< Terug naar vorige pagina

Publicatie

Improving the performance of machine learning models for biotechnology : the quest for deus ex machina

Tijdschriftbijdrage - Review Artikel

Machine learning is becoming an integral part of the Design-Build-Test-Learn cycle in biotechnology. Machine learning models learn from collected datasets such as omics data and predict a defined outcome, which has led to both production improvements and predictive tools in the field. Robust prediction of the behavior of microbial cell factories and production processes not only greatly increases our understanding of the function of such systems, but also provides significant savings of development time. However, many pitfalls when modeling biological data - bad fit, noisy data, model instability, low data quantity and imbalances in the data - cause models to suffer in their performance. Here we provide an accessible, in-depth analysis on the problems created by these pitfalls, as well as means of their detection and mediation, with a focus on supervised learning. Assessing the state of the art, we show that, currently, in-depth analyses of model performance are often absent and must be improved. This review provides a toolbox for the analysis of model robustness and performance, and simultaneously proposes a standard for the community to facilitate future work. It is further accompanied by an interactive online tutorial on the discussed issues.
Tijdschrift: BIOTECHNOLOGY ADVANCES
ISSN: 1873-1899
Volume: 53
Jaar van publicatie:2021
Toegankelijkheid:Closed