Fakultetsopponent: Huimin Zhao, Ph.D. Steven L. Miller Chair, University of Illinois at Urbana-Champaign, USA
Handledare: Senior forskare Eduard Kerkhoven, Chalmers
Examinator: Professor Ivan Mijakovic, Chalmers
Over the years, synthetic biology has demonstrated its significant potential in producing various bulk chemicals, as well as ingredients for cosmetics and pharmaceuticals. To achieve this, microorganisms such as yeasts are commonly used as microbial cell factories. Yeasts are advantageous because they tend to be easy to culture, and many of them can be engineered using genetic toolboxes. Nevertheless, despite their widespread use, there are many yeasts that have not yet been studied in detail, and even for those that have been studied, there are still many gaps in the knowledge of their cellular processes. To this end, the rapid development of machine learning and comparative genomics techniques can aid in improving our understanding of yeasts, based on pre-existing data and knowledge.
Machine learning is a state-of-the-art technique that empowers computers to detect patterns and make predictions from large datasets. In this thesis, I used machine learning to predict gene essentiality (i.e., to identify gene deletions that can cause the death of a cell) in yeasts. I also identified biological patterns that can improve such predictions. This is important for future design of yeast cells and for drug target discovery. Moreover, I developed a deep learning model that predicts how fast and efficient an enzyme work, by only looking at its amino acids. The model was applied on more than 300 yeast species, to simulate how their metabolism work through the prediction of the speed and efficiency of around 3 million enzymes. In addition, I used machine learning to investigate factors that significantly impact protein production in yeasts. This has provided crucial knowledge that can be used in the design of future protein producing yeasts.
In this thesis, I also used comparative genomics. This is a valuable technique that complements machine learning to investigate complex biological problems. In my research, I developed a toolbox to detect so-called horizontal gene transfer (HGT) events. HGT occurs when a microorganism such as yeast acquires a gene from an external source instead of from its parents. Using this toolbox, it is possible to trace potential transmission routes of such HGT genes. Furthermore, with the aid of various comparative genomic analyses, I systematically explored the underlying mechanisms of substrate usage (i.e., the ability of a microorganism to use a particular substance to carry out its functions) in over 300 yeast species.