< Back to previous page

Project

Single-Cell Analysis and Metabolic Modelling of Pairwise Interactions in the Gut Microbiome

The gut microbiota is a complex ecosystem that houses a diverse array of microorganisms, including eukaryotic cells, bacteria and viruses, and plays a crucial role in human health. Its functions include breaking down dietary fibres, producing vitamins, and protecting against pathogens. However, deciphering the role of the gut microbiota in disease remains challenging due to its size, diversity, and complexity, as well as its variability throughout hosts and the vast amount of data needed to describe it. Using a bottom-up approach, we can identify specific gut bacterial species and their metabolic products that affect the host's health. By studying microbial species cultured in pairs, we can already highlight aspects of their interaction behaviour and apply this knowledge to larger communities. Understanding these interactions is critical in identifying potential microbial targets for therapeutic intervention and developing strategies to promote the growth of beneficial microorganisms while suppressing harmful ones. While 16S rRNA sequencing is a popular method to study in vitro bacterial communities and pairwise interactions, the method presents limitations and biases. Our objective is to develop a bioinformatics tool to provide a complementary targeted analysis method for synthetic communities. Additionally, we compare existing tools based on genome-scale metabolic models to test their accuracy and assess the prerequisite data and parameter requirements for optimal outputs.

 

In Chapter II, we focused on developing an alternative method to the expensive and time-consuming 16S rRNA sequencing for in vitro studies. To address this, we adapted machine learning methods to create CellScanner, a user-friendly tool that analyses flow cytometry data to predict the composition of a known microbial community. We evaluated the accuracy of CellScanner for predicting the composition of microbial communities based on flow cytometry (FC) data and compared it to similar methods. CellScanner provides a fast, flexible, and user-friendly approach to resolving microbial community composition. The software accurately predicts the composition of two- and three-species in vitro communities. However, its accuracy decreases with an increasing number of species in the community, and it is thus not suited for complex communities. In addition, the ‘unknown’ events parameter implemented in the software was found to increase precision and specificity but may reduce accuracy if one species has many more unknown events than other species.

Furthermore, our study found that no single machine learning method outperforms the others systematically. We also showed that CellScanner’s ability to resolve the composition of synthetic communities is comparable to other tools (ScriptP and CellCognize). Additionally, we presented the machine learning gating feature, which allows users to remove background and debris from training data by providing FC data from media without cells (blank).

In Chapter III, we applied CellScanner to biological data and compared the results with those obtained from 16S rRNA sequencing. Supervised classification was evaluated as a method for counting gut bacterial species in mixtures. Its advantages over other methods, such as avoiding labour-intensive DNA extraction or plating, not requiring fluorescent labelling of species, and delivering absolute abundances, were discussed. However, this method is limited to co-cultures and small bacterial communities. In addition, CellScanner efficacy was either similar to or better than 16S rRNA gene sequencing for certain combinations of gut bacterial species. A potentially low accuracy of the sequencing could also explain this observation. The study found that flow cytometry features linked to cell shape and size are insufficient to distinguish species, and multivariate methods are needed to classify each event more accurately. Factors such as cell size diversity, cell cycle variations, and bacterial aggregation could explain variability across biological replicates of monocultures.

 

In Chapter IV, we explored the use of genome-scale metabolic models (GEMs) to predict pairwise interactions between gut bacterial species in silico. These models are attractive due to their ability to integrate genome data. Using growth data from the literature, we systematically evaluated the accuracy of using flux balance analysis (FBA) with semi-curated GEMs to predict growth rates and interaction strengths between human and mouse gut bacteria. Our study revealed that FBA-based predictions of interaction strengths between gut bacterial species are currently not accurate enough to be reliable. We found that different methods for constructing models in the literature lead to varying prediction accuracy. Our work provides the first systematic evaluation of FBA-based interaction prediction for gut bacteria, which showed that using curated GEMs significantly increased the accuracy of growth rate prediction. However, our study also highlights the need for condition-specific GEMs before using constraint-based modelling for analysis. We suggest that additional tools and approaches must be tested, and more curated models should be used to confirm the beneficial effect of curation on prediction accuracy. It is important to note that our study was limited to mammalian gut bacteria, and future evaluations could include microorganisms from other environments.

In conclusion, our study demonstrated that the bioinformatics tool we developed provides a user-friendly and viable alternative to 16S rRNA sequencing for in vitro analysis of small gut bacterial communities when FC data are available, particularly when studying pairwise interactions. While the accuracy of CellScanner predictions depends on the species present in the community, updating the tool with alternative methods and using more advanced equipment have the potential to improve it, even with species that are difficult to discriminate. Moreover, our in silico analysis highlights the limits of genome-scale metabolic modelling and the importance of careful model curation and parameter definition for producing accurate predictions.

Date:4 Mar 2019 →  4 Jul 2023
Keywords:Modelling, Metabolism, Interaction
Disciplines:Bio-informatics, Modelling and simulation
Project type:PhD project