Understanding how microbes interact with each other is key to revealing the underlying role that microorganisms play in the host or environment and to identifying microorganisms as an agent that can potentially alter the host or environment. For example, understanding how the microbial interactions associate with parasitic infection can help resolve potential drug or diagnostic test for parasitic infection. To unravel the microbial interactions, existing tools often rely on graphical models to infer the conditional dependence of microbial abundances to represent their interactions. However, current methods do not simultaneously account for the discreteness, compositionality, and heterogeneity inherent to microbiome data. Thus, we build a new approach called “compositional graphical lasso” upon existing tools by incorporating the above characteristics into the graphical model explicitly. We illustrate the advantage of compositional graphical lasso over current methods under a variety of simulation scenarios and on a benchmark study, the Tara Oceans Project. Moreover, we present our results from the analysis of a dataset from the Zebrafish Parasite Infection Study, which aims to gain insight into how the gut microbiome and parasite burden covary during infection, thus uncovering novel putative methods of disrupting parasite success. Our approach identifies changes in interaction degree between infected and uninfected individuals for three taxa, Photobacterium, Gemmobacter, and Paucibacter, which are inversely predicted by other methods. Further investigation of these method-specific taxa interaction changes reveals their biological plausibility. In particular, we speculate on the potential pathobiotic roles of Photobacterium and Gemmobacter in the zebrafish gut, and the potential probiotic role of Paucibacter. Collectively, our analyses demonstrate that compositional graphical lasso provides a powerful means of accurately resolving interactions between microbiota and can thus drive novel biological discovery.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21834260.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21834260.v1&type=result"></script>');
-->
</script>
Reads were grouped into OTUs using the following swarm-based pipeline: paired-end reads were merged with vsearch’s --fastq_mergepairs command (version 2.15.1, allowing for staggered reads; Rognes et al., 2016), and trimmed with cutadapt (version 3.0; Martin, 2011), keeping only reads containing both forward and reverse primers. After trimming, the expected error per read was estimated with vsearch’s command --fastq_filter and the option --eeout. Each sample was then de-replicated, i.e. strictly identical reads were merged, using vsearch’s command --derep_fulllength, and converted into fasta format. Clustering was performed at the sample level with swarm 3.0 using default parameters (Mahé et al., 2015). Prior to global clustering, individual fasta files (one per sample) were pooled and further dereplicated with vsearch. Files containing per-read expected error values were also dereplicated to retain only the lowest expected error for each unique sequence. Global clustering was performed with swarm (using the fastidious option). Cluster representative sequences were then searched for chimeras with vsearch’s command --uchime_denovo using default parameters (Edgar et al., 2011). Clustering results, expected error values, taxonomic assignments, and chimera detection results were used to build a “raw” occurrence table. Reads without primers, reads shorter than 32 nucleotides and reads with uncalled bases (“N”) were discarded. For a “filtered” occurrence table, non-chimeric sequences, sequences with an expected error per nucleotide below 0.0002, and clusters containing at least 2 reads were retained. Since primer trimming is not perfect, some sequences can still contain primer fragments or be excessively trimmed. These sub- or super-sequences were identified using vsearch and merged with their closest, most abundant perfectly trimmed sequence. Finally, occurrence patterns throughout our sample collection were used to further refine the occurrence table. Clusters that contain sub-clusters with only a single-nucleotide difference but with different ecological patterns (defined here as uncorrelated abundance values in at least 5% of the samples) were turned into distinct clusters (https://github.com/frederic-mahe/fred-metabarcoding-pipeline). On the other hand, clusters with similar sequences that had correlated abundance values in at least 95% of the samples, were merged using a re-implementation of lulu's method (Frøslev et al. 2017; https://github.com/frederic-mahe/mumu).
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.3768509&type=result"></script>');
-->
</script>
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.3768509&type=result"></script>');
-->
</script>
Abstract Background Tropical members of the sponge genus Ircinia possess highly complex microbiomes that perform a broad spectrum of chemical processes that influence host fitness. Despite the pervasive role of microbiomes in Ircinia biology, it is still unknown how they remain in stable association across tropical species. To address this question, we performed a comparative analysis of the microbiomes of 11 Ircinia species using whole-metagenomic shotgun sequencing data to investigate three aspects of bacterial symbiont genomes—the redundancy in metabolic pathways across taxa, the evolution of genes involved in pathogenesis, and the nature of selection acting on genes relevant to secondary metabolism. Results A total of 424 new, high-quality bacterial metagenome-assembled genomes (MAGs) were produced for 10 Caribbean Ircinia species, which were evaluated alongside 113 publicly available MAGs sourced from the Pacific species Ircinia ramosa. Evidence of redundancy was discovered in that the core genes of several primary metabolic pathways could be found in the genomes of multiple bacterial taxa. Across hosts, the metagenomes were depleted in genes relevant to pathogenicity and enriched in eukaryotic-like proteins (ELPs) that likely mimic the hosts’ molecular patterning. Finally, clusters of steroid biosynthesis genes (CSGs), which appear to be under purifying selection and undergo horizontal gene transfer, were found to be a defining feature of Ircinia metagenomes. Conclusions These results illustrate patterns of genome evolution within highly complex microbiomes that illuminate how associations with hosts are maintained. The metabolic redundancy within the microbiomes could help buffer the hosts from changes in the ambient chemical and physical regimes and from fluctuations in the population sizes of the individual microbial strains that make up the microbiome. Additionally, the enrichment of ELPs and depletion of LPS and cellular motility genes provide a model for how alternative strategies to virulence can evolve in microbiomes undergoing mixed-mode transmission that do not ultimately result in higher levels of damage (i.e., pathogenicity) to the host. Our last set of results provides evidence that sterol biosynthesis in Ircinia-associated bacteria is widespread and that these molecules are important for the survival of bacteria in highly complex Ircinia microbiomes. Video Abstract
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.6231682&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.6231682&type=result"></script>');
-->
</script>
Abstract Background For many environments, biome-specific microbial gene catalogues are being recovered using shotgun metagenomics followed by assembly and gene calling on the assembled contigs. The assembly is typically conducted either by individually assembling each sample or by co-assembling reads from all the samples. The co-assembly approach can potentially recover genes that display too low abundance to be assembled from individual samples. On the other hand, combining samples increases the risk of mixing data from closely related strains, which can hamper the assembly process. In this respect, assembly on individual samples followed by clustering of (near) identical genes is preferable. Thus, both approaches have potential pros and cons, but it remains to be evaluated which assembly strategy is most effective. Here, we have evaluated three assembly strategies for generating gene catalogues from metagenomes using a dataset of 124 samples from the Baltic Sea: (1) assembly on individual samples followed by clustering of the resulting genes, (2) co-assembly on all samples, and (3) mix assembly, combining individual and co-assembly. Results The mix-assembly approach resulted in a more extensive nonredundant gene set than the other approaches and with more genes predicted to be complete and that could be functionally annotated. The mix assembly consists of 67 million genes (Baltic Sea gene set, BAGS) that have been functionally and taxonomically annotated. The majority of the BAGS genes are dissimilar (< 95% amino acid identity) to the Tara Oceans gene dataset, and hence, BAGS represents a valuable resource for brackish water research. Conclusion The mix-assembly approach represents a feasible approach to increase the information obtained from metagenomic samples. Video abstract
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.5984169.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.5984169.v1&type=result"></script>');
-->
</script>
Additional file 3. Interactive chart of the BAGS gene set taxonomic affiliations.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.19727418.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.19727418.v1&type=result"></script>');
-->
</script>
A large fraction of marine primary production is performed by diverse small protists, and many of these phytoplankton are phagotrophic mixotrophs that vary widely in their capacity to consume bacterial prey. Prior analyses suggest that mixotrophic protists as a group vary in importance across ocean environments, but the mechanisms leading to broad functional diversity among mixotrophs, and the biogeochemical consequences of this, are less clear. Here we use isolates from seven major taxa to demonstrate a tradeoff between phototrophic performance (growth in the absence of prey) and phagotrophic performance (clearance rate when consuming Prochlorococcus). We then show that trophic strategy along the autotrophy-mixotrophy spectrum correlates strongly with global niche differences, across depths and across gradients of stratification and chlorophyll a. A model of competition shows that community shifts can be explained by greater fitness of faster-grazing mixotrophs when nutrients are scarce and light is plentiful. Our results illustrate how basic physiological constraints and principles of resource competition can organize complexity in the surface ocean ecosystem.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21739142.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21739142.v1&type=result"></script>');
-->
</script>
The OceanDNA MAG catalog contains over 50,000 prokaryotic genomes originated from various marine environments.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.5564844.v1&type=result"></script>');
-->
</script>
citations | 1 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.5564844.v1&type=result"></script>');
-->
</script>
Additional file 10: Data File S1. Fasta-formatted CSGs found in bacterial symbionts of Ircinia.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21273853.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21273853.v1&type=result"></script>');
-->
</script>
Additional file 6: Table S1. Taxonomy, statistics, source metadata, and CSG occurrence in MAGs that passed QC and dereplication steps and were used in the final analyses. The archaeal MAGs and the Trichodesmium MAG that did not meet the completeness threshold are included.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21273871.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21273871.v1&type=result"></script>');
-->
</script>
Additional file 7: Table S2. Table of relative abundances of Synechococcus MAGs in Ircinia, inferred via CoverM.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21273874&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21273874&type=result"></script>');
-->
</script>
Understanding how microbes interact with each other is key to revealing the underlying role that microorganisms play in the host or environment and to identifying microorganisms as an agent that can potentially alter the host or environment. For example, understanding how the microbial interactions associate with parasitic infection can help resolve potential drug or diagnostic test for parasitic infection. To unravel the microbial interactions, existing tools often rely on graphical models to infer the conditional dependence of microbial abundances to represent their interactions. However, current methods do not simultaneously account for the discreteness, compositionality, and heterogeneity inherent to microbiome data. Thus, we build a new approach called “compositional graphical lasso” upon existing tools by incorporating the above characteristics into the graphical model explicitly. We illustrate the advantage of compositional graphical lasso over current methods under a variety of simulation scenarios and on a benchmark study, the Tara Oceans Project. Moreover, we present our results from the analysis of a dataset from the Zebrafish Parasite Infection Study, which aims to gain insight into how the gut microbiome and parasite burden covary during infection, thus uncovering novel putative methods of disrupting parasite success. Our approach identifies changes in interaction degree between infected and uninfected individuals for three taxa, Photobacterium, Gemmobacter, and Paucibacter, which are inversely predicted by other methods. Further investigation of these method-specific taxa interaction changes reveals their biological plausibility. In particular, we speculate on the potential pathobiotic roles of Photobacterium and Gemmobacter in the zebrafish gut, and the potential probiotic role of Paucibacter. Collectively, our analyses demonstrate that compositional graphical lasso provides a powerful means of accurately resolving interactions between microbiota and can thus drive novel biological discovery.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21834260.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.21834260.v1&type=result"></script>');
-->
</script>
Reads were grouped into OTUs using the following swarm-based pipeline: paired-end reads were merged with vsearch’s --fastq_mergepairs command (version 2.15.1, allowing for staggered reads; Rognes et al., 2016), and trimmed with cutadapt (version 3.0; Martin, 2011), keeping only reads containing both forward and reverse primers. After trimming, the expected error per read was estimated with vsearch’s command --fastq_filter and the option --eeout. Each sample was then de-replicated, i.e. strictly identical reads were merged, using vsearch’s command --derep_fulllength, and converted into fasta format. Clustering was performed at the sample level with swarm 3.0 using default parameters (Mahé et al., 2015). Prior to global clustering, individual fasta files (one per sample) were pooled and further dereplicated with vsearch. Files containing per-read expected error values were also dereplicated to retain only the lowest expected error for each unique sequence. Global clustering was performed with swarm (using the fastidious option). Cluster representative sequences were then searched for chimeras with vsearch’s command --uchime_denovo using default parameters (Edgar et al., 2011). Clustering results, expected error values, taxonomic assignments, and chimera detection results were used to build a “raw” occurrence table. Reads without primers, reads shorter than 32 nucleotides and reads with uncalled bases (“N”) were discarded. For a “filtered” occurrence table, non-chimeric sequences, sequences with an expected error per nucleotide below 0.0002, and clusters containing at least 2 reads were retained. Since primer trimming is not perfect, some sequences can still contain primer fragments or be excessively trimmed. These sub- or super-sequences were identified using vsearch and merged with their closest, most abundant perfectly trimmed sequence. Finally, occurrence patterns throughout our sample collection were used to further refine the occurrence table. Clusters that contain sub-clusters with only a single-nucleotide difference but with different ecological patterns (defined here as uncorrelated abundance values in at least 5% of the samples) were turned into distinct clusters (https://github.com/frederic-mahe/fred-metabarcoding-pipeline). On the other hand, clusters with similar sequences that had correlated abundance values in at least 95% of the samples, were merged using a re-implementation of lulu's method (Frøslev et al. 2017; https://github.com/frederic-mahe/mumu).
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.3768509&type=result"></script>');
-->
</script>
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.5281/zenodo.3768509&type=result"></script>');
-->
</script>
Abstract Background Tropical members of the sponge genus Ircinia possess highly complex microbiomes that perform a broad spectrum of chemical processes that influence host fitness. Despite the pervasive role of microbiomes in Ircinia biology, it is still unknown how they remain in stable association across tropical species. To address this question, we performed a comparative analysis of the microbiomes of 11 Ircinia species using whole-metagenomic shotgun sequencing data to investigate three aspects of bacterial symbiont genomes—the redundancy in metabolic pathways across taxa, the evolution of genes involved in pathogenesis, and the nature of selection acting on genes relevant to secondary metabolism. Results A total of 424 new, high-quality bacterial metagenome-assembled genomes (MAGs) were produced for 10 Caribbean Ircinia species, which were evaluated alongside 113 publicly available MAGs sourced from the Pacific species Ircinia ramosa. Evidence of redundancy was discovered in that the core genes of several primary metabolic pathways could be found in the genomes of multiple bacterial taxa. Across hosts, the metagenomes were depleted in genes relevant to pathogenicity and enriched in eukaryotic-like proteins (ELPs) that likely mimic the hosts’ molecular patterning. Finally, clusters of steroid biosynthesis genes (CSGs), which appear to be under purifying selection and undergo horizontal gene transfer, were found to be a defining feature of Ircinia metagenomes. Conclusions These results illustrate patterns of genome evolution within highly complex microbiomes that illuminate how associations with hosts are maintained. The metabolic redundancy within the microbiomes could help buffer the hosts from changes in the ambient chemical and physical regimes and from fluctuations in the population sizes of the individual microbial strains that make up the microbiome. Additionally, the enrichment of ELPs and depletion of LPS and cellular motility genes provide a model for how alternative strategies to virulence can evolve in microbiomes undergoing mixed-mode transmission that do not ultimately result in higher levels of damage (i.e., pathogenicity) to the host. Our last set of results provides evidence that sterol biosynthesis in Ircinia-associated bacteria is widespread and that these molecules are important for the survival of bacteria in highly complex Ircinia microbiomes. Video Abstract
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.6231682&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.6231682&type=result"></script>');
-->
</script>
Abstract Background For many environments, biome-specific microbial gene catalogues are being recovered using shotgun metagenomics followed by assembly and gene calling on the assembled contigs. The assembly is typically conducted either by individually assembling each sample or by co-assembling reads from all the samples. The co-assembly approach can potentially recover genes that display too low abundance to be assembled from individual samples. On the other hand, combining samples increases the risk of mixing data from closely related strains, which can hamper the assembly process. In this respect, assembly on individual samples followed by clustering of (near) identical genes is preferable. Thus, both approaches have potential pros and cons, but it remains to be evaluated which assembly strategy is most effective. Here, we have evaluated three assembly strategies for generating gene catalogues from metagenomes using a dataset of 124 samples from the Baltic Sea: (1) assembly on individual samples followed by clustering of the resulting genes, (2) co-assembly on all samples, and (3) mix assembly, combining individual and co-assembly. Results The mix-assembly approach resulted in a more extensive nonredundant gene set than the other approaches and with more genes predicted to be complete and that could be functionally annotated. The mix assembly consists of 67 million genes (Baltic Sea gene set, BAGS) that have been functionally and taxonomically annotated. The majority of the BAGS genes are dissimilar (< 95% amino acid identity) to the Tara Oceans gene dataset, and hence, BAGS represents a valuable resource for brackish water research. Conclusion The mix-assembly approach represents a feasible approach to increase the information obtained from metagenomic samples. Video abstract
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.5984169.v1&type=result"></script>');
-->
</script>
citations | 0 | |
popularity | Average | |
influence | Average | |
impulse | Average |
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.6084/m9.figshare.c.5984169.v1&type=result"></script>');
-->
</script>