Metagenomics is a rapidly developing emerging scientific field which generates tremendous amounts of experimental data via high-throughput sequencing technologies. However, it comes with the two-part challenge of how to handle these vast quantities of data and how to use such information to further address biological questions, aiming to understand community level functional processes. In this study, we describe a novel framework and approach for discerning network interactions using high-throughput sequencing-based metagenomic data. The approach developed would allow microbiologists to address research questions (network interactions) which could not be approached previously and thus should represent a research paradigm shift in metagenomic analysis.
In this study, the pairwise correlations of relative OTU abundance across different samples were used to delineate an adjacency matrix for network construction. Based on this adjacency matrix, a network graph was constructed to represent positive or negative interactions among different OTUs. Thus, a network connection between two OTUs in fact describes the co-occurrence of these two OTUs across different samples but not necessarily their physical interactions. In other words, both OTUs might be responding to a common environmental parameter rather than interacting directly.
Compared to other Pearson correlation-based relevance network approaches (35
), the network approach described here has several advantages (18
). First, this approach was developed based on the two universal laws of RMT, and thus, it should be suitable for various biological systems (e.g., cells, populations, communities, and ecosystems). Theoretically, the results obtained with an RMT approach should be more robust and consistent and should more accurately reflect the nature of the complex systems under study. Second, the majority of relevance network analysis methods define the adjacency matrix for network construction using arbitrary thresholds based on known biological information (28
). As a result, the networks obtained vary with the thresholds selected. However, it is a great challenge in selecting an appropriate threshold for network construction, especially for poorly studied organisms and/or microbial communities. In contrast, the novel RMT-based approach developed here automatically defines thresholds for network construction and hence no ambiguity exists for the networks constructed. Moreover, RMT is useful in removing noise from nonrandom, system-specific features, and hence the networks identified should be more accurate and reliable (18
). This is particularly important for dealing with high-throughput metagenomic data because such data generally have an inherently high noise level.
The identification and characterization of OTU co-occurrence modules represent a new approach for detecting the interactions of microbial populations in a community. Based on the oft-invoked principle of guilt by association (26
), the abundance changes in the microbial populations with strong module memberships are probably driven by the same underlying factors. Thus, it is reasonable to hypothesize that the microbial populations with strong module memberships are physically and/or functionally associated in a microbial community. This hypothesis has important implications not only for our understanding of the interactions and ecological functions of the known cultivated microorganisms but also for predicting the potential ecological roles of as-yet-uncultivated microorganisms. As shown in this study, the modularity, module memberships, topological roles, interaction patterns (positive, zero, or negative), and phylogenetic relationships of individual OTUs are rich sources of new hypotheses for identifying key microbial populations and for understanding their interactions and ecological roles in grassland microbial communities.
Identification of keystone populations is a critical issue in ecology, but it is very difficult to achieve, especially in microbial communities given their extreme complexity, high diversity, and uncultivated status. As demonstrated in this study, key populations could be identified based on network topology, module memberships, and/or their relationships to ecosystem functional traits. The conceptual framework developed in this study could provide important information on candidate genes/populations most important to certain ecosystem processes and functioning. This could be particularly important in ecosystem modeling studies in which microbial community structure must be appropriately simplified prior to their incorporation into ecosystem models.
Knowledge of the responses of biological communities to eCO2
and their mechanisms is critical for projecting future climate change (6
). In this study, we demonstrated the impacts of eCO2
on the network interactions among different phylogenetic groups/populations based high-throughput metagenomic sequencing data and the relationships between network structure and soil properties. It is obvious that the network interactions among different microbial phylogenetic groups/populations are greatly affected by eCO2
in this grassland ecosystem. These results are consistent with our previous study of fMENs (18
) and other studies of macroecology (40
). To the best of our knowledge, this is the first study to document the changes in network interactions among different phylogenetic groups/populations of microbial communities in response to eCO2
The relationship between biodiversity and ecosystem functioning has emerged as a central issue in ecological and environmental sciences (41
) and is one of the great challenges of the 21st century’s sciences (48
). Traditionally, almost all biodiversity studies in microbial ecology consider just species richness and abundance and ignore the interactions among different microorganisms. However, network interactions could be more important to ecosystem processes and functions than species diversity (parts list). In this study, we developed a novel conceptual framework for determining network interactions among different phylogenetic groups/populations in microbial communities based on high-throughput metagenomic sequencing data. This novel framework will allow microbial ecologists to examine research issues beyond microbial species richness and abundance. The developed pMEN framework and information on the responses of network structure to eCO2
should have a profound impact on the study of biodiversity, ecosystem ecology, systems microbiology, and climate change.