In the past few years, the availability of improved sequencing methods, including pyrosequencing [1
], has revolutionized what we know about the microbes that inhabit our bodies. Although it has been known for decades that our microbial symbionts outnumber our own cellsby about a factor of 10 [2
], the differences in the repertoires ofsymbiontsharbored by different healthy individuals, different siteswithin the individual, and by individuals over time are only now coming to light. Initially, it was assumed that a 'core microbiome' existed; that is, that a substantial number of microbial species was shared in each body habitat in all or most humans, and that the genomes of these core species could be used as scaffolds to assemble fragmentary data from short-read shotgun sequencing of microbial community DNA [3
The first three individuals whose gut microbiomes were surveyed using substantial numbers of 16S rRNA genesequences shared few of their species, however [4
]. Similarly, observations that a person's left and right hands have only 17% of bacterial species in common, and that two different people's hands share only 13% [5
], cast doubt on the concept of a substantial core set of microbial species shared by all or most people. This doubt has been reinforced by recent work that redefines core lineages or genes as 'core' even if shared by relatively few people [6
]. In fact, on the basis of 16S rRNA geneanalyses we can rule out the possibility that, even within relatively homogeneous small populations of fewer than 100 individuals, everyone's skin-surface communities or gut communities share more than a tiny fraction of species [6
]. This unanticipated variability in shared community membership, and also in other important aspects of the human microbiome, poses substantial conceptual and computational challenges.
Of particular importance for microbiome studies is the following question: what is the effect size? That is, using standard terminology from statistics, how distinguishable are two communities or groups of communities? Obtaining an answer is essential for addressing many practical concerns with experimental design. For example, the effect size determineshow many individuals need to be recruited for a given study, and how many sequences need to be collected per sample to observe differences if they exist. These considerations are particularly importantfor the study ofsystemic disorders such as diabetes or some autoimmune disorders, which are expected to influence the microbiomein multiple body habitats. We need a sense of how much variation exists among different body habitats, how much variation is observed among healthy individuals for the same body habitat, and how much of a shift occurs due to a pathophysiologic state. It is also importantto define the most appropriate method for determining the magnitude of similarity or difference between communities, as the choice of method has a large influence on the results of community comparisons [9
]. A general discussion of the pros and cons of different metrics of community overlap is beyond the scope of this paper (see [9
] for reviews). Here, we summarize the types and sizes of effects found in studies that used various methods of comparing groups of samples, and look for large-scale patterns that can give information on the number of individuals and sequences that are needed to observe different types of effects (Figure ).
Figure 1 The problem of distinguishing between sequences. (a) An investigator contemplating the problem of distinguishing between sequences from the gut of Equus asinus and the volar forearm of humans. (b) Our solution; guess the effect size based on the effect (more ...)
Figure 2 Variation in human body habitats within and between people. (a) The full dataset (approximately 1,500 sequences per sample); (b) the dataset sampled at only 10 sequences per sample, showing the same pattern; (c) the relationship between sequencing depth (more ...)
A variety of interrelated features differentiate microbial communities. These features include the the relative abundance of specific taxa (the proportion of the bacteria in the sample that are Firmicutes, for example), the level of species richness or diversity observed within a community (alpha diversity), and the degree to which different communities share membership or structure (beta diversity). A major challenge in comparing studies is that there is no consistent way in which the size of community differences is reported, as the type of difference that is relevant depends on the study. For example, lean and obese mice and humans differ in their ratios of prominent bacterial phyla (Bacteroidetes (which include the common gut commensal Bacteroides
), Firmicutes (Gram-positive bacteria, including Lactobacillus and Clostridium), and Actinobacteria (which include Corynebacteria
]); men's and women's hands differ in the number of species-level phylotypes (defined as organisms with 16S sequence identity >97%) observed on average [5
]; and samples from the same or similar sites on the bodies of different individuals cluster together using UniFrac-based principal coordinates analysis [4
]. UniFrac is a metric for comparing microbial communities using phylogenetic information, which has been implemented in several tools.
Because of the diverse ways in which microbial communities respond to various environmental factors, it is difficult to compare effect sizes across different studies or systems, as an analysis that highlights differences in one system may obscure them in another. Thus, in what follows, we review effect types and sizes as reported by the authors of individual studies. We focus on variation in human-associated microbial community diversity as assessed by 16S rRNA gene sequence surveys of abundant lineages, using various measures of both within- and between-sample diversity (alpha and beta diversity, respectively). We review comparisons of microbial communities in relationship toboth sampling depth (that is, number of sequences per sample) and breadth (that is, number of samples or individuals). We then perform simulations using an atlas of microbes associated with different sites in the human body to ask how many sequences per sample are needed in order to detect differences across individuals, time, and locations within the body.