|Home | About | Journals | Submit | Contact Us | Français|
Genetic variation contributes to the risk of developing endometriosis. This review summarizes gene mapping studies in endometriosis and the prospects of finding gene pathways contributing to disease using the latest genome-wide strategies.
To identify candidate-gene association studies of endometriosis, a systematic literature search was conducted in PubMed of publications up to 1 April 2008, using the search terms ‘endometriosis’ plus ‘allele’ or ‘polymorphism’ or ‘gene’. Papers included were those with information on both case and control selection, showed allelic and/or genotypic results for named germ-line polymorphisms and were published in the English language.
Genetic variants in 76 genes have been examined for association, but none shows convincing evidence of replication in multiple studies. There is evidence for genetic linkage to chromosomes 7 and 10, but the genes (or variants) in these regions contributing to disease risk have yet to be identified. Genome-wide association is a powerful method that has been successful in locating genetic variants contributing to a range of common diseases. Several groups are planning these studies in endometriosis. For this to be successful, the endometriosis research community must work together to genotype sufficient cases, using clearly defined disease classifications, and conduct the necessary replication studies in several thousands of cases and controls.
Genes with convincing evidence for association with endometriosis are likely to be identified in large genome-wide studies. This will provide a starting point for functional and biological studies to develop better diagnosis and treatment for this debilitating disease.
Endometriosis is a complex condition, with tissue resembling endometrium found in extra-uterine sites. Although symptoms vary, they commonly include severe pelvic pain, severe dysmenorrhea (painful periods) and reduced fertility (Giudice and Kao, 2004; Berkley et al., 2005). The main pathological processes associated with the disease are peritoneal inflammation and fibrosis, and the formation of adhesions and ovarian cysts. Different phenotypic classifications have been proposed, the main ones being the 4-stage rAFS classification (based on the total surface size of lesions, presence of adhesions and ovarian lesions) (The American Fertility Society, 1985), ovarian versus peritoneal disease (The American Fertility Society, 1985) and deep infiltrating versus superficial disease (Koninckx et al., 1999). Whether these subphenotypes represent the natural history of one disorder, or are in fact different disease subtypes altogether, is an important consideration in endometriosis research, but as yet remains unclear. The disorder often recurs and has a major impact on women's health, relationships, productivity and life choices (Mathias et al., 1996; Simoens et al., 2007). The risk of endometriosis increases with age during the reproductive years (Eskenazi and Warner, 1997); the onset can occur from the menarche onwards, but presentation after the menopause is rare, indicating that it is an estrogen-dependent condition (Bulun et al., 2002). The few established risk factors include increased exposure to menstruation (i.e. shorter cycle length, longer duration of flow and nulliparity), positive smoking history (which reduces ovarian production of estrogen and reduces risk of endometriosis) and increased peripheral body fat (which increases estrogen) (Eskenazi and Warner, 1997; Missmer and Cramer, 2003; Berkley et al., 2005).
As the diagnosis of endometriosis is made on visual inspection of the pelvis at laparoscopy, the population prevalence is unknown. The best estimates suggest endometriosis (all stages) affects 8–10% of women in their reproductive years (Eskenazi and Warner, 1997) and 20–50% of women with infertility (Gao et al., 2006), with an estimated prevalence of moderate–severe endometriosis of up to 2% (Zondervan et al., 2002). Prevalence of all stages of endometriosis in Australia was estimated at 7.2% in a volunteer sample of Australian twins (Treloar et al., 1999a). These figures represent prevalence rates in Caucasian populations; there is some evidence that the prevalence is lower in African-American women (Kyama et al., 2004), but estimates may be biased due to differential access to healthcare (Missmer et al., 2004). The most widely accepted theory to explain the origin of endometriotic tissue is that viable endometrial cells reach the peritoneal cavity through retrograde menstruation along the fallopian tubes (Sampson, 1927). However, menstrual debris is present in the peritoneal cavity of up to 90% of menstruating women. Possible explanations for susceptibility in only some women are summarized in Fig. 1 and include increased exposure to menstrual debris, abnormal eutopic endometrium, altered peritoneal environment, reduced immune surveillance and increased angiogenic capacity (Healy et al., 1998; Vinatier et al., 2001; Treloar et al., 2002; Varma et al., 2004). Biological studies have so far failed to clearly define the mechanisms leading to disease.
Endometriosis is commonly regarded as a complex trait, caused by the interplay between genetic and environmental factors. The contribution of genetic factors in endometriosis susceptibility is supported by a number of different studies (Kennedy, 1998; Zondervan et al., 2001; Simpson and Bischoff, 2002; Stefansson et al., 2002; Treloar et al., 2002; Viganò et al., 2003). Higher rates of endometriosis are found among the relatives of endometriosis cases compared with those of controls in both hospital (Kennedy et al., 1995; Simpson and Bischoff, 2002) and population-based (Stefansson et al., 2002) samples. The relative recurrence risk to sibs, which is the increase in risk of an individual whose sibling is affected compared with the risk in the general population, has been estimated at 2.34 in an Australian sample of twins and their families (Treloar et al., 2002), although estimates from imaging studies on the sisters of women with more severe disease suggest the value may be as high as 15 (Kennedy et al., 1997). It should be noted that, for endometriosis, obtaining an accurate estimate of this recurrence risk is difficult because the population prevalence is unknown and there is inevitable bias in ascertaining endometriosis cases through surgery that will influence the estimated risk to the siblings. Further evidence of genetic risk of endometriosis has been suggested by twin studies, which showed concordance for endometriosis among monozygotic twins (Moen, 1994; Hadfield et al., 1997), and increased concordance in monozygotic compared with dizygotic twins with heritability estimated at 51% (Moen, 1994; Hadfield et al., 1997; Treloar et al., 1999b, 2002). In addition to these studies in humans, familial aggregation has also been shown in non-human primates (Zondervan et al., 2004).
Recently, the genetic contribution to endometriosis has been questioned (Di and Guo, 2007). The authors point out problems with study designs due to small sample size in many studies, ascertainment bias, increased opportunity for diagnosis among family members of cases compared with controls and familial aggregation of confounding risk factors such as early age at menarche (Di and Guo, 2007). These are valid concerns particularly for some published studies, although we should acknowledge the difficulties inherent in endometriosis research because of the lack of a non-invasive diagnostic tool (Zondervan et al., 2002). The strongest evidence for genes influencing endometriosis comes from more recent, large-scale, studies in twins (Treloar et al., 1999b) and in the Icelandic population (Stefansson et al., 2002). A classical twin study in a large sample of Australian twins concluded that genetic factors contributed about half of the variation in endometriosis risk (Treloar et al., 1999b). The potential contribution of increased sharing of environmental risk-factors of endometriosis among identical versus non-identical twins (a potential alternative explanation for the observed results) was investigated and considered unlikely. Analysis of endometriosis cases in the Icelandic population addressed issues of bias in various ways and in each case reached the conclusion that there was evidence for genetic effects on endometriosis risk (Stefansson et al., 2002). Nevertheless, all studies trying to dissect the genetic and non-genetic causes using familial aggregation studies based on phenotypic observations alone must make explicit assumptions about shared environmental influences that are difficult to exclude entirely. Genetic contributions to disease can be tested directly without making these assumptions using genome-wide marker data either through linkage or association studies (Visscher et al., 2006), but very large sample sizes are required for accurate estimates.
Familial aggregation may result from comorbid conditions or traits explaining the observed findings. With regard to endometriosis, Di and Guo (2007) suggested that chromosomal regions linked to endometriosis (Treloar et al., 2005a; Zondervan et al., 2007) may be due to familial aggregation of age at menarche; itself a complex, heritable trait associated with endometriosis risk. Such an explanation—although theoretically possible—is unlikely, since linkage studies specifically designed for one complex trait of interest have low power to detect other complex traits that are loosely associated with it. Indeed, linkage analyses of age at menarche show that linkage signals for endometriosis do not overlap signals for age at menarche (see Linkage Mapping). Therefore, although individual studies carry interpretation difficulties to a greater or lesser extent, current evidence accumulated across a range of studies supports a genetic contribution to endometriosis risk.
Defining pathways to disease is a major goal of research in endometriosis. This knowledge can be used to develop more effective methods of diagnosis and treatment. Genetic studies provide one important approach to define causal pathways influencing endometriosis. The number of gene mapping studies for this disease has increased in recent years as the role of genetic factors has become more widely accepted. In addition, there have been dramatic advances in human genetics in the last few years with many recent papers reporting genetic variants associated with other complex diseases. This review summarizes current studies on genetic variation contributing to endometriosis and the prospects of finding gene pathways contributing to disease using the latest developments in high throughput genotyping and genome-wide strategies.
Gene mapping methods have been very successful in identifying mutations responsible for monogenic diseases. These mutations generally have large effects and have moderate to high penetrance. Consequently, most mutation carriers present with the disease, and most affected families carry the same mutation. Mutations with large and distinct effects are relatively easy to map and locate. Examples of mutations with large effects influencing reproductive pathways include mutations in KISS1 receptor (KISS1R) influencing idiopathic hypogonadotropic hypogonadism (de Roux et al., 2003; Seminara et al., 2003) and mutations in growth differentiation factor 9 (GDF9) increasing risk for dizygotic twinning (Palmer et al., 2006).
Finding genetic variants contributing to complex diseases such as endometriosis, diabetes or heart disease is far more difficult because the contribution of individual genes is small, many genes contribute to an individual's risk of developing the disease and disease risk is often modified by environment. However, common diseases present a much greater public health burden than Mendelian diseases, and there are major international efforts to define genetic contributions to these diseases. The basic requirements to map disease genes remain the same (Fig. 2). Many studies are required for both the discovery and replication steps with sufficient power to detect the small effects of any individual variants. In addition, different combinations of variants are likely to be present in cases in different families and cases in the same family may not all carry the same variants. Despite these limitations, gene mapping remains an important approach to understanding pathways involved in complex disease aetiology, including endometriosis.
Most genetic studies of endometriosis to date have followed a priori defined biological hypotheses and have analysed a small number of variants in candidate genes (often in small numbers of cases and controls). Candidate genes (candidates) are generally chosen based on biological mechanisms thought to contribute to disease. Variants in these candidate genes are genotyped in samples from cases and controls or in affected families to test for association by statistical analysis of genotype data.
To identify candidate-gene association studies of endometriosis, a systematic literature search was conducted in PubMed (http://www.ncbi.nlm.nih.gov/sites/entrez/) of publications up to 1 April 2008, using the search terms ‘endometriosis’ plus ‘allele’ or ‘polymorphism’ or ‘gene’. Only papers that included information on both case and control selection, and showed allelic and/or genotypic results for named germ-line polymorphisms, were included. To enable the application of these criteria by the authors, the review was limited to publications in the English language. Table I presents an overview of candidate gene studies in a format to allow readers to identify which genes have been tested and whether results have replicated in more than one study.
These studies report results from statistical tests for association with endometriosis susceptibility for 76 genes (summarized in Table I). Many of these studies have been reviewed in detail recently (Guo, 2005, 2006a,b; Falconer et al., 2007). Candidates tested include genes from detoxification pathways, sex steroid pathways and cytokine signalling pathways, adhesion molecules and matrix enzymes, and cell-cycle regulation (Falconer et al., 2007). Glutathione S-transferase enzymes involved in the pathway for detoxification of a range of toxic compounds and carcinogens have been studied extensively, in particular because of the suggestion of dioxin exposure being a risk factor (Birnbaum and Cummings, 2002), a finding which was later questioned (Rier et al., 2001; Eskenazi et al., 2002; Guo, 2004). Polymorphisms in glutathione S-transferase M1 (GSTM1) on chromosome 1p13.3 and glutathione S-transferase theta 1 (GSTT1) on chromosome 22q11.23 have been evaluated in over twenty studies (Guo, 2005). Pooled odds ratios for both enzymes (GSTM1: OR: 1.96, 95% CI: 1.29–2.98; GSTT1: OR: 1.77, 95% CI: 1.19–2.63) suggested increased risk of developing endometriosis. However, there is significant heterogeneity between studies for both enzymes and publication bias suggesting the results should be viewed with caution especially for GSTM1 (Guo, 2005). Meta-analysis for multiple studies for the detoxification enzymes N-acetyltransferase 2 (arylamine N-acetyltransferase) (NAT2) on chromosome 8p22 and cytochrome P450, family 1, subfamily A, polypeptide 1 (CYP1A1) on chromosome 15q24.1 found no evidence for association between the NAT2 acetylation polymorphism (pooled OR: 1.13, 95% CI: 0.70–1.82) and endometriosis (Guo, 2006a). There is some evidence for a small increase in risk for alleles at the MspI polymorphism in CYP1A1 (pooled OR: 1.44, 95% CI: 1.00–2.06), but the evidence is not strong, and further studies are needed to confirm the result (Guo, 2006a).
As noted earlier, endometriosis is an estrogen dependent disease and a number of studies have investigated genes from pathways of sex steroid biosynthesis and signalling. Review of association studies for cytochrome P450, family 17, subfamily A, polypeptide 1 (CYP17A1), cytochrome P450, family 19, subfamily A, polypeptide 1 (CYP19), androgen receptor (AR), progesterone receptor (PGR) and estrogen receptors (ESR1 and ESR2) (Guo, 2006b) concluded that many reported positive findings were unsound because of problems with data analysis in the original reports. Meta-analysis of the studies provided some, though limited, support for association between endometriosis and both the PGR-PROGINS polymorphism (pooled OR: 1.94, 95% CI: 1.31–2.88) and ESR1-PvuII polymorphism (pooled OR: 2.1, 95% CI: 1.20–3.68) (Guo, 2006b). A subsequent study in a large family-based sample failed to support any association between PGR and endometriosis (Treloar et al., 2005b).
Convincing evidence for association must include replication. Many replication studies for complex disease associations fail at the replication step (Ioannidis et al., 2001; Hirschhorn et al., 2002). Less than half of the reported associations with endometriosis have been investigated in a separate sample and many associations are not replicated in subsequent studies (Table I). A number of factors contribute to this failure including low prior odds for association when testing only a few variants in one gene, study power, data analysis, publication bias, population differences, failure to type the same variants and technical issues. Many studies only test small samples and lack the necessary power to detect the small effects expected for most common diseases including endometriosis (Zondervan et al., 2002, 2004). Results that appear significant are more likely to be published (Ioannidis et al., 2001; Hirschhorn et al., 2002). In addition, some studies do not take adequate account of statistical problems of multiple testing (Guo, 2006b). This publication bias, together with problems in experimental designs, suggests many results are false positive associations. Studies in our large Australian sample failed to replicate putative associations for PGR and TNFA (Treloar et al., 2005b; Zhao et al., 2007). We recently typed the rs2476601 1858T/C (Trp620Arg) variant in PTPN22 (Ammendola et al., 2007) in the same sample and found no evidence for association with endometriosis (‘T’ allele frequencies in cases and controls were 0.095 and 0.091, respectively, P-value > 0.5).
For real associations, the strength of the true effect is often over estimated in the initial study (Ioannidis et al., 2001) in an effect referred to as the ‘winner's curse’ (Zollner and Pritchard, 2007). Consequently, replication studies often need more samples and greater power than the original study to detect the effect and some cases of failure to replicate findings from the original study might be due to the low power of replication studies (Lohmueller et al., 2003; Zollner and Pritchard, 2007). Review of the large number of studies conducted for association with endometriosis (Table I) does not provide support for any gene variants clearly associated with increased risk of endometriosis. Some reported results may represent true associations, but given the small effect sizes expected further studies in very large samples are required to provide convincing evidence. This can be achieved by combining samples from multiple sites. Recent candidate gene studies in breast cancer included up to 18 000 cases and 22 000 controls to identify a common variant in CASP8 associated with breast cancer risk (Cox et al., 2007), although more modest sample sizes have been used in successful studies for other common diseases.
A second gene mapping approach involves a hypothesis-free search for evidence of genomic regions harbouring genetic risk variants for endometriosis—prior to further association or sequencing studies—using linkage mapping across the genome. Over a period of 10 years, laboratories in Australia and the UK recruited families of sisters with surgically confirmed endometriosis for sib-pair linkage analysis. Sib-pair linkage studies use genome-wide analysis of informative polymorphic microsatellite markers to identify regions of significant excess sharing in affected sibs (Risch, 1990a; Kruglyak and Lander, 1995a,b; Lander and Kruglyak, 1995). For common diseases with a high genetic recurrence risk, the most informative relative pairs are distant ones. For diseases such as endometriosis with lower recurrence risk to sibs, sister pairs constitute the best design (Risch, 1990b). The affected sib method is also more suitable for those conditions where it is difficult to determine ‘unaffected’ status. In the case of endometriosis, determination of unaffected status would require a laparoscopy.
A genome scan with 1176 affected sister pair families was completed in the combined Australian and the UK families (Treloar et al., 2005a) comprising the International Endogene Consortium (IEC). Power calculations suggested the study sample had 80% power to detect a locus with a recurrence risk to sisters of 1.35 (Treloar et al., 2002). Genetic markers spaced about every 10 cM (approximately every 10 Mb) across the genome were typed in DNA samples from the sisters and other family members. The combined data identified one peak of significant linkage on chromosome 10 with a peak of 3.09 (genome-wide P-value of 0.047). A second peak on chromosome 20 shows suggestive evidence for linkage. The results were consistent for both data sets with evidence for linkage to chromosome 10 in both the Australian and the UK families. Fine mapping with an additional four microsatellite markers on chromosome 10 increased the evidence for linkage slightly. The peak of maximum linkage was located at 148.75 cM (127.92 Mb).
A separate linkage analysis in a subset of families with three or more affected women (Oxford: n = 52; Australia: n = 196) was conducted to test whether the apparent concentration of cases in these families might reflect the presence of a more ‘Mendelian-like’ rare genetic variant acting in this subset of families. If present, this situation would be analogous to the discovery of BRCA1 and BRCA2 genes causal in a small subset of breast cancer patients with strong familial inheritance patterns (Miki et al., 1994; Wooster et al., 1995). The analysis in the subset of endometriosis families identified an additional peak of linkage on chromosome 7p (Zondervan et al., 2007). The combined analysis identified significant evidence for linkage to the region using a transmission model with a recessive gene conferring a high risk for developing the disease. This suggests there may be a high-penetrance susceptibility locus for endometriosis in this region present in a small subset of families.
It has been suggested that familial aggregation of endometriosis, and linkage to chromosomes 7 and 10, may be due to the association between endometriosis and age at menarche, since age at menarche is known to have a genetic component (Di and Guo, 2007). Specific gene mapping studies of age at menarche suggest that variants influencing this trait do not overlap the regions of chromosome 7 or 10 implicated in our studies of endometriosis (Treloar et al., 2005a; Guo, 2006; Rothenbuhler et al., 2006; Zondervan et al., 2007). Additionally, a large linkage study in 13 697 individuals and 4899 pseudo-independent sister-pairs found no evidence for significant linkage and no suggestive linkage peaks on chromosome 7 or 10 (G.W. Montgomery et al., unpublished observations). Linkage to these regions in endometriosis is unlikely to be related to age at menarche, but it should be noted that common variants influencing both age at menarche and endometriosis would provide valuable information on the causes of endometriosis and have implications for variation in age of menarche.
Subsequent fine mapping and candidate gene resequencing studies have been—and continue to be—conducted for genes under the chromosome 7 and 10 linkage peaks to identify genetic variants contributing to linkage in these regions. Two genes on chromosome 10q (Fig. 3), which were previously implicated in endometriosis and endometrial cancer, have been investigated (Treloar et al., 2007). The genes are Empty Spiracles, Homolog of Drosophila, 2 (EMX2) and Phosphatase and Tensin Homolog (PTEN) (Sato et al., 2000; Kurose et al., 2001, 2002; Fujii et al., 2002; Latta and Chapman, 2002; Martini et al., 2002; Swiersz, 2002; Zhou et al., 2002; Dinulescu et al., 2005). EMX2 is a transcription factor essential for reproductive tract development also expressed in the adult uterine endometrium, with decreased expression during the luteal phase of the menstrual cycle (Troy et al., 2003; Daftary and Taylor, 2004). PTEN promotes cell survival and proliferation and inactivation of PTEN is an early event in endometrial hyperplasia and the development of ovarian and endometrial cancers (Maxwell et al., 1998). PTEN lies at 89.6 Mb, more centromeric than the region of significant linkage (Fig. 3). However, the linkage peak is broad and there is evidence for linkage and association with endometriosis in Puerto Rican families at marker D10S677 (Flores et al., 2004). D10S677 is located at 113.34 cM (95.95 Mb) close to the PTEN locus. Genotyping of a large number of markers across both genes found no evidence of association with endometriosis (Treloar et al., 2007).
More recently, another gene which lies within the region of significant linkage on chromosome 10, the fibroblast growth factor receptor 2 gene (FGFR2), was implicated in both endometrial (Pollock et al., 2007) and breast cancer (Easton et al., 2007; Hunter et al., 2007b). To investigate this gene for involvement in endometriosis, single nucleotide polymorphisms (SNPs) within intron 2 of FGFR2 including two SNPs (rs2981582 and rs1219648) significantly associated with breast cancer were genotyped. In addition, a dense set of 40 SNPs across 150 kb of the FGFR2 gene covering common variation within the coding region of the gene was genotyped in a large Australian sample of endometriosis cases and unrelated controls. No evidence for association between common variation in either intron 2 or the entire coding region in the FGFR2 gene was found, suggesting this gene is not a major contributor to endometriosis susceptibility (Zhao et al., 2008).
Experience has shown that, apart from a few notable exceptions, hypothesis-based candidate gene studies and linkage mapping followed by candidate gene targeting have been largely unsuccessful in identifying susceptibility genes for complex diseases. Even with prior linkage information limiting the genomic region of interest, it is difficult to define good candidates because we often lack sufficient knowledge about the biological processes underlying a disease, and—in addition—do not yet know the biological functions of all genes. Indeed, following on from identifying significant linkage on chromosome 10, the IEC has now turned to uncovering the gene(s) involved by utilizing further hypothesis-free fine mapping association methods using very densely spaced SNPs, chosen to most comprehensively capture the genetic architecture in the region.
We are using recently developed genome-wide association (GWA) methods providing a powerful approach to locating genetic variants contributing to common diseases. These methods have arisen from several genome discovery projects finalized in recent years, combined with staggering developments in increased efficiency and reduced costs of high-throughput genotyping technologies. During 2007, the number of risk alleles identified for common diseases identified using these new methods exceeds the number identified in the previous decade (Petretto et al., 2007). Key developments underpinning GWA studies include the survey of common human variation by the HapMap Consortium (Consortium, 2005), and the development of high-throughput genomics platforms capable of genotyping up to 1 million SNPs in a single experiment. The human HapMap project (http://www.hapmap.org/) provided detailed information on population-specific patterns of genomic architecture and genetic variation. The results showed that common SNPs in close proximity to each other are often inherited together within populations, so that the presence of one predicts the presence of another (also termed ‘linkage disequilibrium’). The HapMap project also provided a ‘map’ indicating the dependence between these common SNPs, and this information can be used to greatly reduce the amount of genotyping required in GWA studies to ‘cover’ the whole genome with regard to common genetic variation.
In parallel with studies exploring genomic architecture, technological advances have led to development of platforms for genotyping many thousands of SNPs in a single experiment. Several high-throughput genotyping platforms can now type up to 1 million SNPs in an individual sample and process many samples per day. Current commercial SNP genotyping chips use sets of SNPs selected based on information from the HapMap data to achieve maximum genome coverage. The chips are being used successfully in GWA studies to discover genes contributing to complex disease.
In a recent series of papers, the Wellcome Trust Case Control Consortium (WTCCC) employed these genome-wide techniques on a common set of 3000 controls with sample collections exceeding 2000 cases and reported associations for a range of complex diseases which were subsequently replicated in further large sample collections including type 1 diabetes (Consortium, 2007; Todd et al., 2007), type 2 diabetes (Consortium, 2007; Saxena et al., 2007; Scott et al., 2007; Sladek et al., 2007), Crohn's disease (Consortium, 2007; Libioulle et al., 2007; Rioux et al., 2007) and coronary heart disease (Consortium, 2007; McPherson et al., 2007). Genome-wide scans have also reported new variants affecting prostate cancer (Gudmundsson et al., 2007; Yeager et al., 2007), and breast cancer (Easton et al., 2007; Hunter et al., 2007a). Many of the papers reported associations to novel loci not previously implicated in the disease. Important features of these studies are they included unprecedented, large sample sizes, they concentrated on the small number of variants with strongest evidence of association and subsequently replicated these associations in additional large numbers of samples. Almost all the new risk alleles have odds ratios (effect sizes) <2.0 and many have odds ratios <1.5 (Petretto et al., 2007). Importantly, ~25% of associated variants are found in regions not coding for genes and would be missed in gene centric disease discovery programmes.
The GWA methods employed by WTCCC were highly successful in identifying risk alleles in some diseases (such as Crohn's disease), but found very few or no risk alleles in others (such as hypertension and bipolar disorder) (Consortium, 2007; Todd et al., 2007). The reasons for this inconsistency are not immediately apparent. First, they may be related to the phenotype definitions representing a relatively heterogeneous case set (Zondervan and Cardon, 2007). Second, it may be that environmental factors contribute most to their aetiology, and any genetic risk-factors are of such small effect that much larger studies would be required to detect them. Moreover, even for disease where multiple risk alleles are reported, the variants only account for a small proportion of the genetic risk. Further loci are likely to be identified by testing more variants identified in the genome-wide studies, but the effect sizes are all likely to be small.
GWA offers the prospect of making real progress in discovery of genes influencing endometriosis risk and several groups are planning GWA studies in endometriosis. There are major challenges to the successful outcome of these studies. An enduring lesson from all complex disease association studies is the need for very large studies because of the small contribution of each genetic effect, low prior odds for association and the need to adjust for testing many thousands of SNPs (Peltonen, 2007). Studies with sufficient power can only be achieved by collaboration of large consortia. Valid concerns have been expressed frequently in the past regarding the ability to dissect true findings from the large number of ‘significant’ findings that are produced from GWA studies because of the enormous number of statistical tests conducted when analysing 0.5–1 M SNP genotypes (Zondervan and Cardon, 2004; Di and Guo, 2007). However, the recent GWA and follow-up studies have shown that the combination of GWA study and multiple replication sets does enable the elucidation of common genetic variants underlying a complex trait, as long as both are sufficiently powered (including several thousands of cases and controls).
Although genome-wide studies of gene-environment interaction and epistasis (gene–gene interaction) are likely to require sample sizes much in excess of those currently available, pathway-based approaches can be readily applied to GWA studies of complex disease to yield biological insights that are otherwise undetectable by focusing only on individual genes and/or regions that have the strongest evidence for association. For example, a typical pathway-based approach might rank all genes by their significance of association and then look for whether a particular group of genes is enriched at the significant end of the ranked list more than expected by chance. Application of such pathway-based approaches, where multiple genes in the same pathway contribute to disease aetiology, but common variations in each of the causal genes make modest contributions to disease risk, has enormous potential to both detect novel and confirm hypothesized causal pathways and disease mechanisms (Wang et al., 2007). A practical example where multiple genes in a common pathway influence a complex reproductive trait is twinning (multiple ovulation) frequency, where variation in twinning frequency is influenced by mutations in at least three genes from the intra-ovarian bone morphogenetic signalling pathway (Galloway et al., 2000; Wilson et al., 2001; Hanrahan et al., 2004; Palmer et al., 2006).
Recent experience in the field of breast cancer research provides a good model of how studies might proceed to find the genes contributing to endometriosis (Cox et al., 2007; Easton et al., 2007; Hunter et al., 2007a). Studies must be carefully designed, ensuring case definitions are similar, that large sample sizes are used, and that replication is tested in multiple study populations. No individual groups have sufficient samples to provide convincing evidence for association alone and international consortia must combine data across a range of studies and population samples (Cox et al., 2007; Easton et al., 2007; Hunter et al., 2007a). This combined effort has provided the power to demonstrate convincing evidence for variants with relatively small effects and to replicate findings from the GWA studies (Cox et al., 2007; Easton et al., 2007; Hunter et al., 2007a). Similar strategies are being used in a number of other complex diseases.
It is clear that endometriosis research groups will have to adopt comparable strategies and work closely together to combine data from as many samples as possible to successfully identify genes contributing to this disease. The IEC combining sample sets from Australia, the UK and USA has recently been awarded funding from the Australian National Health and Medical Research Council and The Wellcome Trust in the UK to conduct a GWA study in over 3000 cases and conduct replication studies of the key variants in another 3000 cases and 3000 controls. Results are likely to be similar to other complex diseases and the risk for individual alleles will be small. Despite limitations in results from GWA studies, convincing evidence for variants increasing risk of endometriosis will help to define which pathways contribute to the disease. Subsequent investigations can address whether there is evidence for disease heterogeneity with endometriosis arising through different pathways and also examine interactions between genetic variants and environment. Given the small effect sizes, it is less likely that genetic tests will be used directly to assess individual risk of disease, but knowledge of pathways to disease may help to develop better diagnostic methods.
There is good evidence for a genetic contribution to the risk of developing endometriosis. Gene mapping studies provide an important alternative to biological studies for determining pathways and mechanisms of disease. However, endometriosis is a complex trait. Both theoretical and empirical evidence suggest many genes or variants with small effects are likely to account for the genetic risk. Consequently, powerful, well-designed studies are essential. The tools are now available to find genes predisposing to endometriosis. For this to be successful, we must follow the lead of studies in other diseases. The endometriosis research community must work together to fund the necessary GWA studies and conduct replication studies in many thousands of cases and controls. Convincing evidence for genes associated with endometriosis will provide the starting point for functional and biological studies to develop better diagnosis and treatment for this debilitating disease.
This study was supported by grants to GWM from the National Institute of Child Health and Human Development (HD050537) and National Health and Medical Research Council of Australia (339430 and 339446).