|Home | About | Journals | Submit | Contact Us | Français|
Three predisposition genes have been identified for cutaneous malignant melanoma (CMM), but they account for only about 25% of melanoma clusters/pedigrees. Linkage analyses of melanoma pedigrees from many countries have failed to identify significant linkage evidence for the remaining predisposition genes that must exist. The Utah linkage analysis approach of using singly informative extended high-risk pedigrees combined with high-density SNP markers has successfully identified significant linkage evidence for two regions. This first genome-wide linkage analysis of the extended Utah high-risk CMM pedigrees provides confirmation of linkage for a chromosome 9q region previously reported in Danish pedigrees. This report confirms that linkage analysis for common disorders can be successful in analysis of high-density markers in sets of singly informative high-risk pedigrees.
Although it is recognized that approximately 10% of melanoma is familial, only three predisposition genes have been identified as responsible for high-risk melanoma pedigrees. These three genes, CDKN2A, CDK4, and MITF together account for only 20–25% of families with multiple cases of melanoma. The gene responsible for the majority of high-risk pedigrees, CDKN2A (p16), was identified in a linkage study of high-risk Utah and Texas pedigrees. A genome-wide linkage search was never performed in this set of high-risk pedigrees; rather, following report of identification of a constitutional rearrangement of chromosomes 5 and 9 in an individual with multiple cutaneous malignant melanomas (CMM) and atypical moles (Petty et al., 1993), only these 2 regions of the genome were examined using linkage analysis and the gene was localized and cloned (Cannon-Albright et al., 1994; Kamb et al., 1994).
Genome-wide linkage studies of several populations of melanoma high-risk pedigrees have been performed, some with suggestive results. None have yet identified additional melanoma predisposition genes. Greene et al., 1983 reported linkage analysis of 23 genetic markers in 14 high-risk pedigrees and suggested a region on chromosome 1 near the Rh locus; Bale et al., 1989 followed up on this, reporting significant evidence for linkage on chromosome 1p in 6 pedigrees; the candidate gene responsible for these results has never been identified. Nancarrow et al., 1992 performed genome-wide linkage analysis with 172 microsatellite markers in 3 large pedigrees and identified chromosome 6p as a candidate region in one of the pedigrees. This linkage was never confirmed and the pedigree was subsequently shown to carry a germline CDKN2A mutation (Walker et al., 1995a). Gillanders et al., 2003 performed a genome-wide linkage analysis for CMM in 49 Australian pedigrees for which involvement of CDKN2A and CDK4 was excluded. The best linkage evidence was for chromosome 1p22 in the subset of early onset pedigrees (mean age at diagnosis < 35 years). Analysis of 33 additional multiplex families with CMM from several continents added linkage evidence for the region, but no gene has been identified in this region. Jönsson et al., 2005 performed a genome-wide scan of 2 Danish pedigrees with multiple cases of Ocular malignant melanoma and CMM (with no germline mutations in CDKN2A, CDK4, BRCA1 and BRCA2). Significant evidence for linkage was reported for 9q21.32; this was further validated in analysis of a third Danish family with OMM and CMM; RASEF was examined, but no gene has been identified. Hoiom et al (2011) recently analyzed 35 high-risk Swedish melanoma pedigrees with no significant findings.
Use of the Utah genealogical resource linked to statewide cancer reporting since 1966 has resulted in a successful Utah approach to cancer predisposition gene identification. Since p16 was identified in Utah pedigrees we have continued to use the Utah Population Data Base (UPDB) to identify and sample extended Utah high-risk melanoma pedigrees. Here we have performed genome-wide linkage analysis in 34 extended high-risk melanoma pedigrees using a subset of 27,000 high-density, linkage-disequilibrium (LD)-free SNPs from the Illumina 550,000 SNP marker set. Although summary results for all 34 pedigrees combined did not identify significant evidence for linkage, analysis of individual pedigrees identified significant replication evidence for a previously reported linked region. This informative and efficient study of 160 CMM cases in 34 high-risk pedigrees has validated a linkage approach using high-density markers in extended pedigrees by identifying confirmatory replication linkage evidence for the 9q21 melanoma predisposition gene localization previously reported for ocular melanoma and CMM.
Summary genome-wide het-TLODs for both the dominant and recessive models are shown in Figure 1. Although no regions reached significant evidence for linkage (LOD > 3.3) there are several regions with genome-wide suggestive evidence for linkage (LOD > 1.86). Table 1 summarizes those regions that reached a suggestive level of evidence for linkage for either the dominant or recessive model for all 34 pedigrees considered together.
Although overall multipoint consideration of Utah high-risk pedigrees did not give significant evidence for linkage, many of the Utah high-risk pedigrees are sufficiently informative to provide significant evidence for linkage when analyzed alone. Analyzing such pedigrees independently could allow identification of very rare or private segregating variants. Table 2 shows the linkage evidence for all pedigree-specific LOD scores > 1.86, ranked by LOD score. Table 2 shows two pedigrees that alone provide significant evidence for linkage (LOD > 3.30), each at a different chromosomal region, and one pedigree that provided significant replication (LOD =2.97) for the chromosome 9q21 linkage previously reported (Lander and Kruglyak, 1995). The two other significant linkages are currently being analyzed, and are not presented here.
Figure 2 shows the segregating haplotypes in the pedigree showing linkage (LOD = 2.97) to chromosome 9q21 (Pedigree A); four of the five distantly related cases in the pedigree share a 9q haplotype. This pedigree had no known ocular melanoma cases. The average age at CMM diagnosis for the 4 haplotype-sharing cases = 55.8 years (range 40–70). Three of the four haplotype sharing melanoma cases also had additional cancer diagnoses, including Non-Hodgkins Lymphoma, bladder cancer, colon cancer and prostate cancer; colon cancer was also observed in at least one obligate carrier in the pedigree. Counts for number of nevi > 2mm in diameter were performed for family members seen in clinic. Number of nevi was summarized into 1 of 6 categories: 0, 1–10, 11–25, 26–50, 51–100, or > 100 nevi. Counts were available for 11 individuals genotyped in this pedigree (median observed = 11–25 nevi); 5 were haplotype carriers. The median count category for non-haplotype carriers (11–25 nevi) was higher than the median observed for haplotype carriers (1–10 nevi).
One other pedigree showed a LOD score > 0.588 (nominal evidence for linkage) at the 9q region of interest, with a LOD score of 0.88 with four out of four CMM cases sharing the 9q21 haplotype (Pedigree B). Figure 3 shows the haplotype drawing of this second chromosome 9q pedigree. This pedigree also had no known ocular melanoma cases. The average age at diagnosis for the four haplotype sharing cases = 34.3 years (range 20–60). None of the haplotype sharing melanoma cases had additional cancer diagnoses. Nevus counts were available for 7 individuals genotyped in this pedigree (median = 51–100 nevi); 5 were haplotype carriers. The median count category for haplotype carriers (51–100 nevi) was higher than the median observed for non-haplotype carriers (26–50 nevi).
Following the initial linkage analysis, we identified an additional stored DNA samples in Pedigree B. To test and extend the linkage data we genotyped the new samples, and re-analyzed for evidence of linkage. An additional case in Pedigree B was deceased and their 9q21 haplotype was inferred from sampling 3 children. The additional CMM case was determined to share the 9q21 predisposition haplotype; the linkage LOD score for the extended pedigree with MSRs increased to +1.38. The new case was also diagnosed with breast cancer after age 60 years.
We performed SGS analysis to test for significant IBD sharing in the larger pedigree that significantly confirmed 9q21 linkage (Pedigree A). The longest run of sharing (genome-wide) spanned 1,014 markers, occurred at the linkage peak, and was shared by four of five cases in the pedigree. The empirical p-value for this shared segment was 0.008 based on 1000 replicate data sets. These results indicate that the pedigree contains a long region of IBD sharing among the affected cases that contributed to the linkage evidence.
Figure 4 shows the chromosome 9q21 region of interest as defined by our study. Of the two linked pedigrees at chromosome 9q, Pedigree A provides the narrowest boundaries in terms of LOD score evidence, ranging between 84.5 and 94.0 cM. The previously published significant linkage analysis in the Danish cohort of ocular melanoma and CMM (Jonsson, et al., 2005) supported the region spanning 76 cM to 87.3 cM. A consensus region between the two studies suggests a left boundary of 84.6 cM and a right boundary of 87.3 cM (approximately 87.6–89.2 Mb). The Jonsson study also reported a smaller region of interest (plotted as the dark arrow in Figure 4) based on a shared 9q21 haplotype appearing in multiple pedigrees, but mutation screening within that smaller region did not reveal causal variants; it is possible that such variants exist within the wider region not involving the three-marker shared haplotype. A third study reported suggestive evidence for linkage to this region for a phenotype that is closely correlated with CMM, nevus density. Linkage evidence was assessed for a cohort of twins less than age 35 years (Falchi, et al., 2006). A 1-LOD support interval surrounding the peak LOD score (2.55) covers the entire region of interest delineated by the Danish study and the present study (approximately 76–105 cM). The nevus study did not provide further localization, but does enhance interest for CMM in this genomic region.
It has been almost two decades since a significant linkage finding has been published for extended Utah high-risk melanoma pedigrees (Cannon-Albright et al., 1994), and that finding did not result from a genome-wide linkage scan. The most dense melanoma high-risk pedigrees in Utah were primarily ascribed to CDKN2A (p16; Kamb et al., 1994), but most Utah high-risk melanoma pedigrees were found not to be due to p16, indicating that other, perhaps more rare, predisposition variants may be responsible. We have continued to study high-risk CMM pedigrees and we have ascertained and sampled a new set of extended high-risk CMM pedigrees, all screened negative for CDKN2A, CDK4, and ARF, and all with a significant excess of CMM cases (p<0.05) among the descendants of a founder pair in the Utah genealogy. The melanoma cases in these pedigrees represent unexpected clusters of related CMM cases in excess among the descendants of the pedigree founders.
Our approach to this very first genome-wide linkage analysis of this set of CMM pedigrees was unusual, in that we selected the very high-density SNP marker sets typically used for GW association studies, rather than the smaller micro satellite repeat (MSR) or SNP linkage marker sets typically used for linkage analysis. Overall this linkage study of 34 high-risk CMM pedigrees did not identify any significant evidence for linkage for all pedigrees combined using standard heterogeneity measures. This is not an unusual outcome for a cancer linkage study, nor for any linkage study of a large set of pedigrees representing a heterogeneous disease with a complex mode of inheritance where not all pedigrees, and perhaps not even a majority of pedigrees, can be expected to be due to the same gene. However, when we considered the single homogeneous pedigrees within the Utah set, linkage analysis of one pedigree significantly confirmed linkage for a predisposition locus previously published for chromosome arm 9q; an additional linked pedigree was also identified for this region. Our approach for non-parametric confirmation of the linkage evidence used a straightforward new method (Thomas et al., 2008) to confirm evidence for IBD sharing.
A note on the underperformance of the heterogeneity LOD score in this analysis seems required. In this analysis, multiple individual pedigrees achieved significant evidence for linkage, yet the hetLOD approach failed to achieve even suggestive evidence at these regions. This outcome provides a strong caution against over-reliance on the hetLOD method to interpret linkage findings in a search for rare predisposition genes in analyses of multiple informative pedigrees.
Certainly, the overlap between the current and previous linkage analyses for chromosome 9 is the most promising region for causal variants for CMM. The consensus region between the current study and the Danish study spans 87.6–89.2 Mb on chromosome 9 and contains seven genes (NTRK2, AGTPBP1 (a.k.a. KIAA1035), BC108718 (a.k.a. KIF27), MAK10, GOLM1, C9orf153, ISCA1, KIAA1711 (a.k.a. ZCCHC6)), several of which may be promising candidates for cancer predisposition. The gene NTRK2 is involved in the MAPK pathway (cell differentiation). A functional study of a point mutation in this gene identified in a melanoma cell line has been carried out (Geiger, et al., 2011). In addition, the products of this gene have also been considered as therapeutic targets for various cancers (Siu, et al., 2009; Wang, et al., 2009). Another potential candidate is the K1F27 gene that is involved in the sonic hedgehog pathway, which has been associated with basal cell carcinoma, among other cancer types (Katoh & Katoh, 2004). In addition, the gene MAK10 is described as a corneal wound healing-related protein that is expressed in both corneal tissue and skin (Yi, et al., 2000). High expression of MAK10 is associated with rapid protein synthesis, which accompanies wound-healing events and could precipitate drastic alteration to cellular regulation (Polevoda & Sherman, 2003). Another potential candidate in this region is the GOLM1 gene, which codes a Golgi membrane protein. The Golgi complex plays a role in post-modification of proteins exiting the endoplasmic reticulum. Components of the Golgi complex have been implicated in several cancers, and gene products of GOLM1 have been proposed as biomarkers for hepatic and prostate cancers (Jamaspishvili, et al., 2011; Shi, et al., 2011).
This linkage analysis for CMM significantly confirms a previous linkage reported for ocular melanoma and CMM in Danish and Swedish pedigrees. Although we did not observe any ocular melanomas in the 2 linked Utah pedigrees, we did observe multiple other cancers in haplotype-sharing individuals with and without CMM. Further analysis will be required to understand the precise increased cancer risk attributable to the hypothesized CMM predisposition gene on chromosome arm 9q.
Why after so many years of linkage studies for melanoma have we only now observed significant linkage confirmation? There are probably several reasons. This is a previously unreported genome-wide linkage analysis of the extended Utah pedigrees. Most published melanoma linkage scans include families without confirmation that they represent high-risk clusters/pedigrees. In addition, the utilization of a high quality CMM phenotype (histologically confirmed for all cases in the Utah Cancer Registry) contributed to a lowering of bias from misclassification. Finally, the use of high-density SNP markers may have contributed to the power for identification of chromosomal sharing in these pedigrees, since it provides much better resolution in multipoint linkage. The shared regions in the two linked pedigrees are small enough that a less-dense MSR GW linkage scan could easily have missed them.
Clearly the existence of distantly related CMM cases in confirmed high-risk pedigrees was essential for this finding; the large number of meioses between cases provided power, and reduced the region of sharing identified. The existence of high-risk melanoma pedigrees suggests rare segregating mutations/variants for melanoma predisposition. If very rare or private predisposition alleles for melanoma exist, they might only be identified in pedigrees like these extended high-risk pedigrees studied in Utah.
It appears that the extended high-risk pedigree approach successful early on in cancer predisposition gene identification in Utah pedigrees has again succeeded. The significant confirmation linkage for CMM reported here may represent identification of “missing heritability” for CMM; two of 34 pedigrees analysis have evidence for 9q linkage; several Danish pedigrees are linked, additional 9q-linked pedigrees may be identified. A search for segregating mutations is underway. Predisposition genes with rare segregating variants can be identified in a pedigree-specific search in extended pedigrees using a high-density marker map. The recent approach of increasing the number of cases and the number of markers for GWAS to enable identification of “rare variants with large effect” might more easily be accomplished by returning to a likely place to identify these variants---in high risk pedigrees.
Recently sequence analysis of familial melanoma cases has successfully identified a novel recurrent mutation in MITF that appears to predispose to familial and sporadic melanoma (Yokoyama et al., 2011), and in another sequencing study of high-risk pedigrees, Wiesner et al., 2011 reported two families with segregating inactivating germline mutations in the BAP1 gene (chromosomal band 3p21.1) associated with a morphologically distinct type of melanocytic neoplasm, which sometimes displays similarity to CMM (Wiesner et al., 2011). A combination of linkage analysis in extended pedigrees to identify regions of interest, followed by sequence of these target regions, provides a very powerful approach to identifying rare segregating predisposition variants responsible for familial melanoma.
The UPDB is a computerized resource combining the genealogies of the Utah Pioneers and their descendants with other statewide data (Skolnick, 1980). The original genealogy data included 1.6 million individuals extending to 7 generations. Addition and linking of statewide vital statistics data has enlarged the genealogy data (e.g. mother, father, child from a birth certificate); the most current version extends to 15 generations. The genealogy data is linked to the Utah Cancer Registry (UCR), an NCI Surveillance, Epidemiology and End Results (SEER) registry since 1973. The UCR has recorded every independent cancer primary diagnosed in the state since 1966 and includes data on primary site, histology, stage, grade and survival. Written informed consent was obtained for all study subjects and Institutional Review Board approval was in place for this study; the Declaration of Helsinki protocols were followed.
The UPDB is updated annually with new UCR data. For this study we identified approximately 5,000 histologically confirmed CMM cases in the UCR who also had at least 3 generations of genealogy data connecting to the original Utah genealogy. To identify high-risk pedigrees for recruitment we identified founders in the UPDB (individuals without parents) with a significant excess of CMM (p<0.05) among their descendants. We counted all descendants for each founder (a pedigree) by birth-year cohort, sex and birth-place (Utah or not), and, using CMM rates calculated from the UPDB data applied to the descendants, we calculated the expected number of CMM cases among the descendants within each pedigree. Pedigrees with a significant excess of observed to expected CMM cases were recruited to a high-risk pedigree study in which cases and relatives were examined and donated DNA. We also included pedigrees identified and sampled in our previous Utah high-risk pedigree studies (Cannon-Albright et al., 1992). Thirty-four of these high-risk pedigrees with at least 2, and up to 13 distant sampled melanoma cases were selected for a genome-wide linkage study.
Typically genome-wide linkage scans are accomplished with hundreds of microsatellite markers or, more recently, with a slightly larger set of SNP markers (e.g. Illumina’s 6k SNP set). We chose instead to use the Illumina 550k SNP marker set, which is more commonly used for Genome Wide Association Studies (GWAS). This allowed us to perform standard linkage analysis with much more dense coverage than has been accomplished in the past. The GWAS analysis of these data is published elsewhere (Teerlink et al., 2012). We confirmed the standard linkage analysis presented here with results obtained using a new linkage-like approach that efficiently identifies regions shared identical by descent (IBD) from a common ancestor (Thomas et al., 2008).
Many fewer markers are necessary for genome-wide linkage analysis than for genome-wide association analysis (GWAS). For linkage analysis, markers in LD should not be used due to risk of false positives due to allele sharing as a result of LD rather than linkage. A much smaller subset of SNP markers can represent the entire genome. We reduced the set of 550,000 genotyped SNPs to a low-LD subset of markers for linkage by selecting SNPs with a higher heterozygosity, and low or no LD using publicly available HAPMAP data for 30 trios of U.S. residents of northern and western European ancestry (CEU). Specifically, we selected markers with a minimum spacing of 0.1 cM, minimum heterozygosity (0.3), and using a sliding window of 500,000 bp, we minimized r2 between markers (max r2 of 0.16). This strategy resulted in selection of 27,157 markers genome-wide with a median heterozygosity of 0.49 and median spacing of 0.14 cM.
We used MCLINK (Thomas et al., 2000) software for linkage analysis. MCLINK uses Monte Carlo Markov Chain techniques including blocked Gibbs sampling to generate haplotype reconstructions that are used to extract inheritance information in pedigrees (Thomas et al., 2000, Camp et al., 2001; Abkevich et al., 2001). In addition to calculating standard multi-point LOD scores, the program also calculates robust multi-point LOD scores (TLODs-Göring and Terwilliger, 2000). Uncertainties about the genetic model, locus heterogeneity, and misdiagnosis can complicate conventional multipoint LOD score analysis. However, similar to the two-point linkage statistic, the TLOD incorporates optimization of the recombination fraction (theta) in the statistic’s parameterization, thereby retaining the robustness of the two-point LOD to model misspecification while also benefiting from multipoint information. This procedure allows model misspecification to be "absorbed" by the estimate of theta, while retaining the increased informativeness of a multipoint analysis. Our analyses also included the heterogeneity LOD score approach to account for interfamilial heterogeneity (Ott 1986). We analyzed all pedigrees using an affecteds-only model that assumed a disease gene frequency of 0.005 for a dominant model and 0.05 for a recessive model. The penetrance estimates for carriers and non-carriers were 0.5 and 0.0005, respectively. We considered LOD scores >1.86 (corresponding to a false-positive rate of one per genome) as suggestive evidence for linkage, and scores > 3.30 as significant, as defined by Lander and Kruglyak (1995), whether considering evidence for 1 pedigree or for multiple pedigrees.
In order to more fully establish the statistical significance of results obtained from the parametric linkage approach, given the frequently large genetic distances between cases in the pedigree resource, we followed up significant linkage results with a non-parametric method designed to identify regions shared IBD among distantly related cases, which we refer to as shared genomics segments (SGS) (Thomas et al., 2008). The SGS method assumes that in regions where individuals are observed to share alleles, regions shared IBD will be longer than regions shared merely identical by state (IBS). To determine the statistical significance of the length of a region shared among distantly related cases, we simulate the typical length of IBS sharing using random gene drop with founder haplotypes generated from a model for LD (Thomas, 2010). We estimated the LD model from a combined set of all genotyped pedigree members and 500 ethnically matched controls selected from the iControls database who were also genotyped on the Illumina 550K SNP platform. We empirically assessed significance by comparing the length of the longest observed segment across the genome that could be shared by affected pedigree members to the distribution of the longest such segment obtained from the simulated data sets. Significant rejection of the null hypothesis indicates that the observed sharing is IBD.
This research has been supported by the European Commission under the 6th Framework Programme, Contract nr: LSHC-CT-2006-018702; NIH National Cancer Institute grant R01 CA102422 (to L.A. Cannon Albright); and Award Number P30CA042014 from the National Cancer Institute. Research was also supported by the Utah Cancer Registry, which is funded by Contract No. HHSN261201000026C from the National Cancer Institute's SEER Program with additional support from the Utah State Department of Health and the University of Utah. Partial support for all data sets with in the Utah Population Database (UPDB) was provided by Huntsman Cancer Institute, University of Utah and the Huntsman Cancer Institute's Cancer Center Support grant (P30 CA42014) from the National Cancer Institute.
CONFLICT OF INTEREST
The authors state no conflict of interest.