Genomic rearrangements, including rare copy number variations (CNVs), contribute significantly to disease susceptibility in sporadic genomic disorders and some inherited Mendelian diseases
1-6. Although genomic rearrangements are one of the most common somatic genetic events in human cancers, germline CNVs, particularly inherited duplications that are the major mechanism for disease susceptibility, have seldom been reported in familial cancer syndromes. We provide here evidence that duplications of the
T gene confer major susceptibility to familial chordoma.
Chordoma is a rare bone cancer that is believed to originate from notochordal remnants
7. We previously reported genetic linkage to chromosome 7q33 in three multiplex chordoma families (Families 1-3) but not in a fourth chordoma family (Family 4,
Supplementary Figure 1)
8,9. Recently, we evaluated a new chordoma case in Family 1 (individual 43,
Supplementary Figure 1). On magnetic resonance (MR) images the mass in her posterior nasopharynx was virtually identical in its location, appearance and signal characteristics to the clival chordoma that had been removed from her affected father (individual 20). However, she did not inherit the disease-related 7q33 haplotype from her father. Consequently, evidence of linkage to 7q33 in this family decreased substantially (
Supplementary Table 1a). This result led to a further search for chordoma susceptibility loci. In a subsequent genome-wide linkage scan using independent SNPs (Illumina 2.25K) in chordoma families 1-4, we identified six new regions with suggestive evidence for linkage. Fine mapping of the six candidate regions with microsatellite markers (STRs) further improved lod scores on 6q25-27 (3.04 in Family 1,
Supplementary Table 1b), whereas the non-6q candidate regions became less interesting (lod scores 0.5-0.8). All affected individuals in Family 1, including individual 43, shared a common 6q disease-related haplotype between D6S972 and D6S503 (~11Mb). Only one unaffected individual (#44, 2 years old at MR evaluation) shared the 6q disease haplotype. Similarly, all chordoma cases and obligate gene carriers in Family 3 and the two sisters affected with chordoma in Family 4 shared a common haplotype in the 6q region. The minimal disease locus region defined by the three families was ~5 Mb. Family 2 did not show consistent evidence for linkage to this region.
The minimal disease locus region contains several biologically relevant genes that may be important in chordoma and cancers in general. Among them, the
T gene, which encodes Brachyury, is of particular interest. Brachyury is a tissue-specific transcription factor expressed in the nucleus of notochord cells
10 and is essential for proper development and maintenance of the notochord
11. Brachyury is specifically expressed in chordomas but not in a wide variety of non-neoplastic tissues or in 42 other types of neoplasms, including chondrosarcomas
12. Its expression in chordomas mimics expression in the embryonic notochord. Given its obvious biological relevance, we selected the
T gene as our top candidate gene for follow-up. We sequenced the coding region and adjacent splice sites of the nine exons and 5kb upstream and downstream of the coding region of the
T gene in DNA from ten affected individuals from Families 1, 2, 4, and 6 (Families 1-4 were included in the linkage analyses; Families 6-8 were newly examined smaller chordoma families,
Supplementary Figure 1, ). We did not find any sequence variants consistent with a disease-causing mutation. We also sequenced 20 other candidate genes that reside in the minimal disease locus region, including
Dapper and
PRS6KA2, but did not find any disease-related mutations in them.
| Table 1Clinical features and 6q27 duplication status of subjects in seven multiplex chordoma families. |
CNVs have recently been recognized as a significant source of genetic variation that can contribute to disease susceptibility
1. Therefore, we conducted a genome-wide search for CNVs using a whole genome human array-CGH chip (Nimblegen 385K; average probe spacing, 7 kb). We analyzed blood-derived genomic DNA from eleven chordoma cases and two spouses selected from our seven chordoma families. We identified duplicated regions located on 6q27 in seven affected individuals from four families: Family 1 (#18, #21), Family 3 (#3), Family 4 (#1, #3), and Family 8 (#1 and #2) (
Supplementary Table 2). The sizes of the duplicated regions ranged from 52 kb in Family 4 to 489 kb in Family 3. The duplications were not detected in individuals with chordoma from Families 2, 6, and 7, two unrelated spouse controls (from Families 1 and 2) or sixteen individuals from melanoma-prone families whose blood DNA was analyzed using the same assay (
Supplementary Table 2). The duplicated regions in all 4 families contain only one known gene,
T, and within the gene, there were no previously reported CNVs.
To validate the finding, we developed a quantitative PCR (qPCR) assay targeting the 3’ end of exon 6 in the
T gene and screened all 65 individuals (21 affected with chordoma) with DNA available in the 7 families for copy number changes. qPCR analyses confirmed the duplications in all affected subjects and obligate carriers in Families 1, 3, 4, and 8 (,
Supplementary Figure 2A). 6q duplications were not observed in members of the other chordoma families or in 100 unrelated healthy controls (200 meioses) (
Supplementary Figure 2A, 2B). The aggregation of chordoma in the three families without the duplications, each with 2 chordoma cases, may result from alterations of other susceptibility genes, another mutational mechanism targeting the
T gene, or clustering of sporadic chordoma patients.
To confirm the
T gene duplications and to better define the breakpoints of the amplicons, we analyzed genomic DNA from seven individuals with chordoma (#1 and #5 in Family 1, #3 in Family 3, #3 in Family 4, #1 in Family 7, #1 in Family 8) and an unaffected individual in Family 6, using a Nimblegen custom-made fine-tiling CGH array specifically targeting the 6q27 region (average probe spacing, 4 bp). Duplications of the
T gene were clearly demonstrated in chordoma cases from Families 1, 3, 4, and 8 (). In contrast, no duplication was observed in the two “controls” (a chordoma case from Family 7 that did not carry the duplication and an unaffected individual from Family 6). Using the predicted breakpoints defined by the fine-tiling arrays, we were able to amplify junction fragments from three of the four families (Families 1, 4, and 8). The genomic rearrangement in Family 3 is more complex than in the other families, as evidenced by the CGH segmentation analysis (). Families 1, 4, and 8 revealed similar tandem duplications and the duplication sizes were consistent with observed array-CGH changes (124,756 bp in Family 1, 97,284 bp in Family 4, and approximately 173 kb in Family 8,
Supplementary Figure 3). Amplification of junction fragments was confirmed in all individuals of Families 1, 4, and 8 who had increased copy number of the
T gene by qPCR (
Supplementary Figure 2A). In contrast, we were not able to amplify from controls the junction fragments that were observed in Families 1, 4 (200 unrelated controls), and 8 (100 unrelated controls).
Sequence analysis of the junction fragments identified the breakpoints (Families 1 and 4) or breakpoint region (Family 8). Bioinformatics analysis revealed that the breakpoints/breakpoint region were located at or near repetitive SINE and LINE elements, but they did not contain any low-copy repeats (LCRs) (
Supplementary Figure 3). In Family 1, a single base pair was shared between the telomeric and centromeric sequences at the breakpoint junction (
Supplementary Figure 3a). In Family 4, the telomeric and centromeric sequences were separated by a 5-bp insertion (
Supplementary Figure 3b). In Family 8, the junction fragment was located within a 306 bp region of high (90%) homology formed by the fusion of two ALuY elements located at 6q27 (
Supplementary Figure 3c). A precise breakpoint could not be located in this fused region as the junction fragment gradually transitioned from the telomeric ALuY (chr6:166,580,862-166,581,165) to the centromeric ALuY (chr6:166,406,935-166,407,246) sequence. The tandem duplication in Family 8 most likely resulted from an ALuY mediated non-allelic homologous recombination (NAHR). The junction fragments in Family 1 and Family 4 share features that seem to be consistent with non-homologous end joining (NHEJ), although the underlying mechanism could be more complex, as suggested in recent publications
13.
In summary, we have identified T as a major susceptibility gene for familial chordoma using combined genetic linkage and high-resolution array-CGH analyses. This approach has enabled us to identify a susceptibility gene in a linkage region that did not reveal disease-associated mutations by sequencing. Our findings suggest that screening for complex genomic rearrangements that co-segregate with disease in families may provide a powerful alternative to traditional gene-mapping approaches.