Many individuals with multiple or large colorectal adenomas, or early-onset colorectal cancer (CRC), have no detectable germline mutations in the known cancer predisposition genes. Using whole-genome sequencing, supplemented by linkage and association analysis, we identified specific heterozygous POLE or POLD1 germline variants in several multiple adenoma and/or CRC cases, but in no controls. The susceptibility variants appear to have high penetrance. POLD1 is also associated with endometrial cancer predisposition. The mutations map to equivalent sites in the proof-reading (exonuclease) domain of DNA polymerases ε and δ, and are predicted to impair correction of mispaired bases inserted during DNA replication. In agreement with this prediction, mutation carriers’ tumours were microsatellite-stable, but tended to acquire base substitution mutations, as confirmed by yeast functional assays. Further analysis of published data showed that the recently-described group of hypermutant, microsatellite-stable CRCs is likely to be caused by somatic POLE exonuclease domain mutations.
Variant annotation is a crucial step in the analysis of genome sequencing data. Functional annotation results can have a strong influence on the ultimate conclusions of disease studies. Incorrect or incomplete annotations can cause researchers both to overlook potentially disease-relevant DNA variants and to dilute interesting variants in a pool of false positives. Researchers are aware of these issues in general, but the extent of the dependency of final results on the choice of transcripts and software used for annotation has not been quantified in detail.
This paper quantifies the extent of differences in annotation of 80 million variants from a whole-genome sequencing study. We compare results using the RefSeq and Ensembl transcript sets as the basis for variant annotation with the software Annovar, and also compare the results from two annotation software packages, Annovar and VEP (Ensembl’s Variant Effect Predictor), when using Ensembl transcripts.
We found only 44% agreement in annotations for putative loss-of-function variants when using the RefSeq and Ensembl transcript sets as the basis for annotation with Annovar. The rate of matching annotations for loss-of-function and nonsynonymous variants combined was 79% and for all exonic variants it was 83%. When comparing results from Annovar and VEP using Ensembl transcripts, matching annotations were seen for only 65% of loss-of-function variants and 87% of all exonic variants, with splicing variants revealed as the category with the greatest discrepancy. Using these comparisons, we characterised the types of apparent errors made by Annovar and VEP and discuss their impact on the analysis of DNA variants in genome sequencing studies.
Variant annotation is not yet a solved problem. Choice of transcript set can have a large effect on the ultimate variant annotations obtained in a whole-genome sequencing study. Choice of annotation software can also have a substantial effect. The annotation step in the analysis of a genome sequencing study must therefore be considered carefully, and a conscious choice made as to which transcript set and software are used for annotation.
In severe early-onset epilepsy, precise clinical and molecular genetic diagnosis is complex, as many metabolic and electro-physiological processes have been implicated in disease causation. The clinical phenotypes share many features such as complex seizure types and developmental delay. Molecular diagnosis has historically been confined to sequential testing of candidate genes known to be associated with specific sub-phenotypes, but the diagnostic yield of this approach can be low. We conducted whole-genome sequencing (WGS) on six patients with severe early-onset epilepsy who had previously been refractory to molecular diagnosis, and their parents. Four of these patients had a clinical diagnosis of Ohtahara Syndrome (OS) and two patients had severe non-syndromic early-onset epilepsy (NSEOE). In two OS cases, we found de novo non-synonymous mutations in the genes KCNQ2 and SCN2A. In a third OS case, WGS revealed paternal isodisomy for chromosome 9, leading to identification of the causal homozygous missense variant in KCNT1, which produced a substantial increase in potassium channel current. The fourth OS patient had a recessive mutation in PIGQ that led to exon skipping and defective glycophosphatidyl inositol biosynthesis. The two patients with NSEOE had likely pathogenic de novo mutations in CBL and CSNK1G1, respectively. Mutations in these genes were not found among 500 additional individuals with epilepsy. This work reveals two novel genes for OS, KCNT1 and PIGQ. It also uncovers unexpected genetic mechanisms and emphasizes the power of WGS as a clinical tool for making molecular diagnoses, particularly for highly heterogeneous disorders.
Nutritional factors play important roles in the etiology of obesity, type 2 diabetes mellitus and their complications through genotype x environment interactions. We have characterised molecular adaptation to high fat diet (HFD) feeding in inbred mouse strains widely used in genetic and physiological studies. We carried out physiological tests, plasma lipid assays, obesity measures, liver histology, hepatic lipid measurements and liver genome-wide gene transcription profiling in C57BL/6J and BALB/c mice fed either a control or a high fat diet. The two strains showed marked susceptibility (C57BL/6J) and relative resistance (BALB/c) to HFD-induced insulin resistance and non alcoholic fatty liver disease (NAFLD). Global gene set enrichment analysis (GSEA) of transcriptome data identified consistent patterns of expression of key genes (Srebf1, Stard4, Pnpla2, Ccnd1) and molecular pathways in the two strains, which may underlie homeostatic adaptations to dietary fat. Differential regulation of pathways, including the proteasome, the ubiquitin mediated proteolysis and PPAR signalling in fat fed C57BL/6J and BALB/c suggests that altered expression of underlying diet-responsive genes may be involved in contrasting nutrigenomic predisposition and resistance to insulin resistance and NAFLD in these models. Collectively, these data, which further demonstrate the impact of gene x environment interactions on gene expression regulations, contribute to improved knowledge of natural and pathogenic adaptive genomic regulations and molecular mechanisms associated with genetically determined susceptibility and resistance to metabolic diseases.
Colorectal cancer (CRC) is a disease of complex aetiology, with much of the expected inherited risk being due to several common low risk variants. Genome-Wide Association Studies (GWAS) have identified 20 CRC risk variants. Nevertheless, these have only been able to explain part of the missing heritability. Moreover, these signals have only been inspected in populations of Northern European origin.
Thus, we followed the same approach in a Spanish cohort of 881 cases and 667 controls. Sixty-four variants at 24 loci were found to be associated with CRC at p-values <10-5. We therefore evaluated the 24 loci in another Spanish replication cohort (1481 cases and 1850 controls). Two of these SNPs, rs12080929 at 1p33 (Preplication=0.042; Ppooled=5.523x10-03; OR (CI95%)=0.866(0.782-0.959)) and rs11987193 at 8p12 (Preplication=0.039; Ppooled=6.985x10-5; OR (CI95%)=0.786(0.705-0.878)) were replicated in the second Phase, although they did not reach genome-wide statistical significance.
We have performed the first CRC GWAS in a Southern European population and by these means we were able to identify two new susceptibility variants at 1p33 and 8p12 loci. These two SNPs are located near the SLC5A9 and DUSP4 loci, respectively, which could be good functional candidates for the association signals. We therefore believe that these two markers constitute good candidates for CRC susceptibility loci and should be further evaluated in other larger datasets. Moreover, we highlight that were these two SNPs true susceptibility variants, they would constitute a decrease in the CRC missing heritability fraction.
GWAS; SNPs; Colorectal cancer; Spanish cohort; 1p33; 8p12
Prevalence of colorectal cancer (CRC) in the British Bangladeshi population (BAN) is low compared to British Caucasians (CAU). Genetic background may influence mutations and disease features.
We characterized the clinicopathological features of BAN CRCs and interrogated their genomes using mutation profiling and high-density single nucleotide polymorphism (SNP) arrays and compared findings to CAU CRCs.
Age of onset of BAN CRC was significantly lower than for CAU patients (p=3.0 x 10-5) and this difference was not due to Lynch syndrome or the polyposis syndromes. KRAS mutations in BAN microsatellite stable (MSS) CRCs were comparatively rare (5.4%) compared to CAU MSS CRCs (25%; p=0.04), which correlates with the high percentage of mucinous histotype observed (31%) in the BAN samples. No BRAF mutations was seen in our BAN MSS CRCs (CAU CRCs, 12%; p=0.08). Array data revealed similar patterns of gains (chromosome 7 and 8q), losses (8p, 17p and 18q) and LOH (4q, 17p and 18q) in BAN and CAU CRCs. A small deletion on chromosome 16p13.2 involving the alternative splicing factor RBFOX1 only was found in significantly more BAN (50%) than CAU CRCs (15%) cases (p=0.04). Focal deletions targeting the 5’ end of the gene were also identified. Novel RBFOX1 mutations were found in CRC cell lines and tumours; mRNA and protein expression was reduced in tumours.
KRAS mutations were rare in BAN MSS CRC and a mucinous histotype common. Loss of RBFOX1 may explain the anomalous splicing activity associated with CRC.
Colorectal cancer; Genome analysis; British Bangladeshi; RBFOX1; KRAS; BRAF
Genome-wide association study (GWAS) data on a disease are increasingly available from multiple related populations. In this scenario, meta-analyses can improve power to detect homogeneous genetic associations, but if there exist ancestry-specific effects, via interactions on genetic background or with a causal effect that co-varies with genetic background, then these will typically be obscured. To address this issue, we have developed a robust statistical method for detecting susceptibility gene-ancestry interactions in multi-cohort GWAS based on closely-related populations. We use the leading principal components of the empirical genotype matrix to cluster individuals into “ancestry groups” and then look for evidence of heterogeneous genetic associations with disease or other trait across these clusters. Robustness is improved when there are multiple cohorts, as the signal from true gene-ancestry interactions can then be distinguished from gene-collection artefacts by comparing the observed interaction effect sizes in collection groups relative to ancestry groups. When applied to colorectal cancer, we identified a missense polymorphism in iron-absorption gene CYBRD1 that associated with disease in individuals of English, but not Scottish, ancestry. The association replicated in two additional, independently-collected data sets. Our method can be used to detect associations between genetic variants and disease that have been obscured by population genetic heterogeneity. It can be readily extended to the identification of genetic interactions on other covariates such as measured environmental exposures. We envisage our methodology being of particular interest to researchers with existing GWAS data, as ancestry groups can be easily defined and thus tested for interactions.
β-III spectrin is present in the brain and is known to be important in the function of the cerebellum. Heterozygous mutations in SPTBN2, the gene encoding β-III spectrin, cause Spinocerebellar Ataxia Type 5 (SCA5), an adult-onset, slowly progressive, autosomal-dominant pure cerebellar ataxia. SCA5 is sometimes known as “Lincoln ataxia,” because the largest known family is descended from relatives of the United States President Abraham Lincoln. Using targeted capture and next-generation sequencing, we identified a homozygous stop codon in SPTBN2 in a consanguineous family in which childhood developmental ataxia co-segregates with cognitive impairment. The cognitive impairment could result from mutations in a second gene, but further analysis using whole-genome sequencing combined with SNP array analysis did not reveal any evidence of other mutations. We also examined a mouse knockout of β-III spectrin in which ataxia and progressive degeneration of cerebellar Purkinje cells has been previously reported and found morphological abnormalities in neurons from prefrontal cortex and deficits in object recognition tasks, consistent with the human cognitive phenotype. These data provide the first evidence that β-III spectrin plays an important role in cortical brain development and cognition, in addition to its function in the cerebellum; and we conclude that cognitive impairment is an integral part of this novel recessive ataxic syndrome, Spectrin-associated Autosomal Recessive Cerebellar Ataxia type 1 (SPARCA1). In addition, the identification of SPARCA1 and normal heterozygous carriers of the stop codon in SPTBN2 provides insights into the mechanism of molecular dominance in SCA5 and demonstrates that the cell-specific repertoire of spectrin subunits underlies a novel group of disorders, the neuronal spectrinopathies, which includes SCA5, SPARCA1, and a form of West syndrome.
β-III spectrin is present in the brain and is known to be important in the function of the cerebellum. Mutations in β-III spectrin cause spinocerebellar ataxia type 5 (SCA5), sometimes called Lincoln ataxia because it was first described in the relatives of United States President Abraham Lincoln. This is generally an adult-onset progressive cerebellar disorder. Recessive mutations have not previously been described in any of the brain spectrins. We identified a homozygous mutation in SPTBN2, which causes a more severe disorder than SCA5, with a developmental cerebellar ataxia, which is present from childhood; in addition there is marked cognitive impairment. We call this novel condition SPARCA1 (Spectrin-associated Autosomal Recessive Cerebellar Ataxia type 1). This condition could be caused by two separate gene mutations; but we show, using a combination of genome-wide mapping, whole-genome sequencing, and detailed behavioural and neuropathological analysis of a β-III spectrin mouse knockout, that both the ataxia and cognitive impairment are caused by the recessive mutations in β-III spectrin. SPARCA1 is one of a family of neuronal spectrinopathies and illustrates the importance of spectrins in brain development and function.
There is strong evidence that rare copy number variants (CNVs) have a role in susceptibility to autism spectrum disorders (ASDs). Much research has focused on how CNVs mediate a phenotypic effect by altering gene expression levels. We investigated an alternative mechanism whereby CNVs combine the 5′ and 3′ ends of two genes, creating a ‘fusion gene'. Any resulting mRNA with an open reading frame could potentially alter the phenotype via a gain-of-function mechanism. We examined 2382 and 3096 rare CNVs from 996 individuals with ASD and 1287 controls, respectively, for potential to generate fusion transcripts. There was no increased burden in individuals with ASD; 122/996 cases harbored at least one rare CNV of this type, compared with 179/1287 controls (P=0.89). There was also no difference in the overall frequency distribution between cases and controls. We examined specific examples of such CNVs nominated by case–control analysis and a candidate approach. Accordingly, a duplication involving REEP1-POLR1A (found in 3/996 cases and 0/1287 controls) and a single occurrence CNV involving KIAA0319-TDP2 were tested. However, no fusion transcripts were detected by RT-PCR. Analysis of additional samples based on cell line availability resulted in validation of a MAPKAPK5-ACAD10 fusion transcript in two probands. However, this variant was present in controls at a similar rate and is unlikely to influence ASD susceptibility. In summary, although we find no evidence that fusion-gene generating CNVs lead to ASD susceptibility, discovery of a MAPKAPK5-ACAD10 transcript with an estimated frequency of ∼1/200 suggests that gain-of-function mechanisms should be considered in future CNVs studies.
CNV; MAPKAPK5; ACAD10; ALDH2; KIAA0319; dyslexia
Summary: GREVE has been developed to assist with the identification of recurrent genomic aberrations across cancer samples. The exact characterization of such aberrations remains a challenge despite the availability of increasing amount of data, from SNParray to next-generation sequencing. Furthermore, genomic aberrations in cancer are especially difficult to handle because they are, by nature, unique to the patients. However, their recurrence in specific regions of the genome has been shown to reflect their relevance in the development of tumors. GREVE makes use of previously characterized events to identify such regions and focus any further analysis.
Availability: GREVE is available through a web interface and open-source application (http://www.well.ox.ac.uk/GREVE).
We have previously identified several colorectal cancer (CRC)-associated polymorphisms using genome-wide association (GWA) analysis. We sought to fine-map the location of the functional variants for three of these regions at 8q23.3 (EIF3H), 16q22.1 (CDH1/CDH3) and 19q13.11 (RHPN2). We genotyped two case–control sets at high density in the selected regions and used existing data from four other case–control sets, comprising a total of 9328 CRC cases and 10 480 controls. To improve marker density, we imputed genotypes from the 1000 Genomes Project and Hapmap3 data sets. All three regions contained smaller areas in which a cluster of single nucleotide polymorphisms (SNPs) showed clearly stronger association signals than surrounding SNPs, allowing us to assign those areas as the most likely location of the disease-associated functional variant. Further fine-mapping within those areas was generally unhelpful in identifying the functional variation based on strengths of association. However, functional annotation suggested a relatively small number of functional SNPs, including some with potential regulatory function at 8q23.3 and 16q22.1 and a non-synonymous SNP in RPHN2. Interestingly, the expression quantitative trait locus browser showed a number of highly associated SNP alleles correlated with mRNA expression levels not of EIF3H and CDH1 or CDH3, but of UTP23 and ZFP90, respectively. In contrast, none of the top SNPs within these regions was associated with transcript levels at EIF3H, CDH1 or CDH3. Our post-GWA study highlights benefits of fine-mapping of common disease variants in combination with publicly available data sets. In addition, caution should be exercised when assigning functionality to candidate genes in regions discovered through GWA analysis.
The manifestation of coronary artery disease (CAD) follows a well-choreographed series of events that includes damage of arterial endothelial cells and deposition of lipids in the sub-endothelial layers. Genome-wide association studies (GWAS) of multiple populations with distinctive genetic and lifestyle backgrounds are a crucial step in understanding global CAD pathophysiology. In this study, we report a GWAS on the genetic basis of arterial stenosis as measured by cardiac catheterization in a Lebanese population. The locus of the phosphatase and actin regulator 1 gene (PHACTR1) showed association with coronary stenosis in a discovery experiment with genome wide data in 1,949 individuals (rs9349379, OR = 1.37, p = 1.57×10−5). The association was replicated in an additional 2,547 individuals (OR = 1.31, p = 8.85×10−6), leading to genome-wide significant association in a combined analysis (OR = 1.34, p = 8.02×10−10). Results from this GWAS support a central role of PHACTR1 in CAD susceptibility irrespective of lifestyle and ethnic divergences. This association provides a plausible component for understanding molecular mechanisms involved in the formation of stenosis in cardiac vessels and a potential drug target against CAD.
Genome wide association studies (GWAS) and their replications that have associated DNA variants with myocardial infarction (MI) and/or coronary artery disease (CAD) are predominantly based on populations of European or Eastern Asian descent. Replication of the most significantly associated polymorphisms in multiple populations with distinctive genetic backgrounds and lifestyles is crucial to the understanding of the pathophysiology of a multifactorial disease like CAD. We have used our Lebanese cohort to perform a replication study of nine previously identified CAD/MI susceptibility loci (LTA, CDKN2A-CDKN2B, CELSR2-PSRC1-SORT1, CXCL12, MTHFD1L, WDR12, PCSK9, SH2B3, and SLC22A3), and 88 genes in related phenotypes. The study was conducted on 2,002 patients with detailed demographic, clinical characteristics, and cardiac catheterization results. One marker, rs6922269, in MTHFD1L was significantly protective against MI (OR = 0.68, p = 0.0035), while the variant rs4977574 in CDKN2A-CDKN2B was significantly associated with MI (OR = 1.33, p = 0.0086). Associations were detected after adjustment for family history of CAD, gender, hypertension, hyperlipidemia, diabetes, and smoking. The parallel study of 88 previously published genes in related phenotypes encompassed 20,225 markers, three quarters of which with imputed genotypes The study was based on our genome-wide genotype data set, with imputation across the whole genome to HapMap II release 22 using HapMap CEU population as a reference. Analysis was conducted on both the genotyped and imputed variants in the 88 regions covering selected genes. This approach replicated HNRNPA3P1-CXCL12 association with CAD and identified new significant associations of CDKAL1, ST6GAL1, and PTPRD with CAD. Our study provides evidence for the importance of the multifactorial aspect of CAD/MI and describes genes predisposing to their etiology.
In genome-wide association studies (GWASs) of colorectal cancer, we have identified two genomic regions in which pairs of tagging-single nucleotide polymorphisms (tagSNPs) are associated with disease; these comprise chromosomes 1q41 (rs6691170, rs6687758) and 12q13.13 (rs7163702, rs11169552). We investigated these regions further, aiming to determine whether they contain more than one independent association signal and/or to identify the SNPs most strongly associated with disease. Genotyping of additional sample sets at the original tagSNPs showed that, for both regions, the two tagSNPs were unlikely to identify a single haplotype on which the functional variation lay. Conversely, one of the pair of SNPs did not fully capture the association signal in each region. We therefore undertook more detailed analyses, using imputation, logistic regression, genealogical analysis using the GENECLUSTER program and haplotype analysis. In the 1q41 region, the SNP rs11118883 emerged as a strong candidate based on all these analyses, sufficient to account for the signals at both rs6691170 and rs6687758. rs11118883 lies within a region with strong evidence of transcriptional regulatory activity and has been associated with expression of PDGFRB mRNA. For 12q13.13, a complex situation was found: SNP rs7972465 showed stronger association than either rs11169552 or rs7136702, and GENECLUSTER found no good evidence for a two-SNP model. However, logistic regression and haplotype analyses supported a two-SNP model, in which a signal at the SNP rs706793 was added to that at rs11169552. Post-GWAS fine-mapping studies are challenging, but the use of multiple tools can assist in identifying candidate functional variants in at least some cases.
Forkhead-box protein P2 is a transcription factor that has been associated with intriguing aspects of cognitive function in humans, non-human mammals, and song-learning birds. Heterozygous mutations of the human FOXP2 gene cause a monogenic speech and language disorder. Reduced functional dosage of the mouse version (Foxp2) causes deficient cortico-striatal synaptic plasticity and impairs motor-skill learning. Moreover, the songbird orthologue appears critically important for vocal learning. Across diverse vertebrate species, this well-conserved transcription factor is highly expressed in the developing and adult central nervous system. Very little is known about the mechanisms regulated by Foxp2 during brain development. We used an integrated functional genomics strategy to robustly define Foxp2-dependent pathways, both direct and indirect targets, in the embryonic brain. Specifically, we performed genome-wide in vivo ChIP–chip screens for Foxp2-binding and thereby identified a set of 264 high-confidence neural targets under strict, empirically derived significance thresholds. The findings, coupled to expression profiling and in situ hybridization of brain tissue from wild-type and mutant mouse embryos, strongly highlighted gene networks linked to neurite development. We followed up our genomics data with functional experiments, showing that Foxp2 impacts on neurite outgrowth in primary neurons and in neuronal cell models. Our data indicate that Foxp2 modulates neuronal network formation, by directly and indirectly regulating mRNAs involved in the development and plasticity of neuronal connections.
Foxp2 codes for an intriguing regulatory protein that provides a window into unusual aspects of brain function in multiple species. For example, the gene is implicated in speech and language disorders in humans, song learning in songbirds, and learning of rapid movement sequences in mice. Foxp2 acts by tuning the expression levels of other genes (its downstream targets). In this study we used genome-wide techniques to comprehensively identify the major targets of Foxp2 in the embryonic brain, in order to understand its roles in fundamental biological pathways during neurodevelopment, which we followed up through functional analyses of neurons. Most notably, we found that Foxp2 directly and indirectly regulates networks of genes that alter the length and branching of neuronal projections, an important route for modulating the wiring of neural connections in the developing brain. Overall, our findings shed light on how Foxp2 directs particular features of nervous system development, helping us to build bridges between genes and complex aspects of brain function.
Genome-wide association studies (GWAS) have identified 14 tagging single nucleotide polymorphisms (tagSNPs) that are associated with the risk of colorectal cancer (CRC), and several of these tagSNPs are near bone morphogenetic protein (BMP) pathway loci. The penalty of multiple testing implicit in GWAS increases the attraction of complementary approaches for disease gene discovery, including candidate gene- or pathway-based analyses. The strongest candidate loci for additional predisposition SNPs are arguably those already known both to have functional relevance and to be involved in disease risk. To investigate this proposition, we searched for novel CRC susceptibility variants close to the BMP pathway genes GREM1 (15q13.3), BMP4 (14q22.2), and BMP2 (20p12.3) using sample sets totalling 24,910 CRC cases and 26,275 controls. We identified new, independent CRC predisposition SNPs close to BMP4 (rs1957636, P = 3.93×10−10) and BMP2 (rs4813802, P = 4.65×10−11). Near GREM1, we found using fine-mapping that the previously-identified association between tagSNP rs4779584 and CRC actually resulted from two independent signals represented by rs16969681 (P = 5.33×10−8) and rs11632715 (P = 2.30×10−10). As low-penetrance predisposition variants become harder to identify—owing to small effect sizes and/or low risk allele frequencies—approaches based on informed candidate gene selection may become increasingly attractive. Our data emphasise that genetic fine-mapping studies can deconvolute associations that have arisen owing to independent correlation of a tagSNP with more than one functional SNP, thus explaining some of the apparently missing heritability of common diseases.
Genome-wide association studies (GWAS) have identified several colorectal cancer (CRC) susceptibility polymorphisms near genes that encode proteins in the bone morphogenetic protein (BMP) pathway. However, most of the inherited susceptibility to CRC remains unexplained. We investigated three of the best candidate BMP genes (GREM1, BMP4, and BMP2) for additional polymorphisms associated with CRC. By extensive validation of polymorphisms with only modest evidence of association in the initial phases of the GWAS, we identified new, independent CRC predisposition polymorphisms close to BMP4 (rs1957636) and BMP2 (rs4813802). Near GREM1, we used additional genotyping around the GWAS-identified polymorphism rs4779584 to demonstrate two independent signals represented by rs16969681 and rs11632715. Common genes with modest effects on disease risk are becoming harder to identify, and approaches based on informed candidate gene selection may become increasingly attractive. In addition, genetic fine mapping around polymorphisms identified in GWAS can deconvolute associations which have arisen owing to two independent functional variants. These types of study can identify some of the apparently missing heritability of common disease.
Specific language impairment (SLI) is an unexpected deficit in the acquisition of language skills and affects between 5 and 8% of pre-school children. Despite its prevalence and high heritability, our understanding of the aetiology of this disorder is only emerging. In this paper, we apply genome-wide techniques to investigate an isolated Chilean population who exhibit an increased frequency of SLI. Loss of heterozygosity (LOH) mapping and parametric and non-parametric linkage analyses indicate that complex genetic factors are likely to underlie susceptibility to SLI in this population. Across all analyses performed, the most consistently implicated locus was on chromosome 7q. This locus achieved highly significant linkage under all three non-parametric models (max NPL=6.73, P=4.0 × 10−11). In addition, it yielded a HLOD of 1.24 in the recessive parametric linkage analyses and contained a segment that was homozygous in two affected individuals. Further, investigation of this region identified a two-SNP haplotype that occurs at an increased frequency in language-impaired individuals (P=0.008). We hypothesise that the linkage regions identified here, in particular that on chromosome 7, may contain variants that underlie the high prevalence of SLI observed in this isolated population and may be of relevance to other populations affected by language impairments.
Specific language impairment (SLI); Robinson Crusoe Island; linkage; language
Cutaneous squamous cell carcinomas (cSCCs) are the second most frequent cancers in fair-skinned populations; yet, because of their genetic heterogeneity, the key molecular events in cSCC tumorigenesis remain poorly defined. We have employed single nucleotide polymorphism microarray analysis to examine genome-wide allelic imbalance in 60 cSCCs using paired non-tumour samples. The most frequent recurrent aberrations were loss of heterozygosity (LOH) at 3p and 9p, observed in 39 (65%) and 45 (75%) tumours respectively. Microdeletions at 9p23 within the protein tyrosine phosphatase receptor type D (PTPRD) locus were identified in a total of 9 (15%) samples, supporting a tumour suppressor role for PTPRD in cSCC. In addition, microdeletions at 3p14.2 were detected in 3 (5%) cSCCs, implicating fragile histidine triad (FHIT) gene as a possible target for inactivation. Statistical analysis revealed that well-differentiated cSCCs demonstrated significantly fewer aberrations than moderately and poorly differentiated cSCCs; yet, despite a lower rate of allelic imbalance, some specific aberrations were observed equally frequently in both groups. No correlation was established between the frequency of chromosomal aberrations and immune or human papillomavirus status. Our data suggest that well differentiated tumours are a genetically distinct subpopulation of cSCC.
Acute myeloid leukaemia (AML) is the most common acute leukaemia in adults; however, the genetic aetiology of the disease is not yet fully understood. A quantitative expression profile analysis of 157 mature miRNAs was performed on 100 AML patients representing the spectrum of known karyotypes common in AML. The principle observation reported here is that AMLs bearing a t(15;17) translocation had a distinctive signature throughout the whole set of genes, including the up regulation of a subset of miRNAs located in the human 14q32 imprinted domain. The set included miR-127, miR-154, miR-154*, miR-299, miR-323, miR-368, and miR-370. Furthermore, specific subsets of miRNAs were identified that provided molecular signatures characteristic of the major translocation-mediated gene fusion events in AML. Analysis of variance showed the significant deregulation of 33 miRNAs across the leukaemic set with respect to bone marrow from healthy donors. Fluorescent in situ hybridisation analysis using miRNA-specific locked nucleic acid (LNA) probes on cryopreserved patient cells confirmed the results obtained by real-time PCR. This study, conducted on about a fifth of the miRNAs currently reported in the Sanger database (microrna.sanger.ac.uk), demonstrates the potential for using miRNA expression to sub-classify cancer and suggests a role in the aetiology of leukaemia.
Osteoporotic fractures are a major cause of morbidity and mortality in ageing populations. Osteoporosis, defined as low bone mineral density (BMD) and associated fractures, have significant genetic components that are largely unknown. Linkage analysis in a large number of extended osteoporosis families in Iceland, using a phenotype that combines osteoporotic fractures and BMD measurements, showed linkage to Chromosome 20p12.3 (multipoint allele-sharing LOD, 5.10; p value, 6.3 × 10−7), results that are statistically significant after adjusting for the number of phenotypes tested and the genome-wide search. A follow-up association analysis using closely spaced polymorphic markers was performed. Three variants in the bone morphogenetic protein 2 (BMP2) gene, a missense polymorphism and two anonymous single nucleotide polymorphism haplotypes, were determined to be associated with osteoporosis in the Icelandic patients. The association is seen with many definitions of an osteoporotic phenotype, including osteoporotic fractures as well as low BMD, both before and after menopause. A replication study with a Danish cohort of postmenopausal women was conducted to confirm the contribution of the three identified variants. In conclusion, we find that a region on the short arm of Chromosome 20 contains a gene or genes that appear to be a major risk factor for osteoporosis and osteoporotic fractures, and our evidence supports the view that BMP2 is at least one of these genes.
Genetic analysis of Icelandic families and a replication study in a Danish population provide evidence that variation in the gene BMP2 might contribute to osteoporosis