Primary ciliary dyskinesia (PCD) is an inherited disorder characterized by recurrent infections of the upper and lower respiratory tract, reduced fertility in males and situs inversus in about 50% of affected individuals (Kartagener syndrome). It is caused by motility defects in the respiratory cilia that are responsible for airway clearance, the flagella that propel sperm cells and the nodal monocilia that determine left-right asymmetry1. Recessive mutations that cause PCD have been identified in genes encoding components of the outer dynein arms, radial spokes and cytoplasmic pre-assembly factors of axonemal dyneins, but these mutations account for only about 50% of cases of PCD. We exploited the unique properties of dog populations to positionally clone a new PCD gene, CCDC39. We found that loss-of-function mutations in the human ortholog underlie a substantial fraction of PCD cases with axonemal disorganization and abnormal ciliary beating. Functional analyses indicated that CCDC39 localizes to ciliary axonemes and is essential for assembly of inner dynein arms and the dynein regulatory complex.
Fertility is one of the most important traits in dairy cattle, and has been steadily declining over the last decades. We herein use state-of-the-art genomic tools, including high-throughput SNP genotyping and next-generation sequencing, to identify a 3.3 Kb deletion in the FANCI gene causing the brachyspina syndrome (BS), a rare recessive genetic defect in Holstein dairy cattle. We determine that despite the very low incidence of BS (<1/100,000), carrier frequency is as high as 7.4% in the Holstein breed. We demonstrate that this apparent discrepancy is likely due to the fact that a large proportion of homozygous mutant calves die during pregnancy. We postulate that several other embryonic lethals may segregate in livestock and significantly compromise fertility, and propose a genotype-driven screening strategy to detect the corresponding deleterious mutations.
Genome-wide association studies have identified numerous loci demonstrating genome-wide significant association with Crohn's disease. However, when many single nucleotide polymorphisms (SNPs) have weak-to-moderate disease risks, genetic risk prediction models based only on those markers that pass the most stringent statistical significance testing threshold may be suboptimal. Haplotype-based predictive models may provide advantages over single-SNP approaches by facilitating detection of associations driven by cis-interactions among nearby SNPs. In addition, these approaches may be helpful in assaying non-genotyped, rare causal variants. In this study, we investigated the use of two-marker haplotypes for risk prediction in Crohn's disease and show that it leads to improved prediction accuracy compared with single-point analyses. With large numbers of predictors, traditional classification methods such as logistic regression and support vector machine approaches may be suboptimal. An alternative approach is to apply the risk-score method calculated as the number of risk haplotypes an individual carries, both within and across loci. We used the area under the curve (AUC) of the receiver operating curve to assess the performance of prediction models in large-scale genetic data, and observed that the prediction performance in the validation cohort continues to improve as thousands of haplotypes are included in the model, with the AUC reaching its plateau at 0.72 at ∼7000 haplotypes, and begins to gradually decline after that point. In contrast, using the SNP as predictors, we only obtained maximum AUC of 0.65. Validation studies in independent cohorts further support improved prediction capacity with multi-marker, as opposed to single marker analyses.
We report association mapping of a locus on bovine chromosome 3 that underlies a Mendelian form of stunted growth in Belgian Blue Cattle (BBC). By resequencing positional candidates, we identify the causative c124-2A>G splice variant in intron 1 of the RNF11 gene, for which all affected animals are homozygous. We make the remarkable observation that 26% of healthy Belgian Blue animals carry the corresponding variant. We demonstrate in a prospective study design that approximately one third of homozygous mutants die prematurely with major inflammatory lesions, hence explaining the rarity of growth-stunted animals despite the high frequency of carriers. We provide preliminary evidence that heterozygous advantage for an as of yet unidentified phenotype may have caused a selective sweep accounting for the high frequency of the RNF11 c124-2A>G mutation in Belgian Blue Cattle.
Recessive defects in livestock are common, and this is considered to result from the contraction of the effective population size that accompanies intense selection for desired traits, especially when relying heavily on artificial insemination (as males may concomitantly have a very large number of offspring). The costs of recessive defects are assumed to correspond to the loss of the affected animals. By performing a molecular genetic analysis of stunted growth in Belgian Blue Cattle (BBC), we highlight (i) that the economic impact of recessive defects may outweigh the only loss of affected animals and (ii) that some genetic defects are common for reasons other than inbreeding. We first demonstrate that a splice site variant in the RING finger protein 11 (RNF11) gene accounts for ∼40% of cases of stunted growth in BBC. We then show that a large proportion of animals that are homozygous for the corresponding RNF11 mutation die at a young age due to compromised resistance to pathogens. We finally demonstrate that carriers of the mutation benefit from a selective advantage of unidentified origin that accounts for its high frequency in BBC.
An equine SNP genotyping array was developed and evaluated on a panel of samples representing 14 domestic horse breeds and 18 evolutionarily related species. More than 54,000 polymorphic SNPs provided an average inter-SNP spacing of ∼43 kb. The mean minor allele frequency across domestic horse breeds was 0.23, and the number of polymorphic SNPs within breeds ranged from 43,287 to 52,085. Genome-wide linkage disequilibrium (LD) in most breeds declined rapidly over the first 50–100 kb and reached background levels within 1–2 Mb. The extent of LD and the level of inbreeding were highest in the Thoroughbred and lowest in the Mongolian and Quarter Horse. Multidimensional scaling (MDS) analyses demonstrated the tight grouping of individuals within most breeds, close proximity of related breeds, and less tight grouping in admixed breeds. The close relationship between the Przewalski's Horse and the domestic horse was demonstrated by pair-wise genetic distance and MDS. Genotyping of other Perissodactyla (zebras, asses, tapirs, and rhinoceros) was variably successful, with call rates and the number of polymorphic loci varying across taxa. Parsimony analysis placed the modern horse as sister taxa to Equus przewalski. The utility of the SNP array in genome-wide association was confirmed by mapping the known recessive chestnut coat color locus (MC1R) and defining a conserved haplotype of ∼750 kb across all breeds. These results demonstrate the high quality of this SNP genotyping resource, its usefulness in diverse genome analyses of the horse, and potential use in related species.
We utilized the previously generated horse genome sequence and a large SNP database to design an ∼54,000 SNP assay for use in the domestic horse and related species. The utility of this SNP array was demonstrated through genome-wide linkage disequilibrium, inbreeding and genetic distance measurements within breeds, as well as multidimensional scaling and parsimony analysis. Association mapping confirmed a large conserved segment containing the chestnut coat color locus in domestic horses. We also assess the utility of the SNP array in related species, including the Przewalski's Horse, zebras, asses, tapirs, and rhinoceros. This SNP genotyping tool will facilitate many genetics applications in equids, including identification of genes for health and performance traits, and compelling studies of the origins of the domestic horse, diversity within breeds, and evolutionary relationships among related species.
Monozygotic (MZ) twin pair discordance for childhood-onset Type 1 Diabetes (T1D) is ∼50%, implicating roles for genetic and non-genetic factors in the aetiology of this complex autoimmune disease. Although significant progress has been made in elucidating the genetics of T1D in recent years, the non-genetic component has remained poorly defined. We hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology and, thus, performed an epigenome-wide association study (EWAS) for this disease. We generated genome-wide DNA methylation profiles of purified CD14+ monocytes (an immune effector cell type relevant to T1D pathogenesis) from 15 T1D–discordant MZ twin pairs. This identified 132 different CpG sites at which the direction of the intra-MZ pair DNA methylation difference significantly correlated with the diabetic state, i.e. T1D–associated methylation variable positions (T1D–MVPs). We confirmed these T1D–MVPs display statistically significant intra-MZ pair DNA methylation differences in the expected direction in an independent set of T1D–discordant MZ pairs (P = 0.035). Then, to establish the temporal origins of the T1D–MVPs, we generated two further genome-wide datasets and established that, when compared with controls, T1D–MVPs are enriched in singletons both before (P = 0.001) and at (P = 0.015) disease diagnosis, and also in singletons positive for diabetes-associated autoantibodies but disease-free even after 12 years follow-up (P = 0.0023). Combined, these results suggest that T1D–MVPs arise very early in the etiological process that leads to overt T1D. Our EWAS of T1D represents an important contribution toward understanding the etiological role of epigenetic variation in type 1 diabetes, and it is also the first systematic analysis of the temporal origins of disease-associated epigenetic variation for any human complex disease.
Type 1 diabetes (T1D) is a complex autoimmune disease affecting >30 million people worldwide. It is caused by a combination of genetic and non-genetic factors, leading to destruction of insulin-secreting cells. Although significant progress has recently been made in elucidating the genetics of T1D, the non-genetic component has remained poorly defined. Epigenetic modifications, such as methylation of DNA, are indispensable for genomic processes such as transcriptional regulation and are frequently perturbed in human disease. We therefore hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology, and we performed a genome-wide DNA methylation analysis of a specific subset of immune cells (monocytes) from monozygotic twins discordant for T1D. This revealed the presence of T1D–specific methylation variable positions (T1D–MVPs) in the T1D–affected co-twins. Since these T1D–MVPs were found in MZ twins, they cannot be due to genetic differences. Additional experiments revealed that some of these T1D–MVPs are found in individuals before T1D diagnosis, suggesting they arise very early in the process that leads to overt T1D and are not simply due to post-disease associated factors (e.g. medication or long-term metabolic changes). T1D–MVPs may thus potentially represent a previously unappreciated, and important, component of type 1 diabetes risk.
Genome-wide association studies (GWAS) and candidate gene studies in ulcerative colitis (UC) have identified 18 susceptibility loci. We conducted a meta-analysis of 6 UC GWAS, comprising 6,687 cases and 19,718 controls, and followed-up the top association signals in 9,628 cases and 12,917 controls. We identified 29 additional risk loci (P<5×10-8), increasing the number of UC associated loci to 47. After annotating associated regions using GRAIL, eQTL data and correlations with non-synonymous SNPs, we identified many candidate genes providing potentially important insights into disease pathogenesis, including IL1R2, IL8RA/B, IL7R, IL12B, DAP, PRDM1, JAK2, IRF5, GNA12 and LSP1. The total number of confirmed inflammatory bowel disease (IBD) risk loci is now 99, including a minimum of 28 shared association signals between Crohn’s disease (CD) and UC.
Soluble ICAM-1 (sICAM-1) is an endothelium-derived inflammatory marker that has been associated with diverse conditions such as myocardial infarction, diabetes, stroke, and malaria. Despite evidence for a heritable component to sICAM-1 levels, few genetic loci have been identified so far. To comprehensively address this issue, we performed a genome-wide association analysis of sICAM-1 concentration in 22,435 apparently healthy women from the Women's Genome Health Study. While our results confirm the previously reported associations at the ABO and ICAM1 loci, four novel associations were identified in the vicinity of NFKBIK (rs3136642, P = 5.4×10−9), PNPLA3 (rs738409, P = 5.8×10−9), RELA (rs1049728, P = 2.7×10−16), and SH2B3 (rs3184504, P = 2.9×10−17). Two loci, NFKBIB and RELA, are involved in NFKB signaling pathway; PNPLA3 is known for its association with fatty liver disease; and SH3B2 has been associated with a multitude of traits and disease including myocardial infarction. These associations provide insights into the genetic regulation of sICAM-1 levels and implicate these loci in the regulation of endothelial function.
Soluble Intercellular Adhesion Molecule 1 (sICAM-1) is an inflammatory marker that has been associated with several common diseases such as diabetes, heart disease, stroke, and malaria. While it is known that blood concentrations of sICAM-1 are at least partially genetically determined, our current knowledge of which genes mediate this effect is limited. Taking advantage of technologies allowing us to interrogate genetic variation on a whole-genome basis, we found that variation in the NFKBIK, PNPLA3, RELA, and SH2B3 genes are important determinant of sICAM-1 blood concentrations. The NFKBIB and RELA genes are involved in regulation of inflammation. These observations are significant because this is the first report of genetic association within these extensively studied inflammation genes. The PNPLA3 gene has previously been associated with liver disease, and the SH2B3 gene has been associated with a multitude of traits including cardiovascular disease. Extension of these associations to sICAM-1 adds to the intriguing diversity of effects of these genes.
Hereditary periodic fever syndromes are characterized by recurrent episodes of fever and inflammation with no known pathogenic or autoimmune cause. In humans, several genes have been implicated in this group of diseases, but the majority of cases remain unexplained. A similar periodic fever syndrome is relatively frequent in the Chinese Shar-Pei breed of dogs. In the western world, Shar-Pei have been strongly selected for a distinctive thick and heavily folded skin. In this study, a mutation affecting both these traits was identified. Using genome-wide SNP analysis of Shar-Pei and other breeds, the strongest signal of a breed-specific selective sweep was located on chromosome 13. The same region also harbored the strongest genome-wide association (GWA) signal for susceptibility to the periodic fever syndrome (praw = 2.3×10−6, pgenome = 0.01). Dense targeted resequencing revealed two partially overlapping duplications, 14.3 Kb and 16.1 Kb in size, unique to Shar-Pei and upstream of the Hyaluronic Acid Synthase 2 (HAS2) gene. HAS2 encodes the rate-limiting enzyme synthesizing hyaluronan (HA), a major component of the skin. HA is up-regulated and accumulates in the thickened skin of Shar-Pei. A high copy number of the 16.1 Kb duplication was associated with an increased expression of HAS2 as well as the periodic fever syndrome (p<0.0001). When fragmented, HA can act as a trigger of the innate immune system and stimulate sterile fever and inflammation. The strong selection for the skin phenotype therefore appears to enrich for a pleiotropic mutation predisposing these dogs to a periodic fever syndrome. The identification of HA as a major risk factor for this canine disease raises the potential of this glycosaminoglycan as a risk factor for human periodic fevers and as an important driver of chronic inflammation.
Shar-Pei dogs have two unique features: a breed defining “wrinkled” skin phenotype and a genetic disorder called Familial Shar-Pei Fever (FSF). The wrinkled phenotype is strongly selected for and is the result of excessive hyaluronan (HA) deposited in the skin. HA is a molecule that may behave in a pro-inflammatory manner and create a “danger signal” by being analogous to molecules on the surface of pathogens. FSF is characterized by unprovoked episodes of fever and/or inflammation and resembles several human autoinflammatory syndromes. Here we show that the two features are connected and have the same genetic origin, a regulatory mutation located close to a HA synthesizing gene (HAS2). The mutation is a 16.1 Kb duplication, the copy number of which correlates with HAS2 expression and disease. We suggest that the large amount of HA responsible for the skin condition predisposes to sterile fever and inflammation. HAS2 was previously not known to associate with autoinflammatory disease, and this finding is of wide interest since approximately 60% of human patients with periodic fever syndrome remain genetically unexplained. This investigation also demonstrates how strong artificial selection may affect not only desired and selected phenotypes, but also the health of domestic animals.
Crohn's disease (CD) and celiac disease (CelD) are chronic intestinal inflammatory diseases, involving genetic and environmental factors in their pathogenesis. The two diseases can co-occur within families, and studies suggest that CelD patients have a higher risk to develop CD than the general population. These observations suggest that CD and CelD may share common genetic risk loci. Two such shared loci, IL18RAP and PTPN2, have already been identified independently in these two diseases. The aim of our study was to explicitly identify shared risk loci for these diseases by combining results from genome-wide association study (GWAS) datasets of CD and CelD. Specifically, GWAS results from CelD (768 cases, 1,422 controls) and CD (3,230 cases, 4,829 controls) were combined in a meta-analysis. Nine independent regions had nominal association p-value <1.0×10−5 in this meta-analysis and showed evidence of association to the individual diseases in the original scans (p-value <1×10−2 in CelD and <1×10−3 in CD). These include the two previously reported shared loci, IL18RAP and PTPN2, with p-values of 3.37×10−8 and 6.39×10−9, respectively, in the meta-analysis. The other seven had not been reported as shared loci and thus were tested in additional CelD (3,149 cases and 4,714 controls) and CD (1,835 cases and 1,669 controls) cohorts. Two of these loci, TAGAP and PUS10, showed significant evidence of replication (Bonferroni corrected p-values <0.0071) in the combined CelD and CD replication cohorts and were firmly established as shared risk loci of genome-wide significance, with overall combined p-values of 1.55×10−10 and 1.38×10−11 respectively. Through a meta-analysis of GWAS data from CD and CelD, we have identified four shared risk loci: PTPN2, IL18RAP, TAGAP, and PUS10. The combined analysis of the two datasets provided the power, lacking in the individual GWAS for single diseases, to detect shared loci with a relatively small effect.
Celiac disease and Crohn's disease are both chronic inflammatory diseases of the digestive tract. Both of these diseases are complex genetic traits with multiple genetic and non-genetic risk factors. Recent genome-wide association (GWA) studies have identified some of the genetic risk factors for these diseases. Interestingly, in addition to some similarities in phenotype, these studies have shown that CelD and CD share some genetic risk factors. Specifically, by comparing the results of independent GWA studies of CD and CelD, two genetic risk loci were found in common: the PTPN2 locus and the IL18RAP locus. Therefore, in order to directly test for additional shared genetic risk factors, we combined the GWA results from two large studies of CelD and CD, essentially creating a combined phenotype with anyone with CD or CelD being coded as affected. Association results were then replicated in additional cohorts of CelD and CD. It is expected that shared risk loci should show association in this analysis, whereas the signal of risk loci specific to either of the two diseases should be diluted. With this method of meta-analysis, we identified next to PTPN2 and IL18 RAP two loci harbouring TAGAP and PUS10 as shared risk loci for Crohn's disease and celiac disease at genome-wide significance.
A multicenter genome-wide association scan for Crohn's Disease (CD) has recently reported 40 CD susceptibility loci, including 29 novel ones (19 significant and 10 putative). To gain insight into the genetic overlap between CD and ankylosing spondylitis (AS), these markers were tested for association in AS patients.
Two previously established associations, namely with the MHC and IL23R loci, were confirmed. In addition, rs2872507, which maps to a locus associated with asthma and influences the expression of the ORMDL3 gene in lymphoblastoid cells, showed a significant association with AS (p = 0.03). In gut biopsies of AS and CD patients, ORMDL3 expression was not significantly different from controls and no correlation was found with the rs2872507 genotype (Spearman's rho: −0.067). The distribution of p-values for the remaining 36 SNPs was significantly skewed towards low p-values unless the top 5 ranked SNPs (ORMDL3, NKX2–3, PTPN2, ICOSLG and MST1) were excluded from the analysis.
Association analysis using risk variants for CD led to the identification of a new risk variant associated with AS (ORMDL3), underscoring a role for ER stress in AS. In addition, two known and five potentially relevant associations were detected, contributing to common susceptibility of CD and AS.
Prediction of genetic merit using dense SNP genotypes can be used for estimation of breeding values for selection of livestock, crops, and forage species; for prediction of disease risk; and for forensics. The accuracy of these genomic predictions depends in part on the genetic architecture of the trait, in particular number of loci affecting the trait and distribution of their effects. Here we investigate the difference among three traits in distribution of effects and the consequences for the accuracy of genomic predictions. Proportion of black coat colour in Holstein cattle was used as one model complex trait. Three loci, KIT, MITF, and a locus on chromosome 8, together explain 24% of the variation of proportion of black. However, a surprisingly large number of loci of small effect are necessary to capture the remaining variation. A second trait, fat concentration in milk, had one locus of large effect and a host of loci with very small effects. Both these distributions of effects were in contrast to that for a third trait, an index of scores for a number of aspects of cow confirmation (“overall type”), which had only loci of small effect. The differences in distribution of effects among the three traits were quantified by estimating the distribution of variance explained by chromosome segments containing 50 SNPs. This approach was taken to account for the imperfect linkage disequilibrium between the SNPs and the QTL affecting the traits. We also show that the accuracy of predicting genetic values is higher for traits with a proportion of large effects (proportion black and fat percentage) than for a trait with no loci of large effect (overall type), provided the method of analysis takes advantage of the distribution of loci effects.
Prediction of future phenotypes or genetic merit using high-density SNP chips can be used for prediction of disease risk in humans, for forensics, and for selection of livestock, crops, and forage species. Key questions are how accurately these predictions can be made and on what parameters does the accuracy depend. In this paper, we use three dairy cow traits—proportion of black on coat, fat percentage in milk, and overall type, which measures cow confirmation—to demonstrate the large differences among genetic architectures of complex traits. For example 24% of the genetic variance in proportion of black is determined by three loci, KIT, MITF, and a locus on chromosome 8; however a surprisingly large number of additional loci, all of small effect, are required to capture the remaining variation. For overall type, a very large number of loci are necessary to capture the same level of variance. We also show that the accuracy of predicting genetic values is higher for traits with a proportion of large effects (proportion black and fat percentage) than for a trait with no loci of large effect (overall type), provided the method of analysis takes advantage of the distribution of loci effects.
Arachnomelia is a monogenic recessive defect of skeletal development in cattle. The causative mutation was previously mapped to a ∼7 Mb interval on chromosome 5. Here we show that array-based sequence capture and massively parallel sequencing technology, combined with the typical family structure in livestock populations, facilitates the identification of the causative mutation. We re-sequenced the entire critical interval in a healthy partially inbred cow carrying one copy of the critical chromosome segment in its ancestral state and one copy of the same segment with the arachnomelia mutation, and we detected a single heterozygous position. The genetic makeup of several partially inbred cattle provides extremely strong support for the causality of this mutation. The mutation represents a single base insertion leading to a premature stop codon in the coding sequence of the SUOX gene and is perfectly associated with the arachnomelia phenotype. Our findings suggest an important role for sulfite oxidase in bone development.
Arachnomelia is a defect in skeletal development of cattle. Affected calves are born dead with elongated limbs and facial deformities. The causative mutation for this recessive condition had previously been mapped to a ∼7 Mb interval. We exploited the special structure of cattle families to identify the causative mutation by a purely genetic approach. The rich pedigree records in cattle breeding allowed us to identify the founder animal of arachnomelia, a Brown Swiss bull born in 1957. A few generations later several cattle received two copies of the same chromosome segment from the father of this bull due to inbreeding. One copy was passed through the founder animal and acquired the causative mutation, while the other copy was transmitted through a different line of animals and stayed in its ancestral state. Using next-generation sequencing, we sequenced the entire critical interval in one of these inbred animals. As expected, we found only one single heterozygous position, which consequently represents the causative mutation for arachnomelia. The mutation affects the gene for sulfite oxidase, thus indicating a previously unrecognized important role for this enzyme in bone development. Our findings can immediately be applied to remove this deleterious mutation from the cattle breeding population.
Large fractions of eukaryotic genomes contain repetitive sequences of which the vast majority is derived from transposable elements (TEs). In order to inactivate those potentially harmful elements, host organisms silence TEs via methylation of transposon DNA and packaging into chromatin associated with repressive histone marks. The contribution of individual histone modifications in this process is not completely resolved. Therefore, we aimed to define the role of reversible histone acetylation, a modification commonly associated with transcriptional activity, in transcriptional regulation of murine TEs. We surveyed histone acetylation patterns and expression levels of ten different murine TEs in mouse fibroblasts with altered histone acetylation levels, which was achieved via chemical HDAC inhibition with trichostatin A (TSA), or genetic inactivation of the major deacetylase HDAC1. We found that one LTR retrotransposon family encompassing virus-like 30S elements (VL30) showed significant histone H3 hyperacetylation and strong transcriptional activation in response to TSA treatment. Analysis of VL30 transcripts revealed that increased VL30 transcription is due to enhanced expression of a limited number of genomic elements, with one locus being particularly responsive to HDAC inhibition. Importantly, transcriptional induction of VL30 was entirely dependent on the activation of MAP kinase pathways, resulting in serine 10 phosphorylation at histone H3. Stimulation of MAP kinase cascades together with HDAC inhibition led to simultaneous phosphorylation and acetylation (phosphoacetylation) of histone H3 at the VL30 regulatory region. The presence of the phosphoacetylation mark at VL30 LTRs was linked with full transcriptional activation of the mobile element. Our data indicate that the activity of different TEs is controlled by distinct chromatin modifications. We show that activation of a specific mobile element is linked to a dual epigenetic mark and propose a model whereby phosphoacetylation of histone H3 is crucial for full transcriptional activation of VL30 elements.
The majority of genomic sequences in higher eukaryotes do not contain protein coding genes. Large fractions are covered by repetitive sequences, many of which are derived from transposable elements (TEs). These selfish genes, only containing sequences necessary for self-propagation, can multiply and change their location within the genome, threatening host genome integrity and provoking mutational bursts. Therefore host organisms have evolved a diverse repertoire of defence mechanisms to counteract and silence these genomic parasites. One way is to package DNA sequences containing TEs into transcriptionally inert heterochromatin, which is partly achieved via chemical modification of the packaging proteins associated with DNA, the histones. To better understand the contribution of histone acetylation in the activation of TEs, we treated mouse fibroblasts with a specific histone deacetylase inhibitor. By monitoring the expression of ten different types of murine mobile elements, we identified a defined subset of VL30 transposons specifically reactivated upon increased histone acetylation. Importantly, phosphorylation of histone H3, a modification that is triggered by stress, is required for acetylation-dependent activation of VL30 elements. We present a model where concomitant histone phosphorylation and acetylation cooperate in the transcriptional induction of VL30 elements.
Osteoporosis is a major public health problem. It is mainly characterized by low bone mineral density (BMD) and/or low-trauma osteoporotic fractures (OF), both of which have strong genetic determination. The specific genes influencing these phenotypic traits, however, are largely unknown. Using the Affymetrix 500K array set, we performed a case-control genome-wide association study (GWAS) in 700 elderly Chinese Han subjects (350 with hip OF and 350 healthy matched controls). A follow-up replication study was conducted to validate our major GWAS findings in an independent Chinese sample containing 390 cases with hip OF and 516 controls. We found that a SNP, rs13182402 within the ALDH7A1 gene on chromosome 5q31, was strongly associated with OF with evidence combined GWAS and replication studies (P = 2.08×10−9, odds ratio = 2.25). In order to explore the target risk factors and potential mechanism underlying hip OF risk, we further examined this candidate SNP's relevance to hip BMD both in Chinese and Caucasian populations involving 9,962 additional subjects. This SNP was confirmed as consistently associated with hip BMD even across ethnic boundaries, in both Chinese and Caucasians (combined P = 6.39×10−6), further attesting to its potential effect on osteoporosis. ALDH7A1 degrades and detoxifies acetaldehyde, which inhibits osteoblast proliferation and results in decreased bone formation. Our findings may provide new insights into the pathogenesis of osteoporosis.
Osteoporosis is a major health concern worldwide. It is a highly heritable disease characterized mainly by low bone mineral density (BMD) and/or osteoporotic fractures. However, the specific genetic variants determining risk for low BMD or OF are largely unknown. Here, taking advantage of recent technological advances in human genetics, we performed a genome-wide association study and follow-up validation studies to identify genetic variants for osteoporosis. By examining a total of 11,568 individuals from Chinese and Caucasian populations, we discovered a susceptibility gene, ALDH7A1, which is associated with hip osteoporotic fracture and BMD. ALDH7A1 might inhibit osteoblast proliferation and decrease bone formation. Our finding opens a new avenue for exploring the pathophysiology of osteoporosis.
The Patrocles database (http://www.patrocles.org/) compiles DNA sequence polymorphisms (DSPs) that are predicted to perturb miRNA-mediated gene regulation. Distinctive features include: (i) the coverage of seven vertebrate species in its present release, aiming for more when information becomes available, (ii) the coverage of the three compartments involved in the silencing process (i.e. targets, miRNA precursors and silencing machinery), (iii) contextual information that enables users to prioritize candidate ‘Patrocles DSPs’, including graphical information on miRNA-target coexpression and eQTL effect of genotype on target expression levels, (iv) the inclusion of Copy Number Variants and eQTL information that affect miRNA precursors as well as genes encoding components of the silencing machinery and (v) a tool (Patrocles finder) that allows the user to determine whether her favorite DSP may perturb miRNA-mediated gene regulation of custom target sequences. To support the biological relevance of Patrocles' content, we searched for signatures of selection acting on ‘Patrocles single nucleotide polymorphisms (pSNPs)’ in human and mice. As expected, we found a strong signature of purifying selection against not only SNPs that destroy conserved target sites but also against SNPs that create novel, illegitimate target sites, which is reminiscent of the Texel mutation in sheep.
Phylogenetic studies of the emergence and spread of natural recombinants in herpesviruses infecting humans and animals have been reported recently. However, despite an ever-increasing amount of evidence of recombination in herpesvirus history, the recombination process and the consequences on the genetic diversity of the progeny remain poorly characterized. We addressed this issue by using multiple single-nucleotide polymorphisms (SNPs) differentiating the two subtypes of an alphaherpesvirus, bovine herpesvirus 1 (BoHV-1). Analysis of a large sample of progeny virions obtained in a single growth cycle of coinfected BoHV-1 strains provided a prospective investigation of the recombination dynamics by using SNPs as recombination markers. We found that the simultaneous infection with two closely related herpesviruses results in a highly diversified recombination mosaic. From the analysis of multiple recombinants arising in the progeny, we provide the first evidence of genetic interference influencing the recombination process in herpesviruses. In addition, we report striking differences in the levels of recombination frequency observed along the BoHV-1 genome. With particular emphasis on the genetic structure of a progeny virus population rising in vitro, our data show to which extent recombination participates to the genetic diversification of herpesviruses.
We herein describe the positional identification of a 2-bp deletion in the open reading frame of the MRC2 receptor causing the recessive Crooked Tail Syndrome in cattle. The resulting frame-shift reveals a premature stop codon that causes nonsense-mediated decay of the mutant messenger RNA, and the virtual absence of functional Endo180 protein in affected animals. Cases exhibit skeletal anomalies thought to result from impaired extracellular matrix remodeling during ossification, and as of yet unexplained muscular symptoms. We demonstrate that carrier status is very significantly associated with desired characteristics in the general population, including enhanced muscular development, and that the resulting heterozygote advantage caused a selective sweep which explains the unexpectedly high frequency (25%) of carriers in the Belgian Blue Cattle Breed.
Livestock are being subject to intense artificial selection aimed at ever-increasing, sometimes extreme, production phenotypes. This is well-illustrated by the exceptional muscular hypertrophy characterizing the “double-muscled” Belgian Blue Cattle Breed (BBCB). We herein identify a loss-of-function mutation of the bovine MRC2 gene that increases muscle mass in heterozygotes, yet causes skeletal and muscular malformations known as Crooked Tail Syndrome (CTS) in homozygotes. As a result of the “heterozygote advantage”, the MRC2 c.2904_2905delAG mutation has swept through the BBCB population, resulting in as many as 25% carrier animals and causing a sudden burst of CTS cases. These findings highlight one of the risks associated with pushing domestic animals to their physiological limits by intense artificial selection.
Sensory ataxic neuropathy (SAN) is a recently identified neurological disorder in golden retrievers. Pedigree analysis revealed that all affected dogs belong to one maternal lineage, and a statistical analysis showed that the disorder has a mitochondrial origin. A one base pair deletion in the mitochondrial tRNATyr gene was identified at position 5304 in affected dogs after re-sequencing the complete mitochondrial genome of seven individuals. The deletion was not found among dogs representing 18 different breeds or in six wolves, ruling out this as a common polymorphism. The mutation could be traced back to a common ancestor of all affected dogs that lived in the 1970s. We used a quantitative oligonucleotide ligation assay to establish the degree of heteroplasmy in blood and tissue samples from affected dogs and controls. Affected dogs and their first to fourth degree relatives had 0–11% wild-type (wt) sequence, while more distant relatives ranged between 5% and 60% wt sequence and all unrelated golden retrievers had 100% wt sequence. Northern blot analysis showed that tRNATyr had a 10-fold lower steady-state level in affected dogs compared with controls. Four out of five affected dogs showed decreases in mitochondrial ATP production rates and respiratory chain enzyme activities together with morphological alterations in muscle tissue, resembling the changes reported in human mitochondrial pathology. Altogether, these results provide conclusive evidence that the deletion in the mitochondrial tRNATyr gene is the causative mutation for SAN.
Mitochondrial disorders are a group of heterogeneous diseases. It has been estimated that the prevalence of mitochondrial diseases in humans, due to mutations of the mitochondrial genome (mtDNA), is approximately 1 in 8000 in a Caucasian population. Since the late 1980s, when the first disease-causing mutation in human mtDNA was identified, approximately 250 pathogenic mtDNA mutations have been described. Sensory ataxic neuropathy (SAN) is a recently identified neurological disorder in golden retriever dogs that is maternally transmitted. Affected dogs are ataxic, have postural reaction deficits, and exhibit reduced spinal reflexes. They have no pronounced muscle atrophy nor do they seem to be in pain. In this study, we report the identification and characterization of the mutation causing SAN, a single base pair deletion in the mitochondrial tRNATyr gene. The identification of this mutation makes it possible to eradicate the disease in golden retrievers. SAN constitutes a new animal model for mitochondrial disorders in humans.
Parasitic gastroenteritis caused by nematodes is only second to mastitis in terms of health costs to dairy farmers in developed countries. Sustainable control strategies complementing anthelmintics are desired, including selective breeding for enhanced resistance.
Results and Conclusion
To quantify and characterize the genetic contribution to variation in resistance to gastro-intestinal parasites, we measured the heritability of faecal egg and larval counts in the Dutch Holstein-Friesian dairy cattle population. The heritability of faecal egg counts ranged from 7 to 21% and was generally higher than for larval counts. We performed a whole genome scan in 12 paternal half-daughter groups for a total of 768 cows, corresponding to the ~10% most and least infected daughters within each family (selective genotyping). Two genome-wide significant QTL were identified in an across-family analysis, respectively on chromosomes 9 and 19, coinciding with previous findings in orthologous chromosomal regions in sheep. We identified six more suggestive QTL by within-family analysis. An additional 73 informative SNPs were genotyped on chromosome 19 and the ensuing high density map used in a variance component approach to simultaneously exploit linkage and linkage disequilibrium in an initial inconclusive attempt to refine the QTL map position.
Polyunsaturated fatty acids (PUFA) have a role in many physiological processes, including energy production, modulation of inflammation, and maintenance of cell membrane integrity. High plasma PUFA concentrations have been shown to have beneficial effects on cardiovascular disease and mortality. To identify genetic contributors of plasma PUFA concentrations, we conducted a genome-wide association study of plasma levels of six omega-3 and omega-6 fatty acids in 1,075 participants in the InCHIANTI study on aging. The strongest evidence for association was observed in a region of chromosome 11 that encodes three fatty acid desaturases (FADS1, FADS2, FADS3). The SNP with the most significant association was rs174537 near FADS1 in the analysis of arachidonic acid (AA; p = 5.95×10−46). Minor allele homozygotes had lower AA compared to the major allele homozygotes and rs174537 accounted for 18.6% of the additive variance in AA concentrations. This SNP was also associated with levels of eicosadienoic acid (EDA; p = 6.78×10−9) and eicosapentanoic acid (EPA; p = 1.07×10−14). Participants carrying the allele associated with higher AA, EDA, and EPA also had higher low-density lipoprotein (LDL-C) and total cholesterol levels. Outside the FADS gene cluster, the strongest region of association mapped to chromosome 6 in the region encoding an elongase of very long fatty acids 2 (ELOVL2). In this region, association was observed with EPA (rs953413; p = 1.1×10−6). The effects of rs174537 were confirmed in an independent sample of 1,076 subjects participating in the GOLDN study. The ELOVL2 SNP was associated with docosapentanoic and DHA but not with EPA in GOLDN. These findings show that polymorphisms of genes encoding enzymes in the metabolism of PUFA contribute to plasma concentrations of fatty acids.
Polyunsaturated fatty acids (PUFA) have a number of beneficial effects on human health. Plasma PUFA concentrations are determined by a combination of dietary intake and metabolic efficiency. To determine the genes involved in PUFA homeostasis, we scanned the genome for genetic variations associated with plasma PUFA concentrations. The fatty acid desaturase gene, studied in previous candidate gene association studies, was the strongest determinant of plasma PUFA. A second gene encoding a fatty acid elongase was associated with long chain PUFA. The results of this study contribute to our understanding of the genetics of PUFA homeostasis. These genetic markers may be useful tools to examine the inter-relationship between diet, genetics, and disease.
Yellow skin is an abundant phenotype among domestic chickens and is caused by a recessive allele (W*Y) that allows deposition of yellow carotenoids in the skin. Here we show that yellow skin is caused by one or more cis-acting and tissue-specific regulatory mutation(s) that inhibit expression of BCDO2 (beta-carotene dioxygenase 2) in skin. Our data imply that carotenoids are taken up from the circulation in both genotypes but are degraded by BCDO2 in skin from animals carrying the white skin allele (W*W). Surprisingly, our results demonstrate that yellow skin does not originate from the red junglefowl (Gallus gallus), the presumed sole wild ancestor of the domestic chicken, but most likely from the closely related grey junglefowl (Gallus sonneratii). This is the first conclusive evidence for a hybrid origin of the domestic chicken, and it has important implications for our views of the domestication process.
Many bird species possess yellow skin and legs whereas other species have white or black skin color. Yellow or white skin is due to the presence or absence of carotenoids. The genetic basis underlying this diversity is unknown. Domestic chickens with yellow skin are homozygous for a recessive allele, and white skinned chickens carry the dominant allele. As a result, chickens represent an ideal model for analyzing genetic mechanism responsible for skin color variation. In this study we demonstrate that yellow skin is caused by regulatory mutation(s) that inhibit expression of the beta-carotene dioxygenase 2 (BCDO2) enzyme in skin, but not in other tissues. Because BCDO2 cleaves colorful carotenoids into colorless apocarotenoids, a reduction in expression of this gene produces yellow skin. This study also provides the first conclusive evidence of a hybrid origin of the domestic chicken. It has been generally assumed that the red junglefowl is the sole ancestor of the domestic chicken. A phylogenetic analysis, however, demonstrates that though the white skin allele originates from the red junglefowl, the yellow skin allele originates from a different species, most likely the grey junglefowl. This result significantly advances our understanding of chicken domestication.
Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ∼ 1.8–2.0). Relative risks as low as λ ∼ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.
Advances in high-throughput genotyping technology and the International HapMap Project have enabled genetic association studies at the whole-genome level. Our paper describes two genome-wide SNP panels that contain tag SNPs derived from the International HapMap Project. Tag SNPs are proxies for groups of highly correlated SNPs. Information can be captured for the entire group of correlated SNPs by genotyping only one representative SNP, the tag SNP. These whole-genome SNP panels also contain additional content thought to be overrepresented in disease, such as amino acid–changing nonsynonymous SNPs and mitochondrial SNPs. We show that these panels cover the genome with very high efficiency as measured by coverage of all HapMap SNPs and a set of SNPs derived from completely resequenced genes from the Seattle SNPs database. We also show that these panels have high power to detect disease risk alleles for both HapMap and non-HapMap SNPs. In complex disease where multiple risk alleles are believed to be involved, we show that the ability to detect at least one risk allele with the tag SNP panels is also high.