Substantial advances have been made in identifying common genetic variants influencing cardiometabolic traits and disease outcomes through genome wide association studies. Nevertheless, gaps in knowledge remain and new questions have arisen regarding the population relevance, mechanisms, and applications for healthcare. Using a new high-resolution custom single nucleotide polymorphism (SNP) array (Metabochip) incorporating dense coverage of genomic regions linked to cardiometabolic disease, the University College-London School-Edinburgh-Bristol (UCLEB) consortium of highly-phenotyped population-based prospective studies, aims to: (1) fine map functionally relevant SNPs; (2) precisely estimate individual absolute and population attributable risks based on individual SNPs and their combination; (3) investigate mechanisms leading to altered risk factor profiles and CVD events; and (4) use Mendelian randomisation to undertake studies of the causal role in CVD of a range of cardiovascular biomarkers to inform public health policy and help develop new preventative therapies.
Algorithms combining both clinical and genetic data have been developed to improve oral anticoagulant therapy. Three polymorphisms in two genes, VKORC1 and CYP2C9, are the main coumarin dose determinants and no additional polymorphisms of any relevant pharmacogenetic importance have been identified.
To identify new genetic variations in VKORC1 with relevance for oral anticoagulant therapy.
Methods and Results
3949 consecutive patients taking acenocoumarol were genotyped for the VKORC1 rs9923231 and CY2C9* polymorphisms. Of these, 145 patients with a dose outside the expected range for the genetic profile determined by these polymorphisms were selected. Clinical factors explained the phenotype in 88 patients. In the remaining 57 patients, all with higher doses than expected, we sequenced the VKORC1 gene and genetic changes were identified in 14 patients. Four patients carried VKORC1 variants previously related to high coumarin doses (L128R, N = 1 and D36Y, N = 3).Three polymorphisms were also detected: rs17878544 (N = 5), rs55894764 (N = 4) and rs7200749 (N = 2) which was in linkage disequilibrium with rs17878544. Finally, 2 patients had lost the rs9923231/rs9934438 linkage. The prevalence of these variations was higher in these patients than in the whole sample. Multivariate linear regression analysis revealed that only D36Y and rs55894764 variants significantly affect the dose, although the improvement in the prediction model is small (from 39% to 40%).
Our strategy identified novel associations of VKORC1 variants with higher acenocoumarol doses albeit with a low effect size. Further studies are necessary to test their influence on the VKORC1 function and the cost/benefit of their inclusion in pharmacogenetic algorithms.
Motivation: The number of missense mutations being identified in cancer genomes has greatly increased as a consequence of technological advances and the reduced cost of whole-genome/whole-exome sequencing methods. However, a high proportion of the amino acid substitutions detected in cancer genomes have little or no effect on tumour progression (passenger mutations). Therefore, accurate automated methods capable of discriminating between driver (cancer-promoting) and passenger mutations are becoming increasingly important. In our previous work, we developed the Functional Analysis through Hidden Markov Models (FATHMM) software and, using a model weighted for inherited disease mutations, observed improved performances over alternative computational prediction algorithms. Here, we describe an adaptation of our original algorithm that incorporates a cancer-specific model to potentiate the functional analysis of driver mutations.
Results: The performance of our algorithm was evaluated using two separate benchmarks. In our analysis, we observed improved performances when distinguishing between driver mutations and other germ line variants (both disease-causing and putatively neutral mutations). In addition, when discriminating between somatic driver and passenger mutations, we observed performances comparable with the leading computational prediction algorithms: SPF-Cancer and TransFIC.
Availability and implementation: A web-based implementation of our cancer-specific model, including a downloadable stand-alone package, is available at http://fathmm.biocompute.org.uk.
Supplementary data are available at Bioinformatics online.
MeltMADGE reconfigures the mutation scanning process of denaturing gradient gel electrophoresis (DGGE) so that the independent variable is time rather than space and the dependent (denaturing) variable is temperature rather than concentration of chemical denaturant. Use of a thermal ramp enables use of a homogeneous gel and therefore of high density arrays of wells such as those of microplate array diagonal gel electrophoresis (MADGE). In this configuration, electrophoresis of products on 10-12 96-well meltMADGE gels can be conducted in a 1-2 litre tank in a 1-2hour run, enabling the scanning of a target amplicon in over 1,000 subjects simultaneously. Gels are read by imaging fluorescence of UV-excited ethidium bromide, giving a simple, economical system for identifying rarer sequence variants in target genes which is suitable for large-scale case-control or population studies and other comparable applications. Different amplicons with similar melting characteristics can also be combined into the same run.
Primary ciliary dyskinesia (PCD) is a genetic disorder, usually autosomal recessive, causing early respiratory disease and later subfertility. Whole exome sequencing may enable efficient analysis for locus heterogeneous disorders such as PCD. We whole exome sequenced one consanguineous Saudi Arabian with clinically diagnosed PCD and normal laterality, to attempt ab initio molecular diagnosis.
We reviewed thirteen known PCD genes and potentially autozygous regions (extended homozygosity) for homozygous exon deletions, non-dbSNP codon, splice-site base variants or small indels. Homozygous non-dbSNP changes were also reviewed exome-wide.
One single molecular read representing RSPH9 p.Lys268del was observed, with no wildtype reads, and a notable deficiency of mapped reads at this location. Among all observations, RSPH9 was the strongest candidate for causality. Searching unmapped reads revealed seven more mutant reads. Direct assay for p.Lys268del (MboII digest) confirmed homozygosity in the affected individual, then confirmed homozygosity in three siblings with bronchiectasis. Our finding in southwest Saudi Arabia indicates that p.Lys268del, previously observed in two Bedouin families (Israel, UAE) is geographically widespread in the Arabian Peninsula. Analogous with cystic fibrosis CFTR p.Phe508del, screening for RSPH9 p.Lys268del (which lacks sentinel dextrocardia) in those at risk would help in early diagnosis, tailored clinical management, genetic counselling and primary prevention.
high-throughput nucleotide sequencing; primary ciliary dyskinesia; screening
Germ-line mutation rate has been regarded classically as a fundamental biological parameter, as it affects the prevalence of genetic disorders and the rate of evolution. Somatic mutation rate is also an important biological parameter, as it may influence the development and/or the course of acquired diseases, particularly of cancer. Estimates of this parameter have been previously obtained in few instances from dermal fibroblasts and lymphoblastoid cells. However, the methodology required has been laborious and did not lend itself to the analysis of large numbers of samples. We have previously shown that the X-linked gene PIG-A, since its product is required for glycosyl-phosphatidylinositol-anchored proteins to become surface bound, is a good sentinel gene for studying somatic mutations. We now show that by this approach we can accurately measure the proportion of PIG-A mutant peripheral blood granulocytes, which we call mutant frequency, ƒ. We found that the results are reproducible, with a variation coefficient (CV) of 45%. Repeat samples from 32 subjects also had a CV of 44%, indicating that ƒ is a relatively stable individual characteristic. From a study of 142 normal subjects we found that log ƒ is a normally distributed variable; ƒ variability spans a 80-fold range, from less than 1×10−6 to 37.5×10−6, with a median of 4.9×10−6. Unlike other techniques commonly employed in population studies, such as comet assay, this method can detect any kind of mutation, including point mutation, as long as it causes functional inactivation of PIG-A gene. Since the test is rapid and requires only a small sample of peripheral blood, this methodology will lend itself to investigating genetic factors that underlie the variation in the somatic mutation rate, as well as environmental factors that may affect it. It will be also possible to test whether ƒ is a determinant of the risk of cancer.
Autoimmune polyendocrine syndrome type 1 (APS-1) is a rare autosomal recessive disease defined by the presence of two of the three conditions: mucocutaneous candidiasis, hypoparathyroidism, and Addison’s disease. Loss-of-function mutations of the autoimmune regulator (AIRE) gene have been linked to APS-1. Here we report mutational analysis and functional characterization of an AIRE mutation in a consanguineous Chinese family with APS-1. All exons of the AIRE gene and adjacent exon-intron sequences were amplified by PCR and subsequently sequenced. We identified a homozygous missense AIRE mutation c.463G>A (p.Gly155Ser) in two siblings with different clinical features of APS-1. In silico splice-site prediction and minigene analysis were carried out to study the potential pathological consequence. Minigene splicing analysis and subsequent cDNA sequencing revealed that the AIRE mutation potentially compromised the recognition of the splice donor of intron 3, causing alternative pre-mRNA splicing by intron 3 retention. Furthermore, the aberrant AIRE transcript was identified in a heterozygous carrier of the c.463G>A mutation. The aberrant intron 3-retaining transcript generated a truncated protein (p.G155fsX203) containing the first 154 AIRE amino acids and followed by 48 aberrant amino acids. Therefore, our study represents the first functional characterization of the alternatively spliced AIRE mutation that may explain the pathogenetic role in APS-1.
Polymorphisms in apolipoprotein genes have shown to be predictors of plasma lipid levels in adult cohorts receiving highly active antiretroviral therapy (HAART). Our objective was to confirm the association between the APOC3 genotype and plasma lipid levels in an HIV-1-infected pediatric cohort exposed to HAART. A total of 130 HIV-1-infected children/adolescents that attended a reference center in Argentina were selected for an 8-year longitudinal study with retrospective data collection. Longitudinal measurements of plasma triglycerides, total cholesterol, HDL-C and LDL-C were analyzed under linear or generalized linear mixed models. The contribution of the APOC3 genotype at sites −482, −455 and 3238 to plasma lipid levels prediction was tested after adjusting for potential confounders. Four major APOC3 haplotypes were observed for sites −482/−455/3238, with estimated frequencies of 0.60 (C/T/C), 0.14 (T/C/C), 0.11 (C/C/C), and 0.11 (T/C/G). The APOC3 genotype showed a significant effect only for the prediction of total cholesterol levels (p<0.0001). However, the magnitude of the differences observed was dependent on the drug combination (p = 0.0007) and the drug exposure duration at the time of the plasma lipid measurement (p = 0.0002). A lower risk of hypercholesterolemia was predicted for double and triple heterozygous individuals, mainly at the first few months after the initiation of Ritonavir-boosted protease inhibitor-based regimens. We report for the first time a significant contribution of the genotype to total cholesterol levels in a pediatric cohort under HAART. The genetic determination of APOC3 might have an impact on a large portion of HIV-1-infected children at the time of choosing the treatment regimens or on the counter-measures against the adverse effects of drugs.
HP and HPR are related and contiguous genes in strong linkage disequilibrium (LD), encoding haptoglobin and haptoglobin-related protein. These bind and chaperone free Hb for recycling, protecting against oxidation. A copy number variation (CNV) within HP (Hp1/Hp2) results in different possible haptoglobin complexes which have differing properties. HPR rs2000999 (G/A), identified in meta-GWAS, influences total cholesterol (TC) and LDL-cholesterol (LDL-C). We examined the relationship between HP CNV, HPR rs2000999, Hb, red cell count (RCC), LDL-C and TC in the British Women's Heart and Health Study (n = 2779 for samples having CNV, rs2000999, and phenotypes). Analysing single markers by linear regression, rs2000999 was associated with LDL-C (β = 0.040 mmol/L, p = 0.023), TC (β = − 0.040 mmol/L, p = 0.019), Hb (β = − 0.044 g/dL, p = 0.028) and borderline with RCC (β = − 0.032 × 1012/L, p = 0.066). Analysis of CNV by linear regression revealed an association with Hb (Hp1 vs Hp2, β = 0.057 g/dL, p = 0.004), RCC (β = 0.045 × 1012/L, p = 0.014), and showed a trend with LDL-C and TC. There were 3 principal haplotypes (Hp1-G 36%; Hp2-G 45%; Hp2-A 18%). Haplotype comparisons showed that LDL-C and TC associations were from rs2000999; Hb and RCC associations derived largely from the CNV. Distinct genotype–phenotype effects are evident at the genetic epidemiological level once LD has been analysed, perhaps reflecting HP–HPR functional biology and evolutionary history. The derived Hp2 allele of the HP gene has apparently been subject to malaria-driven positive selection. Haptoglobin-related protein binds Hb and apolipoprotein-L, i.e. linking HPR to the cholesterol system; and the HPR/apo-L complex is specifically trypanolytic. Our analysis illustrates the complex interplay between functions and haplotypes of adjacent genes, environmental context and natural selection, and offers insights into potential use of haptoglobin or haptoglobin-related protein as therapeutic agents.
► HP CNV/HPR SNP haplotype analysis shows association of HP CNV with Hb levels/RCC. ► HP CNV/HPR SNP haplotype analysis shows association of HPR SNP with LDL-C/TC. ► The HP CNV/Hb-related association may be via Hp2 allele advantage in malaria zones. ► The HPR SNP/cholesterol association is likely via apolipoproteins in TLF-1 and -2. ► We infer that HP CNV duplication preceded HPR SNP mutation.
HP, haptoglobin gene; HPR, haptoglobin-related protein gene; LD, linkage disequilibrium; Hb, haemoglobin; CNV, copy number variant; GWAS, genome-wide association study; TC, total cholesterol; LDL-C, low density lipoprotein cholesterol; HDL-C, high density lipoprotein cholesterol; RCC, red cell count; Apo-L, apolipoprotein-L; Apo-A, apolipoprotein-A; SNP, single nucleotide polymorphism; K-EDTA, potassium ethylene diamine tetraacetic acid; ARCS, amplification ratiometry control system; PCR, polymerase chain reaction; HW, Hardy Weinberg; HTR, haplotype trend regression; EM, expectation maximization; PASW, a statistical software package by the company SPSS Inc; PLINK, open-source software for whole genome data analysis; TLF-1, trypanosome lytic factor-1; TLF-2, trypanosome lytic factor-2; kD, kilo Daltons; CHD, coronary heart disease; HP; HPR; Haemoglobin; Cholesterol; Malaria; Trypanosome
To investigate the relationship between Angiopoietin-like protein 4 (Angptl4) levels, CHD biomarkers and ANGPTL4 variants.
Methods and Results
Plasma Angptl4 was quantified in 666 subjects of the Northwick Park Heart Study II using a validated ELISA. Seven ANGPTL4 SNPs were genotyped and CHD biomarkers assessed in the whole cohort (n=2775). Weighted mean (±SD) plasma Angptl4 levels were 10.0(±11.0) ng/ml. Plasma Angptl4 concentration correlated positively with age (r=0.15, P<0.001), body fat mass (r=0.19, P=0.003) but negatively with plasma HDL-cholesterol (r=−0.13, P=0.01). No correlation with triglycerides was observed. T266M was independently associated with plasma Angptl4 levels (P<0.001), but not associated with triglycerides or with CHD risk in the meta-analysis of five studies (4,061 cases/15,395 controls). E40K showed no independent association with plasma Angptl4 levels. In HEK293 and Huh7 cells compared to wild-type, E40K and T266M showed significantly altered synthesis and secretion, respectively.
These data suggest that circulating Angptl4 levels do not influence triglyceride levels or CHD risk since (1) Angptl4 levels were not correlated with triglycerides, (2) T266M, although associated with Angptl4 levels, showed no association with plasma triglycerides (3) Triglyceride-lowering E40K did not influence Angptl4 levels. These results provide new insights into the role of Angptl4 in triglyceride metabolism.
Angplt4; E40K; T266M; cardiovascular disease; LPL
The causal nature of associations between circulating triglycerides, insulin resistance, and type 2 diabetes is unclear. We aimed to use Mendelian randomization to test the hypothesis that raised circulating triglyceride levels causally influence the risk of type 2 diabetes and raise normal fasting glucose levels and hepatic insulin resistance.
RESEARCH DESIGN AND METHODS
We tested 10 common genetic variants robustly associated with circulating triglyceride levels against the type 2 diabetes status in 5,637 case and 6,860 control subjects and four continuous outcomes (reflecting glycemia and hepatic insulin resistance) in 8,271 nondiabetic individuals from four studies.
Individuals carrying greater numbers of triglyceride-raising alleles had increased circulating triglyceride levels (SD 0.59 [95% CI 0.52–0.65] difference between the 20% of individuals with the most alleles and the 20% with the fewest alleles). There was no evidence that the carriers of greater numbers of triglyceride-raising alleles were at increased risk of type 2 diabetes (per weighted allele odds ratio [OR] 0.99 [95% CI 0.97–1.01]; P = 0.26). In nondiabetic individuals, there was no evidence that carriers of greater numbers of triglyceride-raising alleles had increased fasting insulin levels (SD 0.00 per weighted allele [95% CI −0.01 to 0.02]; P = 0.72) or increased fasting glucose levels (0.00 [−0.01 to 0.01]; P = 0.88). Instrumental variable analyses confirmed that genetically raised circulating triglyceride levels were not associated with increased diabetes risk, fasting glucose, or fasting insulin and, for diabetes, showed a trend toward a protective association (OR per 1-SD increase in log10 triglycerides: 0.61 [95% CI 0.45–0.83]; P = 0.002).
Genetically raised circulating triglyceride levels do not increase the risk of type 2 diabetes or raise fasting glucose or fasting insulin levels in nondiabetic individuals. One explanation for our results is that raised circulating triglycerides are predominantly secondary to the diabetes disease process rather than causal.
Low muscle mass and function have been associated with poorer indicators of physical capability in older people, which are in-turn associated with increased mortality rates. The growth hormone/insulin-like growth factor (GH/IGF) axis is involved in muscle function and genetic variants in genes in the axis may influence measures of physical capability.
As part of the Healthy Ageing across the Life Course (HALCyon) programme, men and women from seven UK cohorts aged between 52 and 90 years old were genotyped for six polymorphisms: rs35767 (IGF1), rs7127900 (IGF2), rs2854744 (IGFBP3), rs2943641 (IRS1), rs2665802 (GH1) and the exon-3 deletion of GHR. The polymorphisms have previously been robustly associated with age-related traits or are potentially functional. Meta-analysis was used to pool within-study genotypic effects of the associations between the polymorphisms and four measures of physical capability: grip strength, timed walk or get up and go, chair rises and standing balance.
Few important associations were observed among the several tests. We found evidence that rs2665802 in GH1 was associated with inability to balance for 5 s (pooled odds ratio per minor allele = 0.90, 95% CI: 0.82–0.98, p-value = 0.01, n = 10,748), after adjusting for age and sex. We found no evidence for other associations between the polymorphisms and physical capability traits.
Our findings do not provide evidence for a substantial influence of these common polymorphisms in the GH/IGF axis on objectively measured physical capability levels in older adults.
A current concern in genetic epidemiology studies in admixed populations is that population stratification can lead to spurious results. The Brazilian census classifies individuals according to self-reported “color”, but several studies have demonstrated that stratifying according to “color” is not a useful strategy to control for population structure, due to the dissociation between self-reported “color” and genomic ancestry. We report the results of a study in a group of Brazilian siblings in which we measured skin pigmentation using a reflectometer, and estimated genomic ancestry using 21 Ancestry Informative Markers (AIMs). Self-reported “color”, according to the Brazilian census, was also available for each participant. This made it possible to evaluate the relationship between self-reported “color” and skin pigmentation, self-reported “color” and genomic ancestry, and skin pigmentation and genomic ancestry. We observed that, although there were significant differences between the three “color” groups in genomic ancestry and skin pigmentation, there was considerable dispersion within each group and substantial overlap between groups. We also saw that there was no good agreement between the “color” categories reported by each member of the sibling pair: 30 out of 86 sibling pairs reported different “color”, and in some cases, the sibling reporting the darker “color” category had lighter skin pigmentation. Socioeconomic status was significantly associated with self-reported “color” and genomic ancestry in this sample. This and other studies show that subjective classifications based on self-reported “color”, such as the one that is used in the Brazilian census, are inadequate to describe the population structure present in recently admixed populations. Finally, we observed that one of the AIMs included in the panel (rs1426654), which is located in the known pigmentation gene SLC24A5, was strongly associated with skin pigmentation in this sample.
The causal nature of associations between circulating triglycerides, insulin resistance and type 2 diabetes is unclear. We aimed to use Mendelian randomization to test the hypothesis that raised circulating triglyceride levels causally influence the risk of type 2 diabetes, raised normal fasting glucose levels, and hepatic insulin resistance.
Research design and methods
We tested 10 common genetic variants robustly associated with circulating triglyceride levels against type 2 diabetes status in 5637 cases, 6860 controls, and four continuous outcomes (reflecting glycemia and hepatic insulin resistance) in 8271 non-diabetic individuals from four studies.
Individuals carrying greater numbers of triglyceride-raising alleles had increased circulating triglyceride levels (0.59 SD [95% CI: 0.52, 0.65] difference between the 20% of individuals with the most alleles and the 20% with the fewest alleles). There was no evidence that carriers of greater numbers of triglyceride-raising alleles were at increased risk of type 2 diabetes (per weighted allele odds ratio (OR) 0.99 [95% CI: 0.97, 1.01]; P = 0.26). In non-diabetic individuals, there was no evidence that carriers of greater numbers of triglyceride-raising alleles had increased fasting insulin levels (0.00 SD per weighted allele [95% CI: −0.01, 0.02]; P = 0.72) or increased fasting glucose levels (0.00 SD per weighted allele [95% CI: −0.01, 0.01]; P = 0.88). Instrumental variable analyses confirmed that genetically raised circulating triglyceride levels were not associated with increased diabetes risk, fasting glucose or fasting insulin, and, for diabetes, showed a trend towards a protective association (OR per 1 SD increase in log10-triglycerides: 0.61 [95% CI: 0.45, 0.83]; P = 0.002).
Genetically raised circulating triglyceride levels do not increase the risk of type 2 diabetes, or raise fasting glucose or fasting insulin levels in non-diabetic individuals. One explanation for our results is that raised circulating triglycerides are predominantly secondary to the diabetes disease process rather than causal.
We describe a generic design for ratiometric analysis suitable for determination of copy number variation (CNV) class of a gene. Following two initial sequence-specific PCR priming cycles, both ends of both amplicons (one test and one reference) in a duplex reaction, are all primed by the same universal primer (UP). Following each amplification denaturation step, the UP target and its reverse complement (UP′) in each strand form a hairpin. The bases immediately beyond the 3′-end of the UP and 5′ of UP′ are chosen such as not to base pair in the hairpin (otherwise priming is ablated). This hairpin creates a single constant environment for priming events and chaperones free 3′-ends of amplicon strands. The resultant ‘amplification ratio control system’ (ARCS) permits ratiometric representation of amplicons relative to the original template into PCR plateau phase. These advantages circumvent the need for real-time PCR for quantitation. Choice of different %(G+C) content for the target and reference amplicons allows liquid phase thermal melt discrimination and quantitation of amplicons. The design is generic, simple to set up and economical. Comparisons with real-time PCR and other techniques are made and CNV assays demonstrated for haptoglobin duplicon and ‘chemokine (C-C motif) ligand 3-like 1’ gene.
Variation in pre-mRNA splicing is common and in some cases caused by genetic variants in intronic splicing motifs. Recent studies into the insulin gene (INS) discovered a polymorphism in a 5′ non-coding intron that influences the likelihood of intron retention in the final mRNA, extending the 5′ untranslated region and maintaining protein quality. Retention was also associated with increased insulin levels, suggesting that such variants - splice translational efficiency polymorphisms (STEPs) - may relate to disease phenotypes through differential protein expression. We set out to explore the prevalence of STEPs in the human genome and validate this new category of protein quantitative trait loci (pQTL) using publicly available data.
Gene transcript and variant data were collected and mined for candidate STEPs in motif regions. Sequences from transcripts containing potential STEPs were analysed for evidence of splice site recognition and an effect in expressed sequence tags (ESTs). 16 publicly released genome-wide association data sets of common diseases were searched for association to candidate polymorphisms with HapMap frequency data. Our study found 3324 candidate STEPs lying in motif sequences of 5′ non-coding introns and further mining revealed 170 with transcript evidence of intron retention. 21 potential STEPs had EST evidence of intron retention or exon extension, as well as population frequency data for comparison.
Results suggest that the insulin STEP was not a unique example and that many STEPs may occur genome-wide with potentially causal effects in complex disease. An online database of STEPs is freely accessible at http://dbstep.genes.org.uk/.
The association between the R allele of PON1 Q192R and symptoms reported by sheep dippers and Gulf War veterans has been used to suggest a biological basis for these symptoms. In the absence of such studies in non‐occupational populations, these conclusions may not be valid.
To examine the association of paraoxonase (PON1) Q192R with a report of ever being diagnosed with depression among a random sample of 3266 British women, aged 60–79 years.
The R allele of PON1 Q192R was associated with depression: per‐allele odds ratio 1.22 (95% confidence interval: 1.05 to 1.41) in this population.
These findings suggest that the association of PON1 Q192R with symptoms of depression in occupationally exposed groups may be driven by exposure to toxins that everyone in the general population is exposed to rather than exposure to toxins specifically used by sheep dippers or Gulf War veterans, or that other mechanisms underlie the association. This is because the study population in which we have found an association consisted of British women aged 60–79 years, few of whom were sheep dippers or Gulf War veterans. When using genotype–outcome associations to infer causality with respect to an environmental exposure modified by the genotype, it is important to examine these associations in general populations and in those specifically exposed to the putative agent. The possible role of PON1 Q192R in psychiatric morbidity requires further examination.
Mendelian randomization (MR) permits causal inference between exposures and a disease. It can be compared with randomized controlled trials. Whereas in a randomized controlled trial the randomization occurs at entry into the trial, in MR the randomization occurs during gamete formation and conception. Several factors, including time since conception and sampling variation, are relevant to the interpretation of an MR test. Particularly important is consideration of the “missingness” of genotypes that can be originated by chance, genotyping errors, or clinical ascertainment. Testing for Hardy-Weinberg equilibrium (HWE) is a genetic approach that permits evaluation of missingness. In this paper, the authors demonstrate evidence of nonconformity with HWE in real data. They also perform simulations to characterize the sensitivity of HWE tests to missingness. Unresolved missingness could lead to a false rejection of causality in an MR investigation of trait-disease association. These results indicate that large-scale studies, very high quality genotyping data, and detailed knowledge of the life-course genetics of the alleles/genotypes studied will largely mitigate this risk. The authors also present a Web program (http://www.oege.org/software/hwe-mr-calc.shtml) for estimating possible missingness and an approach to evaluating missingness under different genetic models.
epidemiologic methods; genetics; Hardy-Weinberg equilibrium; random allocation; research design
It is unclear whether C-reactive protein (CRP) is causally related to coronary heart disease (CHD). Genetic variants that are known to be associated with CRP levels can be used to provide causal inference of the effect of CRP on CHD. Our objective was to examine the association between CRP genetic variant +1444C>T (rs1130864) and CHD risk in the largest study to date of this association.
Methods and Results
We estimated the association of CRP genetic variant +1444C>T (rs1130864) with CRP levels and with CHD in five studies and then pooled these analyses (N = 18,637 participants amongst whom there were 4,610 cases). CRP was associated with potential confounding factors (socioeconomic position, physical activity, smoking and body mass) whereas genotype (rs1130864) was not associated with these confounders. The pooled odds ratio of CHD per doubling of circulating CRP level after adjustment for age and sex was 1.13 (95%CI: 1.06, 1.21), and after further adjustment for confounding factors it was 1.07 (95%CI: 1.02, 1.13). Genotype (rs1130864) was associated with circulating CRP; the pooled ratio of geometric means of CRP level among individuals with the TT genotype compared to those with the CT/CC genotype was 1.21 (95%CI: 1.15, 1.28) and the pooled ratio of geometric means of CRP level per additional T allele was 1.14 (95%CI: 1.11, 1.18), with no strong evidence in either analyses of between study heterogeneity (I2 = 0%, p>0.9 for both analyses). There was no association of genotype (rs1130864) with CHD: pooled odds ratio 1.01 (95%CI: 0.88, 1.16) comparing individuals with TT genotype to those with CT/CC genotype and 0.96 (95%CI: 0.90, 1.03) per additional T allele (I2<7.5%, p>0.6 for both meta-analyses). An instrumental variables analysis (in which the proportion of CRP levels explained by rs1130864 was related to CHD) suggested that circulating CRP was not associated with CHD: the odds ratio for a doubling of CRP level was 1.04 (95%CI: 0.61, 1.80).
We found no association of a genetic variant, which is known to be related to CRP levels, (rs1130864) and having CHD. These findings do not support a causal association between circulating CRP and CHD risk, but very large, extended, genetic association studies would be required to rule this out.
We sought evidence of interaction between single nucleotide polymorphisms (SNPs) in the Calcium Sensing Receptor (CASR) gene and early life in determination of bone mineral density (BMD) among individuals from the Hertfordshire Cohort Study.
Four hundred and ninety eight men and 468 women aged 59-71 years were recruited. A lifestyle questionnaire was administered and BMD at lumbar spine and femoral neck was measured. DNA was obtained from whole blood samples using standard extraction techniques. Five SNPs of the CASR gene termed CASRV1 (rs1801725, G>T, S986A), CASRV2 (rs7614486, T>G, untranslated), CASRV3 (rs4300957, untranslated), CASRV4 (rs3804592 G>A, intron) and CASRV5 (rs1393189, T>C, intron) were analysed.
Among women the 11 genotype of the CASRV3 SNP was associated with higher lumbar spine BMD within the lowest birth-weight tertile, while the opposite pattern was observed among individuals in the highest birth-weight tertile (test for interaction on 1df, p=0.005, adjusted for age, BMI, physical activity, dietary calcium intake, cigarette and alcohol consumption, social class, menopausal status and HRT use). Similar relationships were seen at the total femur (p=0.042, fully adjusted) with birth-weight and at the total femur according to weight at one year tertile among women (p<0.001, fully adjusted). One haplotype was associated with lumbar spine BMD in women (p=0.008, fully adjusted); these findings were replicated in a second cohort.
We have found evidence of an interaction between a SNP of the CASR gene and birth weight in determination of bone mass in a UK female population.
bone; bone density; cohort studies; genetic studies
The frequency of a haplotype comprising one allele at each of two loci can be expressed as a cubic equation (the 'Hill equation'), the solution of which gives that frequency. Most haplotype and linkage disequilibrium analysis programs use iteration-based algorithms which substitute an estimate of haplotype frequency into the equation, producing a new estimate which is repeatedly fed back into the equation until the values converge to a maximum likelihood estimate (expectation-maximisation).
We present a program, "CubeX", which calculates the biologically possible exact solution(s) and provides estimated haplotype frequencies, D', r2 and χ2 values for each. CubeX provides a "complete" analysis of haplotype frequencies and linkage disequilibrium for a pair of biallelic markers under situations where sampling variation and genotyping errors distort sample Hardy-Weinberg equilibrium, potentially causing more than one biologically possible solution. We also present an analysis of simulations and real data using the algebraically exact solution, which indicates that under perfect sample Hardy-Weinberg equilibrium there is only one biologically possible solution, but that under other conditions there may be more.
Our analyses demonstrate that lower allele frequencies, lower sample numbers, population stratification and a possible |D'| value of 1 are particularly susceptible to distortion of sample Hardy-Weinberg equilibrium, which has significant implications for calculation of linkage disequilibrium in small sample sizes (eg HapMap) and rarer alleles (eg paucimorphisms, q < 0.05) that may have particular disease relevance and require improved approaches for meaningful evaluation.
Various software tools are available for the display of pairwise linkage disequilibrium across multiple single nucleotide polymorphisms. The HapMap project also presents these graphics within their website. However, these approaches are limited in their use of data from multiallelic markers and provide limited information in a graphical form.
We have developed a software package (MIDAS – Multiallelic Interallelic Disequilibrium Analysis Software) for the estimation and graphical display of interallelic linkage disequilibrium. Linkage disequilibrium is analysed for each allelic combination (of one allele from each of two loci), between all pairwise combinations of any type of multiallelic loci in a contig (or any set) of many loci (including single nucleotide polymorphisms, microsatellites, minisatellites and haplotypes). Data are presented graphically in a novel and informative way, and can also be exported in tabular form for other analyses. This approach facilitates visualisation of patterns of linkage disequilibrium across genomic regions, analysis of the relationships between different alleles of multiallelic markers and inferences about patterns of evolution and selection.
MIDAS is a linkage disequilibrium analysis program with a comprehensive graphical user interface providing novel views of patterns of linkage disequilibrium between all types of multiallelic and biallelic markers.
Available from and
There have been inconsistent results from case-control studies assessing the association of the PON1 Q192R polymorphism with coronary heart disease (CHD). Most studies have included predominantly men and the association in women is unclear. Since lipid levels vary between the sexes the antioxidant effect of PON1 and any genes associated with it may also vary by sex. We have examined the association of the PON1 Q192R polymorphism with CHD in a large cohort of British women and combined the results from our cohort study with those from all other published studies.
The distribution of genotypes was the same among women with CHD and those without disease. The odds ratio (95% confidence interval) of having CHD comparing those with either the QR or RR genotype to those with QQ genotype (dominant model of association) was 1.03 (0.89, 1.21) and the per allele odds ratio was 0.98 (0.95, 1.01). In a meta-analysis of this and 38 other published studies (10,738 cases and 17,068 controls) the pooled odds ratio for the dominant effect was 1.14 (1.08, 1.20) and for the per allele effect was 1.10 (1.06, 1.13). There was evidence of small study bias in the meta-analyses and the dominant effect among those studies with 500 or more cases was 1.05 (0.96, 1.15). Ethnicity and reporting of whether the genotyping was done blind to the participants clinical status also contributed to heterogeneity between studies, but there was no difference in effect between studies with 50% or more women compared to those with fewer women and no difference between studies of healthy populations compared to those at high risk (with diabetes, renal disease of familial hypercholesterolaemia).
There is no robust evidence that the PON1 Q192R polymorphism is associated with CHD risk in Caucasian women or men.
Electrophoresis continues to be a mainstay in molecular genetic laboratories for checking, sizing and separating both PCR products, nucleic acids derived from in vivo or in vitro sources and nucleic acid–protein complexes. Many genomic and genetic applications demand high throughput, such as the checking of amplification products from many loci, from many clones, from many cell lines or from many individuals at once. These applications include microarray resource development and expression analysis, genome mapping, library and DNA bank screening, mutagenesis experiments and single nucleotide polymorphism (SNP) genotyping. PCR hardware compatible with industry standard 96 and 384 well microplates is commonplace. We have previously described a simple system for submerged horizontal 96 and 192 well polyacrylamide or agarose microplate array diagonal gel electrophoresis (MADGE) which is microplate compatible and suitable for PCR checking, SNP typing (restriction fragment length polymorphism or amplification refractory mutation system), microsatellite sizing and identification of unknown mutations. By substantial redesign of format and operations, we have derived an efficient ‘dry’ gel system that enables direct 96 pin manual transfer from PCR or other reactions in microplates, into 768 or 384 well gels. Combined with direct electrode contact in clamshell electrophoresis boxes which plug directly to contacts in a powered stacking frame and using 5–10 min electrophoresis times, it would be possible (given a sufficient supply of PCRs for examination) for 1 million gel tracks to be run per day for a minimal hardware investment and at minimal reagent costs. Applications of this system for PCR checking and SNP genotyping are illustrated.
The rate at which nonsynonymous single nucleotide polymorphisms (nsSNPs) are being identified in the human genome is increasing dramatically owing to advances in whole-genome/whole-exome sequencing technologies. Automated methods capable of accurately and reliably distinguishing between pathogenic and functionally neutral nsSNPs are therefore assuming ever-increasing importance. Here, we describe the Functional Analysis Through Hidden Markov Models (FATHMM) software and server: a species-independent method with optional species-specific weightings for the prediction of the functional effects of protein missense variants. Using a model weighted for human mutations, we obtained performance accuracies that outperformed traditional prediction methods (i.e., SIFT, PolyPhen, and PANTHER) on two separate benchmarks. Furthermore, in one benchmark, we achieve performance accuracies that outperform current state-of-the-art prediction methods (i.e., SNPs&GO and MutPred). We demonstrate that FATHMM can be efficiently applied to high-throughput/large-scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations. To illustrate this, we evaluated nsSNPs in wheat (Triticum spp.) to identify some of the important genetic variants responsible for the phenotypic differences introduced by intense selection during domestication. A Web-based implementation of FATHMM, including a high-throughput batch facility and a downloadable standalone package, is available at http://fathmm.biocompute.org.uk.
SNP; hidden Markov models; FATHMM