We applied genome-wide allele-specific expression analysis of monocytes from 188 samples. Monocytes were purified from white blood cells of healthy blood donors to detect cis-acting genetic variation that regulates the expression of long non-coding RNAs. We analysed 8929 regions harboring genes for potential long non-coding RNA that were retrieved from data from the ENCODE project. Of these regions, 60% were annotated as intergenic, which implies that they do not overlap with protein-coding genes. Focusing on the intergenic regions, and using stringent analysis of the allele-specific expression data, we detected robust cis-regulatory SNPs in 258 out of 489 informative intergenic regions included in the analysis. The cis-regulatory SNPs that were significantly associated with allele-specific expression of long non-coding RNAs were enriched to enhancer regions marked for active or bivalent, poised chromatin by histone modifications. Out of the lncRNA regions regulated by cis-acting regulatory SNPs, 20% (n = 52) were co-regulated with the closest protein coding gene. We compared the identified cis-regulatory SNPs with those in the catalog of SNPs identified by genome-wide association studies of human diseases and traits. This comparison identified 32 SNPs in loci from genome-wide association studies that displayed a strong association signal with allele-specific expression of non-coding RNAs in monocytes, with p-values ranging from 6.7×10−7 to 9.5×10−89. The identified cis-regulatory SNPs are associated with diseases of the immune system, like multiple sclerosis and rheumatoid arthritis.
Target enrichment and resequencing is a widely used approach for identification of cancer genes and genetic variants associated with diseases. Although cost effective compared to whole genome sequencing, analysis of many samples constitutes a significant cost, which could be reduced by pooling samples before capture. Another limitation to the number of cancer samples that can be analyzed is often the amount of available tumor DNA. We evaluated the performance of whole genome amplified DNA and the power to detect subclonal somatic single nucleotide variants in non-indexed pools of cancer samples using the HaloPlex technology for target enrichment and next generation sequencing.
We captured a set of 1528 putative somatic single nucleotide variants and germline SNPs, which were identified by whole genome sequencing, with the HaloPlex technology and sequenced to a depth of 792–1752. We found that the allele fractions of the analyzed variants are well preserved during whole genome amplification and that capture specificity or variant calling is not affected. We detected a large majority of the known single nucleotide variants present uniquely in one sample with allele fractions as low as 0.1 in non-indexed pools of up to ten samples. We also identified and experimentally validated six novel variants in the samples included in the pools.
Our work demonstrates that whole genome amplified DNA can be used for target enrichment equally well as genomic DNA and that accurate variant detection is possible in non-indexed pools of cancer samples. These findings show that analysis of a large number of samples is feasible at low cost, even when only small amounts of DNA is available, and thereby significantly increases the chances of indentifying recurrent mutations in cancer samples.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-14-856) contains supplementary material, which is available to authorized users.
Target enrichment; HaloPlex; Non-indexed pooling; Whole genome amplification; Single nucleotide variant; Deep sequencing
Although aberrant DNA methylation has been observed previously in acute lymphoblastic leukemia (ALL), the patterns of differential methylation have not been comprehensively determined in all subtypes of ALL on a genome-wide scale. The relationship between DNA methylation, cytogenetic background, drug resistance and relapse in ALL is poorly understood.
We surveyed the DNA methylation levels of 435,941 CpG sites in samples from 764 children at diagnosis of ALL and from 27 children at relapse. This survey uncovered four characteristic methylation signatures. First, compared with control blood cells, the methylomes of ALL cells shared 9,406 predominantly hypermethylated CpG sites, independent of cytogenetic background. Second, each cytogenetic subtype of ALL displayed a unique set of hyper- and hypomethylated CpG sites. The CpG sites that constituted these two signatures differed in their functional genomic enrichment to regions with marks of active or repressed chromatin. Third, we identified subtype-specific differential methylation in promoter and enhancer regions that were strongly correlated with gene expression. Fourth, a set of 6,612 CpG sites was predominantly hypermethylated in ALL cells at relapse, compared with matched samples at diagnosis. Analysis of relapse-free survival identified CpG sites with subtype-specific differential methylation that divided the patients into different risk groups, depending on their methylation status.
Our results suggest an important biological role for DNA methylation in the differences between ALL subtypes and in their clinical outcome after treatment.
Genome-wide association analysis on monozygotic twin pairs offers a route to discovery of gene–environment interactions through testing for variability loci associated with sensitivity to individual environment/lifestyle. We present a genome-wide scan of loci associated with intra-pair differences in serum lipid and apolipoprotein levels. We report data for 1,720 monozygotic female twin pairs from GenomEUtwin project with 2.5 million SNPs, imputed or genotyped, and measured serum lipid fractions for both twins. We found one locus associated with intra-pair differences in high density lipoprotein (HDL) cholesterol, rs2483058 in an intron of SRGAP2, where twins carrying the C allele are more sensitive to environmental factors (p = 3.98 × 10−8). We followed up the association in further genotyped monozygotic twins (N = 1 261) which showed a moderate association for the variant (p = .002, same direction of an effect). In addition, we report a new association on the level of apolipoprotein A-II (p = 4.03 × 10−8).
twins; association; lipids; apolipoproteins; interaction
A large number of genome-wide association studies have been performed during the past five years to identify associations between SNPs and human complex diseases and traits. The assignment of a functional role for the identified disease-associated SNP is not straight-forward. Genome-wide expression quantitative trait locus (eQTL) analysis is frequently used as the initial step to define a function while allele-specific gene expression (ASE) analysis has not yet gained a wide-spread use in disease mapping studies. We compared the power to identify cis-acting regulatory SNPs (cis-rSNPs) by genome-wide allele-specific gene expression (ASE) analysis with that of traditional expression quantitative trait locus (eQTL) mapping. Our study included 395 healthy blood donors for whom global gene expression profiles in circulating monocytes were determined by Illumina BeadArrays. ASE was assessed in a subset of these monocytes from 188 donors by quantitative genotyping of mRNA using a genome-wide panel of SNP markers. The performance of the two methods for detecting cis-rSNPs was evaluated by comparing associations between SNP genotypes and gene expression levels in sample sets of varying size. We found that up to 8-fold more samples are required for eQTL mapping to reach the same statistical power as that obtained by ASE analysis for the same rSNPs. The performance of ASE is insensitive to SNPs with low minor allele frequencies and detects a larger number of significantly associated rSNPs using the same sample size as eQTL mapping. An unequivocal conclusion from our comparison is that ASE analysis is more sensitive for detecting cis-rSNPs than standard eQTL mapping. Our study shows the potential of ASE mapping in tissue samples and primary cells which are difficult to obtain in large numbers.
The QT interval, an electrocardiographic measure reflecting myocardial repolarization, is a heritable trait. QT prolongation is a risk factor for ventricular arrhythmias and sudden cardiac death (SCD) and could indicate the presence of the potentially lethal Mendelian Long QT Syndrome (LQTS). Using a genome-wide association and replication study in up to 100,000 individuals we identified 35 common variant QT interval loci, that collectively explain ∼8-10% of QT variation and highlight the importance of calcium regulation in myocardial repolarization. Rare variant analysis of 6 novel QT loci in 298 unrelated LQTS probands identified coding variants not found in controls but of uncertain causality and therefore requiring validation. Several newly identified loci encode for proteins that physically interact with other recognized repolarization proteins. Our integration of common variant association, expression and orthogonal protein-protein interaction screens provides new insights into cardiac electrophysiology and identifies novel candidate genes for ventricular arrhythmias, LQTS,and SCD.
genome-wide association study; QT interval; Long QT Syndrome; sudden cardiac death; myocardial repolarization; arrhythmias
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights.
Understanding how FI and FG levels are regulated is important because their derangement is a feature of T2D. Despite recent success from GWAS in identifying regions of the genome influencing glycemic traits, collectively these loci explain only a small proportion of trait variance. Unlocking the biological mechanisms driving these associations has been challenging because the vast majority of variants map to non-coding sequence, and the genes through which they exert their impact are largely unknown. In the current study, we sought to increase our understanding of the physiological pathways influencing both traits using exome-array genotyping in up to 33,231 non-diabetic individuals to identify coding variants and consequently genes associated with either FG or FI levels. We identified novel association signals for both traits including the receptor for GLP-1 agonists which are a widely used therapy for T2D. Furthermore, we identified coding variants at several GWAS loci which point to the genes underlying these association signals. Importantly, we found that multiple coding variants in G6PC2 result in a loss of protein function and lower fasting glucose levels.
To detect genes with CpG sites that display methylation patterns that are characteristic of acute lymphoblastic leukemia (ALL) cells, we compared the methylation patterns of cells taken at diagnosis from 20 patients with pediatric ALL to the methylation patterns in mononuclear cells from bone marrow of the same patients during remission and in non-leukemic control cells from bone marrow or blood. Using a custom-designed assay, we measured the methylation levels of 1,320 CpG sites in regulatory regions of 413 genes that were analyzed because they display allele-specific gene expression (ASE) in ALL cells. The rationale for our selection of CpG sites was that ASE could be the result of allele-specific methylation in the promoter regions of the genes. We found that the ALL cells had methylation profiles that allowed distinction between ALL cells and control cells. Using stringent criteria for calling differential methylation, we identified 28 CpG sites in 24 genes with recurrent differences in their methylation levels between ALL cells and control cells. Twenty of the differentially methylated genes were hypermethylated in the ALL cells, and as many as nine of them (AMICA1, CPNE7, CR1, DBC1, EYA4, LGALS8, RYR3, UQCRFS1, WDR35) have functions in cell signaling and/or apoptosis. The methylation levels of a subset of the genes were consistent with an inverse relationship with the mRNA expression levels in a large number of ALL cells from published data sets, supporting a potential biological effect of the methylation signatures and their application for diagnostic purposes.
Rapid advances in the development of sequencing technologies in recent years have enabled an increasing number of applications in biology and medicine. Here, we review key technical aspects of the preparation of DNA templates for sequencing, the biochemical reaction principles and assay formats underlying next-generation sequencing systems, methods for imaging and base calling, quality control, and bioinformatic approaches for sequence alignment, variant calling and assembly. We also discuss some of the most important advances that the new sequencing technologies have brought to the fields of human population genetics, human genetic history and forensic genetics.
Systemic Lupus Erythematosus (SLE) is a systemic autoimmune disease in which the type I interferon pathway has a crucial role. We have previously shown that three genes in this pathway, IRF5, TYK2 and STAT4, are strongly associated with risk for SLE. Here, we investigated 78 genes involved in the type I interferon pathway to identify additional SLE susceptibility loci. First, we genotyped 896 single-nucleotide polymorphisms in these 78 genes and 14 other candidate genes in 482 Swedish SLE patients and 536 controls. Genes with P<0.01 in the initial screen were then followed up in 344 additional Swedish patients and 1299 controls. SNPs in the IKBKE, TANK, STAT1, IL8 and TRAF6 genes gave nominal signals of association with SLE in this extended Swedish cohort. To replicate these findings we extracted data from a genomewide association study on SLE performed in a US cohort. Combined analysis of the Swedish and US data, comprising a total of 2136 cases and 9694 controls, implicates IKBKE and IL8 as SLE susceptibility loci (Pmeta=0.00010 and Pmeta=0.00040, respectively). STAT1 was also associated with SLE in this cohort (Pmeta=3.3 × 10−5), but this association signal appears to be dependent of that previously reported for the neighbouring STAT4 gene. Our study suggests additional genes from the type I interferon system in SLE, and highlights genes in this pathway for further functional analysis.
systemic lupus erythematosus; type I interferon system; candidate gene study; single nucleotide polymorphism; IKBKE; IL8
Most complex disease-associated genetic variants are located in non-coding regions and are
therefore thought to be regulatory in nature. Association mapping of differential allelic expression
(AE) is a powerful method to identify SNPs with direct cis-regulatory impact
(cis-rSNPs). We used AE mapping to identify cis-rSNPs regulating
gene expression in 55 and 63 HapMap lymphoblastoid cell lines from a Caucasian and an African
population, respectively, 70 fibroblast cell lines, and 188 purified monocyte samples and found
40–60% of these cis-rSNPs to be shared across cell types. We uncover
a new class of cis-rSNPs, which disrupt footprint-derived de novo
motifs that are predominantly bound by repressive factors and are implicated in disease
susceptibility through overlaps with GWAS SNPs. Finally, we provide the proof-of-principle for a new
approach for genome-wide functional validation of transcription factor–SNP interactions. By
perturbing NFκB action in lymphoblasts, we identified 489 cis-regulated
transcripts with altered AE after NFκB perturbation. Altogether, we perform a comprehensive
analysis of cis-variation in four cell populations and provide new tools for the
identification of functional variants associated to complex diseases.
allelic expression; cis-rSNPs; complex disease; NFκB; repressor
Drinking coffee has been linked to reduced calcium conservation, but it is less clear whether it leads to sustained bone mineral loss and if individual predisposition for caffeine metabolism might be important in this context. Therefore, the relation between consumption of coffee and bone mineral density (BMD) at the proximal femur in men and women was studied, taking into account, for the first time, genotypes for cytochrome P450 1A2 (CYP1A2) associated with metabolism of caffeine.
Dietary intakes of 359 men and 358 women (aged 72 years), participants of the Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS), were assessed by a 7-day food diary. Two years later, BMD for total proximal femur, femoral neck and trochanteric regions of the proximal femur were measured by Dual-energy X-ray absorptiometry (DXA). Genotypes of CYP1A2 were determined. Adjusted means of BMD for each category of coffee consumption were calculated.
Men consuming 4 cups of coffee or more per day had 4% lower BMD at the proximal femur (p = 0.04) compared with low or non-consumers of coffee. This difference was not observed in women. In high consumers of coffee, those with rapid metabolism of caffeine (C/C genotype) had lower BMD at the femoral neck (p = 0.01) and at the trochanter (p = 0.03) than slow metabolizers (T/T and C/T genotypes). Calcium intake did not modify the relation between coffee and BMD.
High consumption of coffee seems to contribute to a reduction in BMD of the proximal femur in elderly men, but not in women. BMD was lower in high consumers of coffee with rapid metabolism of caffeine, suggesting that rapid metabolizers of caffeine may constitute a risk group for bone loss induced by coffee.
We have performed Quantitative Trait Loci (QTL) analysis of an F2 intercross between two chicken lines divergently selected for juvenile body-weight. In a previous study 13 identified loci with effects on body-weight, only explained a small proportion of the large variation in the F2 population. Epistatic interaction analysis however, indicated that a network of interacting loci with large effect contributed to the difference in body-weight of the parental lines. This previous analysis was, however, based on a sparse microsatellite linkage map and the limited coverage could have affected the main conclusions. Here we present a revised QTL analysis based on a high-density linkage map that provided a more complete coverage of the chicken genome. Furthermore, we utilized genotype data from ~13,000 SNPs to search the genome for potential selective sweeps that have occurred in the selected lines.
We constructed a linkage map comprising 434 genetic markers, covering 31 chromosomes but leaving seven microchromosomes uncovered. The analysis showed that seven regions harbor QTL that influence growth. The pair-wise interaction analysis identified 15 unique QTL pairs and notable is that nine of those involved interactions with a locus on chromosome 7, forming a network of interacting loci. The analysis of ~13,000 SNPs showed that a substantial proportion of the genetic variation present in the founder population has been lost in either of the two selected lines since ~60% of the SNPs polymorphic among lines showed fixation in one of the lines. With the current marker coverage and QTL map resolution we did not observe clear signs of selective sweeps within QTL intervals.
The results from the QTL analysis using the new improved linkage map are to a large extent in concordance with our previous analysis of this pedigree. The difference in body-weight between the parental chicken lines is caused by many QTL each with a small individual effect. Although the increased chromosomal marker coverage did not lead to the identification of additional QTL, we were able to refine the localization of QTL. The importance of epistatic interaction as a mechanism contributing significantly to the remarkable selection response was further strengthened because additional pairs of interacting loci were detected with the improved map.
Estrogen is an established endometrial carcinogen. One of the most important mediators of estrogenic action is the estrogen receptor alpha. We have investigated whether polymorphic variation in the estrogen receptor alpha gene (ESR1) is associated with endometrial cancer risk.
In 702 cases with invasive endometrial cancer and 1563 controls, we genotyped five markers in ESR1 and used logistic regression models to estimate odds ratios (OR) and 95 percent confidence intervals (CI).
We found an association between rs2234670, rs2234693, as well as rs9340799, markers in strong linkage disequilibrium (LD), and endometrial cancer risk. The association with rs9340799 was the strongest, OR 0.75 (CI 0.60–0.93) for heterozygous and OR 0.53 (CI 0.37–0.77) for homozygous rare compared to those homozygous for the most common allele. Haplotype models did not fit better to the data than single marker models.
We found that intronic variation in ESR1 was associated with endometrial cancer risk.
Systemic lupus erythematosus (SLE) is the prototype autoimmune disease where genes regulated by type I interferon (IFN) are over-expressed and contribute to the disease pathogenesis. Because signal transducer and activator of transcription 4 (STAT4) plays a key role in the type I IFN receptor signaling, we performed a candidate gene study of a comprehensive set of single nucleotide polymorphism (SNPs) in STAT4 in Swedish patients with SLE. We found that 10 out of 53 analyzed SNPs in STAT4 were associated with SLE, with the strongest signal of association (P = 7.1 × 10−8) for two perfectly linked SNPs rs10181656 and rs7582694. The risk alleles of these 10 SNPs form a common risk haplotype for SLE (P = 1.7 × 10−5). According to conditional logistic regression analysis the SNP rs10181656 or rs7582694 accounts for all of the observed association signal. By quantitative analysis of the allelic expression of STAT4 we found that the risk allele of STAT4 was over-expressed in primary human cells of mesenchymal origin, but not in B-cells, and that the risk allele of STAT4 was over-expressed (P = 8.4 × 10−5) in cells carrying the risk haplotype for SLE compared with cells with a non-risk haplotype. The risk allele of the SNP rs7582694 in STAT4 correlated to production of anti-dsDNA (double-stranded DNA) antibodies and displayed a multiplicatively increased, 1.82-fold risk of SLE with two independent risk alleles of the IRF5 (interferon regulatory factor 5) gene.
Mutations in the mismatch repair genes hMLH1 and hMSH2 predispose to hereditary non-polyposis colorectal cancer (HNPCC). Genetic screening of more than 350 Danish patients with colorectal cancer (CRC) has led to the identification of several new genetic variants (e.g. missense, silent and non-coding) in hMLH1 and hMSH2. The aim of the present study was to investigate the frequency of these variants in hMLH1 and hMSH2 in Danish patients with sporadic colorectal cancer and in the healthy background population. The purpose was to reveal if any of the common variants lead to increased susceptibility to colorectal cancer.
Associations between genetic variants in hMLH1 and hMSH2 and sporadic colorectal cancer were evaluated using a case-cohort design. The genotyping was performed on DNA isolated from blood from the 380 cases with sporadic colorectal cancer and a sub-cohort of 770 individuals. The DNA samples were analyzed using Single Base Extension (SBE) Tag-arrays. A Bonferroni corrected Fisher exact test was used to test for association between the genotypes of each variant and colorectal cancer. Linkage disequilibrium (LD) was investigated using HaploView (v3.31).
Heterozygous and homozygous changes were detected in 13 of 35 analyzed variants. Two variants showed a borderline association with colorectal cancer, whereas the remaining variants demonstrated no association. Furthermore, the genomic regions covering hMLH1 and hMSH2 displayed high linkage disequilibrium in the Danish population. Twenty-two variants were neither detected in the cases with sporadic colorectal cancer nor in the sub-cohort. Some of these rare variants have been classified either as pathogenic mutations or as neutral variants in other populations and some are unclassified Danish variants.
None of the variants in hMLH1 and hMSH2 analyzed in the present study were highly associated with colorectal cancer in the Danish population. High linkage disequilibrium in the genomic regions covering hMLH1 and hMSH2, indicate that common genetic variants in the two genes in general are not involved in the development of sporadic colorectal cancer. Nevertheless, some of the rare unclassified variants in hMLH1 and hMSH2 might be involved in the development of colorectal cancer in the families where they were originally identified.
Human body height is a complex genetic trait with high heritability. We performed an association study of 17 candidate genes for height in the Uppsala Longitudinal Study of Adult Men (ULSAM) that consists of 1153 elderly men of age 70 born in the central region of Sweden. First we genotyped a panel of 137 single nucleotide polymorphism (SNPs) evenly distributed across the candidate genes in the ULSAM cohort. We identified 4 SNPs in the estrogen receptor gene (ESR1) on chromosome 6q25.1 with suggestive signals of association (p<0.05) with standing body height. This result was followed up by genotyping the same 25 SNPs in the ESR1 gene as in ULSAM in a second population cohort, the Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS) cohort that consist of 507 males and 509 females of age 70 from the same geographical region as ULSAM. One SNP, rs2179922 located in intron 4 of ESR1 showed and association signal (p = 0.0056) in the male samples from the PIVUS cohort. Homozygote carriers of the G-allele of the SNP rs2179922 were on average 0.90 cm taller than individuals with the two other genotypes at this SNP in the ULSAM cohort and 2.3 cm taller in the PIVUS cohort. No association was observed for the females in the PIVUS cohort.
Using the relative expression levels of two SNP alleles of a gene in the same sample is an effective approach for identifying cis-acting regulatory SNPs (rSNPs). In the current study, we established a process for systematic screening for cis-acting rSNPs using experimental detection of AI as an initial approach. We selected 160 expressed candidate genes that are involved in cancer and anticancer drug resistance for analysis of AI in a panel of cell lines that represent different types of cancers and have been well characterized for their response patterns against anticancer drugs. Of these genes, 60 contained heterozygous SNPs in their coding regions, and 41 of the genes displayed imbalanced expression of the two cSNP alleles. Genes that displayed AI were subjected to bioinformatics-assisted identification of rSNPs that alter the strength of transcription factor binding. rSNPs in 15 genes were subjected to electrophoretic mobility shift assay, and in eight of these genes (APC, BCL2, CCND2, MLH1, PARP1, SLIT2, YES1, XRCC1) we identified differential protein binding from a nuclear extract between the SNP alleles. The screening process allowed us to zoom in from 160 candidate genes to eight genes that may contain functional rSNPs in their promoter regions.
High-throughput genotyping of single nucleotide polymorphisms (SNPs) generates large amounts of data. In many SNP genotyping assays, the genotype assignment is based on scatter plots of signals corresponding to the two SNP alleles. In a robust assay the three clusters that define the genotypes are well separated and the distances between the data points within a cluster are short. "Silhouettes" is a graphical aid for interpretation and validation of data clusters that provides a measure of how well a data point was classified when it was assigned to a cluster. Thus "Silhouettes" can potentially be used as a quality measure for SNP genotyping results and for objective comparison of the performance of SNP assays at different circumstances.
We created a program (ClusterA) for calculating "Silhouette scores", and applied it to assess the quality of SNP genotype clusters obtained by single nucleotide primer extension ("minisequencing") in the Tag-microarray format. A Silhouette score condenses the quality of the genotype assignment for each SNP assay into a single numeric value, which ranges from 1.0, when the genotype assignment is unequivocal, down to -1.0, when the genotype assignment has been arbitrary. In the present study we applied Silhouette scores to compare the performance of four DNA polymerases in our minisequencing system by analyzing 26 SNPs in both DNA polarities in 16 DNA samples. We found Silhouettes to provide a relevant measure for the quality of SNP assays at different reaction conditions, illustrated by the four DNA polymerases here. According to our result, the genotypes can be unequivocally assigned without manual inspection when the Silhouette score for a SNP assay is > 0.65. All four DNA polymerases performed satisfactorily in our Tag-array minisequencing system.
"Silhouette scores" for assessing the quality of SNP genotyping clusters is convenient for evaluating the quality of SNP genotype assignment, and provides an objective, numeric measure for comparing the performance of SNP assays. The program we created for calculating Silhouette scores is freely available, and can be used for quality assessment of the results from all genotyping systems, where the genotypes are assigned by cluster analysis using scatter plots.
Each of the human genes or transcriptional units is likely to contain single nucleotide polymorphisms that may give rise to sequence variation between individuals and tissues on the level of RNA. Based on recent studies, differential expression of the two alleles of heterozygous coding single nucleotide polymorphisms (SNPs) may be frequent for human genes. Methods with high accuracy to be used in a high throughput setting are needed for systematic surveys of expressed sequence variation. In this study we evaluated two formats of multiplexed, microarray based minisequencing for quantitative detection of imbalanced expression of SNP alleles. We used a panel of ten SNPs located in five genes known to be expressed in two endothelial cell lines as our model system.
The accuracy and sensitivity of quantitative detection of allelic imbalance was assessed for each SNP by constructing regression lines using a dilution series of mixed samples from individuals of different genotype. Accurate quantification of SNP alleles by both assay formats was evidenced for by R2 values > 0.95 for the majority of the regression lines. According to a two sample t-test, we were able to distinguish 1–9% of a minority SNP allele from a homozygous genotype, with larger variation between SNPs than between assay formats. Six of the SNPs, heterozygous in either of the two cell lines, were genotyped in RNA extracted from the endothelial cells. The coefficient of variation between the fluorescent signals from five parallel reactions was similar for cDNA and genomic DNA. The fluorescence signal intensity ratios measured in the cDNA samples were compared to those in genomic DNA to determine the relative expression levels of the two alleles of each SNP. Four of the six SNPs tested displayed a higher than 1.4-fold difference in allelic ratios between cDNA and genomic DNA. The results were verified by allele-specific oligonucleotide hybridisation and minisequencing in a microtiter plate format.
We conclude that microarray based minisequencing is an accurate and accessible tool for multiplexed screening for imbalanced allelic expression in multiple samples and tissues in parallel.
Dyslipidemia has been associated with hypertension. The present study explored if polymorphisms in genes encoding proteins in lipid metabolism could be used as predictors for the individual response to antihypertensive treatment.
Ten single nucleotide polymorphisms (SNP) in genes related to lipid metabolism were analysed by a microarray based minisequencing system in DNA samples from ninety-seven hypertensive subjects randomised to treatment with either 150 mg of the angiotensin II type 1 receptor blocker irbesartan or 50 mg of the β1-adrenergic receptor blocker atenolol for twelve weeks.
The reduction in blood pressure was similar in both treatment groups. The SNP C711T in the apolipoprotein B gene was associated with the blood pressure response to irbesartan with an average reduction of 19 mmHg in the individuals carrying the C-allele, but not to atenolol. The C16730T polymorphism in the low density lipoprotein receptor gene predicted the change in systolic blood pressure in the atenolol group with an average reduction of 14 mmHg in the individuals carrying the C-allele.
Polymorphisms in genes encoding proteins in the lipid metabolism are associated with the response to antihypertensive treatment in a drug specific pattern. These results highlight the potential use of pharmacogenetics as a guide for individualised antihypertensive treatment, and also the role of lipids in blood pressure control.
Antihypertensive treatment; pharmacogenetics; lipids; minisequencing; genotyping
Oestrogen receptor α, which mediates the effect of oestrogen in target tissues, is genetically polymorphic. Because breast cancer development is dependent on oestrogenic influence, we have investigated whether polymorphisms in the oestrogen receptor α gene (ESR1) are associated with breast cancer risk.
We genotyped breast cancer cases and age-matched population controls for one microsatellite marker and four single-nucleotide polymorphisms (SNPs) in ESR1. The numbers of genotyped cases and controls for each marker were as follows: TAn, 1514 cases and 1514 controls; c.454-397C → T, 1557 cases and 1512 controls; c.454-351A → G, 1556 cases and 1512 controls; c.729C → T, 1562 cases and 1513 controls; c.975C → G, 1562 cases and 1513 controls. Using logistic regression models, we calculated odds ratios (ORs) and 95% confidence intervals (CIs). Haplotype effects were estimated in an exploratory analysis, using expectation-maximisation algorithms for case-control study data.
There were no compelling associations between single polymorphic loci and breast cancer risk. In haplotype analyses, a common haplotype of the c.454-351A → G or c.454-397C → T and c.975C → G SNPs appeared to be associated with an increased risk for ductal breast cancer: one copy of the c.454-351A → G and c.975C → G haplotype entailed an OR of 1.19 (95% CI 1.06–1.33) and two copies with an OR of 1.42 (95% CI 1.15–1.77), compared with no copies, under a model of multiplicative penetrance. The association with the c.454-397C → T and c.975C → G haplotypes was similar. Our data indicated that these haplotypes were more influential in women with a high body mass index. Adjustment for multiple comparisons rendered the associations statistically non-significant.
We found suggestions of an association between common haplotypes in ESR1 and the risk for ductal breast cancer that is stronger in heavy women.
breast cancer; oestrogen receptor α; gene; haplotype; polymorphism
Whole genome amplification (WGA) procedures such as primer extension preamplification (PEP) or multiple displacement amplification (MDA) have the potential to provide an unlimited source of DNA for large-scale genetic studies. We have performed a quantitative evaluation of PEP and MDA for genotyping single nucleotide polymorphisms (SNPs) using multiplex, four-color fluorescent minisequencing in a microarray format. Forty-five SNPs were genotyped and the WGA methods were evaluated with respect to genotyping success, signal-to-noise ratios, power of genotype discrimination, yield and imbalanced amplification of alleles in the MDA product. Both PEP and MDA products provided genotyping results with a high concordance to genomic DNA. For PEP products the power of genotype discrimination was lower than for MDA due to a 2-fold lower signal-to-noise ratio. MDA products were indistinguishable from genomic DNA in all aspects studied. To obtain faithful representation of the SNP alleles at least 0.3 ng DNA should be used per MDA reaction. We conclude that the use of WGA, and MDA in particular, is a highly promising procedure for producing DNA in sufficient amounts even for genome wide SNP mapping studies.
Adipocyte-derived leucine aminopeptidase (ALAP) is a recently identified member of the M1 family of zinc-metallopeptidases and is thought to play a role in blood pressure control through inactivation of angiotensin II and/or generation of bradykinin. The enzyme seems to be particularly abundant in the heart. Recently, the Arg528-encoding allele of the ALAP gene was shown to be associated with essential hypertension.
We evaluated the influence of this polymorphism on the change in left ventricular mass index in 90 patients with essential hypertension and echocardiographically diagnosed left ventricular hypertrophy, randomised in a double-blind study to receive treatment with either the angiotensin II type I receptor antagonist irbesartan or the beta1-adrenoceptor blocker atenolol for 48 weeks. Genyotyping was performed using minisequencing.
After adjustment for potential covariates (blood pressure and left ventricular mass index at baseline, blood pressure change, age, sex, dose and added antihypertensive treatment), there was a marked difference between the Arg/Arg and Lys/Arg genotypes in patients treated with irbesartan; those with the Arg/Arg genotype responded on average with an almost two-fold greater regression of left ventricular mass index than patients with the Lys/Arg genotype (-30.1 g/m2 [3.6] vs -16.7 [4.5], p = 0.03).
The ALAP genotype seems to determine the degree of regression of left ventricular hypertrophy during antihypertensive treatment with the angiotensin II type I receptor antagonist irbesartan in patients with essential hypertension and left ventricular hypertrophy. This is the first report of a role for ALAP/aminopeptidases in left ventricular mass regulation, and suggests a new potential target for antihypertensive drugs.
Aminopeptidase; irbesartan; hypertension; polymorphism; left ventricular hypertrophy; angiotensin; pharmacogenomics; bradykinin.
We selected 125 candidate single nucleotide polymorphisms (SNPs) in genes belonging to the human type 1 interferon (IFN) gene family and the genes coding for proteins in the main type 1 IFN signalling pathway by screening databases and by in silico comparison of DNA sequences. Using quantitative analysis of pooled DNA samples by solid-phase mini-sequencing, we found that only 20% of the candidate SNPs were polymorphic in the Finnish and Swedish populations. To allow more effective validation of candidate SNPs, we developed a four-colour microarray-based mini-sequencing assay for multiplex, quantitative allele frequency determination in pooled DNA samples. We used cyclic mini-sequencing reactions with primers carrying 5′-tag sequences, followed by capture of the products on microarrays by hybridisation to complementary tag oligonucleotides. Standard curves prepared from mixtures of known amounts of SNP alleles demonstrate the applicability of the system to quantitative analysis, and showed that for about half of the tested SNPs the limit of detection for the minority allele was below 5%. The microarray-based genotyping system established here is universally applicable for genotyping and quantification of any SNP, and the validated system for SNPs in type 1 IFN-related genes should find many applications in genetic studies of this important immunoregulatory pathway.