Rapid advances in the development of sequencing technologies in recent years have enabled an increasing number of applications in biology and medicine. Here, we review key technical aspects of the preparation of DNA templates for sequencing, the biochemical reaction principles and assay formats underlying next-generation sequencing systems, methods for imaging and base calling, quality control, and bioinformatic approaches for sequence alignment, variant calling and assembly. We also discuss some of the most important advances that the new sequencing technologies have brought to the fields of human population genetics, human genetic history and forensic genetics.
Drinking coffee has been linked to reduced calcium conservation, but it is less clear whether it leads to sustained bone mineral loss and if individual predisposition for caffeine metabolism might be important in this context. Therefore, the relation between consumption of coffee and bone mineral density (BMD) at the proximal femur in men and women was studied, taking into account, for the first time, genotypes for cytochrome P450 1A2 (CYP1A2) associated with metabolism of caffeine.
Dietary intakes of 359 men and 358 women (aged 72 years), participants of the Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS), were assessed by a 7-day food diary. Two years later, BMD for total proximal femur, femoral neck and trochanteric regions of the proximal femur were measured by Dual-energy X-ray absorptiometry (DXA). Genotypes of CYP1A2 were determined. Adjusted means of BMD for each category of coffee consumption were calculated.
Men consuming 4 cups of coffee or more per day had 4% lower BMD at the proximal femur (p = 0.04) compared with low or non-consumers of coffee. This difference was not observed in women. In high consumers of coffee, those with rapid metabolism of caffeine (C/C genotype) had lower BMD at the femoral neck (p = 0.01) and at the trochanter (p = 0.03) than slow metabolizers (T/T and C/T genotypes). Calcium intake did not modify the relation between coffee and BMD.
High consumption of coffee seems to contribute to a reduction in BMD of the proximal femur in elderly men, but not in women. BMD was lower in high consumers of coffee with rapid metabolism of caffeine, suggesting that rapid metabolizers of caffeine may constitute a risk group for bone loss induced by coffee.
We have performed Quantitative Trait Loci (QTL) analysis of an F2 intercross between two chicken lines divergently selected for juvenile body-weight. In a previous study 13 identified loci with effects on body-weight, only explained a small proportion of the large variation in the F2 population. Epistatic interaction analysis however, indicated that a network of interacting loci with large effect contributed to the difference in body-weight of the parental lines. This previous analysis was, however, based on a sparse microsatellite linkage map and the limited coverage could have affected the main conclusions. Here we present a revised QTL analysis based on a high-density linkage map that provided a more complete coverage of the chicken genome. Furthermore, we utilized genotype data from ~13,000 SNPs to search the genome for potential selective sweeps that have occurred in the selected lines.
We constructed a linkage map comprising 434 genetic markers, covering 31 chromosomes but leaving seven microchromosomes uncovered. The analysis showed that seven regions harbor QTL that influence growth. The pair-wise interaction analysis identified 15 unique QTL pairs and notable is that nine of those involved interactions with a locus on chromosome 7, forming a network of interacting loci. The analysis of ~13,000 SNPs showed that a substantial proportion of the genetic variation present in the founder population has been lost in either of the two selected lines since ~60% of the SNPs polymorphic among lines showed fixation in one of the lines. With the current marker coverage and QTL map resolution we did not observe clear signs of selective sweeps within QTL intervals.
The results from the QTL analysis using the new improved linkage map are to a large extent in concordance with our previous analysis of this pedigree. The difference in body-weight between the parental chicken lines is caused by many QTL each with a small individual effect. Although the increased chromosomal marker coverage did not lead to the identification of additional QTL, we were able to refine the localization of QTL. The importance of epistatic interaction as a mechanism contributing significantly to the remarkable selection response was further strengthened because additional pairs of interacting loci were detected with the improved map.
Estrogen is an established endometrial carcinogen. One of the most important mediators of estrogenic action is the estrogen receptor alpha. We have investigated whether polymorphic variation in the estrogen receptor alpha gene (ESR1) is associated with endometrial cancer risk.
In 702 cases with invasive endometrial cancer and 1563 controls, we genotyped five markers in ESR1 and used logistic regression models to estimate odds ratios (OR) and 95 percent confidence intervals (CI).
We found an association between rs2234670, rs2234693, as well as rs9340799, markers in strong linkage disequilibrium (LD), and endometrial cancer risk. The association with rs9340799 was the strongest, OR 0.75 (CI 0.60–0.93) for heterozygous and OR 0.53 (CI 0.37–0.77) for homozygous rare compared to those homozygous for the most common allele. Haplotype models did not fit better to the data than single marker models.
We found that intronic variation in ESR1 was associated with endometrial cancer risk.
High-throughput genotyping of single nucleotide polymorphisms (SNPs) generates large amounts of data. In many SNP genotyping assays, the genotype assignment is based on scatter plots of signals corresponding to the two SNP alleles. In a robust assay the three clusters that define the genotypes are well separated and the distances between the data points within a cluster are short. "Silhouettes" is a graphical aid for interpretation and validation of data clusters that provides a measure of how well a data point was classified when it was assigned to a cluster. Thus "Silhouettes" can potentially be used as a quality measure for SNP genotyping results and for objective comparison of the performance of SNP assays at different circumstances.
We created a program (ClusterA) for calculating "Silhouette scores", and applied it to assess the quality of SNP genotype clusters obtained by single nucleotide primer extension ("minisequencing") in the Tag-microarray format. A Silhouette score condenses the quality of the genotype assignment for each SNP assay into a single numeric value, which ranges from 1.0, when the genotype assignment is unequivocal, down to -1.0, when the genotype assignment has been arbitrary. In the present study we applied Silhouette scores to compare the performance of four DNA polymerases in our minisequencing system by analyzing 26 SNPs in both DNA polarities in 16 DNA samples. We found Silhouettes to provide a relevant measure for the quality of SNP assays at different reaction conditions, illustrated by the four DNA polymerases here. According to our result, the genotypes can be unequivocally assigned without manual inspection when the Silhouette score for a SNP assay is > 0.65. All four DNA polymerases performed satisfactorily in our Tag-array minisequencing system.
"Silhouette scores" for assessing the quality of SNP genotyping clusters is convenient for evaluating the quality of SNP genotype assignment, and provides an objective, numeric measure for comparing the performance of SNP assays. The program we created for calculating Silhouette scores is freely available, and can be used for quality assessment of the results from all genotyping systems, where the genotypes are assigned by cluster analysis using scatter plots.
Each of the human genes or transcriptional units is likely to contain single nucleotide polymorphisms that may give rise to sequence variation between individuals and tissues on the level of RNA. Based on recent studies, differential expression of the two alleles of heterozygous coding single nucleotide polymorphisms (SNPs) may be frequent for human genes. Methods with high accuracy to be used in a high throughput setting are needed for systematic surveys of expressed sequence variation. In this study we evaluated two formats of multiplexed, microarray based minisequencing for quantitative detection of imbalanced expression of SNP alleles. We used a panel of ten SNPs located in five genes known to be expressed in two endothelial cell lines as our model system.
The accuracy and sensitivity of quantitative detection of allelic imbalance was assessed for each SNP by constructing regression lines using a dilution series of mixed samples from individuals of different genotype. Accurate quantification of SNP alleles by both assay formats was evidenced for by R2 values > 0.95 for the majority of the regression lines. According to a two sample t-test, we were able to distinguish 1–9% of a minority SNP allele from a homozygous genotype, with larger variation between SNPs than between assay formats. Six of the SNPs, heterozygous in either of the two cell lines, were genotyped in RNA extracted from the endothelial cells. The coefficient of variation between the fluorescent signals from five parallel reactions was similar for cDNA and genomic DNA. The fluorescence signal intensity ratios measured in the cDNA samples were compared to those in genomic DNA to determine the relative expression levels of the two alleles of each SNP. Four of the six SNPs tested displayed a higher than 1.4-fold difference in allelic ratios between cDNA and genomic DNA. The results were verified by allele-specific oligonucleotide hybridisation and minisequencing in a microtiter plate format.
We conclude that microarray based minisequencing is an accurate and accessible tool for multiplexed screening for imbalanced allelic expression in multiple samples and tissues in parallel.
Dyslipidemia has been associated with hypertension. The present study explored if polymorphisms in genes encoding proteins in lipid metabolism could be used as predictors for the individual response to antihypertensive treatment.
Ten single nucleotide polymorphisms (SNP) in genes related to lipid metabolism were analysed by a microarray based minisequencing system in DNA samples from ninety-seven hypertensive subjects randomised to treatment with either 150 mg of the angiotensin II type 1 receptor blocker irbesartan or 50 mg of the β1-adrenergic receptor blocker atenolol for twelve weeks.
The reduction in blood pressure was similar in both treatment groups. The SNP C711T in the apolipoprotein B gene was associated with the blood pressure response to irbesartan with an average reduction of 19 mmHg in the individuals carrying the C-allele, but not to atenolol. The C16730T polymorphism in the low density lipoprotein receptor gene predicted the change in systolic blood pressure in the atenolol group with an average reduction of 14 mmHg in the individuals carrying the C-allele.
Polymorphisms in genes encoding proteins in the lipid metabolism are associated with the response to antihypertensive treatment in a drug specific pattern. These results highlight the potential use of pharmacogenetics as a guide for individualised antihypertensive treatment, and also the role of lipids in blood pressure control.
Antihypertensive treatment; pharmacogenetics; lipids; minisequencing; genotyping
Oestrogen receptor α, which mediates the effect of oestrogen in target tissues, is genetically polymorphic. Because breast cancer development is dependent on oestrogenic influence, we have investigated whether polymorphisms in the oestrogen receptor α gene (ESR1) are associated with breast cancer risk.
We genotyped breast cancer cases and age-matched population controls for one microsatellite marker and four single-nucleotide polymorphisms (SNPs) in ESR1. The numbers of genotyped cases and controls for each marker were as follows: TAn, 1514 cases and 1514 controls; c.454-397C → T, 1557 cases and 1512 controls; c.454-351A → G, 1556 cases and 1512 controls; c.729C → T, 1562 cases and 1513 controls; c.975C → G, 1562 cases and 1513 controls. Using logistic regression models, we calculated odds ratios (ORs) and 95% confidence intervals (CIs). Haplotype effects were estimated in an exploratory analysis, using expectation-maximisation algorithms for case-control study data.
There were no compelling associations between single polymorphic loci and breast cancer risk. In haplotype analyses, a common haplotype of the c.454-351A → G or c.454-397C → T and c.975C → G SNPs appeared to be associated with an increased risk for ductal breast cancer: one copy of the c.454-351A → G and c.975C → G haplotype entailed an OR of 1.19 (95% CI 1.06–1.33) and two copies with an OR of 1.42 (95% CI 1.15–1.77), compared with no copies, under a model of multiplicative penetrance. The association with the c.454-397C → T and c.975C → G haplotypes was similar. Our data indicated that these haplotypes were more influential in women with a high body mass index. Adjustment for multiple comparisons rendered the associations statistically non-significant.
We found suggestions of an association between common haplotypes in ESR1 and the risk for ductal breast cancer that is stronger in heavy women.
breast cancer; oestrogen receptor α; gene; haplotype; polymorphism
Whole genome amplification (WGA) procedures such as primer extension preamplification (PEP) or multiple displacement amplification (MDA) have the potential to provide an unlimited source of DNA for large-scale genetic studies. We have performed a quantitative evaluation of PEP and MDA for genotyping single nucleotide polymorphisms (SNPs) using multiplex, four-color fluorescent minisequencing in a microarray format. Forty-five SNPs were genotyped and the WGA methods were evaluated with respect to genotyping success, signal-to-noise ratios, power of genotype discrimination, yield and imbalanced amplification of alleles in the MDA product. Both PEP and MDA products provided genotyping results with a high concordance to genomic DNA. For PEP products the power of genotype discrimination was lower than for MDA due to a 2-fold lower signal-to-noise ratio. MDA products were indistinguishable from genomic DNA in all aspects studied. To obtain faithful representation of the SNP alleles at least 0.3 ng DNA should be used per MDA reaction. We conclude that the use of WGA, and MDA in particular, is a highly promising procedure for producing DNA in sufficient amounts even for genome wide SNP mapping studies.
We selected 125 candidate single nucleotide polymorphisms (SNPs) in genes belonging to the human type 1 interferon (IFN) gene family and the genes coding for proteins in the main type 1 IFN signalling pathway by screening databases and by in silico comparison of DNA sequences. Using quantitative analysis of pooled DNA samples by solid-phase mini-sequencing, we found that only 20% of the candidate SNPs were polymorphic in the Finnish and Swedish populations. To allow more effective validation of candidate SNPs, we developed a four-colour microarray-based mini-sequencing assay for multiplex, quantitative allele frequency determination in pooled DNA samples. We used cyclic mini-sequencing reactions with primers carrying 5′-tag sequences, followed by capture of the products on microarrays by hybridisation to complementary tag oligonucleotides. Standard curves prepared from mixtures of known amounts of SNP alleles demonstrate the applicability of the system to quantitative analysis, and showed that for about half of the tested SNPs the limit of detection for the minority allele was below 5%. The microarray-based genotyping system established here is universally applicable for genotyping and quantification of any SNP, and the validated system for SNPs in type 1 IFN-related genes should find many applications in genetic studies of this important immunoregulatory pathway.
In the microarray format of the minisequencing method multiple
oligonucleotide primers immobilised on a glass surface are extended
with fluorescent ddNTPs using a DNA polymerase. The method is a
promising tool for large-scale single nucleotide polymorphism (SNP)
detection. We have compared eight chemical methods for covalent
immobilisation of the oligonucleotide primers on glass surfaces.
We included both commercially available, activated slides and slides
that were modified by ourselves. In the comparison the differently
derivatised glass slides were evaluated with respect to background
fluorescence, efficiency of attaching oligonucleotides and performance
of the primer arrays in minisequencing reactions. We found that
there are significant differences in background fluorescence levels
among the different coatings, and that the attachment efficiency, which
was measured indirectly using extension by terminal transferase,
varied largely depending on which immobilisation strategy was used.
We also found that the attachment chemistry affects the genotyping accuracy,
when minisequencing on microarrays is used as the genotyping method.
The best genotyping results were observed using mercaptosilane-coated slides
attaching disulfide-modified oligonucleotides.
Genetic variants of the interferon (IFN) regulatory factor 5 (IRF5) gene are associated with systemic lupus erythematosus (SLE) susceptibility. The contribution of these variants to IRF-5 expression in primary blood cells of SLE patients has not been addressed, nor has the role of type I IFN. The aim of this study was to determine the association between increased IRF-5 expression and the IRF5 risk haplotype in SLE patients.
IRF-5 transcript and protein levels in 44 Swedish patients with SLE and 16 healthy controls were measured by quantitative real-time PCR, minigene assay, and flow cytometry. The rs2004640, rs10954213, rs10488631 and the CGGGG indel were genotyped in these patients. Genotypes of these polymorphisms defined a common risk and protective haplotype.
IRF-5 expression and alternative splicing were significantly upregulated in SLE patients versus healthy donors. Enhanced transcript and protein levels were associated with the risk haplotype of IRF5; rs10488631 gave the only significant independent association that correlated with increased transcription from non-coding exon 1C. Minigene experiments demonstrated an important role for rs2004640 and the CGGGG indel, along with type I IFNs in regulating IRF-5 expression.
This study provides the first formal proof that IRF-5 expression and alternative splicing are significantly upregulated in primary blood cells of SLE patients. The risk haplotype is associated with enhanced IRF-5 transcript and protein expression in SLE patients.
Systemic lupus erythematosus (SLE) is a complex trait characterised by the production of a range of auto-antibodies and a diverse set of clinical phenotypes. Currently, ∼8% of the genetic contribution to SLE in Europeans is known, following publication of several moderate-sized genome-wide (GW) association studies, which identified loci with a strong effect (OR>1.3). In order to identify additional genes contributing to SLE susceptibility, we conducted a replication study in a UK dataset (870 cases, 5,551 controls) of 23 variants that showed moderate-risk for lupus in previous studies. Association analysis in the UK dataset and subsequent meta-analysis with the published data identified five SLE susceptibility genes reaching genome-wide levels of significance (Pcomb<5×10−8): NCF2 (Pcomb = 2.87×10−11), IKZF1 (Pcomb = 2.33×10−9), IRF8 (Pcomb = 1.24×10−8), IFIH1 (Pcomb = 1.63×10−8), and TYK2 (Pcomb = 3.88×10−8). Each of the five new loci identified here can be mapped into interferon signalling pathways, which are known to play a key role in the pathogenesis of SLE. These results increase the number of established susceptibility genes for lupus to ∼30 and validate the importance of using large datasets to confirm associations of loci which moderately increase the risk for disease.
Genome-wide association studies have revolutionised our ability to identify common susceptibility alleles for systemic lupus erythematosus (SLE). In complex diseases such as SLE, where many different genes make a modest contribution to disease susceptibility, it is necessary to perform large-scale association studies to combine results from several datasets, to have sufficient power to identify highly significant novel loci (P<5×10−8). Using a large SLE collection of 870 UK SLE cases and 5,551 UK unaffected individuals, we firstly replicated ten moderate-risk alleles (P<0.05) from a US–Swedish study of 3,273 SLE cases and 12,188 healthy controls. Combining our results with the US-Swedish data identified five new loci, which crossed the level for genome-wide significance: NCF2 (neutrophil cytosolic factor 2), IKZF1 (Ikaros family zinc-finger 1), IRF8 (interferon regulatory factor 8), IFIH1 (interferon-induced helicase C domain-containing protein 1), and TYK2 (tyrosine kinase 2). Each of these five genes regulates a different aspect of the immune response and contributes to the production of type-I and type-II interferons. Although further studies will be required to identify the causal alleles within these loci, the confirmation of five new susceptibility genes for lupus makes a significant step forward in our understanding of the genetic contribution to SLE.
Targeted sequencing is a cost-efficient way to obtain answers to biological questions in many projects, but the choice of the enrichment method to use can be difficult. In this study we compared two hybridization methods for target enrichment for massively parallel sequencing and single nucleotide polymorphism (SNP) discovery, namely Nimblegen sequence capture arrays and the SureSelect liquid-based hybrid capture system. We prepared sequencing libraries from three HapMap samples using both methods, sequenced the libraries on the Illumina Genome Analyzer, mapped the sequencing reads back to the genome, and called variants in the sequences. 74–75% of the sequence reads originated from the targeted region in the SureSelect libraries and 41–67% in the Nimblegen libraries. We could sequence up to 99.9% and 99.5% of the regions targeted by capture probes from the SureSelect libraries and from the Nimblegen libraries, respectively. The Nimblegen probes covered 0.6 Mb more of the original 3.1 Mb target region than the SureSelect probes. In each sample, we called more SNPs and detected more novel SNPs from the libraries that were prepared using the Nimblegen method. Thus the Nimblegen method gave better results when judged by the number of SNPs called, but this came at the cost of more over-sampling.
IgA nephropathy (IgAN) and nephritis in Systemic Lupus Erythematosus (SLE) are two common forms of glomerulonephritis in which genetic findings are of importance for disease development. We have recently reported an association of IgAN with variants of TGFB1. In several autoimmune diseases, particularly in SLE, IRF5, STAT4 genes and TRAF1-C5 locus have been shown to be important candidate genes. The aim of this study was to compare genetic variants from the TGFB1, IRF5, STAT4 genes and TRAF1-C5 locus with susceptibility to IgAN and lupus nephritis in two Swedish cohorts.
Patients and Methods
We genotyped 13 single nucleotide polymorphisms (SNPs) in four genetic loci in 1252 DNA samples from patients with biopsy proven IgAN or with SLE (with and without nephritis) and healthy age- and sex-matched controls from the same population in Sweden.
Genotype and allelic frequencies for SNPs from selected genes did not differ significantly between lupus nephritis patients and SLE patients without nephritis. In addition, haplotype analysis for seven selected SNPs did not reveal a difference for the SLE patient groups with and without nephritis. Moreover, none of these SPNs showed a significant difference between IgAN patients and healthy controls. IRF5 and STAT4 variants remained significantly different between SLE cases and healthy controls. In addition, the data did not show an association of TRAF1-C5 polymorphism with susceptibility to SLE in this Swedish population.
Our data do not support an overlap in genetic susceptibility between patients with IgAN or SLE and reveal no specific importance of SLE associated SNPs for the presence of lupus nephritis.
Human body height is a complex genetic trait with high heritability. We performed an association study of 17 candidate genes for height in the Uppsala Longitudinal Study of Adult Men (ULSAM) that consists of 1153 elderly men of age 70 born in the central region of Sweden. First we genotyped a panel of 137 single nucleotide polymorphism (SNPs) evenly distributed across the candidate genes in the ULSAM cohort. We identified 4 SNPs in the estrogen receptor gene (ESR1) on chromosome 6q25.1 with suggestive signals of association (p<0.05) with standing body height. This result was followed up by genotyping the same 25 SNPs in the ESR1 gene as in ULSAM in a second population cohort, the Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS) cohort that consist of 507 males and 509 females of age 70 from the same geographical region as ULSAM. One SNP, rs2179922 located in intron 4 of ESR1 showed and association signal (p = 0.0056) in the male samples from the PIVUS cohort. Homozygote carriers of the G-allele of the SNP rs2179922 were on average 0.90 cm taller than individuals with the two other genotypes at this SNP in the ULSAM cohort and 2.3 cm taller in the PIVUS cohort. No association was observed for the females in the PIVUS cohort.
Using the relative expression levels of two SNP alleles of a gene in the same sample is an effective approach for identifying cis-acting regulatory SNPs (rSNPs). In the current study, we established a process for systematic screening for cis-acting rSNPs using experimental detection of AI as an initial approach. We selected 160 expressed candidate genes that are involved in cancer and anticancer drug resistance for analysis of AI in a panel of cell lines that represent different types of cancers and have been well characterized for their response patterns against anticancer drugs. Of these genes, 60 contained heterozygous SNPs in their coding regions, and 41 of the genes displayed imbalanced expression of the two cSNP alleles. Genes that displayed AI were subjected to bioinformatics-assisted identification of rSNPs that alter the strength of transcription factor binding. rSNPs in 15 genes were subjected to electrophoretic mobility shift assay, and in eight of these genes (APC, BCL2, CCND2, MLH1, PARP1, SLIT2, YES1, XRCC1) we identified differential protein binding from a nuclear extract between the SNP alleles. The screening process allowed us to zoom in from 160 candidate genes to eight genes that may contain functional rSNPs in their promoter regions.
Speciation is the combination of evolutionary processes that leads to the reproductive isolation of different populations. We investigate the significance of sex-chromosome evolution on the development of post- and prezygotic isolation in two naturally hybridizing Ficedula flycatcher species. Applying a tag-array-based mini-sequencing assay to genotype single nucleotide polymorphisms (SNPs) and interspecific substitutions, we demonstrate rather extensive hybridization and backcrossing in sympatry. However, gene flow across the partial postzygotic barrier (introgression) is almost exclusively restricted to autosomal loci, suggesting strong selection against introgression of sex-linked genes. In addition to this partial postzygotic barrier, character displacement of male plumage characteristics has previously been shown to reinforce prezygotic isolation in these birds. We show that male plumage traits involved in reinforcing prezygotic isolation are sex linked. These results suggest a major role of sex-chromosome evolution in mediating post- and prezygotic barriers to gene flow and point to a causal link in the development of the two forms of reproductive isolation.
Human group A rotavirus (HRV) is the major cause of severe gastroenteritis in infants worldwide. HRV shares the feature of a high degree of genetic diversity with many other RNA viruses, and therefore, genotyping of this organism is more complicated than genotyping of more stable DNA viruses. We describe a novel microarray-based method that allows high-throughput genotyping of RNA viruses with a high degree of polymorphism by multiplex capture and type-specific extension on microarrays. Denatured reverse transcription (RT)-PCR products derived from two outer capsid genes of clinical isolates of HRV were hybridized to immobilized capture oligonucleotides representing the most commonly occurring P and G genotypes on a microarray. Specific primer extension of the type-specific capture oligonucleotides was applied to incorporate the fluorescent nucleotide analogue cyanine 5-labeled dUTP as a detectable label. Laser scanning and fluorescence detection of the microarrays was followed by visual or computer-assisted interpretation of the fluorescence patterns generated on the microarrays. Initially, the method detected HRV in all 40 samples and correctly determined both the G and the P genotypes of 35 of the 40 strains analyzed. After modification by inclusion of additional capture oligonucleotides specific for the initially unassigned genotypes, all genotypes could be correctly defined. The results of genotyping with the microarray fully agreed with the results obtained by nucleotide sequence analysis and sequence-specific multiplex RT-PCR. Owing to its robustness, simplicity, and general utility, the microarray-based method may gain wide applicability for the genotyping of microorganisms, including highly variable RNA and DNA viruses.
Variants in the growth factor receptor-bound protein 10 (GRB10) gene were in a GWAS meta-analysis associated with reduced glucose-stimulated insulin secretion and increased risk of type 2 diabetes (T2D) if inherited from the father, but inexplicably reduced fasting glucose when inherited from the mother. GRB10 is a negative regulator of insulin signaling and imprinted in a parent-of-origin fashion in different tissues. GRB10 knock-down in human pancreatic islets showed reduced insulin and glucagon secretion, which together with changes in insulin sensitivity may explain the paradoxical reduction of glucose despite a decrease in insulin secretion. Together, these findings suggest that tissue-specific methylation and possibly imprinting of GRB10 can influence glucose metabolism and contribute to T2D pathogenesis. The data also emphasize the need in genetic studies to consider whether risk alleles are inherited from the mother or the father.
In this paper, we report the first large genome-wide association study in man for glucose-stimulated insulin secretion (GSIS) indices during an oral glucose tolerance test. We identify seven genetic loci and provide effects on GSIS for all previously reported glycemic traits and obesity genetic loci in a large-scale sample. We observe paradoxical effects of genetic variants in the growth factor receptor-bound protein 10 (GRB10) gene yielding both reduced GSIS and reduced fasting plasma glucose concentrations, specifically showing a parent-of-origin effect of GRB10 on lower fasting plasma glucose and enhanced insulin sensitivity for maternal and elevated glucose and decreased insulin sensitivity for paternal transmissions of the risk allele. We also observe tissue-specific differences in DNA methylation and allelic imbalance in expression of GRB10 in human pancreatic islets. We further disrupt GRB10 by shRNA in human islets, showing reduction of both insulin and glucagon expression and secretion. In conclusion, we provide evidence for complex regulation of GRB10 in human islets. Our data suggest that tissue-specific methylation and imprinting of GRB10 can influence glucose metabolism and contribute to T2D pathogenesis. The data also emphasize the need in genetic studies to consider whether risk alleles are inherited from the mother or the father.
Lupus nephritis is a cause of significant morbidity in systemic lupus erythematosus (SLE) and its genetic background has not been completely clarified. The aim of this investigation was to analyze single nucleotide polymorphisms (SNPs) for association with lupus nephritis, its severe form proliferative nephritis and renal outcome, in two Swedish cohorts. Cohort I (n = 567 SLE cases, n = 512 controls) was previously genotyped for 5676 SNPs and cohort II (n = 145 SLE cases, n = 619 controls) was genotyped for SNPs in STAT4, IRF5, TNIP1 and BLK.
Case-control and case-only association analyses for patients with lupus nephritis, proliferative nephritis and severe renal insufficiency were performed. In the case-control analysis of cohort I, four highly linked SNPs in STAT4 were associated with lupus nephritis with genome wide significance with p = 3.7×10−9, OR 2.20 for the best SNP rs11889341. Strong signals of association between IRF5 and an HLA-DR3 SNP marker were also detected in the lupus nephritis case versus healthy control analysis (p <0.0001). An additional six genes showed an association with lupus nephritis with p <0.001 (PMS2, TNIP1, CARD11, ITGAM, BLK and IRAK1). In the case-only meta-analysis of the two cohorts, the STAT4 SNP rs7582694 was associated with severe renal insufficiency with p = 1.6×10−3 and OR 2.22. We conclude that genetic variations in STAT4 predispose to lupus nephritis and a worse outcome with severe renal insufficiency.
Target enrichment and resequencing is a widely used approach for identification of cancer genes and genetic variants associated with diseases. Although cost effective compared to whole genome sequencing, analysis of many samples constitutes a significant cost, which could be reduced by pooling samples before capture. Another limitation to the number of cancer samples that can be analyzed is often the amount of available tumor DNA. We evaluated the performance of whole genome amplified DNA and the power to detect subclonal somatic single nucleotide variants in non-indexed pools of cancer samples using the HaloPlex technology for target enrichment and next generation sequencing.
We captured a set of 1528 putative somatic single nucleotide variants and germline SNPs, which were identified by whole genome sequencing, with the HaloPlex technology and sequenced to a depth of 792–1752. We found that the allele fractions of the analyzed variants are well preserved during whole genome amplification and that capture specificity or variant calling is not affected. We detected a large majority of the known single nucleotide variants present uniquely in one sample with allele fractions as low as 0.1 in non-indexed pools of up to ten samples. We also identified and experimentally validated six novel variants in the samples included in the pools.
Our work demonstrates that whole genome amplified DNA can be used for target enrichment equally well as genomic DNA and that accurate variant detection is possible in non-indexed pools of cancer samples. These findings show that analysis of a large number of samples is feasible at low cost, even when only small amounts of DNA is available, and thereby significantly increases the chances of indentifying recurrent mutations in cancer samples.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-14-856) contains supplementary material, which is available to authorized users.
Target enrichment; HaloPlex; Non-indexed pooling; Whole genome amplification; Single nucleotide variant; Deep sequencing
Although aberrant DNA methylation has been observed previously in acute lymphoblastic leukemia (ALL), the patterns of differential methylation have not been comprehensively determined in all subtypes of ALL on a genome-wide scale. The relationship between DNA methylation, cytogenetic background, drug resistance and relapse in ALL is poorly understood.
We surveyed the DNA methylation levels of 435,941 CpG sites in samples from 764 children at diagnosis of ALL and from 27 children at relapse. This survey uncovered four characteristic methylation signatures. First, compared with control blood cells, the methylomes of ALL cells shared 9,406 predominantly hypermethylated CpG sites, independent of cytogenetic background. Second, each cytogenetic subtype of ALL displayed a unique set of hyper- and hypomethylated CpG sites. The CpG sites that constituted these two signatures differed in their functional genomic enrichment to regions with marks of active or repressed chromatin. Third, we identified subtype-specific differential methylation in promoter and enhancer regions that were strongly correlated with gene expression. Fourth, a set of 6,612 CpG sites was predominantly hypermethylated in ALL cells at relapse, compared with matched samples at diagnosis. Analysis of relapse-free survival identified CpG sites with subtype-specific differential methylation that divided the patients into different risk groups, depending on their methylation status.
Our results suggest an important biological role for DNA methylation in the differences between ALL subtypes and in their clinical outcome after treatment.
Recent genome-wide association studies (GWASs) conducted in Asian populations have identified novel risk loci for systemic lupus erythematosus (SLE). Here, we genotyped 10 single-nucleotide polymorphisms (SNPs) in eight such loci and investigated their disease associations in three independent Caucasian SLE case–control cohorts recruited from Sweden, Finland and the United States. The disease associations of the SNPs in ETS1, IKZF1, LRRC18-WDFY4, RASGRP3, SLC15A4, TNIP1 and 16p11.2 were replicated, whereas no solid evidence of association was observed for the 7q11.23 locus in the Caucasian cohorts. SLC15A4 was significantly associated with renal involvement in SLE. The association of TNIP1 was more pronounced in SLE patients with renal and immunological disorder, which is corroborated by two previous studies in Asian cohorts. The effects of all the associated SNPs, either conferring risk for or being protective against SLE, were in the same direction in Caucasians and Asians. The magnitudes of the allelic effects for most of the SNPs were also comparable across different ethnic groups. On the contrary, remarkable differences in allele frequencies between Caucasian and Asian populations were observed for all associated SNPs. In conclusion, most of the novel SLE risk loci identified by GWASs in Asian populations were also associated with SLE in Caucasian populations. We observed both similarities and differences with respect to the effect sizes and risk allele frequencies across ethnicities.
systemic lupus erythematosus; genetic-association study; Asian; Caucasian
Circulating lipids levels, as well as several familial lipid metabolism disorders, are strongly associated with initiation and progression of atherosclerosis and incidence of myocardial infarction (MI).
We hypothesized that genetic variants associated with circulating lipid levels would also be associated with MI incidence, and have tested this in three independent samples.
Setting and Subjects
Using age- and sex-adjusted additive genetic models, we analyzed 554 single nucleotide polymorphisms (SNPs) in 41 candidate gene regions proposed to be involved in lipid-related pathways potentially predisposing to incidence of MI in 2,602 participants of the Swedish Twin Register (STR; 57% women). All associations with nominal P<0.01 were further investigated in the Uppsala Longitudinal Study of Adult Men (ULSAM; N = 1,142).
In the present study, we report associations of lipid-related SNPs with incident MI in two community-based longitudinal studies with in silico replication in a meta-analysis of genome-wide association studies. Overall, there were 9 SNPs in STR with nominal P-value <0.01 that were successfully genotyped in ULSAM. rs4149313 located in ABCA1 was associated with MI incidence in both longitudinal study samples with nominal significance (hazard ratio, 1.36 and 1.40; P-value, 0.004 and 0.015 in STR and ULSAM, respectively). In silico replication supported the association of rs4149313 with coronary artery disease in an independent meta-analysis including 173,975 individuals of European descent from the CARDIoGRAMplusC4D consortium (odds ratio, 1.03; P-value, 0.048).
rs4149313 is one of the few amino acid changing variants in ABCA1 known to associate with reduced cholesterol efflux. Our results are suggestive of a weak association between this variant and the development of atherosclerosis and MI.