PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (702785)

Clipboard (0)
None

Related Articles

1.  Allelic based gene-gene interactions in rheumatoid arthritis 
BMC Proceedings  2009;3(Suppl 7):S76.
The detection of gene-gene interaction is an important approach to understand the etiology of rheumatoid arthritis (RA). The goal of this study is to identify gene-gene interaction of SNPs at the allelic level contributing to RA using real data sets (Problem 1) of North American Rheumatoid Arthritis Consortium (NARAC) provided by Genetic Analysis Workshop 16 (GAW16). We applied our novel method that can detect the interaction by a definition of nonrandom association of alleles that occurs when the contribution to RA of a particular allele inherited in one gene depends on a particular allele inherited at other unlinked genes. Starting with 639 single-nucleotide polymorphisms (SNPs) from 26 candidate genes, we identified ten two-way interacting genes and one case of three-way interacting genes. SNP rs2476601 on PTPN22 interacts with rs2306772 on SLC22A4, which interacts with rs881372 on TRAF1 and rs2900180 on C5, respectively. SNP rs2900180 on C5 interacts with rs2242720 on RUNX1, which interacts with rs881375 on TRAF1. Furthermore, rs2476601 on PTPN22 also interacts with three SNPs (rs2905325, rs1476482, and rs2106549) in linkage disequilibrium (LD) on IL6. The other three SNPs (rs2961280, rs2961283, and rs2905308) in LD on IL6 interact with two SNPs (rs477515 and rs2516049) on HLA-DRB1. SNPs rs660895 and rs532098 on HLA-DRB1 interact with rs2834779 and four SNPs in LD on RUNX1. Three-way interacting genes of rs10229203 on IL6, rs4816502 on RUNX1, and rs10818500 on C5 were also detected.
doi:10.1186/1753-6561-3-S7-S76
PMCID: PMC2795978  PMID: 20018071
2.  A Candidate Gene Approach Identifies the TRAF1/C5 Region as a Risk Factor for Rheumatoid Arthritis 
PLoS Medicine  2007;4(9):e278.
Background
Rheumatoid arthritis (RA) is a chronic autoimmune disorder affecting ∼1% of the population. The disease results from the interplay between an individual's genetic background and unknown environmental triggers. Although human leukocyte antigens (HLAs) account for ∼30% of the heritable risk, the identities of non-HLA genes explaining the remainder of the genetic component are largely unknown. Based on functional data in mice, we hypothesized that the immune-related genes complement component 5 (C5) and/or TNF receptor-associated factor 1 (TRAF1), located on Chromosome 9q33–34, would represent relevant candidate genes for RA. We therefore aimed to investigate whether this locus would play a role in RA.
Methods and Findings
We performed a multitiered case-control study using 40 single-nucleotide polymorphisms (SNPs) from the TRAF1 and C5 (TRAF1/C5) region in a set of 290 RA patients and 254 unaffected participants (controls) of Dutch origin. Stepwise replication of significant SNPs was performed in three independent sample sets from the Netherlands (ncases/controls = 454/270), Sweden (ncases/controls = 1,500/1,000) and US (ncases/controls = 475/475). We observed a significant association (p < 0.05) of SNPs located in a haplotype block that encompasses a 65 kb region including the 3′ end of C5 as well as TRAF1. A sliding window analysis revealed an association peak at an intergenic region located ∼10 kb from both C5 and TRAF1. This peak, defined by SNP14/rs10818488, was confirmed in a total of 2,719 RA patients and 1,999 controls (odds ratiocommon = 1.28, 95% confidence interval 1.17–1.39, pcombined = 1.40 × 10−8) with a population-attributable risk of 6.1%. The A (minor susceptibility) allele of this SNP also significantly correlates with increased disease progression as determined by radiographic damage over time in RA patients (p = 0.008).
Conclusions
Using a candidate-gene approach we have identified a novel genetic risk factor for RA. Our findings indicate that a polymorphism in the TRAF1/C5 region increases the susceptibility to and severity of RA, possibly by influencing the structure, function, and/or expression levels of TRAF1 and/or C5.
Using a candidate-gene approach, Rene Toes and colleagues identified a novel genetic risk factor for rheumatoid arthritis in theTRAF1/C5 region.
Editors' Summary
Background.
Rheumatoid arthritis is a very common chronic illness that affects around 1% of people in developed countries. It is caused by an abnormal immune reaction to various tissues within the body; as well as affecting joints and causing an inflammatory arthritis, it can also affect many other organs of the body. Severe rheumatoid arthritis can be life-threatening, but even mild forms of the disease cause substantial illness and disability. Current treatments aim to give symptomatic relief with the use of simple analgesics, or anti-inflammatory drugs. In addition, most patients are also treated with what are known as disease-modifying agents, which aim to prevent joint damage. Rheumatoid arthritis is known to have a genetic component. For example, an association has been shown with the part of the genome that contains the human leukocyte antigens (HLAs), which are involved in the immune response. Information on other genes involved would be helpful both for understanding the underlying cause of the disease and possibly for the discovery of new treatments.
Why Was This Study Done?
Previous work in mice that have a disease similar to human rheumatoid arthritis has identified a number of possible candidate genes. One of these genes, complement component 5 (C5) is involved in the complement system—a primitive system within the body that is involved in the defense against foreign molecules. In humans the gene for C5 is located on Chromosome 9 close to another gene involved in the inflammatory response, TNF receptor-associated factor 1 (TRAF1). A preliminary study in humans of this region had shown some evidence, albeit weak, to suggest that this region might be associated with rheumatoid arthritis. The authors set out to look in more detail, and in a larger group of individuals, to see if they could prove this association.
What Did the Researchers Do and Find?
The researchers took 40 genetic markers, known as single-nucleotide polymorphisms (SNPs), from across the region that included the C5 and TRAF1 genes. SNPs have each been assigned a unique reference number that specifies a point in the human genome, and each is present in alternate forms so can be differentiated. They compared which of the alternate forms were present in 290 patients with rheumatoid arthritis and 254 unaffected participants of Dutch origin. They then repeated the study in three other groups of patients and controls of Dutch, Swedish, and US origin. They found a consistent association with rheumatoid arthritis of one region of 65 kilobases (a small distance in genetic terms) that included one end of the C5 gene as well as the TRAF1 gene. They could refine the area of interest to a piece marked by one particular SNP that lay between the genes. They went on to show that the genetic region in which these genes are located may be involved in the binding of a protein that modifies the transcription of genes, thus providing a possible explanation for the association. Furthermore, they showed that one of the alternate versions of the marker in this region was associated with more aggressive disease.
What Do These Findings Mean?
The finding of a genetic association is the first step in identifying a genetic component of a disease. The strength of this study is that a novel genetic susceptibility factor for RA has been identified and that the overall result is consistent in four different populations as well as being associated with disease severity. Further work will need to be done to confirm the association in other populations and then to identify the precise genetic change involved. Hopefully this work will lead to new avenues of investigation for therapy.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0040278.
• Medline Plus, the health information site for patients from the US National Library of Medicine, has a page of resources on rheumatoid arthritis
• The UK's National Health Service online information site has information on rheumatoid arthritis
• The Arthritis Research Campaign, a UK charity that funds research on all types of arthritis, has a booklet with information for patients on rheumatoid arthritis
• Reumafonds, a Dutch arthritis foundation, gives information on rheumatoid arthritis (in Dutch)
• Autocure is an initiative whose objective is to transform knowledge obtained from molecular research into a cure for an increasing number of patients suffering from inflammatory rheumatic diseases
• The European league against Rheumatism, an organisation which represents the patient, health professionals, and scientific societies of rheumatology of all European nations
doi:10.1371/journal.pmed.0040278
PMCID: PMC1976626  PMID: 17880261
3.  Combining least absolute shrinkage and selection operator (LASSO) and principal-components analysis for detection of gene-gene interactions in genome-wide association studies 
BMC Proceedings  2009;3(Suppl 7):S62.
Variable selection in genome-wide association studies can be a daunting task and statistically challenging because there are more variables than subjects. We propose an approach that uses principal-component analysis (PCA) and least absolute shrinkage and selection operator (LASSO) to identify gene-gene interaction in genome-wide association studies. A PCA was used to first reduce the dimension of the single-nucleotide polymorphisms (SNPs) within each gene. The interaction of the gene PCA scores were placed into LASSO to determine whether any gene-gene signals exist. We have extended the PCA-LASSO approach using the bootstrap to estimate the standard errors and confidence intervals of the LASSO coefficient estimates. This method was compared to placing the raw SNP values into the LASSO and the logistic model with individual gene-gene interaction. We demonstrated these methods with the Genetic Analysis Workshop 16 rheumatoid arthritis genome-wide association study data and our results identified a few gene-gene signals. Based on our results, the PCA-LASSO method shows promise in identifying gene-gene interactions, and, at this time we suggest using it with other conventional approaches, such as generalized linear models, to narrow down genetic signals.
PMCID: PMC2795963  PMID: 20018056
4.  Detecting disease-causing genes by LASSO-Patternsearch algorithm 
BMC Proceedings  2007;1(Suppl 1):S60.
The Genetic Analysis Workshop 15 Problem 3 simulated rheumatoid arthritis data set provided 100 replicates of simulated single-nucleotide polymorphism (SNP) and covariate data sets for 1500 families with an affected sib pair and 2000 controls, modeled after real rheumatoid arthritis data. The data generation model included nine unobserved trait loci, most of which have one or more of the generated SNPs associated with them. These data sets provide an ideal experimental test bed for evaluating new and old algorithms for selecting SNPs and covariates that can separate cases from controls, because the cases and controls are known as well as the identities of the trait loci. LASSO-Patternsearch is a new multi-step algorithm with a LASSO-type penalized likelihood method at its core specifically designed to detect and model interactions between important predictor variables. In this article the original LASSO-Patternsearch algorithm is modified to handle the large number of SNPs plus covariates. We start with a screen step within the framework of parametric logistic regression. The patterns that survived the screen step were further selected by a penalized logistic regression with the LASSO penalty. And finally, a parametric logistic regression model were built on the patterns that survived the LASSO step. In our analysis of Genetic Analysis Workshop 15 Problem 3 data we have identified most of the associated SNPs and relevant covariates. Upon using the model as a classifier, very competitive error rates were obtained.
PMCID: PMC2367607  PMID: 18466561
5.  Association of common polymorphisms in known susceptibility genes with rheumatoid arthritis in a Slovak population using osteoarthritis patients as controls 
Introduction
Both genetic and environmental factors contribute to rheumatoid arthritis (RA), a common and complex autoimmune disease. As well as the major susceptibility gene HLA-DRB1, recent genome-wide and candidate-gene studies reported additional evidence for association of single nucleotide polymorphism (SNP) markers in the PTPN22, STAT4, OLIG3/TNFAIP3 and TRAF1/C5 loci with RA. This study was initiated to investigate the association between defined genetic markers and RA in a Slovak population. In contrast to recent studies, we included intensively-characterized osteoarthritis (OA) patients as controls.
Methods
We used material of 520 RA and 303 OA samples in a case-control setting. Six SNPs were genotyped using TaqMan assays. HLA-DRB1 alleles were determined by employing site-specific polymerase chain reaction (PCR) amplification.
Results
No statistically significant association of TRAF1/C5 SNPs rs3761847 and rs10818488 with RA was detected. However, we were able to replicate the association signals between RA and HLA-DRB1 alleles, STAT4 (rs7574865), PTPN22 (rs2476601) and OLIG3/TNFAIP3 (rs10499194 and rs6920220). The strongest signal was detected for HLA-DRB1*04 with an allelic P = 1.2*10-13 (OR = 2.92, 95% confidence interval (CI) = 2.18 – 3.91). Additionally, SNPs rs7574865STAT4 (P = 9.2*10-6; OR = 1.71, 95% CI = 1.35 – 2.18) and rs2476601PTPN22 (P = 9.5*10-4; OR = 1.67, 95% CI = 1.23 – 2.26) were associated with susceptibility to RA, whereas after permutation testing OLIG3/TNFAIP3 SNPs rs10499194 and rs6920220 missed our criteria for significance (Pcorr = 0.114 and Pcorr = 0.180, respectively).
Conclusions
In our Slovak population, HLA-DRB1 alleles as well as SNPs in STAT4 and PTPN22 genes showed a strong association with RA.
doi:10.1186/ar2699
PMCID: PMC2714116  PMID: 19445664
6.  BLOCK-BASED BAYESIAN EPISTASIS ASSOCIATION MAPPING WITH APPLICATION TO WTCCC TYPE 1 DIABETES DATA1 
The annals of applied statistics  2011;5(3):2052-2077.
Interactions among multiple genes across the genome may contribute to the risks of many complex human diseases. Whole-genome single nucleotide polymorphisms (SNPs) data collected for many thousands of SNP markers from thousands of individuals under the case–control design promise to shed light on our understanding of such interactions. However, nearby SNPs are highly correlated due to linkage disequilibrium (LD) and the number of possible interactions is too large for exhaustive evaluation. We propose a novel Bayesian method for simultaneously partitioning SNPs into LD-blocks and selecting SNPs within blocks that are associated with the disease, either individually or interactively with other SNPs. When applied to homogeneous population data, the method gives posterior probabilities for LD-block boundaries, which not only result in accurate block partitions of SNPs, but also provide measures of partition uncertainty. When applied to case–control data for association mapping, the method implicitly filters out SNP associations created merely by LD with disease loci within the same blocks. Simulation study showed that this approach is more powerful in detecting multi-locus associations than other methods we tested, including one of ours. When applied to the WTCCC type 1 diabetes data, the method identified many previously known T1D associated genes, including PTPN22, CTLA4, MHC, and IL2RA. The method also revealed some interesting two-way associations that are undetected by single SNP methods. Most of the significant associations are located within the MHC region. Our analysis showed that the MHC SNPs form long-distance joint associations over several known recombination hotspots. By controlling the haplotypes of the MHC class II region, we identified additional associations in both MHC class I (HLA-A, HLA-B) and class III regions (BAT1). We also observed significant interactions between genes PRSS16, ZNF184 in the extended MHC region and the MHC class II genes. The proposed method can be broadly applied to the classification problem with correlated discrete covariates.
doi:10.1214/11-AOAS469
PMCID: PMC3226821  PMID: 22140419
Disease association study; epistasis; LD block; Bayesian methods
7.  Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17 
BMC Proceedings  2011;5(Suppl 9):S12.
The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulated phenotypes of trait Q2. An important feature of this data set is that most SNPs are rare, with 87% of the SNPs having a minor allele frequency less than 0.05. For rare SNP detection, in this study we performed a least absolute shrinkage and selection operator (LASSO) regression and F tests at the gene level and calculated the generalized degrees of freedom to avoid any selection bias. For comparison, we also carried out linear regression and the collapsing method, which sums the rare SNPs, modified for a quantitative trait and with two different allele frequency thresholds. The aim of this paper is to evaluate these four approaches in this mini-exome data and compare their performance in terms of power and false positive rates. In most situations the LASSO approach is more powerful than linear regression and collapsing methods. We also note the difficulty in determining the optimal threshold for the collapsing method and the significant role that linkage disequilibrium plays in detecting rare causal SNPs. If a rare causal SNP is in strong linkage disequilibrium with a common marker in the same gene, power will be much improved.
doi:10.1186/1753-6561-5-S9-S12
PMCID: PMC3287844  PMID: 22373385
8.  Mining Gold Dust under the Genome Wide Significance Level: A Two-Stage Approach to Analysis of GWAS 
Genetic epidemiology  2010;35(2):111-118.
We propose a two-stage approach to analyze genome-wide association (GWA) data in order to identify a set of promising single-nucleotide polymorphisms (SNPs). In stage one, we select a list of top signals from single SNP analyses by controlling false discovery rate (FDR). In stage two, we use the least absolute shrinkage and selection operator (LASSO) regression to reduce false positives. The proposed approach was evaluated using simulated quantitative traits based on genome-wide SNP data on 8,861 Caucasian individuals from the Atherosclerosis Risk in Communities (ARIC) Study. Our first stage, targeted at controlling false negatives, yields better power than using Bonferroni corrected significance level. The LASSO regression reduces the number of significant SNPs in stage two: it reduces false positive SNPs and it reduces true positive SNPs also at simulated causal loci due to linkage disequilibrium. Interestingly, the LASSO regression preserves the power from stage one, i.e., the number of causal loci detected from the LASSO regression in stage two is almost the same as in stage one, while reducing false positives further. Real data on systolic blood pressure in the ARIC study was analyzed using our two-stage approach which identified two significant SNPs, one of which was reported to be genome-significant in a meta-analysis containing a much larger sample size. On the other hand, a single SNP association scan did not yield any significant results.
doi:10.1002/gepi.20556
PMCID: PMC3624896  PMID: 21254218
LASSO; FDR; multi-marker; association; power
9.  Predicting the Risk of Rheumatoid Arthritis and Its Age of Onset through Modelling Genetic Risk Variants with Smoking 
PLoS Genetics  2013;9(9):e1003808.
The improved characterisation of risk factors for rheumatoid arthritis (RA) suggests they could be combined to identify individuals at increased disease risks in whom preventive strategies may be evaluated. We aimed to develop an RA prediction model capable of generating clinically relevant predictive data and to determine if it better predicted younger onset RA (YORA). Our novel modelling approach combined odds ratios for 15 four-digit/10 two-digit HLA-DRB1 alleles, 31 single nucleotide polymorphisms (SNPs) and ever-smoking status in males to determine risk using computer simulation and confidence interval based risk categorisation. Only males were evaluated in our models incorporating smoking as ever-smoking is a significant risk factor for RA in men but not women. We developed multiple models to evaluate each risk factor's impact on prediction. Each model's ability to discriminate anti-citrullinated protein antibody (ACPA)-positive RA from controls was evaluated in two cohorts: Wellcome Trust Case Control Consortium (WTCCC: 1,516 cases; 1,647 controls); UK RA Genetics Group Consortium (UKRAGG: 2,623 cases; 1,500 controls). HLA and smoking provided strongest prediction with good discrimination evidenced by an HLA-smoking model area under the curve (AUC) value of 0.813 in both WTCCC and UKRAGG. SNPs provided minimal prediction (AUC 0.660 WTCCC/0.617 UKRAGG). Whilst high individual risks were identified, with some cases having estimated lifetime risks of 86%, only a minority overall had substantially increased odds for RA. High risks from the HLA model were associated with YORA (P<0.0001); ever-smoking associated with older onset disease. This latter finding suggests smoking's impact on RA risk manifests later in life. Our modelling demonstrates that combining risk factors provides clinically informative RA prediction; additionally HLA and smoking status can be used to predict the risk of younger and older onset RA, respectively.
Author Summary
Rheumatoid arthritis (RA) is a common, incurable disease with major individual and health service costs. Preventing its development is therefore an important goal. Being able to predict who will develop RA would allow researchers to look at ways to prevent it. Many factors have been found that increase someone's risk of RA. These are divided into genetic and environmental (such as smoking) factors. The risk of RA associated with each factor has previously been reported. Here, we demonstrate a method that combines these risk factors in a process called “prediction modelling” to estimate someone's lifetime risk of RA. We show that firstly, our prediction models can identify people with very high-risks of RA and secondly, they can be used to identify people at risk of developing RA at a younger age. Although these findings are an important first step towards preventing RA, as only a minority of people tested had substantially increased disease risks our models could not be used to screen the general population. Instead they need testing in people already at risk of RA such as relatives of affected patients. In this context they could identify enough numbers of high-risk people to allow preventive methods to be evaluated.
doi:10.1371/journal.pgen.1003808
PMCID: PMC3778023  PMID: 24068971
10.  A genome-wide association scan for rheumatoid arthritis data by Hotelling's T2 tests 
BMC Proceedings  2009;3(Suppl 7):S6.
We performed a genome-wide association scan on the North American Rheumatoid Arthritis Consortium (NARAC) data using Hotelling's T2 tests, i.e., TH based on allele coding and TG based on genotype coding. The objective was to identify associations between single-nucleotide polymorphisms (SNPs) or markers and rheumatoid arthritis. In specific candidate gene regions, we evaluated the performance of Hotelling's T2 tests. Then Hotelling's T2 tests were used as a tool to identify new regions that contain SNPs showing strong associations with disease. As expected, the strongest association evidence was found in the region of the HLA-DRB1 locus on chromosome 6. In the region of the TRAF1-C5 genes, we identified two SNPs, rs2900180 and rs3761847, with the largest and the second largest TH and TG scores among all SNPs on chromosome 9. We also identified one SNP, rs2476601, in the region of the PTPN22 gene that had the largest TH score and the second largest TG score among all SNPs on chromosome 1. In addition, SNPs with the largest TH score on each chromosome were identified. These SNPs may be located in the regions of genes that have modest effects on rheumatoid arthritis. These regions deserve further investigation.
PMCID: PMC2795960  PMID: 20018053
11.  Immunochip Identifies Novel, and Replicates Known, Genetic Risk Loci for Rheumatoid Arthritis in Black South Africans 
Molecular Medicine  2014;20(1):341-349.
The aim of this study was to identify genetic variants associated with rheumatoid arthritis (RA) risk in black South Africans. Black South African RA patients (n = 263) were compared with healthy controls (n = 374). Genotyping was performed using the Immunochip, and four-digit high-resolution human leukocyte antigen (HLA) typing was performed by DNA sequencing of exon 2. Standard quality control measures were implemented on the data. The strongest associations were in the intergenic region between the HLA-DRB1 and HLA-DQA1 loci. After conditioning on HLA-DRB1 alleles, the effect in the rest of the extended major histocompatibility (MHC) diminished. Non-HLA single nucleotide polymorphisms (SNPs) in the intergenic regions LOC389203|RBPJ, LOC100131131|IL1R1, KIAA1919|REV3L, LOC643749|TRAF3IP2, and SNPs in the intron and untranslated regions (UTR) of IRF1 and the intronic region of ICOS and KIAA1542 showed association with RA (p < 5 × 10−5). Of the SNPs previously associated with RA in Caucasians, one SNP, rs874040, locating to the intergenic region LOC389203|RBPJ was replicated in this study. None of the variants in the PTPN22 gene was significantly associated. The seropositive subgroups showed similar results to the overall cohort. The effects observed across the HLA region are most likely due to HLA-DRB1, and secondary effects in the extended MHC cannot be detected. Seven non-HLA loci are associated with RA in black South Africans. Similar to Caucasians, the intergenic region between LOC38920 and RBPJ is associated with RA in this population. The strong association of the R620W variant of the PTPN22 gene with RA in Caucasians was not replicated since this variant was monomorphic in our study, but other SNP variants of the PTPN22 gene were also not associated with RA in black South Africans, suggesting that this locus does not play a major role in RA in this population.
doi:10.2119/molmed.2014.00097
PMCID: PMC4153842  PMID: 25014791
12.  Detecting significant single-nucleotide polymorphisms in a rheumatoid arthritis study using random forests 
BMC Proceedings  2009;3(Suppl 7):S69.
Random forest is an efficient approach for investigating not only the effects of individual markers on a trait but also the effect of the interactions among the markers in genetic association studies. This approach is especially appealing for the analysis of genome-wide data, such as those obtained from gene expression/single-nucleotide polymorphism (SNP) array experiments in which the number of candidate genes/SNPs is vast. We applied this approach to the Genetic Analysis Workshop 16 Problem 1 data to identify SNPs that contribute to rheumatoid arthritis. The random forest computed a raw importance score for each SNP marker, where higher importance score suggests higher level of association between the marker and the trait. The significance level of the association was determined empirically by repeatedly reapplying the random forest on randomly generated data under the null hypothesis that no association exists between the markers and the trait. Using random forest, we were able to identify 228 significant SNPs (at the genome-wide significant level of 0.05) across the whole genome, over two-thirds of which are located on chromosome 6, especially clustered in the region of 6p21 containing the human leukocyte antigen (HLA) genes, such as gene HLA-DRB1 and HLA-DRA. Further analysis of this region indicates a strong association to the rheumatoid arthritis status.
PMCID: PMC2795970  PMID: 20018063
13.  Genome wide association studies for body conformation traits in the Chinese Holstein cattle population 
BMC Genomics  2013;14:897.
Background
Genome-wide association study (GWAS) is a powerful tool for revealing the genetic basis of quantitative traits. However, studies using GWAS for conformation traits of cattle is comparatively less. This study aims to use GWAS to find the candidates genes for body conformation traits.
Results
The Illumina BovineSNP50 BeadChip was used to identify single nucleotide polymorphisms (SNPs) that are associated with body conformation traits. A least absolute shrinkage and selection operator (LASSO) was applied to detect multiple SNPs simultaneously for 29 body conformation traits with 1,314 Chinese Holstein cattle and 52,166 SNPs. Totally, 59 genome-wide significant SNPs associated with 26 conformation traits were detected by genome-wide association analysis; five SNPs were within previously reported QTL regions (Animal Quantitative Trait Loci (QTL) database) and 11 were very close to the reported SNPs. Twenty-two SNPs were located within annotated gene regions, while the remainder were 0.6–826 kb away from known genes. Some of the genes had clear biological functions related to conformation traits. By combining information about the previously reported QTL regions and the biological functions of the genes, we identified DARC, GAS1, MTPN, HTR2A, ZNF521, PDIA6, and TMEM130 as the most promising candidate genes for capacity and body depth, chest width, foot angle, angularity, rear leg side view, teat length, and animal size traits, respectively. We also found four SNPs that affected four pairs of traits, and the genetic correlation between each pair of traits ranged from 0.35 to 0.86, suggesting that these SNPs may have a pleiotropic effect on each pair of traits.
Conclusions
A total of 59 significant SNPs associated with 26 conformation traits were identified in the Chinese Holstein population. Six promising candidate genes were suggested, and four SNPs showed genetic correlation for four pairs of traits.
doi:10.1186/1471-2164-14-897
PMCID: PMC3879203  PMID: 24341352
Dairy cattle; GWAS; Body conformation traits; SNP; Holstein; QTL
14.  Effective Detection of Human Leukocyte Antigen Risk Alleles in Celiac Disease Using Tag Single Nucleotide Polymorphisms 
PLoS ONE  2008;3(5):e2270.
Background
The HLA genes, located in the MHC region on chromosome 6p21.3, play an important role in many autoimmune disorders, such as celiac disease (CD), type 1 diabetes (T1D), rheumatoid arthritis, multiple sclerosis, psoriasis and others. Known HLA variants that confer risk to CD, for example, include DQA1*05/DQB1*02 (DQ2.5) and DQA1*03/DQB1*0302 (DQ8). To diagnose the majority of CD patients and to study disease susceptibility and progression, typing these strongly associated HLA risk factors is of utmost importance. However, current genotyping methods for HLA risk factors involve many reactions, and are complicated and expensive. We sought a simple experimental approach using tagging SNPs that predict the CD-associated HLA risk factors.
Methodology
Our tagging approach exploits linkage disequilibrium between single nucleotide polymorphism (SNPs) and the CD-associated HLA risk factors DQ2.5 and DQ8 that indicate direct risk, and DQA1*0201/DQB1*0202 (DQ2.2) and DQA1*0505/DQB1*0301 (DQ7) that attribute to the risk of DQ2.5 to CD. To evaluate the predictive power of this approach, we performed an empirical comparison of the predicted DQ types, based on these six tag SNPs, with those executed with current validated laboratory typing methods of the HLA-DQA1 and -DQB1 genes in three large cohorts. The results were validated in three European celiac populations.
Conclusion
Using this method, only six SNPs were needed to predict the risk types carried by >95% of CD patients. We determined that for this tagging approach the sensitivity was >0.991, specificity >0.996 and the predictive value >0.948. Our results show that this tag SNP method is very accurate and provides an excellent basis for population screening for CD. This method is broadly applicable in European populations.
doi:10.1371/journal.pone.0002270
PMCID: PMC2386975  PMID: 18509540
15.  A Genome-Wide Association Study of Psoriasis and Psoriatic Arthritis Identifies New Disease Loci 
PLoS Genetics  2008;4(4):e1000041.
A genome-wide association study was performed to identify genetic factors involved in susceptibility to psoriasis (PS) and psoriatic arthritis (PSA), inflammatory diseases of the skin and joints in humans. 223 PS cases (including 91 with PSA) were genotyped with 311,398 single nucleotide polymorphisms (SNPs), and results were compared with those from 519 Northern European controls. Replications were performed with an independent cohort of 577 PS cases and 737 controls from the U.S., and 576 PSA patients and 480 controls from the U.K.. Strongest associations were with the class I region of the major histocompatibility complex (MHC). The most highly associated SNP was rs10484554, which lies 34.7 kb upstream from HLA-C (P = 7.8×10−11, GWA scan; P = 1.8×10−30, replication; P = 1.8×10−39, combined; U.K. PSA: P = 6.9×10−11). However, rs2395029 encoding the G2V polymorphism within the class I gene HCP5 (combined P = 2.13×10−26 in U.S. cases) yielded the highest ORs with both PS and PSA (4.1 and 3.2 respectively). This variant is associated with low viral set point following HIV infection and its effect is independent of rs10484554. We replicated the previously reported association with interleukin 23 receptor and interleukin 12B (IL12B) polymorphisms in PS and PSA cohorts (IL23R: rs11209026, U.S. PS, P = 1.4×10−4; U.K. PSA: P = 8.0×10−4; IL12B:rs6887695, U.S. PS, P = 5×10−5 and U.K. PSA, P = 1.3×10−3) and detected an independent association in the IL23R region with a SNP 4 kb upstream from IL12RB2 (P = 0.001). Novel associations replicated in the U.S. PS cohort included the region harboring lipoma HMGIC fusion partner (LHFP) and conserved oligomeric golgi complex component 6 (COG6) genes on chromosome 13q13 (combined P = 2×10−6 for rs7993214; OR = 0.71), the late cornified envelope gene cluster (LCE) from the Epidermal Differentiation Complex (PSORS4) (combined P = 6.2×10−5 for rs6701216; OR 1.45) and a region of LD at 15q21 (combined P = 2.9×10−5 for rs3803369; OR = 1.43). This region is of interest because it harbors ubiquitin-specific protease-8 whose processed pseudogene lies upstream from HLA-C. This region of 15q21 also harbors the gene for SPPL2A (signal peptide peptidase like 2a) which activates tumor necrosis factor alpha by cleavage, triggering the expression of IL12 in human dendritic cells. We also identified a novel PSA (and potentially PS) locus on chromosome 4q27. This region harbors the interleukin 2 (IL2) and interleukin 21 (IL21) genes and was recently shown to be associated with four autoimmune diseases (Celiac disease, Type 1 diabetes, Grave's disease and Rheumatoid Arthritis).
Author Summary
Psoriasis (PS) and psoriatic arthritis (PSA) are common inflammatory diseases of humans affecting the skin and joints. Approximately 2% of Europeans are affected with PS, and ∼10–30% of patients develop PSA. Genetic variation in the MHC (multiple histocompatibility locus antigen cluster) increases risk of developing PS. However, only ∼10% of individuals with this risk factor develop PS, indicating that other genetic effects and environmental triggers are important. Recent approaches using a case/control approach and genome wide association studies with DNA markers known as SNPs (single nucleotide polymorphisms) have been fruitful in identifying genetic factors for common diseases. This study describes the first large scale genome wide scan for additional PS and PSA susceptibility genes using 233 cases and 519 controls. It revealed that the MHC is truly the most important risk factor for PS and that it plays a very major role in PSA, confirmed recently identified associations with interleukin 23 receptor and interleukin 12B in both PS and PSA, and identified new associations. These include a region on chromosome 4q27 that contains genes for interleukin 2 and interleukin 21 that has been recently implicated in other autoimmune diseases, and seven additional regions that include chromosome 13q13 and 15q21.
doi:10.1371/journal.pgen.1000041
PMCID: PMC2274885  PMID: 18369459
16.  Genome-Wide Association Study of Determinants of Anti-Cyclic Citrullinated Peptide Antibody Titer in Adults with Rheumatoid Arthritis 
Molecular Medicine  2009;15(5-6):136-143.
We carried out a genome-wide association study of genetic predictors of anti-cyclic citrullinated peptide antibody (anti-CCP) level in 531 self-reported non-Hispanic Caucasian Rheumatoid Arthritis (RA) patients enrolled in the Brigham Rheumatoid Arthritis Sequential Study (BRASS). For replication, we then analyzed 289 single nucleotide polymorphisms (SNPs) with P < 0.001 in BRASS in an independent population of 849 RA patients from the North American Rheumatoid Arthritis Consortium (NARAC). BRASS and NARAC samples were genotyped using the Affymetrix 100K and Illumina 550K platforms respectively. Association between SNPs and anti-CCP titer was tested using general linear models. The five most significant SNPs from BRASS all were within the major histocompatibility complex (MHC) region (P ≤ 3.5 × 10−6). After controlling for the human leukocyte antigen shared epitope (HLA-SE), the top SNPs still yielded P values < 0.0002. In NARAC, a single SNP from the MHC region near BTNL2 and HLA-DRA, rs1980493 (r2 = 0.85 with the top five SNPs from BRASS), was associated significantly with CCP titer (P = 6.1 × 10−5) even after adjustment for the HLA-SE (P = 0.0002). The top SNPs found in BRASS and NARAC had r2 = 0.46 and 0.64, respectively, to HLA-DRB1 DR3 alleles. These results confirm that the most significant genome region affecting anti-CCP titers in RA is the MHC region. We identified a SNP in moderate linkage disequilibrium (LD) with HLA-DR3, which may influence anti-CCP titer independently of the HLA-SE.
doi:10.2119/molmed.2009.00008
PMCID: PMC2654848  PMID: 19287509
17.  The Principal Genetic Determinants for Nasopharyngeal Carcinoma in China Involve the HLA Class I Antigen Recognition Groove 
PLoS Genetics  2012;8(11):e1003103.
Nasopharyngeal carcinoma (NPC) is an epithelial malignancy facilitated by Epstein-Barr Virus infection. Here we resolve the major genetic influences for NPC incidence using a genome-wide association study (GWAS), independent cohort replication, and high-resolution molecular HLA class I gene typing including 4,055 study participants from the Guangxi Zhuang Autonomous Region and Guangdong province of southern China. We detect and replicate strong association signals involving SNPs, HLA alleles, and amino acid (aa) variants across the major histocompatibility complex-HLA-A, HLA –B, and HLA -C class I genes (PHLA-A-aa-site-62 = 7.4×10−29; P HLA-B-aa-site-116 = 6.5×10−19; P HLA-C-aa-site-156 = 6.8×10−8 respectively). Over 250 NPC-HLA associated variants within HLA were analyzed in concert to resolve separate and largely independent HLA-A, -B, and -C gene influences. Multivariate logistical regression analysis collapsed significant associations in adjacent genes spanning 500 kb (OR2H1, GABBR1, HLA-F, and HCG9) as proxies for peptide binding motifs carried by HLA- A*11:01. A similar analysis resolved an independent association signal driven by HLA-B*13:01, B*38:02, and B*55:02 alleles together. NPC resistance alleles carrying the strongly associated amino acid variants implicate specific class I peptide recognition motifs in HLA-A and -B peptide binding groove as conferring strong genetic influence on the development of NPC in China.
Author Summary
NPC is a deadly throat cancer in China that is dependent on EBV infection. Here, we performed a 1 M SNP genome-wide association study using a large cohort of Chinese study participants at risk for NPC. Although several putative gene regions show significant associations, the strongest statistical signals involved scores of variants within the HLA region on chromosome 6. HLA poses a formidable association-genetics challenge because of extensive linkage disequilibrium, rather low allele frequencies, and multiple physically close interacting genes of diverse function. We examined over 250 NPC-HLA associated variants detected with sequence-based nucleotide alleles and amino acid variants. The multiple associations were collapsed to implicate causal signals by multivariate logistical regression to resolve allele association interaction. One operative variant was identified as the HLA-A*11:01 allele motif, specifically in the peptide binding groove, which recognizes invading antigens; a second involved two aa sites with HLA-B tracking B*13:01 and B*55:02 alleles. We synthesize these new and previous discoveries to help resolve the important gene influences on this disease.
doi:10.1371/journal.pgen.1003103
PMCID: PMC3510037  PMID: 23209447
18.  Analysis of polymorphisms in 16 genes in type 1 diabetes that have been associated with other immune-mediated diseases 
BMC Medical Genetics  2006;7:20.
Background
The identification of the HLA class II, insulin (INS), CTLA-4 and PTPN22 genes as determinants of type 1 diabetes (T1D) susceptibility indicates that fine tuning of the immune system is centrally involved in disease development. Some genes have been shown to affect several immune-mediated diseases. Therefore, we tested the hypothesis that alleles of susceptibility genes previously associated with other immune-mediated diseases might perturb immune homeostasis, and hence also associate with predisposition to T1D.
Methods
We resequenced and genotyped tag single nucleotide polymorphisms (SNPs) from two genes, CRP and FCER1B, and genotyped 27 disease-associated polymorphisms from thirteen gene regions, namely FCRL3, CFH, SLC9A3R1, PADI4, RUNX1, SPINK5, IL1RN, IL1RA, CARD15, IBD5-locus (including SLC22A4), LAG3, ADAM33 and NFKB1. These genes have been associated previously with susceptibility to a range of immune-mediated diseases including rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), Graves' disease (GD), psoriasis, psoriatic arthritis (PA), atopy, asthma, Crohn disease and multiple sclerosis (MS). Our T1D collections are divided into three sample subsets, consisting of set 1 families (up to 754 families), set 2 families (up to 743 families), and a case-control collection (ranging from 1,500 to 4,400 cases and 1,500 to 4,600 controls). Each SNP was genotyped in one or more of these subsets. Our study typically had approximately 80% statistical power for a minor allele frequency (MAF) >5% and odds ratios (OR) of 1.5 with the type 1 error rate, α = 0.05.
Results
We found no evidence of association with T1D at most of the loci studied 0.02

Conclusion
Polymorphisms in a variety of genes previously associated with immune-mediated disease susceptibility and/or having effects on gene function and the immune system, are unlikely to be affecting T1D susceptibility in a major way, even though some of the genes tested encode proteins of immune pathways that are believed to be central to the development of T1D. We cannot, however, rule out effect sizes smaller than OR 1.5.
doi:10.1186/1471-2350-7-20
PMCID: PMC1420277  PMID: 16519819
PLoS ONE  2012;7(2):e31584.
Genome-wide association studies and meta-analysis indicate that several genes/loci are consistently associated with rheumatoid arthritis (RA) in European and Asian populations. To evaluate the transferability status of these findings to an ethnically diverse north Indian population, we performed a replication analysis. We investigated the association of 47 single-nucleotide polymorphisms (SNPs) at 43 of these genes/loci with RA in a north Indian cohort comprising 983 RA cases and 1007 age and gender matched controls. Genotyping was done using Infinium human 660w-quad. Association analysis by chi-square test implemented in plink was carried out in two steps. Firstly, association of the index or surrogate SNP (r2>0.8, calculated from reference GIH Hap-Map population) was tested. In the second step, evidence for allelic/locus heterogeneity at aforementioned genes/loci was assessed for by testing additional flanking SNPs in linkage equilibrium with index/surrogate marker.
Of the 44 European specific index SNPs, neither index nor surrogate SNPs were present for nine SNPs in the genotyping array. Of the remaining 35, associations were replicated at seven genes namely PTPN22 (rs1217407, p = 3×10−3); IL2–21 (rs13119723, p = 0.008); HLA-DRB1 (rs660895, p = 2.56×10−5; rs6457617, p = 1.6×10−09; rs13192471, p = 6.7×10−16); TNFA1P3 (rs9321637, p = 0.03); CCL21 (rs13293020, p = 0.01); IL2RA (rs2104286, p = 1.9×10−4) and ZEB1 (rs2793108, p = 0.006). Of the three Asian specific loci tested, rs2977227 in PADI4 showed modest association (p<0.02). Further, of the 140 SNPs (in LE with index/surrogate variant) tested, association was observed at 11 additional genes: PTPRC, AFF3, CD28, CTLA4, PXK, ANKRD55, TAGAP, CCR6, BLK, CD40 and IL2RB. This study indicates limited replication of European and Asian index SNPs and apparent allelic heterogeneity in RA etiology among north Indians warranting independent GWAS in this population. However, replicated associations of HLA-DRB1, PTPN22 (which confer ∼50% of the heritable risk to RA) and IL2RA suggest that cross-ethnicity fine mapping of such loci is apposite for identification of causal variants.
doi:10.1371/journal.pone.0031584
PMCID: PMC3280307  PMID: 22355377
Gastroenterology  2011;141(1):338-347.
Background & Aims
Drug-induced liver injury (DILI), especially from antimicrobial agents, is an important cause of serious liver disease. Amoxicillin-clavulanate (AC) is a leading cause of idiosyncratic DILI, but little is understood about genetic susceptibility to this adverse reaction.
Methods
We performed a genome-wide association study using 822,927 single-nucleotide polymorphism (SNP) markers from 201 White European and US cases of AC-DILI and 532 population controls, matched for genetic background.
Results
AC-DILI was associated with many loci in the major histocompatibility complex. The strongest effect was with a human leukocyte antigen (HLA) class II SNP (rs9274407, P=4.8×10−14), which correlated with rs3135388, a tag SNP of HLA-DRB1*1501-DQB1*0602 that was previously associated with AC-DILI. Conditioned on rs3135388, rs9274407 is still significant (P=1.1×10−4). An independent association was observed in the class I region (rs2523822, P=1.8×10−10), related to HLA-A*0201. The most significant class I and II SNPs showed statistical interaction (P=0.0015). High-resolution HLA genotyping (177 cases and 219 controls) confirmed associations of HLA-A*0201 (P=2×10−6) and HLA-DQB1*0602 (P=5×10−10), and their interaction (P=0.005). Additional, population-dependent effects were observed in HLA alleles with nominal significance. In an analysis of auto-immunerelated genes, rs2476601 in the gene PTPN22 was associated (P=1.3×10−4).
Conclusions
Class I and II HLA genotypes affect susceptibility to AC-DILI, indicating the importance of the adaptive immune response in pathogenesis. The HLA genotypes identified will be useful in studies of the pathogenesis of AC-DILI, but have limited utility as predictive or diagnostic biomarkers because of the low positive-predictive values.
doi:10.1053/j.gastro.2011.04.001
PMCID: PMC3129430  PMID: 21570397
Hepatotoxicity; GWAS; pharmacogenomics; MHC; Side Effect
PLoS Genetics  2008;4(6):e1000107.
Rheumatoid arthritis (RA) is a chronic, systemic autoimmune disease affecting both joints and extra-articular tissues. Although some genetic risk factors for RA are well-established, most notably HLA-DRB1 and PTPN22, these markers do not fully account for the observed heritability. To identify additional susceptibility loci, we carried out a multi-tiered, case-control association study, genotyping 25,966 putative functional SNPs in 475 white North American RA patients and 475 matched controls. Significant markers were genotyped in two additional, independent, white case-control sample sets (661 cases/1322 controls from North America and 596 cases/705 controls from The Netherlands) identifying a SNP, rs1953126, on chromosome 9q33.2 that was significantly associated with RA (ORcommon = 1.28, trend Pcomb = 1.45E-06). Through a comprehensive fine-scale-mapping SNP-selection procedure, 137 additional SNPs in a 668 kb region from MEGF9 to STOM on 9q33.2 were chosen for follow-up genotyping in a staged-approach. Significant single marker results (Pcomb<0.01) spanned a large 525 kb region from FBXW2 to GSN. However, a variety of analyses identified SNPs in a 70 kb region extending from the third intron of PHF19 across TRAF1 into the TRAF1-C5 intergenic region, but excluding the C5 coding region, as the most interesting (trend Pcomb: 1.45E-06 → 5.41E-09). The observed association patterns for these SNPs had heightened statistical significance and a higher degree of consistency across sample sets. In addition, the allele frequencies for these SNPs displayed reduced variability between control groups when compared to other SNPs. Lastly, in combination with the other two known genetic risk factors, HLA-DRB1 and PTPN22, the variants reported here generate more than a 45-fold RA-risk differential.
Author Summary
Rheumatoid arthritis (RA), a chronic autoimmune disorder affecting ∼1% of the population, is characterized by immune-cell–mediated destruction of the joint architecture. Gene–environment interactions are thought to underlie RA etiology. Variants within HLA-DRB1 and the hematopoietic-specific phosphatase, PTPN22, are well established RA-susceptibility loci, and although other markers have been identified, they do not fully account for the disease heritability. To identify additional susceptibility alleles, we carried out a multi-tiered, case-control association study genotyping >25,000 putative functional SNPs; here we report our finding of RA-associated variants in chromosome 9q33.2. A detailed genetic analysis of this region, incorporating HapMap information, localizes the RA-susceptibility effects to a 70 kb region that includes a portion of PHF19, all of TRAF1, and the majority of the TRAF1-C5 intergenic region, but excludes the C5 coding region. In addition to providing new insights into underlying mechanism(s) of disease and suggesting novel therapeutic targets, these data provide the underpinnings of a genetic signature that may predict individuals at increased risk for developing RA. Indeed, initial analyses of three known genetic risk factors, HLA, PTPN22, and the chromosome 9q33.2 variants described here, suggest a >45-fold difference in RA risk depending on an individual's three-locus genotype.
doi:10.1371/journal.pgen.1000107
PMCID: PMC2481282  PMID: 18648537
Frontiers in Genetics  2013;4:270.
The number of publications performing genome-wide association studies (GWAS) has increased dramatically. Penalized regression approaches have been developed to overcome the challenges caused by the high dimensional data, but these methods are relatively new in the GWAS field. In this study we have compared the statistical performance of two methods (the least absolute shrinkage and selection operator—lasso and the elastic net) on two simulated data sets and one real data set from a 50 K genome-wide single nucleotide polymorphism (SNP) panel of 5570 Fleckvieh bulls. The first simulated data set displays moderate to high linkage disequilibrium between SNPs, whereas the second simulated data set from the QTLMAS 2010 workshop is biologically more complex. We used cross-validation to find the optimal value of regularization parameter λ with both minimum MSE and minimum MSE + 1SE of minimum MSE. The optimal λ values were used for variable selection. Based on the first simulated data, we found that the minMSE in general picked up too many SNPs. At minMSE + 1SE, the lasso didn't acquire any false positives, but selected too few correct SNPs. The elastic net provided the best compromise between few false positives and many correct selections when the penalty weight α was around 0.1. However, in our simulation setting, this α value didn't result in the lowest minMSE + 1SE. The number of selected SNPs from the QTLMAS 2010 data was after correction for population structure 82 and 161 for the lasso and the elastic net, respectively. In the Fleckvieh data set after population structure correction lasso and the elastic net identified from 1291 to 1966 important SNPs for milk fat content, with major peaks on chromosomes 5, 14, 15, and 20. Hence, we can conclude that it is important to analyze GWAS data with both the lasso and the elastic net and an alternative tuning criterion to minimum MSE is needed for variable selection.
doi:10.3389/fgene.2013.00270
PMCID: PMC3850240  PMID: 24363662
lasso; elastic net; simulation; GWAS; population structure; cattle
BMC Proceedings  2007;1(Suppl 1):S14.
In the present paper, we used the North American Rheumatoid Arthritis Consortium data provided for Genetic Analysis Workshop 15 Problem 2 to: 1) estimate the penetrances of PTPN22 and HLA-DRB1 and, 2) test the selected model of PTPN22 conditional on the rheumatoid factor status. To achieve these aims, we used the marker association segregation chi-square method, fitting simultaneously both genotype frequency and identical by descent distributions in a sample of 3690 White individuals from 604 nuclear families. A co-dominant model fitted the rs2476601 (R620W) single-nucleotide polymorphism (SNP) of the PTPN22 gene well, whereas a lack of fit for all models was observed for the HLA-DRB1 locus. Testing genetic models of rheumatoid arthritis that include the PTPN22 SNP in addition to the HLA-DRB1 locus did not affect the results, nor did subgroup analysis of PTPN22 conditional on the rheumatoid factor status. In conclusion, PTPN22 R620W SNP is a risk factor for rheumatoid arthritis. The genetic architecture of the HLA-DRB1 locus is highly complex, and more elaborate modeling of this locus is required.
PMCID: PMC2367526  PMID: 18466483
BMC Proceedings  2012;6(Suppl 2):S9.
Background
The least absolute shrinkage and selection operator (LASSO) can be used to predict SNP effects. This operator has the desirable feature of including in the model only a subset of explanatory SNPs, which can be useful both in QTL detection and GWS studies. LASSO solutions can be obtained by the least angle regression (LARS) algorithm. The big issue with this procedure is to define the best constraint (t), i.e. the upper bound of the sum of absolute value of the SNP effects which roughly corresponds to the number of SNPs to be selected. Usai et al. (2009) dealt with this problem by a cross-validation approach and defined t as the average number of selected SNPs overall replications. Nevertheless, in small size populations, such estimator could give underestimated values of t. Here we propose two alternative ways to define t and compared them with the "classical" one.
Methods
The first (strategy 1), was based on 1,000 cross-validations carried out by randomly splitting the reference population (2,000 individuals with performance) into two halves. The value of t was the number of SNPs which occurred in more than 5% of replications. The second (strategy 2), which did not use cross-validations, was based on the minimization of the Cp-type selection criterion which depends on the number of selected SNPs and the expected residual variance.
Results
The size of the subset of selected SNPs was 46, 189 and 64 for the classical approach, strategy 1 and 2 respectively. Classical and strategy 2 gave similar results and indicated quite clearly the regions were QTL with additive effects were located. Strategy 1 confirmed such regions and added further positions which gave a less clear scenario. Correlation between GEBVs estimated with the three strategies and TBVs in progenies without phenotypes were 0.9237, 0.9000 and 0.9240 for classical, strategy 1 and 2 respectively.
Conclusions
This suggests that the Cp-type selection criterion is a valid alternative to the cross-validations to define the best constraint for selecting subsets of predicting SNPs by LASSO-LARS procedure.
doi:10.1186/1753-6561-6-S2-S9
PMCID: PMC3363163  PMID: 22640825
Hancock, Dana B. | Artigas, María Soler | Gharib, Sina A. | Henry, Amanda | Manichaikul, Ani | Ramasamy, Adaikalavan | Loth, Daan W. | Imboden, Medea | Koch, Beate | McArdle, Wendy L. | Smith, Albert V. | Smolonska, Joanna | Sood, Akshay | Tang, Wenbo | Wilk, Jemma B. | Zhai, Guangju | Zhao, Jing Hua | Aschard, Hugues | Burkart, Kristin M. | Curjuric, Ivan | Eijgelsheim, Mark | Elliott, Paul | Gu, Xiangjun | Harris, Tamara B. | Janson, Christer | Homuth, Georg | Hysi, Pirro G. | Liu, Jason Z. | Loehr, Laura R. | Lohman, Kurt | Loos, Ruth J. F. | Manning, Alisa K. | Marciante, Kristin D. | Obeidat, Ma'en | Postma, Dirkje S. | Aldrich, Melinda C. | Brusselle, Guy G. | Chen, Ting-hsu | Eiriksdottir, Gudny | Franceschini, Nora | Heinrich, Joachim | Rotter, Jerome I. | Wijmenga, Cisca | Williams, O. Dale | Bentley, Amy R. | Hofman, Albert | Laurie, Cathy C. | Lumley, Thomas | Morrison, Alanna C. | Joubert, Bonnie R. | Rivadeneira, Fernando | Couper, David J. | Kritchevsky, Stephen B. | Liu, Yongmei | Wjst, Matthias | Wain, Louise V. | Vonk, Judith M. | Uitterlinden, André G. | Rochat, Thierry | Rich, Stephen S. | Psaty, Bruce M. | O'Connor, George T. | North, Kari E. | Mirel, Daniel B. | Meibohm, Bernd | Launer, Lenore J. | Khaw, Kay-Tee | Hartikainen, Anna-Liisa | Hammond, Christopher J. | Gläser, Sven | Marchini, Jonathan | Kraft, Peter | Wareham, Nicholas J. | Völzke, Henry | Stricker, Bruno H. C. | Spector, Timothy D. | Probst-Hensch, Nicole M. | Jarvis, Deborah | Jarvelin, Marjo-Riitta | Heckbert, Susan R. | Gudnason, Vilmundur | Boezen, H. Marike | Barr, R. Graham | Cassano, Patricia A. | Strachan, David P. | Fornage, Myriam | Hall, Ian P. | Dupuis, Josée | Tobin, Martin D. | London, Stephanie J.
PLoS Genetics  2012;8(12):e1003098.
Genome-wide association studies have identified numerous genetic loci for spirometic measures of pulmonary function, forced expiratory volume in one second (FEV1), and its ratio to forced vital capacity (FEV1/FVC). Given that cigarette smoking adversely affects pulmonary function, we conducted genome-wide joint meta-analyses (JMA) of single nucleotide polymorphism (SNP) and SNP-by-smoking (ever-smoking or pack-years) associations on FEV1 and FEV1/FVC across 19 studies (total N = 50,047). We identified three novel loci not previously associated with pulmonary function. SNPs in or near DNER (smallest PJMA = 5.00×10−11), HLA-DQB1 and HLA-DQA2 (smallest PJMA = 4.35×10−9), and KCNJ2 and SOX9 (smallest PJMA = 1.28×10−8) were associated with FEV1/FVC or FEV1 in meta-analysis models including SNP main effects, smoking main effects, and SNP-by-smoking (ever-smoking or pack-years) interaction. The HLA region has been widely implicated for autoimmune and lung phenotypes, unlike the other novel loci, which have not been widely implicated. We evaluated DNER, KCNJ2, and SOX9 and found them to be expressed in human lung tissue. DNER and SOX9 further showed evidence of differential expression in human airway epithelium in smokers compared to non-smokers. Our findings demonstrated that joint testing of SNP and SNP-by-environment interaction identified novel loci associated with complex traits that are missed when considering only the genetic main effects.
Author Summary
Measures of pulmonary function provide important clinical tools for evaluating lung disease and its progression. Genome-wide association studies have identified numerous genetic risk factors for pulmonary function but have not considered interaction with cigarette smoking, which has consistently been shown to adversely impact pulmonary function. In over 50,000 study participants of European descent, we applied a recently developed joint meta-analysis method to simultaneously test associations of gene and gene-by-smoking interactions in relation to two major clinical measures of pulmonary function. Using this joint method to incorporate genetic main effects plus gene-by-smoking interaction, we identified three novel gene regions not previously related to pulmonary function: (1) DNER, (2) HLA-DQB1 and HLA-DQA2, and (3) KCNJ2 and SOX9. Expression analyses in human lung tissue from ours or prior studies indicate that these regions contain genes that are plausibly involved in pulmonary function. This work highlights the utility of employing novel methods for incorporating environmental interaction in genome-wide association studies to identify novel genetic regions.
doi:10.1371/journal.pgen.1003098
PMCID: PMC3527213  PMID: 23284291

Results 1-25 (702785)