The genetic association of the major histocompatibility complex (MHC) to rheumatoid arthritis risk has commonly been attributed to HLA-DRB1 alleles. Yet controversy persists about the causal variants in HLA-DRB1 and the presence of independent effects elsewhere in the MHC. Using existing genome-wide SNP data in 5,018 seropositive cases and 14,974 controls, we imputed and tested classical alleles and amino acid polymorphisms for HLA-A, B, C, DPA1, DPB1, DQA1, DQB1, and DRB1 along with 3,117 SNPs across the MHC. Conditional and haplotype analyses reveal that three amino acid positions (11, 71 and 74) in HLA-DRβ1, and single amino acid polymorphisms in HLA-B (position 9) and HLA-DPβ1 (position 9), all located in the peptide-binding grooves, almost completely explain the MHC association to disease risk. This study illustrates how imputation of functional variation from large reference panels can help fine-map association signals in the MHC.
We investigated the prevalence of xenotropic murine leukemia virus-related virus (XMRV) in 293 participants seen at academic hospitals in Boston, Massachusetts. Participants were recruited from five groups of patients: chronic fatigue syndrome (CFS, n = 32), HIV infection (n = 43), rheumatoid arthritis (RA, n = 97), hematopoietic stem-cell or solid organ transplant (n = 26), or a general cohort of patients presenting for medical care (n = 95). XMRV DNA was not detected in any participant samples. We found no association between XMRV and patients with CFS or chronic immunomodulatory conditions.
XMRV; chronic fatigue syndrome; HIV infection; rheumatoid arthritis; hematopoietic stem-cell transplantation; solid organ transplantation
Cumulative genetic profiles can help identify individuals at high-risk for developing RA. We examined the impact of 39 validated genetic risk alleles on the risk of RA phenotypes characterized by serologic and erosive status.
We evaluated single nucleotide polymorphisms at 31 validated RA risk loci and 8 Human Leukocyte Antigen alleles among 542 Caucasian RA cases and 551 Caucasian controls from Nurses' Health Study and Nurses' Health Study II. We created a weighted genetic risk score (GRS) and evaluated it as 7 ordinal groups using logistic regression (adjusting for age and smoking) to assess the relationship between GRS group and odds of developing seronegative (RF− and CCP−), seropositive (RF+ or CCP+), erosive, and seropositive, erosive RA phenotypes. In separate case only analyses, we assessed the relationships between GRS and age of symptom onset.
In 542 RA cases, 317 (58%) were seropositive, 163 (30%) had erosions and 105 (19%) were seropositive with erosions. Comparing the highest GRS risk group to the median group, we found an OR of 1.2 (95% CI = 0.8–2.1) for seronegative RA, 3.0 (95% CI = 1.9–4.7) for seropositive RA, 3.2 (95% CI = 1.8–5.6) for erosive RA, and 7.6 (95% CI = 3.6–16.3) for seropositive, erosive RA. No significant relationship was seen between GRS and age of onset.
Results suggest that seronegative and seropositive/erosive RA have different genetic architecture and support the importance of considering RA phenotypes in RA genetic studies.
Electronic medical records (EMRs) are a rich data source for discovery research but are underutilized due to the difficulty of extracting highly accurate clinical data. We assessed whether a classification algorithm incorporating narrative EMR data (typed physician notes), more accurately classifies subjects with rheumatoid arthritis (RA) compared to an algorithm using codified EMR data alone.
Subjects with ≥1 ICD9 RA code (714.xx) or who had anti-CCP checked in the EMR of two large academic centers were included into an ‘RA Mart’ (n=29,432). For all 29,432 subjects, we extracted narrative (using natural language processing) and codified RA clinical information. In a training set of 96 RA and 404 non-RA cases from the RA Mart classified by medical record review, we used narrative and codified data to develop classification algorithms using logistic regression. These algorithms were applied to the entire RA Mart. We calculated and compared the positive predictive value (PPV) of these algorithms by reviewing records of an additional 400 subjects classified as RA by the algorithms.
A complete algorithm (narrative and codified data) classified RA subjects with a significantly higher PPV of 94%, than an algorithm with codified data alone (PPV 88%). Characteristics of the RA cohort identified by the complete algorithm were comparable to existing RA cohorts (80% female, 63% anti-CCP+, 59% erosion+).
We demonstrate the ability to utilize complete EMR data to define an RA cohort with a PPV of 94%, which was superior to an algorithm using codified data alone.
Recent discoveries of risk alleles have made it possible to define genetic risk profiles for patients with rheumatoid arthritis (RA). We examined whether a cumulative score based on 22 validated genetic risk alleles for seropositive RA would identify high-risk, asymptomatic individuals who might benefit from preventive interventions.
We genotyped 14 single nucleotide polymorphisms (SNPs) at 13 validated RA risk loci and 8 HLA alleles among (1) 289 Caucasian seropositive cases and 481 controls from the US Nurses' Health Studies (NHS), and (2) 629 Caucasian CCP antibody positive cases and 623 controls from the Swedish Epidemiologic Investigation of RA (EIRA). We created a weighted genetic risk score (GRS), where the weight for each risk allele is the log of the published odds ratio. We used logistic regression to study associations with incident RA. We compared AUCs from a clinical-only model and clinical + genetic model in each cohort.
Patients with GRS > 1.25 standard deviations of the mean had a significantly higher OR of seropositive RA in both NHS (OR=2.9, 95%CI 1.8–4.6) and EIRA (OR=3.4, 95% CI 2.3–5.0) referent to the population average. In NHS, the AUC for a clinical model was 0.57 and for a clinical + genetic model was 0.66, and in EIRA was 0.63 and 0.75, respectively.
The combination of 22 risk alleles into a weighted genetic risk score significantly stratifies individuals for RA risk beyond clinical risk factors alone. However, given the low incidence of RA, the clinical utility of a weighted genetic risk score is limited in the general population.
rheumatoid arthritis; polymorphism; autoantibodies; anti-CCP; smoking
HLA-DRB1 shared epitope (HLA-SE), PTPN22 and CTLA4 alleles are associated with CCP+ RA.
We examined associations between HLA-SE, PTPN22, CTLA4 genotypes and RA phenotypes in a large cohort to (a) replicate prior associations with CCP status, and (b) determine associations with radiographic erosions and age of diagnosis.
689 RA patients from the Brigham RA Sequential Study (BRASS) were genotyped for HLA-SE, PTPN22 (rs2476601) and CTLA4 (rs3087243). Association between genotypes and CCP, RF erosive phenotypes and age at diagnosis were assessed with multivariable models adjusting for age, sex, and disease duration. Novel causal pathway analysis was used to test the hypothesis that genetic risk factors and CCP are in the causal pathway for predicting erosions.
In multivariable analysis, presence of any HLA-SE was strongly associated with CCP+ (OR 3.05 (2.18–4.25)), and RF+ (OR 2.53 (1.83–3.5)) phenotypes; presence of any PTPN22 T allele was associated with CCP+ (OR 1.81 (1.24–2.66)) and RF+ phenotypes (1.84 (1.27–2.66)). CTLA4 was not associated with CCP or RF phenotypes. While HLA-SE was associated with erosive RA phenotype (OR 1.52 (1.01–2.17)), this was no longer significant after conditioning on CCP. PTPN22 and CTLA4 were not associated with erosive phenotype. Presence of any HLA-SE was associated with on average 3.6 years earlier diagnosis compared with absence of HLA-SE (41.3 vs. 44.9 years, p=0.003) and PTPN22 was associated with 4.2 years earlier age of diagnosis (39.5 vs. 43.6 years, p=0.002). CTLA4 genotypes were not associated with age at diagnosis of RA.
In this large clinical cohort, we replicated the association between HLA-SE, and PTPN22 but not CTLA4 with CCP+ and RF+ phenotypes. We also found evidence for associations between HLA-SE, and PTPN22 and earlier age at diagnosis. Since HLA-SE is associated with erosive phenotype in unconditional analysis, but is not significant after conditioning on CCP, this suggests that CCP is in the causal pathway for predicting erosive phenotype.
rheumatoid arthritis; age at diagnosis; PTPN22; HLA; CCP
To identify additional variants in the major histocompatibility complex (MHC) region that independently contribute to risk in 2 disease subsets of rheumatoid arthritis (RA) defined according to the presence or absence of antibodies to citrullinated protein antigens (ACPAs).
In a multistep analytical strategy using unmatched as well as matched analyses to adjust for HLA–DRB1 genotype, we analyzed 2,221 single-nucleotide polymorphisms (SNPs) spanning 10.7 Mb, from 6p22.2 to 6p21.31, across the MHC. For ACPA-positive RA, we analyzed samples from the Swedish Epidemiological Investigation of Rheumatoid Arthritis (EIRA) and the North American Rheumatoid Arthritis Consortium (NARAC) studies (totaling 1,255 cases and 1,719 controls). For ACPA-negative RA, we used samples from the EIRA study (640 cases and 670 controls). Plink and SAS statistical packages were used to conduct all statistical analyses.
A total of 299 SNPs reached locus-wide significance (P < 2.3 × 10−5) for ACPA-positive RA, whereas surprisingly, no SNPs reached this significance for ACPA-negative RA. For ACPA-positive RA, we adjusted for known DRB1 risk alleles and identified additional independent associations with SNPs near HLA–DPB1 (rs3117213; odds ratio 1.42 [95% confidence interval 1.17–1.73], Pcombined = 0.0003 for the strongest association).
There are distinct genetic patterns of MHC associations in the 2 disease subsets of RA defined according to ACPA status. HLA–DPB1 is an independent risk locus for ACPA-positive RA. We did not identify any associations with SNPs within the MHC for ACPA-negative RA.
Next-generation DNA sequencing reveals rare alleles protective from type 1 diabetes.
A large population study using ultra-high-throughput DNA sequencing to re-sequence a genetic locus associated with type 1 diabetes reveals rare protective alleles.
Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low–BMI cases are larger than those estimated from high–BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-control-covariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled false-positive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1×10−9). The improvement varied across diseases with a 16% median increase in χ2 test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci.
This work describes a new methodology for analyzing genome-wide case-control association studies of diseases with strong correlations to clinical covariates, such as age in prostate cancer and body mass index in type 2 diabetes. Currently, researchers either ignore these clinical covariates or apply approaches that ignore the disease's prevalence and the study's ascertainment strategy. We take an alternative approach, leveraging external prevalence information from the epidemiological literature and constructing a statistic based on the classic liability threshold model of disease. Our approach not only improves the power of studies that ascertain individuals randomly or based on the disease phenotype, but also improves the power of studies that ascertain individuals based on both the disease phenotype and clinical covariates. We apply our statistic to seven datasets over six different diseases and a variety of clinical covariates. We found that there was a substantial improvement in test statistics relative to current approaches at known associated variants. This suggests that novel loci may be identified by applying our method to existing and future association studies of these diseases.
The co-occurrence of autoimmune diseases such as rheumatoid arthritis (RA) and type 1 diabetes (T1D) has been reported in individuals and families. We studied the strength and nature of this association at the population level.
We conducted a case-control study of 1419 incident RA cases and 1674 controls between 1996 and 2003. Subjects were recruited from university, public and private rheumatology units throughout Sweden. Blood samples were tested for the presence of antibodies to cyclic citrullinated peptide (anti-CCP), rheumatoid factor (RF) and the presence or absence of the 620W PTPN22 allele. Information on history of diabetes was obtained by questionnaire, telephone interview, and medical record review. The prevalence of T1D and type 2 diabetes (T2D) was compared between incident RA cases and controls and further stratified by anti-CCP, RF status, and the presence of the PTPN22 risk allele.
T1D was associated with an increased risk of RA, OR 4.9 (95% CI 1.8–13.1), and was specific for anti-CCP+ RA, OR 7.3 (95% CI 2.7–20.0), but not anti-CCP negative RA. Further adjustment for PTPN22 attenuated the odds ratio for anti-CCP+ RA in individuals with T1D to 5.3 (95% CI 1.5–18.7). No association was observed between RA and T2D.
The association between T1D and RA is specific for a particular RA subset, anti-CCP+ RA. The risk of type 1 diabetics developing RA later in life may be attributed in part to the presence of the 620W PTPN22 allele, suggesting a common pathway for the pathogenesis of these two diseases.
The single nucleotide polymorphism (SNP) rs11761231 on chromosome 7q has been reported as a sexually dimorphic marker for rheumatoid arthritis susceptibility in a British population. We sought to replicate this finding and better characterize susceptibility alleles in the region in a North American population.
DNA from two North American collections of RA patients and controls (1605 cases and 2640 controls) was genotyped for rs11761231 and 16 additional chromosome 7q tag SNPs using Sequenom iPlex assays. Association tests were performed for each collection and also separately contrasting male cases versus male controls and female cases versus female controls. Principal components analysis (EIGENSTRAT) was used to determine association with RA before and after adjusting for population stratification in the subset of the samples (772 cases and 1213 controls) with whole genome SNP data.
We failed to replicate association of the 7q region with rheumatoid arthritis. Initially, rs11761231 showed evidence for association with RA in the NARAC collection (p=0.0076) and rs11765576 showed association with RA in both the NARAC (p = 0.019) and RA replication (p = 0.0013) collections. These markers also exhibited sexual differentiation. However, in the whole genome subset, neither SNP showed significant association with RA after correction for population stratification.
While two SNPs on chromosome 7q appeared to be associated with RA in a North American cohort, the significance of this finding did not withstand correction for population substructure. Our results emphasize the need to carefully account for population structure to avoid false positive disease associations.
For Genetic Analysis Workshop 16 Problem 1, we provided data for genome-wide association analysis of rheumatoid arthritis. Single-nucleotide polymorphism (SNP) genotype data were provided for 868 cases and 1194 controls that had been assayed using an Illumina 550 k platform. In addition, phenotypic data were provided from genotyping DRB1 alleles, which were classified according to the rheumatoid arthritis shared epitope, levels of anti-cyclic citrullinated peptide, and levels of rheumatoid factor IgM. Several questions could be addressed using the data, including analysis of genetic associations using single SNPs or haplotypes, as well as gene-gene and genetic analysis of SNPs for qualitative and quantitative factors.
SLE is an autoimmune disease influenced by genetic and environmental components. We performed a genome-wide association scan (GWAS) and observed novel association evidence with a variant inTNFAIP3(rs5029939, P = 2.89×10−12, OR = 2.29). We also found evidence of two independent signals of association to SLE risk, including one described in Rheumatoid Arthritis. These results establish that genetic variation inTNFAIP3contributes to differential risk for SLE and RA.
Translating a set of disease regions into insight about pathogenic mechanisms requires not only the ability to identify the key disease genes within them, but also the biological relationships among those key genes. Here we describe a statistical method, Gene Relationships Among Implicated Loci (GRAIL), that takes a list of disease regions and automatically assesses the degree of relatedness of implicated genes using 250,000 PubMed abstracts. We first evaluated GRAIL by assessing its ability to identify subsets of highly related genes in common pathways from validated lipid and height SNP associations from recent genome-wide studies. We then tested GRAIL, by assessing its ability to separate true disease regions from many false positive disease regions in two separate practical applications in human genetics. First, we took 74 nominally associated Crohn's disease SNPs and applied GRAIL to identify a subset of 13 SNPs with highly related genes. Of these, ten convincingly validated in follow-up genotyping; genotyping results for the remaining three were inconclusive. Next, we applied GRAIL to 165 rare deletion events seen in schizophrenia cases (less than one-third of which are contributing to disease risk). We demonstrate that GRAIL is able to identify a subset of 16 deletions containing highly related genes; many of these genes are expressed in the central nervous system and play a role in neuronal synapses. GRAIL offers a statistically robust approach to identifying functionally related genes from across multiple disease regions—that likely represent key disease pathways. An online version of this method is available for public use (http://www.broad.mit.edu/mpg/grail/).
Modern genetic studies, including genome-wide surveys for disease-associated loci and copy number variation, provide a list of critical genomic regions that play an important role in predisposition to disease. Using these regions to understand disease pathogenesis requires the ability to first distinguish causal genes from other nearby genes spuriously contained within these regions. To do this we must identify the key pathways suggested by those causal genes. In this manuscript we describe a statistical approach, Gene Relationships Across Implicated Loci (GRAIL), to achieve this task. It starts with genomic regions and identifies related subsets of genes involved in similar biological processes—these genes highlight the likely causal genes and the key pathways. GRAIL uses abstracts from the entirety of the published scientific literature about the genes to look for potential relationships between genes. We apply GRAIL to four very different phenotypes. In each case we identify a subset of highly related genes; in cases where false positive regions are present, GRAIL is able to separate out likely true positives. GRAIL therefore offers the potential to translate disease genomic regions from unbiased genomic surveys into the key processes that may be critical to the disease.
We carried out a genome-wide association study of genetic predictors of anti-cyclic citrullinated peptide antibody (anti-CCP) level in 531 self-reported non-Hispanic Caucasian Rheumatoid Arthritis (RA) patients enrolled in the Brigham Rheumatoid Arthritis Sequential Study (BRASS). For replication, we then analyzed 289 single nucleotide polymorphisms (SNPs) with P < 0.001 in BRASS in an independent population of 849 RA patients from the North American Rheumatoid Arthritis Consortium (NARAC). BRASS and NARAC samples were genotyped using the Affymetrix 100K and Illumina 550K platforms respectively. Association between SNPs and anti-CCP titer was tested using general linear models. The five most significant SNPs from BRASS all were within the major histocompatibility complex (MHC) region (P ≤ 3.5 × 10−6). After controlling for the human leukocyte antigen shared epitope (HLA-SE), the top SNPs still yielded P values < 0.0002. In NARAC, a single SNP from the MHC region near BTNL2 and HLA-DRA, rs1980493 (r2 = 0.85 with the top five SNPs from BRASS), was associated significantly with CCP titer (P = 6.1 × 10−5) even after adjustment for the HLA-SE (P = 0.0002). The top SNPs found in BRASS and NARAC had r2 = 0.46 and 0.64, respectively, to HLA-DRB1 DR3 alleles. These results confirm that the most significant genome region affecting anti-CCP titers in RA is the MHC region. We identified a SNP in moderate linkage disequilibrium (LD) with HLA-DR3, which may influence anti-CCP titer independently of the HLA-SE.
To identify susceptibility alleles associated with rheumatoid arthritis, we genotyped 397 individuals with rheumatoid arthritis for 116,204 SNPs and carried out an association analysis in comparison to publicly available genotype data for 1,211 related individuals from the Framingham Heart Study1. After evaluating and adjusting for technical and population biases, we identified a SNP at 6q23 (rs10499194, ∼150 kb from TNFAIP3 and OLIG3) that was reproducibly associated with rheumatoid arthritis both in the genome-wide association (GWA) scan and in 5,541 additional case-control samples (P = 10−3, GWA scan; P < 10−6, replication; P = 10−9, combined). In a concurrent study, the Wellcome Trust Case Control Consortium (WTCCC) has reported strong association of rheumatoid arthritis susceptibility to a different SNP located 3.8 kb from rs10499194 (rs6920220; P = 5 × 10−6 in WTCCC)2. We show that these two SNP associations are statistically independent, are each reproducible in the comparison of our data and WTCCC data, and define risk and protective haplotypes for rheumatoid arthritis at 6q23.
Rheumatoid arthritis has a complex mode of inheritance. Although HLA-DRB1 and PTPN22 are well-established susceptibility loci, other genes that confer a modest level of risk have been identified recently. We carried out a genomewide association analysis to identify additional genetic loci associated with an increased risk of rheumatoid arthritis.
We genotyped 317,503 single-nucleotide polymorphisms (SNPs) in a combined case-control study of 1522 case subjects with rheumatoid arthritis and 1850 matched control subjects. The patients were seropositive for autoantibodies against cyclic citrullinated peptide (CCP). We obtained samples from two data sets, the North American Rheumatoid Arthritis Consortium (NARAC) and the Swedish Epidemiological Investigation of Rheumatoid Arthritis (EIRA). Results from NARAC and EIRA for 297,086 SNPs that passed quality-control filters were combined with the use of Cochran-Mantel-Haenszel stratified analysis. SNPs showing a significant association with disease (P<1×10-8) were genotyped in an independent set of case subjects with anti-CCP-positive rheumatoid arthritis (485 from NARAC and 512 from EIRA) and in control subjects (1282 from NARAC and 495 from EIRA).
We observed associations between disease and variants in the major-histocompatibility-complex locus, in PTPN22, and in a SNP (rs3761847) on chromosome 9 for all samples tested, the latter with an odds ratio of 1.32 (95% confidence interval, 1.23 to 1.42; P = 4×10-14). The SNP is in linkage disequilibrium with two genes relevant to chronic inflammation: TRAF1 (encoding tumor necrosis factor receptor-associated factor 1) and C5 (encoding complement component 5).
A common genetic variant at the TRAF1-C5 locus on chromosome 9 is associated with an increased risk of anti-CCP-positive rheumatoid arthritis.
Rheumatoid arthritis is a chronic inflammatory disease with a substantial genetic component. Susceptibility to disease has been linked with a region on chromosome 2q.
We tested single-nucleotide polymorphisms (SNPs) in and around 13 candidate genes within the previously linked chromosome 2q region for association with rheumatoid arthritis. We then performed fine mapping of the STAT1-STAT4 region in a total of 1620 case patients with established rheumatoid arthritis and 2635 controls, all from North America. Implicated SNPs were further tested in an independent case-control series of 1529 patients with early rheumatoid arthritis and 881 controls, all from Sweden, and in a total of 1039 case patients and 1248 controls from three series of patients with systemic lupus erythematosus.
A SNP haplotype in the third intron of STAT4 was associated with susceptibility to both rheumatoid arthritis and systemic lupus erythematosus. The minor alleles of the haplotype-defining SNPs were present in 27% of chromosomes of patients with established rheumatoid arthritis, as compared with 22% of those of controls (for the SNP rs7574865, P = 2.81×10-7; odds ratio for having the risk allele in chromosomes of patients vs. those of controls, 1.32). The association was replicated in Swedish patients with recent-onset rheumatoid arthritis (P = 0.02) and matched controls. The haplotype marked by rs7574865 was strongly associated with lupus, being present on 31% of chromosomes of case patients and 22% of those of controls (P = 1.87×10-9; odds ratio for having the risk allele in chromosomes of patients vs. those of controls, 1.55). Homozygosity of the risk allele, as compared with absence of the allele, was associated with a more than doubled risk for lupus and a 60% increased risk for rheumatoid arthritis.
A haplotype of STAT4 is associated with increased risk for both rheumatoid arthritis and systemic lupus erythematosus, suggesting a shared pathway for these illnesses.
Lymphoblastoid cell lines (LCLs), originally collected as renewable sources of DNA, are now being used as a model system to study genotype–phenotype relationships in human cells, including searches for QTLs influencing levels of individual mRNAs and responses to drugs and radiation. In the course of attempting to map genes for drug response using 269 LCLs from the International HapMap Project, we evaluated the extent to which biological noise and non-genetic confounders contribute to trait variability in LCLs. While drug responses could be technically well measured on a given day, we observed significant day-to-day variability and substantial correlation to non-genetic confounders, such as baseline growth rates and metabolic state in culture. After correcting for these confounders, we were unable to detect any QTLs with genome-wide significance for drug response. A much higher proportion of variance in mRNA levels may be attributed to non-genetic factors (intra-individual variance—i.e., biological noise, levels of the EBV virus used to transform the cells, ATP levels) than to detectable eQTLs. Finally, in an attempt to improve power, we focused analysis on those genes that had both detectable eQTLs and correlation to drug response; we were unable to detect evidence that eQTL SNPs are convincingly associated with drug response in the model. While LCLs are a promising model for pharmacogenetic experiments, biological noise and in vitro artifacts may reduce power and have the potential to create spurious association due to confounding.
The use of lymphoblastoid cell lines (LCLs) has evolved from a renewable source of DNA to an in vitro model system to study the genetics of gene expression, drug response, and other traits in a controlled laboratory setting. While convincing relationships between SNPs and mRNA levels (eQTLs) have been described, the degree to which non-genetic variables also influence phenotypes in LCLs is less well characterized. In the course of attempting to map genes for drug responses in vitro, we evaluated the reproducibility of in vitro traits across replicates, the impact of the EBV virus used to transform B cells into cell lines, and the effect of in vitro culture conditions. We found that responses to at least some drugs and levels of many mRNAs can be technically well measured, but vary both across experiments and with non-genetic confounders such as growth rates, EBV levels, and ATP levels. The influence of such non-genetic factors can both decrease power to detect true relationships between DNA variation and traits and create the potential for non-genetic confounding and spurious associations between DNA variants and traits.
It has been suggested that polymorphisms in IL1 are correlated with severe and/or erosive rheumatoid arthritis (RA), but the implicated alleles have differed among studies. The aim of this study was to perform a broad and well-powered search for association between allelic polymorphism in IL1A and IL1B and the susceptibility to or severity of RA.
Key coding and regulatory regions in IL1A and IL1B were sequenced in 24 patients with RA, revealing 4 novel single-nucleotide polymorphisms (SNPs) in IL1B. These and a comprehensive set of 24 SNPs tagging most of the underlying genetic diversity were genotyped in 3 independent RA case-control sample sets and 1 longitudinal RA cohort, totaling 3,561 patients and 3,062 control subjects.
No fully significant associations were observed. Analysis of the discovery case-control sample sets indicated a potential association of IL1B promoter region SNPs with susceptibility to RA (for RA3/A, odds ratio [OR] 1.27, P = 0.0021) or with the incidence of radiographic erosions (for RA4/C, OR 1.56, P = 0.036), but these findings were not replicated in independent case-control samples. No association with rheumatoid factor, anti-cyclic citrullinated peptide, or the Disease Activity Score in 28 joints was found. None of the associations previously observed in other studies were replicated here.
In spite of a broad and highly powered study, we observed no robust, reproducible association between IL1A/B variants and the susceptibility to or severity of RA in white individuals of European descent. Our results provide evidence that, in the majority of cases, polymorphism in IL1A and IL1B is not a major contributor to genetic susceptibility to RA.
The prediction of response (or non-response) to anti-TNF treatment for rheumatoid arthritis (RA) is a pressing clinical problem. We conducted a genome-wide association study using the Illumina HapMap300 SNP chip on 89 RA patients prospectively followed after beginning anti-TNF therapy as part of Autoimmune Biomarkers Collaborative Network (ABCoN [Autoimmune Bio-markers Collaborative Network]) patient cohort. Response to therapy was determined by the change in Disease Activity Score (DAS28) observed after 14 wks. We used a two-part analysis that treated the change in DAS28 as a continuous trait and then incorporated it into a dichotomous trait of “good responder” and “nonresponder” by European League Against Rheumatism (EULAR) criteria. We corrected for multiple tests by permutation, and adjusted for potential population stratification using EIGENSTRAT. Multiple single nucleotide polymorphism (SNP) markers showed significant associations near or within loci including: the v-maf musculoaponeurotic fibrosarcoma oncogene homolog B (MAFB) gene on chromosome 20; the type I interferon gene IFNk on chromosome 9; and in a locus on chromosome 7 that includes the paraoxonase I (PON1) gene. An SNP in the IL10 promoter (rs1800896) that was previously reported as associated with anti-TNF response was weakly associated with response in this cohort. Replications of these results in independent and larger data sets clearly are required. We provide a reference list of candidate SNPs (P < 0.01) that can be investigated in future pharmacogenomic studies.
Recent evidence suggests that additional risk loci for RA are present in the major histocompatibility complex (MHC), independent of the class II HLA-DRB1 locus. We have now tested a total of 1,769 SNPs across 7.5Mb of the MHC located from 6p22.2 (26.03 Mb) to 6p21.32 (33.59 Mb) derived from the Illumina 550K Beadchip (Illumina, San Diego, CA, USA). For an initial analysis in the whole dataset (869 RA CCP + cases, 1,193 controls), the strongest association signal was observed in markers near the HLA-DRB1 locus, with additional evidence for association extending out into the Class I HLA region. To avoid confounding that may arise due to linkage disequilibrium with DRB1 alleles, we analyzed a subset of the data by matching cases and controls by DRB1 genotype (both alleles matched 1:1), yielding a set of 372 cases with 372 controls. This analysis revealed the presence of at least two regions of association with RA in the Class I region, independent of DRB1 genotype. SNP alleles found on the conserved A1-B8-DR3 (8.1) haplotype show the strongest evidence of positive association (P ~ 0.00005) clustered in the region around the HLA-C locus. In addition, we identified risk alleles that are not present on the 8.1 haplotype, with maximal association signals (P ~ 0.001–0.0027) located near the ZNF311 locus. This latter association is enriched in DRB1*0404 individuals. Finally, several additional association signals were found in the extreme centromeric portion of the MHC, in regions containing the DOB1, TAP2, DPB1, and COL11A2 genes. These data emphasize that further analysis of the MHC is likely to reveal genetic risk factors for rheumatoid arthritis that are independent of the DRB1 shared epitope alleles.