We found that a weighted genetic risk score was associated with development of seropositive RA, erosive RA and seropositive, erosive RA phenotypes. Although there was a significant linear trend with a continuous GRS39 measure predicting seronegative RA, with the exception of group 7 compared to group 1, there was no significant relationship when the score was divided into groups. In contrast, we found a strong and significant association between both continuous and grouped GRS39 and the erosive and/or seropositive phenotypes. Subjects with the highest GRS score (group 7) had a 3.2 times increase of odds of erosive RA as compared to the median group. This odds ratio increased to 7.6 when limiting the phenotype to those with seropositive, erosive RA. We observed similar results when comparing extreme GRS scores (group 7 vs. group 1), where we found a 5 times increased odds of erosive RA and a 14 times increased odds for seropositive, erosive RA. This suggests that the GRS has a stronger association with the more severe phenotype; however narrowing the phenotype definition resulted in a widened confidence interval. Thus, although we detected a stronger effect size (i.e. larger OR), there was also greater variability in the association, most likely due to the small sample size in this group.
One interesting result is the association between the GRS with 39 risk alleles and seropositive RA. We found that group 7 had a 3.0 times increased odds of seropositive RA as compared to group 4. This is similar to the 2.9 times increased odds found by Karlson et al 
with the GRS based on 22 risk alleles. In addition, we observed a similar increase in ORs in the ordinal model when comparing group 7 to group 1, where the OR was 6.3 (from Karlson et al, with 22 risk alleles) and 5.7 in our analysis that included 17 additional risk alleles. Similarly, the combination of risk alleles also displayed a good ability to discriminate between an RA case and control when the case is defined as seropositive RA, erosive RA or seropositive, erosive RA. However, the GRS showed very little, if any, ability to discriminate between seronegative RA and controls with an AUC of 0.563. When we compare the seropositive RA model with the 39 alleles to the one from Karlson et al. with 22 alleles we see no improvement from 0.660 (GRS22) to 0.654 (GRS39). This suggests that the addition of these 17 newly discovered RA alleles, whose individual ORs range from 1.10 to 1.23, does not improve the predictive ability of the GRS. As genetic discoveries progress with next generation sequencing, it is likely that cumulative GRS will improve in its predictive ability.
Our results for seronegative RA should be viewed in the context of prior research. The loci used in the GRS were discovered and the weights determined using only studies that include seropositive RA cases. Although there have been a few genetic markers that have associated with an increase risk of seronegative RA, such as HLA-DR1*03 
, HLA-DR3 
, and allelic forms of DCIR 
and IRF5 
, there may be as yet undiscovered loci that predict the seronegative RA phenotype. In a dataset containing 1500 cases/1500 controls, Kurreeman et al 
demonstrated that a GRS based on 28 non-HLA risk alleles was associated with seronegative RA with an AUC of 0.55 and a p-value for a linear association of 0.0008 also suggesting only a very modest association for these risk alleles with the seronegative phenotype.
It has been shown that the HLA-SE is strongly associated with both RF status and presence of anti-CCP antibodies 
. More specifically, anti-CCP antibodies play a vital role in the causal pathway between HLA-SE and erosions 
. This is one explanation of the results demonstrating that GRS39 performs similarly when using erosive status to define severe disease, rather than seropositivity. In addition, the observation of an AUC of 0.712 for GRS39 identifying seropositive, erosive RA cases suggests that a narrower definition of RA leads to better discriminative ability. This lends support to the argument that RA falls along a severity continuum starting with seronegative as least severe and leading to seropositive, erosive RA as most severe.
We found that earlier age at onset of RA may potentially be associated with increased GRS. While the correlations were weak and not statistically significant, this does suggest that perhaps those with earlier age at RA onset have a higher “load” of genetic risk factors than those with later onset. Previous studies have shown an earlier age of diagnosis of RA both for those having any HLA-SE compared to none 
, and for any PTPN22
T allele compared to CCP 
. Since both HLA and PTPN22
have a strong influence on the GRS score, this may be one explanation for the inverse relationship between the GRS and age at onset. The strongest effects that we detected for GRS and age at onset were with the seronegative and seropositive phenotypes. With this number of subjects, we had 37% and 35% statistical power to detect a significant ρ of 0.11 in seronegative and a ρ of −0.09 in seropositive RA. It is possible with more subjects in all phenotype groups we might have been able to detect significant relationships.
One limitation of our study is that we only have anti-CCP status tested at one time point, either up to 12 years prior to time of RA diagnosis or after diagnosis for the subset of cases without blood sample collected. The lack of information for anti-CCP results in the medical records due to the recent development of this test limits our ability to study anti-CCP results after diagnosis in all cases. We have not systematically collected outcome data after diagnosis of RA in this cohort, thus we do not know if some of the subjects defined as seronegative at diagnosis will later go on to convert to seropositive. This could lead to misclassification bias, with some truly seropositive RA subjects being misclassified as seronegative, which would bias us away from the null in the analysis. However, as we have found only modest associations within the seronegative group we do not believe that this has affected our analysis. This is also the case with erosive disease status, which based on chart data included notes ranging from the date of diagnosis where subjects have not had time to develop erosions to many years of follow-up. Another possible limitation to this study is the lack of data to test for population stratification. However, a subset of this sample (437 RA cases and 437 controls) 
was genotyped for the lactase gene (rs4988235), known to exhibit substantial variation in allele frequency from Northern to Southern Europe 
. No significant differences were found between cases and controls, arguing strongly against any significant population stratification in this dataset.
In summary, many arguments have been made in the last few years for subdividing RA into different phenotypes 
. The analyses here add credence to these arguments. We demonstrate different genetic associations for the different RA sub-types, with only a modest relationship seen in the least severe phenotype, seronegative and the strongest relationship seen with the most severe phenotype, seropositive, erosive RA. This suggests that seropositive RA has a different underlying genetic basis than seronegative RA and thus, in future research, studying the two phenotypes separately would lead to greater understanding of the genetic and functional make-up of the disease.