In this study, we were able to identify or confirm the strong influence of genetic variation on circulating plasma protein levels in an older adult population. In some cases this relationship was extraordinarily strong accounting for as much as 61 percent of the variance with p<9.29×10−112 with 79 other associations exceeding conventional GWAS correction for multiple testing (p<5×10−8). The biological relevance of the top thirteen gene-protein associations based on R2SNP in the ADNI cohort was examined further. Each SNP accounted for 14 to 61 percent of total variation. Among these top 13 gene-protein associations, 9 gene-protein associations were replicated in the IMAS cohort. Two associations, Tamm-Horsfall glycoprotein (THP) and Angiotensinogen, were not replicated in the IMAS cohort. One association, Thyroxine-binding globulin (TBG) with the gene located on the X chromosome, could not be assessed as only a single male participant had the minor allele of the SNP. Association between ApoE level and APOE gene was not replicated warranting further investigation.
Among the top 13 associations, a SNP in the CFHR1
(complement factor H-related 1) gene (rs7517126) showed the very strong influence (R2SNP
) on the plasma level of complement factor H-related protein 1, a complement regulatory protein and a member of complement factor H family. In this study, another SNP in the CFH
(complement factor H) gene (rs6677604) showed the larger influence (R2SNP
) than rs7517126 although these two SNPs were not in strong LD ( and Figure S1
). Similar results of these two SNPs on the expression level of CFHR1
gene were observed in the previous study 
. It was not clearly explained why rs6677604 has the larger influence on the plasma level of complement factor H-related protein 1 than rs7517126, warranting further investigation. Variations in the CFH
genes have been studied for disease susceptibilities, including age-related macular degeneration 
, dense deposit disease 
, atypical hemolytic-uremic syndrome 
, and systemic lupus erythematosus (SLE) 
. Plasma complement factor H has been identified as a potential diagnostic biomarker for AD 
. Interestingly, the SNP with the strongest relationship in this study (rs6677604) has been previously associated with SLE 
For interleukin-6 receptor (IL-6r), rs4129267 in the IL6R
(interleukin 6 receptor) gene had the strongest relationship. The minor allele of the SNP up-regulated the plasma level of IL-6r in the present cohorts. Previous studies reported this association in serum and plasma 
Interleukin-16 is a cytokine which functions as a chemoattractant for a variety of CD4+ immune cells and an immunomodulatory cytokine 
. Two SNPs (rs4778636, rs11857713) in strong LD (pairwise r2
0.75) influenced plasma level of Interleukin-16. Association of these two SNPs was replicated in the IMAS cohort, but no other studies have reported an association of these SNPs with plasma interleukin-16 level. Association of these SNPs with gene expression in human lymphoblastoid cell lines has been recently reported 
Pulmonary and Activation-Regulated Chemokine (PARC) is a small chemokine that belongs to CC chemokine family. Previous studies reported the association of serum PARC with active pulmonary fibrosis in patients with systemic sclerosis 
, and increased plasma level has been observed in childhood acute lymphoblastic leukemia 
and Gaucher disease 
. Our study identified three SNPs (rs972317, rs854462, rs1467288) in or near CCL18
(chemokine (C-C motif) ligand 18 (pulmonary and activation-regulated)) gene, significantly influencing the plasma PARC level in both cohorts, but none of these associations have been previously reported.
Chemokine CC-4 (HCC-4), encoded by CCL16
(chemokine (C-C motif) ligand 16) gene, is also a small chemokine belonging to CC chemokine family and this chemokine chemoattracts lymphocytes and monocytes but not neutrophils 
. One SNP out of three identified SNPs in this study (rs2063979) has been associated with visceral leishmaniasis susceptibility in Brazil 
. The association of rs11080369 and rs2063979 with plasma level of HCC-4 has been previously reported 
. Although in the present study the effect of rs11080369 was in the same direction, the direction of rs2063979 was opposite to that reported previously indicating that directionality warrants further investigation.
Apolipoprotein E (ApoE) protein plays a role in lipid metabolism, combining with lipids to form lipoproteins. Also, ApoE is a major component of very low-density lipoproteins which remove excess cholesterol from the blood and are known to be bound to high density lipoproteins (HDLs), forming HDL-E, functioning as an inhibitor of agonist induced platelet aggregation 
. The APOE
gene encoding ApoE protein is one of the most extensively studied genes, especially for AD susceptibility 
, but also for other disease risk such as cardiovascular mortality 
and stroke 
. The relationship between plasma ApoE and AD has been inconsistent 
. The APOE
ε4 allele is a well-known risk factor for AD. The rs429358 SNP found to be significantly associated with plasma ApoE in the ADNI cohort is one of two key SNPs determining ε2/ε3/ε4 genotypes. Thus, this SNP not only determines different isoforms of ApoE but it also influences the overall plasma level of ApoE in the ADNI cohort. There was no interaction effect between rs429358 and baseline diagnosis on plasma ApoE at uncorrected p
<0.05 in an additional analysis. The relationship among rs429358, plasma ApoE levels, and AD should be further investigated using isoform-specific plasma ApoE levels as the platform for measuring plasma ApoE levels did not the measure levels of their specific isoforms.
Apolipoprotein A-IV (ApoA-IV) is another apolipoprotein in plasma that is involved in lipid metabolism. Previous studies have reported an association of ApoA-IV with AD, but the findings are inconsistent 
. The significant effect of rs1263167 on the plasma level of ApoA-IV was replicated in the IMAS cohort, but has not yet been reported in other studies. One study found the serum level of ApoA-IV to be up-regulated in AD patients 
and another study observed the association of ApoA-IV deficiency with increased Aß deposition 
Human renin-angiotensin system (RAS) plays a role in the regulation of blood pressure, and angiotensiongen and angiotensin-converting enzyme (ACE) are a part of the RAS. Several studies showed the association of ACE
(angiotensin I converting enzyme (peptidyl-dipeptidase A) 1) variants with AD 
as well as type 2 diabetic nephropathy 
, and cerebral amyloid angiopathy-related lobar intracerebral hemorrhage recurrence 
. In our study, rs4343 showed the strongest effect (R2SNP
) on the plasma ACE level. Another study 
identified the association of rs4311 with serum ACE level in control participants and the present study replicated the finding in the same direction of effect. Plasma angiotensinogen levels are highly heritable 
and previous studies 
reported an association of rs4762 and plasma angiotensinogen level. Although rs4762 was associated with plasma angiotensinogen level in the ADNI cohort, the direction was opposite to both previous studies in a Mexican population 
. In addition, another study failed to identify this association in Nigerians 
. Further investigation on other influencing factors than genetic variation should be conducted to explain the inconsistency.
Fetuin-A is a serum protein, encoded by AHSG
(alpha-2-HS-glycoprotein), synthesized in liver and secreted into the blood stream. Plasma Fetuin-A level has been associated with cardiovascular disease 
variants have been previously associated with AD 
. The previous 
studies identified associations of the same SNPs (rs4917, rs2070633) with plasma Fetuin-A level and in the same direction of effect as was observed in the present study.
Tamm-Horsfall glycoprotein (THP) is abundant in urine, and in humans it is encoded by the UMOD
(uromodulin) gene, which is associated with chronic kidney disease 
and blood pressure 
. We identified four SNPs (rs11647727, rs4506906, rs4293393, rs13333226) associated with the plasma THP level in the ADNI cohort although they were not replicated in the IMAS cohort. Among these SNPs, rs13333226 has been previously associated with diastolic blood pressure 
. The strongest SNP effect in our study (rs4293393) has also been associated with urinary THP concentrations in the same direction of the observed effect 
Thyroxine-binding globulin (TBG) is a protein that is involved in the transport of thyroxine and triiodothyronine in human serum 
. Previous studies 
investigated the role of polymorphisms within SERPINA7
(serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 7) gene in relation to inherited TBG defects. We found rs1804495 to be associated with plasma TBG level but only in males in the ADNI sample. This polymorphism is in codon 303 replacing TTG (leucine) with TTT (phenylalanine) and the role of this variant has not been previously reported on TBG defects or plasma level of TBG.
The present study has some limitations that may be informative for future studies. First, both cohorts consisted of older adults including a large portion with MCI, AD or cognitive complaints. Although age, diagnosis and APOE
genotype were included as covariates, we could not definitively determine the extent to which age and AD risk may have influenced the observed associations. Studies of gene-protein associations in younger and cognitively healthy samples are needed to clarify the generalizability of the present results. Second, although this study included relevant covariates, other factors than those measured and selected for analysis may have influenced the associations we studied. Further investigation of other factors influencing protein levels beyond genetic variation and the current covariates may be important 
. Third, non-normal distributions may have influenced association statistics. However, this is relatively unlikely because our analyses did not indicate any significant evidence of statistical bias. Fourth, the IMAS replication sample was of modest size, resulting in limited detection power compared to the ADNI cohort. Additional studies with larger sample sizes are needed for confirmation of the observed relationships. Fifth, the genotyping microarray we used shows considerable variation in SNP coverage for the genes of interest, as illustrated in Figure S1
. Therefore, some potential influence of genetic variants on protein analyte levels may have been missed due to undersampling of targeted genomic regions. Imputation of SNP data using HapMap or 1000 reference panel can increase the coverage and will be used in the future study. Finally, there might be technical issues with RBM between the discovery and the replication data which were assayed at different times with different antibodies and conditions used in different RBM runs. The technical issues related to assay time/batch differences could have played roles in those that were not replicated and this is also an issue for future validation of candidate analytes. Considerable amount of work to resolve these and other technical issues inherent to the RBM and follow up assays will be required to evaluate the current findings and turn them into research or clinical grade diagnostic assays in the future.
Despite these limitations, the current study identified 112 SNP-protein associations in the ADNI cohort and many (n
80) of these associations were highly significant relative to generally accepted significance thresholds (<5×10−8
). Approximately half of the 112 SNP-protein associations identified in the ADNI cohort were replicated in the IMAS cohort. However, some findings in the ADNI cohort which were not replicated in the IMAS cohort were previously reported in other studies and therefore continue to warrant additional investigation.
In conclusion, this study investigated the role of genetic variation, specifically cis-effects, on corresponding protein levels. The strong influence of many genes on commonly measured plasma analytes should be considered. This is particularly critical when proteins are known to play an important role in a disease or treatment. In this case, the evaluation of proteins as diagnostic, prognostic or therapeutic response biomarkers may need to be stratified for genetic background. Future studies should examine diagnostic classification after stratification. Our findings should be replicated in additional independent cohorts with larger samples. It is anticipated that future studies will investigate other genetic mechanisms such as trans-effects, haplotypes, copy number variation and epistasis, each of which may influence plasma protein levels. Finally, mRNA sequencing and transcriptome analyses of expression and alternative splicing should provide a more complete picture of functional genetic variations, influencing plasma-gene products.