|Home | About | Journals | Submit | Contact Us | Français|
Rationale: Previously reported linkage to FEV1 (LOD score = 5.0) on 6q27 in the Framingham Heart Study (FHS) led us to explore a candidate gene, SMOC2, at 168.6 Mb.
Objectives: We tested association between SMOC2 polymorphisms and FEV1 and FVC in unrelated FHS participants.
Methods: Twenty single-nucleotide polymorphisms (SNPs) around SMOC2 were genotyped in 1,734 subjects.
Measurements and Main Results: SNP data were analyzed using multiple linear regression models incorporating sex, age, body mass index, height, and smoking history as covariates, and analyses were repeated within strata of ever- and never-smokers. The minor allele of SNP rs1402 was associated with higher mean FEV1 (p = 0.003) and FVC (p = 0.02) measures. In never-smoking subjects, association with higher measures was observed with the minor allele of rs747995 (FEV1, p = 0.0006; FVC, p = 0.0008). These two SNPs lie in different haplotype blocks and reside in intron 4 of SMOC2. Haplotype analysis revealed a common G-T haplotype (rs747995–rs1402) with 77% frequency in never-smoking FHS subjects. The G-T haplotype was associated with reduction of 126 ml for FEV1 (p = 0.0002) and 157 ml for FVC (p = 0.0002). The G-T haplotype was similarly associated in a set of never-smoking subjects from the Family Heart Study (FEV1, p = 0.03; FVC, p = 0.03).
Conclusions: The replication of the association in two populations supports the possibility that SMOC2 might play an important role in the determination of FEV1 and FVC.
α1-Antitrypsin deficiency is the only proven genetic determinant of chronic obstructive pulmonary disease.
The study provides evidence for the implication of yet another gene, SMOC2, in lung function. SMOC2 was found to affect FEV1 and FVC in two different populations, the NHLBI Framingham Heart Study and the NHLBI Family Heart Study.
The observation that pulmonary diseases cluster in families (1, 2) and the substantial heritability of spirometric pulmonary function measures demonstrated in population-based samples (3–10) support the hypothesis that genetic factors influence both variability in pulmonary function and risk of chronic obstructive pulmonary disease (COPD). The spirometric measurements of FEV1, FVC, and FEV1/FVC are estimated to be between 15 and 60% heritable in population-based samples (8, 10). Familial factors may influence lung size during development, as well as affect the response to environmental toxins such as cigarette smoke.
Today, α1-antitrypsin deficiency is the only proven genetic determinant of COPD (11–16). The homozygous deficiency of the serine protease inhibitor α1-antitrypsin is associated with early-onset emphysema in smokers in their fourth decade of life and nonsmokers in their fifth decade (17) and accounts for less than 2% of all COPD (18). The heterozygous form of the mutation has been inconsistently associated with an increased risk of COPD. Mutations in alleles of α1-antichymotrypsin, a highly homologous protease inhibitor (19), have also been associated with obstructive lung disease in case-control studies (20).
Genomewide linkage for pulmonary function measures has been performed in population-based family studies. In the Framingham Heart Study (FHS), linkage was identified on chromosome 6qter: FEV1 (LOD [logarithm of the odds] = 2.4) and FEV1/FVC (LOD = 1.4) (21). A follow-up study demonstrated that the addition of new markers resulted in stronger evidence for linkage to FEV1 at 184.5 cM (LOD = 5.0) (22). The linkage in FHS lies in the region of 184 cM (D6S503) to 190 cM (D6S281) on 6q27. Linkage in this region was not reported in the population-based sample from the National Heart, Lung, and Blood Institute (NHLBI) Family Heart Study (23). To identify the possible gene on 6q27 that is contributing to the linkage peak observed in the Framingham sample, we adopted a candidate gene approach. One candidate gene is the “secreted protein acidic and rich in cysteines” (SPARC)–related modular calcium binding 2 gene (SMOC2).
SMOC2 (locus ID: 64094) harbors a Kazal domain, two thymoglobulin type-1 domains, two EF-hand calcium-binding domains, and a putative signal peptide (24). The SMOC2 Kazal domain, like α1-antitrypsin, encodes for a serine protease inhibitor. The thymoglobulin type-1 domains might also act as inhibitors of several proteases. The SMOC2 gene has been cloned and characterized (25). The protein was reported to be expressed in the lung and in the aorta, and the mRNA has been reported to be up-regulated during neointima formation in a rat balloon injury model (24), which suggests that SMOC2 may play an important role in lesion growth. Here, we examine the association between 20 single-nucleotide polymorphisms (SNPs) spanning 1,477 kb around and in SMOC2 and spirometry measures in the FHS population. In addition, we report a haplotype analysis of the implicated SNPs evaluated in a sample from the NHLBI Family Heart Study.
Some of the results of these studies have been previously reported in the form of an abstract and poster at the American Society of Human Genetics 2003 annual meeting (26).
This study examined unrelated subjects from the FHS offspring cohort, a sample of white Americans of predominantly western European descent. In 1971, FHS recruited the biologic offspring of the original participants and the spouses of these offspring. A cohort of unrelated offspring participants was sampled by selecting one member from each family, yielding a sample of 1,888 individuals. The spirometric methods used in the FHS have been previously described (21, 22, 27). Spirometric data were available on 1,734 of the unrelated subjects. The mean value of spirometry and covariates at two time points was used when available. Participants having only one examination with spirometry were included with data from a single exam.
Analyses were performed using a multiple linear regression model with FEV1 or FVC as the dependent variable and a dominant modeling strategy. The models included as covariates age, sex, height, body mass index (BMI) (kg/m2), smoking status (never, former, or current), and pack-years. In addition, SNPs were analyzed within strata of never- and ever-smokers using the same covariates. Haplotype association, adjusted for covariates, was assessed using the program haplo.stats (28, 29) (Haplo.stats software is available at http://cran.r-project.org/src/contrib/Descriptions/haplo.stats.html).
The Family Heart Study participants evaluated in this report comprise 225 white families being studied for linkage to BMI on chromosome 7 (30). Details of the original design have been described previously (31). SNP and haplotype association was evaluated using Family Based Association Tests implemented in the FBAT program (32, 33) (FBAT software is available at http://www.biostat.harvard.edu/~fbat/default.html). The analysis was performed in the full sample and stratified by smoking using phenotypic residuals for FEV1 and FVC that were adjusted for age, age-squared, BMI, height, smoking status, and pack-years in separate regression models by sex (23). The p values reported are FBAT results testing the null hypothesis of no linkage and no association. Estimates of the SNP and haplotype effects on FEV1 and FVC in liters were generated using regression models adjusting for the same covariates. Haplotypes were predicted in families using Merlin (34) (Merlin software is available at http://www.sph.umich.edu/csg/abecasis/Merlin/download/).
We evaluated 42 SNPs selected from the SNP Consortium Database (www.ncbi.nlm.nih.gov/SNP) using assays based on the Sequenom matrix-assisted laser desorption/ionization time-of-flight mass spectrometry platform (35). All SNPs were genotyped in at least 32 individuals to establish their heterozygosity. Polymorphic SNPs were typed in all unrelated FHS subjects. Lab controls and duplicate samples were genotyped for quality control. All SNPs were tested for Hardy-Weinberg equilibrium (HWE). Two SNPs exhibiting modest departure from HWE were regenotyped using TaqMan assays on a Prism 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA). The ABI Prism system was also used for genotyping in the Family Heart Study subjects.
Linkage disequilibrium (LD) between all markers was assessed using Haploview (36) (Haploview software is available at http://www.broad.mit.edu/mpg/haploview/download.php). Haplotype blocks were estimated using the confidence interval method (37). The default settings were used to define SNP pairs in strong LD. A block was identified when at least 95% of SNP pairs in a region met criteria for strong LD. Haplotype block information was used to select SNPs for haplotype analysis. SNP rs747995 was selected for the haplotype because of the association observed in nonsmokers, and rs1402 was selected for its position outside of the LD block that included rs747995 and due to its observed association with phenotype in the complete sample. These two SNPs generated a strong haplotype result in the Framingham sample and were genotyped and tested for association in the Family Heart Study. To explore haplotypes among smokers, rs2248421, rs1402, and rs968440, each identified for association with FEV1 in the ever-smokers and representing different LD blocks, were combined.
Characteristics of the unrelated FHS and the Family Heart Study subjects are presented in Tables 1 and and2.2. In both populations, nonsmoking subjects had a higher FEV1/FVC ratio than smokers. SNPs in and near the SMOC2 gene were selected from the National Center for Biotechnology Information (NCBI) database on the basis of their location. A total of 42 SNPs were genotyped; 15 SNPs were found to be nonpolymorphic in the initial screening test. The remaining 27 SNPs were typed in all FHS subjects; 7 were out of HWE (p < 0.01) and excluded from further analyses. SNPs rs747995 and rs714937 had moderate departure from HWE (0.01 < p < 0.05) and were retyped on the ABI platform. The genotypes between the Sequenom and ABI were largely consistent, with 98.5% identical genotypes for rs747995, and 99.1% identical genotypes for rs1402. The mismatched genotypes were discarded and only concordant calls were used in association analyses.
Table 3 shows characteristics of the 20 SNPs in and around SMOC2 that were used for association analysis. The SNPs are presented according to their chromosomal location in build 36.1. All tested SNPs in SMOC2 are intronic and their minor allele frequency is between 3 and 44%. Only one SNP (rs1402) showed significant association to FEV1 (p = 0.003) and FVC (p = 0.02) in the complete sample with the presence of the minor allele (G) associated with a mean FEV1 96 ml higher and mean FVC 84 ml higher than common allele homozygotes (Table 3). A second SNP, rs747995, was significantly associated with only FVC. After stratification by never/ever smoking status, however, seven SNPs showed significant association (p < 0.05) with FEV1 or FVC in at least one of the strata (Table 4). In nonsmokers, SNP rs747995 showed the strongest association with FEV1 (p = 0.0006) and FVC (p = 0.0008), with the minor allele associated with a 149 ml higher FEV1 and 177 ml higher FVC compared with common allele homozygotes.
The region spanning the genotyped SNPs was assessed for patterns of LD, and three LD blocks were identified (Figure 1). Block 1 harbored four SNPs, including rs747995 (SNPs 6–9 in Figure 1), and the three SNPs in this block with minor allele frequencies (MAF) between 14 and 20% each showed association with FEV1 and FVC in nonsmokers. The lower MAF SNP in this block (rs1884475, MAF = 5%) was modestly associated with FVC in smokers. SNP rs1402 was in the region adjacent to this block. A haplotype of SNPs rs747995 and rs1402 revealed a significant global haplotype association with FEV1 (p = 0.02) and FVC (p = 0.02) using all unrelated FHS subjects and a 0.05 significance level. The haplotype association was stratified by smoking status, and although the global haplotype association in ever-smokers did not suggest association (FEV1, p = 0.19; FVC, p = 0.59), the global haplotype association in the strata of never-smokers was statistically significant (FEV1, p = 0.0008; FVC, p = 0.002). Table 5 presents the different haplotypes of rs747995–rs1402 in never-smokers, the haplotype frequency, β-estimates, and p values for FEV1 and FVC. In never-smoking subjects, one common two-SNP haplotype (77%), representing alleles G and T from SNPs rs747995 and rs1402, respectively, was found to be associated with 126 ml lower FEV1 (p = 0.0002) and 157 ml lower FVC (p = 0.0002).
A haplotype incorporating the rs2248421, rs1402, and rs968440 SNPs was significantly associated with FEV1 and FVC (p = 0.006 for both) in the stratum of smokers (Table 6). The three-SNP haplotype result identified a 1.7% frequent haplotype (T-T-A) associated with 255 ml lower mean FEV1 and 294 ml lower mean FVC in smokers. This haplotype was not globally significant in the nonsmokers, and the effect of the T-T-A haplotype in nonsmokers was estimated to be only a 19 ml lower mean FEV1.
Similar to the FHS, the Family Heart Study is a population-based family cohort, and the lung function characteristics shown in Table 2 show similarity to the Framingham sample. We attempted to replicate the SNP and haplotype association observed in never-smokers in a sample from the NHLBI Family Heart Study. SNPs rs747995 and rs1402 were tested using a dominant/recessive model in FBAT in the full sample of never- and ever-smokers, and the minor allele of rs1402 was associated with both FEV1 (p = 0.05) and FVC (p = 0.03). The presence of the minor allele (G) of rs1402 was associated with a higher mean FEV1 by 85 ml and FVC by 100 ml compared with the common allele homozygotes. In these samples, rs747995 was significantly associated with FVC (p = 0.02), with the minor allele (A) associated with higher FVC levels for the homozygous genotype. A total of 21 individuals carried the AA genotype of rs747995, and they had a mean FVC 264 ml higher than carriers of the G allele. In the full sample, the haplotype of rs747995 and rs1402 was borderline globally significant for association with FVC (p = 0.07). In never-smokers, the G-T haplotype of rs747995 and rs1402 showed association to FEV1 and FVC (p = 0.03 for both), although the global haplotype tests did not produce significant results (FEV1, p = 0.14; FVC, p = 0.06). Table 5 presents the different haplotype results for NHLBI Family Heart Study nonsmoking subjects using an additive genetic model for comparison with FHS results. Similar to the observed effects in Framingham, the Family Heart study results indicate the common haplotype (80%) to be associated with lower mean levels of FEV1 and FVC and a 7% haplotype associated with higher levels.
Bioinformatic analysis for the region harboring the haplotype showed that rs747995 and rs1402 reside in intron 4 of SMOC2. Figure 2 shows a genomic view of the two haplotyped SNPs and their location in SMOC2. Further analysis showed that the region of SMOC2 intron 4 harbors an antisense encoding mRNA (AK130868) (Figure 2).
Using linkage analysis, we previously identified a quantitative trait locus influencing FEV1 on 6q27 (LOD score = 5 at 184.5 cM) (22). In an effort to identify a gene(s) in the linkage region that influences FEV1 and/or FVC, we applied a candidate gene approach. One candidate gene on 6q27 is SMOC2, which has homology to the protease inhibitor α1-antitrypsin, the one gene known to influence the development of COPD in smokers.
The SMOC2 homology to α1-antitrypsin is characterized by the domain homology. SMOC2 encodes an EF-hand calcium-binding domain, two thymoglobulin-like domains, a follistatin-like domain, and a novel domain without known homologs (25). The thymoglobulin domains bind and act as inhibitors to different proteases, such as serine and cysteine proteases (38, 39). The follistatin-like domains, including the Kazal-type protease inhibitor domain, are usually indicative of serine protease inhibitors (such as the thymoglobulin domain) (40). Thus, SMOC2 contains multiple domains with the potential to act as protease inhibitors, and protease/antiprotease imbalances represent a leading hypothesis in the etiology of COPD.
To assess the role of SMOC2 in association with FEV1 and FVC, we studied a set of 1,734 unrelated individuals from the FHS (Table 1). Although a significant association with both FEV1 and FVC was found in the total sample for only 1 of the 20 SNPs tested, an additional 6 SNPs showed significant association with lung function when the sample was stratified on smoking status. SNPs that showed significant association to FEV1 and FVC in nonsmokers (rs2255680, rs747995, and rs714937) were located in introns 3 and 4 of SMOC2. SNP rs2248421, which resides in intron 2, showed significant association to FVC in ever-smokers, and rs1884475 and rs1402 (intron 4) were also associated in the ever-smoking sample. HapMap data suggest that 21 haplotype blocks may be present within SMOC2 (http://www.hapmap.org/). Although we may not have sampled SNPs from each of these blocks to rule out other regions of the gene as harboring associated SNPs, the area of strongest association was localized to the region around exons 3 and 4.
The SMOC2 region was assessed for patterns of LD to identify SNPs for haplotype analysis (Figure 1). We identified a common haplotype (rs747995–rs1402) with a frequency of 77% in the nonsmoking FHS subjects that was associated with lower FEV1 and FVC levels (p = 0.0002 for both). Two minor haplotypes of 7 and 16% frequency were associated with higher FEV1 and FVC levels.
To replicate these results in a separate sample, we genotyped the two SNPs in the associated haplotype (rs1402–rs747995) in a set of 225 pedigrees from the NHLBI Family Heart Study. Both SNPs individually were associated with pulmonary function in the full Family Heart Study sample. For rs1402, the genetic model and estimates of the effect on FEV1 and FVC were consistent with the observations from the FHS. For rs747995, the minor allele association with higher levels of lung function was observed more clearly for carriers of two copies of the minor allele than one. This recessive model result compared with a dominant model result calls into question whether the heterozygote genotype has a replicated effect on phenotype, but in both studies, the minor allele homozygotes for rs747995 were observed with higher mean spirometry. Nonetheless, haplotype analysis of the two SNPs revealed that both the haplotype frequencies and their association with lung function among never-smokers were similar in the two samples. Haplotype association was observed despite the absence of linkage to pulmonary function on 6q27 in the Family Heart Study. The replication of the haplotype frequency and effect confirms that these SNPs are common in the general population and provides strong evidence that they are associated with FEV1 and FVC.
The two SNPs we identified span a small region (~ 12,560 bp) on 6q27 that flank exon 4 of SMOC2. How this region impacts SMOC2 gene function is not currently known. Effects on expression, splicing, or properties of the SMOC2 gene product are possible. In addition, an identified antisense mRNA transcript (AK130868) in this region could impact readout from SMOC2. Further functional analyses using various combinations of these SNPs may help to confirm these findings and provide insight into the function of each variant and transcripts.
Because this was a population-based study with a low prevalence of overt obstructive lung disease, it was not possible to determine whether SMOC2 haplotypes were associated with COPD per se. Different SNPs showed association to lung function in smokers and never-smokers, with the strongest association signal found in never-smoking subjects. Moreover, the effect of the alleles on lung function appears to be modified by exposure to cigarette smoke. We observed minor allele effect estimates associated with higher lung function in never-smokers and lower lung function in smokers. If the haplotypes identified correspond to different SMOC2 isoforms, those isoforms may influence lung function differently in the presence or absence of smoking. The association of the A-T and G-G haplotypes with higher lung function in nonsmokers may reflect a gene variant(s) with an impact on lung growth and development, whereas the association of haplotypes with reduced lung functions in smokers, if confirmed, may reflect a gene variant with an increased susceptibility to proteolytic lung injury. Whether SMOC2 variants directly affect susceptibility to development of COPD either by modifying lung volumes attained during growth and development or by influencing response to environmental irritants such as tobacco smoke would be better addressed in a clinical sample.
One limitation of this study is the ability to address how the SMOC2 variants explain the linkage result previously described. Due to scarce DNA resources, only a subset of the participants previously studied for linkage are available for new genotyping. Although we previously obtained an LOD score of 5 on 6q27, the smaller sample currently available generates only an LOD of 1.38. Adjusting for the rs747995 and rs1402 SNPs in linkage increased the LOD to 1.47, suggesting that although the SNPs account for variance in FEV1, they do not necessarily account for the linkage observed.
In summary, the strong association of SNPs and haplotypes in the intron-4 region of SMOC2 with the pulmonary function measurements FEV1 and FVC suggests that common SMOC2 sequence variants are associated with pulmonary function in the general population. This inference is strengthened by the identification of a common haplotype associated with lung function in two populations, unrelated FHS and Family Heart Study sample. Identification of the specific genetic variant affecting lung function may permit improved COPD risk assessment.
The authors thank the investigators and participants of the NHLBI Framingham and Family Heart Studies for making this work possible. The authors thank Dr. Richard H. Myers for facilitating the collaboration with the Family Heart Study and providing genotyping services.
The Framingham Heart Study is supported by NIH/NHLBI N01-HC-25195. J.B.W. is supported by a Young Clinical Scientist award from the Flight Attendant Medical Research Institute.
Originally Published in Press as DOI: 10.1164/rccm.200601-110OC on January 4, 2007
Conflict of Interest Statement: None of the authors has a financial relationship with a commercial entity that has an interest in the subject of this manuscript.