Despite having identical CFTR genotypes, ΔF508 homozygous CF individuals demonstrate a full range of pulmonary disease. Although several environmental factors influencing severity of lung disease have been identified, there is growing evidence that genetic and molecular differences contribute to the significant variability seen in the CF phenotype. This study used microarray analysis of nasal respiratory epithelium to investigate the molecular basis of variability in CF phenotype by identifying differences in gene expression between ΔF508 homozygotes in the most severe 20th percentile of lung disease and those in the most mild 20th percentile and identifying differences in gene expression between ΔF508 homozygotes and age-matched non-CF control subjects. The results suggest that the most significant differences in gene expression between those with CF and those without include those involved with airway defense, antigen presentation, and protein metabolism. There are also differences in gene expression between those with mild CF lung disease and those with severe lung disease, even in nasal respiratory epithelium without evidence of significant differences in inflammation. These include differential expression of genes involving the ubiquitin cycle, oxidoreductase activity, and lipid metabolism.
A previous comparison of gene expression in CF and non-CF respiratory epithelium has been performed in mice and identified differential expression of multiple gene classes, including those involved in transcription, inflammation, intracellular trafficking, signal transduction, and ion transport (21
). Because of the lack of pulmonary pathology in these CF mice, no analysis could be performed to evaluate differences in gene expression associated with severity of lung disease. Another recent study of ΔF508 and non-CF primary respiratory cell cultures grown in sterile conditions for 60 d suggested minimal differences in gene expression (22
). To design our study evaluating differences in gene expression by severity of lung disease, we first needed to select samples from ΔF508 homozygotes in distinct contrasting phenotypic groups. A recent study by Schlucter and coworkers aided in selecting these groups by determining FEV1
criteria identifying those with the most mild 20% and most severe 20% for lung disease among ΔF508 homozygotes between the ages of 15 and 26 (7
). Our investigation used these criteria as the basis for identifying “mild” and “severe” CF phenotypes after controlling for other known environmental influences, including ABPA, B. cepacia
, and atypical mycobacteria.
CF versus Non-CF Differential Gene Expression
Perhaps the most striking of this group is DUOX2, which had an average 2.5-fold lower expression in individuals with CF. DUOX2 is an NADPH oxidase that has recently been identified as the key producer of H2
in the airway and oral cavity and is an essential component of the lactoperoxidase (LPO) airway epithelium defense system (10
). LPO uses H2
to oxidize the anion thiocyanate to hypothiocyanite, a strong antimicrobial agent that prevents the growth of bacteria and fungi in the airways (25
). Individuals with chronic granulomatous disease known to have deficiency in the production of airway reactive oxygen species have been noted to be susceptible to the gram-negative infections often characteristic of CF, such as B. cepacia
). Although authors have previously noted these similarities and have commented on the need for further investigation (11
), these array findings provide the first in vivo
support for a potential link between CF and abnormalities in the DUOX-LPO airway defense system.
One gene ontology category demonstrating clear decreased expression in CF was that involving genes responding to biotic stimulus. These genes included β-casein (CSN2), which demonstrated on average a 1.5-fold lower expression in individuals with CF, and a group of genes involving lymphocyte differentiation (CD2, CD74) and antigen presentation (HLA-F, HLA-G). CSN2 has been identified as inhibiting the growth of viruses and bacteria, particularly S. aureus
, through its strong inhibition of cysteine proteases (27
). CD2 is a cell surface adhesion protein expressed on nearly all T cells that mediates STAT involvement in the regulation of IFN-γ expression, particularly in mucosal T cells (28
). Detection of differential expression of CD2 suggests that the 6–9% inflammatory cells present in the nasal samples made a contribution to the expression profiles. CD2 has previously been shown to be cleaved by polymorphonuclear leukocyte elastase and cathepsin G in patients with CF (29
). IFN-γ dysregulation has repeatedly been proposed to play a role in CF pathophysiology, and these array findings suggest that decreased expression of CD2 in mucosal lymphocytes might contribute to IFN-γ dysregulation.
Also significantly decreased in CF was insulin-like growth factor binding protein-3 (IGFBP3), a protein known to be a key modulator of the effects of insulin-like growth factor. Serum levels of IGFBP3 have been repeatedly demonstrated to be decreased in individuals with CF and in some cases correlate with lung function and nutritional status (30
). Microarray analysis of CFTR null mice lung tissue has also previously demonstrated a significant decrease in insulin-like growth factor binding proteins (21
). Although decreases in IGFBP3 can be related to chronic malnutrition, the significant decrease even in well nourished individuals with mild CF suggests an association between the loss of CFTR function and the decrease in insulin-like growth factor binding proteins.
Six of the 30 genes downregulated in CF were involved in lipid metabolism (PIGB, PIGF, PITPNB, SC4MOL, SLC27A2, and UGCG) and are located in the endoplasmic reticulum according to GO. None of these genes has been identified as potentially involved in CF, although abnormalities in lipid metabolism in CF are well known. Despite the large number of differentially expressed genes located in the endoplasmic reticulum, there was no indication of an ER overload response.
Inflammatory chemokine IL-8 has been identified as being characteristically elevated in CF (20
). Although IL-8 expression was significantly elevated in the severe CF group compared with non-CF control subjects, there was much more variability in IL-8 expression in individuals with the mild CF group. The highest absolute IL-8 values in the study group were found in two individuals with mild CF lung disease and minimal sinus disease. This suggests that although IL-8 levels are usually elevated in CF, marked elevation is not necessarily associated with more aggressive disease. Alternatively, CF nasal respiratory epithelial cells may demonstrate IL-8 characteristics independent of lower airway respiratory epithelium.
There were no differences in CFTR transcript levels between ΔF508 homozygotes and non-CF control subjects. This is not surprising given that it is thought that ΔF508 homozygotes develop CF pathology based on abnormalities at the protein trafficking and function level. This finding is also consistent with the recent results of Zabner and colleagues (22
Mild versus Severe CF Differential Gene Expression
Some of the most striking differences in gene expression were seen when we analyzed the results of specific phenotypes within CF. There were 53 genes that were significantly differentially expressed in individuals with mild CF lung disease compared with individuals with severe CF and non-CF control subjects. The GO categories most represented included lipid metabolism, airway defense, G-coupled protein receptors, and ion transport. Among these genes, several strong candidate modifiers of CF phenotype were apparent.
STATH, a calcium-binding, 43-amino-acid phosphopeptide known to have antibacterial properties is found in saliva, nasal secretions, and the upper airway. STATH was clearly upregulated in individuals with mild CF lung disease. This increased expression was confirmed in an additional 12 mild and severe patient samples collected after the microarray experiments were complete. STATH plays a key role in the development of the oral cavity biofilm by mediating adhesion of bacteria and was recently identified as being the most prominent protein in the saliva–air interface (31
). It is known to have bacterial binding epitopes that promote the growth and adhesion in the oral cavity of some organisms (Porphyromonas gingivalis
and Fusobacterium nucleatum
) while inhibiting the growth of others (Peptostreptococci
and S. aureus
). Its antimicrobial effect on P. aeruginosa
has not been investigated. Given that colonization with mucoid Pseudomonas
is known to accelerate a decline in lung function in CF (3
), a protein acting as a key determinant of bacterial adhesion in the oral cavity and upper airway is of significant interest.
ADIPOQ, a protein usually produced in adipocytes that potently inhibits inflammation and modulates insulin sensitivity, was also significantly upregulated in ΔF508 homozygotes with mild disease. ADIPOQ induces in leukocytes the production of the anti-inflammatory mediators IL-10 and IL-1 receptor antagonist (16
). It also stimulates the release of IL-1β, IL-6, and TNF-α from adipocytes (34
) and modulates energy metabolism and glucose sensitivity (17
). Functional polymorphisms have been identified that influence circulating ADIPOQ levels (35
). If the increased expression of ADIPOQ was due to the amount of adipose tissue alone, it would be expected that the well nourished, non-CF control subjects would have the highest levels, not individuals with mild CF. Functional polymorphisms leading to increased levels of ADIPOQ in CF would likely lead to more mild disease by suppression of CF-related inflammation and improved utilization of nutritional intake. Alternatively, low levels of ADIPOQ in poorly nourished individuals with CF might contribute to the known relationship between poor nutritional status and decline in lung function.
ADIPOQ is classically identified as being produced only by adipocytes, although array studies have identified ADIPOQ expression in trachea, skin, adrenal gland, thymus, and thyroid (36
). Separate RT-PCR studies confirmed the presence of ADIPOQ transcripts in CF and control nasal brushings, although further studies are needed to determine whether this expression is from respiratory epithelial cells or other cell types present in the sample.
Signal transducer and activator of transcription 1 (STAT1), represented by two probe sets, was the only gene identified to be significantly decreased in expression in mild CF. In mucosal T cells, STAT1 is activated by CD2 receptors (37
), which are also identified by the microarray data as being decreased in CF. In epithelial cells, STAT1 is essential for cellular antiviral defense and is central in activating the transcription of IFN-induced genes, particularly nitric oxide synthase-2 (38
). STAT1 activates transcription by binding directly to regulatory DNA elements (38
). STAT1-deficient mice display an absence of responsiveness to IFN and are highly sensitive to infection by virus (39
). It has previously been shown that STAT1 induction and activation are impaired in CF, and STAT1 has been proposed as a potential modifier of the CF phenotype (40
). Although the microarray data corroborate the decrease of STAT1 in CF, it is unclear why this would be more apparent in individuals with mild CF because low levels might be expected to result in more susceptibility to infection. One possibility is the recently identified increased antiapoptotic effect of IFN in the absence of STAT1 (41
). An alternative explanation is that STAT1 is usually low in all CF, but the studied severe group had more active inflammation leading to STAT1 induction.
Individuals with severe CF demonstrated the largest number of differentially expressed genes. A total of 569 genes demonstrated significant upregulation in individuals with severe CF lung disease compared with those with mild CF lung disease and non-CF control subjects. Although the respiratory epithelial cells sampled were not acutely infected or exposed to chronic purulence, the increased number of differentially expressed genes may not only reflect intrinsic differences but also cellular exposure to elevated serum levels of circulating inflammatory mediators present in individuals with severe CF lung disease.
Genes involved in the ubiquitin cycle, oxidoreductase activity, and lipid metabolism were those most strongly upregulated in severe CF. Of the 569 upregulated genes identified, nine were ubiquitin-activating and ubiquitin-conjugating enzymes. This strongly suggests a significant increase in the activity of the ubiquitin system in individuals with severe CF. Among the numerous upregulated ubiquitin cycle genes were the specific ubiquitin-activating enzyme UBA2 and its ubiquitin-like protein target NEDD8 (42
). Also upregulated was ubiquitin-conjugating enzyme HIP2 (E2–25K), which by its covalent attachment of ubiquitin identifies proteins for intracellular proteolysis by the 26S proteasome (43
). ΔF508-mutated CFTR is known to be targeted for degradation by the proteasome, and increased degradation would result in less partially functional CFTR reaching the epithelial cell apical membrane. Although functional polymorphisms in the identified ubiquitin-conjugating enzymes are not known, those leading to increased activity of the ubiquination-proteasome pathway would almost certainly be detrimental to CFTR expression in CF and potentially modify the CF phenotype.
Six of the 569 upregulated genes in severe CF were subunits of the NADH:ubiquinone oxidoreductase complex I, the initial enzyme complex in the electron transport chain of mitochondria. Complex I catalyses the first step in the respiratory electron transport chain in mitochondria, the reduction of ubiquinone by NADH (44
). It also produces superoxide in the mitochondrial matrix, which is converted by superoxide dismutase into hydrogen peroxide (45
). Abnormalities in NADH dehydrogenase have been identified in CF (46
). Further investigation is required to determine if the upregulation of complex I seen in individuals with severe CF is due to increased oxidative stress or is a primary contributor to pathology.
One challenge that we had to address in this study was the risk of type I error (false-positive differentially expressed genes) due to the multiple comparisons present in microarray analysis. There are several classic approaches to controlling for multiple comparisons; however, the normal physiologic variability and overlap between CF and non-CF gene expression in vivo
, even in genes known to be differentially expressed, such as IL-8 (47
), makes the stringent P
values required for significance by these classic methods nearly impossible to obtain. This was noted recently in a microarray study of CF versus non-CF epithelium grown in cell culture by Zabner and coworkers (22
). Even when they analyzed numerous CF and non-CF epithelial cell samples grown under tightly controlled conditions, their initial correction for multiple comparisons resulted in 0 of 22,238 tested genes being identified as significantly differentially expressed.
Our goal was to use available statistical tools to minimize the possibility of type I error as much as possible without being so stringent that we would eliminate all leads to potentially important differentially expressed genes. We did this by first minimizing variability before conducting ANOVA analysis by using a two-component Rocke-Lorenzato model normalization, which corrects for the absolute error that dominates at low expression and the relative errors present at high expression levels (8
). Second, we used a deviation from median correction to minimize the effect of outliers. These adjustments resulted in a reduction of 70.1% in the number of genes identified as significantly differentially expressed compared with ANOVA alone. Next, we used gene ontology group analysis to further reduce type I error because false–positive, differentially expressed genes should be randomly distributed across ontology groups. Finally, we assured that our key findings in individual genes were not due to false discovery by confirming the Statherin and Duox2 data by RT-PCR in a larger, separate CF population.
There are other potential limitations to using an array approach to identify candidate modifiers of CF phenotype. First is the lack of tight correlation between transcript levels and functional protein expression. All of the candidate genes identified here require further study at the protein level. Second is an inability to detect meaningful modifiers, which occur in a small percentage of the population. An example of this is the nonfunctional variant of the mannose-binding lectin gene that is present in 5–10% of the population and has been suggested as being associated with severe CF lung disease (48
). The statistics of microarray analysis do not identify as significant a marked decrease in expression in only 5–10% of the samples. Finally, although using nasal respiratory epithelial cells for analysis is well accepted and provides some significant advantages, these cells may not fully represent the characteristics of lower airway cells.
Overall, the number of genes differentially expressed in the nasal respiratory epithelial cells of individuals with CF compared with non-CF control subjects is less than might be expected. This is particularly true for cells from individuals with mild CF lung disease, with only 69 of the 44,670 assessed probe sets being clearly differentially expressed. This finding is consistent with the recent study by Zabner and colleagues of ΔF508 and non-CF respiratory primary cell cultures grown in sterile conditions for 60 d (22
). They identified only 24 of 22,000 probed genes to be differentially expressed. When in vivo
nasal respiratory epithelial cells from individuals with severe CF lung disease are compared separately with non-CF control subjects, a larger number of differentially expressed genes are identified. Whether this is due to intrinsic cellular differences or to chronic exposure to circulating cytokines requires further investigation.
In summary, this study provides the first in vivo comparison of respiratory epithelial cell gene expression profiles in ΔF508 homozygotes with mild and severe CF lung disease and non-CF control subjects. Although the number of differentially expressed genes might be considered low, particularly in ΔF508 homozygotes with mild disease, specific candidate modifiers of interest involved in airway defense and inflammation are identified. ΔF508 homozygotes with severe disease demonstrate a unique expression profile, with significant upregulation of genes involved in protein ubiquination and mitochondrial oxidoreductase activity. Further investigation of genes identified in this study may aid in the greater understanding of the molecular basis of variability in severity of CF lung disease.