|Home | About | Journals | Submit | Contact Us | Français|
Despite having identical cystic fibrosis transmembrane conductance regulator genotypes, individuals with ΔF508 homozygous cystic fibrosis (CF) demonstrate significant variability in severity of pulmonary disease. This investigation used high-density oligonucleotide microarray analysis of nasal respiratory epithelium to investigate the molecular basis of phenotypic differences in CF by (1) identifying differences in gene expression between ΔF508 homozygotes in the most severe 20th percentile of lung disease by forced expiratory volume in 1 s and those in the most mild 20th percentile of lung disease and (2) identifying differences in gene expression between ΔF508 homozygotes and age-matched non-CF control subjects. Microarray results from 23 participants (12 CF, 11 non-CF) met the strict quality control guidelines and were used for final data analysis. A total of 652 of the 11,867 genes identified as present in 75% of the samples were significantly differentially expressed in one of the three disease phenotypes: 30 in non-CF, 53 in mild CF, and 569 in severe CF. An analysis of genes differentially expressed by severity of CF lung disease demonstrated significant upregulation in severe CF of genes involved in protein ubiquination (P < 0.04), mitochondrial oxidoreductase activity (P < 0.01), and lipid metabolism (P < 0.03). Analysis of genes with decreased expression in patients with CF compared with control subjects demonstrated significant downregulation of genes involved in airway defense (P < 0.047) and protein metabolism (P < 0.048). This study suggests that differences in CF lung phenotype are associated with differences in expression of genes involving airway defense, protein ubiquination, and mitochondrial oxidoreductase activity and identifies specific new candidate modifiers of the CF phenotype.
Cystic fibrosis (CF) is the most common lethal autosomal recessive disorder in the white population. Although CF affects multiple organ systems, it is the severe lung disease that leads to the shortened life expectancy of 35.1 yr (1). The gene responsible for the disease, the cystic fibrosis transmembrane conductance regulator (CFTR), was first identified in 1989, and since then over 1,000 different mutations have been reported. ΔF508 is the most common mutation, and over 50% of individuals with CF are homozygous for the ΔF508/ΔF508 genotype (1).
Although certain aspects of CF phenotype, such as pancreatic insufficiency, are determined by CFTR genotype, most other aspects are not. Within the group of ΔF508 homozygotes, a full range in severity of pulmonary disease is seen, with some developing severe lung disease at an early age and others reaching adulthood with normal lung function (2). This observed variability has made it clear that the CFTR genotype is not the main determinant of severity of CF lung disease. The reasons for this variation in pulmonary phenotype in patients with identical CFTR mutations remain unclear. Certain infectious exposures, such as mucoid Pseudomonas aeruginosa and Burkholderia cepacia, along with nutritional factors (3, 4), have been demonstrated to contribute to the variability in pulmonary phenotype (3, 4). Other environmental factors, including socioeconomic status, tobacco smoking, and marijuana abuse, have been identified as contributors. There is increasing recognition that genetic modifiers likely play a significant role in influencing pulmonary phenotype in CF (5). The European CF Twin and Sibling Study compared phenotypes for a cohort of 41 sets of twins with identical CFTR genotypes and found that monozygous twins had a significantly higher concordance in nutritional status and severity of lung disease than dizygous twins (suggesting that genetic factors besides the CFTR gene influence phenotype) (6). A more in-depth understanding of the molecular basis of variability in CF lung disease is needed and would have major prognostic and therapeutic implications.
Several obstacles have made investigation into the molecular basis of this variability difficult. First has been the lack of an animal model of CF lung disease, making human studies necessary. Second has been the challenge of accurately identifying the most mild and severe pulmonary phenotypes while controlling for external confounders, such as B. cepacia, resistant P. aeruginosa, allergic bronchopulmonary aspergillus (ABPA), and poor nutrition. Third has been the challenge of identifying molecular differences that lead to a variation in phenotype and are not just a response to chronic infection and inflammation.
The aim of this investigation was to use high-density oligonucleotide microarray studies of in vivo nasal respiratory epithelium to investigate the molecular basis of differences in the CF phenotype by (1) identifying differences in gene expression between ΔF508 homozygotes with very mild lung disease and those with very severe lung disease and (2) identifying differences in gene expression between ΔF508 homozygotes and non-CF control subjects.
We attempted to address each of the previously described obstacles by matching participants with CF for ΔF508 homozygosity and infection status and studying only those representing the most mild and severe 20th percentiles of CF lung disease based on forced expiratory volume in 1 sec (FEV1) criteria. We also chose to study nasal respiratory epithelium to permit analysis of respiratory epithelium from severe phenotype patients, which was less likely to be characterized only by response to chronic infection and inflammation.
The results of this study demonstrate that with the exception of a few potentially interesting genes, there is less difference in nasal respiratory epithelial gene expression profiles between ΔF508 homozygotes with mild CF lung disease and non-CF control subjects than might be anticipated. In contrast, ΔF508 homozygotes with severe lung disease demonstrate upregulation of genes involved in protein ubiquination and mitochondrial oxidoreductase activity. When comparing all ΔF508 homozygotes together with non-CF control subjects, the most significant difference in gene expression is a downregulation in CF of genes involved in airway defense and antigen presentation/immune response. The results also identify a small number of new potential modifiers of CF lung phenotype, including statherin (an antimicrobial peptide present in saliva and the upper airways), adiponectin (a potent anti-inflammatory cytokine and modulator of insulin sensitivity), and dual oxidase 2 (a key producer of hydrogen peroxide in the upper airway).
The study was conducted at the Johns Hopkins Cystic Fibrosis Center and was approved by the Johns Hopkins and Western Institutional Review Boards (protocol #1033335). Each participant voluntarily consented to participate. To be eligible for the study, individuals with CF had to be homozygous for the ΔF508 mutation and meet FEV1 criteria that placed them in the top or bottom 20th percentile for FEV1 for their genotype and age (Table 1). These criteria were based on work published by Schlucter and coworkers (7) identifying FEV1 cutoffs that stratify severity of lung disease in ΔF508 homozygotes. Individuals with ABPA, B. cepacia, atypical mycobacteria, methicillin-resistant Staphylococcus aureus, history of significant reactive airway disease, recent viral infection, or active CF exacerbation were excluded. Non-CF control subjects were recruited from age-matched healthy volunteers. If potential participants demonstrated obvious turbinate inflammation or hemorrhage on initial visual inspection, the brushing was rescheduled for a later date.
Bilateral nasal mucosal brushing to collect respiratory epithelium was performed on each subject with a Cytosoft cytology brush (Medical Packaging Corp., Camarillo, CA) under direct visualization of the inferior turbinate using a nasal telescope (Karl Storz Endoscopy America, Culver City, CA). The nasal mucosa was anesthetized with two sprays of 4% tetracaine spray. The cytology brush was then gently rotated on the mucosal surface of the inferior turbinate, removed from the nose, and agitated in a DNAase/RNAase-free microcentrifuge tube containing 1 ml of sterile, chilled PBS. After a 40-μl aliquot was set aside for cytologic evaluation, the sample was centrifuged, the supernatant was removed, and 500 μl of TRIzol Reagent (Invitrogen, Carlsbad, CA) was added to the tube. The sample was immediately snap frozen using liquid nitrogen and stored at −80°C.
A cytospin centrifuge was used to prepare respiratory epithelial samples for Papanicolaou staining. Slides were evaluated using light microscopy by a blinded cytopathologist, and cells were categorized as ciliated epithelial, squamous epithelial, or inflammatory. Results were expressed as the percentage of total cells.
Total RNA from each sample was extracted using RNeasy Mini Kits (Qiagen, Valencia, CA). Double-stranded cDNA was synthesized using SuperScript Choice System (Invitrogen) and used for in vitro transcription using ENZO BioArray HighYield RNA transcript labeling kits (Affymetrix, Santa Clara, CA) to produce biotin-labeled cRNA. The cRNA was purified by RNeasy kit (Qiagen) and fragmented randomly to ~ 50–100 bp (200 mM Tris-acetate [pH 8.2], 500 mM KOAc, and 150 mM MgOAc at 94°C for 35 min). Spike controls were added to fragmented cRNA before hybridization.
cRNA samples from each subject were separated into two aliquots and hybridized to Affymetrix Human Genome HG-U133 A and B microarrays for 16 h according to the protocols described in the GeneChip Expression Analysis Technical Manual (Affymetrix, Santa Clara, CA). After hybridization, each microarray was washed, stained, and amplified on the Affymetrix Fluidics Station 400 using a phycoerythrin–streptavidin stain and biotin-labeled, antistreptavidin antibody. Fluorescent images were read using the Hewlett-Packard G2500A Gene Array Scanner (Hewlett-Packard, Palo Alto, CA).
After scanning, array images were assessed by eye to confirm scanner alignment and the absence of significant bubbles or scratches. The 3′/5′ ratio for glyceraldehyde-3-phosphate dehydrogenase was confirmed to be < 4 for all chips used for analysis. The spike controls BioB, BioC, BioD, and CreX were identified as being present in increasing intensity. GeneChip initial expression analysis was performed using GCOS 1.0 software (Affymetrix).
GeneChip expression data were exported to GeneSpring where per chip normalization to the 50th percentile expression level and per gene normalization to the median expression intensity in all samples was performed. Only probe sets scored present or marginal in at least 75% of samples were included in the analysis. Data were transformed to log ratio for display and analysis. GeneSpring 7 (Silicon Genetics, Redwood City, CA) and S-Plus 6.2 (MathSoft, Cambridge, MA) software was used in data analysis and visualization. Significantly changing gene expression was set at P 0.05 using the GeneSpring t test statistic with cross-gene error model activated using the two-component Rocke-Lorenzato model of normalization (8). The deviation from median option was also used to reduce the effects of outliers. Genes were identified that were significantly differentially expressed in three disease phenotype categories: CF with mild lung disease, CF with severe lung disease, and individuals without CF. A secondary analysis identified genes differentially expressed by sex in the non-CF group to allow exclusion of genes that segregated by sex independent of CF phenotype. The majority of sex-specific genes was already excluded by the requirement that a probe set be present or marginal in at least 75% of the nasal respiratory epithelium samples to be included in the analysis.
Probe sets identified as significantly differentially expressed by disease phenotype underwent an intensive search to identify biologic function. Probe set sequences from the Affymetrix web site were “blasted” against the University of California, Santa Cruz genome database to verify identity and update annotation. For individual genes with multiple probe set sequences specific to different regions of the gene, each probe set was checked separately. The resulting list of genes was submitted to PathwayAssist 3.0 (Stratagene, La Jolla, CA) for automated literature search. Gene Ontology classifications using GOMiner (9), conserved protein family domains, and reference literature were used to construct functional groupings of genes. GOMiner was then used to perform a two-sided Fisher's exact test to determine if a significantly greater number than expected of differentially expressed genes occurred in a category (8). All original array data images and files are available at http://pepr.cnmcresearch.org/browse.do?action=list_prj_expandprojectId=97. This site includes data on all arrays performed. The online supplement identifies which specific arrays were used for analysis and which had unacceptable glyceraldehyde-3-phosphate dehydrogenase ratios (see Tables E1 and E2 in the online supplement).
Changes in RNA levels of differentially expressed genes of interest were validated with quantitative RT-PCR using LightCycler real-time PCR (Roche, Indianapolis, IN). An aliquot of the nasal epithelium RNA was diluted to a concentration of 1 μg/μl, and cDNA synthesis was performed with 2 μg of RNA using the Avian myoblastoma virus RT enzyme and Oligo-p(dT)15 primers included in the 1 st Strand cDNA Synthesis Kit for RT-PCR (Roche). PCR amplification was performed on the LightCycler using DNA Master SYBR Green I kit and primers. Primers used to amplify STATH were 5′-GAAAAGGCAAGTATCCTGAAACAAA (forward) and 3′-TCCAGAACAACCACTATACCCACAA (reverse). Primers used to amplify DUOX2 were 5′-GCCCCTCTCTGCATCTACTG (forward) and 3′-GGGCAAGAGACTTTCAGTGC (reverse). γ-Actin and Tata Box Protein (TBP) were amplified from the same samples at the same time (separate capillary reaction tubes) as internal controls. Primers used for amplification of the γ-actin and TBP genes were γ-actin: 5′-AAGCCACCGACTTGTCTTCC (forward), 3′-AGATCAAGATCATCGCACCC (reverse) and TBP: 5′-GAATATAATCCCAAGCGGTTTG (forward) and 3′-ACTTCACATCACAGCTCCCC (reverse). PCR product identity was confirmed by agarose gel electrophoresis and product sequencing.
A total of 48 individuals (30 with CF, 18 non-CF) underwent nasal brushing to acquire nasal respiratory epithelial cells. Eleven of the brushings (eight CF, three non-CF) did not provide RNA of sufficient quantity or quality after purification to permit microarray analysis. The remaining 37 samples were hybridized to Affymetrix Human Genome HG-U133 A and B microarrays. Twenty-three of the U133A arrays and 21 of the matching U133B arrays passed the stringent quality control guidelines described in Materials and Methods and were used for the final data analysis (12 CF, 11 non-CF). The characteristics of the 23 individuals used in the final data analysis are summarized in Table 2. All of the individuals with severe CF and all but two of the individuals with mild CF were infected with mucoid Pseudomonas.
To assure that the cells collected and analyzed were predominantly respiratory epithelial cells, the nasal brushing smears were blindly read by the Johns Hopkins cytopathology department (Table 3). This analysis demonstrated an overall mean of 86.7 ± 7.1% respiratory epithelial cells (median, 90%; range, 73–98%). Other cell types identified were squamous (5.4 ± 5.6%) and inflammatory cells (7.8 ± 4.1). There was not a significant difference in percentage of inflammatory cells between CF, mild CF, and non-CF samples, although the study was not powered to detect small differences. A mean of 6.6 ± 3.3% of cells were inflammatory in the non-CF group (n = 11), 9.0 ± 6.5% in the mild CF group (n = 5), and 8.9 ± 3.2% in the severe CF group (n = 7) (P = not significant).
A total of 11,867 of the 44,760 probe sets on the U133 A and B chips were identified as being present or marginal in at least 75% of the nasal respiratory epithelium samples. Using the GeneSpring t test with cross-gene error modeling to determine significance, 709 of the 11,867 were identified as significantly differentially expressed in one of the three disease phenotypes: 32 in non-CF, 69 in mild CF, and 608 in severe CF (Figure 1, supplemental file 1). Combining multiple probe sets for single genes and eliminating expressed sequence tags reduced the numbers to 30 differentially expressed genes in non-CF, 53 in mild CF, and 569 in severe CF.
K-means clustering analysis divided the differentially expressed probe sets into three main groups: genes with increased expression only in non-CF (Figure 1B), genes with increased expression only in mild CF (Figure 1C), and genes with increased expression only in severe CF (Figure 1D). There were few probe sets that demonstrated significantly decreased expression only in mild or severe CF. One notable exception was STAT1, which demonstrated significantly decreased expression in mild-CF (Figure 1B).
The 30 genes with increased expression only in the non-CF group could alternatively be identified as genes with significantly decreased expression in mild and severe CF. These 30 genes therefore are those differentially expressed in nasal respiratory epithelium of individuals with CF compared with those without (Table 4). The genes most reduced in expression in CF were DUOX2 (2.5-fold) and CD2 (2.3-fold). DUOX2 is a key producer of H2O2 for airway mucosal defense and has previously been identified as potentially playing a role in CF (10, 11). An analysis by gene ontology (GO) categories of the list of all genes with decreased expression in CF demonstrated significant over-representation of those involving response to biotic stimulus (P < 0.047) and protein metabolism (P < 0.048). The biotic stimulus gene classes included those involved in defense response (P < 0.02), particularly the subset of genes involving antigen presentation (HLA-F, HLA-G; P < 0.0001) and lymphocyte differentiation (CD74, CD2; P < 0.003). Of the 11 downregulated genes in the GO protein metabolism category, calreticulin, a multifunctional ER protein chaperone, was of most potential interest. Transcript levels of CFTR were only marginally present on array, but RT-PCR revealed no detectable differences in CFTR transcript levels between ΔF508 homozygotes and non-CF control subjects.
There were 69 probe sets representing 53 genes that were significantly differentially expressed in individuals with mild CF lung disease compared with those with severe CF and non-CF control subjects. Fifty-two of the 53 genes demonstrated increased expression in mild CF. STAT1, an inducible transcription factor mediating response to IFN (12) and represented by two probe sets (Figure 1C), was the only gene significantly decreased in expression in mild CF. An analysis of the upregulated list demonstrated significant over-representation in several GO categories: lipid metabolism (P < 0.032), G-coupled protein receptors (P < 0.024) and ion transport (P < 0.03) (Table 5). In addition to the genes in these categories, two other genes of interest were found to be upregulated in individuals with mild CF lung disease: statherin (STATH) and adiponectin (ADIPOQ). STATH is a calcium-binding protein found in saliva and is primarily known for its regulation of calcium deposition (13). STATH is also well documented to have significant antibacterial properties (14) and is produced in submucosal glands of the nasal cavity and upper airway (15). ADIPOQ is a potent anti-inflammatory cytokine and inducer of IL-10 and is a modulator of insulin sensitivity (16, 17).
A total of 569 genes demonstrated significant upregulation in individuals with severe CF lung disease compared with those with mild disease and non-CF control subjects. Analysis of these genes by gene ontology categories revealed a striking over-representation of the upregulated genes involved oxidoreductase activity (P = 0.01), the ubiquitin cycle (P = 0.04), and lipid metabolism (P = 0.04) (Table 6). One particular cluster of upregulated oxidoreductase genes were those involved in NADH dehydrogenase:ubiquinone complex I, a mitochondrial subunit essential for electron transfer (NDUFS1, NDUFS7, NDUFB3, NDUFB5, NDUFAB1, NDUFA3). Numerous ubiquitin-conjugating enzymes were also significantly upregulated in individuals with severe CF (UBE2A, UBE2B, UBE2E1, UBE2E3, FBXW2, HIP2, and NEDD8) along with two ubiquitin-activating enzymes (UBA2 and UBE1C). Other upregulated genes of interest in severe CF included glutamate-cysteine ligase, the rate-limiting enzyme of glutathione synthesis (18), and activating transcription factor 1, a transcription factor involved in increasing IL-8 inflammatory response (19). A full list of all genes differentially expressed in individuals with severe CF is available in the online supplement (Table E3).
The inflammatory chemokine IL-8 has previously been identified as being characteristically elevated in CF (20). IL-8 transcript levels were noted to be significantly elevated in the nasal epithelium of the patients with severe CF compared with non-CF control subjects (Figure 2). In contrast, a wide range of IL-8 transcript expression was exhibited in the patients with mild CF; although the two highest expression levels of IL-8 occurred in individuals with mild CF, the remainder had IL-8 levels similar to non-CF control subjects. Overall, the median IL-8 level in individuals with mild CF lung disease was not significantly different from that seen in non-CF (Figure 2).
Because of its potential importance, we sought to verify the significantly increased expression of STATH in mild CF by quantitative RT-PCR in the samples used for microarray analysis and in an additional 12 mild and severe patient samples collected after the microarray experiments were complete. This separate analysis confirmed a significant difference in STATH expression between those with mild CF lung disease (n = 12) and those with severe CF lung disease (n = 11) (Figure 3; Kruskal-Wallis rank sum test, P = 0.042).
Decreased expression of DUOX2 in individuals with CF was also confirmed by RT-PCR in an independent larger group. DUOX2 expression was significantly lower in individuals with CF (n = 22) compared with those without CF (n = 13) (Figure 4; Kruskal-Wallis rank sum test, P = 0.047).
Despite having identical CFTR genotypes, ΔF508 homozygous CF individuals demonstrate a full range of pulmonary disease. Although several environmental factors influencing severity of lung disease have been identified, there is growing evidence that genetic and molecular differences contribute to the significant variability seen in the CF phenotype. This study used microarray analysis of nasal respiratory epithelium to investigate the molecular basis of variability in CF phenotype by identifying differences in gene expression between ΔF508 homozygotes in the most severe 20th percentile of lung disease and those in the most mild 20th percentile and identifying differences in gene expression between ΔF508 homozygotes and age-matched non-CF control subjects. The results suggest that the most significant differences in gene expression between those with CF and those without include those involved with airway defense, antigen presentation, and protein metabolism. There are also differences in gene expression between those with mild CF lung disease and those with severe lung disease, even in nasal respiratory epithelium without evidence of significant differences in inflammation. These include differential expression of genes involving the ubiquitin cycle, oxidoreductase activity, and lipid metabolism.
A previous comparison of gene expression in CF and non-CF respiratory epithelium has been performed in mice and identified differential expression of multiple gene classes, including those involved in transcription, inflammation, intracellular trafficking, signal transduction, and ion transport (21). Because of the lack of pulmonary pathology in these CF mice, no analysis could be performed to evaluate differences in gene expression associated with severity of lung disease. Another recent study of ΔF508 and non-CF primary respiratory cell cultures grown in sterile conditions for 60 d suggested minimal differences in gene expression (22). To design our study evaluating differences in gene expression by severity of lung disease, we first needed to select samples from ΔF508 homozygotes in distinct contrasting phenotypic groups. A recent study by Schlucter and coworkers aided in selecting these groups by determining FEV1 criteria identifying those with the most mild 20% and most severe 20% for lung disease among ΔF508 homozygotes between the ages of 15 and 26 (7). Our investigation used these criteria as the basis for identifying “mild” and “severe” CF phenotypes after controlling for other known environmental influences, including ABPA, B. cepacia, and atypical mycobacteria.
One challenge to this study was identified from the onset: Could we identify differences in mild and severe CF respiratory epithelium that were not due just to response to chronic infection? This challenge was made evident in initial attempts to study lower airway cells from patients with severe CF by the amount of purulence in cell samples obtained by bronchoscopy in patients with CF just before lung transplantation. To address this concern, we studied gene expression in nasal ciliated respiratory epithelial cells, a commonly used surrogate in CF for lower airway respiratory cells. Just as in CF lower airway respiratory epithelium, a markedly decreased amount of CFTR reaches the apical surface membrane of nasal epithelial cells (23). Electrolyte transport characteristics have also been shown to be nearly identical (24). We tried to minimize the potential influence of active local infection by excluding individuals with symptoms of sinus or pulmonary exacerbation, with the hope that collected cells would be more likely to reflect intrinsic differences in gene expression not due solely to response to local infection.
Perhaps the most striking of this group is DUOX2, which had an average 2.5-fold lower expression in individuals with CF. DUOX2 is an NADPH oxidase that has recently been identified as the key producer of H2O2 in the airway and oral cavity and is an essential component of the lactoperoxidase (LPO) airway epithelium defense system (10). LPO uses H2O2 to oxidize the anion thiocyanate to hypothiocyanite, a strong antimicrobial agent that prevents the growth of bacteria and fungi in the airways (25). Individuals with chronic granulomatous disease known to have deficiency in the production of airway reactive oxygen species have been noted to be susceptible to the gram-negative infections often characteristic of CF, such as B. cepacia (11). Although authors have previously noted these similarities and have commented on the need for further investigation (11, 26), these array findings provide the first in vivo support for a potential link between CF and abnormalities in the DUOX-LPO airway defense system.
One gene ontology category demonstrating clear decreased expression in CF was that involving genes responding to biotic stimulus. These genes included β-casein (CSN2), which demonstrated on average a 1.5-fold lower expression in individuals with CF, and a group of genes involving lymphocyte differentiation (CD2, CD74) and antigen presentation (HLA-F, HLA-G). CSN2 has been identified as inhibiting the growth of viruses and bacteria, particularly S. aureus, through its strong inhibition of cysteine proteases (27). CD2 is a cell surface adhesion protein expressed on nearly all T cells that mediates STAT involvement in the regulation of IFN-γ expression, particularly in mucosal T cells (28). Detection of differential expression of CD2 suggests that the 6–9% inflammatory cells present in the nasal samples made a contribution to the expression profiles. CD2 has previously been shown to be cleaved by polymorphonuclear leukocyte elastase and cathepsin G in patients with CF (29). IFN-γ dysregulation has repeatedly been proposed to play a role in CF pathophysiology, and these array findings suggest that decreased expression of CD2 in mucosal lymphocytes might contribute to IFN-γ dysregulation.
Also significantly decreased in CF was insulin-like growth factor binding protein-3 (IGFBP3), a protein known to be a key modulator of the effects of insulin-like growth factor. Serum levels of IGFBP3 have been repeatedly demonstrated to be decreased in individuals with CF and in some cases correlate with lung function and nutritional status (30). Microarray analysis of CFTR null mice lung tissue has also previously demonstrated a significant decrease in insulin-like growth factor binding proteins (21). Although decreases in IGFBP3 can be related to chronic malnutrition, the significant decrease even in well nourished individuals with mild CF suggests an association between the loss of CFTR function and the decrease in insulin-like growth factor binding proteins.
Six of the 30 genes downregulated in CF were involved in lipid metabolism (PIGB, PIGF, PITPNB, SC4MOL, SLC27A2, and UGCG) and are located in the endoplasmic reticulum according to GO. None of these genes has been identified as potentially involved in CF, although abnormalities in lipid metabolism in CF are well known. Despite the large number of differentially expressed genes located in the endoplasmic reticulum, there was no indication of an ER overload response.
Inflammatory chemokine IL-8 has been identified as being characteristically elevated in CF (20). Although IL-8 expression was significantly elevated in the severe CF group compared with non-CF control subjects, there was much more variability in IL-8 expression in individuals with the mild CF group. The highest absolute IL-8 values in the study group were found in two individuals with mild CF lung disease and minimal sinus disease. This suggests that although IL-8 levels are usually elevated in CF, marked elevation is not necessarily associated with more aggressive disease. Alternatively, CF nasal respiratory epithelial cells may demonstrate IL-8 characteristics independent of lower airway respiratory epithelium.
There were no differences in CFTR transcript levels between ΔF508 homozygotes and non-CF control subjects. This is not surprising given that it is thought that ΔF508 homozygotes develop CF pathology based on abnormalities at the protein trafficking and function level. This finding is also consistent with the recent results of Zabner and colleagues (22).
Some of the most striking differences in gene expression were seen when we analyzed the results of specific phenotypes within CF. There were 53 genes that were significantly differentially expressed in individuals with mild CF lung disease compared with individuals with severe CF and non-CF control subjects. The GO categories most represented included lipid metabolism, airway defense, G-coupled protein receptors, and ion transport. Among these genes, several strong candidate modifiers of CF phenotype were apparent.
STATH, a calcium-binding, 43-amino-acid phosphopeptide known to have antibacterial properties is found in saliva, nasal secretions, and the upper airway. STATH was clearly upregulated in individuals with mild CF lung disease. This increased expression was confirmed in an additional 12 mild and severe patient samples collected after the microarray experiments were complete. STATH plays a key role in the development of the oral cavity biofilm by mediating adhesion of bacteria and was recently identified as being the most prominent protein in the saliva–air interface (31). It is known to have bacterial binding epitopes that promote the growth and adhesion in the oral cavity of some organisms (Porphyromonas gingivalis and Fusobacterium nucleatum) while inhibiting the growth of others (Peptostreptococci and S. aureus) (14, 32, 33). Its antimicrobial effect on P. aeruginosa has not been investigated. Given that colonization with mucoid Pseudomonas is known to accelerate a decline in lung function in CF (3), a protein acting as a key determinant of bacterial adhesion in the oral cavity and upper airway is of significant interest.
ADIPOQ, a protein usually produced in adipocytes that potently inhibits inflammation and modulates insulin sensitivity, was also significantly upregulated in ΔF508 homozygotes with mild disease. ADIPOQ induces in leukocytes the production of the anti-inflammatory mediators IL-10 and IL-1 receptor antagonist (16). It also stimulates the release of IL-1β, IL-6, and TNF-α from adipocytes (34) and modulates energy metabolism and glucose sensitivity (17). Functional polymorphisms have been identified that influence circulating ADIPOQ levels (35). If the increased expression of ADIPOQ was due to the amount of adipose tissue alone, it would be expected that the well nourished, non-CF control subjects would have the highest levels, not individuals with mild CF. Functional polymorphisms leading to increased levels of ADIPOQ in CF would likely lead to more mild disease by suppression of CF-related inflammation and improved utilization of nutritional intake. Alternatively, low levels of ADIPOQ in poorly nourished individuals with CF might contribute to the known relationship between poor nutritional status and decline in lung function.
ADIPOQ is classically identified as being produced only by adipocytes, although array studies have identified ADIPOQ expression in trachea, skin, adrenal gland, thymus, and thyroid (36). Separate RT-PCR studies confirmed the presence of ADIPOQ transcripts in CF and control nasal brushings, although further studies are needed to determine whether this expression is from respiratory epithelial cells or other cell types present in the sample.
Signal transducer and activator of transcription 1 (STAT1), represented by two probe sets, was the only gene identified to be significantly decreased in expression in mild CF. In mucosal T cells, STAT1 is activated by CD2 receptors (37), which are also identified by the microarray data as being decreased in CF. In epithelial cells, STAT1 is essential for cellular antiviral defense and is central in activating the transcription of IFN-induced genes, particularly nitric oxide synthase-2 (38). STAT1 activates transcription by binding directly to regulatory DNA elements (38). STAT1-deficient mice display an absence of responsiveness to IFN and are highly sensitive to infection by virus (39). It has previously been shown that STAT1 induction and activation are impaired in CF, and STAT1 has been proposed as a potential modifier of the CF phenotype (40). Although the microarray data corroborate the decrease of STAT1 in CF, it is unclear why this would be more apparent in individuals with mild CF because low levels might be expected to result in more susceptibility to infection. One possibility is the recently identified increased antiapoptotic effect of IFN in the absence of STAT1 (41). An alternative explanation is that STAT1 is usually low in all CF, but the studied severe group had more active inflammation leading to STAT1 induction.
Individuals with severe CF demonstrated the largest number of differentially expressed genes. A total of 569 genes demonstrated significant upregulation in individuals with severe CF lung disease compared with those with mild CF lung disease and non-CF control subjects. Although the respiratory epithelial cells sampled were not acutely infected or exposed to chronic purulence, the increased number of differentially expressed genes may not only reflect intrinsic differences but also cellular exposure to elevated serum levels of circulating inflammatory mediators present in individuals with severe CF lung disease.
Genes involved in the ubiquitin cycle, oxidoreductase activity, and lipid metabolism were those most strongly upregulated in severe CF. Of the 569 upregulated genes identified, nine were ubiquitin-activating and ubiquitin-conjugating enzymes. This strongly suggests a significant increase in the activity of the ubiquitin system in individuals with severe CF. Among the numerous upregulated ubiquitin cycle genes were the specific ubiquitin-activating enzyme UBA2 and its ubiquitin-like protein target NEDD8 (42). Also upregulated was ubiquitin-conjugating enzyme HIP2 (E2–25K), which by its covalent attachment of ubiquitin identifies proteins for intracellular proteolysis by the 26S proteasome (43). ΔF508-mutated CFTR is known to be targeted for degradation by the proteasome, and increased degradation would result in less partially functional CFTR reaching the epithelial cell apical membrane. Although functional polymorphisms in the identified ubiquitin-conjugating enzymes are not known, those leading to increased activity of the ubiquination-proteasome pathway would almost certainly be detrimental to CFTR expression in CF and potentially modify the CF phenotype.
Six of the 569 upregulated genes in severe CF were subunits of the NADH:ubiquinone oxidoreductase complex I, the initial enzyme complex in the electron transport chain of mitochondria. Complex I catalyses the first step in the respiratory electron transport chain in mitochondria, the reduction of ubiquinone by NADH (44). It also produces superoxide in the mitochondrial matrix, which is converted by superoxide dismutase into hydrogen peroxide (45). Abnormalities in NADH dehydrogenase have been identified in CF (46). Further investigation is required to determine if the upregulation of complex I seen in individuals with severe CF is due to increased oxidative stress or is a primary contributor to pathology.
One challenge that we had to address in this study was the risk of type I error (false-positive differentially expressed genes) due to the multiple comparisons present in microarray analysis. There are several classic approaches to controlling for multiple comparisons; however, the normal physiologic variability and overlap between CF and non-CF gene expression in vivo, even in genes known to be differentially expressed, such as IL-8 (47), makes the stringent P values required for significance by these classic methods nearly impossible to obtain. This was noted recently in a microarray study of CF versus non-CF epithelium grown in cell culture by Zabner and coworkers (22). Even when they analyzed numerous CF and non-CF epithelial cell samples grown under tightly controlled conditions, their initial correction for multiple comparisons resulted in 0 of 22,238 tested genes being identified as significantly differentially expressed.
Our goal was to use available statistical tools to minimize the possibility of type I error as much as possible without being so stringent that we would eliminate all leads to potentially important differentially expressed genes. We did this by first minimizing variability before conducting ANOVA analysis by using a two-component Rocke-Lorenzato model normalization, which corrects for the absolute error that dominates at low expression and the relative errors present at high expression levels (8). Second, we used a deviation from median correction to minimize the effect of outliers. These adjustments resulted in a reduction of 70.1% in the number of genes identified as significantly differentially expressed compared with ANOVA alone. Next, we used gene ontology group analysis to further reduce type I error because false–positive, differentially expressed genes should be randomly distributed across ontology groups. Finally, we assured that our key findings in individual genes were not due to false discovery by confirming the Statherin and Duox2 data by RT-PCR in a larger, separate CF population.
There are other potential limitations to using an array approach to identify candidate modifiers of CF phenotype. First is the lack of tight correlation between transcript levels and functional protein expression. All of the candidate genes identified here require further study at the protein level. Second is an inability to detect meaningful modifiers, which occur in a small percentage of the population. An example of this is the nonfunctional variant of the mannose-binding lectin gene that is present in 5–10% of the population and has been suggested as being associated with severe CF lung disease (48). The statistics of microarray analysis do not identify as significant a marked decrease in expression in only 5–10% of the samples. Finally, although using nasal respiratory epithelial cells for analysis is well accepted and provides some significant advantages, these cells may not fully represent the characteristics of lower airway cells.
Overall, the number of genes differentially expressed in the nasal respiratory epithelial cells of individuals with CF compared with non-CF control subjects is less than might be expected. This is particularly true for cells from individuals with mild CF lung disease, with only 69 of the 44,670 assessed probe sets being clearly differentially expressed. This finding is consistent with the recent study by Zabner and colleagues of ΔF508 and non-CF respiratory primary cell cultures grown in sterile conditions for 60 d (22). They identified only 24 of 22,000 probed genes to be differentially expressed. When in vivo nasal respiratory epithelial cells from individuals with severe CF lung disease are compared separately with non-CF control subjects, a larger number of differentially expressed genes are identified. Whether this is due to intrinsic cellular differences or to chronic exposure to circulating cytokines requires further investigation.
In summary, this study provides the first in vivo comparison of respiratory epithelial cell gene expression profiles in ΔF508 homozygotes with mild and severe CF lung disease and non-CF control subjects. Although the number of differentially expressed genes might be considered low, particularly in ΔF508 homozygotes with mild disease, specific candidate modifiers of interest involved in airway defense and inflammation are identified. ΔF508 homozygotes with severe disease demonstrate a unique expression profile, with significant upregulation of genes involved in protein ubiquination and mitochondrial oxidoreductase activity. Further investigation of genes identified in this study may aid in the greater understanding of the molecular basis of variability in severity of CF lung disease.
The authors acknowledge the work of the Children's National Medical Microarray Center in processing the samples.
This work was supported by grants K23-HL071847, U01-HL66618, R025-CR02, CFFMerlo00Q0, R01-HL68927, and BAA HL 02–04 from the NHLBI and the Cystic Fibrosis Foundation.
This article has an online supplement, which is accessible from this issue's table of contents at www.atsjournals.org
Originally Published in Press as DOI: 10.1165/rcmb.2005-0359OC on April 13, 2006
Conflict of Interest Statement: None of the authors has a financial relationship with a commercial entity that has an interest in the subject of this manuscript.