Search tips
Search criteria 


Logo of ajrccmIssue Featuring ArticlePublisher's Version of ArticleSubmissionsAmerican Thoracic SocietyAmerican Thoracic SocietyAmerican Journal of Respiratory and Critical Care Medicine
Am J Respir Crit Care Med. 2011 November 15; 184(10): 1153–1163.
Published online 2011 November 15. doi:  10.1164/rccm.201106-1143OC
PMCID: PMC3262024

Sarcoidosis Blood Transcriptome Reflects Lung Inflammation and Overlaps with Tuberculosis


Rationale: Sarcoidosis is a granulomatous disease of unknown etiology, although M. tuberculosis may play a role in the pathogenesis. The traditional view holds that inflammation in sarcoidosis is compartmentalized to involved organs.

Objectives: To determine whether whole blood gene expression signatures reflect inflammatory pathways in the lung in sarcoidosis and whether these signatures overlap with tuberculosis.

Methods: We analyzed transcriptomic data from blood and lung biopsies in sarcoidosis and compared these profiles with blood transcriptomic data from tuberculosis and other diseases.

Measurements and Main Results: Applying machine learning algorithms to blood gene expression data, we built a classifier that distinguished sarcoidosis from health in derivation and validation cohorts (92% sensitivity, 92% specificity). The most discriminative genes were confirmed by quantitative PCR and correlated with disease severity. Transcript profiles significantly induced in blood overlapped with those in lung biopsies and identified shared dominant inflammatory pathways (e.g., Type-I/II interferons). Sarcoidosis and tuberculosis shared more overlap in blood gene expression compared with other diseases using the 86-gene signature reported to be specific for tuberculosis and the sarcoidosis signature presented herein, although reapplication of machine learning algorithms could identify genes specific for sarcoidosis.

Conclusions: These data indicate that blood transcriptome analysis provides a noninvasive method for identifying inflammatory pathways in sarcoidosis, that these pathways may be leveraged to complement more invasive procedures for diagnosis or assessment of disease severity, and that sarcoidosis and tuberculosis share overlap in gene regulation of specific inflammatory pathways.

Keywords: gene expression profiling, interferons, algorithms, computational biology

At a Glance Commentary

Current Scientific Knowledge on the Subject

Sarcoidosis is a granulomatous disease of unknown etiology, although M. tuberculosis may play a role in the pathogenesis. Traditional view holds that inflammation in sarcoidosis is compartmentalized to involved organs.

What This Study Adds to the Field

In this study, we provide genomic evidence that challenges current thinking about the compartmentalization of inflammation in sarcoidosis and demonstrate the significant overlap in transcriptional profiles between sarcoidosis and tuberculosis, which may have important implications for sarcoidosis etiology.

Sarcoidosis is a systemic inflammatory granulomatous disease. Regarding the immunology of sarcoidosis, studies have shown discordance between immune activation in the circulation compared with that in organs manifesting granulomatous inflammation. Specifically, many patients have anergy to skin tests (14) and impairment of proliferative responses by blood mononuclear cells in vitro (47). Furthermore, studies have described profound T-cell activation in the lung compared with T cells from the blood (4, 821), leading to the conclusion that activation of T cells occurs locally (i.e., in the diseased organs). The concept that inflammatory responses are “compartmentalized” in sarcoidosis has led to the persistent belief that studies of circulating immune cells do not reflect the types of inflammatory pathways activated in diseased organs.

Two major exposures have been considered to play roles in the pathogenesis of sarcoidosis: microbial organisms and noninfectious environmental agents (inorganic or organic) (2224). Due to the histologic similarities between tuberculosis and sarcoidosis, there have been exhaustive attempts to isolate mycobacterial organisms in tissue samples from patients with sarcoidosis, yet intact mycobacteria have not been identified conclusively. More recent work using PCR- or immune assay–based techniques have increasingly found evidence for a link between M. tuberculosis (TB) and sarcoidosis by identifying mycobacterial nucleic acids, insoluble mycobacterial proteins, and specific T-cell immune responses to mycobacterial peptides in many subjects with sarcoidosis (23, 2528). Taken together, these data suggest a role for measureable immune responses to components of mycobacterium, rather than an infection per se, in a certain percentage of individuals who develop sarcoidosis.

A recent genomic study of subjects with active pulmonary TB (PTB) demonstrated a robust peripheral blood transcriptional signature compared with control subjects and those with latent TB (LTB) (29). This study further showed that the TB signature tracked with extent of disease on chest radiograph and reverted to that of control subjects after antimicrobial treatment. Given the evidence suggesting that mycobacterial product(s) may drive abnormal immune responses in sarcoidosis and the observation that PTB has a robust peripheral blood signature, we proposed two hypotheses: (1) that the transcriptional signature of blood would be significantly robust to distinguish patients with sarcoidosis from control subjects and identify inflammatory pathways that reflect granulomatous inflammation in involved tissues and (2) that there would be significant overlap in the blood transcriptional signature from subjects with sarcoidosis and active PTB. Our results challenge the traditional view that holds that sarcoidosis is a “compartmentalized” disease of organs by showing that transcript profiles in the blood can recapitulate inflammatory pathways in the diseased lung. In addition, data presented herein justify future studies to identify blood biomarkers of disease progression and phenotype in sarcoidosis. Some of the results of these studies have been previously reported in the form of an abstract (30).


University of California, San Francisco Subjects: Derivation Data Set

The blood gene expression derivation set included 38 patients with sarcoidosis and 20 healthy control subjects enrolled prospectively at University of California, San Francisco (UCSF) (Table 1). The diagnosis of sarcoidosis was based upon established criteria (31). Subjects with other concurrent systemic inflammatory conditions were excluded. Control subjects without lung disease or other significant medical conditions were recruited from the community through advertisements. The study was approved by the UCSF Committee on Human Research.


Oregon Health Sciences University Subjects: Validation Data Set

The blood gene expression validation set comprised publicly available data (GEO - GSE18781, from Oregon Health Sciences University [32]) (see Table E1 in the online supplement). Review of the publication associated with this dataset indicates that all patients with sarcoidosis had symmetric hilar adenopathy as judged by chest radiograph or by CT scan. Six patients also had biopsy-proven granulomas. An additional five subjects were diagnosed based on coexisting uveitis on dilated examination because the combination of symmetric hilar adenopathy and uveitis has been considered to be specific for a diagnosis of sarcoidosis (33). One subject was diagnosed on the basis of symmetric hilar adenopathy without uveitis or tissue confirmation (sample ID GSM465961), and analyses were performed with and without this subject in the validation set to assess the impact of this subject. No subjects were taking immunosuppressive medications. Control subjects were recruited from an ophthalmology clinic for routine eye care and had no evidence of uveitis.

TB Cohorts

Blood gene expression datasets from tuberculosis (TB) cohorts consisted of subjects with active PTB or LTB before treatment with antimicrobials and were obtained from publicly available data published by Berry and colleagues (29) (Figure 1). The groups consisted of 34 patients with PTB, 38 patients with LTB, and 24 control subjects from two London, United Kingdom cohorts (GEO accession numbers GSE19439 and GSE19444). These subjects were over 18 years of age and HIV negative, and all PTB subjects had laboratory isolation of TB on mycobacterial culture of a respiratory specimen (sputum or bronchoalveolar lavage fluid). LTB subjects were defined as a positive tuberculin-skin test or a positive TB antigen–specific IFN-γ release assay. Control subjects were negative for both assays.

Figure 1.
Flow diagram of bioinformatic analyses presented in the study. (A) Overview of random forest classifier development using derivation and validation blood gene expression datasets. (B) Overview of the analyses used to compare gene expression in blood versus ...

Additional Cohorts Representing Infectious Disease and Th1-Type Inflammation

Additional blood gene expression datasets from disease cohorts analyzed in this study included six subjects with hypersensitivity pneumonitis from UCSF (Figure 1 and Table E2) as well as data from cohorts that were publically available and published by Chaussabel and colleagues (34). This dataset (GEO accession number GSE22098) included 28 adult subjects with systemic lupus erythematosis (with 17 matched control subjects), 49 pediatric SLE subjects (with 19 matched control subjects), 23 subjects with group A Streptococcus infection, and 40 subjects with Staphylococcus aureus infection (together sharing a pool of 12 matched control subjects) (Figure 1).

Lung Biopsy Cohort

We analyzed lung biopsy gene expression datasets from six patients with active pulmonary sarcoidosis and six control subjects using publically available data from Ohio (35) (GEO accession number GSE16538) (Figure 1). Clinical details reported on these subjects were that lung samples from the subjects with sarcoidosis contained well formed, nonnecrotizing epithelioid granulomas in the absence of identifiable infection or foreign bodies. Control subjects had normal lung histology (35). We also analyzed lung biopsy gene expression datasets from eight subjects with “nodular self-limiting” sarcoidosis and seven subjects with “progressive-fibrotic” sarcoidosis (36). These subjects had “active” disease at the time of biopsy and were not on inhaled or oral immunosuppression, and subjects in the progressive-fibrotic group had abnormal spirometry or DlCO (36). This study did not include healthy control subjects.

Gene Expression Analyses

For the blood gene expression derivation set, we isolated whole blood RNA using PAXgene RNA tubes (Becton Dickinson, Franklin Lakes, NJ) and isolation kits (PAXgene Blood RNA Kit; Qiagen, Valencia, CA). After RNA quality assessment (Agilent Bioanalyzer 2100; Agilent Technologies, Santa Clara, CA), hybridizations were performed at the Gladstone Institute Microarray Core Facility using Affymetrix U133 Plus 2.0 microarrays (Affymetrix Inc., Santa Clara, CA). Methods for background adjustment, normalization and probe summarization, and data processing are provided in the online supplement.

Development of the Classification Algorithm

To develop a classification algorithm (patients with sarcoidosis versus control subjects) based on blood gene expression, we initially applied three different machine learning algorithms to the derivation set—random forest (37), elastic net (38), and shrunken centroids (39)—and performed subsequent analyses with random forest (R/Bioconductor) (40) because it performed best on cross-validation and provided a ranking of the genes important in classification. The Oregon blood gene expression dataset provided external validation for receiver operator characteristic (ROC) curve analyses. Additional details are provided in the online supplement.

Comparison of Blood Transcriptome with Lung Tissue

To understand how blood transcript profiles relate to those found in granulomatous tissues, we analyzed the overlap in differentially expressed genes in blood (UCSF) and lung tissue (Ohio, GSE16538) (35) from patients with sarcoidosis as compared with healthy control subjects (Figure 1). In these analyses, we identified genes concordantly up- and down-regulated in blood and lung. To understand whether any genes in our peripheral blood transcript profile relate to fibrotic lung disease in sarcoidosis, we compared our differentially expressed genes with previously published lung biopsy data from Lockstone and colleagues (36) (Figure 1). That study identified genes that were up- or down-regulated in lung tissue from subjects with “progressive-fibrotic” as compared with “nodular self-limiting” sarcoidosis (no healthy control group was used). Overlap between genes differentially expressed in the UCSF blood dataset (patients with sarcoidosis compared with control subjects, false discovery rate (FDR) < 0.05 using limma) and genes reported as differentially expressed in lung from the “progressive-fibrotic” group in their original paper (progressive-fibrotic sarcoidosis compared with nodular self-limiting [36]) were identified.

Cross-Platform Comparisons Using Blood Gene Expression Datasets

We assembled a meta-dataset consisting of the UCSF subjects (GSE19314), the Oregon dataset (GSE18781), and the other disease control subjects described above (29, 34) using normalization approaches described in detail in the online supplement. To examine overlap between groups, we constructed heat maps and principal component analysis (PCA) plots using two prespecified gene sets: (1) the 50 most discriminative genes in the sarcoidosis classifier and (2) the 86-gene “TB-signature” identified by Berry and colleagues (29). To identify genes potentially useful in distinguishing sarcoidosis from PTB, we used a dataset that merged these two groups, trained a second random forest classifier (details are provided in the online supplement), and constructed a heat map and PCA plot using the 50 most discriminative genes. A flow diagram outlining the origin of all datasets and a summary of the bioinformatic analyses performed is provided in Figure 1.

Real-Time PCR Confirmation of Selected Genes

Two-step, real-time quantitative PCR was performed as described previously (41, 42) using primers and probes presented in Table E3.

Statistical Analysis

Comparisons of age, sex, and clinical measures were performed using the Welch two-sample t test and Fisher's exact test as indicated. Classification was performed using random forests, and ROC curves were developed as described above. These analyses and PCA were performed using R/Bioconductor (43). Pathway analyses were performed using Ingenuity Pathway Analysis ( using Canonical Pathways. Comparisons for the expression patterns of the top 10 discriminative genes were made using a pairwise t test with Benjamini and Hochberg's false-discovery rate q values to correct for multiple comparisons (44) (P < 0.05 taken as statistically significant).


Clinical Characteristics of Subjects with Sarcoidosis

The UCSF cohort consisted of 38 prospectively enrolled patients with sarcoidosis and 20 unaffected control subjects (Table 1). The Oregon Health Sciences University sarcoidosis cohort used for validation in this study included 12 patients with sarcoidosis and 12 control subjects (GSE18781; Table E1) (referred to as “Oregon” in the figures). There were no significant differences in age or sex between the UCSF and Oregon datasets (P > 0.05).

Discrimination of Patients with Sarcoidosis from Control Subjects Using Machine Learning Algorithms

Sarcoidosis is a systemic disease that is widely believed to have a compartmentalized inflammatory process. However, despite this belief, two prior studies have demonstrated “activation” of peripheral blood T cells in patients with sarcoidosis compared with control subjects in the degree of effector function (45) and with regard to IL-2 receptor expression (46). Therefore, we hypothesized that a genome-wide approach to analyzing the transcriptome of whole blood cells from patients with sarcoidosis would reveal robust changes in gene expression that could distinguish patients with sarcoidosis from control subjects with high accuracy.

To do this, we took advantage of machine learning algorithms, which are particularly well suited for classification problems in which the set of features (genes) is large. We began these analyses using three independent machine learning algorithms: random forests (37), shrunken centroids (39), and elastic net (38). Each algorithm demonstrated a high sensitivity and specificity to discriminate patients with sarcoidosis from control subjects based on blood gene expression profiles, and we chose to perform all additional analyses using the random forest algorithm as described in Materials and Methods.

We trained the random forest classifier using the UCSF microarray data (38 patients with sarcoidosis and 20 control subjects). We then assessed the classifier performance in two ways. First, prediction of class for “out-of-bag” UCSF samples (those not used for classifier development in any given classification tree) was used to provide an internal estimate of classifier performance, which was globally assessed through generation of ROC curves (Figure 2A). Then, the Oregon dataset (12 patients with sarcoidosis and 12 control subjects) was used for external validation of the classifier (2B). In the training dataset (UCSF), the error rate was 12.1% (sensitivity, 85%; specificity, 90%), estimated from the out-of-bag samples (Figure 2A). In the external validation dataset (Oregon), the observed error rate was 8% (sensitivity, 92%; specificity, 92%) (Figure 2B). All performance metrics are reported at the optimal decision threshold (i.e., the upper-left most point in the ROC curve). Analyses performed after omitting the subject in the Oregon dataset who was diagnosed with sarcoidosis on the basis of hilar adenopathy alone yielded similar results (error rate, 9%; sensitivity, 92%; specificity, 90%). These data indicate that sarcoidosis induces robust changes in the transcriptional profile of blood as compared with control subjects.

Figure 2.
Receiver operator characteristic curve analysis of subject classification derived using the random forest algorithm on blood gene expression data. (A) University of California, San Francisco (UCSF) Derivation (training) set performance for all patients ...

Because 12 of the 38 subjects within the UCSF sarcoidosis cohort were taking systemic corticosteroids, we assessed whether the use of corticosteroids had a significant influence on the classifier. Two observations suggest that this is not the case. First, none of these 12 subjects was misclassified on out-of-bag prediction. Second, we performed PCA using the 100 genes most important in the classification algorithm based on their value in discriminating groups (defined in random forests as the “Gini coefficient” for each gene). Plotting the first and second principal components showed that patients with sarcoidosis who were using corticosteroids were intermixed with patients who were not and that both groups were well distinguished from healthy control subjects (Figure 3A), indicating that the discriminative power of the genes used in the classifier is not specifically influenced by corticosteroid use. To understand if a smaller number of genes could retain the predictive value, we repeated the PCA using the 10 most important genes in our random forest algorithm (Figure 3B). Again, the patients with sarcoidosis remained well distinguished from healthy control subjects. Similar retention of predictive value was observed in the Oregon dataset after reduction in the number of genes used for classification (Figures 3C and 3D).

Figure 3.
Principal component analysis (PCA) using (A) the 100 genes from blood most important in the classification algorithm based on their value in discriminating groups (defined in random forests as the Gini coefficient). Patients with sarcoidosis who were ...

Validation of the Most Discriminative Genes

Quantitative PCR was performed for the 10 most discriminative genes (Table 2) using samples from the UCSF dataset to validate microarray findings, and in each instance PCR results correlated extremely well with microarray data (Figure 4).

Figure 4.
Quantitative PCR confirmation of array-based blood gene expression measurement from 10 of the most important classifier genes (as ranked by the Gini coefficient). The correlation of array to quantitative PCR data was statistically significant for all ...

Relationship with Sarcoidosis Lung Disease Severity

To determine whether the most highly discriminative genes varied with disease severity, we analyzed gene expression levels for the top 10 discriminative genes in subjects with sarcoidosis and low lung function (FEV1 and/or FVC < 80% predicted) separately from those with normal lung function in the UCSF dataset (Figure 5). Among the 10 genes, 3 are IFN inducible or encode for mediators downstream of the IFN-γ signaling pathway. One of these genes, IRF1, was statistical significantly different between patients with sarcoidosis with low and normal lung function. Although we do not have lung function data for the Oregon dataset, these analyses demonstrate the consistency of gene expression across the UCSF and Oregon datasets (Figure 5).

Figure 5.
Expression levels of the 10 most discriminative genes from blood in classifying patients with sarcoidosis. Relative gene expression levels of microarray data are compared between the internal derivation dataset (University of California, San Francisco ...

Similarities of Pathway Analyses in Blood and Granulomatous Lung Tissue

To understand how peripheral blood transcript profiles relate to those found in granulomatous tissues, we analyzed the overlap in differentially expressed genes in blood and lung tissue (GSE16538) (35) from patients with sarcoidosis (Figure 6). These data demonstrate that genes differentially expressed in both compartments are more likely than chance to be concordantly up-regulated and, in particular, that those with high fold-change are concordantly up-regulated in blood and lung. Specifically, we defined concordantly up-regulated genes as those with an FDR q value of < 0.1 in both datasets and fold-change greater than 2 in either dataset (a relatively lenient FDR threshold was chosen to optimize power in the biopsy analysis, which had a small sample size, but this threshold was applied across both datasets and to up- and down-regulated genes to avoid bias in the comparison). The results showed five genes significantly down-regulated in both datasets, two genes up-regulated in the whole blood dataset but down-regulated in the biopsy dataset, 10 genes down-regulated in the whole blood dataset but up-regulated in the biopsy dataset, and 33 genes up-regulated in both datasets. This excess in concordantly up-regulated genes in both datasets was statistically significant (Fisher's exact test; P = 3.3e-10). Among the genes that were concordantly induced in the lung and blood in sarcoidosis was a critical transcriptional regulator in the IFN-α, -β, and -γ signaling pathways (STAT1). Further, up-regulation of genes related to the IFN signaling pathways was significantly overrepresented in the blood transcriptome from patients with sarcoidosis (Figure E1 and Table E4). Collectively, these data are consistent with a large body of literature showing a significant role for Th1-type inflammation in sarcoidosis (reviewed in Ref. 47) and demonstrate that gene regulation of specific inflammatory pathways is overrepresented in circulating blood cells and may serve as a surrogate marker for monitoring disease severity (Figure 5) and activity.

Figure 6.
Overlap in differentially expressed genes in blood and lung tissue in sarcoidosis. Dots represent the mean log fold change in microarray gene expression of individual genes measured in lung biopsies (OHIO dataset; n = 6 patients with sarcoidosis and 6 ...

In addition to IFN pathway–related transcripts in blood, we found up-regulation of important genes related to T-cell homeostasis and survival (IL15 and IL7R [CD127]) (Figure 6; Table 2). These findings are also concordant with a prior genomics study that identified IL7 and IL15 as belonging to the most highly overexpressed gene networks in sarcoidosis lung tissues compared with control subjects (35). Thus, several important molecular pathways were identified in the blood transcriptome, which also appear to be relevant to disease pathogenesis in granulomatous inflammation of the lung.

In addition to examining the overlap of differential gene expression in sarcoidal lung tissue compared with blood, we took another approach to examine whether genes associated with progressive fibrotic lung disease in lung tissue were differentially expressed in the peripheral blood of subjects with sarcoidosis. As described in Materials and Methods, this dataset did not include healthy control subjects. Therefore, we identified genes differentially expressed in the UCSF blood dataset and compared the overlap in up- or down-regulated genes from the lung biopsies of patients classified as “progressive-fibrotic” by Lockstone and colleagues (36). Among the top 50 genes up-regulated in the lung tissue dataset (fibrotic versus self-limiting sarcoidosis), nine were significantly differentially expressed in our blood dataset (sarcoidosis versus healthy control) (Table 3). Among the 51 genes down-regulated in the lung tissue dataset, 12 were differentially expressed in our blood dataset (Table 4).


Overlap between Peripheral Blood Transcript Profiles in Sarcoidosis and TB

Given the increasing evidence suggesting exposure to TB as a potential etiologic agent of sarcoidosis, we assessed the qualitative degree of overlap in blood gene expression patterns between sarcoidosis and PTB using the top 50 most discriminating genes from the random forest classifier model based on patients with sarcoidosis versus control subjects. Figure 7A shows the relative gene expression levels depicted in a heat map as well as by principal component analysis. These data reveal overlaps in gene expression patterns between sarcoidosis and PTB and provide a “genomic perspective” of gene regulation in systemic granulomatous inflammation that is shared between M. tuberculosis infection and sarcoidosis. We included transcriptional data from other Th1-type driven inflammatory and infectious diseases for comparison. These data reveal that some diseases, such as pediatric lupus, share some features in common with PTB and sarcoidosis.

Figure 7.
The distribution of blood gene expression patterns between sarcoidosis and active pulmonary tuberculosis (PTB). Relative gene expression was assessed by generation of heat maps (left) and principal component analysis (right) as described in Materials ...

We also assessed the qualitative degree of gene expression overlap using the 86-gene whole blood transcriptional signature reported by Berry and colleagues to be specific for PTB (29). However, they did not consider sarcoidosis as a disease control in their analyses. Similar to the findings in Figure 7A, the 86-gene, PTB-specific transcriptional signature revealed significant overlap, as shown by heat map and principal component analysis (Figure 7B). Some of the most striking genes depicted in the heat map, such as DHRS9 and CCR7, have indistinguishable expression patterns. Analyses of this set of genes further highlight the similarities between PTB and sarcoidosis expression patterns compared with the other disease conditions.

As an alternative approach to assess the similarities in gene expression patterns in sarcoidosis and PTB, we used the “gene module” strategy first described by Chaussabel and colleagues (34), which examines groups of genes that are coordinately expressed, such as genes related to plasma cells, B cells, myeloid linage, and T cells, as well as other functional categories as previously described (34). Figure E2 shows that gene expression patterns were very similar between sarcoidosis and PTB, with only 4 of 25 modules divergent. Using pathway analysis of the blood transcriptome in sarcoidosis, we found at least nine genes related to Type I and II IFN pathways that overlapped with the blood transcriptional signature reported from subjects with PTB (29) (Figure E1; Table E4). This overlap in blood transcriptional profiles between sarcoidosis and PTB provides evidence that similar activation pathways are engaged in these distinct diseases and may provide insights into biology or serve as genomic markers of disease progression and/or activity.

Although these data suggest that granulomatous lung diseases may share more similarities than differences regarding the blood transcriptional profile, we also explored how blood transcriptional profiles in sarcoidosis and TB may differ. To do this, we used machine learning algorithms to build a classifier with the goal of distinguishing sarcoidosis from PTB. This analysis (Figure 7C) suggests that a number of genes may be divergent between these two diseases, as assessed by the number of genes that are down-regulated in sarcoidosis but not in TB and other Th1-type inflammatory and infectious diseases (most highly divergent gene expression, either up- or down-regulated, was exhibited for genes GBP6 [guanylate-binding protein6], which is induced by IFN-γ; SEPT4 [Septin 4]; TIMM10 [translocase of inner mitochondrial membrane 10]; and NOG [Noggin]). Also of note is the more marked difference in the pattern of gene expression between sarcoidosis, TB, and another pulmonary granulomatous lung disease from subjects recruited at our center (hypersensitivity pneumonitis) (Figures 7A–7C). Overall, these findings challenge the traditional view that the inflammatory fingerprint of sarcoidosis is confined to diseased organs and suggest that sarcoidosis- and TB-specific signatures can be identified by directed genomic analysis of blood samples despite significant overlap in immunopathology (e.g., shared activation of IFN-related signaling pathways).


The goals of this study were (1) to measure the transcriptional signature of peripheral blood from patients with sarcoidosis and determine whether the blood transcriptional profile in sarcoidosis reflects transcriptional abnormalities measured in diseased lungs and (2) to assess overlap in gene expression with a granulomatous infectious disease, PTB, which may play a role in the pathogenesis of sarcoidosis. To this end, we identified a robust transcriptional profile in whole blood from patients with sarcoidosis subjects and identified concordance in gene expression patterns in the blood and diseased lungs, including genes important in the IFN, IL-15, and IL-7 signaling pathways. Finally, we demonstrated significant overlap in gene expression in blood cells in sarcoidosis and active PTB. Our findings indicate that circulating immune cells in sarcoidosis possess significant alterations in genome-wide gene expression supporting a “systemic” inflammatory nature of sarcoidosis. Because we found specific genomic markers in blood that correlate with gene expression patterns in the lung as well as severity of lung function abnormalities, these data justify future studies of blood biomarkers of disease progression and phenotype in sarcoidosis, which may be useful in clinical studies.

Although the concept that M. tuberculosis may be involved in the development of sarcoidosis is not a new one, more recent studies using biochemical, nucleic acid, and cell-based assays have made important contributions by providing evidence that links mycobacteria to sarcoidosis (23, 25, 28). Collectively, these studies have found nucleic acids and insoluble proteins from mycobacteria in diseased tissues as well as specific T-cell immune responses to mycobacterial antigens independent of length of time of known disease in a reproducible percentage of subjects with sarcoidosis. These published data argue that at least some components of the systemic inflammatory profile in chronic sarcoidosis are stable over time and measurable in peripheral blood. The sarcoidosis gene expression signature we present in this study is similar, but not identical, to TB. These data suggest a role for a TB-like immune response, rather than an infection per se, in a certain percentage of individuals who develop sarcoidosis.

Our results raise several questions in light of the available data. Does this genomic data comparison between sarcoidosis and PTB argue further that M. tuberculosis plays an important role in the pathogenesis of sarcoidosis? Or have we identified a more general “granulomatous” transcriptional signature that would be observed in many types of granulomatous infectious and noninfectious lung diseases? Although we cannot answer these questions with confidence, transcriptional data presented from subjects with hypersensitivity pneumonitis (Figure 7) argue against a generalized “granulomatous lung disease” signature. Additional diseases, such as chronic berylliosis and histoplasmosis, must be analyzed to address this possibility. Second, can we track disease progression using a genomic transcriptional signature in whole blood? Our data show there are sets of genes that correlate with abnormally low lung function, and validation through longitudinal studies is necessary to address their utility in monitoring disease progression and severity. Finally, can we use the peripheral blood transcriptional signature to guide therapeutic decisions by tracking response to therapy? This concept was elegantly demonstrated by Berry and colleagues in monitoring treatment in patients with TB (29) and would be particularly helpful in managing patients with sarcoidosis who have chronic disease because we now have a larger armamentarium of potential therapies (e.g., methotrexate, TNF-α inhibitors, mycophenolate mofitil, thalidomide, etc.) and few tools for monitoring response.

Although sarcoidosis is traditionally thought to be a disease in which the pathogenic immune cells are “compartmentalized” in the diseased tissue or organ (4, 48), our data identify a set of genes that are concordantly regulated in blood and lung and that plausibly contribute to the development or persistence of granulomatous inflammation (IFN-α, IFN-γ, IL-15, and IL-7). We speculate that the breadth and sensitivity conferred by newer genomic approaches to characterize systemic inflammation (34) may explain our findings of overlap between blood and lung signatures of specific inflammatory pathways. The circulating immune cells expressing these genes, which are concordant in blood and lung, may identify cells that have recently migrated to the diseased organ and are contributing to ongoing inflammation. Immunophenotyping and functional analysis of these cells may enhance our understanding of the immune biology of sarcoidosis and elucidate novel immune subsets that could be targeted in new treatment approaches. In summary, we provide evidence for a robust peripheral blood transcriptional signature in sarcoidosis, which raises questions about how we should think of disease “compartmentalization” and could be leveraged to help us understand the immunobiology of this disease, provide markers of disease severity, and identify novel immune cells that contribute to organ inflammation.

Supplementary Material

Online Supplement:


The authors thank the members of the Interstitial Lung Disease Clinic and specifically Paul Wolters, Harold Chapman, and Sally McLaughlin as well as the participants in the study.


Supported by grants NAIAD AI079340 and NHLBI HL09537.

Author Contributions: Conception and design, L.L.K., P.G.W., O.D.S., N.R.B.; analysis and interpretation, L.L.K., P.G.W., O.D.S., N.R.B., C.P.N., J.C.P.; drafting the manuscript for important intellectual content, L.L.K., P.G.W., O.D.S., N.R.B.

This article has an online supplement, which is accessible from this issue's table of contents at

Originally Published in Press as DOI: 10.1164/rccm.201106-1143OC on August 18, 2011

Author Disclosure: None of the authors has a financial relationship with a commercial entity that has an interest in the subject of this manuscript.


1. Siltzbach LE, James DG, Neville E, Turiaf J, Battesti JP, Sharma OP, Hosoda Y, Mikami R, Odaka M. Course and prognosis of sarcoidosis around the world. Am J Med 1974;57:847–852 [PubMed]
2. James DG. Immunology of sarcoidosis. Lancet 1966;2:633–635 [PubMed]
3. Sones M, Israel HL. Altered immunologic reactions in sarcoidosis. Ann Intern Med 1954;40:260–268 [PubMed]
4. Hudspith BN, Flint KC, Geraint-James D, Brostoff J, Johnson NM. Lack of immune deficiency in sarcoidosis: compartmentalisation of the immune response. Thorax 1987;42:250–255 [PMC free article] [PubMed]
5. Hirschhorn K, Schreibman RR, Bach FH, Siltzbach LE. In-vitro studies of lymphocytes from patients with sarcoidosis and lymphoproliferative diseases. Lancet 1964;2:842–843 [PubMed]
6. Sharma OP, James DG, Fox RA. A correlation of in vivo delayed-type hypersensitivity with in vitro lymphocyte transformation in sarccidosis. Chest 1971;60:35–37 [PubMed]
7. Kataria YP, Sagone AL, LoBuglio AG, Bromberg PA. In vitro observations on sarcoid lymphocytes and their correlation with cutaneous energy and clinical severity of disease. Am Rev Respir Dis 1973;108:767–776 [PubMed]
8. Robinson BW, McLemore TL, Crystal RG. Gamma interferon is spontaneously released by alveolar macrophages and lung T lymphocytes in patients with pulmonary sarcoidosis. J Clin Invest 1985;75:1488–1495 [PMC free article] [PubMed]
9. Greene CM, Meachery G, Taggart CC, Rooney CP, Coakley R, O'Neill SJ, McElvaney NG. Role of IL-18 in CD4+ T lymphocyte activation in sarcoidosis. J Immunol 2000;165:4718–4724 [PubMed]
10. Muller-Quernheim J, Saltini C, Sondermeyer P, Crystal RG. Compartmentalized activation of the interleukin 2 gene by lung T lymphocytes in active pulmonary sarcoidosis. J Immunol 1986;137:3475–3483 [PubMed]
11. Crystal RG, Roberts WC, Hunninghake GW, Gadek JE, Fulmer JD, Line BR. Pulmonary sarcoidosis: a disease characterized and perpetuated by activated lung T-lymphocytes. Ann Intern Med 1981;94:73–94 [PubMed]
12. Hunninghake GW, Gadek JE, Young RC, Jr, Kawanami O, Ferrans VJ, Crystal RG. Maintenance of granuloma formation in pulmonary sarcoidosis by T lymphocytes within the lung. N Engl J Med 1980;302:594–598 [PubMed]
13. Hunninghake GW, Crystal RG. Pulmonary sarcoidosis: a disorder mediated by excess helper T-lymphocyte activity at sites of disease activity. N Engl J Med 1981;305:429–434 [PubMed]
14. Daniele RP, Dauber JH, Rossman MD. Immunologic abnormalities in sarcoidosis. Ann Intern Med 1980;92:406–416 [PubMed]
15. Costabel U, Bross KJ, Ruhle KH, Lohr GW, Matthys H. Ia-like antigens on T-cells and their subpopulations in pulmonary sarcoidosis and in hypersensitivity pneumonitis: analysis of bronchoalveolar and blood lymphocytes. Am Rev Respir Dis 1985;131:337–342 [PubMed]
16. Saltini C, Spurzem JR, Lee JJ, Pinkston P, Crystal RG. Spontaneous release of interleukin 2 by lung T lymphocytes in active pulmonary sarcoidosis is primarily from the leu3+dr+ T cell subset. J Clin Invest 1986;77:1962–1970 [PMC free article] [PubMed]
17. Rossi GA, Sacco O, Cosulich E, Risso A, Balbi B, Ravazzoni C. Helper T-lymphocytes in pulmonary sarcoidosis: functional analysis of a lung T-cell subpopulation in patients with active disease. Am Rev Respir Dis 1986;133:1086–1090 [PubMed]
18. Saltini C, Hemler ME, Crystal RG. T lymphocytes compartmentalized on the epithelial surface of the lower respiratory tract express the very late activation antigen complex vla-1. Clin Immunol Immunopathol 1988;46:221–233 [PubMed]
19. Pinkston P, Bitterman PB, Crystal RG. Spontaneous release of interleukin-2 by lung T lymphocytes in active pulmonary sarcoidosis. N Engl J Med 1983;308:793–800 [PubMed]
20. Hunninghake GW, Bedell GN, Zavala DC, Monick M, Brady M. Role of interleukin-2 release by lung T-cells in active pulmonary sarcoidosis. Am Rev Respir Dis 1983;128:634–638 [PubMed]
21. Hunninghake GW, Crystal RG. Mechanisms of hypergammaglobulinemia in pulmonary sarcoidosis: site of increased antibody production and role of T lymphocytes. J Clin Invest 1981;67:86–92 [PMC free article] [PubMed]
22. Barnard J, Rose C, Newman L, Canner M, Martyny J, McCammon C, Bresnitz E, Rossman M, Thompson B, Rybicki B, et al. Job and industry classifications associated with sarcoidosis in a case-control etiologic study of sarcoidosis (access). J Occup Environ Med 2005;47:226–234 [PubMed]
23. Chen ES, Moller DR. Etiology of sarcoidosis. Clin Chest Med 2008;29:365–377 (vii) [PubMed]
24. Newman LS, Rose CS, Bresnitz EA, Rossman MD, Barnard J, Frederick M, Terrin ML, Weinberger SE, Moller DR, McLennan G, et al. A case control etiologic study of sarcoidosis: environmental and occupational risk factors. Am J Respir Crit Care Med 2004;170:1324–1330 [PubMed]
25. Drake WP, Dhason MS, Nadaf M, Shepherd BE, Vadivelu S, Hajizadeh R, Newman LS, Kalams SA. Cellular recognition of mycobacterium tuberculosis ESAT-6 and KATG peptides in systemic sarcoidosis. Infect Immun 2007;75:527–530 [PMC free article] [PubMed]
26. Gupta D, Agarwal R, Aggarwal AN, Jindal SK. Molecular evidence for the role of mycobacteria in sarcoidosis: a meta-analysis. Eur Respir J 2007;30:508–516 [PubMed]
27. Klemen H, Husain AN, Cagle PT, Garrity ER, Popper HH. Mycobacterial DNA in recurrent sarcoidosis in the transplanted lung: a PCR-based study on four cases. Virchows Arch 2000;436:365–369 [PubMed]
28. Song Z, Marzilli L, Greenlee BM, Chen ES, Silver RF, Askin FB, Teirstein AS, Zhang Y, Cotter RJ, Moller DR. Mycobacterial catalase-peroxidase is a tissue antigen and target of the adaptive immune response in systemic sarcoidosis. J Exp Med 2005;201:755–767 [PMC free article] [PubMed]
29. Berry MP, Graham CM, McNab FW, Xu Z, Bloch SA, Oni T, Wilkinson KA, Banchereau R, Skinner J, Wilkinson RJ, et al. An interferon-inducible neutrophil-driven blood transcriptional signature in human tuberculosis. Nature 2010;466:973–977 [PMC free article] [PubMed]
30. Woodruff PG, Solberg OD, Snyder-Cappione JE, Hou L, Nguyen C, Chi J, Koth LL. Whole blood gene expression analysis identifies sarcoidosis-specific markers including decreased IL7 receptor expression. Am J Respir Crit Care Med 2010;181:A3981
31. Statement on sarcoidosis. Joint statement of the American Thoracic Society (ATS), the European Respiratory Society (Ers) and the World Association of Sarcoidosis and other Granulomatous Disorders (WASOG) adopted by the ATS Board of Directors and by the ERS Executive Committee, February 1999. Am J Respir Crit Care Med 1999;160:736–755 [PubMed]
32. Rosenbaum JT, Pasadhika S, Crouser ED, Choi D, Harrington CA, Lewis JA, Austin CR, Diebel TN, Vance EE, Braziel RM, et al. Hypothesis: sarcoidosis is a STAT1-mediated disease. Clin Immunol 2009;132:174–183 [PMC free article] [PubMed]
33. Winterbauer RH, Belic N, Moores KD. Clinical interpretation of bilateral hilar adenopathy. Ann Intern Med 1973;78:65–71 [PubMed]
34. Chaussabel D, Quinn C, Shen J, Patel P, Glaser C, Baldwin N, Stichweh D, Blankenship D, Li L, Munagala I, et al. A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus. Immunity 2008;29:150–164 [PMC free article] [PubMed]
35. Crouser ED, Culver DA, Knox KS, Julian MW, Shao G, Abraham S, Liyanarachchi S, Macre JE, Wewers MD, Gavrilin MA, et al. Gene expression profiling identifies MMP-12 and ADAMDEC1 as potential pathogenic mediators of pulmonary sarcoidosis. Am J Respir Crit Care Med 2009;179:929–938 [PMC free article] [PubMed]
36. Lockstone HE, Sanderson S, Kulakova N, Baban D, Leonard A, Kok WL, McGowan S, McMichael AJ, Ho LP. Gene set analysis of lung samples provides insight into pathogenesis of progressive, fibrotic pulmonary sarcoidosis. Am J Respir Crit Care Med 2010;181:1367–1375 [PubMed]
37. Diaz-Uriarte R, Alvarez de Andres S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics 2006;7:3. [PMC free article] [PubMed]
38. Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc, B 2005;67:310–320
39. Dabney AR. Classification of microarrays to nearest centroids. Bioinformatics 2005;21:4148–4154 [PubMed]
40. Liaw A, Wiener M. Classification and regression by randomforest. R News 2002;2:18–22
41. Woodruff PG, Boushey HA, Dolganov GM, Barker CS, Yang YH, Donnelly S, Ellwanger A, Sidhu SS, Dao-Pick TP, Pantoja C, et al. Genome-wide profiling identifies epithelial cell genes associated with asthma and with treatment response to corticosteroids. Proc Natl Acad Sci USA 2007;104:15858–15863 [PubMed]
42. Dolganov GM, Woodruff PG, Novikov AA, Zhang Y, Ferrando RE, Szubin R, Fahy JV. A novel method of gene transcript profiling in airway biopsy homogenates reveals increased expression of a Na+-K+-Cl- cotransporter (nkcc1) in asthmatic subjects. Genome Res 2001;11:1473–1483 [PubMed]
43. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004;5:R80. [PMC free article] [PubMed]
44. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser A Stat Soc 2005;57:289–300
45. Wahlstrom J, Katchar K, Wigzell H, Olerup O, Eklund A, Grunewald J. Analysis of intracellular cytokines in CD4+ and CD8+ lung and blood T cells in sarcoidosis. Am J Respir Crit Care Med 2001;163:115–121 [PubMed]
46. Konishi K, Moller DR, Saltini C, Kirby M, Crystal RG. Spontaneous expression of the interleukin 2 receptor gene and presence of functional interleukin 2 receptors on T lymphocytes in the blood of individuals with active pulmonary sarcoidosis. J Clin Invest 1988;82:775–781 [PMC free article] [PubMed]
47. Schwarz MI, King TE. Interstitial lung disease. BC Decker; Hamilton, Ontario, Canada: 1998
48. Hunninghake GW, Fulmer JD, Young RC, Jr, Gadek JE, Crystal RG. Localization of the immune response in sarcoidosis. Am Rev Respir Dis 1979;120:49–57 [PubMed]

Articles from American Journal of Respiratory and Critical Care Medicine are provided here courtesy of American Thoracic Society