Our results indicate that molecular signatures of gene expression appear to be useful in the identification of the presence and predictive of the activity of IPF. We have shown that molecular signatures can distinguish IPF from both normal lung and other chronic lung diseases. Moreover, our findings suggest that molecular signatures from lung parenchyma at the time of diagnosis appear to be helpful in predicting disease progression and may prove valuable in predicting the activity of IPF.
Genome-wide analyses of gene expression have facilitated the identification of gene expression patterns or signatures revealing the complexity of human cancer. Most of the work using large scale gene expression data has been focused on discovering gene expression profiles that can lead to a better understanding of tumor development and proliferation. The strength of gene expression analysis has been shown by the ability to identify new cancer subtypes and predict clinical outcome 
. A prognostic gene expression signature has been proposed for survival in early-stage lung cancer 
and was recently validated in a large, training-testing, multi-site, blinded study 
. Gene expression profiling has also allowed the prediction of breast cancer recurrence 
which has ultimately lead to the development of the Mammaprint, a clinical test based on a 70-genes signature that predicts the risk of metastasis in breast cancer patients 
While gene expression profiling has proven to be a powerful tool for the identification of specific gene patterns and pathways associated with certain types of human cancers, our findings suggest that these molecular signatures may also prove useful in understanding complex lung diseases, like IPF. The increase of the protein ubiquitination pathway could be associated with an increase of apoptosis of epithelial cells but has not been extensively studied in IPF. There are few studies implicating the PI3/AKT signaling pathway in IPF. Bleomycin-induced pulmonary fibrosis studies in mice have shown activation not only of TGF-beta but also phosphatidylinositol 3-kinase (PI3K) and protein kinase B via a Semaphorin (SEMA) 7A-dependent mechanisms, and PKB/AKT inhibition diminished TGF-beta-induced fibrosis 
. SEMA 7A was not found to be differentially expressed in our dataset though many family members and its receptor intergrin beta are involved in the transcriptional profile of IPF. It has been shown that collagen accumulation can be reduced by the administration of PI3K inhibitors 
, implying that the PI3K/AKT pathway might play an important role in pulmonary fibrosis. Deregulation of the PI3K/PTEN/AKT pathway is one of the most common altered pathways in human malignancy. Significant advances have been made in the understanding of the AKT signaling pathway in oncogenesis and in the development of small molecule inhibitors. Whether this pathway could be targeted in human pulmonary fibrosis remains to be established and could offer new treatment opportunities. The integrin signaling pathway is anticipated to be associated with pulmonary fibrosis since integrins are the primary extracellular matrix (ECM) receptors mediating ECM remodeling 
. In response to changes in the ECM, integrin signaling also regulates many other interrelated cellular processes like proliferation, survival, cell migration and invasion. However, further studies in larger cohorts, using either, real-time PCR, a customized SAGE signature array or tissue-array, are needed to validate the importance and relevance of these findings for early diagnosis and disease management.
Our results may have a significant impact in the development of early biomarkers for IPF. Identifying biomarkers that could reduce the time to diagnosis may create a window of opportunity for therapeutic intervention, especially in a disease like IPF where the diagnosis is often delayed. While our transcriptional signature for disease progression was developed using lung biopsy samples, 47 of the 134 gene products that were associated with clinical progression have been detected in body fluids in various diseases (such as blood, plasma/serum, bronchoalveolar lavage fluid or sputum) according to Ingenuity Pathway Analysis software. Although for many of these 47 genes the biological function and role in IPF pathogenesis is unknown, these genes and gene products could potentially serve as biomarkers for this disease. Genes like ADM (adrenomedullin), CCL2 (chemokine ligand 2), PTPRF (protein tyrosine phosphatase receptor F) and SPP1 (osteopontin) play a role in the migration of smooth muscle cells and cell proliferation and/or invasion implying a potentially more important role of these processes in disease progression. The chemokine CCL2 have been previously detected in metaplastic epithelial cells and vascular endothelial cells of IPF cases and it was proposed that CCL2 may play a key role in the irreversible progression of IPF 
. In addition, a decrease of lung fibrosis was detected in CCL2 null mice when exposed to bleomycin 
. What's more, CCL2 has been shown to be elevated in human bronchoalveolar lavage fluid from patients with IPF 
. The protein was measured in plasma as well and it was shown that there was no significant difference between IPF patients and normal controls 
. Our results indicate that CCL2 is a potential marker of disease progression in IPF. Whether the plasma levels of CCL2 correlates with disease progression remains unknown 
. Interestingly, SPP1 have been localized to the alveolar epithelial cells in IPF lungs, was also significantly elevated in bronchoalveolar lavage fluid from IPF patients 
and, has been detected in plasma from patients with idiopathic interstitial pneumonia 
. Previous studies have shown that SPP1 null mice clearly develop less fibrosis when exposed to bleomycin. It was suggested that SPP1 is secreted by the epithelial cells and has a profibrotic effect 
Some of these potential biomarkers genes have been implicated in human cancers. Heat shock 70KDa protein 1A (HSPA1A) is up regulated in brain, lung, and liver cancer. Macropain (PSMA7) is increased in brain, breast, and stomach cancer, and plays an important role in colorectal cancer progression providing a unique target for drug development. The Ras homolog gene family member B (RHOB), a Rho GTPase, is up regulated in brain and breast cancer though down regulated in lung neoplasms. These GTPases are crucial regulators of the actin cytoskeleton and also play an important role in membrane trafficking. Associated with lung cancer are FK506 binding protein 2 (FKBP2) and Plunc (palate, lung and nasal epithelium carcinoma associated). The latter gene belongs to the PLUNC family of proteins postulated to play a role in innate immune response and is uniquely expressed in the upper respiratory tract. Studies in cystic fibrosis have shown a significant elevation of Plunc expression in diseased airways 
. As Plunc can be detected in sputum 
and bronchoalveolar lavage fluid, it appears to be an ideal candidate biomarker for disease progression in IPF. SAGE and microarray analysis have recently indicated that Plunc is a novel marker that distinguishes gastric hepatoid adenocarcinoma from primary hepatocellular carcinoma 
The extensive SAGE IPF transcriptome presented in this investigation demonstrates the complexity and scope of the biological activity involved in IPF. Some of the pathways identified by SAGE profiling have not been previously associated with IPF. Network and pathway analyses have also shown that various signaling pathways can interact or even partially overlap with each other, thereby suggesting that IPF may be the result of multiple, consecutive (or interactive) biological events, possibly triggered by environmental stimuli. However, despite this biological complexity, our findings clearly illustrate that molecular signatures of gene expression in IPF may prove helpful in predicting disease progression among those with IPF. Molecular and cellular functions like cell proliferation, migration, invasion and cell morphology appear to be over represented in the more progressive IPF group; a striking similarity with human cancers. The association with disease progression and the identifiable heterogeneity seen within samples emphasize the importance and the need for an extensive molecular classification of IPF and other forms of interstitial lung disease. The recognition that IPF may have different subtypes that can be distinguished by their molecular patterns could identify novel therapeutic targets and personalize the clinical approach to this complex group of diseases.