Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs) and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel) and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes) and ScriptSeq whole transcriptome protocols respectively, p<2x10-16. Specifically for lincRNAs, we observed superb Pearson correlation (0.988) between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads). Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol performed particularly well for lincRNA expression from FFPE libraries, but detection of eSNV and fusion transcripts was less sensitive.
This study investigated the comparative ability of bone marrow and skeletal muscle derived stromal cells (BMSC’s and SMSC’s) to express a tenocyte phenotype, and whether this expression could be augmented by growth and differentiation factor-5 (GDF-5).
Tissue harvest was performed on the hind limbs of seven dogs. Stromal cells were isolated via serial expansion in culture. After 4 passages, tenogenesis was induced using either ascorbic acid alone or in conjunction with GDF-5. CD44, tenomodulin, collagen I, and collagen III expression levels were compared for each culture condition at 7 and 14 days following induction. Immunohistochemistry (IHC) was performed to evaluate cell morphology and production of tenomodulin and collagen I.
SMSC’s and BMSC’s were successfully isolated in culture. Following tenocytic induction, SMSC’s demonstrated an increased mean relative expression of tenomodulin, collagen I, and collagen III at 14 days. BMSC’s only showed increased mean relative expression of collagen I, and collagen III at 14 days. IHC revealed positive staining for tenomodulin and collagen I at 14 days for both cell types. The morphology of skeletal muscle derived stromal cells at 14 days had an organized appearance in contrast to the haphazard arrangement of the bone marrow derived cells. GDF-5 did not affect gene expression, cell staining, or cell morphology significantly.
Stromal cells from either bone marrow or skeletal muscle can be induced to increase expression of matrix genes; however, based on expression of tenomodulin and cell culture morphology SMSC’s may be a more ideal candidate for tenocytic differentiation.
Progenitors and Stem Cells; Tissue Engineering; Tendon Biology; Collagen and Matrix Proteins; Biomarkers
Recent genome-wide association studies (GWAS) of late-onset Alzheimer disease (LOAD) identified 9 novel risk loci. Discovery of functional variants within genes at these loci is required to confirm their role in Alzheimer disease (AD). Single nucleotide polymorphisms that influence gene expression (eSNPs) constitute an important class of functional variants. We therefore investigated the influence of the novel LOAD risk loci on human brain gene expression.
We measured gene expression levels in the cerebellum and temporal cortex of autopsied AD subjects and those with other brain pathologies (∼400 total subjects). To determine whether any of the novel LOAD risk variants are eSNPs, we tested their cis-association with expression of 6 nearby LOAD candidate genes detectable in human brain (ABCA7, BIN1, CLU, MS4A4A, MS4A6A, PICALM) and an additional 13 genes ±100 kb of these SNPs. To identify additional eSNPs that influence brain gene expression levels of the novel candidate LOAD genes, we identified SNPs ±100 kb of their location and tested for cis-associations.
CLU rs11136000 (p = 7.81 × 10−4) and MS4A4A rs2304933/rs2304935 (p = 1.48 × 10−4–1.86 × 10−4) significantly influence temporal cortex expression levels of these genes. The LOAD-protective CLU and risky MS4A4A locus alleles associate with higher brain levels of these genes. There are other cis-variants that significantly influence brain expression of CLU and ABCA7 (p = 4.01 × 10−5–9.09 × 10−9), some of which also associate with AD risk (p = 2.64 × 10−2–6.25 × 10−5).
CLU and MS4A4A eSNPs may at least partly explain the LOAD risk association at these loci. CLU and ABCA7 may harbor additional strong eSNPs. These results have implications in the search for functional variants at the novel LOAD risk loci.
MicroRNA plays an important role in human diseases and cancer. We seek to investigate the expression status, clinical relevance, and functional role of microRNA in non-small cell lung cancer.
We performed miRNA expression profiling in matched lung adenocarcinoma and uninvolved lung using 56 pairs of fresh-frozen (FF) and 47 pairs of formalin-fixed, paraffin-embedded (FFPE) samples from never smokers. The most differentially expressed miRNA genes were evaluated by Cox analysis and Log-Rank test. Among the best candidate, miR-708 was further examined for differential expression in two independent cohorts. Functional significance of miR-708 expression in lung cancer was examined by identifying its candidate mRNA target and through manipulating its expression levels in cultured cells.
Among the 20 miRNAs most differentially expressed between tested tumor and normal samples, high expression level of miR-708 in the tumors was most strongly associated with an increased risk of death after adjustments for all clinically significant factors including age, sex, and tumor stage (FF cohort: HR, 1.90; 95% CI, 1.08-3.35; P=.025 and FFPE cohort: HR, 1.93; 95% CI, 1.02-3.63; P=.042). The transcript for TMEM88 gene has a miR-708 binding site in its 3′ UTR and was significantly reduced in tumors high of miR-708. Forced miR-708 expression reduced TMEM88 transcript levels and increased the rate of cell proliferation, invasion, and migration in culture.
MicroRNA-708 acts as an oncogene contributing to tumor growth and disease progression by directly down regulating TMEM88, a negative regulator of the Wnt signaling pathway in lung cancer.
NSCLC; adenocarcinoma; miR-708; never smoker; survival; TMEM88; Wnt signaling
The purpose of this study was to identify key genetic pathways involved in non-small cell lung cancer (NSCLC) and understand their role in tumor progression. We performed a genome wide scanning using paired tumors and corresponding 16 mucosal biopsies from four follow-up lung cancer patients on Affymetrix 250K-NSpI array platform. We found that a single gene SH3GL2 located on human chromosome 9p22 was most frequently deleted in all the tumors and corresponding mucosal biopsies. We further validated the alteration pattern of SH3GL2 in a substantial number of primary NSCLC tumors at DNA and protein level. We also overexpressed wild-type SH3GL2 in three NSCLC cell lines to understand its role in NSCLC progression. Validation in 116 primary NSCLC tumors confirmed frequent loss of heterozygosity of SH3GL2 in overall 51 % (49/97) of the informative cases. We found significantly low (p=0.0015) SH3GL2 protein expression in 71 % (43/60) primary tumors. Forced over-expression of wild-type (wt) SH3GL2 in three NSCLC cell lines resulted in a marked reduction of active epidermal growth factor receptor (EGFR) expression and an increase in EGFR internalization and degradation. Significantly decreased in vitro (p=0.0015–0.030) and in vivo (p=0.016) cellular growth, invasion (p=0.029–0.049), and colony formation (p=0.023–0.039) were also evident in the wt-SH3GL2-transfected cells accompanied by markedly low expression of activated AKT(Ser473), STAT3 (Tyr705), and PI3K. Downregulation of SH3GL2 interactor USP9X and activated β-catenin was also evident in the SH3GL2-transfected cells. Our results indicate that SH3GL2 is frequently deleted in NSCLC and regulates cellular growth and invasion by modulating EGFR function.
Single nucleotide polymorphism array; Lung cancer; SH3GL2; Deletion
Lung cancer patients with mutations in EGFR tyrosine kinase have improved prognosis when treated with EGFR inhibitors. We hypothesized that EGFR mutations may be related to residential radon or passive tobacco smoke.
This hypothesis was investigated by analyzing EGFR mutations in seventy lung tumors from a population of never and long-term former female smokers from Missouri with detailed exposure assessments. The relationship with passive-smoking was also examined in never-smoking female lung cancer cases from the Mayo clinic.
Overall, the frequency of EGFR mutation was 41% [95% Confidence Interval (CI): 32-49%]. Neither radon nor passive-smoking exposure was consistently associated with EGFR mutations in lung tumors.
The results suggest that EGFR mutations are common in female, never-smoking, lung cancer cases from the U.S, and EGFR mutations are unlikely due to exposure to radon or passive-smoking.
EGFR mutations; never-smokers; lung cancer; radon; passive-smoking; second hand smoke; tobacco smoke
Technical advancements in quantitative PCR (qPCR) instrumentation have made it possible to perform gene expression measurements using small sample input to support both basic and clinical research studies. As part of the strategic goals to assess new technologies and identify protocols that best fit the needs of the Mayo Clinic, we compared the Fluidigm BioMark system with standard Applied Biosystems (AB) instrumentation for mRNA and miRNA gene expression measurements. We also examined the performance of the BioMark system when using very low-input RNA.
We evaluated a set of control samples using the same TaqMan assays with both systems. We observed that the BioMark-generated data routinely yields Ct values approximately 10 cycles lower than those obtained with AB instrumentation. The correlations between the two platforms were high (r = 0.96) for both mRNA and miRNA expression experiments. For miRNA expression, a similarly high correlation was observed between fresh frozen and formalin-fixed paraffin embedded (FFPE) samples.
In an effort to accommodate our customer needs, we also evaluated the performance of the BioMark for evaluating gene expression in very low-input samples. Using six standard TaqMan control assays (having high, medium and low expression levels), we observed that high quality RNA samples as low as 10pg achieved linear amplification across four different pre-amplification cycles (10, 14, 18 and 22). At 10pg total RNA input, low-expression control assay IPO8 demonstrated a correlation of r = .999 among the four pre-amplification cycles. This linearity was also observed at higher RNA input levels, up to 10ng. The only control assay that did not perform in a linear fashion across all input amounts and all pre-amplification cycles was 18S ribosomal RNA. The highest correlation observed for 18S was r = 0.801, and this supports the vendor suggestion that 18S is not the best control assay option.
Glucose transporter-1 (GLUT-1) mediates the transport of glucose across the cellular membrane. Its elevated levels and/or activation have been shown to be associated with malignancy. The aim of this study was to investigate GLUT-1 expression in pulmonary neuroendocrine carcinomas. Tissue microarray-based samples of 178 neuroendocrine carcinomas, including 48 typical carcinoids, 31 atypical carcinoids, 27 large cell neuroendocrine carcinomas and 72 small cell carcinomas from different patients, were studied immunohistochemically for GLUT-1 expression. Forty-seven percent (75/161) of pulmonary neuroendocrine carcinomas were immunoreactive with GLUT-1. GLUT-1 was observed in 7% (3/46) of typical carcinoid, 21% (6/29) of atypical carcinoid, 74% (17/23) of large cell neuroendocrine carcinoma and 78% (49/63) of small cell carcinoma. GLUT-1 expression correlated with increasing patient age (P = 0.01) and with neuroendocrine differentiation/tumor type (P < 0.001), but not with gender, tumor size or stage. GLUT-1 expression was seen in a characteristic membranous pattern of staining along the luminal borders or adjacent to necrotic areas. GLUT-1 expression was associated with an increased risk of death for neuroendocrine carcinomas as a group (risk ratio = 2.519; 95% confidence interval = 1.519–4.178; P < 0.001) and carcinoids (risk ratio = 4.262; 95% confidence interval = 1.472–12.343; P = 0.01). In conclusion, GLUT-1 is expressed in approximately half of the pulmonary neuroendocrine carcinomas and shows a strong correlation with neuroendocrine differentiation/grade, but not with other clinicopathologic variables. Further studies appear plausible to elucidate the prognostic significance of GLUT-1 expression in pulmonary carcinoids.
GLUT-1; neuroendocrine carcinoma; carcinoid; survival; lung
The malignant transformation in several types of cancer, including lung cancer, results in a loss of growth inhibition by transforming growth factor-β (TGF-β). Here, we show that SMAD6 expression is associated with a reduced survival in lung cancer patients. Short hairpin RNA (shRNA)–mediated knockdown of SMAD6 in lung cancer cell lines resulted in reduced cell viability and increased apoptosis as well as inhibition of cell cycle progression. However, these results were not seen in Beas2B, a normal bronchial epithelial cell line. To better understand the mechanism underlying the association of SMAD6 with poor patient survival, we used a lentivirus construct carrying shRNA for SMAD6 to knock down expression of the targeted gene. Through gene expression analysis, we observed that knockdown of SMAD6 led to the activation of TGF-β signaling through up-regulation of plasminogen activator inhibitor-1 and phosphorylation of SMAD2/3. Furthermore, SMAD6 knockdown activated the c-Jun NH2-terminal kinase pathway and reduced phosphorylation of Rb-1, resulting in increased G0-G1 cell arrest and apoptosis in the lung cancer cell line H1299. These results jointly suggest that SMAD6 plays a critical role in supporting lung cancer cell growth and survival. Targeted inactivation of SMAD6 may provide a novel therapeutic strategy for lung cancers expressing this gene.
MicroRNAs play a role in regulating diverse biological processes and have considerable utility as molecular markers for diagnosis and monitoring of human disease. Several technologies are available commercially for measuring microRNA expression. However, cross-platform comparisons do not necessarily correlate well, making it difficult to determine which platform most closely represents the true microRNA expression level in a tissue. To address this issue, we have analyzed RNA derived from cell lines, as well as fresh frozen and formalin-fixed paraffin embedded tissues, using Affymetrix, Agilent, and Illumina microRNA arrays, NanoString counting, and Illumina Next Generation Sequencing. We compared the performance within- and between the different platforms, and then verified these results with those of quantitative PCR data. Our results demonstrate that the within-platform reproducibility for each method is consistently high and although the gene expression profiles from each platform show unique traits, comparison of genes that were commonly detectable showed that detection of microRNA transcripts was similar across multiple platforms.
Formalin fixed, paraffin embedded tissues are most commonly used for routine pathology analysis and for long term tissue preservation in the clinical setting. Many institutions have large archives of Formalin fixed, paraffin embedded tissues that provide a unique opportunity for understanding genomic signatures of disease. However, genome-wide expression profiling of Formalin fixed, paraffin embedded samples have been challenging due to RNA degradation. Because of the significant heterogeneity in tissue quality, normalization and analysis of these data presents particular challenges. The distribution of intensity values from archival tissues are inherently noisy and skewed due to differential sample degradation raising two primary concerns; whether a highly skewed array will unduly influence initial normalization of the data and whether outlier arrays can be reliably identified.
Two simple extensions of common regression diagnostic measures are introduced that measure the stress an array undergoes during normalization and how much a given array deviates from the remaining arrays post-normalization. These metrics are applied to a study involving 1618 formalin-fixed, paraffin-embedded HER2-positive breast cancer samples from the N9831 adjuvant trial processed with Illumina’s cDNA-mediated Annealing Selection extension and Ligation assay.
Proper assessment of array quality within a research study is crucial for controlling unwanted variability in the data. The metrics proposed in this paper have direct biological interpretations and can be used to identify arrays that should either be removed from analysis all together or down-weighted to reduce their influence in downstream analyses.
High-dimensional array quality; Formalin-Fixed; Paraffin-embedded tissue; Outlier detection
Genetic variants that modify brain gene expression may also influence risk for human diseases. We measured expression levels of 24,526 transcripts in brain samples from the cerebellum and temporal cortex of autopsied subjects with Alzheimer's disease (AD, cerebellar n = 197, temporal cortex n = 202) and with other brain pathologies (non–AD, cerebellar n = 177, temporal cortex n = 197). We conducted an expression genome-wide association study (eGWAS) using 213,528 cisSNPs within ±100 kb of the tested transcripts. We identified 2,980 cerebellar cisSNP/transcript level associations (2,596 unique cisSNPs) significant in both ADs and non–ADs (q<0.05, p = 7.70×10−5–1.67×10−82). Of these, 2,089 were also significant in the temporal cortex (p = 1.85×10−5–1.70×10−141). The top cerebellar cisSNPs had 2.4-fold enrichment for human disease-associated variants (p<10−6). We identified novel cisSNP/transcript associations for human disease-associated variants, including progressive supranuclear palsy SLCO1A2/rs11568563, Parkinson's disease (PD) MMRN1/rs6532197, Paget's disease OPTN/rs1561570; and we confirmed others, including PD MAPT/rs242557, systemic lupus erythematosus and ulcerative colitis IRF5/rs4728142, and type 1 diabetes mellitus RPS26/rs1701704. In our eGWAS, there was 2.9–3.3 fold enrichment (p<10−6) of significant cisSNPs with suggestive AD–risk association (p<10−3) in the Alzheimer's Disease Genetics Consortium GWAS. These results demonstrate the significant contributions of genetic factors to human brain gene expression, which are reliably detected across different brain regions and pathologies. The significant enrichment of brain cisSNPs among disease-associated variants advocates gene expression changes as a mechanism for many central nervous system (CNS) and non–CNS diseases. Combined assessment of expression and disease GWAS may provide complementary information in discovery of human disease variants with functional implications. Our findings have implications for the design and interpretation of eGWAS in general and the use of brain expression quantitative trait loci in the study of human disease genetics.
Genetic variants that regulate gene expression levels can also influence human disease risk. Discovery of genomic loci that alter brain gene expression levels (brain expression quantitative trait loci = eQTLs) can be instrumental in the identification of genetic risk underlying both central nervous system (CNS) and non–CNS diseases. To systematically assess the role of brain eQTLs in human disease and to evaluate the influence of brain region and pathology in eQTL mapping, we performed an expression genome-wide association study (eGWAS) in 773 brain samples from the cerebellum and temporal cortex of ∼200 autopsied subjects with Alzheimer's disease (AD) and ∼200 with other brain pathologies (non–AD). We identified ∼3,000 significant associations between cisSNPs near ∼700 genes and their cerebellar transcript levels, which replicate in ADs and non–ADs. More than 2,000 of these associations were reproducible in the temporal cortex. The top cisSNPs are enriched for both CNS and non–CNS disease-associated variants. We identified novel and confirmed previous cisSNP/transcript associations for many disease loci, suggesting gene expression regulation as their mechanism of action. These findings demonstrate the reproducibility of the eQTL approach across different brain regions and pathologies, and advocate the combined use of gene expression and disease GWAS for identification and functional characterization of human disease-associated variants.
Glutathione S-transferase omega-1 and 2 genes (GSTO1, GSTO2), residing within an Alzheimer and Parkinson disease (AD and PD) linkage region, have diverse functions including mitigation of oxidative stress and may underlie the pathophysiology of both diseases. GSTO polymorphisms were previously reported to associate with risk and age-at-onset of these diseases, although inconsistent follow-up study designs make interpretation of results difficult. We assessed two previously reported SNPs, GSTO1 rs4925 and GSTO2 rs156697, in AD (3,493 ADs vs. 4,617 controls) and PD (678 PDs vs. 712 controls) for association with disease risk (case-controls), age-at-diagnosis (cases) and brain gene expression levels (autopsied subjects).
We found that rs156697 minor allele associates with significantly increased risk (odds ratio = 1.14, p = 0.038) in the older ADs with age-at-diagnosis > 80 years. The minor allele of GSTO1 rs4925 associates with decreased risk in familial PD (odds ratio = 0.78, p = 0.034). There was no other association with disease risk or age-at-diagnosis. The minor alleles of both GSTO SNPs associate with lower brain levels of GSTO2 (p = 4.7 × 10-11-1.9 × 10-27), but not GSTO1. Pathway analysis of significant genes in our brain expression GWAS, identified significant enrichment for glutathione metabolism genes (p = 0.003).
These results suggest that GSTO locus variants may lower brain GSTO2 levels and consequently confer AD risk in older age. Other glutathione metabolism genes should be assessed for their effects on AD and other chronic, neurologic diseases.
GSTO genes; Disease risk; Gene expression; Association
The dismal lethality of lung cancer is due to late stage at diagnosis and inherent therapeutic resistance. The incorporation of targeted therapies has modestly improved clinical outcomes, but the identification of new targets could further improve clinical outcomes by guiding stratification of poor-risk early stage patients and individualizing therapeutic choices. We hypothesized that a sequential, combined microarray approach would be valuable to identify and validate new targets in lung cancer. We profiled gene expression signatures during lung epithelial cell immortalization and transformation, and showed that genes involved in mitosis were progressively enhanced in carcinogenesis. 28 genes were validated by immunoblotting and 4 genes were further evaluated in non-small cell lung cancer tissue microarrays. Although CDK1 was highly expressed in tumor tissues, its loss from the cytoplasm unexpectedly predicted poor survival and conferred resistance to chemotherapy in multiple cell lines, especially microtubule-directed agents. An analysis of expression of CDK1 and CDK1-associated genes in the NCI60 cell line database confirmed the broad association of these genes with chemotherapeutic responsiveness. These results have implications for personalizing lung cancer therapy and highlight the potential of combined approaches for biomarker discovery.
Lung cancer in individuals who have never smoked tobacco products is an increasing medical and public-health issue. We aimed to unravel the genetic basis of lung cancer in never smokers.
We did a four-stage investigation. First, a genome-wide association study of single nucleotide polymorphisms (SNPs) was done with 754 never smokers (377 matched case-control pairs at Mayo Clinic, Rochester, MN, USA). Second, the top candidate SNPs from the first study were validated in two independent studies among 735 (MD Anderson Cancer Center, Houston, TX, USA) and 253 (Harvard University, Boston, MA, USA) never smokers. Third, further replication of the top SNP was done in 530 never smokers (UCLA, Los Angeles, CA, USA). Fourth, expression quantitative trait loci (eQTL) and gene-expression differences were analysed to further elucidate the causal relation between the validated SNPs and the risk of lung cancer in never smokers.
44 top candidate SNPs were identified that might alter the risk of lung cancer in never smokers. rs2352028 at chromosome 13q31.3 was subsequently replicated with an additive genetic model in the four independent studies, with a combined odds ratio of 1·46 (95% CI 1·26–1·70, p=5·94×10−6). A cis eQTL analysis showed there was a strong correlation between genotypes of the replicated SNPs and the transcription level of the gene GPC5 in normal lung tissues (p=1·96×10−4), with the high-risk allele linked with lower expression. Additionally, the transcription level of GPC5 in normal lung tissue was twice that detected in matched lung adenocarcinoma tissue (p=6·75×10−11).
Genetic variants at 13q31.3 alter the expression of GPC5, and are associated with susceptibility to lung cancer in never smokers. Downregulation of GPC5 might contribute to the development of lung cancer in never smokers.
MicroRNAs (miRNAs) represent a growing class of small non-coding RNAs that are important regulators of gene expression in both plants and animals. Studies have shown that miRNAs play a critical role in human cancer and they can influence the level of cell proliferation and apoptosis by modulating gene expression. Currently, methods for the detection and measurement of miRNA expression include small and moderate-throughput technologies, such as standard quantitative PCR and microarray based analysis. However, these methods have several limitations when used in large clinical studies where a high-throughput and highly quantitative technology needed for the efficient characterization of a large number of miRNA transcripts in clinical samples. Furthermore, archival formalin fixed, paraffin embedded (FFPE) samples are increasingly becoming the primary resource for gene expression studies because fresh frozen (FF) samples are often difficult to obtain and requires special storage conditions. In this study, we evaluated the miRNA expression levels in FFPE and FF samples as well as several lung cancer cell lines employing a high throughput qPCR-based microfluidic technology. The results were compared to standard qPCR and hybridization-based microarray platforms using the same samples.
We demonstrated highly correlated Ct values between multiplex and singleplex RT reactions in standard qPCR assays for miRNA expression using total RNA from A549 (R = 0.98; p < 0.0001) and H1299 (R = 0.95; p < 0.0001) lung cancer cell lines. The Ct values generated by the microfluidic technology (Fluidigm 48.48 dynamic array systems) resulted in a left-shift toward lower Ct values compared to those observed by ABI 7900 HT (mean difference, 3.79), suggesting that the microfluidic technology exhibited a greater sensitivity. In addition, we show that as little as 10 ng total RNA can be used to reliably detect all 48 or 96 tested miRNAs using a 96-multiplexing RT reaction in both FFPE and FF samples. Finally, we compared miRNA expression measurements in both FFPE and FF samples by qPCR using the 96.96 dynamic array and Affymetrix microarrays. Fold change comparisons for comparable genes between the two platforms indicated that the overall correlation was R = 0.60. The maximum fold change detected by the Affymetrix microarray was 3.5 compared to 13 by the 96.96 dynamic array.
The qPCR-array based microfluidic dynamic array platform can be used in conjunction with multiplexed RT reactions for miRNA gene expression profiling. We showed that this approach is highly reproducible and the results correlate closely with the existing singleplex qPCR platform at a throughput that is 5 to 20 times higher and a sample and reagent usage that was approximately 50-100 times lower than conventional assays. We established optimal conditions for using the Fluidigm microfluidic technology for rapid, cost effective, and customizable arrays for miRNA expression profiling and validation.
The cDNA-mediated Annealing, extension, Selection and Ligation (DASL) assay has become a suitable gene expression profiling system for degraded RNA from paraffin-embedded tissue. We examined assay characteristics and the performance of the DASL 502-gene Cancer Panelv1 (1.5K) and 24,526-gene panel (24K) platforms at differentiating nine human epidermal growth factor receptor 2- positive (HER2+) and 11 HER2-negative (HER2-) paraffin-embedded breast tumors.
Bland-Altman plots and Spearman correlations evaluated intra/inter-panel agreement of normalized expression values. Unequal-variance t-statistics tested for differences in expression levels between HER2 + and HER2 - tumors. Regulatory network analysis was performed using Metacore (GeneGo Inc., St. Joseph, MI).
Technical replicate correlations ranged between 0.815-0.956 and 0.986-0.997 for the 1.5K and 24K panels, respectively. Inter-panel correlations of expression values for the common 498 genes across the two panels ranged between 0.485-0.573. Inter-panel correlations of expression values of 17 probes with base-pair sequence matches between the 1.5K and 24K panels ranged between 0.652-0.899. In both panels, erythroblastic leukemia viral oncogene homolog 2 (ERBB2) was the most differentially expressed gene between the HER2 + and HER2 - tumors and seven additional genes had p-values < 0.05 and log2 -fold changes > |0.5| in expression between HER2 + and HER2 - tumors: topoisomerase II alpha (TOP2A), cyclin a2 (CCNA2), v-fos fbj murine osteosarcoma viral oncogene homolog (FOS), wingless-type mmtv integration site family, member 5a (WNT5A), growth factor receptor-bound protein 7 (GRB7), cell division cycle 2 (CDC2), and baculoviral iap repeat-containing protein 5 (BIRC5). The top 52 discriminating probes from the 24K panel are enriched with genes belonging to the regulatory networks centered around v-myc avian myelocytomatosis viral oncogene homolog (MYC), tumor protein p53 (TP53), and estrogen receptor α (ESR1). Network analysis with a two-step extension also showed that the eight discriminating genes common to the 1.5K and 24K panels are functionally linked together through MYC, TP53, and ESR1.
The relative RNA abundance obtained from two highly differing density gene panels are correlated with eight common genes differentiating HER2 + and HER2 - breast tumors. Network analyses demonstrated biological consistency between the 1.5K and 24K gene panels.
Exposure to secondhand smoke during adulthood has detrimental health effects, including increased lung cancer risk. Compared with adults, children may be more susceptible to secondhand smoke. This susceptibility may be exacerbated by alterations in inherited genetic variants of innate immunity genes. We hypothesized a positive association between childhood secondhand smoke exposure and lung cancer risk that would be modified by genetic polymorphisms in the mannose binding lectin-2 (MBL2) gene resulting in well-known functional changes in innate immunity.
Childhood secondhand smoke exposure and lung cancer risk was assessed among men and women in the ongoing National Cancer Institute-Maryland Lung Cancer (NCI-MD) study, which included 624 cases and 348 controls. Secondhand smoke history was collected via in-person interviews. DNA was used for genotyping the MBL2 gene. To replicate, we used an independent case-control study from Mayo Clinic consisting of 461 never smokers, made up of 172 cases and 289 controls. All statistical tests were two-sided.
In the NCI-MD study, secondhand smoke exposure during childhood was associated with increased lung cancer risk among never smokers [odds ratio (OR), 2.25; 95% confidence interval (95% CI), 1.04-4.90]. This was confirmed in the Mayo study (OR, 1.47; 95% CI, 1.00-2.15). A functional MBL2 haplotype associated with high circulating levels of MBL and increased MBL2 activity was associated with increased lung cancer risk among those exposed to childhood secondhand smoke in both the NCI-MD and Mayo studies (OR, 2.52; 95% CI, 1.13-5.60, and OR, 2.78; 95% CI, 1.18-3.85, respectively).
Secondhand smoke exposure during childhood is associated with increased lung cancer risk among never smokers, particularly among those possessing a haplotype corresponding to a known overactive complement pathway of the innate immune system.
MicroRNAs (miRNAs) are known to be important regulators of both organ development and tumorigenesis. MiRNA networks and their regulation of messenger RNA (mRNA) translation and protein expression in specific biological processes are poorly understood.
We explored the dynamic regulation of miRNAs in mouse lung organogenesis. Comprehensive miRNA and mRNA profiling was performed encompassing all recognized stages of lung development beginning at embryonic day 12 and continuing to adulthood. We analyzed the expression patterns of dynamically regulated miRNAs and mRNAs using a number of statistical and computational approaches, and in an integrated manner with protein levels from an existing mass-spectrometry derived protein database for lung development.
In total, 117 statistically significant miRNAs were dynamically regulated during mouse lung organogenesis and clustered into distinct temporal expression patterns. 11,220 mRNA probes were also shown to be dynamically regulated and clustered into distinct temporal expression patterns, with 3 major patterns accounting for 75% of all probes. 3,067 direct miRNA-mRNA correlation pairs were identified involving 37 miRNAs. Two defined correlation patterns were observed upon integration with protein data: 1) increased levels of specific miRNAs directly correlating with downregulation of predicted mRNA targets; and 2) increased levels of specific miRNAs directly correlating with downregulation of translated target proteins without detectable changes in mRNA levels. Of 1345 proteins analyzed, 55% appeared to be regulated in this manner with a direct correlation between miRNA and protein level, but without detectable change in mRNA levels.
Systematic analysis of microRNA, mRNA, and protein levels over the time course of lung organogenesis demonstrates dynamic regulation and reveals 2 distinct patterns of miRNA-mRNA interaction. The translation of target proteins affected by miRNAs independent of changes in mRNA level appears to be a prominent mechanism of developmental regulation in lung organogenesis.
We seek to establish a genetic test to identify lung cancer using cells obtained through CT guided fine needle aspiration (FNA).
We selected regions of frequent copy number gains in chromosomes 1q32, 3q26, 5p15, and 8q24 in non-small cell lung cancer (NSCLC) and tested their ability to determine the neoplastic state of cells obtained by FNA using fluorescent in situ hybridization (FISH). Two sets of samples were included. The pilot set included six paraffin-embedded non-cancerous lung tissues and 33 formalin-fixed FNA specimens. These 39 samples were used to establish the optimal fixation and single scoring criteria for the samples. The test set included 40 FNA samples. The results of the genetic test were compared with the cytology, pathology, and clinical follow up for each case to assess the sensitivity and specificity of the genetic test.
Non-tumor lung tissues had ≤4 signals per nuclei for all tested markers while tumor samples had ≥5 signals per nucleus in five or more cells for at least one marker. Among the 40 testing cases, 36 of 40 (90%) FNA samples were analyzable. Genetic analysis identified 15 cases as tumor and 21 as non-tumor. Clinical and pathological diagnoses confirmed the genetic test in 15 of 16 lung cancer cases regardless of tumor subtype, stage, or size and in 20 of 20 cases diagnosed as benign lung diseases.
A set of only four genetic markers can distinguish the neoplastic state of lung lesion using small samples obtained through CT guided FNA.
FNA; lung cancer; genetic markers; chromosome amplification; in situ hybridization
Tobacco smoking is responsible for over 90% of lung cancer cases, and yet the precise molecular alterations induced by smoking in lung that develop into cancer and impact survival have remained obscure.
We performed gene expression analysis using HG-U133A Affymetrix chips on 135 fresh frozen tissue samples of adenocarcinoma and paired noninvolved lung tissue from current, former and never smokers, with biochemically validated smoking information. ANOVA analysis adjusted for potential confounders, multiple testing procedure, Gene Set Enrichment Analysis, and GO-functional classification were conducted for gene selection. Results were confirmed in independent adenocarcinoma and non-tumor tissues from two studies. We identified a gene expression signature characteristic of smoking that includes cell cycle genes, particularly those involved in the mitotic spindle formation (e.g., NEK2, TTK, PRC1). Expression of these genes strongly differentiated both smokers from non-smokers in lung tumors and early stage tumor tissue from non-tumor tissue (p<0.001 and fold-change >1.5, for each comparison), consistent with an important role for this pathway in lung carcinogenesis induced by smoking. These changes persisted many years after smoking cessation. NEK2 (p<0.001) and TTK (p = 0.002) expression in the noninvolved lung tissue was also associated with a 3-fold increased risk of mortality from lung adenocarcinoma in smokers.
Our work provides insight into the smoking-related mechanisms of lung neoplasia, and shows that the very mitotic genes known to be involved in cancer development are induced by smoking and affect survival. These genes are candidate targets for chemoprevention and treatment of lung cancer in smokers.
Identifying specific molecular markers and developing sensitive detection methods are two of the fundamental requirements for detection and differential diagnosis of cancer. Toward this goal, we first performed cDNA array analysis using 65 non-small cell lung cancer and non-involved normal lung tissues. We then used several complementary statistical and analytical methods to examine gene expression profiles generated by us and others from four independent sets of normal and neoplastic lung tissues. We report here that several sets of roughly 20 genes were sufficient to provide a robust distinction between normal and neoplastic tissues of the lung. Next we assessed the predictive ability of these gene sets by using Flow-Thru Chips® (FTC) (MetriGenix, Baltimore, MD) containing 20 genes to screen 48 primary lung tumours and normal lung tissues. Gene expression changes detected by FTC distinguished lung cancers from the normal lung tissues by using an RNA amount equivalent to that present in as few as 300 cells. We also used an independent set of 24 genes and showed that their expression profile was equally effective when measured by quantitative polymerase chain reaction (Q-PCR). Our results demonstrate that lung cancers can be identified based on the expression patterns of just 20 genes and that this approach is applicable for cancer diagnosis, prognosis, and monitoring using small amount of tumor or biopsy samples.
cancer detection; class prediction; expression profiling; gene chip; non–small cell lung cancer; Flow-Thru Chip
Serial analysis of gene expression studies led us to identify a previously unknown gene, c20orf85, that is present in the normal lung epithelium, but absent or downregulated in most primary non-small cell lung cancers and lung cancer cell lines. We named this gene LLC1 for Low in Lung Cancer 1. LLC1 is located on chromosome 20q13.3 and has a 70% GC content in the promoter region. It has 4 exons and encodes a protein containing 137 amino acids. By in situ hybridization, we observed that LLC1 message is localized in normal lung bronchial epithelial cells, but absent in 13 of 14 lung adenocarcinoma and 9 out of 10 lung squamous carcinoma samples. Methylation at CpG sites of the LLC1 promoter was frequently observed in lung cancer cell lines and in a fraction of primary lung cancer tissues. Treatment with 5-aza deoxycytidine resulted in a reduced methylation of the LLC1 promoter concomitant with the increase of LLC1 expression. These results suggest that inactivation of LLC1 by means of promoter methylation is a frequent event in nonsmall cell lung cancer and may play a role in lung tumorigenesis.
nonsmall cell lung cancer; serial analysis of gene expression; promoter methylation