Very few analytical approaches have been reported to resolve the variability in microarray measurements stemming from sample heterogeneity. For example, tissue samples used in cancer studies are usually contaminated with the surrounding or infiltrating cell types. This heterogeneity in the sample preparation hinders further statistical analysis, significantly so if different samples contain different proportions of these cell types. Thus, sample heterogeneity can result in the identification of differentially expressed genes that may be unrelated to the biological question being studied. Similarly, irrelevant gene combinations can be discovered in the case of gene expression based classification.
We propose a computational framework for removing the effects of sample heterogeneity by "microdissecting" microarray data in silico. The computational method provides estimates of the expression values of the pure (non-heterogeneous) cell samples. The inversion of the sample heterogeneity can be facilitated by providing accurate estimates of the mixing percentages of different cell types in each measurement. For those cases where no such information is available, we develop an optimization-based method for joint estimation of the mixing percentages and the expression values of the pure cell samples. We also consider the problem of selecting the correct number of cell types.
The efficiency of the proposed methods is illustrated by applying them to a carefully controlled cDNA microarray data obtained from heterogeneous samples. The results demonstrate that the methods are capable of reconstructing both the sample and cell type specific expression values from heterogeneous mixtures and that the mixing percentages of different cell types can also be estimated. Furthermore, a general purpose model selection method can be used to select the correct number of cell types.
RNA-seq, a next-generation sequencing based method for transcriptome analysis, is rapidly emerging as the method of choice for comprehensive transcript abundance estimation. The accuracy of RNA-seq can be highly impacted by the purity of samples. A prominent, outstanding problem in RNA-seq is how to estimate transcript abundances in heterogeneous tissues, where a sample is composed of more than one cell type and the inhomogeneity can substantially confound the transcript abundance estimation of each individual cell type. Although experimental methods have been proposed to dissect multiple distinct cell types, computationally "deconvoluting" heterogeneous tissues provides an attractive alternative, since it keeps the tissue sample as well as the subsequent molecular content yield intact.
Here we propose a probabilistic model-based approach, Transcript Estimation from Mixed Tissue samples (TEMT), to estimate the transcript abundances of each cell type of interest from RNA-seq data of heterogeneous tissue samples. TEMT incorporates positional and sequence-specific biases, and its online EM algorithm only requires a runtime proportional to the data size and a small constant memory. We test the proposed method on both simulation data and recently released ENCODE data, and show that TEMT significantly outperforms current state-of-the-art methods that do not take tissue heterogeneity into account. Currently, TEMT only resolves the tissue heterogeneity resulting from two cell types, but it can be extended to handle tissue heterogeneity resulting from multi cell types. TEMT is written in python, and is freely available at https://github.com/uci-cbcl/TEMT.
The probabilistic model-based approach proposed here provides a new method for analyzing RNA-seq data from heterogeneous tissue samples. By applying the method to both simulation data and ENCODE data, we show that explicitly accounting for tissue heterogeneity can significantly improve the accuracy of transcript abundance estimation.
Interpreting gene expression profiles obtained from heterogeneous samples can be difficult because bulk gene expression measures are not resolved to individual cell populations. We have recently devised Population-Specific Expression Analysis (PSEA), a statistical method that identifies individual cell types expressing genes of interest and achieves quantitative estimates of cell type-specific expression levels. This procedure makes use of marker gene expression and circumvents the need for additional experimental information like tissue composition.
To systematically assess the performance of statistical deconvolution, we applied PSEA to gene expression profiles from cerebellum tissue samples and compared with parallel, experimental separation methods. Owing to the particular histological organization of the cerebellum, we could obtain cellular expression data from in situ hybridization and laser-capture microdissection experiments and successfully validated computational predictions made with PSEA. Upon statistical deconvolution of whole tissue samples, we identified a set of transcripts showing age-related expression changes in the astrocyte population.
PSEA can predict cell-type specific expression levels from tissues homogenates on a genome-wide scale. It thus represents a computational alternative to experimental separation methods and allowed us to identify age-related expression changes in the astrocytes of the cerebellum. These molecular changes might underlie important physiological modifications previously observed in the aging brain.
Genomics; Computational biology; Cerebellum; Gene expression; Aging; Astrocyte
Gene expression profiling studies based on DNA microarrays have demonstrated their ability to define the interaction pathways between neoplastic and nonmalignant stromal cells in cancer tissues. During the past ten years, a number of approaches including microdissection have tried to resolve the variability in DNA microarray measurements stemming from cancer tissue sample heterogeneity. Another approach, designated as virtual or in silico microdissection, avoids the laborious and time-consuming step of anatomic microdissection. It consists of confronting the gene expression profiles of complex tissue samples to those of cell lines representative of different cell lineages, different differentiation stages, or different signaling pathways. This strategy has been used in recent studies aiming to analyze microenvironment alterations using gene expression profiling of nonmicrodissected classical Hodgkin lymphoma tissues in order to generate new prognostic factors. These recent contributions are detailed and discussed in the present paper.
Uncertainty often affects molecular biology experiments and data for different reasons. Heterogeneity of gene or protein expression within the same tumor tissue is an example of biological uncertainty which should be taken into account when molecular markers are used in decision making. Tissue Microarray (TMA) experiments allow for large scale profiling of tissue biopsies, investigating protein patterns characterizing specific disease states. TMA studies deal with multiple sampling of the same patient, and therefore with multiple measurements of same protein target, to account for possible biological heterogeneity. The aim of this paper is to provide and validate a classification model taking into consideration the uncertainty associated with measuring replicate samples.
We propose an extension of the well-known Naïve Bayes classifier, which accounts for biological heterogeneity in a probabilistic framework, relying on Bayesian hierarchical models. The model, which can be efficiently learned from the training dataset, exploits a closed-form of classification equation, thus providing no additional computational cost with respect to the standard Naïve Bayes classifier. We validated the approach on several simulated datasets comparing its performances with the Naïve Bayes classifier. Moreover, we demonstrated that explicitly dealing with heterogeneity can improve classification accuracy on a TMA prostate cancer dataset.
The proposed Hierarchical Naïve Bayes classifier can be conveniently applied in problems where within sample heterogeneity must be taken into account, such as TMA experiments and biological contexts where several measurements (replicates) are available for the same biological sample. The performance of the new approach is better than the standard Naïve Bayes model, in particular when the within sample heterogeneity is different in the different classes.
For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence activated cell sorting, or in-silico deconfounding. We studied feasibility and validity of a non-negative matrix decomposition algorithm using experimental gene expression data for blood and sorted cells from the same donor samples. Our objective was to optimize the algorithm regarding detection of differentially expressed genes and to enable its use for classification in the difficult scenario of reversely regulated genes. This would be of importance for the identification of candidate biomarkers in heterogeneous tissues.
Experimental data and simulation studies involving noise parameters estimated from these data revealed that for valid detection of differential gene expression, quantile normalization and use of non-log data are optimal. We demonstrate the feasibility of predicting proportions of constituting cell types from gene expression data of single samples, as a prerequisite for a deconfounding-based classification approach.
Classification cross-validation errors with and without using deconfounding results are reported as well as sample-size dependencies. Implementation of the algorithm, simulation and analysis scripts are available.
The deconfounding algorithm without decorrelation using quantile normalization on non-log data is proposed for biomarkers that are difficult to detect, and for cases where confounding by varying proportions of cell types is the suspected reason. In this case, a deconfounding ranking approach can be used as a powerful alternative to, or complement of, other statistical learning approaches to define candidate biomarkers for molecular diagnosis and prediction in biomedicine, in realistically noisy conditions and with moderate sample sizes.
Rationale and Objectives
We have previously described a probabilistic model for the multiple-reader, multiple-case paradigm for ROC analysis. When the figure of merit is the Wilcoxon statistic, this model returns a seven-term expansion for the variance of this statistic as a function of the numbers of cases and readers. This probabilistic model also provides expressions for the coefficients in the seven-term expansion in terms of expectations over the internal noise, readers, and cases. Finally, this probabilistic model sets bounds on both the overall variance of the Wilcoxon statistic as well as the individual coefficients.
Materials and Methods
In this paper we will first validate the probabilistic model by comparing variances determined by direct computation of the expansion coefficients to empirical estimates of the variance using independent sampling. Validation of the probabilistic model will enable us to use the direct estimates of the expansion coefficients as a gold-standard to compare other coefficient-estimation techniques. Next, we develop a coefficient-estimation technique that employs bootstrapping to estimate the Wilcoxon statistic variance for different numbers of readers and cases. We then employ constrained, least-squares fitting techniques to estimate the expansion coefficients. The constraints used in this fitting are derived directly from the probabilistic model.
Results and Discussion
Using two different simulation studies, we show that the novel (and practical) bootstrapping/fitting technique returns estimates of the coefficients that are consistent with the gold standard. The results presented also serve to validate the seven-term expansion for the variance of the Wilcoxon statistic.
ROC analysis; multiple reader multiple case; Wilcoxon statistic
Aims—Laser capture microdissection is a recent development that enables the isolation of specific cell types for subsequent molecular analysis. This study describes a method for obtaining proteome information from laser capture microdissected tissue using colon cancer as a model.
Methods—Laser capture microdissection was performed on toluidine blue stained frozen sections of colon cancer. Tumour cells were selectively microdissected. Conditions were established for solubilising proteins from laser microdissected samples and these proteins were separated by two dimensional gel electrophoresis. Individual protein spots were cut from the gel, characterised by mass spectrometry, and identified by database searching. These results were compared with protein expression patterns and mass spectroscopic data obtained from bulk tumour samples run in parallel.
Results—Proteins could be recovered from laser capture microdissected tissue in a form suitable for two dimensional gel electrophoresis. The solubilised proteins retained their expected electrophoretic mobility in two dimensional gels as compared with bulk samples, and mass spectrometric analysis was also unaffected.
Conclusion—A method for performing two dimensional gel electrophoresis and mass spectrometry using laser capture microdissected tissue has been developed.
colon cancer; electrophoresis; proteomics
Breast tumors consist of several different tissue components. Despite the heterogeneity, most gene expression analyses have traditionally been performed without prior microdissection of the tissue sample. Thus, the gene expression profiles obtained reflect the mRNA contribution from the various tissue components. We utilized histopathological estimations of area fractions of tumor and stromal tissue components in 198 fresh-frozen breast tumor tissue samples for a cell type-associated gene expression analysis associated with distant metastasis. Sets of differentially expressed gene-probes were identified in tumors from patients who developed distant metastasis compared with those who did not, by weighing the contribution from each tumor with the relative content of stromal and tumor epithelial cells in their individual tumor specimen. The analyses were performed under various assumptions of mRNA transcription level from tumor epithelial cells compared with stromal cells. A set of 30 differentially expressed gene-probes was ascribed solely to carcinoma cells. Furthermore, two sets of 38 and five differentially expressed gene-probes were mostly associated to tumor epithelial and stromal cells, respectively. Finally, a set of 26 differentially expressed gene-probes was identified independently of cell type focus. The differentially expressed genes were validated in independent gene expression data from a set of laser capture microdissected invasive ductal carcinomas. We present a method for identifying and ascribing differentially expressed genes to tumor epithelial and/or stromal cells, by utilizing pathologic information and weighted t-statistics. Although a transcriptional contribution from the stromal cell fraction is detectable in microarray experiments performed on bulk tumor, the gene expression differences between the distant metastasis and no distant metastasis group were mostly ascribed to the tumor epithelial cells of the primary breast tumors. However, the gene PIP5K2A was found significantly elevated in stroma cells in distant metastasis group, compared to stroma in no distant metastasis group. These findings were confirmed in gene expression data from the representative compartments from microdissected breast tissue. The method described was also found to be robust to different histopathological procedures.
Cell type heterogeneity may have a substantial effect on gene expression profiling of human tissue. Several in silico methods for deconvoluting a gene expression profile into cell-type-specific subprofiles have been published but not widely used. Here, we consider recent methods and the experimental validations available for them. Shen-Orr et al. recently developed an approach called cell-type-specific significance analysis of microarray for deconvoluting gene expression. This method requires the measurement of the proportion of each cell type in each sample and the expression profiles of the heterogeneous samples. It determines how gene expression varies among pre-defined phenotypes for each cell type. Gene expression can vary substantially among cell types and sample heterogeneity can mask the identification of biologically important phenotypic correlations. Consequently, the deconvolution approach can be useful in the analysis of mixtures of cell populations in clinical samples.
Cells within tissues can be morphologically indistinguishable yet show molecular expression patterns that are remarkably heterogeneous. Here, we describe an approach for comprehensively identifying coregulated, heterogeneously expressed genes among cells that otherwise appear identical. The technique, called “stochastic profiling”, involves the repeated, random selection of very-small cell populations via laser-capture microdissection, followed by a customized single-cell amplification procedure and transcriptional profiling. Fluctuations in the resulting gene-expression measurements are then analyzed statistically to identify transcripts that are heterogeneously co-expressed. We stochastically profiled matrix-attached human epithelial cells in a three-dimensional culture model of mammary-acinar morphogenesis. Of 4,557 transcripts, we identified 547 genes with strong cell-to-cell expression differences. Clustering of this heterogeneous subset revealed several molecular “programs” implicated in protein biosynthesis, oxidative-stress responses, and nuclear factor-κB signaling, which were independently confirmed by RNA fluorescence in situ hybridization. Thus, stochastic profiling can reveal single-cell heterogeneities without measuring individual cells explicitly.
Tissue heterogeneity is a serious limiting factor for sound cell-specific molecular studies including genomic and proteomic analyses. Although tissue microdissection technologies (e.g. laser capture microdissection) have advanced tremendously over the last decades several factors such as their generally high cost and inability to microdissect fresh or live tissues limit their widespread use. Therefore, there is a need for a low-cost and easy-to-use microdissection device. Here, we developed a low-cost vacuum-assisted capillary-based cell and tissue acquisition system (CTAS) and demonstrated its use for microdissection of brain tissues samples for several downstream applications including isolation of high quality RNA from microdissected brain tissue samples, their use for proteomics studies and electron microscopy as well as microdissection of native living brain tissues for primary cell culturing. Unlike LCM, CTAS is capable of microdissecting fresh frozen and live tissues, works in a thicker tissue sections ranging from 10 mm to 300 mm and can collect individual cells, cell clusters and subanatomical regions. CTAS has been established as a straightforward and robust microdissection tool, allowing rapid, precise and efficient procurement of specific tissue and cell types at low cost. Developed microdissection protocol avoids extensive heating, chemical treatment, laser beam exposure, and other potentially harmful physical treatment of the tissue samples, thus preserving the primary functions of the dissected cells and the macromolecules within for subsequent downstream applications.
The cellular composition of heterogeneous samples can be predicted using an expression deconvolution algorithm to decompose their gene expression profiles based on pre-defined, reference gene expression profiles of the constituent populations in these samples. However, the expression profiles of the actual constituent populations are often perturbed from those of the reference profiles due to gene expression changes in cells associated with microenvironmental or developmental effects. Existing deconvolution algorithms do not account for these changes and give incorrect results when benchmarked against those measured by well-established flow cytometry, even after batch correction was applied. We introduce PERT, a new probabilistic expression deconvolution method that detects and accounts for a shared, multiplicative perturbation in the reference profiles when performing expression deconvolution. We applied PERT and three other state-of-the-art expression deconvolution methods to predict cell frequencies within heterogeneous human blood samples that were collected under several conditions (uncultured mono-nucleated and lineage-depleted cells, and culture-derived lineage-depleted cells). Only PERT's predicted proportions of the constituent populations matched those assigned by flow cytometry. Genes associated with cell cycle processes were highly enriched among those with the largest predicted expression changes between the cultured and uncultured conditions. We anticipate that PERT will be widely applicable to expression deconvolution strategies that use profiles from reference populations that vary from the corresponding constituent populations in cellular state but not cellular phenotypic identity.
The cellular composition of heterogeneous samples can be predicted from reference gene expression profiles that represent the homogeneous, constituent populations of the heterogeneous samples. However, existing methods fail when the reference profiles are not representative of the constituent populations. We developed PERT, a new probabilistic expression deconvolution method, to address this limitation. PERT was used to deconvolve the cellular composition of variably sourced and treated heterogeneous human blood samples. Our results indicate that even after batch correction is applied, cells presenting the same cell surface antigens display different transcriptional programs when they are uncultured versus culture-derived. Given gene expression profiles of culture-derived heterogeneous samples and profiles of uncultured reference populations, PERT was able to accurately recover proportions of the constituent populations composing the heterogeneous samples. We anticipate that PERT will be widely applicable to expression deconvolution strategies that use profiles from reference populations that vary from the corresponding constituent populations in cellular state but not cellular phenotypic identity.
Complicating proteomic analysis of whole tissues is the obvious problem of cell heterogeneity in tissues, which often results in misleading or confusing molecular findings. Thus, the coupling of tissue microdissection for tumor cell enrichment with capillary isotachophoresis-based selective analyte concentration not only serves as a synergistic strategy to characterize low abundance proteins, but it can also be employed to conduct comparative proteomic studies of human astrocytomas.
A set of fresh frozen brain biopsies were selectively microdissected to provide an enriched, high quality, and reproducible sample of tumor cells. Despite sharing many common proteins, there are significant differences in the protein expression level among different grades of astrocytomas. A large number of proteins, such as plasma membrane proteins EGFR and Erbb2, are up-regulated in glioblastoma. Besides facilitating the prioritization of follow-on biomarker selection and validation, comparative proteomics involving measurements in changes of pathways are expected to reveal the molecular relationships among different pathological grades of gliomas and potential molecular mechanisms that drive gliomagenesis.
Normal biological tissues harbour different populations of cells with intricate spacial distribution patterns resulting in heterogeneity of their overall cellular composition. Laser microdissection involving direct viewing and expertise by a pathologist, enables access to defined cell populations or specific region on any type of tissue sample, thus selecting near-pure populations of targeted cells. It opens the way for molecular methods directed towards well-defined populations, and provides also a powerful tool in studies focused on a limited number of cells. Laser microdissection has wide applications in oncology (diagnosis and research), cellular and molecular biology, biochemistry and forensics for tissue selection, but other areas have been gradually opened up to these new methodological approaches, such as cell cultures and cytogenetics. In clinical oncology trials, molecular profiling of microdissected samples can yield global “omics” information which, together, with the morphological analysis of cells, can provide the basis for diagnosis, prognosis and patient-tailored treatments. This remarkable technology has brought new insights in the understanding of DNA, RNA, and the biological functions and regulation of proteins to identify molecular disease signatures. We review herein the different applications of laser microdissection in a variety of fields, and we particularly focus attention on the pre-analytical steps that are crucial to successfully perform molecular-level investigations.
Laser microdissection; histopathology; quality control; snap-freezing; DNA; RNA; proteomics; in situ cellular and molecular analyses
Gene expression analysis is generally performed on heterogeneous tissue samples consisting of multiple cell types. Current methods developed to separate heterogeneous gene expression rely on prior knowledge of the cell-type composition and/or signatures - these are not available in most public datasets. We present a novel method to identify the cell-type composition, signatures and proportions per sample without need for a-priori information. The method was successfully tested on controlled and semi-controlled datasets and performed as accurately as current methods that do require additional information. As such, this method enables the analysis of cell-type specific gene expression using existing large pools of publically available microarray datasets.
Gene expression microarrays are widely used to uncover biological insights. Most microarray experiments profile whole tissues containing mixtures of multiple cell-types. As such, gene expression differences between samples may be due to different cellular compositions or biological differences, highly limiting the conclusions derived from the analysis. All current approaches to computationally separate the heterogeneous gene expression to individual cell-types require that the identity, relative amount of the cell-types in the tissue or their individual gene expression are known. Publically available microarray-based datasets, which include thousands of patient samples, do not usually measure this information, rendering existing separation methods unusable. We developed a novel approach to estimate the number of cell-types, identities, individual gene expression and relative proportions in heterogeneous tissues with no a-priori information except for an initial estimate of the cell-types in the tissue analyzed and general reference signatures of these cell-types that may be easily obtained from public databases. We successfully applied our method to microarray datasets, yielding highly accurate estimations, which often exceed the performance of separation methods that require prior information. Thus, our method can be accurately applied to any heterogeneous dataset, where re-examination and analysis of the individual cell-types in the heterogeneous tissue can aid in discovering new aspects regarding these diseases.
Diagnosis of Barrett's esophagus (BE) is typically done through morphologic analysis of esophageal tissue biopsy. Such samples contain several cell types. Laser capture microdissection (LCM) allows the isolation of specific cells from heterogeneous cell populations. The purpose of this study was to determine the degree of overlap of the two sample types and to define a set of genes that may serve as biochemical markers for BE.
We obtained biopsies from regions of the glandular tissue of BE and normal esophagus from 9 subjects with BE. Samples from 5 subjects were examined as whole tissue (BE [whole]; E [whole]), and in 4 subjects the glandular epithelium of BE was isolated using LCM (BE [LCM]) and compared to the averaged values (E [LCM]) for both basal cell (B [LCM]) and squamous cell (S [LCM]) epithelium.
Gene expression revealed 1797 probesets between BE [whole] and E [whole] (fold change > 2.0; p<0.001). Most (74%) were also differentially expressed between BE [LCM] and E [LCM], showing that there was high concordance between the two sampling methods. LCM provided a great deal of additional information (2113 genes) about the alterations in gene expression that may represent the BE phenotype.
There are differences in gene expression profiles depending on whether specimens are whole tissue biopsies or LCM dissected. Whole tissue biopsies should prove satisfactory for diagnostic purposes. Because the data from LCM samples delineated many more Barrett's specific genes, this procedure may provide more information regarding pathogenesis than whole tissue material.
The molecular examination of pathologically altered cells and tissues at the DNA, RNA, and protein level has revolutionised research and diagnostics in pathology. However, the inherent heterogeneity of primary tissues with an admixture of various reactive cell populations can affect the outcome and interpretation of molecular studies. Recently, microdissection of tissue sections and cytological preparations has been used increasingly for the isolation of homogeneous, morphologically identified cell populations, thus overcoming the obstacle of tissue complexity. In conjunction with sensitive analytical techniques, such as the polymerase chain reaction, microdissection allows precise in vivo examination of cell populations, such as carcinoma in situ or the malignant cells of Hodgkin's disease, which are otherwise inaccessible for conventional molecular studies. However, most microdissection techniques are very time consuming and require a high degree of manual dexterity, which limits their practical use. Laser capture microdissection (LCM), a novel technique developed at the National Cancer Institute, is an important advance in terms of speed, ease of use, and versatility of microdissection. LCM is based on the adherence of visually selected cells to a thermoplastic membrane, which overlies the dehydrated tissue section and is focally melted by triggering of a low energy infrared laser pulse. The melted membrane forms a composite with the selected tissue area, which can be removed by simple lifting of the membrane. LCM can be applied to a wide range of cell and tissue preparations including paraffin wax embedded material. The use of immunohistochemical stains allows the selection of cells according to phenotypic and functional characteristics. Depending on the starting material, DNA, good quality mRNA, and proteins can be extracted successfully from captured tissue fragments, down to the single cell level. In combination with techniques like expression library construction, cDNA array hybridisation and differential display, LCM will allow the establishment of "genetic fingerprints"of specific pathological lesions, especially malignant neoplasms. In addition to the identification of new diagnostic and prognostic markers, this approach could help in establishing individualised treatments tailored to the molecular profile of a tumour. This review provides an overview of the technique of LCM, summarises current applications and new methodical approaches, and tries to give a perspective on future developments. In addition, LCM is compared with other recently developed laser microdissection techniques.
Key Words: laser capture microdissection • RNA analysis • DNA analysis • gene expression • profiling • immunohistochemistry
Laser capture microdissection (LCM) allows the precise procurement of enriched cell populations from a heterogeneous tissue, or live cell culture, under direct microscopic visualization. Histologically enriched cell populations can be procured by harvesting cells of interest directly, or isolating specific cells by ablating unwanted cells. The basic components of laser microdissection technology are a) visualization of cells via light microscopy, b) transfer of laser energy to a thermolabile polymer with either the formation of a polymer-cell composite (capture method) or transfer of laser energy via an ultraviolet laser to photovolatize a region of tissue (cutting method), and c) removal of cells of interest from the heterogeneous tissue section. The capture and cutting methods (instruments) for laser microdissection differ in the manner by which cells of interest are removed from the heterogeneous sample. Laser energy in the capture method is infrared (810nm), while in the cutting mode the laser is ultraviolet (355nm). Infrared lasers melt a thermolabile polymer that adheres to the cells of interest, whereas ultraviolet lasers ablate cells for either removal of unwanted cells or excision of a defined area of cells. LCM technology is applicable to an array of applications including mass spectrometry, DNA genotyping and loss-of-heterozygosity analysis, RNA transcript profiling, cDNA library generation, proteomics discovery, and signal kinase pathway profiling. This chapter describes laser capture microdissection using an ArcturusXT instrument for protein LCM sample analysis, and using a mmi CellCut Plus® instrument for RNA analysis via NanoString technology.
DNA; infrared laser; laser capture microdissection; molecular profiling; NanoString; phopshoprotein; pre-analytical variablity; protein; RNA; tissue; tissue heterogeneity; UV laser
Laser-capture microdissection (LCM) that enables the isolation of specific cell populations from complex tissues under morphological control is increasingly used for subsequent gene expression studies in cell biology by methods such as real-time quantitative PCR (qPCR), microarrays and most recently by RNA-sequencing. Challenges are i) to select precisely and efficiently cells of interest and ii) to maintain RNA integrity. The mammary gland which is a complex and heterogeneous tissue, consists of multiple cell types, changing in relative proportion during its development and thus hampering gene expression profiling comparison on whole tissue between physiological stages. During lactation, mammary epithelial cells (MEC) are predominant. However several other cell types, including myoepithelial (MMC) and immune cells are present, making it difficult to precisely determine the specificity of gene expression to the cell type of origin. In this work, an optimized reliable procedure for producing RNA from alveolar epithelial cells isolated from frozen histological sections of lactating goat, sheep and cow mammary glands using an infrared-laser based Arcturus Veritas LCM (Applied Biosystems®) system has been developed. The following steps of the microdissection workflow: cryosectioning, staining, dehydration and harvesting of microdissected cells have been carefully considered and designed to ensure cell capture efficiency without compromising RNA integrity.
The best results were obtained when staining 8 μm-thick sections with Cresyl violet® (Ambion, Applied Biosystems®) and capturing microdissected cells during less than 2 hours before RNA extraction. In addition, particular attention was paid to animal preparation before biopsies or slaughtering (milking) and freezing of tissue blocks which were embedded in a cryoprotective compound before being immersed in isopentane. The amount of RNA thus obtained from ca.150 to 250 acini (300,000 to 600,000 μm2) ranges between 5 to 10 ng. RNA integrity number (RIN) was ca. 8.0 and selectivity of this LCM protocol was demonstrated through qPCR analyses for several alveolar cell specific genes, including LALBA (α-lactalbumin) and CSN1S2 (αs2-casein), as well as Krt14 (cytokeratin 14), CD3e and CD68 which are specific markers of MMC, lymphocytes and macrophages, respectively.
RNAs isolated from MEC in this manner were of very good quality for subsequent linear amplification, thus making it possible to establish a referential gene expression profile of the healthy MEC, a useful platform for tumor biomarker discovery.
We examined gene expression profiles of tumor cells from 29 untreated patients with lung cancer (10 adenocarcinomas (AC), 10 squamous cell carcinomas (SCC), and 9 small cell lung cancer (SCLC)) in comparison to 5 samples of normal lung tissue (NT). The European and American methodological quality guidelines for microarray experiments were followed, including the stipulated use of laser capture microdissection for separation and purification of the lung cancer tumor cells from surrounding tissue.
Based on differentially expressed genes, different lung cancer samples could be distinguished from each other and from normal lung tissue using hierarchical clustering. Comparing AC, SCC and SCLC with NT, we found 205, 335 and 404 genes, respectively, that were at least 2-fold differentially expressed (estimated false discovery rate: < 2.6%). Different lung cancer subtypes had distinct molecular phenotypes, which also reflected their biological characteristics. Differentially expressed genes in human lung tumors which may be of relevance in the respective lung cancer subtypes were corroborated by quantitative real-time PCR.
Genetic programming (GP) was performed to construct a classifier for distinguishing between AC, SCC, SCLC, and NT. Forty genes, that could be used to correctly classify the tumor or NT samples, have been identified. In addition, all samples from an independent test set of 13 further tumors (AC or SCC) were also correctly classified.
The data from this research identified potential candidate genes which could be used as the basis for the development of diagnostic tools and lung tumor type-specific targeted therapies.
An important need of many cancer research projects is the availability of high-quality, appropriately selected tissue. Tissue biorepositories are organized to collect, process, store, and distribute samples of tumor and normal tissue for further use in fundamental and translational cancer research. This, in turn, provides investigators with an invaluable resource of appropriately examined and characterized tissue specimens and linked patient information. Human tissues, in particular, tumor tissues, are complex structures composed of heterogeneous mixtures of morphologically and functionally distinct cell types. It is essential to analyze specific cell types to identify and define accurately the biologically important processes in pathologic lesions. Laser capture microdissection (LCM) is state-of-the-art technology that provides the scientific community with a rapid and reliable method to isolate a homogeneous population of cells from heterogeneous tissue specimens, thus providing investigators with the ability to analyze DNA, RNA, and protein accurately from pure populations of cells. This is particularly well-suited for tumor cell isolation, which can be captured from complex tissue samples. The combination of LCM and a tissue biorepository offers a comprehensive means by which researchers can use valuable human biospecimens and cutting-edge technology to facilitate basic, translational, and clinical research. This review provides an overview of LCM technology with an emphasis on the applications of LCM in the setting of a tissue biorepository, based on the author's extensive experience in LCM procedures acquired at Fox Chase Cancer Center and Hollings Cancer Center.
pathology; cancer biology; cells of interest
In this paper we introduce an efficient algorithm for alignment of multiple large-scale biological networks. In this scheme, we first compute a probabilistic similarity measure between nodes that belong to different networks using a semi-Markov random walk model. The estimated probabilities are further enhanced by incorporating the local and the cross-species network similarity information through the use of two different types of probabilistic consistency transformations. The transformed alignment probabilities are used to predict the alignment of multiple networks based on a greedy approach. We demonstrate that the proposed algorithm, called SMETANA, outperforms many state-of-the-art network alignment techniques, in terms of computational efficiency, alignment accuracy, and scalability. Our experiments show that SMETANA can easily align tens of genome-scale networks with thousands of nodes on a personal computer without any difficulty. The source code of SMETANA is available upon request. The source code of SMETANA can be downloaded from http://www.ece.tamu.edu/~bjyoon/SMETANA/.
Single-cell variations in gene and protein expression are important during development and disease. Cell-to-cell heterogeneities can be directly inspected one cell at a time, but global methods are usually not sensitive enough to work with such a small amount of starting material. Here, we provide a detailed protocol for stochastic profiling, a method that infers single-cell regulatory heterogeneities by repeatedly sampling small collections of cells at random. Repeated stochastic sampling is performed by laser capture microdissection or limiting dilution, followed by careful exponential cDNA amplification, hybridization to microarrays, and statistical analysis. Stochastic profiling surveys the transcriptome for programs that are heterogeneously regulated among cellular subpopulations in their native tissue context. The protocol is readily optimized for specific biological applications and takes about one week to complete.
heterogeneity; single-cell; stochastic; systems biology; gene expression; cancer; development
AIM: To develop a method of labeling and micro-dissecting mouse Kupffer cells within an extraordinarily short period of time using laser capture microdissection (LCM).
METHODS: Tissues are complex structures comprised of a heterogeneous population of interconnected cells. LCM offers a method of isolating a single cell type from specific regions of a tissue section. LCM is an essential approach used in conjunction with molecular analysis to study the functional interaction of cells in their native tissue environment. The process of labeling and acquiring cells by LCM prior to mRNA isolation can be elaborate, thereby subjecting the RNA to considerable degradation. Kupffer cell labeling is achieved by injecting India ink intravenously, thus circumventing the need for in vitro staining. The significance of this novel approach was validated using a cholestatic liver injury model.
RESULTS: mRNA extracted from the microdissected cell population displayed marked increases in colony-stimulating factor-1 receptor and Kupffer cell receptor message expression, which demonstrated Kupffer cell enrichment. Gene expression by Kupffer cells derived from bile-duct-ligated, versus sham-operated, mice was compared. Microarray analysis revealed a significant (2.5-fold, q value < 10) change in 493 genes. Based on this fold-change and a standardized PubMed search, 10 genes were identified that were relevant to the ability of Kupffer cells to suppress liver injury.
CONCLUSION: The methodology outlined herein provides an approach to isolating high quality RNA from Kupffer cells, without altering the tissue integrity.
Kupffer cells; India ink; Laser capture microdissection; Bile duct ligation; DNA microarray