We have developed flowMeans, a time-efficient and accurate method for automated identification of cell populations in flow cytometry (FCM) data based on K-means clustering. Unlike traditional K-means, flowMeans can identify concave cell populations by modelling a single population with multiple clusters. flowMeans uses a change point detection algorithm to determine the number of sub-populations, enabling the method to be used in high throughput FCM data analysis pipelines. Our approach compares favourably to manual analysis by human experts and current state-of-the-art automated gating algorithms. flowMeans is freely available as an open source R package through Bioconductor.
flow cytometry; data analysis; cluster analysis; model selection; bioinformatics; statistics
Flow cytometry (FCM) software packages from R/Bioconductor, such as flowCore and flowViz, serve as an open platform for development of new analysis tools and methods. We created plateCore, a new package that extends the functionality in these core packages to enable automated negative control-based gating and make the processing and analysis of plate-based data sets from high-throughput FCM screening experiments easier. plateCore was used to analyze data from a BD FACS CAP screening experiment where five Peripheral Blood Mononucleocyte Cell (PBMC) samples were assayed for 189 different human cell surface markers. This same data set was also manually analyzed by a cytometry expert using the FlowJo data analysis software package (TreeStar, USA). We show that the expression values for markers characterized using the automated approach in plateCore are in good agreement with those from FlowJo, and that using plateCore allows for more reproducible analyses of FCM screening data.
Effective quality assessment is an important part of any high-throughput flow cytometry data analysis pipeline, especially when considering the complex designs of the typical flow experiments applied in clinical trials. Technical issues like instrument variation, problematic antibody staining, or reagent lot changes can lead to biases in the extracted cell subpopulation statistics. These biases can manifest themselves in non–obvious ways that can be difficult to detect without leveraging information about the study design or other experimental metadata. Consequently, a systematic and integrated approach to quality assessment of flow cytometry data is necessary to effectively identify technical errors that impact multiple samples over time. Gated cell populations and their statistics must be monitored within the context of the experimental run, assay, and the overall study.
We have developed two new packages, flowWorkspace and QUAliFiER to construct a pipeline for quality assessment of gated flow cytometry data. flowWorkspace makes manually gated data accessible to BioConductor’s computational flow tools by importing pre–processed and gated data from the widely used manual gating tool, FlowJo (Tree Star Inc, Ashland OR). The QUAliFiER package takes advantage of the manual gates to perform an extensive series of statistical quality assessment checks on the gated cell sub–populations while taking into account the structure of the data and the study design to monitor the consistency of population statistics across staining panels, subject, aliquots, channels, or other experimental variables. QUAliFiER implements SVG–based interactive visualization methods, allowing investigators to examine quality assessment results across different views of the data, and it has a flexible interface allowing users to tailor quality checks and outlier detection routines to suit their data analysis needs.
We present a pipeline constructed from two new R packages for importing manually gated flow cytometry data and performing flexible and robust quality assessment checks. The pipeline addresses the increasing demand for tools capable of performing quality checks on large flow data sets generated in typical clinical trials. The QUAliFiER tool objectively, efficiently, and reproducibly identifies outlier samples in an automated manner by monitoring cell population statistics from gated or ungated flow data conditioned on experiment–level metadata.
Flow cytometry; Quality assessment; BioConductor package
In a high throughput setting, effective flow cytometry data analysis depends heavily on proper data preprocessing. While usual preprocessing steps of quality assessment, outlier removal, normalization, and gating have received considerable scrutiny from the community, the influence of data transformation on the output of high throughput analysis has been largely overlooked. Flow cytometry measurements can vary over several orders of magnitude, cell populations can have variances that depend on their mean fluorescence intensities, and may exhibit heavily-skewed distributions. Consequently, the choice of data transformation can influence the output of automated gating. An appropriate data transformation aids in data visualization and gating of cell populations across the range of data. Experience shows that the choice of transformation is data specific. Our goal here is to compare the performance of different transformations applied to flow cytometry data in the context of automated gating in a high throughput, fully automated setting. We examine the most common transformations used in flow cytometry, including the generalized hyperbolic arcsine, biexponential, linlog, and generalized Box-Cox, all within the BioConductor flowCore framework that is widely used in high throughput, automated flow cytometry data analysis. All of these transformations have adjustable parameters whose effects upon the data are non-intuitive for most users. By making some modelling assumptions about the transformed data, we develop maximum likelihood criteria to optimize parameter choice for these different transformations.
We compare the performance of parameter-optimized and default-parameter (in flowCore) data transformations on real and simulated data by measuring the variation in the locations of cell populations across samples, discovered via automated gating in both the scatter and fluorescence channels. We find that parameter-optimized transformations improve visualization, reduce variability in the location of discovered cell populations across samples, and decrease the misclassification (mis-gating) of individual events when compared to default-parameter counterparts.
Our results indicate that the preferred transformation for fluorescence channels is a parameter- optimized biexponential or generalized Box-Cox, in accordance with current best practices. Interestingly, for populations in the scatter channels, we find that the optimized hyperbolic arcsine may be a better choice in a high-throughput setting than current standard practice of no transformation. However, generally speaking, the choice of transformation remains data-dependent. We have implemented our algorithm in the BioConductor package, flowTrans, which is publicly available.
The recent development of semiautomated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data.
We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data.
We found that graphical representations can reveal substantial nonbiological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review.
Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control.
flow cytometry; high throughput; quality assessment; visualization; exploratory data analysis; statistics; software
High-throughput flow cytometry experiments produce hundreds of large multivariate samples of cellular characteristics. These samples require specialized processing to obtain clinically meaningful measurements. A major component of this processing is a form of cell subsetting known as gating. Manual gating is time-consuming and subjective. Good automatic and semi-automatic gating algorithms are very beneficial to high-throughput flow cytometry.
We develop a statistical procedure, named curvHDR, for automatic and semi-automatic gating. The method combines the notions of significant high negative curvature regions and highest density regions and has the ability to adapt well to human-perceived gates. The underlying principles apply to dimension of arbitrary size, although we focus on dimensions up to three. Accompanying software, compatible with contemporary flow cytometry infor-matics, is developed.
The method is seen to adapt well to nuances in the data and, to a reasonable extent, match human perception of useful gates. It offers big savings in human labour when processing high-throughput flow cytometry data whilst retaining a good degree of efficacy.
Intracellular cytokine staining (ICS) by multiparameter flow cytometry is one of the primary methods for determining T cell immunogenicity in HIV-1 clinical vaccine trials. Data analysis requires considerable expertise and time. The amount of data is quickly increasing as more and larger trials are performed, and thus there is a critical need for high throughput methods of data analysis.
A web based flow cytometric analysis system, LabKey Flow, was developed for analyses of data from standardized ICS assays. A gating template was created manually in commercially-available flow cytometric analysis software. Using this template, the system automatically compensated and analyzed all data sets. Quality control queries were designed to identify potentially incorrect sample collections.
Comparison of the semi-automated analysis performed by LabKey Flow and the manual analysis performed using FlowJo software demonstrated excellent concordance (concordance correlation coefficient >0.990). Manual inspection of the analyses performed by LabKey Flow for 8-color ICS data files from several clinical vaccine trials indicates that template gates can appropriately be used for most data sets.
The semi-automated LabKey Flow analysis system can analyze accurately large ICS data files. Routine use of the system does not require specialized expertise. This high-throughput analysis will provide great utility for rapid evaluation of complex multiparameter flow cytometric measurements collected from large clinical trials.
Flow Cytometry; Intracellular Cytokine Staining; HIV-1; vaccine; T cell; immunogenicity; data analysis; automation
Manual gating of bivariate plots remains the most frequently used data analysis method in flow cytometry. However, gating is operator-dependent and cumbersome, particularly with the increasing complexity of modern multicolor immunophenotyping data. A method that can remove operator bias, enable systematic and thorough analysis of complex high-dimensional data, correlate temporal changes in different subsets and lead to biomaker discovery is needed. Here we apply such a method, called cytometric fingerprinting (CF), to data obtained on peripheral blood B cells from an adult patient with type-1 diabetes who underwent pancreatic islet transplantation. We establish that CF can be used to analyze longitudinal trends in immunophenotypic data, and show that results from CF are comparable to those obtained with traditional gating methods. Both methods reveal the appearance of transitional B cells and subsequent accumulation of more mature B cells following immunosuppression and transplantation. This pattern is consistent with a temporally ordered process of B cell auto-reconstitution. We also show the comparative efficiency of fingerprinting in recognizing relative changes in B cell subsets with respect to time, its ability to couple the data with statistical methods (agglomerative clustering) and its potential to define novel subsets.
B lymphocyte; flow cytometry; cytometric fingerprinting; type 1 diabetes; transplantation
Flow cytometry is one of the fundamental research tools available to the life scientist. The ability to observe multi-dimensional changes in protein expression and activity at single-cell resolution for a large number of cells provides a unique perspective on the behavior of cell populations. However, the analysis of complex multi-dimensional data is one of the obstacles for wider use of polychromatic flow cytometry.
Recent enhancements to an open-source platform - R/Bioconductor - enable the graphical and data analysis of flow cytometry data. Prior examples have focused on high-throughput applications. To facilitate wider use of this platform for flow cytometry, the analysis of a dataset, obtained following isolation of CD4+CD62L+ T cells from Balb/c splenocytes using magnetic microbeads, is presented as a form of tutorial.
A common workflow for analyzing flow cytometry data was presented using R/Bioconductor. In addition, density function estimation and principal component analysis are provided as examples of more complex analyses.
The compendium - in the form of text, supplemental R scripts, and supplemental FCS3.0 files - presented here is intended to help illuminate a path for inquisitive readers to explore their own data using R/Bioconductor.
bioinformatics; statistics; CD4+ T cells
Advances in multi-parameter flow cytometry (FCM) now allow for the independent detection of larger numbers of fluorochromes on individual cells, generating data with increasingly higher dimensionality. The increased complexity of these data has made it difficult to identify cell populations from high-dimensional FCM data using traditional manual gating strategies based on single-color or two-color displays.
To address this challenge, we developed a novel program, FLOCK (FLOw Clustering without K), that uses a density-based clustering approach to algorithmically identify biologically relevant cell populations from multiple samples in an unbiased fashion, thereby eliminating operator-dependent variability.
FLOCK was used to objectively identify seventeen distinct B cell subsets in a human peripheral blood sample and to identify and quantify novel plasmablast subsets responding transiently to tetanus and other vaccinations in peripheral blood. FLOCK has been implemented in the publically available Immunology Database and Analysis Portal – ImmPort (http://www.immport.org) for open use by the immunology research community.
FLOCK is able to identify cell subsets in experiments that use multi-parameter flow cytometry through an objective, automated computational approach. The use of algorithms like FLOCK for FCM data analysis obviates the need for subjective and labor intensive manual gating to identify and quantify cell subsets. Novel populations identified by these computational approaches can serve as hypotheses for further experimental study.
flow cytometry; density-based analysis; data clustering; tetanus vaccination; B lymphocyte subsets
We present a framework for the identification of cell subpopulations in
flow cytometry data based on merging mixture components using the
flowClust methodology. We show that the cluster merging algorithm
under our framework improves model fit and provides a better
estimate of the number of distinct cell subpopulations than
either Gaussian mixture models or flowClust, especially for
complicated flow cytometry data distributions. Our framework
allows the automated selection of the number of distinct cell
subpopulations and we are able to identify cases where the
algorithm fails, thus making it suitable for application in a high
throughput FCM analysis pipeline. Furthermore, we demonstrate a
method for summarizing complex merged cell subpopulations in a
simple manner that integrates with the existing flowClust
framework and enables downstream data analysis. We demonstrate the
performance of our framework on simulated and real FCM data. The
software is available in the flowMerge package through the
The ability of flow cytometry to allow fast single cell interrogation of a large number of cells has
made this technology ubiquitous and indispensable in the clinical and laboratory setting. A current limit to the potential of this technology is the lack of automated tools for analyzing the resulting data. We describe methodology and software to automatically identify cell populations in flow cytometry data. Our approach advances the paradigm of manually gating sequential two-dimensional projections of the data to a procedure that automatically produces gates based on statistical theory. Our approach is nonparametric and can reproduce nonconvex subpopulations that are known to occur in flow cytometry samples, but which cannot be produced with current parametric model-based approaches. We illustrate the methodology with a sample of mouse spleen and peritoneal cavity cells.
Budding yeast Saccharoymyces cerevisiae is a powerful model system for analyzing eukaryotic cell cycle regulation. Yeast cell cycle analysis is typically performed by visual analysis or flow cytometry, and both have limitations in the scope and accuracy of data obtained. This study demonstrates how Multispectral Imaging Flow Cytometry (MIFC) provides precise quantitation of cell cycle distribution and morphological phenotypes of yeast cells in flow.
Cell cycle analysis of wild-type yeast, nap1Δ, and yeast overexpressing NAP1, was performed visually, by flow cytometry and by MIFC. Quantitative morphological analysis employed measurements of cellular length, thickness and aspect ratio in an algorithm to calculate a novel feature, bud length.
MIFC demonstrated reliable quantification of the yeast cell cycle compared to morphological and flow cytometric analyses. By employing this technique we observed both the G2/M delay and elongated buds previously described in the nap1Δ strain.
Using MIFC, we demonstrate that overexpression of NAP1 causes elongated buds yet only a minor disruption in the cell cycle. The different effects of NAP1 expression level on cell cycle and morphology suggests that these phenotypes are independent. Unlike conventional yeast flow cytometry, MIFC generates complete cell cycle profiles and concurrently offers multiple parameters for morphological analysis.
Budding yeast; Cell cycle; MIFC; Multipectral Imaging Flow Cytometry; Nap1
The analysis of protein-protein-interactions is a key focus of proteomics efforts. The yeast two-hybrid system has been the most commonly used method in genome-wide searches for protein interaction partners. However, the throughput of the current yeast two-hybrid array approach is hampered by the involvement of the time-consuming LacZ assay and/or the incompatibility of liquid handling automation due to the requirement for selection of colonies/diploids on agar plates. To facilitate large-scale yeast two-hybrid assays, we report a novel array approach by coupling a GFP reporter based yeast two-hybrid system with high throughput flow cytometry that enables the processing of a 96 well plate in as little as 3 minutes. In this approach, the yEGFP reporter has been established in both AH109 (MATa) and Y187 (MATα) reporter cells. It not only allows the generation of two copies of GFP reporter genes in diploid cells, but also allows the convenient determination of self-activators generated from both bait and prey constructs by flow cytometry. We demonstrate a Y2H array assay procedure that is carried out completely in liquid media in 96-well plates by mating bait and prey cells in liquid YPD media, selecting the diploids containing positive interaction pairs in selective media and analyzing the GFP reporter directly by flow cytometry. We have evaluated this flow cytometry based array procedure by showing that the interaction of the positive control pair P53/T is able to be reproducibly detected at 72 hrs post-mating compared to the negative control pairs. We conclude that our flow cytometry based yeast two-hybrid approach is robust, convenient, quantitative, and is amenable to large-scale analysis using liquid-handling automation.
HT flow cytometry; Protein-protein interaction; Yeast two-hybrid system; Array approach
As a high-throughput technology that offers rapid quantification of multidimensional characteristics for millions of cells, flow cytometry (FCM) is widely used in health research, medical diagnosis and treatment, and vaccine development. Nevertheless, there is an increasing concern about the lack of appropriate software tools to provide an automated analysis platform to parallelize the high-throughput data-generation platform. Currently, to a large extent, FCM data analysis relies on the manual selection of sequential regions in 2-D graphical projections to extract the cell populations of interest. This is a time-consuming task that ignores the high-dimensionality of FCM data.
In view of the aforementioned issues, we have developed an R package called flowClust to automate FCM analysis. flowClust implements a robust model-based clustering approach based on multivariate t mixture models with the Box-Cox transformation. The package provides the functionality to identify cell populations whilst simultaneously handling the commonly encountered issues of outlier identification and data transformation. It offers various tools to summarize and visualize a wealth of features of the clustering results. In addition, to ensure its convenience of use, flowClust has been adapted for the current FCM data format, and integrated with existing Bioconductor packages dedicated to FCM analysis.
flowClust addresses the issue of a dearth of software that helps automate FCM analysis with a sound theoretical foundation. It tends to give reproducible results, and helps reduce the significant subjectivity and human time cost encountered in FCM analysis. The package contributes to the cytometry community by offering an efficient, automated analysis platform which facilitates the active, ongoing technological advancement.
This review describes the use of high-throughput flow cytometry for performing multiplexed cell-based and bead-based screens. With the many advances in cell-based analysis and screening, flow cytometry has historically been underutilized as a screening tool largely due to the limitations in handling large numbers of samples. However, there has been a resurgence in the use of flow cytometry due to a combination of innovations around instrumentation and a growing need for cell-based and bead-based applications. The HTFC™ Screening System (IntelliCyt Corporation, Albuquerque, NM) is a novel flow cytometry-based screening platform that incorporates a fast sample-loading technology, HyperCyt®, with a two-laser, six-parameter flow cytometer and powerful data analysis capabilities. The system is capable of running multiplexed screening assays at speeds of up to 40 wells per minute, enabling the processing of a 96- and 384-well plates in as little as 3 and 12 min, respectively. Embedded in the system is HyperView®, a data analysis software package that allows rapid identification of hits from multiplexed high-throughput flow cytometry screening campaigns. In addition, the software is incorporated into a server-based data management platform that enables seamless data accessibility and collaboration across multiple sites. High-throughput flow cytometry using the HyperCyt technology has been applied to numerous assay areas and screening campaigns, including efflux transporters, whole cell and receptor binding assays, functional G-protein-coupled receptor screening, in vitro toxicology, and antibody screening.
Flow cytometry technology is widely used in both health care and research. The rapid expansion of flow cytometry applications has outpaced the development of data storage and analysis tools. Collaborative efforts being taken to eliminate this gap include building common vocabularies and ontologies, designing generic data models, and defining data exchange formats. The Minimum Information about a Flow Cytometry Experiment (MIFlowCyt) standard was recently adopted by the International Society for Advancement of Cytometry. This standard guides researchers on the information that should be included in peer reviewed publications, but it is insufficient for data exchange and integration between computational systems. The Functional Genomics Experiment (FuGE) formalizes common aspects of comprehensive and high throughput experiments across different biological technologies. We have extended FuGE object model to accommodate flow cytometry data and metadata.
We used the MagicDraw modelling tool to design a UML model (Flow-OM) according to the FuGE extension guidelines and the AndroMDA toolkit to transform the model to a markup language (Flow-ML). We mapped each MIFlowCyt term to either an existing FuGE class or to a new FuGEFlow class. The development environment was validated by comparing the official FuGE XSD to the schema we generated from the FuGE object model using our configuration. After the Flow-OM model was completed, the final version of the Flow-ML was generated and validated against an example MIFlowCyt compliant experiment description.
The extension of FuGE for flow cytometry has resulted in a generic FuGE-compliant data model (FuGEFlow), which accommodates and links together all information required by MIFlowCyt. The FuGEFlow model can be used to build software and databases using FuGE software toolkits to facilitate automated exchange and manipulation of potentially large flow cytometry experimental data sets. Additional project documentation, including reusable design patterns and a guide for setting up a development environment, was contributed back to the FuGE project.
We have shown that an extension of FuGE can be used to transform minimum information requirements in natural language to markup language in XML. Extending FuGE required significant effort, but in our experiences the benefits outweighed the costs. The FuGEFlow is expected to play a central role in describing flow cytometry experiments and ultimately facilitating data exchange including public flow cytometry repositories currently under development.
Cell division is an essential cellular process that requires an array of known and unknown proteins for its spatial and temporal regulation. Here we develop a novel, high-throughput screening method for the identification of bacterial cell division genes and regulators. The method combines the over-expression of a shotgun genomic expression library to perturb the cell division process with high-throughput flow cytometry sorting to screen many thousands of clones. Using this approach, we recovered clones with a filamentous morphology for the model bacterium, Escherichia coli. Genetic analysis revealed that our screen identified both known cell division genes, and genes that have not previously been identified to be involved in cell division. This novel screening strategy is applicable to a wide range of organisms, including pathogenic bacteria, where cell division genes and regulators are attractive drug targets for antibiotic development.
Automated classification of biological cells according to their 3D morphology is highly desired in a flow cytometer setting. We have investigated this possibility experimentally and numerically using a diffraction imaging approach. A fast image analysis software based on the gray level co-occurrence matrix (GLCM) algorithm has been developed to extract feature parameters from measured diffraction images. The results of GLCM analysis and subsequent classification demonstrate the potential for rapid classification among six types of cultured cells. Combined with numerical results we show that the method of diffraction imaging flow cytometry has the capacity as a platform for high-throughput and label-free classification of biological cells.
(170.1530) Cell analysis; (290.5870) Scattering, Rayleigh
Recent advances in automation technologies have enabled the use of flow cytometry for high throughput screening, generating large complex data sets often in clinical trials or drug discovery settings. However, data management and data analysis methods have not advanced sufficiently far from the initial small-scale studies to support modeling in the presence of multiple covariates.
We developed a set of flexible open source computational tools in the R package flowCore to facilitate the analysis of these complex data. A key component of which is having suitable data structures that support the application of similar operations to a collection of samples or a clinical cohort. In addition, our software constitutes a shared and extensible research platform that enables collaboration between bioinformaticians, computer scientists, statisticians, biologists and clinicians. This platform will foster the development of novel analytic methods for flow cytometry.
The software has been applied in the analysis of various data sets and its data structures have proven to be highly efficient in capturing and organizing the analytic work flow. Finally, a number of additional Bioconductor packages successfully build on the infrastructure provided by flowCore, open new avenues for flow data analysis.
Polychromatic flow cytometry allows the capture of multidimensional data, providing the technical tool to assess complex immune responses. Interrogation of the adaptive T cell response to infection or vaccination already has benefited greatly from standardized protocols for polychromatic flow cytometric analysis. The innate immune system plays an important role in health and disease, and presents potentially important therapeutic and diagnostic modalities. We describe here a high-throughput polychromatic flow cytometry-based platform that enables the rapid interrogation and large scale screening of human blood antigen presenting cell responses to Toll-like receptor (TLR) ligands and other innate immune modulators. Using this assay, we found that for certain stimuli (e.g., TLR9 and TLR3 ligands), the general protocol for intracellular cytokine cytometry had to be significantly modified to allow response detection. Furthermore, high concentrations of TLR7/8 and TLR4 stimuli caused substantial changes in lineage markers, potentially confounding analysis if one were to use a conventional “lineage-negative” cocktail. The assay we developed is reproducible and has been used to show that a given individual’s TLR response pattern is relatively stable over at least several months. This protocol is in strict compliance with published guidelines for polychromatic flow cytometry, provides a common platform for scientists to compare their results directly, and may be applicable to the diagnostic evaluation of Toll-like receptor function and the rapid screening of promising therapeutic innate immune modulators.
Flow cytometry; Intracellular cytokine staining; Cytokine bead array; Innate immunity; Toll-like receptors; Assay validation
Flow cytometry specializes in high content measurements of cells and particles in suspension. Having long excelled in analytical throughput of single cells and particles, only recently with the advent of HyperCyt sampling technology has flow cytometry’s multi-experiment throughput begun to approach the point of practicality for efficiently analyzing hundreds-of-thousands of samples, the realm of high throughput screening (HTS). To extend performance and automation compatibility we built a HyperCyt-linked Cluster Cytometer platform, a network of flow cytometers for analyzing samples displayed in high-density, 1536-well plate format. To assess performance we used cell and microsphere based HTS assays that had been well characterized in previous studies. Experiments addressed important technical issues: challenges of small wells (assay volumes 10 μL or less, reagent mixing, cell and particle suspension), detecting and correcting for differences in performance of individual flow cytometers, and the ability to reanalyze a plate in the event of problems encountered during the primary analysis. Boosting sample throughput an additional four-fold, this platform is uniquely positioned to synergize with expanding suspension array and cell barcoding technologies in which as many as 100 experiments are performed in a single well or sample. As high-performance flow cytometers shrink in cost and size, cluster cytometry promises to become a practical, productive approach for HTS and other large scale investigations of biological complexity.
Flow cytometry; suspension array; high content analysis; high throughput screening
Identification of minor cell populations, e.g. leukemic blasts within blood samples, has become increasingly important in therapeutic disease monitoring. Modern flow cytometers enable researchers to reliably measure six and more variables, describing cellular size, granularity and expression of cell-surface and intracellular proteins, for thousands of cells per second. Currently, analysis of cytometry readouts relies on visual inspection and manual gating of one- or two-dimensional projections of the data. This procedure, however, is labor-intensive and misses potential characteristic patterns in higher dimensions.
Leukemic samples from patients with acute lymphoblastic leukemia at initial diagnosis and during induction therapy have been investigated by 4-color flow cytometry. We have utilized multivariate classification techniques, Support Vector Machines (SVM), to automate leukemic cell detection in cytometry. Classifiers were built on conventionally diagnosed training data. We assessed the detection accuracy on independent test data and analyzed marker expression of incongruently classified cells. SVM classification can recover manually gated leukemic cells with 99.78% sensitivity and 98.87% specificity.
Multivariate classification techniques allow for automating cell population detection in cytometry readouts for diagnostic purposes. They potentially reduce time, costs and arbitrariness associated with these procedures. Due to their multivariate classification rules, they also allow for the reliable detection of small cell populations.
Telomere length analysis has been greatly simplified by the quantitative flow cytometry technique flow-FISH. In this method, a fluorescein-labeled synthetic oligonucleotide complementary to the telomere terminal repeat sequence is hybridized to the telomere sequence and the resulting fluorescence measured by flow cytometry. This technique has supplanted the traditional laborious Southern blot telomere length measurement techniques in many laboratories, and allows single cell analysis of telomere length in high-throughput sample formats. Nevertheless, the harsh conditions required for telomere probe annealing (82°C) has made it difficult to successfully combine this technique with simultaneous immunolabeling. Most traditional organic fluorescent probes (i.e. fluorescein, phycoerythrin, etc.) have limited thermal stability and do not survive the high-temperature annealing process, despite efforts to covalently crosslink the antigen-antibody-fluorophore complex. This loss of probe fluorescence has made it difficult to measure flow-FISH in complex lymphocyte populations, and has generally forced investigators to use fluorescent-activated cell sorting to pre-separate their populations, a laborious technique that requires prohibitively large numbers of cells.
In this study, we have substituted quantum dots (nanoparticles) for traditional fluorophores in FISH-flow. Quantum dots were demonstrated to possess much greater thermal stability than traditional low molecular weight and phycobiliprotein fluorophores. Quantum dot antibody conjugates directed against monocyte and T cell antigens were found to retain most of their fluorescence following the high-temperature annealing step, allowing simultaneous fluorescent immunophenotyping and telomere length measurement. Since quantum dots have very narrow emission bandwidths, we were able to analyze multiple quantum dot-antibody conjugates (Qdot 605, 655 and 705) simultaneously with FISH-flow measurement to assess the age-associated decline in telomere length in both human monocytes and T cell subsets. With quantum dot immunolabeling, the mean decrease rate in telomere length for CD4+ cells was calculated at 41.8bp/year, very close to previously reported values using traditional flow-FISH and Southern blotting. This modification to the traditional flow-FISH technique should therefore allow simultaneous fluorescent immunophenotyping and telomere length measurement, permitting complex cell subset-specific analysis in small numbers of cells without the requirement for prior cell sorting.
FISH-flow cytometry; quantum dots; telomere length
A next step to interpret the findings generated by genome-wide association studies is to associate
molecular quantitative traits with disease-associated alleles. To this end, researchers are linking disease
risk alleles with gene expression quantitative trait loci (eQTL). However, gene expression at the
mRNA level is only an intermediate trait and flow cytometry analysis can provide more downstream
and biologically valuable protein level information in multiple cell subsets simultaneously using freshly
obtained samples. Because the throughput of flow cytometry is currently limited, experiments may
need to span over several weeks or months to obtain a sufficient sample size to demonstrate genetic
association. Therefore, normalisation methods are needed to control for technical variability and compare
flow cytometry data over an extended period of time. We show how the use of normalising
fluorospheres improves the repeatability of a cell surface CD25-APC mean fluorescence intensity phenotype
on CD4+ memory T cells. We investigate two types of normalising beads: broad spectrum and
spectrum matched. Lastly, we propose two alternative normalisation procedures that are usable in the
absence of normalising beads.