Microarray is one of the most powerful detection systems with multiplexing and high throughput capability. It has significant potential as a versatile biosensing platform for environmental monitoring, pathogen detection, medical therapeutics, and drug screening to name a few. To date, however, microarray applications are still limited to preliminary screening of genome-scale transcription profiling or gene ontology analysis. Expanding the utility of microarrays as a detection tool for various biological and biomedical applications requires information about performance such as the limits of detection and quantification, which are considered as an essential information to decide the detection sensitivity of sensing devices. Here we present a calibration design that integrates detection limit theory and linear dynamic range to obtain a performance index of microarray detection platform using oligonucleotide arrays as a model system. Two different types of limits of detection and quantification are proposed by the prediction or tolerance interval for two common cyanine fluorescence dyes, Cy3 and Cy5. Besides oligonucleotide, the proposed method can be generalized to other microarray formats with various biomolecules such as complementary DNA, protein, peptide, carbohydrate, tissue, or other small biomolecules. Also, it can be easily applied to other fluorescence dyes for further dye chemistry improvement.
Summary: The introduction of in vitro nucleic acid amplification techniques, led by real-time PCR, into the clinical microbiology laboratory has transformed the laboratory detection of viruses and select bacterial pathogens. However, the progression of the molecular diagnostic revolution currently relies on the ability to efficiently and accurately offer multiplex detection and characterization for a variety of infectious disease pathogens. Microarray analysis has the capability to offer robust multiplex detection but has just started to enter the diagnostic microbiology laboratory. Multiple microarray platforms exist, including printed double-stranded DNA and oligonucleotide arrays, in situ-synthesized arrays, high-density bead arrays, electronic microarrays, and suspension bead arrays. One aim of this paper is to review microarray technology, highlighting technical differences between them and each platform's advantages and disadvantages. Although the use of microarrays to generate gene expression data has become routine, applications pertinent to clinical microbiology continue to rapidly expand. This review highlights uses of microarray technology that impact diagnostic microbiology, including the detection and identification of pathogens, determination of antimicrobial resistance, epidemiological strain typing, and analysis of microbial infections using host genomic expression and polymorphism profiles.
Comparative genomic hybridization (CGH) microarrays have been used to determine copy number variations (CNVs) and their effects on complex diseases. Detection of absolute CNVs independent of genomic variants of an arbitrary reference sample has been a critical issue in CGH array experiments. Whole genome analysis using massively parallel sequencing with multiple ultra-high resolution CGH arrays provides an opportunity to catalog highly accurate genomic variants of the reference DNA (NA10851). Using information on variants, we developed a new method, the CGH array reference-free algorithm (CARA), which can determine reference-unbiased absolute CNVs from any CGH array platform. The algorithm enables the removal and rescue of false positive and false negative CNVs, respectively, which appear due to the effects of genomic variants of the reference sample in raw CGH array experiments. We found that the CARA remarkably enhanced the accuracy of CGH array in determining absolute CNVs. Our method thus provides a new approach to interpret CGH array data for personalized medicine.
Ancient human remains of paleopathological interest typically contain highly degraded DNA in which pathogenic taxa are often minority components, making sequence-based metagenomic characterization costly. Microarrays may hold a potential solution to these challenges, offering a rapid, affordable, and highly informative snapshot of microbial diversity in complex samples without the lengthy analysis and/or high cost associated with high-throughput sequencing. Their versatility is well established for modern clinical specimens, but they have yet to be applied to ancient remains. Here we report bacterial profiles of archaeological and historical human remains using the Lawrence Livermore Microbial Detection Array (LLMDA). The array successfully identified previously-verified bacterial human pathogens, including Vibrio cholerae (cholera) in a 19th century intestinal specimen and Yersinia pestis (“Black Death” plague) in a medieval tooth, which represented only minute fractions (0.03% and 0.08% alignable high-throughput shotgun sequencing reads) of their respective DNA content. This demonstrates that the LLMDA can identify primary and/or co-infecting bacterial pathogens in ancient samples, thereby serving as a rapid and inexpensive paleopathological screening tool to study health across both space and time.
To address the limitations of traditional virus and pathogen detection methodologies in clinical diagnosis, scientists have developed high-throughput oligonucleotide microarrays to rapidly identify infectious agents. However, objectively identifying pathogens from the complex hybridization patterns of these massively multiplexed arrays remains challenging.
In this study, we conceived an automated method based on the hypergeometric distribution for identifying pathogens in multiplexed arrays and compared it to five other methods. We evaluated these metrics: 1) accurate prediction, whether the top ranked prediction(s) match the real virus(es); 2) four accuracy scores.
Though accurate prediction and high specificity and sensitivity can be achieved with several methods, the method based on hypergeometric distribution provides a significant advantage in term of positive predicting value with two to sixty folds the positive predicting values of other methods.
The proposed multi-specie array analysis based on the hypergeometric distribution addresses shortcomings of previous methods by enhancing signals of positively hybridized probes.
A common technique used for sensitive and specific diagnostic virus detection in clinical samples is PCR that can identify one or several viruses in one assay. However, a diagnostic microarray containing probes for all human pathogens could replace hundreds of individual PCR-reactions and remove the need for a clear clinical hypothesis regarding a suspected pathogen. We have established such a diagnostic platform for random amplification and subsequent microarray identification of viral pathogens in clinical samples. We show that Phi29 polymerase-amplification of a diverse set of clinical samples generates enough viral material for successful identification by the Microbial Detection Array, demonstrating the potential of the microarray technique for broad-spectrum pathogen detection. We conclude that this method detects both DNA and RNA virus, present in the same sample, as well as differentiates between different virus subtypes. We propose this assay for diagnostic analysis of viruses in clinical samples.
A microbial diagnostic microarray for the detection of the most relevant bacterial food‐ and water‐borne pathogens and indicator organisms was developed and thoroughly validated. The microarray platform based on sequence‐specific end labelling of oligonucleotides and the pyhylogenetically robust gyrB marker gene allowed a highly specific (resolution on genus/species level) and sensitive (0.1% relative and 104 cfu absolute detection sensitivity) detection of the target pathogens. Validation was performed using a set of reference strains and a set of spiked environmental samples. Reliability of the obtained data was additionally verified by independent analysis of the samples via fluorescence in situ hybridization (FISH) and conventional microbiological reference methods. The applicability of this diagnostic system for food analysis was demonstrated through extensive validation using artificially and naturally contaminated spiked food samples. The microarray‐based pathogen detection was compared with the corresponding microbiological reference methods (performed according to the ISO norm). Microarray results revealed high consistency with the reference microbiological data.
Microarrays are becoming a very popular tool for microbial detection and diagnostics. Although these diagnostic arrays are much simpler when compared to the traditional transcriptome arrays, due to the high throughput nature of the arrays, the data analysis requirements still form a bottle neck for the widespread use of these diagnostic arrays. Hence we developed a new online data sharing and analysis environment customised for diagnostic arrays.
Microbial Diagnostic Array Workstation (MDAW) is a database driven application designed in MS Access and front end designed in ASP.NET.
MDAW is a new resource that is customised for the data analysis requirements for microbial diagnostic arrays.
Rapid and multiplexed measurement is vital in the detection of food-borne pathogens. While highly specific and sensitive, traditional immunochemical assays such as enzyme-linked immunosorbent assays (ELISAs) often require expensive read-out equipment (e.g. fluorescent labels) and lack the capability of multiplex detection. By combining the superior specificity of immunoassays with the sensitivity and simplicity of magnetic detection, we have developed a novel multiplex magnetic nanotag-based detection platform for mycotoxins that functions on a sub-picomolar concentration level. Unlike fluorescent labels, magnetic nanotags (MNTs) can be detected with inexpensive giant magnetoresistive (GMR) sensors such as spin-valve sensors. In the system presented here, each spin-valve sensor has an active area of 90 × 90 µm2, arranged in an 8×8 array. Sample is added to the antibody-immobilized sensor array prior to the addition of the biotinylated detection antibody. The sensor response is recorded in real time upon the addition of streptavidin-linked MNTs on the chip. Here we demonstrate the simultaneous detection of multiple mycotoxins (aflatoxins B1, zearalenone and HT-2) and show that a detection limit of 50 pg/mL can be achieved.
New design and optimization of pathogen detection microarrays is shown to allow robust and accurate detection of a range of pathogens. The customized microarray platform includes a method for reducing PCR bias during DNA amplification.
DNA microarrays used as 'genomic sensors' have great potential in clinical diagnostics. Biases inherent in random PCR-amplification, cross-hybridization effects, and inadequate microarray analysis, however, limit detection sensitivity and specificity. Here, we have studied the relationships between viral amplification efficiency, hybridization signal, and target-probe annealing specificity using a customized microarray platform. Novel features of this platform include the development of a robust algorithm that accurately predicts PCR bias during DNA amplification and can be used to improve PCR primer design, as well as a powerful statistical concept for inferring pathogen identity from probe recognition signatures. Compared to real-time PCR, the microarray platform identified pathogens with 94% accuracy (76% sensitivity and 100% specificity) in a panel of 36 patient specimens. Our findings show that microarrays can be used for the robust and accurate diagnosis of pathogens, and further substantiate the use of microarray technology in clinical diagnostics.
Infectious diseases emerge frequently in China, partly because of its large and highly mobile population. Therefore, a rapid and cost-effective pathogen screening method with broad coverage is required for prevention and control of infectious diseases. The availability of a large number of microbial genome sequences generated by conventional Sanger sequencing and next generation sequencing has enabled the development of a high-throughput high-density microarray platform for rapid large-scale screening of vertebrate pathogens.
An easy operating pathogen microarray (EOPM) was designed to detect almost all known pathogens and related species based on their genomic sequences. For effective identification of pathogens from EOPM data, a statistical enrichment algorithm has been proposed, and further implemented in a user-friendly web-based interface.
Using multiple probes designed to specifically detect a microbial genus or species, EOPM can correctly identify known pathogens at the species or genus level in blinded testing. Despite a lower sensitivity than PCR, EOPM is sufficiently sensitive to detect the predominant pathogens causing clinical symptoms. During application in two recent clinical infectious disease outbreaks in China, EOPM successfully identified the responsible pathogens.
EOPM is an effective surveillance platform for infectious diseases, and can play an important role in infectious disease control.
We have developed and validated a consolidated bead-based genotyping platform, the Bioplex suspension array for simultaneous detection of multiple single nucleotide polymorphisms (SNPs) of the ATP-binding cassette transporters. Genetic polymorphisms have been known to influence therapeutic response and risk of disease pathologies. Genetic screening for therapeutic and diagnostic applications thus holds great promise in clinical management. The allele-specific primer extension (ASPE) reaction was used to assay 22 multiplexed SNPs for eight subjects. Comparison of the microsphere-based ASPE assay results to sequencing results showed complete concordance in genotype assignments. The Bioplex suspension array thus proves to be a reliable, cost-effective and high-throughput technological platform for genotyping. It can be easily adapted to customized SNP panels for specific applications involving large-scale mutation screening of clinically relevant markers.
Genotype; Microspheres; Polymorphism, Genetic
Identifying the bacteria and viruses present in a complex sample is useful in disease diagnostics, product safety, environmental characterization, and research. Array-based methods have proven utility to detect in a single assay at a reasonable cost any microbe from the thousands that have been sequenced.
We designed a pan-Microbial Detection Array (MDA) to detect all known viruses (including phages), bacteria and plasmids and developed a novel statistical analysis method to identify mixtures of organisms from complex samples hybridized to the array. The array has broader coverage of bacterial and viral targets and is based on more recent sequence data and more probes per target than other microbial detection/discovery arrays in the literature. Family-specific probes were selected for all sequenced viral and bacterial complete genomes, segments, and plasmids. Probes were designed to tolerate some sequence variation to enable detection of divergent species with homology to sequenced organisms, and to have no significant matches to the human genome sequence.
In blinded testing on spiked samples with single or multiple viruses, the MDA was able to correctly identify species or strains. In clinical fecal, serum, and respiratory samples, the MDA was able to detect and characterize multiple viruses, phage, and bacteria in a sample to the family and species level, as confirmed by PCR.
The MDA can be used to identify the suite of viruses and bacteria present in complex samples.
Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist.
We developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV.
To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated projects.
Availability and Implementation
Available on the web at: http://sourceforge.net/projects/cnv
Multiplexed detection assays that analyze a modest number of nucleic acid targets over large sample sets are emerging as the preferred testing approach in such applications as routine pathogen typing, outbreak monitoring, and diagnostics. However, very few DNA testing platforms have proven to offer a solution for mid-plexed analysis that is high-throughput, sensitive, and with a low cost per test. In this work, an enhanced genotyping method based on MassCode technology was devised and integrated as part of a high-throughput mid-plexing analytical system that facilitates robust qualitative differential detection of DNA targets. Samples are first analyzed using MassCode PCR (MC-PCR) performed with an array of primer sets encoded with unique mass tags. Lambda exonuclease and an array of MassCode probes are then contacted with MC-PCR products for further interrogation and target sequences are specifically identified. Primer and probe hybridizations occur in homogeneous solution, a clear advantage over micro- or nanoparticle suspension arrays. The two cognate tags coupled to resultant MassCode hybrids are detected in an automated process using a benchtop single quadrupole mass spectrometer. The prospective value of using MassCode probe arrays for multiplexed bioanalysis was demonstrated after developing a 14plex proof of concept assay designed to subtype a select panel of Salmonella enterica serogroups and serovars. This MassCode system is very flexible and test panels can be customized to include more, less, or different markers.
Despite the known relevance of genomic structural variants to pathogen behavior, cancer, development, and evolution, certain repeat based structural variants may evade detection by existing high-throughput techniques. Here, we present ruler arrays, a technique to detect genomic structural variants including insertions and deletions (indels), duplications, and translocations. A ruler array exploits DNA polymerase’s processivity to detect physical distances between defined genomic sequences regardless of the intervening sequence. The method combines a sample preparation protocol, tiling genomic microarrays, and a new computational analysis. The analysis of ruler array data from two genomic samples enables the identification of structural variation between the samples. In an empirical test between two closely related haploid strains of yeast ruler arrays detected 78% of the structural variants larger than 100 bp.
Many rapid methods have been developed for screening foods for the presence of pathogenic microorganisms. Rapid methods that have the additional ability to identify microorganisms via multiplexed immunological recognition have the potential for classification or typing of microbial contaminants thus facilitating epidemiological investigations that aim to identify outbreaks and trace back the contamination to its source. This manuscript introduces a novel, high throughput typing platform that employs microarrayed multiwell plate substrates and laser-induced fluorescence of the nucleic acid intercalating dye/stain SYBR Gold for detection of antibody-captured bacteria. The aim of this study was to use this platform for comparison of different sets of antibodies raised against the same pathogens as well as demonstrate its potential effectiveness for serotyping. To that end, two sets of antibodies raised against each of the “Big Six” non-O157 Shiga toxin-producing E. coli (STEC) as well as E. coli O157:H7 were array-printed into microtiter plates, and serial dilutions of the bacteria were added and subsequently detected. Though antibody specificity was not sufficient for the development of an STEC serotyping method, the STEC antibody sets performed reasonably well exhibiting that specificity increased at lower capture antibody concentrations or, conversely, at lower bacterial target concentrations. The favorable results indicated that with sufficiently selective and ideally concentrated sets of biorecognition elements (e.g., antibodies or aptamers), this high-throughput platform can be used to rapidly type microbial isolates derived from food samples within ca. 80 min of total assay time. It can also potentially be used to detect the pathogens from food enrichments and at least serve as a platform for testing antibodies.
antibody; microarray; bacteria; fluorescence; microtiter plate; typing
Affecting the core functional microbiome, peculiar high level taxonomic unbalances of the human intestinal microbiota have been recently associated with specific diseases, such as obesity, inflammatory bowel diseases, and intestinal inflammation.
In order to specifically monitor microbiota unbalances that impact human physiology, here we develop and validate an original DNA-microarray (HTF-Microbi.Array) for the high taxonomic level fingerprint of the human intestinal microbiota. Based on the Ligase Detection Reaction-Universal Array (LDR-UA) approach, the HTF-Microbi.Array enables specific detection and approximate relative quantification of 16S rRNAs from 30 phylogenetically related groups of the human intestinal microbiota. The HTF-Microbi.Array was used in a pilot study of the faecal microbiota of eight young adults. Cluster analysis revealed the good reproducibility of the high level taxonomic microbiota fingerprint obtained for each of the subject.
The HTF-Microbi.Array is a fast and sensitive tool for the high taxonomic level fingerprint of the human intestinal microbiota in terms of presence/absence of the principal groups. Moreover, analysis of the relative fluorescence intensity for each probe pair of our LDR-UA platform can provide estimation of the relative abundance of the microbial target groups within each samples. Focusing the phylogenetic resolution at division, order and cluster levels, the HTF-Microbi.Array is blind with respect to the inter-individual variability at the species level.
Genotyping platforms such as single nucleotide polymorphism (SNP) arrays are powerful tools to study genomic aberrations in cancer samples. Allele specific information from SNP arrays provides valuable information for interpreting copy number variation (CNV) and allelic imbalance including loss-of-heterozygosity (LOH) beyond that obtained from the total DNA signal available from array comparative genomic hybridization (aCGH) platforms. Several algorithms based on hidden Markov models (HMMs) have been designed to detect copy number changes and copy-neutral LOH making use of the allele information on SNP arrays. However heterogeneity in clinical samples, due to stromal contamination and somatic alterations, complicates analysis and interpretation of these data.
We have developed MixHMM, a novel hidden Markov model using hidden states based on chromosomal structural aberrations. MixHMM allows CNV detection for copy numbers up to 7 and allows more complete and accurate description of other forms of allelic imbalance, such as increased copy number LOH or imbalanced amplifications. MixHMM also incorporates a novel sample mixing model that allows detection of tumor CNV events in heterogeneous tumor samples, where cancer cells are mixed with a proportion of stromal cells.
We validate MixHMM and demonstrate its advantages with simulated samples, clinical tumor samples and a dilution series of mixed samples. We have shown that the CNVs of cancer cells in a tumor sample contaminated with up to 80% of stromal cells can be detected accurately using Illumina BeadChip and MixHMM.
The MixHMM is available as a Python package provided with some other useful tools at http://genecube.med.yale.edu:8080/MixHMM.
Emerging known and unknown pathogens create profound threats to public health. Platforms for rapid detection and characterization of microbial agents are critically needed to prevent and respond to disease outbreaks. Available detection technologies cannot provide broad functional information about known or novel organisms. As a step toward developing such a system, we have produced and tested a series of high-density functional gene arrays to detect elements of virulence and antibiotic resistance mechanisms. Our first generation array targets genes from Escherichia coli strains K12 and CFT073, Enterococcus faecalis and Staphylococcus aureus. We determined optimal probe design parameters for gene family detection and discrimination. When tested with organisms at varying phylogenetic distances from the four target strains, the array detected orthologs for the majority of targeted gene families present in bacteria belonging to the same taxonomic family. In combination with whole-genome amplification, the array detects femtogram concentrations of purified DNA, either spiked in to an aerosol sample background, or in combinations from one or more of the four target organisms. This is the first report of a high density NimbleGen microarray system targeting microbial antibiotic resistance and virulence mechanisms. By targeting virulence gene families as well as genes unique to specific biothreat agents, these arrays will provide important data about the pathogenic potential and drug resistance profiles of unknown organisms in environmental samples.
Alternative splicing (AS) is an important regulatory mechanism for gene expression and protein diversity in eukaryotes. Previous studies have demonstrated that it can be causative for, or specific to splicing-related diseases. Understanding the regulation of AS will be helpful for diagnostic efforts and drug discoveries on those splicing-related diseases. As a novel exon-centric microarray platform, exon array enables a comprehensive analysis of AS by investigating the expression of known and predicted exons. Identifying of AS events from exon array has raised much attention, however, new and powerful algorithms for exon array data analysis are still absent till now.
Here, we considered identifying of AS events in the framework of variable selection and developed a regression method for AS detection (REMAS). Firstly, features of alternatively spliced exons were scaled by reasonably defined variables. Secondly, we designed a hierarchical model which can represent gene structure and transcriptional influence to exons, and the lasso type penalties were introduced in calculation because of huge variable size. Thirdly, an iterative two-step algorithm was developed to select alternatively spliced genes and exons. To avoid negative effects introduced by small sample size, we ranked genes as parameters indicating their AS capabilities in an iterative manner. After that, both simulation and real data evaluation showed that REMAS could efficiently identify potential AS events, some of which had been validated by RT-PCR or supported by literature evidence.
As a new lasso regression algorithm based on hierarchical model, REMAS has been demonstrated as a reliable and effective method to identify AS events from exon array data.
Phylogenetic microarrays present an attractive strategy to high-throughput interrogation of complex microbial communities. In this work we present several approaches to optimize the analysis of intestinal microbiota with the recently developed Microbiota Array. First, we determined how 16S rDNA-specific PCR amplification influenced bacterial detection and the consistency of measured abundance values. Bacterial detection improved with an increase in the number of PCR amplification cycles, but 25 cycles were sufficient to achieve the maximum possible detection. A PCR-caused deviation in the measured abundance values was also observed. We also developed two mathematical algorithms aimed to account for a predicted cross-hybridization of 16S rDNA fragments among different species, and to adjust the measured hybridization signal based on the number of 16S rRNA gene copies per species genome. The 16S rRNA gene copy adjustment indicated that the presence of members of class Clostridia might be over-estimated in some 16S rDNA-based studies. Finally, we show that the examination of total community RNA with phylogenetic microarray can provide estimates of the relative metabolic activity of individual community members. Complementary profiling of genomic DNA and total RNA isolated from the same sample presents an opportunity to assess population structure and activity in the same microbial community.
Recent advances in tissue microarray technology have allowed immunohistochemistry to become a powerful medium-to-high throughput analysis tool, particularly for the validation of diagnostic and prognostic biomarkers. However, as study size grows, the manual evaluation of these assays becomes a prohibitive limitation; it vastly reduces throughput and greatly increases variability and expense. We propose an algorithm—Tissue Array Co-Occurrence Matrix Analysis (TACOMA)—for quantifying cellular phenotypes based on textural regularity summarized by local inter-pixel relationships. The algorithm can be easily trained for any staining pattern, is absent of sensitive tuning parameters and has the ability to report salient pixels in an image that contribute to its score. Pathologists’ input via informative training patches is an important aspect of the algorithm that allows the training for any specific marker or cell type. With co-training, the error rate of TACOMA can be reduced substantially for a very small training sample (e.g., with size 30). We give theoretical insights into the success of co-training via thinning of the feature set in a high dimensional setting when there is “sufficient” redundancy among the features. TACOMA is flexible, transparent and provides a scoring process that can be evaluated with clarity and confidence. In a study based on an estrogen receptor (ER) marker, we show that TACOMA is comparable to, or outperforms, pathologists’ performance in terms of accuracy and repeatability.
The sequencing of many genomes and tiling arrays consisting of millions of DNA segments spanning entire genomes have made high-resolution copy number analysis possible. Microarray-based comparative genomic hybridization (array CGH) has enabled the high-resolution detection of DNA copy number aberrations. While many of the methods and algorithms developed for the analysis microarrays have focused on expression analysis, the same technology can be used to detect genetic alterations, using for example standard commercial Affymetrix arrays. Due to the nature of the resultant data, standard techniques for processing GeneChip expression experiments are inapplicable.
We have developed a robust and flexible methodology for high-resolution analysis of DNA copy number of whole genomes, using Affymetrix high-density expression oligonucleotide microarrays. Copy number is obtained from fluorescence signals after processing with novel normalization, spatial artifact correction, data transformation and deletion/duplication detection. We applied our approach to identify deleted and amplified regions in E. coli mutants obtained after prolonged starvation.
The availability of Affymetrix expression chips for a wide variety of organisms makes the proposed array CGH methodology useful more generally.
Microarrays are the most common method of studying global gene expression, and may soon enter the realm of FDA-approved clinical/diagnostic testing of cancer and other diseases. However, the acceptance of array data has been made difficult by the proliferation of widely different array platforms with gene probes ranging in size from 25 bases (oligonucleotides) to several kilobases (complementary DNAs or cDNAs). The algorithms applied for image and data analysis are also as varied as the microarray platforms, perhaps more so. In addition, there is a total lack of universally accepted standards for use among the different platforms and even within the same array types. Due to this lack of coherency in array technologies, confusion in interpretation of data within and across platforms has often been the norm, and studies of the same biological phenomena have, in many cases, led to contradictory results. In this commentary/review, some of the causes of this confusion will be summarized, and progress in overcoming these obstacles will be described, with the goal of providing an optimistic view of the future for the use of array technologies in global expression profiling and other applications.
microarray; expression profiling; RNA standards; controls; MGED; MAQC; NIST; ERCC