Search tips
Search criteria

Results 1-25 (1424023)

Clipboard (0)

Related Articles

1.  Extending the tissue microarray data exchange specification for inclusion of data analysis results 
The Tissue Microarray Data Exchange Specification (TMA DES) is an eXtensible Markup Language (XML) specification for encoding TMA experiment data in a machine-readable format that is also human readable. TMA DES defines Common Data Elements (CDEs) that form a basic vocabulary for describing TMA data. TMA data are routinely subjected to univariate and multivariate statistical analysis to determine differences or similarities between pathologically distinct groups of tumors for one or more markers or between markers for different groups. Such statistical analysis tests include the t-test, ANOVA, Chi-square, Mann-Whitney U, and Kruskal-Wallis tests. All these generate output that needs to be recorded and stored with TMA data.
Materials and Methods:
We propose extending the TMA DES to include syntactic and semantic definitions of CDEs for describing the results of statistical analyses performed upon TMA DES data. These CDEs are described in this paper and it is illustrated how they can be added to the TMA DES. We created a Document Type Definition (DTD) file defining the syntax for these CDEs, and a set of ISO 11179 entries providing semantic definitions for them. We describe how we wrote a program in R that read TMA DES data from an XML file, performed statistical analyses on that data, and created a new XML file containing both the original XML data and CDEs representing the results of our analyses. This XML file was submitted to XML parsers in order to confirm that they conformed to the syntax defined in our extended DTD file. TMA DES XML files with deliberately introduced errors were also parsed in order to verify that our new DTD file could perform error checking. Finally, we also validated an existing TMA DES XML file against our DTD file in order to demonstrate the backward compatibility of our DTD.
Our experiments demonstrated the encoding of analysis results using our proposed CDEs. We used XML parsers to confirm that these XML data were syntactically correct and conformed to the rules specified in our extended TMA DES DTD. We also demonstrated that this extended DTD was capable of being used to successfully perform error checking, and was backward compatible with pre-existing TMA DES data which did not use our new CDEs.
The TMA DES allows Tissue Microarray data to be shared. A variety of statistical tests are used to analyze such data. We have proposed a set of CDEs as an extension to the TMA DES which can be used to annotate TMA DES data with the results of statistical analyses performed on that data. We performed experiments which demonstrated the usage of TMA DES data containing our proposed CDEs.
PMCID: PMC3073064  PMID: 21572505
CDEs; DTD; statistical analysis; tissue microarray; TMA Data Exchange Specification; XML
2.  Reliability of cyclin A assessment on tissue microarrays in breast cancer compared to conventional histological slides 
British Journal of Cancer  2006;94(11):1697-1702.
Cyclin A has in some studies been associated with poor breast cancer survival, although all studies have not confirmed this. Its prognostic significance in breast cancer needs evaluation in larger studies. Tissue microarray (TMA) technique allows a simultaneous analysis of large amount of tumours on a single microscopic slide. This makes a rapid screening of molecular markers in large amount of tumours possible. Because only a small tissue sample of each tumour is punched on an array, the question has arisen about the representativeness of TMA when studying markers that are expressed in only a small proportion of cells. For this reason, we wanted to compare cyclin A expression on TMA and on traditional large sections. Two breast cancer TMAs were constructed of 200 breast tumours diagnosed between 1997–1998. TMA slides and traditional large section slides of these 200 tumours were stained with cyclin A antibody and analysed by two independent readers. The reproducibility of the two readers' results was good or even very good, with kappa values 0.71–0.87. The agreement of TMA and large section results was good with kappa value 0.62–0.75. Cyclin A overexpression was significantly (P<0.001) associated with oestrogen receptor and progesterone receptor negativity and high grade both on TMA and large sections. Cyclin A overexpression was significantly associated with poor metastasis-free survival both on TMA and large sections. The relative risks for metastasis were similar on TMA and large sections. This study suggests that TMA technique could be useful to study histological correlations and prognostic significance of cyclin A on breast cancer on a large scale.
PMCID: PMC2361315  PMID: 16670718
breast cancer; cyclin A; tissue microarray
3.  Limitations of tissue microarrays compared with whole tissue sections in survival analysis 
Oncology Letters  2010;1(5):827-831.
Tissue microarray (TMA) is a promising technique in the evaluation of immunohistochemical markers in tumors and may be used as an alternative for whole sections. However, only a few studies have correlated a clinical outcome with both TMA and results obtained from whole sections. This study compared immunostaining for Ki-67 and p16 in TMA (3 cores from each specimen) and whole sections of 171 cases of stage III epithelial ovarian cancer with clinical data. A high expression of Ki-67 was identified in 85.0, 85.5, 85.8, 90.5 and 84% of cores 1, 2 and 3, TMAs and whole tissue sections, respectively. A high p16 expression was found in 36.5, 31.4, 30.3, 46.3 and 31.0% of cores 1, 2 and 3, TMAs and whole tissue sections, respectively. The high expression of Ki-67 and p16 in whole tissue sections significantly correlated with that of Ki-67 and p16 in core 1 (P<0.0001 and P<0.0001, respectively), core 2 (P<0.0001 and P<0.0001, respectively), core 3 (P<0.0001 and P<0.0001, respectively), and TMAs (P<0.0001 and P<0.0001, respectively). In univariate analysis, a high expression of Ki-67 and p16 in two of the cores; TMA and the whole tissue sections were significantly correlated to disease-related survival (Ki-67: P=0.008, 0.012, 0.012 and 0.0001, respectively, and p16: P=0.0007, 0.0005, 0.0008 and 0.005, respectively). However, in the multivariate analysis only Ki-67 on whole tissue sections retained an independent prognostic significance (P=0.025). We concluded that more studies, with a higher number of cores, are necessary to determine the efficacy of TMA in reflecting the prognostic value of different antibodies. Morever, evaluation of this method is crucial for each type of tumor and each separate antigen. It is also essential to confirm the clinical correlations on the whole sections before investigating the same parameters on TMA.
PMCID: PMC3436208  PMID: 22966388
tissue microarrays; whole tissue sections; immunohistochemistry; prognosis; Ki-67; p16
4.  Association of PKCζ Expression with Clinicopathological Characteristics of Breast Cancer 
PLoS ONE  2014;9(3):e90811.
The protein kinase C (PKC) family has been functionally linked to cancer. It has been suggested that atypical PKCs contribute to cell proliferation and cancer progression. With respect to breast cancer, PKCζ has been found to play a key role in intracellular transduction of mitogenic and apoptotic signals using mammary cell lines. However, little is known about its function in vivo. Here we examined the correlation between PKCζ protein levels and important clinicopathologic factors in breast cancer using patient samples. To conduct the study, 30 invasive ductal carcinoma cases and their paired normal tissues were used for tissue microarray analysis (TMA) and 16 were used for western blot analysis. In addition, the correlation between PKCζ expression levels and clinicopathologic characteristics was determined in 176 cases with relevant clinical data. Finally, the correlation between PKCζ and epithelial growth factor receptor 2 (HER2) expressions was determined using three breast cancer cell lines by western blot analysis. Both TMA and western blot results showed that PKCζ protein was highly expressed in primary tumors but not in paired normal tissue. The correlation study indicated that high PKCζ levels were associated with premenopausal patients (p = 0.019) and worse prognostic factors, such as advanced clinical stage, more lymph node involvement and larger tumor size. Both disease-free survival and overall survival rates were lower in the high PKCζ group than those in the low PKCζ group. No correlation was observed between PKCζ levels and age, histological grade, or estrogen or progesterone receptor expression status. A positive correlation between PKCζ and HER2 levels was observed in both tumor samples and cell lines. Our observations link PKCζ expression with factors pointing to worse prognosis, higher HER2 levels and a lower survival rate. This suggests that PKCζ protein levels may serve as a prognostic marker of breast cancer.
PMCID: PMC3946230  PMID: 24603690
5.  A Novel Survival-Based Tissue Microarray of Pancreatic Cancer Validates MUC1 and Mesothelin as Biomarkers 
PLoS ONE  2012;7(7):e40157.
One–fifth of patients with seemingly ‘curable’ pancreatic ductal adenocarcinoma (PDA) experience an early recurrence and death, receiving no definable benefit from a major operation. Some patients with advanced stage tumors are deemed ‘unresectable’ by conventional staging criteria (e.g. liver metastasis), yet progress slowly. Effective biomarkers that stratify PDA based on biologic behavior are needed. To help researchers sort through the maze of biomarker data, a compendium of ∼2500 published candidate biomarkers in PDA was compiled (PLoS Med, 2009. 6(4) p. e1000046).
Methods and Findings
Building on this compendium, we constructed a survival tissue microarray (termed s-TMA) comprised of short-term (cancer-specific death <12 months, n = 58) and long-term survivors (>30 months, n = 79) who underwent resection for PDA (total, n = 137). The s-TMA functions as a biological filter to identify bona fide prognostic markers associated with survival group extremes (at least 18 months separate survival groups). Based on a stringent selection process, 13 putative PDA biomarkers were identified from the public biomarker repository. Candidates were tested against the s-TMA by immunohistochemistry to identify the best markers of tumor biology. In a multivariate model, MUC1 (odds ratio, OR = 28.95, 3+ vs. negative expression, p = 0.004) and MSLN (OR = 12.47, 3+ vs. negative expression, p = 0.01) were highly predictive of early cancer-specific death. By comparison, pathologic factors (size, lymph node metastases, resection margin status, and grade) had ORs below three, and none reached statistical significance. ROC curves were used to compare the four pathologic prognostic features (ROC area = 0.70) to three univariate molecular predictors (MUC1, MSLN, MUC2) of survival group (ROC area = 0.80, p = 0.07).
MUC1 and MSLN were superior to pathologic features and other putative biomarkers as predicting survival group. Molecular assays comparing cancers from short and long survivors are an effective strategy to screen biomarkers and prioritize candidate cancer genes for diagnostic and therapeutic studies.
PMCID: PMC3391218  PMID: 22792233
6.  CpG island hypermethylation of BRCA1 and loss of pRb as co-occurring events in basal/triple-negative breast cancer 
Epigenetics  2011;6(5):638-649.
Triple-negative breast cancer (TNBC) occurs in approximately 15% of all breast cancer patients, and the incidence of TNBC is greatly increased in BRCA1 mutation carriers. This study aimed to assess the impact of BRCA1 promoter methylation with respect to breast cancer subtypes in sporadic disease. Tissue microarrays (TMAs) were constructed representing tumors from 303 patients previously screened for BRCA1 germline mutations, of which a subset of 111 sporadic tumors had previously been analyzed with respect to BRCA1 methylation. Additionally, a set of eight tumors from BRCA1 mutation carriers were included on the TMAs. Expression analysis was performed on TMAs by immunohistochemistry (IHC) for BRCA1, pRb, p16, p53, PTEN, ER, PR, HER2, CK5/6, CK8, CK18, EGFR, MUC1, and Ki-67. Data on BRCA1 aberrations and IHC expression was examined with respect to breast cancer-specific survival. The results demonstrate that CpG island hypermethylation of BRCA1 significantly associates with the basal/triple-negative subtype. Low expression of pRb, and high/intense p16, were associated with BRCA1 promoter hypermethylation, and the same effects were seen in BRCA1 mutated tumors. The expression patterns of BRCA1, pRb, p16 and PTEN were highly correlated, and define a subgroup of TNBCs characterized by BRCA1 aberrations, high Ki-67 (≥ 40%) and favorable disease outcome. In conclusion, our findings demonstrate that epigenetic inactivation of the BRCA1 gene associates with RB/p16 dysfunction in promoting TNBCs. The clinical implications relate to the potential use of targeted treatment based on PARP inhibitors in sporadic TNBCs, wherein CpG island hypermethylation of BRCA1 represents a potential marker of therapeutic response.
PMCID: PMC3121973  PMID: 21593597
BRCA1; methylation; epigenetics; triple negative; breast cancer; retionblastoma tumor suppressor gene; pRb; p16
7.  The tissue microarray data exchange specification: implementation by the Cooperative Prostate Cancer Tissue Resource 
BMC Bioinformatics  2004;5:19.
Tissue Microarrays (TMAs) have emerged as a powerful tool for examining the distribution of marker molecules in hundreds of different tissues displayed on a single slide. TMAs have been used successfully to validate candidate molecules discovered in gene array experiments. Like gene expression studies, TMA experiments are data intensive, requiring substantial information to interpret, replicate or validate. Recently, an open access Tissue Microarray Data Exchange Specification has been released that allows TMA data to be organized in a self-describing XML document annotated with well-defined common data elements. While this specification provides sufficient information for the reproduction of the experiment by outside research groups, its initial description did not contain instructions or examples of actual implementations, and no implementation studies have been published. The purpose of this paper is to demonstrate how the TMA Data Exchange Specification is implemented in a prostate cancer TMA.
The Cooperative Prostate Cancer Tissue Resource (CPCTR) is funded by the National Cancer Institute to provide researchers with samples of prostate cancer annotated with demographic and clinical data. The CPCTR now offers prostate cancer TMAs and has implemented a TMA database conforming to the new open access Tissue Microarray Data Exchange Specification. The bulk of the TMA database consists of clinical and demographic data elements for 299 patient samples. These data elements were extracted from an Excel database using a transformative Perl script. The Perl script and the TMA database are open access documents distributed with this manuscript.
TMA databases conforming to the Tissue Microarray Data Exchange Specification can be merged with other TMA files, expanded through the addition of data elements, or linked to data contained in external biological databases. This article describes an open access implementation of the TMA Data Exchange Specification and provides detailed guidance to researchers who wish to use the Specification.
PMCID: PMC373442  PMID: 15040818
8.  A TMA De-Arraying Method for High Throughput Biomarker Discovery in Tissue Research 
PLoS ONE  2011;6(10):e26007.
Tissue MicroArrays (TMAs) represent a potential high-throughput platform for the analysis and discovery of tissue biomarkers. As TMA slides are produced manually and subject to processing and sectioning artefacts, the layout of TMA cores on the final slide and subsequent digital scan (TMA digital slide) is often disturbed making it difficult to associate cores with their original position in the planned TMA map. Additionally, the individual cores can be greatly altered and contain numerous irregularities such as missing cores, grid rotation and stretching. These factors demand the development of a robust method for de-arraying TMAs which identifies each TMA core, and assigns them to their appropriate coordinates on the constructed TMA slide.
This study presents a robust TMA de-arraying method consisting of three functional phases: TMA core segmentation, gridding and mapping. The segmentation of TMA cores uses a set of morphological operations to identify each TMA core. Gridding then utilises a Delaunay Triangulation based method to find the row and column indices of each TMA core. Finally, mapping correlates each TMA core from a high resolution TMA whole slide image with its name within a TMAMap.
This study describes a genuine robust TMA de-arraying algorithm for the rapid identification of TMA cores from digital slides. The result of this de-arraying algorithm allows the easy partition of each TMA core for further processing. Based on a test group of 19 TMA slides (3129 cores), 99.84% of cores were segmented successfully, 99.81% of cores were gridded correctly and 99.96% of cores were mapped with their correct names via TMAMaps. The gridding of TMA cores were also extensively tested using a set of 113 pseudo slide (13,536 cores) with a variety of irregular grid layouts including missing cores, rotation and stretching. 100% of the cores were gridded correctly.
PMCID: PMC3189244  PMID: 22016800
9.  TMA Navigator: network inference, patient stratification and survival analysis with tissue microarray data 
Nucleic Acids Research  2013;41(Web Server issue):W562-W568.
Tissue microarrays (TMAs) allow multiplexed analysis of tissue samples and are frequently used to estimate biomarker protein expression in tumour biopsies. TMA Navigator ( is an open access web application for analysis of TMA data and related information, accommodating categorical, semi-continuous and continuous expression scores. Non-biological variation, or batch effects, can hinder data analysis and may be mitigated using the ComBat algorithm, which is incorporated with enhancements for automated application to TMA data. Unsupervised grouping of samples (patients) is provided according to Gaussian mixture modelling of marker scores, with cardinality selected by Bayesian information criterion regularization. Kaplan–Meier survival analysis is available, including comparison of groups identified by mixture modelling using the Mantel-Cox log-rank test. TMA Navigator also supports network inference approaches useful for TMA datasets, which often constitute comparatively few markers. Tissue and cell-type specific networks derived from TMA expression data offer insights into the molecular logic underlying pathophenotypes, towards more effective and personalized medicine. Output is interactive, and results may be exported for use with external programs. Private anonymous access is available, and user accounts may be generated for easier data management.
PMCID: PMC3692046  PMID: 23761446
10.  TmaDB: a repository for tissue microarray data 
BMC Bioinformatics  2005;6:218.
Tissue microarray (TMA) technology has been developed to facilitate large, genome-scale molecular pathology studies. This technique provides a high-throughput method for analyzing a large cohort of clinical specimens in a single experiment thereby permitting the parallel analysis of molecular alterations (at the DNA, RNA, or protein level) in thousands of tissue specimens. As a vast quantity of data can be generated in a single TMA experiment a systematic approach is required for the storage and analysis of such data.
To analyse TMA output a relational database (known as TmaDB) has been developed to collate all aspects of information relating to TMAs. These data include the TMA construction protocol, experimental protocol and results from the various immunocytological and histochemical staining experiments including the scanned images for each of the TMA cores. Furthermore the database contains pathological information associated with each of the specimens on the TMA slide, the location of the various TMAs and the individual specimen blocks (from which cores were taken) in the laboratory and their current status i.e. if they can be sectioned into further slides or if they are exhausted. TmaDB has been designed to incorporate and extend many of the published common data elements and the XML format for TMA experiments and is therefore compatible with the TMA data exchange specifications developed by the Association for Pathology Informatics community. Finally the design of the database is made flexible such that TMA experiments from several types of cancer can be stored in a single database, which incorporates the national minimum data set required for pathology reports supported by the Royal College of Pathologists (UK).
TmaDB will provide a comprehensive repository for TMA data such that a large number of results from the numerous immunostaining experiments can be efficiently compared for each of the TMA cores. This will allow a systematic, large-scale comparison of tumour samples to facilitate the identification of gene products of clinical importance such as therapeutic or prognostic markers. In addition this work will contribute to the establishment of a standard for reporting TMA data analogous to MIAME in the description of microarray data.
PMCID: PMC1215475  PMID: 16137321
11.  Identification of biology-based breast cancer types with distinct predictive and prognostic features: role of steroid hormone and HER2 receptor expression in patients treated with neoadjuvant anthracycline/taxane-based chemotherapy 
Reliable predictive and prognostic markers for routine diagnostic purposes are needed for breast cancer patients treated with neoadjuvant chemotherapy. We evaluated protein biomarkers in a cohort of 116 participants of the GeparDuo study on anthracycline/taxane-based neoadjuvant chemotherapy for operable breast cancer to test for associations with pathological complete response (pCR) and disease-free survival (DFS). Particularly, we evaluated if interactions between hormone receptor (HR) and human epidermal growth factor receptor 2 (HER2) expression might lead to a different clinical behavior of HR+/HER2+ co-expressing and HR+/HER2- tumors and whether subgroups of triple negative tumors might be identified by the help of Ki67 labeling index, cytokeratin 5/6 (CK5/6), as well as cyclooxygenase-2 (COX-2), and Y-box binding protein 1 (YB-1) expression.
Expression analysis was performed using immunohistochemistry and silver-enhanced in situ hybridization on tissue microarrays (TMAs) of pretherapeutic core biopsies.
pCR rates were significantly different between the biology-based tumor types (P = 0.044) with HR+/HER2+ and HR-/HER2- tumors having higher pCR rates than HR+/HER2- tumors. Ki67 labeling index, confirmed as significant predictor of pCR in the whole cohort (P = 0.001), identified HR-/HER- (triple negative) carcinomas with a higher chance for a pCR (P = 0.006). Biology-based tumor type (P = 0.046 for HR+/HER2+ vs. HR+/HER2-), Ki67 labeling index (P = 0.028), and treatment arm (P = 0.036) were independent predictors of pCR in a multivariate model. DFS was different in the biology-based tumor types (P < 0.0001) with HR+/HER2- and HR+/HER2+ tumors having the best prognosis and HR-/HER2+ tumors showing the worst outcome. Biology-based tumor type was an independent prognostic factor for DFS in multivariate analysis (P < 0.001).
Our data demonstrate that a biology-based breast cancer classification using estrogen receptor (ER), progesterone receptor (PgR), and HER2 bears independent predictive and prognostic potential. The HR+/HER2+ co-expressing carcinomas emerged as a group of tumors with a good response rate to neoadjuvant chemotherapy and a favorable prognosis. HR+/HER2- tumors had a good prognosis irrespective of a pCR, whereas patients with HR-/HER- and HR-/HER+ tumors, especially if they had not achieved a pCR, had an unfavorable prognosis and are in need of additional treatment options.
Trial registration identifier: NCT00793377
PMCID: PMC2790846  PMID: 19758440
12.  Prognostic utility of basaloid differentiation in oropharyngeal cancer 
Human papillomavirus (HPV) is recognized as the key risk factor for a distinct subset of oropharyngeal squamous cell carcinoma. P16 is a reliable, sensitive surrogate marker for HPV and confers a positive prognostic advantage. Basaloid differentiation on hematoxylin and eosin (H&E) staining is anecdotally noted by some pathologists to be associated with p16 positivity. This association, however, has not been adequately quantified in the literature, nor has the prognostic implications of basaloid differentiation been described.
1) To correlate the H&E staining feature of basaloid differentiation with p16 positivity in oropharyngeal cancer. 2) To investigate the prognostic utility of basaloid differentiation in oropharyngeal cancer survival.
Retrospective cross-sectional study of all patients diagnosed with and treated for oropharyngeal cancer at a single tertiary cancer center from 2002 to 2009. Tissue microarrays (TMAs) were generated from 208 oropharyngeal tumor specimens stained with H&E and immunohistochemical markers. These oropharyngeal TMAs were utilized in several previous publications. Samples were scored for basaloid differentiation by a pathologist blinded to the p16 result. A multivariate survival analysis with Cox-regression and Kaplan-Meier survival analysis was performed.
In the 208 samples, basaloid differentiation correlated with p16 positivity (Spearman’s rho 0.435). Basaloid differentiation and p16 positivity were both independent predictors of improved survival. The 5 year disease specific survival (DSS) was 73% for p16 positive tumors and 35% for p16 negative tumors (p < 0.001). Similarly, the 5 year DSS of basaloid differentiated tumors was 74% compared to 41% for non-basaloid tumors (p = 0.001). Patients with p16 positive and basaloid differentiated tumors had the best survival outcomes with a 5 year DSS of 80%.
Basaloid differentiation is a feature on H&E which correlates with p16 positivity and is a simple, inexpensive, independent, positive prognostic indicator of comparable magnitude to p16 status. Due to the added prognostic value of basaloid differentiation, this feature should be routinely reported by qualified pathologists.
PMCID: PMC3892036  PMID: 24350944
Basaloid differentiation; HPV; p16; Hematoxylin; Eosin; Oropharynx; Squamous cell carcinoma; Outcomes; Survival
13.  Immunoprofiles of 11 Biomarkers Using Tissue Microarrays Identify Prognostic Subgroups in Colorectal Cancer1 
Neoplasia (New York, N.Y.)  2005;7(8):741-747.
BACKGROUND AND AIMS: Genomewide expression profiling has identified a number of genes expressed at higher levels in colorectal cancer (CRC) than in normal tissues. Our objectives in this study were: 1) to test whether genes were also distinct on the protein level; 2) to evaluate these biomarkers in a series of well-characterized CRCs; and 3) to apply hierarchical cluster analysis to the immunohistochemical data. METHODS: Tissue microarrays (TMAs) comprising 351 CRC specimens from 270 patients were constructed to evaluate the genes Adam10, CyclinD1, AnnexinII, NFKB, Casein-kinase-2-beta (CK2B), YB-1, P32, Rad51, c-fos, IGFBP4, and Connexin26 (Cx26). In total, 3,797 samples were analyzed. RESULTS: Unsupervised hierarchical clustering discovered subgroups of CRC that differed by tumor stage and survival. Kaplan-Meier analysis showed that reduced Cx26 expression was significantly associated with shorter patient survival and higher tumor grade (G1/G2 vs G3, P = .02), and Adam10 expression with a higher tumor stage (pT1/2 vs pT3/4, P = .04). CONCLUSIONS: Our study highlights the potential of TMAs for a higher-dimensional analysis by evaluating serial sections of the same tissue core (three-dimensional TMA analysis). In addition, it endorses the use of immunohistochemistry supplemented by hierarchical clustering for the identification of tumor subgroups with diagnostic and prognostic signatures.
PMCID: PMC1501883  PMID: 16207476
Connexin26 (Cx26); Adam10; colorectal cancer (CRC); hierarchical clustering; tissue microarray (TMA)
14.  High-throughput proteomic analysis of formalin-fixed paraffin-embedded tissue microarrays using MALDI imaging mass spectrometry 
Proteomics  2008;8(18):3715-3724.
A novel method for high-throughput proteomic analysis of formalin-fixed paraffin-embedded (FFPE) tissue microarrays (TMA) is described using on-tissue tryptic digestion followed by MALDI imaging MS. A TMA section containing 112 needle core biopsies from lung-tumor patients was analyzed using MS and the data were correlated to a serial hematoxylin and eosin (H&E)-stained section having various histological regions marked, including cancer, non-cancer, and normal ones. By correlating each mass spectrum to a defined histological region, statistical classification models were generated that can sufficiently distinguish biopsies from adenocarcinoma from squamous cell carcinoma biopsies. These classification models were built using a training set of biopsies in the TMA and were then validated on the remaining biopsies. Peptide markers of interest were identified directly from the TMA section using MALDI MS/MS sequence analysis. The ability to detect and characterize tumor marker proteins for a large cohort of FFPE samples in a high-throughput approach will be of significant benefit not only to investigators studying tumor biology, but also to clinicians for diagnostic and prognostic purposes.
PMCID: PMC2927989  PMID: 18712763
Cancer; Formalin-fixed paraffin-embedded; Imaging mass spectrometry; Lung; Tissue
15.  Internet-based profiler system as integrative framework to support translational research 
BMC Bioinformatics  2005;6:304.
Translational research requires taking basic science observations and developing them into clinically useful tests and therapeutics. We have developed a process to develop molecular biomarkers for diagnosis and prognosis by integrating tissue microarray (TMA) technology and an internet-database tool, Profiler. TMA technology allows investigators to study hundreds of patient samples on a single glass slide resulting in the conservation of tissue and the reduction in inter-experimental variability. The Profiler system allows investigator to reliably track, store, and evaluate TMA experiments. Here within we describe the process that has evolved through an empirical basis over the past 5 years at two academic institutions.
The generic design of this system makes it compatible with multiple organ system (e.g., prostate, breast, lung, renal, and hematopoietic system,). Studies and folders are restricted to authorized users as required. Over the past 5 years, investigators at 2 academic institutions have scanned 656 TMA experiments and collected 63,311 digital images of these tissue samples. 68 pathologists from 12 major user groups have accessed the system. Two groups directly link clinical data from over 500 patients for immediate access and the remaining groups choose to maintain clinical and pathology data on separate systems. Profiler currently has 170 K data points such as staining intensity, tumor grade, and nuclear size. Due to the relational database structure, analysis can be easily performed on single or multiple TMA experimental results. The TMA module of Profiler can maintain images acquired from multiple systems.
We have developed a robust process to develop molecular biomarkers using TMA technology and an internet-based database system to track all steps of this process. This system is extendable to other types of molecular data as separate modules and is freely available to academic institutions for licensing.
PMCID: PMC1343596  PMID: 16364175
16.  SAMSN1 Is Highly Expressed and Associated with a Poor Survival in Glioblastoma Multiforme 
PLoS ONE  2013;8(11):e81905.
To study the expression pattern and prognostic significance of SAMSN1 in glioma.
Affymetrix and Arrystar gene microarray data in the setting of glioma was analyzed to preliminarily study the expression pattern of SAMSN1 in glioma tissues, and Hieratical clustering of gene microarray data was performed to filter out genes that have prognostic value in malignant glioma. Survival analysis by Kaplan-Meier estimates stratified by SAMSN1 expression was then made based on the data of more than 500 GBM cases provided by The Cancer Genome Atlas (TCGA) project. At last, we detected the expression of SAMSN1 in large numbers of glioma and normal brain tissue samples using Tissue Microarray (TMA). Survival analysis by Kaplan-Meier estimates in each grade of glioma was stratified by SAMSN1 expression. Multivariate survival analysis was made by Cox proportional hazards regression models in corresponding groups of glioma.
With the expression data of SAMSN1 and 68 other genes, high-grade glioma could be classified into two groups with clearly different prognoses. Gene and large sample tissue microarrays showed high expression of SAMSN1 in glioma particularly in GBM. Survival analysis based on the TCGA GBM data matrix and TMA multi-grade glioma dataset found that SAMSN1 expression was closely related to the prognosis of GBM, either PFS or OS (P<0.05). Multivariate survival analysis with Cox proportional hazards regression models confirmed that high expression of SAMSN1 was a strong risk factor for PFS and OS of GBM patients.
SAMSN1 is over-expressed in glioma as compared with that found in normal brains, especially in GBM. High expression of SAMSN1 is a significant risk factor for the progression free and overall survival of GBM.
PMCID: PMC3838348  PMID: 24278465
17.  Integrative genome-wide expression profiling identifies three distinct molecular subgroups of renal cell carcinoma with different patient outcome 
BMC Cancer  2012;12:310.
Renal cell carcinoma (RCC) is characterized by a number of diverse molecular aberrations that differ among individuals. Recent approaches to molecularly classify RCC were based on clinical, pathological as well as on single molecular parameters. As a consequence, gene expression patterns reflecting the sum of genetic aberrations in individual tumors may not have been recognized. In an attempt to uncover such molecular features in RCC, we used a novel, unbiased and integrative approach.
We integrated gene expression data from 97 primary RCC of different pathologic parameters, 15 RCC metastases as well as 34 cancer cell lines for two-way nonsupervised hierarchical clustering using gene groups suggested by the PANTHER Classification System. We depicted the genomic landscape of the resulted tumor groups by means of Single Nuclear Polymorphism (SNP) technology. Finally, the achieved results were immunohistochemically analyzed using a tissue microarray (TMA) composed of 254 RCC.
We found robust, genome wide expression signatures, which split RCC into three distinct molecular subgroups. These groups remained stable even if randomly selected gene sets were clustered. Notably, the pattern obtained from RCC cell lines was clearly distinguishable from that of primary tumors. SNP array analysis demonstrated differing frequencies of chromosomal copy number alterations among RCC subgroups. TMA analysis with group-specific markers showed a prognostic significance of the different groups.
We propose the existence of characteristic and histologically independent genome-wide expression outputs in RCC with potential biological and clinical relevance.
PMCID: PMC3488567  PMID: 22824167
DNA-microarray; SNP-array; RCC subgroups; Tissue microarray; Outcome
18.  An improved method for constructing tissue microarrays from prostate needle biopsy specimens 
Journal of Clinical Pathology  2009;62(8):694-698.
Prostate cancer diagnosis is routinely made by the histopathological examination of formalin fixed needle biopsy specimens. Frequently this is the only cancer tissue available from the patient for the analysis of diagnostic and prognostic biomarkers. There is, therefore, an urgent need for methods that allow the high-throughput analysis of these biopsy samples using immunohistochemical (IHC) markers and fluorescence in situ hybridisation (FISH) analysis based markers.
A method that allows the construction of tissue microarrays (TMAs) from diagnostic prostate needle biopsy cores has previously been reported. However, the technique only allows the production of low-density biopsy TMAs with a maximum of 20 cores per TMA. Here two methods are presented that allow the rapid and uniform production of biopsy TMAs containing between 54 and 72 biopsy cores. IHC and FISH techniques were used to detect biomarker status.
Biopsy TMAs were constructed from prostate needle biopsy specimens taken from 102 patients entered into an active surveillance trial and 201 patients in a radiotherapy trial. The detection rate for cancer in slices of these biopsy TMAs was 66% and 79% respectively. Slices of a biopsy TMA prepared from biopsies from active surveillance patients were used to detect multiple IHC markers and to score TMPRSS2-ERG fusion status in a FISH-based assay.
The construction of biopsy TMAs provides an effective method for the multiplex analysis of IHC and FISH markers and for their assessment as prognostic biomarkers in the context of clinical trials.
PMCID: PMC2709943  PMID: 19638540
19.  Validation of cytoplasmic-to-nuclear ratio of survivin as an indicator of improved prognosis in breast cancer 
BMC Cancer  2010;10:639.
Conflicting data exist regarding the prognostic and predictive impact of survivin (BIRC5) in breast cancer. We previously reported survivin cytoplasmic-to-nuclear ratio (CNR) as an independent prognostic indicator in breast cancer. Here, we validate survivin CNR in a separate and extended cohort. Furthermore, we present new data suggesting that a low CNR may predict outcome in tamoxifen-treated patients.
Survin expression was assessed using immunhistochemistry on a breast cancer tissue microarray (TMA) containing 512 tumours. Whole slide digital images were captured using an Aperio XT scanner. Automated image analysis was used to identify tumour from stroma and then to quantify tumour-specific nuclear and cytoplasmic survivin. A decision tree model selected using a 10-fold cross-validation approach was used to identify prognostic subgroups based on nuclear and cytoplasmic survivin expression.
Following optimisation of the staining procedure, it was possible to evaluate survivin protein expression in 70.1% (n = 359) of the 512 tumours represented on the TMA. Decision tree analysis predicted that nuclear, as opposed to cytoplasmic, survivin was the most important determinant of overall survival (OS) and breast cancer-specific survival (BCSS). The decision tree model confirmed CNR of 5 as the optimum threshold for survival analysis. Univariate analysis demonstrated an association between a high CNR (>5) and a prolonged BCSS (HR 0.49, 95% CI 0.29-0.81, p = 0.006). Multivariate analysis revealed a high CNR (>5) was an independent predictor of BCSS (HR 0.47, 95% CI 0.27-0.82, p = 0.008). An increased CNR was associated with ER positive (p = 0.045), low grade (p = 0.007), Ki-67 (p = 0.001) and Her2 (p = 0.026) negative tumours. Finally, a high CNR was an independent predictor of OS in tamoxifen-treated ER-positive patients (HR 0.44, 95% CI 0.23-0.87, p = 0.018).
Using the same threshold as our previous study, we have validated survivin CNR as a marker of good prognosis in breast cancer in a large independent cohort. These findings provide robust evidence of the importance of survivin CNR as a breast cancer biomarker, and its potential to predict outcome in tamoxifen-treated patients.
PMCID: PMC2999619  PMID: 21092276
20.  Cancer stem cell markers in breast cancer: pathological, clinical and prognostic significance 
Breast Cancer Research : BCR  2011;13(6):R118.
The cancer stem cell (CSC) hypothesis states that tumours consist of a cellular hierarchy with CSCs at the apex driving tumour recurrence and metastasis. Hence, CSCs are potentially of profound clinical importance. We set out to establish the clinical relevance of breast CSC markers by profiling a large cohort of breast tumours in tissue microarrays (TMAs) using immunohistochemistry (IHC).
We included 4, 125 patients enrolled in the SEARCH population-based study with tumours represented in TMAs and classified into molecular subtype according to a validated IHC-based five-marker scheme. IHC was used to detect CD44/CD24, ALDH1A1, aldehyde dehydrogenase family 1 member A3 (ALDH1A3) and integrin alpha-6 (ITGA6). A 'Total CSC' score representing expression of all four CSC markers was also investigated. Association with breast cancer specific survival (BCSS) at 10 years was assessed using a Cox proportional-hazards model. This study was complied with REMARK criteria.
In ER negative cases, multivariate analysis showed that ITGA6 was an independent prognostic factor with a time-dependent effect restricted to the first two years of follow-up (hazard ratio (HR) for 0 to 2 years follow-up, 2.4; 95% confidence interval (95% CI), 1.2 to 4.8; P = 0.009). The composite 'Total CSC' score carried independent prognostic significance in ER negative cases for the first four years of follow-up (HR for 0 to 4 years follow-up, 1.3; 95% CI, 1.1 to 1.6; P = 0.006).
Breast CSC markers do not identify identical subpopulations in primary tumours. Both ITGA6 and a composite Total CSC score show independent prognostic significance in ER negative disease. The use of multiple markers to identify tumours enriched for CSCs has the greatest prognostic value. In the absence of more specific markers, we propose that the effective translation of the CSC hypothesis into patient benefit will necessitate the use of a panel of markers to robustly identify tumours enriched for CSCs.
PMCID: PMC3326560  PMID: 22112299
21.  Association of Drug Transporter Expression with Mortality and Progression-Free Survival in Stage IV Head and Neck Squamous Cell Carcinoma 
PLoS ONE  2014;9(10):e108908.
Drug transporters such as P-glycoprotein (ABCB1) have been associated with chemotherapy resistance and are considered unfavorable prognostic factors for survival of cancer patients. Analyzing mRNA expression levels of a subset of drug transporters by quantitative reverse transcription polymerase chain reaction (qRT-PCR) or protein expression by tissue microarray (TMA) in tumor samples of therapy naïve stage IV head and neck squamous cell carcinoma (HNSCC) (qRT-PCR, n = 40; TMA, n = 61), this in situ study re-examined the significance of transporter expression for progression-free survival (PFS) and overall survival (OS). Data from The Cancer Genome Atlas database was used to externally validate the respective findings (n = 317). In general, HNSCC tended to lower expression of drug transporters compared to normal epithelium. High ABCB1 mRNA tumor expression was associated with both favorable progression-free survival (PFS, p = 0.0357) and overall survival (OS, p = 0.0535). Similar results were obtained for the mRNA of ABCC1 (MRP1, multidrug resistance-associated protein 1; PFS, p = 0.0183; OS, p = 0.038). In contrast, protein expression of ATP7b (copper transporter ATP7b), mRNA expression of ABCG2 (BCRP, breast cancer resistance protein), ABCC2 (MRP2), and SLC31A1 (hCTR1, human copper transporter 1) did not correlate with survival. Cluster analysis however revealed that simultaneous high expression of SLC31A1, ABCC2, and ABCG2 indicates poor survival of HNSCC patients. In conclusion, this study militates against the intuitive dogma where high expression of drug efflux transporters indicates poor survival, but demonstrates that expression of single drug transporters might indicate even improved survival. Prospectively, combined analysis of the ‘transportome’ should rather be performed as it likely unravels meaningful data on the impact of drug transporters on survival of patients with HNSCC.
PMCID: PMC4183512  PMID: 25275603
22.  Low RBM3 protein expression correlates with tumour progression and poor prognosis in malignant melanoma: An analysis of 215 cases from the Malmö Diet and Cancer Study 
We have previously reported that expression of the RNA- and DNA-binding protein RBM3 is associated with a good prognosis in breast cancer and ovarian cancer. In this study, the prognostic value of immunohistochemical RBM3 expression was assessed in incident cases of malignant melanoma from a prospective population-based cohort study.
Until Dec 31st 2008, 264 incident cases of primary invasive melanoma had been registered in the Malmö Diet and Cancer Study. Histopathological and clinical information was obtained for available cases and tissue microarrays (TMAs) constructed from 226 (85.6%) suitable paraffin-embedded tumours and 31 metastases. RBM3 expression was analysed by immunohistochemistry on the TMAs and a subset of full-face sections. Chi-square and Mann-Whitney U tests were used for comparison of RBM3 expression and relevant clinicopathological characteristics. Kaplan Meier analysis and Cox proportional hazards modelling were used to assess the relationship between RBM3 and recurrence free survival (RFS) and overall survival (OS).
RBM3 could be assessed in 215/226 (95.1%) of primary tumours and all metastases. Longitudinal analysis revealed that 16/31 (51.6%) of metastases lacked RBM3 expression, in contrast to the primary tumours in which RBM3 was absent in 3/215 (1.4%) cases and strongly expressed in 120/215 (55.8%) cases. Strong nuclear RBM3 expression in the primary tumour was significantly associated with favourable clinicopathological parameters; i.e. non-ulcerated tumours, lower depth of invasion, lower Clark level, less advanced clinical stage, low mitotic activity and non-nodular histological type, and a prolonged RFS (RR = 0.50; 95% CI = 0.27-0.91) and OS (RR = 0.36, 95%CI = 0.20-0.64). Multivariate analysis demonstrated that the beneficial prognostic value of RBM3 remained significant for OS (RR = 0.33; 95%CI = 0.18-0.61).
In line with previous in vitro data, we here show that RBM3 is down-regulated in metastatic melanoma and high nuclear RBM3 expression in the primary tumour is an independent marker of a prolonged OS. The potential utility of RBM3 in treatment stratification of patients with melanoma should be pursued in future studies.
PMCID: PMC3156749  PMID: 21777469
23.  Quantitative Multiplexed Analysis of ErbB Family Co-expression for Primary Breast Cancer Prognosis in a Large Retrospective Cohort 
Cancer  2009;115(11):2400-2409.
Assessment of outcome using a single prognostic or predictive marker is the current basis of targeted therapy, but is inherently limited by its simplicity. Multiplexing has provided better classification but only been done quantitatively using RNA or DNA. Automated quantitative analysis (AQUA) is a new technology that allows quantitative in situ assessment of protein expression. We hypothesize that multiplexed quantitative measurement of ErbB receptor family proteins may allow better prediction of outcome.
We quantitatively assessed the expression of six proteins in four subcellular compartments in 676 patients using breast carcinoma tissue microarrays (TMA). Then using Cox proportional hazards modeling and unsupervised hierarchical clustering, we assessed the prognostic value of the expression singly and multiplexed.
EGFR, HER-2 and HER-3 expression were associated with decreased survival. Multivariate analysis showed high HER-2 and HER-3 expression maintained independence as prognostic markers. Hierarchical clustering of expression data defined a small class enriched for HER-2 expression with 40% 10 year survival, compared to 55% using conventional methods. Clustering also revealed a similarly poor-prognostic subgroup co-expressing EGFR and HER-3 (but low for ER, PR and HER-2) with a 42% 10 year survival.
This work shows that the combined analysis of protein expression improved prognostic classification and that multiplexed models were superior to any single marker-based method for prediction of 10-year survival. These methods illustrate a protein-based, multiplexed approach that could more accurately identify patients for targeted therapies.
PMCID: PMC2756449  PMID: 19330842
AQUA; HER2; EGFR; HER3; HER4; Immunohistochemistry; survival
24.  Subtyping of Breast Cancer by Immunohistochemistry to Investigate a Relationship between Subtype and Short and Long Term Survival: A Collaborative Analysis of Data for 10,159 Cases from 12 Studies 
PLoS Medicine  2010;7(5):e1000279.
Paul Pharoah and colleagues evaluate the prognostic significance of immunohistochemical subtype classification in more than 10,000 breast cancer cases with early disease, and examine the influence of a patient's survival time on the prediction of future survival.
Immunohistochemical markers are often used to classify breast cancer into subtypes that are biologically distinct and behave differently. The aim of this study was to estimate mortality for patients with the major subtypes of breast cancer as classified using five immunohistochemical markers, to investigate patterns of mortality over time, and to test for heterogeneity by subtype.
Methods and Findings
We pooled data from more than 10,000 cases of invasive breast cancer from 12 studies that had collected information on hormone receptor status, human epidermal growth factor receptor-2 (HER2) status, and at least one basal marker (cytokeratin [CK]5/6 or epidermal growth factor receptor [EGFR]) together with survival time data. Tumours were classified as luminal and nonluminal tumours according to hormone receptor expression. These two groups were further subdivided according to expression of HER2, and finally, the luminal and nonluminal HER2-negative tumours were categorised according to expression of basal markers. Changes in mortality rates over time differed by subtype. In women with luminal HER2-negative subtypes, mortality rates were constant over time, whereas mortality rates associated with the luminal HER2-positive and nonluminal subtypes tended to peak within 5 y of diagnosis and then decline over time. In the first 5 y after diagnosis the nonluminal tumours were associated with a poorer prognosis, but over longer follow-up times the prognosis was poorer in the luminal subtypes, with the worst prognosis at 15 y being in the luminal HER2-positive tumours. Basal marker expression distinguished the HER2-negative luminal and nonluminal tumours into different subtypes. These patterns were independent of any systemic adjuvant therapy.
The six subtypes of breast cancer defined by expression of five markers show distinct behaviours with important differences in short term and long term prognosis. Application of these markers in the clinical setting could have the potential to improve the targeting of adjuvant chemotherapy to those most likely to benefit. The different patterns of mortality over time also suggest important biological differences between the subtypes that may result in differences in response to specific therapies, and that stratification of breast cancers by clinically relevant subtypes in clinical trials is urgently required.
Please see later in the article for the Editors' Summary
Editors' Summary
Each year, more than one million women discover they have breast cancer. Breast cancer begins when cells in the breast's milk-producing glands or in the tubes (ducts) that take milk to the nipples acquire genetic changes that allow them to divide uncontrollably and to move around the body (metastasize). The uncontrolled cell division leads to the formation of a lump that can be detected by mammography (a breast X-ray) or by manual breast examination. Breast cancer is treated by surgical removal of the lump or, if the cancer has started to spread, by removal of the whole breast (mastectomy). Surgery is usually followed by radiotherapy or chemotherapy. These “adjuvant” therapies are designed to kill any remaining cancer cells but can make women very ill. Generally speaking, the outlook (prognosis) for women with breast cancer is good. In the United States, for example, nearly 90% of affected women are still alive five years after their diagnosis.
Why Was This Study Done?
Because there are several types of cells in the milk ducts and glands, there are several subtypes of breast cancer. Luminal tumors, for example, begin in the cells that line the ducts and glands and usually grow slowly; basal-type tumors arise in deeper layers of the ducts and glands and tend to grow quickly. Clinicians need to distinguish between different breast cancer subtypes so that they can give women a realistic prognosis and can give adjuvant treatments to those women who are most likely to benefit. One way to distinguish between different subtypes is to stain breast cancer samples using antibodies (immune system proteins) that recognize particular proteins (antigens). This “immunohistochemical” approach can identify several breast cancer subtypes but its prognostic value and the best way to classify breast tumors remains unclear. In this study, the researchers investigate the survival over time of women with six major subtypes of breast cancer classified using five immunohistochemical markers: the estrogen receptor and the progesterone receptor (two hormone receptors expressed by luminal cells), the human epidermal growth factors receptor-2 (HER2, a protein marker used to select specific adjuvant therapies), and CK5/6 and EGFR (proteins expressed by basal cells).
What Did the Researchers Do and Find?
The researchers pooled data on survival time and on the expression of the five immunohistochemical markers from more than 10,000 cases of breast cancer from 12 studies. They then divided the tumors into six subtypes on the basis of their marker expression: luminal (hormone receptor-positive), HER2-positive tumors; luminal, HER2-negative, basal marker-positive tumors; luminal, HER2-negative, basal marker-negative tumors; nonluminal (hormone receptor-negative), HER2-positive tumors; nonluminal, HER2-negative, basal marker-positive tumors; and nonluminal, HER2-negative, basal marker-negative tumors. In the first five years after diagnosis, women with nonluminal tumor subtypes had the worst prognosis but at 15 years after diagnosis, women with luminal HER2-positive tumors had the worst prognosis. Furthermore, death rates (the percentage of affected women dying each year) differed by subtype over time. Thus, women with the two luminal HER2-negative subtypes were as likely to die soon after diagnosis as at later times whereas the death rates associated with nonluminal subtypes peaked within five years of diagnosis and then declined.
What Do These Findings Mean?
These and other findings indicate that the six subtypes of breast cancer defined by the expression of five immunohistochemical markers have distinct biological characteristics that are associated with important differences in short-term and long-term outcomes. Because different laboratories measured the immunohistochemical markers using different methods, it is possible that some of the tumors included in this study were misclassified. However, the finding of clear differences in the behavior of the immunochemically classified subtypes suggests that the use of the five markers for tumor classification might be robust enough for routine clinical practice. The application of these markers in the clinical setting, suggest the researchers, could improve the targeting of adjuvant therapies to those women most likely to benefit. Furthermore, note the researchers, these findings strongly suggest that subtype-specific responses should be evaluated in future clinical trials of treatments for breast cancer.
Additional Information
Please access these Web sites via the online version of this summary at
This study is further discussed in a PLoS Medicine Perspective by Stefan Ambs
The US National Cancer Institute provides detailed information for patients and health professionals on all aspects of breast cancer (in English and Spanish)
The American Cancer Society has a detailed guide to breast cancer, which includes information on the immunochemical classification of breast cancer subtypes
The UK charities MacMillan Cancer Support and Cancer Research UK also provide detailed information about breast cancer
The MedlinePlus Encyclopedia provides information for patients about breast cancer; Medline Plus provides links to many other breast cancer resources (in English and Spanish)
PMCID: PMC2876119  PMID: 20520800
25.  A data model and database for high-resolution pathology analytical image informatics 
The systematic analysis of imaged pathology specimens often results in a vast amount of morphological information at both the cellular and sub-cellular scales. While microscopy scanners and computerized analysis are capable of capturing and analyzing data rapidly, microscopy image data remain underutilized in research and clinical settings. One major obstacle which tends to reduce wider adoption of these new technologies throughout the clinical and scientific communities is the challenge of managing, querying, and integrating the vast amounts of data resulting from the analysis of large digital pathology datasets. This paper presents a data model, which addresses these challenges, and demonstrates its implementation in a relational database system.
This paper describes a data model, referred to as Pathology Analytic Imaging Standards (PAIS), and a database implementation, which are designed to support the data management and query requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines on whole-slide images and tissue microarrays (TMAs).
(1) Development of a data model capable of efficiently representing and storing virtual slide related image, annotation, markup, and feature information. (2) Development of a database, based on the data model, capable of supporting queries for data retrieval based on analysis and image metadata, queries for comparison of results from different analyses, and spatial queries on segmented regions, features, and classified objects.
Settings and Design:
The work described in this paper is motivated by the challenges associated with characterization of micro-scale features for comparative and correlative analyses involving whole-slides tissue images and TMAs. Technologies for digitizing tissues have advanced significantly in the past decade. Slide scanners are capable of producing high-magnification, high-resolution images from whole slides and TMAs within several minutes. Hence, it is becoming increasingly feasible for basic, clinical, and translational research studies to produce thousands of whole-slide images. Systematic analysis of these large datasets requires efficient data management support for representing and indexing results from hundreds of interrelated analyses generating very large volumes of quantifications such as shape and texture and of classifications of the quantified features.
Materials and Methods:
We have designed a data model and a database to address the data management requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines. The data model represents virtual slide related image, annotation, markup and feature information. The database supports a wide range of metadata and spatial queries on images, annotations, markups, and features.
We currently have three databases running on a Dell PowerEdge T410 server with CentOS 5.5 Linux operating system. The database server is IBM DB2 Enterprise Edition 9.7.2. The set of databases consists of 1) a TMA database containing image analysis results from 4740 cases of breast cancer, with 641 MB storage size; 2) an algorithm validation database, which stores markups and annotations from two segmentation algorithms and two parameter sets on 18 selected slides, with 66 GB storage size; and 3) an in silico brain tumor study database comprising results from 307 TCGA slides, with 365 GB storage size. The latter two databases also contain human-generated annotations and markups for regions and nuclei.
Modeling and managing pathology image analysis results in a database provide immediate benefits on the value and usability of data in a research study. The database provides powerful query capabilities, which are otherwise difficult or cumbersome to support by other approaches such as programming languages. Standardized, semantic annotated data representation and interfaces also make it possible to more efficiently share image data and analysis results.
PMCID: PMC3153692  PMID: 21845230
Data models; databases; digitized slides; image analysis

Results 1-25 (1424023)