High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarray datasets, mainly from the Cancer Genome Atlas (TCGA) project, in respect to technical and biological annotations. We observe technical bias in these datasets and discuss corrective interventions. We then suggest a general procedure to control study design, detect technical bias using linear regression of principal components, correct for batch effects, and re-evaluate principal components. This procedure is implemented in the R package swamp, and as graphical user interface software. In conclusion, high-throughput platforms that generate continuous measurements are sensitive to various forms of technical bias. For such data, monitoring of technical variation is an important analysis step.
data adjustment; batch effect; bias; sample annotation; RNAseq; high-throughput analysis
Truncating germline mutations in the tumor suppressor gene BRCA-1 associated protein-1 (BAP1) have been reported in families predisposed to developing a wide range of different cancer types including uveal melanoma and cutaneous melanoma. There has also been an association between amelanotic tumor development and germline BAP1 mutation suggesting a possible phenotypic characteristic of BAP1 mutation carriers. Though there have been many types of cancer associated with germline BAP1 mutation, the full spectrum of disease association is yet to be ascertained. Here we describe a Danish family with predominantly uveal melanoma but also a range of other tumor types including lung, neuroendocrine, stomach, and breast cancer; as well as pigmented skin lesions. Whole-exome sequencing identified a BAP1 splice mutation located at c.581-2A>G, which leads to a premature truncation of BAP1 in an individual with uveal melanoma. This mutation was carried by several other family members with melanoma or various cancers. The finding expands on the growing profile of BAP1 as an important uveal and cutaneous melanoma tumor suppressor gene and implicates its involvement in the development of lung, and stomach cancer.
For primary melanomas, tumor thickness, mitotic rate, and ulceration are well-laid cornerstones of prognostication. However, a molecular exposition of melanoma aggressiveness is critically missing. We recently uncovered a four-class structure in metastatic melanoma, which predicts outcome and informs biology. This raises the possibility that a molecular structure exists even in the early stages of melanoma and that molecular determinants could underlie histophenotype and eventual patient outcome.
We subjected 223 archival primary melanomas to a horizontally integrated analysis of RNA expression, oncogenic mutations at 238 lesions, histomorphometry, and survival data.
Our previously described four-class structure that was elucidated in metastatic lesions was evident within the expression space of primary melanomas. Because these subclasses converged into two larger prognostic and phenotypic groups, we used the metastatic lesions to develop a binary subtype-based signature capable of distinguishing between "high" and "low" grade forms of the disease. The two-grade signature was subsequently applied to the primary melanomas. Compared with low-grade tumors, high-grade primary melanomas were significantly associated with increased tumor thickness, mitotic rate, ulceration (all P < 0.01), and poorer relapse-free (HR = 4.94; 95% CI, 2.84–8.59), and overall (HR = 3.66; 95% CI, 2.40–5.58) survival. High-grade melanomas exhibited elevated levels of proliferation and BRCA1/DNA damage signaling genes, whereas low-grade lesions harbored higher expression of immune genes. Importantly, the molecular-grade signature was validated in two external gene expression data sets.
We provide evidence for a molecular organization within melanomas, which is preserved across all stages of disease.
Metastatic melanoma is characterized by a poor response to chemotherapy. Furthermore, there is a lack of established predictive and prognostic markers. In this single institution study, we correlated mutation status and expression levels of BRAF and NRAS to dacarbazine (DTIC) treatment response as well as progression-free and overall survival in a cohort of 85 patients diagnosed with advanced melanoma. Neither BRAF nor NRAS mutation status correlated to treatment response. However, patients with tumors harboring NRAS mutations had a shorter overall survival (p < 0.001) compared to patients with tumors wild-type for NRAS. Patients having a clinical benefit (objective response or stable disease at 3 months) on DTIC therapy had lower BRAF and NRAS expression levels compared to patients progressing on therapy (p = 0.037 and 0.003, respectively). For BRAF expression, this association was stronger among patients with tumors wild-type for BRAF (p = 0.005). Further, low BRAF as well as NRAS expression levels were associated with a longer progression-free survival in the total population (p = 0.004 and <0.001, respectively). Contrasting low NRAS expression levels, which were associated with improved overall survival in the total population (p = 0.01), low BRAF levels were associated with improved overall survival only among patients with tumors wild-type for BRAF (p = 0.013). These findings indicate that BRAF and NRAS expression levels may influence responses to DTIC as well as prognosis in patients with advanced melanoma.
Electronic supplementary material
The online version of this article (doi:10.1007/s10585-013-9587-4) contains supplementary material, which is available to authorized users.
Melanoma; BRAF; NRAS; Chemoresistance; Dacarbazine
The objectives and goals of the Southern Swedish Malignant Melanoma (SSMM) are to develop, build and utilize cutting edge biobanks and OMICS platforms to better understand disease pathology and drug mechanisms. The SSMM research team is a truly cross-functional group with members from oncology, surgery, bioinformatics, proteomics, and genomics initiatives. Within the research team there are members who daily diagnose patients with suspect melanomas, do follow-ups on malignant melanoma patients and remove primary or metastatic lesions by surgery. This inter-disciplinary clinical patient care ensures a competence build as well as a best practice procedure where the patient benefits.
Clinical materials from patients before, during and after treatments with clinical end points are being collected. Tissue samples as well as bio-fluid samples such as blood fractions, plasma, serum and whole blood will be archived in 384-high density sample tube formats. Standardized approaches for patient selections, patient sampling, sample-processing and analysis platforms with dedicated protein assays and genomics platforms that will hold value for the research community are used. The patient biobank archives are fully automated with novel ultralow temperature biobank storage units and used as clinical resources.
An IT-infrastructure using a laboratory information management system (LIMS) has been established, that is the key interface for the research teams in order to share and explore data generated within the project. The cross-site data repository in Lund forms the basis for sample processing, together with biological samples in southern Sweden, including blood fractions and tumor tissues. Clinical registries are associated with the biobank materials, including pathology reports on disease diagnosis on the malignant melanoma (MM) patients.
We provide data on the developments of protein profiling and targeted protein assays on isolated melanoma tumors, as well as reference blood standards that is used by the team members in the respective laboratories. These pilot data show biobank access and feasibility of performing quantitative proteomics in MM biobank repositories collected in southern Sweden. The scientific outcomes further strengthen the build of healthcare benefit in the complex challenges of malignant melanoma pathophysiology that is addressed by the novel personalized medicines entering the market.
Malignant melanoma; Protein sequencing; Proteomics; Genes; Antibodies; mRNA; Mass spectrometry; Bioinformatics
Mycobacterium goodii is a rare cause of significant infection. M. goodii has mainly been associated with lymphadenitis, cellulitis, osteomyelitis, and wound infection.
A case of a 76-year-old Caucasian female is presented. The patient developed a prosthetic valve endocarditis caused by M. goodii. She had also suffered from severe neurological symptoms related to a septic emboli that could be demonstrated as an ischemic lesion found on CT of the brain. Transesophageal echocardiography verified a large vegetation attached to the prosthetic valve. Commonly used blood culture bottles showed growth of the bacteria after 3 days.
Although M. goodii is rarely involved in these kinds of severe infections, rapidly growing mycobacteria should be recognized during conventional bacterial investigations and identified by molecular tools such as analysis of 16S rDNA. Species identification of nontuberculous mycobacteria is demanding and is preferably done in collaboration with a mycobacterial laboratory. An early diagnosis provides the opportunity for adequate treatment. In the present case, prolonged antimicrobial treatment and surgery with replacement of the prosthetic valve was successful.
Mycobacterium goodii; NTM; Endocarditis; Septic emboli; Prosthetic valve; 16S rDNA analysis
Podocalyxin-like 1 (PODXL) is a cell-adhesion glycoprotein and stem cell marker that has been associated with an aggressive tumour phenotype and adverse outcome in several cancer types. We recently demonstrated that overexpression of PODXL is an independent factor of poor prognosis in colorectal cancer (CRC). The aim of this study was to validate these results in two additional independent patient cohorts and to examine the correlation between PODXL mRNA and protein levels in a subset of tumours.
PODXL protein expression was analyzed by immunohistochemistry in tissue microarrays with tumour samples from a consecutive, retrospective cohort of 270 CRC patients (cohort 1) and a prospective cohort of 337 CRC patients (cohort 2). The expression of PODXL mRNA was measured by real-time quantitative PCR in a subgroup of 62 patients from cohort 2. Spearman´;s Rho and Chi-Square tests were used for analysis of correlations between PODXL expression and clinicopathological parameters. Kaplan Meier analysis and Cox proportional hazards modelling were applied to assess the relationship between PODXL expression and time to recurrence (TTR), disease free survival (DFS) and overall survival (OS).
High PODXL protein expression was significantly associated with unfavourable clinicopathological characteristics in both cohorts. In cohort 1, high PODXL expression was associated with a significantly shorter 5-year OS in both univariable (HR = 2.28; 95% CI 1.43-3.63, p = 0.001) and multivariable analysis (HR = 2.07; 95% CI 1.25-3.43, p = 0.005). In cohort 2, high PODXL expression was associated with a shorter TTR (HR = 2.93; 95% CI 1.26-6.82, p = 0.013) and DFS (HR = 2.44; 95% CI 1.32-4.54, p = 0.005), remaining significant in multivariable analysis, HR = 2.50; 95% CI 1.05-5.96, p = 0.038 for TTR and HR = 2.11; 95% CI 1.13-3.94, p = 0.019 for DFS.
No significant correlation could be found between mRNA levels and protein expression of PODXL and there was no association between mRNA levels and clinicopathological parameters or survival.
Here, we have validated the previously demonstrated association between immunohistochemical expression of PODXL and poor prognosis in CRC in two additional independent patient cohorts. The results further underline the potential utility of PODXL as a biomarker for more precise prognostication and treatment stratification of CRC patients.
Lung cancer is the worldwide leading cause of death from cancer. Tobacco usage is the major pathogenic factor, but all lung cancers are not attributable to smoking. Specifically, lung cancer in never-smokers has been suggested to represent a distinct disease entity compared to lung cancer arising in smokers due to differences in etiology, natural history and response to specific treatment regimes. However, the genetic aberrations that differ between smokers and never-smokers’ lung carcinomas remain to a large extent unclear.
Unsupervised gene expression analysis of 39 primary lung adenocarcinomas was performed using Illumina HT-12 microarrays. Results from unsupervised analysis were validated in six external adenocarcinoma data sets (n=687), and six data sets comprising normal airway epithelial or normal lung tissue specimens (n=467). Supervised gene expression analysis between smokers and never-smokers were performed in seven adenocarcinoma data sets, and results validated in the six normal data sets.
Initial unsupervised analysis of 39 adenocarcinomas identified two subgroups of which one harbored all never-smokers. A generated gene expression signature could subsequently identify never-smokers with 79-100% sensitivity in external adenocarcinoma data sets and with 76-88% sensitivity in the normal materials. A notable fraction of current/former smokers were grouped with never-smokers. Intriguingly, supervised analysis of never-smokers versus smokers in seven adenocarcinoma data sets generated similar results. Overlap in classification between the two approaches was high, indicating that both approaches identify a common set of samples from current/former smokers as potential never-smokers. The gene signature from unsupervised analysis included several genes implicated in lung tumorigenesis, immune-response associated pathways, genes previously associated with smoking, as well as marker genes for alveolar type II pneumocytes, while the best classifier from supervised analysis comprised genes strongly associated with proliferation, but also genes previously associated with smoking.
Based on gene expression profiling, we demonstrate that never-smokers can be identified with high sensitivity in both tumor material and normal airway epithelial specimens. Our results indicate that tumors arising in never-smokers, together with a subset of tumors from smokers, represent a distinct entity of lung adenocarcinomas. Taken together, these analyses provide further insight into the transcriptional patterns occurring in lung adenocarcinoma stratified by smoking history.
Lung cancer; Smoking; Gene expression analysis; Adenocarcinoma; EGFR; Never-smokers; Immune response
All cancers carry somatic mutations. The patterns of mutation in cancer genomes reflect the DNA damage and repair processes to which cancer cells and their precursors have been exposed. To explore these mechanisms further, we generated catalogs of somatic mutation from 21 breast cancers and applied mathematical methods to extract mutational signatures of the underlying processes. Multiple distinct single- and double-nucleotide substitution signatures were discernible. Cancers with BRCA1 or BRCA2 mutations exhibited a characteristic combination of substitution mutation signatures and a distinctive profile of deletions. Complex relationships between somatic mutation prevalence and transcription were detected. A remarkable phenomenon of localized hypermutation, termed “kataegis,” was observed. Regions of kataegis differed between cancers but usually colocalized with somatic rearrangements. Base substitutions in these regions were almost exclusively of cytosine at TpC dinucleotides. The mechanisms underlying most of these mutational signatures are unknown. However, a role for the APOBEC family of cytidine deaminases is proposed.
► The genomes of 21 breast cancers sequenced ► Multiple somatic mutational processes extracted from mutation catalogs ► Mutational processes of BRCA1/BRCA2 breast cancers are distinctive ► Localized regions of hypermutation, “kataegis,” are frequent in breast cancers
Analyses of breast cancer genomes define distinct mutational signatures that imply the existence of multiple distinct somatic mutational processes throughout the genome and reveal a remarkable phenomenon of localized hypermutation. These highly mutated regions vary in size and chromosomal location and are surprisingly frequent in cancer genomes, often colocalizing with somatic rearrangements.
We report a genome-wide association study of melanoma, conducted by GenoMEL, of 2,981 cases, of European ancestry, and 1,982 study-specific controls, plus a further 6,426 French and UK population controls, all genotyped for 317,000 or 610,000 SNPs. The analysis confirmed previously known melanoma susceptibility loci. The 7 novel regions with at least one SNP with p<10−5 and further local imputed or genotyped support were selected for replication using two other genome-wide studies (from Australia and Houston, Texas). Additional replication came from UK and Dutch case-control series. Three of the 7 regions replicated at p<10−3: an ATM missense polymorphism (rs1801516, overall p=3.4×10−9); a polymorphism within MX2 (rs45430, p=2.9×10−9) and a SNP adjacent to CASP8 (rs13016963, p=8.6×10−10). A fourth region near CCND1 remains of potential interest, showing suggestive but inconclusive evidence of replication. Unlike the previously known regions, the novel loci showed no association with nevus or pigmentation phenotypes in a large UK case-control series.
Human epidermal growth factor receptor 2 (HER2)-amplified breast cancer represents a clinically well-defined subgroup due to availability of targeted treatment. However, HER2-amplified tumors have been shown to be heterogeneous at the genomic level by genome-wide microarray analyses, pointing towards a need of further investigations for identification of recurrent copy number alterations and delineation of patterns of allelic imbalance.
High-density whole genome array-based comparative genomic hybridization (aCGH) and single nucleotide polymorphism (SNP) array data from 260 HER2-amplified breast tumors or cell lines, and 346 HER2-negative breast cancers with molecular subtype information were assembled from different repositories. Copy number alteration (CNA), loss-of-heterozygosity (LOH), copy number neutral allelic imbalance (CNN-AI), subclonal CNA and patterns of tumor DNA ploidy were analyzed using bioinformatical methods such as genomic identification of significant targets in cancer (GISTIC) and genome alteration print (GAP). The patterns of tumor ploidy were confirmed in 338 unrelated breast cancers analyzed by DNA flow cytometry with concurrent BAC aCGH and gene expression data.
A core set of 36 genomic regions commonly affected by copy number gain or loss was identified by integrating results with a previous study, together comprising > 400 HER2-amplified tumors. While CNN-AI frequency appeared evenly distributed over chromosomes in HER2-amplified tumors, not targeting specific regions and often < 20% in frequency, the occurrence of LOH was strongly associated with regions of copy number loss. HER2-amplified and HER2-negative tumors stratified by molecular subtypes displayed different patterns of LOH and CNN-AI, with basal-like tumors showing highest frequencies followed by HER2-amplified and luminal B cases. Tumor aneuploidy was strongly associated with increasing levels of LOH, CNN-AI, CNAs and occurrence of subclonal copy number events, irrespective of subtype. Finally, SNP data from individual tumors indicated that genomic amplification in general appears as monoallelic, that is, it preferentially targets one parental chromosome in HER2-amplified tumors.
We have delineated the genomic landscape of CNAs, amplifications, LOH, and CNN-AI in HER2-amplified breast cancer, but also demonstrated a strong association between different types of genomic aberrations and tumor aneuploidy irrespective of molecular subtype.
The CD44 cell adhesion molecule is aberrantly expressed in many breast tumors and has been implicated in the metastatic process as well as in the putative cancer stem cell (CSC) compartment. We aimed to investigate potential associations between alternatively spliced isoforms of CD44 and CSCs as well as to various breast cancer biomarkers and molecular subtypes.
We used q-RT-PCR and exon-exon spanning assays to analyze the expression of four alternatively spliced CD44 isoforms as well as the total expression of CD44 in 187 breast tumors and 13 cell lines. ALDH1 protein expression was determined by IHC on TMA.
Breast cancer cell lines showed a heterogeneous expression pattern of the CD44 isoforms, which shifted considerably when cells were grown as mammospheres. Tumors characterized as positive for the CD44+/CD24- phenotype by immunohistochemistry were associated to all isoforms except the CD44 standard (CD44S) isoform, which lacks all variant exons. Conversely, tumors with strong expression of the CSC marker ALDH1 had elevated expression of CD44S. A high expression of the CD44v2-v10 isoform, which retain all variant exons, was correlated to positive steroid receptor status, low proliferation and luminal A subtype. The CD44v3-v10 isoform showed similar correlations, while high expression of CD44v8-v10 was correlated to positive EGFR, negative/low HER2 status and basal-like subtype. High expression of CD44S was associated with strong HER2 staining and also a subgroup of basal-like tumors. Unsupervised hierarchical cluster analysis of CD44 isoform expression data divided tumors into four main clusters, which showed significant correlations to molecular subtypes and differences in 10-year overall survival.
We demonstrate that individual CD44 isoforms can be associated to different breast cancer subtypes and clinical markers such as HER2, ER and PgR, which suggests involvement of CD44 splice variants in specific oncogenic signaling pathways. Efforts to link CD44 to CSCs and tumor progression should consider the expression of various CD44 isoforms.
Basal-like breast cancer (BBC) is a subtype of breast cancer with poor prognosis1–3. Inherited mutations of BRCA1, a cancer susceptibility gene involved in double-strand DNA break (DSB) repair, lead to breast cancers that are nearly always of the BBC subtype3–5; however, the precise molecular lesions and oncogenic consequences of BRCA1 dysfunction are poorly understood. Here we show that heterozygous inactivation of the tumor suppressor gene Pten leads to the formation of basal-like mammary tumors in mice, and that loss of PTEN expression is significantly associated with the BBC subtype in human sporadic and BRCA1-associated hereditary breast cancers. In addition, we identify frequent gross PTEN mutations, involving intragenic chromosome breaks, inversions, deletions and micro copy number aberrations, specifically in BRCA1-deficient tumors. These data provide an example of a specific and recurrent oncogenic consequence of BRCA1-dependent dysfunction in DNA repair and provide insight into the pathogenesis of BBC with therapeutic implications. These findings also argue that obtaining an accurate census of genes mutated in cancer will require a systematic examination for gross gene rearrangements, particularly in tumors with deficient DSB repair.
Complement C2 deficiency is the most common genetically determined complete complement deficiency and is associated with a number of diseases. Most prominent are the associations with recurrent serious infections in young children and the development of systemic lupus erythematosus (SLE) in adults. The links with these diseases reflect the important role complement C2 plays in both innate immunity and immune tolerance. Infusions with normal fresh frozen plasma for the treatment of associated disease have demonstrated therapeutic effects but so far protein replacement therapy has not been evaluated.
Human complement C2 was cloned and expressed in a mammalian cell line. The purity of recombinant human C2 (rhC2) was greater than 95% and it was characterized for stability and activity. It was sensitive to C1s cleavage and restored classical complement pathway activity in C2-deficient serum both in a complement activation ELISA and a hemolytic assay. Furthermore, rhC2 could increase C3 fragment deposition on the human pathogen Streptococcus pneumoniae in C2-deficient serum to levels equal to those with normal serum.
Taken together these data suggest that recombinant human C2 can restore classical complement pathway activity and may serve as a potential therapeutic for recurring bacterial infections or SLE in C2-deficient patients.
A significant proportion of high-risk breast cancer families are not explained by mutations in known genes. Recent genome-wide searches (GWS) have not revealed any single major locus reminiscent of BRCA1 and BRCA2, indicating that still unidentified genes may explain relatively few families each or interact in a way obscure to linkage analyses. This has drawn attention to possible benefits of studying populations where genetic heterogeneity might be reduced. We thus performed a GWS for linkage on nine Icelandic multiple-case non-BRCA1/2 families of desirable size for mapping highly penetrant loci. To follow up suggestive loci, an additional 13 families from other Nordic countries were genotyped for selected markers.
GWS was performed using 811 microsatellite markers providing about five centiMorgan (cM) resolution. Multipoint logarithm of odds (LOD) scores were calculated using parametric and nonparametric methods. For selected markers and cases, tumour tissue was compared to normal tissue to look for allelic loss indicative of a tumour suppressor gene.
The three highest signals were located at chromosomes 6q, 2p and 14q. One family contributed suggestive LOD scores (LOD 2.63 to 3.03, dominant model) at all these regions, without consistent evidence of a tumour suppressor gene. Haplotypes in nine affected family members mapped the loci to 2p23.2 to p21, 6q14.2 to q23.2 and 14q21.3 to q24.3. No evidence of a highly penetrant locus was found among the remaining families. The heterogeneity LOD (HLOD) at the 6q, 2p and 14q loci in all families was 3.27, 1.66 and 1.24, respectively. The subset of 13 Nordic families showed supportive HLODs at chromosome 6q (ranging from 0.34 to 1.37 by country subset). The 2p and 14q loci overlap with regions indicated by large families in previous GWS studies of breast cancer.
Chromosomes 2p, 6q and 14q are candidate sites for genes contributing together to high breast cancer risk. A polygenic model is supported, suggesting the joint effect of genes in contributing to breast cancer risk to be rather common in non-BRCA1/2 families. For genetic counselling it would seem important to resolve the mode of genetic interaction.
Breast cancer is a profoundly heterogeneous disease with respect to biologic and clinical behavior. Gene-expression profiling has been used to dissect this complexity and to stratify tumors into intrinsic gene-expression subtypes, associated with distinct biology, patient outcome, and genomic alterations. Additionally, breast tumors occurring in individuals with germline BRCA1 or BRCA2 mutations typically fall into distinct subtypes.
We applied global DNA copy number and gene-expression profiling in 359 breast tumors. All tumors were classified according to intrinsic gene-expression subtypes and included cases from genetically predisposed women. The Genomic Identification of Significant Targets in Cancer (GISTIC) algorithm was used to identify significant DNA copy-number aberrations and genomic subgroups of breast cancer.
We identified 31 genomic regions that were highly amplified in > 1% of the 359 breast tumors. Several amplicons were found to co-occur, the 8p12 and 11q13.3 regions being the most frequent combination besides amplicons on the same chromosomal arm. Unsupervised hierarchical clustering with 133 significant GISTIC regions revealed six genomic subtypes, termed 17q12, basal-complex, luminal-simple, luminal-complex, amplifier, and mixed subtypes. Four of them had striking similarity to intrinsic gene-expression subtypes and showed associations to conventional tumor biomarkers and clinical outcome. However, luminal A-classified tumors were distributed in two main genomic subtypes, luminal-simple and luminal-complex, the former group having a better prognosis, whereas the latter group included also luminal B and the majority of BRCA2-mutated tumors. The basal-complex subtype displayed extensive genomic homogeneity and harbored the majority of BRCA1-mutated tumors. The 17q12 subtype comprised mostly HER2-amplified and HER2-enriched subtype tumors and had the worst prognosis. The amplifier and mixed subtypes contained tumors from all gene-expression subtypes, the former being enriched for 8p12-amplified cases, whereas the mixed subtype included many tumors with predominantly DNA copy-number losses and poor prognosis.
Global DNA copy-number analysis integrated with gene-expression data can be used to dissect the complexity of breast cancer. This revealed six genomic subtypes with different clinical behavior and a striking concordance to the intrinsic subtypes. These genomic subtypes may prove useful for understanding the mechanisms of tumor development and for prognostic and treatment prediction purposes.
Five different molecular subtypes of breast cancer have been identified through gene expression profiling. Each subtype has a characteristic expression pattern suggested to partly depend on cellular origin. We aimed to investigate whether the molecular subtypes also display distinct methylation profiles.
We analysed methylation status of 807 cancer-related genes in 189 fresh frozen primary breast tumours and four normal breast tissue samples using an array-based methylation assay.
Unsupervised analysis revealed three groups of breast cancer with characteristic methylation patterns. The three groups were associated with the luminal A, luminal B and basal-like molecular subtypes of breast cancer, respectively, whereas cancers of the HER2-enriched and normal-like subtypes were distributed among the three groups. The methylation frequencies were significantly different between subtypes, with luminal B and basal-like tumours being most and least frequently methylated, respectively. Moreover, targets of the polycomb repressor complex in breast cancer and embryonic stem cells were more methylated in luminal B tumours than in other tumours. BRCA2-mutated tumours had a particularly high degree of methylation. Finally, by utilizing gene expression data, we observed that a large fraction of genes reported as having subtype-specific expression patterns might be regulated through methylation.
We have found that breast cancers of the basal-like, luminal A and luminal B molecular subtypes harbour specific methylation profiles. Our results suggest that methylation may play an important role in the development of breast cancers.
HER2 gene amplification and protein overexpression (HER2+) define a clinically challenging subgroup of breast cancer with variable prognosis and response to therapy. Although gene expression profiling has identified an ERBB2 molecular subtype of breast cancer, it is clear that HER2+ tumors reside in all molecular subtypes and represent a genomically and biologically heterogeneous group, needed to be further characterized in large sample sets.
Genome-wide DNA copy number profiling, using bacterial artificial chromosome (BAC) array comparative genomic hybridization (aCGH), and global gene expression profiling were performed on 200 and 87 HER2+ tumors, respectively. Genomic Identification of Significant Targets in Cancer (GISTIC) was used to identify significant copy number alterations (CNAs) in HER2+ tumors, which were related to a set of 554 non-HER2 amplified (HER2-) breast tumors. High-resolution oligonucleotide aCGH was used to delineate the 17q12-q21 region in high detail.
The HER2-amplicon was narrowed to an 85.92 kbp region including the TCAP, PNMT, PERLD1, HER2, C17orf37 and GRB7 genes, and higher HER2 copy numbers indicated worse prognosis. In 31% of HER2+ tumors the amplicon extended to TOP2A, defining a subgroup of HER2+ breast cancer associated with estrogen receptor-positive status and with a trend of better survival than HER2+ breast cancers with deleted (18%) or neutral TOP2A (51%). HER2+ tumors were clearly distinguished from HER2- tumors by the presence of recurrent high-level amplifications and firestorm patterns on chromosome 17q. While there was no significant difference between HER2+ and HER2- tumors regarding the incidence of other recurrent high-level amplifications, differences in the co-amplification pattern were observed, as shown by the almost mutually exclusive occurrence of 8p12, 11q13 and 20q13 amplification in HER2+ tumors. GISTIC analysis identified 117 significant CNAs across all autosomes. Supervised analyses revealed: (1) significant CNAs separating HER2+ tumors stratified by clinical variables, and (2) CNAs separating HER2+ from HER2- tumors.
We have performed a comprehensive survey of CNAs in HER2+ breast tumors, pinpointing significant genomic alterations including both known and potentially novel therapeutic targets. Our analysis sheds further light on the genomically complex and heterogeneous nature of HER2+ tumors in relation to other subgroups of breast cancer.
Results from studies using mice deficient in specific complement factors and clinical data on patients with an inherited deficiency of the classical complement pathway component C2 suggest that the classical pathway is vital for immunity to Streptococcus pneumoniae. However, the consequences of defects in classical pathway activity for opsonization with C3b and the phagocytosis of different S. pneumoniae serotypes in human serum are not known, and there has not been a systematic analysis of the abilities of sera from subjects with a C2 deficiency to opsonize S. pneumoniae. Hence, to investigate the role of the classical pathway in immunity to S. pneumoniae in more detail, flow cytometry assays of opsonization with C3b and the phagocytosis of three capsular serotypes of S. pneumoniae were performed using human sera depleted of the complement factor C1q or B or sera obtained from C2-deficient subjects. The results demonstrate that, in human serum, the classical pathway is vital for C3b-iC3b deposition onto cells of all three serotypes of S. pneumoniae and seems to be more important than the alternative pathway for phagocytosis. Compared to the results for sera from normal subjects, C3b-iC3b deposition and total anti-S. pneumoniae antibody activity levels in sera obtained from C2−/− subjects were reduced and the efficiency of phagocytosis of all three S. pneumoniae strains was impaired. Anticapsular antibody levels did not correlate with phagocytosis or C3b-iC3b deposition. These data confirm that the classical pathway is vital for complement-mediated phagocytosis of S. pneumoniae and demonstrate why subjects with a C2 deficiency have a marked increase in susceptibility to S. pneumoniae infections.
Today, no objective criteria exist to differentiate between individual primary tumors and intra- or intermammary dissemination respectively, in patients diagnosed with two or more synchronous breast cancers. To elucidate whether these tumors most likely arise through clonal expansion, or whether they represent individual primary tumors is of tumor biological interest and may have clinical implications. In this respect, high resolution genomic profiling may provide a more reliable approach than conventional histopathological and tumor biological factors.
32 K tiling microarray-based comparative genomic hybridization (aCGH) was used to explore the genomic similarities among synchronous unilateral and bilateral invasive breast cancer tumor pairs, and was compared with histopathological and tumor biological parameters.
Based on global copy number profiles and unsupervised hierarchical clustering, five of ten (p = 1.9 × 10-5) unilateral tumor pairs displayed similar genomic profiles within the pair, while only one of eight bilateral tumor pairs (p = 0.29) displayed pair-wise genomic similarities. DNA index, histological type and presence of vessel invasion correlated with the genomic analyses.
Synchronous unilateral tumor pairs are often genomically similar, while synchronous bilateral tumors most often represent individual primary tumors. However, two independent unilateral primary tumors can develop synchronously and contralateral tumor spread can occur. The presence of an intraductal component is not informative when establishing the independence of two tumors, while vessel invasion, the presence of which was found in clustering tumor pairs but not in tumor pairs that did not cluster together, supports the clustering outcome. Our data suggest that genomically similar unilateral tumor pairs may represent a more aggressive disease that requires the addition of more severe treatment modalities, and underscores the importance of evaluating the clonality of multiple tumors for optimal patient management. In summary, our findings demonstrate the importance of evaluating the properties of both tumors in order to determine the most optimal patient management.
Esophageal squamous cell carcinoma (ESCC) is a genetically complex tumor type and a major cause of cancer related mortality. Although distinct genetic alterations have been linked to ESCC development and prognosis, the genetic alterations have not gained clinical applicability. We applied array-based comparative genomic hybridization (aCGH) to obtain a whole genome copy number profile relevant for identifying deranged pathways and clinically applicable markers.
A 32 k aCGH platform was used for high resolution mapping of copy number changes in 30 stage I-IV ESCC. Potential interdependent alterations and deranged pathways were identified and copy number changes were correlated to stage, differentiation and survival.
Copy number alterations affected median 19% of the genome and included recurrent gains of chromosome regions 5p, 7p, 7q, 8q, 10q, 11q, 12p, 14q, 16p, 17p, 19p, 19q, and 20q and losses of 3p, 5q, 8p, 9p and 11q. High-level amplifications were observed in 30 regions and recurrently involved 7p11 (EGFR), 11q13 (MYEOV, CCND1, FGF4, FGF3, PPFIA, FAD, TMEM16A, CTTS and SHANK2) and 11q22 (PDFG). Gain of 7p22.3 predicted nodal metastases and gains of 1p36.32 and 19p13.3 independently predicted poor survival in multivariate analysis.
aCGH profiling verified genetic complexity in ESCC and herein identified imbalances of multiple central tumorigenic pathways. Distinct gains correlate with clinicopathological variables and independently predict survival, suggesting clinical applicability of genomic profiling in ESCC.
Genome wide DNA alterations were evaluated by array CGH in addition to RNA expression profiling in colorectal cancer from patients with excellent and poor survival following primary operations.
DNA was used for CGH in BAC and cDNA arrays. Global RNA expression was determined by 44K arrays. DNA and RNA from tumor and normal colon were used from cancer patients grouped according to death, survival or Dukes A, B, C and D tumor stage. Confirmed DNA alterations in all Dukes A – D were judged relevant for carcinogenesis, while changes in Dukes C and D only were regarded relevant for tumor progression.
Copy number gain was more common than loss in tumor tissue (p < 0.01). Major tumor DNA alterations occurred in chromosome 8, 13, 18 and 20, where short survival included gain in 8q and loss in 8p. Copy number gains related to tumor progression were most common on chromosome 7, 8, 19, 20, while corresponding major losses appeared in chromosome 8. Losses at chromosome 18 occurred in all Dukes stages. Normal colon tissue from cancer patients displayed gains in chromosome 19 and 20. Mathematical Vector analysis implied a number of BAC-clones in tumor DNA with genes of potential importance for death or survival.
The genomic variation in colorectal cancer cells is tremendous and emphasizes that BAC array CGH is presently more powerful than available statistical models to discriminate DNA sequence information related to outcome. Present results suggest that a majority of DNA alterations observed in colorectal cancer are secondary to tumor progression. Therefore, it would require an immense work to distinguish primary from secondary DNA alterations behind colorectal cancer.
Colorectal cancer array CGH; Tumor DNA
High-resolution microarray-based comparative genomic hybridization (CGH) techniques have successfully been applied to study copy number imbalances in a number of settings such as the analysis of cancer genomes. For normalization of array-CGH data, methods initially developed for gene expression microarray analysis have, in general, been directly adopted and used. However, these methods are designed to work under assumptions that may not be valid for array-CGH data when copy number imbalances are present. We therefore sought to investigate the effect on normalization imposed by copy number imbalances.
Here we demonstrate that copy number imbalances correlate with intensity in array-CGH data thereby causing problems for conventional normalization methods. We propose a strategy to circumvent these problems by taking copy number imbalances into account during normalization, and we test the proposed strategy using several data sets from the analysis of cancer genomes. In addition, we show how the strategy can be applied to conveniently define adaptive sample-specific boundaries between balanced copy number, losses, and gains to facilitate management of variation in tissue heterogeneity when calling copy number changes.
We highlight the importance of considering copy number imbalances during normalization of array-CGH data, and show how failure to do so can deleteriously affect data and hamper interpretation.
Non-linearities in observed log-ratios of gene expressions, also known as intensity dependent log-ratios, can often be accounted for by global biases in the two channels being compared. Any step in a microarray process may introduce such offsets and in this article we study the biases introduced by the microarray scanner and the image analysis software.
By scanning the same spotted oligonucleotide microarray at different photomultiplier tube (PMT) gains, we have identified a channel-specific bias present in two-channel microarray data. For the scanners analyzed it was in the range of 15–25 (out of 65,535). The observed bias was very stable between subsequent scans of the same array although the PMT gain was greatly adjusted. This indicates that the bias does not originate from a step preceding the scanner detector parts. The bias varies slightly between arrays. When comparing estimates based on data from the same array, but from different scanners, we have found that different scanners introduce different amounts of bias. So do various image analysis methods. We propose a scanning protocol and a constrained affine model that allows us to identify and estimate the bias in each channel. Backward transformation removes the bias and brings the channels to the same scale. The result is that systematic effects such as intensity dependent log-ratios are removed, but also that signal densities become much more similar. The average scan, which has a larger dynamical range and greater signal-to-noise ratio than individual scans, can then be obtained.
The study shows that microarray scanners may introduce a significant bias in each channel. Such biases have to be calibrated for, otherwise systematic effects such as intensity dependent log-ratios will be observed. The proposed scanning protocol and calibration method is simple to use and is useful for evaluating scanner biases or for obtaining calibrated measurements with extended dynamical range and better precision. The cross-platform R package aroma, which implements all described methods, is available for free from .