ERG rearrangements, (most commonly TMPRSS2: ERG [T2:ERG] gene fusions), have been identified in approximately 50% of prostate cancers (PCa). Quantification of T2:ERG in post-DRE urine, in combination with PCA3, improves the performance of serum PSA for PCa prediction on biopsy Here we compared urine T2:ERG and PCA3 scores to ERG+ (determined by immunohistochemistry) and total prostate cancer burden in 41 mapped prostatectomies. Prostatectomies had a median of 3 tumor foci (range: 1–15) and 2.6 cm of summed linear tumor dimension (range: 0.6–7.1 cm). Urine T2:ERG score most correlated with summed linear ERG+ tumor dimension and number of ERG+ foci (rs=0.68 and 0.67, respectively, both p<0.001). Urine PCA3 score showed weaker correlation with both number of tumor foci (rs=0.34, p=0.03) and summed linear tumor dimension (rs=0.26, p=0.10). In summary, we demonstrate a strong correlation between urine T2:ERG score and total ERG+ PCa burden at prostatectomy, consistent with high tumor specificity.
TMPRSS2:ERG; prostate cancer; PCA3; urine
Fusions of androgen-regulated genes and v-ets erythroblastosis virus E26 oncogene homolog (avian) (ERG) occur in ~50% of prostate cancers, encoding a truncated ERG product. In prostatectomy specimens, ERG-rearrangements are >99% specific for prostate cancer or high grade prostatic intraepithelial neoplasia (HGPIN) adjacent to ERG-rearranged prostate cancer by fluorescence in situ hybridization (FISH) and immunohistochemistry (IHC).
To evaluate ERG staining by IHC on needle biopsies, including diagnostically challenging cases.
Biopsies from a retrospective cohort (n=111) enriched in cores requiring diagnostic IHC and a prospective cohort from all cases over 3 months (n=311) were stained with an anti-ERG antibody (clone EPR3864).
Amongst evaluable cores (n=418), ERG staining was confined to cancerous epithelium (71/160 cores, 44%), HGPIN (12/68 cores, 18%) and atypical foci (3/28 cores, 11%), with staining in only 2/162 (1%) cores diagnosed as benign. ERG was expressed in ~5 morphologically benign glands across 418 cores, and was uniformly expressed by all cancerous glands in 70/71 cores.
ERG staining is more prostate cancer-specific than alpha-methylacyl-CoA racemase (AMACR), and staining in an atypical focus supports a diagnosis of cancer if HGPIN can be excluded. Thus, ERG staining shows utility in diagnostically challenging biopsies and may be useful in molecularly subtyping prostate cancer and risk stratifying isolated HGPIN.
Quantitative targeted proteomics has recently taken front stage in the proteomics community. Centered on multiple reaction monitoring–mass spectrometry (MRM–MS) methodologies, quantitative targeted proteomics is being used in the verification of global proteomics data, the discovery of lower abundance proteins, protein post-translational modifications, discrimination of select highly homologous protein isoforms and as the final step in biomarker discovery. An older methodology utilized with small molecule analysis, the proteomics community is making great technological strides to develop MRM–MS as the next method to address previously challenging issues in global proteomics experimentation, namely dynamic range, identification of post-translational modifications, sensitivity and selectivity of measurement which will undoubtedly further biomedical knowledge. This brief review will provide a general introduction of MRM–MS and highlight its novel application for targeted quantitative proteomic experimentations.
absolute quantification; quantitative proteomics; mass spectrometry; multiple reaction monitoring; stable isotope dilution; targeted proteomics
Nuclear magnetic resonance based measurements of small molecule mixtures continues to be confronted with the challenge of spectral assignment. While multidimensional experiments are capable of addressing this challenge, the imposed time constraint becomes prohibitive, particularly with the large sample sets commonly encountered in metabolomic studies. Thus, one-dimensional spectral assignment is routinely performed, guided by two-dimensional experiments on a selected sample subset; however, a publicly available graphical interface for aiding in this process is currently unavailable. We have collected spectral information for 360 unique compounds from publicly available databases including chemical shift lists and authentic full resolution spectra, supplemented with spectral information for 25 compounds collected in-house at a proton NMR frequency of 900 MHz. This library serves as the basis for MetaboID, a Matlab-based user interface designed to aid in the one-dimensional spectral assignment process. The tools of MetaboID were built to guide resonance assignment in order of increasing confidence, starting from cursory compound searches based on chemical shift positions to analysis of authentic spike experiments. Together, these tools streamline the often repetitive task of spectral assignment. The overarching goal of the integrated toolbox of MetaboID is to centralize the one dimensional spectral assignment process, from providing access to large chemical shift libraries to providing a straightforward, intuitive means of spectral comparison. Such a toolbox is expected to be attractive to both experienced and new metabolomic researchers as well as general complex mixture analysts.
NMR; metabolomics; assignment; mixture analysis
Poly(ADP-ribose) polymerase-1 (PARP-1) is an abundant nuclear enzyme that modifies substrates by poly(ADP-ribose)-ylation. PARP-1 has well-described functions in DNA damage repair, and also functions as a context-specific regulator of transcription factors. Using multiple models, data demonstrate that PARP-1 elicits pro-tumorigenic effects in androgen receptor (AR)-positive prostate cancer (PCa) cells, both in the presence and absence of genotoxic insult. Mechanistically, PARP-1 is recruited to sites of AR function, therein promoting AR occupancy and AR function. It was further confirmed in genetically-defined systems that PARP-1 supports AR transcriptional function, and that in models of advanced PCa, PARP-1 enzymatic activity is enhanced, further linking PARP-1 to AR activity and disease progression. In vivo analyses demonstrate that PARP-1 activity is required for AR function in xenograft tumors, as well as tumor cell growth in vivo and generation and maintenance of castration-resistance. Finally, in a novel explant system of primary human tumors, targeting PARP-1 potently suppresses tumor cell proliferation. Collectively, these studies identify novel functions of PARP-1 in promoting disease progression, and ultimately suggest that the dual functions of PARP-1 can be targeted in human PCa to suppress tumor growth and progression to castration-resistance.
prostate cancer; androgen receptor; PARP-1; PARP inhibitor; DNA damage
A better understanding of molecular pathways involved in malignant transformation of head and neck squamous cell carcinoma (HNSCC) is essential for the development of novel and efficient anti-cancer drugs. To delineate the global metabolism of HNSCC, we report 1H NMR-based metabolic profiling of HNSCC cells from five different patients that were derived from various sites of the upper aerodigestive tract, including the floor of mouth, tongue and larynx. Primary cultures of normal human oral keratinocytes (NHOK) from three different donors were used for comparison. 1H NMR spectra of polar and non-polar extracts of cells were used to identify more than thirty-five metabolites. Principal component analysis performed on the NMR data revealed a clear classification of NHOK and HNSCC cells. HNSCC cells exhibited significantly altered levels of various metabolites that clearly revealed dysregulation in multiple metabolic events, including Warburg effect, oxidative phosphorylation, energy metabolism, TCA cycle anaplerotic flux, glutaminolysis, hexosamine pathway, osmo-regulatory and anti-oxidant mechanism. In addition, significant alterations in the ratios of phosphatidylcholine/lysophosphatidylcholine and phosphocholine/glycerophosphocholine, and elevated arachidonic acid observed in HNSCC cells reveal an altered membrane choline phospholipid metabolism (MCPM). Furthermore, significantly increased activity of phospholipase A2 (PLA2), particularly cytosolic PLA2 (cPLA2) observed in all the HNSCC cells confirm an altered MCPM. In summary, the metabolomic findings presented here can be useful to further elucidate the biological aspects that lead to HNSCC, and also provide a rational basis for monitoring molecular mechanisms in response to chemotherapy. Moreover, cPLA2 may serve as a potential therapeutic target for anti-cancer therapy of HNSCC.
Head and Neck Squamous Cell Carcinoma; NMR spectroscopy; Metabolites; Lipids; Metabolomics; phospholipase A2
Since the introduction of serum prostate specific antigen (PSA) screening twenty-five years ago, prostate cancer diagnosis and management have been guided by this biomarker. Yet, PSA has proven controversial as a diagnostic assay due to its limitations. The next wave of prostate cancer biomarkers has emerged, introducing new assays in serum and urine that may supplement or, in time, replace PSA due to higher cancer specificity. This expanding universe of biomarkers has been facilitated, in large part, by new genomic technologies that have enabled an unbiased look at cancer biology. Such efforts have produced several notable success stories, moving biomarkers from the bench to the clinic rapidly. However, biomarker research has centered on disease diagnostics, rather than prognosis and prediction, which could work toward disease prevention—an important focus moving forward. We review the current state of prostate cancer biomarker research, including the PSA revolution, its impact on early prostate cancer detection, the recent advances in biomarker discovery, and the future efforts that promise to improve clinical management of this disease.
Donor T cells that respond to host alloantigens following allogeneic bone marrow transplantation (BMT) induce graft-versus-host (GVH) responses, but their molecular landscape is not well understood. MicroRNAs (miRNAs) regulate gene (mRNA) expression and fine-tune the molecular responses of T cells. We stimulated naive T cells with either allogeneic or nonspecific stimuli and used argonaute cross-linked immunoprecipitation (CLIP) with subsequent ChIP microarray analyses to profile miR responses and their direct mRNA targets. We identified a unique expression pattern of miRs and mRNAs following the allostimulation of T cells and a high correlation between the expression of the identified miRs and a reduction of their mRNA targets. miRs and mRNAs that were predicted to be differentially regulated in allogeneic T cells compared with nonspecifically stimulated T cells were validated in vitro. These analyses identified wings apart-like homolog (Wapal) and synaptojanin 1 (Synj1) as potential regulators of allogeneic T cell responses. The expression of these molecular targets in vivo was confirmed in MHC-mismatched experimental BMT. Targeted silencing of either Wapal or Synj1 prevented the development of GVH response, confirming a role for these regulators in allogeneic T cell responses. Thus, this genome-wide analysis of miRNA-mRNA interactions identifies previously unrecognized molecular regulators of T cell responses.
Apoptosis is a fundamental biologic process by which metazoan cells orchestrate their own self-demise. Genetic analyses of the nematode C elegans identified three core components of the suicide apparatus which include CED-3, CED-4, and CED-9. An analogous set of core constituents exists in mammalian cells and includes caspase-9, Apaf-1, and bcl-2/xl, respectively. CED-3 and CED-4, along with their mammalian counterparts, function to kill cells, whereas CED-9 and its mammalian equivalents protect cells from death. These central components biochemically intermingle in a ternary complex recently dubbed the “apoptosome.” The C elegans protein EGL-1 and its mammalian counterparts, pro-apoptotic members of the bcl-2 family, induce cell death by disrupting apoptosome interactions. Thus, EGL-1 may represent a primordial signal integrator for the apoptosome. Various biochemical processes including oligomerization, adenosine triphosphate ATP/dATP binding, and cytochrome c interaction play a role in regulating the ternary death complex. Recent studies suggest that cell death receptors, such as CD95, may amplify their suicide signal by activating the apoptosome. These mutual associations by core components of the suicide apparatus provide a molecular framework in which diverse death signals likely interface. Understanding the apoptosome and its cellular connections will facilitate the design of novel therapeutic strategies for cancer and other disease states in which apoptosis plays a pivotal role.
apoptosis; apoptosome; cell death; death receptor
ETS gene fusions, which result in overexpression of an ETS transcription factor, are considered driving mutations in approximately half of all prostate cancers. Dysregulation of ETS transcription factors is also known to exist in Ewing's sarcoma, breast cancer, and acute lymphoblastic leukemia. We previously discovered that ERG, the predominant ETS family member in prostate cancer, interacts with the DNA damage response protein poly (ADP-ribose) polymerase 1 (PARP1) in human prostate cancer specimens. Therefore, we hypothesized that the ERG-PARP1 interaction may confer radiation resistance by increasing DNA repair efficiency and that this radio-resistance could be reversed through PARP1 inhibition. Using lentiviral approaches, we established isogenic models of ERG overexpression in PC3 and DU145 prostate cancer cell lines. In both cell lines, ERG overexpression increased clonogenic survival following radiation by 1.25 (±0.07) fold (mean ± SEM) and also resulted in increased PARP1 activity. PARP1 inhibition with olaparib preferentially radiosensitized ERG-positive cells by a factor of 1.52 (±0.03) relative to ERG-negative cells (P < .05). Neutral and alkaline COMET assays and immunofluorescence microscopy assessing γ-H2AX foci showed increased short- and long-term efficiencies of DNA repair, respectively, following radiation that was preferentially reversed by PARP1 inhibition. These findings were verified in an in vivo xenograft model. Our findings demonstrate that ERG overexpression confers radiation resistance through increased efficiency of DNA repair following radiation that can be reversed through inhibition of PARP1. These results motivate the use of PARP1 inhibitors as radiosensitizers in patients with localized ETS fusion-positive cancers.
E26 transformation-specific (ETS) transcription factors are known to be involved in gene aberrations in various malignancies including prostate cancer; however, their role in melanoma oncogenesis has yet to be fully explored. We have completed a comprehensive fluorescence in situ hybridization (FISH)-based screen for all 27 members of the ETS transcription factor family on two melanoma tissue microarrays, representing 223 melanomas, 10 nevi, and 5 normal skin tissues. None of the melanoma cases demonstrated ETS fusions; however, 6 of 114 (5.3%) melanomas were amplified for ETV1 using a break-apart FISH probe. For the six positive cases, locus-controlled FISH probes revealed that two of six cases were amplified for the ETV1 region, whereas four cases showed copy gains of the entire chromosome 7. The remaining 26 ETS family members showed no chromosomal aberrations by FISH. Quantitative polymerase chain reaction showed an average 3.4-fold (P value = .00218) increased expression of ETV1 in melanomas, including the FISH ETV1-amplified cases, when compared to other malignancies (prostate, breast, and bladder carcinomas). These data suggest that a subset of melanomas overexpresses ETV1 and amplification of ETV1 may be one mechanism for achieving high gene expression.
Using a series of detailed experiments, Zhang et al establish that the prostate cancer RNA chimera SLC45A3-ELK4 is generated by cis-splicing between the two adjacent genes and does not involve DNA rearrangements or trans-splicing. The chimera expression is induced by androgen treatment likely by overcoming the read-through block imposed by the intergenic CCCTC-insulators bound by CTCF repressor protein. The chimeric transcript, but not wild type ELK4, is shown to augment prostate cancer cell proliferation.
Pseudogene transcripts can provide a novel tier of gene regulation through generation of endogenous siRNAs or miRNA-binding sites. Characterization of pseudogene expression, however, has remained confined to anecdotal observations due to analytical challenges posed by the extremely close sequence similarity with their counterpart coding genes. Here, we describe a systematic analysis of pseudogene “transcription” from an RNA-Seq resource of 293 samples, representing 13 cancer and normal tissue types, and observe a surprisingly prevalent, genome-wide expression of pseudogenes that could be categorized as ubiquitously expressed or lineage and/or cancer specific. Further, we explore disease subtype specificity and functions of selected expressed pseudogenes. Taken together, we provide evidence that transcribed pseudogenes are a significant contributor to the transcriptional landscape of cells and are positioned to play significant roles in cellular differentiation and cancer progression, especially in light of the recently described ceRNA networks. Our work provides a transcriptome resource that enables high-throughput analyses of pseudogene expression.
In an effort to address the variable correspondence problem across large sample cohorts common in metabolomic/metabonomic studies, we have developed a pre-alignment protocol that aims to generate spectral segments sharing a common target spectrum. Under the assumption that a single reference spectrum will not correctly represent all spectra of a data set, the goal of this approach is to perform local alignment corrections on spectral regions which share a common ‘most similar’ spectrum. A natural beneficial outcome of this procedure is the automatic definition of spectral segments, a feature that is not common to all alignment methods. This protocol is shown to specifically improve the quality of alignment in 1H NMR data sets exhibiting large inter-sample compositional variation (e.g. pH, ionic strength). As a proof-of-principle demonstration, we have utilized two recently developed alignment algorithms specific to NMR data, recursive segment-wise peak alignment and interval correlated shifting and applied them to two data sets comprised of 15 aqueous cell line extract and 20 human urine 1H NMR profiles. Application of this protocol represents a fundamental shift from current alignment methodologies that seek to correct misalignments utilizing a single representative spectrum, with the added benefit that it can be appended to any alignment algorithm.
Metabolomic; alignment; NMR; urine
In the past decade, biomarker discovery has become ubiquitous in cancer research. However, despite this interest in biomarker research, few newly-characterized biomarkers have emerged as clinically-used entities. Here, we review the current state of biomarker research in cancer and identify challenges that stall many biomarker discovery efforts. We outline a model for systematic biomarker discovery, exemplified by recent efforts in prostate cancer, in which bioinformatics plays a central role in identifying promising new candidate biomarkers. Finally, we review the role of the National Cancer Institute’s Early Detection Research Network (EDRN) in biomarker studies and the importance of EDRN-led efforts to establish a research standard for more effective biomarker discovery efforts.
biomarker; prostate cancer; bioinformatics; early detection
BACKGROUND & AIMS
Polymorphisms that reduce the function of nucleotide-binding oligomerization domain (NOD)2, a bacterial sensor, have been associated with Crohn’s disease (CD). No proteins that regulate NOD2 activity have been identified as selective pharmacologic targets. We sought to discover regulators of NOD2 that might be pharmacologic targets for CD therapies.
Carbamoyl phosphate synthetase/ aspartate transcarbamylase/dihydroorotase (CAD) is an enzyme required for de novo pyrimidine nucleotide synthesis; it was identified as a NOD2-interacting protein by immunoprecipitation-coupled mass spectrometry. CAD expression was assessed in colon tissues from individuals with and without inflammatory bowel disease by immunohistochemistry. The interaction between CAD and NOD2 was assessed in human HCT116 intestinal epithelial cells by immunoprecipitation, immunoblot, reporter gene, and gentamicin protection assays. We also analyzed human cell lines that express variants of NOD2 and the effects of RNA interference, overexpression and CAD inhibitors.
CAD was identified as a NOD2-interacting protein expressed at increased levels in the intestinal epithelium of patients with CD compared with controls. Overexpression of CAD inhibited NOD2-dependent activation of nuclear factor κB and p38 mitogen-activated protein kinase, as well as intracellular killing of Salmonella. Reduction of CAD expression or administration of CAD inhibitors increased NOD2-dependent signaling and antibacterial functions of NOD2 variants that are and are not associated with CD.
The nucleotide synthesis enzyme CAD is a negative regulator of NOD2. The antibacterial function of NOD2 variants that have been associated with CD increased in response to pharmacologic inhibition of CAD. CAD is a potential therapeutic target for CD.
NLR; Innate Immunity; IBD; PALA
We explore the utility of p-value weighting for enhancing the power to detect differential metabolites in a two-sample setting. Related gene expression information is used to assign an a priori importance level to each metabolite being tested. We map the gene expression to a metabolite through pathways and then gene expression information is summarized per-pathway using gene set enrichment tests. Through simulation we explore four styles of enrichment tests and four weight functions to convert the gene information into a meaningful p-value weight. We implement the p-value weighting on a prostate cancer metabolomics dataset. Gene expression on matched samples is used to construct the weights. Under certain regulatory conditions, the use of weighted p-values does not in-flate the type I error above what we see for the un-weighted tests except in high correlation situations. The power to detect differential metabolites is notably increased in situations with disjoint pathways and shows moderate improvement, relative to the proportion of enriched pathways, when pathway membership overlaps.
External beam radiation therapy is often used as in an attempt to cure localized prostate cancer (PCa), but is only palliative against disseminated disease. Raf Kinase Inhibitory Protein (RKIP) is a metastasis suppressor whose expression is reduced in approximately 50% of localized PCa tissues and is absent in metastases. Chemotherapeutic agents have been shown to induce tumor apoptosis through induction of RKIP expression. Our goal was to test if radiation therapy similarly induces apoptosis through induction of RKIP expression.
The C4-2B PCa cell line was engineered to over express or under express RKIP. The engineered cells were tested for apoptosis in cell culture and tumor regression in mice following radiation treatment.
Radiation induced both RKIP expression and apoptosis of PCa cells. Over expression of RKIP sensitized PCa cells to radiation-induced apoptosis; whereas, short-hairpin targeting of RKIP, so that radiation could not induce RKIP expression, protected cells from radiation-induced apoptosis. In a murine model, knockdown of RKIP in PCa cells diminished radiation-induced apoptosis. Molecular concept mapping of genes altered upon manipulation of RKIP expression revealed that an inverse correlation with the concept of genes altered by irradiation.
The data presented here indicate that the loss of RKIP, as seen in primary PCa tumors and metastases, confers protection against radiation-induced apoptosis. Therefore, it is conceivable that loss of RKIP confers a growth advantage upon PCa cells at distant sites since loss of RKIP would decrease apoptosis, favoring proliferation.
RKIP; ionizing radiation; apoptosis; prostate cancer; radioresistance
Despite significant advancement in alignment algorithms, the exponential growth of nucleotide sequencing throughput threatens to outpace bioinformatic analysis. Computation may become the bottleneck of genome analysis if growing alignment costs are not mitigated by further improvement in algorithms. Much gain has been gleaned from indexing and compressing alignment databases, but many widely used alignment tools process input reads sequentially and are oblivious to any underlying redundancy in the reads themselves.
Here we present Oculus, a software package that attaches to standard aligners and exploits read redundancy by performing streaming compression, alignment, and decompression of input sequences. This nearly lossless process (> 99.9%) led to alignment speedups of up to 270% across a variety of data sets, while requiring a modest amount of memory. We expect that streaming read compressors such as Oculus could become a standard addition to existing RNA-Seq and ChIP-Seq alignment pipelines, and potentially other applications in the future as throughput increases.
Oculus efficiently condenses redundant input reads and wraps existing aligners to provide nearly identical SAM output in a fraction of the aligner runtime. It includes a number of useful features, such as tunable performance and fidelity options, compatibility with FASTA or FASTQ files, and adherence to the SAM format. The platform-independent C++ source code is freely available online, at http://code.google.com/p/oculus-bio.
DNA nucleotide sequence alignment streaming identity redundancy compression software algorithm
High-resolution magic-angle spinning (HR-MAS) proton NMR spectroscopy is used to explore the metabolic signatures of head and neck squamous cell carcinoma (HNSCC) which included matched normal adjacent tissue (NAT) and tumor originating from tongue, lip, larynx and oral cavity, and associated lymph-node metastatic (LN-Met) tissues. A total of 43 tissues (18 NAT, 18 Tumor and 7 LN-Met) from twenty-two HNSCC patients were analyzed. Principal Component Analysis of NMR data showed a clear classification between NAT and tumor tissues, however, LN-Met tissues were classified among tumor. A partial least squares discriminant analysis model generated from NMR metabolic profiles was used to differentiate normal from tumor samples (Q2 > 0.80, Receiver Operator Characteristic area under the curve > 0. 86, using 7-fold cross validation). HNSCC and LN-Met tissues showed elevated levels of lactate, amino acids including leucine, isoleucine, valine, alanine, glutamine, glutamate, aspartate, glycine, phenylalanine and tyrosine, choline containing compounds, creatine, taurine, glutathione and decreased levels of triglycerides. These elevated metabolites were associated with highly active glycolysis, increased amino acids influx (anaplerosis) into the TCA cycle, altered energy metabolism, membrane choline phospholipid metabolism, and oxidative and osmotic defense mechanisms. Moreover, decreased levels of triglycerides may indicate lipolysis followed by β-oxidation of fatty acids that may exist to deliver bioenergy for rapid tumor cell proliferation and growth.
HR-MAS NMR; Metabolites; Metabolomics; Head and Neck Squamous Cell Carcinoma; Lymph-node metastasis
Neuroendocrine prostate cancer (NEPC) is an aggressive subtype of prostate cancer that most commonly evolves from preexisting prostate adenocarcinoma (PCA). Using Next Generation RNA-sequencing and oligonucleotide arrays, we profiled 7 NEPC, 30 PCA, and 5 benign prostate tissue (BEN), and validated findings on tumors from a large cohort of patients (37 NEPC, 169 PCA, 22 BEN) using IHC and FISH. We discovered significant overexpression and gene amplification of AURKA and MYCN in 40% of NEPC and 5% of PCA, respectively, and evidence that that they cooperate to induce a neuroendocrine phenotype in prostate cells. There was dramatic and enhanced sensitivity of NEPC (and MYCN overexpressing PCA) to Aurora kinase inhibitor therapy both in vitro and in vivo, with complete suppression of neuroendocrine marker expression following treatment. We propose that alterations in Aurora kinase A and N-myc are involved in the development of NEPC, and future clinical trials will help determine from the efficacy of Aurora kinase inhibitor therapy.
neuroendocrine prostate cancer; aurora kinase A; n-myc; drug targets
An avalanche of next generation sequencing (NGS) studies has generated an unprecedented amount of genomic structural variation data. These studies have also identified many novel gene fusion candidates with more detailed resolution than previously achieved. However, in the excitement and necessity of publishing the observations from this recently developed cutting-edge technology, no community standardization approach has arisen to organize and represent the data with the essential attributes in an interchangeable manner. As transcriptome studies have been widely used for gene fusion discoveries, the current non-standard mode of data representation could potentially impede data accessibility, critical analyses, and further discoveries in the near future.
Here we propose a prototype, Gene Fusion Markup Language (GFML) as an initiative to provide a standard format for organizing and representing the significant features of gene fusion data. GFML will offer the advantage of representing the data in a machine-readable format to enable data exchange, automated analysis interpretation, and independent verification. As this database-independent exchange initiative evolves it will further facilitate the formation of related databases, repositories, and analysis tools. The GFML prototype is made available at
The Gene Fusion Markup Language (GFML) presented here could facilitate the development of a standard format for organizing, integrating and representing the significant features of gene fusion data in an inter-operable and query-able fashion that will enable biologically intuitive access to gene fusion findings and expedite functional characterization. A similar model is envisaged for other NGS data analyses.
Summary: Next generation sequencing (NGS) technologies have enabled de novo gene fusion discovery that could reveal candidates with therapeutic significance in cancer. Here we present an open-source software package, ChimeraScan, for the discovery of chimeric transcription between two independent transcripts in high-throughput transcriptome sequencing data.
Supplementary Information: Supplementary data are available at Bioinformatics online.