1.  A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites 
Proteomics  2014;14(0):2688-2698.
Next-generation transcriptome sequencing is increasingly integrated with mass spectrometry to enhance MS-based protein and peptide identification. Recently, a breakthrough in transcriptome analysis was achieved with the development of ribosome profiling (ribo-seq). This technology is based on the deep sequencing of ribosome-protected mRNA fragments, thereby enabling the direct observation of in vivo protein synthesis at the transcript level. In order to explore the impact of a ribo-seq-derived protein sequence search space on MS/MS spectrum identification, we performed a comprehensive proteome study on a human cancer cell line, using both shotgun and N-terminal proteomics, next to ribosome profiling, which was used to delineate (alternative) translational reading-frames. By including protein-level evidence of sample-specific genetic variation and alternative translation, this strategy improved the identification score of 69 proteins and identified 22 new proteins in the shotgun experiment. Furthermore, we discovered 18 new alternative translation start sites in the N-terminal proteomics data and observed a correlation between the quantitative measures of ribo-seq and shotgun proteomics with a Pearson correlation coefficient ranging from 0.483 to 0.664. Overall, this study demonstrated the benefits of ribosome profiling for MS-based protein and peptide identification and we believe this approach could develop into a common practice for next-generation proteomics.
PMCID: PMC4391000  PMID: 25156699
proteogenomics; ribosome profiling; N-terminomics; bioinformatics; translation initiation
2.  Clinical Validation of an Epigenetic Assay to Predict Negative Histopathological Results in Repeat Prostate Biopsies 
The Journal of urology  2014;192(4):1081-1087.
The DOCUMENT multicenter trial in the United States validated the performance of an epigenetic test as an independent predictor of prostate cancer risk to guide decision making for repeat biopsy. Confirming an increased negative predictive value could help avoid unnecessary repeat biopsies.
Materials and Methods
We evaluated the archived, cancer negative prostate biopsy core tissue samples of 350 subjects from a total of 5 urological centers in the United States. All subjects underwent repeat biopsy within 24 months with a negative (controls) or positive (cases) histopathological result. Centralized blinded pathology evaluation of the 2 biopsy series was performed in all available subjects from each site. Biopsies were epigenetically profiled for GSTP1, APC and RASSF1 relative to the ACTB reference gene using quantitative methylation specific polymerase chain reaction. Predetermined analytical marker cutoffs were used to determine assay performance. Multivariate logistic regression was used to evaluate all risk factors.
The epigenetic assay resulted in a negative predictive value of 88% (95% CI 85–91). In multivariate models correcting for age, prostate specific antigen, digital rectal examination, first biopsy histopathological characteristics and race the test proved to be the most significant independent predictor of patient outcome (OR 2.69, 95% CI 1.60–4.51).
The DOCUMENT study validated that the epigenetic assay was a significant, independent predictor of prostate cancer detection in a repeat biopsy collected an average of 13 months after an initial negative result. Due to its 88% negative predictive value adding this epigenetic assay to other known risk factors may help decrease unnecessary repeat prostate biopsies.
PMCID: PMC4337855  PMID: 24747657
prostate; prostatic neoplasms; epigenomics; methylation; biopsy
3.  Spectrin Repeat Containing Nuclear Envelope 1 and Forkhead Box Protein E1 Are Promising Markers for the Detection of Colorectal Cancer in Blood 
Identifying biomarkers in body fluids may improve the noninvasive detection of colorectal cancer. Previously, we identified N-Myc downstream-regulated gene 4 (NDRG4) and GATA binding protein 5 (GATA5) methylation as promising biomarkers for colorectal cancer in stool DNA. Here, we examined the utility of NDRG4, GATA5, and two additional markers [Forkhead box protein E1 (FOXE1) and spectrin repeat containing nuclear envelope 1 (SYNE1)] promoter methylation as biomarkers in plasma DNA. Quantitative methylation-specific PCR was performed on plasma DNA from 220 patients with colorectal cancer and 684 noncancer controls, divided in a training set and a test set. Receiver operating characteristic analysis was performed to measure the area under the curve of GATA5, NDRG4, SYNE1, and FOXE1 methylation. Functional assays were performed in SYNE1 and FOXE1 stably transfected cell lines. The sensitivity of NDRG4, GATA5, FOXE1, and SYNE1 methylation in all stages of colorectal cancer (154 cases, 444 controls) was 27% [95% confidence interval (CI), 20%–34%), 18% (95% CI, 12%–24%), 46% (95% CI, 38%– 54%), and 47% (95% CI, 39%–55%), with a specificity of 95% (95% CI, 93%–97%), 99% (95% CI, 98%–100%), 93% (95% CI, 91%–95%), and 96% (95% CI, 94%–98%), respectively. Combining SYNE1 and FOXE1, increased the sensitivity to 56% (95% CI, 48%–64%), while the specificity decreased to 90% (95% CI, 87%–93%) in the training set and to 58% sensitivity (95% CI, 46%–70%) and 91% specificity (95% CI, 80%–100%) in a test set (66 cases, 240 controls). SYNE1 overexpression showed no major differences in cell proliferation, migration, and invasion compared with controls. Overexpression of FOXE1 significantly decreased the number of colonies in SW480 and HCT116 cell lines. Overall, our data suggest that SYNE1 and FOXE1 are promising markers for colorectal cancer detection.
PMCID: PMC4316751  PMID: 25538088
4.  Mining for viral fragments in methylation enriched sequencing data 
Most next generation sequencing experiments generate more data than is usable for the experimental set up. For example, methyl-CpG binding domain (MBD) affinity purification based sequencing is often used for DNA-methylation profiling, but up to 30% of the sequenced fragments cannot be mapped uniquely to the reference genome. Here we present and evaluate a methodology for the identification of viruses in these otherwise unused paired-end MBD-seq data. Viral detection is accomplished by mapping non-reference alignable reads to a comprehensive set of viral genomes. As viruses play an important role in epigenetics and cancer development, 92 (pre)malignant and benign samples, originating from two different collections of cervical samples and related cell lines, were used in this study. These samples include primary carcinomas (n = 22), low- and high-grade cervical intraepithelial neoplasia (CIN1 and CIN2/3 - n = 2/n = 30) and normal tissue (n = 20), as well as control samples (n = 17). Viruses that were detected include phages, adenoviruses, herpesviridae and HPV. HPV, which causes virtually all cervical cancers, was identified in 95% of the carcinomas, 100% of the CIN2/3 samples, both CIN1 samples and in 55% of the normal samples. Comparing the amount of mapped fragments on HPV for each HPV-infected sample yielded a significant difference between normal samples and carcinomas or CIN2/3 samples (adjusted p-values resp. <10−5, <10−5), reflecting different viral loads and/or methylation degrees in non-normal samples. Fragments originating from different HPV types could be distinguished and were independently validated by PCR-based assays in 71% of the detections. In conclusion, although limited by the a priori knowledge of viral reference genome sequences, the proposed methodology can provide a first confined but substantial insight into the presence, concentration and types of methylated viral sequences in MBD-seq data at low additional cost.
PMCID: PMC4316777  PMID: 25699076
viruses; epigenomics; DNA-methylation; next generation sequencing; bioinformatics; cervical cancer; human papillomavirus; MBD-seq
5.  On Cross-Sectional Associations of Leukocyte Telomere Length with Cardiac Systolic, Diastolic and Vascular Function: The Asklepios Study 
PLoS ONE  2014;9(12):e115071.
Systemic telomere length has been associated with measures of diastolic function, vascular stiffness and left ventricular mass mainly in smaller, patient-specific settings and not in a general population. In this study we describe the applicability of these findings in a large, representative population.
Methods and Results
Peripheral blood leukocyte telomere length (PBL TL) was measured using telomere restriction fragment analysis in the young to middle-aged (>2500 volunteers, ∼35 to 55 years old) Asklepios study population, free from overt cardiovascular disease. Subjects underwent extensive echocardiographic, hemodynamic and biochemical phenotyping. After adjusting for relevant confounders (age, sex, systolic blood pressure, heart rate, body mass index and use of antihypertensive drugs) we found no associations between PBL TL and left ventricular mass index (P = 0.943), ejection fraction (P = 0.933), peak systolic septal annular motion (P = 0.238), pulse wave velocity (P = 0.971) or pulse pressure (P = 0.999). In contrast, our data showed positive associations between PBL TL and parameters of LV filling: the transmitral flow early (E) to late (A) velocity ratio (E/A-ratio; P<0.001), the ratio of early (e′) to late (a′) mitral annular velocities (e′/a′-ratio; P = 0.012) and isovolumic relaxation time (P = 0.015). Interestingly, these associations were stronger in women than in men and were driven by associations between PBL TL and the late diastolic components (A and a′).
In a generally healthy, young to middle-aged population, PBL TL is not related to LV mass or systolic function, but might be associated with an altered LV filling pattern, especially in women.
PMCID: PMC4266659  PMID: 25506937
6.  PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration 
Nucleic Acids Research  2014;43(5):e29.
An increasing amount of studies integrate mRNA sequencing data into MS-based proteomics to complement the translation product search space. However, several factors, including extensive regulation of mRNA translation and the need for three- or six-frame-translation, impede the use of mRNA-seq data for the construction of a protein sequence search database. With that in mind, we developed the PROTEOFORMER tool that automatically processes data of the recently developed ribosome profiling method (sequencing of ribosome-protected mRNA fragments), resulting in genome-wide visualization of ribosome occupancy. Our tool also includes a translation initiation site calling algorithm allowing the delineation of the open reading frames (ORFs) of all translation products. A complete protein synthesis-based sequence database can thus be compiled for mass spectrometry-based identification. This approach increases the overall protein identification rates with 3% and 11% (improved and new identifications) for human and mouse, respectively, and enables proteome-wide detection of 5′-extended proteoforms, upstream ORF translation and near-cognate translation start sites. The PROTEOFORMER tool is available as a stand-alone pipeline and has been implemented in the galaxy framework for ease of use.
PMCID: PMC4357689  PMID: 25510491
7.  Identification by array comparative genomic hybridization of a new amplicon on chromosome 17q highly recurrent in BRCA1 mutated triple negative breast cancer 
Triple Negative Breast Cancers (TNBC) represent about 12% to 20% of all breast cancers (BC) and have a worse outcome compared to other BC subtypes. TNBC often show a deficiency in DNA double-strand break repair mechanisms. This is generally related to the inactivation of a repair enzymatic complex involving BRCA1 caused either by genetic mutations, epigenetic modifications or by post-transcriptional regulations.
The identification of new molecular biomarkers that would allow the rapid identification of BC presenting a BRCA1 deficiency could be useful to select patients who could benefit from PARP inhibitors, alkylating agents or platinum-based chemotherapy.
Genomic DNA from 131 formalin-fixed paraffin-embedded (FFPE) tumors (luminal A and B, HER2+ and triple negative BC) with known BRCA1 mutation status or unscreened for BRCA1 mutation were analysed by array Comparative Genomic Hybridization (array CGH). One highly significant and recurrent gain in the 17q25.3 genomic region was analysed by fluorescent in situ hybridization (FISH). Expression of the genes of the 17q25.3 amplicon was studied using customized Taqman low density arrays and single Taqman assays (Applied Biosystems).
We identified by array CGH and confirmed by FISH a gain in the 17q25.3 genomic region in 90% of the BRCA1 mutated tumors. This chromosomal gain was present in only 28.6% of the BRCA1 non-mutated TNBC, 26.7% of the unscreened TNBC, 13.6% of the luminal B, 19.0% of the HER2+ and 0% of the luminal A breast cancers. The 17q25.3 gain was also detected in 50% of the TNBC with BRCA1 promoter methylation. Interestingly, BRCA1 promoter methylation was never detected in BRCA1 mutated BC. Gene expression analyses of the 17q25.3 sub-region showed a significant over-expression of 17 genes in BRCA1 mutated TNBC (n = 15) as compared to the BRCA1 non mutated TNBC (n = 13).
In this study, we have identified by array CGH and confirmed by FISH a recurrent gain in 17q25.3 significantly associated to BRCA1 mutated TNBC. Up-regulated genes in the 17q25.3 amplicon might represent potential therapeutic targets and warrant further investigation.
Electronic supplementary material
The online version of this article (doi:10.1186/s13058-014-0466-y) contains supplementary material, which is available to authorized users.
PMCID: PMC4303204  PMID: 25416589
8.  SNP-guided identification of monoallelic DNA-methylation events from enrichment-based sequencing data 
Nucleic Acids Research  2014;42(20):e157.
Monoallelic gene expression is typically initiated early in the development of an organism. Dysregulation of monoallelic gene expression has already been linked to several non-Mendelian inherited genetic disorders. In humans, DNA-methylation is deemed to be an important regulator of monoallelic gene expression, but only few examples are known. One important reason is that current, cost-affordable truly genome-wide methods to assess DNA-methylation are based on sequencing post-enrichment. Here, we present a new methodology based on classical population genetic theory, i.e. the Hardy–Weinberg theorem, that combines methylomic data from MethylCap-seq with associated SNP profiles to identify monoallelically methylated loci. Applied on 334 MethylCap-seq samples of very diverse origin, this resulted in the identification of 80 genomic regions featured by monoallelic DNA-methylation. Of these 80 loci, 49 are located in genic regions of which 25 have already been linked to imprinting. Further analysis revealed statistically significant enrichment of these loci in promoter regions, further establishing the relevance and usefulness of the method. Additional validation was done using both 14 whole-genome bisulfite sequencing data sets and 16 mRNA-seq data sets. Importantly, the developed approach can be easily applied to other enrichment-based sequencing technologies, like the ChIP-seq-based identification of monoallelic histone modifications.
PMCID: PMC4227762  PMID: 25237057
9.  Systemic Suppression of the Shoot Metabolism upon Rice Root Nematode Infection 
PLoS ONE  2014;9(9):e106858.
Hirschmanniella oryzae is the most common plant-parasitic nematode in flooded rice cultivation systems. These migratory animals penetrate the plant roots and feed on the root cells, creating large cavities, extensive root necrosis and rotting. The objective of this study was to investigate the systemic response of the rice plant upon root infection by this nematode. RNA sequencing was applied on the above-ground parts of the rice plants at 3 and 7 days post inoculation. The data revealed significant modifications in the primary metabolism of the plant shoot, with a general suppression of for instance chlorophyll biosynthesis, the brassinosteroid pathway, and amino acid production. In the secondary metabolism, we detected a repression of the isoprenoid and shikimate pathways. These molecular changes can have dramatic consequences for the growth and yield of the rice plants, and could potentially change their susceptibility to above-ground pathogens and pests.
PMCID: PMC4162577  PMID: 25216177
10.  Illumina sequencing of 15 deafness genes using fragmented amplicons 
BMC Research Notes  2014;7:509.
Resequencing of deafness related genes using GS FLX massive parallel sequencing of PCR amplicons spanning selected genes has previously been reported as a successful strategy to discover causal variants. The amplicon lengths were designed to be smaller than the sequencing read length of GS FLX technology, but are longer than Illumina sequencing technology read lengths. Fragmentation is thus required to sequence these amplicons using high throughput Illumina technology.
We performed Illumina sequencing in 4 patients on 563 multiplexed amplicons covering the exons of 15 genes involved in the hearing process. After exploring several fragmentation strategies, the amplicons were fragmented using Covaris sonication prior to library preparation. CLC genomic workbench was used to analyze the data.
We achieve an excellent coverage with more than 99% of the amplicons bases covered. All variants that were previously validated using Sanger sequencing, were also called in this study. Variant calling revealed less false positive and false negative results compared to the previous study. For each patient, several variants were found that are reported by ClinVar as possible hearing loss variants.
Migration from GS FLX amplicon sequencing to Illumina amplicon sequencing is straightforward and leads to more accurate results.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-0500-7-509) contains supplementary material, which is available to authorized users.
PMCID: PMC4266979  PMID: 25106482
11.  Bacterial Diversity Assessment in Antarctic Terrestrial and Aquatic Microbial Mats: A Comparison between Bidirectional Pyrosequencing and Cultivation 
PLoS ONE  2014;9(6):e97564.
The application of high-throughput sequencing of the 16S rRNA gene has increased the size of microbial diversity datasets by several orders of magnitude, providing improved access to the rare biosphere compared with cultivation-based approaches and more established cultivation-independent techniques. By contrast, cultivation-based approaches allow the retrieval of both common and uncommon bacteria that can grow in the conditions used and provide access to strains for biotechnological applications. We performed bidirectional pyrosequencing of the bacterial 16S rRNA gene diversity in two terrestrial and seven aquatic Antarctic microbial mat samples previously studied by heterotrophic cultivation. While, not unexpectedly, 77.5% of genera recovered by pyrosequencing were not among the isolates, 25.6% of the genera picked up by cultivation were not detected by pyrosequencing. To allow comparison between both techniques, we focused on the five phyla (Proteobacteria, Actinobacteria, Bacteroidetes, Firmicutes and Deinococcus-Thermus) recovered by heterotrophic cultivation. Four of these phyla were among the most abundantly recovered by pyrosequencing. Strikingly, there was relatively little overlap between cultivation and the forward and reverse pyrosequencing-based datasets at the genus (17.1–22.2%) and OTU (3.5–3.6%) level (defined on a 97% similarity cut-off level). Comparison of the V1–V2 and V3–V2 datasets of the 16S rRNA gene revealed remarkable differences in number of OTUs and genera recovered. The forward dataset missed 33% of the genera from the reverse dataset despite comprising 50% more OTUs, while the reverse dataset did not contain 40% of the genera of the forward dataset. Similar observations were evident when comparing the forward and reverse cultivation datasets. Our results indicate that the region under consideration can have a large impact on perceived diversity, and should be considered when comparing different datasets. Finally, a high number of OTUs could not be classified using the RDP reference database, suggesting the presence of a large amount of novel diversity.
PMCID: PMC4041716  PMID: 24887330
12.  Reduced Rate of Repeated Prostate Biopsies Observed in ConfirmMDx Clinical Utility Field Study 
American Health & Drug Benefits  2014;7(3):129-134.
The diagnosis of prostate cancer is dependent on histologic confirmation in biopsy core tissues. The biopsy procedure is invasive, puts the patient at risk for complications, and is subject to significant sampling errors. An epigenetic test that uses methylation-specific polymerase chain reaction to determine the epigenetic status of the prostate cancer–associated genes GSTP1, APC, and RASSF1 has been clinically validated and is used in clinical practice to increase the negative predictive value in men with no history of prostate cancer compared with standard histopathology. Such information can help to avoid unnecessary repeat biopsies. The repeat biopsy rate may provide preliminary clinical utility evidence in relation to this assay's potential impact on the number of unnecessary repeat prostate biopsies performed in US urology practices.
The purpose of this preliminary study was to quantify the number of repeat prostate biopsy procedures to demonstrate a low repeat biopsy rate for men with a history of negative histopathology who received a negative epigenetic assay result on testing of the residual prostate tissue.
In this recently completed field observation study, practicing urologists used the epigenetic test called ConfirmMDx for Prostate Cancer (MDxHealth, Inc, Irvine, CA) to evaluate cancer-negative men considered at risk for prostate cancer. This test has been previously validated in 2 blinded multicenter studies that showed the superior negative predictive value of the epigenetic test over standard histopathology for cancer detection in prostate biopsies. A total of 5 clinical urology practices that had ordered a minimum of 40 commercial epigenetic test requisitions for patients with previous, cancer-negative biopsies over the course of the previous 18 months were contacted to assess their interest to participate in the study. Select demographic and prostate-screening parameter information, as well as the incidence of repeat biopsy, specifically for patients with a negative test result, was collected and merged into 1 collective database. All men from each of the 5 sites who had negative assay results were included in the analysis.
A total of 138 patients were identified in these urology practices and were included in the analysis. The median age of the men was 63 years, and the current median serum prostate-specific antigen level was 4.7 ng/mL. Repeat biopsies had been performed in 6 of the 138 (4.3%) men with a negative epigenetic assay result, in whom no evidence of cancer was found on histopathology.
In this study, a low rate of repeat prostatic biopsies was observed in the group of men with previous histopathologically negative biopsies who were considered to be at risk for harboring cancer. The data suggest that patients managed using the ConfirmMDx for Prostate Cancer negative results had a low rate of repeat prostate biopsies. These results warrant a large, controlled, prospective study to further evaluate the clinical utility of the epigenetic test to lower the unnecessary repeat biopsy rate.
PMCID: PMC4070628  PMID: 24991397
13.  Emerging evidence for CHFR as a cancer biomarker: from tumor biology to precision medicine 
Cancer Metastasis Reviews  2013;33(1):161-171.
Novel insights in the biology of cancer have switched the paradigm of a “one-size-fits-all” cancer treatment to an individualized biology-driven treatment approach. In recent years, a diversity of biomarkers and targeted therapies has been discovered. Although these examples accentuate the promise of personalized cancer treatment, for most cancers and cancer subgroups no biomarkers and effective targeted therapy are available. The great majority of patients still receive unselected standard therapies with no use of their individual molecular characteristics. Better knowledge about the underlying tumor biology will lead the way toward personalized cancer treatment. In this review, we summarize the evidence for a promising cancer biomarker: checkpoint with forkhead and ring finger domains (CHFR). CHFR is a mitotic checkpoint and tumor suppressor gene, which is inactivated in a diverse group of solid malignancies, mostly by promoter CpG island methylation. CHFR inactivation has shown to be an indicator of poor prognosis and sensitivity to taxane-based chemotherapy. Here we summarize the current knowledge of altered CHFR expression in cancer, the impact on tumor biology and implications for personalized cancer treatment.
PMCID: PMC3988518  PMID: 24375389
CHFR promoter methylation; Predictive biomarker; Taxane sensitivity
14.  Staphylococcal enterotoxin B influences the DNA methylation pattern in nasal polyp tissue: a preliminary study 
Staphylococcal enterotoxins may influence the pro-inflammatory pattern of chronic sinus diseases via epigenetic events. This work intended to investigate the potential of staphylococcal enterotoxin B (SEB) to induce changes in the DNA methylation pattern. Nasal polyp tissue explants were cultured in the presence and absence of SEB; genomic DNA was then isolated and used for whole genome methylation analysis. Results showed that SEB stimulation altered the methylation pattern of gene regions when compared with non stimulated tissue. Data enrichment analysis highlighted two genes: the IKBKB and STAT-5B, both playing a crucial role in T- cell maturation/activation and immune response.
PMCID: PMC3867657  PMID: 24341752
Staphylococcus aureus enterotoxin B; Chronic rhinosinusitis and nasal polyps; DNA methylation; MBD2; Whole genome methylation analysis; Hypermethylation
15.  Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs 
BMC Genomics  2013;14:648.
It was long assumed that proteins are at least 100 amino acids (AAs) long. Moreover, the detection of short translation products (e.g. coded from small Open Reading Frames, sORFs) is very difficult as the short length makes it hard to distinguish true coding ORFs from ORFs occurring by chance. Nevertheless, over the past few years many such non-canonical genes (with ORFs < 100 AAs) have been discovered in different organisms like Arabidopsis thaliana, Saccharomyces cerevisiae, and Drosophila melanogaster. Thanks to advances in sequencing, bioinformatics and computing power, it is now possible to scan the genome in unprecedented scrutiny, for example in a search of this type of small ORFs.
Using bioinformatics methods, we performed a systematic search for putatively functional sORFs in the Mus musculus genome. A genome-wide scan detected all sORFs which were subsequently analyzed for their coding potential, based on evolutionary conservation at the AA level, and ranked using a Support Vector Machine (SVM) learning model. The ranked sORFs are finally overlapped with ribosome profiling data, hinting to sORF translation. All candidates are visually inspected using an in-house developed genome browser. In this way dozens of highly conserved sORFs, targeted by ribosomes were identified in the mouse genome, putatively encoding micropeptides.
Our combined genome-wide approach leads to the prediction of a comprehensive but manageable set of putatively coding sORFs, a very important first step towards the identification of a new class of bioactive peptides, called micropeptides.
PMCID: PMC3852105  PMID: 24059539
Micropeptide; Small open reading frame; Mus musculus; Genome-wide; Ribosome profiling; LincRNA; sORF; ncRNA; Bioactive peptide
16.  Transcriptional analysis through RNA sequencing of giant cells induced by Meloidogyne graminicola in rice roots 
Journal of Experimental Botany  2013;64(12):3885-3898.
One of the reasons for the progressive yield decline observed in aerobic rice production is the rapid build-up of populations of the rice root knot nematode Meloidogyne graminicola. These nematodes induce specialized feeding cells inside root tissue, called giant cells. By injecting effectors in and sipping metabolites out of these cells, they reprogramme normal cell development and deprive the plant of its nutrients. In this research we have studied the transcriptome of giant cells in rice, after isolation of these cells by laser-capture microdissection. The expression profiles revealed a general induction of primary metabolism inside the giant cells. Although the roots were shielded from light induction, we detected a remarkable induction of genes involved in chloroplast biogenesis and tetrapyrrole synthesis. The presence of chloroplast-like structures inside these dark-grown cells was confirmed by confocal microscopy. On the other hand, genes involved in secondary metabolism and more specifically, the majority of defence-related genes were strongly suppressed in the giant cells. In addition, significant induction of transcripts involved in epigenetic processes was detected inside these cells 7 days after infection.
PMCID: PMC3745741  PMID: 23881398
Giant cell; laser-capture microdissection; Meloidogyne graminicola; Oryza sativa; root knot nematode; transcriptome.
17.  Quality Evaluation of Methyl Binding Domain Based Kits for Enrichment DNA-Methylation Sequencing 
PLoS ONE  2013;8(3):e59068.
DNA-methylation is an important epigenetic feature in health and disease. Methylated sequence capturing by Methyl Binding Domain (MBD) based enrichment followed by second-generation sequencing provides the best combination of sensitivity and cost-efficiency for genome-wide DNA-methylation profiling. However, existing implementations are numerous, and quality control and optimization require expensive external validation. Therefore, this study has two aims: 1) to identify a best performing kit for MBD-based enrichment using independent validation data, and 2) to evaluate whether quality evaluation can also be performed solely based on the characteristics of the generated sequences. Five commercially available kits for MBD enrichment were combined with Illumina GAIIx sequencing for three cell lines (HCT15, DU145, PC3). Reduced representation bisulfite sequencing data (all three cell lines) and publicly available Illumina Infinium BeadChip data (DU145 and PC3) were used for benchmarking. Consistent large-scale differences in yield, sensitivity and specificity between the different kits could be identified, with Diagenode's MethylCap kit as overall best performing kit under the tested conditions. This kit could also be identified with the Fragment CpG-plot, which summarizes the CpG content of the captured fragments, implying that the latter can be used as a tool to monitor data quality. In conclusion, there are major quality differences between kits for MBD-based capturing of methylated DNA, with the MethylCap kit performing best under the used settings. The Fragment CpG-plot is able to monitor data quality based on inherent sequence data characteristics, and is therefore a cost-efficient tool for experimental optimization, but also to monitor quality throughout routine applications.
PMCID: PMC3598902  PMID: 23554971
18.  Genome-wide promoter methylation analysis in neuroblastoma identifies prognostic methylation biomarkers 
Genome Biology  2012;13(10):R95.
Accurate outcome prediction in neuroblastoma, which is necessary to enable the optimal choice of risk-related therapy, remains a challenge. To improve neuroblastoma patient stratification, this study aimed to identify prognostic tumor DNA methylation biomarkers.
To identify genes silenced by promoter methylation, we first applied two independent genome-wide methylation screening methodologies to eight neuroblastoma cell lines. Specifically, we used re-expression profiling upon 5-aza-2'-deoxycytidine (DAC) treatment and massively parallel sequencing after capturing with a methyl-CpG-binding domain (MBD-seq). Putative methylation markers were selected from DAC-upregulated genes through a literature search and an upfront methylation-specific PCR on 20 primary neuroblastoma tumors, as well as through MBD- seq in combination with publicly available neuroblastoma tumor gene expression data. This yielded 43 candidate biomarkers that were subsequently tested by high-throughput methylation-specific PCR on an independent cohort of 89 primary neuroblastoma tumors that had been selected for risk classification and survival. Based on this analysis, methylation of KRT19, FAS, PRPH, CNR1, QPCT, HIST1H3C, ACSS3 and GRB10 was found to be associated with at least one of the classical risk factors, namely age, stage or MYCN status. Importantly, HIST1H3C and GNAS methylation was associated with overall and/or event-free survival.
This study combines two genome-wide methylation discovery methodologies and is the most extensive validation study in neuroblastoma performed thus far. We identified several novel prognostic DNA methylation markers and provide a basis for the development of a DNA methylation-based prognostic classifier in neuroblastoma.
PMCID: PMC3491423  PMID: 23034519
19.  A tissue biopsy-based epigenetic multiplex PCR assay for prostate cancer detection 
BMC Urology  2012;12:16.
PSA-directed prostate cancer screening leads to a high rate of false positive identifications and an unnecessary biopsy burden. Epigenetic biomarkers have proven useful, exhibiting frequent and abundant inactivation of tumor suppressor genes through such mechanisms. An epigenetic, multiplex PCR test for prostate cancer diagnosis could provide physicians with better tools to help their patients. Biomarkers like GSTP1, APC and RASSF1 have demonstrated involvement with prostate cancer, with the latter two genes playing prominent roles in the field effect. The epigenetic states of these genes can be used to assess the likelihood of cancer presence or absence.
An initial test cohort of 30 prostate cancer-positive samples and 12 cancer-negative samples was used as basis for the development and optimization of an epigenetic multiplex assay based on the GSTP1, APC and RASSF1 genes, using methylation specific PCR (MSP). The effect of prostate needle core biopsy sample volume and age of formalin-fixed paraffin-embedded (FFPE) samples was evaluated on an independent follow-up cohort of 51 cancer-positive patients. Multiplexing affects copy number calculations in a consistent way per assay. Methylation ratios are therefore altered compared to the respective singleplex assays, but the correlation with patient outcome remains equivalent. In addition, tissue-biopsy samples as small as 20 μm can be used to detect methylation in a reliable manner. The age of FFPE-samples does have a negative impact on DNA quality and quantity.
The developed multiplex assay appears functionally similar to individual singleplex assays, with the benefit of lower tissue requirements, lower cost and decreased signal variation. This assay can be applied to small biopsy specimens, down to 20 microns, widening clinical applicability. Increasing the sample volume can compensate the loss of DNA quality and quantity in older samples.
PMCID: PMC3431995  PMID: 22672250
GSTP1; APC; RASSF1; Methylation; Epigenetics; Prostate cancer; Diagnosis; Multiplex; Singleplex; MSP
20.  Molecular diagnostics for congenital hearing loss including 15 deafness genes using a next generation sequencing platform 
BMC Medical Genomics  2012;5:17.
Hereditary hearing loss (HL) can originate from mutations in one of many genes involved in the complex process of hearing. Identification of the genetic defects in patients is currently labor intensive and expensive. While screening with Sanger sequencing for GJB2 mutations is common, this is not the case for the other known deafness genes (> 60). Next generation sequencing technology (NGS) has the potential to be much more cost efficient. Published methods mainly use hybridization based target enrichment procedures that are time saving and efficient, but lead to loss in sensitivity. In this study we used a semi-automated PCR amplification and NGS in order to combine high sensitivity, speed and cost efficiency.
In this proof of concept study, we screened 15 autosomal recessive deafness genes in 5 patients with congenital genetic deafness. 646 specific primer pairs for all exons and most of the UTR of the 15 selected genes were designed using primerXL. Using patient specific identifiers, all amplicons were pooled and analyzed using the Roche 454 NGS technology. Three of these patients are members of families in which a region of interest has previously been characterized by linkage studies. In these, we were able to identify two new mutations in CDH23 and OTOF. For another patient, the etiology of deafness was unclear, and no causal mutation was found. In a fifth patient, included as a positive control, we could confirm a known mutation in TMC1.
We have developed an assay that holds great promise as a tool for screening patients with familial autosomal recessive nonsyndromal hearing loss (ARNSHL). For the first time, an efficient, reliable and cost effective genetic test, based on PCR enrichment, for newborns with undiagnosed deafness is available.
PMCID: PMC3443074  PMID: 22607986
Deafness; Next generation sequencing; PCR based enrichment; Genetic diagnostics
21.  Colorectal adenoma to carcinoma progression is accompanied by changes in gene expression associated with ageing, chromosomal instability, and fatty acid metabolism 
Cellular Oncology (Dordrecht)  2012;35(1):53-63.
Colorectal cancer develops in a multi-step manner from normal epithelium, through a pre-malignant lesion (so-called adenoma), into a malignant lesion (carcinoma), which invades surrounding tissues and eventually can spread systemically (metastasis). It is estimated that only about 5% of adenomas do progress to a carcinoma.
The present study aimed to unravel the biology of adenoma to carcinoma progression by mRNA expression profiling, and to identify candidate biomarkers for adenomas that are truly at high risk of progression.
Genome-wide mRNA expression profiles were obtained from a series of 37 colorectal adenomas and 31 colorectal carcinomas using oligonucleotide microarrays. Differentially expressed genes were validated in an independent colorectal gene expression data set. Gene Set Enrichment Analysis (GSEA) was used to identify altered expression of sets of genes associated with specific biological processes, in order to better understand the biology of colorectal adenoma to carcinoma progression.
mRNA expression of 248 genes was significantly different, of which 96 were upregulated and 152 downregulated in carcinomas compared to adenomas. Classification of adenomas and carcinomas using the expression of these genes showed to be very accurate, also when tested in an independent expression data set. Gene-sets associated with ageing (which is related to senescence) and chromosomal instability were upregulated, and a gene-set associated with fatty acid metabolism was downregulated in carcinomas compared to adenomas. Moreover, gene-sets associated with chromosomal location revealed chromosome 4q22 loss and chromosome 20q gain of gene-set expression as being relevant in this progression.
Concluding remark
These data are consistent with the notion that adenomas and carcinomas are distinct biological entities. Disruption of specific biological processes like senescence (ageing), maintenance of chromosomal instability and altered metabolism, are key factors in the progression from adenoma to carcinoma.
Electronic supplementary material
The online version of this article (doi:10.1007/s13402-011-0065-1) contains supplementary material, which is available to authorized users.
PMCID: PMC3308003  PMID: 22278361
Colorectal cancer; Progression; Gene expression; Ageing; Chromosomal instability; Fatty acid metabolism
22.  Cancer -related Epigenome Changes Associated with Reprogramming to Induced Pluripotent Stem Cells 
Cancer research  2010;70(19):7662-7673.
The ability to induce pluripotent stem cells from committed, somatic, human cells provides tremendous potential for regenerative medicine. However, there is a defined neoplastic potential inherent to such reprogramming that must be understood and may provide a model for understanding key events in tumorigenesis. Using genome wide assays we identify cancer-related epigenetic abnormalities that arise early during reprogramming and persist in induced pluripotent stem cell (iPS) clones. These include hundreds of abnormal gene silencing events, patterns of aberrant responses to epigenetic modifying drugs resembling those for cancer cells, and presence in iPS and partially reprogrammed cells of cancer-specific, gene promoter, DNA methylation alterations. Our findings suggest that by studying the process of induced reprogramming we may gain significant insight into the origins of epigenetic gene silencing associated with human tumorigenesis and add to means of assessing iPS for safety.
PMCID: PMC2980296  PMID: 20841480
reprogramming; induced pluripotent stem cells (iPS); embryonic stem cells (ESC); DNA methylation; chromatin; cancer
23.  Practical Tools to Implement Massive Parallel Pyrosequencing of PCR Products in Next Generation Molecular Diagnostics 
PLoS ONE  2011;6(9):e25531.
Despite improvements in terms of sequence quality and price per basepair, Sanger sequencing remains restricted to screening of individual disease genes. The development of massively parallel sequencing (MPS) technologies heralded an era in which molecular diagnostics for multigenic disorders becomes reality. Here, we outline different PCR amplification based strategies for the screening of a multitude of genes in a patient cohort. We performed a thorough evaluation in terms of set-up, coverage and sequencing variants on the data of 10 GS-FLX experiments (over 200 patients). Crucially, we determined the actual coverage that is required for reliable diagnostic results using MPS, and provide a tool to calculate the number of patients that can be screened in a single run. Finally, we provide an overview of factors contributing to false negative or false positive mutation calls and suggest ways to maximize sensitivity and specificity, both important in a routine setting. By describing practical strategies for screening of multigenic disorders in a multitude of samples and providing answers to questions about minimum required coverage, the number of patients that can be screened in a single run and the factors that may affect sensitivity and specificity we hope to facilitate the implementation of MPS technology in molecular diagnostics.
PMCID: PMC3184136  PMID: 21980484
24.  Genome-Wide Promoter Analysis Uncovers Portions of the Cancer Methylome 
Cancer research  2008;68(8):2661-2670.
DNA methylation has a role in mediating epigenetic silencing of CpG island genes in cancer and other diseases. Identification of all gene promoters methylated in cancer cells “the cancer methylome” would greatly advance our understanding of gene regulatory networks in tumorigenesis. We previously described a new method of identifying methylated tumor suppressor genes based on pharmacologic unmasking of the promoter region and detection of re-expression on microarray analysis. In this study, we modified and greatly improved the selection of candidates based on new promoter structure algorithm and microarray data generated from 20 cancer cell lines of 5 major cancer types. We identified a set of 200 candidate genes that cluster throughout the genome of which 25 were previously reported as harboring cancer-specific promoter methylation. The remaining 175 genes were tested for promoter methylation by bisulfite sequencing or methylation-specific PCR (MSP). Eighty-two of 175 (47%) genes were found to be methylated in cell lines, and 53 of these 82 genes (65%) were methylated in primary tumor tissues. From these 53 genes, cancer-specific methylation was identified in 28 genes (28 of 53; 53%). Furthermore, we tested 8 of the 28 newly identified cancer-specific methylated genes with quantitative MSP in a panel of 300 primary tumors representing 13 types of cancer. We found cancer-specific methylation of at least one gene with high frequency in all cancer types. Identification of a large number of genes with cancer-specific methylation provides new targets for diagnostic and therapeutic intervention, and opens fertile avenues for basic research in tumor biology.
PMCID: PMC3102297  PMID: 18413733
25.  Pharmacologic Unmasking of Epigenetically Silenced Genes in Breast Cancer 
Aberrant promoter hypermethylation of several known or putative tumor suppressor genes occurs frequently during the pathogenesis of various cancers including breast cancer. Many epigenetically inactivated genes involved in breast cancer development remain to be identified. Therefore, in this study we used a pharmacologic unmasking approach in breast cancer cell lines with 5-aza-2′-deoxycytidine (5-aza-dC) followed by microarray expression analysis to identify epigenetically inactivated genes in breast cancer.
Experimental Design
Breast cancer cell lines were treated with 5-aza-dC followed by microarray analysis to identify epigenetically inactivated genes in breast cancer. We then used bisulfite DNA sequencing, conventional methylation-specific PCR, and quantitative fluorogenic real-time methylation-specific PCR to confirm cancer-specific methylation in novel genes.
Forty-nine genes were up-regulated in breast cancer cells lines after 5-aza-dC treatment, as determined by microarray analysis. Five genes (MAL, FKBP4, VGF, OGDHL, and KIF1A) showed cancer-specific methylation in breast tissues. Methylation of at least two was found at high frequency only in breast cancers (40 of 40) as compared with normal breast tissue (0 of 10; P < 0.0001, Fisher’s exact test).
This study identified new cancer-specific methylated genes to help elucidate the biology of breast cancer and as candidate diagnostic markers for the disease.
PMCID: PMC3082476  PMID: 19228724

