Chronic lymphocytic leukemia (CLL) can be divided into prognostic subgroups based on the IGHV gene mutational status, and is further characterized by multiple subsets of cases with quasi-identical or stereotyped B cell receptors that also share clinical and biological features. We recently reported differential DNA methylation profiles in IGHV-mutated and IGHV-unmutated CLL subgroups. For the first time, we here explore the global methylation profiles of stereotyped subsets with different prognosis, by applying high-resolution methylation arrays on CLL samples from three major stereotyped subsets: the poor-prognostic subsets #1 (n = 15) and #2 (n = 9) and the favorable-prognostic subset #4 (n = 15). Overall, the three subsets exhibited significantly different methylation profiles, which only partially overlapped with those observed in our previous study according to IGHV gene mutational status. Specifically, gene ontology analysis of the differentially methylated genes revealed a clear enrichment of genes involved in immune response, such as B cell activation (e.g., CD80, CD86 and IL10), with higher methylation levels in subset #1 than subsets #2 and #4. Accordingly, higher expression of the co-stimulatory molecules CD80 and CD86 was demonstrated in subset #4 vs. subset #1, pointing to a key role for these molecules in the crosstalk of CLL subset #4 cells with the microenvironment. In summary, investigation of three prototypic, stereotyped CLL subsets revealed distinct DNA methylation profiles for each subset, which suggests subset-biased patterns of transcriptional control and highlights a key role for epigenetics during leukemogenesis.
Chronic lymphocytic leukemia; DNA methylation; microarrays; stereotyped B cell receptors; immune response
Abnormal tail biting behaviour is a major welfare problem for pigs receiving the behaviour, as well as an indication of decreased welfare in the pigs performing it. However, not all pigs in a pen perform or receive tail biting behaviour and it has recently been shown that these ‘neutral’ pigs not only differ in their behaviour, but also in their gene expression compared to performers and receivers of tail biting in the same pen. To investigate whether this difference was linked to the cause or a consequence of them not being involved in the outbreak of tail biting, behaviour and brain gene expression was compared with ‘control’ pigs housed in pens with no tail biting. It was shown that the pigs housed in control pens performed a wider variety of pig-directed abnormal behaviour (belly nosing 0.95±1.59, tail in mouth 0.31±0.60 and ‘other‘ abnormal 1.53±4.26; mean±S.D) compared to the neutral pigs (belly nosing 0.30±0.62, tail in mouth 0.13±0.50 and “other“ abnormal 0.42±1.06). With Affymetrix gene expression arrays, 107 transcripts were identified as differently expressed (p<0.05) between these two categories of pigs. Several of these transcripts had already been shown to be differently expressed in the neutral pigs when they were compared to performers and receivers of tail biting in the same pen in an earlier study. Hence, the different expression of these genes cannot be a consequence of the neutral pigs not being involved in tail biting behaviour, but rather linked to the cause contributing to why they were not involved in tail biting interactions. These neutral pigs seem to have a genetic and behavioural profile that somehow contributes to them being resistant to performing or receiving pig-directed abnormal behaviour, such as tail biting, even when housed in an environment that elicits that behaviour in other pigs.
Whole-genome sequencing of tumor tissue has the potential to provide comprehensive characterization of genomic alterations in tumor samples. We present Patchwork, a new bioinformatic tool for allele-specific copy number analysis using whole-genome sequencing data. Patchwork can be used to determine the copy number of homologous sequences throughout the genome, even in aneuploid samples with moderate sequence coverage and tumor cell content. No prior knowledge of average ploidy or tumor cell content is required. Patchwork is freely available as an R package, installable via R-Forge (http://patchwork.r-forge.r-project.org/).
Cancer; allele-specific copy number analysis; whole-genome sequencing; aneuploidy; tumor heterogeneity; chromothripsis
Ovarian cancer is a heterogeneous disease and prognosis for apparently similar cases of ovarian cancer varies. Recurrence of the disease in early stage (FIGO-stages I-II) serous ovarian cancer results in survival that is comparable to those with recurrent advanced-stage disease. The aim of this study was to investigate if there are specific genomic aberrations that may explain recurrence and clinical outcome.
Fifty-one women with early stage serous ovarian cancer were included in the study. DNA was extracted from formalin fixed samples containing tumor cells from ovarian tumors. Tumor samples from thirty-seven patients were analysed for allele-specific copy numbers using OncoScan single nucleotide polymorphism arrays from Affymetrix and the bioinformatic tool Tumor Aberration Prediction Suite. Genomic gains, losses, and loss-of-heterozygosity that associated with recurrent disease were identified.
The most significant differences (p < 0.01) in Loss-of-heterozygosity (LOH) were identified in two relatively small regions of chromosome 19; 8.0-8,8 Mbp (19 genes) and 51.5-53.0 Mbp (37 genes). Thus, 56 genes on chromosome 19 were potential candidate genes associated with clinical outcome. LOH at 19q (51-56 Mbp) was associated with shorter disease-free survival and was an independent prognostic factor for survival in a multivariate Cox regression analysis. In particular LOH on chromosome 19q (51-56 Mbp) was significantly (p < 0.01) associated with loss of TP53 function.
The results of our study indicate that presence of two aberrations in TP53 on 17p and LOH on 19q in early stage serous ovarian cancer is associated with recurrent disease. Further studies related to the findings of chromosomes 17 and 19 are needed to elucidate the molecular mechanism behind the recurring genomic aberrations and the poor clinical outcome.
Allele-specific copy number; FFPE; LOH; Prognosis; Serous ovarian cancer; TAPS; Early-stage
The rate of failed interlock blood alcohol concentration (BAC) tests is a strong predictor of recidivism post-interlock and a partial proxy for alcohol use. Alcohol biomarkers measured at the start of an interlock program are known to correlate well with rates of failed BAC tests over months of interlock use. This study evaluates two methods of measuring low blood levels of the biomarker PEth (phosphatidylethanol). PEth is a 100% alcohol specific biomarker and strongly intercorrelated with several independent indicators of drinking driving risk, including 8 other biomarkers, 3 psychometric assessments, and the rate of failed interlock BAC tests during many months of interlock use. Does a more sensitive method of measuring PEth at program entry detect drinking even among those who subsequently log no failed interlock tests?
In a sample of 281 driver blood samples, PEth was measured by both high-performance liquid chromatography (HPLC) and liquid chromatography tandem mass spectrometry (LCMSMS) in order to compare sensitivity and accuracy. The average rate of failed interlock BAC tests was the criterion measure for marker sensitivity. LCMSMS, calibrated to detect low levels of drinking as a possible measure of abstinence violation, was judged relative to the standard HPLC assay for PEth measured up to 4 µmol/L.
The two methods showed a good quantitative relationship (r2>.86). LCMSMS detected positive PEth levels in samples that were below the limit of detection of the HPLC method. PEth measured by LCMSMS was positive for a higher proportion of DUI offenders who logged zero failed interlock BAC tests than were detected by HPLC.
Although HPLC is the widely used standard for measuring PEth in clinical alcoholism samples, the LCMSMS method, when calibrated to detect trace amounts of the major component of PEth, can detect abstinence levels of alcohol near zero intake and still correlate strongly with other indicators related to alcohol use and road safety.
phosphatidylethanol; HPLC; LCMSMS; DUI drivers; alcohol; interlocks; abstinence
We describe a bioinformatic tool, Tumor Aberration Prediction Suite (TAPS), for the identification of allele-specific copy numbers in tumor samples using data from Affymetrix SNP arrays. It includes detailed visualization of genomic segment characteristics and iterative pattern recognition for copy number identification, and does not require patient-matched normal samples. TAPS can be used to identify chromosomal aberrations with high sensitivity even when the proportion of tumor cells is as low as 30%. Analysis of cancer samples indicates that TAPS is well suited to investigate samples with aneuploidy and tumor heterogeneity, which is commonly found in many types of solid tumors.
To evaluate the prevalence of primary aldosteronism (PA) in newly diagnosed and untreated hypertensive patients in primary care using the aldosterone/renin ratio (ARR), and to assess clinical and biochemical characteristics in patients with high and normal ARR.
Patient survey study.
Setting and subjects
A total of 200 consecutive patients with newly diagnosed and untreated hypertension from six primary health care centres in Sweden were included.
Main outcome measures
ARR was calculated from serum aldosterone and plasma renin concentrations. The cut-off level for ARR was 65. Patients with an increased ARR were considered for confirmatory testing with the fludrocortisone suppression test (FST), followed by adrenal computed tomographic radiology (CT) and adrenal venous sampling (AVS).
Of 200 patients, 36 patients had an ARR > 65. Of these 36 patients, 11 patients had an incomplete aldosterone inhibition during FST. Three patients were diagnosed with an aldosterone producing adenoma (APA) and eight with bilateral adrenal hyperplasia (BHA). Except for moderately lower level of P-K in patients with an ARR > 65 and in patients with PA, there were no biochemical or clinical differences found among hypertensive patients with PA compared with patients without PA.
Eleven of 200 evaluated patients (5.5%) were considered to have PA. The diagnosis of PA should therefore be considered in newly diagnosed hypertensive subjects and screening for the diagnosis is warranted.
Aldosterone; aldosterone to renin ratio; family practice; hypertension; primary aldosteronism; renin
Technologies based on DNA microarrays have the potential to provide detailed information on genomic aberrations in tumor cells. In practice a major obstacle for quantitative detection of aberrations is the heterogeneity of clinical tumor tissue. Since tumor tissue invariably contains genetically normal stromal cells, this may lead to a failure to detect aberrations in the tumor cells.
Using SNP array data from 44 non-small cell lung cancer samples we have developed a bioinformatic algorithm that accurately models the fractions of normal and tumor cells in clinical tumor samples. The proportion of normal cells in combination with SNP array data can be used to detect and quantify copy number neutral loss-of-heterozygosity (CNNLOH) in the tumor cells both in crude tumor tissue and in samples enriched for tumor cells by laser capture microdissection.
Genome-wide quantitative analysis of CNNLOH using the CNNLOH Quantifier method can help to identify recurrent aberrations contributing to tumor development in clinical tumor samples. In addition, SNP-array based analysis of CNNLOH may become important for detection of aberrations that can be used for diagnostic and prognostic purposes.
There is strong evidence for the importance of genetic factors in idiopathic autism. The results from independent twin and family studies suggest that the disorder is caused by the action of several genes, possibly acting epistatically. We have used cDNA microarray technology for the identification of constitutional changes in the gene expression profile associated with idiopathic autism. Samples were obtained and analyzed from six affected subjects belonging to multiplex autism families and from six healthy controls. We assessed the expression levels for approximately 7,700 genes by cDNA microarrays using mRNA derived from Epstein Barr virus (EBV)-transformed B-lymphocytes. The microarray data was analyzed in order to identify up- or down-regulation of specific genes. A common pattern with nine down-regulated genes was identified among samples derived from individuals with autism when compared to controls. Four of these nine genes encode proteins involved in biological processes associated with brain function or the immune system, and are consequently considered as candidates for genes associated with autism. Quantitative realtime PCR confirms the down-regulation of the gene encoding SEMA5A, a protein involved in axonal guidance. EBV should be considered as a possible source for altered expression but our consistent results make us suggest SEMA5A a candidate gene in the etiology of idiopathic autism.
Adolescent; Autistic Disorder; genetics; metabolism; Child; Child; Preschool; Down-Regulation; physiology; Female; Gene Expression; physiology; Genetic Predisposition to Disease; Humans; Male; Membrane Proteins; genetics; metabolism; Nerve Tissue Proteins; genetics; metabolism; Oligonucleotide Array Sequence Analysis; methods; Reverse Transcriptase Polymerase Chain Reaction; methods; Autistic disorder; cDNA microarrays; Gene expression; Chromosome 7q31; SEMA5A
A strategy is presented for detection of loss-of-heterozygosity and allelic imbalance in cancer cells from whole genome SNP genotyping data.
We present a strategy for detection of loss-of-heterozygosity and allelic imbalance in cancer cells from whole genome single nucleotide polymorphism genotyping data. Using a dilution series of a tumor cell line mixed with its paired normal cell line and data generated on Affymetrix and Illumina platforms, including paired tumor-normal samples and tumors characterized by fluorescent in situ hybridization, we demonstrate a high sensitivity and specificity of the strategy for detecting both minute and gross allelic imbalances in heterogeneous tumor samples.
We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process.
We explain the regulatory mechanisms of the inferred periodic classes with cis-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of cis-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach.
The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes.
Using the relative expression levels of two SNP alleles of a gene in the same sample is an effective approach for identifying cis-acting regulatory SNPs (rSNPs). In the current study, we established a process for systematic screening for cis-acting rSNPs using experimental detection of AI as an initial approach. We selected 160 expressed candidate genes that are involved in cancer and anticancer drug resistance for analysis of AI in a panel of cell lines that represent different types of cancers and have been well characterized for their response patterns against anticancer drugs. Of these genes, 60 contained heterozygous SNPs in their coding regions, and 41 of the genes displayed imbalanced expression of the two cSNP alleles. Genes that displayed AI were subjected to bioinformatics-assisted identification of rSNPs that alter the strength of transcription factor binding. rSNPs in 15 genes were subjected to electrophoretic mobility shift assay, and in eight of these genes (APC, BCL2, CCND2, MLH1, PARP1, SLIT2, YES1, XRCC1) we identified differential protein binding from a nuclear extract between the SNP alleles. The screening process allowed us to zoom in from 160 candidate genes to eight genes that may contain functional rSNPs in their promoter regions.
Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a refined algorithm called Repeated Independent Design and Test (RIDT).
Our simulations reveal that repeated designs and tests based on resampling in a fixed bag of samples yield a biased variance estimate. We also demonstrate that it is possible to obtain an improved variance estimate by means of a procedure that explicitly models how this bias depends on the number of samples used for testing. For the special case of repeated designs and tests using new samples for each design and test, we present an exact analytical expression for how the expected value of the bias decreases with the size of the test set.
We show that via modeling and subsequent reduction of the small sample bias, it is possible to obtain an improved estimate of the variance of classifier performance between design sets. However, the uncertainty of the variance estimate is large in the simulations performed indicating that the method in its present form cannot be directly applied to small data sets.
Detection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at a particular frequency that characterizes the process under study but this frequency is seldom exactly known. Previously proposed detector designs require access to labelled training examples and do not allow systematic incorporation of diffuse prior knowledge available about the period time.
A learning-free Bayesian detector that does not rely on labelled training examples and allows incorporation of prior knowledge about the period time is introduced. It is shown to outperform two recently proposed alternative learning-free detectors on simulated data generated with models that are different from the one used for detector design. Results from applying the detector to mRNA expression time profiles from S. cerevisiae showsthat the genes detected as periodically expressed only contain a small fraction of the cell-cycle genes inferred from mutant phenotype. For example, when the probability of false alarm was equal to 7%, only 12% of the cell-cycle genes were detected. The genes detected as periodically expressed were found to have a statistically significant overrepresentation of known cell-cycle regulated sequence motifs. One known sequence motif and 18 putative motifs, previously not associated with periodic expression, were also over represented.
In comparison with recently proposed alternative learning-free detectors for periodic gene expression, Bayesian inference allows systematic incorporation of diffuse a priori knowledge about, e.g. the period time. This results in relative performance improvements due to increased robustness against errors in the underlying assumptions. Results from applying the detector to mRNA expression time profiles from S. cerevisiae include several new findings that deserve further experimental studies.
Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes.