The replication of eukaryotic chromosomes is organized temporally and spatially within the nucleus through epigenetic regulation of replication origin function. The characteristic initiation timing of specific origins is thought to reflect their chromatin environment or sub-nuclear positioning, however the mechanism remains obscure. Here we show that the yeast Forkhead transcription factors, Fkh1 and Fkh2, are global determinants of replication origin timing. Forkhead regulation of origin timing is independent of local levels or changes of transcription. Instead, we show that Fkh1 and Fkh2 are required for the clustering of early origins and their association with the key initiation factor Cdc45 in G1-phase, suggesting that Fkh1 and Fkh2 selectively recruit origins to emergent replication factories. Fkh1 and Fkh2 bind Fkh-activated origins, and interact physically with ORC, providing a plausible mechanism to cluster origins. These findings add a new dimension to our understanding of the epigenetic basis for differential origin regulation and its connection to chromosomal domain organization.
Replication origin timing; chromatin; Forkhead; Fox; centromere; telomere; chromosome-conformation; Cdc45; epigenetics; transcription; nuclear architecture
Dynamic activity of signaling pathways, such as Notch, is vital to achieve correct development and homeostasis. However, most studies assess output many hours or days after initiation of signaling, once the outcome has been consolidated. Here we analyze genome-wide changes in transcript levels, binding of the Notch pathway transcription factor, CSL [Suppressor of Hairless, Su(H), in Drosophila], and RNA Polymerase II (Pol II) immediately following a short pulse of Notch stimulation. A total of 154 genes showed significant differential expression (DE) over time, and their expression profiles stratified into 14 clusters based on the timing, magnitude, and direction of DE. E(spl) genes were the most rapidly upregulated, with Su(H), Pol II, and transcript levels increasing within 5–10 minutes. Other genes had a more delayed response, the timing of which was largely unaffected by more prolonged Notch activation. Neither Su(H) binding nor poised Pol II could fully explain the differences between profiles. Instead, our data indicate that regulatory interactions, driven by the early-responding E(spl)bHLH genes, are required. Proposed cross-regulatory relationships were validated in vivo and in cell culture, supporting the view that feed-forward repression by E(spl)bHLH/Hes shapes the response of late-responding genes. Based on these data, we propose a model in which Hes genes are responsible for co-ordinating the Notch response of a wide spectrum of other targets, explaining the critical functions these key regulators play in many developmental and disease contexts.
Signaling via the Notch pathway conveys important information that helps to shape tissues and, when misused, contributes to diseases. Cells respond to the Notch signal by changing which genes are transcribed. Most previous studies have looked at changes in gene activity at a single time point, long after the start of signaling. By looking at carefully timed intervals immediately after Notch pathway activation, we have been able to follow the dynamic changes in transcription of all the genes and have found that they exhibit different patterns of activity. For example, activity of some genes, especially a previously characterised family called the E(spl) genes, starts very early, whereas others show more delayed upregulation. Our investigations into the underlying mechanisms reveal that cross-regulatory interactions driven by the early genes are required to shape the timing of the delayed response. This feed-forward mechanism is important because it explains why the E(spl)/Hes genes can play such a pivotal role in the Notch response, despite the fact that many other genes are regulated by the signal, a finding that will be valuable for understanding the contribution of E(spl)/Hes genes in diseases associated with altered Notch.
The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in ~40% of genes, with the landscape dominated by cis- and trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.
Phylogeographic methods have attracted a lot of attention in recent years, stressing the need to provide a solid statistical framework for many existing methodologies so as to draw statistically reliable inferences. Here, we take a flexible fully Bayesian approach by reducing the problem to a clustering framework, whereby the population distribution can be explained by a set of migrations, forming geographically stable population clusters. These clusters are such that they are consistent with a fixed number of migrations on the corresponding (unknown) subdivided coalescent tree. Our methods rely upon a clustered population distribution, and allow for inclusion of various covariates (such as phenotype or climate information) at little additional computational cost. We illustrate our methods with an example from weevil mitochondrial DNA sequences from the Iberian peninsula.
migration; coalescent; subdivided population; island model; Markov chain Monte Carlo; reversible jump
Protein synthesis and autophagic degradation are regulated in an opposite manner by mammalian target of rapamycin (mTOR), whereas under certain conditions it would be beneficial if they occured in unison to handle rapid protein turnover. We observed a distinct cellular compartment at the trans-side of the Golgi apparatus, the ‘TOR-autophagy spatial coupling compartment’ (TASCC), where (auto)lysosomes and mTOR accumulated during Ras-induced senescence. mTOR recruitment to the TASCC was amino acid- and Rag guanosine triphosphatase (GTPase)-dependent, and disruption of mTOR localization to the TASCC suppressed interleukin-6/8 synthesis. TASCC-formation was observed during macrophage differentiation and in glomerular podocytes; both displayed increased protein secretion. The spatial coupling of cells’ catabolic and anabolic machinery could augment their respective functions and facilitate the mass synthesis of secretory proteins.
Sample tracking errors have been and always will be a part of the practical implementation of large experiments. It has recently been proposed that expression quantitative trait loci (eQTLs) and their associated effects could be used to identify sample mix-ups and this approach has been applied to a number of large population genomics studies to illustrate the prevalence of the problem. We had adopted a similar approach, termed ‘BADGER’, in the METABRIC project. METABRIC is a large breast cancer study that may have been the first in which eQTL-based detection of mismatches was used during the study, rather than after the event, to aid quality assurance. We report here on the particular issues associated with large cancer studies performed using historical samples, which complicate the interpretation of such approaches. In particular we identify the complications of using tumour samples, of considering cellularity and RNA quality, of distinct subgroups existing in the study population (including family structures), and of choosing eQTLs to use. We also present some results regarding the design of experiments given consideration of these matters. The eQTL-based approach to identifying sample tracking errors is seen to be of value to these studies, but requiring care in its implementation.
In vivo imaging and quantification of fluorescent reporter molecules is increasingly useful in biomedical research. For example, tracking animal movement in 3D with simultaneous quantification of fluorescent transgenic reporters allows for correlations between behavior, aging and gene expression. However implementation has been hindered in the past by the complexity of operating the systems.
We report significant technical improvements and user-friendly software (called FluoreScore) that enables tracking of 3D movement and the dynamics of gene expression in adult Drosophila, using two cameras and recorded GFP videos. Expression of a transgenic construct encoding eGFP was induced in free-moving adult flies using the Gene-Switch system and RU486 drug feeding. The time course of induction of eGFP expression was readily quantified from internal tissues including central nervous tissue.
FluoreScore should facilitate a variety of future studies involving quantification of movement behaviors and fluorescent molecules in free-moving animals.
The Piwi proteins of the Argonaute superfamily are required for normal germline development in Drosophila, zebrafish and mice, and associate with 24-30 nucleotide RNAs termed piRNAs. We identify a class of 21 nucleotide RNAs, previously named 21U-RNAs, as the piRNAs of C. elegans. Piwi and piRNA expression is restricted to the male and female germline and independent of many proteins in other small RNA pathways, including DCR-1. We show that Piwi is specifically required to silence Tc3, but not other Tc/mariner DNA transposons. Tc3 excision rates in the germline are increased at least 100 fold in piwi mutants as compared to wild type. We find no evidence for a Ping-Pong model for piRNA amplification in C. elegans. Instead, we demonstrate that Piwi acts upstream of an endogenous siRNA pathway in Tc3 silencing. These data might suggest a link between piRNA and siRNA function.
Estimation of divergence times is usually done using either the fossil record or sequence data from modern species. We provide an integrated analysis of palaeontological and molecular data to give estimates of primate divergence times that utilize both sources of information. The number of preserved primate species discovered in the fossil record, along with their geological age distribution, is combined with the number of extant primate species to provide initial estimates of the primate and anthropoid divergence times. This is done by using a stochastic forwards-modeling approach where speciation and fossil preservation and discovery are simulated forward in time. We use the posterior distribution from the fossil analysis as a prior distribution on node ages in a molecular analysis. Sequence data from two genomic regions (CFTR on human chromosome 7 and the CYP7A1 region on chromosome 8) from 15 primate species are used with the birth–death model implemented in mcmctree in PAML to infer the posterior distribution of the ages of 14 nodes in the primate tree. We find that these age estimates are older than previously reported dates for all but one of these nodes. To perform the inference, a new approximate Bayesian computation (ABC) algorithm is introduced, where the structure of the model can be exploited in an ABC-within-Gibbs algorithm to provide a more efficient analysis.
Approximate Bayesian computation; molecular phylogeny; palaeontological data; primate divergence
It is possible to infer the past of populations by comparing genomes between individuals. In general, older populations have more genomic diversity than younger populations. The force of selection can also be inferred from population diversity. If selection is strong and frequently eliminates less fit variants, diversity will be limited because new, initially homogeneous populations constantly emerge.
Methodology and Results
Here we translate a population genetics approach to human somatic cancer cell populations by measuring genomic diversity within and between small colorectal cancer (CRC) glands. Control tissue culture and xenograft experiments demonstrate that the population diversity of certain passenger DNA methylation patterns is reduced after cloning but subsequently increases with time. When measured in CRC gland populations, passenger methylation diversity from different parts of nine CRCs was relatively high and uniform, consistent with older, stable lineages rather than mixtures of younger homogeneous populations arising from frequent cycles of selection. The diversity of six metastases was also high, suggesting dissemination early after transformation. Diversity was lower in DNA mismatch repair deficient CRC glands, possibly suggesting more selection and the elimination of less fit variants when mutation rates are elevated.
The many hitchhiking passenger variants observed in primary and metastatic CRC cell populations are consistent with relatively old populations, suggesting that clonal evolution leading to selective sweeps may be rare after transformation. Selection in human cancers appears to be a weaker than presumed force after transformation, consistent with the observed rarity of driver mutations in cancer genomes. Phenotypic plasticity rather than the stepwise acquisition of new driver mutations may better account for the many different phenotypes within human tumors.
The cancer stem cell (CSC) concept is a highly debated topic in cancer research.
While experimental evidence in favor of the cancer stem cell theory is
apparently abundant, the results are often criticized as being difficult to
interpret. An important reason for this is that most experimental data that
support this model rely on transplantation studies. In this study we use a novel
cellular Potts model to elucidate the dynamics of established malignancies that
are driven by a small subset of CSCs. Our results demonstrate that epigenetic
mutations that occur during mitosis display highly altered dynamics in
CSC-driven malignancies compared to a classical, non-hierarchical model of
growth. In particular, the heterogeneity observed in CSC-driven tumors is
considerably higher. We speculate that this feature could be used in combination
with epigenetic (methylation) sequencing studies of human malignancies to prove
or refute the CSC hypothesis in established tumors without the need for
transplantation. Moreover our tumor growth simulations indicate that CSC-driven
tumors display evolutionary features that can be considered beneficial during
tumor progression. Besides an increased heterogeneity they also exhibit
properties that allow the escape of clones from local fitness peaks. This leads
to more aggressive phenotypes in the long run and makes the neoplasm more
adaptable to stringent selective forces such as cancer treatment. Indeed when
therapy is applied the clone landscape of the regrown tumor is more aggressive
with respect to the primary tumor, whereas the classical model demonstrated
similar patterns before and after therapy. Understanding these often
counter-intuitive fundamental properties of (non-)hierarchically organized
malignancies is a crucial step in validating the CSC concept as well as
providing insight into the therapeutical consequences of this model.
Cancer is in essence a genetic disease that leads to uncontrolled cell
proliferation, invasion and metastasis. The cancer stem cell (CSC) hypothesis
states that tumors are not just a mass of uniform malignant cells but they are
hierarchically organized, like normal tissues. At the top of such a hierarchy
are cancer stem cells that fuel tumor growth in the long run, whereas the
majority of other cells are able to divide only a few times. The experiments
that support the CSC hypothesis are often criticized as being difficult to
interpret. A novel approach to test the CSC paradigm is to integrate
mathematical modeling with DNA variation data that carry the phylogenetic
history of cells. We have developed a model that simulates the occurrence of
such changes under both the CSC hypothesis and the classical, purely stochastic
scenario. We found that although a CSC-driven tumor has a smaller number of
tumorigenic cells, it triggers more malignant properties such as invasive
growth, heterogeneity and evolutionary escape from peaks in the fitness
landscape. These properties, that are unique to the CSC model, are enhanced even
further when a treatment is applied to the tumor.
Motivation: Identification of genomic regions of interest in ChIP-seq data, commonly referred to as peak-calling, aims to find the locations of transcription factor binding sites, modified histones or nucleosomes. The BayesPeak algorithm was developed to model the data structure using Bayesian statistical techniques and was shown to be a reliable method, but did not have a full-genome implementation.
Results: In this note we present BayesPeak, an R package for genome-wide peak-calling that provides a flexible implementation of the BayesPeak algorithm and is compatible with downstream BioConductor packages. The BayesPeak package introduces a new method for summarizing posterior probability output, along with methods for handling overfitting and support for parallel processing. We briefly compare the package with other common peak-callers.
Availability: Available as part of BioConductor version 2.6. URL: http://bioconductor.org/packages/release/bioc/html/BayesPeak.html
Supplementary information: Supplementary data are available at Bioinformatics online.
The demands of microarray expression technologies for quantities of RNA place a limit on the questions they can address. As a consequence, the RNA requirements have reduced over time as technologies have improved. In this paper we investigate the costs of reducing the starting quantity of RNA for the Illumina BeadArray platform. This we do via a dilution data set generated from two reference RNA sources that have become the standard for investigations into microarray and sequencing technologies.
We find that the starting quantity of RNA has an effect on observed intensities despite the fact that the quantity of cRNA being hybridized remains constant. We see a loss of sensitivity when using lower quantities of RNA, but no great rise in the false positive rate. Even with 10 ng of starting RNA, the positive results are reliable although many differentially expressed genes are missed. We see that there is some scope for combining data from samples that have contributed differing quantities of RNA, but note also that sample sizes should increase to compensate for the loss of signal-to-noise when using low quantities of starting RNA.
The BeadArray platform maintains a low false discovery rate even when small amounts of starting RNA are used. In contrast, the sensitivity of the platform drops off noticeably over the same range. Thus, those conducting experiments should not opt for low quantities of starting RNA without consideration of the costs of doing so. The implications for experimental design, and the integration of data from different starting quantities, are complex.
The amplification of millions of single molecules in parallel can be carried out on microscopic magnetic beads contained in aqueous compartments of an oil-buffer emulsion. These bead-emulsion amplification (BEA) reactions result in beads covered by almost identical copies derived from a single template. The post-PCR analysis is carried out using different fluorophore-labeled probes. We have identified BEA reaction conditions that efficiently produce longer amplicons of up to 450 base pairs. These conditions include the use of a Titanium Taq amplification system. Second, we explored alternate fluorophores coupled to probes for post-PCR DNA analysis. We demonstrate that four different Alexa fluorophores can be used simultaneously with extremely low crosstalk. Finally, we developed an allele-specific extension chemistry based on Alexa dyes to query individual nucleotides of the amplified material that is both highly efficient and specific.
High-throughput measurement of allele-specific expression (ASE) is a relatively new and exciting application area for array-based technologies. In this paper, we explore several data sets which make use of Illumina's GoldenGate BeadArray technology to measure ASE. This platform exploits coding SNPs to obtain relative expression measurements for alleles at approximately 1500 positions in the genome.
We analyze data from a mixture experiment where genomic DNA samples from pairs of individuals of known genotypes are pooled to create allelic imbalances at varying levels for the majority of SNPs on the array. We observe that GoldenGate has less sensitivity at detecting subtle allelic imbalances (around 1.3 fold) compared to extreme imbalances, and note the benefit of applying local background correction to the data. Analysis of data from a dye-swap control experiment allowed us to quantify dye-bias, which can be reduced considerably by careful normalization. The need to filter the data before carrying out further downstream analysis to remove non-responding probes, which show either weak, or non-specific signal for each allele, was also demonstrated. Throughout this paper, we find that a linear model analysis of the data from each SNP is a flexible modelling strategy that allows for testing of allelic imbalances in each sample when replicate hybridizations are available.
Our analysis shows that local background correction carried out by Illumina's software, together with quantile normalization of the red and green channels within each array, provides optimal performance in terms of false positive rates. In addition, we strongly encourage intensity-based filtering to remove SNPs which only measure non-specific signal. We anticipate that a similar analysis strategy will prove useful when quantifying ASE on Illumina's higher density Infinium BeadChips.
A key stage for all microarray analyses is the extraction of feature-intensities from an image. If this step goes wrong, then subsequent preprocessing and processing stages will stand little chance of rectifying the matter. Illumina employ random construction of their BeadArrays, making feature-intensity extraction even more important for the Illumina platform than for other technologies. In this paper we show that using raw Illumina data it is possible to identify, control, and perhaps correct for a range of spatial-related phenomena that affect feature-intensity extraction.
We note that feature intensities can be unnaturally high when in the proximity of a number of phenomena relating either to the images themselves or to the layout of the beads on an array. Additionally we note that beads neighbour beads of the same type more often than one might expect, which may cause concern in some models of hybridization. We highlight issues in the identification of a bead's location, and in particular how this both affects and is affected by its intensity. Finally we show that beads can be wrongly identified in the image on either a local or array-wide scale, with obvious implications for data quality.
The image processing issues identified will often pass unnoticed by an analysis of the standard data returned from an experiment. We detail some simple diagnostics that can be implemented to identify problems of this nature, and outline approaches to correcting for such problems. These approaches require access to the raw data from the arrays, not just the summarized data usually returned, making the acquisition of such raw data highly desirable.
Imprinted genes show expression from one parental allele only and are important for development and behaviour. This extreme mode of allelic imbalance has been described for approximately 56 human genes. Imprinting status is often disrupted in cancer and dysmorphic syndromes. More subtle variation of gene expression, that is not parent-of-origin specific, termed 'allele-specific gene expression' (ASE) is more common and may give rise to milder phenotypic differences. Using two allele-specific high-throughput technologies alongside bioinformatics predictions, normal term human placenta was screened to find new imprinted genes and to ascertain the extent of ASE in this tissue.
Twenty-three family trios of placental cDNA, placental genomic DNA (gDNA) and gDNA from both parents were tested for 130 candidate genes with the Sequenom MassArray system. Six genes were found differentially expressed but none imprinted. The Illumina ASE BeadArray platform was then used to test 1536 SNPs in 932 genes. The array was enriched for the human orthologues of 124 mouse candidate genes from bioinformatics predictions and 10 human candidate imprinted genes from EST database mining. After quality control pruning, a total of 261 informative SNPs (214 genes) remained for analysis. Imprinting with maternal expression was demonstrated for the lymphocyte imprinted gene ZNF331 in human placenta. Two potential differentially methylated regions (DMRs) were found in the vicinity of ZNF331. None of the bioinformatically predicted candidates tested showed imprinting except for a skewed allelic expression in a parent-specific manner observed for PHACTR2, a neighbour of the imprinted PLAGL1 gene. ASE was detected for two or more individuals in 39 candidate genes (18%).
Both Sequenom and Illumina assays were sensitive enough to study imprinting and strong allelic bias. Previous bioinformatics approaches were not predictive of new imprinted genes in the human term placenta. ZNF331 is imprinted in human term placenta and might be a new ubiquitously imprinted gene, part of a primate-specific locus. Demonstration of partial imprinting of PHACTR2 calls for re-evaluation of the allelic pattern of expression for the PHACTR2-PLAGL1 locus. ASE was common in human term placenta.
miR-124 is a highly conserved microRNA (miRNA) whose in vivo function is poorly understood. Here, we identify miR-124 targets based on the analysis of the first mir-124 mutant in any organism. We find that miR-124 is expressed in many sensory neurons in Caenorhabditis elegans and onset of expression coincides with neuronal morphogenesis. We analyzed the transcriptome of miR-124 expressing and nonexpressing cells from wild-type and mir-124 mutants. We observe that many targets are co-expressed with and actively repressed by miR-124. These targets are expressed at reduced relative levels in sensory neurons compared to the rest of the animal. Our data from mir-124 mutant animals show that this effect is due to a large extent to the activity of miR-124. Genes with nonconserved target sites show reduced absolute expression levels in sensory neurons. In contrast, absolute expression levels of genes with conserved sites are comparable to control genes, suggesting a tuning function for many of these targets. We conclude that miR-124 contributes to defining cell-type-specific gene activity by repressing a diverse set of co-expressed genes.
The accurate and high resolution mapping of DNA copy number aberrations has become an important tool by which to gain insight into the mechanisms of tumourigenesis. There are various commercially available platforms for such studies, but there remains no general consensus as to the optimal platform. There have been several previous platform comparison studies, but they have either described older technologies, used less-complex samples, or have not addressed the issue of the inherent biases in such comparisons. Here we describe a systematic comparison of data from four leading microarray technologies (the Affymetrix Genome-wide SNP 5.0 array, Agilent High-Density CGH Human 244A array, Illumina HumanCNV370-Duo DNA Analysis BeadChip, and the Nimblegen 385 K oligonucleotide array). We compare samples derived from primary breast tumours and their corresponding matched normals, well-established cancer cell lines, and HapMap individuals. By careful consideration and avoidance of potential sources of bias, we aim to provide a fair assessment of platform performance.
By performing a theoretical assessment of the reproducibility, noise, and sensitivity of each platform, notable differences were revealed. Nimblegen exhibited between-replicate array variances an order of magnitude greater than the other three platforms, with Agilent slightly outperforming the others, and a comparison of self-self hybridizations revealed similar patterns. An assessment of the single probe power revealed that Agilent exhibits the highest sensitivity. Additionally, we performed an in-depth visual assessment of the ability of each platform to detect aberrations of varying sizes. As expected, all platforms were able to identify large aberrations in a robust manner. However, some focal amplifications and deletions were only detected in a subset of the platforms.
Although there are substantial differences in the design, density, and number of replicate probes, the comparison indicates a generally high level of concordance between platforms, despite differences in the reproducibility, noise, and sensitivity. In general, Agilent tended to be the best aCGH platform and Affymetrix, the superior SNP-CGH platform, but for specific decisions the results described herein provide a guide for platform selection and study design, and the dataset a resource for more tailored comparisons.
The differential expression pattern of microRNAs (miRNAs) during mammary gland development might provide insights into their role in regulating the homeostasis of the mammary epithelium. Our aim was to analyse these regulatory functions by deriving a comprehensive tissue-specific combined miRNA and mRNA expression profile of post-natal mouse mammary gland development.
We measured the expression of 318 individual murine miRNAs by bead-based flow-cytometric profiling of whole mouse mammary glands throughout a 16-point developmental time course, including juvenile, puberty, mature virgin, gestation, lactation, and involution stages. In parallel whole-genome mRNA expression data were obtained.
One third (n = 102) of all murine miRNAs analysed were detected during mammary gland development. MicroRNAs were represented in seven temporally co-expressed clusters, which were enriched for both miRNAs belonging to the same family and breast cancer-associated miRNAs. Global miRNA and mRNA expression was significantly reduced during lactation and the early stages of involution after weaning. For most detected miRNA families we did not observe systematic changes in the expression of predicted targets. For miRNA families whose targets did show changes, we observed inverse patterns of miRNA and target expression. The data sets are made publicly available and the combined expression profiles represent an important community resource for mammary gland biology research.
MicroRNAs were expressed in likely co-regulated clusters during mammary gland development. Breast cancer-associated miRNAs were significantly enriched in these clusters. The mechanism and functional consequences of this miRNA co-regulation provide new avenues for research into mammary gland biology and generate candidates for functional validation.
Illumina BeadArrays are among the most popular and reliable platforms for gene expression profiling. However, little external scrutiny has been given to the design, selection and annotation of BeadArray probes, which is a fundamental issue in data quality and interpretation. Here we present a pipeline for the complete genomic and transcriptomic re-annotation of Illumina probe sequences, also applicable to other platforms, with its output available through a Web interface and incorporated into Bioconductor packages. We have identified several problems with the design of individual probes and we show the benefits of probe re-annotation on the analysis of BeadArray gene expression data sets. We discuss the importance of aspects such as probe coverage of individual transcripts, alternative messenger RNA splicing, single-nucleotide polymorphisms, repeat sequences, RNA degradation biases and probes targeting genomic regions with no known transcription. We conclude that many of the Illumina probes have unreliable original annotation and that our re-annotation allows analyses to focus on the good quality probes, which form the majority, and also to expand the scope of biological information that can be extracted.
and mutant forms ofp53 affect life span in Drosophila,
nematodes and mice, however the role of wild-type p53 in aging
remains unclear. Here conditional over-expression of both wild-type and
mutant p53 transgenes indicated that, in adult flies, p53
limits life span in females but favors life span in males. In contrast,
during larval development, moderate over-expression of p53 produced
both male and female adults with increased life span. Mutations of the
endogenous p53 gene also had sex-specific effects on life span under
control and stress conditions: null mutation of p53 increased life
span in females, and had smaller, more variable effects in males. These
developmental stage-specific and sex-specific effects of p53 on
adult life span are consistent with a sexual antagonistic pleiotropy model.
aging; sexual conflict; Geneswitch; maternal effects; tumor suppressor
Circadian rhythms in animals are regulated at the level of individual cells and by systemic signaling to coordinate the activities of multiple tissues. The circadian pacemakers have several physiological outputs, including daily locomotor rhythms. Several redox-active compounds have been found to function in regulation of circadian rhythms in cells, however, how particular compounds might be involved in regulating specific animal behaviors remains largely unknown. Here the effects of hydrogen peroxide on Drosophila movement were analyzed using a recently developed three-dimensional real-time multiple fly tracking assay. Both hydrogen peroxide feeding and direct injection of hydrogen peroxide caused increased adult fly locomotor activity. Continuous treatment with hydrogen peroxide also suppressed daily locomotor rhythms. Conditional over-expression of the hydrogen peroxide-producing enzyme superoxide dismutase (SOD) also increased fly activity and altered the patterns of locomotor activity across days and weeks. The real-time fly tracking system allowed for detailed analysis of the effects of these manipulations on behavior. For example, both hydrogen peroxide feeding and SOD over-expression increased all fly motion parameters, however, hydrogen peroxide feeding caused relatively more erratic movement, whereas SOD over-expression produced relatively faster-moving flies. Taken together, the data demonstrate that hydrogen peroxide has dramatic effects on fly movement and daily locomotor rhythms, and implicate hydrogen peroxide in the normal control of these processes.
In this paper, the design of a real-time image acquisition system for tracking the movement of Drosophila in three-dimensional space is presented. The system uses three calibrated and synchronized cameras to detect multiple flies and integrates the detected fly silhouettes to construct the three-dimensional visual hull models of each fly. We used an extended Kalman filter to estimate the state of each fly, given past positions from the reconstructed fly visual hulls. The results show that our approach constructs the three-dimensional visual hull of each fly from the detected image silhouettes and robustly tracks them at real-time rates. The system is suitable for a more detailed analysis of fly behaviour.
real-time three-dimensional tracking; Drosophila activity monitoring; visual hull construction; extended Kalman filtering
Chromatin immunoprecipitation on tiling arrays (ChIP-chip) has been employed to examine features such as protein binding and histone modifications on a genome-wide scale in a variety of cell types. Array data from the latter studies typically have a high proportion of enriched probes whose signals vary considerably (due to heterogeneity in the cell population), and this makes their normalization and downstream analysis difficult.
Here we present strategies for analyzing such experiments, focusing our discussion on the analysis of Bromodeoxyruridine (BrdU) immunoprecipitation on tiling array (BrdU-IP-chip) datasets. BrdU-IP-chip experiments map large, recently replicated genomic regions and have similar characteristics to histone modification/location data. To prepare such data for downstream analysis we employ a dynamic programming algorithm that identifies a set of putative unenriched probes, which we use for both within-array and between-array normalization. We also introduce a second dynamic programming algorithm that incorporates a priori knowledge to identify and quantify positive signals in these datasets.
Highly enriched IP-chip datasets are often difficult to analyze with traditional array normalization and analysis strategies. Here we present and test a set of analytical tools for their normalization and quantification that allows for accurate identification and analysis of enriched regions.