Search tips
Search criteria

Results 1-21 (21)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
1.  Heptamethine Cyanine Based 64Cu - PET Probe PC-1001 for Cancer Imaging: Synthesis and In Vivo Evaluation 
Nuclear medicine and biology  2013;40(3):351-360.
Development of a heptamethine cyanine based tumor-targeting PET imaging probe for noninvasive detection and diagnosis of breast cancer.
Tumor-specific heptamethine-cyanine DOTA conjugate complexed with Cu-64 (PC-1001) was synthesized for breast cancer imaging. In vitro cellular uptake studies were performed in the breast cancer MCF-7 and noncancerous breast epithelial MCF-10A cell lines to establish tumor specificity. In vivo time-dependent fluorescence and PET imaging of breast tumor xenografts in mice were performed. Blood clearance, biodistribution, and tumor-specific uptake and plasma binding of PC-1001 were quantified. Tumor histology (H&E staining) and fluorescence imaging were examined.
PC-1001 displayed similar fluorescence properties (ε=82,880 cm−1M−1, Ex/Em=750/820 nm) to the parental dye. Time-dependent cellular accumulation indicated significantly higher probe uptake (>2-fold, 30 min) in MCF-7 than MCF-10A cells and the uptake was observed to be mediated by organic anion transport peptides (OATPs) system. In vivo studies revealed that PC-1001 has desirable accumulation profile in tumor tissues, with tumor versus muscle uptake of about 4.3 fold at 24 h and 5.8 fold at 48 h post probe injections. Blood half-life of PC-1001 was observed to be 4.3±0.2 h. Microscopic fluorescence imaging of harvested tumor indicated that the uptake of PC-1001 was restricted to viable rather than necrotic tumor cells.
A highly efficient tumor-targeting PET/fluorescence imaging probe PC-1001 is synthesized and validated in vitro in MCF-7 breast cancer cells and in vivo in mice breast cancer xenograft model.
PMCID: PMC3643142  PMID: 23375364
Breast cancer; Cu-64; PET imaging probe; tumor-targeting; xenograft
2.  Estimating species richness by a Poisson-compound gamma model 
Biometrika  2010;97(3):727-740.
We propose a Poisson-compound gamma approach for species richness estimation. Based on the denseness and nesting properties of the gamma mixture, we fix the shape parameter of each gamma component at a unified value, and estimate the mixture using nonparametric maximum likelihood. A least-squares crossvalidation procedure is proposed for the choice of the common shape parameter. The performance of the resulting estimator of N is assessed using numerical studies and genomic data.
PMCID: PMC3372246  PMID: 22822253
Crossvalidation; Nesting property of gamma mixtures; Nonparametric maximum likelihood estimation; Poisson-compound gamma model; Species richness estimation
4.  A base pair resolution map of nucleosome positions in yeast 
Nature  2012;486(7404):496-501.
The exact positions of nucleosomes along genomic DNA can influence many aspects of chromosome function, yet existing methods for mapping nucleosomes do not provide the necessary single base pair accuracy to determine these positions. Here we develop and apply a new approach for direct mapping of nucleosome centers based on chemical modification of engineered histones. The resulting map locates nucleosome positions genome-wide in unprecedented detail and accuracy. It reveals novel aspects of the in vivo nucleosome organization that are linked to transcription factor binding, RNA polymerase pausing, and the higher order structure of the chromatin fiber.
PMCID: PMC3786739  PMID: 22722846
5.  Functional Specialization of the Small Interfering RNA Pathway in Response to Virus Infection 
PLoS Pathogens  2013;9(8):e1003579.
In Drosophila, post-transcriptional gene silencing occurs when exogenous or endogenous double stranded RNA (dsRNA) is processed into small interfering RNAs (siRNAs) by Dicer-2 (Dcr-2) in association with a dsRNA-binding protein (dsRBP) cofactor called Loquacious (Loqs-PD). siRNAs are then loaded onto Argonaute-2 (Ago2) by the action of Dcr-2 with another dsRBP cofactor called R2D2. Loaded Ago2 executes the destruction of target RNAs that have sequence complementarity to siRNAs. Although Dcr-2, R2D2, and Ago2 are essential for innate antiviral defense, the mechanism of virus-derived siRNA (vsiRNA) biogenesis and viral target inhibition remains unclear. Here, we characterize the response mechanism mediated by siRNAs against two different RNA viruses that infect Drosophila. In both cases, we show that vsiRNAs are generated by Dcr-2 processing of dsRNA formed during viral genome replication and, to a lesser extent, viral transcription. These vsiRNAs seem to preferentially target viral polyadenylated RNA to inhibit viral replication. Loqs-PD is completely dispensable for silencing of the viruses, in contrast to its role in silencing endogenous targets. Biogenesis of vsiRNAs is independent of both Loqs-PD and R2D2. R2D2, however, is required for sorting and loading of vsiRNAs onto Ago2 and inhibition of viral RNA expression. Direct injection of viral RNA into Drosophila results in replication that is also independent of Loqs-PD. This suggests that triggering of the antiviral pathway is not related to viral mode of entry but recognition of intrinsic features of virus RNA. Our results indicate the existence of a vsiRNA pathway that is separate from the endogenous siRNA pathway and is specifically triggered by virus RNA. We speculate that this unique framework might be necessary for a prompt and efficient antiviral response.
Author Summary
The RNA interference (RNAi) pathway utilizes small non-coding RNAs to silence gene expression. In insects, RNAi regulates endogenous genes and functions as an RNA-based immune system against viral infection. Here we have uncovered details of how RNAi is triggered by RNA viruses. Double-stranded RNA (dsRNA) generated as a replication intermediate or from transcription of the RNA virus can be used as substrate for the biogenesis of virus-derived small interfering RNAs (vsiRNAs). Unlike other dsRNAs, virus RNA processing involves Dicer but not its canonical partner protein Loqs-PD. Thus, vsiRNA biogenesis is mechanistically different from biogenesis of endogenous siRNAs or siRNAs derived from other exogenous RNA sources. Our results suggest a specialization of the pathway dedicated to silencing of RNA viruses versus other types of RNAi silencing. The understanding of RNAi mechanisms during viral infection could have implications for the control of insect-borne viruses and the use of siRNAs to treat viral infections in humans.
PMCID: PMC3757037  PMID: 24009507
6.  Large-scale Cortical Network Properties Predict Future Sound-to-Word Learning Success 
Journal of cognitive neuroscience  2012;24(5):1087-1103.
The human brain possesses a remarkable capacity to interpret and recall novel sounds as spoken language. These linguistic abilities arise from complex processing spanning a widely distributed cortical network and are characterized by marked individual variation. Recently, graph theoretical analysis has facilitated the exploration of how such aspects of large-scale brain functional organization may underlie cognitive performance. Brain functional networks are known to possess small-world topologies characterized by efficient global and local information transfer, but whether these properties relate to language learning abilities remains unknown. Here we applied graph theory to construct large-scale cortical functional networks from cerebral hemodynamic (fMRI) responses acquired during an auditory pitch discrimination task and found that such network properties were associated with participants’ future success in learning words of an artificial spoken language. Successful learners possessed networks with reduced local efficiency but increased global efficiency relative to less successful learners and had a more cost-efficient network organization. Regionally, successful and less successful learners exhibited differences in these network properties spanning bilateral prefrontal, parietal, and right temporal cortex, overlapping a core network of auditory language areas. These results suggest that efficient cortical network organization is associated with sound-to-word learning abilities among healthy, younger adults.
PMCID: PMC3736731  PMID: 22360625
7.  Archaeal nucleosome positioning in vivo and in vitro is directed by primary sequence motifs 
BMC Genomics  2013;14:391.
Histone wrapping of DNA into nucleosomes almost certainly evolved in the Archaea, and predates Eukaryotes. In Eukaryotes, nucleosome positioning plays a central role in regulating gene expression and is directed by primary sequence motifs that together form a nucleosome positioning code. The experiments reported were undertaken to determine if archaeal histone assembly conforms to the nucleosome positioning code.
Eukaryotic nucleosome positioning is favored and directed by phased helical repeats of AA/TT/AT/TA and CC/GG/CG/GC dinucleotides, and disfavored by longer AT-rich oligonucleotides. Deep sequencing of genomic DNA protected from micrococcal nuclease digestion by assembly into archaeal nucleosomes has established that archaeal nucleosome assembly is also directed and positioned by these sequence motifs, both in vivo in Methanothermobacter thermautotrophicus and Thermococcus kodakarensis and in vitro in reaction mixtures containing only one purified archaeal histone and genomic DNA. Archaeal nucleosomes assembled at the same locations in vivo and in vitro, with much reduced assembly immediately upstream of open reading frames and throughout the ribosomal rDNA operons. Providing further support for a common positioning code, archaeal histones assembled into nucleosomes on eukaryotic DNA and eukaryotic histones into nucleosomes on archaeal DNA at the same locations. T. kodakarensis has two histones, designated HTkA and HTkB, and strains with either but not both histones deleted grow normally but do exhibit transcriptome differences. Comparisons of the archaeal nucleosome profiles in the intergenic regions immediately upstream of genes that exhibited increased or decreased transcription in the absence of HTkA or HTkB revealed substantial differences but no consistent pattern of changes that would correlate directly with archaeal nucleosome positioning inhibiting or stimulating transcription.
The results obtained establish that an archaeal histone and a genome sequence together are sufficient to determine where archaeal nucleosomes preferentially assemble and where they avoid assembly. We confirm that the same nucleosome positioning code operates in Archaea as in Eukaryotes and presumably therefore evolved with the histone-fold mechanism of DNA binding and compaction early in the archaeal lineage, before the divergence of Eukaryotes.
PMCID: PMC3691661  PMID: 23758892
Archaea; Nucleosome positioning; Dinucleotide repeats; Histone deletions; rDNA expression; Chromatin evolution
10.  High-resolution nucleosome mapping of targeted regions using BAC-based enrichment 
Nucleic Acids Research  2013;41(7):e87.
We report a target enrichment method to map nucleosomes of large genomes at unprecedented coverage and resolution by deeply sequencing locus-specific mononucleosomal DNA enriched via hybridization with bacterial artificial chromosomes. We achieved ∼10 000-fold enrichment of specific loci, which enabled sequencing nucleosomes at up to ∼500-fold higher coverage than has been reported in a mammalian genome. We demonstrate the advantages of generating high-sequencing coverage for mapping the center of discrete nucleosomes, and we show the use of the method by mapping nucleosomes during T cell differentiation using nuclei from effector T-cells differentiated from clonal, isogenic, naïve, primary murine CD4 and CD8 T lymphocytes. The analysis reveals that discrete nucleosomes exhibit cell type-specific occupancy and positioning depending on differentiation status and transcription. This method is widely applicable to mapping many features of chromatin and discerning its landscape in large genomes at unprecedented resolution.
PMCID: PMC3627574  PMID: 23413004
11.  Nucleosome mapping across the CFTR locus identifies novel regulatory factors 
Nucleic Acids Research  2013;41(5):2857-2868.
Nucleosome positioning on the chromatin strand plays a critical role in regulating accessibility of DNA to transcription factors and chromatin modifying enzymes. Hence, detailed information on nucleosome depletion or movement at cis-acting regulatory elements has the potential to identify predicted binding sites for trans-acting factors. Using a novel method based on enrichment of mononucleosomal DNA by bacterial artificial chromosome hybridization, we mapped nucleosome positions by deep sequencing across 250 kb, encompassing the cystic fibrosis transmembrane conductance regulator (CFTR) gene. CFTR shows tight tissue-specific regulation of expression, which is largely determined by cis-regulatory elements that lie outside the gene promoter. Although multiple elements are known, the repertoire of transcription factors that interact with these sites to activate or repress CFTR expression remains incomplete. Here, we show that specific nucleosome depletion corresponds to well-characterized binding sites for known trans-acting factors, including hepatocyte nuclear factor 1, Forkhead box A1 and CCCTC-binding factor. Moreover, the cell-type selective nucleosome positioning is effective in predicting binding sites for novel interacting factors, such as BAF155. Finally, we identify transcription factor binding sites that are overrepresented in regions where nucleosomes are depleted in a cell-specific manner. This approach recognizes the glucocorticoid receptor as a novel trans-acting factor that regulates CFTR expression in vivo.
PMCID: PMC3597660  PMID: 23325854
12.  Estrogen utilization of IGF-1-R and EGF-R to signal in breast cancer cells 
As breast cancer cells develop secondary resistance to estrogen deprivation therapy, they increase their utilization of non-genomic signaling pathways. Our prior work demonstrated that estradiol causes an association of ERα with Shc, Src and the IGF-1-R. In cells developing resistance to estrogen deprivation (surrogate for aromatase inhibition) and to the anti-estrogens tamoxifen, 4-OH-tamoxifen, and fulvestrant, an increased association of ERα with c-Src and the EGF-R occurs. At the same time, there is a translocation of ERα out of the nucleus and into the cytoplasm and cell membrane. Blockade of cSrc with the Src kinase inhibitor, PP-2 causes relocation of ERα into the nucleus. While these changes are not identical in response to each anti- estrogen, ERα binding to the EGF-R is increased in response to 4-OH-Tamoxifen when compared with tamoxifen. The changes in EGF-R interactions with ERα impart an enhanced sensitivity of tamoxifen resistant cells to the inhibitory properties of the specific EGF-R tyrosine kinase inhibitor, AG 1478. However, with long term exposure of tamoxifen-resistant cells to AG 1478, the cells begin to re-grow but can now be inhibited by the IGF-R tyrosine kinase inhibitor, AG 1024. These data suggest that the IGF-R system becomes the predominant signaling mechanism as an adaptive response to the EGF-R inhibitor. Taken together, this information suggests that both the EGF-R and IGF-R pathways can mediate ERα signaling.
To further examine the effects of fulvestrant on ERα function, we examined the acute effects of fulvestrant, on non-genomic functionality. Fulvestrant enhanced ERα association with the membrane IGF-1 receptor (IGF-1R). Using siRNA or expression vectors to knock-down or knock-in selective proteins, we further demonstrated that the ERα/IGF-1R association is Src-dependent. Fulvestrant rapidly induced IGF-1R and MAPK phosphorylation. The Src inhibitor PP2 and IGF-1R inhibitor AG1024 greatly blocked fulvestrant-induced ERα/IGF-1R interaction leading to a further depletion of total cellular ERα induced by fulvestrant and further enhanced fulvestrant-induced cell growth arrest. More dramatic was the translocation of ERα to the plasma membrane in combination with the IGF-1-R as shown by confocal microscopy. Taken in aggregate, these studies suggest that secondary resistance to hormonal therapy results in usage of both IGF-R and EGF-R for non-genomic signaling.
PMCID: PMC2826506  PMID: 19815064
13.  Large-Scale Cortical Functional Organization and Speech Perception across the Lifespan 
PLoS ONE  2011;6(1):e16510.
Aging is accompanied by substantial changes in brain function, including functional reorganization of large-scale brain networks. Such differences in network architecture have been reported both at rest and during cognitive task performance, but an open question is whether these age-related differences show task-dependent effects or represent only task-independent changes attributable to a common factor (i.e., underlying physiological decline). To address this question, we used graph theoretic analysis to construct weighted cortical functional networks from hemodynamic (functional MRI) responses in 12 younger and 12 older adults during a speech perception task performed in both quiet and noisy listening conditions. Functional networks were constructed for each subject and listening condition based on inter-regional correlations of the fMRI signal among 66 cortical regions, and network measures of global and local efficiency were computed. Across listening conditions, older adult networks showed significantly decreased global (but not local) efficiency relative to younger adults after normalizing measures to surrogate random networks. Although listening condition produced no main effects on whole-cortex network organization, a significant age group x listening condition interaction was observed. Additionally, an exploratory analysis of regional effects uncovered age-related declines in both global and local efficiency concentrated exclusively in auditory areas (bilateral superior and middle temporal cortex), further suggestive of specificity to the speech perception tasks. Global efficiency also correlated positively with mean cortical thickness across all subjects, establishing gross cortical atrophy as a task-independent contributor to age-related differences in functional organization. Together, our findings provide evidence of age-related disruptions in cortical functional network organization during speech perception tasks, and suggest that although task-independent effects such as cortical atrophy clearly underlie age-related changes in cortical functional organization, age-related differences also demonstrate sensitivity to task domains.
PMCID: PMC3031590  PMID: 21304991
14.  Predicting nucleosome positioning using a duration Hidden Markov Model 
BMC Bioinformatics  2010;11:346.
The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software tool for predicting nucleosome positioning can help understanding how a genome's nucleosome organization may facilitate genome function.
We present a duration Hidden Markov model for nucleosome positioning prediction by explicitly modeling the linker DNA length. The nucleosome and linker models trained from yeast data are re-scaled when making predictions for other species to adjust for differences in base composition. A software tool named NuPoP is developed in three formats for free download.
Simulation studies show that modeling the linker length distribution and utilizing a base composition re-scaling method both improve the prediction of nucleosome positioning regarding sensitivity and false discovery rate. NuPoP provides a user-friendly software tool for predicting the nucleosome occupancy and the most probable nucleosome positioning map for genomic sequences of any size. When compared with two existing methods, NuPoP shows improved performance in sensitivity.
PMCID: PMC2900280  PMID: 20576140
15.  Involvement of 90-kuD ribosomal S6 kinase in collagen type I expression in rat hepatic fibrosis 
AIM: To investigate the relationship between 90-kuD ribosomal S6 kinase (p90RSK) and collagen type I expression during the development of hepatic fibrosis in vivo and in vitro.
METHODS: Rat hepatic fibrosis was induced by intraperitoneal injection of dimethylnitrosamine. The protein expression and cell location of p90RSK and their relationship with collagen type I were determined by co-immunofluoresence and confocal microscopy. Subsequently, RNAi strategy was employed to silence p90RSK mRNA expression in HSC-T6, an activated hepatic stellate cell (HSC) line. The expression of collagen type I in HSC-T6 cells was assessed by Western blotting and real-time polymerase chain reaction. Furthermore, HSCs were transfected with expression vectors or RNAi constructs of p90RSK to increase or decrease the p90RSK expression, then collagen type I promoter activity in the transfected HSCs was examined by reporter assay. Lastly HSC-T6 cells transfected with p90RSK siRNA was treated with or without platelet-derived growth factor (PDGF)-BB at a final concentration of 20 μg/L and the cell growth was determined by MTS conversion.
RESULTS: In fibrotic liver tissues, p90RSK was over-expressed in activated HSCs and had a significant positive correlation with collagen type I levels. In HSC-T6 cells transfected with RNAi targeted to p90RSK, the expression of collagen type I was down-regulated (61.8% in mRNA, P < 0.01, 89.1% in protein, P < 0.01). However, collagen type I promoter activity was not increased with over-expression of p90RSK and not decreased with low expression either, compared with controls in the same cell line (P = 0.076). Furthermore, p90RSK siRNA exerted the inhibition of HSC proliferation, and also abolished the effect of PDGF on the HSC proliferation.
CONCLUSION: p90RSK is over-expressed in activated HSCs and involved in regulating the abnormal expression of collagen type I through initiating the proliferation of HSCs.
PMCID: PMC2678581  PMID: 19418583
90-kuD ribosomal S6 kinase; Collagen type I; Hepatic fibrosis; Hepatic stellate cell; RNAi
16.  A genomic code for nucleosome positioning 
Nature  2006;442(7104):772-778.
Eukaryotic genomes are packaged into nucleosome particles that occlude the DNA from interacting with most DNA binding proteins. Nucleosomes have higher affinity for particular DNA sequences, reflecting the ability of the sequence to bend sharply, as required by the nucleosome structure. However, it is not known whether these sequence preferences have a significant influence on nucleosome position in vivo, and thus regulate the access of other proteins to DNA. Here we isolated nucleosome-bound sequences at high resolution from yeast and used these sequences in a new computational approach to construct and validate experimentally a nucleosome-DNA interaction model, and to predict the genome-wide organization of nucleosomes. Our results demonstrate that genomes encode an intrinsic nucleosome organization and that this intrinsic organization can explain ∼50% of the in vivo nucleosome positions. This nucleosome positioning code may facilitate specific chromosome functions including transcription factor binding, transcription initiation, and even remodelling of the nucleosomes themselves.
PMCID: PMC2623244  PMID: 16862119
17.  Development of a high sensitivity, nested Q-PCR assay for mouse and human aromatase 
Measurement of breast tissue estradiol levels could provide a powerful method to predict the risk of developing breast cancer but obtaining sufficient amounts of tissue from women is difficult from a practical standpoint. Assessment of aromatase in ductal lavage fluid or fine needle aspirates from breast might provide a surrogate marker for tissue estrogen levels but highly sensitive methods would be required. These considerations prompted us to develop an ultra-sensitive, “nested” PCR assay for aromatase which is up to one million fold more sensitive than standard PCR methods. We initially validated this assay using multiple tissues from the aromatase transgenic mouse and found that coefficients of variation for measurement of replicate samples averaged less than 5%. We demonstrated a 60-fold enhancement in aromatase message in the transgenic versus the wild type mouse breast but surprisingly, levels in the transgenic animals were highly variable, ranging from 0.4 to 27 relative units. The variability of aromatase expression in the transgenic breast did not correlate with the degree of breast development and did not appear to relate to hormonal manipulation of the MMTV promoter but probably related to lack of exhaustive inbreeding and mixed zygocity of transgenic animals. Extensive validation in mouse tissues provided confidence regarding the assay in human tissues, since nearly identical methods were used. The human assay was sufficiently sensitive to detect aromatase in a single human JAR (choriocarcinoma) cell, in all breast biopsies measured, and in 7/23 ductal lavage fluids.
PMCID: PMC2579313  PMID: 17975728
Aromatase; Breast tissue; Ductal lavage; Estrogen; Fine needle aspirate (FNA); Nested PCR
18.  Preferentially Quantized Linker DNA Lengths in Saccharomyces cerevisiae 
PLoS Computational Biology  2008;4(9):e1000175.
The exact lengths of linker DNAs connecting adjacent nucleosomes specify the intrinsic three-dimensional structures of eukaryotic chromatin fibers. Some studies suggest that linker DNA lengths preferentially occur at certain quantized values, differing one from another by integral multiples of the DNA helical repeat, ∼10 bp; however, studies in the literature are inconsistent. Here, we investigate linker DNA length distributions in the yeast Saccharomyces cerevisiae genome, using two novel methods: a Fourier analysis of genomic dinucleotide periodicities adjacent to experimentally mapped nucleosomes and a duration hidden Markov model applied to experimentally defined dinucleosomes. Both methods reveal that linker DNA lengths in yeast are preferentially periodic at the DNA helical repeat (∼10 bp), obeying the forms 10n+5 bp (integer n). This 10 bp periodicity implies an ordered superhelical intrinsic structure for the average chromatin fiber in yeast.
Author Summary
Eukaryotic genomic DNA exists as chromatin, with the DNA wrapped locally into a repeating array of protein–DNA complexes (“nucleosomes”) separated by short stretches of unwrapped “linker” DNA. Nucleosome arrays further compact into ∼30-nm-wide higher-order chromatin structures. Despite decades of work, there remains no agreement about the structure of the 30 nm fiber, or even if the structure is ordered or random. The helical symmetry of DNA couples the one-dimensional distribution of nucleosomes along the DNA to an intrinsic three-dimensional structure for the chromatin fiber. Random linker length distributions imply random three-dimensional intrinsic fiber structures, whereas different possible nonrandom length distributions imply different ordered structures. Here we use two independent computational methods, with two independent kinds of experimental data, to experimentally define the probability distribution of linker DNA lengths in yeast. Both methods agree that linker DNA lengths in yeast come in a set of preferentially quantized lengths that differ one from another by ∼10 bp, the DNA helical repeat, with a preferred phase offset of 5 bp. The preferential quantization of lengths implies that the intrinsic three-dimensional structure for the average chromatin fiber is ordered, not random. The 5 bp offset implies a particular geometry for this intrinsic structure.
PMCID: PMC2522279  PMID: 18787693
19.  Meta-Analysis of Drosophila Circadian Microarray Studies Identifies a Novel Set of Rhythmically Expressed Genes 
PLoS Computational Biology  2007;3(11):e208.
Five independent groups have reported microarray studies that identify dozens of rhythmically expressed genes in the fruit fly Drosophila melanogaster. Limited overlap among the lists of discovered genes makes it difficult to determine which, if any, exhibit truly rhythmic patterns of expression. We reanalyzed data from all five reports and found two sources for the observed discrepancies, the use of different expression pattern detection algorithms and underlying variation among the datasets. To improve upon the methods originally employed, we developed a new analysis that involves compilation of all existing data, application of identical transformation and standardization procedures followed by ANOVA-based statistical prescreening, and three separate classes of post hoc analysis: cross-correlation to various cycling waveforms, autocorrelation, and a previously described fast Fourier transform–based technique [1–3]. Permutation-based statistical tests were used to derive significance measures for all post hoc tests. We find application of our method, most significantly the ANOVA prescreening procedure, significantly reduces the false discovery rate relative to that observed among the results of the original five reports while maintaining desirable statistical power. We identify a set of 81 cycling transcripts previously found in one or more of the original reports as well as a novel set of 133 transcripts not found in any of the original studies. We introduce a novel analysis method that compensates for variability observed among the original five Drosophila circadian array reports. Based on the statistical fidelity of our meta-analysis results, and the results of our initial validation experiments (quantitative RT-PCR), we predict many of our newly found genes to be bona fide cyclers, and suggest that they may lead to new insights into the pathways through which clock mechanisms regulate behavioral rhythms.
Author Summary
Circadian genes regulate many of life's most essential processes, from sleeping and eating to cellular metabolism, learning, and much more. Many of these genes exhibit cyclic transcript expression, a characteristic utilized by an ever-expanding corpus of microarray-based studies to discover additional circadian genes. While these attempts have identified hundreds of transcripts in a variety of organisms, they exhibit a striking lack of agreement, making it difficult to determine which, if any, are truly cycling. Here, we examine one group of these reports (those performed on the fruit fly—Drosophila melanogaster) to identify the sources of observed differences and present a means of analyzing the data that drastically reduces their impact. We demonstrate the fidelity of our method through its application to the original fruit fly microarray data, detecting more than 200 (133 novel) transcripts with a level of statistical fidelity better than that found in any of the original reports. Initial validation experiments (quantitative RT-PCR) suggest these to be truly cycling genes, one of which is now known to be a bona fide circadian gene (cwo). We report the discovery of 133 novel candidate circadian genes as well as the highly adaptable method used to identify them.
PMCID: PMC2098839  PMID: 17983263
20.  Gene capture prediction and overlap estimation in EST sequencing from one or multiple libraries 
BMC Bioinformatics  2005;6:300.
In expressed sequence tag (EST) sequencing, we are often interested in how many genes we can capture in an EST sample of a targeted size. This information provides insights to sequencing efficiency in experimental design, as well as clues to the diversity of expressed genes in the tissue from which the library was constructed.
We propose a compound Poisson process model that can accurately predict the gene capture in a future EST sample based on an initial EST sample. It also allows estimation of the number of expressed genes in one cDNA library or co-expressed in two cDNA libraries. The superior performance of the new prediction method over an existing approach is established by a simulation study. Our analysis of four Arabidopsis thaliana EST sets suggests that the number of expressed genes present in four different cDNA libraries of Arabidopsis thaliana varies from 9155 (root) to 12005 (silique). An observed fraction of co-expressed genes in two different EST sets as low as 25% can correspond to an actual overlap fraction greater than 65%.
The proposed method provides a convenient tool for gene capture prediction and cDNA library property diagnosis in EST sequencing.
PMCID: PMC1369009  PMID: 16351717
21.  Improved alignment of nucleosome DNA sequences using a mixture model 
Nucleic Acids Research  2005;33(21):6743-6755.
DNA sequences that are present in nucleosomes have a preferential ∼10 bp periodicity of certain dinucleotide signals (1,2), but the overall sequence similarity of the nucleosomal DNA is weak, and traditional multiple sequence alignment tools fail to yield meaningful alignments. We develop a mixture model that characterizes the known dinucleotide periodicity probabilistically to improve the alignment of nucleosomal DNAs. We assume that a periodic dinucleotide signal of any type emits according to a probability distribution around a series of ‘hot spots’ that are equally spaced along nucleosomal DNA with 10 bp period, but with a 1 bp phase shift across the middle of the nucleosome. We model the three statistically most significant dinucleotide signals, AA/TT, GC and TA, simultaneously, while allowing phase shifts between the signals. The alignment is obtained by maximizing the likelihood of both Watson and Crick strands simultaneously. The resulting alignment of 177 chicken nucleosomal DNA sequences revealed that all 10 distinct dinucleotides are periodic, however, with only two distinct phases and varying intensity. By Fourier analysis, we show that our new alignment has enhanced periodicity and sequence identity compared with center alignment. The significance of the nucleosomal DNA sequence alignment is evaluated by comparing it with that obtained using the same model on non-nucleosomal sequences.
PMCID: PMC1310902  PMID: 16339114

Results 1-21 (21)