Search tips
Search criteria

Results 1-21 (21)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
1.  miR-191 and miR-135 are required for long-lasting spine remodeling associated with synaptic long term depression 
Nature communications  2014;5:3263.
Activity-dependent modification of dendritic spines, subcellular compartments accommodating postsynaptic specializations in the brain, is an important cellular mechanism for brain development, cognition and synaptic pathology of brain disorders. NMDA receptor-dependent long-term depression (NMDAR-LTD), a prototypic form of synaptic plasticity, is accompanied by prolonged remodeling of spines. The mechanisms underlying long-lasting spine remodeling in NMDAR-LTD, however, are largely unclear. Here we show that LTD induction causes global changes in miRNA transcriptomes affecting many cellular activities. Specifically, we show that expression changes of miR-191 and miR-135 are required for maintenance but not induction of spine restructuring. Moreover, we find that actin depolymerization and AMPA receptor exocytosis are regulated for extended periods of time by miRNAs to support long-lasting spine plasticity. These findings reveal a novel miRNA mediated-mechanism and a new role of AMPA receptor exocytosis in long-lasting spine plasticity, and identify a number of candidate miRNAs involved in LTD.
PMCID: PMC3951436  PMID: 24535612
2.  Comparison of Spectral and Image Morphological Analysis for Egg Early Hatching Property Detection Based on Hyperspectral Imaging 
PLoS ONE  2014;9(2):e88659.
The use of non-destructive methods to detect egg hatching properties could increase efficiency in commercial hatcheries by saving space, reducing costs, and ensuring hatching quality. For this purpose, a hyperspectral imaging system was built to detect embryo development and vitality using spectral and morphological information of hatching eggs. A total of 150 green shell eggs were used, and hyperspectral images were collected for every egg on day 0, 1, 2, 3 and 4 of incubation. After imaging, two analysis methods were developed to extract egg hatching characteristic. Firstly, hyperspectral images of samples were evaluated using Principal Component Analysis (PCA) and only one optimal band with 822 nm was selected for extracting spectral characteristics of hatching egg. Secondly, an image segmentation algorithm was applied to isolate the image morphologic characteristics of hatching egg. To investigate the applicability of spectral and image morphological analysis for detecting egg early hatching properties, Learning Vector Quantization neural network (LVQNN) was employed. The experimental results demonstrated that model using image morphological characteristics could achieve better accuracy and generalization than using spectral characteristic parameters, and the discrimination accuracy for eggs with embryo development were 97% at day 3, 100% at day 4. In addition, the recognition results for eggs with weak embryo development reached 81% at day 3, and 92% at day 4. This study suggested that image morphological analysis was a novel application of hyperspectral imaging technology to detect egg early hatching properties.
PMCID: PMC3923798  PMID: 24551130
3.  cGRNB: a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets 
BMC Systems Biology  2013;7(Suppl 2):S7.
We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts.
In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module.
In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at
PMCID: PMC3851836  PMID: 24565134
4.  A Splicing-Independent Function of SF2/ASF in MicroRNA Processing 
Molecular Cell  2010;38(1):67-77.
Both splicing factors and microRNAs are important regulatory molecules that play key roles in post-transcriptional gene regulation. By miRNA deep sequencing, we identified 40 miRNAs that are differentially expressed upon ectopic overexpression of the splicing factor SF2/ASF. Here we show that SF2/ASF and one of its upregulated microRNAs (miR-7) can form a negative feedback loop: SF2/ASF promotes miR-7 maturation, and mature miR-7 in turn targets the 3′UTR of SF2/ASF to repress its translation. Enhanced microRNA expression is mediated by direct interaction between SF2/ASF and the primary miR-7 transcript to facilitate Drosha cleavage and is independent of SF2/ASF’s function in splicing. Other miRNAs, including miR-221 and miR-222, may also be regulated by SF2/ASF through a similar mechanism. These results underscore a function of SF2/ASF in pri-miRNA processing and highlight the potential coordination between splicing control and miRNA-mediated gene repression in gene regulatory networks.
PMCID: PMC3395997  PMID: 20385090
5.  Expression of miRNAs and Their Cooperative Regulation of the Pathophysiology in Traumatic Brain Injury 
PLoS ONE  2012;7(6):e39357.
Traumatic brain injury (TBI) is a leading cause of injury-related death and disability worldwide. Effective treatment for TBI is limited and many TBI patients suffer from neuropsychiatric sequelae. The molecular and cellular mechanisms underlying the neuronal damage and impairment of mental abilities following TBI are largely unknown. Here we used the next generation sequencing platform to delineate miRNA transcriptome changes in the hippocampus at 24 hours and 7 days following TBI in the rat controlled cortical impact injury (CCI) model, and developed a bioinformatic analysis to identify cellular activities that are regulated by miRNAs differentially expressed in the CCI brains. The results of our study indicate that distinct sets of miRNAs are regulated at different post-traumatic times, and suggest that multiple miRNA species cooperatively regulate cellular pathways for the pathological changes and management of brain injury. The distinctive miRNAs expression profiles at different post-CCI times may be used as molecular signatures to assess TBI progression. In addition to known pathophysiological changes, our study identifies many other cellular pathways that are subjected to modification by differentially expressed miRNAs in TBI brains. These pathways can potentially be targeted for development of novel TBI treatment.
PMCID: PMC3382215  PMID: 22761770
6.  Combinatorial network of transcriptional regulation and microRNA regulation in human cancer 
BMC Systems Biology  2012;6:61.
Both transcriptional control and microRNA (miRNA) control are critical regulatory mechanisms for cells to direct their destinies. At present, the combinatorial regulatory network composed of transcriptional regulations and post-transcriptional regulations is often constructed through a forward engineering strategy that is based solely on searching of transcriptional factor binding sites or miRNA seed regions in the putative target sequences. If the reverse engineering strategy is integrated with the forward engineering strategy, a more accurate and more specific combinatorial regulatory network will be obtained.
In this work, utilizing both sequence-matching information and parallel expression datasets of miRNAs and mRNAs, we integrated forward engineering with reverse engineering strategies and as a result built a hypothetical combinatorial gene regulatory network in human cancer. The credibility of the regulatory relationships in the network was validated by random permutation procedures and supported by authoritative experimental evidence-based databases. The global and local architecture properties of the combinatorial regulatory network were explored, and the most important tumor-regulating miRNAs and TFs were highlighted from a topological point of view.
By integrating the forward engineering and reverse engineering strategies, we manage to sketch a genome-scale combinatorial gene regulatory network in human cancer, which includes transcriptional regulations and miRNA regulations, allowing systematic study of cancer gene regulation. Our work establishes a pipeline that can be extended to reveal conditional combinatorial regulatory landscapes correlating to specific cellular contexts.
PMCID: PMC3483236  PMID: 22691419
7.  The Prevalence and Regulation of Antisense Transcripts in Schizosaccharomyces pombe 
PLoS ONE  2010;5(12):e15271.
A strand-specific transcriptome sequencing strategy, directional ligation sequencing or DeLi-seq, was employed to profile antisense transcriptome of Schizosaccharomyces pombe. Under both normal and heat shock conditions, we found that polyadenylated antisense transcripts are broadly expressed while distinct expression patterns were observed for protein-coding and non-coding loci. Dominant antisense expression is enriched in protein-coding genes involved in meiosis or stress response pathways. Detailed analyses further suggest that antisense transcripts are independently regulated with respect to their sense transcripts, and diverse mechanisms might be potentially involved in the biogenesis and degradation of antisense RNAs. Taken together, antisense transcription may have profound impacts on global gene regulation in S. pombe.
PMCID: PMC3004915  PMID: 21187966
8.  DCGL: an R package for identifying differentially coexpressed genes and links from gene expression microarray data 
Bioinformatics  2010;26(20):2637-2638.
Summary: Gene coexpression analysis was developed to explore gene interconnection at the expression level from a systems perspective, and differential coexpression analysis (DCEA), which examines the change in gene expression correlation between two conditions, was accordingly designed as a complementary technique to traditional differential expression analysis (DEA). Since there is a shortage of DCEA tools, we implemented in an R package ‘DCGL’ five DCEA methods for identification of differentially coexpressed genes and differentially coexpressed links, including three currently popular methods and two novel algorithms described in a companion paper. DCGL can serve as an easy-to-use tool to facilitate differential coexpression analyses.
Contact: and
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2951087  PMID: 20801914
9.  Using GeneReg to construct time delay gene regulatory networks 
BMC Research Notes  2010;3:142.
Understanding gene expression and regulation is essential for understanding biological mechanisms. Because gene expression profiling has been widely used in basic biological research, especially in transcription regulation studies, we have developed GeneReg, an easy-to-use R package, to construct gene regulatory networks from time course gene expression profiling data; More importantly, this package can provide information about time delays between expression change in a regulator and that of its target genes.
The R package GeneReg is based on time delay linear regression, which can generate a model of the expression levels of regulators at a given time point against the expression levels of their target genes at a later time point. There are two parameters in the model, time delay and regulation coefficient. Time delay is the time lag during which expression change of the regulator is transmitted to change in target gene expression. Regulation coefficient expresses the regulation effect: a positive regulation coefficient indicates activation and negative indicates repression. GeneReg was implemented on a real Saccharomyces cerevisiae cell cycle dataset; more than thirty percent of the modeled regulations, based entirely on gene expression files, were found to be consistent with previous discoveries from known databases.
GeneReg is an easy-to-use, simple, fast R package for gene regulatory network construction from short time course gene expression data. It may be applied to study time-related biological processes such as cell cycle, cell differentiation, or causal inference.
PMCID: PMC2892504  PMID: 20500822
10.  GEOGLE: context mining tool for the correlation between gene expression and the phenotypic distinction 
BMC Bioinformatics  2009;10:264.
In the post-genomic era, the development of high-throughput gene expression detection technology provides huge amounts of experimental data, which challenges the traditional pipelines for data processing and analyzing in scientific researches.
In our work, we integrated gene expression information from Gene Expression Omnibus (GEO), biomedical ontology from Medical Subject Headings (MeSH) and signaling pathway knowledge from sigPathway entries to develop a context mining tool for gene expression analysis – GEOGLE. GEOGLE offers a rapid and convenient way for searching relevant experimental datasets, pathways and biological terms according to multiple types of queries: including biomedical vocabularies, GDS IDs, gene IDs, pathway names and signature list. Moreover, GEOGLE summarizes the signature genes from a subset of GDSes and estimates the correlation between gene expression and the phenotypic distinction with an integrated p value.
This approach performing global searching of expression data may expand the traditional way of collecting heterogeneous gene expression experiment data. GEOGLE is a novel tool that provides researchers a quantitative way to understand the correlation between gene expression and phenotypic distinction through meta-analysis of gene expression datasets from different experiments, as well as the biological meaning behind. The web site and user guide of GEOGLE are available at:
PMCID: PMC2745391  PMID: 19703314
11.  Combinatorial network of primary and secondary microRNA-driven regulatory mechanisms 
Nucleic Acids Research  2009;37(18):5969-5980.
Recent miRNA transfection experiments show strong evidence that miRNAs influence not only their target but also non-target genes; the precise mechanism of the extended regulatory effects of miRNAs remains to be elucidated. A hypothetical two-layer regulatory network in which transcription factors (TFs) function as important mediators of miRNA-initiated regulatory effects was envisioned, and a comprehensive strategy was developed to map such miRNA-centered regulatory cascades. Given gene expression profiles after miRNA-perturbation, along with putative miRNA–gene and TF–gene regulatory relationships, highly likely degraded targets were fetched by a non-parametric statistical test; miRNA-regulated TFs and their downstream targets were mined out through linear regression modeling. When applied to 53 expression datasets, this strategy discovered combinatorial regulatory networks centered around 19 miRNAs. A tumor-related regulatory network was diagrammed as an example, with the important tumor-related regulators TP53 and MYC playing hub connector roles. A web server is provided for query and analysis of all reported data in this article. Our results reinforce the growing awareness that non-coding RNAs may play key roles in the transcription regulatory network. Our strategy could be applied to reveal conditional regulatory pathways in many more cellular contexts.
PMCID: PMC2764428  PMID: 19671526
12.  Identification and target prediction of miRNAs specifically expressed in rat neural tissue 
BMC Genomics  2009;10:214.
MicroRNAs (miRNAs) are a large group of RNAs that play important roles in regulating gene expression and protein translation. Several studies have indicated that some miRNAs are specifically expressed in human, mouse and zebrafish tissues. For example, miR-1 and miR-133 are specifically expressed in muscles. Tissue-specific miRNAs may have particular functions. Although previous studies have reported the presence of human, mouse and zebrafish tissue-specific miRNAs, there have been no detailed reports of rat tissue-specific miRNAs. In this study, Home-made rat miRNA microarrays which established in our previous study were used to investigate rat neural tissue-specific miRNAs, and mapped their target genes in rat tissues. This study will provide information for the functional analysis of these miRNAs.
In order to obtain as complete a picture of specific miRNA expression in rat neural tissues as possible, customized miRNA microarrays with 152 selected miRNAs from miRBase were used to detect miRNA expression in 14 rat tissues. After a general clustering analysis, 14 rat tissues could be clearly classified into neural and non-neural tissues based on the obtained expression profiles with p values < 0.05. The results indicated that the miRNA profiles were different in neural and non-neural tissues. In total, we found 30 miRNAs that were specifically expressed in neural tissues. For example, miR-199a was specifically expressed in neural tissues. Of these, the expression patterns of four miRNAs were comparable with those of Landgraf et al., Bak et al., and Kapsimani et al. Thirty neural tissue-specific miRNAs were chosen to predict target genes. A total of 1,475 target mRNA were predicted based on the intersection of three public databases, and target mRNA's pathway, function, and regulatory network analysis were performed. We focused on target enrichments of the dorsal root ganglion (DRG) and olfactory bulb. There were four Gene Ontology (GO) functions and five KEGG pathways significantly enriched in DRG. Only one GO function was significantly enriched in the olfactory bulb. These targets are all predictions and have not been experimentally validated.
Our work provides a global view of rat neural tissue-specific miRNA profiles and a target map of miRNAs, which is expected to contribute to future investigations of miRNA regulatory mechanisms in neural systems.
PMCID: PMC2688525  PMID: 19426523
13.  Human gene expression sensitivity according to large scale meta-analysis 
BMC Bioinformatics  2009;10(Suppl 1):S56.
Genes show different sensitivities in expression corresponding to various biological conditions. Systematical study of this concept is required because of its important implications in microarray analysis etc. J.H. Ohn et al. first studied this gene property with yeast transcriptional profiling data.
Here we propose a calculation framework for gene expression sensitivity analysis. We also compared the functions, centralities and transcriptional regulations of the sensitive and robust genes. We found that the robust genes tended to be involved in essential cellular processes. Oppositely, the sensitive genes perform their functions diversely. Moreover while genes from both groups show similar geometric centrality by coupling them onto integrated protein networks, the robust genes have higher vertex degree and betweenness than that of the sensitive genes. An interesting fact was also found that, not alike the sensitive genes, the robust genes shared less transcription factors as their regulators.
Our study reveals different propensities of gene expression to external perturbations, demonstrates different roles of sensitive genes and robust genes in the cell and proposes the necessity of combining the gene expression sensitivity in the microarray analysis.
PMCID: PMC2648786  PMID: 19208159
14.  Gene expression module-based chemical function similarity search 
Nucleic Acids Research  2008;36(20):e137.
Investigation of biological processes using selective chemical interventions is generally applied in biomedical research and drug discovery. Many studies of this kind make use of gene expression experiments to explore cellular responses to chemical interventions. Recently, some research groups constructed libraries of chemical related expression profiles, and introduced similarity comparison into chemical induced transcriptome analysis. Resembling sequence similarity alignment, expression pattern comparison among chemical intervention related expression profiles provides a new way for chemical function prediction and chemical–gene relation investigation. However, existing methods place more emphasis on comparing profile patterns globally, which ignore noises and marginal effects. At the same time, though the whole information of expression profiles has been used, it is difficult to uncover the underlying mechanisms that lead to the functional similarity between two molecules. Here a new approach is presented to perform biological effects similarity comparison within small biologically meaningful gene categories. Regarding gene categories as units, a reduced similarity matrix is generated for measuring the biological distances between query and profiles in library and pointing out in which modules do chemical pairs resemble. Through the modularization of expression patterns, this method reduces experimental noises and marginal effects and directly correlates chemical molecules with gene function modules.
PMCID: PMC2582597  PMID: 18842630
15.  Localized-Statistical Quantification of Human Serum Proteome Associated with Type 2 Diabetes 
PLoS ONE  2008;3(9):e3224.
Recent advances in proteomics have shed light to discover serum proteins or peptides as biomarkers for tracking the progression of diabetes as well as understanding molecular mechanisms of the disease.
In this work, human serum of non-diabetic and diabetic cohorts was analyzed by proteomic approach. To analyze total 1377 high-confident serum-proteins, we developed a computing strategy called localized statistics of protein abundance distribution (LSPAD) to calculate a significant bias of a particular protein-abundance between these two cohorts. As a result, 68 proteins were found significantly over-represented in the diabetic serum (p<0.01). In addition, a pathway-associated analysis was developed to obtain the overall pathway bias associated with type 2 diabetes, from which the significant over-representation of complement system associated with type 2 diabetes was uncovered. Moreover, an up-stream activator of complement pathway, ficolin-3, was observed over-represented in the serum of type 2 diabetic patients, which was further validated with statistic significance (p = 0.012) with more clinical samples.
The developed LSPAD approach is well fit for analyzing proteomic data derived from biological complex systems such as plasma proteome. With LSPAD, we disclosed the comprehensive distribution of the proteins associated with diabetes in different abundance levels and the involvement of ficolin-related complement activation in diabetes.
PMCID: PMC2529402  PMID: 18795103
16.  The prediction of interferon treatment effects based on time series microarray gene expression profiles 
The status of a disease can be reflected by specific transcriptional profiles resulting from the induction or repression activity of a number of genes. Here, we proposed a time-dependent diagnostic model to predict the treatment effects of interferon and ribavirin to HCV infected patients by using time series microarray gene expression profiles of a published study.
In the published study, 33 African-American (AA) and 36 Caucasian American (CA) patients with chronic HCV genotype 1 infection received pegylated interferon and ribavirin therapy for 28 days. HG-U133A GeneChip containing 22283 probes was used to analyze the global gene expression in peripheral blood mononuclear cells (PBMC) of all the patients on day 0 (pretreatment), 1, 2, 7, 14, and 28. According to the decrease of HCV RNA levels on day 28, two categories of responses were defined: good and poor. A voting method based on Student's t test, Wilcoxon test, empirical Bayes test and significance analysis of microarray was used to identify differentially expressed genes. A time-dependent diagnostic model based on C4.5 decision tree was constructed to predict the treatment outcome. This model not only utilized the gene expression profiles before the treatment, but also during the treatment. Leave-one-out cross validation was used to evaluate the performance of the model.
The model could correctly predict all Caucasian American patients' treatment effects at very early time point. The prediction accuracy of African-American patients achieved 85.7%. In addition, thirty potential biomarkers which may play important roles in response to interferon and ribavirin were identified.
Our method provides a way of using time series gene expression profiling to predict the treatment effect of pegylated interferon and ribavirin therapy on HCV infected patients. Similar experimental and bioinformatical strategies may be used to improve treatment decisions for other chronic diseases.
PMCID: PMC2546378  PMID: 18691426
17.  Transcript-level annotation of Affymetrix probesets improves the interpretation of gene expression data 
BMC Bioinformatics  2007;8:194.
The wide use of Affymetrix microarray in broadened fields of biological research has made the probeset annotation an important issue. Standard Affymetrix probeset annotation is at gene level, i.e. a probeset is precisely linked to a gene, and probeset intensity is interpreted as gene expression. The increased knowledge that one gene may have multiple transcript variants clearly brings up the necessity of updating this gene-level annotation to a refined transcript-level.
Through performing rigorous alignments of the Affymetrix probe sequences against a comprehensive pool of currently available transcript sequences, and further linking the probesets to the International Protein Index, we generated transcript-level or protein-level annotation tables for two popular Affymetrix expression arrays, Mouse Genome 430A 2.0 Array and Human Genome U133A Array. Application of our new annotations in re-examining existing expression data sets shows increased expression consistency among synonymous probesets and strengthened expression correlation between interacting proteins.
By refining the standard Affymetrix annotation of microarray probesets from the gene level to the transcript level and protein level, one can achieve a more reliable interpretation of their experimental data, which may lead to discovery of more profound regulatory mechanism.
PMCID: PMC1913542  PMID: 17559689
18.  The use of global transcriptional analysis to reveal the biological and cellular events involved in distinct development phases of Trichophyton rubrum conidial germination 
BMC Genomics  2007;8:100.
Conidia are considered to be the primary cause of infections by Trichophyton rubrum.
We have developed a cDNA microarray containing 10250 ESTs to monitor the transcriptional strategy of conidial germination. A total of 1561 genes that had their expression levels specially altered in the process were obtained and hierarchically clustered with respect to their expression profiles. By functional analysis, we provided a global view of an important biological system related to conidial germination, including characterization of the pattern of gene expression at sequential developmental phases, and changes of gene expression profiles corresponding to morphological transitions. We matched the EST sequences to GO terms in the Saccharomyces Genome Database (SGD). A number of homologues of Saccharomyces cerevisiae genes related to signalling pathways and some important cellular processes were found to be involved in T. rubrum germination. These genes and signalling pathways may play roles in distinct steps, such as activating conidial germination, maintenance of isotropic growth, establishment of cell polarity and morphological transitions.
Our results may provide insights into molecular mechanisms of conidial germination at the cell level, and may enhance our understanding of regulation of gene expression related to the morphological construction of T. rubrum.
PMCID: PMC1871584  PMID: 17428342
19.  Analysis of the dermatophyte Trichophyton rubrum expressed sequence tags 
BMC Genomics  2006;7:255.
Dermatophytes are the primary causative agent of dermatophytoses, a disease that affects billions of individuals worldwide. Trichophyton rubrum is the most common of the superficial fungi. Although T. rubrum is a recognized pathogen for humans, little is known about how its transcriptional pattern is related to development of the fungus and establishment of disease. It is therefore necessary to identify genes whose expression is relevant to growth, metabolism and virulence of T. rubrum.
We generated 10 cDNA libraries covering nearly the entire growth phase and used them to isolate 11,085 unique expressed sequence tags (ESTs), including 3,816 contigs and 7,269 singletons. Comparisons with the GenBank non-redundant (NR) protein database revealed putative functions or matched homologs from other organisms for 7,764 (70%) of the ESTs. The remaining 3,321 (30%) of ESTs were only weakly similar or not similar to known sequences, suggesting that these ESTs represent novel genes.
The present data provide a comprehensive view of fungal physiological processes including metabolism, sexual and asexual growth cycles, signal transduction and pathogenic mechanisms.
PMCID: PMC1621083  PMID: 17032460
21.  Systematic Analysis of Head-to-Head Gene Organization: Evolutionary Conservation and Potential Biological Relevance 
PLoS Computational Biology  2006;2(7):e74.
Several “head-to-head” (or “bidirectional”) gene pairs have been studied in individual experiments, but genome-wide analysis of this gene organization, especially in terms of transcriptional correlation and functional association, is still insufficient. We conducted a systematic investigation of head-to-head gene organization focusing on structural features, evolutionary conservation, expression correlation and functional association. Of the present 1,262, 1,071, and 491 head-to-head pairs identified in human, mouse, and rat genomes, respectively, pairs with 1– to 400–base pair distance between transcription start sites form the majority (62.36%, 64.15%, and 55.19% for human, mouse, and rat, respectively) of each dataset, and the largest group is always the one with a transcription start site distance of 101 to 200 base pairs. The phylogenetic analysis among Fugu, chicken, and human indicates a negative selection on the separation of head-to-head genes across vertebrate evolution, and thus the ancestral existence of this gene organization. The expression analysis shows that most of the human head-to-head genes are significantly correlated, and the correlation could be positive, negative, or alternative depending on the experimental conditions. Finally, head-to-head genes statistically tend to perform similar functions, and gene pairs associated with the significant cofunctions seem to have stronger expression correlations. The findings indicate that the head-to-head gene organization is ancient and conserved, which subjects functionally related genes to correlated transcriptional regulation and thus provides an exquisite mechanism of transcriptional regulation based on gene organization. These results have significantly expanded the knowledge about head-to-head gene organization. Supplementary materials for this study are available at
It was commonly assumed that higher eukaryotic genomes are loosely organized and genes are interspersed in the whole genome sequences. However, experiments have continuously identified eukaryotic head-to-head gene pairs with genes located closely next to each other, possibly sharing a same promoter; and preliminary genomic surveys have even proved head-to-head gene pair to be a common feature of human genome. The authors report a systematic investigation of head-to-head gene pairs in terms of the genomic structure, evolutionary conservation, expressional correlation, and functional association. The authors first identified some common structural and distributional patterns in three representative mammalian genomes: human, mouse, and rat. Then, through comparative analyses between human, chicken, and Fugu, they observed a conservation tendency of head-to-head gene pairs in vertebrates. Finally, interactive analyses of expressional and functional association yielded some interesting results, including the significant expression correlation of head-to-head genes, especially for the pairs with significant functional association. The main conclusion of this paper is that the head-to-head gene organization is ancient and conserved, subjecting functionally related genes to coregulated transcription. Lists of head-to-head gene pairs in human, mouse, rat, chicken, and Fugu are provided, while some individual pairs in need of further in-depth investigations are highlighted.
PMCID: PMC1487180  PMID: 16839196

Results 1-21 (21)