1.  Functional and evolutionary correlates of gene constellations in the Drosophila melanogaster genome that deviate from the stereotypical gene architecture 
BMC Genomics  2010;11:322.
The biological dimensions of genes are manifold. These include genomic properties, (e.g., X/autosomal linkage, recombination) and functional properties (e.g., expression level, tissue specificity). Multiple properties, each generally of subtle influence individually, may affect the evolution of genes or merely be (auto-)correlates. Results of multidimensional analyses may reveal the relative importance of these properties on the evolution of genes, and therefore help evaluate whether these properties should be considered during analyses. While numerous properties are now considered during studies, most work still assumes the stereotypical solitary gene as commonly depicted in textbooks. Here, we investigate the Drosophila melanogaster genome to determine whether deviations from the stereotypical gene architecture correlate with other properties of genes.
Deviations from the stereotypical gene architecture were classified as the following gene constellations: Overlapping genes were defined as those that overlap in the 5-prime, exonic, or intronic regions. Chromatin co-clustering genes were defined as genes that co-clustered within 20 kb of transcriptional territories. If this scheme is applied the stereotypical gene emerges as a rare occurrence (7.5%), slightly varied schemes yielded between ~1%-50%. Moreover, when following our scheme, paired-overlapping genes and chromatin co-clustering genes accounted for 50.1 and 42.4% of the genes analyzed, respectively. Gene constellation was a correlate of a number of functional and evolutionary properties of genes, but its statistical effect was ~1-2 orders of magnitude lower than the effects of recombination, chromosome linkage and protein function. Analysis of datasets on male reproductive proteins showed these were biased in their representation of gene constellations and evolutionary rate Ka/Ks estimates, but these biases did not overwhelm the biologically meaningful observation of high evolutionary rates of male reproductive genes.
Given the rarity of the solitary stereotypical gene, and the abundance of gene constellations that deviate from it, the presence of gene constellations, while once thought to be exceptional in large Eukaryote genomes, might have broader relevance to the understanding and study of the genome. However, according to our definition, while gene constellations can be significant correlates of functional properties of genes, they generally are weak correlates of the evolution of genes. Thus, the need for their consideration would depend on the context of studies.
PMCID: PMC2891614  PMID: 20497561
2.  Adaptive introgression of anticoagulant rodent poison resistance by hybridization between Old World mice 
Current biology : CB  2011;21(15):1296-1301.
It is known that evolution by selection on new or standing single nucleotide polymorphisms (SNPs) in the vitamin K 2,3-epoxide reductase subcomponent 1 (vkorc1) of house mice (Mus musculus domesticus) can cause resistance to anticoagulant rodenticides such as warfarin [1–3]. Here we report an introgression in European M. m. domesticus spanning as much as ~20.3 megabases (Mb) and including vkorc1, the molecular target of anticoagulants [1–4], that stems from hybridization with the Algerian mouse (M. spretus). We show that in the laboratory the homozygous complete vkorc1 allele of M. spretus confers resistance when introgressed into M. m. domesticus. Consistent with selection on the introgression after the introduction of rodenticides in the 1950s we document historically adaptive population genetics of vkorc1 in M. m. domesticus. Furthermore, we detected adaptive protein evolution of vkorc1 in the M. spretus lineage (Ka/Ks=1.54–1.93) resulting in radical amino-acid substitutions that apparently have anticoagulant tolerance of M. spretus as pleiotropic effect. Thus, positive selection produced an adaptive, divergent and pleiotropic vkorc1 allele in the donor species, M. spretus, which crossed a species barrier where it is expressed as adaptive trait in the recipient species, M. m. domesticus. Resistant house mice originated from selection on new or standing vkorc1 polymorphisms and from selection on vkorc1 polymorphisms acquired by adaptive introgressive hybridization.
PMCID: PMC3152605  PMID: 21782438
3.  Co-expression of adjacent genes in yeast cannot be simply attributed to shared regulatory system 
BMC Genomics  2007;8:352.
Adjacent gene pairs in the yeast genome have a tendency to express concurrently. Sharing of regulatory elements within the intergenic region of those adjacent gene pairs was often considered the major mechanism responsible for such co-expression. However, it is still in debate to what extent that common transcription factors (TFs) contribute to the co-expression of adjacent genes. In order to resolve the evolutionary aspect of this issue, we investigated the conservation of adjacent pairs in five yeast species. By using the information for TF binding sites in promoter regions available from the MYBS database , the ratios of TF-sharing pairs among all the adjacent pairs in yeast genomes were analyzed. The levels of co-expression in different adjacent patterns were also compared.
Our analyses showed that the proportion of adjacent pairs conserved in five yeast species is relatively low compared to that in the mammalian lineage. The proportion was also low for adjacent gene pairs with shared TFs. Particularly, the statistical analysis suggested that co-expression of adjacent gene pairs was not noticeably associated with the sharing of TFs in these pairs. We further proposed a case of the PAC (polymerase A and C) and RRPE (rRNA processing element) motifs which co-regulate divergent/bidirectional pairs, and found that the shared TFs were not significantly relevant to co-expression of divergent promoters among adjacent genes.
Our findings suggested that the commonly shared cis-regulatory system does not solely contribute to the co-expression of adjacent gene pairs in yeast genome. Therefore we believe that during evolution yeasts have developed a sophisticated regulatory system that integrates both TF-based and non-TF based mechanisms(s) for concurrent regulation of neighboring genes in response to various environmental changes.
PMCID: PMC2045684  PMID: 17910772
4.  MYBS: a comprehensive web server for mining transcription factor binding sites in yeast 
Nucleic Acids Research  2007;35(Web Server issue):W221-W226.
Correct interactions between transcription factors (TFs) and their binding sites (TFBSs) are of central importance to gene regulation. Recently developed chromatin-immunoprecipitation DNA chip (ChIP-chip) techniques and the phylogenetic footprinting method provide ways to identify TFBSs with high precision. In this study, we constructed a user-friendly interactive platform for dynamic binding site mapping using ChIP-chip data and phylogenetic footprinting as two filters. MYBS (Mining Yeast Binding Sites) is a comprehensive web server that integrates an array of both experimentally verified and predicted position weight matrixes (PWMs) from eleven databases, including 481 binding motif consensus sequences and 71 PWMs that correspond to 183 TFs. MYBS users can search within this platform for motif occurrences (possible binding sites) in the promoters of genes of interest via simple motif or gene queries in conjunction with the above two filters. In addition, MYBS enables users to visualize in parallel the potential regulators for a given set of genes, a feature useful for finding potential regulatory associations between TFs. MYBS also allows users to identify target gene sets of each TF pair, which could be used as a starting point for further explorations of TF combinatorial regulation. MYBS is available at
PMCID: PMC1933147  PMID: 17537814

