1.  Barcoded DNA-Tag Reporters for Multiplex Cis-Regulatory Analysis 
PLoS ONE  2012;7(4):e35934.
Cis-regulatory DNA sequences causally mediate patterns of gene expression, but efficient experimental analysis of these control systems has remained challenging. Here we develop a new version of “barcoded" DNA-tag reporters, “Nanotags" that permit simultaneous quantitative analysis of up to 130 distinct cis-regulatory modules (CRMs). The activities of these reporters are measured in single experiments by the NanoString RNA counting method and other quantitative procedures. We demonstrate the efficiency of the Nanotag method by simultaneously measuring hourly temporal activities of 126 CRMs from 46 genes in the developing sea urchin embryo, otherwise a virtually impossible task. Nanotags are also used in gene perturbation experiments to reveal cis-regulatory responses of many CRMs at once. Nanotag methodology can be applied to many research areas, ranging from gene regulatory networks to functional and evolutionary genomics.
PMCID: PMC3339872  PMID: 22563420
2.  High accuracy, high-resolution prevalence measurement for the majority of locally expressed regulatory genes in early sea urchin development 
Gene expression patterns : GEP  2010;10(4-5):177-184.
Accurate measurements of transcript abundance are a prerequisite to understand gene activity in development. Using the NanoString nCounter, an RNA counting device, we measured the prevalence of 172 transcription factors and signaling molecules in early sea urchin development. These measurements show high fidelity over more than five orders of magnitude down to a few transcripts per embryo. Most of the genes included are locally restricted in their spatial expression, and contribute to the divergent regulatory states of cells in the developing embryo. In order to obtain high-resolution expression, profiles from fertilization to late gastrulation samples were collected at hourly intervals. The measured time courses agree well with, and substantially extend, prior relative abundance measurements obtained by quantitative PCR. High temporal resolution permits sequences of successively activated genes to be precisely delineated providing an ancillary tool for assembling maps of gene regulatory networks. The data are available via an interactive website for quick plotting of selected time courses.
PMCID: PMC2902461  PMID: 20398801
Transcription factor; Gene expression time course; mRNA prevalence measurement; Embryogenesis
3.  Exclusive Developmental Functions of gatae cis-Regulatory Modules in the Strongylocentrorus purpuratus Embryo 
Developmental biology  2007;307(2):434-445.
The gatae gene of Strongylocentrotus purpuratus is orthologous to vertebrate gata-4,5,6 genes. This gene is expressed in the endomesoderm in the blastula and later the gut of the embryo, and is required for normal development. A gatae BAC containing a GFP reporter knocked into exon one of the gene was able to reproduce all aspects of endogenous gatae expression in the embryo. To identify putative gatae cis-regulatory modules we carried out an interspecific sequence conservation analysis with respect to a Lytechinus variegatus gatae BAC, which revealed 25 conserved non-coding sequence patches. These were individually tested in gene transfer experiments, and two modules capable of driving localized reporter expression in the embryo were identified. Module 10 produces early expression in mesoderm and endoderm cells up to the early gastrula stage, while module 24 generates late endodermal expression at gastrula and pluteus stages. Module 10 was then deleted from the gatae BAC by reciprocal recombination, resulting in total loss of reporter expression in the time frame in which it is normally active. Similar deletion of module 24 led to ubiquitous GFP expression in the gastrula and pluteus. These results show that Module 10 is uniquely necessary and sufficient to account for the early phase of gatae expression during endomesoderm specification. In addition they imply a functional cis-regulatory module exclusion, whereby only a single module can associate with the basal promoter and drive gene expression at any given time.
PMCID: PMC2031225  PMID: 17570356
sea urchin; gene regulation; GATA factors; cis-regulatory analysis; gatae
4.  Cis-regulatory control of the nodal gene, initiator of the sea urchin oral ectoderm gene network 
Developmental biology  2007;306(2):860-869.
Expression of the nodal gene initiates the gene regulatory network which establishes the transcriptional specification of the oral ectoderm in the sea urchin embryo. This gene encodes a TGFβ ligand, and in Strongylocentrotus purpuratus its transcription is activated in the presumptive oral ectoderm at about the 30-cell stage. Thereafter Nodal signaling occurs among all cells of the oral ectoderm territory, and nodal expression is required for expression of oral ectoderm regulatory genes. The cis-regulatory system of the nodal gene transduces anisotropically distributed cytoplasmic cues that distinguish the future oral and aboral domains of the early embryo. Here we establish the genomic basis for the initiation and maintenance of nodal gene expression in the oral ectoderm. Functional cis-regulatory control modules of the nodal gene were identified by interspecific sequence conservation. A 5′ cis-regulatory module functions both to initiate expression of the nodal gene and to maintain its expression by means of feedback input from the Nodal signal transduction system. These functions are mediated respectively by target sites for bZIP transcription factors, and by SMAD target sites. At least one SMAD site is also needed for the initiation of expression. An intron module also contains SMAD sites which respond to Nodal feedback, and in addition acts to repress vegetal expression. These observations explain the main features of nodal expression in the oral ectoderm: since the activity of bZIP factors is redox sensitive, and the initial polarization of oral vs aboral fate is manifested in a redox differential, the bZIP sites account for the activation of nodal on the oral side; and since the immediate early signal transduction response factors for Nodal are SMAD factors, the SMAD sites account for the feedback maintenance of nodal gene expression.
PMCID: PMC2063469  PMID: 17451671
Nodal; Oral ectoderm; Gene regulatory network; Community effect; TGF-beta; bZIP; SMAD; Sea urchin; Positive feedback regulation; cis-regulatory analysis
5.  Evolutionary Change of the Numbers of Homeobox Genes in Bilateral Animals 
Molecular biology and evolution  2005;22(12):2386-2394.
It has been known that the conservation or diversity of homeobox genes is responsible for the similarity and variability of some of the morphological or physiological characters among different organisms. To gain some insights into the evolutionary pattern of homeobox genes in bilateral animals, we studied the change of the numbers of these genes during the evolution of bilateral animals. We analyzed 2,031 homeodomain sequences compiled from 11 species of bilateral animals ranging from Caenorhabditis elegans to humans. Our phylogenetic analysis using a modified reconciled-tree method suggested that there were at least about 88 homeobox genes in the common ancestor of bilateral animals. About 50–60 genes of them have left at least one descendant gene in each of the 11 species studied, suggesting that about 30–40 genes were lost in a lineage-specific manner. Although similar numbers of ancestral genes have survived in each species, vertebrate lineages gained many more genes by duplication than invertebrate lineages, resulting in more than 200 homeobox genes in vertebrates and about 100 in invertebrates. After these gene duplications, a substantial number of old duplicate genes have also been lost in each lineage. Because many old duplicate genes were lost, it is likely that lost genes had already been differentiated from other groups of genes at the time of gene loss. We conclude that both gain and loss of homeobox genes were important for the evolutionary change of phenotypic characters in bilateral animals.
PMCID: PMC1464090  PMID: 16079247
homeobox genes; molecular evolution; gene duplication; gene loss; evolutionary developmental biology
6.  A simple method for predicting the functional differentiation of duplicate genes and its application to MIKC-type MADS-box genes 
Nucleic Acids Research  2005;33(2):e12.
A simple statistical method for predicting the functional differentiation of duplicate genes was developed. This method is based on the premise that the extent of functional differentiation between duplicate genes is reflected in the difference in evolutionary rate because the functional change of genes is often caused by relaxation or intensification of functional constraints. With this idea in mind, we developed a window analysis of protein sequences to identify the protein regions in which the significant rate difference exists. We applied this method to MIKC-type MADS-box proteins that control flower development in plants. We examined 23 pairs of sequences of floral MADS-box proteins from petunia and found that the rate differences for 14 pairs are significant. The significant rate differences were observed mostly in the K domain, which is important for dimerization between MADS-box proteins. These results indicate that our statistical method may be useful for predicting protein regions that are likely to be functionally differentiated. These regions may be chosen for further experimental studies.
PMCID: PMC548370  PMID: 15659573

