Comprehensive gene expression data can provide a powerful resource for predicting and building models of gene regulatory networks. Identifying all the factors that control where, when and to what level mRNA is produced will be essential for deciphering the common language governing transcriptional regulation. Work is being done to indentify sequence elements that convey lineage, spatial and temporal expression information for a gene. However an integral part of such analyzes will be datasets containing the in vivo, temporal-spatial descriptions of gene expression in experimentally manipulated environments. We have focused on developing a method to reproducibly characterize gene expression in pigment cell lineages at E11.5 in the mouse. This captures a time when early melanoblasts are present along the entirety of the embryo and are undergoing active migration and proliferation. It is within this context that we have generated a pigment cell gene expression dataset, using systematic evaluation criteria to identify distinct melanoblast populations and categorize genes based on in situ gene expression.
An inherent requirement of using whole mount in situ hybridization data for comparative analyses is appropriate quantitation and spatial annotation of the data. To accomplish this, we have developed a defined MDS scoring system tailored to the melanoblast lineage, to rapidly, systematically and reproducibly annotate the melanoblast gene expression pattern. This system can be applied to multiple embryos and used to compare genes within backgrounds or genes in the context of genetic or environmental variations. It is important to note that the MDS system is a quantitative score of melanoblast number based on in situ hybridization detection of a specific gene. Thus if the in situ hybridization signal for a gene is detectible in all melanoblasts, than the MDS score is reflective of the total melanoblast number present in a given embryonic region. Alternatively, the MDS score may represent a sub-population of melanoblasts for several reasons. For example systemic perturbation (such as genetic background or inclusion of a genetic mutation) or natural variation of gene expression may result in a subset of melanoblasts with undetectable levels of the gene without affecting overall melanoblast cell numbers. Both indices are biologically important to identify, however one must be mindful of these parameters when incorporating appropriate controls and interpreting the data using this approach.
With this in situ
data set we find examples where the MDS score captures either gene expression differences, or a reduction in overall melanoblast numbers in the embryo. An Examples of our ability to capture gene expression differences is found from the comparative analysis of all Class I genes in transcription factor-haploinsufficient conditions. Here analysis of Dct
expression revealed a significant reduction in the number of Dct
+ melanoblasts found in Sox10tm1Weg/+
embryos in comparison to Mitfmi/+
embryos. None of the other Class 1 genes exhibits a similar reduction in gene+ melanoblasts in Sox10tm1Weg/+
embryos. Taken together these results indicate that there are more melanoblasts present in Sox10tm1Weg/+
embryos than are found to be Dct
+, and that an in vivo
reduction in SOX10 levels results in a more dramatic reduction in Dct
expression than observed due to alteration in MITF levels alone. These in vivo
results are consistent with what we have observed through in vitro
promoter transactivation assays indicating that both SOX10 and MITF synergistically regulate Dct
expression (Jiao et al., 2004
; Ludwig et al., 2004
; Potterf et al., 2001
), and that Mitf
is also a direct target of SOX10 (Potterf et al., 2000
), (Bondurand et al., 2000b
) and would thus be reduced in Sox10tm1Weg/+
embryos(Britsch et al., 2001
). These results also suggest that unlike Dct
, the promoters for the genes Si, Gpnmb, Slc45A2 and Tyr
would not be predicted to respond to MITF and SOX10 in a synergistic manner.
Another example of the capacity of the MDS score to capture variation in gene expression levels is observed for analysis of Gpnmb and Si. We observed that for both genes Gpnmb and Si there were more gene-positive melanoblasts across the entirety of the embryo in Sox10tm1Weg/+ embryos than in Mitfmi/+ embryos. Given that three other melanoblast genes, Mitf, Slc45a2 and Tyr present with similar numbers of melanoblasts in both Sox10 and Mitf heterozygous mice we suggest that regulatory elements governing Si and Gpnmb gene expression may be more sensitive to reductions in functional MITF levels than to reductions in SOX10. Therefore Gpnmb and Si can be grouped distinctly from the other Class I genes, Dct, Slc45a2 and Tyr. Future systematic analysis of the cis-regulatory regions of these genes correlated with annotation of the melanoblast and RPE lineage repertoire of available transcription factors at E11.5 will be required in order to begin to make more comprehensive predictive models that define these gene expression patterns.
The MDS score also captures gene expression differences with respect to location of gene-positive cell populations in vivo
. For example, Mitf
gene expression patterns are distinct relative to the other genes in this data set when comparing the MDS heatmap profiles in Mitfmi/mi
homozygous embryos. Previously, we observed Gpnmb-
-positive cells marking Sox10
-independent melanoblasts in a discrete region over the hind limb and tail at E11.5 (Loftus et al., 2008
). Here we have extended the analysis of this population of cells in Sox10tm1Weg/tm1Weg
embryos, and determined that Dct
, and Slc45a2
are not expressed in this newly defined cell population. Possibly this distinct gene expression pattern may be correlated with the observation that Gpnmb
is also found to be regulated by MITF in the osteoclast cell lineage (Ripoll et al., 2008
). Future analyses will be required to determine precisely what factors are contributing to the SOX10-independent expression of these two genes in the caudal melanoblast population and thus differentiate Mitf
from other pigment cell expressed genes.
Analysis of our dataset also provides examples of how genes can be categorized based on temporal-spatial and lineage information. For example, if expression were only examined in terminally differentiated melanocyte cell populations, most genes in this study would be grouped into one Class. However analysis at E11.5 showed that Gpnmb, Si, Slc45a2
(Class I) can be distinguished from Trpm1, Mlana,
(Class II), as the latter are not expressed in melanoblasts at the E11.5 timepoint (). This temporal difference is not due to whether or not Class III genes are MITF targets genes, as experimental evidence demonstrates that MITF contributes to the regulation of Trpm1, Mlana,
in pigmented melanocytes and melanoma cells (Du et al., 2003
; Yasumoto et al., 1995
; Zhiqi et al., 2004
). This data suggests that a more complex interaction of regulatory factors, in combination with MITF, regulate the temporal initiation of gene expression in early melanoblasts versus melanocytes.
Finally, by applying the MDS scoring system to comparison of multiple inbred strains, we demonstrate that early in melanoblast development, patterns can vary significantly depending on the strain background and embryonic location. In this case the variation observed is likely not due to modifier loci merely reducing Si
expression in a region specific manner, as we have observed greater numbers of trunk melanoblasts in E11.5 FVB/NJ mice than in C56BL/6J mice for Dct
, and Slc45A2
expression (Baxter and Pavan, 2003
), and unpublished data), similar to what is seen for Si
in these two strains. Thus the strain dependant variation in numbers of Si
-positive melanoblasts is likely due to regional differences in melanoblast numbers between strains and not due to variations in levels of Si
gene expression within melanoblasts. Taken together this suggests that there are strain-dependant modifiers of early melanoblast numbers in the embryo. It has long been appreciated that the location and size of hypo-pigmented areas in adult mice are inherited as quantitative traits (Doolittle et al., 1975
; Dunn and Charles, 1937
; Pavan et al., 1995
) and it has more recently been demonstrated that the background strains onto which the spotting mutation alleles Ednrb
are maintained affect the severity of the spot (Asher et al., 1996
; Britsch et al., 2001
; Pavan et al., 1995
; Southard-Smith et al., 1999
). However, the total number of loci that represent strain-specific modifiers of hypopigmentation remains to be determined. Currently, 104 genes and/or loci are annotated in the MGI database that exhibit some degree of “white spotting” (http://www.informatics.jax.org/
), with typical patterns being belly spots, head spots, piebald spotting, and white “belts” which encircle the caudal trunk (Baxter et al., 2004
). These genes/loci are known to affect a variety of melanoblast/melanocyte cellular processes including migration, differentiation proliferation, and cell survival. There is the potential for numerous pigmentation modifier loci to segregate among the many inbred strains. Application of the MDS scoring system in the context of a consistent inbred strain background can verify that the details of gene expression differences such as those described here can be correctly attributed to either gene-intrinsic variations or independent modifiers.
All 14 genes in this data set are expressed in the terminally differentiated melanocyte lineage, with 8 of 14 genes (Si, Dct, Gpnmb, Slc45A2, Mlana Tyr, Tyrp1, and Trpm1) classified as MITF transcriptional targets. However, we here demonstrate that not all these genes respond to alteration in MITF levels in a similar manner. Through systematic evaluation of this expression dataset we have been able to sub-divide this gene set into smaller, expression-based, phenotype-correlated gene groupings. These results reinforce the fact that transcriptional regulation is a complex process. Often transcriptional regulation is approached in a simplified manner through the prism of data collected in a single isolated cell type, such as a melanocyte or melanoma derived cell line, under a single condition. Such analysis by default excludes a spatio-temporal based analysis such as observations of marked cell populations during embryonic development. Our dataset and methodology for annotation provides a framework for phenotypic correlations to be made for genes based upon in vivo expression characteristics. In the future, in depth functional analysis of genomic elements will be required to obtain detailed and precise gene expression information. More lineage-appropriate gene expression information is required and must be collected in a standardized manner if one is to develop computational models that identify parameters that define where, when, and at what level a gene is expressed. Fundamental to this type of modeling/analysis is information regarding how modulation of a single regulatory expression pathway or a single transcription factor will vary the expression of individual genes and how that variation compares when a collection of genes with defined similar patterns of expression are compared. Cataloging this variation is the first step in identifying temporal-spatial gene expression pattern correlations between genes and regulatory sequence for those genes. These correlations will provide the foundation to annotate and query the common and distinct cis-regulatory factors for each of these genes and will allow for future testing and evaluation of candidate regulatory elements, among additional genes demonstrating similar defined expression patterns, to assess the extent to which these elements are predictive of the expression patterns observed.