|Home | About | Journals | Submit | Contact Us | Français|
Two-dimensional patterning of the follicular epithelium in Drosophila oogenesis is required for the formation of three-dimensional eggshell structures. Our analysis of a large number of published gene expression patterns in the follicle cells suggested that they follow a simple combinatorial code, based on six spatial building blocks and the operations of union, difference, intersection, and addition. The building blocks are related to the distribution of the inductive signals, provided by the highly conserved EGFR and DPP pathways. We demonstrated the validity of the code by testing it against a set of newly identified expression patterns, obtained in a large-scale transcriptional profiling experiment. Using the proposed code, we distinguished 36 distinct patterns for 81 genes expressed in the follicular epithelium and characterized their joint dynamics over four stages of oogenesis. This work provides the first systematic analysis of the diversity and dynamics of two-dimensional gene expression patterns in a developing tissue.
Drosophila eggshell is an elaborate three-dimensional structure that is derived from the follicular epithelium in the developing egg chamber (Figure 1A-C). The dorsal-anterior structures of the eggshell, including the dorsal appendages and operculum, are formed by the follicle cells that are patterned by Gurken (GRK), a TGFα-like ligand secreted by the oocyte, and Decapentaplegic (DPP), a BMP2/4-type ligand secreted by the follicle cells stretched over the nurse cells, reviewed in (Dobens and Raftery, 2000; Berg, 2005). GRK and DPP control the expression of multiple genes in the follicular epithelium. Under their action, the expression of a Zn-finger transcription factor Broad (BR) evolves into a pattern with two patches on either side of the dorsal midline (Deng and Bownes, 1997; Yakoby et al., 2008). The BR-expressing cells form the roof (upper part) of the dorsal appendages (James and Berg, 2003; Dorman et al., 2004; Ward and Berg, 2005). Adjacent to the BR-expressing cells are two stripes of cells that express rhomboid (rho), a gene that is directly repressed by BR and encodes ligand-processing protease in the EGFR pathway (Ruohola-Baker et al., 1993; Sapir et al., 1998; Lee et al., 2001; Ward and Berg, 2005). These cells form the floor (lower part) of the appendages (James and Berg, 2003; Dorman et al., 2004; Berg, 2005).
The patterns of genes expressed during the stages of oogenesis that correspond to the formation of dorsal eggshell structures are very diverse (Figure 1D). At the same time, inspection of a large number of published patterns suggests that they can be “constructed” from a small number of building blocks. For instance, the T-shaped pattern of CG3074 is similar to the domain “missing” in the early pattern of br (Figure 1D,iv,iii), while the two patches in the late pattern of br appear to correspond to the two “holes” in the expression of 18w (Figure 1D,i,v). Based on a number of similar observations, we hypothesized that all of the published patterns could be constructed from just six basic shapes, or primitives, which reflect the anatomy of the egg chamber and the spatial structure of the patterning signals (Figure 2).
In computer graphics, representation of geometrical objects in terms of a small number of building blocks is known under the name of Constructive Solid Geometry (CSG), which provides a way do describe complex shapes in terms of just a few parameters – the types of the building blocks, such as cylinders, spheres, and cubes, their sizes, and operations, such as difference, union, and intersection (Requicha and Voelcker, 1982; Foley et al., 1992). Thus, information about a large number of structures can be stored in a compact form of statements that contain information about the types of the building blocks and the operations from which these structures were assembled. Here, we describe a similar approach for two-dimensional patterns and demonstrate how it enables the synthesis, comparison, and analysis of gene expression at the tissue scale.
The six building blocks used in our annotation system can be related to the structure of the egg chamber and the spatial distribution of the EGFR and DPP signals (Figure 2A,B). The first primitive, M (for “midline”), is related to the EGFR signal. It reflects high levels of EGFR activation and has a concave boundary, which can be related to the spatial pattern of GRK secretion from the oocyte (Peri et al., 1999; Queenan et al., 1999; Nakamura and Matsuno, 2003). The second primitive, denoted by D (for “dorsal”), reflects the intermediate levels of EGFR signaling during the early phase of EGFR activation by GRK, and is defined as a region of the follicular epithelium that is bounded by a level set (line of constant value) of the DV profile of EGFR activation. The boundary of this shape is convex and can be extracted from the experimentally validated computational model of the GRK gradient (Goentoro et al., 2006a; Chang et al., 2008). The third primitive, denoted by A (for “anterior”), is an anterior stripe which is obtained from a level set of the early pattern of DPP signaling in the follicular epithelium (Lembong et al., 2008). This pattern is uniform along the DV axis, as visualized by the spatial pattern of phosphorylated MAD (P-MAD) (Jekely and Rorth, 2003; Shravage et al., 2007). Thus, the D, M, and A primitives represent the spatial distribution of the inductive signals at the stage of eggshell patterning when the EGFR and DPP pathways act as independent AP and DV gradients.
Each of the next two primitives, denoted R (for “roof”) and F (for “floor”), is composed of two identical regions, shaped as the respective expression domains of br and rho (Figure 1B, (Ruohola-Baker et al., 1993; Deng and Bownes, 1997)), and reflect spatial and temporal integration of the EGFR and DPP pathways in later stages of eggshell patterning (Peri et al., 1999; Astigarraga et al., 2007; Yakoby et al., 2008). The mechanisms responsible for the emergence of the F and R domains are not fully understood. It was shown that the R domain is established as a result of sequential action of the feedforward and feedback loops within the EGFR and DPP pathways (Yakoby et al., 2008). The formation of the F domain requires the activating EGFR signal and repressive BR signal, expressed in the R domain (Peri et al., 1999; Ward and Berg, 2005; Ward et al., 2006). Thus, at the current level of understanding, the R and F domains should be viewed as just two of the shapes that are commonly seen in the two-dimensional expression patterns in the follicular epithelium (Figure 1D). The sixth primitive, U (for “uniform”), is spatially uniform and will be used in combination with other primitives to generate more complex patterns.
While a number of patterns, such as those of jar and Dad (Figure 1D,ii,vii), can be described with just a single primitive, more complex patterns are constructed combinatorially, using the operations of intersection (∩), difference (\), and union (). E.g., the dorsal anterior stripe of argos expression (Figure 1D,viii) is obtained as an intersection of the A and D primitives (A∩D, Figure 2C,i). The ventral pattern of pip (Figure 1D,ix) is obtained as a difference of the U and D primitives (U\D, Figure 2C,ii). The pattern of 18w (Figure 1D,v) is constructed from the A, D, and R primitives, joined by the operations of union and difference (AD\R, Figure 2C,iii). For a small number of published patterns, our annotations reflect the experimentally demonstrated regulatory connections. E.g., the U\D annotation for pip, reflects that actual repression of pip by the dorsal gradient of EGFR activation (Pai et al., 2000; James et al., 2002; Peri et al., 2002). For a majority of genes, our annotations should be viewed as a way to schematically represent a two-dimensional pattern and as a hypothetical description of regulation.
The geometric operations of intersection, difference, and union can be implemented by the Boolean operations performed at the regulatory regions of individual genes (Davidson, 2005), (Figure 2D). Boolean operations evaluate expression at each point and assign a value of zero (off) or one (on). As an example, consider a regulatory module, hypothesized for argos (Figure 2Ci), that performs a logical AND operation on two inputs: the output of the module is one only when both inputs are present. When both of the inputs are spatially distributed, the output is nonzero only in those regions of space where both inputs are present, leading to an output that corresponds to the intersection of the two inputs. Similarly, a spatial difference of the two inputs can be realized by a regulatory module that performs the ANDN (ANDNOT) operation. This is the case for pip, repressed by the DV gradient of Gurken signaling and activated by a still unknown uniform signal (Figure 2Cii)(Sen et al., 1998; Pai et al., 2000; James et al., 2002). Finally, a regulatory module that performs an OR operation is non-zero when at least one of the inputs is nonzero. When the inputs are spatially distributed, the output is their spatial union (illustrated for 18w, Figure 2C,iii).
Boolean operations on primitives lead to patterns with just two levels of expression (the gene is either expressed or not). In addition to Boolean logic, developmental cis-regulatory modules and systems for postranscriptional control of gene expression can perform analog operations, leading to multiple nonzero levels of output (Yuh et al., 1998; Davidson, 2001; Buchler et al., 2003; Longabaugh et al., 2005; Istrail et al., 2007; Cory and Perkins, 2008). Consider a module that adds the two binary inputs, shaped as the primitives (Figure 2D). The output is nonzero in the domain shaped as the union of the two primitives, but is characterized by two nonzero levels of expression. We reserve this type of annotation only for those cases where the application of Boolean operations would lead to a loss of the spatial structure of the pattern (such as the A+U expression pattern of mia at stage 11 of oogenesis, Figure 2C,iv). For example, the union of the A and U primitives is a U primitive, whereas the sum of these primitives is an anterior band superimposed on top of a spatially uniform background (Figures 2C,iv and 3E,iv).
As a first test of our combinatorial code, we used it to describe the two-dimensional patterns of the 49 genes previously shown to be expressed between stages 10A and 12 of oogenesis. The results of the annotion of this group of genes, that have been studied for approximately two decades, are shown in Table 1 (see also Table S1 for references to the original publications). For ~50% of these genes we relied on published in situ hybridization images. For the other genes (26/49), we have either verified, corrected the published patterns, or completed the temporal profile in our own in situ hybridization experiments. This expanded set of 118 gene expression patterns, all of which could be successfully annotated using our system of six primitives and four operations. The entire dataset required only 28 logical statements (Table 1). This analysis of the previously published and verified data supports the feasibility of our approach.
As a more rigorous test of the proposed code, we applied it to a large set of newly identified genes and patterns. We reasoned that new eggshell patterning genes could be discovered by screening for targets of the EGFR and DPP signals. Based on this, we used the GAL4/UAS system to perturb the EGFR and DPP signals at the stages 9-10 of oogenesis in a manner that induced clear perturbations of the dorsal eggshell structures (Figure 3A) (Brand and Perrimon, 1993; Twombly et al., 1996; Queenan et al., 1997; Yakoby et al., 2008). Specifically, ectopic activation of EGFR signaling abolishes the dorsoventral polarity of the eggshell, completely eliminates the dorsal appendages, and generates an operculum-like material at the anterior of the eggshell (Figure 3A,i). Uniform inhibition of EGFR signaling in the follicle cells abolishes the dorsal eggshell structures (Figure 3A,ii). Uniform activation of DPP signaling also leads to the loss of dorsal appendages and greatly expands the operculum (Figure 3A,iii). Finally, uniform expression of the intracellular inhibitor of DPP signaling leads to eggshells with a smaller operculum and deformed appendages (Figure 3A,iv).
At the next step, we used Affymetrix Gene Chip microarrays to identify ~100 genes that changed in abundance in stage 9-10 egg chambers, when compared to the wild type (Figure 3B). The details of transcriptional profiling experiments, their statistical analysis, and validation are described in the Supplemental Figures and Text. Briefly, we selected ~200 genes that responded to perturbations in both the EGFR and DPP pathways to identify potential targets of the EGFR and DPP signal integration. We then used a large scale qRT-PCR transcriptional profiling approach to validate all of these targets and eliminated ones that were likely induced due to stress, reducing the number of the candidate genes to ~100. Using in situ hybridization, we found that ~1/3 of these genes are expressed in the wild type ovary during the stages of oogenesis relevant for eggshell patterning (Figure 3C, Table 1). Some of the identified genes were known to be expressed before, while others are new (Figure 3C-E). In addition to identifying new genes, we identified a number of novel spatial patterns, e.g. the mask-like pattern of the putative cell adhesion gene Cad74A (Figure 3C,ix) (Zartman et al., 2008). Importantly, the proposed code describes all identified patterns, demonstrating that it provides an adequate language for gene expression in the follicular epithelium.
Our experiments have essentially doubled the number of reported expression patterns in the follicle cells. In combination with the published data, we have collected 211 two-dimensional spatial patterns for 81 genes at four consecutive stages of oogenesis (stages 10A, 10B, 11, and 12). This set of data forms the basis for our analysis of the diversity of the spatial gene expression patterns in the follicular epithelium. We assigned each of the 211 images in our database to 36 distinct patterns (Figure 4A-C). Out of these, six are the primitives themselves, 16 are binary terms (e.g, A∩D for argos at stage 10A), while the remaining 14 are built from three or four terms (e.g., 18w at stage 10B). Approximately two thirds of the patterns can be described with just four primitives that are directly related to the early spatial patterns of the EGFR and DPP pathway activation (D, M, A, and U). The number of distinct patterns is low when compared to the number of hypothetical patterns that can be generated within the framework of our combinatorial system. For example, if we consider only patterns that potentially could be built from two primitives and three operations, we can generate 6*6*3=108 patterns, compared to the 16 observed patterns described by only 2 primitives. This is a consequence of the fact that the primitives in our system are not completely independent and are downstream of a smaller number (most likely, just two) inductive signals.
Our annotation describes the shape of the boundary of the spatial pattern, but not the precise location of this boundary or the quantitative expression level within the domain. For example, the ventral patterns of bves, jim, fng, and pip, are all annotated with the same expression (U\D), even though the expression domains of these genes overlap only partially (Sen et al., 1998; Doerflinger et al., 1999; Jordan et al., 2000; Lin et al., 2007). Qualitative similiarity of spatial patterns for a group of genes, which corresponds to the exact match of our annotations, can be used as a proxy for a similar gene regulatory strategy. Analyzing the statistics of such events (see Experimental Procedures), we found that an exact match of the statements used to describe the real patterns, is a rare event. For example, the probability that any three genes, selected at random from our database, are co-expressed in the same pattern in at least one stage of oogenesis, is <1%.
Sharing the same annotation is not only rare, but also a transient event (Figure 5A). We found that most of the groups of genes with the same annotation at any given point either converge to the similar pattern from diverse patterns in the past, or diverge from a common annotation to different patterns at later stages. For example, given that a pair of genes is expressed in the same pattern at some point of oogenesis, the probability that this pair will still be co-expressed at some other time point is estimated to be only ~14%. The numbers are much lower for triplets and quadruplets of co-expressed genes (~4% and 1%, respectively; see Experimental Procedures for details). In other words, the fact that any group of genes is expressed in a similar pattern is not predictive of common expression patterns at other time points. Thus, patterns are dynamically reassembled from a small number of building blocks at every stage of oogenesis.
Based on our database, each stage is characterized by the expression of ~50 genes, most of which are expressed in multiple stages of oogenesis (Figures 5A-C, Tables S2,3). The sets of genes expressed at two consecutive stages show a considerable overlap, and the differences between the two sets constitute the genes that either stop being expressed or are expressed de novo (Figure 5D). Most of the genes are expressed in more than one stage of oogenesis and their expression patterns are very dynamic (Figure 5C). In general, the probability that a gene, which is expressed in multiple stages of oogenesis, will be expressed in different patterns is 74±10%.
To characterize dynamics at the scale of dozens of genes, we grouped their expression patterns into three classes. The first class, called AP, is composed of four patterns constructed from A and U primitives, is related to the AP patterning signals alone, and likely reflects regulation by the DPP pathway (A, U, U\A, and A+U). The second class, called DV, is composed of eight patterns that could be assigned to the DV patterning signal alone. This class is composed of patterns constructed from the D, M, and U primitives and their combinations (D, M, U\D, D\D, D\M, D+U, U\M, and M(U\D)). Thus, the AP and DV classes can be viewed as simple responses to the EGFR and DPP signals during the early stages of eggshell patterning. The remaining 24 annotations form the third class, called INT (for ‘integration’). This class is constructed from combinations of purely AP and purely DV patterns and/or the R and F primitives.
Next, we analyzed how the sets of genes that belong to each of these classes change over four consequtive time points (Figure 5E). Purely AP or DV patterns dominate the expression patterns in stage 10A egg chambers. Over time, however, most of the expressed genes appear as complex patterns, as can be expected based on the change in the spatial distribution of the EGFR and BMP signaling (Wasserman and Freeman, 1998; Peri et al., 1999; Yakoby et al., 2008). Complex patterns (members of the INT class) appear in two different ways. First, a gene can evolve into a more complex pattern from a simpler one. The change of the rho pattern, from D\M to F annotation, is one example (Peri et al., 1999). Second, a gene can first appear as a complex pattern, which is the case for the F pattern of Vinc. We found that newly appearing genes (after stage 10) are more likely to belong to the complex (INT) class. The set memberships of the simple, AP and DV, classes are more transient than that of the INT class, which tends to be more maintained over time (Figure 5F), potentially reflecting the reinforcing action of the feedback loops in the EGFR and DPP systems.
A qualitative change in the spatial expression pattern of a gene can reflect the use of a different regulatory region and/or the dynamics of the inductive signal. In the follicle cells, the dynamics of patterns and convergence into the INT category likely reflects the reinforcing action of the feedforward and feedback loops that split the spatial profiles of the EGFR and DPP signaling along the dorsal midline (Peri et al., 1999; Yakoby et al., 2008). For the EGFR system, this was attributed to the action of the midline-expressed EGFR inhibitor and a switch in the activating ligand: from the oocyte-derived GRK to SPI, secreted by the rho-expressing follicle cells. Thus, in both the early and late stages of oogenesis, EGFR activation is thought to be generated by a locally produced ligand that acts through a uniformly expressed receptor.
Interestingly, one of the genes identified in our experiments is Ras85D, which is essential for signal transmission from activated EGFR. During the patterning of the follicular epithelium, Ras is expressed in a dynamic pattern, which can be described as the U→ →F→F→F, sequence (stages 9 through 12, respectively), indicating that it is regulated by EGFR signaling (Figure 5G). This suggests a new layer of regulation, which depends on the ability of the follicle cells to transduce signals downstream of EGFR. At least two other components of the EGFR pathway, Shc and drk/Grb2, are expressed in dynamic patterns (Figure 5G,H). One function of this highly coordinated patterning of multiple pathway components is to localize EGFR signaling to the F domain and to prevent the spreading of a traveling wave of the EGFR activation across the entire follicular epithelium (Pribyl et al., 2003).
In addition to enabling the analysis of patterns at the tissue scale, our annotations guide mechanistic studies of individual genes and gene groups. As an example, the A∩D annotation for the stage 10A expression of ana, a secreted glycoprotein identified in our transcriptional profiling experiments (Figure 3E,ix; ;6A),6A), suggests that it is generated by a local AND gate that responds to the anterior DPP and dorsoventral GRK gradients. Thus, removal of either of these signals should erase the pattern. In agreement with this prediction, the pattern of ana is abolished in response to uniform inhibition of DPP signaling (Figure 6A,iv) and uniform inhibition of the EGFR signaling (not shown). Thus, experiments with complete inhibition of the EGFR and DPP inputs support the model whereby the A∩D pattern is established by a locally acting AND gate. At the same time, partial reduction of either of the inputs is predicted to reduce the size of the stripe of the A∩D pattern. This prediction is supported by the fact that the size of the ana pattern is reduced in flies homozygous for the weak allele of Ras (Figure 6A,iii)(Schnorr et al., 2001).
We predict that these changes in the spatial pattern of ana will be also observed for other genes with the A∩D annotation. One of these genes is argos, a negative feedback inhibitor induced by high levels of EGFR signaling (Figures 1D,viii;2C,I;;6C)6C) (Wasserman and Freeman, 1998; Mantrova et al., 1999). The AND-gate model for argos is supported by the fact that the uniform activation of EGFR generates ectopic argos expression only in the anterior part of the follicular epithelium, suggesting a necessary role of the anterior DPP signal (Queenan et al., 1997).
Another example of the connection between spatial patterns, as described by our annotations, and regulatory signals is provided by the AD\R pattern of a cell adhesion molecule 18w (Figure 2B,i,ii). The D\R part of the annotation suggests that 18w is repressed by BR, a transcription factor expressed in the R domain. Thus, in flies with reduced level of EGFR signaling (Schnorr et al., 2001), the change in the BR domain from the two-domain R pattern to a single domain pattern should lead to the midline repression of 18w. This is exactly what is observed experimentally: the wild type AD\R pattern is converted into a AD\D pattern (Figure 6B,iii). Similarly, uniform inhibition of DPP signaling, which generates ectopic BR in the anterior cells, leads to the loss of 18w expression from a part of the A domain, as predicted by its wild type annotation (Figure 6B,iv). We observed similar transitions in the spatial pattern of Cad74A, a cell adhesion molecule with the U\R annotation (Zartman et al., 2008). In this case, BR is sufficient for repressing Cad74A in the R domain, illustrating the predictive power of our annotations.
In a clear demonstration of a general trend revealed by the analysis of pattern dynamics, the highly correlated initial patterns of ana and argos and those of 18w and Cad74A, follow different dynamics (Figure 6C). At stage 11 ana expression disappears, whereas argos evolves into a midline-type pattern. These differences might reflect differences in the quantitative parameters of the locally acting regulatory modules. One possibility is that the AND gates, hypothesized to establish the wild type A∩D patterns, are activated at different levels of the EGFR and DPP signals and respond in qualitatively different ways to changes in these signals at later stages of oogenesis.
Signaling pathways guide organogenesis through the spatial and temporal control of gene expression. While the identities of genes controlled by any given signal can be identified using a combination of genetic and transcriptional profiling techniques, systematic analysis of the diversity of induced patterns requires a formal approach for pattern quantification, categorization and comparison (Leptin, 2005; Dequéant et al., 2006; Brady et al., 2007). Such approaches are only beginning to be developed (Kumar et al., 2002; Megason and Fraser, 2007; Fisher et al., 2008; Fowlkes et al., 2008). Multiplex detection of gene expression, which has a potential to convert images of the spatial distribution of transcripts into a vector format preferred by a majority of statistical methods, is currently feasible only for a small number of genes and systems with simple anatomies (Kosman et al., 2004; Luengo Hendriks et al., 2007). We presented an alternative approach, based on the combinatorial construction of patterns from simple building blocks.
In general, the building blocks can be identified as shapes that are overrepresented in a large set of experimentally collected gene expression patterns. This approach can be potentially pursued in systems where mechanisms of pattern formation are yet to be explored. At the same time, in well-studied systems, the building blocks can be linked to identified patterning mechanisms. We chose six primitives, based on the features that are commonly observed in real patterns and related to the structure of the tissue as well as the spatial distribution of the inductive signals. A similar approach will be useful whenever a two-dimensional cellular layer is patterned by a small number of signals, when cells can convert smoothly varying signals into spatial patterns with sharp boundaries, and when the regulatory regions of target genes have the ability to combinatorially process the inductive signals. One system where this approach could be feasible is the wing imaginal disk which is patterned by the spatially orthogonal WG and DPP morphogens (Cadigan, 2002; Butler et al., 2003; Jacobsen et al., 2006).
We have shown that six primitives are sufficient to describe the experimentally observed patterns during stages 10-12 ooogenesis. A natural question is whether it is possible to accomplish this with a smaller number of primitives. Two of our primitives, R and F, could be potentially constructed from the D, M, and A primitives, which are related to the patterns EGFR and DPP activation during the earlier stages of eggshell patterning. Specifically, recent studies of br regulation suggest that the R domain is formed as a difference of the D, A, and M patterns (Yakoby et al., 2008). Furthermore, the formation of the F domain requires repressive action in the adjacent R domain (Ward et al., 2006). With the R and F domains related to the other four primitives, the size of our spatial alphabet will be reduced even further (from 6 to 4), but at the expense of increasing the complexity of the expressions used to describe various spatial patterns.
Previously, the question of the diversity of the spatial patterns has been adressed only in one-dimensional systems. For example, transcriptional responses to the Dorsal morphogen gradient in the early Drosophila embryo, give rise to three-types of patterns in the form of the dorsal, lateral, and ventral bands (Markstein et al., 2004). Our work provides the first attempt to characterize the diversity and dynamics of two-dimensional patterns. We identified 36 qualitatively different patterns and proposed that each of them can be constructed using a compact combinatorial code. The sizes of the datasets from the literature, and our own transcriptional profiling experiments are approximately the same (118 and 93 patterns, respectively; Figure 3B,C). Based on this, we expect that newly discovered patterns will be readily described using our annotation system.
We found that a gene expressed in more than one stage of oogenesis, is more likely to appear in different patterns, and that groups of genes sharing the same pattern at one time point are more likely to scatter in the future than to stay together. More detailed understanding of the dynamics of the spatial patterns of the EGFR and DPP pathway activation is crucial for explaining these trends and the two observed scenarios for the emergence of complex patterns. A gene that makes its first appearance as a complex pattern, such as the A∩D pattern of argos at stage 10B, can be a direct target of the EGFR and DPP signal integration. In contrast, a gene such as Cct1, which changes from the A to R patterns, can be a dedicated target of DPP signaling alone, and changes as a consequence of change in the spatial pattern of DPP signaling (Gupta and Schupbach, 2003; Lembong et al., 2008; Yakoby et al., 2008). Future tests of such hypotheses require analysis of cis-regulatory modules responsible for gene regulation in the follicular epithelium. While only a few enhancers have been identified at this time (Tolias et al., 1993; Andrenacci et al., 2000; Ward et al., 2006), our categorization of patterns should accelerate the identification of enhancers for a large number of genes.
Proposed for the spatial patterns of transcripts, our annotations can also describe patterns of protein expression, modification, and subcellular localization patterns. For example, the stage 10A patterns of MAD phosphorylation and CIC nuclear localization can be accurately described using the A and U\D annotations, respectively. The ultimate challenge is to use the information about the patterning of the follicular epithelium to explore how it is transformed into the three-dimensional eggshell. A number of genes in the assembled database encode cytoskeleton and cell adhesion molecules, suggesting that they provide a link between patterning and morphogenesis (Ward and Berg, 2005; Kleve et al., 2006; Laplante and Nilson, 2006; Zartman et al., 2008). We hypothesize that the highly correlated expression patterns of these genes give rise to the spatial patterns of force generation and mechanical properties of cells that eventually convert the epithelium into a target three-dimensional structure.
Wild type: OreR, UAS lines: UAS-Dad (Tsuneizumi et al., 1997), UAS-tkv* (caTKV), (Lecuit et al., 1996; Nellen et al., 1996), UAS-dnEGFR (a gift from A. Michelson), UAS-λ-top 4.2 (caEGFR) (Queenan et al., 1997), and UAS-2EGFP-AH3 (Halfon et al., 2002). The UAS lines were crossed to the CY2-GAL4 driver (Queenan et al., 1997; Goentoro et al., 2006b). Flies were grown on agar cornmeal medium; all crosses were done at 23°C, except for the caTKV, where the cross was done at 18C, and adult progeny were transferred to 23°C prior to dissection.
Previous transcriptional profiling studies of oogenesis used either the entire ovary or sorted subpopulations of the follicle cells (Bryant et al., 1999; Jordan et al., 2005). We hand-dissected stage 9-10 egg chambers. Flies were transferred to a fresh cornmeal vial with yeast and incubated at 23°C for 24h prior to dissection. Ovaries were hand-dissected in cold PBS and egg chambers at stages 9–10 (defined in Spradling, 1993) were separated from older and younger stages and divided into three groups of equal number egg chambers. To prevent RNA degradation, egg chambers were collected in groups of 20 in the RNA stabilizing buffer (RNeasy Mini Kit, Qiagen, Valencia, CA), over 5 rounds to accumulate a total of 100 egg chambers for each biological replicate (a total of 300 egg chambers from each genetic background). Details of RNA extraction, microarray experiments, qRT-PCR validation, and data analysis are described in the Supplementary Material.
Digoxigenin-labeled antisense RNA probes were synthesized using cDNA clones obtained from the Drosophila Gene Collection (http://www.fruitfly.org/DGC/index.html). For genes with no available cDNA, gene-specific PCR primers were designed to obtain 900-1200 bp products and cloned using StrataClone PCR cloning kit (Stratagene). All clones were sequenced (GeneWiz) and BLASTed against the D. melanogaster genome. In situ hybridization was carried out as described elsewhere (Wang et al., 2006), but without the RNase digestion step, which, as we found, does not allow detection of patterns in the main body follicle cells. Immunohistochemistry procedures are described elsewhere (Yakoby et al., 2008).
In situ hybridization images were annotated using 6 blocks and 4 operations (see the main text). Annotation of all of the patterns was done independently by four different scientists. In deciding on the annotation for every gene and time point, we examined 10-20 images collected from multiple focal planes from multiple egg chambers at different orientations on the microscope slide (see Figures S3-5 for examples of representative datasets). Since fixed and stained egg chambers are frequently connected to each other within the ovariole and are found in a wide range of orientations, standard image registration and normalization approches, similar to those used in the computer-assisted annotations of in situ hybridization images of gene expression in the Drosophila embryo (Kumar et al., 2002; Gurunathan et al., 2004; Peng et al., 2007), were not applicable. Furthermore, the color reactions at the final step of in situ hybridization were developed for different periods of time for different genes (the development times ranged from 10 minutes to 3 hours). As a consequence, images for different genes have different levels of background, which makes the segmentation of images, required for automated extraction of gene expression patterns, impossible at this stage. Given these limitations and a reasonably small size of the images in our database (100's, as opposed to 1000's in the fly embryo dataset) we used human annotation to summarize the information contained in multiple images. Similar approaches have been used in analyzing in situ hybridization images in Drosophila and other organisms (Tomancak et al., 2002; Sprague et al., 2006; Darnell et al., 2007; Visel et al., 2007).
To characterize the frequency of co-expression, defined as the exact match of annotations, we first generated lists of all nonredundant k-tuples (pairs, triples, quadruplets) of genes in the list of 81 genes in our database. We then identified those k-tuples that share the annotation at least once in one of the four analyzed stages of oogenesis. The ratio of the number of k-tuples that are co-expressed in at least one stage serves as the estimate for the frequency of co-expression. For the gene pairs, triplets, and quadruplets, these estimates and 95% confidence intervals (computed using the Bernoulli model (Wasserman, 2003)) are: 0.10±0.01, 0.0099±0.0007, and 0.0011±0.0001, respectively. Similar analysis was done to estimate the probabilities of pattern convergence and scatterring.
The Affymetrix Gene Chip data generated for this study are available in the GEO database (GSE12477).
This work was supported by the P50 GM071508 and R01GM078079 grants from the NIH. TS is supported by the Howard Hughes Medical Institute. CAB was supported by the BWF Graduate Training Program in Biological Dynamics, the NSF-IGERT PICASso Program, and by the Sigma-Xi Grant in Aid of Research. We thank Denise Montell and Xiaobo Wang for the in situ hybridization protocol. We are grateful to Pam Carroll, Hong Xaio, Aiqing He for help with the statistical design and quality control for transcriptional profiling assays, and to Donna Storton for help with the microarray facility. We thank Tiffany Vora and Stefano De Renzis for their help with the microarray experiments and Jack Leatherbarrow for assistance in generating the in situ hybridization probes. We also thank Thomas Funkhouser, Sudhir Kumar, Dannie Durand, Olga Troyanskaya, Stuart Newfeld, Mathieu Coppey, and Gregory Reeves for helpful discussions during the course of this work.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.