|Home | About | Journals | Submit | Contact Us | Français|
We established a collection of 7,000 transgenic lines of Drosophila melanogaster. Expression of GAL4 in each line is controlled by a different, defined fragment of genomic DNA that serves as a transcriptional enhancer. We used confocal microscopy of dissected nervous systems to determine the expression patterns driven by each fragment in the adult brain and ventral nerve cord. We present image data on 6,650 lines. Using both manual and machine-assisted annotation, we describe the expression patterns in the most useful lines. We illustrate the use of these data to identify novel neuronal cell types, reveal brain asymmetry, and describe the nature and extent of neuronal shape stereotypy. The GAL4 lines allow expression of exogenous genes in distinct, small subsets of the adult nervous system. The set of DNA fragments, each driving a documented expression pattern, will facilitate the generation of additional constructs to manipulate neuronal function.
The ability to express a gene of interest in a spatially restricted manner in a transgenic animal has greatly contributed to the successful use of Drosophila in a wide variety of biological studies. This is especially true for the nervous system, where experiments often call for manipulating the activity of small, reproducible sets of neurons (reviewed in Venken et al., 2011; Griffith, 2012). Our objective was to generate and characterize a set of transgenic lines that can each drive expression of a reporter gene in a distinct, small subset of the approximately 150,000 neurons that comprise the adult central nervous system. The set of lines also needed to be large enough to ensure that all, or nearly all, neurons are represented in at least one line.
Transgenic animals in which the yeast transcriptional activator GAL4 is placed under the control of endogenous regulatory elements have proven to be powerful and versatile tools for manipulating gene expression (Fischer et al., 1988; Brand and Perrimon, 1993). Analysis of confocal imagery of whole mount preparations from animals in which a reporter gene is expressed under the control of GAL4 has been widely used for identifying and characterizing the morphology of neuronal populations, an approach pioneered and applied most extensively by Kei Ito’s laboratory (see for example, Ito et al., 2003; Otsuna and Ito, 2006; Tanaka et al., 2012).
Prior studies have used collections of GAL4-expressing lines based on enhancer traps. In an enhancer trap, a transposon carrying a GAL4 gene is inserted at a large number of random genomic locations; at each insertion site, the pattern of GAL4 expression comes under the influence of local transcriptional regulatory elements (O’Kane and Gehring, 1987). Enhancer traps have a number of properties that limit their use: the precise nature of the sequence elements driving expression is unknown, the varied genomic locations complicate genetic manipulations and, in general, GAL4 is expressed in more cells than optimal.
These limitations can often be overcome for anatomic studies. All that is required for such studies is the ability to recognize different neuronal populations under the microscope; stochastic labeling methods can be used to generate animals in which only a fraction of the cells in the parent GAL4 pattern express the reporter gene (reviewed in Venken et al., 2011). But this is not the case when one intends to use the GAL4 driver to manipulate the activity of a specific population of neurons to study physiology or behavior. For such applications, the ability to direct expression of an exogenous protein to reproducible, small subsets of neurons in all animals in a population is critical.
The collection of GAL4 lines we generated for this project, based on the experimental design of Pfeiffer et al. (2008), begins to address those requirements. In these lines, a short fragment of genomic DNA controls GAL4 expression. As a result, GAL4 is, on average, expressed in fewer cells than in enhancer trap lines (Pfeiffer et al., 2008). Since the transgenes are all inserted at a known location by site-specific integration, genomic position effects on expression are held constant. Moreover, because the DNA fragments driving expression in each line are completely defined, one can make constructs that reuse the same enhancer fragment to drive the expression of another protein. Importantly, the expression pattern of that protein will be predictable from the image data obtained with the original GAL4 line. Pfeiffer et al. (2010) show examples of such enhancer reuse in making constructs expressing the transcriptional activator LexA and the repressor GAL80.
The ability to reuse the enhancer fragments greatly enhances the value of the image database we report here. One can use our database not only for neuroanatomy or to select a GAL4 line of interest for further studies, but also as a resource to select enhancer fragments to generate the constructs required for more sophisticated applications. For example, LexA and GAL4 expressing constructs can be combined in the same animal to separately control expression of two reporters in different cell populations. This is required for a variety of experiments used to determine neuronal connectivity in circuit mapping (reviewed in Griffin, 2012).
Although the GAL4 lines we present here are expressed more sparsely than enhancer trap lines, in most cases even more restricted expression will be required for behavioral studies where the effects of altering the activity of a very small population of neurons is desired. For example, GAL4 expression can be restricted to cells in which expression of two enhancer fragments overlap by using the split-GAL4 method, developed by Luan et al. (2006) and improved upon by Pfeiffer et al. (2010). This approach appears to be sufficient to reach single “cell-type” specificity. The modular nature of our constructs enables the facile construction of the required split-GAL4 transgenic lines.
With 6,650 lines successfully imaged, this is the largest dataset of GAL4-driven expression patterns in the adult brain and the only large dataset available of expression patterns in the adult ventral nerve cord (VNC). The expression patterns generated by these same lines in the embryonic nervous system and in third instar imaginal discs are described in accompanying papers (Manning et al., 2012; Jory et al., 2012). These combined data sets should be especially powerful for efforts to understand the logic underlying the cis-regulatory code.
The GAL4 lines have been deposited in the Bloomington Drosophila Stock Center for distribution and the expression patterns of the lines in several tissues are documented on a web-accessible database. We also developed a number of software tools to support our production pipeline and to analyze the large image dataset we generated; we expect these tools and methods to be useful for other large-scale imaging projects.
We constructed transgenic lines in which the expression of the transcriptional activator GAL4 is driven by a defined DNA fragment that contains one or more transcriptional enhancers, using methods we have described previously (Pfeiffer et al., 2008). We then examined the expression patterns that these GAL4 drivers generated, in combination with a UAS-GFP reporter construct, using confocal microscopy of whole mounts of adult brains and VNCs. We developed a pipeline for dissection, immunohistochemistry and imaging that enabled us to produce high quality data at scale. We also developed a variety of tools for browsing and annotating the image data. Our initial intent was to use expert anatomists to describe the expression patterns. While an effective strategy for a small number of lines, or to identify neurons in a particular brain region in all the lines, such an approach did not scale well for a project of this size where our goal was to fully annotate thousands of lines. To overcome this limitation, we developed methods for machine-assisted annotation of the adult central brain. The patterns of expression of a subset of the lines were registered (Peng et al., 2011) on a standard model of the adult brain (K. Ito, pers. comm.) using features visible in a reference stain. Using manually constructed 3-D masks, we computationally assigned aspects of the expression pattern in each line to one of the 68 major brain regions and then quantified the intensity and distribution of expression within each region. A human expert then vetted those annotations. The VNCs were annotated by an expert anatomist using a controlled vocabulary. These annotations permit text-based searching of a portion of the GAL4 image collection. The image database also allows a user to simply browse the data to identify lines that express GAL4 in a particular neuronal cell type, to uncover cell types not previously described, or to examine the patterns generated by all the DNA fragments surrounding a gene of interest.
We selected approximately 1,200 genes for which available expression data or predicted function implied expression in a subset of cells in the adult brain: for example, genes encoding transcription factors, neuropeptides, cell surface proteins, ion channels, transporters, and receptors. We spanned the flanking upstream and downstream intergenic regions of these genes, as well as any of their introns larger than 300 bp, with fragments of DNA that averaged 3 kb in length and overlapped (in regions that could not be covered by a single fragment) by about 1 kb. These putative enhancer fragments were cloned into a vector containing the GAL4 coding region and a core promoter and then each construct was integrated into the same location in the fly genome by site-specific integration (Groth et al., 2004). We generated ~7,000 GAL4 lines. See Pfeiffer et al. (2008) and Experimental Procedures for details.
To determine the expression patterns generated by each GAL4 driver line, we developed a high throughput pipeline for brain dissection, immunohistochemistry and imaging (see Experimental Procedures). Each GAL4 line was crossed to a UAS-GFP reporter line placing the expression of GFP under the positive control of GAL4 in the progeny, which were heterozygous for both the GAL4 driver and UAS reporter. Brains and VNCs were dissected from 3–5 day old adult females, stained with an anti-GFP antibody and with a reference marker to allow identification of different brain regions and to provide the features used to align images obtained from different brains. We obtained high quality image data from 6,647 GAL4 lines and collected a total of over 47,000 confocal stacks. We developed several enhancements to methods previously employed for Drosophila brain and VNC dissection and immunohistochemistry, as well as software tools for data tracking, which are described in Experimental Procedures.
Based on visual inspection of the confocal stacks (Figure 1) we judged that just over half of the lines had a density of labeling—as well as intense, crisp expression patterns—that would make them useful for future anatomical and behavioral experiments. These ranged from expression in approximately 20 (0.02%) to 5,000 (3%) neurons, excluding Kenyon cells, present in the central brain. We excluded Kenyon cells because the 5,000 Kenyon cells are tightly clustered in the mushroom body and do not obscure expression patterns elsewhere in the brain.
1,200 lines (17%) showed no obvious staining in the brain, although many of these showed expression in other developmental stages or tissues. Indeed, DNA fragments that show no enhancer activity in any tissue are rare; while we did not directly attempt to determine this number, a simple comparison of our data with the expression patterns driven by the same fragments in the third instar larva (J. Truman, pers. comm.; Jory et al., 2012) and embryo (Manning et al., 2012) indicates that this fraction is below 10% and would likely decrease further if more tissues were examined.
850 lines (12%) contain glial expression; in 320 lines expression was predominantly in glia, with little or no neuronal expression. All known glial cell types were observed, as well as several types not previously described (M. Kremer, pers. comm.). 1,050 lines (15%) showed broad neuronal expression that we estimated to be in greater than 5,000 cells in the central brain (excluding Kenyon cells; see for example, Figure 1 panels I and L).
Some lines displayed obvious stochastic expression, where not all cells in the pattern showed expression in each brain. This was most easily observed in lines expressing in the repeated columnar structures of the optic lobe, where gaps in the pattern are readily recognized. An example of a non-stochastic line is shown in Fig. 2G; note the even array of cell bodies in the medulla, which has a repeating columnar structure.
We chose not to further analyze broadly expressed lines, lines with no expression, lines with extensive glial expression, or those showing obvious stochastic expression. We present image data on all 6,647 lines at www.janelia.org/gal4-gen1, but focused our more detailed analyses on the remaining ~3,850 lines. Examples of what we considered very high quality lines, corresponding to the best few hundred lines in the collection, are shown in Figure 2.
A detailed mining of the image database is beyond the scope of this report. Here we present three examples to illustrate the types of studies our data enable: (1) the identification of a new class of antennal lobe projection neurons; (2) a confirmation that the fly brain displays left-right asymmetry (albeit very limited); and (3) the use of a line with sparse expression to evaluate the degree of neuronal stereotypy. The transgenic GAL4 lines and the database of images we are providing should enable these types of studies to be performed by any skilled anatomist.
Many types of the antennal lobe projection neurons have been described by Golgi impregnation, stochastic labeling and MARCM analysis of GAL4 drivers, and screening of sparsely labeled enhancer trap GAL4 drivers (reviewed in Vosshall and Stocker, 2007; see for example Stocker et al., 1990; Jefferis et al., 2007; Tanaka et al, 2008; Tanaka et al., 2012). Despite this intensive study, we were able to identify novel projection neurons in several GAL4 lines, including R19B06, R49A01 and R60H12. In Figure 3, we describe in detail the projection neurons from the VP3 glomerulus observed in R60H12.
Our initial visual screening of the expression patterns revealed one—but only one—clear case of an innervated structure that displayed left-right asymmetry in the adult brain. This structure appears to correspond to what Pascual et al. (2004) described as the asymmetric body (AB), a small area embedded on the right side of the central complex, ventrally in layer 1 of the fan-shaped body. They identified the AB in brains stained with an antibody against Fasciclin 2, a broadly expressed cell surface protein. They reported that a small fraction (7.6%) of wild type flies displayed Fasciclin 2 positive structures in both hemispheres and that these animals had a defect in long-term memory. Because neurons contributing to the AB had not been identified, the opportunities for further mechanistic experiments were limited. We identified several GAL4 lines that innervate the AB (R38D01, R42C09, R52H03, R70H05, and R72A10) and show an anatomical analysis of R72A10 in Figure 4. In 22 of the 23 R72A10 brains examined, innervation of the AB was detected exclusively in the right side of the brain; in one brain, staining was detectable bilaterally, but clearly more extensive on the right side. In lines R38D01, R42C09, and R70H05 a majority of brains displayed similarly asymmetric innervation; that is, in these GAL4 lines expression was usually bilateral, but staining on the right side was considerably more prominent than on the left side (data not shown).
Most neurons show reproducible overall morphologies, but the precise branching patterns of their fine arbors differ between animals. Examination of multiple brains from a GAL4 line that expresses sparsely enough to permit the identification of individual cell classes provides a way to estimate the extent and nature of such morphological variability. Chou et al. (2010) performed a study of this type of local interneurons in the antennal lobe (AL) and found a high degree of variability, but such studies have been limited. In Figure 5, we show an example of a population of about seven olfactory projection neurons that innervate glomerulus DL3. Consistent with earlier studies (Jenett et al. 2006), we observed a high degree of variability in the position of the cell bodies; the detailed morphology of the AL glomeruli is also known to vary somewhat between animals. We were, however, surprised by the high level of variability in the arborization patterns of these neurons in the areas of their synaptic targets in the MB calyx and the lateral horn. Differences could be seen not only between animals, but also between the two brain hemispheres of the same animal (Figure 5).
We only imaged the brains and VNCs of females in a systematic manner and so our data do not address the interesting question of the extent of sexual dimorphism. From a cursory comparison of expression patterns of 400 GAL4 lines in males and females, our impression is that less than 10% of lines show sexual dimorphism. Previous studies have shown that subsets of neurons that express fruitless (Cachero et al., 2010) or doublesex (Sanders et al., 2008) are dimorphic.
We developed a machine-assisted annotation process that abstracts the expression patterns displayed in the GAL4 lines in a way that they can be evaluated computationally. The process is initiated by aligning the confocal stacks for each brain to a standard brain model. This process transfers the expression pattern in each brain to a standard three-dimensional coordinate system. Expression in any volume of interest (VOI) can then be quantitated. A 3-D mask corresponding to that VOI is made manually. The mask specifies the voxels that correspond to the VOI. While a VOI can be freely defined, we chose to use the standard regions defined by the Insect Brain Name Working Group (K. Ito, pers. comm.). We describe the expression pattern in a VOI with two numbers, one reflecting the intensity of the expression and one the staining distribution, or what fraction of the voxels show expression. Figure 6 illustrates these metrics as applied to a set of volumes derived from the confocal images of different GAL4 lines. In this case, the VOI was the fan-shaped body, one of the 68 standard brain regions. Our database provides for searching expression patterns, using these intensity and distribution values, of a selected subset (~3,800) of the lines.
We do not yet have computational tools capable of aligning the imaged VNCs into a standard framework. As this is a requirement for machine-assisted annotation, VNCs were annotated manually. The VNC expression patterns for 2,500 selected lines were described using a controlled vocabulary indicating the position of cell bodies, neuronal arbors, and afferent and efferent axons.
Image data on a representative brain and VNC of each of the 6,650 lines can be viewed at www.janelia.org/gal4-gen1. We present maximum intensity projections of the confocal stacks as well as MPEG movies in which each frame corresponds to an ~1 μm optical section moving through the confocal stack along the z-axis. The site also enables searching the image data by line name, the name of the gene from which the enhancer fragment was derived, or for expression in a given brain area. In addition to the data described in this paper, data on the expression of the lines in third instar imaginal discs described in Jory et al. (2012) and expression in the embryonic CNS described in Manning et al. (2012) are included. The large size of the original confocal stacks (~500 MB each) limits on-line distribution. It is possible to download a small number of the original confocal stacks (LSM files) from the web server. We are also able to provide access to the original confocal stacks of the most useful 3,850 lines (~5 TB) on hard disk; requests should be sent to gro.imhh.ailenaj@1neg-4lag. The GAL4 lines have been deposited in the Bloomington Drosophila Stock Center for distribution (http://flystocks.bio.indiana.edu/Browse/misc-browse/Janelia.php).
The collection of GAL4 driver lines we describe here should be particularly useful for identifying and characterizing the morphology of neuronal populations in the adult fly for several reasons: (1) The lines express, on average, in fewer cells than the enhancer trap lines used in previous studies. (2) The number of lines we have imaged is larger than in any previous study. (3) We have imaged not only the brain but also the VNC, an important part of the central nervous system that has not generally been included in other studies. (4) We provide for convenient on-line browsing as well as an initial annotation of the expression patterns of the lines. (5) We are making the GAL4 lines and our primary image data freely available.
We estimate that our collection contains ~3,850 lines in which the number of labeled central-brain neurons is in the range of 20 to 5,000. We expect that these lines will be adequate to represent the vast majority, if not all, cells in a variety of overlapping patterns. Establishing the completeness of the coverage in a rigorous way is not currently possible, but we note that for several well studied brain areas, such as the mushroom body and its extrinsic neurons (Y. Aso, unpublished) and the optic lobes (A. Nern, pers. comm.), expert annotators have been able to identify more than 90% of previously described cell types as well as many novel cell types.
We report the identification of a type of antennal lobe projection neuron not previously described, a cell class that has been heavily studied in previous work. We also present data establishing that the fly brain is asymmetric. However, asymmetry is rare, as we only observed one obviously asymmetric brain structure during our initial analysis of the collection. We believe that one of the major uses of the data we have generated will be further neuroanatomical studies, which can be carried out by simply browsing and mining the existing images. The intent of the work presented in this report was not to carry out such analyses, but to create a widely distributed resource that enables this activity.
The large number of lines in our collection that express in sparse populations of cells allows for a detailed examination of the degree of stereotypy in neuronal shape: the extent to which the same genetically defined cell population displays the same morphology from animal to animal. Initial neuronal wiring must be established to a large extent by a program specified in the genome. But how much variability these “molecular algorithms” generate in practice, and the extent to which this differs between cell types, has not yet been studied in detail. We present one example in this paper for a population of antennal lobe projection neurons. Their display of extensive variability in the positions of axonal termini in the lateral horn implies that highly dynamic mechanisms are used for circuit formation. The resources we provide here should enable a much more comprehensive study of this question in diverse cell types.
Our analysis of the expression patterns generated by 6,650 DNA fragments confirms the initial impression from the small sample (45 fragments) described in Pfeiffer et al. (2008) about the density of enhancers (also called cis-regulatory modules or CRMs) in the D. melanogaster genome. The genome contains >50,000 enhancers and multiple enhancers drive distinct subsets of expression of a gene in each tissue and developmental stage. But the GAL4 lines we describe are not individually adequate to provide cell-type specific expression. Even when single 3-kb DNA fragments are used to drive GAL4 expression, a tiny fraction—perhaps less than one in a thousand of patterns is—limited to a single neuronal population in the adult brain. The analyses of expression patterns generated by these same GAL4 lines in the embryo (Manning et al., 2012) and third instar imaginal discs (Jory et al., 2012) come to a similar conclusion. Taken together, these observations demonstrate that single enhancers only very rarely produce expression patterns limited to a single cell-type in a complex tissue.
We generated fragments by PCR from genomic DNA which were then cloned, sequence verified, and inserted upstream of a synthetic core promoter and the GAL4 coding region as described in Pfeiffer et al. (2008). In about 200 cases where the upstream intergenic region was small, we generated PCR fragments that also contained the start site of transcription and used them to create transcriptional fusion constructs. Constructs were inserted into the attP2 integration site using the phiC31 site-specific integration system (Groth et al., 2004) and homozygous stocks were generated. Embryo injections and screening for transformant flies were performed by GSI (Cambridge, MA); see Pfeiffer et al. (2008) for additional details. The primers used for each fragment and the molecular coordinates of the fragments on the Drosophila genome sequence can be downloaded at http://flystocks.bio.indiana.edu/Browse/misc-browse/Janelia.php and are also given at www.janelia.org/gal4-gen1.
To achieve consistent image quality while producing a high number of specimens we developed standardized processes for dissection, immunohistochemistry and confocal laser scanning microscopy. Males from each GAL4 line were crossed to females homozygous for the pJFRC2-10XUAS-IVS-mCD8::GFP reporter inserted at attP2 (Pfeiffer et al., 2010) and 3–5 day old female adults heterozygous for the GAL4 driver and UAS reporter were dissected. Dissections were performed using Leica MZ9-5 stereomicroscopes. Flies were anesthetized with carbon dioxide gas, rinsed once in 70% ethanol, twice in S2 media (Schneider’s Insect Medium, SIGMA), submerged in cold S2 media on a dish coated with silicone (Sylgard 184, Dow Corning), and dissected with Dumont #5 stainless steel forceps. Dissection in S2 media resulted in better preservation of neuronal morphology than dissection in phosphate buffered saline (PBS). When dissecting complete central nervous systems, we first separated the VNC from the thorax and abdomen before dissecting the head. Once the nervous system was exposed, we removed membranes, trachea, and other tissues, taking care not to touch the brain, VNC, or neck connective. A well-dissected nervous system sinks in the media and has an intact connection between the VNC and the brain. A video of our dissection procedure can be found at http://www.janelia.org/team-project/fly-light#5064.
The nervous systems were transferred within 20 min of dissection to 1.5 ml Eppendorf Protein LoBind microcentrifuge tubes containing 1% paraformaldehyde (Electron Microscopy Science, Hatfield, PA) in S2 media on ice. After ensuring that the samples settled to the bottom of the tube, the tubes were placed on a rotating, rocking mixer (Fisher Scientific Nutating Mixer) at 4°C overnight. The next morning, the fixative solution was aspirated and replaced with cold PAT (0.5% Triton X-100, 0.5% bovine serum albumin in PBS) at 4°C. Samples were warmed to room temperature in a rack on the lab bench and were washed 3 times at room temperature with 1 ml of room temperature PAT for 1 h per wash. They were then incubated in blocking buffer (3% normal goat serum (Invitrogen) in PAT) for 1 h at room temperature after which the blocking buffer was replaced with 1 ml of a mixture of two primary antibodies in blocking buffer: rabbit anti-GFP (Invitrogen A11122) at 1:1000 and mouse nc82 supernatant (Developmental Studies Hybridoma Bank, Iowa City, IA) at 1:50; mAb nc82 is directed against the Bruchpilot protein, a component of pre-synaptically-located T-bars (Wagh et al., 2006). Primary antibody incubations were overnight at 4°C with rocking.
After the primary antibody incubation, samples were returned to room temperature and were given three 1 h washes in PAT. The last wash solution was replaced with 1 ml of a secondary antibody solution consisting of a 1:800 dilution of Alexa Fluor 488 goat anti-rabbit “green” (Invitrogen A11034) and a 1:400 dilution of Alexa Fluor 568 goat anti-mouse “red” (Invitrogen A11031) in blocking buffer and the samples were incubated for 4–5 days with rocking at 4°C in the dark. Following the secondary antibody incubation, tissues were washed several times with PAT as in previous steps. Just prior to mounting on a microscope slide, tissues were washed once for 20 min at room temperature in PBS.
An iterative process was used to optimize the dissection and staining protocol in which the quality of the computational alignment obtained from each imaged brain to a standard brain model was used as a metric. The BrainAligner (Peng et al., 2011) attempts to identify landmarks within the nc82 staining pattern of each brain and then uses the landmarks it can identify to warp the image to a collection of about 280 features in the standard brain. The two key factors in obtaining a good alignment were achieving high quality nc82 staining that penetrated the entire brain and avoiding damage to the brains during the dissection, staining or mounting processes. The quality score, Qi, produced by the BrainAligner reports the fraction of features that were not found by the program. Scores of less than 0.30 correlated with high quality alignment as subjectively judged by visual inspection. Greater than 22,000 of the brains we imaged in our screen of the GAL4 lines aligned with a score less than 0.30. For over 5,500 different GAL4 lines, we have at least one brain with a score less than 0.30.
For confocal imaging, samples were mounted on a 75 mm × 25 mm microscope slide (Fisherbrand SuperFrost Plus) to which we had applied an 8 or 10-well silicone adhesive spacer (custom ordered from Grace Biolabs). Approximately 40–50 samples were mounted per slide. The samples were mounted in a 9:1 mixture of Vectashield mounting solution (Vector Laboratories) and PBS. Wells were covered with a 10 mm round cover glass (Carl Zeiss, High Performance cover glass, #1.5 thickness, 0.17 ± 0.005 mm) held in place by nail polish. A video of our mounting procedure is posted at http://www.janelia.org/team-project/fly-light#5064.
Imaging was done on two Zeiss LSM 510 instruments with 20X, 0.80 numerical aperture plan-apochromat objectives; the motorized stage was modified to carry a custom two-slide holder. The pixel size was 0.62 μm × 0.62 μm with a z-section thickness of ~1 μm; the pixel dimensions of the stacks are 1024 × 1024 × ~150. We imaged with 12 bits of dynamic range and averaged four successive scans. Laser illumination was increased with z-position to counter attenuation. Fluorescence emission from the 488 nm excitation passed through a 505–550 nm bandpass filter and emission from the 561 nm excitation passed through a 575 nm longpass filter.
In collaboration with Carl Zeiss Microscopy, LLC, we developed a custom version of the Zeiss MultiTime software to enable the scanning of large numbers of selected samples in sequence. The “Add Next” feature simplified the setup of complex multiple channel z-stack image acquisitions over multiple locations. For improved image quality through z-stacks, the configuration of laser ramp points and laser power setup was simplified. In addition to other small usability enhancements, automated file management was added. Finally, MultiTime was made more robust so that it could seamlessly recover position and configuration information in the event of computer crash, power loss, or other unexpected interruption.
Our database shows image data for 6,647 lines. While we generated ~7,000 transgenic lines, the remaining lines did not yield images of sufficiently quality in our first imaging attempt and their expression patterns were not sufficiently interesting to warrant re-imaging.
We developed a number of software tools to manage the volume of image data generated during the project. The large size of each confocal stack (~500 MB) necessitated the generation of smaller representations of the original data to facilitate analysis by enabling rapid data loading over network connections. Three types of derived representations were generated for each confocal stack: (1) a maximum intensity projection of the full stack as an ~300 KB JPEG file (http://www.w3.org/Graphics/JPEG/jfif3.pdf); (2) a series of ~10 μm sub-stack maximum intensity projections as JPEG files; and (3) a movie in which each slice of a stack is turned into a movie frame. By using the MPEG-4 compression algorithm (http://mpeg.chiariglione.org/standards/mpeg-4/mpeg-4.htm) we were able to produce movies with file sizes of about 5 MB, approximately 100 times smaller than the original confocal stack, while introducing only minor compression artifacts. During the production of the movies, the image data in each frame were contrast optimized to improve the ability to see weak signals. A calibration bar is included in each frame, which displays the maximum and minimum intensities in the original image. The calibration bars should be used when judging the strength of an expression pattern. We produced all projections in two versions, one with the reference channel (nc82 staining) and one without. The former facilitates the localization of expression pattern elements relative to the reference channel in which the various neuropil of the brain can be identified. The latter permits the detection of faint aspects of the expression pattern, which might otherwise be masked by the reference channel.
In the course of our analysis we noted that the imaging process produced a primary image file (LSM format) that was inverted in the left-right axis relative to the specimen. Because we wanted to provide primary data with minimal manipulation and we wanted the derived data files to be consistent with the primary data, we did not correct for this inversion. The sole exception was in the case of the asymmetric body data shown in Figure 4, where left-right orientation was central to data description.
We developed a standardized file name for the image files that provides human readable information about the file contents. Another benefit of the standardized file name is that it is very easy to parse, which allows the file system to function as a simple machine searchable image database. The structure of this naming convention is shown in Figure S1 and the renaming software, and the file structure of the data are described in Supplemental Information.
In order to distribute the data to the research community we developed a web interface (www.janelia.org/gal4-gen1) written in Perl and PHP, and fed by a MySQL relational database. The data provided in this way are primarily the derived data sets, as the large size of the original confocal stacks limits on-line distribution.
We developed a number of software tools and scripts, which we refer to collectively as the Fiji Annotation Suite (FAS), to facilitate manual annotation of the images by a human expert. These ImageJ macros, shell scripts, and helper applications work inside of Fiji (http://fiji.sc/) to access files without requiring the user to interact with the Unix file system. They facilitate adding descriptive metadata (annotations) to image data by streamlining and organizing the workflow; for example, accelerating data entry while analyzing an image or video by assigning annotation terms from a controlled vocabulary to specific keys of the keyboard. While initially developed for this project, these tools are flexible enough to be used for the annotation of any structured collection of image data. The FAS will be described in detail elsewhere; it is open source (GLP) and is available upon request.
The expression patterns in the VNC were annotated manually using a controlled vocabulary to describe the position of cell bodies, neuronal arbors and afferent and efferent axons. The terminology for all peripheral nerves and most of the neuropil regions has been previously described in detail (Power, 1948). A paper describing all the neuropil regions and peripheral nerves of the VNC is in preparation (Court, R., Armstrong, D.A., and Shepherd, D., unpublished).
Machine-assisted annotation was performed on image data that had been aligned to a standard brain model using the BrainAligner software described by Peng et al. (2011). Because of the requirement that expression patterns need to be transferred to the standard brain map with a high degree of precision, it was only possible to apply this approach to specimens that could be aligned with high quality (Qi scores below 0.30). After alignment, the expression patterns generated by each GAL4 line can be described using the same coordinate system. By defining the volumes that correspond to the brain areas in the standard map, it was possible to computationally perform annotation in a semi-quantitative way. A set of 3-D masks was generated manually using Amira (Visage Imaging GmbH) corresponding to the volume associated with each of 68 major brain areas as defined in the standard brain map of D. melanogaster (Figure 7; K. Ito, pers. comm.).
To abstract the intensity and distribution of the expression within the volumes of interest (VOI) from the details of the morphology of the GAL4 positive neurons, we generated a 4-bit histogram (16 bins) of all voxels within the VOI. This number of bins was determined empirically; it proved to be a good compromise between preserving data density and maximizing the visual perceptibility of brightness-values expressed in false colors. To assist in communicating these results we generated a version of the data in which the intensity and the distribution within each brain region (what fraction of the voxels within that brain region show expression) within each brain region are each represented by an integer from 0 to 5 (see Figure 6). We found that these numerical representations reflect human perception of the distribution and intensity of the expression patterns and allow efficient querying of the database for lines that express in a given brain region. The result of the machine-assisted annotation is an abstraction of the expression patterns displayed by each GAL4 line into 136 scores representing the intensity and distribution of expression in each of 68 brain areas.
We thank Hua-Peng Liaw, Omotara Ogundeyi, Nicholas Abel, Emily Willis, Ying-Jou Lee, Rebecca Vorimo for assistance with data acquisition; Karen Hibbard, Jessica Keating, James McMahon, Megan Hong, Monti Mercer, Grace Zheng, Jui-Chun Kao and Danielle Ruiz for fly husbandry; Christopher Bruns, Nikolay Kladt, Frank Midgley, Lowell Umayam and Charlotte Weaver for software development; Geoffrey Meissner and Eric Hoopfer for help in evaluating sexual dimorphism; and Bruce Kimmel and Crystal Sullivan for administrative support.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.