PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (26)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  The neuronal architecture of the mushroom body provides a logic for associative learning 
eLife  null;3:e04577.
We identified the neurons comprising the Drosophila mushroom body (MB), an associative center in invertebrate brains, and provide a comprehensive map describing their potential connections. Each of the 21 MB output neuron (MBON) types elaborates segregated dendritic arbors along the parallel axons of ∼2000 Kenyon cells, forming 15 compartments that collectively tile the MB lobes. MBON axons project to five discrete neuropils outside of the MB and three MBON types form a feedforward network in the lobes. Each of the 20 dopaminergic neuron (DAN) types projects axons to one, or at most two, of the MBON compartments. Convergence of DAN axons on compartmentalized Kenyon cell–MBON synapses creates a highly ordered unit that can support learning to impose valence on sensory representations. The elucidation of the complement of neurons of the MB provides a comprehensive anatomical substrate from which one can infer a functional logic of associative olfactory learning and memory.
DOI: http://dx.doi.org/10.7554/eLife.04577.001
eLife digest
One of the key goals of neuroscience is to understand how specific circuits of brain cells enable animals to respond optimally to the constantly changing world around them. Such processes are more easily studied in simpler brains, and the fruit fly—with its small size, short life cycle, and well-developed genetic toolkit—is widely used to study the genes and circuits that underlie learning and behavior.
Fruit flies can learn to approach odors that have previously been paired with food, and also to avoid any odors that have been paired with an electric shock, and a part of the brain called the mushroom body has a central role in this process. When odorant molecules bind to receptors on the fly's antennae, they activate neurons in the antennal lobe of the brain, which in turn activate cells called Kenyon cells within the mushroom body. The Kenyon cells then activate output neurons that convey signals to other parts of the brain.
It is known that relatively few Kenyon cells are activated by any given odor. Moreover, it seems that a given odor activates different sets of Kenyon cells in different flies. Because the association between an odor and the Kenyon cells it activates is unique to each fly, each fly needs to learn through its own experiences what a particular pattern of Kenyon cell activation means.
Aso et al. have now applied sophisticated molecular genetic and anatomical techniques to thousands of different transgenic flies to identify the neurons of the mushroom body. The resulting map reveals that the mushroom body contains roughly 2200 neurons, including seven types of Kenyon cells and 21 types of output cells, as well as 20 types of neurons that use the neurotransmitter dopamine. Moreover, this map provides insights into the circuits that support odor-based learning. It reveals, for example, that the mushroom body can be divided into 15 anatomical compartments that are each defined by the presence of a specific set of output and dopaminergic neuron cell types. Since the dopaminergic neurons help to shape a fly's response to odors on the basis of previous experience, this organization suggests that these compartments may be semi-autonomous information processing units.
In contrast to the rest of the insect brain, the mushroom body has a flexible organization that is similar to that of the mammalian brain. Elucidating the circuits that support associative learning in fruit flies should therefore make it easier to identify the equivalent mechanisms in vertebrate animals.
DOI: http://dx.doi.org/10.7554/eLife.04577.002
doi:10.7554/eLife.04577
PMCID: PMC4273437  PMID: 25535793
mushroom body; olfactory learning; associative memory; neuronal circuits; dopamine; plasticity; D. melanogaster
2.  Mushroom body output neurons encode valence and guide memory-based action selection in Drosophila 
eLife  null;3:e04580.
Animals discriminate stimuli, learn their predictive value and use this knowledge to modify their behavior. In Drosophila, the mushroom body (MB) plays a key role in these processes. Sensory stimuli are sparsely represented by ∼2000 Kenyon cells, which converge onto 34 output neurons (MBONs) of 21 types. We studied the role of MBONs in several associative learning tasks and in sleep regulation, revealing the extent to which information flow is segregated into distinct channels and suggesting possible roles for the multi-layered MBON network. We also show that optogenetic activation of MBONs can, depending on cell type, induce repulsion or attraction in flies. The behavioral effects of MBON perturbation are combinatorial, suggesting that the MBON ensemble collectively represents valence. We propose that local, stimulus-specific dopaminergic modulation selectively alters the balance within the MBON network for those stimuli. Our results suggest that valence encoded by the MBON ensemble biases memory-based action selection.
DOI: http://dx.doi.org/10.7554/eLife.04580.001
eLife digest
An animal's survival depends on its ability to respond appropriately to its environment, approaching stimuli that signal rewards and avoiding any that warn of potential threats. In fruit flies, this behavior requires activity in a region of the brain called the mushroom body, which processes sensory information and uses that information to influence responses to stimuli.
Aso et al. recently mapped the mushroom body of the fruit fly in its entirety. This work showed, among other things, that the mushroom body contained 21 different types of output neurons. Building on this work, Aso et al. have started to work out how this circuitry enables flies to learn to associate a stimulus, such as an odor, with an outcome, such as the presence of food.
Two complementary techniques—the use of molecular genetics to block neuronal activity, and the use of light to activate neurons (a technique called optogenetics)—were employed to study the roles performed by the output neurons in the mushroom body. Results revealed that distinct groups of output cells must be activated for flies to avoid—as opposed to approach—odors. Moreover, the same output neurons are used to avoid both odors and colors that have been associated with punishment. Together, these results indicate that the output cells do not encode the identity of stimuli: rather, they signal whether a stimulus should be approached or avoided. The output cells also regulate the amount of sleep taken by the fly, which is consistent with the mushroom body having a broader role in regulating the fly's internal state.
The results of these experiments—combined with new knowledge about the detailed structure of the mushroom body—lay the foundations for new studies that explore associative learning at the level of individual circuits and their component cells. Given that the organization of the mushroom body has much in common with that of the mammalian brain, these studies should provide insights into the fundamental principles that underpin learning and memory in other species, including humans.
DOI: http://dx.doi.org/10.7554/eLife.04580.002
doi:10.7554/eLife.04580
PMCID: PMC4273436  PMID: 25535794
mushroom body; memory; behavioral valence; sleep; population code; action selection; D. melanogaster
3.  Shared mushroom body circuits underlie visual and olfactory memories in Drosophila 
eLife  2014;3:e02395.
In nature, animals form memories associating reward or punishment with stimuli from different sensory modalities, such as smells and colors. It is unclear, however, how distinct sensory memories are processed in the brain. We established appetitive and aversive visual learning assays for Drosophila that are comparable to the widely used olfactory learning assays. These assays share critical features, such as reinforcing stimuli (sugar reward and electric shock punishment), and allow direct comparison of the cellular requirements for visual and olfactory memories. We found that the same subsets of dopamine neurons drive formation of both sensory memories. Furthermore, distinct yet partially overlapping subsets of mushroom body intrinsic neurons are required for visual and olfactory memories. Thus, our results suggest that distinct sensory memories are processed in a common brain center. Such centralization of related brain functions is an economical design that avoids the repetition of similar circuit motifs.
DOI: http://dx.doi.org/10.7554/eLife.02395.001
eLife digest
Animals tend to associate good and bad things with certain visual scenes, smells and other kinds of sensory information. If we get food poisoning after eating a new food, for example, we tend to associate the taste and smell of the new food with feelings of illness. This is an example of a negative ‘associative memory’, and it can persist for months, even when we know that our sickness was not caused by the new food itself but by some foreign body that should not have been in the food. The same is true for positive associative memories.
It is known that many associative memories contain information from more than one of the senses. Our memory of a favorite food, for instance, includes its scent, color and texture, as well as its taste. However, little is known about the ways in which information from the different senses is processed in the brain. Does each sense have its own dedicated memory circuit, or do multiple senses converge to the same memory circuit?
A number of studies have used olfactory (smell) and visual stimuli to study the basic neuroscience that underpins associative memories in fruit flies. The olfactory experiments traditionally use sugar and electric shocks to induce positive and negative associations with various scents. However, the visual experiments use other methods to induce associations with colors. This means that it is difficult to combine and compare the results of olfactory and visual experiments.
Now, Vogt, Schnaitmann et al. have developed a transparent grid that can be used to administer electric shocks in visual experiments. This allows direct comparisons to be made between the neuronal processing of visual associative memories and the neural processing of olfactory associative memories.
Vogt, Schnaitmann et al. showed that both visual and olfactory stimuli are modulated in the same subset of dopamine neurons for positive associative memories. Similarly, another subset of dopamine neurons was found to drive negative memories of both the visual and olfactory stimuli. The work of Vogt, Schnaitmann et al. shows that associative memories are processed by a centralized circuit that receives both visual and olfactory inputs, thus reducing the number of memory circuits needed for such memories.
DOI: http://dx.doi.org/10.7554/eLife.02395.002
doi:10.7554/eLife.02395
PMCID: PMC4135349  PMID: 25139953
associative memory; dopamine neurons; visual learning; D. melanogaster
4.  A GAL4-Driver Line Resource for Drosophila Neurobiology 
Cell reports  2012;2(4):991-1001.
SUMMARY
We established a collection of 7,000 transgenic lines of Drosophila melanogaster. Expression of GAL4 in each line is controlled by a different, defined fragment of genomic DNA that serves as a transcriptional enhancer. We used confocal microscopy of dissected nervous systems to determine the expression patterns driven by each fragment in the adult brain and ventral nerve cord. We present image data on 6,650 lines. Using both manual and machine-assisted annotation, we describe the expression patterns in the most useful lines. We illustrate the use of these data to identify novel neuronal cell types, reveal brain asymmetry, and describe the nature and extent of neuronal shape stereotypy. The GAL4 lines allow expression of exogenous genes in distinct, small subsets of the adult nervous system. The set of DNA fragments, each driving a documented expression pattern, will facilitate the generation of additional constructs to manipulate neuronal function.
doi:10.1016/j.celrep.2012.09.011
PMCID: PMC3515021  PMID: 23063364
5.  A visual motion detection circuit suggested by Drosophila connectomics 
Nature  2013;500(7461):175-181.
Summary
Animal behavior arises from computations in neuronal circuits, but our understanding of these computations has been frustrated by the lack of detailed synaptic connection maps, or connectomes. For example, despite intensive investigations over half a century, the neuronal implementation of local motion detection in the insect visual system remains elusive. Here, we developed a semi-automated pipeline using electron microscopy to reconstruct a connectome, containing 379 neurons and 8,637 chemical synaptic contacts, within the Drosophila optic medulla. By matching reconstructed neurons to examples from light microscopy, we assigned neurons to cell types and assembled a connectome of the medulla's repeating module. Within this module, we identified cell types constituting a motion detection circuit and showed that the connections onto individual motion-sensitive neurons in this circuit were consistent with their direction selectivity. Our results identify cellular targets for future functional investigations, and demonstrate that connectomes can provide key insights into neuronal computations.
doi:10.1038/nature12450
PMCID: PMC3799980  PMID: 23925240
6.  Contributions of the 12 neuron classes in the fly lamina to motion vision 
Neuron  2013;79(1):128-140.
SUMMARY
Motion detection is a fundamental neural computation performed by many sensory systems. In the fly, local motion computation is thought to occur within the first two layers of the visual system, the lamina and medulla. We constructed specific genetic driver lines for each of the 12 neuron classes in the lamina. We then depolarized and hyperpolarized each neuron type, and quantified fly behavioral responses to a diverse set of motion stimuli. We found that only a small number of lamina output neurons are essential for motion detection, while most neurons serve to sculpt and enhance these feedforward pathways. Two classes of feedback neurons (C2 and C3), and lamina output neurons (L2 and L4), are required for normal detection of directional motion stimuli. Our results reveal a prominent role for feedback and lateral interactions in motion processing, and demonstrate that motion-dependent behaviors rely on contributions from nearly all lamina neuron classes.
doi:10.1016/j.neuron.2013.05.024
PMCID: PMC3806040  PMID: 23849200
7.  Annotated embryonic CNS expression patterns of 5000 GMR GAL4 lines: a resource for manipulating gene expression and analyzing cis-regulatory modules 
Cell reports  2012;2(4):1002-1013.
Here we describe the embryonic CNS expression of 5,000 GAL4 lines made using molecularly defined cis-regulatory DNA inserted into a single attP genomic location. We document and annotate the patterns in early embryos when neurogenesis is at its peak, and in older embryos where there is maximal neuronal diversity and the first neural circuits are established. We note expression in other tissues such as the lateral body wall (muscle, sensory neurons, trachea) and viscera. Companion papers report on the adult brain and larval imaginal discs, and the integrated datasets are available online (www.janelia.org/flylight/gal4-gen1). This collection of embryonically-expressed GAL4 lines will be valuable for determining neuronal morphology and function; the 1862 lines expressed in small subsets of neurons (<20/segment) will be especially valuable for characterizing interneuronal diversity and function, as interneurons comprise the majority of all CNS neurons, yet their gene expression profile and function remain virtually unexplored.
doi:10.1016/j.celrep.2012.09.009
PMCID: PMC3523218  PMID: 23063363
8.  A survey of 6300 genomic fragments for cis-regulatory activity in the imaginal discs of Drosophila melanogaster 
Cell reports  2012;2(4):1014-1024.
6,334 fragments from the genome of Drosophila melanogaster were analyzed for their ability to drive expression of GAL4 reporter genes in the 3rd instar larval imaginal discs. 1,194 reporter genes drove expression in the eye, antenna, leg, wing, haltere, or genital imaginal discs. The patterns ranged from large regions to individual cells. About 75% of the active fragments drove expression in multiple discs; 20% were expressed in ventral, but not dorsal, discs (legs, genital, and antenna), while ~23% were expressed in dorsal, but not ventral discs (wing, haltere, and eye). Several patterns, for example within the leg chordotonal organ, appeared a surprisingly large number of times. Unbiased searches for DNA sequence motifs suggest candidate transcription factors that may regulate enhancers with shared activities. Together, these expression patterns provide a valuable resource to the community and offer a broad overview of how transcriptional regulatory information is distributed in the Drosophila genome.
doi:10.1016/j.celrep.2012.09.010
PMCID: PMC3483442  PMID: 23063361
9.  Gene Ontology: tool for the unification of biology 
Nature genetics  2000;25(1):25-29.
Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
doi:10.1038/75556
PMCID: PMC3037419  PMID: 10802651
10.  Comparative Genomics of the Eukaryotes 
Science (New York, N.Y.)  2000;287(5461):2204-2215.
A comparative analysis of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae—and the proteins they are predicted to encode—was undertaken in the context of cellular, developmental, and evolutionary processes. The nonredundant protein sets of flies and worms are similar in size and are only twice that of yeast, but different gene families are expanded in each genome, and the multidomain proteins and signaling pathways of the fly and worm are far more complex than those of yeast. The fly has orthologs to 177 of the 289 human disease genes examined and provides the foundation for rapid analysis of some of the basic processes involved in human disease.
PMCID: PMC2754258  PMID: 10731134
11.  Global analysis of patterns of gene expression during Drosophila embryogenesis 
Genome Biology  2007;8(7):R145.
Embryonic expression patterns for 6,003 (44%) of the 13,659 protein-coding genes identified in the Drosophila melanogaster genome were documented, of which 40% show tissue-restricted expression.
Background
Cell and tissue specific gene expression is a defining feature of embryonic development in multi-cellular organisms. However, the range of gene expression patterns, the extent of the correlation of expression with function, and the classes of genes whose spatial expression are tightly regulated have been unclear due to the lack of an unbiased, genome-wide survey of gene expression patterns.
Results
We determined and documented embryonic expression patterns for 6,003 (44%) of the 13,659 protein-coding genes identified in the Drosophila melanogaster genome with over 70,000 images and controlled vocabulary annotations. Individual expression patterns are extraordinarily diverse, but by supplementing qualitative in situ hybridization data with quantitative microarray time-course data using a hybrid clustering strategy, we identify groups of genes with similar expression. Of 4,496 genes with detectable expression in the embryo, 2,549 (57%) fall into 10 clusters representing broad expression patterns. The remaining 1,947 (43%) genes fall into 29 clusters representing restricted expression, 20% patterned as early as blastoderm, with the majority restricted to differentiated cell types, such as epithelia, nervous system, or muscle. We investigate the relationship between expression clusters and known molecular and cellular-physiological functions.
Conclusion
Nearly 60% of the genes with detectable expression exhibit broad patterns reflecting quantitative rather than qualitative differences between tissues. The other 40% show tissue-restricted expression; the expression patterns of over 1,500 of these genes are documented here for the first time. Within each of these categories, we identified clusters of genes associated with particular cellular and developmental functions.
doi:10.1186/gb-2007-8-7-r145
PMCID: PMC2323238  PMID: 17645804
12.  Global analyses of mRNA translational control during early Drosophila embryogenesis 
Genome Biology  2007;8(4):R63.
The polysomal profiles of over 15,000 transcripts during the first ten hours after egg laying have been determined.
Background
In many animals, the first few hours of life proceed with little or no transcription, and developmental regulation at these early stages is dependent on maternal cytoplasm rather than the zygotic nucleus. Translational control is critical for early Drosophila embryogenesis and is exerted mainly at the gene level. To understand post-transcriptional regulation during Drosophila early embryonic development, we used sucrose polysomal gradient analyses and GeneChip analysis to illustrate the translation profile of individual mRNAs.
Results
We determined ribosomal density and ribosomal occupancy of over 10,000 transcripts during the first ten hours after egg laying.
Conclusion
We report the extent and general nature of gene regulation at the translational level during early Drosophila embryogenesis on a genome-wide basis. The diversity of the translation profiles indicates multiple mechanisms modulating transcript-specific translation. Cluster analyses suggest that the genes involved in some biological processes are co-regulated at the translational level at certain developmental stages.
doi:10.1186/gb-2007-8-4-r63
PMCID: PMC1896012  PMID: 17448252
13.  Large-Scale Trends in the Evolution of Gene Structures within 11 Animal Genomes 
PLoS Computational Biology  2006;2(3):e15.
We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans) together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales—from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for “Comparative Genomics Library”). Our results demonstrate that change in intron–exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.
Synopsis
Just as protein sequences change over time, so do gene structures. Over comparatively short evolutionary timescales, introns lengthen and shorten; and over longer timescales the number and positions of introns in homologous genes can change. These facts suggest that the intron–exon structures of genes may provide a source of evolutionary information. The utility of gene structures as materials for phylogenetic analyses, however, depends upon their independence from the forces driving protein evolution. If, for example, intron–exon structures are strongly influenced by selection at the amino acid level, then using them for phylogenetic investigations is largely pointless, as the same information could have been more easily gained from protein analyses. Using 11 animal genomes, Yandell et al. show that evolution of intron lengths and positions is largely—though not completely—independent of protein sequence evolution. This means that gene structures provide a source of information about the evolutionary past independent of protein sequence similarities—a finding the authors employ to investigate the accuracy of the protein clock and to explore the utility of gene structures as a means to resolve deep phylogenetic relationships within the animals.
doi:10.1371/journal.pcbi.0020015
PMCID: PMC1386723  PMID: 16518452
14.  Computational identification of developmental enhancers: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura 
Genome Biology  2004;5(9):R61.
27 predicted gene-regulatory regions in the Drosophila melanogaster genome were analyzed in vivo, confirming 15 active enhancer regions. A comparison with Drosophila pseudoobscura sequences revealed that conservation of binding-site clusters accurately discriminates functional regions from non-functional ones.
Background
The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters.
Results
We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns.
Conclusions
Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.
doi:10.1186/gb-2004-5-9-r61
PMCID: PMC522868  PMID: 15345045
15.  Drosophila melanogaster MNK/Chk2 and p53 Regulate Multiple DNA Repair and Apoptotic Pathways following DNA Damage 
Molecular and Cellular Biology  2004;24(3):1219-1231.
We have used genetic and microarray analysis to determine how ionizing radiation (IR) induces p53-dependent transcription and apoptosis in Drosophila melanogaster. IR induces MNK/Chk2-dependent phosphorylation of p53 without changing p53 protein levels, indicating that p53 activity can be regulated without an Mdm2-like activity. In a genome-wide analysis of IR-induced transcription in wild-type and mutant embryos, all IR-induced increases in transcript levels required both p53 and the Drosophila Chk2 homolog MNK. Proapoptotic targets of p53 include hid, reaper, sickle, and the tumor necrosis factor family member Eiger. Overexpression of Eiger is sufficient to induce apoptosis, but mutations in Eiger do not block IR-induced apoptosis. Animals heterozygous for deletions that span the reaper, sickle, and hid genes exhibited reduced IR-dependent apoptosis, indicating that this gene complex is haploinsufficient for induction of apoptosis. Among the genes in this region, hid plays a central, dosage-sensitive role in IR-induced apoptosis. p53 and MNK/Chk2 also regulate DNA repair genes, including two components of the nonhomologous end-joining repair pathway, Ku70 and Ku80. Our results indicate that MNK/Chk2-dependent modification of Drosophila p53 activates a global transcriptional response to DNA damage that induces error-prone DNA repair as well as intrinsic and extrinsic apoptosis pathways.
doi:10.1128/MCB.24.3.1219-1231.2004
PMCID: PMC321428  PMID: 14729967
16.  Computational identification of Drosophila microRNA genes 
Genome Biology  2003;4(7):R42.
An informatic procedure has been used to analyze the euchromatic sequences of Drosophila melanogaster and D. pseudoobscura for conserved sequences that adopt an extended stem-loop structure and display other characteristics of known miRNAs.
Background
MicroRNAs (miRNAs) are a large family of 21-22 nucleotide non-coding RNAs with presumed post-transcriptional regulatory activity. Most miRNAs were identified by direct cloning of small RNAs, an approach that favors detection of abundant miRNAs. Three observations suggested that miRNA genes might be identified using a computational approach. First, miRNAs generally derive from precursor transcripts of 70-100 nucleotides with extended stem-loop structure. Second, miRNAs are usually highly conserved between the genomes of related species. Third, miRNAs display a characteristic pattern of evolutionary divergence.
Results
We developed an informatic procedure called 'miRseeker', which analyzed the completed euchromatic sequences of Drosophila melanogaster and D. pseudoobscura for conserved sequences that adopt an extended stem-loop structure and display a pattern of nucleotide divergence characteristic of known miRNAs. The sensitivity of this computational procedure was demonstrated by the presence of 75% (18/24) of previously identified Drosophila miRNAs within the top 124 candidates. In total, we identified 48 novel miRNA candidates that were strongly conserved in more distant insect, nematode, or vertebrate genomes. We verified expression for a total of 24 novel miRNA genes, including 20 of 27 candidates conserved in a third species and 4 of 11 high-scoring, Drosophila-specific candidates. Our analyses lead us to estimate that drosophilid genomes contain around 110 miRNA genes.
Conclusions
Our computational strategy succeeded in identifying bona fide miRNA genes and suggests that miRNAs constitute nearly 1% of predicted protein-coding genes in Drosophila, a percentage similar to the percentage of miRNAs recently attributed to other metazoan genomes.
doi:10.1186/gb-2003-4-7-r42
PMCID: PMC193629  PMID: 12844358
17.  Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome 
Genome Biology  2002;3(12):research0086.1-86.2.
Analysis of conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons.
Background
It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined.
Results
We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences.
Conclusions
Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone.
doi:10.1186/gb-2002-3-12-research0086
PMCID: PMC151188  PMID: 12537575
18.  Heterochromatic sequences in a Drosophila whole-genome shotgun assembly 
Genome Biology  2002;3(12):research0085.1-85.16.
Annotation of an improved whole-genome shotgun assembly of the Drosophila melanogaster genome predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Fluorescence in situ hybridization was used to correlate the genomic sequence with the cytogenetic map; the annotated euchromatic sequence extends into the centric heterochromatin on each chromosome arm.
Background
Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly.
Results
WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm.
Conclusions
Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes.
doi:10.1186/gb-2002-3-12-research0085
PMCID: PMC151187  PMID: 12537574
19.  Annotation of the Drosophila melanogaster euchromatic genome: a systematic review 
Genome Biology  2002;3(12):research0083.1-83.22.
The recent completion of the Drosophila melanogaster genomic sequence to high quality, and the availability of a greatly expanded set of Drosophila cDNA sequences, afforded FlyBase the opportunity to significantly improve genomic annotations.
Background
The recent completion of the Drosophila melanogaster genomic sequence to high quality and the availability of a greatly expanded set of Drosophila cDNA sequences, aligning to 78% of the predicted euchromatic genes, afforded FlyBase the opportunity to significantly improve genomic annotations. We made the annotation process more rigorous by inspecting each gene visually, utilizing a comprehensive set of curation rules, requiring traceable evidence for each gene model, and comparing each predicted peptide to SWISS-PROT and TrEMBL sequences.
Results
Although the number of predicted protein-coding genes in Drosophila remains essentially unchanged, the revised annotation significantly improves gene models, resulting in structural changes to 85% of the transcripts and 45% of the predicted proteins. We annotated transposable elements and non-protein-coding RNAs as new features, and extended the annotation of untranslated (UTR) sequences and alternative transcripts to include more than 70% and 20% of genes, respectively. Finally, cDNA sequence provided evidence for dicistronic transcripts, neighboring genes with overlapping UTRs on the same DNA sequence strand, alternatively spliced genes that encode distinct, non-overlapping peptides, and numerous nested genes.
Conclusions
Identification of so many unusual gene models not only suggests that some mechanisms for gene regulation are more prevalent than previously believed, but also underscores the complex challenges of eukaryotic gene prediction. At present, experimental data and human curation remain essential to generate high-quality genome annotations.
doi:10.1186/gb-2002-3-12-research0083
PMCID: PMC151185  PMID: 12537572
20.  Systematic determination of patterns of gene expression during Drosophila embryogenesis 
Genome Biology  2002;3(12):research0088.1-88.14.
As a first step to creating a comprehensive atlas of gene-expression patterns during Drosophila embryogenesis, 2,179 genes have been examinded by in situ hybridization to fixed Drosophila embryos. Of the genes assayed, 63.7% displayed dynamic expression patterns that were documented with 25,690 digital photomicrographs of individual embryos.
Background
Cell-fate specification and tissue differentiation during development are largely achieved by the regulation of gene transcription.
Results
As a first step to creating a comprehensive atlas of gene-expression patterns during Drosophila embryogenesis, we examined 2,179 genes by in situ hybridization to fixed Drosophila embryos. Of the genes assayed, 63.7% displayed dynamic expression patterns that were documented with 25,690 digital photomicrographs of individual embryos. The photomicrographs were annotated using controlled vocabularies for anatomical structures that are organized into a developmental hierarchy. We also generated a detailed time course of gene expression during embryogenesis using microarrays to provide an independent corroboration of the in situ hybridization results. All image, annotation and microarray data are stored in publicly available database. We found that the RNA transcripts of about 1% of genes show clear subcellular localization. Nearly all the annotated expression patterns are distinct. We present an approach for organizing the data by hierarchical clustering of annotation terms that allows us to group tissues that express similar sets of genes as well as genes displaying similar expression patterns.
Conclusions
Analyzing gene-expression patterns by in situ hybridization to whole-mount embryos provides an extremely rich dataset that can be used to identify genes involved in developmental processes that have been missed by traditional genetic analysis. Systematic analysis of rigorously annotated patterns of gene expression will complement and extend the types of analyses carried out using expression microarrays.
doi:10.1186/gb-2002-3-12-research0088
PMCID: PMC151190  PMID: 12537577
21.  The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective 
Genome Biology  2002;3(12):research0084.1-84.2.
Using Release 3 of the euchromatic genomic sequence of Drosophila melanogaster, 85 known and eight novel families of transposable element have been identified, varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence.
Background
Transposable elements are found in the genomes of nearly all eukaryotes. The recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive elements in the Drosophila euchromatin. We have used this genomic sequence to describe the euchromatic transposable elements in the sequenced strain of this species.
Results
We identified 85 known and eight novel families of transposable element varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial. The density of transposable elements increases an average of 4.7 times in the centromere-proximal regions of each of the major chromosome arms. We found that transposable elements are preferentially found outside genes; only 436 of 1,572 transposable elements are contained within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of transposable elements is found nested within other elements of the same or different classes. Lastly, an analysis of structural variation from different families reveals distinct patterns of deletion for elements belonging to different classes.
Conclusions
This analysis represents an initial characterization of the transposable elements in the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the transposable elements of other organisms can begin to be made. These data have been made available on the Berkeley Drosophila Genome Project website for future analyses.
doi:10.1186/gb-2002-3-12-research0084
PMCID: PMC151186  PMID: 12537573
22.  A Drosophila full-length cDNA resource 
Genome Biology  2002;3(12):research0080.1-80.8.
High-quality full-insert sequence for 8,921 putative full-length cDNA clones in the Drosophila Gene Collection has been generated and compared to the annotated Release 3 genomic sequence. More than 5,300 cDNAs have been identifieed that contain a complete and accurate protein-coding sequence, corresponding to at least one splice form for 40% of the predicted D. melanogaster genes.
Background
A collection of sequenced full-length cDNAs is an important resource both for functional genomics studies and for the determination of the intron-exon structure of genes. Providing this resource to the Drosophila melanogaster research community has been a long-term goal of the Berkeley Drosophila Genome Project. We have previously described the Drosophila Gene Collection (DGC), a set of putative full-length cDNAs that was produced by generating and analyzing over 250,000 expressed sequence tags (ESTs) derived from a variety of tissues and developmental stages.
Results
We have generated high-quality full-insert sequence for 8,921 clones in the DGC. We compared the sequence of these clones to the annotated Release 3 genomic sequence, and identified more than 5,300 cDNAs that contain a complete and accurate protein-coding sequence. This corresponds to at least one splice form for 40% of the predicted D. melanogaster genes. We also identified potential new cases of RNA editing.
Conclusions
We show that comparison of cDNA sequences to a high-quality annotated genomic sequence is an effective approach to identifying and eliminating defective clones from a cDNA collection and ensure its utility for experimentation. Clones were eliminated either because they carry single nucleotide discrepancies, which most probably result from reverse transcriptase errors, or because they are truncated and contain only part of the protein-coding sequence.
doi:10.1186/gb-2002-3-12-research0080
PMCID: PMC151182  PMID: 12537569
23.  Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence 
Genome Biology  2002;3(12):research0079.1-79.14.
The Drosophila melanogaster genome was the first metazoan genome to be sequenced by whole-genome shotgun. Now, the sequence has been finished in a process designed to close gaps, improve sequence quality and validate the assembly.
Background
The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.
Results
Our finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.
Conclusions
The WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.
doi:10.1186/gb-2002-3-12-research0079
PMCID: PMC151181  PMID: 12537568
24.  Computational analysis of core promoters in the Drosophila genome 
Genome Biology  2002;3(12):research0087.1-87.12.
Candidate transcription start sites have been identified for about 2,000 Drosophila genes by aligning 5' expressed sequence tags (ESTs) from cap-trapped cDNA libraries to the genome. Examination of the sequences flanking these candidate transcription start sites revealed the presence of well-known core promoter motifs such as the TATA box, the initiator and the downstream promoter element (DPE).
Background
The core promoter, a region of about 100 base-pairs flanking the transcription start site (TSS), serves as the recognition site for the basal transcription apparatus. Drosophila TSSs have generally been mapped by individual experiments; the low number of accurately mapped TSSs has limited analysis of promoter sequence motifs and the training of computational prediction tools.
Results
We identified TSS candidates for about 2,000 Drosophila genes by aligning 5' expressed sequence tags (ESTs) from cap-trapped cDNA libraries to the genome, while applying stringent criteria concerning coverage and 5'-end distribution. Examination of the sequences flanking these TSSs revealed the presence of well-known core promoter motifs such as the TATA box, the initiator and the downstream promoter element (DPE). We also define, and assess the distribution of, several new motifs prevalent in core promoters, including what appears to be a variant DPE motif. Among the prevalent motifs is the DNA-replication-related element DRE, recently shown to be part of the recognition site for the TBP-related factor TRF2. Our TSS set was then used to retrain the computational promoter predictor McPromoter, allowing us to improve the recognition performance to over 50% sensitivity and 40% specificity. We compare these computational results to promoter prediction in vertebrates.
Conclusions
There are relatively few recognizable binding sites for previously known general transcription factors in Drosophila core promoters. However, we identified several new motifs enriched in promoter regions. We were also able to significantly improve the performance of computational TSS prediction in Drosophila.
doi:10.1186/gb-2002-3-12-research0087
PMCID: PMC151189  PMID: 12537576
25.  Evidence for large domains of similarly expressed genes in the Drosophila genome 
Journal of Biology  2002;1(1):5.
Background
Transcriptional regulation in eukaryotes generally operates at the level of individual genes. Regulation of sets of adjacent genes by mechanisms operating at the level of chromosomal domains has been demonstrated in a number of cases, but the fraction of genes in the genome subject to regulation at this level is unknown.
Results
Drosophila gene-expression profiles that were determined from over 80 experimental conditions using high-density oligonucleotide microarrays were searched for groups of adjacent genes that show similar expression profiles. We found about 200 groups of adjacent and similarly expressed genes, each having between 10 and 30 members; together these groups account for over 20% of assayed genes. Each group covers between 20 and 200 kilobase pairs of genomic sequence, with a mean group size of about 100 kilobase pairs. Groups do not appear to show any correlation with polytene banding patterns or other known chromosomal structures, nor were genes within groups functionally related to one another.
Conclusions
Groups of adjacent and co-regulated genes that are not otherwise functionally related in any obvious way can be identified by expression profiling in Drosophila. The mechanism underlying this phenomenon is not yet known.
PMCID: PMC117248  PMID: 12144710

Results 1-25 (26)