Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Wiley Interdiscip Rev Syst Biol Med. Author manuscript; available in PMC 2012 January 4.
Published in final edited form as:
PMCID: PMC3251331

The zebrafish: scalable in vivo modeling for systems biology


The zebrafish offers a scalable vertebrate model for many areas of biologic investigation. There is substantial conservation of genetic and genomic features and, at a higher order, conservation of intermolecular networks, as well as physiologic systems and phenotypes. We highlight recent work demonstrating the extent of this homology, and efforts to develop high-throughput phenotyping strategies suited to genetic or chemical screening on a scale compatible with in vivo validation for systems biology. We discuss the implications of these approaches for functional annotation of the genome, elucidation of multicellular processes in vivo, and mechanistic exploration of hypotheses generated by a broad range of ‘unbiased’ ‘omic technologies such as expression profiling and genome-wide association. Finally, we outline potential strategies for the application of the zebrafish to the systematic study of phenotypic architecture, disease heterogeneity and drug responses.


Reductionism remains a central paradigm in biological research, but it is increasingly clear that we must understand the integrated interactions between myriad pathways, cells, tissues, and organs if we are ever to be able to attain the level of prediction possible in the physical sciences.13 Importantly, for medical purposes we must comprehend biologic systems not only in their basal states but also under a diverse set of perturbations including stress, environmental exposures, disease states, and drug challenges.

In the postgenomic era, there have been tremendous changes in the scale of data collection that is feasible in basic and translational science, largely through the rapid dissemination of expression profiling, proteomic, and metabolomic technologies.2,4 Although ‘omic cataloging is proceeding at a tremendous pace, we also know that signals transmitted through electric fields, specific ions, lipids, and other small molecules are unlikely ever to be captured in any meaningful way by genomic technologies. These and many other epigenetic or environmental factors will require the development of innovative technologies to generate relevant comprehensive datasets. While characterizing biology and medicine on such a grand scale is emerging as a long-term goal, the challenges of defining connectivity in such an enormous range of states remain daunting.

It is possible to define molecular interactions in many ways: direct physical interaction, shared location, coregulation, or through common multidimensional phenotypes (phenoclustering).58 The latter approach may incorporate endpoints as diverse as the initial cell division phenotypes of a nematode and complex clinical syndromes.811 As the definition of genotype–phenotype relationships has become feasible on a genome-wide scale, extensive interaction networks have superceded linear pathways, though these constructs are still largely confined to a single cell or a specific scale, and often merely a static depiction of ‘what is possible’ rather than ‘what is happening’. Algorithms agnostic to data format incorporated in machine learning have proven powerful in several biological applications, particularly in the integration of disparate datasets.1215

Largely undeclared is a need for the empiric testing of the multiscale hypotheses generated by such in silico models.16 While for many phenotypes or networks in vivo analysis is tractable in the genomes of yeast, Drosophila and Caenorhabditis elegans, for the bulk of vertebrate phenotypes modeling on such scale is inconceivable. Relevant models for in vivo validation on a genomic simply do not exist for most disease-relevant traits. This review is designed to outline the utility of the zebrafish for the in vivo study of biology at scale, proposing its development as a key organism for the evaluation and refinement of computational models, and highlighting entry points to the literature for those interested in further exploration.


The zebrafish is small for a vertebrate. Adults grow to 3 cm or more in length, but during the embryonic and larval stages of life, the zebrafish is only a few mm long. During these stages, developing fish can live for days in a single well of a standard 384-well plate, surviving on nutrients stored in their yolk sacs. Zebrafish are simple and inexpensive to raise, with a single pair of adults routinely laying hundreds of fertilized eggs in a single morning. Importantly, the zebrafish also has a tractable, diploid genome that has been sequenced and is amenable to both forward and reverse genetics. Consequently, as even a small zebrafish facility can generate many thousands of embryos per day, it is straightforward to perform large-scale, phenotype-based genetic or chemical screens.17,18 Since such screening can be performed in vivo, modulation of potential therapeutic targets by mutations or small molecules (exploring chemical or environmental space) reveals the effects of such perturbations across multiple scales from individual molecules, through specific cell types to integrated organ physiology.18,19

The zebrafish genome and body plan are similar to those of other vertebrates, but its transparency and external development make real time observation of its internal organs straightforward, especially when combined with fluorescent markers or reporters that highlight the locations or activities of specific populations of cells. Large numbers of transgenic zebrafish lines have been created that express fluorescent proteins in locations ranging from the presomitic mesoderm20 to the pituitary gland.21 These lines greatly facilitate detection of anatomical and physiological changes caused by genetic perturbations or small molecules. Numerous disease models ranging from congenital heart defects to cancers have been developed22,23 and the zebrafish is genetically and pharmacologically similar to humans.24,25

As zebrafish have become more widely used, additional technologies have been developed that further increase the utility of the Model. The zebrafish genome project is now nearly complete, and DNA microarrays have been generated for expression profiling studies and are under development for genotyping.26,27 Antisense morpholino oligos have proven to be an effective means of ‘knocking down’ gene function with very rapid turnaround times and the potential for manipulating tens or hundreds of genes.28 The ease of gene knockdown or gene over-expression facilitates the efficient connection between genotype and phenotype at these extremes. Rigorous proof is often feasible in the form of genetic rescue with cDNA or genomic clones, and in many instances such experiments can be performed using RNA injection or transient transgenesis.29 Such transient approaches can also be used to study dominant negative alleles, and are scalable. More recently, reverse genetic approaches have been developed for the zebrafish, enabling researchers to generate mutations in virtually any gene of interest, and a major effort to generate mutants in every gene is underway.30 The zebrafish is becoming a mature model organism, with an expanding range of genomic and experimental tools.

The relevance of any model organism to human disease is dependent on conservation of genetic and genomic features and, at a higher order, on the conservation of intermolecular networks, physiologic systems and phenotypes. We highlight below the degree of structural and functional conservation between zebrafish and higher organisms.

Sequence Conservation

At a nucleotide level, there is extensive coding sequence conservation between the zebrafish and higher vertebrates that enables relatively efficient delineation of the cognate orthologs for comparative biology. Estimates range from 70 to 80% identity at nucleotide level to around 80–90% homology at an amino acid level in functional domains.31 Codon usage tables have not yet been systematically defined for the zebrafish, though there is evidence from other teleosts of some distinctive biases. The overall topology of individual genes is predictably conserved, with comparable intron–exon organization, splicing repertoire, and intronic dimensions. As the zebrafish genome project is completed these homologies will be more readily quantitated. Large syntenic regions conserved across multiple mammalian species are often distributed among several smaller loci in zebrafish.32 Importantly, there has also been a substantial genome duplication event in the zebrafish with up to 30% of genes existing as two copies. These paralogous genes may share redundant functions or may have diverged in the location or timing of their expression to subsume completely different and nonredundant roles.31,33

Not only are traditional protein coding genes conserved, but so too many of the noncoding transcripts identified through emerging genomic technologies.34 MicroRNAs and their target sequences are present, though definition of the precise extent of functional conservation in these complex networks will require considerable experimentation.35 There are some biologically significant long noncoding RNAs which do not appear to be conserved at the level of the primary sequence, but direct experimentation may be the only way to exclude conservation at a functional level. Similarly, other aspects of genome biology including nuclear organization, long-range regulation of the genome and DNA–chromatin interactions may also be accessible in the zebrafish.36,37

Regulatory Homology

The degree of sequence conservation between zebrafish and higher vertebrates falls off rapidly in noncoding regions of the genome. Of all non-coding sequences highly conserved between human and mouse, fewer than 15% are conserved in fish. A substantial number of ultraconserved sequences, conserved from human to fish, exhibit long-range enhancer activity, but physiologic roles for these sequences remain to be discovered.34,38

Even where there is little evidence of conservation at the level of the primary sequence, there may be considerable conservation of function. Complementary transgenesis of murine and zebrafish promoters has revealed remarkable functional conservation in the absence of high-level sequence homology.36,39 The genome-wide extent of this phenomenon has not been systematically explored, but these data suggest that the zebrafish may offer strategies for the efficient functional annotation of at least a subset of regulatory elements. Indeed, such empiric experimentation may be required to discern the structural requirements for many forms of genomic control.37

Functional Conservation

Many aspects of complex physiology can be characterized in the zebrafish. Not surprisingly, considering the ease of detection of phenomena in the frequency domain, cardiovascular physiology and neurologic behaviors were among the first areas of investigation. Remarkably, within 48 h of fertilization, and while it can be sustained in large numbers for days in multiwell plates, the larval fish has established complex physiology comparable in many instances to that of adult humans. For example, remarkable physiologic and pharmacologic homology has been demonstrated not only for simple parameters such as heart rate, contractility, and blood flow, but also for a repertoire of higher resolution phenotypes such as depolarization, repolarization, Ca2+ handling in different subcellular compartments, as well as classic transcriptional responses such as that of the natriuretic peptides in hypertrophy or heart failure.29,4045 These tools enable efficient large-scale screens for genetic or chemical modifiers of known disease pathways in a completely native context.18,44

While the majority of zebrafish phenotypes studied to date have been found to be highly representative, it is imperative that investigators validate fundamental homologies (e.g., miRNA targets, transcription factor networks, translational control, molecular physiology, specific basal, and perturbed phenotypes) for each biological context. Such empiric data will help determine the diseases and regulatory networks most suited for exploration in the zebrafish. At present the phenotypic repertoire in zebrafish is a limiting step in many areas of investigation.

There is clearly a need for a much more robust approach to comparing phenotypes across species.46 The basic phenotypic vocabularies for each model organism have evolved in the fields in which the organism was first used, and in many instances each community emphasizes very different aspects of the biology. Function is rarely so well characterized in animal models as it is in humans, and the resolution of physiology that is feasible will vary across species. On the other hand, specific cells or tissues are often more accessible in particular organisms shedding light on distinctive aspects of the genotype–phenotype relationship. Ultimately, the ability to capture all of the phenotypic components for a given genotype will require integrated investigation across multiple organisms. An important step in this direction will be the formalization of phenotype ontologies.46,47


The zebrafish has proven extremely facile for experimentation in certain areas of biology. These include the earliest stages of organ development, which are inaccessible in other vertebrate models, as well as specific pathways for which comparable mutants do not exists in other organisms. However, perhaps the most obvious niche for the zebrafish is as a screenable vertebrate. Indeed, the utility of the zebrafish for automatable or semiautomatable large-scale genetic or chemical screens is uniquely suited to the in vivo exploration of integrative biology. In most instances some investment must be made to change the scale of phenotyping, but multiple examples suggest that this is feasible for phenotypes ranging from simple cell motility to complex behavior.41,48,49 We have outlined some general examples, but an enormous range of possibilities can be imagined.

Gene Expression

The transparency of the zebrafish lends itself to the comprehensive representation of native gene expression. In situ hybridization techniques have been elegantly employed to follow the emerging patterns of gene expression at cellular resolution during the initial days of development. There are currently large efforts to systematically map the expression of several thousand zebrafish cDNAs through development from early gastrulation through to day 5 postfertilization50 (available at Robotic approaches to in situ hybridization have enabled genetic or chemical screens based on transcript expression. While susceptible to both false positives and false negatives, if carefully designed, the hits in such screens can readily be validated in second round assays. This strategy may be increasingly important in the empiric evaluation of noncoding sequences, as the logic of gene regulation is deconvoluted. Permeabilization of the fish is much more difficult beyond the first few days of development, so that despite the generation of fish strains that remain transparent through adulthood, transgenic reporters will be required for more complete mapping of gene expression throughout life.

Immunohistochemical characterization of protein localization is also feasible in whole mount in the zebrafish embryo. It is possible in many instances to discriminate between different phosphorylation states, and localization to individual cells can be delineated.44 The relatively small size of zebrafish cells in many organs can make subcellular localization more challenging, but ultimately, comprehensive information on gene expression, protein localization and many aspects of posttranslational modification may be accessible. Adult zebrafish strains that maintain complete or partial transparency open the entire lifespan of the organism to this type of analysis.51

Gene Function Reporters

The zebrafish also offers the opportunity to directly assay transcription, transcript splicing and stability, as well as miRNA targeting and other gene functions, using a variety of reporter strategies that can readily be adapted for in vivo screening.52 Coupling specific promoters with fluorescent proteins or luciferase reporters enables the robust analysis of transcription in spatial and temporal domains. The introduction into a constitutively expressed reporter construct of a target intron for a specific splicing regulator allows the spatial characterization of the regulator’s activity in real time. The discovery of miRNAs has led to an explosion of work in this arena, and the combination of in vivo reporters with target 3′ untranslated sequences has proven a powerful means for the assay of miRNA function. Several other classes of noncoding RNA have been discovered through direct sequencing technologies in the last few years, and may also be accessible in the zebrafish.34


One of the most attractive features of the zebrafish is prospect of vertebrate cell biology in the context of all of the relevant in vivo cues. This is evident early in development where a ‘recital’ of much of the repertoire of cellular and molecular biology is revealed in the first few days of development. This remarkable and highly reproducible series of events may even be exploited as a uniquely multipotent assay system. Phenotypic signatures of specific cellular or pathway defects can be recognized, and through phenoclustering, other pathway members or modifiers can be identified. This approach has been used to discover inhibitors of specific pathways, but might be much more widely applicable.53

Automated microscopy platforms have recently been used to assay specific cell behaviors in in vitro systems and in invertebrate development.48,54 Combining these techniques with genome-wide siRNA knockdown has led to successful screens for novel members of specific pathways, and exploration of the genetic networks responsible for fundamental cellular functions.55,56 The ongoing development of robust reporters for a host of cellular processes in the zebrafish, and their potential implementation in an automated or semiautomated mode, will hopefully open these phenomena to unbiased exploration at scale in the zebrafish.57 The availability of reporters for specific biochemical processes allows these also to be integrated in four-dimensional maps of organismal biology. Several fluorescence resonance energy transfer or FRET-based reporters for a range of physiologic variables including Ca2+ concentration, cAMP or specific kinase activities in different cellular compartments have been successfully translated from cell culture to in vivo analysis in the zebrafish.58 Other investigators have used different approaches such as ion and voltage sensitive dyes, or other molecular reporters.59 As with any application of these technologies close attention must be paid to the signal–noise ratios, dynamic range, and the effects of the probes or reporters themselves on basal physiology. The complexity of vertebrate genetics and cell biology is such that the greatest advantage of the fish is likely to be in the in vivo modeling of sophisticated cellular interactions in organogenesis, and multitissue responses in health or disease, rather than in subcellular biology.


Several groups have successfully adapted imaging or other assays to multiwell plates and undertaken screens for genetic or chemical modifiers with varying degrees of success. For example, in the cardiovascular arena, sophisticated assays of cardiac rate, rhythm, contractility, blood flow, vasomotion, and rheology are available, and can often be multiplexed using different fluorescent reporters and refined image analysis techniques.43,45,57 Ca2+ reporters have been used to screen for mutants affecting the developmental maturation of cardiac physiology. Combining screening technologies in series to optimize sensitivity and specificity in turn, highlights the utility of counter-screening. This type of staged or combination screen is readily feasible with higher throughput assays, and allows sophisticated screen logic to be applied.59 Assays for many other physiologic functions have been developed (see examples in Table 1), and as the zebrafish is becoming increasingly popular as a model for physiology and disease, the range of these assays will continue to grow exponentially. Investigators can exploit not only the integrative physiology of the zebrafish, but also the ontologic sequence that the organism offers. Different stages of integration can be defined for many of the processes being studied, enabling evaluation of the role of early function in the patterning of whole organism responses. Perhaps the most powerful strategies will incorporate aspects of cell biology, physiology, and drug challenges, all in the context of validated zebrafish models of disease. The potential for data collection on a scale matching that of functional genomics is close to being realized, but will require collaboration among geneticists, cell and computational biologists, engineers, and physiologists.

Examples of Scalable Modeling in Zebrafish

Where higher throughput assays have been developed, it has been possible to exploit the zebrafish for the annotation of small molecule function, for in vivo approaches to the study of drug toxicity, empiric pathway discovery or the testing of large numbers of variables in genetically faithful disease models.41,60 This approach has been used to explore phenotypes including the pharmacogenetics of specific drug responses and the development of repolarization. Similarly, behaviors ranging from individual reflexes through midbrain responses to social interaction have been characterized in the zebrafish, and large-scale screens for fundamental neurological responses are underway.

A recent study by Mitchison et al. depicts the power of the zebrafish model for studying dynamic physiologic processes.61 The authors focused on the problem of wound detection and healing—in particular the mechanism by which, in the initial minutes after injury, leukocytes from hundreds of microns away sense and migrate toward damaged tissue. Given its biochemical properties, they hypothesized that hydrogen peroxide (H2O2) may contribute toward the required spatial signaling.

An in vivo evaluation of the hypothesis required that the authors genetically and pharmacologically manipulate a living biological system, while simultaneously allowing visualization of the dynamics of a chemical gradient and the resulting migration of leukocytes—a task ideally suited for study in the zebrafish embryo.61 The authors made use of genetically encoded H2O2 sensor, HyPer, which consists of a circularly permuted yellow fluorescent protein (YFP) covalently attached to OxyR, a bacterial transcription factor that changes conformation in response to H2O2. The resulting change is then transmitted to YFP, altering its fluorescent properties, and allowing specific H2O2 quantification in vivo. Using this probe in the context of an embryonic zebrafish tail injury model, the authors were able to detect a rapid increase in H2O2 at the wound margin which extended outwards several hundred microns. Leukocyte influx was tracked using leukocyte-specific fluorescent tags and found to temporally follow rather than precede H2O2 production, further supporting H2O2’s role as a spatial chemotactic signal. Finally, the authors elucidated the molecular basis for the peroxide gradient through chemical and genetic inhibition, thus establishing dual oxidase (Duox1) as the enzyme responsible for generating H2O2 at the wound margin.

Small model organisms not only permit the opportunity to model intercellular crosstalk in vivo, but such approaches can potentially lead to quantitative, dynamic models of higher-order processes such as cell-fate specification, which rely on the integration of local signals to determine cellular identity. Although there has been little use of zebrafish for this purpose to date, similar investigations in other small organisms can suggest potential applications of the zebrafish system toward defining mechanisms of intercellular communication for vertebrate-specific processes such as glomerular filtration or arrhythmias.62

One of the most elegant examples of cell-fate specification is that of C. elegans vulval development, originally described in detail in 1989 by Sternberg and Horwitz.63 The C. elegans vulva arises from three vulval precursor cells, which integrate signals from a gonadal anchor cell, the surrounding hypodermal syncytium and one another to attain one of three cellular fates. The genes important in sending and receiving signals have been identified, and a systematic series of mutations has resulted in genotype–phenotype descriptions for over 48 mutation combinations. The wealth of data has led to the parallel development of dynamic mathematical64 and computational (executable) models65 to describe the intercellular communication—with the ability to parameterize models on the basis of empiric data. In the case of computational models, a prediction regarding the requirement of a time-delay between signals were verified with the use of fluorescent reporters of cell-fate in vivo.

Multicellular processes in zebrafish, which may have greater relevance to vertebrates, should lend themselves to similar modeling approaches. One area of significant interest is cardiac regeneration, which has recently been shown in zebrafish to involve dedifferentiation of mature cardiomyocytes.66 Genetic and chemical screens could be used to define mediators of this process, and fluorescent-based reporters could then be used to assist the development and refinement of quantitative models describing the timing of intercellular communication. Such models in fact may have clinical relevance for healing postmyocardial infarction.


There are many possible strategies for using zebrafish to derive and inform multigene networks and gene–environment interactions relevant to human biology or disease. In this section, we have outlined some of the most pertinent applications, but, as phenotyping technologies change, the scope of what is feasible will only grow.

Modeling of Mutations or Interactions from Other Organisms

One of the most pressing needs in systems biology is for in vivo modeling on a scale compatible with ‘omics technologies. In the last few years, the advent and validation of morpholinos, as well as the increasing efficiency of transgenesis have made it possible to study the effects of tens, or even hundreds, of genes on specific phenotypes. This scale of investigation has often only been limited by phenotyping throughput, but the availability of automatable, quantitative phenotypes open these to comprehensive screens.41,49 Several biological problems are immediately accessible to this scale of modeling. Specific networks, derived from the integration of expression profiling data or genetic screens in lower organisms, can be efficiently validated, and new hypotheses on network structure in cognate vertebrate phenotypes can be efficiently explored.

Perhaps one of the most obvious areas where the scale of downstream investigation is currently a roadblock to progress is in the follow-up of genome-wide association studies (GWAS). The initial wave of human GWAS has raised several important challenges for investigators.6769 The simultaneous identification of multiple novel loci has highlighted a need for efficient model systems both to define the causal genes and to explore the fundamental mechanisms of disease.69,70 In addition, many of the most successful studies to date have explained only a proportion of the heritability, and for approaches to identifying the remaining genes are far from straightforward in human.7073 The cost of increasing the size (and power) of many human disease cohorts for secondary analyses of loci of borderline significance and for gene–gene or gene–environment interaction is prohibitive. Notably, these findings also suggest there is substantial unidentified etiologic heterogeneity underlying many common disorders which must be resolved if the genetic basis of these traits is to be understood.68,69,74,75 Any strategy to identify the causal genes underlying any of the GWA loci for a specific phenotype must allow the evaluation of multiple genes across hundreds of kilobases at each locus, alone and in combination, as well offering the capability to explore interactions between loci. Ideally, tissue-specific mechanistic studies would also be feasible once the genes have been identified. Initial work in the mouse has suggested that such an integrative genomic strategy can move toward mechanism at individual GWAS loci, but the practical obstacles to addressing large numbers of loci in parallel may be difficult to overcome.4,76,77 The zebrafish offers a compromise, with robust modeling of complex vertebrate biology and accessibility to functional genomics for different tissues or organs, but with the scalability necessary for rigorous gene–gene and gene–environment analyses, facilitating the parallel exploration of all loci for the trait in question.18,19,23,78

Network exploration in the zebrafish itself will ultimately be feasible on a genome-wide basis for appropriately scalable phenotypes. Bulk aquaculture, embryo sorting, robotic techniques for oocyte injection, and the streamlining of phenotyping and other aspects of the process will eventually bring these approaches to many complex traits. Such efforts will be complemented by ongoing work to define mutant alleles in every zebrafish gene.79 Not all downstream mechanistic studies will be feasible in the zebrafish, but as for many of the overarching strategies we have described, the fish complements work in simpler in vivo or in vitro models and can prioritize hypotheses for testing in rodents, larger animals or humans.

Traditional Unbiased Genetic or Chemical Screens

Once faithful phenotypic surrogates have been established in the zebrafish, through direct physiologic or pharmacologic parallels, or through disrupting known human disease gene orthologs, many avenues of exploration are feasible. The creation of high-throughput assays for relevant phenotypes in the fish allows the rapid genetic or chemical annotation of disease pathways. Genetic screens can be performed for pathway modifiers, and chemical screens can yield pathway probes or drug leads [80].

Specific examples include shelf screens of existing zebrafish mutants. Such screens could readily be adapted to genome-wide mutant libraries currently under construction, but can also be performed using large morpholino libraries or for some tissues even transient expression of panels of cDNAs. In one shelf screen, the robust parallels observed between zebrafish and human cardiac repolarization suggested that formal genetic dissection of this clinically important complex trait might be feasible.60 Using a stepwise phenotyping strategy investigators could attain the throughput required for saturation genetic screening. Thus, by combining an initial sensitive high-throughput screen for abnormal heart rate response to dofetilide with a very specific second high-resolution assay in which confirmed mutants were studied using optical mapping sensitivity and specificity were optimized.41,80 The screening strategy identified mutants based on bimodal distributions of heart rate response to dofetilide. Subsequent testing in the absence of dofetilide allowed discrimination between pure drug response phenotypes and intrinsic heart rate defects. In the initial shelf screen of 340 insertional mutants 15 genes with major effects on repolarization were identified, none of which had previously been implicated in this process. Interestingly, the majority of these genes appear to belong to an integrin associated network modulating channels and their adaptor proteins (Figure 1). Recently, in collaboration with clinical investigators it was established that some of these genes modify human repolarization, confirming the utility of zebrafish screens for gene discovery in physiologic or pharmacologic pathways.59

A network of integrin associated proteins identified in a screen for modifiers of repolarization in the zebrafish.59

Modeling physiology or disease in whole organisms provides a comprehensive context that typically cannot be achieved with in vitro cell-based assays.18,19,81 However, whole organism disease models do not typically allow for the scale and automation that are central features of high-throughput small molecule screens. The zebrafish makes it possible to combine the physiological context of the whole organism with high-throughput screening in a vertebrate. The existence of in vivo disease models has enabled direct small molecule screens to identify novel pathway probes and ultimately initial drug leads.18,44 In many instances, previously annotated libraries of compounds can be used to move rapidly from phenotype to pathway, and data regarding mechanism of action, clinical safety, and other factors are available for many of the hits. To date ‘proof of concept’ screens have been successfully undertaken, but the true potential for such screens in chemical biology or drug discovery has yet to be realized.18 The advantages of this approach include the ability to screen all of the relevant targets together in the appropriate context, parallel identification of efficacy and toxicity, and the feasibility of exploring drug-drug interactions. Several drug leads identified in zebrafish screens are making their way toward the clinic, and the economic success of even one compound is likely to change the landscape for this type of ultra-high content chemical screen.

Phenotypic Exploration and Discovery

Phenotype-driven modeling in yeast, fly and nematode has proven a powerful approach to pathway dissection.4,10,54 The converse approach, ‘unbiased’ phenotype discovery is emerging for basic cellular phenotypes in lower organisms, but large-scale in vivo modeling of vertebrate traits has not been feasible. The zebrafish offers the potential for unsupervised learning of quantitative trait measures to determine the characteristic phenotype(s) associated with specific gene manipulations. The phenotypes captured in this way may be orthogonal to those identified in humans, and through the discovery of associated pathways may shed light on different aspects of biology or disease. This type of analysis would allow empiric study of many phenomena in genotype-phenotype correlation including penetrance, pleiotropy, and pathogenicity. Importantly, any comprehensive approach to these problems requires systematic and relevant systems perturbations that are readily performed in the zebrafish. With sufficient examples of causal alleles, this approach would be highly informative for human disease, and is difficult to imagine in other model organisms.

While much of the focus in human and animal modeling of disease over the last two decades has been on the genetic and epigenetic contributors to complexity, there has been little work to improve the granularity of clinical phenotypes.10,82,83 Many disease syndromes are rooted in the late 19th century, and the success of randomized controlled trials, while a major advance, has acted as a ‘lumping’ influence in the last two decades. As we begin to understand variability in disease biology and in drug responses, the concept of personalized medicine has emerged.84,85 To facilitate this phenotypic refinement, new diagnostic tools to discriminate homogeneous disease subsets, and to identify causally related, more penetrant ‘endophenotypes’ are being proposed.83 Implicit in this construct is a dramatic change in our understanding of the relationship between genotype and phenotype, including quantitative comprehension of gene-gene, gene-environment, and gene-drug interactions.

The information content of any network is greatly enhanced by systematic perturbations and drugs are one form of such perturbation that might be broadly applicable. While dynamic or provoked phenotypes are not uncommon in vitro, the concept of empiric in vivo discovery of provoked phenotypes may overcome current barriers.82,83 Ideally a phenotype discovery effort would encompass relatively unbiased assessment of multiple phenotypic axes including diverse organ systems or tissues.10,54,82,83 The zebrafish offers an extensive repertoire of vertebrate phenotypes, but at present their detection requires careful observation or directed rational assay design.10 Modeling genetic defects will not only allow surrogate fish phenotypes to be defined, but may detect new aspects of such multisystem defects.86,87 Testing several thousand compounds one might hope to identify a pathway probe that would discriminate between distinctive pathways with superficially similar basal phenotypes. Such network perturbations can also readily be translated across species. Combining zebrafish disease models with libraries of known and approved drugs may enable the collection of datasets highly informative not only for disease network architecture, but also for pharmacogenetics.


Importantly, each of these techniques can be undertaken in an iterative manner, with every cycle further informing in silico models and seeding complementary approaches to mechanistic, diagnostic and therapeutic discovery in clinical, translational and basic arenas. A central aspect of this approach is the identification of key genetic or environmental perturbations that aid in the characterization of the underlying network architecture. With the unique scalability and powerful genetic resources of the zebrafish model it is possible to systematically validate multiple aspects of the underlying network models across different times, different states and across different populations.59,88,89 This tractable vertebrate will engage systems biologists in a multidisciplinary effort that by its very nature mandates the generation of new integrative models and of new research teams operating at the interfaces between previously disparate scientific fields.


1. Loscalzo J, Kohane I, Barabasi AL. Human disease classification in the postgenomic era: a complex systems approach to human pathobiology. Mol Syst Biol. 2007;3:124. [PMC free article] [PubMed]
2. Reed JL, Patel TR, Chen KH, Joyce AR, Applebee MK, Herring CD, Bui OT, Knight EM, Fong SS, Palsson BO. Systems approach to refining genome annotation. Proc Natl Acad Sci USA. 2006;103:17480–17484. [PubMed]
3. Strange K. The end of “naive reductionism”: rise of systems biology or renaissance of physiology? Am J Physiol Cell Physiol. 2005;288:C968–C974. [PubMed]
4. Ge H, Walhout AJ, Vidal M. Integrating ‘omic information: a bridge between genomics and systems biology. Trends Genet. 2003;19:551–560. [PubMed]
5. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006;440:637–643. [PubMed]
6. Hartman JLT, Tippery NP. Systematic quantification of gene interactions by phenotypic array analysis. Genome Biol. 2004;5:R49. [PMC free article] [PubMed]
7. Ge H, Liu Z, Church GM, Vidal M. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat Genet. 2001;29:482–486. [PubMed]
8. Piano F, Schetter AJ, Morton DG, Gunsalus KC, Reinke V, Kim SK, Kemphues KJ. Gene clustering based on RNAi phenotypes of ovary-enriched genes in C. elegans. Curr Biol. 2002;12:1959–1964. [PubMed]
9. Srinivasan P, Piano F, Shatkin AJ. mRNA capping enzyme requirement for Caenorhabditis elegans viability. J Biol Chem. 2003;278:14168–14173. [PubMed]
10. Rual JF, Ceron J, Koreth J, Hao T, Nicot AS, Hirozane-Kishikawa T, Vandenhaute J, Orkin SH, Hill DE, van den Heuvel S, et al. Toward improving Caenorhabditis elegans phenome mapping with an ORFeome-based RNAi library. Genome Res. 2004;14:2162–2168. [PubMed]
11. King OD, Foulger RE, Dwight SS, White JV, Roth FP. Predicting gene function from patterns of annotation. Genome Res. 2003;13:896–904. [PubMed]
12. Deo RC, Hunter L, Lewis GD, Pare G, Vasan RS, Chasman D, Wang TJ, Gerszten RE, Roth FP. Interpreting metabolomic profiles using unbiased pathway models. PLoS Comput Biol. 2010;6(2):e1000692. [PMC free article] [PubMed]
13. Berriz GF, King OD, Bryant B, Sander C, Roth FP. Characterizing gene sets with FuncAssociate. Bioinformatics. 2003;19:2502–2504. [PubMed]
14. Clare A, King RD. Machine learning of functional class from phenotype data. Bioinformatics. 2002;18:160–166. [PubMed]
15. Pavlidis P, Weston J, Cai J, Noble WS. Learning gene functional classifications from multiple data types. J Comput Biol. 2002;9:401–411. [PubMed]
16. Joyce AR, Palsson BO. The model organism as a system: integrating ‘omics’ data sets. Nat Rev Mol Cell Biol. 2006;7:198–210. [PubMed]
17. Anderson RH, Webb S, Brown NA, Lamers W, Moorman A. Development of the heart: (3) formation of the ventricular outflow tracts, arterial valves, and intrapericardial arterial trunks. Heart. 2003;89:1110–1118. [PMC free article] [PubMed]
18. MacRae CA, Peterson RT. Zebrafish-based small molecule discovery. Chem Biol. 2003;10:901–908. [PubMed]
19. Zon LI, Peterson RT. In vivo drug discovery in the zebrafish. Nat Rev Drug Discov. 2005;4:35–44. [PubMed]
20. Gajewski M, Sieger D, Alt B, Leve C, Hans S, Wolff C, Rohr KB, Tautz D. Anterior and posterior waves of cyclic her1 gene expression are differentially regulated in the presomitic mesoderm of zebrafish. Development. 2003;130:4269–4278. [PubMed]
21. Liu NA, Huang H, Yang Z, Herzog W, Hammerschmidt M, Lin S, Melmed S. Pituitary corticotroph ontogeny and regulation in transgenic zebrafish. Mol Endocrinol. 2003;17:959–966. [PubMed]
22. Barut BA, Zon LI. Realizing the potential of zebrafish as a model for human disease. Physiol Genom. 2000;2:49–51. [PubMed]
23. Lieschke GJ, Currie PD. Animal models of human disease: zebrafish swim into view. Nat Rev Genet. 2007;8:353–367. [PubMed]
24. Langheinrich U, Vacun G, Wagner T. Zebrafish embryos express an orthologue of HERG and are sensitive toward a range of QT-prolonging drugs inducing severe arrhythmia. Toxicol Appl Pharmacol. 2003;193:370–382. [PubMed]
25. Milan DJ, Peterson TA, Ruskin JN, Peterson RT, MacRae CA. Drugs that induce repolarization abnormalities cause bradycardia in zebrafish. Circulation. 2003;107:1355–1358. [PubMed]
26. Wienholds E, Plasterk RH. MicroRNA function in animal development. FEBS Lett. 2005;579:5911–5922. [PubMed]
27. Stuckenholz C, Lu L, Thakur P, Kaminski N, Bahary N. FACS-assisted microarray profiling implicates novel genes and pathways in zebrafish gastrointestinal tract development. Gastroenterology. 2009;137:1321–1332. [PMC free article] [PubMed]
28. Nasevicius A, Ekker SC. Effective targeted gene ‘knockdown’ in zebrafish. Nat Genet. 2000;26:216–220. [PubMed]
29. Heuser A, Plovie ER, Ellinor PT, Grossmann KS, Shin JT, Wichter T, Basson CT, Lerman BB, Sasse-Klaassen S, Thierfelder L, et al. Mutant desmocollin-2 causes arrhythmogenic right ventricular cardiomyopathy. Am J Hum Genet. 2006;79:1081–1088. [PubMed]
30. Wienholds E, Schulte-Merker S, Walderich B, Plasterk RH. Target-selected inactivation of the zebrafish rag1 gene. Science. 2002;297:99–102. [PubMed]
31. Postlethwait JH, Woods IG, Ngo-Hazelett P, Yan YL, Kelly PD, Chu F, Huang H, Hill-Force A, Talbot WS. Zebrafish comparative genomics and the origins of vertebrate chromosomes. Genome Res. 2000;10:1890–1902. [PubMed]
32. Barbazuk WB, Korf I, Kadavi C, Heyen J, Tate S, Wun E, Bedell JA, McPherson JD, Johnson SL. The syntenic relationship of the zebrafish and human genomes. Genome Res. 2000;10:1351–1358. [PubMed]
33. Van de Peer Y, Taylor JS, Joseph J, Meyer A. Wanda: a database of duplicated fish genes. Nucleic Acids Res. 2002;30:109–112. [PMC free article] [PubMed]
34. Elgar G, Vavouri T. Tuning in to the signals: noncoding sequence conservation in vertebrate genomes. Trends Genet. 2008;24:344–352. [PubMed]
35. Thatcher EJ, Bond J, Paydar I, Patton JG. Genomic organization of zebrafish microRNAs. BMC Genom. 2008;9:253. [PMC free article] [PubMed]
36. Pashos EE, Kague E, Fisher S. Evaluation of cis-regulatory function in zebrafish. Brief Funct Genom Proteomic. 2008;7:465–473. [PubMed]
37. Dong X, Navratilova P, Fredman D, Drivenes O, Becker TS, Lenhard B. Exonic remnants of whole-genome duplication reveal cis-regulatory function of coding exons. Nucleic Acids Res. 2010;38:1071–1085. [PMC free article] [PubMed]
38. Shin JT, Priest JR, Ovcharenko I, Ronco A, Moore RK, Burns CG, MacRae CA. Human-zebrafish non-coding conserved elements act in vivo to regulate transcription. Nucleic Acids Res. 2005;33:5437–5445. [PMC free article] [PubMed]
39. Kague E, Bessling SL, Lee J, Hu G, Passos-Bueno MR, Fisher S. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis. Dev Biol. 2010;337:496–505. [PubMed]
40. Burns CG, MacRae CA. Purification of hearts from zebrafish embryos. Biotechniques. 2006;40:274, 276, 278. passim. [PubMed]
41. Burns CG, Milan DJ, Grande EJ, Rottbauer W, MacRae CA, Fishman MC. High-throughput assay for small molecules that modulate zebrafish embryonic heart rate. Nat Chem Biol. 2005;1:263–264. [PubMed]
42. Gerull B, Heuser A, Wichter T, Paul M, Basson CT, McDermott DA, Lerman BB, Markowitz SM, Ellinor PT, MacRae CA. Mutations in the desmosomal protein plakophilin-2 are common in arrhythmogenic right ventricular cardiomyopathy. Nat Genet. 2004;36:1162–1164. [PubMed]
43. Lee JS, Yu Q, Shin JT, Sebzda E, Bertozzi C, Chen M, Mericko P, Stadtfeld M, Zhou D, Cheng L, et al. Klf2 is an essential regulator of vascular hemodynamic forces in vivo. Dev Cell. 2006;11:845–857. [PubMed]
44. Peterson RT, Shaw SY, Peterson TA, Milan DJ, Zhong TP, Schreiber SL, MacRae CA, Fishman MC. Chemical suppression of a genetic mutation in a zebrafish model of aortic coarctation. Nat Biotechnol. 2004;22:595–599. [PubMed]
45. Schonberger J, Wang L, Shin JT, Kim SD, Depreux FF, Zhu H, Zon L, Pizard A, Kim JB, Macrae CA, et al. Mutation in the transcriptional coactivator EYA4 causes dilated cardiomyopathy and sensorineural hearing loss. Nat Genet. 2005;37:418–422. [PubMed]
46. Washington NL, Haendel MA, Mungall CJ, Ashburner M, Westerfield M, Lewis SE. Linking human diseases to animal models using ontology-based phenotype annotation. PLoS Biol. 2009;7:e1000247. [PMC free article] [PubMed]
47. McGary KL, Park TJ, Woods JO, Cha HJ, Wallingford JB, Marcotte EM. Systematic discovery of nonobvious human disease models through orthologous phenotypes. Proc Natl Acad Sci USA. 2010;107:6544–6549. [PubMed]
48. Beis D, Stainier DY. In vivo cell biology: following the zebrafish trend. Trends Cell Biol. 2006;16:105–112. [PubMed]
49. Kokel D, Bryan J, Laggner C, White R, Cheung CY, Mateus R, Healey D, Kim S, Werdich AA, Haggarty SJ, et al. Rapid behavior-based identification of neuroactive small molecules in the zebrafish. Nat Chem Biol. 2010;6:231–237. [PMC free article] [PubMed]
50. Thisse B, Heyer V, Lux A, Alunni V, Degrave A, Seiliez I, Kirchner J, Parkhill JP, Thisse C. Spatial and temporal expression of the zebrafish genome by large-scale in situ hybridization screening. Methods Cell Biol. 2004;77:505–519. [PubMed]
51. White RM, Sessa A, Burke C, Bowman T, LeBlanc J, Ceol C, Bourque C, Dovey M, Goessling W, Burns CE, et al. Transparent adult zebrafish as a tool for in vivo transplantation analysis. Cell Stem Cell. 2008;2:183–189. [PMC free article] [PubMed]
52. Pase L, Lieschke GJ. Validating microRNA target transcripts using zebrafish assays. Methods Mol Biol. 2009;546:227–240. [PubMed]
53. Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, et al. High-quality binary protein interaction map of the yeast interactome network. Science. 2008;322:104–110. [PMC free article] [PubMed]
54. Walhout AJ, Reboul J, Shtanko O, Bertin N, Vaglio P, Ge H, Lee H, Doucette-Stamm L, Gunsalus KC, Schetter AJ, et al. Integrating interactome, phenome, and transcriptome mapping data for the C. elegans germline. Curr Biol. 2002;12:1952–1958. [PubMed]
55. Perrimon N, Friedman A, Mathey-Prevot B, Eggert US. Drug-target identification in Drosophila cells: combining high-throughout RNAi and small-molecule screens. Drug Discov Today. 2007;12:28–33. [PubMed]
56. Sepp KJ, Hong P, Lizarraga SB, Liu JS, Mejia LA, Walsh CA, Perrimon N. Identification of neural outgrowth genes using genome-wide RNAi. PLoS Genet. 2008;4:e1000111. [PMC free article] [PubMed]
57. Hack A, Ahouse J, Roberts P, Garakani A. New methods for automated phenotyping of complex cellular behaviors. Conf Proc IEEE Eng Med Biol Soc. 2004;7:5124–5126. [PubMed]
58. Chi NC, Shaw RM, Jungblut B, Huisken J, Ferrer T, Arnaout R, Scott I, Beis D, Xiao T, Baier H, et al. Genetic and physiologic dissection of the vertebrate cardiac conduction system. PLoS Biol. 2008;6:e109. [PMC free article] [PubMed]
59. Milan DJ, Kim AM, Winterfield JR, Jones IL, Pfeufer A, Sanna S, Arking DE, Amsterdam AH, Sabeh KM, Mably JD, et al. Drug-sensitized zebrafish screen identifies multiple genes, including GINS3, as regulators of myocardial repolarization. Circulation. 2009;120:553–559. [PMC free article] [PubMed]
60. Milan DJ, Jones IL, Ellinor PT, MacRae CA. In vivo recording of adult zebrafish electrocardiogram and assessment of drug-induced QT prolongation. Am J Physiol Heart Circ Physiol. 2006;291:H269–H273. [PubMed]
61. Niethammer P, Grabher C, Look AT, Mitchison TJ. A tissue-scale gradient of hydrogen peroxide mediates rapid wound detection in zebrafish. Nature. 2009;459:996–999. [PMC free article] [PubMed]
62. Berger SI, Ma’ayan A, Iyengar R. Systems pharmacology of arrhythmias. Sci Signal. 2010;3:ra30. [PMC free article] [PubMed]
63. Sternberg PW, Horvitz HR. The combined action of two intercellular signaling pathways specifies three cell fates during vulval induction in C. elegans. Cell. 1989;58:679–693. [PubMed]
64. Giurumescu CA, Sternberg PW, Asthagiri AR. Intercellular coupling amplifies fate segregation during Caenorhabditis elegans vulval development. Proc Natl Acad Sci USA. 2006;103:1331–1336. [PubMed]
65. Fisher J, Piterman N, Hajnal A, Henzinger TA. Predictive modeling of signaling crosstalk during C. elegans vulval development. PLoS Comput Biol. 2007;3:e92. [PubMed]
66. Kikuchi K, Holdway JE, Werdich AA, Anderson RM, Fang Y, Egnaczyk GF, Evans T, Macrae CA, Stainier DY, Poss KD. Primary contribution to zebrafish heart regeneration by gata4(+) cardiomyocytes. Nature. 2010;464:601–605. [PMC free article] [PubMed]
67. Flint J, Mott R. Finding the molecular basis of quantitative traits: successes and pitfalls. Nat Rev Genet. 2001;2:437–445. [PubMed]
68. Cardon LR, Bell JI. Association study designs for complex diseases. Nat Rev Genet. 2001;2:91–99. [PubMed]
69. McCarthy MI, Hirschhorn JN. Genome-wide association studies: potential next steps on a genetic journey. Hum Mol Genet. 2008;17:R156–R165. [PMC free article] [PubMed]
70. Welcome Trust Case Control Consortium (WTCC) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678. [PMC free article] [PubMed]
71. Arking DE, Pfeufer A, Post W, Kao WH, Newton-Cheh C, Ikeda M, West K, Kashuk C, Akyol M, Perz S. A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization. Nat Genet. 2006;38:644–651. [PubMed]
72. Gudbjartsson DF, Arnar DO, Helgadottir A, Gretarsdottir S, Holm H, Sigurdsson A, Jonasdottir A, Baker A, Thorleifsson G, Kristjansson K. Variants conferring risk of atrial fibrillation on chromosome 4q25. Nature. 2007;448:353–357. [PubMed]
73. Ellinor PT, Yoerger DM, Ruskin JN, MacRae CA. Familial aggregation in lone atrial fibrillation. Hum Genet. 2005;118:179–184. [PubMed]
74. Anttila V, Kallela M, Oswell G, Kaunisto MA, Nyholt DR, Hamalainen E, Havanka H, Ilmavirta M, Terwilliger J, Sobel E, et al. Trait components provide tools to dissect the genetic susceptibility of migraine. Am J Hum Genet. 2006;79:85–99. [PubMed]
75. Terwilliger JD, Hiekkalinna T. An utter refutation of the “Fundamental Theorem of the HapMap” Eur J Hum Genet. 2006;14:426–437. [PubMed]
76. Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JD, Chesneau A, Hao T. A map of the interactome network of the metazoan C. elegans. Science. 2004;303:540–543. [PMC free article] [PubMed]
77. Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, Guhathakurta D, Sieberts SK, Monks S, Reitman M, Zhang C. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet. 2005;37:710–717. [PMC free article] [PubMed]
78. Warren KS, Fishman MC. “Physiological genomics”: mutant screens in zebrafish. Am J Physiol. 1998;275(1 Pt 2):H1–H7. [PubMed]
79. Meng X, Noyes MB, Zhu LJ, Lawson ND, Wolfe SA. Targeted gene inactivation in zebrafish using engineered zinc-finger nucleases. Nat Biotechnol. 2008;26:695–701. [PMC free article] [PubMed]
80. Milan DJ, Giokas AC, Serluca FC, Peterson RT, MacRae CA. Notch1b and neuregulin are required for specification of central cardiac conduction tissue. Development. 2006;133:1125–1132. [PubMed]
81. Peterson RT, Link BA, Dowling JE, Schreiber SL. Small molecule developmental screens reveal the logic and timing of vertebrate development. Proc Natl Acad Sci USA. 2000;97:12965–12969. [PubMed]
82. Freimer N, Sabatti C. The human phenome project. Nat Genet. 2003;34:15–21. [PubMed]
83. Singer E. “Phenome” project set to pin down subgroups of autism. Nat Med. 2005;11:583. [PubMed]
84. Roden DM, George AL., Jr The genetic basis of variability in drug responses. Nat Rev Drug Discov. 2002;1:37–44. [PubMed]
85. Lin M, Aquilante C, Johnson JA, Wu R. Sequencing drug response with HapMap. Pharmacogenom J. 2005;5:149–156. [PubMed]
86. Plaster NM, Tawil R, Tristani-Firouzi M, Canun S, Bendahhou S, Tsunoda A, Donaldson MR, Iannaccone ST, Brunt E, Barohn R. Mutations in Kir2.1 cause the developmental and episodic electrical phenotypes of Andersen’s syndrome. Cell. 2001;105:511–519. [PubMed]
87. Splawski I, Timothy KW, Sharpe LM, Decher N, Kumar P, Bloise R, Napolitano C, Schwartz PJ, Joseph RM, Condouris K, et al. Ca(V)1.2 calcium channel dysfunction causes a multisystem disorder including arrhythmia and autism. Cell. 2004;119:19–31. [PubMed]
88. Patton EE, Zon LI. The art and design of genetic screens: zebrafish. Nat Rev Genet. 2001;2:956–966. [PubMed]
89. St Johnston D. The art and design of genetic screens: drosophila melanogaster. Nat Rev Genet. 2002;3:176–188. [PubMed]