Over the last decade, the sequencing of a vast number of genomes revealed that an increase in organismal complexity is not merely explained by a dramatic increase in the number of protein-coding genes. Indeed, highly complex organisms frequently contain roughly the same number of protein-coding genes as organisms with less intricate morphology and behaviors. For example, the nematode
Caenorhabditis elegans has ~20,000 predicted protein-coding genes with a relatively simple body plan.
(1) The fruit fly
Drosophila melanogaster and humans have a much more complex anatomy and physiology than worms, yet their genomes encode only ~14,000 and ~25,000 predicted protein-coding genes, respectively.
(2,3)It has been proposed that organismal complexity developed from a gradual increase in protein diversity, mainly due to alternative mRNA splicing, combined with a gradual increase in the extent and intricacy of gene regulation.
(4–6) For instance, the human genome is 3.2 Gb in length, whereas
C. elegans has a genome of only 100 Mb. Since, exon and open reading frame length does not increase with animal complexity, this means that the non-coding part of the human genome can be up to 30 times larger than that of
C. elegans. In addition to an increase in regulatory genomic space, there is also an increase in the number of
trans-regulators. First, the proportion of proteins that encode transcription factors (TFs) increases with organismal complexity; around 5% of the protein-coding genes code for TFs in flies and nematodes, compared to almost 10% for mouse and human
(7–9) (). Second, the number of microRNAs (miRNAs) encoded by a genome appears to correlate with organismal complexity as well.
(10) For example, 154, 337, and 695 miRNAs have been annotated to date in the
C. elegans, zebrafish
Danio rerio, and human genomes, respectively (miRBase).
(11) | Table 1Overview of TF- and miRNA-mediated gene regulation |
Both TFs and miRNAs can exert a widespread impact on gene expression. Most, if not all, genes in the genome are controlled by TFs, which either up- or down-regulate transcription. Overall, miRNAs are predicted to target approximately 10–30% of animal protein-coding genes, with each miRNA repressing on average 200 transcripts.
(12–15)Hierarchically, miRNAs function downstream of TFs since miRNAs can repress an mRNA only after it has been transcribed. However, recent observations suggest that transcriptional regulation by TFs and post-transcriptional regulation by miRNAs are often highly coordinated. To gain an understanding of the coordinated effects of TFs and miRNAs, it is critical to delineate and characterize the genome-scale regulatory networks in which these regulators operate. Such networks combine the plethora of regulatory circuits for a tissue, organism, or process of interest, usually into a single model. Analyses of these models provide insights into the mechanisms that control gene expression at a systems level, rather than at the level of (an) individual gene(s). Here, we first briefly describe the main principles of TF- and miRNA-mediated gene regulation, and then discuss our current knowledge of how these factors interact with each other in the context of genome-scale regulatory networks, concentrating primarily on animal systems. We also provide a brief discussion of the need to incorporate spatiotemporal gene expression information, and the long-term incorporation of other functional networks, such as those involving signaling or RNA-binding proteins (RBPs), into dynamic “meta network models” that describe gene expression in animal development, homeostasis, behavior, and pathology at high resolution and precision.