The use of two different combination functions to investigate the network enables the development of an investigative methodology that supports hypothesis generation through systematic network exploration. The top 1000 edges as scored by either function generate a network comprised of 945 genes and 1,743 total edges. This collection of high scoring edges is organized as 92 pairs, 15 triplets, seven small clusters (<10 nodes), one large ‘yarnball’ (551 nodes), and three medium-sized clusters (comprising 27 to 51 nodes) (). One of the medium-sized sub-networks (total 45 nodes, 107 edges, is analyzed in detail here (circled in ), illustrating a typical use of the Hanalyzer.
Sub-network explanation guided by the Average combination network
That sub-network contains 50 edges from the Average combination graph, involving 20 nodes (); 15 edges asserted solely by the Average metric and 35 asserted by both the Average and Logit measures. By browsing the annotations associated with these 20 genes and their protein products it quickly became apparent that the theme common to this sub-network is muscle (Table S1
). Nineteen of the 20 nodes have at least one reference to ‘muscle’ within their annotations or description, with the most informative descriptive terms being the GO Biological Process terms “muscle contraction” GO:0006936 (and children, including “regulation of muscle contraction” GO:0006937) and “muscle development” GO:0007517, together annotating 15 of the 20 nodes. It is also of interest to note that the majority of the nodes (13 of 20) in this network belong to one of three well characterized muscle protein families (Actin, Myosin and Troponin), suggesting that this network is involved in force generation and structural integrity of muscle.
Sub-network comprising of edges asserted by the Average combinatorial metric.
The single apparent exception to this muscle theme was Thbs4
(Thrombospondin 4, MGI:1101779). Direct searching of PubMed identified a role for Thbs4 (also known as TSP-4) in muscle formation. Thbs4 is secreted by developing tendon mesenchyme cells, and is part of a local signaling process involving the protein ankyrin repeat domain 1 (Ankrd1
; MGI:1097717) which couples tendon morphogenesis to muscle formation 
(note that Ankrd1
was called “muscle ankyrin repeat protein” or marp
in that paper). Thbs4
is expressed at high levels (and in complementary patterns) to Ankrd1
during myogenesis through late embryogenesis and is still observed postnatally 
This network is intriguing because of its strong muscle theme and because the expression profile of the nodes within this network is striking in its mandibular specificity (). The expression of this group of 20 genes is consistently and exclusively up-regulated in the mandibular sample as development progresses from E10.5–12.5. The literature indicates that this expression profile is consistent with tongue muscle development; the tongue being the largest single muscle mass in the head and located within the mandible. At approximately E11, the migration of myogenic cells from the occipital somites into the tongue primordia is considered complete, with myoblasts continuing to proliferate and differentiate until around E15 when they fuse and withdraw from the cell cycle 
. Desmin (Des
, MGI:94885) mRNA is detected as early as E10, consistent with its marking early steps in skeletal myogenesis, such as myoblast determination 
. Also, Thbs4
has been shown to promote myogenic differentiation specifically in the tongue, which due to its lack of cartilage, links muscle groups through a tendinous scaffold 
Heatmap of genes in the Average sub-network.
This same group of genes is also up-regulated at the later E12–12.5 time point in the maxilla sample, consistent with a later onset of all other muscle cell differentiation in relation to the tongue. Skeletal muscle development is staggered, with the tongue maturing approximately 1.5 days (in mice) earlier than all other skeletal muscles. The more advanced stage of tongue muscle development at birth is thought to correlate with its requirement for mammalian suckling immediately after birth 
. The lack of significant muscle in the frontonasal prominence accounts for the low level of expression of these genes in that tissue. The systematically reported and easily explored collection of relevant background knowledge made the interpretation of this complex set of evidence regarding the broad developmental function of a complex group of interacting genes much more straightforward than it would have been using any other approach with which we are familiar.
Hypothesis generation guided by the Logit combination network
Once the well understood aspects of the sub-network had been explored and a biological explanation for the observations created, the analyst adds the edges asserted only by the Logit metric to the visualization of the sub-network. The inclusion of Logit-asserted edges introduced an additional 25 nodes to the network (total 45 nodes), and expanded the network to 107 edges (). These 107 edges consist of 48 Logit-only edges, 18 Average edges (note the additional 3 Average edges linked into the network via connection to nodes introduced by the Logit edges) and 41 edges asserted by both Logit and Average metrics. The nodes comprising this larger network display the same striking mandible-specific expression pattern of the Average combination network, suggesting these additional nodes may also be implicated in tongue development ().
Sub-network comprising of edges asserted by both Average and Logit combined metrics.
Heatmap of all genes in the sub-network.
Although nine of these additional nodes expand the core cluster described above, the majority of nodes form two new clusters tethered to the initial group by one to four edges. Browsing the collated annotations associated with these additional nodes allowed rapid insight into common functional themes. These annotations indicated that the two additional clusters represent myogenic differentiation (six nodes) and synapse interactions (eight nodes) ( and Table S2
). Within the synapse cluster the most informative annotations are the KEGG annotation “Neuroactive ligand-receptor interaction” KEGG:mmu04080 and the GO Cellular Component term “postsynaptic membrane” GO:0045211, which together annotate all six members of this cluster. All eight nodes within the transcription cluster are, unsurprisingly, annotated with the GO Biological Process “transcription” GO:0006350, and five of these nodes also have a documented muscle-related knock out phenotype. The specific genes and interactions in each of these three clusters are explored in turn, and several are selected for experimental validation.
Functional clusters of nodes within the mandibular specific sub-network.
The first cluster investigated we called the core cluster. Of the nine additional nodes contributing to the structural cluster, four (Cdh15, Nrk, Fndc5, and E430002G05Rik; MGI:106672, MGI:1351326, MGI:1917614 and MGI:2445082, respectively) lack annotations from our experts suggesting a role in either muscle, or more generally, craniofacial development. Supplementary investigation of the literature and publicly available expression data was required to extrapolate the muscle association of these four genes.
In contrast to the other ‘unannotated’ nodes, Cdh15
(also known as M-Cadherin, M denoting muscle 
) is a very well studied gene with a number of associated publications (23 references tied to its MGI record alone [accessed 4/23/2008]). It has long been known that Cdh15
is expressed in myogenic cells and has a role in skeletal muscle differentiation, as indicated by low level expression in skeletal myoblasts followed by an increased expression in myotube forming cells 
. Its precise role during muscle development and regeneration is yet to be determined however, and a recent Cdh15
null mouse model with apparently normal muscle phenotype suggesting functional compensation by other cadherin proteins 
The lack of information linking Cdh15
with muscle development highlights the persisting problem of organism-specific gene name normalization. While Cdh15
is the only official gene symbol, there are two approved names for the resultant protein product; Cadherin 15 and M-Cadherin (myotubule) [Data from HUGO, www.genenames.org
Accessed 5/1/2008], and to confuse things further, both names are only used in the human records for this gene (Both GeneBank [NM_004933] and Entrez Gene [ID: 1013] use “Homo sapiens cadherin 15, M-cadherin (myotubule) (CDH15), mRNA” as their definition).
The literature indicates that the Ste20-type kinase, NIK-related kinase (Nrk
) is predominantly expressed in developing skeletal musculature from E10.5 through E17 during mouse embryogenesis; however, Nrk
expression is not detected in any adult tissues, including skeletal muscles 
. Limited RNA expression data obtained from GenePaint.org 
, also appears to show Nrk
expression in E14.5 tongue (GenePaint set ID: MH1818, section Embryo_C1818_1_4B).
In the developing embryo, the recently characterized fibronectin type III domain containing 5 gene (Fndc5
, also known as PeP
; data from iHop 
) is almost exclusively expressed in developing skeletal muscle 
. Absent at E7, Fndc5
expression is first detected in whole embryos at E11, and at E13.5 is specifically observed in the tongue and other skeletal muscles 
. A role during myoblast differentiation is indicated by a two-fold increase in expression during the transition from myoblasts into myotubes, after which expression stabilizes and continues into and throughout adulthood 
Finally, investigation of the Riken clone E430002G05Rik
presented little informative annotation. A single GeneRif identified from the associated EntrezGene entry (GeneID: 210622) yielded all information ascertained about this gene via the associated publication. This single publication 
identified mRNAs affected in a mouse model (mdx
) for Duchenne muscular dystrophy (DMD). E430002G05Rik
was identified as a down-regulated transcript in the mdx
mouse and subsequently named RAMP
(Regeneration-associated muscle protease homolog) 
. It was observed that RAMP
is predominantly expressed in normal adult skeletal muscle and brain, and that it is specifically up-regulated in regenerating skeletal muscle fibers after injury 
. The absence of any annotation regarding development prompted the selection of this gene for further experimental validation.
We called the second cluster explored the Transcription Factor Cluster
. Although well annotated as transcription factors, information provided by reading experts on Pitx3
(MGI:1100498, MGI:98216, and MGI:1341879, respectively) did not suggest roles in muscle development (Table S2
), prompting further investigations. Pitx3
is well characterized and annotated with respect to its role in lens formation during eye development 
. However, literature searching revealed that tongue-specific expression of Pitx3
(also known as Ptx3
) during development (expression first detected at E11.5) was documented over a decade ago 
, while its specific role in myogenesis and myoblast differentiation has only more recently been reported 
Known and annotated principally for its role in mediating the effects of retinoic acid, there also exists extensive literature associating Rxrg
(retinoid X receptor gamma) with myoblast differentiation. This association was not asserted by any of the reading experts, although 117 papers were returned by PubMed search with query “rxr muscle” (accessed 4/25/2008), also suggesting difficulties in species-specific gene name normalization. As early as 1993, RXRs were identified as positive regulators of skeletal muscle development via their direct interactions with Myogenin and MyoD promotor elements 
, and the role of Rxrg
in muscle continues to be explored, with the most recent associated publication identifying a role in lipogenesis and SREBP1c regulation in skeletal muscle 
. A high-throughput study identifying transcription units involved in brain development 
indirectly documented the tongue-specific expression profile of Rxrg
in E13.5 mice (image MGI:3507450), with the same expression pattern weakly persevering in E14.5 mice (GenePaint.org set ID: C1279, section Embryo_C1279_6_3D).
Significantly less is known about the zinc-finger gene, Zim1
. In mouse, this gene is part of an imprinted cluster that includes Zim2
(MGI:1923887) and Peg3
, but a Zim1
ortholog has not been identified to date in human. Therefore, it has been proposed that Zim1
is a recent addition to the mouse genome that was derived via a local duplication of Zim2
. In mice, Zim1
is maternally imprinted and is only expressed during embryogenesis, notably in the limb bud and therefore it has been suggested as having a role in limb development 
. Limited and unannotated RNA expression information was available from additional studies in the mouse 
; however, these did not address Zim1
expression in the developing face. We therefore selected Zim1
for experimental validation, as there was only limited knowledge of this gene and its function in mouse facial and muscle development.
Although well studied in craniofacial development, we also selected Hoxa2
(MGI:96174) for further analysis as its expression is not normally associated with branchial arch 1, which gives rise to the mandible. Indeed, Hoxa2
has a strong anterior limit of expression in the neural crest cells originating in rhombomere 4 that generate the mesenchyme of the second branchial arch. Moreover, the absence of Hox gene expression in more rostral tissues, including the first branchial arch, has been postulated to have enabled the evolution of the vertebrate head 
. We therefore decided to explore this potential novel domain of Hoxa2
expression in more detail.
The third cluster explored was called the synapse cluster
. All the nodes contributing to the synapse cluster are unambiguously implicated in neuromuscular signaling. However, two additional nodes (Ablim3
; MGI:2442582 and MGI:1343178 respectively) fail to fit neatly into any cluster, and instead appear to straddle the synapse interaction and muscle structure clusters. Ablim3
annotation includes both the GO Molecular Function term “actin binding” GO:0003779 as well as the KEGG annotation “Axon guidance” KEGG:mmu04360. However, the annotation associated with Apobec2
strongly indicates a role in RNA editing and processing, but gives no indication of a role in muscle (Table S2
-associated literature revealed little consensus regarding its function. Apobec2
has been documented as an ancestral, cardiac and skeletal muscle-specific member of the Apobec
family implicated in muscle regeneration 
. It has also been described as a ubiquitously expressed protein with cytidine deaminase RNA editing activity 
. Apobec2 knockout mice appear viable and fertile 
but no examination of the tongue was reported. Apobec2
was selected for further biological investigation due to the sparse nature of current associated knowledge and its possible function in the tongue muscle development.