Interactions between signaling pathways in mammalian cells indicate that a large-scale complex network of interactions is involved in determining and controlling cellular phenotype [1
]. To visualize and analyze these complex networks, the biochemical networks may be abstracted to directed graphs [4
]. To understand the topology of such networks, graph-theory methodologies can be applied to analyze networks' global and local structural properties [5
]. Additionally, the value of assembled network datasets is enhanced with network visualization software and web-based information systems. These systems provide summary information, order, and logic for interpretation of sparse experimental results [6
]. Visualization tools and web-based navigation systems provide an integrative resource that aids in understanding the system under investigation and may lead to the development of new hypotheses.
Graph-theory methods have been used in other scientific fields to analyze complex systems abstracted to networks. For example, Watts and Strogatz [8
] defined a measure called the "clustering coefficient" (CC) for characterizing the level of clustered interactions within networks by measuring the abundance of triangles in networks (three interactions among three components). For instance, if a node has four neighbors and three of the neighbors are directly connected, the CC for that node is 0.5 because the four neighbors can be connected maximally with six links (3/6 = 0.5). The network's CC is the average CC computed for all nodes. Caldarelli et al
] formulated an algorithm to consider rectangles (four interacting nodes) in the clustering calculation, and called it the grid coefficient. Watts and Strogatz also used the characteristic path length to measure the disjointedness between nodes in networks. Characteristic path length is the average shortest path between any two pairs of nodes. It is calculated for all possible pairs of nodes, such that the average minimum number of steps between all pairs of nodes is the characteristic path length. Together, the CC and the characteristic path length measurements have a predictable relationship when computed for most real networks. This observation is called the "small-world" phenomenon [8
Barabasi and coworkers [10
] analyzed the connectivity distribution of metabolic networks and other biochemical networks and observed a connectivity distribution termed "scale-free". Scale-free property indicates that the connectivity distribution of nodes follows a long heavy tail that fits a power-law. Such distribution results in few highly connected nodes that serve as hubs whereas most other nodes have few links. Another topological property that is used to statistically analyze biochemical regulatory networks is the identification of network motifs. In biochemical regulatory networks, motifs are subcircuits of molecular interactions involving multiple cellular components. The different possibilities for subcircuit configurations made of several components define different types of network motifs. All the possible combinations for interconnectivity made of few components in directed graphs can be determined [11
] and then used to identify their prevalence by comparing the counts in random topologies. This method was used to characterize motifs in gene regulatory networks from Caenorhabditis elegans
and Saccharomyces cerevisiae
]. This type of analysis identified signature patterns of network motifs that can characterize different types of networks, including signal-transduction networks [13
]. The graph-theory based network analysis methods described above are statistical. Such statistical analysis of signaling networks requires that the size of the network is large enough (requiring an estimated minimum of 200 nodes). SNAVI includes functions to compute the clustering, characteristic path length, and connectivity distribution of networks, and provides the means to identify and visualize network motifs.
Statistical analysis of network topology is complemented by effective network visualization and web-based navigation tools. Maps or diagrams of signaling pathways help summarize many interactions at once. Maps may suggest new interpretations for experiments, because the act of preparing the maps imposes logical interpretation [15
]. Additionally, mapping a network is an important initial step for developing models for quantitative simulation [16
]. Molecular interaction network maps are constantly changing as new data become available, and manually redrawing signaling maps is not convenient or desirable. The requirements for mapping large-scale biological networks include showing an appropriate level of detail, minimizing overlap of nodes and links, and compatibility with multiple data storage formats [7
Existing software tools draw networks automatically from databases, Excel spreadsheets, XML (Extensible Markup Language), or text files where interactions are listed in a structured format. One recommended platform is Cytoscape [17
]. General network visualization software are often used by computational biologists, for example the Pajek software project [18
], or AT&T's Graphviz project [19
], where the second is an open-source project used as a library in many applications, i.e., Science Signaling uses GraphViz to display their Connections Maps [20
]. When maps expand beyond a certain number of nodes (~40–50) it becomes impossible to follow the links generated using the Pajek (version 1.10) or GraphViz (version 1.0) programs. One solution is implementing zooming and panning functionality using scalable vector graphics (SVG) code [21
] or Flash, or dividing large, complex pathways into sets of smaller interrelated pathways. Another solution is to allow users to specify a portion of the network they want to explore and then construct subnetworks that are easily navigated. SNAVI can be used to construct and visualize such subnetworks to allow investigation of larger networks.
SNAVI is a software tool for statistical analysis and visualization of large-scale cellular signaling networks and other biochemical intracellular networks. Here, we demonstrate how SNAVI can be used for web-based visualization and statistical analysis of biological regulatory networks. As an example, the installation of SNAVI provides a network representing signaling pathways in hippocampal neuronal cells [1
]. To create this network, direct interactions were extracted from primary papers into a template stored in a flat file (Table ), and then verified through a multistep manual review process by biologists. The network currently contains 594 nodes and 1422 links extracted from 1296 articles. Users may use this dataset or may load their own data. The process of creating, analyzing, and visualizing signaling networks using SNAVI is described in the methods.
SNAVI native file storage format SIG file