Rheumatoid Arthritis (RA) is a complex disease involving a yet unknown number of genes, and affecting a large number of organs, tissues and sites across the body. It is affecting approximately 1% of the population worldwide
[1], with this rate rising for the first time in 40 years, as reported at the American College of Rheumatology meeting in San Francisco (CA, USA) in 2008. RA is a systemic autoimmune disease causing recruitment and activation of inflammatory cells, synovial hyperplasia, and destruction of cartilage and bone. A complete loss of mobility and functioning can be the final evolution of the disease
[2]. Although RA involves the synovial joints, it presents several systemic features as, in fact, several other organs are affected including skin, lungs, kidneys, blood vessels and heart
[3]–
[6]. Because of its complexity, having a broad, systemic perspective on the biological functions activated and the molecular pathways involved in the disease is of crucial importance.
In this direction several types of approaches and data platforms can be used for investigation. Genome-Wide Association studies (GWAs) scan the whole genome in search of loci susceptible to carry mutations related to RA (only as a sample of very recent studies
[7]–
[9]). Gene microarray data have contributed greatly to pathogenesis and to the identification of biomarkers for diagnosis, to patient stratification and prognostication of RA
[10]. Other studies join the information from these 2 approaches and compare differentially expressed genes with genome-wide association studies to better predict candidate susceptibility genes of RA
[11]. Furthermore, some signal transduction pathways have also been identified as being involved in the disease progression and in the effects of therapies of RA. The TGF-

pathway, for example, shows broad, constitutive alternation in Rheumatoid Arthritis Synovial Fibroblasts (RASFs)
[12] and the NF-

B pathway has been inhibited during the anti-TNF-

therapy by etanercept
[13]. The signal transduction pathways in RA and some of the important proteins of these pathways have been identified as drug targets to treat RA
[14]–
[16]. However, due to the complex interactions of these pathways, treatments that target only one protein may not be very effective. Besides the relevance of proteins as targets, a recent study has also shown that miR-155 was up-regulated during the treatment with TNF-

in RASFs
[17]. This implies that some microRNA may be involved in RA progression. Due to the complexity of RA, however, the interaction among all of these molecules and pathways is still obscure. This is highly relevant in the identification of new therapies, as in fact, some of the most common drugs used to treat RA, such as MTX (Methotrexate), can cause liver, lung and kidney damage, as well as strong immunodepression. To avoid these important side effects and to develop more specific and useful drugs, the whole structure of the molecular networks involved in RA needs to be studied and clarified. The identification and analysis of this complex map cannot be performed without the help of computational biology. Hence we present here a comprehensive map for RA that combines together the molecules and pathways that were so far found to be associated with RA, based on systemic, high-throughput data, and made available following the format suggested by
CellDesigner [18], a popular and successful standard for the exchange of cellular maps
[19].
To date, the most abundant source of high-throughput, systemic, genome-wide data is still represented by microarrays for gene expression, although soon this may be replaced by more quantitative information from mRNAs sequencing
[20],
[21]. For this reason, in order to build a comprehensive map of the processes ongoing in RA, we chose to construct a molecular map based on the results of high-throughput analyses. In fact, despite the growing availability of proteomic data and their promising applications, the throughput remains lower and limited to a number of validated targets
[22]. In order to give a systemic description of the relationships among the genes and pathways known to be involved in RA, we merged the information of all available papers related to high-throughput experiments (mRNA, miRNA) on RA (
[11]–
[13],
[17],
[23]–
[46]). Using these information and further data available from the Kyoto Encyclopedia of Genes and Genomes (KEGG) ‘
http://www.genome.jp/kegg/’, we build a comprehensive cell-level interaction map. Visually, we present this molecular-interaction map as a gene regulation map and a protein-protein interaction map, linked by a number of transcription factors. We then use network analysis methods to analyse the map as a whole and tissue-specifically. In we present a schematic view of the study framework.
Basically, our map uses the information retrieved from the results of functional genomic analyses on RA (in the form of differentially expressed genes), as a blueprint for the construction of a more detailed interaction map based on assessed literature (in the form of pathways). It is very clear that such a map is likely to include a number of false positives, since the differential expression of genes may not correspond to the effective presence of the corresponding protein. However, the additional layer of information retrieved from pathways stored in literature shows that the reconstructed map is able to identify hubs that are known targets of current RA treatments, and to suggest interesting new targets. Moreover, one of the preeminent characteristics of
in silico modeling and analysis is to offer ‘cheap’ working hypotheses to be subsequently tested
in vitro and
in vivo. In this respect our analysis highlights the importance of proteins that have already been identified as relevant in the evolution of RA, giving more insight into their potential role, and in particular into their ability to affect neighboring pathways and functions. This type of analysis indicates potential new markers in the diagnosis and/or monitoring of RA, that is markers that could be loaded, for example, on board of point-of-care diagnostic tools, based not only on genomic (DNA) screens
[47], but also on functional genomic (mRNA) screens
[48]. Generally, the mRNAs on which we base our analysis are identified as relevant under different biological conditions, such as healthy versus diseased subjects, RA versus other immune diseases, or comparing subjects before and after treatment. The experiment samples are also from different tissues, such as peripheral blood mononuclear cell (PBMC), synovial fibroblasts and cartilage.
A range of network analysis methods have been successfully applied in multiple studies in an attempt to understand the structure of interaction networks, or the effect that single genes or molecules have on such networks (see
[49]–
[53] for example). In the ‘Analysis of Molecular-Interaction Map and Network Modules’ section of this paper we use network analysis methods to understand the systemic interactions of molecules involved in RA. We begin by determining the topological parameters of the network and analysing the structure of the map as a whole. Many biological networks display scale-free properties
[54]–
[58], which means that they contain fewer nodes that have many connections to other nodes' hubs. Targeting hubs enables one to reach several other nodes in a shorter time frame than would be possible by targeting nodes at random. On the downside, the presence of hubs in scale-free networks means that the network can be more easily destroyed if the hubs are removed. We use the degree distribution, which describes the number of links per node, to identify hubs and to determine if the network is scale free (if it is, then the degree distribution follows a power-law distribution). We then consider whether the hubs in the molecular-interaction map have known biological relevance. Further to understanding the structure of the network through the presence of hubs, we decompose the network into modules according to its structure. We expect the network to have a bow tie structure
[56], which means that it can be separated into four components: a central part containing strongly connected components, an IN component containing nodes from which the central component can be reached, an OUT component containing nodes that can be reached from the central component and a fourth component containing all other nodes. Within the central component of the bow tie structure, we can identify topologically relevant cycles of nodes. A cycle in this sense is a group of nodes that are connected to each other such that the links between nodes form a cycle containing all nodes. Definition of cycles in a cell can represent biologically significant features, such as feedback in the cell, which is an important way for the cell to regulate different biological mechanisms, such as protein-protein interactions, gene-regulation or metabolic pathways
[59],
[60]. We use the relevant cycles and consider the paths attached to them, thus creating separate modules from the interaction-network whose core components are a closed cycle. We then look for biological relevance in these newly defined modules. We determine if the modules produced show similarities to biological modules (in the sense that they may act as an independent sub-system or perform a specific biological function in the cell)
[50]. This module analysis helps one to decompose the complex network and furthermore identify the pathways involved in RA. By careful dissection of the pathways, novel therapeutic interventions designed to block signaling may be developed. Several potential targets, including MAP kinases and NF-

B, are already being explored. Analysis of the interaction network without any amount of decomposition will not give a full understanding of its structure, which is important for thorough biological interpretation.
Given that transcription factors have been shown to be potential drug-targets, and that it has also been shown that it is possible to modulate some transcription factors through signaling cascades, our attention is drawn the transcription factors in our network. In ‘The Role of Transcription Factors’ we investigate whether the transcription factors present in our network also have important topological properties, in the sense that they link topologically distinct parts (i.e. different modules) of the network. If this is the case, then it may be possible to influence the different topologically important parts of the cell, by concentrating on specific transcription factors.
Further to analysis of the interaction map as a whole, in section ‘Analysis of Tissue-Specific Networks’ we also present the results of a tissue-specific analysis. Here we consider whether there are topological and biological differences in the way in which various tissue types act within the cell with respect to RA. By assigning a species tag to each node in the molecular-interaction map, we produce five tissue-specific sub-maps (Blood Peripheral Blood Mononuclear Cell (Blood_PBMC), Blood Peripheral Blood Mononuclear Cell plus Polymorphonuclear leukocytes (Blood_PBMC_PMN), cartilage, Synovial Fibroblast and synovial Polymorphonuclear leukocytes (synovial_PMN)). Of these five sub-maps, we are only able to achieve meaningful topological results for three (Blood_PBMC, Synovial Fibroblast and cartilage), due to the small amount of data used to build the remaining two sub-maps. For these three larger sub-maps, we pay particular attention to the identification of hubs by tissue type, and to areas where there is an overlap between tissue types. The results from this part of the study enable us to comment on whether there exist tissue specific markers that could play a role in the diagnosis of RA.
Throughout the analysis, we constantly are required to return to the literature in order to determine if the topological results have any biological significance. We present our findings and identify areas for further research.