We aimed to (i) identify proteins that may be mediators of crosstalk between angiogenesis-associated protein families and (ii) characterize their association with angiogenesis. We accomplished the first aim by application of graph diffusion on the human molecular interaction network followed by verification of statistical significance. We accomplished the second aim using a previously reported time series gene expression experimental dataset taken during angiogenesis.15
The search for angiogenesis-associated proteins and crosstalk
We used the human physical interactome as a basis for the analysis. We used a graph theoretic technique called graph diffusion to quantify the distance between proteins in the interactome19
(see Methods). The graph diffusion method also known as the diffusion kernel allowed us to quantify the distance between a single protein and a protein family. We referred to the distance between a protein and a protein family as the diffusion kernel score (DKS). A protein with a high DKS interacts closely with the protein family. For example, consider the family of type IV collagen fibrils, a protein that physically interacts with all type IV collagens would receive a high DKS, while a protein that only indirectly interacts with type IV collagens would receive a relatively lower DKS. We use the DKS to estimate the association between a single protein and a family of proteins.
To locate those proteins that potentially mediate crosstalk between families, we define crosstalk proteins that are highly associated with multiple protein families (i.e. the proteins have a DKS which is greater than a threshold for multiple families). For example, a crosstalk protein for type IV collagens and CXC chemokines would have many direct and indirect interactions with both protein families. We evaluate the statistical significance of a crosstalk protein by considering hundreds of rewired networks. We create each rewired network by repeatedly swapping interactions. The statistical test that we use for crosstalk proteins controls for the size of the protein families and the degree of protein interaction.
Using this approach, we found 126 proteins that were topologically close to the angiogenesis-associated protein families. We evaluated the quality of the protein annotations by their statistical significance and functional enrichment in angiogenesis. To put this network in context with the rest of the known human interactome, these are less than 1% of proteins (i.e. 0.93%) and interactions (0.25%). The analysis pointed to many proteins whose role in angiogenesis is well known, which serves as a validation of the approach. There are 194 human proteins that have angiogenesis as part of a GO annotation (as of 6/2010). The likelihood that a protein is annotated with angiogenesis by chance is 0.014. Excluding 31 seed proteins, our analysis of the human protein-protein interaction network identifies 4 proteins that have angiogenesis as part of their GO annotation. The probability that the 95 (i.e. 126 associated − 31 seeds) proteins contained 4 angiogenesis annotated proteins by chance is 0.045. We calculated the p-value using Fisher’s exact test. Our analysis suggests new or understudied modulators of angiogenesis. These centrally located proteins may be attractive targets due to their potential to minipulate multiple protein families.
In , we show a Venn diagram to illustrate the associations of the 126 proteins. These proteins are topologically close to the type IV collagens, CXC chemokines or TSP1-containing proteins or some combination of families, as indicated by the figure. The figure gives the putative crosstalk between three angiogenesis-associated protein families: type IV collagens (blue), CXC chemokines (red), and TSP1-containing proteins (green). The crosstalk proteins are shown for CXC chemokines and type IV collagens (purple), CXC chemokines and TSP1-containing proteins (tan), type IV collagens and TSP1-containing proteins (yellow), and between all three (orange). The number of angiogenesis-associated proteins is shown in parentheses. In the results, we focus on the proteins associated with multiple families. First, we discuss crosstalk proteins between type IV collagen and TSP1-containing proteins. Then, we highlight six proteins identified as crosstalk proteins between all three families. These six proteins: three syndecans, MMP9, CD44 and versican may be important mediators of crosstalk for these angiogenesis-associated protein families.
Venn diagram of the putative crosstalk
Method for comparison of topology-based annotation
Crosstalk between pathways is an important concept in biology. There have been both computational20
efforts to identify crosstalk between pathways. Some of these approaches are not suitable in this context because they rely on overlapping pathways to identify crosstalk. Alternate approaches might consider “first neighbors” or “second neighbors” to identify association between pathways or modules. These rigid approaches have the inherent disadvantage of being unable to identify crosstalk between modules of distance 2 for “first neighbors” or distance 3 for “second neighbors. Other studies used shortest paths to help define crosstalk proteins.20,22
These methods borrow from concepts such as betweenness centrality. Because graph diffusion considers all paths, our method has inherent advantages over those that only consider shortest paths between proteins.
To motivate the use of the graph diffusion method, we performed a systematic comparison of three alternative methods in a head-to-head comparison with graph diffusion. The we compared graph diffusion with methods based on first neighbors, second neighbors, and betweenness centrality. In , we show the results of this comparison. We found that graph diffusion identified more statistically significant proteins at both the 0.01 and 0.05 levels. The graph diffusion method identified a more functionally cohesive set of proteins as demonstrated by the number of GO term enrichments at the 0.001 and 0.0001 levels.
Head-to-head comparison of topological annotation methods
Gene expression validates crosstalk proteins as angiogenesis-associated
To further validate the role of the crosstalk proteins in angiogenesis we reanalysed a time series gene expression dataset taken during VEGF-induced angiogenesis. We expected that many crosstalk proteins would have perturbed gene expression during angiogenesis. If this proved to be the case, the microarray dataset would provide additional evidence of the role of crosstalk proteins in angiogenesis.
A research team led by Claesson-Welsh took measurements from a gene expression time series of VEGF-induced capillary endothelial tube formation in a 3D collagen matrix in vitro.15
The dataset included 8 time points: 15 min, and 1, 3, 6, 9, 12, 18, and 24 h of VEGF stimulation. We reanalysed these data to identify the transcription profiles that are significantly increasing or decreasing during tube formation (that we refer to as angiogenesis). To accomplish this, we ranked transcripts by the absolute value of the covariance between the transcript measurements and the time points. We tested the null hypothesis that the crosstalk proteins are uniformly distributed among the ranked list of genes. We computed the family-wise error rate (FWER) p-value using gene set enrichment analysis23
which is based the Kolmogorov-Smirnov test followed by permutation testing. We found the crosstalk proteins significantly enriched at the head of the ranked list of perturbed genes (p=3·10−4
). In , we give the trajectory of gene expression during VEGF-induced angiogenesis. We measure the trajectory of gene expression change by the covariance between the gene expression and the time points. The statistical test indicates that many of the crosstalk proteins have either increasing or decreasing gene expression during angiogenesis. This analysis helped confirm the importance of these crosstalk proteins in VEGF-induced angiogenesis and serves as a validation of our bioinformatics analysis.
Putative crosstalk between type IV collagens and TSP1-containing proteins
We studied the association between type IV collagens and TSP1-containing proteins to reveal the mediators of crosstalk between these two families. In , the crosstalk proteins between type IV collagens and TSP1-containing proteins are highlighted in yellow. A significant number of these proteins bind collagen and associate with the vesicle lumen (). CD36 is also known to interact with type IV collagens.24
The identification of CD36 as a crosstalk protein for type IV collagens and TSP1-containing proteins helps confirm our approach. Decorin (DCN) is another proteoglycan that we identify as a crosstalk protein. Decorin interacts with collagens and extracellular matrix (ECM) and promotes angiogenesis.25
Fibronectin 1 (FN1) is an important connective molecule in the extracellular space. FN1 has domains for collagens, fibulin 1, heparin, and syndecan binding.26
We identify FN1 as a crosstalk protein between type IV collagens and TSP1-containing proteins. FN1 connects extracellular collagens with membrane-bound integrins (). As such, FN1 has a central role in endothelial cell adhesion to the ECM. Another important conduit of information between TSP1-containing proteins and type IV collagens is through aggrecan (ACAN) and brevican (BCAN) through fibulin 2 (FBLN2).27,28
The crosstalk between type IV collagens and TSP1-containing proteins through ACAN, BCAN, and FBLN2 has not been reported in the context of angiogenesis, although it is known that FBLN2 inhibits tumor angiogenesis.28
The crosstalk between type IV collagens and TSP1-containing proteins may be significantly influenced by fibronectin 1, aggrecan, brevican, and fibulin 2. The amyloid beta (A4) precursor protein (APP) is also annotated as a crosstalk protein between type IV collagen and TSP1-containing proteins. shows the direct interaction between APP and COL4A1, COL4A2, COL4A5, COL4A6 and TSP1-containing spondin 1 (SPON1). APP is known to be associated with Alzheimer’s disease.29
It is also known that Alzheimer’s disease is related to angiogenesis.30
This study suggests angiogenesis might influence Alzheimer’s disease through the association between APP and type IV collagens and TSP1-containing proteins.
Network of association between type IV collagens, CXC chemokines and TSP1-containing proteins
Crosstalk protein functional enrichment
Putative crosstalk between type IV collagens, CXC chemokines and TSP1-containing proteins
We were also interested in identifying the potential avenues of crosstalk between type IV collagens, CXC chemokines, and TSP1-containing proteins. We identified six proteins that are well connected to all three families of angiogenesis-associated proteins. In , we show the crosstalk proteins between all three families in orange. A significant number of these proteins bind collagen and are localized on the cell surface (). MMP9 was identified as a crosstalk protein between the three families of angiogenesis-associated proteins. MMP9 is known to degrade type IV collagens31
and CXC chemokines like PF4.32
Thrombospondins are known to regulate the amount of MMP9.33
These functions outline the pivotal role of MMP9 in association with angiogenesis. Although MMP9 degrades many proteins, the interaction between MMP9 and the angiogenesis-associated protein families is highly significant (, p=0.004).
Our work highlights syndecan 1 (SDC1), syndecan 2 (SDC2), syndecan 4 (SDC4) at the centre of crosstalk between type IV collagens, CXC chemokines, and TSP1-containing proteins. Syndecans have been previously implicated in angiogenesis.34
Endothelial CD44 plays an important role in tube formation during angiogenesis.35
Our study suggests that CD44 may operate as a mediator of crosstalk between type IV collagens, CXC chemokines, and TSP1-containing proteins. Note that WISP-1, a TSP1-containing protein, is connected to the type IV collagen family through Bone Morphogenetic Protein 3 (BMP-3). An anti-angiogenic peptide derived from WISP-1 with relatively low anti-proliferative and anti-migratory in vitro
activity identified in,5
showed a significant in vivo
activity in corneal and laser-induced choroidal neovascularization mouse models.9
Versican (VCAN) is the last protein in the set of centrally located proteins. VCAN is involved in the attachment of endothelial cells to the extracellular matrix. The importance of VCAN in angiogenesis could easily be missed by other methods that only consider the direct interactions. VCAN has only a few physical protein-protein interactions, and it has only one direct interaction with the angiogenesis-associated proteins (i.e. ADAMTS1). Still, our analysis highlights VCAN as a potential component of crosstalk between type IV collagens, CXC chemokines, and TSP1-containing proteins. Using the quantitative comparison shown in , we confirmed that local approaches like first neighbors (p=0.046) and second neighbors (p=0.11) would have missed VCAN, while non-local approaches like graph diffusion (p=0.008) and betweenness centrality (p=0.006) would have identified the significance of VCAN at the 0.01 level. We identify six proteins at the center of the type IV collagen, CXC chemokine, and TSP1-containing protein network. These proteins, SDC1, SDC2, SDC4, MMP9, CD44, and VCAN, appear to be important components of angiogenesis, based on their position within the angiogenesis-associated network.