Proteins localized within the same subcellular compartment tend to be functionally associated. This study shows that subcellular localization and network distance between disease-associated proteins provide complementary information explaining patterns of disease comorbidity.
A positive correlation was found between subcellular localization of disease-associated protein pairs and measures of comorbidity.A higher comorbidity tendency was found for disease-associated protein pairs that are positioned within a shorter distance in the protein interaction network.The integration of subcellular localization information with protein interaction network sheds light onto the potential molecular connections underlying comorbidity patterns and will help to understand the mechanisms of human disease.
It was shown that the emergence of phenotypically similar diseases are triggered as a result of molecular connections between disease-causing genes (Oti and Brunner, 2007; Zaghloul and Katsanis, 2010). From a genetics, perspective diseases are associated with certain genes (Goh et al, 2007; Feldman et al, 2008), whereas from a proteomics perspective phenotypically similar diseases are connected via biological modules such as protein–protein interactions (PPIs) or molecular pathways (Lage et al, 2007; Jiang et al, 2008; Wu et al, 2008; Linghu et al, 2009; Suthram et al, 2010). These molecular connections between diseases were observed on the population level as well: diseases connected through molecular connections such as shared genes, PPIs, and metabolic pathways tend to show elevated comorbidity (Rzhetsky et al, 2007; Lee et al, 2008; Zhernakova et al, 2009; Park et al, 2009a, 2009b). While these findings constitute a step toward improving our understanding of the mechanism of disease progression, there are still many more molecule-level connections between disease pairs that need to be explored in order to establish a firmer comorbidity association.
Subcellular localization provides spatial information of proteins in the cell; proteins target subcellular localizations to interact with appropriate partners and form functional complexes in signaling pathways and metabolic processes (Au et al, 2007). Abnormal protein localizations are known to lead to the loss of functional effects in diseases (Luheshi et al, 2008; Laurila and Vihinen, 2009). For example, mis-localizations of nuclear/cytoplasmic transport have been detected in many types of carcinoma cells (Kau et al, 2004). A proper identification of protein subcellular localization can hence be useful in discovering disease-associated proteins (Giallourakis et al, 2005; Calvo and Mootha, 2010). With this understanding, we postulate that disease-associated proteins connected by subcellular localizations could also explain the phenotypic similarities between diseases. Furthermore, such connections may also couple to disease progressions that contribute to multiple disease manifestation, that is, comorbidity.
Protein subcellular localization has been extensively studied through various methods to determine a variety of protein functions. To the best of our knowledge, the connection between diseases and subcellular localizations are yet to be studied systematically. To resolve this we constructed, for the first time, a human Disease-associated Protein and subcellular Localization (DPL) matrix (top panel in Box 1). Our DPL matrix provides the ‘cellular localization map of diseases' that represents the spatial index of diseases in the cell. We found that each disease shows unique characteristics of subcellular localization profile in the DPL matrix. We were interested in determining whether subsets of 1284 human diseases exhibit distinct enrichment profiles across subcellular localizations. We calculated pairwise correlations and performed a hierarchical clustering of the enrichments of the 1284 diseases across 10 different subcellular localizations.
Our DPL matrix revealed that 778 diseases (∼62%, P=1.40 × 10−3) are enriched in a single localization and 273 diseases (∼21%, P=3.45 × 10−3) are enriched in dual localizations. In the DPL matrix, certain disease-associated proteins are likely to be found in membrane-bounded organelles such as mitochondria, lysosome, and peroxisome, indicating that the mutations of proteins localized to these compartments are connected to the pathophysiological conditions of those organelles. Meanwhile, certain disease-associated proteins in the DPL matrix are enriched in dual localizations, such as extracellular/plasma membrane or endoplasmic reticulum/Golgi. Although these two pairs of subcellular localizations appear to be distinct compartments at first, they are functionally related compartments in close proximity during protein translocation process in the cell, and thus are likely to share interacting protein partners (Gandhi et al, 2006).
Comorbidity represents the co-occurrence of multiple diseases in the same individual (Lee et al, 2008; Hidalgo et al, 2009; Park et al, 2009a). Many comorbid disease pairs have been shown to share common genes in the human disease network. For example, Diabetes and Alzheimer's disease share a risk factor in angiotensin I converting enzyme, and frequently occur together in an individual. In such instances, comorbidity can be partially attributed to the disease connections on the molecular level. To explore the impact of protein subcellular localization on comorbidity, we hypothesized that certain disease pairs could also be connected via subcellular localization by the molecular connections between the disease-associated proteins (bottom panel in Box 1).
We found a positive correlation between subcellular localization similarity and relative risk (Figure 3B, Pearson's correlation coefficient between relative risk and subcellular localization similarity=0.81, P=2.96 × 10−5). The subcellular localization similarity represents the correlation of subcellular localization profiles between disease pairs. To our surprise, when we compared the relative risk of disease pairs linked via various molecular connections, we found that disease pairs connected by subcellular localization showed a near three-fold higher comorbidity tendency (with link distances equal to 2 or 3) when compared with random pairs (Figure 3E).
We then assessed quantitatively the impact of network distances and subcellular localizations on the comorbidity tendency of disease pairs. We expected the proteins associated with comorbid disease pairs to be located closely in the protein interaction network via fewer links compared with random disease pairs. Indeed, a higher comorbidity tendency was found when two disease-associated proteins were positioned within a shorter distance (gray plots in Figure 3F). Moreover, when subcellular localization information was combined with small network distances, the comorbidity tendency increased dramatically (orange plots in Figure 3F). It suggests that subcellular localization and close network distances, two conceptually distinct molecular connections, contributed synergistically to the comorbidity tendency.
Disease progression is not restricted to the mutation of disease-causing genes, but also affected by molecular connections in ‘disease modules,' resulting in comorbidity (Fraser, 2006; Lee et al, 2008). In this study, for the first time we applied subcellular localization information to elucidate the molecular connections between comorbid diseases. We believe that, based on our finding, our approach helps to define the boundaries of ‘disease modules.' Taken together, integration of diverse molecular connections should improve the molecular level understanding of hitherto unexplained comorbid disease pairs and help us in expanding the scope of our knowledge of the mechanism of human disease progression.
Proteins targeting the same subcellular localization tend to participate in mutual protein–protein interactions (PPIs) and are often functionally associated. Here, we investigated the relationship between disease-associated proteins and their subcellular localizations, based on the assumption that protein pairs associated with phenotypically similar diseases are more likely to be connected via subcellular localization. The spatial constraints from subcellular localization significantly strengthened the disease associations of the proteins connected by subcellular localizations. In particular, certain disease types were more prevalent in specific subcellular localizations. We analyzed the enrichment of disease phenotypes within subcellular localizations, and found that there exists a significant correlation between disease classes and subcellular localizations. Furthermore, we found that two diseases displayed high comorbidity when disease-associated proteins were connected via subcellular localization. We newly explained 7584 disease pairs by using the context of protein subcellular localization, which had not been identified using shared genes or PPIs only. Our result establishes a direct correlation between protein subcellular localization and disease association, and helps to understand the mechanism of human disease progression.