Search tips
Search criteria

Results 1-25 (1331789)

Clipboard (0)

Related Articles

1.  Building Disease-Specific Drug-Protein Connectivity Maps from Molecular Interaction Networks and PubMed Abstracts 
PLoS Computational Biology  2009;5(7):e1000450.
The recently proposed concept of molecular connectivity maps enables researchers to integrate experimental measurements of genes, proteins, metabolites, and drug compounds under similar biological conditions. The study of these maps provides opportunities for future toxicogenomics and drug discovery applications. We developed a computational framework to build disease-specific drug-protein connectivity maps. We integrated gene/protein and drug connectivity information based on protein interaction networks and literature mining, without requiring gene expression profile information derived from drug perturbation experiments on disease samples. We described the development and application of this computational framework using Alzheimer's Disease (AD) as a primary example in three steps. First, molecular interaction networks were incorporated to reduce bias and improve relevance of AD seed proteins. Second, PubMed abstracts were used to retrieve enriched drug terms that are indirectly associated with AD through molecular mechanistic studies. Third and lastly, a comprehensive AD connectivity map was created by relating enriched drugs and related proteins in literature. We showed that this molecular connectivity map development approach outperformed both curated drug target databases and conventional information retrieval systems. Our initial explorations of the AD connectivity map yielded a new hypothesis that diltiazem and quinidine may be investigated as candidate drugs for AD treatment. Molecular connectivity maps derived computationally can help study molecular signature differences between different classes of drugs in specific disease contexts. To achieve overall good data coverage and quality, a series of statistical methods have been developed to overcome high levels of data noise in biological networks and literature mining results. Further development of computational molecular connectivity maps to cover major disease areas will likely set up a new model for drug development, in which therapeutic/toxicological profiles of candidate drugs can be checked computationally before costly clinical trials begin.
Author Summary
Molecular connectivity maps between drugs and a wide range of bio-molecular entities can help researchers to study and compare the molecular therapeutic/toxicological profiles of many candidate drugs. Recent studies in this area have focused on linking drug molecules and genes in specific disease contexts using drug-perturbed gene expression experiments, which can be costly and time-consuming to derive. In this paper, we developed a computational framework to build disease-specific drug-protein connectivity maps, by mining molecular interaction networks and PubMed abstracts. Using Alzheimer's Disease (AD) as a case study, we described how drug-protein molecular connectivity maps can be constructed to overcome data coverage and noise issues inherent in automatically extracted results. We showed that this new approach outperformed both curated drug target databases and conventional text mining systems in retrieving disease-related drugs, with an overall balanced performance of sensitivity, specificity, and positive predictive values. The AD molecular connectivity map contained novel information on AD-related genes/proteins, AD candidate drugs, and protein therapeutic/toxicological profiles of all the AD candidate drugs. Bi-clustering of the molecular connectivity map revealed interesting patterns of functionally similar proteins and drugs, therefore creating new opportunities for future drug development applications.
PMCID: PMC2709445  PMID: 19649302
2.  Protein localization as a principal feature of the etiology and comorbidity of genetic diseases 
Proteins localized within the same subcellular compartment tend to be functionally associated. This study shows that subcellular localization and network distance between disease-associated proteins provide complementary information explaining patterns of disease comorbidity.
A positive correlation was found between subcellular localization of disease-associated protein pairs and measures of comorbidity.A higher comorbidity tendency was found for disease-associated protein pairs that are positioned within a shorter distance in the protein interaction network.The integration of subcellular localization information with protein interaction network sheds light onto the potential molecular connections underlying comorbidity patterns and will help to understand the mechanisms of human disease.
It was shown that the emergence of phenotypically similar diseases are triggered as a result of molecular connections between disease-causing genes (Oti and Brunner, 2007; Zaghloul and Katsanis, 2010). From a genetics, perspective diseases are associated with certain genes (Goh et al, 2007; Feldman et al, 2008), whereas from a proteomics perspective phenotypically similar diseases are connected via biological modules such as protein–protein interactions (PPIs) or molecular pathways (Lage et al, 2007; Jiang et al, 2008; Wu et al, 2008; Linghu et al, 2009; Suthram et al, 2010). These molecular connections between diseases were observed on the population level as well: diseases connected through molecular connections such as shared genes, PPIs, and metabolic pathways tend to show elevated comorbidity (Rzhetsky et al, 2007; Lee et al, 2008; Zhernakova et al, 2009; Park et al, 2009a, 2009b). While these findings constitute a step toward improving our understanding of the mechanism of disease progression, there are still many more molecule-level connections between disease pairs that need to be explored in order to establish a firmer comorbidity association.
Subcellular localization provides spatial information of proteins in the cell; proteins target subcellular localizations to interact with appropriate partners and form functional complexes in signaling pathways and metabolic processes (Au et al, 2007). Abnormal protein localizations are known to lead to the loss of functional effects in diseases (Luheshi et al, 2008; Laurila and Vihinen, 2009). For example, mis-localizations of nuclear/cytoplasmic transport have been detected in many types of carcinoma cells (Kau et al, 2004). A proper identification of protein subcellular localization can hence be useful in discovering disease-associated proteins (Giallourakis et al, 2005; Calvo and Mootha, 2010). With this understanding, we postulate that disease-associated proteins connected by subcellular localizations could also explain the phenotypic similarities between diseases. Furthermore, such connections may also couple to disease progressions that contribute to multiple disease manifestation, that is, comorbidity.
Protein subcellular localization has been extensively studied through various methods to determine a variety of protein functions. To the best of our knowledge, the connection between diseases and subcellular localizations are yet to be studied systematically. To resolve this we constructed, for the first time, a human Disease-associated Protein and subcellular Localization (DPL) matrix (top panel in Box 1). Our DPL matrix provides the ‘cellular localization map of diseases' that represents the spatial index of diseases in the cell. We found that each disease shows unique characteristics of subcellular localization profile in the DPL matrix. We were interested in determining whether subsets of 1284 human diseases exhibit distinct enrichment profiles across subcellular localizations. We calculated pairwise correlations and performed a hierarchical clustering of the enrichments of the 1284 diseases across 10 different subcellular localizations.
Our DPL matrix revealed that 778 diseases (∼62%, P=1.40 × 10−3) are enriched in a single localization and 273 diseases (∼21%, P=3.45 × 10−3) are enriched in dual localizations. In the DPL matrix, certain disease-associated proteins are likely to be found in membrane-bounded organelles such as mitochondria, lysosome, and peroxisome, indicating that the mutations of proteins localized to these compartments are connected to the pathophysiological conditions of those organelles. Meanwhile, certain disease-associated proteins in the DPL matrix are enriched in dual localizations, such as extracellular/plasma membrane or endoplasmic reticulum/Golgi. Although these two pairs of subcellular localizations appear to be distinct compartments at first, they are functionally related compartments in close proximity during protein translocation process in the cell, and thus are likely to share interacting protein partners (Gandhi et al, 2006).
Comorbidity represents the co-occurrence of multiple diseases in the same individual (Lee et al, 2008; Hidalgo et al, 2009; Park et al, 2009a). Many comorbid disease pairs have been shown to share common genes in the human disease network. For example, Diabetes and Alzheimer's disease share a risk factor in angiotensin I converting enzyme, and frequently occur together in an individual. In such instances, comorbidity can be partially attributed to the disease connections on the molecular level. To explore the impact of protein subcellular localization on comorbidity, we hypothesized that certain disease pairs could also be connected via subcellular localization by the molecular connections between the disease-associated proteins (bottom panel in Box 1).
We found a positive correlation between subcellular localization similarity and relative risk (Figure 3B, Pearson's correlation coefficient between relative risk and subcellular localization similarity=0.81, P=2.96 × 10−5). The subcellular localization similarity represents the correlation of subcellular localization profiles between disease pairs. To our surprise, when we compared the relative risk of disease pairs linked via various molecular connections, we found that disease pairs connected by subcellular localization showed a near three-fold higher comorbidity tendency (with link distances equal to 2 or 3) when compared with random pairs (Figure 3E).
We then assessed quantitatively the impact of network distances and subcellular localizations on the comorbidity tendency of disease pairs. We expected the proteins associated with comorbid disease pairs to be located closely in the protein interaction network via fewer links compared with random disease pairs. Indeed, a higher comorbidity tendency was found when two disease-associated proteins were positioned within a shorter distance (gray plots in Figure 3F). Moreover, when subcellular localization information was combined with small network distances, the comorbidity tendency increased dramatically (orange plots in Figure 3F). It suggests that subcellular localization and close network distances, two conceptually distinct molecular connections, contributed synergistically to the comorbidity tendency.
Disease progression is not restricted to the mutation of disease-causing genes, but also affected by molecular connections in ‘disease modules,' resulting in comorbidity (Fraser, 2006; Lee et al, 2008). In this study, for the first time we applied subcellular localization information to elucidate the molecular connections between comorbid diseases. We believe that, based on our finding, our approach helps to define the boundaries of ‘disease modules.' Taken together, integration of diverse molecular connections should improve the molecular level understanding of hitherto unexplained comorbid disease pairs and help us in expanding the scope of our knowledge of the mechanism of human disease progression.
Proteins targeting the same subcellular localization tend to participate in mutual protein–protein interactions (PPIs) and are often functionally associated. Here, we investigated the relationship between disease-associated proteins and their subcellular localizations, based on the assumption that protein pairs associated with phenotypically similar diseases are more likely to be connected via subcellular localization. The spatial constraints from subcellular localization significantly strengthened the disease associations of the proteins connected by subcellular localizations. In particular, certain disease types were more prevalent in specific subcellular localizations. We analyzed the enrichment of disease phenotypes within subcellular localizations, and found that there exists a significant correlation between disease classes and subcellular localizations. Furthermore, we found that two diseases displayed high comorbidity when disease-associated proteins were connected via subcellular localization. We newly explained 7584 disease pairs by using the context of protein subcellular localization, which had not been identified using shared genes or PPIs only. Our result establishes a direct correlation between protein subcellular localization and disease association, and helps to understand the mechanism of human disease progression.
PMCID: PMC3130560  PMID: 21613983
cellular networks; comorbidity; human disease; subcellular localization
3.  Automated identification of pathways from quantitative genetic interaction data 
We present a novel Bayesian learning method that reconstructs large detailed gene networks from quantitative genetic interaction (GI) data.The method uses global reasoning to handle missing and ambiguous measurements, and provide confidence estimates for each prediction.Applied to a recent data set over genes relevant to protein folding, the learned networks reflect known biological pathways, including details such as pathway ordering and directionality of relationships.The reconstructed networks also suggest novel relationships, including the placement of SGT2 in the tail-anchored biogenesis pathway, a finding that we experimentally validated.
Recent developments have enabled large-scale quantitative measurement of genetic interactions (GIs) that report on the extent to which the activity of one gene is dependent on a second. It has long been recognized (Avery and Wasserman, 1992; Hartman et al, 2001; Segre et al, 2004; Tong et al, 2004; Drees et al, 2005; Schuldiner et al, 2005; St Onge et al, 2007; Costanzo et al, 2010) that functional dependencies revealed by GI data can provide rich information regarding underlying biological pathways. Further, the precise phenotypic measurements provided by quantitative GI data can provide evidence for even more detailed aspects of pathway structure, such as differentiating between full and partial dependence between two genes (Drees et al, 2005; Schuldiner et al, 2005; St Onge et al, 2007; Jonikas et al, 2009) (Figure 1A). As GI data sets become available for a range of quantitative phenotypes and organisms, such patterns will allow researchers to elucidate pathways important to a diverse set of biological processes.
We present a new method that exploits the high-quality, quantitative nature of recent GI assays to automatically reconstruct detailed multi-gene pathway structures, including the organization of a large set of genes into coherent pathways, the connectivity and ordering within each pathway, and the directionality of each relationship. We introduce activity pathway networks (APNs), which represent functional dependencies among a set of genes in the form of a network. We present an automatic method to efficiently reconstruct APNs over large sets of genes based on quantitative GI measurements. This method handles uncertainty in the data arising from noise, missing measurements, and data points with ambiguous interpretations, by performing global reasoning that combines evidence from multiple data points. In addition, because some structure choices remain uncertain even when jointly considering all measurements, our method maintains multiple likely networks, and allows computation of confidence estimates over each structure choice.
We applied our APN reconstruction method to the recent high-quality GI data set of Jonikas et al (2009), which examined the functional interaction between genes that contribute to protein folding in the ER. Specifically, Jonikas et al used the cell's endogenous sensor (the unfolded protein response), to first identify several hundred yeast genes with functions in endoplasmic reticulum folding and then systematically characterized their functional interdependencies by measuring unfolded protein response levels in double mutants. Our analysis produced an ensemble of 500 likelihood-weighted APNs over 178 genes (Figure 2).
We performed an aggregate evaluation of our results by comparing to known biological relationships between gene pairs, including participation in pathways according to the Kyoto Encyclopedia of Genes and Genomes (KEGG), correlation of chemical genomic profiles in a recent high-throughput assay (Hillenmeyer et al, 2008) and similarity of Gene Ontology (GO) annotations. In each evaluation performed, our reconstructed APNs were significantly more consistent with the known relationships than either the raw GI values or the Pearson correlation between profiles of GI values.
Importantly, our approach provides not only an improved means for defining pairs or groups of related genes, but also enables the identification of detailed multi-gene network structures. In many cases, our method successfully reconstructed known cellular pathways, including the ER-associated degradation (ERAD) pathway, and the biosynthesis of N-linked glycans, ranking them among the highest confidence structures. In-depth examination of the learned network structures indicates agreement with many known details of these pathways. In addition, quantitative analysis indicates that our learned APNs are indicative of ordering within KEGG-annotated biological pathways.
Our results also suggest several novel relationships, including placement of uncharacterized genes into pathways, and novel relationships between characterized genes. These include the dependence of the J domain chaperone JEM1 on the PDI homolog MPD1, dependence of the Ubiquitin-recycling enzyme DOA4 on N-linked glycosylation, and the dependence of the E3 Ubiquitin ligase DOA10 on the signal peptidase complex subunit SPC2. Our APNs also place the poorly characterized TPR-containing protein SGT2 upstream of the tail-anchored protein biogenesis machinery components GET3, GET4, and MDY2 (also known as GET5), suggesting that SGT2 has a function in the insertion of tail-anchored proteins into membranes. Consistent with this prediction, our experimental analysis shows that sgt2Δ cells show a defect in localization of the tail-anchored protein GFP-Sed5 from punctuate Golgi structures to a more diffuse pattern, as seen in other genes involved in this pathway.
Our results show that multi-gene, detailed pathway networks can be reconstructed from quantitative GI data, providing a concrete computational manifestation to intuitions that have traditionally accompanied the manual interpretation of such data. Ongoing technological developments in both genetics and imaging are enabling the measurement of GI data at a genome-wide scale, using high-accuracy quantitative phenotypes that relate to a range of particular biological functions. Methods based on RNAi will soon allow collection of similar data for human cell lines and other mammalian systems (Moffat et al, 2006). Thus, computational methods for analyzing GI data could have an important function in mapping pathways involved in complex biological systems including human cells.
High-throughput quantitative genetic interaction (GI) measurements provide detailed information regarding the structure of the underlying biological pathways by reporting on functional dependencies between genes. However, the analytical tools for fully exploiting such information lag behind the ability to collect these data. We present a novel Bayesian learning method that uses quantitative phenotypes of double knockout organisms to automatically reconstruct detailed pathway structures. We applied our method to a recent data set that measures GIs for endoplasmic reticulum (ER) genes, using the unfolded protein response as a quantitative phenotype. The results provided reconstructions of known functional pathways including N-linked glycosylation and ER-associated protein degradation. It also contained novel relationships, such as the placement of SGT2 in the tail-anchored biogenesis pathway, a finding that we experimentally validated. Our approach should be readily applicable to the next generation of quantitative GI data sets, as assays become available for additional phenotypes and eventually higher-level organisms.
PMCID: PMC2913392  PMID: 20531408
computational biology; genetic interaction; pathway reconstruction; probabilistic methods
4.  Survival-Related Profile, Pathways, and Transcription Factors in Ovarian Cancer 
PLoS Medicine  2009;6(2):e1000024.
Ovarian cancer has a poor prognosis due to advanced stage at presentation and either intrinsic or acquired resistance to classic cytotoxic drugs such as platinum and taxoids. Recent large clinical trials with different combinations and sequences of classic cytotoxic drugs indicate that further significant improvement in prognosis by this type of drugs is not to be expected. Currently a large number of drugs, targeting dysregulated molecular pathways in cancer cells have been developed and are introduced in the clinic. A major challenge is to identify those patients who will benefit from drugs targeting these specific dysregulated pathways.The aims of our study were (1) to develop a gene expression profile associated with overall survival in advanced stage serous ovarian cancer, (2) to assess the association of pathways and transcription factors with overall survival, and (3) to validate our identified profile and pathways/transcription factors in an independent set of ovarian cancers.
Methods and Findings
According to a randomized design, profiling of 157 advanced stage serous ovarian cancers was performed in duplicate using ∼35,000 70-mer oligonucleotide microarrays. A continuous predictor of overall survival was built taking into account well-known issues in microarray analysis, such as multiple testing and overfitting. A functional class scoring analysis was utilized to assess pathways/transcription factors for their association with overall survival. The prognostic value of genes that constitute our overall survival profile was validated on a fully independent, publicly available dataset of 118 well-defined primary serous ovarian cancers. Furthermore, functional class scoring analysis was also performed on this independent dataset to assess the similarities with results from our own dataset. An 86-gene overall survival profile discriminated between patients with unfavorable and favorable prognosis (median survival, 19 versus 41 mo, respectively; permutation p-value of log-rank statistic = 0.015) and maintained its independent prognostic value in multivariate analysis. Genes that composed the overall survival profile were also able to discriminate between the two risk groups in the independent dataset. In our dataset 17/167 pathways and 13/111 transcription factors were associated with overall survival, of which 16 and 12, respectively, were confirmed in the independent dataset.
Our study provides new clues to genes, pathways, and transcription factors that contribute to the clinical outcome of serous ovarian cancer and might be exploited in designing new treatment strategies.
Ate van der Zee and colleagues analyze the gene expression profiles of ovarian cancer samples from 157 patients, and identify an 86-gene expression profile that seems to predict overall survival.
Editors' Summary
Ovarian cancer kills more than 100,000 women every year and is one of the most frequent causes of cancer death in women in Western countries. Most ovarian cancers develop when an epithelial cell in one of the ovaries (two small organs in the pelvis that produce eggs) acquires genetic changes that allow it to grow uncontrollably and to spread around the body (metastasize). In its early stages, ovarian cancer is confined to the ovaries and can often be treated successfully by surgery alone. Unfortunately, early ovarian cancer rarely has symptoms so a third of women with ovarian cancer have advanced disease when they first visit their doctor with symptoms that include vague abdominal pains and mild digestive disturbances. That is, cancer cells have spread into their abdominal cavity and metastasized to other parts of the body (so-called stage III and IV disease). The outlook for women diagnosed with stage III and IV disease, which are treated with a combination of surgery and chemotherapy, is very poor. Only 30% of women with stage III, and 5% with stage IV, are still alive five years after their cancer is diagnosed.
Why Was This Study Done?
If the cellular pathways that determine the biological behavior of ovarian cancer could be identified, it might be possible to develop more effective treatments for women with stage III and IV disease. One way to identify these pathways is to use gene expression profiling (a technique that catalogs all the genes expressed by a cell) to compare gene expression patterns in the ovarian cancers of women who survive for different lengths of time. Genes with different expression levels in tumors with different outcomes could be targets for new treatments. For example, it might be worth developing inhibitors of proteins whose expression is greatest in tumors with short survival times. In this study, the researchers develop an expression profile that is associated with overall survival in advanced-stage serous ovarian cancer (more than half of ovarian cancers originate in serous cells, epithelial cells that secrete a watery fluid). The researchers also assess the association of various cellular pathways and transcription factors (proteins that control the expression of other proteins) with survival in this type of ovarian carcinoma.
What Did the Researchers Do and Find?
The researchers analyzed the gene expression profiles of tumor samples taken from 157 patients with advanced stage serous ovarian cancer and used the “supervised principal components” method to build a predictor of overall survival from these profiles and patient survival times. This 86-gene predictor discriminated between patients with favorable and unfavorable outcomes (average survival times of 41 and 19 months, respectively). It also discriminated between groups of patients with these two outcomes in an independent dataset collected from 118 additional serous ovarian cancers. Next, the researchers used “functional class scoring” analysis to assess the association between pathway and transcription factor expression in the tumor samples and overall survival. Seventeen of 167 KEGG pathways (“wiring” diagrams of molecular interactions, reactions and relations involved in cellular processes and human diseases listed in the Kyoto Encyclopedia of Genes and Genomes) were associated with survival, 16 of which were confirmed in the independent dataset. Finally, 13 of 111 analyzed transcription factors were associated with overall survival in the tumor samples, 12 of which were confirmed in the independent dataset.
What Do These Findings Mean?
These findings identify an 86-gene overall survival gene expression profile that seems to predict overall survival for women with advanced serous ovarian cancer. However, before this profile can be used clinically, further validation of the profile and more robust methods for determining gene expression profiles are needed. Importantly, these findings also provide new clues about the genes, pathways and transcription factors that contribute to the clinical outcome of serous ovarian cancer, clues that can now be exploited in the search for new treatment strategies. Finally, these findings suggest that it might eventually be possible to tailor therapies to the needs of individual patients by analyzing which pathways are activated in their tumors and thus improve survival times for women with advanced ovarian cancer.
Additional Information.
Please access these Web sites via the online version of this summary at
This study is further discussed in a PLoS Medicine Perspective by Simon Gayther and Kate Lawrenson
See also a related PLoS Medicine Research Article by Huntsman and colleagues
The US National Cancer Institute provides a brief description of what cancer is and how it develops, and information on all aspects of ovarian cancer for patients and professionals (in English and Spanish)
The UK charity Cancerbackup provides general information about cancer, and more specific information about ovarian cancer
MedlinePlus also provides links to other information about ovarian cancer (in English and Spanish)
The KEGG Pathway database provides pathway maps of known molecular networks involved in a wide range of cellular processes
PMCID: PMC2634794  PMID: 19192944
5.  Human Disease-Drug Network Based on Genomic Expression Profiles 
PLoS ONE  2009;4(8):e6536.
Drug repositioning offers the possibility of faster development times and reduced risks in drug discovery. With the rapid development of high-throughput technologies and ever-increasing accumulation of whole genome-level datasets, an increasing number of diseases and drugs can be comprehensively characterized by the changes they induce in gene expression, protein, metabolites and phenotypes.
Methodology/Principal Findings
We performed a systematic, large-scale analysis of genomic expression profiles of human diseases and drugs to create a disease-drug network. A network of 170,027 significant interactions was extracted from the ∼24.5 million comparisons between ∼7,000 publicly available transcriptomic profiles. The network includes 645 disease-disease, 5,008 disease-drug, and 164,374 drug-drug relationships. At least 60% of the disease-disease pairs were in the same disease area as determined by the Medical Subject Headings (MeSH) disease classification tree. The remaining can drive a molecular level nosology by discovering relationships between seemingly unrelated diseases, such as a connection between bipolar disorder and hereditary spastic paraplegia, and a connection between actinic keratosis and cancer. Among the 5,008 disease-drug links, connections with negative scores suggest new indications for existing drugs, such as the use of some antimalaria drugs for Crohn's disease, and a variety of existing drugs for Huntington's disease; while the positive scoring connections can aid in drug side effect identification, such as tamoxifen's undesired carcinogenic property. From the ∼37K drug-drug relationships, we discover relationships that aid in target and pathway deconvolution, such as 1) KCNMA1 as a potential molecular target of lobeline, and 2) both apoptotic DNA fragmentation and G2/M DNA damage checkpoint regulation as potential pathway targets of daunorubicin.
We have automatically generated thousands of disease and drug expression profiles using GEO datasets, and constructed a large scale disease-drug network for effective and efficient drug repositioning as well as drug target/pathway identification.
PMCID: PMC2715883  PMID: 19657382
6.  Automatic Filtering and Substantiation of Drug Safety Signals 
PLoS Computational Biology  2012;8(4):e1002457.
Drug safety issues pose serious health threats to the population and constitute a major cause of mortality worldwide. Due to the prominent implications to both public health and the pharmaceutical industry, it is of great importance to unravel the molecular mechanisms by which an adverse drug reaction can be potentially elicited. These mechanisms can be investigated by placing the pharmaco-epidemiologically detected adverse drug reaction in an information-rich context and by exploiting all currently available biomedical knowledge to substantiate it. We present a computational framework for the biological annotation of potential adverse drug reactions. First, the proposed framework investigates previous evidences on the drug-event association in the context of biomedical literature (signal filtering). Then, it seeks to provide a biological explanation (signal substantiation) by exploring mechanistic connections that might explain why a drug produces a specific adverse reaction. The mechanistic connections include the activity of the drug, related compounds and drug metabolites on protein targets, the association of protein targets to clinical events, and the annotation of proteins (both protein targets and proteins associated with clinical events) to biological pathways. Hence, the workflows for signal filtering and substantiation integrate modules for literature and database mining, in silico drug-target profiling, and analyses based on gene-disease networks and biological pathways. Application examples of these workflows carried out on selected cases of drug safety signals are discussed. The methodology and workflows presented offer a novel approach to explore the molecular mechanisms underlying adverse drug reactions.
Author Summary
Adverse drug reactions (ADRs) constitute a major cause of morbidity and mortality worldwide. Due to the relevance of ADRs for both public health and pharmaceutical industry, it is important to develop efficient ways to monitor ADRs in the population. In addition, it is also essential to comprehend why a drug produces an adverse effect. To unravel the molecular mechanisms of ADRs, it is necessary to consider the ADR in the context of current biomedical knowledge that might explain it. Nowadays there are plenty of information sources that can be exploited in order to accomplish this goal. Nevertheless, the fragmentation of information and, more importantly, the diverse knowledge domains that need to be traversed, pose challenges to the task of exploring the molecular mechanisms of ADRs. We present a novel computational framework to aid in the collection and exploration of evidences that support the causal inference of ADRs detected by mining clinical records. This framework was implemented as publicly available tools integrating state-of-the-art bioinformatics methods for the analysis of drugs, targets, biological processes and clinical events. The availability of such tools for in silico experiments will facilitate research on the mechanisms that underlie ADR, contributing to the development of safer drugs.
PMCID: PMC3320573  PMID: 22496632
7.  IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis 
BMC Bioinformatics  2012;13(Suppl 15):S7.
Next-Generation Sequencing (NGS) technologies and Genome-Wide Association Studies (GWAS) generate millions of reads and hundreds of datasets, and there is an urgent need for a better way to accurately interpret and distill such large amounts of data. Extensive pathway and network analysis allow for the discovery of highly significant pathways from a set of disease vs. healthy samples in the NGS and GWAS. Knowledge of activation of these processes will lead to elucidation of the complex biological pathways affected by drug treatment, to patient stratification studies of new and existing drug treatments, and to understanding the underlying anti-cancer drug effects. There are approximately 141 biological human pathway resources as of Jan 2012 according to the Pathguide database. However, most currently available resources do not contain disease, drug or organ specificity information such as disease-pathway, drug-pathway, and organ-pathway associations. Systematically integrating pathway, disease, drug and organ specificity together becomes increasingly crucial for understanding the interrelationships between signaling, metabolic and regulatory pathway, drug action, disease susceptibility, and organ specificity from high-throughput omics data (genomics, transcriptomics, proteomics and metabolomics).
We designed the Integrated Pathway Analysis Database for Systematic Enrichment Analysis (IPAD,, defining inter-association between pathway, disease, drug and organ specificity, based on six criteria: 1) comprehensive pathway coverage; 2) gene/protein to pathway/disease/drug/organ association; 3) inter-association between pathway, disease, drug, and organ; 4) multiple and quantitative measurement of enrichment and inter-association; 5) assessment of enrichment and inter-association analysis with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources; and 6) cross-linking of multiple available data sources.
IPAD is a comprehensive database covering about 22,498 genes, 25,469 proteins, 1956 pathways, 6704 diseases, 5615 drugs, and 52 organs integrated from databases including the BioCarta, KEGG, NCI-Nature curated, Reactome, CTD, PharmGKB, DrugBank, PharmGKB, and HOMER. The database has a web-based user interface that allows users to perform enrichment analysis from genes/proteins/molecules and inter-association analysis from a pathway, disease, drug, and organ.
Moreover, the quality of the database was validated with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources. Two case studies were also presented to demonstrate: 1) self-validation of enrichment analysis and inter-association analysis on brain-specific markers, and 2) identification of previously undiscovered components by the enrichment analysis from a prostate cancer study.
IPAD is a new resource for analyzing, identifying, and validating pathway, disease, drug, organ specificity and their inter-associations. The statistical method we developed for enrichment and similarity measurement and the two criteria we described for setting the threshold parameters can be extended to other enrichment applications. Enriched pathways, diseases, drugs, organs and their inter-associations can be searched, displayed, and downloaded from our online user interface. The current IPAD database can help users address a wide range of biological pathway related, disease susceptibility related, drug target related and organ specificity related questions in human disease studies.
PMCID: PMC3439721  PMID: 23046449
8.  Molecular Concepts Analysis Links Tumors, Pathways, Mechanisms, and Drugs1 * 
Neoplasia (New York, N.Y.)  2007;9(5):443-454.
Global molecular profiling of cancers has shown broad utility in delineating pathways and processes underlying disease, in predicting prognosis and response to therapy, and in suggesting novel treatments. To gain further insights from such data, we have integrated and analyzed a comprehensive collection of “molecular concepts” representing > 2500 cancer-related gene expression signatures from Oncomine and manual curation of the literature, drug treatment signatures from the Connectivity Map, target gene sets from genome-scale regulatory motif analyses, and reference gene sets from several gene and protein annotation databases. We computed pairwise association analysis on all 13,364 molecular concepts and identified > 290,000 significant associations, generating hypotheses that link cancer types and subtypes, pathways, mechanisms, and drugs. To navigate a network of associations, we developed an analysis platform, the Molecular Concepts Map. We demonstrate the utility of the approach by highlighting molecular concepts analyses of Myc pathway activation, breast cancer relapse, and retinoic acid treatment.
PMCID: PMC1877973  PMID: 17534450
Cancer; bioinformatics; gene expression signature; network; oncomine
9.  A pharmacogenomic method for individualized prediction of drug sensitivity 
Using valproic acid as an example, the authors demonstrate that drug response signatures derived from genome-wide expression data can identify individuals likely to respond to a drug, and propose that this method could select optimal populations for clinical trials of new therapies.
Drug response signatures that accurately reflect the cellular response to a drug can be generated from Connectivity Map and publically available gene expression data.Predictions from the drug response signature for valproic acid correlate with sensitivity to valproic acid in breast cancer cell lines and patient tumors grown in three-dimensional culture and mouse xenografts.The MATCH algorithm provides an efficient approach for using genome-wide gene expression data to identify a target population for a drug prior to clinical trials.MATCH can predict drug sensitivity in tumors without knowledge of mechanism of action.
Unlike traditional chemotherapy, targeted cancer therapies are expected to work in only a subset of people with a particular cancer. However, biomarkers of response are not always known before clinical trial initiation. We present MATCH (Merging genomic and pharmacologic Analyses for Therapy CHoice), an algorithm for using genome-wide gene expression data to identify and validate a genomic biomarker of sensitivity (see Figure 1). Our proof-of-principle example is valproic acid (VPA), but we also show that an estrogen blocking drug currently used for breast cancer and a B-RAF inhibitor in trials for melanoma give predictions that correspond to their clinical uses.
We use genome-wide gene expression data from treated and untreated samples from the Connectivity Map to generate a VPA response signature. We validate that the VPA signature can identify treated and untreated cells in an independent data set of normal cells and in independent samples from the Connectivity Map. The AUC for the ROC curve is 0.86. We then apply the VPA signature to publically available data sets from a panel of cancer cell lines and from primary tumor and normal tissue samples. These data suggest that there is a subset of women with breast cancer who will be sensitive to VPA. Finally, we validate that our predictions correlate with sensitivity to VPA in breast cancer cell lines grown in two-dimensional culture, primary breast tumor samples grown in three-dimensional culture, and in vivo mouse breast cancer xenografts. Together, these studies show that MATCH can identify cancer patients most likely to respond to a specific drug treatment.
Identifying the best drug for each cancer patient requires an efficient individualized strategy. We present MATCH (Merging genomic and pharmacologic Analyses for Therapy CHoice), an approach using public genomic resources and drug testing of fresh tumor samples to link drugs to patients. Valproic acid (VPA) is highlighted as a proof-of-principle. In order to predict specific tumor types with high probability of drug sensitivity, we create drug response signatures using publically available gene expression data and assess sensitivity in a data set of >40 cancer types. Next, we evaluate drug sensitivity in matched tumor and normal tissue and exclude cancer types that are no more sensitive than normal tissue. From these analyses, breast tumors are predicted to be sensitive to VPA. A meta-analysis across breast cancer data sets shows that aggressive subtypes are most likely to be sensitive to VPA, but all subtypes have sensitive tumors. MATCH predictions correlate significantly with growth inhibition in cancer cell lines and three-dimensional cultures of fresh tumor samples. MATCH accurately predicts reduction in tumor growth rate following VPA treatment in patient tumor xenografts. MATCH uses genomic analysis with in vitro testing of patient tumors to select optimal drug regimens before clinical trial initiation.
PMCID: PMC3159972  PMID: 21772261
biomarkers; cancer; pharmacogenomics
10.  Structural and functional protein network analyses predict novel signaling functions for rhodopsin 
Proteomic analyses, literature mining, and structural data were combined to generate an extensive signaling network linked to the visual G protein-coupled receptor rhodopsin. Network analysis suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking.
Using a shotgun proteomic approach, we identified the protein inventory of the light sensing outer segment of the mammalian photoreceptor.These data, combined with literature mining, structural modeling, and computational analysis, offer a comprehensive view of signal transduction downstream of the visual G protein-coupled receptor rhodopsin.The network suggests novel signaling branches downstream of rhodopsin to cytoskeleton dynamics and vesicular trafficking.The network serves as a basis for elucidating physiological principles of photoreceptor function and suggests potential disease-associated proteins.
Photoreceptor cells are neurons capable of converting light into electrical signals. The rod outer segment (ROS) region of the photoreceptor cells is a cellular structure made of a stack of around 800 closed membrane disks loaded with rhodopsin (Liang et al, 2003; Nickell et al, 2007). In disc membranes, rhodopsin arranges itself into paracrystalline dimer arrays, enabling optimal association with the heterotrimeric G protein transducin as well as additional regulatory components (Ciarkowski et al, 2005). Disruption of these highly regulated structures and processes by germline mutations is the cause of severe blinding diseases such as retinitis pigmentosa, macular degeneration, or congenital stationary night blindness (Berger et al, 2010).
Traditionally, signal transduction networks have been studied by combining biochemical and genetic experiments addressing the relations among a small number of components. More recently, large throughput experiments using different techniques like two hybrid or co-immunoprecipitation coupled to mass spectrometry have added a new level of complexity (Ito et al, 2001; Gavin et al, 2002, 2006; Ho et al, 2002; Rual et al, 2005; Stelzl et al, 2005). However, in these studies, space, time, and the fact that many interactions detected for a particular protein are not compatible, are not taken into consideration. Structural information can help discriminate between direct and indirect interactions and more importantly it can determine if two or more predicted partners of any given protein or complex can simultaneously bind a target or rather compete for the same interaction surface (Kim et al, 2006).
In this work, we build a functional and dynamic interaction network centered on rhodopsin on a systems level, using six steps: In step 1, we experimentally identified the proteomic inventory of the porcine ROS, and we compared our data set with a recent proteomic study from bovine ROS (Kwok et al, 2008). The union of the two data sets was defined as the ‘initial experimental ROS proteome'. After removal of contaminants and applying filtering methods, a ‘core ROS proteome', consisting of 355 proteins, was defined.
In step 2, proteins of the core ROS proteome were assigned to six functional modules: (1) vision, signaling, transporters, and channels; (2) outer segment structure and morphogenesis; (3) housekeeping; (4) cytoskeleton and polarity; (5) vesicles formation and trafficking, and (6) metabolism.
In step 3, a protein-protein interaction network was constructed based on the literature mining. Since for most of the interactions experimental evidence was co-immunoprecipitation, or pull-down experiments, and in addition many of the edges in the network are supported by single experimental evidence, often derived from high-throughput approaches, we refer to this network, as ‘fuzzy ROS interactome'. Structural information was used to predict binary interactions, based on the finding that similar domain pairs are likely to interact in a similar way (‘nature repeats itself') (Aloy and Russell, 2002). To increase the confidence in the resulting network, edges supported by a single evidence not coming from yeast two-hybrid experiments were removed, exception being interactions where the evidence was the existence of a three-dimensional structure of the complex itself, or of a highly homologous complex. This curated static network (‘high-confidence ROS interactome') comprises 660 edges linking the majority of the nodes. By considering only edges supported by at least one evidence of direct binary interaction, we end up with a ‘high-confidence binary ROS interactome'. We next extended the published core pathway (Dell'Orco et al, 2009) using evidence from our high-confidence network. We find several new direct binary links to different cellular functional processes (Figure 4): the active rhodopsin interacts with Rac1 and the GTP form of Rho. There is also a connection between active rhodopsin and Arf4, as well as PDEδ with Rab13 and the GTP-bound form of Arl3 that links the vision cycle to vesicle trafficking and structure. We see a connection between PDEδ with prenyl-modified proteins, such as several small GTPases, as well as with rhodopsin kinase. Further, our network reveals several direct binary connections between Ca2+-regulated proteins and cytoskeleton proteins; these are CaMK2A with actinin, calmodulin with GAP43 and S1008, and PKC with 14-3-3 family members.
In step 4, part of the network was experimentally validated using three different approaches to identify physical protein associations that would occur under physiological conditions: (i) Co-segregation/co-sedimentation experiments, (ii) immunoprecipitations combined with mass spectrometry and/or subsequent immunoblotting, and (iii) utilizing the glycosylated N-terminus of rhodopsin to isolate its associated protein partners by Concanavalin A affinity purification. In total, 60 co-purification and co-elution experiments supported interactions that were already in our literature network, and new evidence from 175 co-IP experiments in this work was added. Next, we aimed to provide additional independent experimental confirmation for two of the novel networks and functional links proposed based on the network analysis: (i) the proposed complex between Rac1/RhoA/CRMP-2/tubulin/and ROCK II in ROS was investigated by culturing retinal explants in the presence of an ROCK II-specific inhibitor (Figure 6). While morphology of the retinas treated with ROCK II inhibitor appeared normal, immunohistochemistry analyses revealed several alterations on the protein level. (ii) We supported the hypothesis that PDEδ could function as a GDI for Rac1 in ROS, by demonstrating that PDEδ and Rac1 co localize in ROS and that PDEδ could dissociate Rac1 from ROS membranes in vitro.
In step 5, we use structural information to distinguish between mutually compatible (‘AND') or excluded (‘XOR') interactions. This enables breaking a network of nodes and edges into functional machines or sub-networks/modules. In the vision branch, both ‘AND' and ‘XOR' gates synergize. This may allow dynamic tuning of light and dark states. However, all connections from the vision module to other modules are ‘XOR' connections suggesting that competition, in connection with local protein concentration changes, could be important for transmitting signals from the core vision module.
In the last step, we map and functionally characterize the known mutations that produce blindness.
In summary, this represents the first comprehensive, dynamic, and integrative rhodopsin signaling network, which can be the basis for integrating and mapping newly discovered disease mutants, to guide protein or signaling branch-specific therapies.
Orchestration of signaling, photoreceptor structural integrity, and maintenance needed for mammalian vision remain enigmatic. By integrating three proteomic data sets, literature mining, computational analyses, and structural information, we have generated a multiscale signal transduction network linked to the visual G protein-coupled receptor (GPCR) rhodopsin, the major protein component of rod outer segments. This network was complemented by domain decomposition of protein–protein interactions and then qualified for mutually exclusive or mutually compatible interactions and ternary complex formation using structural data. The resulting information not only offers a comprehensive view of signal transduction induced by this GPCR but also suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking, predicting an important level of regulation through small GTPases. Further, it demonstrates a specific disease susceptibility of the core visual pathway due to the uniqueness of its components present mainly in the eye. As a comprehensive multiscale network, it can serve as a basis to elucidate the physiological principles of photoreceptor function, identify potential disease-associated genes and proteins, and guide the development of therapies that target specific branches of the signaling pathway.
PMCID: PMC3261702  PMID: 22108793
protein interaction network; rhodopsin signaling; structural modeling
11.  DEAR1 Is a Dominant Regulator of Acinar Morphogenesis and an Independent Predictor of Local Recurrence-Free Survival in Early-Onset Breast Cancer 
PLoS Medicine  2009;6(5):e1000068.
Ann Killary and colleagues describe a new gene that is genetically altered in breast tumors, and that may provide a new breast cancer prognostic marker.
Breast cancer in young women tends to have a natural history of aggressive disease for which rates of recurrence are higher than in breast cancers detected later in life. Little is known about the genetic pathways that underlie early-onset breast cancer. Here we report the discovery of DEAR1 (ductal epithelium–associated RING Chromosome 1), a novel gene encoding a member of the TRIM (tripartite motif) subfamily of RING finger proteins, and provide evidence for its role as a dominant regulator of acinar morphogenesis in the mammary gland and as an independent predictor of local recurrence-free survival in early-onset breast cancer.
Methods and Findings
Suppression subtractive hybridization identified DEAR1 as a novel gene mapping to a region of high-frequency loss of heterozygosity (LOH) in a number of histologically diverse human cancers within Chromosome 1p35.1. In the breast epithelium, DEAR1 expression is limited to the ductal and glandular epithelium and is down-regulated in transition to ductal carcinoma in situ (DCIS), an early histologic stage in breast tumorigenesis. DEAR1 missense mutations and homozygous deletion (HD) were discovered in breast cancer cell lines and tumor samples. Introduction of the DEAR1 wild type and not the missense mutant alleles to complement a mutation in a breast cancer cell line, derived from a 36-year-old female with invasive breast cancer, initiated acinar morphogenesis in three-dimensional (3D) basement membrane culture and restored tissue architecture reminiscent of normal acinar structures in the mammary gland in vivo. Stable knockdown of DEAR1 in immortalized human mammary epithelial cells (HMECs) recapitulated the growth in 3D culture of breast cancer cell lines containing mutated DEAR1, in that shDEAR1 clones demonstrated disruption of tissue architecture, loss of apical basal polarity, diffuse apoptosis, and failure of lumen formation. Furthermore, immunohistochemical staining of a tissue microarray from a cohort of 123 young female breast cancer patients with a 20-year follow-up indicated that in early-onset breast cancer, DEAR1 expression serves as an independent predictor of local recurrence-free survival and correlates significantly with strong family history of breast cancer and the triple-negative phenotype (ER−, PR−, HER-2−) of breast cancers with poor prognosis.
Our data provide compelling evidence for the genetic alteration and loss of expression of DEAR1 in breast cancer, for the functional role of DEAR1 in the dominant regulation of acinar morphogenesis in 3D culture, and for the potential utility of an immunohistochemical assay for DEAR1 expression as an independent prognostic marker for stratification of early-onset disease.
Editors' Summary
Each year, more than one million women discover that they have breast cancer. This type of cancer begins when cells in the breast that line the milk-producing glands or the tubes that take the milk to the nipples (glandular and ductal epithelial cells, respectively) acquire genetic changes that allow them to grow uncontrollably and to move around the body (metastasize). The uncontrolled division leads to the formation of a lump that can be detected by mammography (a breast X-ray) or by manual breast examination. Breast cancer is treated by surgical removal of the lump or, if the cancer has started to spread, by removal of the whole breast (mastectomy). Surgery is usually followed by radiotherapy or chemotherapy. These “adjuvant” therapies are designed to kill any remaining cancer cells but can make patients very ill. Generally speaking, the outlook for women with breast cancer is good. In the US, for example, nearly 90% of affected women are still alive five years after their diagnosis.
Why Was This Study Done?
Although breast cancer is usually diagnosed in women in their 50s or 60s, some women develop breast cancer much earlier. In these women, the disease is often very aggressive. Compared to older women, young women with breast cancer have a lower overall survival rate and their cancer is more likely to recur locally or to metastasize. It would be useful to be able to recognize those younger women at the greatest risk of cancer recurrence so that they could be offered intensive surveillance and adjuvant therapy; those women at a lower risk could have gentler treatments. To achieve this type of “stratification,” the genetic changes that underlie breast cancer in young women need to be identified. In this study, the researchers discover a gene that is genetically altered (by mutations or deletion) in early-onset breast cancer and then investigate whether its expression can predict outcomes in women with this disease.
What Did the Researchers Do and Find?
The researchers used “suppression subtractive hybridization” to identify a new gene in a region of human Chromosome 1 where loss of heterozygosity (LOH; a genetic alteration associated with cancer development) frequently occurs. They called the gene DEAR1 (ductal epithelium-associated RING Chromosome 1) to indicate that it is expressed in ductal and glandular epithelial cells and encodes a “RING finger” protein (specifically, a subtype called a TRIM protein; RING finger proteins such as BRCA1 and BRCA2 have been implicated in early cancer development and in a large fraction of inherited breast cancers). DEAR1 expression was reduced or lost in several ductal carcinomas in situ (a local abnormality that can develop into breast cancer) and advanced breast cancers, the researchers report. Furthermore, many breast tumors carried DEAR1 missense mutations (genetic changes that interfere with the normal function of the DEAR1 protein) or had lost both copies of DEAR1 (the human genome contains two copies of most genes). To determine the function of DEAR1, the researchers replaced a normal copy of DEAR1 into a breast cancer cell that had a mutation in DEAR1. They then examined the growth of these genetically manipulated cells in special three-dimensional cultures. The breast cancer cells without DEAR1 grew rapidly without an organized structure while the breast cancer cells containing the introduced copy of DEAR1 formed structures that resembled normal breast acini (sac-like structures that secrete milk). In normal human mammary epithelial cells, the researchers silenced DEAR1 expression and also showed that without DEAR1, the normal mammary cells lost their ability to form proper acini. Finally, the researchers report that DEAR1 expression (detected “immunohistochemically”) was frequently lost in women who had had early-onset breast cancer and that the loss of DEAR1 expression correlated with reduced local recurrence-free survival, a strong family history of breast cancer and with a breast cancer subtype that has a poor outcome.
What Do These Findings Mean?
These findings indicate that genetic alteration and loss of expression of DEAR1 are common in breast cancer. Although laboratory experiments may not necessarily reflect what happens in people, the results from the three-dimensional culture of breast epithelial cells suggest that DEAR1 may regulate the normal acinar structure of the breast. Consequently, loss of DEAR1 expression could be an early event in breast cancer development. Most importantly, the correlation between DEAR1 expression and both local recurrence in early-onset breast cancer and a breast cancer subtype with a poor outcome suggests that it might be possible to use DEAR1 expression to identify women with early-onset breast cancer who have an increased risk of local recurrence so that they get the most appropriate treatment for their cancer.
Additional Information
Please access these Web sites via the online version of this summary at
This study is further discussed in a PLoS Medicine Perspective by Senthil Muthuswamy
The US National Cancer Institute provides detailed information for patients and health professionals on all aspects of breast cancer, including information on genetic alterations in breast cancer (in English and Spanish)
The MedlinePlus Encyclopedia provides information for patients about breast cancer; MedlinePlus also provides links to many other breast cancer resources (in English and Spanish)
The UK charities Cancerbackup (now merged with MacMillan Cancer Support) and Cancer Research UK also provide detailed information about breast cancer
PMCID: PMC2673042  PMID: 19536326
12.  The BARD1 Cys557Ser Variant and Breast Cancer Risk in Iceland 
PLoS Medicine  2006;3(7):e217.
Most, if not all, of the cellular functions of the BRCA1 protein are mediated through heterodimeric complexes composed of BRCA1 and a related protein, BARD1. Some breast-cancer-associated BRCA1 missense mutations disrupt the function of the BRCA1/BARD1 complex. It is therefore pertinent to determine whether variants of BARD1 confer susceptibility to breast cancer. Recently, a missense BARD1 variant, Cys557Ser, was reported to be at increased frequencies in breast cancer families. We investigated the role of the BARD1 Cys557Ser variant in a population-based cohort of 1,090 Icelandic patients with invasive breast cancer and 703 controls. We then used a computerized genealogy of the Icelandic population to study the relationships between the Cys557Ser variant and familial clustering of breast cancer.
Methods and Findings
The Cys557Ser allele was present at a frequency of 0.028 in patients with invasive breast cancer and 0.016 in controls (odds ratio [OR] = 1.82, 95% confidence interval [CI] 1.11–3.01, p = 0.014). The alleleic frequency was 0.037 in a high-predisposition group of cases defined by having a family history of breast cancer, early onset of breast cancer, or multiple primary breast cancers (OR = 2.41, 95% CI 1.22–4.75, p = 0.015). Carriers of the common Icelandic BRCA2 999del5 mutation were found to have their risk of breast cancer further increased if they also carried the BARD1 variant: the frequency of the BARD1 variant allele was 0.047 (OR = 3.11, 95% CI 1.16–8.40, p = 0.046) in 999del5 carriers with breast cancer. This suggests that the lifetime probability of a BARD1 Cys557Ser/BRCA2 999del5 double carrier developing breast cancer could approach certainty. Cys557Ser carriers, with or without the BRCA2 mutation, had an increased risk of subsequent primary breast tumors after the first breast cancer diagnosis compared to non-carriers. Lobular and medullary breast carcinomas were overrepresented amongst Cys557Ser carriers. We found that an excess of ancestors of contemporary carriers lived in a single county in the southeast of Iceland and that all carriers shared a SNP haplotype, which is suggestive of a founder event. Cys557Ser was found on the same SNP haplotype background in the HapMap Project CEPH sample of Utah residents.
Our findings suggest that BARD1 Cys557Ser is an ancient variant that confers risk of single and multiple primary breast cancers, and this risk extends to carriers of the BRCA2 999del5 mutation.
Editors' Summary
About 13% of women (one in eight women) will develop breast cancer during their lifetime, but many factors affect the likelihood of any individual woman developing this disease, for example, whether she has had children and at what age, when she started and stopped her periods, and her exposure to certain chemicals or radiation. She may also have inherited a defective gene that affects her risk of developing breast cancer. Some 5%–10% of all breast cancers are familial, or inherited. In 20% of these cases, the gene that is defective is BRCA1 or BRCA2. Inheriting a defective copy of one of these genes greatly increases a woman's risk of developing breast cancer, while researchers think that the other inherited genes that predispose to breast cancer—most of which have not been identified yet—have a much weaker effect. These are described as low-penetrance genes. Inheriting one such gene only slightly increases breast cancer risk; a woman has to inherit several to increase her lifetime risk of cancer significantly.
Why Was This Study Done?
It is important to identify these additional predisposing gene variants because they might provide insights into why breast cancer develops, how to prevent it, and how to treat it. To find low-penetrance genes, researchers do case–control association studies. They find a large group of women with breast cancer (cases) and a similar group of women without cancer (controls), and examine how often a specific gene variant occurs in the two groups. If the variant is found more often in the cases than in the controls, it might be a variant that increases a woman's risk of developing breast cancer.
What Did the Researchers Do and Find?
The researchers involved in this study recruited Icelandic women who had had breast cancer and unaffected women, and looked for a specific variant—the Cys557Ser allele—of a gene called BARD1. They chose BARD1 because the protein it encodes interacts with the protein encoded by BRCA1. Because defects in BRCA1 increase the risk of breast cancer, defects in an interacting protein might have a similar effect. In addition, the Cys557Ser allele has been implicated in breast cancer in other studies. The researchers found that the Cys557Ser allele was nearly twice as common in women with breast cancer as in control women. It was also more common (but not by much) in women who had a family history of breast cancer or who had developed breast cancer more than once. And having the Cys557Ser allele seemed to increase the already high risk of breast cancer in women who had a BRCA2 variant (known as BRCA2 999del5) that accounts for 40% of inherited breast cancer risk in Iceland.
What Do These Findings Mean?
These results indicate that inheriting the BARD1 Cys557Ser allele increases a woman's breast cancer risk but that she is unlikely to have a family history of the disease. Because carrying the Cys557Ser allele only slightly increases a woman's risk of breast cancer, for most women there is no clinical reason to test for this variant. Eventually, when all the low-penetrance genes that contribute to breast cancer risk have been identified, it might be helpful to screen women for the full set to determine whether they are at high risk of developing breast cancer. This will not happen for many years, however, since there might be tens or hundreds of these genes. For women who carry BRCA2 999del5, the situation might be different. It might be worth testing these women for the BARD1 Cys557Ser allele, the researchers explain, because the lifetime probability of developing breast cancer in women carrying both variants might approach 100%. This finding has clinical implications in terms of counseling and monitoring, as does the observation that Cys557Ser carriers have an increased risk of a second, independent breast cancer compared to non-carriers. However, all these findings need to be confirmed in other groups of patients before anyone is routinely tested for the BARD1 Cys557Ser allele.
Additional Information.
Please access these Web sites via the online version of this summary at
• MedlinePlus pages about breast cancer
• Information on breast cancer from the United States National Cancer Institute
• Information on inherited breast cancer from the United States National Human Genome Research Institute
• United States National Cancer Institute information on genetic testing for BRCA1 and BRCA2 variants
• GeneTests pages on the involvement of BRCA1 and BRCA2 in hereditary breast and ovarian cancer
• Cancer Research UK's page on breast cancer statistics
In a population-based cohort of 1090 Icelandic patients, a Cys557Ser missense variant of the BARD1 gene, which interacts with BRCA1, increased the risk of single and multiple primary breast cancers.
PMCID: PMC1479388  PMID: 16768547
13.  An integrative analysis of cellular contexts, miRNAs and mRNAs reveals network clusters associated with antiestrogen-resistant breast cancer cells 
BMC Genomics  2012;13:732.
A major goal of the field of systems biology is to translate genome-wide profiling data (e.g., mRNAs, miRNAs) into interpretable functional networks. However, employing a systems biology approach to better understand the complexities underlying drug resistance phenotypes in cancer continues to represent a significant challenge to the field. Previously, we derived two drug-resistant breast cancer sublines (tamoxifen- and fulvestrant-resistant cell lines) from the MCF7 breast cancer cell line and performed genome-wide mRNA and microRNA profiling to identify differential molecular pathways underlying acquired resistance to these important antiestrogens. In the current study, to further define molecular characteristics of acquired antiestrogen resistance we constructed an “integrative network”. We combined joint miRNA-mRNA expression profiles, cancer contexts, miRNA-target mRNA relationships, and miRNA upstream regulators. In particular, to reduce the probability of false positive connections in the network, experimentally validated, rather than prediction-oriented, databases were utilized to obtain connectivity. Also, to improve biological interpretation, cancer contexts were incorporated into the network connectivity.
Based on the integrative network, we extracted “substructures” (network clusters) representing the drug resistant states (tamoxifen- or fulvestrant-resistance cells) compared to drug sensitive state (parental MCF7 cells). We identified un-described network clusters that contribute to antiestrogen resistance consisting of miR-146a, -27a, -145, -21, -155, -15a, -125b, and let-7s, in addition to the previously described miR-221/222.
By integrating miRNA-related network, gene/miRNA expression and text-mining, the current study provides a computational-based systems biology approach for further investigating the molecular mechanism underlying antiestrogen resistance in breast cancer cells. In addition, new miRNA clusters that contribute to antiestrogen resistance were identified, and they warrant further investigation.
PMCID: PMC3560207  PMID: 23270413
Bioinformatics; miRNA; Network; Breast cancer; Antiestrogen resistance
14.  PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media 
Journal of biomedical informatics  2013;46(6):10.1016/j.jbi.2013.07.007.
The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel Semantic Web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO) (pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC). A combination of lexical, pattern-based and semantics-based techniques is used together with the domain knowledge to extract fine-grained semantic information from UGC. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks.
Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, routes of administration, etc. The DAO is also used to help recognize three types of data, namely: 1) entities, 2) relationships and 3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information from UGC, and querying, search, trend analysis and overall content analysis of social media related to prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques.
A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University.
A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future.
PMCID: PMC3844051  PMID: 23892295
Entity Identification; Relationship Extraction; Triple Extraction; Sentiment Extraction; Semantic Web; Drug Abuse Ontology; Prescription Drug Abuse; Epidemiology
15.  MalaCards: an integrated compendium for diseases and their annotation 
Comprehensive disease classification, integration and annotation are crucial for biomedical discovery. At present, disease compilation is incomplete, heterogeneous and often lacking systematic inquiry mechanisms. We introduce MalaCards, an integrated database of human maladies and their annotations, modeled on the architecture and strategy of the GeneCards database of human genes. MalaCards mines and merges 44 data sources to generate a computerized card for each of 16 919 human diseases. Each MalaCard contains disease-specific prioritized annotations, as well as inter-disease connections, empowered by the GeneCards relational database, its searches and GeneDecks set analyses. First, we generate a disease list from 15 ranked sources, using disease-name unification heuristics. Next, we use four schemes to populate MalaCards sections: (i) directly interrogating disease resources, to establish integrated disease names, synonyms, summaries, drugs/therapeutics, clinical features, genetic tests and anatomical context; (ii) searching GeneCards for related publications, and for associated genes with corresponding relevance scores; (iii) analyzing disease-associated gene sets in GeneDecks to yield affiliated pathways, phenotypes, compounds and GO terms, sorted by a composite relevance score and presented with GeneCards links; and (iv) searching within MalaCards itself, e.g. for additional related diseases and anatomical context. The latter forms the basis for the construction of a disease network, based on shared MalaCards annotations, embodying associations based on etiology, clinical features and clinical conditions. This broadly disposed network has a power-law degree distribution, suggesting that this might be an inherent property of such networks. Work in progress includes hierarchical malady classification, ontological mapping and disease set analyses, striving to make MalaCards an even more effective tool for biomedical research.
Database URL:
PMCID: PMC3625956  PMID: 23584832
16.  Cross-species chemogenomic profiling reveals evolutionarily conserved drug mode of action 
Chemogenomic screens were performed in both budding and fission yeasts, allowing for a cross-species comparison of drug–gene interaction networks.Drug–module interactions were more conserved than individual drug–gene interactions.Combination of data from both species can improve drug–module predictions and helps identify a compound's mode of action.
Understanding the molecular effects of chemical compounds in living cells is an important step toward rational therapeutics. Drug discovery aims to find compounds that will target a specific pathway or pathogen with minimal side effects. However, even when an effective drug is found, its mode of action (MoA) is typically not well understood. The lack of knowledge regarding a drug's MoA makes the drug discovery process slow and rational therapeutics incredibly difficult. More recently, different high-throughput methods have been developed that attempt to discern how a compound exerts its effects in cells. One of these methods relies on measuring the growth of cells carrying different mutations in the presence of the compounds of interest, commonly referred to as chemogenomics (Wuster and Babu, 2008). The differential growth of the different mutants provides clues as to what the compounds target in the cell (Figure 2). For example, if a drug inhibits a branch in a vital two-branch pathway, then mutations in the second branch might result in cell death if the mutants are grown in the presence of the drug (Figure 2C). As these compound–mutant functional interactions are expected to be relatively rare, one can assume that the growth rate of a mutant–drug combination should generally be equal to the product of the growth rate of the untreated mutant with the growth rate of the drug-treated wild type. This expectation is defined as the neutral model and deviations from this provide a quantitative score that allow us to make informed predictions regarding a drug's MoA (Figure 2B; Parsons et al, 2006).
The availability of these high-throughput approaches now allows us to perform cross-species studies of functional interactions between compounds and genes. In this study, we have performed a quantitative analysis of compound–gene interactions for two fungal species (budding yeast (S. cerevisiae) and fission yeast (S. pombe)) that diverged from each other approximately 500–700 million years ago. A collection of 2957 compounds from the National Cancer Institute (NCI) were screened in both species for inhibition of wild-type cell growth. A total of 132 were found to be bioactive in both fungi and 9, along with 12 additional well-characterized drugs, were selected for subsequent screening. Mutant libraries of 727 and 438 gene deletions were used for S. cerevisiae and S. pombe, respectively, and these were selected based on availability of genetic interaction data from previous studies (Collins et al, 2007; Roguev et al, 2008; Fiedler et al, 2009) and contain an overlap of 190 one-to-one orthologs that can be directly compared. Deviations from the neutral expectation were quantified as drug–gene interactions scores (D-scores) for the 21 compounds against the deletion libraries. Replicates of both screens showed very high correlations (S. cerevisiae r=0.72, S. pombe r=0.76) and reproduced well previously known compound–gene interactions (Supplementary information). We then compared the D-scores for the 190 one-to-one orthologs present in the data set of both species. Despite the high reproducibility, we observed a very poor conservation of these compound–gene interaction scores across these species (r=0.13, Figure 4A).
Previous work had shown that, across these same species, genetic interactions within protein complexes were much more conserved than average genetic interactions (Roguev et al, 2008). Similarly we observed a higher cross-species conservation of the compound–module (complex or pathway) interactions than the overall compound–gene interactions. Specifically, the data derived from fission yeast were a poor predictor of S. cerevisaie drug–gene interactions, but a good predictor of budding yeast compound–module connections (Figure 4B). Also, a combined score from both species improved the prediction of compound–module interactions, above the accuracy observed with the S. cerevisae information alone, but this improvement was not observed for the prediction of drug–gene interactions (Figure 4B). Data from both species were used to predict drug–module interactions, and one specific interaction (compound NSC-207895 interaction with DNA repair complexes) was experimentally verified by showing that the compound activates the DNA damage repair pathway in three species (S. cerevisiae, S. pombe and H. sapiens).
To understand why the combination of chemogenomic data from two species might improve drug–module interaction predictions, we also analyzed previously published cross-species genetic–interaction data. We observed a significant correlation between the conservation of drug–gene and gene–gene interactions among the one-to-one orthologs (r=0.28, P-value=0.0078). Additionally, the strongest interactions of benomyl (a microtubule inhibitor) were to complexes that also had strong and conserved genetic interactions with microtubules (Figure 4C). We hypothesize that a significant number of the compound–gene interactions obtained from chemogenomic studies are not direct interactions with the physical target of the compounds, but include many indirect interactions that genetically interact with the main target(s). This would explain why the compound interaction networks show similar evolutionary patterns as the genetic interactions networks.
In summary, these results shed some light on the interplay between the evolution of genetic networks and the evolution of drug response. Understanding how genetic variability across different species might result in different sensitivity to drugs should improve our capacity to design treatments. Concretely, we hope that this line of research might one day help us create drugs and drug combinations that specifically affect a pathogen or diseased tissue, but not the host.
We present a cross-species chemogenomic screening platform using libraries of haploid deletion mutants from two yeast species, Saccharomyces cerevisiae and Schizosaccharomyces pombe. We screened a set of compounds of known and unknown mode of action (MoA) and derived quantitative drug scores (or D-scores), identifying mutants that are either sensitive or resistant to particular compounds. We found that compound–functional module relationships are more conserved than individual compound–gene interactions between these two species. Furthermore, we observed that combining data from both species allows for more accurate prediction of MoA. Finally, using this platform, we identified a novel small molecule that acts as a DNA damaging agent and demonstrate that its MoA is conserved in human cells.
PMCID: PMC3018166  PMID: 21179023
chemogenomics; evolution; modularity
17.  Cancer Screening with Digital Mammography for Women at Average Risk for Breast Cancer, Magnetic Resonance Imaging (MRI) for Women at High Risk 
Executive Summary
The purpose of this review is to determine the effectiveness of 2 separate modalities, digital mammography (DM) and magnetic resonance imaging (MRI), relative to film mammography (FM), in the screening of women asymptomatic for breast cancer. A third analysis assesses the effectiveness and safety of the combination of MRI plus mammography (MRI plus FM) in screening of women at high risk. An economic analysis was also conducted.
Research Questions
How does the sensitivity and specificity of DM compare to FM?
How does the sensitivity and specificity of MRI compare to FM?
How do the recall rates compare among these screening modalities, and what effect might this have on radiation exposure? What are the risks associated with radiation exposure?
How does the sensitivity and specificity of the combination of MRI plus FM compare to either MRI or FM alone?
What are the economic considerations?
Clinical Need
The effectiveness of FM with respect to breast cancer mortality in the screening of asymptomatic average- risk women over the age of 50 has been established. However, based on a Medical Advisory Secretariat review completed in March 2006, screening is not recommended for women between the ages of 40 and 49 years. Guidelines published by the Canadian Task Force on Preventive Care recommend mammography screening every 1 to 2 years for women aged 50 years and over, hence, the inclusion of such women in organized breast cancer screening programs. In addition to the uncertainty of the effectiveness of mammography screening from the age of 40 years, there is concern over the risks associated with mammographic screening for the 10 years between the ages of 40 and 49 years.
The lack of effectiveness of mammography screening starting at the age of 40 years (with respect to breast cancer mortality) is based on the assumption that the ability to detect cancer decreases with increased breast tissue density. As breast density is highest in the premenopausal years (approximately 23% of postmenopausal and 53% of premenopausal women having at least 50% of the breast occupied by high density), mammography screening is not promoted in Canada nor in many other countries for women under the age of 50 at average risk for breast cancer. It is important to note, however, that screening of premenopausal women (i.e., younger than 50 years of age) at high risk for breast cancer by virtue of a family history of cancer or a known genetic predisposition (e.g., having tested positive for the breast cancer genes BRCA1 and/or BRCA2) is appropriate. Thus, this review will assess the effectiveness of breast cancer screening with modalities other than film mammography, specifically DM and MRI, for both pre/perimenopausal and postmenopausal age groups.
International estimates of the epidemiology of breast cancer show that the incidence of breast cancer is increasing for all ages combined whereas mortality is decreasing, though at a slower rate. The observed decreases in mortality rates may be attributable to screening, in addition to advances in breast cancer therapy over time. Decreases in mortality attributable to screening may be a result of the earlier detection and treatment of invasive cancers, in addition to the increased detection of ductal carcinoma in situ (DCIS), of which certain subpathologies are less lethal. Evidence from the Surveillance, Epidemiology and End Results (better known as SEER) cancer registry in the United States, indicates that the age-adjusted incidence of DCIS has increased almost 10-fold over a 20 year period, from 2.7 to 25 per 100,000.
There is a 4-fold lower incidence of breast cancer in the 40 to 49 year age group than in the 50 to 69 year age group (approximately 140 per 100,000 versus 500 per 100,000 women, respectively). The sensitivity of FM is also lower among younger women (approximately 75%) than for women aged over 50 years (approximately 85%). Specificity is approximately 80% for younger women versus 90% for women over 50 years. The increased density of breast tissue in younger women is likely responsible for the decreased accuracy of FM.
Treatment options for breast cancer vary with the stage of disease (based on tumor size, involvement of surrounding tissue, and number of affected axillary lymph nodes) and its pathology, and may include a combination of surgery, chemotherapy and/or radiotherapy. Surgery is the first-line intervention for biopsy-confirmed tumors. The subsequent use of radiation, chemotherapy or hormonal treatments is dependent on the histopathologic characteristics of the tumor and the type of surgery. There is controversy regarding the optimal treatment of DCIS, which is considered a noninvasive tumour.
Women at high risk for breast cancer are defined as genetic carriers of the more commonly known breast cancer genes (BRCA1, BRCA2 TP53), first degree relatives of carriers, women with varying degrees of high risk family histories, and/or women with greater than 20% lifetime risk for breast cancer based on existing risk models. Genetic carriers for this disease, primarily women with BRCA1 or BRCA2 mutations, have a lifetime probability of approximately 85% of developing breast cancer. Preventive options for these women include surgical interventions such as prophylactic mastectomy and/or oophorectomy, i.e., removal of the breasts and/or ovaries. Therefore, it is important to evaluate the benefits and risks of different screening modalities, to identify additional options for these women.
This Medical Advisory Secretariat review is the second of 2 parts on breast cancer screening, and concentrates on the evaluation of both DM and MRI relative to FM, the standard of care. Part I of this review (March 2006) addressed the effectiveness of screening mammography in 40 to 49 year old average-risk women. The overall objective of the present review is to determine the optimal screening modality based on the evidence.
Evidence Review Strategy
The Medical Advisory Secretariat followed its standard procedures and searched the following electronic databases: Ovid MEDLINE, EMBASE, Ovid MEDLINE In-Process & Other Non-Indexed Citations, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews and The International Network of Agencies for Health Technology Assessment database. The subject headings and keywords searched included breast cancer, breast neoplasms, mass screening, digital mammography, magnetic resonance imaging. The detailed search strategies can be viewed in Appendix 1.
Included in this review are articles specific to screening and do not include evidence on diagnostic mammography. The search was further restricted to English-language articles published between January 1996 and April 2006. Excluded were case reports, comments, editorials, nonsystematic reviews, and letters.
Digital Mammography: In total, 224 articles specific to DM screening were identified. These were examined against the inclusion/exclusion criteria described below, resulting in the selection and review of 5 health technology assessments (HTAs) (plus 1 update) and 4 articles specific to screening with DM.
Magnetic Resonance Imaging: In total, 193 articles specific to MRI were identified. These were examined against the inclusion/exclusion criteria described below, resulting in the selection and review of 2 HTAs and 7 articles specific to screening with MRI.
The evaluation of the addition of FM to MRI in the screening of women at high risk for breast cancer was also conducted within the context of standard search procedures of the Medical Advisory Secretariat. as outlined above. The subject headings and keywords searched included the concepts of breast cancer, magnetic resonance imaging, mass screening, and high risk/predisposition to breast cancer. The search was further restricted to English-language articles published between September 2007 and January 15, 2010. Case reports, comments, editorials, nonsystematic reviews, and letters were not excluded.
MRI plus mammography: In total, 243 articles specific to MRI plus FM screening were identified. These were examined against the inclusion/exclusion criteria described below, resulting in the selection and review of 2 previous HTAs, and 1 systematic review of 11 paired design studies.
Inclusion Criteria
English-language articles, and English or French-language HTAs published from January 1996 to April 2006, inclusive.
Articles specific to screening of women with no personal history of breast cancer.
Studies in which DM or MRI were compared with FM, and where the specific outcomes of interest were reported.
Randomized controlled trials (RCTs) or paired studies only for assessment of DM.
Prospective, paired studies only for assessment of MRI.
Exclusion Criteria
Studies in which outcomes were not specific to those of interest in this report.
Studies in which women had been previously diagnosed with breast cancer.
Studies in which the intervention (DM or MRI) was not compared with FM.
Studies assessing DM with a sample size of less than 500.
Digital mammography.
Magnetic resonance imaging.
Screening with film mammography.
Outcomes of Interest
Breast cancer mortality (although no studies were found with such long follow-up).
Recall rates.
Summary of Findings
Digital Mammography
There is moderate quality evidence that DM is significantly more sensitive than FM in the screening of asymptomatic women aged less than 50 years, those who are premenopausal or perimenopausal, and those with heterogeneously or extremely dense breast tissue (regardless of age).
It is not known what effect these differences in sensitivity will have on the more important effectiveness outcome measure of breast cancer mortality, as there was no evidence of such an assessment.
Other factors have been set out to promote DM, for example, issues of recall rates and reading and examination times. Our analysis did not show that recall rates were necessarily improved in DM, though examination times were lower than for FM. Other factors including storage and retrieval of screens were not the subject of this analysis.
Magnetic Resonance Imaging
There is moderate quality evidence that the sensitivity of MRI is significantly higher than that of FM in the screening of women at high risk for breast cancer based on genetic or familial factors, regardless of age.
Radiation Risk Review
Cancer Care Ontario conducted a review of the evidence on radiation risk in screening with mammography women at high risk for breast cancer. From this review of recent literature and risk assessment that considered the potential impact of screening mammography in cohorts of women who start screening at an earlier age or who are at increased risk of developing breast cancer due to genetic susceptibility, the following conclusions can be drawn:
For women over 50 years of age, the benefits of mammography greatly outweigh the risk of radiation-induced breast cancer irrespective of the level of a woman’s inherent breast cancer risk.
Annual mammography for women aged 30 – 39 years who carry a breast cancer susceptibility gene or who have a strong family breast cancer history (defined as a first degree relative diagnosed in their thirties) has a favourable benefit:risk ratio. Mammography is estimated to detect 16 to 18 breast cancer cases for every one induced by radiation (Table 1). Initiation of screening at age 35 for this same group would increase the benefit:risk ratio to an even more favourable level of 34-50 cases detected for each one potentially induced.
Mammography for women under 30 years of age has an unfavourable benefit:risk ratio due to the challenges of detecting cancer in younger breasts, the aggressiveness of cancers at this age, the potential for radiation susceptibility at younger ages and a greater cumulative radiation exposure.
Mammography when used in combination with MRI for women who carry a strong breast cancer susceptibility (e.g., BRCA1/2 carriers), which if begun at age 35 and continued for 35 years, may confer greatly improved benefit:risk ratios which were estimated to be about 220 to one.
While there is considerable uncertainty in the risk of radiation-induced breast cancer, the risk expressed in published studies is almost certainly conservative as the radiation dose absorbed by women receiving mammography recently has been substantially reduced by newer technology.
A CCO update of the mammography radiation risk literature for 2008 and 2009 gave rise to one article by Barrington de Gonzales et al. published in 2009 (Barrington de Gonzales et al., 2009, JNCI, vol. 101: 205-209). This article focuses on estimating the risk of radiation-induced breast cancer for mammographic screening of young women at high risk for breast cancer (with BRCA gene mutations). Based on an assumption of a 15% to 25% or less reduction in mortality from mammography in these high risk women, the authors conclude that such a reduction is not substantially greater than the risk of radiation-induced breast cancer mortality when screening before the age of 34 years. That is, there would be no net benefit from annual mammographic screening of BRCA mutation carriers at ages 25-29 years; the net benefit would be zero or small if screening occurs in 30-34 year olds, and there would be some net benefit at age 35 years or older.
The Addition of Mammography to Magnetic Resonance Imaging
The effects of the addition of FM to MRI screening of high risk women was also assessed, with inclusion and exclusion criteria as follows:
Inclusion Criteria
English-language articles and English or French-language HTAs published from September 2007 to January 15, 2010.
Articles specific to screening of women at high risk for breast cancer, regardless of the definition of high risk.
Studies in which accuracy data for the combination of MRI plus FM are available to be compared to that of MRI and FM alone.
RCTs or prospective, paired studies only.
Studies in which women were previously diagnosed with breast cancer were also included.
Exclusion Criteria
Studies in which outcomes were not specific to those of interest in this report.
Studies in which there was insufficient data on the accuracy of MRI plus FM.
Both MRI and FM.
Screening with MRI alone and FM alone.
Outcomes of Interest
Summary of Findings
Magnetic Resonance Imaging Plus Mammography
Moderate GRADE Level Evidence that the sensitivity of MRI plus mammography is significantly higher than that of MRI or FM alone, although the specificity remains either unchanged or decreases in the screening of women at high risk for breast cancer based on genetic/familial factors, regardless of age.
These studies include women at high risk defined as BRCA1/2 or TP53 carriers, first degree relatives of carriers, women with varying degrees of high risk family histories, and/or >20% lifetime risk based on existing risk models. This definition of high risk accounts for approximately 2% of the female adult population in Ontario.
PMCID: PMC3377503  PMID: 23074406
18.  Assessing Drug Target Association Using Semantic Linked Data 
PLoS Computational Biology  2012;8(7):e1002574.
The rapidly increasing amount of public data in chemistry and biology provides new opportunities for large-scale data mining for drug discovery. Systematic integration of these heterogeneous sets and provision of algorithms to data mine the integrated sets would permit investigation of complex mechanisms of action of drugs. In this work we integrated and annotated data from public datasets relating to drugs, chemical compounds, protein targets, diseases, side effects and pathways, building a semantic linked network consisting of over 290,000 nodes and 720,000 edges. We developed a statistical model to assess the association of drug target pairs based on their relation with other linked objects. Validation experiments demonstrate the model can correctly identify known direct drug target pairs with high precision. Indirect drug target pairs (for example drugs which change gene expression level) are also identified but not as strongly as direct pairs. We further calculated the association scores for 157 drugs from 10 disease areas against 1683 human targets, and measured their similarity using a score matrix. The similarity network indicates that drugs from the same disease area tend to cluster together in ways that are not captured by structural similarity, with several potential new drug pairings being identified. This work thus provides a novel, validated alternative to existing drug target prediction algorithms. The web service is freely available at:
Author Summary
Modern drug discovery requires the understanding of chemogenomics, the complex interaction of chemical compounds and drugs with a wide variety of protein target and genes in the body. A large amount of data pertaining to such relationships exists in publicly-accessible datasets but it is siloed and thus impossible to use in an integrated fashion. In this work we have integrated and semantically annotated a large amount of public data from a wide range of databases, including compound-gene, drug-drug, protein-protein, drug-side effects and so on, to create a complex network of interactions relating to compounds and protein targets. We developed a statistical algorithm called Semantic Link Association Prediction (SLAP) for predicting “missing links” in this data network: i.e. compound-target interactions for which there is no experimental data but which are statistically probable given the other relationships that exist in this set. We present validation experiments which show this method works with a high degree of accuracy, and also demonstrate how it can be used to create a drug similarity network to make predictions of new indications for existing drugs.
PMCID: PMC3390390  PMID: 22859915
19.  IIS – Integrated Interactome System: A Web-Based Platform for the Annotation, Analysis and Visualization of Protein-Metabolite-Gene-Drug Interactions by Integrating a Variety of Data Sources and Tools 
PLoS ONE  2014;9(6):e100385.
High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted.
We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web.
We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at:
PMCID: PMC4065059  PMID: 24949626
20.  Semi-automated literature mining to identify putative biomarkers of disease from multiple biofluids 
Computational methods for mining of biomedical literature can be useful in augmenting manual searches of the literature using keywords for disease-specific biomarker discovery from biofluids. In this work, we develop and apply a semi-automated literature mining method to mine abstracts obtained from PubMed to discover putative biomarkers of breast and lung cancers in specific biofluids.
A positive set of abstracts was defined by the terms ‘breast cancer’ and ‘lung cancer’ in conjunction with 14 separate ‘biofluids’ (bile, blood, breastmilk, cerebrospinal fluid, mucus, plasma, saliva, semen, serum, synovial fluid, stool, sweat, tears, and urine), while a negative set of abstracts was defined by the terms ‘(biofluid) NOT breast cancer’ or ‘(biofluid) NOT lung cancer.’ More than 5.3 million total abstracts were obtained from PubMed and examined for biomarker-disease-biofluid associations (34,296 positive and 2,653,396 negative for breast cancer; 28,355 positive and 2,595,034 negative for lung cancer). Biological entities such as genes and proteins were tagged using ABNER, and processed using Python scripts to produce a list of putative biomarkers. Z-scores were calculated, ranked, and used to determine significance of putative biomarkers found. Manual verification of relevant abstracts was performed to assess our method’s performance.
Biofluid-specific markers were identified from the literature, assigned relevance scores based on frequency of occurrence, and validated using known biomarker lists and/or databases for lung and breast cancer [NCBI’s On-line Mendelian Inheritance in Man (OMIM), Cancer Gene annotation server for cancer genomics (CAGE), NCBI’s Genes & Disease, NCI’s Early Detection Research Network (EDRN), and others]. The specificity of each marker for a given biofluid was calculated, and the performance of our semi-automated literature mining method assessed for breast and lung cancer.
We developed a semi-automated process for determining a list of putative biomarkers for breast and lung cancer. New knowledge is presented in the form of biomarker lists; ranked, newly discovered biomarker-disease-biofluid relationships; and biomarker specificity across biofluids.
PMCID: PMC4215335  PMID: 25379168
Literature mining; Text mining; Lung cancer; Breast cancer; Biomarker; Biofluid
21.  A Computational Approach to Finding Novel Targets for Existing Drugs 
PLoS Computational Biology  2011;7(9):e1002139.
Repositioning existing drugs for new therapeutic uses is an efficient approach to drug discovery. We have developed a computational drug repositioning pipeline to perform large-scale molecular docking of small molecule drugs against protein drug targets, in order to map the drug-target interaction space and find novel interactions. Our method emphasizes removing false positive interaction predictions using criteria from known interaction docking, consensus scoring, and specificity. In all, our database contains 252 human protein drug targets that we classify as reliable-for-docking as well as 4621 approved and experimental small molecule drugs from DrugBank. These were cross-docked, then filtered through stringent scoring criteria to select top drug-target interactions. In particular, we used MAPK14 and the kinase inhibitor BIM-8 as examples where our stringent thresholds enriched the predicted drug-target interactions with known interactions up to 20 times compared to standard score thresholds. We validated nilotinib as a potent MAPK14 inhibitor in vitro (IC50 40 nM), suggesting a potential use for this drug in treating inflammatory diseases. The published literature indicated experimental evidence for 31 of the top predicted interactions, highlighting the promising nature of our approach. Novel interactions discovered may lead to the drug being repositioned as a therapeutic treatment for its off-target's associated disease, added insight into the drug's mechanism of action, and added insight into the drug's side effects.
Author Summary
Most drugs are designed to bind to and inhibit the function of a disease target protein. However, drugs are often able to bind to ‘off-target’ proteins due to similarities in the protein binding sites. If an off-target is known to be involved in another disease, then the drug has potential to treat the second disease. This repositioning strategy is an alternate and efficient approach to drug discovery, as the clinical and toxicity histories of existing drugs can greatly reduce drug development cost and time. We present here a large-scale computational approach that simulates three-dimensional binding between existing drugs and target proteins to predict novel drug-target interactions. Our method focuses on removing false predictions, using annotated ‘known’ interactions, scoring and ranking thresholds. 31 of our top novel drug-target predictions were validated through literature search, and demonstrated the utility of our method. We were also able to identify the cancer drug nilotinib as a potent inhibitor of MAPK14, a target in inflammatory diseases, which suggests a potential use for the drug in treating rheumatoid arthritis.
PMCID: PMC3164726  PMID: 21909252
22.  The apoptotic machinery as a biological complex system: analysis of its omics and evolution, identification of candidate genes for fourteen major types of cancer, and experimental validation in CML and neuroblastoma 
BMC Medical Genomics  2009;2:20.
Apoptosis is a critical biological phenomenon, executed under the guidance of the Apoptotic Machinery (AM), which allows the physiologic elimination of terminally differentiated, senescent or diseased cells. Because of its relevance to BioMedicine, we have sought to obtain a detailed characterization of AM Omics in Homo sapiens, namely its Genomics and Evolution, Transcriptomics, Proteomics, Interactomics, Oncogenomics, and Pharmacogenomics.
This project exploited the methodology commonly used in Computational Biology (i.e., mining of many omics databases of the web) as well as the High Throughput biomolecular analytical techniques.
In Homo sapiens AM is comprised of 342 protein-encoding genes (possessing either anti- or pro-apoptotic activity, or a regulatory function) and 110 MIR-encoding genes targeting them: some have a critical role within the system (core AM nodes), others perform tissue-, pathway-, or disease-specific functions (peripheral AM nodes). By overlapping the cancer type-specific AM mutation map in the fourteen most frequent cancers in western societies (breast, colon, kidney, leukaemia, liver, lung, neuroblastoma, ovary, pancreas, prostate, skin, stomach, thyroid, and uterus) to their transcriptome, proteome and interactome in the same tumour type, we have identified the most prominent AM molecular alterations within each class. The comparison of the fourteen mutated AM networks (both protein- as MIR-based) has allowed us to pinpoint the hubs with a general and critical role in tumour development and, conversely, in cell physiology: in particular, we found that some of these had already been used as targets for pharmacological anticancer therapy. For a better understanding of the relationship between AM molecular alterations and pharmacological induction of apoptosis in cancer, we examined the expression of AM genes in K562 and SH-SY5Y after anticancer treatment.
We believe that our data on the Apoptotic Machinery will lead to the identification of new cancer genes and to the discovery of new biomarkers, which could then be used to profile cancers for diagnostic purposes and to pinpoint new targets for pharmacological therapy. This approach could pave the way for future studies and applications in molecular and clinical Medicine with important perspectives both for Oncology as for Regenerative Medicine.
PMCID: PMC2683874  PMID: 19402918
23.  Target Inhibition Networks: Predicting Selective Combinations of Druggable Targets to Block Cancer Survival Pathways 
PLoS Computational Biology  2013;9(9):e1003226.
A recent trend in drug development is to identify drug combinations or multi-target agents that effectively modify multiple nodes of disease-associated networks. Such polypharmacological effects may reduce the risk of emerging drug resistance by means of attacking the disease networks through synergistic and synthetic lethal interactions. However, due to the exponentially increasing number of potential drug and target combinations, systematic approaches are needed for prioritizing the most potent multi-target alternatives on a global network level. We took a functional systems pharmacology approach toward the identification of selective target combinations for specific cancer cells by combining large-scale screening data on drug treatment efficacies and drug-target binding affinities. Our model-based prediction approach, named TIMMA, takes advantage of the polypharmacological effects of drugs and infers combinatorial drug efficacies through system-level target inhibition networks. Case studies in MCF-7 and MDA-MB-231 breast cancer and BxPC-3 pancreatic cancer cells demonstrated how the target inhibition modeling allows systematic exploration of functional interactions between drugs and their targets to maximally inhibit multiple survival pathways in a given cancer type. The TIMMA prediction results were experimentally validated by means of systematic siRNA-mediated silencing of the selected targets and their pairwise combinations, showing increased ability to identify not only such druggable kinase targets that are essential for cancer survival either individually or in combination, but also synergistic interactions indicative of non-additive drug efficacies. These system-level analyses were enabled by a novel model construction method utilizing maximization and minimization rules, as well as a model selection algorithm based on sequential forward floating search. Compared with an existing computational solution, TIMMA showed both enhanced prediction accuracies in cross validation as well as significant reduction in computation times. Such cost-effective computational-experimental design strategies have the potential to greatly speed-up the drug testing efforts by prioritizing those interventions and interactions warranting further study in individual cancer cases.
Author Summary
Selective inhibition of specific panels of multiple protein targets provides an unprecedented potential for improving therapeutic efficacy of anticancer agents. We introduce a computational systems pharmacology strategy, which uses the concept of target inhibition networks to predict effective multi-target combinations for treating specific cancer types. The strategy is based on integration of two complementary information sources, drug treatment efficacies and drug-target binding affinities, which are readily available in drug screening labs. Compared to the cancer sequencing efforts, which often result in a huge number of non-targetable genetic alterations, the target combinations from our strategy are druggable, by definition, hence enabling more straightforward translation toward clinically actionable treatment strategies. The model predictions were experimentally validated using siRNA-mediated target silencing screens in three case studies involving MDA-MB-231 and MCF-7 breast cancer and BxPC-3 pancreatic cancer cells. In more general terms, the cancer cell-specific target inhibition networks provided additional insights into the drugs' mechanisms of action, for instance, how the cancer cell survival pathways can be targeted by synergistic and synthetic lethal interactions through multi–target perturbations. These results demonstrate that the principles introduced here offer the possibilities to move toward more systematic prediction and evaluation of the most effective drug target combinations.
PMCID: PMC3772058  PMID: 24068907
24.  DrugComboRanker: drug combination discovery based on target network analysis 
Bioinformatics  2014;30(12):i228-i236.
Motivation: Currently there are no curative anticancer drugs, and drug resistance is often acquired after drug treatment. One of the reasons is that cancers are complex diseases, regulated by multiple signaling pathways and cross talks among the pathways. It is expected that drug combinations can reduce drug resistance and improve patients’ outcomes. In clinical practice, the ideal and feasible drug combinations are combinations of existing Food and Drug Administration-approved drugs or bioactive compounds that are already used on patients or have entered clinical trials and passed safety tests. These drug combinations could directly be used on patients with less concern of toxic effects. However, there is so far no effective computational approach to search effective drug combinations from the enormous number of possibilities.
Results: In this study, we propose a novel systematic computational tool DrugComboRanker to prioritize synergistic drug combinations and uncover their mechanisms of action. We first build a drug functional network based on their genomic profiles, and partition the network into numerous drug network communities by using a Bayesian non-negative matrix factorization approach. As drugs within overlapping community share common mechanisms of action, we next uncover potential targets of drugs by applying a recommendation system on drug communities. We meanwhile build disease-specific signaling networks based on patients’ genomic profiles and interactome data. We then identify drug combinations by searching drugs whose targets are enriched in the complementary signaling modules of the disease signaling network. The novel method was evaluated on lung adenocarcinoma and endocrine receptor positive breast cancer, and compared with other drug combination approaches. These case studies discovered a set of effective drug combinations top ranked in our prediction list, and mapped the drug targets on the disease signaling network to highlight the mechanisms of action of the drug combinations.
Availability and implementation: The program is available on request.
PMCID: PMC4058933  PMID: 24931988
25.  Receptor-Defined Subtypes of Breast Cancer in Indigenous Populations in Africa: A Systematic Review and Meta-Analysis 
PLoS Medicine  2014;11(9):e1001720.
In a systematic review and meta-analysis, Isabel dos Santos Silva and colleagues estimate the prevalence of receptor-defined subtypes of breast cancer in North Africa and sub-Saharan Africa.
Please see later in the article for the Editors' Summary
Breast cancer is the most common female cancer in Africa. Receptor-defined subtypes are a major determinant of treatment options and disease outcomes but there is considerable uncertainty regarding the frequency of poor prognosis estrogen receptor (ER) negative subtypes in Africa. We systematically reviewed publications reporting on the frequency of breast cancer receptor-defined subtypes in indigenous populations in Africa.
Methods and Findings
Medline, Embase, and Global Health were searched for studies published between 1st January 1980 and 15th April 2014. Reported proportions of ER positive (ER+), progesterone receptor positive (PR+), and human epidermal growth factor receptor-2 positive (HER2+) disease were extracted and 95% CI calculated. Random effects meta-analyses were used to pool estimates. Fifty-four studies from North Africa (n = 12,284 women with breast cancer) and 26 from sub-Saharan Africa (n = 4,737) were eligible. There was marked between-study heterogeneity in the ER+ estimates in both regions (I2>90%), with the majority reporting proportions between 0.40 and 0.80 in North Africa and between 0.20 and 0.70 in sub-Saharan Africa. Similarly, large between-study heterogeneity was observed for PR+ and HER2+ estimates (I2>80%, in all instances). Meta-regression analyses showed that the proportion of ER+ disease was 10% (4%–17%) lower for studies based on archived tumor blocks rather than prospectively collected specimens, and 9% (2%–17%) lower for those with ≥40% versus those with <40% grade 3 tumors. For prospectively collected samples, the pooled proportions for ER+ and triple negative tumors were 0.59 (0.56–0.62) and 0.21 (0.17–0.25), respectively, regardless of region. Limitations of the study include the lack of standardized procedures across the various studies; the low methodological quality of many studies in terms of the representativeness of their case series and the quality of the procedures for collection, fixation, and receptor testing; and the possibility that women with breast cancer may have contributed to more than one study.
The published data from the more appropriate prospectively measured specimens are consistent with the majority of breast cancers in Africa being ER+. As no single subtype dominates in the continent availability of receptor testing should be a priority, especially for young women with early stage disease where appropriate receptor-specific treatment modalities offer the greatest potential for reducing years of life lost.
Please see later in the article for the Editors' Summary
Editors' Summary
Breast cancer is the commonest female tumor in Africa and death rates from the disease in some African countries are among the highest in the world. Breast cancer begins when cells in the breast acquire genetic changes that allow them to grow uncontrollably and to move around the body. When a breast lump is found (by mammography or manual examination), a few cells are collected from the lump (a biopsy) to look for abnormal cells and to test for the presence of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor-2 (HER2) on the cells. The hormones estrogen and progesterone promote the growth of normal breast cells and of ER+ and PR+ breast cancer cells. HER2 also controls the growth of breast cells. The receptor status of breast cancer is a major determinant of treatment options and prognosis (likely outcome). ER+ tumors, for example, are more receptive to hormonal therapy and have a better prognosis than ER− tumors, whereas HER2+ tumors, which make large amounts of HER2, are more aggressive than HER2− tumors. Breast cancer is treated by surgically removing the lump or the whole breast (mastectomy) if the tumor has already spread, before killing any remaining cancer cells with chemotherapy or radiotherapy. In addition, ER+, PR+, and HER2+ tumors are treated with drugs that block these receptors (including tamoxifen and trastuzumab), thereby slowing breast cancer growth.
Why Was This Study Done?
ER+ tumors predominate in white women but the proportion of ER+ tumors among US-born black women is slightly lower. The frequency of different receptor-defined subtypes of breast cancer in indigenous populations in Africa is currently unclear but policy makers need this information to help them decide whether routine receptor status testing should be introduced across Africa. Because receptor status is a major determination of treatment options and outcomes, it would be more important to introduce receptor testing if all subtypes are present in breast cancers in indigenous African women and if no one subtype dominates than if most breast cancers in these women are ER+. In this systematic review (a study that uses pre-defined criteria to identify all the research on a given topic) and meta-analysis (a statistical approach that combines the results of several studies), the researchers examine the distribution of receptor-defined breast cancer subtypes in indigenous populations in Africa.
What Did the Researchers Do and Find?
The researchers identified 54 relevant studies from North Africa involving 12,284 women with breast cancer (mainly living in Egypt or Tunisia) and 26 studies from sub-Saharan Africa involving 4,737 women with breast cancer (mainly living in Nigeria or South Africa) and used the data from these studies to calculate the proportions of ER+, PR+, and HER2+ tumors (the number of receptor-positive tumors divided by the number of tumors with known receptor status) across Africa. The proportion of ER+ tumors varied markedly between studies, ranging between 0.40 and 0.80 in North Africa and between 0.20 and 0.70 in sub-Saharan Africa. Among prospectively collected samples (samples collected specifically for receptor-status testing; studies that determined the receptor status of breast cancers using stored samples reported a lower proportion of ER+ disease than studies that used prospectively collected samples), the overall pooled proportions of ER+ and triple negative tumors were 0.59 and 0.21, respectively.
What Do These Findings Mean?
Although these findings highlight the scarcity of data on hormone receptor and HER2 status in breast cancers in indigenous African populations, they provide new information about the distribution of breast cancer subtypes in Africa. Specifically, these findings suggest that although slightly more than half of breast cancers in Africa are ER+, no single subtype dominates. They also suggest that the distribution of receptor-defined breast cancer subtypes in Africa is similar to that found in Western populations. The accuracy of these findings is likely to be affected by the low methodological quality of many of the studies and the lack of standardized procedures. Thus, large well-designed studies are still needed to accurately quantify the distribution of various breast cancer subtypes across Africa. In the meantime, the current findings support the introduction of routine receptor testing across Africa, especially for young women with early stage breast cancer in whom the potential to improve survival and reduce the years of life lost by knowing the receptor status of an individual's tumor is greatest.
Additional Information
Please access these websites via the online version of this summary at
This study is further discussed in a PLOS Medicine Perspective by Sulma i Mohammed
The US National Cancer Institute (NCI) provides comprehensive information about cancer (in English and Spanish), including detailed information for patients and professionals about breast cancer including an online booklet for patients
Cancer Research UK, a not-for profit organization, provides information about cancer; its detailed information about breast cancer includes sections on tests for hormone receptors and HER2 and on treatments that target hormone receptors and treatments that target HER2 is a not-for-profit organization that provides up-to-date information about breast cancer (in English and Spanish), including information on hormone receptor status and HER2 status
The UK National Health Service Choices website has information and personal stories about breast cancer; the not-for profit organization Healthtalkonline also provides personal stories about dealing with breast cancer
PMCID: PMC4159229  PMID: 25202974

Results 1-25 (1331789)