Serum proteins are routinely used to diagnose diseases, but are hard to find due to low sensitivity in screening the serum proteome. Public repositories of microarray data, such as the Gene Expression Omnibus (GEO), contain RNA expression profiles for more than 16,000 biological conditions, covering more than 30% of United States mortality. We hypothesized that genes coding for serum- and urine-detectable proteins, and showing differential expression of RNA in disease-damaged tissues would make ideal diagnostic protein biomarkers for those diseases. We showed that predicted protein biomarkers are significantly enriched for known diagnostic protein biomarkers in 22 diseases, with enrichment significantly higher in diseases for which at least three datasets are available. We then used this strategy to search for new biomarkers indicating acute rejection (AR) across different types of transplanted solid organs. We integrated three biopsy-based microarray studies of AR from pediatric renal, adult renal and adult cardiac transplantation and identified 45 genes upregulated in all three. From this set, we chose 10 proteins for serum ELISA assays in 39 renal transplant patients, and discovered three that were significantly higher in AR. Interestingly, all three proteins were also significantly higher during AR in the 63 cardiac transplant recipients studied. Our best marker, serum PECAM1, identified renal AR with 89% sensitivity and 75% specificity, and also showed increased expression in AR by immunohistochemistry in renal, hepatic and cardiac transplant biopsies. Our results demonstrate that integrating gene expression microarray measurements from disease samples and even publicly-available data sets can be a powerful, fast, and cost-effective strategy for the discovery of new diagnostic serum protein biomarkers.
Protein biomarkers in the blood are urgently needed for the diagnosis of a wide variety of diseases to improve health care. We aim to find a fast and cost-effective strategy to discover diagnostic protein biomarkers. Hundreds of diseases have already been investigated using microarray technology, measuring the mRNA expression of all genes in the disease-damaged tissues. We analyzed biopsy-based microarray data for 41 diseases in the public repository, identified genes with dysregulated mRNA expressions and detectable-protein abundance in the blood, and predicted them as candidate diagnostic protein biomarkers. We found that clinically and preclinically validated diagnostic protein biomarkers were significantly enriched in our predicted protein candidates for 22 diseases. We then measured the concentrations of ten predicted protein biomarkers in the serum samples from 39 renal transplant patients. Three of them were confirmed to be diagnostic of acute rejection after renal transplantation. All three proteins were further confirmed to be diagnostic of acute rejection in 63 cardiac transplant recipients. Our results show that publically available genome-wide gene expression data on disease-damaged tissues can be effectively translated into diagnostic protein biomarkers.