Despite complete surgical resection survival in early stage non-small cell lung cancer (NSCLC) remains poor. Based on prior pre-clinical evaluations, we hypothesized that combined induction proteasome and histone deacetylase inhibitor therapy, followed by tumor resection, is feasible.
A phase I clinical trial using a two-staged multiple agent design of bortezomib and vorinostat as induction therapy followed by consolidative surgery in patients with NSCLC was performed. Standard toxicity and MTD were examined. Pre- and post-treatment tumor gene expression arrays were performed and analyzed. Pre- and post-treatment FDG-PET imaging was used to assess tumor metabolism. Finally, serum 20S proteasome levels were analyzed with ELISA, and selected intratumoral proteins were assessed via immunohistochemistry.
Thirty-four patients were consented with 21 patients enrolling in the trial. One patient withdrew early secondary to disease progression. The MTD was bortezomib 1.3 mg/m2 and vorinostat 300 mg BID given. There were (2) grade III dose-limiting toxicities of fatigue and hypophosphatemia that were self-limited. There was no mortality. Thirty percent (6/20) of patients had greater than 60% histologic necrosis of their tumor following treatment, with two having ≥90% tumor necrosis. Tumor metabolism, 20S proteasome activity, and specific protein expression did not demonstrate consistent results. Gene expression arrays comparing pre- and post-therapy NSCLC specimens revealed robust intratumoral changes in specific genes.
Induction bortezomib and vorinostat therapy followed by surgery in patients with operable NSCLC is feasible. Correlative gene expression studies suggest new targets and cell signaling pathways that may be important in modulating this combined therapy.
Histone deacetylase; proteasome inhibitor; lung cancer
Protein tyrosine phosphatases (PTPs) constitute a large family of signaling enzymes that control the cellular levels of protein tyrosine phosphorylation. A detailed understanding of PTP functions in normal physiology and in pathogenic conditions has been hampered by the absence of PTP-specific, cell-permeable small molecule agents. We present a stepwise focused library approach that transforms a weak and general nonhydrolyzable pTyr mimetic (F2Pmp, phosphonodifluoromethyl phenylalanine) into a highly potent and selective inhibitor of PTP-MEG2, an antagonist of hepatic insulin signaling. The crystal structures of the PTP-MEG2-inhibitor complexes provide direct evidence that potent and selective PTP inhibitors can be obtained by introducing molecular diversity into the F2Pmp scaffold to engage both the active site and unique nearby peripheral binding pockets. Importantly, the PTP-MEG2 inhibitor possesses highly efficacious cellular activity and is capable of augmenting insulin signaling and improving insulin sensitivity and glucose homeostasis in diet-induced obese mice. The results indicate that F2Pmp can be converted into highly potent and selective PTP inhibitory agents with excellent in vivo efficacy. Given the general nature of the approach, this strategy should be applicable to other members of the PTP superfamily.
Telomerase reverse transcriptase (TERT) promoter mutations were recently shown to drive telomerase activity in various cancer types, including medulloblastoma. However, the clinical and biological implications of TERT mutations in medulloblastoma have not been described. Hence, we sought to describe these mutations and their impact in a subgroup-specific manner. We analyzed the TERT promoter by direct sequencing and genotyping in 466 medulloblastomas. The mutational distributions were determined according to subgroup affiliation, demographics, and clinical, prognostic, and molecular features. Integrated genomics approaches were used to identify specific somatic copy number alterations in TERT promoter-mutated and wild-type tumors. Overall, TERT promoter mutations were identified in 21 % of medulloblastomas. Strikingly, the highest frequencies of TERT mutations were observed in SHH (83 %; 55/66) and WNT (31 %; 4/13) medulloblastomas derived from adult patients. Group 3 and Group 4 harbored this alteration in <5 % of cases and showed no association with increased patient age. The prognostic implications of these mutations were highly subgroup-specific. TERT mutations identified a subset with good and poor prognosis in SHH and Group 4 tumors, respectively. Monosomy 6 was mostly restricted to WNT tumors without TERT mutations. Hallmark SHH focal copy number aberrations and chromosome 10q deletion were mutually exclusive with TERT mutations within SHH tumors. TERT promoter mutations are the most common recurrent somatic point mutation in medulloblastoma, and are very highly enriched in adult SHH and WNT tumors. TERT mutations define a subset of SHH medulloblastoma with distinct demographics, cytogenetics, and outcomes.
Electronic supplementary material
The online version of this article (doi:10.1007/s00401-013-1198-2) contains supplementary material, which is available to authorized users.
TERT promoter mutations; SHH pathway; Adult; Medulloblastoma
To identify features of primary care quality improvement associated with improved health outcomes using premature coronary heart disease (CHD) mortality as an example, and to determine impacts of different modelling approaches.
Cross-sectional study of mortality rates in 229 general practices.
General practices from three East Midlands primary care trusts.
Patients registered to the practices above between April 2006 and March 2009.
Main outcome measures
Numbers of CHD deaths in those aged under 75 (premature mortality) and at all ages in each practice.
Population characteristics and markers of quality of primary care were associated with variations in premature CHD mortality. Increasing levels of deprivation, percentages of practice populations on practice diabetes registers, white, over 65 and male were all associated with increasing levels of premature CHD mortality. Control of serum cholesterol levels in those with CHD and the percentage of patients recalling access to their preferred general practitioner were both associated with decreased levels of premature CHD mortality. Similar results were found for all-age mortality. A combined measure of quality of primary care for CHD comprising 12 quality outcomes framework indicators was associated with decreases in both all-age and premature CHD mortality. The selected models suggest that practices in less deprived areas may have up to 20% lower premature CHD mortality than those with median deprivation and that improvement in the CHD care quality from 83% (lower quartile) to 86% (median) could reduce premature CHD mortality by 3.6%. Different modelling approaches yielded qualitatively similar results.
High-quality primary care, including aspects of access to and continuity of care, detection and management, appears to be associated with reducing CHD mortality. The impact on premature CHD mortality is greater than on all-age CHD mortality. Determining the most useful measures of quality of primary care needs further consideration.
PRIMARY CARE; STATISTICS & RESEARCH METHODS
A new immuno-TRAP technique overcomes limitations of spatial resolution and selection bias to identify gene locus associations with a nuclear subcompartment such as promyelocytic leukemia nuclear bodies.
Important insights into nuclear function would arise if gene loci physically interacting with particular subnuclear domains could be readily identified. Immunofluorescence microscopy combined with fluorescence in situ hybridization (immuno-FISH), the method that would typically be used in such a study, is limited by spatial resolution and requires prior assumptions for selecting genes to probe. Our new technique, immuno-TRAP, overcomes these limitations. Using promyelocytic leukemia nuclear bodies (PML NBs) as a model, we used immuno-TRAP to determine if specific genes localize within molecular dimensions with these bodies. Although we confirmed a TP53 gene–PML NB association, immuno-TRAP allowed us to uncover novel locus-PML NB associations, including the ABCA7 and TFF1 loci and, most surprisingly, the PML locus itself. These associations were cell type specific and reflected the cell’s physiological state. Combined with microarrays or deep sequencing, immuno-TRAP provides powerful opportunities for identifying gene locus associations with potentially any nuclear subcompartment.
Recent increases in the number of deposited membrane protein crystal structures necessitate the use of automated computational tools to position them within the lipid bilayer. Identifying the correct orientation allows us to study the complex relationship between sequence, structure and the lipid environment, which is otherwise challenging to investigate using experimental techniques due to the difficulty in crystallising membrane proteins embedded within intact membranes.
We have developed a knowledge-based membrane potential, calculated by the statistical analysis of transmembrane protein structures, coupled with a combination of genetic and direct search algorithms, and demonstrate its use in positioning proteins in membranes, refinement of membrane protein models and in decoy discrimination.
Our method is able to quickly and accurately orientate both alpha-helical and beta-barrel membrane proteins within the lipid bilayer, showing closer agreement with experimentally determined values than existing approaches. We also demonstrate both consistent and significant refinement of membrane protein models and the effective discrimination between native and decoy structures. Source code is available under an open source license from http://bioinf.cs.ucl.ac.uk/downloads/memembed/.
Membrane protein; Statistical potential; Orientation; Refinement; Genetic algorithm
Background and Aim
Liver cirrhosis is associated with decreased hepatic cytochrome P4503A (CYP3A) activity but the pathogenesis of this phenomenon is not well elucidated. In this study, we examined if certain microRNAs (miRNA) are associated with decreased hepatic CYP3A activity in cirrhosis.
Hepatic CYP3A activity and miRNA microarray expression profiles were measured in cirrhotic (n=28) and normal (n=12) liver tissue. Hepatic CYP3A activity was measured via midazolam hydroxylation in human liver microsomes. Additionally, hepatic CYP3A4 protein concentration and the expression of CYP3A4 mRNA were measured. Analyses were conducted to identify miRNAs which were differentially expressed between two groups but also were significantly associated with lower hepatic CYP3A activity.
Hepatic CYP3A activity in cirrhotic livers was 1.7-fold lower than in the normal livers (0.28 ± 0.06 vs. 0.47 ± 0.07mL* min-1*mg protein-1 (mean ± SEM), P=0.02). Six microRNAs (miR-155, miR-454, miR-582-5p, let-7f-1*, miR-181d, and miR-500) had >1.2-fold increase in cirrhotic livers and also had significant negative correlation with hepatic CYP3A activity (range of r = -0.44 to -0.52, P <0.05). Notably, miR-155, a known regulator of liver inflammation, had the highest fold increase in cirrhotic livers (2.2-fold, P=4.16E-08) and significantly correlated with hepatic CYP3A activity (r=-0.50, P=0.017). The relative expression (2-ΔΔCt mean ± SEM) of hepatic CYP3A4 mRNA was significantly higher in cirrhotic livers (21.76 ± 2.65 vs. 5.91 ± 1.29, P=2.04E-07) but their levels did not significantly correlate with hepatic CYP3A activity (r=-0.43, P=0.08).
The strong association between certain miRNAs, notably miR-155, and lower hepatic CYP3A activity suggest that altered miRNA expression may regulate hepatic CYP3A activity.
The epithelial-mesenchymal transition (EMT) is a de-differentiation process required for wound healing and development. In tumors of epithelial origin aberrant induction of EMT contributes to cancer progression and metastasis. Studies have begun to implicate epigenetic reprogramming in EMT; however, the relationship between reprogramming and the coordination of cellular processes is largely unexplored. We have previously developed a system to study EMT in a canonical non-small cell lung cancer (NSCLC) model. In this system we have shown that the induction of EMT results in constitutive NF-κB activity. We hypothesized a role for chromatin remodeling in the sustained deregulation of cellular signaling pathways.
We mapped sixteen histone modifications and two variants for epithelial and mesenchymal states. Combinatorial patterns of epigenetic changes were quantified at gene and enhancer loci. We found a distinct chromatin signature among genes in well-established EMT pathways. Strikingly, these genes are only a small minority of those that are differentially expressed. At putative enhancers of genes with the ‘EMT-signature’ we observed highly coordinated epigenetic activation or repression. Furthermore, enhancers that are activated are bound by a set of transcription factors that is distinct from those that bind repressed enhancers. Upregulated genes with the ‘EMT-signature’ are upstream regulators of NF-κB, but are also bound by NF-κB at their promoters and enhancers. These results suggest a chromatin-mediated positive feedback as a likely mechanism for sustained NF-κB activation.
There is highly specific epigenetic regulation at genes and enhancers across several pathways critical to EMT. The sites of these changes in chromatin state implicate several inducible transcription factors with critical roles in EMT (NF-κB, AP-1 and MYC) as targets of this reprogramming. Furthermore, we find evidence that suggests that these transcription factors are in chromatin-mediated transcriptional feedback loops that regulate critical EMT genes. In sum, we establish an important link between chromatin remodeling and shifts in cellular reprogramming.
EMT; Epigenetics; Chromatin; Reprogramming; Feedback
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based Critical Assessment of protein Function Annotation (CAFA) experiment. Fifty-four methods representing the state-of-the-art for protein function prediction were evaluated on a target set of 866 proteins from eleven organisms. Two findings stand out: (i) today’s best protein function prediction algorithms significantly outperformed widely-used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is significant need for improvement of currently available tools.
atients undergoing resections for suspicious pulmonary lesions have a 9-55% benign rate. Validated prediction models exist to estimate the probability of malignancy in a general population and current practice guidelines recommend their use. We evaluated these models in a surgical population to determine the accuracy of existing models to predict benign or malignant disease.
We conducted a retrospective review of our thoracic surgery quality improvement database (2005-2008) to identify patients who underwent resection of a pulmonary lesion. Patients were stratified into subgroups based on age, smoking status and fluorodeoxyglucose positron emission tomography (PET) results. The probability of malignancy was calculated for each patient using the Mayo and SPN prediction models. Receiver operating characteristic (ROC) and calibration curves were used to measure model performance.
89 patients met selection criteria; 73% were malignant. Patients with preoperative PET scans were divided into 4 subgroups based on age, smoking history and nodule PET avidity. Older smokers with PET-avid lesions had a 90% malignancy rate. Patients with PET- non-avid lesions, or PET-avid lesions with age<50 years or never smokers of any age had a 62% malignancy rate. The area under the ROC curve for the Mayo and SPN models was 0.79 and 0.80, respectively; however, the models were poorly calibrated (p<0.001).
Despite improvements in diagnostic and imaging techniques, current general population models do not accurately predict lung cancer among patients ref erred for surgical evaluation. Prediction models with greater accuracy are needed to identify patients with benign disease to reduce non-therapeutic resections.
Lung Cancer; Lung Cancer; diagnosis; cancer staging; Positron Emission Tomography (PET)
Video-assisted thoracic surgery (VATS) lobectomy has become the standard of care for early stage lung cancer throughout the world. Teaching this complex procedure requires adequate case volume, adequate instrumentation, a committed operating room team and baseline experience with open lobectomy. We outline what key maneuvers and steps are required to teach and learn VATS lobectomy. This is most easily performed as part of a thoracic surgery training program, but with adequate commitment and proctoring, there is no reason experienced open surgeons cannot become proficient VATS surgeons. We provide videos showing the key portions of a subcarinal lymph node dissection, posterior hilar dissection of the right upper lobe, fissureless right middle lobectomy, and fissureless left lower lobectomy. These videos highlight what we feel are important principals in VATS lobectomy, i.e., N2 and N1 lymph node dissection, fissureless techniques, and progressive responsibility of the learner. Current literature in simulation of VATS lobectomy is also outlined as this will be the future of teaching in VATS lobectomy.
video-assisted thoracic surgery (VATS) lobectomy; teaching; simulation
The epithelial-to-mesenchymal transition (EMT) is a de-differentiation process that has been implicated in metastasis and the generation of cancer initiating cells (CICs) in solid tumors. To examine EMT in non-small cell lung cancer (NSCLC), we utilized a three dimensional (3D) cell culture system in which cells were co-stimulated with tumor necrosis factor alpha (TNF) and transforming growth factor beta (TGFβ). NSCLC spheroid cultures display elevated expression of EMT master-switch transcription factors, TWIST1, SNAI1/Snail1, SNAI2/Slug and ZEB2/Sip1, and are highly invasive. Mesenchymal NSCLC cultures show CIC characteristics, displaying elevated expression of transcription factors KLF4, SOX2, POU5F1/Oct4, MYCN, and KIT. As a result, these putative CIC display a cancer “stem-like” phenotype by forming lung metastases under limiting cell dilution. The pleiotropic transcription factor, NF-κB, has been implicated in EMT and metastasis. Thus, we set out to develop a NSCLC model to further characterize the role of NF-κB activation in the development of CICs. Here, we demonstrate that induction of EMT in 3D cultures results in constitutive NF-κB activity. Furthermore, inhibition of NF-κB resulted in the loss of TWIST1, SNAI2, and ZEB2 induction, and a failure of cells to invade and metastasize. Our work indicates that NF-κB is required for NSCLC metastasis, in part, by transcriptionally upregulating master-switch transcription factors required for EMT.
Conodonts have been considered the earliest skeletonizing vertebrates and their mineralized feeding apparatus interpreted as having performed a tooth function. However, the absence of jaws in conodonts and the small size of their oropharyngeal musculature limits the force available for fracturing food items, presenting a challenge to this interpretation. We address this issue quantitatively using engineering approaches previously applied to mammalian dentitions. We show that the morphology of conodont food-processing elements was adapted to overcome size limitations through developing dental tools of unparalleled sharpness that maximize applied pressure. Combined with observations of wear, we also show how this morphology was employed, demonstrating how Wurmiella excavata used rotational kinematics similar to other conodonts, suggesting that this occlusal style is typical for the clade. Our work places conodont elements within a broader dental framework, providing a phylogenetically independent system for examining convergence and scaling in dental tools.
conodont; tooth; dental tools; finite-element analysis; Wurmiella excavata
The expansion of repressive epigenetic marks has been implicated in heterochromatin formation during embryonic development, but the general applicability of this mechanism is unclear. Here we show that nuclear rearrangement of repressive histone marks H3K9me3 and H3K27me3 into nonoverlapping structural layers characterizes senescence-associated heterochromatic foci (SAHF) formation in human fibroblasts. However, the global landscape of these repressive marks remains unchanged upon SAHF formation, suggesting that in somatic cells, heterochromatin can be formed through the spatial repositioning of pre-existing repressively marked histones. This model is reinforced by the correlation of presenescent replication timing with both the subsequent layered structure of SAHFs and the global landscape of the repressive marks, allowing us to integrate microscopic and genomic information. Furthermore, modulation of SAHF structure does not affect the occupancy of these repressive marks, nor vice versa. These experiments reveal that high-order heterochromatin formation and epigenetic remodeling of the genome can be discrete events.
The organisation of the large volume of mammalian genomic DNA within cell nuclei requires mechanisms to regulate chromatin compaction involving the reversible formation of higher order structures. The compaction state of chromatin varies between interphase and mitosis and is also subject to rapid and reversible change upon ATP depletion/repletion. In this study we have investigated mechanisms that may be involved in promoting the hyper-condensation of chromatin when ATP levels are depleted by treating cells with sodium azide and 2-deoxyglucose. Chromatin conformation was analysed in both live and permeabilised HeLa cells using FLIM-FRET, high resolution fluorescence microscopy and by electron spectroscopic imaging microscopy. We show that chromatin compaction following ATP depletion is not caused by loss of transcription activity and that it can occur at a similar level in both interphase and mitotic cells. Analysis of both live and permeabilised HeLa cells shows that chromatin conformation within nuclei is strongly influenced by the levels of divalent cations, including calcium and magnesium. While ATP depletion results in an increase in the level of unbound calcium, chromatin condensation still occurs even in the presence of a calcium chelator. Chromatin compaction is shown to be strongly affected by small changes in the levels of polyamines, including spermine and spermidine. The data are consistent with a model in which the increased intracellular pool of polyamines and divalent cations, resulting from depletion of ATP, bind to DNA and contribute to the large scale hyper-compaction of chromatin by a charge neutralisation mechanism.
The autoimmune liver disease primary biliary cirrhosis (PBC) is associated with life-altering fatigue in ∼50% of patients. Previous work suggests that fatigued PBC subjects have evidence of autonomic dysfunction and may be at a higher risk of sudden cardiac death. The manifestation of this risk is not clear. This pilot study investigated whether alterations in cardiac torsion and strain could be detected in fatigued or nonfatigued early-stage PBC patients. We performed cardiac tissue tagging and anatomical cine-imaging in 13 early-stage PBC patients (including 7 with significant fatigue) and 10 control subjects to calculate cardiac torsion and strain throughout systole and diastole. From the cardiac tagging, we calculated the torsion-to-shortening ratio (TSR), a measure of subepicardial torsion exerting mechanical advantage over subendocardial shortening. Autonomic function testing was performed to evaluate baroreceptor effective index on standing. TSR was markedly increased in the fatigued PBC patients (0.70 ± 0.13) compared with both controls (0.46 ± 0.11, P = 0.002) and nonfatigued PBC patients (0.44 ± 0.12, P = 0.003). Decreased baroreceptor effective index on standing strongly correlated with increased TSR within the whole PBC group (r = −0.71, P = 0.007). Fatigued PBC patients demonstrate a redistribution of myocardial strain characteristic of a reduced relative contribution to contraction from the subendocardium. This is analogous to the changes found in healthy aging for subjects ∼16 yr older than the fatigued PBC patients. Hence the hearts of fatigued PBC patients may be subject to processes of accelerated aging.
torsion; strain; magnetic resonance imaging; liver disease; autoimmune
The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies’ platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other platforms. This helps to identify the proper sequencing platform for whole genome studies with different application scopes.
Here, we present the new UCL Bioinformatics Group’s PSIPRED Protein Analysis Workbench. The Workbench unites all of our previously available analysis methods into a single web-based framework. The new web portal provides a greatly streamlined user interface with a number of new features to allow users to better explore their results. We offer a number of additional services to enable computationally scalable execution of our prediction methods; these include SOAP and XML-RPC web server access and new HADOOP packages. All software and services are available via the UCL Bioinformatics Group website at http://bioinf.cs.ucl.ac.uk/.
The outbreak of severe acute respiratory syndrome in 2002–2003 exacted considerable human and economic costs from countries involved. It also exposed major weaknesses in several of these countries in coping with an outbreak of a newly emerged infectious disease. In the 10 years since the outbreak, in addition to the increase in knowledge of the biology and epidemiology of this disease, a major lesson learned is the value of having a national public health institute that is prepared to control disease outbreaks and designed to coordinate a national response and assist localities in their responses.
severe acute respiratory syndrome; SARS; coronavirus; viruses; disease threats; public health; national public health institutes
Medulloblastoma is an aggressively-growing tumour, arising in the cerebellum or medulla/brain stem. It is the most common malignant brain tumour in children, and displays tremendous biological and clinical heterogeneity1. Despite recent treatment advances, approximately 40% of children experience tumour recurrence, and 30% will die from their disease. Those who survive often have a significantly reduced quality of life.
Four tumour subgroups with distinct clinical, biological and genetic profiles are currently discriminated2,3. WNT tumours, displaying activated wingless pathway signalling, carry a favourable prognosis under current treatment regimens4. SHH tumours show hedgehog pathway activation, and have an intermediate prognosis2. Group 3 & 4 tumours are molecularly less well-characterised, and also present the greatest clinical challenges2,3,5. The full repertoire of genetic events driving this distinction, however, remains unclear.
Here we describe an integrative deep-sequencing analysis of 125 tumour-normal pairs. Tetraploidy was identified as a frequent early event in Group 3 & 4 tumours, and a positive correlation between patient age and mutation rate was observed. Several recurrent mutations were identified, both in known medulloblastoma-related genes (CTNNB1, PTCH1, MLL2, SMARCA4) and in genes not previously linked to this tumour (DDX3X, CTDNEP1, KDM6A, TBR1), often in subgroup-specific patterns. RNA-sequencing confirmed these alterations, and revealed the expression of the first medulloblastoma fusion genes. Chromatin modifiers were frequently altered across all subgroups.
These findings enhance our understanding of the genomic complexity and heterogeneity underlying medulloblastoma, and provide several potential targets for new therapeutics, especially for Group 3 & 4 patients.
To understand fully cell behaviour, biologists are making progress towards cataloguing the functional elements in the human genome and characterising their roles across a variety of tissues and conditions. Yet, functional information – either experimentally validated or computationally inferred by similarity – remains completely missing for approximately 30% of human proteins. FFPred was initially developed to bridge this gap by targeting sequences with distant or no homologues of known function and by exploiting clear patterns of intrinsic disorder associated with particular molecular activities and biological processes. Here, we present an updated and improved version, which builds on larger datasets of protein sequences and annotations, and uses updated component feature predictors as well as revised training procedures. FFPred 2.0 includes support vector regression models for the prediction of 442 Gene Ontology (GO) terms, which largely expand the coverage of the ontology and of the biological process category in particular. The GO term list mainly revolves around macromolecular interactions and their role in regulatory, signalling, developmental and metabolic processes. Benchmarking experiments on newly annotated proteins show that FFPred 2.0 provides more accurate functional assignments than its predecessor and the ProtFun server do; also, its assignments can complement information obtained using BLAST-based transfer of annotations, improving especially prediction in the biological process category. Furthermore, FFPred 2.0 can be used to annotate proteins belonging to several eukaryotic organisms with a limited decrease in prediction quality. We illustrate all these points through the use of both precision-recall plots and of the COGIC scores, which we recently proposed as an alternative numerical evaluation measure of function prediction accuracy.