PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-21 (21)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
Document Types
1.  Dynamic changes of HBV markers and HBV DNA load in infants born to HBsAg(+) mothers: can positivity of HBsAg or HBV DNA at birth be an indicator for HBV infection of infants? 
BMC Infectious Diseases  2013;13:524.
Background
Neither HBV DNA nor HBsAg positivity at birth is an accurate marker for HBV infection of infants. No data is available for continuous changes of HBV markers in newborns to HBsAg(+) mothers. This prospective, multi-centers study aims at observing the dynamic changes of HBV markers and exploring an early diagnostic marker for mother-infant infection.
Methods
One hundred forty-eight HBsAg(+) mothers and their newborns were enrolled after mothers signed the informed consent forms. Those infants were received combination immunoprophylaxis (hepatitis B immunoglobulin [HBIG] and hepatitis B vaccine) at birth, and then followed up to 12 months. Venous blood of the infants (0, 1, 7, and 12 months of age) was collected to test for HBV DNA and HBV markers.
Results
Of the 148 infants enrolled in our study, 41 and 24 infants were detected as HBsAg(+) and HBV DNA(+) at birth, respectively. Nine were diagnosed with HBV infection after 7 mo follow-up. Dynamic observation of the HBV markers showed that HBV DNA and HBsAg decreased gradually and eventually sero-converted to negativity in the non-infected infants, whereas in the infected infants, HBV DNA and HBsAg were persistently positive, or higher at the end of follow-up. At 1 mo, the infants with anti-HBs(+), despite positivity for HBsAg or HBV DNA at birth, were resolved after 12 mo follow-up, whereas all the nine infants with anti-HBs(−) were diagnosed with HBV infection. Anti-HBs(−) at 1 mo showed a higher positive likelihood ratio for HBV mother-infant infection than HBV DNA and/or HBsAg at birth.
Conclusions
Negativity for anti-HBs at 1 mo can be considered as a sensitive and early diagnostic indictor for HBV infection in the infants with positive HBV DNA and HBsAg at birth, especially for those infants with low levels of HBV DNA load and HBsAg titer.
doi:10.1186/1471-2334-13-524
PMCID: PMC3829094  PMID: 24195671
2.  Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers 
BMC Genomics  2013;14:440.
Background
Many large-scale studies analyzed high-throughput genomic data to identify altered pathways essential to the development and progression of specific types of cancer. However, no previous study has been extended to provide a comprehensive analysis of pathways disrupted by copy number alterations across different human cancers. Towards this goal, we propose a network-based method to integrate copy number alteration data with human protein-protein interaction networks and pathway databases to identify pathways that are commonly disrupted in many different types of cancer.
Results
We applied our approach to a data set of 2,172 cancer patients across 16 different types of cancers, and discovered a set of commonly disrupted pathways, which are likely essential for tumor formation in majority of the cancers. We also identified pathways that are only disrupted in specific cancer types, providing molecular markers for different human cancers. Analysis with independent microarray gene expression datasets confirms that the commonly disrupted pathways can be used to identify patient subgroups with significantly different survival outcomes. We also provide a network view of disrupted pathways to explain how copy number alterations affect pathways that regulate cell growth, cycle, and differentiation for tumorigenesis.
Conclusions
In this work, we demonstrated that the network-based integrative analysis can help to identify pathways disrupted by copy number alterations across 16 types of human cancers, which are not readily identifiable by conventional overrepresentation-based and other pathway-based methods. All the results and source code are available at http://compbio.cs.umn.edu/NetPathID/.
doi:10.1186/1471-2164-14-440
PMCID: PMC3703268  PMID: 23822816
3.  Solution NMR structure of the ribosomal protein RP-L35Ae from Pyrococcus furiosus 
Proteins  2012;80(7):1901-1906.
The ribosome consists of small and large subunits each comprised of dozens of proteins and RNA molecules. However, the functions of many of the individual protomers within the ribosome are still unknown. Here we describe the solution NMR structure of the ribosomal protein RP-L35Ae from the archaeon Pyrococcus furiosus. RP-L35Ae is buried within the large subunit of the ribosome and belongs to Pfam protein domain family PF01247, which is highly conserved in eukaryotes, present in a few archaeal genomes, but absent in bacteria. The protein adopts a six-stranded anti-parallel β-barrel analogous to the ‘tRNA binding motif’ fold. The structure of the P. furiosus RP-L35Ae presented here constitutes the first structural representative from this protein domain family.
doi:10.1002/prot.24071
PMCID: PMC3639469  PMID: 22422653
ribosomal protein; L35Ae; PF01247; tRNA binding; solution NMR; structural genomics
4.  Evaluation of EML4-ALK Fusion Proteins in Non-Small Cell Lung Cancer Using Small Molecule Inhibitors12 
Neoplasia (New York, N.Y.)  2011;13(1):1-11.
The echinoderm microtubule-associated protein-like 4-anaplastic lymphoma kinase (EML4-ALK) fusion gene resulting from an inversion within chromosome 2p occurs in approximately 5% of non-small cell lung cancer and is mutually exclusive with Ras and EGFR mutations. In this study, we have used a potent and selective ALK small molecule inhibitor, NPV-TAE684, to assess the oncogenic role of EML4-ALK in non-small cell lung cancer (NSCLC). We show here that TAE684 inhibits proliferation and induces cell cycle arrest, apoptosis, and tumor regression in two NSCLC models that harbor EML4-ALK fusions. TAE684 inhibits EML4-ALK activation and its downstream signaling including ERK, AKT, and STAT3. We used microarray analysis to carry out targeted pathway studies of gene expression changes in H2228 NSCLC xenograft model after TAE684 treatment and identified a gene signature of EML4-ALK inhibition. The gene signature represents 1210 known human genes, and the top biologic processes represented by these genes are cell cycle, DNA synthesis, cell proliferation, and cell death. We also compared the effect of TAE684 with PF2341066, a c-Met and ALK small molecule inhibitor currently in clinical trial in cancers harboring ALK fusions, and demonstrated that TAE684 is a much more potent inhibitor of EML4-ALK. Our data demonstrate that EML4-ALK plays an important role in the pathogenesis of a subset of NSCLC and provides insight into the mechanism of EML4-ALK inhibition by a small molecule inhibitor.
PMCID: PMC3022423  PMID: 21245935
5.  NMR and X-RAY structures of human E2-like ubiquitin-fold modifier conjugating enzyme 1 (UFC1) reveal structural and functional conservation in the metazoan UFM1-UBA5-UFC1 ubiquination pathway 
For cell regulation, E2-like ubiquitin-fold modifier conjugating enzyme 1 (Ufc1) is involved in the transfer of ubiquitin-fold modifier 1 (Ufm1), a ubiquitin like protein which is activated by E1-like enzyme Uba5, to various target proteins. Thereby, Ufc1 participates in the very recently discovered Ufm1-Uba5-Ufc1 ubiquination pathway which is found in metazoan organisms. The structure of human Ufc1 was solved by using both NMR spectroscopy and X-ray crystallography. The complementary insights obtained with the two techniques provided a unique basis for understanding the function of Ufc1 at atomic resolution. The Ufc1 structure consists of the catalytic core domain conserved in all E2-like enzymes and an additional N-terminal helix. The active site Cys116, which forms a thio-ester bond with Ufm1, is located in a flexible loop that is highly solvent accessible. Based on the Ufc1 and Ufm1 NMR structures, a model could be derived for the Ufc1-Ufm1 complex in which the C-terminal Gly83 of Ufm1 may well form the expected thio-ester with Cys116, suggesting that Ufm1-Ufc1 functions as described for other E1-E2-E3 machineries. α-helix 1 of Ufc1 adopts different conformations in the crystal and in solution, suggesting that this helix plays a key role to mediate specificity.
doi:10.1007/s10969-008-9054-7
PMCID: PMC2850604  PMID: 19101823
Ufc1; Ufm1; Ubiquitin; E2; Ubiquitin Conjugating Enzyme
6.  Understanding the physical properties controlling protein crystallization based on analysis of large-scale experimental data 
Nature biotechnology  2009;27(1):51-57.
Crystallization has proven to be the most significant bottleneck to high-throughput protein structure determination using diffraction methods. We have used the large-scale, systematically generated experimental results of the Northeast Structural Genomics Consortium to characterize the biophysical properties that control protein crystallization. Datamining of crystallization results combined with explicit folding studies lead to the conclusion that crystallization propensity is controlled primarily by the prevalence of well-ordered surface epitopes capable of mediating interprotein interactions and is not strongly influenced by overall thermodynamic stability. These analyses identify specific sequence features correlating with crystallization propensity that can be used to estimate the crystallization probability of a given construct. Analyses of entire predicted proteomes demonstrate substantial differences in the bulk amino acid sequence properties of human versus eubacterial proteins that reflect likely differences in their biophysical properties including crystallization propensity. Finally, our thermodynamic measurements enable critical evaluation of previous claims regarding correlations between protein stability and bulk sequence properties, which generally are not supported by our dataset.
doi:10.1038/nbt.1514
PMCID: PMC2746436  PMID: 19079241
protein crystallization; protein thermodynamics; crystallization mechanism; surface entropy; datamining; structural genomics
7.  Association of C-Terminal Ubiquitin Hydrolase BRCA1-Associated Protein 1 with Cell Cycle Regulator Host Cell Factor 1▿  
Molecular and Cellular Biology  2009;29(8):2181-2192.
Protein ubiquitination provides an efficient and reversible mechanism to regulate cell cycle progression and checkpoint control. Numerous regulatory proteins direct the addition of ubiquitin to lysine residues on target proteins, and these are countered by an army of deubiquitinating enzymes (DUBs). BRCA1-associated protein-1 (Bap1) is a ubiquitin carboxy-terminal hydrolase and is frequently mutated in lung and sporadic breast tumors. Bap1 can suppress growth of lung cancer cells in athymic nude mice and this requires its DUB activity. We show here that Bap1 interacts with host cell factor 1 (HCF-1), a transcriptional cofactor found in a number of important regulatory complexes. Bap1 binds to the HCF-1 β-propeller using a variant of the HCF-binding motif found in herpes simplex virus VP16 and other HCF-interacting proteins. HCF-1 is K48 and K63 ubiquitinated, with a major site of linkage at lysines 1807 and 1808 in the HCF-1C subunit. Expression of a catalytically inactive version of Bap1 results in the selective accumulation of K48 ubiquitinated polypeptides. Depletion of Bap1 using small interfering RNA results in a modest accumulation of HCF-1C, suggesting that Bap1 helps to control cell proliferation by regulating HCF-1 protein levels and by associating with genes involved in the G1-S transition.
doi:10.1128/MCB.01517-08
PMCID: PMC2663315  PMID: 19188440
8.  Structural genomics is the largest contributor of novel structural leverage 
The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today’s UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849–851, 2007) has resulted from systematic targeting of large families. PSI’s per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ~15 years to cover most sequences in the current UniProt database.
doi:10.1007/s10969-008-9055-6
PMCID: PMC2705706  PMID: 19194785
Protein structure determination; Structural genomics; Evolution; Protein universe
9.  Natural selection of protein structural and functional properties: a single nucleotide polymorphism perspective 
Genome Biology  2008;9(4):R69.
A large-scale survey using single nucleotide polymorphism data from dbSNP provides insights into the evolutionary selection constraints on human proteins of different structural and functional categories.
Background
The rates of molecular evolution for protein-coding genes depend on the stringency of functional or structural constraints. The Ka/Ks ratio has been commonly used as an indicator of selective constraints and is typically calculated from interspecies alignments. Recent accumulation of single nucleotide polymorphism (SNP) data has enabled the derivation of Ka/Ks ratios for polymorphism (SNP A/S ratios).
Results
Using data from the dbSNP database, we conducted the first large-scale survey of SNP A/S ratios for different structural and functional properties. We confirmed that the SNP A/S ratio is largely correlated with Ka/Ks for divergence. We observed stronger selective constraints for proteins that have high mRNA expression levels or broad expression patterns, have no paralogs, arose earlier in evolution, have natively disordered regions, are located in cytoplasm and nucleus, or are related to human diseases. On the residue level, we found higher degrees of variation for residues that are exposed to solvent, are in a loop conformation, natively disordered regions or low complexity regions, or are in the signal peptides of secreted proteins. Our analysis also revealed that histones and protein kinases are among the protein families that are under the strongest selective constraints, whereas olfactory and taste receptors are among the most variable groups.
Conclusion
Our study suggests that the SNP A/S ratio is a robust measure for selective constraints. The correlations between SNP A/S ratios and other variables provide valuable insights into the natural selection of various structural or functional properties, particularly for human-specific genes and constraints within the human lineage.
doi:10.1186/gb-2008-9-4-r69
PMCID: PMC2643940  PMID: 18397526
10.  Membrane Protein Prediction Methods 
Methods (San Diego, Calif.)  2007;41(4):460-474.
We survey computational approaches that tackle membrane protein structure and function prediction. While describing the main ideas that have led to the development of the most relevant and novel methods, we also discuss pitfalls, provide practical hints and highlight the challenges that remain. The methods covered include: sequence alignment, motif search, functional residue identification, transmembrane segment and protein topology predictions, homology and ab initio modeling. Overall, predictions of functional and structural features of membrane proteins are improving, although progress is hampered by the limited amount of high-resolution experimental information available. While predictions of transmembrane segments and protein topology rank among the most accurate methods in computational biology, more attention and effort will be required in the future to ameliorate database search, homology and ab initio modeling.
doi:10.1016/j.ymeth.2006.07.026
PMCID: PMC1934899  PMID: 17367718
membrane proteins; protein structure prediction; protein function prediction; alignments; transmembrane segment prediction; homology modeling; ab initio modeling
11.  Natively Unstructured Loops Differ from Other Loops 
PLoS Computational Biology  2007;3(7):e140.
Natively unstructured or disordered protein regions may increase the functional complexity of an organism; they are particularly abundant in eukaryotes and often evade structure determination. Many computational methods predict unstructured regions by training on outliers in otherwise well-ordered structures. Here, we introduce an approach that uses a neural network in a very different and novel way. We hypothesize that very long contiguous segments with nonregular secondary structure (NORS regions) differ significantly from regular, well-structured loops, and that a method detecting such features could predict natively unstructured regions. Training our new method, NORSnet, on predicted information rather than on experimental data yielded three major advantages: it removed the overlap between testing and training, it systematically covered entire proteomes, and it explicitly focused on one particular aspect of unstructured regions with a simple structural interpretation, namely that they are loops. Our hypothesis was correct: well-structured and unstructured loops differ so substantially that NORSnet succeeded in their distinction. Benchmarks on previously used and new experimental data of unstructured regions revealed that NORSnet performed very well. Although it was not the best single prediction method, NORSnet was sufficiently accurate to flag unstructured regions in proteins that were previously not annotated. In one application, NORSnet revealed previously undetected unstructured regions in putative targets for structural genomics and may thereby contribute to increasing structural coverage of large eukaryotic families. NORSnet found unstructured regions more often in domain boundaries than expected at random. In another application, we estimated that 50%–70% of all worm proteins observed to have more than seven protein–protein interaction partners have unstructured regions. The comparative analysis between NORSnet and DISOPRED2 suggested that long unstructured loops are a major part of unstructured regions in molecular networks.
Author Summary
The details of protein structures are important for function. Regions that do not adopt any regular structure in isolation (natively unstructured or disordered regions) initially appeared as a curious exception to this structure–function paradigm. It has become increasingly clear that unstructured regions are fundamental to many roles and that they are particularly important for multicellular organisms. Structural biology is just beginning to apprehend the stunning diversity of these roles. Here, we focused on unstructured regions dominated by a particular type of loop, namely the natively unstructured one. We developed a method that succeeded in the distinction between well-structured and natively unstructured loops. For the development, we did not use any experimental data for unstructured regions; when tested on experimental data, the method performed surprisingly well. Due to its different premises, the method captured very different aspects of unstructured regions than other methods that we tested. We applied the new method to two different problems. The first was the identification of proteins that may be difficult targets for structure determination. The second was the identification of worm proteins that have many interaction partners (more than seven) and unstructured regions. Surprisingly, we found unstructured regions of the loopy type in more than 50% of all the promiscuous worm proteins.
doi:10.1371/journal.pcbi.0030140
PMCID: PMC1924875  PMID: 17658943
12.  Natively Unstructured Loops Differ from Other Loops 
PLoS Computational Biology  2007;3(7):e140.
Natively unstructured or disordered protein regions may increase the functional complexity of an organism; they are particularly abundant in eukaryotes and often evade structure determination. Many computational methods predict unstructured regions by training on outliers in otherwise well-ordered structures. Here, we introduce an approach that uses a neural network in a very different and novel way. We hypothesize that very long contiguous segments with nonregular secondary structure (NORS regions) differ significantly from regular, well-structured loops, and that a method detecting such features could predict natively unstructured regions. Training our new method, NORSnet, on predicted information rather than on experimental data yielded three major advantages: it removed the overlap between testing and training, it systematically covered entire proteomes, and it explicitly focused on one particular aspect of unstructured regions with a simple structural interpretation, namely that they are loops. Our hypothesis was correct: well-structured and unstructured loops differ so substantially that NORSnet succeeded in their distinction. Benchmarks on previously used and new experimental data of unstructured regions revealed that NORSnet performed very well. Although it was not the best single prediction method, NORSnet was sufficiently accurate to flag unstructured regions in proteins that were previously not annotated. In one application, NORSnet revealed previously undetected unstructured regions in putative targets for structural genomics and may thereby contribute to increasing structural coverage of large eukaryotic families. NORSnet found unstructured regions more often in domain boundaries than expected at random. In another application, we estimated that 50%–70% of all worm proteins observed to have more than seven protein–protein interaction partners have unstructured regions. The comparative analysis between NORSnet and DISOPRED2 suggested that long unstructured loops are a major part of unstructured regions in molecular networks.
Author Summary
The details of protein structures are important for function. Regions that do not adopt any regular structure in isolation (natively unstructured or disordered regions) initially appeared as a curious exception to this structure–function paradigm. It has become increasingly clear that unstructured regions are fundamental to many roles and that they are particularly important for multicellular organisms. Structural biology is just beginning to apprehend the stunning diversity of these roles. Here, we focused on unstructured regions dominated by a particular type of loop, namely the natively unstructured one. We developed a method that succeeded in the distinction between well-structured and natively unstructured loops. For the development, we did not use any experimental data for unstructured regions; when tested on experimental data, the method performed surprisingly well. Due to its different premises, the method captured very different aspects of unstructured regions than other methods that we tested. We applied the new method to two different problems. The first was the identification of proteins that may be difficult targets for structure determination. The second was the identification of worm proteins that have many interaction partners (more than seven) and unstructured regions. Surprisingly, we found unstructured regions of the loopy type in more than 50% of all the promiscuous worm proteins.
doi:10.1371/journal.pcbi.0030140
PMCID: PMC1924875  PMID: 17658943
13.  Distinguishing Protein-Coding from Non-Coding RNAs through Support Vector Machines 
PLoS Genetics  2006;2(4):e29.
RIKEN's FANTOM project has revealed many previously unknown coding sequences, as well as an unexpected degree of variation in transcripts resulting from alternative promoter usage and splicing. Ever more transcripts that do not code for proteins have been identified by transcriptome studies, in general. Increasing evidence points to the important cellular roles of such non-coding RNAs (ncRNAs). The distinction of protein-coding RNA transcripts from ncRNA transcripts is therefore an important problem in understanding the transcriptome and carrying out its annotation. Very few in silico methods have specifically addressed this problem. Here, we introduce CONC (for “coding or non-coding”), a novel method based on support vector machines that classifies transcripts according to features they would have if they were coding for proteins. These features include peptide length, amino acid composition, predicted secondary structure content, predicted percentage of exposed residues, compositional entropy, number of homologs from database searches, and alignment entropy. Nucleotide frequencies are also incorporated into the method. Confirmed coding cDNAs for eukaryotic proteins from the Swiss-Prot database constituted the set of true positives, ncRNAs from RNAdb and NONCODE the true negatives. Ten-fold cross-validation suggested that CONC distinguished coding RNAs from ncRNAs at about 97% specificity and 98% sensitivity. Applied to 102,801 mouse cDNAs from the FANTOM3 dataset, our method reliably identified over 14,000 ncRNAs and estimated the total number of ncRNAs to be about 28,000.
Synopsis
There are two types of RNA: messenger RNAs (mRNAs), which are translated into proteins, and non-coding RNAs (ncRNAs), which function as RNA molecules. Besides textbook examples such as tRNAs and rRNAs, non-coding RNAs have been found to carry out very diverse functions, from mRNA splicing and RNA modification to translational regulation. It has been estimated that non-coding RNAs make up the vast majority of transcription output of higher eukaryotes. Discriminating mRNA from ncRNA has become an important biological and computational problem. The authors describe a computational method based on a machine learning algorithm known as a support vector machine (SVM) that classifies transcripts according to features they would have if they were coding for proteins. These features include peptide length, amino acid composition, secondary structure content, and protein alignment information. The method is applied to the dataset from the FANTOM3 large-scale mouse cDNA sequencing project; it identifies over 14,000 ncRNAs in mouse and estimates the total number of ncRNAs in the FANTOM3 data to be about 28,000.
doi:10.1371/journal.pgen.0020029
PMCID: PMC1449884  PMID: 16683024
14.  Sequence-based prediction of protein domains 
Nucleic Acids Research  2004;32(12):3522-3530.
Guessing the boundaries of structural domains has been an important and challenging problem in experimental and computational structural biology. Predictions were based on intuition, biochemical properties, statistics, sequence homology and other aspects of predicted protein structure. Here, we introduced CHOPnet, a de novo method that predicts structural domains in the absence of homology to known domains. Our method was based on neural networks and relied exclusively on information available for all proteins. Evaluating sustained performance through rigorous cross-validation on proteins of known structure, we correctly predicted the number of domains in 69% of all proteins. For 50% of the two-domain proteins the centre of the predicted boundary was closer than 20 residues to the boundary assigned from three-dimensional (3D) structures; this was about eight percentage points better than predictions by ‘equal split’. Our results appeared to compare favourably with those from previously published methods. CHOPnet may be useful to restrict the experimental testing of different fragments for structure determination in the context of structural genomics.
doi:10.1093/nar/gkh684
PMCID: PMC484172  PMID: 15240828
15.  CHOP: parsing proteins into structural domains 
Nucleic Acids Research  2004;32(Web Server issue):W569-W571.
Sequence-based domain assignment is one of the most important and challenging problems in structural biology. We have developed a method, CHOP, that chops proteins into domain-like fragments. The basic idea is to cut proteins from entirely sequenced organisms beginning from very reliable experimental information (Protein Data Bank), proceeding to expert annotations of domain-like regions (Pfam-A) and completing through cuts based on termini of native protein ends. The CHOP server takes protein sequences as input and returns the dissections supported by homology transfer. CHOP results are precompiled for many entirely sequenced proteomes. The service is available at http://www.rostlab.org/services/CHOP/.
doi:10.1093/nar/gkh481
PMCID: PMC441619  PMID: 15215452
16.  The PredictProtein server 
Nucleic Acids Research  2004;32(Web Server issue):W321-W326.
PredictProtein (http://www.predictprotein.org) is an Internet service for sequence analysis and the prediction of protein structure and function. Users submit protein sequences or alignments; PredictProtein returns multiple sequence alignments, PROSITE sequence motifs, low-complexity regions (SEG), nuclear localization signals, regions lacking regular structure (NORS) and predictions of secondary structure, solvent accessibility, globular regions, transmembrane helices, coiled-coil regions, structural switch regions, disulfide-bonds, sub-cellular localization and functional annotations. Upon request fold recognition by prediction-based threading, CHOP domain assignments, predictions of transmembrane strands and inter-residue contacts are also available. For all services, users can submit their query either by electronic mail or interactively via the World Wide Web.
doi:10.1093/nar/gkh377
PMCID: PMC441515  PMID: 15215403
17.  Predicting transmembrane beta-barrels in proteomes 
Nucleic Acids Research  2004;32(8):2566-2577.
Very few methods address the problem of predicting beta-barrel membrane proteins directly from sequence. One reason is that only very few high-resolution structures for transmembrane beta-barrel (TMB) proteins have been determined thus far. Here we introduced the design, statistics and results of a novel profile-based hidden Markov model for the prediction and discrimination of TMBs. The method carefully attempts to avoid over-fitting the sparse experimental data. While our model training and scoring procedures were very similar to a recently published work, the architecture and structure-based labelling were significantly different. In particular, we introduced a new definition of beta- hairpin motifs, explicit state modelling of transmembrane strands, and a log-odds whole-protein discrimination score. The resulting method reached an overall four-state (up-, down-strand, periplasmic-, outer-loop) accuracy as high as 86%. Furthermore, accurately discriminated TMB from non-TMB proteins (45% coverage at 100% accuracy). This high precision enabled the application to 72 entirely sequenced Gram-negative bacteria. We found over 164 previously uncharacterized TMB proteins at high confidence. Database searches did not implicate any of these proteins with membranes. We challenge that the vast majority of our 164 predictions will eventually be verified experimentally. All proteome predictions and the PROFtmb prediction method are available at http://www.rostlab.org/services/PROFtmb/.
doi:10.1093/nar/gkh580
PMCID: PMC419468  PMID: 15141026
18.  NORSp: predictions of long regions without regular secondary structure 
Nucleic Acids Research  2003;31(13):3833-3835.
Many structurally flexible regions play important roles in biological processes. It has been shown that extended loopy regions are very abundant in the protein universe and that they have been conserved through evolution. Here, we present NORSp, a publicly available predictor for disordered regions in protein. Specifically, NORSp predicts long regions with NO Regular Secondary structure. Upon user submission of a protein sequence, NORSp will analyse the protein for its secondary structure, presence of transmembrane helices and coiled-coil. It will then return email to the user about the presence and position of disordered regions. NORSp can be accessed from http://cubic.bioc.columbia.edu/services/NORSp/.
PMCID: PMC168922  PMID: 12824431
19.  The PredictProtein server 
Nucleic Acids Research  2003;31(13):3300-3304.
PredictProtein (PP, http://cubic.bioc.columbia.edu/pp/) is an internet service for sequence analysis and the prediction of aspects of protein structure and function. Users submit protein sequence or alignments; the server returns a multiple sequence alignment, PROSITE sequence motifs, low-complexity regions (SEG), ProDom domain assignments, nuclear localisation signals, regions lacking regular structure and predictions of secondary structure, solvent accessibility, globular regions, transmembrane helices, coiled-coil regions, structural switch regions and disulfide-bonds. Upon request, fold recognition by prediction-based threading is available. For all services, users can submit their query either by electronic mail or interactively from World Wide Web.
PMCID: PMC168915  PMID: 12824312
20.  PEP: Predictions for Entire Proteomes 
Nucleic Acids Research  2003;31(1):410-413.
PEP is a database of Predictions for Entire Proteomes. The database contains summaries of analyses of protein sequences from a range of organisms representing all three major kingdoms of life: eukaryotes, prokaryotes and archaea. All proteins publicly available for organisms were aligned against SWISS-PROT, TrEMBL and PDB. Additionally, the following annotations are provided: secondary structure, transmembrane helices, coiled coils, regions of low complexity, signal peptides, PROSITE motifs, nuclear localization signals and classes of cellular function. Proteins that contain long regions without regular secondary structure are also identified. We have produced a related database of structural domain-like fragments derived from PEP and clusters based on homology between all fragments. The PEP database, fragments and clusters are distributed freely as a set of flat files and have been integrated into SRS. The PEP group of databases can be accessed from: http://cubic.bioc.columbia.edu/pep.
PMCID: PMC165549  PMID: 12520036
21.  Host genetic background impacts modulation of the TLR4 pathway by RON in tissue-associated macrophages 
Immunology and Cell Biology  2013;91(7):451-460.
Toll-like receptors (TLRs) enable metazoans to mount effective innate immune responses to microbial and viral pathogens, as well as to endogenous host-derived ligands. It is understood that genetic background of the host can influence TLR responsiveness, altering susceptibility to pathogen infection, autoimmunity and cancer. Macrophage stimulatory protein (MSP), which activates the receptor tyrosine kinase recepteur d'origine nantais (RON), promotes key macrophage functions such as motility and phagocytic activity. MSP also acts via RON to modulate signaling by TLR4, which recognizes a range of pathogen or endogenous host-derived molecules. Here, we show that RON exerts divergent control over TLR4 activity in macrophages from different mouse genetic backgrounds. RON potently modulated the TLR4 response in macrophages from M2-prone FVB mice, as compared with M1-skewed C57Bl6 mice. Moreover, global expression analysis revealed that RON suppresses the TLR4-dependent type-I interferon gene signature only in FVB macrophages. This leads to attenuated production of the potent inflammatory mediator, tumor necrosis factor-α. Eliminating RON kinase activity markedly decreased carcinogen-mediated tumorigenesis in M2/Th2-biased FVB mice. We propose that host genetic background influences RON function, thereby contributing to the variability in TLR4 responsiveness in rodents and, potentially, in humans. These findings provide novel insight into the complex interplay between genetic context and immune function.
doi:10.1038/icb.2013.27
PMCID: PMC3736205  PMID: 23817579
RON; macrophage; TLR4; interferon

Results 1-21 (21)