Search tips
Search criteria

Results 1-25 (1141996)

Clipboard (0)

Related Articles

1.  EVAcon: a protein contact prediction evaluation service 
Nucleic Acids Research  2005;33(Web Server issue):W347-W351.
Here we introduce EVAcon, an automated web service that evaluates the performance of contact prediction servers. Currently, EVAcon is monitoring nine servers, four of which are specialized in contact prediction and five are general structure prediction servers. Results are compared for all newly determined experimental structures deposited into PDB (∼5–50 per week). EVAcon allows for a precise comparison of the results based on a system of common protein subsets and the commonly accepted evaluation criteria that are also used in the corresponding category of the CASP assessment. EVAcon is a new service added to the functionality of the EVA system for the continuous evaluation of protein structure prediction servers. The new service is accesible from any of the three EVA mirrors: PDG (CNB-CSIC, Madrid) (); CUBIC (Columbia University, NYC) (); and Sali Lab (UCSF, San Francisco) ().
PMCID: PMC1160172  PMID: 15980486
2.  LOC3D: annotate sub-cellular localization for protein structures 
Nucleic Acids Research  2003;31(13):3337-3340.
LOC3D ( is both a weekly-updated database and a web server for predictions of sub-cellular localization for eukaryotic proteins of known three-dimensional (3D) structure. Localization is predicted using four different methods: (i) PredictNLS, prediction of nuclear proteins through nuclear localization signals; (ii) LOChom, inferring localization through sequence homology; (iii) LOCkey, inferring localization through automatic text analysis of SWISS-PROT keywords; and (iv) LOC3Dini, ab initio prediction through a system of neural networks and vector support machines. The final prediction is based on the method that predicts localization with the highest confidence. The LOC3D database currently contains predictions for >8700 eukaryotic protein chains taken from the Protein Data Bank (PDB). The web server can be used to predict sub-cellular localization for proteins for which only a predicted structure is available from threading servers. This makes the resource of particular interest to structural genomics initiatives.
PMCID: PMC168921  PMID: 12824321
3.  Improving protein secondary structure prediction based on short subsequences with local structure similarity 
BMC Genomics  2010;11(Suppl 4):S4.
When characterizing the structural topology of proteins, protein secondary structure (PSS) plays an important role in analyzing and modeling protein structures because it represents the local conformation of amino acids into regular structures. Although PSS prediction has been studied for decades, the prediction accuracy reaches a bottleneck at around 80%, and further improvement is very difficult.
In this paper, we present an improved dictionary-based PSS prediction method called SymPred, and a meta-predictor called SymPsiPred. We adopt the concept behind natural language processing techniques and propose synonymous words to capture local sequence similarities in a group of similar proteins. A synonymous word is an n-gram pattern of amino acids that reflects the sequence variation in a protein’s evolution. We generate a protein-dependent synonymous dictionary from a set of protein sequences for PSS prediction.
On a large non-redundant dataset of 8,297 protein chains (DsspNr-25), the average Q3 of SymPred and SymPsiPred are 81.0% and 83.9% respectively. On the two latest independent test sets (EVA Set_1 and EVA_Set2), the average Q3 of SymPred is 78.8% and 79.2% respectively. SymPred outperforms other existing methods by 1.4% to 5.4%. We study two factors that may affect the performance of SymPred and find that it is very sensitive to the number of proteins of both known and unknown structures. This finding implies that SymPred and SymPsiPred have the potential to achieve higher accuracy as the number of protein sequences in the NCBInr and PDB databases increases.
Our experiment results show that local similarities in protein sequences typically exhibit conserved structures, which can be used to improve the accuracy of secondary structure prediction. For the application of synonymous words, we demonstrate an example of a sequence alignment which is generated by the distribution of shared synonymous words of a pair of protein sequences. We can align the two sequences nearly perfectly which are very dissimilar at the sequence level but very similar at the structural level. The SymPred and SymPsiPred prediction servers are available at
PMCID: PMC3005913  PMID: 21143813
4.  META-PP: single interface to crucial prediction servers 
Nucleic Acids Research  2003;31(13):3308-3310.
The META-PP server ( simplifies access to a battery of public protein structure and function prediction servers by providing a common and stable web-based interface. The goal is to make these powerful and increasingly essential methods more readily available to nonexpert users and the bioinformatics community at large. At present META-PP provides access to a selected set of high-quality servers in the areas of comparative modelling, threading/fold recognition, secondary structure prediction and more specialized fields like contact and function prediction.
PMCID: PMC168978  PMID: 12824314
5.  Distill: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins 
BMC Bioinformatics  2006;7:402.
We describe Distill, a suite of servers for the prediction of protein structural features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, non-redundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of Cα traces for short proteins (up to 200 amino acids).
The servers are state-of-the-art, with secondary structure predicted correctly for nearly 80% of residues (currently the top performance on EVA), 2-class solvent accessibility nearly 80% correct, and contact maps exceeding 50% precision on the top non-diagonal contacts. A preliminary implementation of the predictor of protein Cα traces featured among the top 20 Novel Fold predictors at the last CASP6 experiment as group Distill (ID 0348). The majority of the servers, including the Cα trace predictor, now take into account homology information from the PDB, when available, resulting in greatly improved reliability.
All predictions are freely available through a simple joint web interface and the results are returned by email. In a single submission the user can send protein sequences for a total of up to 32k residues to all or a selection of the servers. Distill is accessible at the address: .
PMCID: PMC1574355  PMID: 16953874
6.  Cluster-based exposure variation analysis 
Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation.
For this purpose, we simulated a repeated cyclic exposure varying within each cycle between “low” and “high” exposure levels in a “near” or “far” range, and with “low” or “high” velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a “small” or “large” standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity.
Each simulation trace included two realizations of 100 concatenated cycles with either low (ρ = 0.1), medium (ρ = 0.5) or high (ρ = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined.
C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate and multivariate), and C-EVA, respectively (p < 0.001). All three methods performed poorly in discriminating exposure patterns differing with respect to the variability in cycle time duration.
While C-EVA had a higher accuracy than conventional EVA, both failed to detect differences in temporal similarity. The data-driven optimality of data reduction and the capability of handling multiple exposure time lines in a single analysis are the advantages of the C-EVA.
PMCID: PMC3623884  PMID: 23557439
Ergonomics; Physical work load; Linear discriminant analysis; Work-related musculoskeletal disorders; Principle component analysis
7.  Improving the accuracy of protein secondary structure prediction using structural alignment 
BMC Bioinformatics  2006;7:301.
The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence) database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences), the probability of a newly identified sequence having a structural homologue is actually quite high.
We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25%) onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based) secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics) indicate that this new method can achieve a Q3 score approaching 88%.
By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at . For high throughput or batch sequence analyses, the PROTEUS programs, databases (and server) can be downloaded and run locally.
PMCID: PMC1550433  PMID: 16774686
8.  SWISS-MODEL: an automated protein homology-modeling server 
Nucleic Acids Research  2003;31(13):3381-3385.
SWISS-MODEL ( is a server for automated comparative modeling of three-dimensional (3D) protein structures. It pioneered the field of automated modeling starting in 1993 and is the most widely-used free web-based automated modeling facility today. In 2002 the server computed 120 000 user requests for 3D protein models. SWISS-MODEL provides several levels of user interaction through its World Wide Web interface: in the ‘first approach mode’ only an amino acid sequence of a protein is submitted to build a 3D model. Template selection, alignment and model building are done completely automated by the server. In the ‘alignment mode’, the modeling process is based on a user-defined target-template alignment. Complex modeling tasks can be handled with the ‘project mode’ using DeepView (Swiss-PdbViewer), an integrated sequence-to-structure workbench. All models are sent back via email with a detailed modeling report. WhatCheck analyses and ANOLEA evaluations are provided optionally. The reliability of SWISS-MODEL is continuously evaluated in the EVA-CM project. The SWISS-MODEL server is under constant development to improve the successful implementation of expert knowledge into an easy-to-use server.
PMCID: PMC168927  PMID: 12824332
9.  Visual Acuity Testing Using Autorefraction or Pinhole Occluder as Compared with a Manual Protocol Refraction in Individuals with Diabetes 
Ophthalmology  2010;118(3):537-542.
To compare visual acuity (VA) scores obtained after autorefraction or using a pinhole occluder to scores obtained after refraction according to a standard clinical research protocol.
Prospective, comparative case series
One hundred and ten study participants (209 eyes) with diabetes mellitus and a broad range of diabetic retinopathy severity and visual acuity (VA).
VA was measured after autorefraction by a Topcon KR-8000 autorefractor as well as after a Diabetic Retinopathy Clinical Research Network ( protocol manual refraction. The order of testing was randomized and examiners were masked to the source of each refraction. A second VA measurement, utilizing an identical manual refraction, was made in a subset of eyes (N = 144, 69%) in order to establish test-retest variability for comparison purposes. All eyes underwent VA testing using a pinhole occluder.
Main Outcome Measures
Best corrected VA as measured by the Electronic Early Treatment Diabetic Retinopathy Study Visual Acuity Test© (E-ETDRS).
In all eyes, the median E-ETDRS VA letter score (EVA) obtained after manual refraction (MR-EVA) was 69 (Snellen equivalent 20/40), ranging from 4 to 93 (20/800 to 20/16). The median MR-EVA was slightly better than the median EVA obtained after autorefraction (AR-EVA), with a median difference (AR-EVA – MR-EVA) of −1 letter (25th, 75th percentiles: −4, 2 letters). The absolute difference between AR-EVA and MR-EVA was similar to the test-retest variability of MR-EVA alone. In contrast, MR-EVA was better than EVA obtained using a pinhole occluder (PH-EVA), (median PH-EVA – MR-EVA: −4 letters [−9, 0]), and had significantly less test-retest variability (P<0.001). Generally, the spherical equivalent of autorefraction was slightly more hyperopic (or less myopic) than the spherical equivalent of manual refraction (median difference: +0.25 Diopters [0, +0.63 Diopters]).
Given the substantial time and effort required for training and certification of study protocol refractionists, and the similarity between AR-EVA and MR-EVA, further evaluation of autorefraction, but not pinhole occluder testing, as an alternative to the current clinical research gold standard of ETDRS protocol manual refraction in study participants with diabetic retinopathy is warranted.
PMCID: PMC3057328  PMID: 20947171
10.  Static benchmarking of membrane helix predictions 
Nucleic Acids Research  2003;31(13):3642-3644.
Prediction of trans-membrane helices continues to be a difficult task with a few prediction methods clearly taking the lead; none of these is clearly best on all accounts. Recently, we have carefully set up protocols for benchmarking the most relevant aspects of prediction accuracy and have applied it to >30 prediction methods. Here, we present the extension of that analysis to the level of an automatic web server evaluating new methods ( The most important achievements of the tool are: (i) any new method is compared to the battery of well-established tools; (ii) the battery of measures explored allows spotting strengths in methods that may not be ‘best’ overall. In particular, we report per-residue and per-segment scores for accuracy and the error-rates for confusing membrane helices with globular proteins or signal peptides. An additional feature is that developers can directly investigate any hydrophobicity scale for its potential in predicting membrane helices.
PMCID: PMC168939  PMID: 12824384
11.  The PredictProtein server 
Nucleic Acids Research  2003;31(13):3300-3304.
PredictProtein (PP, is an internet service for sequence analysis and the prediction of aspects of protein structure and function. Users submit protein sequence or alignments; the server returns a multiple sequence alignment, PROSITE sequence motifs, low-complexity regions (SEG), ProDom domain assignments, nuclear localisation signals, regions lacking regular structure and predictions of secondary structure, solvent accessibility, globular regions, transmembrane helices, coiled-coil regions, structural switch regions and disulfide-bonds. Upon request, fold recognition by prediction-based threading is available. For all services, users can submit their query either by electronic mail or interactively from World Wide Web.
PMCID: PMC168915  PMID: 12824312
12.  The effect of customised and sham foot orthoses on plantar pressures 
The effectiveness of foot orthoses has been evaluated in many clinical trials with sham foot orthoses used as the control intervention in at least 10 clinical trials. However, the mechanical effects and credibility of sham orthoses has been rarely quantified. This study aimed to: (i) compare the effects on plantar pressures of three sham foot orthoses to a customised foot orthosis, and (ii) establish the perceived credibility and the expected benefit of each orthotic condition.
Thirty adults aged between 18 and 51 participated in this study. At 0 and 4 weeks, plantar pressure data were collected for the heel, midfoot and forefoot using the pedar®-X in-shoe system for the following five randomly assigned conditions: (i) shoe alone, (ii) customised foot orthosis, (iii) contoured polyethylene sham foot orthosis, (iv) contoured EVA sham foot orthosis, and (v) flat EVA sham foot orthosis. At the initial data collection session, each participant completed a Credibility/Expectancy Questionnaire (CEQ) to determine the credibility and expected benefit of each orthotic condition.
Compared to the shoe alone at week 0, the contoured polyethylene sham orthosis was the only condition to not significantly effect peak pressure at any region of the foot. In contrast, the contoured EVA sham orthosis, the flat EVA sham orthosis and the customised orthosis significantly reduced peak pressure at the heel. At the medial midfoot, all sham orthoses provided the same effect as the shoe alone, which corresponded to effects that were significantly different to the customised orthosis. There were no differences in peak pressure between conditions at the other mask regions, the lateral midfoot and forefoot. When the conditions were compared at week 4, the differences between the conditions were generally similar to the findings observed at week 0. With respect to credibility and expected benefit, all orthotic conditions were considered the same with the exception of the contoured polyethylene sham orthosis, which was perceived as being less credible and less likely to provide benefits.
The findings of this study indicate that all of the sham orthoses tested provided the same effect on plantar pressures at the midfoot and forefoot as a shoe alone. However, the contoured EVA sham orthosis and the flat EVA sham orthosis significantly reduced peak pressure under the heel, which was similar to the customised orthosis. In contrast, the contoured polyethylene sham orthosis had no significant effect on plantar pressure and was comparable to the shoe alone at all regions of the foot. Hence, lower plantar pressures were found under the heel with some sham orthoses, but not with others. Importantly, participants perceived the polyethylene sham orthosis – the sham that had no effect on plantar pressure – to be the least credible orthosis and the least likely to provide benefits. This may be critical for the design of future clinical trials as it may introduce confounding effects that produce inaccurate results. These findings provide some evidence for the mechanical effects, treatment credibility and expected benefit of sham foot orthoses, which should be considered when they are used as a control intervention in a clinical trial.
PMCID: PMC3663766  PMID: 23680496
Orthoses; Orthotic devices; Sham treatment; Kinetics
13.  A benchmark server using high resolution protein structure data, and benchmark results for membrane helix predictions 
BMC Bioinformatics  2013;14:111.
Helical membrane proteins are vital for the interaction of cells with their environment. Predicting the location of membrane helices in protein amino acid sequences provides substantial understanding of their structure and function and identifies membrane proteins in sequenced genomes. Currently there is no comprehensive benchmark tool for evaluating prediction methods, and there is no publication comparing all available prediction tools. Current benchmark literature is outdated, as recently determined membrane protein structures are not included. Current literature is also limited to global assessments, as specialised benchmarks for predicting specific classes of membrane proteins were not previously carried out.
We present a benchmark server at that uses recent high resolution protein structural data to provide a comprehensive assessment of the accuracy of existing membrane helix prediction methods. The server further allows a user to compare uploaded predictions generated by novel methods, permitting the comparison of these novel methods against all existing methods compared by the server. Benchmark metrics include sensitivity and specificity of predictions for membrane helix location and orientation, and many others. The server allows for customised evaluations such as assessing prediction method performances for specific helical membrane protein subtypes.
We report results for custom benchmarks which illustrate how the server may be used for specialised benchmarks. Which prediction method is the best performing method depends on which measure is being benchmarked. The OCTOPUS membrane helix prediction method is consistently one of the highest performing methods across all measures in the benchmarks that we performed.
The benchmark server allows general and specialised assessment of existing and novel membrane helix prediction methods. Users can employ this benchmark server to determine the most suitable method for the type of prediction the user needs to perform, be it general whole-genome annotation or the prediction of specific types of helical membrane protein. Creators of novel prediction methods can use this benchmark server to evaluate the performance of their new methods. The benchmark server will be a valuable tool for researchers seeking to extract more sophisticated information from the large and growing protein sequence databases.
PMCID: PMC3620685  PMID: 23530628
Helical membrane proteins; Transmembrane helix prediction; Benchmark
14.  Effects of Dilation on Electronic-ETDRS Visual Acuity (EVA) in Diabetic Patients 
To evaluate the effect of pupillary dilation on electronic-ETDRS visual acuity (EVA) in diabetic subjects and to assess post-dilation EVA as a surrogate for pre-dilation visual acuity (VA).
Methods and Design refraction and EVA were measured pre- and post-dilation in diabetic subjects by independent, masked examiners.
In 129 eyes of 66 subjects, median [25th, 75th percentiles] pre-dilation EVA score was 69 [54, 86] (Snellen-equivalent 20/40-1 [20/80-1, 20/20+1]). Pre-dilation VA was ≥20/20, 20/25-20/40, 20/50-20/80, and <20/80 in 29%, 19%, 26%, and 26% of eyes, respectively. Median EVA change post-dilation was -3 letters [-7, 0]. EVA change was ≥15 letters (≥ 3 ETDRS lines) in 9% of eyes and ≥10 letters (≥ 2 ETDRS lines) in 19% of eyes. Extent of change (range +12 to -25 letters) was associated with baseline VA. No relationship was identified between EVA change and gender, race, lens status, refractive error, DR severity, or primary cause of vision loss.
In an optimized clinical trial setting, there is a decline in best-corrected EVA after dilation in diabetic subjects. The large range and magnitude of VA change preclude using post-dilation EVA as a surrogate for undilated VA.
PMCID: PMC2762194  PMID: 18936147
15.  Oxidative Lung Damage Resulting from Repeated Exposure to Radiation and Hyperoxia Associated with Space Exploration 
Spaceflight missions may require crewmembers to conduct Extravehicular Activities (EVA) for repair, maintenance or scientific purposes. Pre-breathe protocols in preparation for an EVA entail 100% hyperoxia exposure that may last for a few hours (5-8 hours), and may be repeated 2-3 times weekly. Each EVA is associated with additional challenges such as low levels of total body cosmic/galactic radiation exposure that may present a threat to crewmember health and therefore, pose a threat to the success of the mission. We have developed a murine model of combined, hyperoxia and radiation exposure (double-hit) in the context of evaluating countermeasures to oxidative lung damage associated with space flight. In the current study, our objective was to characterize the early and chronic effects of repeated single and double-hit challenge on lung tissue using a novel murine model of repeated exposure to low-level total body radiation and hyperoxia. This is the first study of its kind evaluating lung damage relevant to space exploration in a rodent model.
Mouse cohorts (n=5-15/group) were exposed to repeated: a) normoxia; b) >95% O2 (O2); c) 0.25Gy single fraction gamma radiation (IR); or d) a combination of O2 and IR (O2+IR) given 3 times per week for 4 weeks. Lungs were evaluated for oxidative damage, active TGFβ1 levels, cell apoptosis, inflammation, injury, and fibrosis at 1, 2, 4, 8, 12, 16, and 20 weeks post-initiation of exposure.
Mouse cohorts exposed to all challenge conditions displayed decreased bodyweight compared to untreated controls at 4 and 8 weeks post-challenge initiation. Chronic oxidative lung damage to lipids (malondialdehyde levels), DNA (TUNEL, cleaved Caspase 3, cleaved PARP positivity) leading to apoptotic cell death and to proteins (nitrotyrosine levels) was elevated all treatment groups. Importantly, significant systemic oxidative stress was also noted at the late phase in mouse plasma, BAL fluid, and urine. Importantly, however, late oxidative damage across all parameters that we measured was significantly higher than controls in all cohorts but was exacerbated by the combined exposure to O2 and IR. Additionally, impaired levels of arterial blood oxygenation were noted in all exposure cohorts. Significant but transient elevation of lung tissue fibrosis (p<0.05), determined by lung hydroxyproline content, was detected as early as 2 week in mice exposed to challenge conditions and persisted for 4-8 weeks only. Interestingly, active TGFβ1 levels in +BAL fluid was also transiently elevated during the exposure time only (1-4 weeks). Inflammation and lung edema/lung injury was also significantly elevated in all groups at both early and late time points, especially the double-hit group.
We have characterized significant, early and chronic lung changes consistent with oxidative tissue damage in our murine model of repeated radiation and hyperoxia exposure relevant to space travel. Lung tissue changes, detectable several months after the original exposure, include significant oxidative lung damage (lipid peroxidation, DNA damage and protein nitrosative stress) and increased pulmonary fibrosis. These findings, along with increased oxidative stress in diverse body fluids and the observed decreases in blood oxygenation levels in all challenge conditions (whether single or in combination), lead us to conclude that in our model of repeated exposure to oxidative stressors, chronic tissue changes are detected that persist even months after the exposure to the stressor has ended. This data will provide useful information in the design of countermeasures to tissue oxidative damage associated with space exploration.
PMCID: PMC3866035  PMID: 24358450
Apoptosis; Bronchoalveolar lavage; Caspase 3; Double-hit; Extravehicular activity; Hyperoxia; Inflammation; Lung fibrosis; Lung injury; Mouse model; Nitrotyrosine; Oxidative stress; PARP; Radiation pneumonopathy; Space exploration; TGF-β1; Total body irradiation; TUNEL
16.  Evaluation of the Thyroid in Patients With Hearing Loss and Enlarged Vestibular Aqueducts 
To evaluate thyroid structure and function in patients with enlargement of the vestibular aqueduct (EVA) and sensorineural hearing loss.
Prospective cohort survey.
National Institutes of Health Clinical Center, a federal biomedical research facility.
The study population comprised 80 individuals, aged 1.5 to 59 years, ascertained on the basis of EVA and sensorineural hearing loss.
Main Outcome Measures
Associations among the number of mutant alleles of SLC26A4; volume and texture of the thyroid; percentage of iodine 123 (123I) discharged at 120 minutes after administration of perchlorate in the perchlorate discharge test; and peripheral venous blood levels of thyrotropin, thyroxine, free thyroxine, triiodothyronine, thyroglobulin, antithyroid peroxidase and antithyroglobulin antibodies, and thyroid-binding globulin.
Thyroid volume is primarily genotype dependent in pediatric patients but age dependent in older patients. Individuals with 2 mutant SLC26A4 alleles discharged a significantly (P ≤ .001) greater percentage of 123I compared with those with no mutant alleles or 1 mutant allele. Thyroid function, as measured by serologic testing, is not associated with the number of mutant alleles.
Ultrasonography with measurement of gland volume is recommended for initial assessment and follow-up surveillance of the thyroid in patients with EVA. Perchlorate discharge testing is recommended for the diagnostic evaluation of patients with EVA along with goiter, nondiagnostic SLC26A4 genotypes (zero or 1 mutant allele), or both.
PMCID: PMC2941509  PMID: 19620588
17.  PROMALS3D web server for accurate multiple protein sequence and structure alignments 
Nucleic Acids Research  2008;36(Web Server issue):W30-W34.
Multiple sequence alignments are essential in computational sequence and structural analysis, with applications in homology detection, structure modeling, function prediction and phylogenetic analysis. We report PROMALS3D web server for constructing alignments for multiple protein sequences and/or structures using information from available 3D structures, database homologs and predicted secondary structures. PROMALS3D shows higher alignment accuracy than a number of other advanced methods. Input of PROMALS3D web server can be FASTA format protein sequences, PDB format protein structures and/or user-defined alignment constraints. The output page provides alignments with several formats, including a colored alignment augmented with useful information about sequence grouping, predicted secondary structures and consensus sequences. Intermediate results of sequence and structural database searches are also available. The PROMALS3D web server is available at:
PMCID: PMC2447800  PMID: 18503087
18.  Jenner-predict server: prediction of protein vaccine candidates (PVCs) in bacteria based on host-pathogen interactions 
BMC Bioinformatics  2013;14:211.
Subunit vaccines based on recombinant proteins have been effective in preventing infectious diseases and are expected to meet the demands of future vaccine development. Computational approach, especially reverse vaccinology (RV) method has enormous potential for identification of protein vaccine candidates (PVCs) from a proteome. The existing protective antigen prediction software and web servers have low prediction accuracy leading to limited applications for vaccine development. Besides machine learning techniques, those software and web servers have considered only protein’s adhesin-likeliness as criterion for identification of PVCs. Several non-adhesin functional classes of proteins involved in host-pathogen interactions and pathogenesis are known to provide protection against bacterial infections. Therefore, knowledge of bacterial pathogenesis has potential to identify PVCs.
A web server, Jenner-Predict, has been developed for prediction of PVCs from proteomes of bacterial pathogens. The web server targets host-pathogen interactions and pathogenesis by considering known functional domains from protein classes such as adhesin, virulence, invasin, porin, flagellin, colonization, toxin, choline-binding, penicillin-binding, transferring-binding, fibronectin-binding and solute-binding. It predicts non-cytosolic proteins containing above domains as PVCs. It also provides vaccine potential of PVCs in terms of their possible immunogenicity by comparing with experimentally known IEDB epitopes, absence of autoimmunity and conservation in different strains. Predicted PVCs are prioritized so that only few prospective PVCs could be validated experimentally. The performance of web server was evaluated against known protective antigens from diverse classes of bacteria reported in Protegen database and datasets used for VaxiJen server development. The web server efficiently predicted known vaccine candidates reported from Streptococcus pneumoniae and Escherichia coli proteomes. The Jenner-Predict server outperformed NERVE, Vaxign and VaxiJen methods. It has sensitivity of 0.774 and 0.711 for Protegen and VaxiJen dataset, respectively while specificity of 0.940 has been obtained for the latter dataset.
Better prediction accuracy of Jenner-Predict web server signifies that domains involved in host-pathogen interactions and pathogenesis are better criteria for prediction of PVCs. The web server has successfully predicted maximum known PVCs belonging to different functional classes. Jenner-Predict server is freely accessible at
PMCID: PMC3701604  PMID: 23815072
Protein vaccine candidates (PVCs); Host-pathogen interactions; Domain; Antigen; Reverse vaccinology; Virulence
19.  SLC26A4 genotype, but not cochlear radiologic structure, is correlated with hearing loss in ears with an enlarged vestibular aqueduct (EVA) 
The Laryngoscope  2010;120(2):384-389.
Identify correlations among SLC26A4 genotype, cochlear structural anomalies, and hearing loss associated with enlargement of the vestibular aqueduct (EVA).
Study Design
Prospective cohort survey, National Institutes of Health, Clinical Center, a federal biomedical research facility.
83 individuals, 11 months to 59 years of age, with EVA in at least one ear. Correlations among pure-tone hearing thresholds, number of mutant SLC26A4 alleles, and the presence of cochlear anomalies detected by computed tomography or magnetic resonance imaging.
Linear mixed-effect model indicates significantly poorer hearing in ears with EVA from individuals with two mutant alleles of SLC26A4 than in those with EVA and a single mutant allele (p = .012) or no mutant alleles (p = .007) in this gene. There was no detectable relationship between degree of hearing loss and the presence of structural cochlear anomalies.
The number of mutant alleles of SLC26A4, but not the presence of cochlear anomalies, has a significant association with severity of hearing loss in ears with EVA. This information will be useful for prognostic counseling of patients and families with EVA.
PMCID: PMC2811762  PMID: 19998422
enlarged vestibular aqueduct; SLC26A4; hearing
20.  KoBaMIN: a knowledge-based minimization web server for protein structure refinement 
Nucleic Acids Research  2012;40(Web Server issue):W323-W328.
The KoBaMIN web server provides an online interface to a simple, consistent and computationally efficient protein structure refinement protocol based on minimization of a knowledge-based potential of mean force. The server can be used to refine either a single protein structure or an ensemble of proteins starting from their unrefined coordinates in PDB format. The refinement method is particularly fast and accurate due to the underlying knowledge-based potential derived from structures deposited in the PDB; as such, the energy function implicitly includes the effects of solvent and the crystal environment. Our server allows for an optional but recommended step that optimizes stereochemistry using the MESHI software. The KoBaMIN server also allows comparison of the refined structures with a provided reference structure to assess the changes brought about by the refinement protocol. The performance of KoBaMIN has been benchmarked widely on a large set of decoys, all models generated at the seventh worldwide experiments on critical assessment of techniques for protein structure prediction (CASP7) and it was also shown to produce top-ranking predictions in the refinement category at both CASP8 and CASP9, yielding consistently good results across a broad range of model quality values. The web server is fully functional and freely available at
PMCID: PMC3394243  PMID: 22564897
21.  LISE: a server using ligand-interacting and site-enriched protein triangles for prediction of ligand-binding sites 
Nucleic Acids Research  2013;41(Web Server issue):W292-W296.
LISE is a web server for a novel method for predicting small molecule binding sites on proteins. It differs from a number of servers currently available for such predictions in two aspects. First, rather than relying on knowledge of similar protein structures, identification of surface cavities or estimation of binding energy, LISE computes a score by counting geometric motifs extracted from sub-structures of interaction networks connecting protein and ligand atoms. These network motifs take into account spatial and physicochemical properties of ligand-interacting protein surface atoms. Second, LISE has now been more thoroughly tested, as, in addition to the evaluation we previously reported using two commonly used small benchmark test sets and targets of two community-based experiments on ligand-binding site predictions, we now report an evaluation using a large non-redundant data set containing >2000 protein–ligand complexes. This unprecedented test, the largest ever reported to our knowledge, demonstrates LISE’s overall accuracy and robustness. Furthermore, we have identified some hard to predict protein classes and provided an estimate of the performance that can be expected from a state-of-the-art binding site prediction server, such as LISE, on a proteome scale. The server is freely available at
PMCID: PMC3692107  PMID: 23609546
22.  WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation 
BMC Genomics  2013;14(Suppl 3):S6.
SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases.
The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively.
WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at
PMCID: PMC3665478  PMID: 23819482
23.  Evaluation of Visual Acuity Measurements after Autorefraction versus Manual Refraction in Eyes with and without Diabetic Macular Edema 
Archives of ophthalmology  2011;130(4):470-479.
To compare visual acuity (VA) scores after autorefraction versus research protocol manual refraction in eyes of patients with diabetes and a wide range of VA.
Electronic Early Treatment Diabetic Retinopathy Study (E-ETDRS) VA Test© letter score (EVA) was measured after autorefraction (AR-EVA) and after Diabetic Retinopathy Clinical Research Network ( protocol manual refraction (MR-EVA). Testing order was randomized, study participants and VA examiners were masked to refraction source, and a second EVA utilizing an identical manual refraction (MR-EVAsupl) was performed to determine test-retest variability.
In 878 eyes of 456 study participants, median MR-EVA was 74 (Snellen equivalent approximately 20/32). Spherical equivalent was often similar for manual and autorefraction (median difference: 0.00, 5th and 95th percentiles −1.75 to +1.13 Diopters). However, on average, MR-EVA results were slightly better than AR-EVA results across the entire VA range. Furthermore, variability between AR-EVA and MR-EVA was substantially greater than the test-retest variability of MR-EVA (P<0.001). Variability of differences was highly dependent on autorefractor model.
Across a wide range of VA at multiple sites using a variety of autorefractors, VA measurements tend to be worse with autorefraction than manual refraction. Differences between individual autorefractor models were identified. However, even among autorefractor models comparing most favorably to manual refraction, VA variability between autorefraction and manual refraction is higher than the test-retest variability of manual refraction. The results suggest that with current instruments, autorefraction is not an acceptable substitute for manual refraction for most clinical trials with primary outcomes dependent on best-corrected VA.
PMCID: PMC3489033  PMID: 22159173
24.  MetaDBSite: a meta approach to improve protein DNA-binding sites prediction 
BMC Systems Biology  2011;5(Suppl 1):S7.
Protein-DNA interactions play an important role in many fundamental biological activities such as DNA replication, transcription and repair. Identification of amino acid residues involved in DNA binding site is critical for understanding of the mechanism of gene regulations. In the last decade, there have been a number of computational approaches developed to predict protein-DNA binding sites based on protein sequence and/or structural information.
In this article, we present metaDBSite, a meta web server to predict DNA-binding residues for DNA-binding proteins. MetaDBSite integrates the prediction results from six available online web servers: DISIS, DNABindR, BindN, BindN-rf, DP-Bind and DBS-PRED and it solely uses sequence information of proteins. A large dataset of DNA-binding proteins is constructed from the Protein Data Bank and it serves as a gold-standard benchmark to evaluate the metaDBSite approach and the other six predictors.
The comparison results show that metaDBSite outperforms single individual approach. We believe that metaDBSite will become a useful and integrative tool for protein DNA-binding residues prediction. The MetaDBSite web-server is freely available at and
PMCID: PMC3121123  PMID: 21689482
25.  Prediction of protein secondary structures with a novel kernel density estimation based classifier 
BMC Research Notes  2008;1:51.
Though prediction of protein secondary structures has been an active research issue in bioinformatics for quite a few years and many approaches have been proposed, a new challenge emerges as the sizes of contemporary protein structure databases continue to grow rapidly. The new challenge concerns how we can effectively exploit all the information implicitly deposited in the protein structure databases and deliver ever-improving prediction accuracy as the databases expand rapidly.
The new challenge is addressed in this article by proposing a predictor designed with a novel kernel density estimation algorithm. One main distinctive feature of the kernel density estimation based approach is that the average execution time taken by the training process is in the order of O(nlogn), where n is the number of instances in the training dataset. In the experiments reported in this article, the proposed predictor delivered an average Q3 (three-state prediction accuracy) score of 80.3% and an average SOV (segment overlap) score of 76.9% for a set of 27 benchmark protein chains extracted from the EVA server that are longer than 100 residues.
The experimental results reported in this article reveal that we can continue to achieve higher prediction accuracy of protein secondary structures by effectively exploiting the structural information deposited in fast-growing protein structure databases. In this respect, the kernel density estimation based approach enjoys a distinctive advantage with its low time complexity for carrying out the training process.
PMCID: PMC2527571  PMID: 18710504

Results 1-25 (1141996)