PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bmcbioiBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Bioinformatics
 
BMC Bioinformatics. 2010; 11: 407.
Published online 2010 July 31. doi:  10.1186/1471-2105-11-407
PMCID: PMC2920885

Predicting β-turns and their types using predicted backbone dihedral angles and secondary structures

Abstract

Background

β-turns are secondary structure elements usually classified as coil. Their prediction is important, because of their role in protein folding and their frequent occurrence in protein chains.

Results

We have developed a novel method that predicts β-turns and their types using information from multiple sequence alignments, predicted secondary structures and, for the first time, predicted dihedral angles. Our method uses support vector machines, a supervised classification technique, and is trained and tested on three established datasets of 426, 547 and 823 protein chains. We achieve a Matthews correlation coefficient of up to 0.49, when predicting the location of β-turns, the highest reported value to date. Moreover, the additional dihedral information improves the prediction of β-turn types I, II, IV, VIII and "non-specific", achieving correlation coefficients up to 0.39, 0.33, 0.27, 0.14 and 0.38, respectively. Our results are more accurate than other methods.

Conclusions

We have created an accurate predictor of β-turns and their types. Our method, called DEBT, is available online at http://comp.chem.nottingham.ac.uk/debt/.

Background

Secondary structure can provide important information about three-dimensional protein structure. Therefore, its prediction has been an area of intense research over the past three decades. To predict secondary structure many methods have been implemented, including different machine learning techniques, such as artificial neural networks (ANNs) [1,2] and support vector machines (SVMs) [3-5], and different input schemes, such as position specific scoring matrices (PSSMs) [2] and hidden Markov models [6]. Notably, the predictive accuracy reached 80% for three-state prediction, where residues are divided into helix, strand and coil. Helices and strands are repetitive, regular structures, while the remaining residues, which can be tight turns, loops, bulges or random coil, are all classified as coil; they are non-repetitive, irregular secondary structures [7]. Although the helix and strand classes are structurally well-defined, the third class, coil, does not provide any detailed structural information. Hence, further analysis of the local structure is necessary, such as prediction of backbone dihedral angles [5,8] and prediction of tight turns [9].

Tight turns play an important role in protein folding and stability. They are partly responsible for the compact, globular shape of proteins, because they provide directional change to the polypeptide chain [10]. Depending on the number of constituent residues, tight turns can be classified as α-turns, β-turns, γ-turns, δ-turns or π-turns. A β-turn is formed by four adjacent residues, which are not in an α-helix, where the distance between Ca(i) and Ca(i + 3) is less than 7 Å [9]. The β-turns are the most common tight turns. On average, about a quarter of all residues are in a β-turn [11]. Moreover, β-turns are crucial components of β-hairpins, the fundamental elements of anti-parallel β-sheets, whose prediction has recently attracted interest [12-14]. Furthermore, β-turn formation is an important step in protein folding [15], while improved β-turn sequences can improve protein stability [16,17]. Additionally, their occurrence on the surface of proteins suggests their involvement in molecular recognition processes and their interactions between peptide substrates and receptors [18]. Recently, mimicking β-turns for the synthesis of medicines [19,20] or for nucleating β-sheet folding [21] has also attracted interest. Thus, the prediction of β-turns can facilitate three-dimensional structure prediction and can provide important information about the protein folding. Hutchinson and Thornton [22] classified the β-turns into nine types based on the dihedral angles of residues i + 1 and i + 2 in the turn (table (table1).1). This is the most established classification of β-turns.

Table 1
The dihedral angles of β-turn types [22]

Prediction of β-turns has attracted interest in the past. The approaches can be divided into statistical methods and machine learning techniques. The former include early methods which used amino acid propensities [23-27] as well as more recent methods, like COUDES [28], which used probabilities with multiple sequence alignments. Over the past few years, machine learning techniques have been applied successfully to predict β-turns. Since their first use [29], ANNs have been frequently used for β-turn prediction [30-32]. Over the past decade, several studies used SVMs to predict β-turns [33-37] and other techniques, such as nearest neighbour, have been applied recently [38]. Through the use of evolutionary information and more sophisticated machine learning techniques, the correlation coefficient for turn/non-turn prediction is now as high as 0.47 [34]. Other methods predict the type of β-turn, rather than the location of the turn in the chain, with significant success, even though this problem is challenging, due to the lack of examples for many β-turn types. BTPRED [30], BetaTurns [39], MOLEBRNN [32] and the method of Asgary and colleagues [40] are ANN-based, whereas COUDES [28] uses amino acid propensities with multiple sequence alignments. In spite of its successful use for the prediction of β-turn location [34,37], the SVM method has not been employed widely for β-turn type prediction.

Despite the success so far, there is a need for more accurate predictions of both β-turn location and β-type, which could be realised through the use of additional information. Evolutionary information from multiple alignments [31] as well as predicted secondary structures [30] can improve β-turn predictions dramatically. In this work, we show that the backbone dihedral angles can provide crucial information for turn/non-turn prediction and can also noticeably improve the prediction of β-turn types, since the types are defined by the dihedral angles of the central residues. Predicted dihedral angles have been used successfully for secondary structure prediction [5,41]. The method presented here, called DEBT (Dihedrally Enhanced Beta Turn prediction), uses predicted secondary structures and predicted dihedral angles from DISSPred [5] and achieves the highest correlation coefficient reported to date for turn/non-turn prediction, while the prediction of β-turn types is, in most cases, more accurate than other contemporary methods. The method predicts β-turn type I, II, IV, VIII as defined by Hutchinson and Thornton [22], while all remaining types are classified as NS (non-specific). Moreover, we show that using a small local window of predicted secondary structures and dihedral angles, rather than using the predictions of one individual residue, is beneficial.

Methods

Datasets

DEBT was trained and tested on four non-redundant datasets, which contain chains with at least one β-turn and have X-ray crystallographic resolution better than 2.0 Å. All protein chains have less than 25% sequence similarity, to ensure that there is very little homology in the training set. The first dataset, denoted as GR426 in this paper, consists of 426 protein chains [42] and was used to study the impact of various training schemes and to tune the kernel parameters. GR426 has been used by the majority of recent β-turn prediction methods and, therefore, we can use it to make direct comparisons. In 2002, the GR426 dataset was used for the evaluation of β-turn prediction methods [43] and was partitioned into seven subsets in order to perform seven-fold cross-validation. In this work, we utilised the same partition for the cross-validation. After finding the optimal input scheme and tuning the kernel parameters, we used two additional datasets, constructed for training and testing COUDES [28], to validate the performance of our method. The datasets consist of 547 and 823 protein chains and are denoted as FA547 and FA823, respectively. Finally, DEBT's web-server was trained using PDB-Select25 (version October 2008) [44], a set of 4018 chains from the PDB with less than 25% sequence similarity. From these chains, we have selected those that contain at least one β-turn and have an X-ray crystallographic resolution below 2.0 Å. This gave a dataset of 1296 protein chains, denoted as PDB1296 in this article, which is the largest training set used for a β-turn prediction server. The PDB codes and chain identifiers of the above datasets are listed at DEBT's website http://comp.chem.nottingham.ac.uk/debt/. The β-turns and their types were assigned using the PROMOTIF program [45]. In this work, we predict β-turn types I, II, IV, VIII, while all the remaining types are assigned to the miscellaneous class NS (non-specific). Table Table22 shows the distributions of β-turns and their types in each dataset.

Table 2
Distribution of residues in β-turns and their types in different datasets

DEBT method utilises PSSMs, constructed by the PSI-BLAST algorithm [46], to predict β-turns and their types. PSSMs have N × 20 elements, where the N rows correspond to the length of the amino acid sequence and the columns correspond to the 20 standard amino acids. PSSMs represent the log-likelihood of a particular residue substitution, usually based on a weighted average of BLOSUM62 [47]. We generated the PSSMs using the BLOSUM62 substitution matrix with an E-value of 0.001 and three iterations against a non-reduntant (nr) database, which was downloaded in February 2009. The data were filtered by pfilt [48] to remove low complexity regions, transmembrane spans and coiled coil regions. The PSSM values were linearly scaled simply by dividing them by ten. Typically, PSSM values are in the range [-7,7], but some values outside this range may appear. Linear scaling maintains the same distribution in the input data and helps avoid numerical difficulties during training.

Support Vector Machines

DEBT employs SVM [49], a state-of-the-art supervised learning technique. The SVM method has become an area of intense research, because it performs well with real-world problems, it is simple to understand and implement and, most importantly, it finds the global solution, while other methods, like ANNs, have several local solutions [50]. The SVM can find non-linear boundaries between two classes by using a kernel function, which maps the data from the input space into a richer feature space, where linear boundaries can be implemented. Furthermore, the SVM effectively handles large feature spaces, since it does not suffer from the "curse of dimensionality", and, therefore, avoids overfitting, a common drawback of supervised learning techniques.

A detailed description of the SVM algorithm can be found in various textbooks [50-52]. In brief, given input vectors xi [set membership] Rn and output values yi [set membership] {-1, 1}, the fundamental goal of a binary SVM classifier is to solve the following optimisation problem:

equation image
(1)

where w is a vector perpendicular to the hyperplane, b is the offset from the origin and C is a penalty parameter for each misclassification. Thus, it controls the trade-off between training error and the margin that separates the two classes. The kernel function used in our case is the radial basis function (RBF), shown in equation 2, which was successfully used for complex problems, such as secondary structure prediction [3] and dihedral prediction [5].

equation image
(2)

where xi and xj are the input vectors for instances i and j, respectively, and γ is a parameter that controls the width of the kernel.

LibSVM [53], a popular SVM software package, was employed for the training and testing of the SVM classifiers. In order to get the optimal predictive performance, we optimised three parameters: C (equation 1), γ (equation 2) and w. The latter controls the cost of misclassification for the minority class and, therefore, reduces the effect of the imbalance in the datasets. In other words, different penalty parameters costs are used for each class [54]. The optimised parameters for each classifier are shown in table table3.3. Seven-fold cross-validation was applied on datasets GR426, FA547 and FA823. For the former, we utilised the the same subsets used by Kaur and Raghava [55] to evaluate different β-turn prediction methods, whereas the partition of the other two datasets was identical to the one used to train COUDES [28].

Table 3
Optimised parameters for each SVM classifier used in DEBT.

DEBT architecture

Figure Figure11 shows the architecture of the method. DEBT uses two different local windows around the residue to be predicted: one, l1, of nine residues for the PSSM values and a second, l2, of five residues for the predicted secondary structures and dihedral angles, both centred around the residue to be predicted. DISSPred [5] is used to predict both three-state secondary structure and the dihedral angles. DISSPred uses different partitions of the ϕ - ψ space created by two unsupervised clustering algorithms and both the algorithm and the number of clusters can be adjusted by the user. Subsequently, DISSPred predicts the secondary structure and the dihedral angles using an iterative process. For each residue in window l2, the predicted secondary structures are encoded using three binary attributes, one for each state: (1,0,0) for helix, (0,1,0) for strand and (0,0,1) for coil. The dihedral angles are predicted by DISSPred using a partition of seven clusters and, therefore, are encoded similarly using seven binary attributes. Thus, the input vectors of the SVM classifiers have 230 attributes: 180 attributes for the PSSM values, 15 attributes for the predicted secondary structures and 35 attributes for the predicted dihedral clusters. We used the same architecture for both turn/non-turn prediction and β-turn type prediction.

Figure 1
The architecture of our β-turn location and β-turn type prediction method. An example of an input sequence is provided at the top. Around each residue to be predicted (shown in red), two local windows are used. One, l1, has a size of nine ...

Filtering

Because the prediction is based on individual residues, the SVM outputs include some β-turns that are shorter than four residues, which is unrealistic. Turn predictions longer than four adjacent residues are acceptable, since there are many β-turns in the dataset that are overlapping. In fact, about 58% are multiple turns [22]. To ensure that the predictions are at least four residue long, we applied some filtering rules similar to the "state-flipping" rule described by Shepherd and colleagues [30]. The rules are applied with the following order: (1) flip isolated non-turn predictions to turn (tnt → ttt), (2) flip isolated turn predictions to non-turn (ntn → nnn), (3) flip isolated turn pairs of turn prediction to non-turn (nttn → nnnn) and (4) flip the adjacent non-turn predictions to turn for isolated three consecutive turn predictions (ntttn → ttttt).

Prediction accuracy assessment

Six different scalar measures were used to assess DEBT's performance. All of them can be derived from two or more of the following quantities: (1) true positives, pi, is the number of correctly classified β-turns or β-turn type i, (2) true negatives, ni, is the number of correctly classified non-turns, (3) false positives, oi, is the number of non-turns incorrectly classified as β-turns or β-turn type i (over-predictions), (4) false negatives, ui, is the number of β-turns or β-turn type i incorrectly classified as non-turn (under-predictions) and (5) total number of residues, t = pi + ni + oi + ui, where i = I, II, IV, VIII or NS. The first measure used is the predictive accuracy, the percentage of correctly classified residues.

equation image
(3)

Two measures, that are usually used together, are sensitivity (also labelled as Qobs in some articles) and specificity which give the percentage of observed β-turns or β-turn types that are predicted correctly and the percentage of observed non-turns that are predicted correctly, respectively. The optimal is to equalise the two measures.

equation image
(4)
equation image
(5)

We report the commonly used Matthews correlation coefficient (MCC) [56], which is the most robust measure for β-turn prediction. The reason is that, when the dataset is unbalanced, it is possible to achieve high predictive accuracy just by predicting all instances as non-turn. The MCC, defined by equation 6, is a number between -1 and 1, with perfect correlation giving a coefficient equal to 1. Therefore, a higher MCC corresponds to a better predictive performance.

equation image
(6)

Finally, we report Qpred, the percentage of β-turn predictions that are correct:

equation image
(7)

Another important consideration is whether the classifiers perform better than random prediction. Herein, we report a normalised percentage better than random (Si), defined in equation 8, which was introduced by Shepherd and colleagues [30]. Perfect predictions score Si = 100%, whereas Si = 0% shows that the prediction is no better than random.

equation image
(8)

where R is the expected number of residues that would be predicted correctly by a random prediction and is defined as:

equation image
(9)

Apart from the scalar measures described above, we report the receive-operator characteristics (ROC) curves, which represent the sensitivity (or true positive rate - TP rate) against the false positive rate (1 - specificity). ROC curves have been widely used in bioinformatics [57] for visualisation and assessment of machine learning classifiers. Moreover, the area under the ROC curve (AUC) is calculated to provide a scalar measure of the ROC analysis and compare different methods. The trapezium rule is used to calculate the AUC, as described by Fawcett [58].

Results and Discussion

The effect of the input scheme

Before optimising the SVM classifiers, we tried different input schemes, which showed that the combination of evolutionary information (PSSMs), predicted secondary structures and predicted dihedral angles gives the most accurate predictions. Table Table44 shows the results on the GR426 dataset from the experiments using various input schemes and different window sizes for the turn/non-turn classifier. Firstly, we changed the size of the PSSM window, l1, by using lengths of seven, nine and eleven residues. The last two sizes give the highest MCC value. We selected a window size of nine residues, because the input vector is smaller and, therefore, the training time is shorter. Subsequently, we augmented the PSSM-only input vector with additional attributes only for the central residue (i.e. l2 = 1) using predicted secondary structures, predicted dihedral angles or both. The results show that, when used together, predicted secondary structures and dihedral angles achieve the best performance. Finally, we changed the size of the second window, l2, using three, five or seven residues. The optimal window size is five residues. The same window sizes, l1 and l2, were utilised for all classifiers.

Table 4
Experiments on the GR426 dataset with different input schemes.

Turn/non-turn prediction

Predicted dihedral angles and secondary structures improve the performance of the turn/non-turn classifier, as shown in table table5.5. In fact, the MCC shows an improvement of over 10% and reaches values of 0.48, 0.49 and 0.48 for datasets GR426, FA547 and FA823, respectively. Moreover, the overall accuracy is higher than 80% for datasets FA547 and FA823, while it is 79.2% for the GR426 dataset. Finally, Qpred, Qobs (sensitivity) and the better-than-random score, S, also improved after using predicted dihedral angles and secondary structures.

Table 5
Performance of DEBT for the prediction of β-turn location on three datasets.

Table Table66 compares the DEBT's predictive performance with other turn/non-turn predictors in the literature on the established datasets GR426, FA547 and FA823, sorted by the reported MCC score. The comparison is based on the MCC value, because it is the most robust measure, particularly when the dataset is unbalanced. Our achieved MCC values are the highest reported to date on all datasets. Interestingly, the methods by Zheng and Kurgan [34] and by Hu and Li [37], which report the second highest MCC score (0.47) on the GR426 dataset, are also SVM-based, which highlights the superiority of the SVM method compared to other machine learning techniques for β-turn prediction. Moreover, our method achieves a high MCC score by using a single SVM model, without any preprocessing, feature selection or predictions from multiple secondary structure or dihedral prediction methods, which may, potentially, improve the results. DEBT's performance using other measures is also one of the highest in the literature with overall accuracy around 80% and the Qpred and Qobs scores around 55% and 70%, respectively. These measures can vary depending on the balance of the dataset and the selected SVM parameters (table (table3),3), which we optimised based on the more robust MCC score.

Table 6
Comparison of DEBT with other turn/non-turn prediction methods on three different datasets.

Prediction of β-turn types

Table Table77 shows the performance of our method for the prediction of β-turn types on three different datasets. Notably, the MCC score increases dramatically when we augment the input vector with a local window of predicted dihedral angles and secondary structures. The improvement of the MCC score is at least 16%, 7%, 17%, 40% and 11% for types I, II, IV, VIII and NS, respectively, on all datasets. The explanation for the dramatic improvement of the prediction of some types, such as types I and VIII, can be derived from their dihedral angles (table (table1).1). These types have negative ϕ and ψ angles and, hence, their structure is closer to a helical conformation, which is more accurately predicted by DISSPRED [5]. Therefore, more accurate secondary structure and dihedral predictions lead to more accurate β-turn type predictions. DEBT's predictive accuracy is over 70% for all types, with the caveat that it is not a reliable measure when the dataset is unbalanced. The prediction of the NS class with the highest MCC score clearly reflects the under-predictions, since the specificity is high and the sensitivity is low. When we attempted to equalise the two measures on the GR426 dataset, the MCC value dropped to 0.22, with the sensitivity and specificity at 68.5% and 84.3%, respectively. For all datasets, the better-than-random scores, S, are higher than 20% for all β-turn types except type VIII. On the GR426 dataset, DEBT's achieved S scores of 30.1%, 23.1%, 20.4% and 26.2% for types I, II, IV and NS, respectively, are noticeably higher than the scores reported by BTPRED [30] and BetaTurns [39]. The former achieved better-than-random scores of 18.1%, 18.9%, 4.5% and 2.6% for types I, II, VIII and IV, respectively, while BetaTurns reported values of 19.1%, 23.2%, 12.4%, 1.8% and 6.1% for types I, II, IV, VIII and NS, respectively.

Table 7
DEBT's prediction of β-turn types on three different datasets.

Table Table88 compares the performance of β-turn prediction with other methods in the literature based on the GR426 dataset. DEBT outperforms other contemporary methods for the prediction of type I, IV, VIII and NS. Our achieved MCC score is higher by at least 12.5% for types I and IV and by at least 27% and 29% for types VIII and NS, respectively. The performance highlights the importance of predicted dihedral angles in β-turn type prediction, since they are defined by the dihedral angles of the central residues (table (table1).1). The prediction of type II is the only one that does not achieve a MCC score as high as some other methods. MOLEBRNN [32] and - using different dataset - the method by Asgary and co-workers [40] report higher MCC values, while COUDES [28] reports an MCC of 0.30, which is slightly higher than our achieved value of 0.29. However, DEBT achieves a comparable MCC of 0.33 for the prediction of type II using datasets FA547 and FA823, which generally give higher MCC values than GR426 for β-turn type prediction (see table table77).

Table 8
Performance of DEBT and other β-turn type prediction methods based on the achieved MCC value.

ROC analysis

Figure Figure22 illustrates the ROC curves for turn/non-turn prediction and β-turn type prediction before and after using predicted secondary structures and dihedral angles on the GR426 dataset. The ROC curves on datasets FA547 and FA823 are shown in additional file 1. The corresponding areas under the curves were calculated and are presented in tables tables55 and and77 for turn/non-turn prediction and β-turn type prediction, respectively. The improvement in the results highlights the utility of predicted dihedral angles and secondary structure in both turn/non-turn and β-turn type prediction methods.

Figure 2
ROC curves for the prediction on the GR426 dataset. Dashed curves correspond to the PSSM-only prediction, while solid curves correspond to the prediction after augmenting the input vector with predicted dihedral angles and secondary structures.

DEBT web-server

Our method is freely available online at http://comp.chem.nottingham.ac.uk/debt/. The web-server was trained using a large training set of 1296 protein chains with at least one β-turn to improve the performance of the method. It is written in Perl using a CGI interface. The user can either cut and paste the amino acid sequence or upload a FASTA file. Additionally, multiple FASTA files can be uploaded in an archive. Initially, DEBT firstly runs the PSI-BLAST algorithm [46] to construct the PSSMs and DISSPred [5] to predict the secondary structures and the dihedral angles. Subsequently, the results are merged to create the input file for six SVM classifiers. The output file, shown in figure figure3,3, contains the number and the one-letter abbreviation of the amino acids with six binary prediction values: one for turn/non-turn prediction and five for the β-turn types. The prediction value can be "1" if the corresponding residues is predicted in a β-turn/β-turn type and "0" otherwise. Moreover, the user can ask for DISSPred's results to be attached in the output file, which makes DEBT not only a β-turn prediction server, but also a three-state secondary structure prediction and a seven-state dihedral prediction interface. The output file, together with the log files, are sent to the user by e-mail, or can be downloaded, after the calculations are completed. The combination of DISSPred's iterative process with the training on a large dataset makes DEBT web-server slightly slower, but more accurate, than other β-turn prediction servers.

Figure 3
An example of an output file produced in DEBT web-server. The first and second columns show the one-letter code and the number of the amino acids, respectively. Column three shows the prediction value of the turn/non-turn prediction and columns four to ...

Conclusions

In this article, we presented a method that predicts the location of β-turns and their types in a protein chain. Our method uses predicted dihedral angles from DISSPred [5] to enhance the predictions. Moreover, we improved the predictive performance by using a local window of predicted secondary structures and dihedral angles, rather than the predictions for one individual residue. The MCC of 0.48, achieved for turn/non-turn prediction on a set of 426 non-redundant proteins, shows that DEBT is more accurate than other β-turn prediction methods. Moreover, we report the highest MCCs of 0.49 and 0.48 on two larger datasets of 547 and 823 non-redundant protein chains. Additionally, the dihedrally enhanced prediction for β-turn types is more accurate than other methods. We report DEBT's prediction on three datasets with achieved MCCs up to 0.39, 0.33, 0.27, 0.14 and 0.38 for β-turn types I, II, IV, VIII and NS, respectively. The prediction of β-turn types has limitations derived from the observation that identical tetrapeptides may form different β-turn types. In fact, around 15% of all tetrapeptides that form β-turns in datasets GR426 and FA547 appear in multiple β-turn types. This number is close to 18% in the FA823 dataset. A detailed analysis of the fundamental limitation of β-turn prediction is a challenging future focus. In spite of the limitations, the performance might be improved further by applying techniques introduced by other studies, such as feature selection techniques [34], or by using predicted secondary structures and dihedral angles from multiple predictors. Predicted β-turns can be used to improve secondary structure prediction [59] and we are currently exploring this.

Authors' contributions

PK carried out the experiments and wrote the manuscript. JDH conceived the study and assisted in writing the manuscript. Both authors read and approved the final manuscript for publication.

Supplementary Material

Additional file 1:

ROC curves for datasets FA547 and FA823. ROC curves for the predictions on datasets FA547 and FA823, before and after using predicted dihedral angles and secondary structures. Dashed curves correspond to the PSSM-only prediction, while solid curves correspond to the prediction after aumenting the input vector with predicted dihedral angles and secondary structures.

Acknowledgements

We thank the HPC facility at the University of Nottingham and the University of Nottingham for a PhD studentship.

References

  • Rost B, Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol. 1993;232(2):584–599. doi: 10.1006/jmbi.1993.1413. [PubMed] [Cross Ref]
  • Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol. 1999;292(2):195–202. doi: 10.1006/jmbi.1999.3091. [PubMed] [Cross Ref]
  • Hua S, Sun Z. A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J Mol Biol. 2001;308(2):397–407. doi: 10.1006/jmbi.2001.4580. [PubMed] [Cross Ref]
  • Karypis G. YASSPP: Better kernels and coding schemes lead to improvements in protein secondary structure prediction. Proteins. 2006;64(3):575–586. doi: 10.1002/prot.21036. [PubMed] [Cross Ref]
  • Kountouris P, Hirst JD. Prediction of backbone dihedral angles and protein secondary structure using support vector machines. BMC Bioinformatics. 2009;10:437. doi: 10.1186/1471-2105-10-437. [PMC free article] [PubMed] [Cross Ref]
  • Karplus K, Barrett C, Cline M, Diekhans M, Grate L, Hughey R. Predicting protein structure using only sequence information. Proteins. 1999. pp. 121–125. [PubMed] [Cross Ref]
  • Richardson JS. The anatomy and taxonomy of protein structure. Adv Protein Chem. 1981;34:167–339. full_text. [PubMed]
  • Dor O, Zhou Y. Real-SPINE: an integrated system of neural networks for real-value prediction of protein structural properties. Proteins. 2007;68:76–81. doi: 10.1002/prot.21408. [PubMed] [Cross Ref]
  • Chou KC. Prediction of tight turns and their types in proteins. Anal Biochem. 2000;286:1–16. doi: 10.1006/abio.2000.4757. [PubMed] [Cross Ref]
  • Marcelino AMC, Gierasch LM. Roles of beta-turns in protein folding: from peptide models to protein engineering. Biopolymers. 2008;89(5):380–391. doi: 10.1002/bip.20960. [PMC free article] [PubMed] [Cross Ref]
  • Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22(12):2577–2637. doi: 10.1002/bip.360221211. [PubMed] [Cross Ref]
  • de la Cruz X, Hutchinson EG, Shepherd A, Thornton JM. Toward predicting protein topology: an approach to identifying beta hairpins. Proc Natl Acad Sci USA. 2002;99(17):11157–11162. doi: 10.1073/pnas.162376199. [PubMed] [Cross Ref]
  • Kuhn M, Meiler J, Baker D. Strand-loop-strand motifs: prediction of hairpins and diverging turns in proteins. Proteins. 2004;54(2):282–288. doi: 10.1002/prot.10589. [PubMed] [Cross Ref]
  • Kumar M, Bhasin M, Natt NK, Raghava GPS. BhairPred: prediction of beta-hairpins in a protein from multiple alignment information using ANN and SVM techniques. Nucleic Acids Res. 2005. pp. W154–W159. [PMC free article] [PubMed] [Cross Ref]
  • Takano K, Yamagata Y, Yutani K. Role of amino acid residues at turns in the conformational stability and folding of human lysozyme. Biochemistry. 2000;39(29):8655–8665. doi: 10.1021/bi9928694. [PubMed] [Cross Ref]
  • Trevino SR, Schaefer S, Scholtz JM, Pace CN. Increasing protein conformational stability by optimizing beta-turn sequence. J Mol Biol. 2007;373:211–218. doi: 10.1016/j.jmb.2007.07.061. [PMC free article] [PubMed] [Cross Ref]
  • Fu H, Grimsley GR, Razvi A, Scholtz JM, Pace CN. Increasing protein stability by improving beta-turns. Proteins. 2009;77(3):491–498. doi: 10.1002/prot.22509. [PMC free article] [PubMed] [Cross Ref]
  • Rose GD, Gierasch LM, Smith JA. Turns in peptides and proteins. Adv Protein Chem. 1985;37:1–109. full_text. [PubMed]
  • Müller G, Hessler G, Decornez HY. Are β-turn mimetics mimics of β-turns? Angew Chem Int Ed Engl. 2000;39(5):894–896. doi: 10.1002/(SICI)1521-3773(20000303)39:5<894::AID-ANIE894>3.0.CO;2-2. [PubMed] [Cross Ref]
  • Kee KS, Jois SDS. Design of β-turn based therapeutic agents. Curr Pharm Des. 2003;9(15):1209–1224. doi: 10.2174/1381612033454900. [PubMed] [Cross Ref]
  • Fuller AA, Du D, Liu F, Davoren JE, Bhabha G, Kroon G, Case DA, Dyson HJ, Powers ET, Wipf P, Gruebele M, Kelly JW. Evaluating beta-turn mimics as beta-sheet folding nucleators. Proc Natl Acad Sci USA. 2009;106(27):11067–11072. doi: 10.1073/pnas.0813012106. [PubMed] [Cross Ref]
  • Hutchinson EG, Thornton JM. A revised set of potentials for β-turn formation in proteins. Protein Sci. 1994;3(12):2207–2216. doi: 10.1002/pro.5560031206. [PubMed] [Cross Ref]
  • Chou PY, Fasman GD. Conformational parameters for amino acids in helical, β-sheet, and random coil regions calculated from proteins. Biochemistry. 1974;13(2):211–222. doi: 10.1021/bi00699a001. [PubMed] [Cross Ref]
  • Wilmot CM, Thornton JM. Analysis and prediction of the different types of β-turn in proteins. J Mol Biol. 1988;203:221–232. doi: 10.1016/0022-2836(88)90103-9. [PubMed] [Cross Ref]
  • Wilmot CM, Thornton JM. β-turns and their distortions: a proposed new nomenclature. Protein Eng. 1990;3(6):479–493. doi: 10.1093/protein/3.6.479. [PubMed] [Cross Ref]
  • Chou KC, Blinn JR. Classification and prediction of β-turn types. J Protein Chem. 1997;16(6):575–595. doi: 10.1023/A:1026366706677. [PubMed] [Cross Ref]
  • Zhang C, Chou K. Prediction of β-turns in proteins by 1-4 and 2-3 correlation model. Biopolymers. 1997;41(6):673–702. doi: 10.1002/(SICI)1097-0282(199705)41:6<673::AID-BIP7>3.0.CO;2-N. [Cross Ref]
  • Fuchs PFJ, Alix AJP. High accuracy prediction of β-turns and their types using propensities and multiple alignments. Proteins. 2005;59(4):828–839. doi: 10.1002/prot.20461. [PubMed] [Cross Ref]
  • McGregor MJ, Flores TP, Sternberg MJE. Prediction of β-turns in proteins using neural networks. Protein Eng. 1989;2(7):521–526. doi: 10.1093/protein/2.7.521. [PubMed] [Cross Ref]
  • Shepherd AJ, Gorse D, Thornton JM. Prediction of the location and type of β-turns in proteins using neural networks. Protein Sci. 1999;8(5):1045–1055. doi: 10.1110/ps.8.5.1045. [PubMed] [Cross Ref]
  • Kaur H, Raghava GPS. Prediction of β-turns in proteins from multiple alignment using neural network. Protein Sci. 2003;12(3):627–634. doi: 10.1110/ps.0228903. [PubMed] [Cross Ref]
  • Kirschner A, Frishman D. Prediction of β-turns and β-turn types by a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN) Gene. 2008;422(1-2):22–29. doi: 10.1016/j.gene.2008.06.008. [PubMed] [Cross Ref]
  • Cai YD, Liu XJ, Li YX, Xu XB, Chou KC. Prediction of β-turns with learning machines. Peptides. 2003;24(5):665–669. doi: 10.1016/S0196-9781(03)00133-5. [PubMed] [Cross Ref]
  • Zheng C, Kurgan L. Prediction of β-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments. BMC Bioinformatics. 2008;9:430. doi: 10.1186/1471-2105-9-430. [PMC free article] [PubMed] [Cross Ref]
  • Zhang Q, Yoon S, Welsh WJ. Improved method for predicting β-turn using support vector machine. Bioinformatics. 2005;21(10):2370–2374. doi: 10.1093/bioinformatics/bti358. [PubMed] [Cross Ref]
  • Pham TH, Satou K, Ho TB. Prediction and analysis of β-turns in proteins by support vector machine. Genome Inform. 2003;14:196–205. [PubMed]
  • Hu X, Li Q. Using support vector machine to predict β- and γ-turns in proteins. J Comput Chem. 2008;29(12):1867–1875. doi: 10.1002/jcc.20929. [PubMed] [Cross Ref]
  • Kim S. Protein beta-turn prediction using nearest-neighbor method. Bioinformatics. 2004;20:40–44. doi: 10.1093/bioinformatics/btg368. [PubMed] [Cross Ref]
  • Kaur H, Raghava GPS. A neural network method for prediction of β-turn types in proteins using evolutionary information. Bioinformatics. 2004;20(16):2751–2758. doi: 10.1093/bioinformatics/bth322. [PubMed] [Cross Ref]
  • Asgary MP, Jahandideh S, Abdolmaleki P, Kazemnejad A. Analysis and identification of β-turn types using multinomial logistic regression and artificial neural network. Bioinformatics. 2007;23(23):3125–3130. doi: 10.1093/bioinformatics/btm324. [PubMed] [Cross Ref]
  • Wood MJ, Hirst JD. Protein secondary structure prediction with dihedral angles. Proteins. 2005;59(3):476–481. doi: 10.1002/prot.20435. [PubMed] [Cross Ref]
  • Guruprasad K, Rajkumar S. β- and γ-turns in proteins revisited: a new set of amino acid turn-type dependent positional preferences and potentials. J Biosci. 2000;25(2):143–156. [PubMed]
  • Kaur H, Raghava GPS. An evaluation of β-turn prediction methods. Bioinformatics. 2002;18(11):1508–1514. doi: 10.1093/bioinformatics/18.11.1508. [PubMed] [Cross Ref]
  • Hobohm U, Scharf M, Schneider R, Sander C. Selection of representative protein data sets. Protein Sci. 1992;1(3):409–417. doi: 10.1002/pro.5560010313. [PubMed] [Cross Ref]
  • Hutchinson EG, Thornton JM. PROMOTIF-a program to identify and analyze structural motifs in proteins. Protein Sci. 1996;5(2):212–220. doi: 10.1002/pro.5560050204. [PubMed] [Cross Ref]
  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [PMC free article] [PubMed] [Cross Ref]
  • Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA. 1992;89(22):10915–10919. doi: 10.1073/pnas.89.22.10915. [PubMed] [Cross Ref]
  • Jones DT, Swindells MB. Getting the most from PSI-BLAST. Trends Biochem Sci. 2002;27(3):161–164. doi: 10.1016/S0968-0004(01)02039-4. [PubMed] [Cross Ref]
  • Vapnik V. The Nature of Statistical Learning Theory. N.Y.: Springer; 1995.
  • Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press; 2000.
  • Burges CJ. A Tutorial on Support Vector Machines for Pattern Recognition. Data Min and Knowl Disc. 1998;2(2):121–167. doi: 10.1023/A:1009715923555. [Cross Ref]
  • Scholkopf B, Smola AJ. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA, USA: MIT Press; 2001.
  • Chang CC, Lin CJ. LIBSVM: a library for support vector machines. 2001. http://www.csie.ntu.edu.tw/~cjlin/libsvm
  • Osuna E, Freund R, Girosi F. Support Vector Machines: Training and Applications. Tech. rep., Cambridge, MA, USA; 1997.
  • Kaur H, Raghava GPS. BetaTPred: prediction of β-turns in a protein using statistical algorithms. Bioinformatics. 2002;18(3):498–499. doi: 10.1093/bioinformatics/18.3.498. [PubMed] [Cross Ref]
  • Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975;405(2):442–451. [PubMed]
  • Sonego P, Kocsor A, Pongor S. ROC analysis: applications to the classification of biological sequences and 3 D structures. Brief Bioinform. 2008;9(3):198–209. doi: 10.1093/bib/bbm064. [PubMed] [Cross Ref]
  • Fawcett T. An introduction to ROC analysis. Pattern Recogn Lett. 2006;27(8):861–874. doi: 10.1016/j.patrec.2005.10.010. [Cross Ref]
  • Frishman D, Argos P. Seventy-five percent accuracy in protein secondary structure prediction. Proteins. 1997;27(3):329–335. doi: 10.1002/(SICI)1097-0134(199703)27:3<329::AID-PROT1>3.0.CO;2-8. [PubMed] [Cross Ref]

Articles from BMC Bioinformatics are provided here courtesy of BioMed Central