TY - JOUR
T1 - Simplified sequence-based method for ATP-binding prediction using contextual local evolutionary conservation
AU - Fang, Chun
AU - Noguchi, Tamotsu
AU - Yamana, Hayato
PY - 2014/3/11
Y1 - 2014/3/11
N2 - Background: Identifying ligand-binding sites is a key step to annotate the protein functions and to find applications in drug design. Now, many sequence-based methods adopted various predicted results from other classifiers, such as predicted secondary structure, predicted solvent accessibility and predicted disorder probabilities, to combine with position-specific scoring matrix (PSSM) as input for binding sites prediction. These predicted features not only easily result in high-dimensional feature space, but also greatly increased the complexity of algorithms. Moreover, the performances of these predictors are also largely influenced by the other classifiers.Results: In order to verify that conservation is the most powerful attribute in identifying ligand-binding sites, and to show the importance of revising PSSM to match the detailed conservation pattern of functional site in prediction, we have analyzed the Adenosine-5'-triphosphate (ATP) ligand as an example, and proposed a simple method for ATP-binding sites prediction, named as CLCLpred (Contextual Local evolutionary Conservation-based method for Ligand-binding prediction). Our method employed no predicted results from other classifiers as input; all used features were extracted from PSSM only. We tested our method on 2 separate data sets. Experimental results showed that, comparing with other 9 existing methods on the same data sets, our method achieved the best performance.Conclusions: This study demonstrates that: 1) exploiting the signal from the detailed conservation pattern of residues will largely facilitate the prediction of protein functional sites; and 2) the local evolutionary conservation enables accurate prediction of ATP-binding sites directly from protein sequence.
AB - Background: Identifying ligand-binding sites is a key step to annotate the protein functions and to find applications in drug design. Now, many sequence-based methods adopted various predicted results from other classifiers, such as predicted secondary structure, predicted solvent accessibility and predicted disorder probabilities, to combine with position-specific scoring matrix (PSSM) as input for binding sites prediction. These predicted features not only easily result in high-dimensional feature space, but also greatly increased the complexity of algorithms. Moreover, the performances of these predictors are also largely influenced by the other classifiers.Results: In order to verify that conservation is the most powerful attribute in identifying ligand-binding sites, and to show the importance of revising PSSM to match the detailed conservation pattern of functional site in prediction, we have analyzed the Adenosine-5'-triphosphate (ATP) ligand as an example, and proposed a simple method for ATP-binding sites prediction, named as CLCLpred (Contextual Local evolutionary Conservation-based method for Ligand-binding prediction). Our method employed no predicted results from other classifiers as input; all used features were extracted from PSSM only. We tested our method on 2 separate data sets. Experimental results showed that, comparing with other 9 existing methods on the same data sets, our method achieved the best performance.Conclusions: This study demonstrates that: 1) exploiting the signal from the detailed conservation pattern of residues will largely facilitate the prediction of protein functional sites; and 2) the local evolutionary conservation enables accurate prediction of ATP-binding sites directly from protein sequence.
KW - ATP-binding site
KW - Local evolutionary conservation
KW - Sequence-based
UR - http://www.scopus.com/inward/record.url?scp=84899131695&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84899131695&partnerID=8YFLogxK
U2 - 10.1186/1748-7188-9-7
DO - 10.1186/1748-7188-9-7
M3 - Article
AN - SCOPUS:84899131695
VL - 9
JO - Algorithms for Molecular Biology
JF - Algorithms for Molecular Biology
SN - 1748-7188
IS - 1
M1 - 7
ER -