TY - JOUR
T1 - POODLE-L
T2 - A two-level SVM prediction system for reliably predicting long disordered regions
AU - Hirose, Shuichi
AU - Shimizu, Kana
AU - Kanai, Satoru
AU - Kuroda, Yutaka
AU - Noguchi, Tamotsu
N1 - Funding Information:
We thank our colleagues in PharmaDesign, Inc. and protein function team in Computational Biological Research Center (CBRC) for helpful advice and discussion. This work was funded by PharmaDesign, Inc. and Advanced Industrial Science and Technology (AIST).
PY - 2007/8/15
Y1 - 2007/8/15
N2 - Motivation: Recent experimental and theoretical studies have revealed several proteins containing sequence segments that are unfolded under physiological conditions. These segments are called disordered regions. They are actively investigated because of their possible involvement in various biological processes, such as cell signaling, transcriptional and translational regulation. Additionally, disordered regions can represent a major obstacle to high-throughput proteome analysis and often need to be removed from experimental targets. The accurate prediction of long disordered regions is thus expected to provide annotations that are useful for a wide range of applications. Results: We developed Prediction of Order and Disorder by machine LEarning (POODLE-L; L stands for long), the Support Vector Machines (SVMs) based method for predicting long disordered regions using 10 kinds of simple physico-chemical properties of amino acid. POODLE-L assembles the output of 10 two-level SVM predictors into a final prediction of disordered regions. The performance of POODLE-L for predicting long disordered regions, which exhibited a Matthew's correlation coefficient of 0.658, was the highest when compared with eight well-established publicly available disordered region predictors.
AB - Motivation: Recent experimental and theoretical studies have revealed several proteins containing sequence segments that are unfolded under physiological conditions. These segments are called disordered regions. They are actively investigated because of their possible involvement in various biological processes, such as cell signaling, transcriptional and translational regulation. Additionally, disordered regions can represent a major obstacle to high-throughput proteome analysis and often need to be removed from experimental targets. The accurate prediction of long disordered regions is thus expected to provide annotations that are useful for a wide range of applications. Results: We developed Prediction of Order and Disorder by machine LEarning (POODLE-L; L stands for long), the Support Vector Machines (SVMs) based method for predicting long disordered regions using 10 kinds of simple physico-chemical properties of amino acid. POODLE-L assembles the output of 10 two-level SVM predictors into a final prediction of disordered regions. The performance of POODLE-L for predicting long disordered regions, which exhibited a Matthew's correlation coefficient of 0.658, was the highest when compared with eight well-established publicly available disordered region predictors.
UR - http://www.scopus.com/inward/record.url?scp=34548567232&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548567232&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btm302
DO - 10.1093/bioinformatics/btm302
M3 - Article
C2 - 17545177
AN - SCOPUS:34548567232
VL - 23
SP - 2046
EP - 2053
JO - Bioinformatics
JF - Bioinformatics
SN - 1367-4803
IS - 16
ER -