Condensing position-specific scoring matrixs by the Kidera factors for ligand-binding site prediction

Chun Fang, Tamotsu Noguchi, Hayato Yamana

    Research output: Contribution to journalArticle

    3 Citations (Scopus)

    Abstract

    Position-specific scoring matrix (PSSM) has been widely used for identifying protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligandbinding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other featurebased methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50% feature dimensions without losing information included in the PSSM.

    Original languageEnglish
    Pages (from-to)70-84
    Number of pages15
    JournalInternational Journal of Data Mining and Bioinformatics
    Volume12
    Issue number1
    DOIs
    Publication statusPublished - 2015

    Fingerprint

    Position-Specific Scoring Matrices
    Binding sites
    Binding Sites
    Ligands
    weighting
    Benchmarking
    Physical properties
    Flavin-Adenine Dinucleotide
    Amino acids
    Feature extraction
    Amino Acids
    Proteins

    Keywords

    • Kidera factors
    • Ligand-binding site
    • Position specific scoring matrix

    ASJC Scopus subject areas

    • Library and Information Sciences
    • Information Systems
    • Biochemistry, Genetics and Molecular Biology(all)

    Cite this

    Condensing position-specific scoring matrixs by the Kidera factors for ligand-binding site prediction. / Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato.

    In: International Journal of Data Mining and Bioinformatics, Vol. 12, No. 1, 2015, p. 70-84.

    Research output: Contribution to journalArticle

    @article{9d928715e58c4b489837f1b738c03277,
    title = "Condensing position-specific scoring matrixs by the Kidera factors for ligand-binding site prediction",
    abstract = "Position-specific scoring matrix (PSSM) has been widely used for identifying protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligandbinding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other featurebased methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50{\%} feature dimensions without losing information included in the PSSM.",
    keywords = "Kidera factors, Ligand-binding site, Position specific scoring matrix",
    author = "Chun Fang and Tamotsu Noguchi and Hayato Yamana",
    year = "2015",
    doi = "10.1504/IJDMB.2015.068954",
    language = "English",
    volume = "12",
    pages = "70--84",
    journal = "International Journal of Data Mining and Bioinformatics",
    issn = "1748-5673",
    publisher = "Inderscience Enterprises Ltd",
    number = "1",

    }

    TY - JOUR

    T1 - Condensing position-specific scoring matrixs by the Kidera factors for ligand-binding site prediction

    AU - Fang, Chun

    AU - Noguchi, Tamotsu

    AU - Yamana, Hayato

    PY - 2015

    Y1 - 2015

    N2 - Position-specific scoring matrix (PSSM) has been widely used for identifying protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligandbinding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other featurebased methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50% feature dimensions without losing information included in the PSSM.

    AB - Position-specific scoring matrix (PSSM) has been widely used for identifying protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligandbinding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other featurebased methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50% feature dimensions without losing information included in the PSSM.

    KW - Kidera factors

    KW - Ligand-binding site

    KW - Position specific scoring matrix

    UR - http://www.scopus.com/inward/record.url?scp=84928801306&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84928801306&partnerID=8YFLogxK

    U2 - 10.1504/IJDMB.2015.068954

    DO - 10.1504/IJDMB.2015.068954

    M3 - Article

    VL - 12

    SP - 70

    EP - 84

    JO - International Journal of Data Mining and Bioinformatics

    JF - International Journal of Data Mining and Bioinformatics

    SN - 1748-5673

    IS - 1

    ER -