Improvement in speed and accuracy of multiple sequence alignment program prime

Shinsuke Yamada, Osamu Gotoh, Hayato Yamana

    Research output: Contribution to journalArticle

    3 Citations (Scopus)

    Abstract

    Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. We have developed an MSA program PRIME, whose crucial feature is the use of a group-to-group sequence alignment algorithm with a piecewise linear gap cost. We have shown that PRIME is one of the most accurate MSA programs currently available. However, PRIME is slower than other leading MSA programs. To improve computational performance, we newly incorporate anchoring and grouping heuristics into PRIME. An anchoring method is to locate well-conserved regions in a given MSA as anchor points to reduce the region of DP matrix to be examined, while a grouping method detects conserved subfamily alignments specified by phylogenetic tree in a given MSA to reduce the number of iterative refinement steps. The results of BAliBASE 3.0 and PREFAB 4 benchmark tests indicated that these heuristics contributed to reduction in the computational time of PRIME by more than 60% while the average alignment accuracy measures decreased by at most 2%. Additionally, we evaluated the effectiveness of iterative refinement algorithm based on maximal expected accuracy (MEA). Our experiments revealed that when many sequences are aligned, the MEA-based algorithm significantly improves alignment accuracy compared with the standard version of PRIME at the expense of a considerable increase in computation time.

    Original languageEnglish
    Pages (from-to)2-12
    Number of pages11
    JournalIPSJ Transactions on Bioinformatics
    Volume1
    DOIs
    Publication statusPublished - 2008 Nov

    Fingerprint

    Sequence Alignment
    Benchmarking
    Computational Biology
    Bioinformatics
    Anchors
    Costs and Cost Analysis

    ASJC Scopus subject areas

    • Computer Science Applications
    • Biochemistry, Genetics and Molecular Biology (miscellaneous)

    Cite this

    Improvement in speed and accuracy of multiple sequence alignment program prime. / Yamada, Shinsuke; Gotoh, Osamu; Yamana, Hayato.

    In: IPSJ Transactions on Bioinformatics, Vol. 1, 11.2008, p. 2-12.

    Research output: Contribution to journalArticle

    @article{aebe562633334dceb3e08f5e041e7fb2,
    title = "Improvement in speed and accuracy of multiple sequence alignment program prime",
    abstract = "Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. We have developed an MSA program PRIME, whose crucial feature is the use of a group-to-group sequence alignment algorithm with a piecewise linear gap cost. We have shown that PRIME is one of the most accurate MSA programs currently available. However, PRIME is slower than other leading MSA programs. To improve computational performance, we newly incorporate anchoring and grouping heuristics into PRIME. An anchoring method is to locate well-conserved regions in a given MSA as anchor points to reduce the region of DP matrix to be examined, while a grouping method detects conserved subfamily alignments specified by phylogenetic tree in a given MSA to reduce the number of iterative refinement steps. The results of BAliBASE 3.0 and PREFAB 4 benchmark tests indicated that these heuristics contributed to reduction in the computational time of PRIME by more than 60{\%} while the average alignment accuracy measures decreased by at most 2{\%}. Additionally, we evaluated the effectiveness of iterative refinement algorithm based on maximal expected accuracy (MEA). Our experiments revealed that when many sequences are aligned, the MEA-based algorithm significantly improves alignment accuracy compared with the standard version of PRIME at the expense of a considerable increase in computation time.",
    author = "Shinsuke Yamada and Osamu Gotoh and Hayato Yamana",
    year = "2008",
    month = "11",
    doi = "10.2197/ipsjtbio.1.2",
    language = "English",
    volume = "1",
    pages = "2--12",
    journal = "IPSJ Transactions on Bioinformatics",
    issn = "1882-6679",
    publisher = "Information Processing Society of Japan",

    }

    TY - JOUR

    T1 - Improvement in speed and accuracy of multiple sequence alignment program prime

    AU - Yamada, Shinsuke

    AU - Gotoh, Osamu

    AU - Yamana, Hayato

    PY - 2008/11

    Y1 - 2008/11

    N2 - Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. We have developed an MSA program PRIME, whose crucial feature is the use of a group-to-group sequence alignment algorithm with a piecewise linear gap cost. We have shown that PRIME is one of the most accurate MSA programs currently available. However, PRIME is slower than other leading MSA programs. To improve computational performance, we newly incorporate anchoring and grouping heuristics into PRIME. An anchoring method is to locate well-conserved regions in a given MSA as anchor points to reduce the region of DP matrix to be examined, while a grouping method detects conserved subfamily alignments specified by phylogenetic tree in a given MSA to reduce the number of iterative refinement steps. The results of BAliBASE 3.0 and PREFAB 4 benchmark tests indicated that these heuristics contributed to reduction in the computational time of PRIME by more than 60% while the average alignment accuracy measures decreased by at most 2%. Additionally, we evaluated the effectiveness of iterative refinement algorithm based on maximal expected accuracy (MEA). Our experiments revealed that when many sequences are aligned, the MEA-based algorithm significantly improves alignment accuracy compared with the standard version of PRIME at the expense of a considerable increase in computation time.

    AB - Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. We have developed an MSA program PRIME, whose crucial feature is the use of a group-to-group sequence alignment algorithm with a piecewise linear gap cost. We have shown that PRIME is one of the most accurate MSA programs currently available. However, PRIME is slower than other leading MSA programs. To improve computational performance, we newly incorporate anchoring and grouping heuristics into PRIME. An anchoring method is to locate well-conserved regions in a given MSA as anchor points to reduce the region of DP matrix to be examined, while a grouping method detects conserved subfamily alignments specified by phylogenetic tree in a given MSA to reduce the number of iterative refinement steps. The results of BAliBASE 3.0 and PREFAB 4 benchmark tests indicated that these heuristics contributed to reduction in the computational time of PRIME by more than 60% while the average alignment accuracy measures decreased by at most 2%. Additionally, we evaluated the effectiveness of iterative refinement algorithm based on maximal expected accuracy (MEA). Our experiments revealed that when many sequences are aligned, the MEA-based algorithm significantly improves alignment accuracy compared with the standard version of PRIME at the expense of a considerable increase in computation time.

    UR - http://www.scopus.com/inward/record.url?scp=76249130835&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=76249130835&partnerID=8YFLogxK

    U2 - 10.2197/ipsjtbio.1.2

    DO - 10.2197/ipsjtbio.1.2

    M3 - Article

    AN - SCOPUS:76249130835

    VL - 1

    SP - 2

    EP - 12

    JO - IPSJ Transactions on Bioinformatics

    JF - IPSJ Transactions on Bioinformatics

    SN - 1882-6679

    ER -