Prediction of RNA secondary structure using generalized centroid estimators

Michiaki Hamada, Hisanori Kiryu, Kengo Sato, Toutai Mituyama, Kiyoshi Asai

Research output: Contribution to journalArticle

128 Citations (Scopus)

Abstract

Motivation: Recent studies have shown that the methods for predicting secondary structures of RNAs on the basis of posterior decoding of the base-pairing probabilities has an advantage with respect to prediction accuracy over the conventionally utilized minimum free energy methods. However, there is room for improvement in the objective functions presented in previous studies, which are maximized in the posterior decoding with respect to the accuracy measures for secondary structures. Results: We propose novel estimators which improve the accuracy of secondary structure prediction of RNAs. The proposed estimators maximize an objective function which is the weighted sum of the expected number of the true positives and that of the true negatives of the base pairs. The proposed estimators are also improved versions of the ones used in previous works, namely CONTRAfold for secondary structure prediction from a single RNA sequence and McCaskill-MEA for common secondary structure prediction from multiple alignments of RNA sequences. We clarify the relations between the proposed estimators and the estimators presented in previous works, and theoretically show that the previous estimators include additional unnecessary terms in the evaluation measures with respect to the accuracy. Furthermore, computational experiments confirm the theoretical analysis by indicating improvement in the empirical accuracy. The proposed estimators represent extensions of the centroid estimators proposed in Ding et al. and Carvalho and Lawrence, and are applicable to a wide variety of problems in bioinformatics.

Original languageEnglish
Pages (from-to)465-473
Number of pages9
JournalBioinformatics
Volume25
Issue number4
DOIs
Publication statusPublished - 2009 Feb
Externally publishedYes

Fingerprint

RNA Secondary Structure
Centroid
RNA
Base Pairing
Estimator
Secondary Structure
Prediction
Computational Biology
Structure Prediction
Decoding
Bioinformatics
Free energy
Objective function
Energy Method
Weighted Sums
Pairing
Computational Experiments
Free Energy
Theoretical Analysis
Alignment

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability

Cite this

Prediction of RNA secondary structure using generalized centroid estimators. / Hamada, Michiaki; Kiryu, Hisanori; Sato, Kengo; Mituyama, Toutai; Asai, Kiyoshi.

In: Bioinformatics, Vol. 25, No. 4, 02.2009, p. 465-473.

Research output: Contribution to journalArticle

Hamada, Michiaki ; Kiryu, Hisanori ; Sato, Kengo ; Mituyama, Toutai ; Asai, Kiyoshi. / Prediction of RNA secondary structure using generalized centroid estimators. In: Bioinformatics. 2009 ; Vol. 25, No. 4. pp. 465-473.
@article{64a56cb5db39414eb3030ab4c95ecae7,
title = "Prediction of RNA secondary structure using generalized centroid estimators",
abstract = "Motivation: Recent studies have shown that the methods for predicting secondary structures of RNAs on the basis of posterior decoding of the base-pairing probabilities has an advantage with respect to prediction accuracy over the conventionally utilized minimum free energy methods. However, there is room for improvement in the objective functions presented in previous studies, which are maximized in the posterior decoding with respect to the accuracy measures for secondary structures. Results: We propose novel estimators which improve the accuracy of secondary structure prediction of RNAs. The proposed estimators maximize an objective function which is the weighted sum of the expected number of the true positives and that of the true negatives of the base pairs. The proposed estimators are also improved versions of the ones used in previous works, namely CONTRAfold for secondary structure prediction from a single RNA sequence and McCaskill-MEA for common secondary structure prediction from multiple alignments of RNA sequences. We clarify the relations between the proposed estimators and the estimators presented in previous works, and theoretically show that the previous estimators include additional unnecessary terms in the evaluation measures with respect to the accuracy. Furthermore, computational experiments confirm the theoretical analysis by indicating improvement in the empirical accuracy. The proposed estimators represent extensions of the centroid estimators proposed in Ding et al. and Carvalho and Lawrence, and are applicable to a wide variety of problems in bioinformatics.",
author = "Michiaki Hamada and Hisanori Kiryu and Kengo Sato and Toutai Mituyama and Kiyoshi Asai",
year = "2009",
month = "2",
doi = "10.1093/bioinformatics/btn601",
language = "English",
volume = "25",
pages = "465--473",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "4",

}

TY - JOUR

T1 - Prediction of RNA secondary structure using generalized centroid estimators

AU - Hamada, Michiaki

AU - Kiryu, Hisanori

AU - Sato, Kengo

AU - Mituyama, Toutai

AU - Asai, Kiyoshi

PY - 2009/2

Y1 - 2009/2

N2 - Motivation: Recent studies have shown that the methods for predicting secondary structures of RNAs on the basis of posterior decoding of the base-pairing probabilities has an advantage with respect to prediction accuracy over the conventionally utilized minimum free energy methods. However, there is room for improvement in the objective functions presented in previous studies, which are maximized in the posterior decoding with respect to the accuracy measures for secondary structures. Results: We propose novel estimators which improve the accuracy of secondary structure prediction of RNAs. The proposed estimators maximize an objective function which is the weighted sum of the expected number of the true positives and that of the true negatives of the base pairs. The proposed estimators are also improved versions of the ones used in previous works, namely CONTRAfold for secondary structure prediction from a single RNA sequence and McCaskill-MEA for common secondary structure prediction from multiple alignments of RNA sequences. We clarify the relations between the proposed estimators and the estimators presented in previous works, and theoretically show that the previous estimators include additional unnecessary terms in the evaluation measures with respect to the accuracy. Furthermore, computational experiments confirm the theoretical analysis by indicating improvement in the empirical accuracy. The proposed estimators represent extensions of the centroid estimators proposed in Ding et al. and Carvalho and Lawrence, and are applicable to a wide variety of problems in bioinformatics.

AB - Motivation: Recent studies have shown that the methods for predicting secondary structures of RNAs on the basis of posterior decoding of the base-pairing probabilities has an advantage with respect to prediction accuracy over the conventionally utilized minimum free energy methods. However, there is room for improvement in the objective functions presented in previous studies, which are maximized in the posterior decoding with respect to the accuracy measures for secondary structures. Results: We propose novel estimators which improve the accuracy of secondary structure prediction of RNAs. The proposed estimators maximize an objective function which is the weighted sum of the expected number of the true positives and that of the true negatives of the base pairs. The proposed estimators are also improved versions of the ones used in previous works, namely CONTRAfold for secondary structure prediction from a single RNA sequence and McCaskill-MEA for common secondary structure prediction from multiple alignments of RNA sequences. We clarify the relations between the proposed estimators and the estimators presented in previous works, and theoretically show that the previous estimators include additional unnecessary terms in the evaluation measures with respect to the accuracy. Furthermore, computational experiments confirm the theoretical analysis by indicating improvement in the empirical accuracy. The proposed estimators represent extensions of the centroid estimators proposed in Ding et al. and Carvalho and Lawrence, and are applicable to a wide variety of problems in bioinformatics.

UR - http://www.scopus.com/inward/record.url?scp=60149094002&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60149094002&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btn601

DO - 10.1093/bioinformatics/btn601

M3 - Article

VL - 25

SP - 465

EP - 473

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 4

ER -