Large scale similarity search for locally stable secondary structures among RNA sequences

Michiaki Hamada, Toutai Mituyama, Kiyoshi Asai

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Recently, a large number of candidates of non-coding RNAs (ncRNAs) has been predicted by experimental or computational approaches. Moreover, in genomic sequences, there are still many interesting regions whose functions are unknown (e.g., indel conserved regions, human accelerated regions, ultraconserved elements and transposon free regions) and some of those regions may be ncRNAs. On the other hand, it is known that many ncRNAs have characteristic secondary structures which are strongly related to their functions. Therefore, detecting clusters which have mutually similar secondary structures is important for revealing new ncRNA families. In this paper, we describe a novel method, called RNAclique, which is able to search for clusters containing mutually similar and locally stable secondary structures among a large number of unaligned RNA sequences. Our problem is formulated as a constraint quasiclique search problem, and we use an approximate combinatorial optimization method, called GRASP, for solving the problem. Several computational experiments show that our method is useful and scalable for detecting ncRNA families from large sequences. We also present two examples of large scale sequence analysis using RNAclique.

Original languageEnglish
Pages (from-to)36-46
Number of pages11
JournalIPSJ Transactions on Bioinformatics
Volume2
DOIs
Publication statusPublished - 2009
Externally publishedYes

Fingerprint

Untranslated RNA
RNA
Combinatorial optimization
Sequence Analysis
Experiments

ASJC Scopus subject areas

  • Computer Science Applications
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)

Cite this

Large scale similarity search for locally stable secondary structures among RNA sequences. / Hamada, Michiaki; Mituyama, Toutai; Asai, Kiyoshi.

In: IPSJ Transactions on Bioinformatics, Vol. 2, 2009, p. 36-46.

Research output: Contribution to journalArticle

@article{05bb81fd693d40838b9f2b9b3a738575,
title = "Large scale similarity search for locally stable secondary structures among RNA sequences",
abstract = "Recently, a large number of candidates of non-coding RNAs (ncRNAs) has been predicted by experimental or computational approaches. Moreover, in genomic sequences, there are still many interesting regions whose functions are unknown (e.g., indel conserved regions, human accelerated regions, ultraconserved elements and transposon free regions) and some of those regions may be ncRNAs. On the other hand, it is known that many ncRNAs have characteristic secondary structures which are strongly related to their functions. Therefore, detecting clusters which have mutually similar secondary structures is important for revealing new ncRNA families. In this paper, we describe a novel method, called RNAclique, which is able to search for clusters containing mutually similar and locally stable secondary structures among a large number of unaligned RNA sequences. Our problem is formulated as a constraint quasiclique search problem, and we use an approximate combinatorial optimization method, called GRASP, for solving the problem. Several computational experiments show that our method is useful and scalable for detecting ncRNA families from large sequences. We also present two examples of large scale sequence analysis using RNAclique.",
author = "Michiaki Hamada and Toutai Mituyama and Kiyoshi Asai",
year = "2009",
doi = "10.2197/ipsjtbio.2.36",
language = "English",
volume = "2",
pages = "36--46",
journal = "IPSJ Transactions on Bioinformatics",
issn = "1882-6679",
publisher = "Information Processing Society of Japan",

}

TY - JOUR

T1 - Large scale similarity search for locally stable secondary structures among RNA sequences

AU - Hamada, Michiaki

AU - Mituyama, Toutai

AU - Asai, Kiyoshi

PY - 2009

Y1 - 2009

N2 - Recently, a large number of candidates of non-coding RNAs (ncRNAs) has been predicted by experimental or computational approaches. Moreover, in genomic sequences, there are still many interesting regions whose functions are unknown (e.g., indel conserved regions, human accelerated regions, ultraconserved elements and transposon free regions) and some of those regions may be ncRNAs. On the other hand, it is known that many ncRNAs have characteristic secondary structures which are strongly related to their functions. Therefore, detecting clusters which have mutually similar secondary structures is important for revealing new ncRNA families. In this paper, we describe a novel method, called RNAclique, which is able to search for clusters containing mutually similar and locally stable secondary structures among a large number of unaligned RNA sequences. Our problem is formulated as a constraint quasiclique search problem, and we use an approximate combinatorial optimization method, called GRASP, for solving the problem. Several computational experiments show that our method is useful and scalable for detecting ncRNA families from large sequences. We also present two examples of large scale sequence analysis using RNAclique.

AB - Recently, a large number of candidates of non-coding RNAs (ncRNAs) has been predicted by experimental or computational approaches. Moreover, in genomic sequences, there are still many interesting regions whose functions are unknown (e.g., indel conserved regions, human accelerated regions, ultraconserved elements and transposon free regions) and some of those regions may be ncRNAs. On the other hand, it is known that many ncRNAs have characteristic secondary structures which are strongly related to their functions. Therefore, detecting clusters which have mutually similar secondary structures is important for revealing new ncRNA families. In this paper, we describe a novel method, called RNAclique, which is able to search for clusters containing mutually similar and locally stable secondary structures among a large number of unaligned RNA sequences. Our problem is formulated as a constraint quasiclique search problem, and we use an approximate combinatorial optimization method, called GRASP, for solving the problem. Several computational experiments show that our method is useful and scalable for detecting ncRNA families from large sequences. We also present two examples of large scale sequence analysis using RNAclique.

UR - http://www.scopus.com/inward/record.url?scp=79954566331&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79954566331&partnerID=8YFLogxK

U2 - 10.2197/ipsjtbio.2.36

DO - 10.2197/ipsjtbio.2.36

M3 - Article

AN - SCOPUS:79954566331

VL - 2

SP - 36

EP - 46

JO - IPSJ Transactions on Bioinformatics

JF - IPSJ Transactions on Bioinformatics

SN - 1882-6679

ER -