A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs

Tsukasa Fukunaga, Michiaki Hamada

    Research output: Contribution to journalArticle

    Abstract

    RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5′-untranslated region (UTR) and 3′-UTR sequences to detect novel 5′-UTR-3′-UTR interactions. We thus identified two significant 5′-UTR-3′-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5′-UTR-3′-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.

    Original languageEnglish
    Pages (from-to)976-986
    Number of pages11
    JournalJournal of Computational Biology
    Volume25
    Issue number9
    DOIs
    Publication statusPublished - 2018 Sep 1

    Fingerprint

    Statistical Significance
    RNA
    Interaction
    5' Untranslated Regions
    3' Untranslated Regions
    Untranslated RNA
    Long Noncoding RNA
    Gumbel Distribution
    Prediction
    Experimental Validation
    p-Value
    Energy
    False Positive
    Pairing
    Experimentation
    Base Pairing
    Primates

    Keywords

    • RNA bioinformatics
    • RNA-RNA interaction
    • statistical test

    ASJC Scopus subject areas

    • Modelling and Simulation
    • Molecular Biology
    • Genetics
    • Computational Mathematics
    • Computational Theory and Mathematics

    Cite this

    A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs. / Fukunaga, Tsukasa; Hamada, Michiaki.

    In: Journal of Computational Biology, Vol. 25, No. 9, 01.09.2018, p. 976-986.

    Research output: Contribution to journalArticle

    @article{62e77ecc56d64439aa467b547d4487c3,
    title = "A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs",
    abstract = "RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5′-untranslated region (UTR) and 3′-UTR sequences to detect novel 5′-UTR-3′-UTR interactions. We thus identified two significant 5′-UTR-3′-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5′-UTR-3′-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.",
    keywords = "RNA bioinformatics, RNA-RNA interaction, statistical test",
    author = "Tsukasa Fukunaga and Michiaki Hamada",
    year = "2018",
    month = "9",
    day = "1",
    doi = "10.1089/cmb.2017.0260",
    language = "English",
    volume = "25",
    pages = "976--986",
    journal = "Journal of Computational Biology",
    issn = "1066-5277",
    publisher = "Mary Ann Liebert Inc.",
    number = "9",

    }

    TY - JOUR

    T1 - A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs

    AU - Fukunaga, Tsukasa

    AU - Hamada, Michiaki

    PY - 2018/9/1

    Y1 - 2018/9/1

    N2 - RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5′-untranslated region (UTR) and 3′-UTR sequences to detect novel 5′-UTR-3′-UTR interactions. We thus identified two significant 5′-UTR-3′-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5′-UTR-3′-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.

    AB - RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5′-untranslated region (UTR) and 3′-UTR sequences to detect novel 5′-UTR-3′-UTR interactions. We thus identified two significant 5′-UTR-3′-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5′-UTR-3′-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.

    KW - RNA bioinformatics

    KW - RNA-RNA interaction

    KW - statistical test

    UR - http://www.scopus.com/inward/record.url?scp=85053185317&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85053185317&partnerID=8YFLogxK

    U2 - 10.1089/cmb.2017.0260

    DO - 10.1089/cmb.2017.0260

    M3 - Article

    AN - SCOPUS:85053185317

    VL - 25

    SP - 976

    EP - 986

    JO - Journal of Computational Biology

    JF - Journal of Computational Biology

    SN - 1066-5277

    IS - 9

    ER -