DeepM6ASeq: Prediction and characterization of m6A-containing sequences using deep learning

Yiqian Zhang, Michiaki Hamada

    Research output: Contribution to journalArticle

    2 Citations (Scopus)

    Abstract

    Background: N6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites. Results: We implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites. Conculsion: We developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq.

    Original languageEnglish
    Article number524
    JournalBMC Bioinformatics
    Volume19
    DOIs
    Publication statusPublished - 2018 Dec 31

    Fingerprint

    Methylation
    Learning
    RNA
    Predict
    Prediction
    Saliency Map
    Alternative Splicing
    MicroRNA
    MicroRNAs
    Genomics
    Learning systems
    Machine Learning
    Classifiers
    Classifier
    Research Personnel
    Efficiency
    Research
    Model
    Deep learning
    Framework

    Keywords

    • Deep learning
    • N6-methyladenosine
    • RNA modification

    ASJC Scopus subject areas

    • Structural Biology
    • Biochemistry
    • Molecular Biology
    • Computer Science Applications
    • Applied Mathematics

    Cite this

    DeepM6ASeq : Prediction and characterization of m6A-containing sequences using deep learning. / Zhang, Yiqian; Hamada, Michiaki.

    In: BMC Bioinformatics, Vol. 19, 524, 31.12.2018.

    Research output: Contribution to journalArticle

    @article{3f8a7194e6964872ae2c4c46bd5b05e0,
    title = "DeepM6ASeq: Prediction and characterization of m6A-containing sequences using deep learning",
    abstract = "Background: N6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites. Results: We implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites. Conculsion: We developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq.",
    keywords = "Deep learning, N6-methyladenosine, RNA modification",
    author = "Yiqian Zhang and Michiaki Hamada",
    year = "2018",
    month = "12",
    day = "31",
    doi = "10.1186/s12859-018-2516-4",
    language = "English",
    volume = "19",
    journal = "BMC Bioinformatics",
    issn = "1471-2105",
    publisher = "BioMed Central",

    }

    TY - JOUR

    T1 - DeepM6ASeq

    T2 - Prediction and characterization of m6A-containing sequences using deep learning

    AU - Zhang, Yiqian

    AU - Hamada, Michiaki

    PY - 2018/12/31

    Y1 - 2018/12/31

    N2 - Background: N6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites. Results: We implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites. Conculsion: We developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq.

    AB - Background: N6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites. Results: We implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites. Conculsion: We developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq.

    KW - Deep learning

    KW - N6-methyladenosine

    KW - RNA modification

    UR - http://www.scopus.com/inward/record.url?scp=85059265094&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85059265094&partnerID=8YFLogxK

    U2 - 10.1186/s12859-018-2516-4

    DO - 10.1186/s12859-018-2516-4

    M3 - Article

    C2 - 30598068

    AN - SCOPUS:85059265094

    VL - 19

    JO - BMC Bioinformatics

    JF - BMC Bioinformatics

    SN - 1471-2105

    M1 - 524

    ER -