Reference-free prediction of rearrangement breakpoint reads

Edward Wijaya, Kana Shimizu, Kiyoshi Asai, Michiaki Hamada

    Research output: Contribution to journalArticle

    4 Citations (Scopus)

    Abstract

    Availability and implementation: The source code of SlideSort-BPRcan be freely downloaded from https://code.google.com/p/slidesortbpr/.

    Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.

    Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100×, it finds ∼88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.

    Original languageEnglish
    Pages (from-to)2559-2567
    Number of pages9
    JournalBioinformatics
    Volume30
    Issue number18
    DOIs
    Publication statusPublished - 2014

    Fingerprint

    Rearrangement
    Genome
    Genes
    Sequencing
    Prediction
    False Positive
    Genomic Structural Variation
    Prostate Cancer
    Chromosomes
    Fast Algorithm
    Chromosome
    Prostatic Neoplasms
    Cancer
    Fusion
    Person
    DNA
    Fusion reactions
    Availability
    Molecules
    Predict

    ASJC Scopus subject areas

    • Biochemistry
    • Molecular Biology
    • Computational Theory and Mathematics
    • Computer Science Applications
    • Computational Mathematics
    • Statistics and Probability
    • Medicine(all)

    Cite this

    Reference-free prediction of rearrangement breakpoint reads. / Wijaya, Edward; Shimizu, Kana; Asai, Kiyoshi; Hamada, Michiaki.

    In: Bioinformatics, Vol. 30, No. 18, 2014, p. 2559-2567.

    Research output: Contribution to journalArticle

    Wijaya, Edward ; Shimizu, Kana ; Asai, Kiyoshi ; Hamada, Michiaki. / Reference-free prediction of rearrangement breakpoint reads. In: Bioinformatics. 2014 ; Vol. 30, No. 18. pp. 2559-2567.
    @article{a70bd29fec244c7b9d706f269b975a0b,
    title = "Reference-free prediction of rearrangement breakpoint reads",
    abstract = "Availability and implementation: The source code of SlideSort-BPRcan be freely downloaded from https://code.google.com/p/slidesortbpr/.Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100×, it finds ∼88{\%} of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.",
    author = "Edward Wijaya and Kana Shimizu and Kiyoshi Asai and Michiaki Hamada",
    year = "2014",
    doi = "10.1093/bioinformatics/btu360",
    language = "English",
    volume = "30",
    pages = "2559--2567",
    journal = "Bioinformatics",
    issn = "1367-4803",
    publisher = "Oxford University Press",
    number = "18",

    }

    TY - JOUR

    T1 - Reference-free prediction of rearrangement breakpoint reads

    AU - Wijaya, Edward

    AU - Shimizu, Kana

    AU - Asai, Kiyoshi

    AU - Hamada, Michiaki

    PY - 2014

    Y1 - 2014

    N2 - Availability and implementation: The source code of SlideSort-BPRcan be freely downloaded from https://code.google.com/p/slidesortbpr/.Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100×, it finds ∼88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.

    AB - Availability and implementation: The source code of SlideSort-BPRcan be freely downloaded from https://code.google.com/p/slidesortbpr/.Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100×, it finds ∼88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.

    UR - http://www.scopus.com/inward/record.url?scp=84907504043&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84907504043&partnerID=8YFLogxK

    U2 - 10.1093/bioinformatics/btu360

    DO - 10.1093/bioinformatics/btu360

    M3 - Article

    VL - 30

    SP - 2559

    EP - 2567

    JO - Bioinformatics

    JF - Bioinformatics

    SN - 1367-4803

    IS - 18

    ER -