Availability and implementation: The source code of SlideSort-BPRcan be freely downloaded from https://code.google.com/p/slidesortbpr/.
Motivation: Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information.
Results: In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100×, it finds ∼88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics