Secure wavelet matrix: Alphabet-friendly privacy-preserving string search for bioinformatics

Hiroki Sudo, Masanobu Jimbo, Koji Nuida, Kana Shimizu

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Biomedical data often includes personal information, and the technology is demanded that enables the searching of such sensitive data while protecting privacy. We consider a case in which a server has a text database and a user searches the database to find substring matches. The user wants to conceal his/her query and the server wants to conceal the database except for the search results. The previous approach for this problem is based on a linear-time algorithm in terms of alphabet size $\mathbf{|\Sigma |}$||, and it cannot search on the database of large alphabet such as biomedical documents. We present a novel algorithm that can search a string in logarithmic time of $\mathbf{|\Sigma |}$||. In our algorithm, named secure wavelet matrix sWM, we use an additively homomorphic encryption to build an efficient data structure called a wavelet matrix. In an experiment using a simulated string of length 10,000 whose alphabet size ranges from 4 to 1024, the run time of the sWM was up to around two orders of magnitude faster than that of the previous method. sWM enables the searching of a private database efficiently and thus it will facilitate utilizing sensitive biomedical information.

Original languageEnglish
Article number3370684
Pages (from-to)1675-1684
Number of pages10
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume16
Issue number5
DOIs
Publication statusPublished - 2019 Sep

Keywords

  • FM-index
  • String search
  • homomorphic encryption
  • privacy
  • wavelet matrix

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Secure wavelet matrix: Alphabet-friendly privacy-preserving string search for bioinformatics'. Together they form a unique fingerprint.

Cite this