Secure Wavelet Matrix: Alphabet-Friendly Privacy-Preserving String Search for Bioinformatics

Hiroki Sudo, Masanobu Jimbo, Koji Nuida, Kana Shimizu

Research output: Contribution to journalArticle

Abstract

Biomedical data often includes personal information, and the technology is demanded that enables to search such a sensitive data while protecting privacy. We consider a case in which a server has a text database and a user searches the database to find substring matches. The user wants to conceal his/her query and the server wants to conceal the database except for the search results. The previous approach for this problem is based on a linear-time algorithm in terms of alphabet size <formula><tex>$|\Sigma|$</tex></formula>, and it cannot search on the database of large alphabet such as biomedical documents.We present a novel algorithm that can search a string in logarithmic time of <formula><tex>$|\Sigma|$</tex></formula>. In our algorithm, named secure wavelet matrix (sWM), we use an additively homomorphic encryption to build an efficient data structure called a wavelet matrix.In an experiment using a simulated string of length 10,000 whose alphabet size ranges from 4 to 1024, the run time of the sWM was up to around two orders of magnitude faster than that of the previous method.sWM enables to search a private database efficiently and thus it will facilitate utilizing sensitive biomedical information.

Original languageEnglish
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
DOIs
Publication statusAccepted/In press - 2018 Mar 8
Externally publishedYes

    Fingerprint

Keywords

  • Complexity theory
  • Data structures
  • FM-index
  • Homomorphic Encryption
  • Indexes
  • Privacy
  • Protocols
  • Search problems
  • Servers
  • String Search
  • Wavelet Matrix

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Cite this