Efficient privacy-preserving variable-length substring match for genome sequence

Yoshiki Nakagawa, Satsuya Ohata, Kana Shimizu*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Finding a similar substring that commonly appears in query and database sequences is an essential task for genome data analysis. This study proposes a secure two-party variable-length string search protocol based on secret sharing. The unique feature of our protocol is that time, communication, and round complexities are not dependent on the database length N, after the query input. This property brings dramatic performance improvements in search time, since N is usually quite large in an actual genome database, and the same database is repeatedly used for many queries. Our concept hinges on a technique that efficiently applies the compressed full-text index (FOCS 2000) for a secret-sharing scheme. We conducted an experiment using a human genomic sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a well-designed baseline protocol under the realistic computation/network environment.

Original languageEnglish
Title of host publication21st International Workshop on Algorithms in Bioinformatics, WABI 2021
EditorsAlessandra Carbone, Mohammed El-Kebir
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959772006
DOIs
Publication statusPublished - 2021 Jul 1
Event21st International Workshop on Algorithms in Bioinformatics, WABI 2021 - Virtual, Chicago, United States
Duration: 2021 Aug 22021 Aug 4

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume201
ISSN (Print)1868-8969

Conference

Conference21st International Workshop on Algorithms in Bioinformatics, WABI 2021
Country/TerritoryUnited States
CityVirtual, Chicago
Period21/8/221/8/4

Keywords

  • FM-index
  • Maximal exact match
  • Private genome sequence search
  • Secret sharing
  • Secure multiparty computation
  • Suffix tree

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Efficient privacy-preserving variable-length substring match for genome sequence'. Together they form a unique fingerprint.

Cite this