Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition

Shotaro Sano, Takuma Otsuka, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

CAPTCHAs play critical roles in maintaining the security of various Web services by distinguishing humans from automated programs and preventing Web services from being abused. CAPTCHAs are designed to block automated programs by presenting questions that are easy for humans but difficult for computers, e.g., recognition of visual digits or audio utterances. Recent audio CAPTCHAs, such as Google's audio reCAPTCHA, have presented overlapping and distorted target voices with stationary background noise. We investigate the security of overlapping audio CAPTCHAs by developing an audio reCAPTCHA solver. Our solver is constructed based on speech recognition techniques using hidden Markov models (HMMs). It is implemented by using an off-the-shelf library HMM Toolkit. Our experiments revealed vulnerabilities in the current version of audio reCAPTCHA with the solver cracking 52% of the questions. We further explain that background stationary noise did not contribute to enhance security against our solver.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages36-52
Number of pages17
Volume8231 LNCS
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event8th International Workshop on Security, IWSEC 2013 - Okinawa
Duration: 2013 Nov 182013 Nov 20

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8231 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other8th International Workshop on Security, IWSEC 2013
CityOkinawa
Period13/11/1813/11/20

Fingerprint

Automatic Speech Recognition
Hidden Markov models
Speech recognition
Web services
Markov Model
Model-based
Web Services
Overlapping
Cracking
Speech Recognition
Digit
Vulnerability
Target
Experiments
Experiment
Background
Human

Keywords

  • audio CAPTCHA
  • automatic speech recognition
  • hidden Marcov model
  • human interaction proof
  • reCAPTCHA

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Sano, S., Otsuka, T., & Okuno, H. G. (2013). Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8231 LNCS, pp. 36-52). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8231 LNCS). https://doi.org/10.1007/978-3-642-41383-4_3

Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition. / Sano, Shotaro; Otsuka, Takuma; Okuno, Hiroshi G.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8231 LNCS 2013. p. 36-52 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8231 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sano, S, Otsuka, T & Okuno, HG 2013, Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 8231 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8231 LNCS, pp. 36-52, 8th International Workshop on Security, IWSEC 2013, Okinawa, 13/11/18. https://doi.org/10.1007/978-3-642-41383-4_3
Sano S, Otsuka T, Okuno HG. Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8231 LNCS. 2013. p. 36-52. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-41383-4_3
Sano, Shotaro ; Otsuka, Takuma ; Okuno, Hiroshi G. / Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8231 LNCS 2013. pp. 36-52 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{49fc6327e7954c47ab9199325b2b7efd,
title = "Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition",
abstract = "CAPTCHAs play critical roles in maintaining the security of various Web services by distinguishing humans from automated programs and preventing Web services from being abused. CAPTCHAs are designed to block automated programs by presenting questions that are easy for humans but difficult for computers, e.g., recognition of visual digits or audio utterances. Recent audio CAPTCHAs, such as Google's audio reCAPTCHA, have presented overlapping and distorted target voices with stationary background noise. We investigate the security of overlapping audio CAPTCHAs by developing an audio reCAPTCHA solver. Our solver is constructed based on speech recognition techniques using hidden Markov models (HMMs). It is implemented by using an off-the-shelf library HMM Toolkit. Our experiments revealed vulnerabilities in the current version of audio reCAPTCHA with the solver cracking 52{\%} of the questions. We further explain that background stationary noise did not contribute to enhance security against our solver.",
keywords = "audio CAPTCHA, automatic speech recognition, hidden Marcov model, human interaction proof, reCAPTCHA",
author = "Shotaro Sano and Takuma Otsuka and Okuno, {Hiroshi G.}",
year = "2013",
doi = "10.1007/978-3-642-41383-4_3",
language = "English",
isbn = "9783642413827",
volume = "8231 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "36--52",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Solving Google's continuous audio CAPTCHA with HMM-based automatic speech recognition

AU - Sano, Shotaro

AU - Otsuka, Takuma

AU - Okuno, Hiroshi G.

PY - 2013

Y1 - 2013

N2 - CAPTCHAs play critical roles in maintaining the security of various Web services by distinguishing humans from automated programs and preventing Web services from being abused. CAPTCHAs are designed to block automated programs by presenting questions that are easy for humans but difficult for computers, e.g., recognition of visual digits or audio utterances. Recent audio CAPTCHAs, such as Google's audio reCAPTCHA, have presented overlapping and distorted target voices with stationary background noise. We investigate the security of overlapping audio CAPTCHAs by developing an audio reCAPTCHA solver. Our solver is constructed based on speech recognition techniques using hidden Markov models (HMMs). It is implemented by using an off-the-shelf library HMM Toolkit. Our experiments revealed vulnerabilities in the current version of audio reCAPTCHA with the solver cracking 52% of the questions. We further explain that background stationary noise did not contribute to enhance security against our solver.

AB - CAPTCHAs play critical roles in maintaining the security of various Web services by distinguishing humans from automated programs and preventing Web services from being abused. CAPTCHAs are designed to block automated programs by presenting questions that are easy for humans but difficult for computers, e.g., recognition of visual digits or audio utterances. Recent audio CAPTCHAs, such as Google's audio reCAPTCHA, have presented overlapping and distorted target voices with stationary background noise. We investigate the security of overlapping audio CAPTCHAs by developing an audio reCAPTCHA solver. Our solver is constructed based on speech recognition techniques using hidden Markov models (HMMs). It is implemented by using an off-the-shelf library HMM Toolkit. Our experiments revealed vulnerabilities in the current version of audio reCAPTCHA with the solver cracking 52% of the questions. We further explain that background stationary noise did not contribute to enhance security against our solver.

KW - audio CAPTCHA

KW - automatic speech recognition

KW - hidden Marcov model

KW - human interaction proof

KW - reCAPTCHA

UR - http://www.scopus.com/inward/record.url?scp=84891904580&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84891904580&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-41383-4_3

DO - 10.1007/978-3-642-41383-4_3

M3 - Conference contribution

AN - SCOPUS:84891904580

SN - 9783642413827

VL - 8231 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 36

EP - 52

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -