Paper
16 January 2006 CAPTCHA challenge strings: problems and improvements
Jon Bentley, Colin Mallows
Author Affiliations +
Proceedings Volume 6067, Document Recognition and Retrieval XIII; 60670H (2006) https://doi.org/10.1117/12.650644
Event: Electronic Imaging 2006, 2006, San Jose, California, United States
Abstract
A CAPTCHA is a Completely Automated Public Test to tell Computers and Humans Apart. Typical CAPTCHAs present a challenge string consisting of a visually distorted sequence of letters and perhaps numbers, which in theory only a human can read. Attackers of CAPTCHAs have two primary points of leverage: Optical Character Recognition (OCR) can identify some characters, while nonuniform probabilities make other characters relatively easy to guess. This paper uses a mathematical theory of assurance to characterize the probability that a correct answer to a CAPTCHA is not just a lucky guess. We examine the three most common types of challenge strings, dictionary words, Markov text, and random strings, and find substantial weaknesses in each. We therefore propose improvements to Markov text, and new challenges based on the consonant-vowel-consonant (CVC) trigrams of psychology. Theory and experiment together quantify problems in current challenges and the improvements offered by modifications.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jon Bentley and Colin Mallows "CAPTCHA challenge strings: problems and improvements", Proc. SPIE 6067, Document Recognition and Retrieval XIII, 60670H (16 January 2006); https://doi.org/10.1117/12.650644
Lens.org Logo
CITATIONS
Cited by 18 scholarly publications and 2 patents.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Associative arrays

Optical character recognition

Probability theory

Psychology

Computing systems

Defense and security

Information assurance

RELATED CONTENT

Performance issues in a real-time true color data display
Proceedings of SPIE (April 07 1995)
Implicit CAPTCHAs
Proceedings of SPIE (January 17 2005)
Header and footer extraction by page association
Proceedings of SPIE (January 13 2003)
Homographs with pronominal reference
Proceedings of SPIE (March 01 1992)
Multilingual mapping based on XML-SVG
Proceedings of SPIE (November 03 2008)

Back to Top