I Am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs
Suphannee Sivakorn, Iasonas Polakis and Angelos D. Keromytis
Department of Computer Science
Columbia University, New York, USA
{suphannee, polakis, angelos}@cs.columbia.edu
Abstract—Since their inception, captchas have been widely
used for preventing fraudsters from performing illicit actions.
Nevertheless, economic incentives have resulted in an arms
race, where fraudsters develop automated solvers and, in turn,
captcha services tweak their design to break the solvers. Recent
work, however, presented a generic attack that can be applied
to any text-based captcha scheme. Fittingly, Google recently
unveiled the latest version of reCaptcha. The goal of their new
system is twofold; to minimize the effort for legitimate users,
while requiring tasks that are more challenging to computers
than text recognition. ReCaptcha is driven by an “advanced
risk analysis system” that evaluates requests and selects the
difficulty of the captcha that will be returned. Users may
be required to click in a checkbox, or solve a challenge by
identifying images with similar content.
In this paper, we conduct a comprehensive study of re-
Captcha, and explore how the risk analysis process is in-
fluenced by each aspect of the request. Through extensive
experimentation, we identify flaws that allow adversaries to
effortlessly influence the risk analysis, bypass restrictions, and
deploy large-scale attacks. Subsequently, we design a novel
low-cost attack that leverages deep learning technologies for
the semantic annotation of images. Our system is extremely
effective, automatically solving 70.78% of the image reCaptcha
challenges, while requiring only 19 seconds per challenge. We
also apply our attack to the Facebook image captcha and
achieve an accuracy of 83.5%. Based on our experimental
findings, we propose a series of safeguards and modifications
for impacting the scalability and accuracy of our attacks.
Overall, while our study focuses on reCaptcha, our findings
have wide implications; as the semantic information conveyed
via images is increasingly within the realm of automated
reasoning, the future of captchas relies on the exploration of
novel directions.
1. Introduction
The widespread use of automated bots for increasing
the scale of nefarious online activities, has rendered the
use of captchas1[1] a necessity. Captchas are a valuable
defense mechanism in the fight against fraudsters and are
used for preventing, among other, the mass creation of
1. For readability, we follow the convention of the lowercased acronym.
accounts and posting of messages in popular services. Ac-
cording to reports, users solve over 200 million reCaptcha
challenges every day [2]. However, it is widely accepted
that users consider captchas to be a nuisance, and may
require multiple attempts before passing a challenge. Even
simple challenges deter a significant amount of users from
visiting a website [3], [4]. Thus, it is imperative to render
the process of solving a captcha challenge as effortless
as possible for legitimate users, while remaining robust
against automated solvers. Traditionally, automated solvers
have been considered less lucrative than human solvers
for underground markets [5], due to the quick response
times exhibited by captcha services in tweaking their design.
These design changes can render a solver ineffective, as
each solver must be crafted for a specific type of challenge.
However, the ever-advancing capabilities of computer vision
continue to diminish the security that can be offered by text-
based captchas. Recent work demonstrated the effectiveness
of a generic solver that can be applied to new captcha
schemes without requiring a custom-tailored approach [6].
As such, the future of text-based captchas seems uncertain,
and automated solvers may soon become dominant in un-
derground captcha-solving economies.
The “no captcha reCaptcha” deployed by Google, aims
to tackle the aforementioned challenges. The motivation
behind the new version is to verify users, when possible,
without requiring them to actually solve a tedious challenge.
An advanced risk analysis system analyzes various aspects
of a user’s request for a captcha (e.g., browser character-
istics, tracking cookie) and calculates a confidence score.
The score reflects the confidence of the system that the
request originates from an honest user and is not suspicious
(e.g., bot or human-solver). For high confidence scores, the
user is only required to click within a checkbox. For lower
scores, the user may be presented with a new image-based
challenge or a traditional text-based captcha. In the image-
based scheme, users are required to distinguish between a
set of images and select those with similar content.
In this paper, we conduct an extensive exploration of the
design and implementation aspects of reCaptcha. We find
that, apart from the risk analysis system, the reCaptcha wid-
get also performs a series of browser checks for detecting
the use of web automation frameworks or discrepancies in
the browser’s behavior. The checks range from verifying the
format of browser attributes to more complex techniques like
canvas fingerprinting [7]. Nonetheless, we build a system
that leverages a popular web automation framework and
still passes the checks. Furthermore, following our black-
box testing, we identify design flaws that allow an adversary
to trivially “influence” the risk analysis process and receive
the checkbox challenge. Specifically, we find that supplying
a Google tracking cookie that is 9 days old is sufficient,
even if it has not been associated with any browsing history
(apart from the initial request that created it). Moreover,
since tracking cookies have not been previously used by
attackers, no safeguards exist for preventing their creation
at a large scale; we create more than 63K cookies a day from
a single host without triggering any defenses. Using these
cookies, our system can maintain a solving rate of 52K-60K
checkbox captchas per day, from a single IP address.
Next, we design a novel attack for solving image-based
captchas. To our knowledge, this is the first captcha-breaking
attack that extracts semantic information from images for
solving the challenge; with the use of image annotation
services and libraries, we are able to identify the content
of images and select those depicting similar objects. We
also take advantage of Google’s reverse image search func-
tionality for enriching our information about the images.
We further leverage machine learning for our image selec-
tion, and develop a classifier that processes the output of
the image annotation systems and searches for subsets of
common tags that occur across images with similar content.
Our automated captcha-breaking system is highly effective,
achieving an accuracy of 70.78% against the image-based
reCaptcha. It is also efficient, as it solves challenges in 19
seconds. To demonstrate the general applicability of our
attack, we also target Facebook’s new image captcha, where
we achieve an accuracy of 83.5%.
Due to the prevalence of human solving services, the
utility of any captcha-breaking system must also be eval-
uated in terms of cost-effectiveness. As such, we evaluate
our system in an offline mode, where no online information
or service is used. Under such restrictions, and running
on commodity hardware, our attack solves 41.57% of the
captchas while requiring only 20.9 seconds per challenge,
with practically no cost. We employ a popular captcha-
solving service, and find that our system is comparable in
both accuracy and efficiency, rendering it an effective cost-
free solution for fraudsters.
Overall, while our study focuses on reCaptcha, our find-
ings are generalizable and with significant implications for
future directions. Despite their current flaws, incorporating
the safeguards we identify in reCaptcha’s risk analysis and
widget can improve the security of future captcha imple-
mentations. Furthermore, evolving from the recognition of
distorted alphanumeric characters to more advanced tasks,
such as extracting semantic information from images, has
been considered a promising direction for the design of
robust and usable captchas [8]. Our attack, however, raises
questions about the suitability of such tasks, given the recent
advancements in computer vision and machine learning. We
believe the future of captchas depends on the exploration of
fundamentally different approaches to their design.
(a) Before user clicks checkbox. (b) User considered human.
Figure 1. The reCaptcha widget.
The major contributions of this paper are the following:
We conduct a comprehensive study of the security aspects
of the most widely used captcha service. Our extensive
testing reveals flaws in Google’s advanced risk analysis
system, which can be exploited by adversaries for de-
ploying large-scale automated attacks. Our novel misuse
of tracking cookies renders them a valuable commodity
for adversaries.
We design a novel attack that leverages deep learning
systems for extracting the semantic information of images.
We extensively evaluate our attack against reCaptcha,
quantify the effect of different services and types of
information on the accuracy, and demonstrate that it is
both highly accurate and efficient. We also demonstrate
that it is generalizable, as it is effective even against
Facebook’s image captcha, and practical, as it achieves
comparable results to captcha-solving services at virtually
no cost.
Based on the insights from our empirical analysis, and key
aspects of our approach, we introduce new safeguards for
preventing the manipulation of the risk analysis process
at a large scale. We also present guidelines for mitigating
attacks against the image reCaptcha. The disclosure of
our findings to Google led to the modification of the risk
analysis and image reCaptcha, resulting in a more robust
captcha service.
2. Analyzing Recaptcha
The reCaptcha service [9] offered by Google, is the most
widely used captcha service, and has been adopted by a
plethora of popular websites for preventing automated bots
from conducting nefarious activities. A recent announce-
ment [10] reported the deployment of a new reCaptcha
mechanism designed to be more user-friendly and secure.
By leveraging information about users’ activities, acquired
through persistent tracking cookies, Google can correlate
requests to users that have previously interacted with any of
its services. This is possible even in cases where the user is
not currently logged into a Google account or the browser
is in a private/incognito mode, as the (anonymous) tracking
cookie can still reveal a user’s previous activities. This is a
novel direction, as it allows Google to further strengthen
the captcha service by coupling it to their vast pool of
user information, allowing them to better detect suspicious
requests. While safeguards have been previously proposed
for mitigating automated large-scale captcha solving [3], re-
Captcha goes beyond the stateless, per IP address, approach
and employs a plethora of checks.
Widget. When visiting a webpage protected by re-
Captcha, a widget is displayed (shown in Figure 1(a)). The
widget’s JavaScript code is obfuscated, to prevent analysis
from third parties. When the widget loads, it collects infor-
mation about the user’s browser which will be sent back
to the server. Furthermore, it performs a series of checks
for verifying the user’s browser, and also checks for known
browser automation kits. We provide more details on the
checks in Section 4.
Workflow. Once the user clicks in the checkbox, a
request is sent to Google containing (i) the Referrer,
(ii) the website’s sitekey (obtained when registering for
reCaptcha), (iii) the cookie for google.com, and (iv)
the information generated by the widget’s browser checks
(encrypted). If the user is logged into her Google account,
the request will also contain an extra field with the user’s
authentication cookie. The request is then analyzed by the
advanced risk analysis system, which decides what type of
captcha challenge will be presented to the user. The response
is an HTML frame that contains the challenge, which is
displayed by the widget in a popup.
Solution. Once the challenge has been presented to the
user, it has to be answered within 55 seconds. Otherwise,
the popup is closed and the user is required to click on
the checkbox again to receive a new challenge. Once the
user clicks, an HTML field called recaptcha-token is
populated with a token. If the user is deemed legitimate
and not required to solve a challenge, the token becomes
valid on Google’s side. Otherwise, the token will remain
invalid until the user solves the given challenge correctly.
The token is submitted to the website when completing
the desired action (e.g., POST an account creation form),
and the user’s session expires after 2 minutes. The token
is invalidated on Google’s side 2 minutes and 10 seconds
after its creation. The website sends a verification request
through the reCaptcha API which contains: (i) a shared
secret, (ii) the response token and, optionally, (iii) the user’s
IP address. The response is a JSON object with a boolean
field indicating if the verification was a success. If the
verification fails, error codes offer more information.
Based on the level of confidence assigned to the specific
request, Google’s advanced risk analysis system will select
which type of challenge to present to the user. The different
versions present a varying level of difficulty and nuisance,
as some are trivial to pass while others are problematic even
for humans. If a specific user requests multiple challenges or
provides several wrong answers in a short amount of time,
the system will return increasingly harder challenges.
Challenge type. In our experiments we came across the
following versions of reCaptcha:
“No captcha reCaptcha” [Figure 1]. This new user-
friendly version is designed to completely remove the nui-
sance of solving captchas. Upon clicking the checkbox in
the widget, if the advanced risk analysis system has high
confidence that the request is “legitimate”, the checkbox
changes to a tick, the challenge is considered solved, and no
further action is required. For the remainder of the paper,
we will refer to this version as the checkbox captcha.
Figure 2. Similar images challenge by reCaptcha.
TABLE 1. EXAMPLES OF REMAINING VERSIONS OF RECAPTCHA.
(a) (b)
(c) (d)
(e)
Image reCaptcha [Figure 2]. This new version is built
on the notion that identifying images with similar content
is a difficult task for bots. The challenge contains a sample
image and 9 candidate images, and the user is requested to
select those that are similar to the sample.
Distorted one-word [Table 1-(a)]. This is the easiest
version of a distorted text reCaptcha.
Street view numbers [Table 1-(b)]. The answers provided
by users make Google Maps more precise and complete.
Scanned words [Table 1-(c)]. In an effort to improve
the digitization of books, the challenge includes at least
one word that was scanned from a book, but has not been
recognized by the OCR software with high confidence.
Distorted two-word [Table 1-(d)]. This is returned when
the request is considered suspicious.
Fallback captcha [Table 1-(e)]. If the User-Agent
fails certain browser checks, the widget automatically
fetches and presents a challenge of this type, before the
checkbox is clicked. We explore when it is triggered in
Section 4.
Type selection. While the new reCaptcha relies on the
image challenges for security, text captchas have not been
completely removed from the system. Specifically, upon
initial release of the new reCaptcha, the easy text captchas
were returned intermittently with the new image captcha.
However, over the period of the following 6 months, text
captchas appeared to be gradually “phased out”, with the
image captcha now being the default type returned. Nonethe-
less, the difficult text captchas (Table 1-(d), Table 1-(e))
are still in use; they are returned after multiple solutions
of captchas or when certain browser checks fail. This sug-
gests that the text captchas target suspicious human users
(e.g., workers for captcha-solving services) and not bots, as
these captchas are harder for humans to solve despite being
solvable by bots [6], [11].
Threat model. In practice, fraudsters may follow two
distinct approaches for solving challenges. First, they may
employ an automated captcha breaking system, which will
allow them to conduct nefarious actions unencumbered
(e.g., create email accounts, post in forums). Second, they
may employ humans to manually solve challenges, i.e.,
through an underground captcha-solving service [5]. In this
paper, we develop methods for automatically solving the
challenges. Our goal is to design methods for bypassing
safeguards and influencing the advanced risk analysis into
returning checkbox captchas, and develop an attack that can
succesfully solve semantic image captchas. Nevertheless,
our findings could also be exploited by solving services,
as we demonstrate the feasibility of accurate, large-scale,
low-cost attacks.
3. System Overview
In this section we present an overview of our system
designed to solve reCaptcha challenges. Due to the approach
taken by Google, where part of the functionality is offloaded
to the advanced risk analysis system (i.e., determining the
difficulty of the challenge), our captcha-breaking system
exhibits a novel aspect that has not been previously required
by similar attacks; we build a component designed to “in-
fluence” the process that determines the level of difficulty
for the challenge.
Our system is built on Selenium, an open-source browser
automation framework. We opt for the Firefox-WebDriver,
so we can leverage the rich functionality of the browser
engine and handle all aspects of the webpages required for
bypassing the browser checks of reCaptcha. Specifically, we
build on top of Mozilla Firefox (v.36). The WebDriver offers
functionality for locating specific HTML DOM elements
in a page, provides features for executing JavaScript and
controllers for handling keyboard and mouse events. We use
the XVFB virtual framebuffer for configuring the display
screen resolution. We also manage the saving and loading
of browser cookies, as the persistent Google tracking cookie
is an integral part of the advanced risk analysis process.
The system consists of two main components. The first is
responsible for creating tracking cookies that can influence
the risk analysis process. Our second component processes
the challenges and, depending on the type of challenge
returned, follows different techniques for solving them.
Cookie Manager. The Google tracking cookie plays a
crucial role in determining the difficulty of the challenge
that is presented to the user. Furthermore, each cookie
can receive up to 8 checkbox captchas in a day. As part
of our attack, we develop functionality for automatically
creating Google cookies. The goal is to create cookies
which are subsequently “trained” to appear as originating
from legitimate users and not automated bots. In each case,
we create a cookie in a clean virtual machine, where our
browser automation system imitates a user browsing the
web. We configure the system to perform actions specific to
the website being visited, while mimicking a diurnal cycle
and following random resting intervals between actions. For
example, we conduct Google searches for certain terms and
follow links from the results, open videos in Youtube, and
perform searches in Google Maps. We also visit popular
websites that contain social plugins associated to Google.
As such, we are able to create Google tracking cookies and
associate them with varying amounts of browsing activity.
Recaptcha Breaker. This component is designed to
collect reCaptcha challenges. Using the cookies created by
the previous module, it visits sites that employ reCaptcha for
deterring automated bots from completing certain actions.
If the image captcha is presented, it is passed to a different
module. We have gathered a list of websites that rely on the
reCaptcha service for preventing automated actions. During
our experiments, we do not conduct any actions in these
websites; we only target the reCaptcha challenges. Upon
locating such websites, we manually inspected the website’s
HTML page code and located the appropriate DOM element
that holds the reCaptcha widget IFrame. We recorded the
particular element and used it during our attack for identify-
ing the frame containing the captcha. As the widget requires
the user to click the checkbox to receive a captcha, our
system locates the checkbox element through its identifier
(recaptcha-anchor) and performs a mouse click action
in it. If the advanced risk analysis considers the request
legitimate and no challenge is returned, we only need to
extract the recaptcha-token.
If the user is required to solve a challenge, a popup
is created on the client page by the JavaScript, within an
element with a class name of goog-bubble-content.
Inside the popup there is an IFrame where all captcha
challenge elements are located, along with the veri-
fication button for sending the response. To identify
which type of captcha is shown to the user, our
system attempts to locate rc-imageselect which
is the element created for the image captcha, or
rc-defaultchallenge-response-field which is
the element created for holding the user response for text
captchas. Based on which element appears in the code, we
identify the type of challenge that was presented.
The text captcha is sent from the Google server
and put inside the rc-defaultchallenge-payload
element in JPG format. For the image captcha, the
text description (which contains the hint) is located in
rc-imageselect-desc. The sample and candidate im-
ages are put in rc-imageselect-tile. All images
are sent in a base64 format and rendered in the user
browser. Our system extracts the description and the
base64 images, which are saved in PNG format. Im-
ages are saved individually, so they can be supplied to
the image annotation modules. The system also collects
the rc-imageselect-tile and the verification button
recaptcha-verify-button, for sending a mouse ac-
tion to those HTML objects when our back-end breaker
finishes processing the challenge and is ready to submit the
solution.
3.1. Breaking the image captcha
The ability to discern objects and provide detailed de-
scriptions of images is a cognitive process that comes nat-
urally to humans. On the other hand, processing an image
for identifying objects and assigning semantic information
to it, is considered a complex computer vision problem [12].
Therefore, such tasks have been considered a promising
alternative to text-based captchas, and several schemes have
been proposed, in which users are asked to identify or
label images [3], [13]. However, recent advancements in
the area of computer vision have demonstrated impressive
results [12], [14]–[16], and image annotation services that
leverage such techniques have emerged. As these services
are bound to become even more widely available, we explore
if they can be used for solving image captchas.
To solve an image captcha, our system has to automat-
ically identify which of the given images are semantically
similar to the sample image. Upon receiving a challenge we
extract the sample and candidate images, and the hint that
describes the content of the sample image (e.g., “wine”).
Next, all images are passed to an image annotation module.
GRIS. The Google Reverse Image Search, built on the
work by Krizhevsky et al. [17], offers the ability to conduct
a search-based on an image. If the search is successful it
may return a “best guess” description of the image (which
may differ for the same image across searches) along with
a list of websites where the image is contained, and other
available sizes of that image. While this is not part of
Google’s public API, we identified the format of the search
URL so our module can replicate the functionality. When
conducting the reverse search for the 9 candidate images,
we also collect the page titles of the webpages that contain
the image, as an extra piece of information. If available, we
also obtain a higher resolution version of each image, as it
increases the accuracy of the image annotation modules.
During our initial experiments, we came across instances
where the descriptions were not in English. As such, if a
description is returned, we extract it and convert it to English
through the automatic language detection feature of Google
Translate. We also convert all tags to their singular form.
Image annotation. There are several free online services
and libraries that offer relevant functionality, ranging from
assigning tags (keywords) to providing free-form descrip-
tions of images. Example outputs are shown in Table 2.
Clarifai is built on the deconvolutional networks by
Zeiler et al. [18], and returns a set of 20 tags describing the
image along with a confidence score for each tag. The tags
range from generic (“drink”) to very specific (“cabernet”).
Alchemy2is also built upon deep learning, and offers an
API for image recognition. For each submitted image, the
service returns a set of tags and a confidence score for each
tag. In our experiments, images received up to 8 tags, which
tend to be specific (e.g., “wine”).
TDL. Srivastava and Salakhutdinov [16] have released an
app for demonstrating the image classification capabilities
of their deep learning system. We have identified the API
calls used by the app, and built a module that leverages
the API for obtaining a description of the images. For each
image, 8 tags are returned along with a confidence score.
NeuralTalk. Karpathy and Fei-Fei [12] developed Neu-
ralTalk, a Recurrent Neural Network architecture for gen-
erating free-form descriptions of an image’s contents. We
have developed a module for processing the images with the
NeuralTalk library locally; we break the returned description
down into individual words, and remove verbs, prepositions
and conjunctions. The remaining words are considered the
tags for the image.
Caffe. Jia et al. [19] have released Caffe, a deep learning
framework, which we also leverage for processing images
locally. Caffe returns a set of 10 labels; 5 with the highest
confidence scores and 5 that are more specific as keywords
but may have lower confidence scores.
Tag Classifier. The returned tags do not always exactly
match the description (i.e., hint) given by reCaptcha for a
challenge. To overcome this, we leverage machine learning
to develop a classifier that can “guess” the content of an im-
age based on a subset of the tags. Specifically, we opted for
the Word2Vec word vectors proposed by Mikolov et al. [20]
for finding the similarity between tags and hints. During the
training of our classifier, we modeled and represented each
word (tag and hint) as a real vector in vector space. Each tag
assigned to an image is paired with the correct hint and all
<tag, hint> pairs are given as input to the model. We
tune the parameters and identify the optimal values for each
annotation system. Once the classifier has been trained, it
can be use to predict the similarity of the captcha’s hint and
the tags by computing the cosine similarity between their
corresponding word vectors, with the goal of identifying
subsets of tags from each image that have been associated
with the hint during the training phase. Thus, our classifier
allows our system to select images with similar content even
if the annotation system does not return tags that exactly
match the given hint.
History Module. During our experiments we detect a
non-negligible amount of repetition within the captchas, i.e.,
many images are actually re-used across challenges. As
such, we manually create a labelled dataset with images and
their tag from challenges we collect. Each image is anno-
tated with the hint given in the challenge that describes the
content (e.g., cat, soup). We also maintain a hint_list
that contains the hints we have seen.
2. http://www.alchemyapi.com
TABLE 2. EXAM PL E OUT PUT F ROM E ACH I MAGE A NN OTATION M OD ULE ,WITH TAGS SORTED IN DESCENDING ORDER OF CONFIDENCE.
GRIS Alchemy Clarifai TDL NeuralTalk Caffe
wine and blood wine, glass
glass, red wine,
wine, merlot, liquid,
bottle, still,
glassware, alcohol,
drink, wineglass,
beverage, pouring,
white wine,
cabernet, taste,
leaded glass, dining,
party, vino
red wine, goblet,
wine bottle,
punching bag, beer
glass, perfume,
balloon
a glass of wine
sitting on top of a
table
red wine, wine,
alcohol, drug of
abuse, drug, red
wine, punching bag,
beaker, cocktail
shaker, table lamp
Solution. Each module assigns the candidate images to
one of 3 sets: select,discard or undecided. First,
we collect information for all the images through GRIS.
Next, if a hint is not provided, we search for the sample
image in the labelled dataset to obtain one if possible.
The history module searches for the candidate images in
our labelled dataset and, if found, compares their tag to
the hint and adds those that match to the select list.
The remaining images are compared to the hint_list
and added to the discard list if there is a match. An
image annotation module then processes all the images and
assigns them tags. If one of the tags matches the hint the
image is added to the select set. If it matches one of the
other tags in the hint_list it is added to the discard
set. A similar process is conducted when we leverage the
best guess and page title results for each image. Once
all the modules have completed, the system processes the
results and merges the sets from the modules. Each type of
information is given a different “weight” (e.g., title pages
have the lowest confidence) which allows us to overcome
cases where modules assign the same image to a different
set. If there is not an adequate number of images in the
select set, the system picks the remaining images from
the undecided set. That is done either through our tag
classifier, or by selecting the images that have the most
overlapping (i.e., common) tags with the sample image.
In Section 4 we evaluate the effectiveness of these two
approaches, and elaborate on why the optimal strategy is
to select 3 images for the solution.
4. Attack Evaluation
In this section we evaluate our attacks against reCaptcha.
First, we explore Google’s advanced risk analysis system,
and identify how various characteristics of our system influ-
ence the confidence assigned by the system to our captcha
requests. Next, we evaluate the accuracy of our attack
against the image-based version of reCaptcha.
4.1. Influencing the risk analysis system
We follow a black-box testing approach to identify how
different aspects of our system and testing environment
influence the risk analysis process. Our goal is to issue
requests for captchas that will be considered legitimate
TABLE 3. TRACKING COOKIE CREATION AND TRAININGBEH AVIOR .
Network Web Surfing Account Threshold
Departmental Frequent No 9th day
Departmental Moderate No 9th day
ToR Frequent No 9th day
ToR Moderate No 9th day
Any None No 9th day
by the advanced risk analysis system and, thus, receive
checkbox captchas that can be solved with a single click.
Browsing history. We aim to quantify the minimum
amount of browsing history required for a specific cookie
before it is presented with a checkbox challenge. When
using a Google service, regardless of being logged in to a
Google account or not, a plethora of relevant information
is collected and associated to the cookies. As such, we
deployed virtual users that exhibit different web surfing
behavior We explored multiple network connection setups,
as Google may assign a higher level of trust to IP addresses
originating from our university’s subnets. Thus, we also
tunneled connections over ToR, which we configured to
select exit nodes located in the USA. As shown in Table 3,
regardless of our experimental setup, our system is presented
with a checkbox captcha on the 9th day after the creation
of the cookie. Specifically, we obtain a checkbox captcha
after the beginning of the 9th day (00:01 PST) from the
cookie’s creation. Our follow-up experiments revealed that
the threshold remains the same even without conducting any
web surfing with the cookie. Thus, Google’s advanced risk
analysis can practically be neutralized by simply appending
a 9-day old cookie to the request.
Account. We investigated if being logged in a Google
account influences the risk analysis. First, we created fresh
accounts but did not conduct a phone verification. We also
created accounts and supplied them with an alternative
email address from another provider. Our findings indicate
that such accounts are considered suspicious and negatively
influence the outcome. In both cases we were presented with
a checkbox captcha after 60 days had passed. We repeated
the experiment with a fresh account which we verified with
a phone number from a US provider. Again, we were able to
obtain a checkbox captcha after the 60th day, indicating that
even phone-verified accounts start with a “bad reputation”.
Surprisingly, we conclude that it is actually better for an
adversary to not use any account at all.
TABLE 4. COMBINATIONS OF MISMATCHING INFORMATION BETWEEN WHAT OUR SYSTEM USES AND WHAT THE USER-AGENT CONTAINS.
Component 9-day Cookie System runs User-Agent reports Captcha
Browser 3Firefox/36.0 {Mobile/8C148 Safari/6533.18.5,
Chrome/42.0.2311.135 Safari/537.36}image
Browser version 3Firefox/36.0 Firefox/{10.0, 35.0, 36.0, 3.0.12}checkbox
Browser version 7Firefox/36.0 Firefox/{10.0, 35.0, 36.0}image
Browser version - Firefox/36.0 Firefox/1.0.4 fallback
Browser version 7Chrome/42.0 Chrome/{15.0.861.0, 4.0.212.1}image
Browser version - Chrome/42.0 Chrome/3.0.191.3 fallback
Engine version - Chrome/42.0; AppleWebKit/537.36 AppleWebKit/{528.10, 530.5, 531.3}fallback
Engine version 7Chrome 42/0; AppleWebKit/537.36 AppleWebKit/{532 and up}image
Engine version 3|7Firefox/36.0; Gecko/20100101 Gecko/20040914 image
Browser/Engine - Chrome 42/0; AppleWebKit/537.36 Chrome 42/0; Gecko/20100101 fallback
Browser/Engine 3|7Firefox/36.0; Gecko/20100101 Firefox/36.0; AppleWebKit/537.36 image
Platform 3Linux x86 64 {(Macintosh; Intel Mac OS X 10.8;), (Android;
Mobile;), (Windows NT 6.3;)}checkbox
- - - wrong format or incomplete information fallback
-3Linux x86 64; Firefox/36.0
Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en)
AppleWebKit/420 (KHTML, like Gecko) Version/3.0
Mobile/1A543a Safari/419.3
checkbox
Geo-location. We used ToR for exploring the impact
of the user’s geo-location. We selected exit nodes across
different countries and continents, including the top 3 coun-
tries associated with abused phone numbers used for Google
account verification [21]. We find that that there is no
restriction based on the country in which a cookie is created,
as we are able to obtain the checkbox captcha regardless
of the combination between the country where the cookie
is created, and the current location of the user sending
the request. This facilitates fraudsters since they can create
cookies without any restrictions on the location of the IP
addresses used.
Browser checks. The reCaptcha widget executes a se-
ries of checks for detecting suspicious browser attributes
or behavior. While the widget’s JavaScript is obfuscated
to prevent analysis, de-obfuscated code has been released3
providing indications about the type of checks conducted.
Here we explore how aspects of our automated browser
environment affects the outcome of the risk analysis. Due
to space constraints we present a subset of our experiments.
Automation. According to the W3C specification4, Web-
Driver is required to set a webdriver attribute to True, so
websites can detect automation. While this is implemented
by the current version of Firefox-WebDriver, and the widget
checks for the attribute, we did not find it to have an effect.
To further ascertain our finding, we changed our website’s
JavaScript to explicitly set the attribute. Again the outcome
remained the same, as we obtained the checkbox captcha.
Canvas fingerprinting. The way that content is rendered
across machines and browsers varies, enabling device fin-
gerprinting [7]. The reCaptcha widget leverages that for
identifying the user’s browser. Specifically, the JavaScript
code creates a Canvas element and draws a predefined
composition. The display attribute is set to None, so the
process remains invisible to the user. After the rendering
is complete, the element is encoded in base64 and sent
back with the other data when the user clicks the checkbox.
3. https://github.com/neuroradiology/InsideReCaptcha
4. https://w3c.github.io/webdriver/webdriver-spec.html
The element can be compared to the outcome of known
browser versions, for identifying the environment the widget
is running in and detecting discrepancies with the reported
User-Agent.
User-Agent. Table 4 presents a subset of the combi-
nations we used in our experiments, to identify how the
User-Agent influences the type of captcha that we re-
ceive. We have grouped certain variations that exhibit the
same behavior. We repeat each combination multiple times
at different moments, to verify the consistency of the out-
come. We also conduct certain experiments with Chrome.
Surprisingly, we found that if the User-Agent contains
an outdated version of the browser or browser engine we
are actually using, the widget automatically considers the
environment suspicious and presents the user with a fallback
captcha before the checkbox is clicked. The same happens
if the browser and engine versions are up-to-date, but don’t
correspond to the actual environment of the experiment (e.g.,
if we use Firefox but report Chrome). Fallback captchas are
also returned when the User-Agent does not contain the
complete information, or is mis-formated. For other types
of mismatching information, the widget usually returns the
image captcha. Our findings also suggest that the widget
does not detect the underlying operating system with the
canvas fingerprinting, as we obtain checkbox captchas re-
gardless of the platform stated in the User-Agent. The
last row in Table 4 presents an interesting finding, as the
browser engine we report (AppleWebKit) does not map to
the engine we are actually using, yet we are still presented
with the checkbox captcha. This behavior is unique to this
outdated User-Agent from iOS1.0, and does not occur
for newer versions. Finally, we also found that even if the
User-Agent used during a cookie’s creation is different
to the one used when requesting a captcha with that cookie,
the outcome is not affected.
Screen resolution. We experimented with multiple
combinations of screen resolutions, ranging from 1×1
to 4096×2160 pixels, and still obtained the check-
box captcha. Even when combining them with different
User-Agent options for both mobile and desktop devices,
results remained the same.
Mouse. To identify whether the behavior of the mouse
affects the outcome of the risk analysis, we experi-
mented with various mouse behavior configurations. We
explored aspects such as the timing of movements, er-
ratic movement patterns, and issuing multiple clicks within
the widget and checkbox. We also used the JavaScript
getElementById().click() function to simulate a
click within the checkbox without hovering over the widget.
None of these had a negative effect on the risk analysis.
Cookie reputation. We explore if the server keeps in-
formation regarding previous behavior of a cookie, as a
“reputation” score that affects the outcome of the risk anal-
ysis. We opt for two different types of suspicious behavior,
using 9-day old cookies. In the first configuration, we use a
User-Agent that automaticaly receives a fallback captcha.
The second configuration uses a User-Agent that receives
image or text captchas and always provides wrong solutions
to the captchas. We run our experiments with varying re-
quest rates and for a varying number of days. In all cases,
even after a week of consecutive wrong answers every one
hour, once we change the User-Agent to a valid value the
cookie receives a checkbox captcha. Our results suggest that,
even though Google accounts have a reputation that affects
the risk analysis outcome, that approach is not applied to
cookies and they are not assigned a reputation.
Site restriction. The attack’s scale can be in-
creased if we solve captchas on a website we control
(attacker.com) but associate the tokens with a target
website (example.com). This would facilitate captcha-
solving services that harvest and sell tokens to others,
as it will reduce the network activity and, thus, cost
of the attacks. Furthermore, targeted websites may have
stricter thresholds than the reCaptcha service, which would
be preferable to avoid. Soon after the deployment of
the new reCaptcha, a straightforward “clickjacking” at-
tack was demonstrated [22]. To prevent the attack, re-
Captcha was re-designed so that the token is tied to the
website where the challenge was presented. Apart from
checking the Referrer, the widget identifies the website
through the document.location.hostname, which is
read-only and cannot be intercepted for security reasons.
We present a workaround for bypassing this restric-
tion. We setup a virtual host on our server and set the
ServerName and other necessary fields to correspond
with example.com. By using a2ensite, and modi-
fying the hosts file, we can run our website on the
localhost and trick reCaptcha into associating our re-
quest to example.com. To complete our attack we also
need to send the target’s sitekey. This can be trivially
obtained by visiting the website once and monitoring a re-
Captcha request. Ultimately, when example.com verifies
the token through the reCaptcha API, it will be successful.
Token harvesting. We also explored if creating a large
number of cookies from a single IP address is prohibited.
To avoid impacting other sites, we ran the attack against
our own webserver. We maintain the same User-Agent
0
10
20
30
40
50
60
00 02 04 06 08 10 12 14 16 18 20 22
Checkbox captchas
Hour of Day (EST)
Figure 3. Checkbox captchas obtained per minute.
header for all requests, and deploy multiple instances of
our system. Surprisingly, we are able to create over 63,000
cookies in a single day without triggering any mechanisms
or getting blocked, and are only limited by the physical
capabilities of the machine. This indicates that there is no
mechanism to prohibit the creation of cookies from a single
IP address. The only restriction we detected was triggered by
a massive number of concurrent requests (i.e., for detecting
DoS attacks). The lack of a safeguard can be justified
by the fact that creating cookies at a large scale has not
been required by attacks before. Indeed, we present a novel
misuse of tracking cookies, which makes them a valuable
commodity for fraudsters.
Next, we deployed our system to identify how many
checkbox captchas we can solve in a single day. We experi-
mented with different captcha-request rates. Figure 3 shows
the results from representative experiments with different
rates plotted in the attacker’s timezone (EST). Each exper-
iment lasted 24 hours and is plotted with a different color.
While the most aggressive rate (red points) gets blocked
numerous times, if we maintain a lower rate of about 1,200
requests per hour (black points) we do not get blocked.
For intermediate rates we get blocked at specific times
that recur across experiments. Interestingly, these blocked
periods coincide with a typical workday diurnal cycle (i.e.,
before work, lunchtime, after work). This can be attributed
to the advanced risk analysis system either pro-actively
adapting the threshold or “dropping” requests during peak
hours, due to the increased traffic. Despite being blocked for
short periods of time, at the optimal rate (green points) we
receive approximately 2,500 checkbox captchas per hour,
which drops to about 1,200 during peak hours. During
weekdays, our results vary between 52,000 and 55,000. We
observe less blocking during the weekend, and obtain over
59,000 checkbox captchas per day.
Overall evaluation. The current design and implementa-
tion of reCaptcha suffer from significant flaws and omissions
that can be trivially exploited by adversaries. In an attempt
to remove the burden for legitimate users, reCaptcha has
enabled attacks that can effortlessly harvest tokens at a large
scale and pass checkbox captchas, which do not require
any computation. However, the plethora of checks that are
performed, combined with those that are feasible (e.g.,
mouse checks) can be used to introduce more safeguards
and improve the robustness of any captcha system.
TABLE 5. COMB INATI ON S FOR PAS SI NG TH E IM AGE RE CAPTCHA.
Image Selection Constraint Pass
nCorrect +kWrong k13
(n1) Correct n > 23
(n1) Correct +kWrong k > 07
4.2. Breaking the image captcha
Solution flexibility. We explore whether reCaptcha has
any flexibility when deciding if the given solution is correct.
In any case, at least two images have to be selected before
the response is sent. We manually solved image challenges
using different combinations of the number of correct (n)
and wrong (k) selections. In most cases (74%) we found
the number of correct candidate images to be 2; the rest
contain 3 and we also found two challenges with 4. As can
be seen in Table 5, our experiments reveal that a user can
pass the challenge even if a correct image is missed or a
wrong selection is provided. Due to the small size of the
images, in some cases their content may not be discernible
even to humans. Based on these results, we set our captcha
breaking system to select 3 images for the solution; this
strategy offers us a “free” selection when n= 2 and may fall
within the “relaxation” limits for the remaining challenges.
We also came across a few cases where we passed the
challenge having supplied 2 correct and 2 wrong answers.
This suggests that the image reCaptcha may contain both
“control” and unknown images, similar to the text chal-
lenges [9]: apart from images with known content (i.e.,
control images), the challenge contains images for which
the system has low confidence about their content. If a
user selects the control images correctly, and also selects an
unknown image, then the system can associate the unknown
image with the hint. Thus, while in practice accepted com-
binations for passing a challenge may be even more flexible,
Table 5 shows the passing combinations that we found to
always hold true.
Image repetition. During our experiments we came
across several cases of images being repeated across chal-
lenges. To quantify this behavior, we created a dataset of 700
downloaded challenges. First, we searched for images with
identical MD5 values. Upon inspection, we came across a
surprising finding. Out of the 700 captchas, we identified
6 pairs of completely identical challenges: for each pair,
the images and their ordering were exactly the same in
both challenges. In all 6 cases, the two challenges had
been collected from a different website and always within
two hours. Apart from the significant implications for the
robustness of reCaptcha, it also suggests that challenges are
not created “on-the-fly” but selected from a relatively small
pool of challenges which is periodically updated.
We also found that images were being repeated across
challenges. However, in most cases they had different MD5
values. Thus, we conducted a comparison using perceptual
hashes, to identify identical images with different MD5
values. In all cases the images we detected seemed visually
identical, despite being transformed for each new pool of
0
10
20
30
40
50
60
70
GRIS
Alchemy
Clarifai
TDL
Neuraltalk
Caffe
GRIS
Alchemy
Clarifai
TDL
Neuraltalk
Caffe
Accuracy(%)
Vanilla
Hint
Best Guess
Page Title
High Res.
Pass ChallengeExact Solution
Figure 4. Accuracy of simulated attack for different combinations of
modules and data against the image reCaptcha.
challenges; this may be done to prevent hash-based identifi-
cation of the images. Since the hash value for identical im-
ages remains the same across websites and within the same
pool of challenges, but changes in time, the transformation
is independent of the website. We identified 1,368 redundant
images that belonged to 358 sets of identical images. The
largest set contained 92 images, i.e., the same image was
shown as the sample in 92 different challenges. The most
re-used candidate image was seen 12 times.
Attack simulation. To evaluate the effectiveness of each
module, we simulated our attack on the dataset. To verify
the accuracy of our attacks, we manually inspected the
challenges and noted their solution. Figure 4 breaks down
the accuracy for each module and type of information. Out
of the 700 challenges, 667 contained a hint, and we acquired
a best guess to use as a hint for 25 of the remaining
challenges through a reverse search of the sample image.
Here, we used a hint_list with the hints that we had
come across in our experiments, but did not use the history
module or tag classifier.
As we showed in Table 5, reCaptcha is flexible and a
solution is considered correct even if it contains a wrong
image along with the correct ones. Thus, to calculate the
accuracy that our system would obtain against reCaptcha,
we ran our simulated attack and accounted for that flexibility
when deciding the outcome of each solution. As such, we
configured the system to select 3 images for the solution so
as to fall within the “relaxed” limits. The Pass Challenge
bars represent the outcome of the attacks. We started with
a baseline measurement for the “vanilla” version of each
module which selects images based on the overlapping tags;
for GRIS, this also entails using the hint provided in the
challenge and the best guess returned by the reverse image
search. GRIS passed 13.1% of the challenges. When using
the page titles returned for the 9 candidate images as tags,
the success rate increased to 19.2%. In general, the success
rate for GRIS is limited by the number of candidate images
for which we can obtain a best guess description. For the
other modules, the baseline attack selects the 3 images that
have the most common (overlapping) tags with the sample
image. For Alchemy and Clarifai, the baseline success of
the attack was 27.9% and 38.9% respectively. When using
the hint, best guess and page titles, the Alchemy module
passed 49.9% of the challenges, while Clarifai passed 58%.
Caffe is also very effective, solving 45.9% of the challenges.
The hint has a significant effect in most cases, increasing the
accuracy by 1.5-15.5% depending on the annotation system.
We explored how the attack’s accuracy is impacted by
supplying the image annotation module with higher resolu-
tion versions of the images. We were able to automatically
obtain a higher resolution version of 2,909 images from
the 700 challenges. Out of those, 371 corresponded to the
sample image. The high resolution images increased the
attacks’ success, with Alchemy and Clarifai passing 53.4%
and 61.2% of the challenges respectively. TDL is less accu-
rate achieving 45%, while Caffe increases to 49.1%. While
the higher resolution images could potentially improve the
success rate of the GRIS module as well, conducting reverse
image searches on all versions of each image, for acquiring
a best guess description, would require a significantly higher
number of queries for each challenge and is, thus, omitted.
We also measured the number of challenges our system
would pass if there was no flexibility. Since in most cases the
solution consists of 2 images, we tuned the system to select
2 images for each challenge. The Exact Solution bars in
Figure 4 present the results, and we can see that all the image
annotation services were quite effective in identifying the
correct images. Clarifai is the most effective as it selected
the exact set of images in 40.2% of the challenges, while
Alchemy reached 31.5% and Caffe 28.3%.
Tag classifier. To quantify the effectiveness of our tag
classifier as part of our captcha-breaking system, we fol-
lowed a 10-fold validation approach for training and testing
our classifier on the dataset of 700 labelled image captchas.
In our first experiment, we skipped the other image selection
steps, and relied solely on the classifier for selecting the
images. For each image, the classifier received as input the
hint and the set of tags, and returned a “similarity” score;
we selected the 3 images with the highest score. Our attack
provided an exact match solution for 26.28% (σ= 7.09),
and passed 44.71% (σ= 6.39) of the challenges. In the
second experiment, we incorporated our classifier into our
system, and used the classifier-based selection as a replace-
ment of the overlapping-based selection of images from the
undecided set. When using the classifier, our attack’s
average accuracy for Clarifai reached 66.57% (σ= 7.53),
resulting in an improvement of about 5.3%. The classifier
is more effective than the overlap approach, as it identifies
specific subsets of tags that are associated with each hint,
instead of the more simplistic metric of the number of
common tags. Furthermore, the use of the classifier does
not impact the performance of the attack as the duration is
increased by 0.025 sec.
Live attack. To obtain an exact measurement of our
attack’s accuracy, we run our automated captcha-breaker
against reCaptcha. To minimize our impact, we do not repeat
all previous combinations, but opt for the one that had the
best results in the simulated experiments and, as such, we
employ the Clarifai service.
Labelled dataset. We created a labelled dataset to exploit
the image repetition. We manually labelled 3,000 images
collected from challenges, and assigned each image a tag
describing the content. We selected the appropriate tags from
our hint_list. We used pHash for the comparison, as it
is very efficient, and allows our system to compare all the
images from a challenge to our dataset in 3.3 seconds.
We ran our captcha-breaking system against 2,235
captchas, and obtained a 70.78% accuracy. The higher accu-
racy compared to the simulated experiments is, at least par-
tially, attributed to the image repetition; the history module
located 1,515 sample images and 385 candidate images in
our labelled dataset. We also came across 4 pairs of iden-
tical challenges. In one case, we had solved the challenge
correctly the first time we received it, indicating that Google
does not remove challenges from the pool even if they are
answered correctly.
Figure 5 shows the average number of images added to
the select and discard lists. We break the numbers
down to challenges that our system passed and those it
failed against. The Undecided column depicts the number
of images we select from the undecided set using the
overlap module or our skip-gram classifier, for reaching the
3 images we provide as the solution. As each module creates
its own lists, the overall number of images in the select
lists is higher than 3 due to images being in multiple
lists. Our attack is more successful when Clarifai provides
accurate tags that can be directly matched to the hint and
the hint_list; passed challenges have an average of
1.5 selected images and 4.25 discarded, while failed ones
have 1.09 and 3.8 respectively. The history, best guess, and
title page are also effective in discarding images, which is
beneficial as it reduces the chance of selecting wrong images
from the undecided set. On average, we selected 1.49
undecided images for the solution when we succeeded.
For the failed challenges, the average is 2, verifying that it
is a more error-prone process.
Figure 6 shows the attack’s total time, broken down to
each phase. The most time consuming phase is running
GRIS, as it searches for all the images in Google and
processes the results, including the extraction of links that
point to higher resolution versions of the images. Currently,
we follow a multi-threaded approach for processing the
10 images in parallel. This could be improved by further
parallelizing the processing of each image. Nonetheless, the
attack is very efficient, with an average duration of 19.2
seconds per challenge.
Hint repetition. We also found significant repetition of
the type of content presented in the challenges. Figure 7
depicts all the hints and their respective frequency. As can
be seen, there is a very limited variety of image categories
used. 5 types account for the solution of over 54.7% of the
challenges, and 10 for over 91.5%. An adversary can further
increase the accuracy by tailoring the attack and training the
image annotation system for these specific types of images.
Time restriction. The widget removes the challenge after
55 seconds and the user is required to click in the checkbox
for receiving a new one. However, we found that the chal-
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
History
Tags
BestGuess
PageTitle
Undecided
History
Tags
BestGuess
PageTitle
Undecided
Images
Select
Discard
PassedFailed
Figure 5. Average number of images selected or
removed by each module.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35 40 45 50 55 60
CDF
Time (seconds)
Extract
History
HighRes
GRIS
Clarifai
Solve
Total
Figure 6. Cumulative distribution of time re-
quired for each step.
0
10
20
30
40
50
60
70
80
90
100
no-hint
rose
sandwich
soup
hamburger
coffee
wine
avocado
guinea-pig
pasta
bread
steak
cake
sushi
ice-cream
cat
rice-dish
dog
burrito
Challenges (%)
Success
Frequency
Figure 7. Frequency and success rate for each
type of hint.
lenge is not invalidated on the server and we can provide
an answer even after 20 minutes. The challenge is simply
hidden by the widget, and changing the visibility and
position attributes back to their original value makes the
challenge reappear.
Offline mode. We also evaluated our attack in an offline
mode, where we did not use any online annotation services
or Google’s reverse image search; we relied solely on the
local library, our labelled dataset, and our skip-gram classi-
fier. We tested our attack with the two libraries, NeuralTalk
and Caffe. When using Caffe and our classifier, our system
solved 41.57% (σ= 4.28) of the image captchas, while
the attack duration increased to 20.9 seconds per challenge.
While NeuralTalk is similarly accurate at 40%, it intro-
duces a large increase to the attack’s duration; specifically,
the duration of the attack increases to 117.8 seconds, as
NeuralTalk requires an average of 110.9 seconds to process
the 10 images. However, leveraging the capabilities of a
GPU for the computations will improve the performance and
reduce the duration.
Thus, adversaries can deploy accurate and efficient at-
tacks against the image reCaptcha without relying on ex-
ternal services, which may require payment for processing
large collections of images or report suspicious actions.
4.3. Economic analysis
Since captcha-breaking is driven by monetary incentives,
we evaluate our findings from an economic perspective.
Image captcha. To evaluate the viability of our captcha-
breaking system as a solution for fraudsters, we compare
our performance to that of Decaptcher, the (self-reported)
oldest captcha-solving service. We selected Decaptcher for
two reasons. First, it supports the image reCaptcha, charging
$2 per 1000 solved captchas (customers can request refunds
for failed challenges). Second, previous work [5] found it
to be the most accurate solving service (tied with another
service), rendering it a suitable candidate for comparison.
We submitted the 700 image captchas to Decaptcher
and measured the response time and accuracy (taking into
account the solution flexibility). Interestingly, 147 were
initially rejected due to the service being overloaded, and
had to be re-submitted at a later time. Out of the 700
captchas, 88 received a time-out error (ERR_TIMEOUT) as
the solvers did not provide an answer in the time window
allocated by the service and 258 (36.85%) were an exact
match. When taking into account the flexibility, 321 (44.3%)
of the captchas were solved. The average solving time for
the challenges that received a solution was 22.5 seconds.
While the accuracy may increase over time as the human
solvers become more accustomed to the image reCaptcha,
it is evident that our system is a cost-effective alternative.
Nonetheless, our completely offline captcha-breaking system
is comparable to a professional solving service in both
accuracy and attack duration, with the added benefit of not
incurring any cost on the attacker..
Checkbox captcha. Assuming a selling price of $2 per
1,000 solved captchas, our token harvesting attack could
accrue $104 - $110 daily, per host (i.e., IP address). By
leveraging proxy services and running multiple attacks in
parallel, this amount could be significantly higher for a
single machine.
5. Applicability
We have demonstrated the effectiveness of our attack
against the image reCaptcha. Nonetheless, the basis of our
attack is widely applicable; by extracting the semantics of
images, we can construct attacks against other image-based
captchas. While other schemes may require different asso-
ciations of objects, our findings demonstrate that extracting
semantic information from images is no longer an obstacle
for machines, and should not be the yardstick by which we
evaluate the security of image captchas.
Currently, our approach can be readily applied to similar
existing schemes, such as the recently released Facebook
captcha (Figure 8). The image captcha is shown to users
when they send messages to other users that contain sus-
picious URLs. Before the message is actually delivered
to other users, the sender must pass an image captcha.
This mechanism is designed to prevent the propagation of
(automated) spam or phishing links by fake or compromised
accounts. Facebook’s image captcha follows the same ap-
proach with reCaptcha, where users have to identify which
images (out of 12 candidates) have content that match a
given hint. There are, however, some differences. Facebook
resizes the images dynamically in HTML, allowing access
to the high resolution versions. Also, no sample image
Figure 8. Image captcha by Facebook.
0
10
20
30
40
50
60
70
80
90
100
GRIS
Alchemy
Clarifai
TDL
Neuraltalk
Caffe
Accuracy (%)
Exact match
Pass challenge
Figure 9. Attack accuracy against Facebook’s image captcha.
is shown. The system allows the same flexibility rules as
reCaptcha. The number of correct images varies from 2 to
10, with 5-7 being the most common cases. Accordingly,
we tweak our solution algorithm to only select the images
contained in the select set, and not to opt for a specific
number of images.
Figure 9 depicts the accuracy of our attack against 200
Facebook image captchas. Once again, Clarifai achieves
the highest accuracy with 83.5% while Alchemy is also
very effective with 67%. The higher accuracy, compared to
reCaptcha, is due to two characteristics. First, the higher res-
olution images which lead to more accurate labels from the
annotation systems. Second, the use of completely unrelated
images when creating the challenge facilitates the discarding
of the incorrect options; on the other hand, reCaptcha opts
for images that belong to the same category (e.g., all are
some type of food) which renders the distinction more
difficult. Furthermore, these results demonstrate that while
increasing the number of images significantly impacts ran-
dom guessing attacks, it does not affect our attack, as other
aspects of the captcha’s implementation are more influential.
6. Guidelines and Countermeasures
Here we discuss countermeasures for defending against
our attacks, and their potential impact on the usability of
the service. These measures are not exclusive to reCaptcha,
and can be adopted by other captcha providers. Since auto-
mated solvers for the image captcha cannot be prohibited,
we present guidelines for reducing the accuracy and cost-
effectiveness of the image captcha attack. Due to the poten-
tial impact, extensive user studies are required for evaluating
each modification.
Token auctioning. The token verification API call has
an optional field for comparing the IP address of the user
that solved the captcha, to the one that submitted the token
to the website. This field should be made mandatory to
prevent services from selling tokens obtained from checkbox
captchas. However, it cannot prevent outsourcing the solu-
tion for the other types of captchas, since the adversaries
can extract the challenges, send them to the solving service
and receive the solution. Nonetheless, it increases the cost
of large scale attacks, as (automatically) solving the other
captcha types is more costly in terms of computation.
Risk analysis. We propose the following safeguards for
improving the advanced risk analysis system.
Account. Previous work [23] has proposed regulating
the number of available challenges by tying them to a
service account. If requests are valid only when they that
originate from users logged into their Google account, the
scale of attacks could be constrained. While this would
result in the value of accounts increasing in the underground
market, Google has deployed several safeguards that under-
mine the ability of adversaries to maintain phone verified
accounts [21]. On the other hand, this introduces a usability
issue, as users that are browsing with the privacy (incognito)
mode enabled, will be blocked. As a workaround, for such
cases reCaptcha can return the hardest type of challenge and
maintain a separate, severely limited, bucket of tokens [3]
per IP address. Once the tokens have been exhausted for a
specific time period, no more challenges should be returned
to requests that do not originate from a logged in user.
Cookie reputation. Assigning high confidence to cookies
with no browsing history is a significant flaw. The “reputa-
tion” of a cookie should elevate with the amount of browsing
conducted. This will increase the cost of cookie-creation.
Also, the number of cookies that can be created within a
time period from a specific host should be regulated. While
cookie creation was not a problem in the past, their use in
reCaptcha has rendered them a valuable commodity.
Browser checks. A stricter approach than the existing
would be to not return any challenge if the checks detect an
overtly suspicious environment (e.g., mismatch between de-
tected browser and what is reported in the User-Agent).
Image captcha attacks. Due to recent advancements
in computer vision and machine learning [12], [14]–[16],
[18], [19], and the availability of free image annotation
services and libraries, developing a captcha scheme that can
defeat automated solvers remains an open problem. Here, we
propose modifications and safeguards focused on reducing
the accuracy of our automated attacks in lieu of such a
development.
Solution. Creating challenges where the number of cor-
rect images is selected from a larger range, and uniformly
distributed, will impact the accuracy of the attack. Without
TABLE 6. EXAMPLE OUTPUT FOR IMAGE WITH ARTIFICIAL NOISE.
Clarifai Caffe
modern, glass,
business, window,
office, light, reflection,
office building,
communication,
architecture, futuristic,
future, panoramic,
technology, nobody,
city, structure,
geometric, bright,
building
prison, correctional
institution, penal
institution, institution,
shoji, prison, shoji,
pirate, pick, bearskin
knowledge of the number of images to select, the attack will
require a threshold confidence score to be set for selecting
an image, which may result in correct images not being
selected. Furthermore, by not allowing any flexibility in the
solution and requiring users to only select correct images,
the attack’s accuracy can be further reduced.
Repetition. Once a challenge has been presented to a
user, it should be removed from the pool of available
challenges, even if the given solution was incorrect. Fur-
thermore, the pool from which images are selected should
be greatly increased in size. This will minimize the existing
amount of repetition which can be exploited by adversaries.
Hint and content. As shown in Figure 7, when a captcha
does not contain a hint, our attack is less successful. As such,
the hint should be removed. Furthermore, the attack is less
accurate for certain types of content. Captcha providers can
conduct experiments for identifying more categories that are
problematic for image annotation software and use those to
construct challenges.
Content homogeneity. By populating challenges with
“filler” images (i.e., those that are not part of the solution)
that have the same type of content as the solution, may
impact the accuracy of the attack. The output of the image
annotation systems is not always precise enough to discern
similar types of content (e.g., if all the images are some
type of food). The heterogeneity of the filler images in
Facebook’s captcha contributes significantly to the attack’s
higher success rate.
Advanced semantic relations. Requiring users to identify
images with more complicated semantic relations could
increase the complexity required for the captcha-breaking
system. Instead of similar objects, challenges could ask users
to select semantically-related objects (e.g., a tennis ball, a
racket, and a tennis court). While this cannot completely
prevent attacks, it can be adopted as a temporary measure.
Introducing noise. We have conducted a preliminary
experiment for evaluating the impact of artificial noise on the
effectiveness of the image annotation services. Our system
draws a random grid on each image, with a varying number
of lines on each axis, and varying distances between each
line. The grid’s color is chosen using the complimentary
value of the image’s average color value, to make it more
discernible to human users. We run our attack against 100
image reCaptcha challenges, and find that the attack’s ac-
curacy drops to 16% for Clarifai and 13% for Caffe. An
example output can be found in Table 6. Interestingly, the
image annotation systems are not impacted by the grid when
the original image depicts an animal, as the grid is identified
as a cage. The grid also reduces the probability of retrieving
higher resolution versions of the images, as the reverse
search is significantly affected.
As previous work has demonstrated effective methods
for removing noise from images [24], [25], we consider
this a temporary solution, that can increase the cost of
the automated attack. As images will have to be “cleaned”
before being processed by the image annotation system,
these mechanisms may increase the computational cost of
the attack. Further exploration is required for evaluating the
trade-off between the amount of noise introduced (which
may impact usability) and the time required by adversaries
for removing it.
Adversarial images. Szegedy et al. [26] described a
pixel-level distortion process that produces images that are
almost identical (visually) to the original, yet are misclassi-
fied by all the neural networks they tested, regardless of their
model parameters or training datasets. By altering a small
number of pixels, the resulting images are misclassified
(e.g., a school bus classified as an ostrich) or not recognized
(e.g., a binary car classifier fails to recognize an image
previously identified as a car). This process could be applied
to image captchas, as it can prevent automated attacks
from identifying the images, while the images remain easily
identifiable for humans. However, evaluation is required for
verifying the performance overhead of such an approach,
as well as its robustness; it may be easy for adversaries to
transform the images in a way that negates the distortion,
before processing them with the image annotation system.
7. Ethics and disclosure
Demonstrating large scale attacks against reCaptcha has
implications for the many websites that rely on it for secur-
ing resources and protecting their users. While the presented
image-based attack cannot be prohibited, we propose safe-
guards and countermeasures for impacting the accuracy and
cost-effectiveness of our attack, until a suitable replacement
mechanism can be found. Furthermore, we have not affected
the websites in any way during our experiments, as we do
not perform any actions apart from acquiring the reCaptcha
challenges. We have disclosed a report with our findings and
recommendations to Google, in an effort to assist them in
making reCaptcha more robust to automated attacks. Fol-
lowing our disclosure, reCaptcha altered the safeguards and
the risk analysis process to mitigate our large-scale token
harvesting attacks. They also removed the solution flexibility
and sample image from the image captcha for reducing the
attack’s accuracy. We have also informed Facebook, but
have not been notified of any changes. Overall, we hope
that sharing our findings will help initiate the much-needed
discussion between researchers and industry regarding the
future of captchas.
8. Limitations
Certain aspects of our system can be explored for further
improving the accuracy of our attack.
Customize. Our current design does not account for
certain characteristics of the image annotation modules. For
example, Clarifai returns generic tags which can lead to false
positives. In the future we plan to explore how the attack
is impacted if the generic tags are removed, and identify
specific types of keywords that can be omitted.
Confidence Scores. Several of the image annotation sys-
tems return confidence scores. However, the highest scores
are frequently assigned to generic tags. Our current version
of the attack does not assign higher weights to tags with
higher confidence scores, as it may increase the false posi-
tives. By exploring which generic tags can be omitted, we
can modify our system to leverage the tag-confidence scores.
9. Future directions for captchas
The results by Bursztein et al. [6] pose significant im-
plications regarding the robustness of text-based captcha
schemes and the security they offer. Similarly, our novel
attacks pose an interesting dilemma regarding future di-
rections for designing captchas. We believe that the ca-
pabilities of computer vision and machine learning have
finally reached the point where expectations of automati-
cally distinguishing between humans and bots with existing
schemes, without excluding a considerable number of users
in the process, seem unrealistic. Thus, we must reassess
our concept of Reverse Turing tests, and approach their
design from a fundamentally different perspective. While an
in depth exploration of this open problem is out of the scope
of our work, we present certain alternative schemes in the
next section. An overview of potential alternative captcha
designs was recently presented in [6].
10. Related Work
Text captchas. The majority of deployed captcha
schemes are based on distorted letters or numbers, and
a broad body of work has demonstrated attacks for au-
tomatically breaking such schemes. Yan and Ahmad [27]
presented an attack against a Microsoft captcha. Their novel
text segmentation approach focused on locating the separate
characters which are, then, easier to identify. The authors
had previously [28] demonstrated attacks against a variety
of captcha schemes, using simple pattern recognition algo-
rithms and exploiting design errors. Li et al. [29] explored
the robustness of captcha schemes used by multiple e-
banking services and found that in most cases they could
achieve 100% recognition accuracy. They concluded that,
in an attempt to maintain the usability of their systems, e-
banking services opted for captcha designs that offered little
security against automated attacks. Mori and Malik [30] pre-
sented two methods for breaking the EZ-Gimpy and Gimpy
captchas that depict words over artificial noise, achieving a
success rate of 92% and 33% respectively. The attacks take
advantage of the fact that the challenges depict actual words
and not random strings of alphanumerics.
Bursztein [31] measured the success rate of users against
captchas from various services. In their study they employed
users through an underground solving service and workers
from Amazon Turk. An important observation the authors
made is that the difficulty of captchas is often very high,
rendering their solution a troublesome process for users.
Various attacks have been demonstrated against previous
text versions of reCaptcha [32]–[34]. Bursztein et al. [35]
conducted an extensive study on the strengths and weak-
nesses of text captchas. During their evaluation they found
13 of the 15 schemes to be vulnerable to automated attacks.
In [36] the authors proposed a more user-friendly scheme
based on distorted numbers, with users passing 95.3% of
the challenges. Recently, Bursztein et al. [6] presented a
novel approach for breaking text captchas, that performs
character segmentation and recognition in a single step.
Most importantly, their approach was universally applicable
to all tested schemes, and solved them with varying levels
of accuracy (5.33-55.22%).
Alternate designs have also proposed the use of video
CAPTCHA challenges that contain text. Xu et al. [37]
demonstrated highly effective attacks against the NuCaptcha
scheme that depicted dynamic text strings.
Image captchas. The first publication to rigorously
explore the use of automated tests for security [1], also
proposed the use of distorted images of animals for cre-
ating challenges. The Asirra captcha [3] required users to
distinguish between images depicting cats and dogs. It was
broken within a year [38] with a classifier trained on color
and texture features.
Chew and Tygar [13] proposed three image captcha
schemes where users are required to: (i) type a label that
describes six images with similar content (e.g., astronaut),
(ii) detect if each of 2 sets of images contains similar
content, and (iii) identify one image out of 6 that has content
different to the others. Our attack against the image captcha
can be readily applied to the second and third type without
complications as it is based on the same principle; identi-
fying images with the same content. The first type would
require an extension of our existing attack, for deciding
which tag should be supplied as the answer.
The use of human faces for captchas has been pro-
posed in multiple schemes. Goswami et al. proposed FaceD-
captcha [39], a scheme where users are required to differ-
entiate between actual images of human faces and animated
versions of human faces. Rui and Liu [40] proposed AR-
TiFACIAL, a captcha scheme where users are required to
identify faces and facial features within a heavily distorted
image. Zhu et al. [8] demonstrated attacks against a series
of image-based captcha schemes, including ARTiFACIAL.
Based on the insight gathered by their attacks’ character-
istics they set guidelines for designing robust image-based
captcha schemes. Their guidelines mandated, among other,
that a captcha must rely on semantic information, require
identification of multiple types of objects and prohibit at-
tacks based on a-priori knowledge such as the type of ob-
jects. While the image reCaptcha design does exhibit these
characteristics, the current implementation breaks the third
guideline as we found that the types of objects presented
are from a limited selection of categories.
Social authentication. Yardi et al. [41] proposed photo-
based authentication for social networks; to verify their
identify, users are required to identify their friends in photos.
Such a system was eventually deployed by Facebook for
preventing adversaries from gaining access to user accounts
after acquiring their credentials. This could be offered by
Facebook as a captcha service for other websites. However,
Polakis et al. [42] demonstrated an attack that leveraged
publicly available data and face recognition algorithms for
solving the challenges. In follow up work [23] the authors
proposed a photo-based captcha scheme, based upon the
same principle, which created challenges robust against face
recognition and image comparison attacks. As the captcha
challenges are crafted specifically for each user, such a
scheme can prevent captcha-solving services [5] or smug-
gling attacks [43]. The drawback is the requirement for users
to have an account with a specific service. ReCaptcha allevi-
ates that requirement with cookies, which reveal information
about a user’s activities without requiring a Google account.
Captcha-solving services. Motoyama et al. [5] explored
the inner workings of captcha-solving markets from an
economic perspective. They concluded that captchas should
not be viewed as an isolated defense mechanism, but eval-
uated as a way to impact the attackers’ profitability at a
large scale. In our experiments we have demonstrated the
feasibility of low-cost large-scale attacks against reCaptcha
that can be highly profitable for solving services. Shin et
al. [44] analyzed the functionality of a popular forum spam
automator designed to solve arithmetic tasks, which also
contained answer-question pairs for trivia challenges.
As fraudsters can distribute their illicit activities across
many hosts (e.g., by employing a botnet [45]), (IP address-
based) token bucket approaches [3] cannot prevent large
scale captcha-solving attacks. Leveraging reputation has
been proposed as a mitigation technique [6]. Jakobsson [46]
argued that existing approaches for throttling access to
valuable resources through captchas are no longer a viable
solution; challenges are becoming increasingly difficult for
humans, while the effectiveness of automated attacks con-
tinues to improve. As an alternative, Jakobsson proposed a
mechanism where users are required to create an account
with a trusted third party that serves as a mediator. If
the trusted party can verify the machine’s “identity” the
website will grant access to the user. To effectively throttle
requests from a machine, and prevent large scale attacks,
accounts must be tied to (physical) resources that are, by
nature, restricted. Such resources may be bank accounts,
phone numbers or postal addresses. The current approach
by reCaptcha is similar, in the sense that it treats users’
browsing history as a resource, and distinguishes users based
on to their web history (cookie). We have demonstrated,
however, that this is not a restricted resource and the system
can be manipulated. Furthermore, while recent work has
found that underground markets use phone-verified Google
accounts [21], periodic verification of each account’s phone
number could significantly mitigate such incidents. Thus,
requiring a phone-verified account for presenting a captcha
may be an effective direction to explore.
Dynamic cognitive game captchas have been proposed
as an alternative, and mechanisms for detecting the outsourc-
ing of the solution have been demonstrated [47]. While the
most popular game captcha available has been broken [48],
this approach could potentially lead to more robust captcha
schemes.
11. Conclusion
This paper offers a comprehensive examination of the
design and implementation characteristics of the reCaptcha
service. Our findings demonstrate that the development of
new generation captcha systems is a complex task, even
for resource-rich entities such as Google and Facebook.
In an effort to evolve towards more advanced functional-
ity and address the issues of previous schemes, reCaptcha
has introduced a series of mechanisms that have not been
used in captcha systems before. Accordingly, we explored
the design directions of this new system, as well as the
implementation flaws. Due to the essential role of the risk
analysis system, we evaluated how different aspects of our
captcha-breaking system affect the outcome of our requests,
and demonstrated a novel misuse of persistent tracking
cookies. We identified a plethora of checks performed by the
reCaptcha widget for identifying suspicious characteristics
of the environment. We also found a lack of safeguards for
preventing the creation and use of tracking cookies, which
we exploited to demonstrate the feasibility of large-scale
captcha-solving attacks. Next, we presented a novel no-cost
attack for image captchas that builds on deep learning tech-
nologies for extracting semantic information from images.
The effectiveness and efficiency of our attack further corrob-
orate that new directions need to be explored for the design
of captchas, as existing schemes rely on tasks that are within
the capabilities of automated cognisance. Nonetheless, the
advanced risk analysis and widget introduced by reCaptcha
possess valuable functionality that can be incorporated into
future captcha schemes for mitigating attacks.
Acknowledgements
We would like to thank the anonymous reviewers, as
well as Fabian Monrose and Michalis Polychronakis for
their comments on previous drafts of this paper. This work
was supported by the NSF under grant CNS-13-18415.
Author Suphannee Sivakorn is also partially supported by
the Ministry of Science and Technology of the Royal Thai
Government. Any opinions, findings, conclusions, or recom-
mendations expressed herein are those of the authors, and
do not necessarily reflect those of the US Government or
the NSF.
References
[1] L. von Ahn, M. Blum, N. J. Hopper, and J. Langford, “CAPTCHA:
Using hard ai problems for security,” in EUROCRYPT ’03.
[2] M. Foley, ““Prove You’re Human”: Fetishizing material embodiment
and immaterial labor in information networks,” Critical Studies in
Media Communication, vol. 31, no. 5, pp. 365–379, 2014.
[3] E. Jeremy, J. R. Douceur, J. Howell, and J. Sault, “Asirra: a
CAPTCHA that exploits interest-aligned manual image
categorization,” in CCS ’07.
[4] Distil Networks. CAPTCHAs Have Negative Impact on Web Traffic
and Leads. http://www.distilnetworks.com/
distil-networks-study-captchas-negative-impact-on-web-traffic/.
[5] M. Motoyama, K. Levchenko, C. Kanich, D. McCoy, G. M. Voelker,
and S. Savage, “Re: CAPTCHAs: understanding captcha-solving
services in an economic context,” in USENIX Security ’10.
[6] E. Bursztein, J. Aigrain, A. Moscicki, and J. C. Mitchell, “The end
is nigh: Generic solving of text-based CAPTCHAs.” in USENIX
WOOT ’14.
[7] K. Mowery and H. Shacham, “Pixel perfect: Fingerprinting canvas
in html5,” in W2SP ’12.
[8] B. B. Zhu, J. Yan, Q. Li, C. Yang, J. Liu, N. Xu, M. Yi, and K. Cai,
“Attacks and design of image recognition CAPTCHAs,” in CCS ’10.
[9] L. Von Ahn, B. Maurer, C. McMillen, D. Abraham, and M. Blum,
“reCAPTCHA: Human-based character recognition via web security
measures,” Science, vol. 321, no. 5895, 2008.
[10] Google Online Security Blog, “Are you a robot? Introducing “No
CAPTCHA reCAPTCHA”,” http://googleonlinesecurity.blogspot.
com/2014/12/are-you- robot-introducing- no-captcha.html.
[11] I. J. Goodfellow, Y. Bulatov, J. Ibarz, S. Arnoud, and V. Shet,
“Multi-digit number recognition from street view imagery using
deep convolutional neural networks,” in CoRR ’13.
[12] A. Karpathy and L. Fei-Fei, “Deep visual-semantic alignments for
generating image descriptions,” in CoRR ’14.
[13] M. Chew and J. D. Tygar, “Image recognition CAPTCHAs,” in ISC
’04.
[14] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A
neural image caption generator,” in CoRR ’14.
[15] M. M. Kalayeh, H. Idrees, and M. Shah, “NMF-KNN: Image
Annotation Using Weighted Multi-view Non-negative Matrix
Factorization,” in CVPR ’14.
[16] N. Srivastava and R. Salakhutdinov, “Multimodal learning with
deep boltzmann machines,” Journal of Machine Learning Research,
vol. 15, pp. 2949–2980, 2014.
[17] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet
classification with deep convolutional neural networks,” in NIPS ’12.
[18] M. D. Zeiler, G. W. Taylor, and R. Fergus, “Adaptive
deconvolutional networks for mid and high level feature learning,”
in ICCV ’11.
[19] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,
S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture
for fast feature embedding.”
[20] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation
of word representations in vector space,” in CoRR ’13.
[21] K. Thomas, D. Iatskiv, E. Bursztein, T. Pietraszek, C. Grier, and
D. McCoy, “Dialing back abuse on phone verified accounts,” in
CCS ’14.
[22] E. Homakov. The No CAPTCHA problem.
http://homakov.blogspot.in/2014/12/the-no-captcha-problem.html.
[23] I. Polakis, P. Ilia, F. Maggi, M. Lancini, G. Kontaxis, S. Zanero,
S. Ioannidis, and A. D. Keromytis, “Faces in the distorting mirror:
Revisiting photo-based social authentication,” in CCS ’14.
[24] R. H. Chan, C.-W. Ho, and M. Nikolova, “Salt-and-pepper noise
removal by median-type noise detectors and detail-preserving
regularization,” Trans. Img. Proc., vol. 14, no. 10, 2005.
[25] C. Liu, R. Szeliski, S. B. Kang, C. L. Zitnick, and W. T. Freeman,
“Automatic estimation and removal of noise from a single image,”
IEEE TPAMI, vol. 30, no. 2, 2008.
[26] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J.
Goodfellow, and R. Fergus, “Intriguing properties of neural
networks,” in CoRR ’13.
[27] J. Yan, E. Ahmad, and A. Salah, “A low-cost attack on a microsoft
CAPTCHA,” in CCS ’08.
[28] ——, “Breaking visual CAPTCHAs with naive pattern recognition
algorithms,” in ACSAC ’07.
[29] S. Li, S. A. H. Shah, M. A. U. Khan, S. A. Khayam, A.-R. Sadeghi,
and R. Schmitz, “Breaking e-banking CAPTCHAs,” in ACSAC ’10.
[30] G. Mori and J. Malik, “Recognizing objects in adversarial clutter:
Breaking a visual CAPTCHA,” in CVPR ’03.
[31] E. Bursztein, S. Bethard, C. Fabry, J. C. Mitchell, and D. Jurafsky,
“How good are humans at solving CAPTCHAs? A large scale
evaluation,” in SP ’10.
[32] C. Cruz-Perez, O. Starostenko, F. Uceda-Ponga, V. Alarcon-Aquino,
and L. Reyes-Cabrera, “Breaking reCAPTCHAs with unpredictable
collapse: Heuristic character segmentation and recognition,” vol.
7329, 2012.
[33] P. Baecher, N. B¨
uscher, M. Fischlin, and B. Milde, “Breaking
reCAPTCHA: a holistic approach via shape recognition,” in Future
Challenges in Security and Privacy for Academia and Industry,
2011, vol. 354.
[34] O. Starostenko, C. Cruz-Perez, F. Uceda-Ponga, and
V. Alarcon-Aquino, “Breaking text-based CAPTCHAs with variable
word and character orientation,” Pattern Recognition, vol. 48, 2015.
[35] E. Bursztein, M. Martin, and J. C. Mitchell, “Text based
CAPTCHA strengths and weaknesses,” in CCS ’11.
[36] E. Bursztein, A. Moscicki, C. Fabry, S. Bethard, J. C. Mitchell, and
D. Jurafsky, “Easy does it: More usable CAPTCHAs,” in CHI ’14.
[37] Y. Xu, G. Reynaga, S. Chiasson, J.-M. Frahm, F. Monrose, and
P. van Oorschot, “Security and usability challenges of
moving-object CAPTCHAs: Decoding codewords in motion,” in
USENIX Security ’12.
[38] P. Golle, “Machine learning attacks against the asirra CAPTCHA,”
in CCS ’08.
[39] G. Goswami, B. M. Powell, M. Vatsa, R. Singh, and A. Noore,
“FaceDCAPTCHA: Face detection based color image CAPTCHA,”
Future Generation Computer Systems, vol. 31, 2014.
[40] Y. Rui and Z. Liu, “Artifacial: Automated reverse turing test using
facial features,” in Multimedia ’03.
[41] S. Yardi, N. Feamster, and A. Bruckman, “Photo-based
authentication using social networks,” in WOSN ’08.
[42] I. Polakis, M. Lancini, G. Kontaxis, F. Maggi, S. Ioannidis, A. D.
Keromytis, and S. Zanero, “All your face are belong to us: breaking
facebook’s social authentication,” in ACSAC ’12.
[43] M. Egele, L. Bilge, E. Kirda, and C. Kruegel, “CAPTCHA
smuggling: Hijacking web browsing sessions to create captcha
farms,” in SAC ’10.
[44] Y. Shin, M. Gupta, and S. Myers, “The nuts and bolts of a forum
spam automator,” in USENIX LEET ’11.
[45] B. Stone-Gross, T. Holz, G. Stringhini, and G. Vigna, “The
underground economy of spam: A botmasters perspective of
coordinating large-scale spam campaigns,” in USENIX LEET ’11.
[46] M. Jakobsson, “Captcha-free throttling,” in AISec ’09.
[47] M. Mohamed, S. Gao, N. Saxena, and C. Zhang, “Dynamic
cognitive game CAPTCHA usability and detection of
streaming-based farming,” in USEC ’14.
[48] SPAM tech. Cracking the AreYouAHuman Captcha. http:
//spamtech.co.uk/software/bots/cracking-the-areyouhuman-captcha/.