
14 CHAPTER 2. BACKGROUND AND CONTEXT
very specific JavaScript functions and they decompile the Flash files they encounter to verify the
presence of fingerprinting related function calls. With their approach, they do not need to rely on
a known list of tracking scripts as they can directly look for behaviours related to fingerprinting
activities. They found 404 sites out of 1 million performing JavaScript-based font probing and 95
sites out of 10,000 performing Flash-based font probing.
Finally, in 2016, two researchers at Princeton University released the OpenWPM platform, “a
web privacy measurement framework which makes it easy to collect data for privacy studies on a
scale of thousands to millions of site” [160]. To demonstrate the capabilities of their tool, they made
an analysis of the top 1 million sites to detect and quantify emerging online tracking behaviours [77].
Their findings provide more accurate results than in the past as they instrumented extensively a
very high number of JavaScript objects to build a detection criterion for each fingerprint technique.
Out of 1 million, they found 14,371 sites performing canvas fingerprinting, 3,250 sites performing
canvas font fingerprinting and 715 sites performing WebRTC-based fingerprinting. These numbers
are much higher than what was reported in previous studies. However, they do not report on the
spread of more traditional techniques like the collection of navigator properties. After contacting
the authors, the amount of data they collected was so important that they needed to perform a
very time-consuming process of analysing scripts by hand to design new heuristics. In the end, the
number of actors performing device fingerprinting on the web is probably much higher than what
is currently reported by large crawls.
Evolution of privacy policies
As browsers started to reveal larger parts of their configuration, many websites updated their
privacy policy to indicate that they started collecting and storing device-specific information. Here,
we take a look at the privacy policies of major web actors to see if they perform device fingerprinting
and to find out what they do with this information.
•Google At the time of writing, the latest version of Google’s privacy policy dates from
March, 1st 2017 [151]. It notably includes the following pieces of information: “We collect
device-specific information (such as your hardware model, operating system version, unique
device identifiers, and mobile network information including phone number)” and “device
event information such as crashes, system activity, hardware settings, browser type, browser
language, the date and time of your request and referral URL”. There is also a specific
paragraph named “Cookies and similar technologies” where it is mentioned “We and our
partners use various technologies to collect and store information when you visit a Google
service, and this may include using cookies or similar technologies to identify your browser
or device”. While there is not a single mention of the term “fingerprinting”, collecting device-
specific information to identify a browser or device definitely fits the definition.
Another important aspect of Google’s privacy policy is its evolution. In January 2001, we
can see the first traces of the collection of information as the policy was updated to include:
“Google notes and saves information such as time of day, browser type, browser language, and
IP address with each query. That information is used to verify our records and to provide more
relevant services to users. For example, Google may use your IP address or browser language
to determine which language to use when showing search results or advertisements.” [146].
As written, the information is used here to tailor the response to the user’s device which is
exactly why they were introduced in browsers in the first place.
In July 2004, a section called “Data collection” gives new but vague details about the goal
of the collected information [147]. Notably, it indicates that “Google collects limited non-
personally identifying information your browser makes available whenever you visit a website”
but they use this information “to operate, develop and improve our services”.
The March 2012 update marks the foundations of Google’s current privacy policy but there
is a very small but notable shift in how Google uses the data [148]. While previous privacy
policies indicated that the collected data was primarily for Google’s own use, the update
enables them to share some data with their own advertising service and their partners. Then,
as indicated by following updates in March 2014 and June 2015, this sharing of data extends