Proceedings on Privacy Enhancing Technologies 2021

Yana Dimova, Gunes Acar, Lukasz Olejnik, Wouter Joosen, and Tom Van Goethem

The CNAME of the Game:

Large-scale Analysis of DNS-based Tracking

Evasion

Abstract: Online tracking is a whack-a-mole game

between trackers who build and monetize behavioral

user profiles through intrusive data collection, and anti-

tracking mechanisms, deployed as a browser extension,

built-in to the browser, or as a DNS resolver. As a re-

sponse to pervasive and opaque online tracking, more

and more users adopt anti-tracking tools to preserve

their privacy. Consequently, as the information that

trackers can gather on users is being curbed, some track-

ers are looking for ways to evade these tracking counter-

measures. In this paper we report on a large-scale lon-

gitudinal evaluation of an anti-tracking evasion scheme

that leverages CNAME records to include tracker re-

sources in a same-site context, effectively bypassing

anti-tracking measures that use fixed hostname-based

block lists. Using historical HTTP Archive data we find

that this tracking scheme is rapidly gaining traction,

especially among high-traffic websites. Furthermore, we

report on several privacy and security issues inherent to

the technical setup of CNAME-based tracking that we

detected through a combination of automated and man-

ual analyses. We find that some trackers are using the

technique against the Safari browser, which is known to

include strict anti-tracking configurations. Our findings

show that websites using CNAME trackers must take

extra precautions to avoid leaking sensitive information

to third parties.

Yana Dimova: imec-DistriNet, KU Leuven, E-mail:

yana.dimova@cs.kuleuven.be

Gunes Acar: imec-COSIC, KU Leuven, E-mail:

gunes.acar@esat.kuleuven.be

Lukasz Olejnik: European Data Protection Supervisor, inde-

pendent researcher, E-mail: me@lukaszolejnik.com

Wouter Joosen: imec-DistriNet, E-mail:

wouter.joosen@cs.kuleuven.be

Tom Van Goethem: imec-DistriNet, E-mail:

tom.vangoethem@cs.kuleuven.be

1 Introduction

Websites use trackers for various purposes including

analytics, advertising and marketing. Although track-

ing may help websites in monetization of their content,

the use of such methods may often come at the ex-

pense of users’ privacy, for example when it involves

building detailed behavioral profiles of users. As a re-

action to the omnipresence of online tracking, in the

previous decade many countermeasures have been de-

veloped, including specialised browser extensions, DNS

resolvers, and built-in browser protections. As of to-

day, all major browsers (except Google Chrome) include

some forms of anti-tracking measures. Safari’s Intelli-

gent Tracking Prevention (ITP) includes multiple fea-

tures to thwart various forms of tracking and circumven-

tion techniques [54]; Firefox’ Enhanced Tracking Protec-

tion (ETP) and the tracking prevention mechanism in

Edge rely on block lists to exclude trackers [30, 55].

As a counter-reaction to the increased use of anti-

tracking measures, several trackers have resorted to new

techniques in an attempt to circumvent these measures.

Prominent and well-studied examples of these evasion

techniques include browser fingerprinting [2, 19, 20, 24,

38], leveraging various browser mechanisms to persist

a unique identifier [7, 22, 48], and creating a finger-

print from hardware anomalies [15, 32, 58]. A notable

example for the use of evasion techniques is the case of

Criteo, one of the tracking actors we study in this paper.

In 2015, Criteo leveraged a redirection technique to set

first-party cookies [10, 41], and later abused the HTTP

Strict-Transport-Security mechanism [22, 48], both in

an effort to circumvent Safari’s Intelligent Tracking Pro-

tection (ITP). Our study complements these past re-

ports with an observation that Criteo is applying a spe-

cialised form of first-party tracking to Safari browsers.

In this paper, we report on an evasion technique

that has been known for several years but recently

gained more traction, presumably due to the increased

protection against third-party tracking. This tracking

scheme takes advantage of a CNAME record on a sub-

domain such that it is same-site to the including web-

arXiv:2102.09301v2 [cs.CR] 23 Feb 2021

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 2

site. As such, defenses that block third-party cookies are

rendered ineffective. Furthermore, because custom sub-

domains are used, these are unlikely to be included in

block lists (instead of blocking the tracker for all sites,

block lists would have to include every instance for each

website including the CNAME-based tracker).

Using the HTTP Archive dataset, supplemented

with results from custom crawls, we report on a large-

scale evaluation of the CNAME-based tracking ecosys-

tem, involving 13 manually-vetted tracking companies.

We find that this type of tracking is predominantly

present on popular websites: 9.98% of the top 10,000

websites employ at least one CNAME-based tracker.

The use of such tracking is rising. Through a histor-

ical analysis of the ecosystem, we show that the num-

ber of websites that rely on this type of tracking is

steadily growing, especially compared to similarly-sized

tracking companies which have experienced a decline

in number of publishers. We find that CNAME-based

tracking is often used in conjunction with other track-

ers: on average 28.43 third-party tracking scripts can be

found on websites that also use CNAME-based track-

ing. We note that this complexity in the tracking ecosys-

tem results in unexpected privacy leaks, as it actually

introduces new privacy threats inherent to the ecosys-

tem where various trackers often set first-party cookies

via the document.cookie interface. We find that due to

how the web architecture works, such practices lead to

wide-spread cookie leaks. Using automated methods we

measure such cookie leaks to CNAME-based trackers

and identify cookie leaks on 95% of the sites embedding

CNAME-based trackers. Although most of these leaks

are due to first-party cookies set by other third-party

scripts, we also find cases of cookie leaks to CNAME-

based trackers in POST bodies and in URL parame-

ters, which indicates a more active involvement by the

CNAME-based trackers.

Furthermore, through a series of experiments, we

report on the increased threat surface that is caused by

including the tracker as same-site. Specifically, we find

several instances where requests are sent to the tracking

domain over an insecure connection (HTTP) while the

page was loaded over a secure channel (HTTPS). This

allows an attacker to alter the response and inject new

cookies, or even alter the HTML code effectively launch-

ing a cross-site scripting attack against the website that

includes the tracker; the same attacks would have neg-

ligible consequences if the tracking iframe was included

from a cross-site domain. Finally, we detected two vul-

nerabilities in the tracking functionality of CNAME-

based trackers. This could expose the data of visitors on

all publisher websites through cross-site scripting and

session-fixation attacks.

In summary, we make the following contributions:

– We provide a general overview of the CNAME-

based tracking scheme, based on a large-scale anal-

ysis involving a custom detection method, allowing

us to discover previously unknown trackers.

– We perform a historical analysis to study the ecosys-

tem, and find that this form of first-party tracking

is becoming increasingly popular and is often used

to complement third-party tracking.

– Through a series of experiments, we analyze the se-

curity and privacy implications that are intrinsic

to the tracking scheme. We identify numerous is-

sues, including the extensive leakage of cookies set

by third-party trackers.

– Based on the observation of practical deployments

of the CNAME-based tracking scheme, we report

on the worrying security and privacy practices that

have negative consequences for web users.

– We discuss the various countermeasures that have

recently been developed to thwart this type of track-

ing, and assess to what extent these are resistant to

further circumvention techniques.

2 Background

2.1 Web browser requests

Upon visiting a web page, the browser will make vari-

ous requests to fetch embedded resources such as scripts,

style sheets and images. Depending on the relation be-

tween the embedding website and the site that the re-

sources are hosted on, these can be same-origin,same-

site or cross-site. If the resource shares the same scheme

(i.e. http or https), host (e.g. www.example.com) and

port (e.g. 80 or 443) as the embedding site, it is con-

sidered same-origin. In case there is no exact match for

the host, but the resource is located on the same reg-

istrable domain name, the effective top level domain

plus one (eTLD+1 ), as the embedding website (e.g.

www.example.com and foo.example.com), it is consid-

ered same-site. Finally, resources that have a different

eTLD+1 domain with regard to the including website

are considered cross-site, i.e., resources from tracker.com

included on example.com are cross-site.

Prior to making the connection to the server, the do-

main name first needs to be resolved to an IP address.

In the most straightforward case, the DNS resolution of

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 3

the domain name returns an Arecord containing the IP

address. However, the domain could also use a CNAME

record to refer to any other domain name. This can

be an iterative process as the new domain name can

again resolve to another CNAME record; this process con-

tinues until an Arecord is found. Through this indirec-

tion of CNAMEs, the host that the browser connects to

may belong to a different party, such as a tracker, than

the domain it actually requests the resource from. This

means that requests to xxx.example.com may actually

be routed to a different site, such as yyy.tracker.com.

Cookie scoping Before a request is sent, the

browser will first determine which cookies to attach in

the HTTP request. This includes all cookies that were

set on the same (sub)domain as the one where the re-

quest will be sent to. Other cookies that will be in-

cluded are those that were set by a same-site resource,

i.e. either on another subdomain, or on the top do-

main, and had the Domain attribute set to the top do-

main, for instance by the following response header on

https://sub.example.com/

Se t - C oo ki e : co o ki e = va l ue ; D om ai n = e xa m pl e . co m

Cookies that were set without the Domain attribute will

only be included on requests that are same-origin to

the response containing the Set-Cookie header. The

SameSite attribute on cookies determines whether a

cookie will be included if the request is cross-site. If the

value of this attribute is set to None, no restrictions will

be imposed; if it is set to Lax or Strict, it will not be in-

cluded on requests to resources that are cross-site to the

embedding website; the latter imposes further restric-

tions on top-level navigational requests. Several browser

vendors intend to move to a configuration that assigns

SameSite=Lax to all cookies by default [11, 31, 51]. As

such, for third-party tracking to continue to work, the

cookies set by the trackers explicitly need to set the

SameSite=None attribute, which may make them easier

to distinguish. For CNAME-based tracking, where the

tracking requests are same-site, the move to SameSite

cookie by default has no effect.

2.2 Tracking

2.2.1 Third-party tracking

In a typical tracking scenario, websites include resources

from a third-party tracker in a cross-site context. As

a result, when a user visits one of these web pages, a

cookie originating from the third party is stored in the

visitor’s browser. The next time a user visits a website

on which the same tracker is embedded, the browser will

include the cookie in the request to the tracker. This

scheme allows trackers to identify users across differ-

ent websites to build detailed profiles of their browsing

behavior. Such tracking has triggered privacy concerns

and has resulted in substantial research effort to under-

stand the complexity of the tracking ecosystem [21, 33]

and its evolution [29].

2.2.2 First-party tracking

In first-party tracking the script and associated analyt-

ics requests are loaded from a same-site origin. Conse-

quently, any cookie that is set will only be included with

requests to the same site. Historically, one method that

was used to bypass this limitation was cookie match-

ing [40], where requests containing the current cookie

are sent to a common third-party domain. However,

such scripts can be blocked by anti-tracking tools based

on simple matching rules. Instead, the technique cov-

ered in this work uses a delegation of the domain name,

which circumvents the majority of anti-tracking mecha-

nisms currently offered to users.

2.2.3 CNAME-based tracking

General overview In the typical case of third-party

tracking, a website will include a JavaScript file from

the tracker, which will then report the analytics infor-

mation by sending (cross-site) requests to the tracker

domain. With CNAME-based tracking, the same oper-

ations are performed, except that the domain that the

scripts are included from and where the analytics data

is sent to, is a subdomain of the website. For example,

the website example.com would include a tracking script

from track.example.com, thus effectively appearing as

same-site to the including website. Typically, the sub-

domain has a CNAME record that points to a server of

the tracker. An overview of the CNAME-based tracking

scheme is shown in Figure 1.

Bypassing anti-tracking measures The CNAME

tracking scheme has direct implications for many anti-

tracking mechanisms. Because the requests to the track-

ing services are same-site (i.e. they point to the same

eTLD+1 domain as the visited website), countermea-

sures that aim to block third-party cookies, such as Sa-

fari’s ITP, are effectively circumvented. Other popular

anti-tracking mechanisms that rely on blocking requests

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 4

example.com

resolver

QUERY

track.example.com

CNAME:x.tracker.com

QUERY

x.tracker.com

A:1.2.3.4

GET/track.js

200OK

Set-Cookie:uid=xyz

track.example.com

Fig. 1. Overview of CNAME-based tracking.

or cookies by using block lists (such as EasyPrivacy [18]

or Disconnect.me [16]) become much harder to main-

tain when trackers are served from a custom subdomain

that is unique to every website. To block CNAME-based

tracking, block lists would need to contain an entry

for every website that uses the CNAME-based tracking

service, instead of a single entry per tracker or match

all DNS-level domains, leading to greater performance

costs.

As a consequence of how the CNAME-based track-

ing scheme is constructed, it faces certain limitations in

comparison to third-party tracking. For instance, there

no longer exists a common identifier shared across the

different websites (in typical third-party tracking, the

third-party cookie is responsible for this functionality).

Consequently, visits to different websites cannot be at-

tributed to the same user using standard web develop-

ment features.

3 Detecting CNAME-based

tracking

In this section we describe the composition of the

datasets along with the various steps of our method-

ology that we used to detect CNAME-based trackers

and the publishers that include them.

3.1 Dataset

In order to analyze the CNAME-based tracking scheme

at a scale, we leveraged the (freely available) crawling

data from HTTP Archive [6]. This dataset originates

from visiting the home page of all origins from the

Chrome User Experience Report (CrUX), which lists

websites (including those hosted on subdomains) fre-

quently visited by Chrome users. The results reported

in this section are based on the desktop crawl performed

in October, consisting of 5,506,818 visited web pages

from 4,218,763 unique eTLD+1 domains. The informa-

tion contained in this dataset includes all request and

response headers of all the requests (507M in total) that

were made when visiting the web pages with the latest

Chrome browser. As the dataset only contains the IP

address of the remote host that was connected to at the

time of making the request, we extended the dataset

with DNS records (in particular CNAME) obtained by

running zdns [57] on all first-party subdomains.

3.1.1 Methodology

Discovering trackers To detect services that offer

CNAME-based tracking, we used a three-pronged ap-

proach that leverages features intrinsic to the ecosystem,

combining both automated and manual analysis. First

we filtered all requests from HTTP Archive’s dataset

and only considered the ones that were same-site but

not same-origin, i.e. the same eTLD+1 but not the ex-

act same origin as the visited web page. Furthermore, we

only retained requests to domain names that returned

a CNAME record referring (either directly or indirectly

after redirection of other CNAME records) to a differ-

ent eTLD+1 domain in our DNS data. We aggregated

these requests on the eTLD+1 of the CNAME record,

and recorded a variety of information, such as the av-

erage number of requests per website, variation of re-

quest size, percentage of requests that contain a cookie

or set one via the HTTP response header, etc. In Ap-

pendix B we elaborate on these features and discuss how

they could be used to assist or automate the detection

of CNAME-based tracking. Out of the resulting 46,767

domains, we only consider the ones that are part of a

CNAME-chain on at least 100 different websites, which

leaves us with 120 potential CNAME-based trackers.

In the second phase, we performed a manual anal-

ysis to rule out services that have no strict intention to

track users. Many services that are unrelated to track-

ing, such as CDNs, use a same-site subdomain to serve

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 5

content, and may also set a cookie on this domain,

thus giving them potential tracking capabilities. For in-

stance, Cloudflare sets a _cfduid cookie in order to de-

tect malicious visitors, but does not intend to track users

with this cookie (user information is kept less than 24

hours) [12]. For each of the 120 domains, we visited the

web page of the related organization (if available) and

gathered information about the kind of service(s) it pro-

vides according to the information and documentation

provided on its website. Based on this information, we

then determined whether tracking was the main service

provided by this company, either because it explicitly in-

dicated this, or tracking would be required for the main

advertised product, e.g. in order to provide users with

personalized content, or whether this was clear from

the way the products were marketed. For instance one

such provider, Pardot offers a service named “Marketing

Automation”, which they define as “a technology that

helps businesses grow by automating marketing pro-

cesses, tracking customer engagement, and delivering

personalized experiences to each customer across mar-

keting, sales, and service”1, indicating that customers

(website visitors) may be tracked. Finally, we validate

this based on the requests sent to the purported tracker

when visiting a publisher website: we only consider a

company to be a tracker when a uniquely identifying pa-

rameter is stored in the browser and sent along with sub-

sequent requests, e.g. via a cookie or using localStorage.

Using this method, we found a total of 5 trackers. Fur-

thermore, we extended the list with eight trackers from

the CNAME cloaking blocklist by NextDNS [13, 37].

Four of the trackers we detected in our manual analysis

were not included in the block list. We left two of the

trackers from the list out of consideration, as they were

not included in the DNS data. In total we consider 13

CNAME-based trackers.

Detecting the prevalence of CNAME-based

tracking By examining request information to host-

names having a CNAME record to one of the identified

trackers, we manually constructed a signature for all

tracking requests for each of the 13 trackers, based on

the DNS records and request/response information (e.g.

the same JavaScript resource being accessed or a request

URL according to a specific pattern). This allows us to

filter out any instances where a resource was included

from a tracking provider but is unrelated to tracking,

as the providers may offer various other services and

simply relying on DNS data to detect CNAME pub-

1https://www.pardot.com/what-is-marketing-automation/

lisher domains leads to an overestimate (we justify this

claim in Section 5.2). Using this approach, we detected

a total of 10,474 websites (eTLD+1) that used at least

one of the trackers; we explore these publishers that use

CNAME tracking in more detail in Section 4.2.

3.2 Alternative user agent

A limitation of the HTTP Archive dataset, is that

all websites were visited with the Chrome User-Agent

string, a browser that does not have built-in tracking

protection. Furthermore, only the home page of each

website was visited. To evaluate whether these limita-

tions would affect our results, we performed a crawl-

ing experiment on the Tranco top 10,000 websites2;

for every website, we visited up to 20 web pages (to-

taling 146,397 page visits). We performed the experi-

ment twice: once with the Chrome User-Agent string,

and once with Safari’s. The latter is known for its

strict policies towards tracking, and thus may receive

different treatment. We used a headless Chrome in-

strumented through the Chrome DevTools Protocol

[43] as our crawler. A comparative analysis of these

two crawls showed that one tracker, namely Criteo,

would only resort to first-party tracking for Safari users.

Previously, this tracker was found to abuse top-level

redirections [41] and leverage the HTTP Strict Trans-

port Security (HSTS) mechanism to circumvent Safari’s

ITP [22, 48].

3.3 Coverage

Finally, to analyze the representativeness of our results

and determine whether the composition of the HTTP

Archive dataset did not affect our detection, we per-

formed a comparative analysis with our custom crawl. In

the 8,499 websites that were both in the Tranco top 10k,

and the HTTP Archive dataset, we found a total of 465

(5.47%) websites containing a CNAME-based tracker.

These included 66 websites that were not detected to

contain CNAME-based tracking based on the data from

HTTP Archive (as it does not crawl through differ-

ent pages). On the other hand, in the HTTP Archive

dataset we found 209 websites that were detected to

contain a CNAME-based tracker, which could not be

detected as such based on our crawl results. This is be-

2https://tranco-list.eu/list/6WGX/10000

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 6

cause the HTTP Archive dataset also contains popu-

lar subdomains, which are not included in the Tranco

list. As such, we believe that the HTTP Archive dataset

provides a representative view of the state of CNAME-

based tracking on the web. We note however that the

numbers reported in this paper should be considered

lower bounds, as certain instances of tracking can only

be detected when crawling through multiple pages on a

website.

4 CNAME-based tracking

In this section, we provide an in-depth overview of

the CNAME-based tracking ecosystem through a large-

scale analysis.

4.1 CNAME-based trackers

An overview of the detected trackers can be found in

Table 1. For every tracker we indicated the number of

publishers, counted as the number of unique eTLD+1

domains that have at least one subdomain set up to

refer to a tracker (typically with a CNAME record).

Furthermore, we estimated the total number of pub-

lishers by levering DNS information from the Securi-

tyTrails API [49]. More precisely, all CNAME-based

trackers either require the publishers that include them

to set a CNAME record to a specific domain, or the

trackers create a new subdomain for every publisher.

As such, the estimated number of publishers could be

determined by finding the domains that had a CNAME

record pointing to the tracker, or by listing the sub-

domains of the tracker domain and filtering out those

that did not match the pattern that was used for pub-

lishers. For Ingenious Technologies we were unable to

estimate the total number of publishers as they use a

wildcard subdomain (and thus it could not be deter-

mined whether a subdomain referred to an actual pub-

lisher using CNAME tracking).

We noted the price of the services offered by the

tracker suppliers when such information was available,

either from the tracker’s website or through third-party

reviews. In most cases, with the exception of TraceDock,

which specifically focuses on providing mechanisms for

circumvention of anti-tracking techniques, the offered

services included a range of analytics and marketing

tools.

Finally, for every tracker we determined whether

tracking requests would be blocked by three relevant

anti-tracking solutions: uBlock Origin (version 1.26) on

both Firefox and Chrome, and the NextDNS CNAME

blocklist [36], which was used to extend the list of track-

ers we considered. As of version 1.25 of uBlock Ori-

gin, the extension on Firefox implements a custom de-

fense against CNAME-based tracking [1], by resolving

the domain name of requests that are originally not fil-

tered by the standard block list and then again checks

this block list against the resolved CNAME records.

Because Chrome does not support a DNS resolution

API for extensions, the defense could not be applied

to this browser. Consequently, we find that four of

the CNAME-based trackers (Oracle Eloqua, Eulerian,

Criteo, and Keyade) are blocked by uBlock Origin on

Firefox but not on the Chrome version of the anti-

tracking extension.

4.2 Tracking publishers

As a result of our analysis of the HTTP Archive dataset,

we detected 10,474 eTLD+1 domains that had a sub-

domain pointing to at least one CNAME-based tracker,

with 85 publishers referring to two different trackers.

We find that for 9,501 publisher eTLD+1s the tracking

request is included from a same-site origin , i.e., the pub-

lisher website has the same eTLD+1 as the subdomain

it includes tracker content from. Furthermore, on 18,451

publisher eTLD+1s we found the tracker was included

from a cross-site origin; these were typically sites that

were related in some way, e.g. belonging to the same

organization. Although these instances cannot circum-

vent countermeasures where all third-party cookies are

blocked, e.g. the built-in protection of Safari, they still

defeat blocklists.

Figure 2 displays the percentage of publisher

eTLD+1s involved in CNAME-based tracking, both in a

same-site or cross-site context, for bins of 10,000 Tranco-

ranked websites. The ratio of same-site to cross-site

CNAME-based tracking is consistently between 50%

and 65% for all bins. We can clearly see that the use of

CNAME-based tracking is heavily biased towards more

popular websites. In the top 10,000 Tranco websites 10%

refer to a tracker via a CNAME record. Because our

dataset only contains information about the homepage

of websites, and does not include results from Criteo, the

reported number should be considered a lower bound.

Using the categorization service by McAfee [34],

we determined the most popular categories among

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 7

Table 1. Overview of the analyzed CNAME-based trackers, based on the HTTP Archive dataset from October 2020.

requests to tracker is blocked by

Tracker Detected

# publishers

Est. total

# publishers

Pricing

(min. /mo)

uBlock Origin

Firefox

uBlock Origin

Chrome

NextDNS

CNAME blocklist

Pardot 5,993 21,759 $1,250 Ë*Ë*é

Adobe Experience Cloud 2,612 9,029 $5,000†Ë Ë Ë

Act-On Software 1,041 2,533 $900 Ë Ë é

Oracle Eloqua 304 3,743 $2,000†Ë é é

Eulerian 253 1,501 ? Ë é Ë

Webtrekk 101 822 ? Ë Ë Ë

Ingenious Technologies 41 - ? é é Ë

TraceDock 49 69 e49 é é Ë

<intent> 14 124 ? é é Ë

AT Internet 31 74 e355 é é Ë

Criteo 16 13,082 ? Ë é Ë

Keyade 12 86 ? Ë é Ë

Wizaly 12 55 $2000†é é Ë

†: Pricing information does not originate from original source, but as reported in reviews of the product.

*: Requests made to the CNAME subdomain triggered by a third-party analytics script hosted on pardot.com; the block-

list prevents the analytics script from loading. If this script was loaded from the CNAME domain, it would not be blocked.

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

% websites using CNAME-based tracking

Fig. 2. Percentage of websites using CNAME-based tracking per

bin of 10,000 ranks.

CNAME-based tracking publishers, as shown in Fig-

ure 3. As a baseline comparison, we also include the

distribution of categories in the Tranco top 10k. Because

of the strong financial motives to perform tracking,

e.g. marketing and attribution of online purchases, it

is not surprising that publishers are mainly financially-

focused, with approximately 40% of the publisher’s web-

sites being categorized as Business.

Finally, we explored to what extent publishers that

employ CNAME-based tracking also include third-party

trackers. To this end we analyzed all requests using

the EasyPrivacy blocklist [18] to determine the num-

ber of trackers that would be blocked by this list. We

find that on the vast majority of websites that include

a CNAME-based tracker (93.97%) at least one third-

Business

Finance

Marketing

Shopping

Internet

Education

Travel

Real Estate

Health

% of hostnames belonging to category

CNAME publishers

Tranco top 10k

Fig. 3. Most popular categories among CNAME-based tracking

publishers.

party tracker was present; on average these sites had

28.43 third-party tracking requests. This clearly shows

that CNAME-based tracking is most often used in con-

junction with other types of tracking. From a privacy

perspective this may cause certain issues, as the other

trackers may also set first-party cookies via JavaScript;

we explore this in more detail in Section 6.

5 Historical Evolution

In this section we report on various analyses we

performed to capture the longitudinal evolution of

CNAME-based tracking.

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 8

October 2020 CNAME records

A records

determine trackers

IPs & hostnames

used by trackers

overall

previous month

IPs & hostnames

used by trackers

inprevious month

current month

patterns of

tracking requests

requests

ZDNS

CNAME-based trackers

Fig. 4. Overview of the methodology that was used to determine

CNAME-based trackers over time.

5.1 Uptake in CNAME-based tracking

First, we explore the change in prevalence of CNAME-

based tracking over time. To achieve this, we leverage

the dataset of HTTP Archive, which is collected on a

monthly basis and dates back several years. We consider

the datasets from December 2018, when the pages from

the Chrome User Experience Report started to be used

as input for their crawler, until October 2020.

To determine the number of publishers using

CNAME tracking over time, we used an iterative ap-

proach as shown in Figure 4. Starting from the most

recent month (October 2020), we obtained the domain

names and associated IP addresses that were used to

connect to the CNAME-trackers. Next, we use data

from HTTP Archive’s dataset from the previous month

to determine all IP addresses that (confirmed) CNAME

domains resolve to, allowing us to capture changes of IP

addresses by trackers. By adding these IP addresses to

the list of IPs we found in October through a scan with

zdns, we obtain a set of IP addresses that were ever

used by the different CNAME trackers. Furthermore,

whenever we noticed that a tracker is using IPs within

a certain range for the tracking subdomains, we added

the whole range to the set of used IPs (e.g. Eulerian allo-

cates IP addresses in the range 109.232.192.0/21 for the

tracking subdomains). Relying just on the IP informa-

tion would likely lead to false positives as the trackers

provide various other services which may be hosted on

the same IP address, and ownership of IP addresses may

change over time. To prevent marking unrelated services

as tracking, we rely on our manually-defined request sig-

natures (as defined in Section 3.1.1) to filter out any

requests that are unrelated to tracking. Using the do-

Dec 2018

Apr 2019

Jul 2019

Oct 2019

Jan 2020

Apr 2020

Jul 2020

Oct 2020

5000

10000

15000

20000

25000

30000

Number of publisher eTLD+1s

Same-site

Cross-site

Total

Fig. 5. Number of eTLD+1 domains that include CNAME-based

tracking in a same-site and cross-site context.

Dec 2018

Apr 2019

Jul 2019

Nov 2019

Jan 2020

Apr 2020

Jul 2020

Oct 2020

-20%

20%

40%

Change in # publishers

Most widely used trackers in Tranco top 10k

CNAME-cloaking publisher domains in Tranco top 10k

Less popular trackers in Tranco top 10k

Fig. 6. Relative percentage, based on the state as of December

2018, of the number of publishers of popular and less popular

trackers and CNAME-based trackers.

main names of the confirmed tracking requests and the

set of IP addresses associated with tracking providers,

we can apply the same approach again for the previous

month. We repeat this process for every month between

October 2020 and December 2018.

Figure 5 shows the total number of publisher

eTLD+1s using CNAME-based tracking, either in a

same-site or cross-site context. The sudden drop in num-

ber of cross-site inclusions of CNAME trackers in Oc-

tober 2019 is mainly due to a single tracker (Adobe

Experience Cloud). We suspect it is related to changes

it made with regard to CCPA regulations (the HTTP

Archive crawlers are based in California) [5]. In gen-

eral, we find that the number of publisher sites that

employ CNAME-based tracking is gradually increasing

over time.

To further explore the evolution of the adoption of

CNAME-based tracking, we compare it to the evolution

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 9

of third-party tracking on the web. More specifically, for

the ten most popular tracking companies according to

WhoTracks.me [26], and fifteen randomly selected less

popular trackers with between 50 and 15,000 publishers

as of October 2020 (similar to the customer base we ob-

served for the CNAME-based trackers), we determined

the number of publishers in the Tranco top 10k list3, be-

tween December 2018 and October 2020. To this end we

used the EasyPrivacy block list, and only used the rules

that match the selected trackers. For the three cases

(popular trackers, less popular trackers and CNAME-

based trackers) we computed the relative increase or

decrease in number of publishers for the Tranco top 10k

websites. As the point of reference, we take the first en-

try of our dataset: December 2018. The relative changes

in the number of publishers are shown in Figure 6, and

indicate that the customer base of less popular trackers

declines whereas popular trackers retain a stable cus-

tomer base. This is in line with the findings of a study

by Cliqz and Ghostery [52]. Our results clearly show

that compared to third-party trackers, the CNAME-

based trackers are rapidly gaining in popularity, with

a growth of 21% over the past 22 months (compared to

a change of −3% for popular trackers and −8% for less

popular trackers).

5.2 Method evaluation

In this section, we evaluate the method we used to de-

tect CNAME-based tracking throughout time for cor-

rectness and completeness. For this analysis, we make

use of historical DNS data provided by Rapid7 [44]. We

try to determine both the web pages that were incor-

rectly considered to be using CNAME-based tracking,

as well as publishers that we might have missed by using

our method.

Correctness To assess the correctness of our ap-

proach, we looked for subdomains that we considered

to be using CNAME tracking for each month of our

analysis (December 2018 until October 2020), but that

did not have a CNAME record pointing to a tracker in

the corresponding month in the historical Rapid7 DNS

dataset. We found 81 publishers, 0.46% of the 17,633

publishers that we determined over the whole period,

that could potentially be labeled incorrectly. Upon a

closer examination, we find that all of these 81 publish-

ers were in fact correctly marked as such.

3https://tranco-list.eu/list/Z7GG/10000

These 81 publishers can be divided in three major

groups based on the same reason that caused the mis-

match in the datasets. First: Because of the timing dif-

ference between the HTTP Archive dataset and the

Rapid7 dataset, the tracking domain of 21 publishers

did not yet appear in the Rapid7 DNS dataset in the

first month of starting to use CNAME-based tracking.

Second: We found that 15 CNAME-based tracking do-

mains incorrectly configured their DNS records, caus-

ing them to send tracking requests to an non-existent

or typo domain. For instance, several CNAME records

pointed to a .207.net domain instead a .2o7.net do-

main. Third: We found 42 publisher tracking subdo-

mains that did not have a CNAME record pointing

to a known tracking domain. Instead, it pointed to an-

other domain that would still resolve to the same IP ad-

dress used by the tracker. This occurs when the tracker

adds a new tracking domain but the publisher that in-

cluded it did not yet update their CNAME records. For

example, we observe nine publisher subdomains that

have a CNAME record pointing to .ca-eulerian.net,

whereas the currently used domain is .eulerian.net.

On the other hand, as of October 2020, Adobe Ex-

perience Cloud added a new tracking domain, namely

data.adobedc.net; in the dataset of this month we

found 33 tracking subdomains that already started re-

ferring to it. As our method is agnostic of the domain

name used in the CNAME record of the publisher sub-

domain (the domain name may change over time), it

can detect these instances, in contrast to an approach

that is purely based on CNAME records. Finally, for

the remaining three publishers, we found that a DNS

misconfiguration on the side of the publisher caused the

CNAME record to not correctly appear in the Rapid7

dataset. Although tracking requests were sent to the

tracking subdomain, these subdomains would not al-

ways resolve to the correct IP address, or return dif-

ferent results based on the geographic location of the

resolver.

As a result, we conclude that all of the publishers were

correctly categorized as using CNAME-based tracking.

Moreover, our method is robust against changes in

tracking domains used by CNAME trackers.

Completeness We evaluate the completeness of

our method by examining domain names that we did

not detect as publishers, but that do have a CNAME

record to a tracking domain. Our detection method uses

an accumulating approach starting from the most recent

month’s data (October 2020) and detecting CNAME-

based tracking for each previous month, based on the

current month’s data. For this reason, we only consider

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 10

publisher subdomains that we might have missed in the

final month of our analysis (December 2018), where the

missed domains error would be most notable. Out of the

20,381 domain names that have a CNAME record in the

Rapid7 dataset pointing to a tracking domain, 12,060

(59.2%) were not present in the HTTP Archive dataset.

From the remaining domain names, 7,866 (38.6%) were

labeled as publishers by us, leaving 455 (2.2%) domain

names that we potentially missed as a consequence of

using our method. After examining the HTTP Archive

dataset for these domains, we find that for 195 host-

names the IP address is missing in the dataset. For

the remaining 260 domains, we find that the majority

(196) does not send any tracking-related request to the

tracker, which could indicate that the tracking service is

not actively being used. For 41 domain names, we find

that the sent requests do not match our request pat-

tern, and further examination shows that these are in

fact using another service, unrelated to tracking, from

one of the providers. The remaining 22 domain names

were missed as publishers in our method since these re-

solved to an IP address that was not previously used for

CNAME-based tracking.

Our results show that relying solely on DNS data to de-

tect CNAME-based tracking leads to an overestimation

of the number of publishers. Furthermore, our method

missed only 0.28% of CNAME-based tracking publish-

ers due to irregularities in the set of IP addresses used

by CNAME-based tracking providers. A downside of

our method is that it cannot automatically account for

changes of the request signature used by CNAME track-

ers throughout time. However, we note that in the anal-

ysis spanning 22 months, we did not encounter changes

in the request signature for any of the 13 trackers.

Tracker domain ownership Lastly, we verify

whether the ownership of the IP-addresses used by the

thirteen trackers changes throughout time. To achieve

this, we examine PTR records of the IP-addresses used

for tracking in December 2018 and check whether the

owner company of the resulting domains has changed

since then, by using Rapid7’s reverse DNS dataset [45]

and historical WHOIS data [53]. We find that all of the

IP addresses point to domains owned by the correspond-

ing tracker. Furthermore, for 7 trackers, the ownership

of the tracking domains has not changed since Decem-

ber 2018. 6 trackers had redacted their WHOIS infor-

mation due to privacy, out of which 1 was not updated

throughout our measurement period. The other 5 have

been updated recently and therefore we cannot conclude

that their owner has remained the same. We do suspect

this is the case however, since all of the domains were

6420246

Months before/after website adopted CNAME-based tracker

Average number of third-party trackers

afterbefore

Fig. 7. Number of third-party trackers adopted by publishers in

the six months before and after they adopted a CNAME-based

tracker.

owned by the corresponding tracker before the details

became redacted.

5.3 Effects on third-party tracking

In order to gather more insight on the reasons as to why

websites adopt CNAME-based tracking, we performed

an additional experiment. We posed the hypothesis that

if the number of third-party trackers employed by web-

sites decreases after they started using the CNAME-

based tracking services, this would indicate that the

CNAME-based tracking is used as a replacement for

third-party tracking. A possible reason for this could

be privacy concerns: without any anti-tracking mea-

sures, third-party tracking allows the tracker to build

profiles of users by following them on different sites,

whereas CNAME-based tracking only tracks users on

a specific site (assuming that the tracker acts in good

faith). Conversely, if the number of third-party track-

ers remains stable or even increases, this would indicate

that CNAME-based tracking is used in conjunction with

third-party tracking, e.g. to still obtain information on

users that employ anti-tracking measures.

To measure the evolution of the number of third-

party trackers on publisher sites that recently adopted

CNAME-based tracking, we again use the measure-

ments ranging between December 2018 and October

2020 from the HTTP Archive dataset. We consider a

publisher website including a CNAME tracker to be a

new if for six consecutive months it did not refer to

this tracker through a CNAME record on a subdomain,

and then for the following six months always included

a resource from this tracker. In total we found 1,129

publishers at in the duration of our analysis started

using CNAME tracking. For these publishers, we de-

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 11

termined the number of third-party trackers based on

the EasyPrivacy blocklist for the six months before and

after the time the publishers adopted CNAME-based

tracking. The average number of third-party trackers

over this time period is shown in Figure 7. We find

that the adoption of CNAME-based tracking services

does not significantly affect the third-party trackers that

are in use, indicating that these CNAME-based trackers

are used to complement the information obtained from

other trackers.

6 Implications of first-party

inclusion

In this section we explore how CNAME-based tracking

can increase the security and threat surface, in compar-

ison to third-party tracking. Because the tracker is in-

cluded in a same-site context, there may be additional

security risks. Furthermore, privacy-sensitive informa-

tion, e.g. contained in cookies, may be inadvertently

sent to the tracker, posing increased threats for users.

6.1 Transport security

When visiting a website that employs CNAME-based

tracking, various types of requests are made to the

tracker-controlled subdomain. We find that most com-

monly, the web page makes a request to report analytics

data, typically via an asynchronous request or by creat-

ing an (invisible) <img> element. Additionally, we find

that in most cases the tracking script is also included

from the CNAME subdomain. To ensure that a man-in-

the-middle attacker cannot read or modify the requests

and responses, a secure HTTPS connection is required.

Based on the HTTP Archive dataset from July 2020,

we find that the vast majority (92.18%) of sites that

use CNAME-based tracking, support TLS, and in al-

most all cases the resources of the trackers are included

over a secure connection as well. Nevertheless, we did

identify 19 websites where active content, i.e. HTML

or JavaScript, was requested from the tracker over an

insecure connection. Although most modern browsers

block this, users of older browser versions would still be

susceptible to man-in-the-middle attacks.

On 72 websites we found that a request reporting

analytics data to the tracker was sent over HTTP while

the web page was loaded over HTTPS. In this case, the

request is not blocked but instead the browser warns the

user that the connection is not secure. Because this is

a same-site request (as opposed to a cross-site request

as would be the case with third-party tracking), cook-

ies that are scoped to the eTLD+1 domain, and that

do not contain the Secure attribute, are attached to

this request. Consequently these cookies can be inter-

cepted on the network by an adversary. Furthermore an

attacker could benefit from manipulating the response.

More precisely, an adversary could inject Set-Cookie

headers to set a cookie with an arbitrary value on the in-

cluding website, effectively launching a session-fixation

attack against the user [27, 46]. In the remainder of this

section, we explore the privacy and security threats as-

sociated with including the tracker as first-party in more

detail.

6.2 Tracker vulnerabilities: case studies

To further explore how the security of websites and

their visitors is affected by including a CNAME-based

tracker, we performed a limited security evaluation of

the trackers that are included on publisher websites.

For up to maximum 30 minutes per tracker, we ana-

lyzed the requests and responses to/from the CNAME

subdomain for client-side web vulnerabilities. In most

cases, we found that only a single request was made,

and an empty response was returned. Despite the time-

limited nature of our analysis, we did identify vulnera-

bilities in two different trackers that affect all publishers

that include them. We reported the vulnerabilities to

the affected trackers and actively worked with them to

mitigate the issues. Unfortunately, in one instance the

tracker did not respond to repeated attempts to report

the vulnerability, leaving hundreds of websites exposed.

We hope to still be able to contact this vendor through

one of their customers.

6.2.1 Vulnerability 1: session fixation

The first vulnerability is caused by the tracker’s func-

tionality to extend the lifetime of first-party adver-

tising and analytics cookies, such as Facebook’s _fbp

cookie or the _ga cookie by Google Analytics. Because

these cookies are set by a cross-site script trough the

document.cookie API, Safari’s ITP and Firefox’ ETP

limit the lifespan of these cookies to 7 days. To overcome

these limits, the tracker provides a specific endpoint on

the CNAME subdomain that accepts a POST request

with a JSON payload containing the cookie names and

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 12

values whose lifetime should be extended. In the re-

sponse, the tracker’s server includes several Set-Cookie

headers containing the tracking cookies. Consequently,

these cookies are no longer set via the DOM API and

will have an extended lifetime in regard to Safari’s ITP

or Firefox’s ETP limits.

We found that the tracker endpoint did not ad-

equately validate the origin of the requests, nor the

cookie names and values. Consequently, through the

functionality provided by the tracker, which is enabled

by default on all the websites that include the tracker

in a first-party context, it becomes possible to launch a

session-fixation attack. For example, on a shopping site

the attacker could create their own profile and capture

the cookies associated with their session. Subsequently,

the attacker could abuse the session-fixation vulnerabil-

ity to force the victim to set the same session cookie

as the one from the attacker, resulting in the victim be-

ing logged in as the attacker. If at some point the victim

would try to make a purchase and enter their credit card

information, this would be done in the attacker’s pro-

file. Finally, the attacker can make purchases using the

victim’s credit card, or possibly even extract the credit

card information.

The impact of this vulnerability highlights the in-

creased threat surface caused by using the CNAME-

based tracking scheme. If a third-party tracker that was

included in a cross-site context would have the same vul-

nerability, the consequences would be negligible. The ex-

tent of the vulnerability would be limited to the setting

of an arbitrary cookie on a tracking domain (as opposed

to the first-party visited website) which would have no

effect on the user. However, because in the CNAME-

tracking scheme the tracking domain is a subdomain of

the website, cookies set with a Domain attribute of the

eTLD+1 domain (this was the default in the detected

vulnerability), will be attached to all requests of this

website and all its subdomains. As a result, the vulner-

ability does not only affect the tracker, but introduces

a vulnerability to all the websites that include it.

6.2.2 Vulnerability 2: cross-site scripting

The second vulnerability that we identified affects pub-

lishers that include a different tracker, and likewise it is

directly related to tracker-specific functionality. In this

case, the tracker offers a method to associate a user’s

email address with their fingerprint (based on IP ad-

dress and browser properties such as the User-Agent

string). This email address is later reflected in a dy-

namically generated script that is executed on every

page load, allowing the website to retrieve it again, even

if the user would clear their cookies. However, because

the value of the email address is not properly sanitized,

it is possible to include an arbitrary JavaScript payload

that will be executed on every page that includes the

tracking script. Interestingly, because the email address

is associated with the user’s browser and IP fingerprint,

we found that the payload will also be executed in a pri-

vate browsing mode or on different browser profiles. We

tested this vulnerability on several publisher websites,

and found that all could be exploited in the same way.

As such, the issue introduced by the tracking provider

caused a persistent XSS vulnerability in several hun-

dreds of websites.

6.3 Sensitive information leaked to

CNAME-based trackers

CNAME-based trackers operate on a subdomain of pub-

lisher websites. It is therefore possible that cookies sent

to the tracker may contain sensitive information, such

as personal information (name, email, location) and au-

thentication cookies, assuming these sensitive cookies

are scoped to the eTLD+1 domain of the visited website

(i.e. Domain=.example.org). Furthermore, it is possible

that websites explicitly share personal information with

the CNAME-based trackers in order to build a better

profile on their users.

To analyze the type of information that is sent to

trackers and to assess the frequency of occurrence, we

performed a manual experiment on a random subset of

publishers. Based on data from a preliminary crawl of

20 pages per website, we selected up to ten publisher

websites per tracker that had at least one HTML form

element with a password field. We limited the number

of websites in function of the manual effort required to

manually register, login, interact with it, and thoroughly

analyze the requests that were sent. We looked for au-

thentication cookies (determined by verifying that these

were essential to remain logged on to the website), and

personal information such as the name and email that

was provided during the registration process.

Out of the 103 considered websites, we were able to

successfully register and log in on 50 of them. In total,

we found that on 13 of these websites sensitive infor-

mation leaked to the CNAME tracker. The leaked in-

formation included the user’s full name (on 1 website),

location (on 2 websites), email address (on 4 websites,

either in plain-text or hashed), and the authentication

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 13

cookie (on 10 websites). We note that such leaks are the

result of including the trackers in a first-party context.

Our limited study indicates that the CNAME tracking

scheme negatively impacts users’ security (authentica-

tion cookie leaks) and privacy (personal data leaks).

6.4 Cookie leaks to CNAME-based

trackers

Next we perform an automated analysis to investigate

cookies that are inadvertently sent to CNAME trackers.

We conducted an automated crawl on June 7, 2020 of

8,807 websites that we, at that time, identified as us-

ing CNAME-based tracking following the methodology

outlined in Section 4.2. In this crawl, we searched for

cookies sent to the CNAME subdomain while exclud-

ing the cookies set by the CNAME tracker itself (either

through its subdomain or its third-party domains).

The crawler We built our crawler by modifying

the DDG Tracker Radar Collector [17], a Puppeteer-

based crawler that uses the Chrome DevTools Protocol

(CDP). We extended the crawler by adding capabili-

ties to capture HTTP request cookies, POST data, and

document.cookie assignments. DDG Tracker Radar Col-

lector uses the Chrome DevTools Protocol to set break-

points and capture the access to the Web API methods

and properties that may be relevant to browser finger-

printing and tracking (e.g. document.cookie). We used

this JavaScript instrumentation to identify scripts that

set cookies using JavaScript.

For each website, we loaded the homepage using a

fresh profile. We instructed the crawler to wait ten sec-

ond on each website, and then reload the page. This

allowed us to capture the leaks of cookies that were

set after the request to the CNAME-based tracker do-

main. We also collected HTTP headers, POST bodies,

JavaScript calls, and cookies from the resulting profile.

When crawling, we used a Safari User-Agent string, as

we found at least one CNAME-based tracker (Criteo)

employing first-party tracking for Safari users only.

Data analysis To identify the cookie leaks, we first

built the list of cookies sent to the CNAME subdomain.

From the resulting list, we excluded session cookies,

short cookies (less than 10 characters), and cookies that

contain values that occur on multiple visits (to exclude

non-uniquely identifying cookies). To determine the lat-

ter, we first built a mapping between the distinct cookie

values and the number of sites they occur on.

Next, we identified the setter of the cookies. First,

we searched the cookie name and value in Set-Cookie

Table 2. Five origins with most leaked cookies to CNAME-based

trackers. The right column indicates the number of distinct sites

cookies we observed one or more cookie leaks set by the scripts

from these origins.

Cookie origin Purpose

Num. of

distinct sites

www.google-analytics.com Analytics 5,970

connect.facebook.net FB Pixel 3,287

www.googletagmanager.com Tag management 2,376

bat.bing.com Advertising 1,182

assets.adobedtm.com Tag management 887

headers in HTTP responses. When the cookie in ques-

tion was sent in the corresponding request, we excluded

its response from the analysis. For JavaScript cookies,

we searched for the name-value pair in assignments to

document.cookie using the JavaScript instrumentation

data. We then used the JavaScript stack trace to de-

termine the origin of the script. After determining the

setter, we excluded cookies set by the CNAME-based

tracker itself.

Leaks in HTTP Cookie headers

We identified one or more cookie leaks on 7,377 sites

(95%) out of the 7,797 sites where we could identify

the presence of at least one CNAME-based tracker. Ta-

ble 2 shows the five origins with most cookies leaked to

CNAME-based trackers. The overwhelming majority of

cookie leaks (31K/35K) are due to third-party analytics

scripts setting cookies on the first-party domain.

The leakage of first-party cookies containing unique

IDs may not reveal any additional information to

CNAME-based trackers, since these trackers may al-

ready have an ID for the users in their own cookies.

However, cookies containing other information such as

ad campaign information, emails, authentication cook-

ies may also leak to the CNAME-based trackers (as

shown in Section 6.3). Moreover, our analysis found that

on 4,006 sites, a cookie set by a third-party domain is

sent to the CNAME-based tracker’s subdomain. 3,898 of

these sites are due to Pardot, which sets the same cookie

on its first-party subdomain and its third-party domain.

To set the same cookie on both domains, Pardot sends

its unique ID in an URL parameter called visitor_id

to its first-party subdomain.

Leaks in POST request bodies While we accept

and do not rule out that cookie leaks may often happen

inadvertently, i.e. without the knowledge or the cooper-

ation of the CNAME trackers, when browsers send cook-

ies with a matching domain to the tracker, this picture

is not always so straight-forward. Namely, we identified

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 14

and investigated two other types of cookie leaks that in-

volve more active participation by the CNAME trackers.

First, we studied cookie values sent in the POST request

bodies, again excluding the cookies set by the CNAME

tracker itself, and session cookies and cookies that oc-

cur on multiple sites, as described above. We found that

166 cookies (on 94 distinct sites) set by another party

were sent to a CNAME tracker’s subdomain in a POST

request body. The majority of these cases were due to

TraceDock (46 sites) and Adobe Experience Cloud (30

sites), while Otto Group and Webtrekk caused these

cookie leaks on 11 and seven sites respectively.

We used the request “initiators” field to identify the

senders of the requests. The “initiators” field contains

the set of script addresses that triggered an HTTP re-

quest, derived from JavaScript stack traces. In 78 of the

166 instances, the CNAME subdomain or the tracker’s

third-party domains were among the initiators of the

POST request. In the remaining cases, the CNAME

tracker’s script was served on a different domain (e.g.

Adobe Experience Cloud, assets.adobedtm.com), a dif-

ferent subdomain that also belongs to the CNAME

tracker (e.g. Otto Group uses tp.xyz.com subdomain for

its scripts and te.xyz.com for the endpoint), or the re-

quest was triggered by a tag manager script, or a com-

bined script that contains the CNAME tracker’s script.

The cookies sent in the POST bodies indicate that

certain CNAME tracker scripts actively read and ex-

filtrate cookies they may access on first party sites. Al-

though the content of the cookies may not always reveal

additional information, our manual analysis presented

above revealed sensitive information such as email ad-

dresses, authentication cookies and other personal in-

formation is leaking to the CNAME trackers.

Leaks in request URLs Next we investigate the

cookies sent to CNAME tracker subdomains in the re-

quest URLs. To detect such leaks we searched for cook-

ies in the request URLs (and URL-decoded URLs) ex-

cluding the scheme and the hostname. We excluded the

same set of cookies as the previous two analyses – cook-

ies set by CNAME tracker itself, short cookies, session

cookies and cookies with non-unique values.

We found 1,899 cookie leaks in request URLs to

CNAME subdomains on 1,295 distinct sites. 1,566 of

the cookies were sent to Adobe Experience Cloud’s sub-

domain, while Pardot’s and Eularian’s subdomains re-

ceived 130 and 101 cookies, respectively. In addition,

in 4,121 cases (4,084 sites), a cookie set by Pardot’s

third-party domain was sent to its CNAME subdomain,

confirming the finding above that Pardot syncs cookies

between its third-party domain and its CNAME sub-

domain. Overall, in 378 cases the leaked cookie was set

by a third-party domain, indicating that cookies were

synced or simply exchanged between the domains.

Our automated analysis of cookie leaks, in combina-

tion with the deeper manual analysis presented above

indicates that passive and active collection of cookies

by the CNAME trackers is highly prevalent and have

severe privacy and security implications including the

collection of email addresses, unique identifiers and au-

thentication cookies. Further, our results show that cer-

tain CNAME-based trackers use third-party cookies for

cross-site tracking and at times receive cookies set by

other third-party domains, allowing them to track users

across websites.

7 Discussion

CNAME-based tracking exists for several years now.

Our analysis shows that recently it is rapidly gaining

in popularity, especially on frequently-visited websites.

In this section we explore the current countermeasures

against this form of tracking, and discuss their effective-

ness and potential circumvention techniques that track-

ers may use in the future.

Countermeasures In response to a report that

a tracker was using CNAMEs to circumvent privacy

blocklists4, uBlock Origin released an update for its

Firefox version that thwarts CNAME cloaking [23]. The

extension blocks requests to CNAME trackers by resolv-

ing the domain names using the browser.dns.resolve

API method to obtain the last CNAME record (if there

is any) before each request is sent. Subsequently, the

extension checks whether the domain name matches

any of the rules in its blocklists, and blocks requests

with matching domains while adding the outcome to

a local cache. Although uBlock Origin also has a ver-

sion for Chromium-based browsers, the same defense

cannot be applied because Chromium-based browser

extensions do not have access to an API to perform

DNS queries. As such, at the time of this writing, it

is technically impossible for these extensions to block

requests to trackers that leverage CNAME records to

avoid detection. As we explain in Section 4, uBlock

Origin for Chrome, which does not have a defense for

CNAME-based tracking, still manages to block several

trackers. This is because the requests to the trackers

4https://github.com/uBlockOrigin/uBlock-issues/issues/780

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 15

matched an entry of the blocklist with a URL pattern

that did not consider the hostname. Unfortunately, it

is fairly straightforward for the tracker to circumvent

such a fixed rule-based measure, e.g. by randomizing

the path of the tracking script and analytics endpoint,

as is evidenced by the various trackers that could only

be blocked by the uBlock Origin version on Firefox. An

alternative strategy for browser extensions that are un-

able to resolve DNS records before making the decision

to block requests, could be to analyze the behavior or

artifacts of tracking scripts. However, the tracker’s code

could be dynamic and include many variations, making

detection arduous and performance-intensive. As such,

blocking CNAME-based trackers is highly challenging

for browser extensions that do not have access to suf-

ficient information to reveal the factual party that a

request is made to.

Other tracking countermeasures operate as a DNS

resolver, and return a bogus IP address, e.g. 127.0.0.1

when the domain name matches an entry from the

blocklist. As this defense works at the DNS level, these

can also consider all the intermediary resolutions to

CNAME records, and return a bogus IP address if any

of them resolve to a domain on the blocklist. Examples

of DNS-based anti-tracking measures that adopted de-

fenses against CNAME cloaking include NextDNS [42],

AdGuard [4], and Pi-hole [50].

Circumvention Both anti-tracking solutions, i.e.

browser extensions and DNS resolvers, rely on a block-

list, and can thus only block trackers whose domain

names are on the list. This provides a first avenue for cir-

cumvention for the trackers: by randomizing the domain

names that is referred to in CNAME records, it would

become infeasible to rely on a domain-based blocklist.

However, this would mean that every time the tracker

changes domains, all the publishers that include it would

need to update their CNAME record, making it largely

impractical. Another circumvention option could be to

directly refer to the IP address of the tracker through

an A record instead of a CNAME record. We found the

pool of IP addresses used by CNAME-based trackers

to be relatively stable over time, and in fact found sev-

eral (35) publishers that already do this. At the time of

this writing, this circumvents the current blocklists, as

these do not contain the IP addresses used by the track-

ers, but this can be easily defended against by adding

the IP addresses to the blocklist.

Similar to randomizing the CNAME records, chang-

ing IP addresses as soon as they appear on the block-

list would also be practically infeasible, as it requires

all publishers to update their DNS records. Neverthe-

less, a tracker could request their publishers to delegate

authority for a specific subdomain/zone to the tracker

by setting an NS record that points to the tracker. As

such, the tracker could dynamically generate A record

responses for any domain name within the delegated

zone, and thus periodically change them to avoid being

added to the blocklist. For anti-tracking mechanisms

to detect this circumvention technique, this would re-

quire obtaining the NS records to determine whether

they point to a tracker. Although it is feasible to detect

this, it may introduce a significant additional overhead

for the browser extensions and DNS-based anti-tracking

mechanisms.

In general, as long as the anti-tracking mechanism

can detect the indirection to the third-party tracker, it

is possible to detect and block requests to the tracker,

albeit at a certain performance overhead. Trackers could

try to further camouflage their involvement in serving

the tracking scripts and collecting the analytics infor-

mation. For instance, they could request the publishers

that include tracking scripts to create a reverse proxy

for a specific path that points to the tracker, which could

be as easy as adding a few lines in the web server config-

uration, or adjusting the settings of the CDN provider.

In such a situation, the tracking-related requests would

appear, from a user’s perspective, to be sent to the vis-

ited website, both in terms of domain name as well as IP

address. Consequently, current tracking defenses would

not be able to detect or block such requests. As the

perpetual battle between anti-tracking mechanisms and

trackers continues, as evidenced by the increasing pop-

ularity of CNAME-based tracking, we believe that fur-

ther research in the detection and analyzing the preva-

lence of novel circumvention techniques is warranted.

Limitations As stated in Section 5, the method

we use to detect CNAME-based tracking in historical

data cannot account for changes in the request signature

used by trackers. In practise, these signatures remained

the same during our measurement period. Furthermore,

part of the experiments we conducted in Section 6 re-

quired substantial manual analysis, making it infeasible

to perform on a larger set of websites.

8 Related work

In 2009, Krishnamurthy and Wills provided one of the

first longitudinal analyses of user information flows to

third-party sites (called aggregators) [28]. The authors

also observed a trend of serving third-party tracking

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 16

content from first-party contexts, pointing out the chal-

lenges for countermeasures based on blocklists. Meyer

and Mitchell studied the technology and policy aspects

of third-party tracking [33]. Englehardt and Narayanan

[20] measured tracking on Alexa top million websites us-

ing OpenWPM and discovered new fingerprinting tech-

niques such as AudioContext API-based fingerprinting.

The CNAME tracking scheme was mentioned anec-

dotally by Bau in 2013 [9], but the authors did not fo-

cus on the technique specifically. To our knowledge, the

first systematic analysis of the CNAME scheme used

to embed third-party trackers in first-party content is

the work of Olejnik and Casteluccia [39], in which they

identified this special arrangement as part of the real-

time bidding setup. The authors also reported leaks of

first-party cookies to such third parties. In our paper,

we extensively expand such analyses. Although cookies

were most commonly used for cross-site tracking, more

advanced mechanisms have been deployed by websites

and studied by the researchers. Browser fingerprinting

[19], where traits of the host [56], system, browser and

graphics stack [35] are extracted to identify the user is

one of the stateless tracking vectors that does not need

cookies to operate. Fingerprinting on the web was mea-

sured at scale by Acar et al. [2, 3], Nikiforakis et al.[38],

and Englehardt and Narayanan [20]. Combining mul-

tiple tracking vectors at the same time may give rise

to supercookies or evercookies, as demonstrated first by

Samy Kamkar [25]. Over the years, many information

exfiltration or tracking vectors have been studied, in-

cluding Cache Etag HTTP header [7], Web Sockets [8],

ultrasound beacons [32], and fingerprinting sensors cal-

ibrations on mobile devices with sensors [58].

Similar to these studies we measure the prevalence

of a tracking mechanism that tries to circumvent ex-

isting countermeasures. However our work uses novel

methods to identify CNAME-based trackers in histor-

ical crawl data, allowing us to perform a longitudinal

measurement.

In concurrent work, Dao et al. also explored the

ecosystem of CNAME-based trackers [14]. Based on a

crawl of the Alexa top 300k, they find 1,762 CNAME-

based tracking domains as of January 2020, which

are detected by matching the CNAME domain with

EasyPrivacy. In our work, we detected 9,273 sites that

leverage CNAME-based tracking in a same-site context

and an additional 19,226 websites that use it in a cross-

site context. We rely on an approach that combines

historical DNS records (A records) with manually con-

structed fingerprints. The latter is used to filter out any

potential false positives that may be caused by changes

in the IP space ownership, or because the CNAME- or

A-records may be used to other services of the same

provider unrelated to tracking. Based on the evaluation

of our method in Section 5.2, we find that it is impor-

tant to use request-specific information to prevent incor-

rectly marking domains as using CNAME-based track-

ing. Furthermore, relying on filter lists, and in particular

on the eTLD+1 domains that are listed, could result in

the inclusion of non-tracking domains, e.g. sp-prod.net

is the second most popular tracker considered by Dao

et al., but was excluded in our work as it is part of a

“Consent Management Platform” that captures cookie

consent for compliance with GDPR [47]. Additionally,

filter lists may be incomplete, resulting in trackers be-

ing missed: for example, Pardot, the tracker we find to

be most widely used, was not detected in prior work.

Consequently, relying on filter lists also prevents the de-

tection of new trackers, this limitation is not applicable

to our method.

Dao et al. also perform an analysis of the histori-

cal evolution of CNAME-based tracking, based on four

datasets of the Alexa top 100k websites collected be-

tween January 2016 and January 2020. As the used

OpenWPM datasets do not include DNS records, the

researchers rely on a historical forward DNS dataset

provided by Rapid7 [44], which does not cover all do-

mains over time. By using the HTTP Archive dataset,

which includes the IP address that was used, we were

able to perform a more granular analysis, showing a

more accurate growth pattern. We also show that this

growth is rapidly increasing, significantly outperforming

third-party trackers with a comparable customer base.

Finally, to the best of our knowledge, we are the first

to perform an analysis of the privacy and security im-

plications associated with the CNAME-based tracking

scheme.

9 Conclusion

Our research sheds light on the emerging ecosystem of

CNAME-based tracking, a tracking scheme that takes

advantage of a DNS-based cloaking technique to evade

tracking countermeasures. Using HTTP Archive data

and a novel method, we performed a longitudinal analy-

sis of the CNAME-based tracking ecosystem using crawl

data of 5.6M web pages. Our findings show that un-

like other trackers with similar scale, CNAME-based

trackers are becoming increasingly popular, and are

mostly used to supplement “typical” third-party track-

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 17

ing services. We evaluated the privacy and security

threats that are caused by including CNAME track-

ers in a same-site context. Through manual analysis we

found that sensitive information such as email addresses

and authentication cookies leak to CNAME trackers

on sites where users can create accounts. Furthermore,

we performed an automated analysis of cookie leaks to

CNAME trackers and found that cookies set by other

parties leak to CNAME trackers on 95% of the websites

that we studied. Finally we identified two major web se-

curity vulnerabilities that CNAME trackers caused. We

disclosed the vulnerabilities to the respective parties and

have worked with them to mitigate the issues. We hope

that our research helps with addressing the security and

privacy issues that we highlighted, and inform develop-

ment of countermeasures and policy making with regard

to online privacy and tracking.

References

[1] Lawrence Abrams. 2019. uBlock Origin Now Blocks

Sneaky First-Party Trackers in Firefox. https://www.

bleepingcomputer.com/news/security/ublock-origin-now-

blocks-sneaky- first-party-trackers-in-firefox/.

[2] Gunes Acar, Christian Eubank, Steven Englehardt, Marc

Juarez, Arvind Narayanan, and Claudia Diaz. 2014. The

Web Never Forgets: Persistent Tracking Mechanisms in the

Wild. In Proceedings of the 2014 ACM SIGSAC Conference

on Computer and Communications Security (Scottsdale,

Arizona, USA) (CCS ’14). ACM, New York, NY, USA, 674–

689. https://doi.org/10.1145/2660267.2660347

[3] Gunes Acar, Marc Juarez, Nick Nikiforakis, Claudia Diaz,

Seda Gürses, Frank Piessens, and Bart Preneel. 2013. FPDe-

tective: dusting the web for fingerprinters. In Proceedings of

the 2013 ACM SIGSAC conference on Computer & commu-

nications security. 1129–1140.

[4] AdGuard. 2019. Disguised trackers threat and how we will

address it. https://adguard.com/en/blog/disguised-trackers.

html.

[5] Adobe Experience Cloud. 2019. Adobe Experience Cloud

Release Notes - October 2019. https://docs.adobe.com/

content/help/en/release-notes/experience- cloud/previous/

2019/10102019.html.

[6] HTTP Archive. 2020. State Of The Web Report. https:

//httparchive.org/.

[7] Mika D Ayenson, Dietrich James Wambach, Ashkan Soltani,

Nathan Good, and Chris Jay Hoofnagle. 2011. Flash cookies

and privacy II: Now with HTML5 and ETag respawning.

Available at SSRN 1898390 (2011).

[8] Muhammad Ahmad Bashir, Sajjad Arshad, Engin Kirda,

William Robertson, and Christo Wilson. 2018. How tracking

companies circumvented ad blockers using websockets. In

Proceedings of the Internet Measurement Conference 2018.

ACM, 471–477.

[9] Jason Bau, Jonathan Mayer, Hristo Paskov, and John C

Mitchell. 2013. A promising direction for web tracking coun-

termeasures. Proceedings of W2SP (2013).

[10] Omar Benguerah. 2017. Setting first-party cookies by redi-

rection. US Patent 9,723,051.

[11] Google Chrome. 2020. Cookies default to Same-

Site=Lax. https://www.chromestatus.com/feature/

5088147346030592.

[12] Cloudflare. 2020. Understanding the Cloudflare Cook-

ies. https://support.cloudflare.com/hc/en-us/articles/

200170156-Understanding- the-Cloudflare-Cookies.

[13] Romain Cointepas. 2019. CNAME Cloaking, the danger-

ous disguise of third-party trackers. https://medium.com/

nextdns/cname-cloaking- the-dangerous- disguise-of- third-

party-trackers-195205dc522a.

[14] Ha Dao, Johan Mazel, and Kensuke Fukuda. 2020. Char-

acterizing CNAME Cloaking-Based Tracking on the Web.

In Proceedings of IFIP/IEEE Traffic Measurement Analysis

Conference (TMA). 9 pages.

[15] Sanorita Dey, Nirupam Roy, Wenyuan Xu, Romit Roy

Choudhury, and Srihari Nelakuditi. 2014. AccelPrint: Im-

perfections of Accelerometers Make Smartphones Trackable..

In NDSS. Citeseer.

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 18

[16] Disconnect. 2020. Privacy Solutions. https://disconnect.

me/.

[17] duckduckgo. 2020. tracker-radar-collector. https://github.

com/duckduckgo/tracker-radar-collector [Online; accessed

10. Jun. 2020].

[18] EasyPrivacy. 2020. Filter List That Completely Removes All

Forms Of Tracking From The Internet. https://easylist.to/

index.html.

[19] Peter Eckersley. 2010. How unique is your web browser?. In

International Symposium on Privacy Enhancing Technologies

Symposium. Springer, 1–18.

[20] Steven Englehardt and Arvind Narayanan. 2016. Online

tracking: A 1-million-site measurement and analysis. In Pro-

ceedings of the 2016 ACM SIGSAC conference on computer

and communications security. 1388–1401.

[21] Steven Englehardt, Dillon Reisman, Christian Eubank, Pe-

ter Zimmerman, Jonathan Mayer, Arvind Narayanan, and

Edward W Felten. 2015. Cookies that give you away: The

surveillance implications of web tracking. In Proceedings

of the 24th International Conference on World Wide Web.

289–299.

[22] Brent Fulgham. 2018. Protecting Against HSTS Abuse.

https://webkit.org/blog/8146/protecting-against-hsts-

abuse.

[23] Raymond Hill. 2020. uBlock Origin - 1.25.0. https://github.

com/gorhill/uBlock/releases/tag/1.25.0.

[24] Umar Iqbal, Steven Englehardt, and Zubair Shafiq.

2020. Fingerprinting the Fingerprinters: Learning to De-

tect Browser Fingerprinting Behaviors. arXiv preprint

arXiv:2008.04480 (2020).

[25] Samy Kamkar. 2010. Evercookie-virtually irrevocable persis-

tent cookies. His Blog 9 (2010).

[26] Arjaldo Karaj, Sam Macbeth, Rémi Berson, and Josep M

Pujol. 2018. WhoTracks. Me: Shedding light on the opaque

world of online tracking. arXiv preprint arXiv:1804.08959

(2018).

[27] Mitja Kolšek. 2002. Session fixation vulnerability in web-

based applications. Acros Security 7 (2002).

[28] Balachander Krishnamurthy and Craig Wills. 2009. Privacy

diffusion on the web: a longitudinal perspective. In Proceed-

ings of the 18th international conference on World wide

web. ACM, 541–550.

[29] Adam Lerner, Anna Kornfeld Simpson, Tadayoshi Kohno,

and Franziska Roesner. 2016. Internet jones and the raiders

of the lost trackers: An archaeological study of web tracking

from 1996 to 2016. In 25th {USENIX}Security Symposium

({USENIX}Security 16).

[30] Scott Low and Joe Martin. 2020. Tracking Prevention in

Microsoft Edge (Chromium). https://docs.microsoft.com/

en-us/microsoft- edge/web-platform/tracking-prevention.

[31] Andrea Marchesini. 2019. Enable sameSite=lax by default

on Nightly. https://bugzilla.mozilla.org/show_bug.cgi?id=

1604212.

[32] Vasilios Mavroudis, Shuang Hao, Yanick Fratantonio, Fed-

erico Maggi, Christopher Kruegel, and Giovanni Vigna.

2017. On the privacy and security of the ultrasound ecosys-

tem. Proceedings on Privacy Enhancing Technologies 2017,

2 (2017), 95–112.

[33] Jonathan R Mayer and John C Mitchell. 2012. Third-party

web tracking: Policy and technology. In 2012 IEEE Sympo-

sium on Security and Privacy. IEEE, 413–427.

[34] McAfee. 2020. Customer URL Ticketing System. https:

//trustedsource.org/.

[35] Keaton Mowery and Hovav Shacham. 2012. Pixel perfect:

Fingerprinting canvas in HTML5. Proceedings of W2SP

(2012), 1–12.

[36] NextDNS. 2020. CNAME Cloaking Blocklist. https://

github.com/nextdns/cname-cloaking- blocklist.

[37] NextDNS. 2020. NextDNS CNAME Cloaking Blocklist.

https://github.com/nextdns/cname-cloaking- blocklist.

[38] Nick Nikiforakis, Alexandros Kapravelos, Wouter Joosen,

Christopher Kruegel, Frank Piessens, and Giovanni Vigna.

2013. Cookieless monster: Exploring the ecosystem of web-

based device fingerprinting. In 2013 IEEE Symposium on

Security and Privacy. IEEE, 541–555.

[39] Lukasz Olejnik and Claude Castelluccia. 2014. Analysis of

openx-publishers cooperation. In In 7th Workshop on Hot

Topics in Privacy Enhancing Technologies (HotPETs 2014).

[40] Lukasz Olejnik, Tran Minh-Dung, and Claude Castelluccia.

2014. Selling off privacy at auction. In In Proceedings of

the 2014 Symposium on Network and Distributed System

Security.

[41] Mike O’Neill. 2015. Discovered In The Wild: A New Method

Bypassing Safari’s Third-Party Cookie Blocking. https:

//baycloud.com/blog/PostDetail?slug=discovered-in-the-

wild-a-new-method-bypassing-safaris-third-party-cookie-

blocking.

[42] Olivier Poitrey. 2019. NextDNS first to support blocking

of ALL third-party trackers disguised as first-party. https:

//medium.com/nextdns/nextdns-added- cname-uncloaking-

support-becomes- the-first-cross-platform-solution-to-the-

problem-e3f437f84342.

[43] Chrome DevTools Protocol. 2020. Instrument, Inspect,

Debug And Profile Chromium. https://chromedevtools.

github.io/devtools-protocol/.

[44] Rapid7. 2020. DNS ’ANY’, ’A’, ’AAAA’, ’TXT’, ’MX’,

and ’CNAME’ responses for known forward DNS names.

https://opendata.rapid7.com/sonar.fdns_v2/.

[45] Rapid7. 2020. DNS IPv4 PTR responses. https://opendata.

rapid7.com/sonar.rdns_v2/.

[46] Michael Schrank, Bastian Braun, Martin Johns, and

Joachim Posegga. 2010. Session fixation–the forgotten

vulnerability? Sicherheit 2010. Sicherheit, Schutz und Zuver-

lässigkeit (2010).

[47] SourcePoint. 2020. Consent Management Platform. https:

//help.sourcepoint.com/en/collections/1255107-consent-

management-platform.

[48] Alan Toner. 2017. Safari in Arms Race Against Trackers -

Criteo Feels the Heat. https://www.eff.org/deeplinks/2017/

12/arms-race-against-trackers-safari-leads-criteo-30.

[49] Security Trails. 2020. Robust APIs & Data Services for

Security Teams. https://securitytrails.com/.

[50] Adam Warner. 2020. Pi-hole v5.0 is here! https://pi-hole.

net/2020/05/10/pi-hole- v5-0- is-here/.

[51] Mike West. 2020. Incrementally Better Cookies. https:

//tools.ietf.org/html/draft-west-cookie-incrementalism- 01.

[52] WhoTracks.me. 2018. GDPR - What happened? https:

//whotracks.me/blog/gdpr-what-happened.html.

[53] Whoxy. 2020. WHOIS Lookup API for Domain Names.

https://www.whoxy.com/.

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 19

[54] John Wilander. 2020. Full Third-Party Cookie Blocking

and More. https://webkit.org/blog/10218/full-third-party-

cookie-blocking-and- more/.

[55] Marissa Wood. 2019. Today’s Firefox Blocks Third-Party

Tracking Cookies and Cryptomining by Default. https://

blog.mozilla.org/blog/2019/09/03/todays-firefox-blocks-

third-party-tracking- cookies-and-cryptomining-by-default/.

[56] Ting-Fang Yen, Yinglian Xie, Fang Yu, Roger Peng Yu, and

Martin Abadi. 2012. Host Fingerprinting and Tracking

on the Web: Privacy and Security Implications.. In NDSS,

Vol. 62. 66.

[57] ZDNS. 2020. Command-line Utility That Provides High-

speed DNS Lookups. https://github.com/zmap/zdns.

[58] Jiexin Zhang, Alastair R. Beresford, and Ian Sheret. 2019.

SensorID: Sensor Calibration Fingerprinting for Smartphones.

In Proceedings of the 40th IEEE Symposium on Security

and Privacy (SP). IEEE.

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 20

A Acknowledgement

This research is partially funded by the Research Fund

KU Leuven, and by the Flemish Research Programme

Cybersecurity with reference number VR20192203. We

would like to thank the reviewers for their constructive

comments. Gunes Acar holds a Postdoctoral fellowship

of the Research Foundation Flanders (FWO).

The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion 21

B Assisted detection

First-party subdomains referring to third-parties are

by no means exclusive to CNAME-based tracking: ser-

vices such as CDNs rely on a similar setup. Many

websites hosting various services utilize CNAMEs to

connect website domains to third-party hosts. Further-

more, a variety of different kinds of services provide

third-party content in a first-party context by using

CNAME records. Examples include Consent Manage-

ment Providers or domain parking services and traffic

management platforms.

In our approach to distinguish the various kinds of

first-party services we collected features that help us

characterize a resource. For each of the 120 services we

considered, we measured the number of websites the

first-party is active on, the number of different host-

names a request to the service originates from, and the

number of unique paths occurring in requests to the

service. Furthermore, we captured the body size of the

response, its content type (i.e. an image, script, video

or html resource) and the average number of requests

per website using the service. Lastly, we detected the

percentage of requests and websites that sent and re-

ceived cookies from the service.

To measure the uniformity of the response sizes of

potential first-party trackers we sorted the sizes in buck-

ets, each bucket with a size of 100 bytes. We then con-

sidered the number of buckets as a possible feature for

distinction between different kinds of services. A low

number of buckets would indicate that the service has

a similar response to each request (e.g. the same script)

which would increase the likelihood of the service being

a tracker.

After manually visiting the websites of each of the

considered services, we were able to classify them in

three different categories: trackers,Content Distribution

Networks (CDNs) and other. Any service that did not

mention being explicitly a CDN or a tracker on their

website, was categorized as “other”.

To gain a better understanding of the features we

collected, we analyzed their distribution across the dif-

ferent categories. Figure 8 shows the features that are

the least overlapping for the three categories.

As can be deduced from Figure 8d and Figure 8a, the

number of response size buckets and the number of

unique paths accessed by the website is much lower for

trackers than for CDNs and other services. This was in

0 10 20 30 40

Averagenumber of unique paths accessed per website

Percentageof labels belonging to category

other

cdn

tracker

(a) Distribution of the aver-

age number of unique paths

per website

0 20 40 60 80 100 120 140 160

Averagenumber of request sent to the domain per website

100

Percentageof labels belonging to category

other

cdn

tracker

(b) Distribution of the aver-

age number of requests per

website to the service

0 20 40 60 80

Percentageof websites that receive a cookie

Percentageof labels belonging to category

other

cdn

tracker

centage of responses contain-

ing at least one cookie

0 1000 2000 3000 4000 5000

Numberof response size buckets

100

Percentageof labels belonging to category

other

cdn

tracker

(d) Distribution of different

response sizes sorted in buck-

ets of size 100 bytes

Fig. 8. Features distinguishing trackers from other types of ser-

vices

line with our expectation that customer websites access

a similar resource each time. Furthermore, tracking ser-

vices receive a low number of requests per website and

often respond with a cookie.

Given the fact that we had a small list of confirmed

trackers only, it was not feasible to build a classifier

with the purpose of distinguishing tracking services from

other types of services. However, our findings are still

useful for performing assisted detection of tracking ser-

vices. They form a simple heuristic for ruling out some

companies from being trackers. With more data, the

features that we gathered could likely be used for auto-

matic detection.