Talon: An Automated Framework for Cross-Device Tracking Detection
Konstantinos Solomos
FORTH, Greece
solomos@ics.forth.gr
Panagiotis Ilia
Univ. of Illinois at Chicago, USA
pilia@uic.edu
Sotiris Ioannidis
FORTH, Greece
sotiris@ics.forth.gr
Nicolas Kourtellis
Telefonica Reasearch, Spain
nicolas.kourtellis@telefonica.com
Abstract
Although digital advertising fuels much of today’s free Web,
it typically does so at the cost of online users’ privacy, due to
the continuous tracking and leakage of users’ personal data.
In search for new ways to optimize the effectiveness of ads,
advertisers have introduced new advanced paradigms such
as cross-device tracking (CDT), to monitor users’ browsing
on multiple devices and screens, and deliver (re)targeted ads
in the most appropriate screen. Unfortunately, this practice
leads to greater privacy concerns for the end-user.
Going beyond the state-of-the-art, we propose a novel
methodology for detecting CDT and measuring the factors
affecting its performance, in a repeatable and systematic
way. This new methodology is based on emulating realistic
browsing activity of end-users, from different devices, and
thus triggering and detecting cross-device targeted ads. We
design and build Talon1, a CDT measurement framework
that implements our methodology and allows experimenta-
tion with multiple parallel devices, experimental setups and
settings. By employing Talon, we perform several critical
experiments, and we are able to not only detect and measure
CDT with average AUC score of 0.78-0.96, but also to pro-
vide significant insights about the behavior of CDT entities
and the impact on users’ privacy. In the hands of privacy
researchers, policy makers and end-users, Talon can be an
invaluable tool for raising awareness and increasing trans-
parency on tracking practices used by the ad-ecosystem.
1 Introduction
Online advertising has become a driving force of the econ-
omy, with digital ad spending already surpassing the spend-
ing for TV-based advertising in 2017 [46], and expected to
reach $327 billion in 2019 [61]. This is because online adver-
tising can be easily tailored to, and target specific audiences.
In order to personalize ads, advertisers employ various track-
ing practices to collect user behavioral and browsing data.
1https://en.wikipedia.org/wiki/Talos
Ad#Ecosystem
User%Info
Ads
Figure 1: High level representation of cross-device tracking.
Until recently, the tracking of a user was confined to the
physical boundary of each one of her devices. However,
as users typically own multiple devices [4, 80], advertisers
have started employing advanced targeting practices specif-
ically designed to track and target users across all their de-
vices. These efforts indicate a radical shift of the ad-targeting
paradigm, from device-centric to user-centric. In this new
paradigm, an advertiser tries to identify which devices (e.g.,
smartphone, tablet, laptop) belong to the same user, and then
target her across all devices with ads related to her overall
online behavior. Figure 1 illustrates a typical cross-device
tracking (CDT) scenario, where a user is targeted with rele-
vant ads in her second device (desktop), due to the behavior
exhibited to the ad-ecosystem from her first device (mobile).
A recent FTC Staff Report [74] states that CDT can be de-
terministic or probabilistic, and companies engaging in such
practices typically use a mixture of both techniques. Deter-
ministic tracking utilizes 1st-party login services that require
user authentication (e.g., Facebook, Twitter, Gmail). These
1st-party services often share information (e.g., a unique
identifier) with 3rd-parties, enabling them to perform a more
effective CDT. In the case of probabilistic CDT, there are
no shared identifiers between the users’ devices, and 3rd-
parties attempt to identify which devices belong to the same
user by considering network access data, common behav-
ioral patterns in browsing history, etc. In fact, to understand
the degree to which CDT trackers appear on the Web, we
arXiv:1812.11393v5 [cs.CR] 31 Jul 2019
measured their frequency of appearance on Alexa Top-10k
websites: companies performing probabilistic CDT can be
found in 27% of the websites, and when also considering
deterministic CDT, this coverage reaches 80%. Also, sev-
eral advertising companies such as Criteo [26], Tapad [79],
Drawbridge [32] etc., claim that they can track users across
devices with very high accuracy (e.g., Drawbridge’s Cross-
Device connected consumer graph is 97.3% accurate [31]).
In spite of its big impact on user privacy, apart from some
empirical evidence about CDT, there is only a limited work
investigating it. In the most close work to ours, Zimmeck
et al. [88], designed an algorithm that correlates mobile and
desktop devices into pairs by considering devices’ browsing
history and IP addresses. While this approach shows that
correlation of devices is possible when such data are avail-
able, it does not provide an approach for detecting and mea-
suring CDT. In fact, to the best of our knowledge, there is no
existing approach to audit the probabilistic CDT ecosystem
and the factors that impact its performance on the Web. Our
work is the first to propose a novel methodology that enables
auditing the CDT ecosystem in an automated and systematic
way. In effect, our work takes the first and crucial step in un-
derstanding the inner workings of the CDT mechanics and
measure different parameters that affect how it performs.
The methodology proposed in this work is based on the
following idea: we want to detect when CDT trackers suc-
cessfully correlate a user’s devices, by identifying cross-
device targeted behavioral ads they send, i.e., ads that are de-
livered on one device, but have been triggered because of the
user’s browsing on a different device. In order to design this
methodology, we first study browsing data of real users with
multiple devices from [88] and extract topics of interest and
other user behavioral patterns. Then, to make trackers cor-
relate the different devices of the end-user and serve cross-
device targeted ads, we employ artificially created personas
with specific interests, to emulate realistic browsing activity
across the user devices as extracted from the real data.
We build Talon, a novel framework that materializes our
methodology in order to collect, categorize and analyze all
the ads delivered to the different user devices, and evaluate
with simple and advanced statistical methods the potential
existence of CDT. Through a variety of experiments we are
able to measure CDT with an average AUC of 0.78-0.96.
Specifically, in the simplest experiment, where the user ex-
hibits significant browsing activity mainly from the mobile
device, the average value of AUC is 0.78 for the 10 different
behavioral profiles used. When the user exhibits significant
browsing activity from both devices (mobile and desktop),
with a matching behavioral profile, we observe CDT with
an average AUC of 0.83. In the case of visiting specifically
chosen websites that employ multiple known CDT trackers,
we achieve AUC score of 0.96. We also find that brows-
ing in incognito can reduce the effect of CDT, but does not
eliminate it, as trackers can perform device matching based
only on the current browsing session of the user, and not all
her browsing history. Finally, we compare the data collected
with our real user-driven artificial personas (such as CDT
trackers found, types of ads detected, etc.) with correspond-
ing distributions observed in the real user data from [88],
offering a strong validation to the realistic design of Talon.
Overall, our main contributions in this work are:
Design a novel, real data-driven methodology for detect-
ing CDT by triggering behavioral cross-device targeted
ads on one user device, according to specifically-crafted
emulated personas, and then detecting those ads when
delivered on a different device of the same user.
Implement Talon, a practical framework for CDT mea-
surements. Talon has been designed to provide scala-
bility for fast deployment of multiple parallel device in-
stances, to support various experimental setups, and to
be easily extensible.
Conduct a set of experiments for measuring the potential
existence of CDT in different types of emulated users,
with an average AUC score of 0.78-0.96, and investigate
the various factors that affect its performance under dif-
ferent classes of experimental setups and configurations.
2 Background & Related Work
In this section, we provide the necessary terminology to un-
derstand the technical contributions of our work, and in par-
allel we present various mechanisms and technologies pro-
posed in related works.
2.1 Personalized Targeted Advertising
As the purpose of advertising is to increase market share, the
advertising industry continuously develops new mechanisms
to deliver more effective ads. These mechanisms involve the
delivery of contextual ads, targeted behavioral ads, and also
retargeted ads.
Contextual advertising refers to the delivery of ads rel-
evant to the content of the publishing website. With re-
gards to the effectiveness of contextual advertisement, Chun
et al. [24] found that it enhances brand recognition and that
users tend to have favourable attitudes towards it. In one
of the first works in this area, Broder et al. [16] proposed
an approach for classifying ads and web pages into a broad
taxonomy of topics, and then matching web pages with se-
mantically relevant ads. A large body of work also investi-
gates targeted behavioral advertising with regards to differ-
ent levels of personalization, based on the type of informa-
tion that is used to target the user [14, 9, 84, 69, 66], and
its effectiveness [86, 38, 54, 39]. Interestingly, Aguirre et
al. [9] found that, while highly personalized ads are more
relevant to users, they increase users’ sense of vulnerabil-
ity. In another study, Dolin et al. [30] measured users’
2
comfort regarding personalized advertisement. In a differ-
ent direction of investigation, Carrascosa et al. [21] devel-
oped a methodology that employs artificially-created behav-
ioral profiles (i.e., personas) for detecting behavioral targeted
advertising at scale. Their methodology could distinguish
interest-based targeting from other forms of advertising such
as retargeting. An extensive review of the literature about
behavioral advertising can be found in [15].
2.2 Leakage of Personal Information
In order to serve highly targeted ads, advertisers employ vari-
ous, often questionable and privacy intrusive, techniques for
collecting and inferring users’ personal information. They
typically employ techniques for tracking users visits across
different websites, which allow them to reconstruct parts of
the users’ browsing history. Numerous works investigate the
various approaches employed by trackers, and focus on pro-
tecting users’ privacy.
In a recent work, Papadopoulos et al. [70] developed a
methodology that enables users to estimate the actual price
advertisers pay for serving them ads. The range of these
prices can indicate which personal information of the user is
exposed to the advertiser and the sensitivity of this informa-
tion. Liu et al. [55] proposed AdReveal, a tool for character-
izing ads, and found that advertisers frequently target users
based on their interests and browsing behavior. Lecuyer et
al. [51] proposed XRay, a data tracking system that allows
users to identify which data is being used for targeting, by
comparing outputs from different accounts. In another work,
they propose Sunlight [52], a system that employs method-
ologies from statistics and machine learning to detect target-
ing at large scale.
Bashir et al. [12] developed a methodology that detects in-
formation flows between ad-exchanges. This approach lever-
ages retargeted ads, in order to detect when ad-exchanges
share the user’s information between them, for tracking and
retargeting the user. Datta et al. [27] developed AdFisher, a
tool that explores causal connections between users’ brows-
ing activities, their ad settings and the ads they receive, and
found cases of discriminatory ads. This tool uses machine
learning to determine, based on the ads received, if the user
belongs to a group of users that exhibit a specific browsing
behavior i.e., visited specific websites that affected their be-
havioral profile. Castelluccia et al. [83] showed that targeted
ads contain information that enable reconstruction of users’
behavioral profiles, and that user’s personal information can
be revealed to any party that has access the ads received by
the user.
In order to enable ad targeting without compromising user
privacy, Toubiana et al. [82] and Guha et al. [43] proposed
Adnostic and Privad, respectively. These two approaches
try to protect users’ privacy by keeping user profiles on the
client-side and thus, hiding user activities and interests from
the ad-network. Furthermore, in an attempt to provide a bet-
ter alternative, Parra-Arnau et al. [71], proposes a tool that
allows users to control which information can be used for
the purpose of advertising.
Furthermore, many works investigate privacy leakage,
specifically, in mobile devices and the different factors in-
fluencing mobile advertising [68, 81, 75, 42, 62]. A recent
study by Papadopoulos et al. [68] compared privacy leakage
when visiting mobile websites and using mobile apps. Meng
et al. [62] studied the accuracy of personalized ads served by
mobile applications based on the information collected by
the ad-networks. Also, Razaghpanah et al. [75] developed
a technique that detects third-party advertising and tracking
services in the mobile ecosystem and uncovers unknown re-
lationships between these services.
2.3 Web Tracking
As mentioned previously, various techniques are employed
for tracking and correlating users’ activities across different
websites. Many works investigated stateful tracking tech-
niques [77, 65, 35, 87, 53], and also stateless techniques such
as browser fingerprinting [34, 6, 5, 64, 63, 67]. One of the
first studies about tracking [59], investigated which informa-
tion is collected by third parties and how users can be identi-
fied. Roesner et al. [77] measured the prevalence of trackers
and different tracking behaviors in the web.
Olejnik et al. [65] investigated “cookie syncing”, a tech-
nique that enables third parties to have a more completed
view on the users’ browsing history by synchronizing their
cookies. Acar et al. [5] investigated the prevalence of “ev-
ercookies” and the effects of cookie respawning in combi-
nation with cookie syncing. Englehardt and Narayanan [35]
conducted a large scale measurement study to quantify state-
ful and stateless tracking in the web, and cookie syncing,
while Lerner et al. [53] conducted a longitudinal measure-
ment study of third party tracking behaviors and found that
tracking has increased in prevalence and complexity over
time.
With regards to stateless tracking, Nikiforakis et al. [64]
investigated various fingerprinting techniques employed by
popular trackers and measured the adoption of fingerprint-
ing in the web. Acar et al. [6] proposed FPDetective, a
framework to detect fingerprinting by identifying and ana-
lyzing specific events such as the loading of fonts, or access-
ing specific browser properties. In another work, Nikiforakis
et al. [63] proposed PriVaricator, a tool that employs ran-
domization to make fingerprints non-deterministic, in order
to make it harder for trackers to link user fingerprints across
websites. Also, in a recent work, Cao et al. [20] proposed a
fingerprinting technique that utilizes OS and hardware level
features, for enabling user tracking not only within a single
browser, but also across different browsers on the same ma-
chine.
3
2.4 Cross-Device Tracking
A few recent works investigate cross-device tracking that
is implemented based on technologies such as ultrasound
and Bluetooth, and measure the prevalence of these ap-
proaches [58, 11, 49]. As in this work we focus on web
based cross-device tracking, our work is complementary to
works that investigate such technologies.
A work by Brookman et al. [17], one of the few that inves-
tigate CDT on the web, provides some initial insights about
the prevalence of trackers. This work examines 100 popular
websites in order to determine which of them disclose data
to trackers, identifies which websites contain trackers known
to employ CDT techniques, and also investigates if users are
aware of these techniques.
During the Drawbridge Cross-Device Connection com-
petition of the ICDM 2015 conference [2], the participants
were provided with a dataset [1] that contained informa-
tion about some users’ devices, cookies, IP addresses and
also browsing activity, and were challenged to match cook-
ies with devices and users. This resulted in a number of short
papers [10, 19, 47, 48, 50, 28, 78, 85] that describe different
machine learning approaches followed during the competi-
tion for matching devices and cookies. Some of the proposed
methods achieved accuracy greater than 90%, and seen from
a different point compared to our work, showed that users’
devices can be potentially correlated if enough network and
device information is available.
Zimmeck et al. [88] conducted an initial small-scale ex-
ploratory study on CDT based on the observation of cross-
device targeted ads in two “paired” devices (mobile and
desktop) over the course of two months. Following this ex-
ploration, they collected the browsing history of 126 users,
from which 107 have provided data from both their desktop
and mobile device, and designed an algorithm that estimates
similarities and correlates the devices into pairs. This ap-
proach, which is based on IP addresses and browsing history,
and achieves high matching rates, shows that users’ network
information and browsing history can be used for correlating
user devices, and thus potentially for CDT.
In general, research around CDT is still very limited; in
fact, only [88, 17] initially studied some of its aspects, but
without proving its actual existence or providing a methodol-
ogy for detecting and measuring it. Overall, our work builds
on these early studies on CDT, as well as past studies on de-
tection of web tracking during targeted ads. We propose the
first of its kind methodology for systematic investigation of
probabilistic CDT, by leveraging artificially-created profiles
with specific web behaviors, and measuring the existence of,
and factors affecting CDT in various experimental setups.
3 A methodology to measure CDT
The proposed methodology emulates realistic browsing ac-
tivity of end-users across different devices, and collects and
categorizes all ads delivered to these devices based on the
intensity of the targeting. Finally, it compares these ads with
baseline browsing activity to establish if CDT is present or
not, at what level, and for which types of user interests.
3.1 Design Principle
In general, the CDT performed by the ad-ecosystem is a very
complex process, with multiple parties involved, and a non-
trivial task to dissect and understand. To infer its internal
mechanics, we rely on probing the ecosystem with consis-
tent and repeatable inputs (I), under specific experimental
settings (V), allowing the ecosystem to process and use this
input via transformations and modeling (F), and produce out-
puts we can measure on the receiving end (Y):
(I,V) F
−→ Y
In this expression, the unknown Fis the probabilistic model-
ing performed by CDT entities, allowing them to track users
across their devices. Following this design principle, our
methodology allows to push realistic input signals to the ad-
ecosystem via website visits, and measure the ecosystem’s
output through the delivered ads, to demonstrate if Fenabled
the ecosystem to perform probabilistic CDT. An overview of
our methodology is illustrated in Figure 2.
3.2 Design Overview
3.2.1 Input Signal (I)
To trigger CDT, we first need to inject to the ad-ecosystem
some activity from a user’s browsing behavior (I). This input
can be visits (i) to pages of interest (e.g., travel, shopping),
or (ii) to control pages of null interest (e.g., weather pages).
Intuitively, the former can be used first to demonstrate par-
ticular behavior of a user from a given device (mobile), and
the latter afterwards for collecting ads delivered as the output
of the ecosystem (Y) due to I, to that device, or other device
of the same user (desktop).
Persona Pages. We extract real users’ interests from the
dataset provided by Zimmeck et al. [88] and leverage an ap-
proach similar to Carrascosa et al. [21] to emulate brows-
ing behavior according to specific web categories, and cre-
ate multiple, carefully-crafted personas of different granular-
ities. This design makes the methodology systematic and re-
peatable and produces realistic browsing traffic from scripted
browsers. For each persona, our approach identifies a set of
websites (dubbed as persona pages) that have, at the given
time, active ad-campaigns. This “training activity” aims to
drive CDT trackers into possible device-pairing between the
user’s two devices with high degree of confidence.
4
Control' Pag es Persona ' Pages
Same'Public' IP'Address
Paire d'PC
Baselin e'PC
Experimental'Setup'Selector
Mobile
W'W'W
Ad-
ecosystem
Ad'
Categorizer
HTML
Ads
Metadat a
CDT No&CDT
Experimental'Setup'(V)
Input'
Signal'(I)
Ad-ecosystem
CDT' Functions'
&'Model ing' (F)
Output' Signal'(Y)
Instantiation& of
Probing& Devices
Feature'
Extractor
CDT' Detection
Categori es
Page 'Parser
&'Ad' E xtr actor
CDT' Machi ne'
Learning'Modeler
Figure 2: High level representation of methodology design principles and units for CDT measurements.
Control Pages. Following past works [21, 12], all devices
in the system collect ads by visiting neutral websites that
typically serve ads not related to their content, thus, reducing
bias from possible behavioral ads delivered to specific type
of websites. We refer to these websites as control pages. We
detail the design of personas and control pages in §4.1.
3.2.2 Experimental Setup (V)
No 1st-party logins. Since we focus on probabilistic CDT,
we assume that the emulated user does not visit or log into
any 1st-party service that employs deterministic CDT and
thus, there is no common identifier (e.g., email address, so-
cial network ID) shared between the user’s devices.
Devices, IP addresses & Activity. The approach we fol-
low is based on triggering and identifying behavioral cross-
device targeted ads, and specifically ads that appear on one of
the user’s devices, but have been triggered by the user’s ac-
tivity on a different device. For this trigger to be facilitated,
the ad-ecosystem must be provided with hints that these two
devices belong to the same user. Zimmeck et al. [88] suggest
that in many cases, the devices’ IP address is adequate for
matching devices that belong to the same user. Also, accord-
ing to relevant industrial teams [57, 8] more signals can be
used, such as location, browsing, etc., for device matching.
Following these observations, our methodology requires
a minimum of three different devices: one mobile device
and two desktop computers, with two different public IP ad-
dresses. We assume that two devices (i.e., the mobile and one
desktop) belong to the same user, and are connected to the
same network. That is, these devices have the same public
IP address, are active in the same geolocation as in a typical
home network, and will be considered by the ad-ecosystem
as producing traffic from the same user. The second desk-
top (i.e., baseline PC), which has a different IP address, is
used for receiving a different flow of ads while replicating
the browsing of the user’s desktop (i.e., paired PC). This
control instance is used for establishing a baseline set of ads
to compare with the ads received by the user’s paired PC.
CDT Direction. In principle, the design allows the inves-
tigation of both directions of CDT. That is, users may first
browse on the mobile device, and then move to their desk-
top, and vice versa. However, since ad-targeting companies
such as AdBrain and Criteo support that the direction from
mobile to desktop is more suitable for cross-device retarget-
ing [72, 7, 25], in this work we focus on the mobile to desk-
top direction (Mob PC). In essence, the mobile device
performs a specifically instructed web browsing session to
establish the persona, by visiting the set of persona pages,
i.e., training phase; then, the two desktop computers perform
web browsing, i.e., testing phase, where they visit the set of
control pages and collect the delivered ads. The browsing
performed by the desktops is synchronized by means of vis-
iting the same pages and performing the exact same clicks.
3.2.3 Output Signal (Y)
In order to handle the Output Signal and transform it ap-
propriately, we design and implement two different compo-
nents: (i) Page Parser & Ad Extractor and (ii) Ad Catego-
rizer. The first is responsible for the identification and ex-
traction of ad elements inside the webpages. The module
uses string matching techniques and a public list of common
ad-domains (Easylist [33]) to identify the delivered ads. The
second module assigns a keyword on each ad identified on
the previous step, based on its type and content (e.g., “On-
line Shopping”, “Fashion”, “Recreation”, etc.). Using both
modules, we store the ads delivered in all devices of our ex-
perimental setup along with their categories, as well as data
related to the activity of the devices that attracted these ads.
5
Pre-processing
Extract
Keywords
Google Search -
Campaign Extraction
Real User’s
Interests
[88]
Group
Keywords per
Persona
Persona Selection
Extract
Weather
Websites
Google Product
Taxonomy
Alexa Top
Sites
Synthetic
Personas
[21]
Control Pages Persona Pages
Figure 3: Persona design and automatic generation.
3.2.4 CDT Detection
Comparing Signals. Various statistical methods can be used
to associate the input signal Iof persona browsing in the
mobile device, with the output signal Yof ads delivered to
the potentially paired-PC. For example, simple methods that
perform similarity computation between the two signals in a
given dimensionality (e.g., Jaccard, Cosine) can be applied.
These methods, as well as typical statistical techniques (e.g.,
permutation tests) capture only one dimension of each in-
put/output signal and thus, might not be suitable for measur-
ing with confidence the high complexity of the CDT signal.
In this case, more advanced methods can be employed, such
as Machine Learning techniques (ML) for classification of
the signals as similar enough to match, or not. In our analy-
sis, we mainly focus on ML to compute the likelihood of the
two signals being the product of CDT, as it takes into con-
sideration this multidimensionality in the feature space. We
describe the modeling and methods used for ML in §4.4.
4 Framework Implementation
A high level overview of our methodology, and its material-
ization by our framework Talon, is presented in Figure 2 and
described in §3. In the following, we provide more details
about its building blocks, and argue for various design de-
cisions taken while implementing this methodology into the
fully-fledged automated system.
4.1 Input Signal: Control Pages & Personas
Persona Pages. A critical part of our methodology is the de-
sign and automatic building of realistic user personas. Each
persona has a unique collection of visiting links, that form
the set of persona pages. Since we do not know in ad-
vance which e-commerce sites are conducting cross-device
Table 1: Behavioral personas used in our experiments.
Persona Category - Description
1 Online Shopping - Accessories, Jewelry.
2 Online Shopping - Fashion, Beauty.
3 Online Shopping - Sports and Accessories.
4 Online Shopping - Health and Fitness.
5 Online Shopping - Pet Supplies.
6 Air Travel.
7 Online Courses and Language Resources.
8 Online Business, Marketing , Merchandising.
9 Browser Games - Online Games.
10 Hotels and Vacations.
ad-campaigns, we design a process to dynamically detect ac-
tive persona pages of given interest categories. Our approach
for persona generation is shown in Figure 3.
We first use the list of topics of Zimmeck at al. [88], that
describe real user’s online interests. We perform a cluster-
ing based on the content of each interest and label the clus-
ters appropriately (e.g., we group together: “Shopping” and
“Beauty and Fashion” under the label: “Shopping and Fash-
ion”). Then, we use the persona categorization of Carrascosa
et al. [21] for their top 50 personas, and select only those
personas that describe similar interests with the previously
formed list. For the resulting intersection of personas from
the two lists, we iterate through the Google Product Taxon-
omy list [40] to obtain the related keywords for each one.
For increasing the probability to capture active ad-
campaigns that can potentially deliver ads to the devices,
we use Google Search as it reveals campaigns associated
with products currently being advertised. That is, if a
user searches for specific keywords (e.g., “men watches”),
Google will display a set of results, including sponsored
links for sites conducting campaigns for the terms searched.
In this way, we use the keywords set for each persona, as
extracted above, and transform them into search queries by
appending common string patterns such as “buy”, “sell”, and
“offers”. This process is repeated until between five and ten
unique domains per persona are collected. If the procedure
fails, no persona is formed.
As the effectiveness of a persona depends on the active ad-
campaigns at the given time, in our experiments, we deploy
personas in 10 categories related to shopping, traveling, etc.
(full list shown in Table 1). With this procedure, we manage
to design personas similar enough with real users, as well as
with emulated users designed in previous works [21, 12, 13,
88].
Control Pages. For retrieving the delivered ads (after any
type of browsing), we employ a set of webpages that contain:
(i) easily identifiable ad-elements and (ii) a sufficient num-
ber of ads that remains consistent through time. These pages
have neutral context and do not affect the behavioral profile
of the device visiting them. For most of the experiments in
6
§5, we use a set of five popular weather websites2as control
pages, similarly to [21]. We manually confirmed the neutral-
ity of these pages, by observing no contextual ads delivered
to them. When visiting the set of control pages, our meth-
ods extract and categorize all the ads received, in order to
identify those that have been potentially resulted from CDT.
4.2 Experimental System Setup
The experimental setup contains different types of units, con-
nected together for replicating browsing activity on multiple
devices. Typically, CDT is applied on two or more devices
that belong to the same user, such as a desktop and a mobile
device. Thus, the system contains emulated instances of both
types, controlled by a number of experimental parameters.
Devices & Automation. The desktop devices are built on
top of the web measurement framework OpenWPM [35].
This platform enables launching instances of the Firefox
browser, performs realistic browsing with scrolling, sleeps
and clicks, and collects a wide range of measurements in
every browsing session. It is also capable of storing the
browser’s data (cookies, local cache, temporary files) and ex-
ports a browser profile after the end of a browsing session,
which can be loaded in a future session. With these options,
we can perform stateful experiments, as a typical user’s web
browser that stores all the data through time, or stateless ex-
periments to emulate browsing in incognito mode.
For the mobile device, we use the official Android Emula-
tor [41], as well as the Appium UI Automator [73] for the au-
tomation of browsing. We build the mobile browsing module
on top of these components to automate visits to pages via
the Browser Application. This browsing module provides
functionalities for realistic interaction with a website, e.g.,
scrolling, click and sleep rate. Similarly to the desktop, it
can run either in a stateful or stateless mode.
Experimental Setup Selector. As shortly described in §3,
we need two phases of browsing to different types of web-
pages (training and testing), in order to successfully measure
CDT. For that reason, we set the two browsing phases in the
following way: During the training phase, the selected de-
vice visits the set of Persona Pages for a specific duration,
referred to as training time (ttrain). The test phase is the set
of visits to control pages for the purpose of collecting ads.
During this phase, we control the duration of browsing (i.e.,
ttest ). The experimental setup selector controls various pa-
rameters such as: which type of device will be trained and
tested, the times ttrain and ttest , the sequence of time slots
for training and testing from the selected device, number of
repetitions of this procedure, etc.
Timeline of phases. Each class of experiments is executed
multiple times (or runs), through parallel instantiations of the
user devices within the framework (as shown in Figure 2).
2accuweather.com,wunderground.com,weather.com,
weather-forecast.com,metcheck.com
Session'1 Session N
.'.'.'.'.'
0''''''''''''''''''''''''''''''tStStS
Session 2
B1' M1' W A1'''' R''''B2M2' W A2'''' R'''''''''''''''''BNMN' W AN'
Time'(t)
1''''''''''''''''''''''''''''''''''''''''''''' 2 N-1''''''''''''''''''''''''''''''''''''
CDT'detection
Figure 4: Timeline of phases for CDT measurement.
Mi: mobile training time ttrain + testing time ttest ;
Bi(Ai): desktop testing time ttest before (after) mobile phase;
W: wait time (twait ); R: rest time (trest ); tSi: time of session i.
Each experimental run is executed following a timeline of
phases as illustrated in Figure 4. This timeline contains N
sessions with three primary stages in each: Before, Mobile,
and After. The Before (Bi)stage is when the two desktop de-
vices perform a parallel test browsing, with a duration of ttest
time, to establish the state of ads before the mobile device
injects signal into the ad-ecosystem. The Mobile (Mi)stage
is when the mobile device performs a training browsing for
ttrain time, and a test browsing for ttest time. This phase in-
jects the signal from the mobile during training with a per-
sona, but also performs a subsequent test with control pages
to establish the state of ads after the training. Finally, the Af-
ter (Ai)stage is when the two desktops perform the final test
browsing, with the same duration ttest as in Before (Bi)stage,
to establish the state of ads after the mobile training.
After extensive experimentation, we found that a mini-
mum training time ttrain=15 minutes and testing time ttest =20
minutes are sufficient for injecting a clear signal over noise,
from the trained device to the ad-ecosystem. There is also
a waiting time (twait=10 minutes) and resting time (trest =5
minutes) between the stages of each session, to allow align-
ment of instantiations of devices running in parallel during
each session. In total, each session lasts 1.5 hours and is re-
peated N=15 times during a run. Through the experimental
setup selector, we define the values of such variables (ttrain ,
ttest ,twait ,trest,N, type of device), offering the researcher the
flexibility to experiment in different cases of CDT.
4.3 Output Signal
Page Parser. This component is activated when the visited
page is fully loaded and no further changes occur on the con-
tent. To collect the display ads, we first need to identify spe-
cific DOM elements inside the visited webpages. This task is
challenging due to the dynamic Javascript execution and the
complex DOM structures generated in most webpages. For
the reliable extraction of ad-elements and identification of
the landing pages,3we follow a methodology similar to the
one proposed in [55]. The functionality of this component
is to parse the rendered webpage and extract the attributes of
3Destination websites the user is redirected to when clicking on the ads.
7
display ads, which also contain the landing pages.
Ad Extractor. In most modern websites, the displayed ads
are embedded in iFrame tags that create deep nesting layers,
containing numerous and different types of elements. How-
ever, the ads served by the control pages are found directly
inside the iFrames so the module does not have to handle
such complex behavior. Therefore, the module firstly iden-
tifies all the active iFrame elements and filters out the in-
valid ones that have either empty content or zero dimensions.
Then, it retrieves the href attributes of image and flash ads
and parses the URLs, while searching for specific string pat-
terns such as adurl=, redirect=, etc. These patterns are typ-
ically used by the ad-networks for encoding URLs in web-
pages. Next, the module forms the list of candidate landing
pages, which are then processed and analyzed to create the
set of true landing pages. The Ad Extractor is fully com-
patible with the crawlers, and does not need to perform any
clicks on the ad-elements, since it extracts only the landing
pages’ URLs directly from the rendered webpage. After col-
lecting the candidate landing pages, the module filters them
with the EasyList [33], similarly to previous works [12, 35],
and stores only the true active ad-domains. Finally, the Page
Parser & Ad Extractor module also stores metadata from the
crawls such as: time and date of execution, number of identi-
fied ads, number of categories, type and phase of crawl, etc.
Ad Categorizer. To associate landing pages or browsing
URLs with web categories, we employ the McAfee Trusted-
Sources database [60], which provides URLs organized into
categories. This system was able to categorize 96% of the
landing pages of our collection into a total of 76 unique cat-
egories, by providing up to four semantic categories for each
page, while the remaining 4% domains were manually clas-
sified to the categories above. The final output contains the
landing pages of collected ads, along with their categories.
4.4 CDT Detection
Probabilistic CDT is a kind of task generally suitable for
investigation through ML. Previous work [88] and industry
directions [57, 8] claim that probabilistic device-pairing is
based on specific, well-defined signals such as: IP address,
geolocation, type and frequency of browsing activity. Since
we control these parameters in our methodology, by defi-
nition we construct the ground truth with our experimental
setups. That is, we control (i) the devices used, which are
potentially paired under a given IP address, geolocation and
browsing patterns, (ii) the control instance of baseline desk-
top device, and (iii) the browsing with the personas.
Before applying any statistical method, every instance of
the input data has to be transformed into a vector of values;
each position in the vector corresponds to a feature. Features
are different properties of the collected data: browsing ac-
tivity of a user during training time, experimental setup used
(persona, etc.), time-related details of the experiment, as well
Table 2: Description of features used by datasets. The type of
desktop crawl values are in range {0,1}, where 0 represents
the before/test sessions, while 1 the after/train sessions. The
time of crawl is divided in 30 minutes timeslots and is en-
coded in range {0,48}. The day of crawl is encoded in range
{1,7}.Vrepresents the (enumerated) vectors of values in the
sets of: landing pages, training pages, ads and ad categories.
Feature Label Description
Crawl Type The type of desktop crawl.
Run ID The indexed number of run{1,4}.
Session ID The index of session{1,15}.
Persona Keywords V: keyword categoriesof training pages.
Mobile Timeslot Time of crawl (Mobile).
Desktop Timeslot Time of crawl (Desktop).
Desktop Day The day of crawl (Desktop).
Mobile Number of Ads # ad domains (Mobile).
Desktop Number of Ads # ad domains collected (Desktop).
Mobile Unique Number of Ads # distinct ad domains (Mobile).
Desktop Unique Number of Ads # ad domains (Desktop).
Mobile Number of Keywords # ad categories (Mobile).
Desktop Number of Keywords # ad categories (Desktop).
Mobile Unique Number of Keywords # distinct ad categories (Mobile).
Desktop Unique Number of Keywords # distinct ad categories (Desktop).
Mobile Keywords V: keyword categories of landing pages (Mobile).
Desktop Keywords V: keyword categories of landing pages (Desktop).
Mobile Landing Pages V: landing pages of delivered ads (Mobile).
Desktop Landing Pages V: landing pages of delivered ads (Desktop.)
as information about the collected ads, which is the output
signal received from the given browsing activity. These fea-
tures can be studied systematically to identify statistical as-
sociation between the input and output signals, given an ex-
perimental setup. In effect, our feature space is comprised
of a union of these vectors, since all features are either con-
trolled, or measurable by us (detailed description of the fea-
tures is given in Table 2). The only unknown is whether the
ad-ecosystem has successfully associated the devices, and if
it has exhibited this in the output signal via ads.
One Dimension Statistical Analysis. At the first level of
analysis, to measure the similarity of distribution of ads de-
livered in the different devices, we compare the signals us-
ing a two-tailed permutation test and reject the null hypoth-
esis that the frequency of ads delivered (for a given cate-
gory) comes from the same distribution, if the t-test statistic
leads to a p-value smaller than a significance level α<0.05.
Multidimensional Statistical Analysis. Given that a uni-
dimensional test such as the previous one does not take into
account the various other features available in each experi-
ment, we further consider ML, which take into account mul-
tidimensional data, to decide if the ads delivered in each de-
vice are from the same distribution or not. We transform
the problem of identifying if the previously exported vec-
tors are similar enough, into a typical binary classification
problem, where the predicted class describes the existence
of pairing or not, that may have occurred between the mo-
bile device and one of the two desktop devices. As a paired
combination we consider the desktop device that exists un-
der the same IP address with the mobile device. The “not
paired” combination is the mobile device and the baseline
desktop. The analysis is based on three classification algo-
8
Table 3: Characteristics of the datasets used in each setup
(S) of experiments. S={1,2,3}are the setups of experiments
in §5.2, §5.3 and §5.4, respectively; ttotal : the total du-
ration of experiment; ttrain : the training duration; ttest : the
testing duration; I: independent personas; C: data combined
from personas; SF: stateful browser; SL: stateless browser;
B: boosted CDT browsing.
S Personas Runs tt rain ttest ttotal Samples Features
1a 10 (I, SF) 4 15min 20min 37 days 240 1100
1b 10 (C, SF) - - - - 2400 2201
2a 2 (I, SF) 4 480min 30min 6 days 192 600
2b 2 (C, SF) - - - - 384 750
2c 2 (I, SF, B) 4 480min 30min 6 days 192 500
2d 2 (C, SF, B) - - - - 384 576
3a 5 (I, SL) 2 15min 20min 9 days 120 450
3b 5 (C, SL) - - - - 600 880
rithms with different dependences on the data distributions.
An easily applied classifier that is typically used for perfor-
mance comparison with other models, is the Gaussian Naive
Bayes classifier. Logistic Regression is a well-behaved clas-
sification algorithm that can be trained, as long as the classes
are linearly separable. It is also robust to noise and can avoid
overfitting by tuning its regularization/penalty parameters.
Random Forest and Extra-Trees classifiers, construct a mul-
titude of decision trees and output the class that is the mode
of the classes of the individual trees. Also they use the Gini
index metric to compute the importance of features.
A fundamental point when considering the performance
evaluation of ML algorithms is the selection of the appropri-
ate metrics. Pure Accuracy can be used, but it’s not repre-
sentative for our analysis, since we want to report the most
accurate estimation for the number of predicted paired de-
vices, while at the same time measure the absolute number of
miss-classified samples overall. For this reason, metrics like
Precision, Recall and F1-score, and the Area Under Curve
of the Receiver Operating Curve (AUC) are typically used,
since they can quantify this type of information.
5 Experimental Evaluation
We use the Talon framework to perform various experiments
and construct different datasets for each. Since every ex-
perimental setup has different experimental parameters (i.e.,
training and testing time, number of personas, browsing
functionalities), the datasets vary in terms of samples size
and feature space. The datasets collected during our experi-
ments and used in our analysis are presented in the Table 3.
5.1 Does IP-sharing allow CDT?
A first set of preliminary experiments were performed to
demonstrate that our platform can (i) successfully identify
and collect the ads delivered to our multiple devices (mobile
and desktops), (ii) inject browsing signal from a device, thus
biasing it to have a realistic persona and (iii) lead to match-
ing/pairing of devices, which could be due to same behav-
ioral ads, retargeting ads or CDT.
First, we use a simple experimental setup: we connect
three instances of desktop devices and one mobile device un-
der the same IP address. We create one persona (as in §4.1),
with an interest in “Online Shopping-Fashion, Beauty”, and
following the described timeline of phases, we run this ex-
periment for two days. Then, we perform one-dimensional
statistical analysis, as introduced in §4.4, and find that there
is no similarity between the mobile with any of desktop de-
vices (null hypothesis rejected with highest p-value=0.030),
while all desktop distributions are similar to each other (null
hypothesis accepted with lowest p-value=0.33). These statis-
tical results indicate that there is no clear device-pairing (at
the level of ad distribution for the given persona), and that
we should consider controlling more factors to instigate it.
Consequently, we expand this experiment by also training
one of the desktop devices using the same persona as with
mobile. By repeating the same statistical tests, we find that
the mobile and desktop with the same browsing behavior re-
ceive ads coming from the same distribution (null hypothesis
accepted with lowest p-value=0.84), while the other desk-
top devices show no similarity with each other or the mobile
(null hypothesis rejected with highest p-value=0.008). This
result indicates that browsing behavior under a shared IP ad-
dress can boost the signal towards advertisers, which they
can use to apply advanced targeting, either as CDT, or retar-
geting on each device or a mixture of both techniques.
Finally, these preliminary experiments and statistical tests
provide us with evidence regarding the effectiveness of our
framework to inject enough browsing signal from different
devices under selected personas. Our framework is also able
to collect ads delivered between devices, that can be later
analyzed and linked back to the personas. Those are fun-
damental components for our system and importantly they
are potentially causing CDT between the devices involved.
Next, we present more elaborate experimentations with our
framework, in order to study CDT in action.
5.2 Does short-time browsing allow CDT?
Independent Personas: Setup 1a. This experimental setup
emulates the behavior of a user that browses frequently about
some topics, but in short-lived sessions in her devices. Given
that most users do not frequently delete their local brows-
ing state, this setup assumes that the user’s browser stores
all state, i.e., cookies, cache, browsing history. This enables
trackers to identify users more easily across their devices, as
they have historical information about them. In this setup,
every experimental run starts with a clean browser profile;
cookies and temporary browser files are stored for the whole
duration of the experimental run (stateful). We use all per-
9
Table 4: Performance evaluation for Random Forest in Se-
tups 1a and 1b. Left value in each column is the score for
Class 0 (C0=not paired desktop); right value for Class 1
(C1=paired desktop).
Persona Precision Recall F1-Score AUC
(Setup) C0 C1 C0 C1 C0 C1
1 (1a) 0.89 0.60 0.57 0.90 0.70 0.72 0.73
2 (1a) 0.84 0.78 0.81 0.82 0.82 0.80 0.82
3 (1a) 0.81 0.73 0.78 0.76 0.79 0.74 0.76
4 (1a) 0.87 0.78 0.87 0.78 0.87 0.78 0.82
5 (1a) 0.94 0.65 0.68 0.93 0.79 0.76 0.80
6 (1a) 0.57 0.67 0.81 0.38 0.67 0.48 0.59
7 (1a) 0.81 0.87 0.89 0.76 0.85 0.81 0.81
8 (1a) 0.86 0.85 0.89 0.81 0.87 0.83 0.84
9 (1a) 0.74 0.90 0.91 0.73 0.82 0.81 0.81
10 (1a) 0.77 0.85 0.81 0.81 0.79 0.83 0.81
combined (1b) 0.77 0.84 0.81 0.84 0.82 0.84 0.89
sonas of Table 1, and the data collection for each lasts 4 days.
We perform the same statistical analysis as in §5.1, and
find that in 4/10 personas, the mobile and paired desktop
ads are similar (null hypothesis accepted with lowest p-
value=0.13), while the mobile and baseline desktop ad dis-
tributions are different (null hypothesis is rejected with high-
est p-value=0.009). This inconsistency is reasonable since
the statistical analysis is based only on one dimension (the
frequency count of types of ads appearing in the devices),
which may not be enough for fully capturing the existence of
device-pairing. For this reason, we choose to use more ad-
vanced, multidimensional ML methods which take into ac-
count the various variables available, to effectively compare
the potential CDT signals received by the two devices.
The classification results of the Random Forest (best per-
forming) algorithm are reported in Table 4. We use AUC
score as the main metric in our analysis, since the ad-industry
seems to prefer higher Precision scores over Recall, as the
False Positives have greater impact on the effectiveness of
ad-campaigns.4As shown in Table 4, the model achieves
high AUC scores for most of the personas, with a maximum
value of 0.84. Specifically, the personas 2, 4 and 8 scored
highest in AUC, and also in Precision and Recall, whereas
persona 6 has poor performance compared to the rest. These
results indicate that for high scoring personas, we success-
fully captured the active CDT campaigns, but for the per-
sonas with lower scores, there may not be active campaigns
for the period of the experiments.
In order to retrieve the variables that affect the discovery
and measurement of CDT, we applied the feature importance
method on the dataset of each persona, and selected the top-
10 highest scoring features. For the majority of the personas
4Tapad [3] mentions: “Maintaining a low false positive rate while also
having a low false negative rate and scale is optimal. This combination is
a strong indicator that the Device Graph in question was neither artificially
augmented nor scrubbed.”
0
0.01
0.02
0.03
Desktop Day
Real Estate
Marketing
Online Shopping
Crawl Type
Desktop Timeslot
Stock Trading
Beauty
Desktop #Ads
Domain 1
Domain 2
Recreation
Domain 3
Education
Mobile Day
Business
Run Id
Hardware
Domain 4
Mobile #Ads (unq)
Games
Fashion
Merchandising
Sports
Mobile #Keywords (unq)
Domain 5
Software
Travel
Mobile Timeslot
Session Id
Gini Score
Features
Crawl Attributes
Ad Domain
Keyword
Figure 5: Top-30 features ranked by importance using Gini
index, in the machine learning model.
(7 out of 10) the most important features were the number
of ads (distinct or not) and the number of keywords in desk-
top. In some cases, there were also landing pages that had
high scoring (i.e., specific ad-campaigns), but this was not
consistent across all personas.
Combined Personas: Setup 1b. Here, we use all the
datasets collected individually, for each persona in the previ-
ous experiment (Setup 1a), and combine them into one uni-
fied dataset. This setup emulates the real scenario of a user
exhibiting multiple and diverse web interests, that give ex-
tra information to the ad-ecosystem about their browsing be-
havior. Of course, there is an increase in the possible feature
space to accommodate all the domains and keywords from
all personas. In fact, the dataset contains 2021 features as it
stores the vectors of landing pages and keywords, for all the
different types of personas. In total, there were 890 distinct
ad-domains described by keywords in 76 distinct categories.
In this dataset, we apply feature selection with the Extra-
Trees classifier to select the most relevant features and cre-
ate a more accurate predictive model. This method reduced
the feature space to 984 useful features out of 2201. Next,
we use the three classification algorithms and a range of
hyper-parameters for each one. Also, we apply a 10-fold
nested cross-validation method for selecting the best model
(in terms of scoring performance) that can give us an ac-
curate, non overly-optimistic estimation [22]. Again, the
best selected model was Random Forest, with 200 estimators
(trees) and 200 depth of each tree, with AUC=0.89 (bottom
row in Table 4). The model’s performance is high in all the
mentioned scores, which indicates that the more diverse data
the advertisers collect, the easier it is to identify the different
user’s devices. This result is in line with Zimmeck et al. [88],
who attempted a threshold-based approach for probabilistic
CDT detection on real users’ data, lending credence to our
proposed platform’s performance.
We also measure the feature importance for the top-30 fea-
tures (shown in Figure 5). One third of the top features are
related to crawl specific metadata, whereas about half of the
top features are keyword-related. Interestingly, features such
10
as the day and time of the experiment, as well as the number
of received ads, are important for the algorithm to make the
classification of the devices. Indeed, time-related features
provide hints on when the ad-ecosystem receives the brows-
ing signal and attempts the CDT, and thus, which days and
hours of day the CDT is stronger. These results give support
to our initial decision to experiment in a continuous fashion
with regular sessions injecting browsing signal, while at the
same time measuring the output signal via delivered ads.
5.3 Does long-time browsing improve CDT?
Independent Personas: Setup 2a. In this set of experi-
ments, we allow the devices to train for a longer period of
time, to emulate the scenario where a user is focused on a
particular interest, and produces heavy browsing behavior
around a specific category. This long-lived browsing injects
a significantly higher input signal to the ad-ecosystem than
the previous setup, which should make it easier to perform
CDT. In order to increase the setup’s complexity, and make
it more difficult to track the user, we allow all devices (i.e.,
1 mobile, 2 desktops) to train in the same way under the
same persona. In effect, this setup also tests a basic coun-
termeasure from the user’s point of view, who tries to blur
her browsing by injecting traffic of the same persona from
all devices to the ad-ecosystem.
In this setup, while all devices are trained with the same
behavioral profile, we examine if the statistical tests and ML
modeler can still detect and distinguish the CDT. This exper-
iment contains three different phases during each run. The
mobile phase, where the mobile performs training crawls for
ttrain =480 mins, and a testing crawl for ttest =30 mins. In par-
allel with the mobile training, the two desktops perform test
crawls for ttest =30 mins. After mobile training and testing,
both desktops start continuous training and testing crawls al-
ternately for 8 hours (ttrain=ttest =30 min).
Due to the long time needed for executing this experiment,
we focus on two personas constructed in the following way.
We use the methodology for persona creation as described
in §4.1, and focus on active ad-campaigns, resulting to two
personas in the interest of “Online Shopping-Accessories”,
and “Online Shopping-Health and Fitness” (loosely match-
ing the personas 1 and 4 from Table 1). Then, we per-
formed 4 runs of 16 hours duration each, for each persona.
In this setup, since all devices are uniformly trained, we do
not include the keyword vector of the persona pages into the
datasets, to not introduce any bias from repetitive features.
The statistical analysis for this experiment reveals poten-
tial CDT, since we accept the null hypothesis for the dis-
tribution of ads delivered in the paired desktop and mobile
(lowest p-value=0.052), and reject it in the baseline desktop
and mobile (highest p-value=0.006). This consistency is in-
teresting, since for this setup all three devices are uniformly
trained with the same persona, and thus all of them collect
Table 5: Performance evaluation for Logistic Regression in
total components of Setup 2. Left value in each column is
the score for Class 0 (C0=not paired desktop); right value
for Class 1 (C1=paired desktop).
Persona Precision Recall F1-Score AUC
(setup) C0 C1 C0 C1 C0 C1
1 (2a) 0.90 0.79 0.82 0.88 0.86 0.83 0.85
4 (2a) 0.83 0.79 0.81 0.81 0.82 0.80 0.81
combined(2b) 0.87 0.92 0.92 0.87 0.89 0.90 0.89
1 (2c) 0.87 1.0 1.0 0.88 0.93 0.93 0.93
4 (2c) 1.0 0.98 0.98 1.0 0.99 0.99 0.99
combined(2d) 1.0 0.86 0.88 1.0 0.93 0.93 0.93
similar ads due to retargeting. However, there is no similar-
ity between the distributions of ads in the devices that do not
share the same IP address.
To clarify this finding, we applied the ML algorithms as in
the previous experiment. The algorithms again detect CDT
between the mobile and the paired desktop, even though all
devices were exposed to similar training with the same per-
sona. In fact, Logistic Regression performed the best across
both personas, with AUC 0.81, and F1-score 0.80 for
both classes.Detailed evaluation results of §5.3 presented
in able 5. When computing the importance of features, the
desktop number of ads and keywords and the desktop time
slot are in the top-10 features. Based on these observa-
tions, we believe that the longer training time allowed the
ad-ecosystem to establish an accurate user profile, and retar-
get ads on the paired desktop, based on the mobile’s activity.
Combined Personas: Setup 2b. Similarly to §5.2 we
combine all data collected from the Setup 2a into a unified
dataset. Under this scenario, in which we mix data from both
personas, the classifier again performs well, with AUC=0.89.
Important features in this case are the number of ads and key-
words delivered to the desktops, the time of the experiment,
and number of keywords for the desktop.
Boosted Browsing with CDT trackers and Independent
Personas: Setup 2c. In the next set of experiments, we
investigate the role of CDT trackers in the discovery and
measurement of CDT. In particular, we attempt to boost the
CDT signal, by visiting webpages with higher portion of
CDT trackers. Therefore, the experimental setup and the
preprocessing method remain the same as in the previous
Setup 2a, but we select webpages to be visited that have ac-
tive ad-campaigns and their landing pages embed the most-
known CDT trackers (as we also show in the next section):
Criteo, Tapad, Demdex, Drawbridge. We also change the
set of our control pages, so that each one contains at least
a CDT tracker. News sites have many 3rd-parties compared
to other types of sites [35]. Thus, for this boosted browsing
experiment, we choose the set of control pages to contain 3
weather pages and 2 news websites,5while verifying they do
5accuweather.com,wunderground.com,weather.com,
usatoday.com,huffingtonpost.com
11
not serve contextual ads.
Performing the same analysis as earlier, we find that mo-
bile and paired desktop have ads coming from the same dis-
tribution (lowest p-value=0.10), and that there is no simi-
larity between the ads delivered in the mobile and baseline
desktop (highest p-value=0.007). For a clearer investigation
of the importance of the CDT trackers, we also evaluate the
findings with the ML models. For persona 1, Logistic Re-
gression and Random Forest models perform near optimally,
with high precision of Class 1, high recall for class 0, aver-
age F1-Score=0.93 for both classes, and AUC=0.93. For per-
sona 4, the scores are even higher, outperforming the other
setups, as all metrics for Logistic Regression scored higher
than 0.98. Overall, these results indicate that we success-
fully biased the trackers to identify the emulated user in both
devices, and to provide enough output signal (ads delivered)
for the statistical algorithms to detect the CDT performed.
Boosted Browsing with CDT trackers and Combined
Personas: Setup 2d. We follow a similar approach with be-
fore, and combine all data collected from the Setup 2c, into
a unified dataset for Setup 2d. Under this scenario, the clas-
sifier (Logistic Regression) again performs very well, with
AUC=0.93. Important features in this case are the number
of ads delivered to the desktops, the time of the experiment
in each desktop and the number of keywords. Interestingly,
and perhaps unexpectedly, the existence of Criteo tracker in
a landing page, is a feature appearing in the top-10 features.
5.4 Does incognito browsing help evade CDT?
Independent Personas: Setup 3a. In this final experimen-
tal setup, we investigate if it is possible for the user to apply
some basic countermeasures to avoid, or at least reduce the
possibility of CDT, by removing her browsing state in every
new session. For this, we perform experiments where the tra-
ditional tracking mechanisms (e.g., cookies, cache, browsing
history, etc.) are disabled or removed, emulating incognito
browsing. We select the first five personas from Table 1,
which had the most active ad-campaigns and appeared to be
promising due to the “online shopping” interest. Every desk-
top executed browsing in a stateless mode, while the mobile
in a stateful mode. For each persona, we collected data for
two runs, following the timeline of phases as in Setup 1a.
The distributions between mobile vs. paired desktop, as
well as mobile vs. baseline desktop, were found to be dif-
ferent (highest p-value=0.034). Also, none of the ML classi-
fiers performed higher than 0.7 (in all metrics), and thus we
could not clearly extract any significant result. Specifically,
the highest AUC score for personas 1 and 2 was 0.70 with the
use of the Random Forest classifier, and for personas 3 and 4
was 0.73 using the Logistic Regression classifier. The worst
scoring, independent of algorithm, was recorded for persona
5, with AUC=0.57, and Precision/Recall scores under 0.50.
Combined Personas: Setup 3b. When the data from all five
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30 35
CDF of Sessions
Number of Ads
Paired PC
Baseline PC
Mobile
0 5 10 15 20 25 30 35
Number of Keywords
Paired PC
Baseline PC
Mobile
Figure 6: CDF of collected ads (left) and corresponding key-
words of the ads (right) per crawling session for all devices.
personas are combined, the classifier performing best was
Logistic Regression, with AUC=0.79. Overall, these results
point to the semi-effectiveness of the incognito browsing to
limit CDT. That is, by removing the browsing state of a user
on a given device, the signal provided to the CDT entities is
reduced, but not fully removed. In fact, when the data from
various personas are combined, the CDT is still somewhat
effective, since the paired devices have the same IP address.
6 Platform Validation
In this section we validate the representativeness of the data
collected from the previous experiments, by examining: (i)
the type and frequency of ads delivered in each device, and
(ii) the type and number of trackers that our personas were
exposed to. We compare the distributions of these quantities
with past works and data on real users, to quantify if our syn-
thetic personas successfully emulate real users’ traffic, and if
our measurements of the CDT ad-ecosystem are realistic.
We first measure the frequency of ads delivered to our de-
vices in the experiment §5.2, since it follows a well-crafted
timeline that is suitable for this kind of measurement. The
ads delivered in the three devices during these sessions are
shown in Figure 6 (left). For most sessions (90%), the
mobile device was exposed to fewer than five ads, since the
mobile version of websites typically delivers a smaller num-
ber of ads, designed for smaller screens and devices. On the
contrary, the desktop devices had a higher exposure to ads
compared to the mobile device. Also, the two desktops re-
ceive a similar number of ads (on average 2 to 4 ads on every
visit to the control pages). Similar observations can be made
for the keywords categories of ads (Figure 6 (right)). The
ad-industry has reported that 300 ads everyday, on aver-
age, are being displayed to desktop users [44, 18, 45, 23],
while they also recommend the delivery of 5 ads per mobile
domain [76], which proportionally match the number of ads
we have collected in our mobile and desktop sessions.
We also validate the representativeness of the data col-
lected from the experiments §5.2 and §5.3, by examining the
trackers appearing in the webpages visited by the personas.
We use Disconnect List [29] to detect them and measure their
12
0
10
20
30
40
50
Google
Facebook
Bing
Zopim
AppNexus
Advertising.com
Yahoo
Pinterest
Casalemedia
Yandex
OpenX
Twitter
Pubmatic
Hotjar
Tapad
Amazon
Outbrain
Drawbridge
Taboola
Linkedin(Microsoft)
Occurences(%)
Figure 7: Top-20 trackers (grouped based on organization)
and their coverage in persona pages. For example, all the
Google-owned domains, such as Doubleclick, Googleapis,
Google-Analytics, are grouped under the “Google” label.
frequency of appearance (i.e., Figure 7). From the trackers
detected in the set of persona pages, and using the list pro-
vided by [88], 37% was found to be CDT related, including
both deterministic and probabilistic. In fact, the top CDT
trackers found in our data, which may perform both types
of CDT, include Google-owned domains, Facebook, Criteo,
Zopim, Bing, Advertising.com(AOL), and are in-line with
the top CDT trackers found in [88, 17] (66% overlap of top-
20 with [88] and 55% overlap with [17]). In addition, 17%
of these trackers are mainly focused on probabilistic CDT,
including Criteo, BlueKai, AdRoll, Cardlytics, Drawbridge,
Tapad, and each individual tracker is found at least in 2% of
the persona pages, again in-line with the results in [88].
7 Discussion & Conclusion
Through extensive experiments with the proposed frame-
work Talon, we were able to trigger CDT trackers into pair-
ing of the emulated users’ devices. This allowed us to statis-
tically verify that CDT is indeed happening, and measure its
effectiveness on different user interests and browsing behav-
iors, independently and in combination. In fact, CDT was
prominent when user devices were trained to browse pages
of similar interests, reinforcing the behavioral signal sent to
CDT entities, and specifically when browsing activity is re-
lated with online shopping, since those types of users seem to
be more targeted by advertisers. The CDT effect was further
amplified when the visited persona and control pages had
embedded CDT trackers, pushing the accuracy of detection
up to 99%. We also found that browsing in a stateless mode
showed a reduced, but not completely removed CDT effect,
as incognito browsing obfuscates somewhat the signal sent
to the ad-ecosystem, but not the network access information.
Indeed, our data collection was performed across relatively
short time periods, in comparison to the wealth of browsing
data that advertising networks have at their disposal. In fact,
we anticipate that CDT companies collect data about users
and devices for months or years, and even buy data from data
brokers, to have the capacity of targeting users with even
higher rates. To that end, we believe that high accuracies
self-reported by CDT companies (e.g., Lotame: >90% [56],
Drawbridge: 97.3% [31]), are possible.
Impact on user privacy: Undoubtedly, CDT infringes on
users’ online privacy and minimizes their anonymity. But the
actual extent of this tracking paradigm and its consequences
to users, the community, and even to the ad-ecosystem itself,
are still unknown. In fact, since CDT is heavily depended on
user’s browsing activity, and the ad-ecosystem employs such
collected data for targeting purposes, one major line of future
work is the study of targeting sensitive user categories (e.g.,
gender, sexual orientation, race, etc.) via CDT. This is espe-
cially relevant nowadays with the enforcement of recent EU
privacy regulations such as GDPR [37] and ePrivacy [36].
This is where Talon comes in play, as it provides a concrete,
scalable and extensible methodology for experimenting with
different CDT scenarios, auditing its mechanics and measur-
ing its impact. In fact, the modular design of our method-
ology allows to study CDT in depth, and propose new ex-
tensions to study the CDT ecosystem: new plugins, personas
and ML techniques. To that end, our design constitutes Talon
into an enhanced transparency tool that reveals potentially il-
legal biases or discrimination from the ad-ecosystem.
Acknowledgments
The research leading to these results has received fund-
ing from the European Union’s Horizon 2020 Research and
Innovation Programme under grand agreement No 786669
(project CONCORDIA), the Marie Sklodowska-Curie grant
agreement No 690972 (project PROTASIS), and the Defense
Advanced Research Projects Agency (DARPA) ASED Pro-
gram and AFRL under contract FA8650-18-C-7880. The
paper reflects only the authors’ views and the Agency and
the Commission are not responsible for any use that may be
made of the information it contains.
References
[1] ICDM 2015: Drawbridge Cross-Device Connections
- Data. https://www.kaggle.com/c/icdm-2015-
drawbridge-cross-device-connections/data,
2015.
[2] WebWire - Drawbridge Challenges Scientific Com-
munity to Better the Accuracy of Its Cross-Device
Consumer Graph. https://www.webwire.com/
ViewPressRel.asp?aId=198392, 2017.
[3] Measuring Cross-Device: The Methodology.
https://www.tapad.com/resources/cross-
13
device/measuring-cross-device-the-
methodology, 2018.
[4] Pew Research Center - Mobile Fact Sheet. http:
//www.pewinternet.org/fact-sheet/mobile/,
2018.
[5] ACA R, G., EU BANK , C. , ENGLEHARDT, S., JUAR EZ,
M., NARAYANAN, A., A ND DIAZ, C. The web never
forgets: Persistent tracking mechanisms in the wild. In
Proceedings of the 2014 ACM SIGSAC Conference on
Computer and Communications Security, CCS ’14.
[6] ACAR, G., JUAR EZ , M., NIKIFORAKIS, N., DIAZ,
C. , G ¨
UR SE S, S ., PIESSENS, F., A ND PRE NE EL , B.
Fpdetective: Dusting the web for fingerprinters. In
Proceedings of the 2013 ACM SIGSAC Conference on
Computer & Communications Security, CCS ’13.
[7] ADBR AI N. Demystifying cross-device. essen-
tial reading for product management,business
development and business technology lead-
ers. https://www.iabuk.com/sites/
default/files/white-paper-docs/Adbrain-
Demystifying-Cross-Device.pdf, 2016.
[8] ADELPHIC. How cross-device identity matching
works. https://adelphic.com/how-cross-
device-identity-matching-works-part-1/,
2016.
[9] AGUIRRE, E., MAHR, D., GR EWAL , D., D E RUYTER,
K., AND WET ZE LS , M. Unraveling the personaliza-
tion paradox: The effect of information collection and
trust-building strategies on online advertisement effec-
tiveness. Journal of Retailing 91, 1 (2015), 34–49.
[10] ANA ND, T. R., A ND RE NOV, O. Machine learning ap-
proach to identify users across their digital devices. In
IEEE International Conference on Data Mining Work-
shop (ICDMW) (2015), pp. 1676–1680.
[11] AR P, D., QUIRING, E. , WRE SS NE GGE R, C., AND
RIE CK , K. Privacy threats through ultrasonic side
channels on mobile devices. In IEEE European Sym-
posium on Security and Privacy (EuroS&P) (2017),
pp. 35–47.
[12] BASHIR, M. A., ARSHAD, S., ROB ERT SON , W., AN D
WILSON, C. Tracing information flows between ad ex-
changes using retargeted ads. In 25th USENIX Security
Symposium (2016), pp. 481–496.
[13] BASHIR, M. A., FA ROOQ , U., SHAHID, M., ZA FFA R,
M. F., AND WILSON, C. Quantity vs. quality: Evaluat-
ing user interest profiles using ad preference managers.
In Proceedings of the Annual Network and Distributed
System Security Symposium (NDSS), San Diego, CA
(2019).
[14] BL EI ER, A., A ND EISENBEISS, M. Personalized on-
line advertising effectiveness: The interplay of what,
when, and where. Marketing Science 34, 5 (2015),
669–688.
[15] BOERMAN, S. C., KRUIKEMEIER, S., A ND
ZUI DE RVE EN BORGESIUS, F. J. Online behav-
ioral advertising: A literature review and research
agenda. Journal of Advertising.
[16] BRO DER, A., FONTOURA, M., J OSI FOV SKI , V., AND
RIEDEL, L. A semantic approach to contextual adver-
tising. In Proceedings of the 30th Annual International
ACM SIGIR Conference on Research and Development
in Information Retrieval (2007).
[17] BROOKMAN, J., RO UG E, P., ALVA, A., AN D YEU NG ,
C. Cross-device tracking: Measurement and disclo-
sures. Proceedings on Privacy Enhancing Technolo-
gies, 2 (2017), 133–148.
[18] BRY CE SA ND ERS. Do we really see 4,000 ads a day?
https://www.bizjournals.com/bizjournals/
how-to/marketing/2017/09/do-we-really-
see-4-000-ads-a-day.html, 2017.
[19] CAO , X. , HUAN G, W., AN D YU, Y. Recovering cross-
device connections via mining ip footprints with en-
semble learning. In IEEE International Conference on
Data Mining Workshop (ICDMW) (2015), pp. 1681–
1686.
[20] CAO , Y., LI, S., A ND WIJMANS, E. (cross-)browser
fingerprinting via os and hardware level features. In
Proceedings of Network & Distributed System Security
Symposium (NDSS) (2017), Internet Society.
[21] CARRASCOSA, J. M., MIKIANS, J. , CUE VAS, R.,
ERR AM ILLI, V., AN D LAOUTARIS, N. I always feel
like somebody’s watching me: measuring online be-
havioural advertising. In Proceedings of the 11th ACM
Conference on Emerging Networking Experiments and
Technologies (CONEXT) (2015).
[22] CAWLEY, G. C., AN D TALBOT, N. L. On over-fitting
in model selection and subsequent selection bias in per-
formance evaluation. Journal of Machine Learning Re-
search 11 (2010).
[23] CH RI STOPHER EL LI OT T. Yes, there are too many
ads online. yes, you can stop them. heres how.
https://www.huffingtonpost.com/entry/yes-
there-are-too-many-ads-online-yes-you-
can-stop_us_589b888de4b02bbb1816c297, 2017.
[24] CHUN, K. Y., SO NG , J. H., HOLLENBECK, C. R.,
AN D LEE , J.-H. Are contextual advertisements effec-
tive? International Journal of Advertising 33, 2 (2014),
351–371.
14
[25] CRITEO. The State of Cross-Device Com-
merce. https://www.criteo.com/wp-content/
uploads/2017/07/Report-criteo-state-of-
cross-device-commerce-2016-h2-SEA.pdf,
2016.
[26] CRITEO. The 5 top attribution methodologies for
cross-channel roi. https://www.criteo.com/
insights/top-attribution-methodologies-
for-cross-channel-roi/, 2018.
[27] DATTA, A., TSCHANTZ, M. C ., AN D DATTA, A. Au-
tomated experiments on ad privacy settings. Proceed-
ings on privacy enhancing technologies 2015, 1 (2015),
92–112.
[28] DIAZ-MO RALES , R . Cross-device tracking: Matching
devices and cookies. In 2015 IEEE International Con-
ference on Data Mining Workshop (ICDMW) (2015).
[29] DISCONNECT. Disconnect lets you visualize and block
the invisible websites that track your browsing history.
https://disconnect.me/, 2019.
[30] DO LI N, C., WEINSHEL, B., S HAN , S., HAHN, C. M.,
CHO I, E., MAZU RE K, M. L., A ND UR, B. Unpacking
perceptions of data-driven inferences underlying online
targeting and personalization. In Proceedings of the
2018 CHI Conference on Human Factors in Computing
Systems (2018), ACM, p. 493.
[31] DRAWBRIDGE. Cross-Device Consumer Graph.
https://go.drawbridge.com/rs/454-ORY-
155/images/Drawbridge-Cross-Device-
Consumer-Graph.pdf, 2015.
[32] DRAWBRIDGE. Drawbridge Cross-Device Con-
nected Consumer Graph Is 97.3% Accurate.
https://go.drawbridge.com/rs/454-ORY-
155/images/Drawbridge-Cross-Device-
Consumer-Graph.pdf, 2015.
[33] EASYLIST. Easylist is the primary filter list that re-
moves most adverts from international webpages, in-
cluding unwanted frames, images and objects. https:
//easylist.to/, 2018.
[34] EC KE RSLEY, P. How unique is your web browser? In
Proceedings of the 10th International Conference on
Privacy Enhancing Technologies, PETS’10.
[35] ENGLEHARDT, S., AN D NAR AYANAN , A. Online
tracking: A 1-million-site measurement and analysis.
In Proceedings of the ACM SIGSAC conference on
computer and communications security (CCS) (2016),
pp. 1388–1401.
[36] EU ROPEAN PARLIAMENT, COUNCIL OF THE EURO -
PE AN UNION. Directive 2002/58/EC of the European
Parliament and of the Council of 12 July 2002 concern-
ing the processing of personal data and the protection of
privacy in the electronic communications sector (Direc-
tive on privacy and electronic communications)), 2002.
[37] Regulation (EU) 2016/679 of the European Parliament
and of the Council of 27 April 2016 on the protection
of natural persons with regard to the processing of per-
sonal data and on the free movement of such data, and
repealing Directive 95/46/EC (General Data Protection
Regulation). Official Journal of the European Union
L119 (2016), 1–88.
[38] FARAH AT, A., A ND BAILEY, M. C. How effective is
targeted advertising? In Proceedings of the 21st ACM
International Conference on World Wide Web (2012),
pp. 111–120.
[39] GI RONDA , J. T., AND KORGAONKAR, P. K. ispy?
tailored versus invasive ads and consumers perceptions
of personalized advertising. Electronic Commerce Re-
search and Applications 29 (2018), 64–77.
[40] GOOGLE. Google Product Taxonomy. https:
//www.google.com/basepages/producttype/
taxonomy.en-US.txt, 2015.
[41] GOOGLE. Run apps on the Android Emula-
tor. https://developer.android.com/studio/
run/emulator/, 2018.
[42] GR ACE, M. C., ZHOU, W., JIANG , X. , AND
SAD EG HI , A.-R. Unsafe exposure analysis of mo-
bile in-app advertisements. In Proceedings of the Fifth
ACM Conference on Security and Privacy in Wireless
and Mobile Networks, WISEC ’12.
[43] GUHA, S., CHENG, B., AND FRANCIS, P. Privad:
Practical privacy in online advertising. In Proceedings
of the 8th USENIX Conference on Networked Systems
Design and Implementation (2011).
[44] JO N SIMPSON. Finding brand success in the
digital world. https://www.forbes.com/
sites/forbesagencycouncil/2017/08/25/
finding-brand-success-in-the-digital-
world/#734eaba626e2, 2018.
[45] JUSTIN MALLINSON. How many ads do we really see
each day? http://www.tcsmedia.co.uk/many-
ads-really-see-day/, 2018.
[46] KA FK A, P., AND MOLLA, R. Recode - 2017 was
the year digital ad spending finally beat TV. https:
//www.recode.net/2017/12/4/16733460/2017-
digital-ad-spend-advertising-beat-tv, 2017.
15
[47] KEJELA, G., AND RO NG, C. Cross-device consumer
identification. In IEEE International Conference on
Data Mining Workshop (ICDMW) (2015), pp. 1687–
1689.
[48] KI M, M. S., LI U, J., WANG, X., AND YANG, W.
Connecting devices to cookies via filtering, feature en-
gineering, and boosting. In IEEE International Con-
ference on Data Mining Workshop (ICDMW) (2015),
pp. 1690–1694.
[49] KOROL OVA, A., AND SHAR MA , V. Cross-app track-
ing via nearby bluetooth low energy devices. In Pro-
ceedings of the 8th ACM Conference on Data and
Application Security and Privacy (CODASPY) (2018),
pp. 43–52.
[50] LA ND RY, M., CHO NG, R., E T AL. Multi-layer clas-
sification: Icdm 2015 drawbridge cross-device connec-
tions competition. In IEEE International Conference
on Data Mining Workshop (ICDMW) (2015), pp. 1695–
1698.
[51] L ´
EC UY ER , M., DUC OFFE, G., LA N, F., PAPANCEA,
A., PE TSIOS , T., SPAHN, R., CHAINTREAU, A., AND
GEAMBASU, R . Xray: Enhancing the webs trans-
parency with differential correlation. In 23rd USENIX
Security Symposium (2014), pp. 49–64.
[52] LE CU YER, M., SPA HN , R. , SPILIOPOLOUS, Y.,
CHAINTREAU, A., GEA MBAS U, R., AN D HSU , D.
Sunlight: Fine-grained targeting detection at scale with
statistical confidence. In Proceedings of the 22Nd ACM
SIGSAC Conference on Computer and Communica-
tions Security (CCS) (2015).
[53] LE RN ER, A., SIMPSON, A. K., KO HN O, T., AND
ROE SN ER , F. Internet jones and the raiders of the lost
trackers: An archaeological study of web tracking from
1996 to 2016. In 25th USENIX Security Symposium
(2016).
[54] LEWIS, R. A., RAO, J. M., AND REIL EY, D. H. Here,
there, and everywhere: correlated online behaviors can
lead to overestimates of the effects of advertising. In
Proceedings of the 20th ACM International Conference
on World Wide Web (2011), pp. 157–166.
[55] LI U, B., SHET H, A ., WEINSBERG, U., CHAN-
DRASHEKAR, J., AN D GOVIN DAN, R. Adreveal: im-
proving transparency into online targeted advertising.
In Proceedings of the 12th ACM Workshop on Hot Top-
ics in Networks (2013), p. 12.
[56] LOTAME. Cross-Device ID Graph Accuracy: Method-
ology. https://www.lotame.com/cross-device-
id-graph-accuracy-methodology/, 2016.
[57] LOTAME. Cross-device.bridging the gap between
screens. https://www.lotame.com/products/
cross-device/, 2018.
[58] MAVRO UD IS, V., HAO , S., FRATAN TO NI O, Y.,
MAGG I, F., KRUE GE L, C ., AN D VIG NA , G. On the
privacy and security of the ultrasound ecosystem. Pro-
ceedings on Privacy Enhancing Technologies (2017).
[59] MAYE R, J. R., AND MITCHELL, J . C. Third-party
web tracking: Policy and technology. In Proceedings
of the 2012 IEEE Symposium on Security and Privacy,
SP ’12.
[60] MCAF EE. Customer URL Ticketing System. https:
//www.trustedsource.org/, 2018.
[61] MCNAIR, C. Global Ad Spending Update.
https://www.emarketer.com/content/global-
ad-spending-update, 2018.
[62] ME NG , W., DING, R., CH UNG, S. P., HAN , S., AND
LEE , W. The price of free: Privacy leakage in person-
alized mobile in-apps ads. In NDSS (2016).
[63] NIKIFORAKIS, N., JOOSEN , W., AN D LIVSHITS, B.
Privaricator: Deceiving fingerprinters with little white
lies. In Proceedings of the 24th International Confer-
ence on World Wide Web, WWW ’15.
[64] NIKIFORAKIS, N., KAPRAVELOS , A. , JOO SE N, W.,
KRUE GEL, C., PIESSENS, F., AN D VIG NA , G. Cook-
ieless monster: Exploring the ecosystem of web-based
device fingerprinting. In Proceedings of the 2013 IEEE
Symposium on Security and Privacy, SP ’13.
[65] OLEJNIK, L., MINH-DUNG , T., A ND CA ST ELL UC -
CIA, C. Selling off privacy at auction. In Network
and Distributed System Security Symposium (NDSS)
(2014).
[66] PACHILAKIS, M., PAPADOPOULOS, P., MAR KATO S,
E. P., AND KO URT ELL IS , N. No more chasing water-
falls: A measurement study of the header bidding ad-
ecosystem. In 19th ACM Internet Measurement Con-
ference (2019).
[67] PANC HENKO, A., LA NZ E, F., PENNEKAMP, J., E N-
GE L, T., ZINNEN, A., HENZE, M., A ND WEH RL E,
K. Website fingerprinting at internet scale. In NDSS
(2016).
[68] PAPADOPOULOS, E. P., DIAMANTARIS, M., PA-
PADOPOULOS, P., PET SA S, T., IOANNIDIS, S., A ND
MAR KATO S, E. P. The long-standing privacy debate:
Mobile websites vs mobile apps. In Proceedings of
the 26th ACM International Conference on World Wide
Web (2017), pp. 153–162.
16
[69] PAPADOPOULOS, P., KOURT EL LIS, N., AN D
MAR KATO S, E. Cookie synchronization: Everything
you always wanted to know but were afraid to ask.
In The World Wide Web Conference (2019), ACM,
pp. 1432–1442.
[70] PAPADOPOULOS, P., KOURT EL LIS, N., RODRIGUEZ,
P. R., A ND LAO UTARI S, N. If you are not paying for
it, you are the product: How much do advertisers pay to
reach you? In Proceedings of the ACM Internet Mea-
surement Conference (2017), pp. 142–156.
[71] PARR A-ARNAU, J., ACHARA, J. P., AN D CASTEL -
LUCCIA, C. Myadchoices: Bringing transparency and
control to online advertising. ACM Transactions on the
Web (TWEB) 11, 1 (2017), 7.
[72] PATRICK HOLMES. Mobile and Desktop Ad-
vertising Strategies Based on User Intent.
https://instapage.com/blog/adwords-
search-device-user-intent, 2018.
[73] PRO JECT, J. F. Automation for Apps. http://
appium.io/, 2018.
[74] RAMIREZ, E., OHLHAUS EN , M., A ND MCSWEENY,
T. Cross-device tracking: An FTC staff report. Tech.
rep., 2017.
[75] RA ZAGHPANA H, A ., NI TH YANA ND , R., VALL INA-
RODRIGUEZ, N., SU NDAR ES AN, S., ALLMAN, M.,
KREIBICH, C., AN D GILL, P. Apps, trackers, privacy
and regulators: A global study of the mobile tracking
ecosystem. Proceedings of the Annual Network and
Distributed System Security Symposium (NDSS), San
Diego, CA.
[76] RE NE HE RMENAU. Adsense max allowed number of
ads - 2018 Rules. https://wpquads.com/google-
adsense-allowed-number-ads/, 2018.
[77] ROE SNER, F., KOH NO , T., A ND WETHE RA LL, D.
Detecting and defending against third-party tracking on
the web. In Proceedings of the 9th USENIX Confer-
ence on Networked Systems Design and Implementa-
tion, NSDI’12.
[78] SE LS AAS, L. R., AGRAWAL, B., RON G, C., AN D
WIK TO RS KI , T. Affm: Auto feature engineering in
field-aware factorization machines for predictive ana-
lytics. In IEEE International Conference on Data Min-
ing Workshop (ICDMW) (2015), pp. 1705–1709.
[79] TAPAD. Tapad device graph - creating a unified view of
the consumer. https://www.tapad.com/device-
graph/, 2018.
[80] TAPAD. The expert’s guide to cross-device conversion
& attribution. https://www.tapad.com/uses/the-
experts-guide-to-cross-device-conversion-
attribution, 2018.
[81] TERKKI, E., RAO, A., AN D TARKOMA, S. Spying on
android users through targeted ads. In 2017 9th Inter-
national Conference on Communication Systems and
Networks (COMSNETS).
[82] TOUBIANA, V., NAR AYANAN , A., BONEH , D. , NIS -
SE NBAUM , H., AND BAROCAS, S. Adnostic: Privacy
preserving targeted advertising. In Proceedings Net-
work and Distributed System Symposium (2010).
[83] TRAN, M. M.-D., A ND CASTELLUCCIA, C. Betrayed
by your ads! reconstructing user profiles from targeted
ads. In The 12th (PETS 2012) Privacy Enhancing Tech-
nologies Symposium, Vigo, Spain (2012).
[84] TUCKER, C. E. Social networks, personalized adver-
tising, and privacy controls. Journal of Marketing Re-
search 51, 5 (2014), 546–562.
[85] WALTHE RS, J. Learning to rank for cross-device
identification. In 2015 IEEE International Conference
on Data Mining Workshop (ICDMW) (2015), IEEE,
pp. 1710–1712.
[86] YAN, J., LIU , N., WANG, G., ZH AN G, W., JI AN G,
Y., AN D CHEN, Z. How much can behavioral targeting
help online advertising? In Proceedings of the 18th
International Conference on World Wide Web (2009).
[87] YU, Z., MAC BE TH , S., MOD I, K., AN D PUJOL, J. M.
Tracking the trackers. In Proceedings of the 25th In-
ternational Conference on World Wide Web (2016),
WWW ’16, pp. 121–132.
[88] ZI MM ECK, S., LI, J . S., KIM , H., BEL LOVIN, S. M. ,
AN D JEBAR A, T. A privacy analysis of cross-device
tracking. In 26th USENIX Security Symposium (2017),
pp. 1391–1408.
17