
IMC ’19, October 21–23, 2019, Amsterdam, Netherlands Jordan Jueckstock and Alexandros Kapravelos
measurement goals of this paper, our in-browser approach provided
coverage OpenWPM’s in-band instrumentation could not match
(Section 2). Merzdovnik et al. [
36
] measured the effectiveness of
tracker blocking tools like Ghostery and AdBlock Plus while visiting
over 100,000 sites within the Alexa top 200,000 domains. Their focus
was on identifying sources of 3rd-party tracking and measuring the
success or failure of blockers; ours is on fine-grained feature usage
attribution and analysis on a script-by-script basis.
Lauinger et al. [
30
] surveyed 133,000 top sites and discovered
widespread use of outdated or vulnerable JS libraries using a browser
automation system like ours but without instrumenting or logging
API usage. Snyder’s measurement [
49
] of JS browser API usage on
top web sites found approximately 50% of the available features
completely unused of the Alexa top 10K at the time of measurement.
A follow-up work [
50
] explored the degree to which potentially dan-
gerous or undesirable JS browser API features could be disabled to
reduce the browser’s attack surface without disrupting the user’s
web browsing experience.
7 CONCLUSION
We have made the case for choosing out-of-band over in-band JS
instrumentation when measuring the web for security and privacy
concerns. We also presented
VisibleV8
, a custom variant of Chrome
for measuring native JS/browser features used on the web. VV8
is modern, stealthy, and fast enough for both interactive use and
web-scale automation. Our implementation is a small, highly main-
tainable patch easily ported to new browser versions. The resulting
instrumentation, hidden inside the JS engine itself, is transparent
to the visited pages, performs as well or better than in-band equiva-
lents, and provides fine-grained feature tracking by source script and
security origin. With VV8 we have observed JS code loaded directly
or by frames on 29% of the Alexa top 50k sites actively testing for
common automated browser frameworks. As many web measure-
ments rely on such tools, this result marks a concerning development
for security and privacy research on the web.
VisibleV8
has proven
itself a transparent, efficient, and effective observation platform. We
hope its public release contributes to the development of more next-
generation web instrumentation and measurement tools for security
and privacy research.
8 AVAILABILITY
The
VisibleV8
patches to Chromium, along with tools and docu-
mentation, are publicly available at:
https://kapravelos.com/projects/vv8
9 ACKNOWLEDGEMENTS
Wewould like to thank our shepherd Dave Levin and the anonymous
reviewers for their insightful comments and feedback. This work
was supported by the Office of Naval Research (ONR) under grant
N00014-17-1-2541, by DARPA under agreement number FA8750-19-
C-0003, and by the National Science Foundation (NSF) under grant
CNS-1703375.
REFERENCES
[1]
2014. Understanding web pages better. https://webmasters.googleblog.com/2014/
05/understanding-web- pages-better.html. (2014). Accessed: 2019-8-19.
[2]
2016. javascript - Can a website detect when you are using selenium with
chromedriver? https://stackoverflow.com/a/41220267. (2016). Accessed:
2018-11-15.
[3]
2017. Bug 1424176. https://bugzilla.mozilla.org/show_bug.cgi?id=1424176. (2017).
Accessed: 2018-11-15.
[4]
2017. Issue 793217. https://bugs.chromium.org/p/chromium/issues/detail?id=
793217. (2017). Accessed: 2018-11-15.
[5]
2018. marmelab/gremlins.js: Monkey testing library for web apps and Node.js.
https://github.com/marmelab/gremlins.js. (2018). Accessed: 2018-11-15.
[6]
2018. The State of the Octoverse: top programming languages of 2018. https://
github.blog/2018-11-15-state-of-the-octoverse-top-programming- languages/.
(2018). Accessed: 2019-5-8.
[7]
2018. WebIDL Level 1. https://w ww.w3.org/TR/WebIDL-1/. (2018). Accessed:
2018-11-15.
[8]
2019. BrowserBench.org. https://browserbench.org/. (2019). Accessed: 2019-1-25.
[9]
2019. Dromaeo. http://dromaeo.com/?recommended. (2019). Accessed: 2019-1-25.
[10]
2019. P hantomJS - Scriptable Headless Browser. http://phantomjs.org/. (2019).
Accessed: 2019-2-1.
[11]
2019. Selenium - Web Browser Automation. https://docs.seleniumhq.org/. (2019).
Accessed: 2019-2-1.
[12]
2019. The RedMonk Programming Language Rankings: January 2019.
https://redmonk.com/sogrady/2019/03/20/language-rankings- 1-19/. (2019).
Accessed: 2019-5-8.
[13]
Gunes Acar, Marc Juarez, Nick Nikiforakis, Claudia Diaz, Seda Gürses, Frank
Piessens, and Bart Preneel. 2013. FPDete ctive:Dusting the Web for Fingerprinters.
In Proceedings of the ACM Conference on Computer and Communications Security
(CCS).
[14]
Pieter Agten, Steven Van Acker, YoranBrondsema, P hu H Phung, LievenDesmet,
and Frank Piessens. 2012. JSand: Complete Client-Side Sandboxing of Third-Party
JavaScript without Browser Modifications. In Proceedings of the Annual Computer
Security Applications Conference (ACSAC). ACM.
[15]
James P Anderson. 1972. Computer Se curity Technology Planning Study. Volume
2. Technical Report. Anderson (James P) and Co Fort Washington PA.
[16]
Marc Andreessen. 2011. Why Software is Eating the World. https:
//www.wsj.com/articles/SB10001424053111903480904576512250915629460.
(2011). Accessed: 2018-04-20.
[17]
Yinzhi Cao, Zhanhao Chen, Song Li, and Shujiang Wu. 2017. Deterministic
Browser. In Proceedings of the ACM Conference on Computer and Communications
Security (CCS).
[18]
Yinzhi Cao, Zhichun Li, Vaibhav Rastogi, Yan Chen, and Xitao Wen. 2012.
Virtual Browser: A Virtualized Browser to Sandbox Third-party JavaScripts with
Enhanced Security. In Proceedings of the 7th ACM Symposium on Information,
Computer and Communications Security (ASIACCS ’12). ACM.
[19]
Quan Chen and Alexandros Kapravelos. 2018. Mystique: Uncovering Information
Leakage from Browser Extensions. In Proceedings of the ACM Conference on Com-
puter and Communications Security (CCS). https://doi.org/10.1145/3243734.3243823
[20]
Andrey Chudnov and David A Naumann. 2015. Inlined information flow
monitoring for javascript. In Proceedings of the ACM Conference on Computer and
Communications Security (CCS). ACM.
[21]
Anupam Das, Gunes Acar, Nikita Borisov, and Amogh Pradeep. 2018. The
Web’s Sixth Sense: A Study of Scripts Accessing Smartphone Sensors. In
Proceedings of the ACM Conference on Computer and Communications Security
(CCS). https://doi.org/10.1145/3243734.3243860
[22]
Steven Englehardt and Arvind Narayanan. 2016. Online Tracking: A 1-million-site
Measurement and Analysis. In Proceedings of the ACM Conference on Computer and
Communications Security (CCS). ACM. https://doi.org/10.1145/2976749.2978313
[23]
Úlfar Erlingsson. 2003. The Inlined Reference Monitor Approach to Security Policy
Enforcement. Technical Report. Cornell University.
[24]
Luca Invernizzi, Kurt Thomas, Alexandros Kapravelos, Oxana Comanescu,
Jean-Michel Picod, and Elie Bursztein. 2016. Cloak of Visibility: Detecting When
Machines Browse A Different Web. In Proceedings of the IEEE Symposium on
Security and Privacy.
[25]
Gregoire Jacob, Engin Kirda, Christopher Kruegel, and Giovanni Vigna. 2012.
PUBCRAWL: Protecting Users and Businesses from CRAWLers. In Proceedings
of the USENIX Security Symposium.
[26]
Simon Holm Jensen, Peter A. Jonsson, and Anders Møller. 2012. Remedying the
Eval That Men Do. In Proceedings of the 2012 International Symposium on Software
Testing and Analysis (ISSTA 2012). ACM. https://doi.org/10.1145/2338965.2336758
[27]
Alexandros Kapravelos, Chris Grier, Neha Chachra, Chris Kruegel, Giovanni
Vigna, and Vern Paxson. 2014. Hulk: Eliciting Malicious Behavior in Browser
Extensions. In Proceedings of the USENIX Security Symposium. USENIX.
[28]
Alexandros Kapravelos, Yan Shoshitaishvili, Marco Cova, Chris Kruegel, and
Giovanni Vigna. 2013. Revolver: An Automated Approach to the Detection of
Evasive Web-based Malware.In Procee dings of the USENIX Security Symposium.
[29]
Clemens Kolbitsch, Benjamin Livshits, Benjamin Zorn, and Christian Seifert. 2012.
Rozzle: De-Cloaking Internet Malware. In Proceedings of the IEEE Symposium on
Security and Privacy. IEEE.