
7 Acknowledgments
This research was achieved by the Ministry of Education, Science,
Sports, and Culture, Japan, Grant-in-Aid for Scientific Research
(B) 16H02874 and the Commissioned Research of National
Institute of Information and Communications Technology (NICT),
Japan, Grant Number 190.
8 References
[1] Curtsinger, C., Livshits, B., Zorn, B., et al.: ‘ZOZZLE: fast and precise
in-browser JavaScript malware detection’. Proc. of USENIX Security Symp.,
Berkeley, CA, USA, 2011, pp. 33–48
[2] Howard, F.: ‘Malware with your mocha: obfuscation and anti-emulation tricks in
malicious JavaScript’, Sophos Lab, September 2010, pp. 1–18
[3] Kaplan, S., Livshits, B., Zorn, B., et al.: ‘‘NOFUS: Automatically
Detecting’+ String.fromCharCode(32) + ‘ObFuSCateD’.toLowerCase
() + ‘JavaScript Code’’. Microsoft Research Technical Report, MSR-TR-2011,
(57), 2011, pp. 1–11
[4] Sebastian, S., Malgaonkar, S., Shah, P., et al.: ‘A study and review on code
obfuscation’. World Conf. on Futuristic Trends in Research and Innovation for
Social Welfare (WCFTR’16), Coimbatore, Tamilnadu, India, 2016, pp. 1–6
[5] Schrittwieser, S., Katzenbeisser, S., Kinder, J., et al.: ‘Protecting software
through obfuscation: can it keep pace with progress in code analysis?’,ACM
Comput. Surv., 2016, 49, (1), pp. 4:1–4:40
[6] Xu, W., Zhang, F., Zhu, S.: ‘The power of obfuscation techniques in malicious
JavaScript code: a measurement study’. 7th Int. Conf. on Malicious and
Unwanted Software (MALWARE), Fajardo, PR, USA, 2012, pp. 9–16
[7] Lu, G., Debray, S.: ‘Automatic simplification of obfuscated JavaScript code: a
semantics-based approach’. Proc. of the IEEE 6th Int. Conf. on Software
Security and Reliability, Gaithersburg MD, USA, 2012, pp. 31–40
[8] JavaScript Obfuscator: ‘JavaScript obfuscator is a powerful encoding, and
obfuscation technologies prevent reverse engineering, copyright infringement
and unauthorized modification of your code’. Available at https://
javascriptobfuscator.com/Javascript-Obfuscator.aspx, accessed on August 2019
[9] Serafim, T., Kachalov, T.: ‘JavaScript obfuscator tool’, A free and efficient
obfuscator for JavaScript (including ES2017) –a Web UI tool to the excellent
(and open source) javascript-obfuscator@0.21.0. Available at https://obfuscator.
io/, accessed on August 2019
[10] DirtyMa rkup: ‘JavaScript beautifier’. Available at http://www.dirtymarkup.com/,
accessed on November 2019
[11] Lielma nis, E., Newman, L.: ‘Online JavaScript beautifier (v1.10.2), beautify,
unpack or deobfuscate JavaScript and HTML, make JSON/JSONP readable’.
Available at https://beautifier.io/, accessed on November 2019
[12] Dan’s Tools: ‘JavaScript viewer, beautifier, formatter and editor’. Available at
https://www.cleancss.com/javascript-beautify/, accessed on November 2019
[13] Ndichu, S., Ozawa, S., Misu, T., et al.: ‘A machine learning approach to
malicious JavaScript detection using fixed length vector representation’. Proc.
of 2018 Int. Joint Conf. on Neural Networks, (IJCNN’18), Rio de Janeiro,
Brazil, 8–13 July 2018, pp. 1–8
[14] Ndichu, S., Kim, S., Ozawa, S., et al.: ‘A machine learning approach to detection
of JavaScript-based attacks using AST features and paragraph vectors’,Appl. Soft
Comput. J., 2019, 84, (105721), pp. 1–11
[15] Skolka, P., Staicu, C., Pradel, M.: ‘Anything to hide? Studying minified and
obfuscated code in the web’. the Proc. of the World Wide Web Conf.
(WWW’19), San Francisco, CA, USA, 2019, vol. 4, pp. 1–11
[16] Yadegari, B., Johannesmeyer, B., Whitely, B., et al.: ‘A generic approach to
automatic deobfuscation of executable code’. the Proc. of the 36th IEEE Symp.
on Security and Privacy (IEEE SP’15), San Jose, CA, USA, 2015, pp. 674–691
[17] Gorji, A., Abadi, M.: ‘Detect ing obfuscated JavaScript malware using sequences
of internal function calls’. the Proc. of the 52nd ACM Southeast Regional Conf.
(ACMSE’14), Kennesaw, GA, USA, 2014, pp. 1–6
[18] Tellenbach, B., Paganoni, S., Rennhard, M.: ‘Detecting obfuscated JavaScript
from known and unknown obfuscators using machine learning’,Int. J. Adv.
Secur., 2016, 9,(3–4), pp. 196–206
[19] Fass, A., Krawczyk, R., Backes, M., et al.: ‘JaSt: fully syntactic detection of
malicious (obfuscated) JavaScript’. the Proc. of the 15th Int. Conf. on
Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA),
Saclay, France, 2018, pp. 303–325
[20] Pan, J., Mao, X.: ‘Obfuscated malicious JavaScript detection by machine
learning’. 2nd Int. Conf. on Advances in Mechanical Engineering and
Industrial Informatics (AMEII’16), Hangzhou, Zhejiang, 2016, pp. 805–810
[21] Graziano, M., Balzarotti, D., Zidouemba, A.: ‘ROPMEMU: a framework for
the analysis of complex code-reuse attacks’. Proc. of the 11th Asia Conf. on
Computer and Communications Security (ASIA CCS’16), Xi’an, China, 2016,
pp. 47–58
[22] Hu, X., Cheng, Y., Duan, Y., et al.: ‘JSForce: a forced execution engine for
malicious JavaScript detection’. Proc. of the 13th Int. Conf. on Security and
Privacy in Communication Networks (SecureComm’17), Lecture Notes of the
Institute for Computer Sciences, Social Informatics and Telecommunications
Engineering, Niagara Falls, Canada, 2017, vol. 238, pp. 704–720
[23] Python: ‘Python 2.7.17 documentation, the python standard library’. Available at
https://docs.python.org/2/library/functions.html, accessed on August 2019
[24] Zhang, Y., Jin, R., Zhou, Z.-H.: ‘Understanding bag-of-words model: a statistical
framework’,Int. J. Mach. Learn. Cybern., 2010, 1,(1–4), pp. 43–52
[25] Goldberg, Y.: ‘Neural network methods for natural language processing’,
‘Synthesis lectures on human language technologies’, vol. 37 (Morgan and
Claypool, San Rafael, CA, USA, 2017), no. 69, pp. 1–309
[26] Mikolov, T., Sutskever, I., Chen, K., et al.: ‘Distributed representations of words
and phrases and their compositionality’. Proc. of the Advances in Neural
Information Processing Systems (NIPS), Lake Tahoe, Nevada, USA, 2013b,
vol. 26, pp. 3111–3119
[27] Mikolov, T., Chen, K., Corrado, G., et al.: ‘Efficient estimation of word
representations in vector space’. the Int. Conf. on Learning Representations
(ICLR), Workshop Papers, Scottsdale, AZ, USA., 2013a, pp. 1–12
[28] Le, Q., Mikolov, T.: ‘Distributed representations of sentences and documents’.
Proc. of the 31st Int. Conf. on Machine Learning (ICML-14), Beijing, China,
2014, pp. 1188–1196
[29] Dai, A., Olah, C., Le, Q.: ‘Document embedding with paragraph vectors’. Neural
Information Processing Systems (NIPS), Deep Learning Workshop, Palais des
Congrès de Montréal, 2014, pp. 1–8
[30] Bojanowski, P., Grave, E., Joulin, A., et al.: ‘Enriching word vectors with
subword information’,Trans. Assoc. Comput. Linguist., 2017, 5, pp. 135–146,
arXiv preprint 27 arXiv:1607.04606
[31] Joulin, A., Grave, E., Bojanowski, P., et al.: ‘Bag of tricks for efficient text
classification’. Proc. of the 15th Conf. of the European Chapter of the
Association for Computational Linguistics (EACL), Short Papers, Valencia,
Spain, 2017, pp. 427–431
[32] Wang, B., Wang, A., Chen, F., et al.: ‘Evaluating word embedding models:
methods and experimental results’,APSIPA Trans. Signal Inf. Process., 2019,
E19, (8), pp. 1–13, arXiv:1901.09785
[33] Radim, R., Sojka, P.: ‘Software framework for topic modelling with large
corpora’. Proc. of the Int. Conf. on Language Resources and Evaluation
(LREC’10), Workshop on New Challenges for NLP Frameworks, Valletta,
Malta, 2010, pp. 45–50
[34] Petrak, H.: ‘JavaScript malware collection –a collection of almost 40.000
JavaScript malware samples’. Available at https://github.com/HynekPetrak/
javascript-malwarecollection, accessed on August 2019
[35] Raychev, V., Bielik, P., Vechev, M., et al.: ‘Learning programs from noisy data’.
Proc. of the 43rd Annual ACM SIGPLAN-SIGACT Symp. on Principles of
Programming Languages, (POPL’16), New York, NY, USA, 2016, pp. 761–774
[36] The Majestic Million Service: ‘The million domains we find with the most
referring subnets’. Available at https://majestic.com/reports/majestic-million,
accessed on August 2019
[37] Burges, C.J.C.: ‘A tutorial on support vector machines for pattern recognition’,
Data Min. Knowl. Discov., 1998, 2, (2), pp. 121–167
[38] Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: ‘Scikit-lea rn: machine
learning in python’,J. Mach. Learn. Res., 2011, 12, pp. 2825–2830
CAAI Trans. Intell. Technol., 2020, Vol. 5, Iss. 3, pp. 184–192
192 This is an open access article published by the IET, Chinese Association for Artificial Intelligence and
Chongqing University of Technology under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)