
flow-based statistical analysis scheme [21]. Both methods are
related to our proposal, however their objectives are limited
either to the SSL/TLS application recognition or to classifi-
cation of encrypted application layer protocols. In this work,
we focus on an in-depth analysis of the SSL/TLS protocol
message sequences to characterize and classify application
flows. In our previous work, we considered the problem of
detecting Skype traffic and classifying its service flows. We
proposed a classification method for Skype traffic tunneled
over TLS in addition to proprietary encryption. The method
is based on the Statistical Protocol IDentification (SPID) that
analyzes the distributions of flow and application layer data
[13].
Bissias et al. presented a traffic analysis attack against
encrypted HTTP streams to identify the source of the traffic by
analyzing distributions of packet sizes and inter-arrival times
of web requests from interesting sites [23]. Even if their work
differs from our paper in terms of objectives and methodology,
the conclusions remain the same: encrypting traffic does not
prevent from performing some types of traffic analysis.
Lee et al. and Levillain et al. evaluated the practices of
SSL/TLS servers by investigating server replies [24], [18].
They studied the details of the encryption parameters, e.g. ci-
pher suites, key sizes, and protocol features such as supported
versions and their extensions. Our work is a further step in
this direction.
VII. CONCLUSIONS
In this paper, we have proposed stochastic fingerprints for
application traffic flows conveyed in SSL/TLS sessions. The
fingerprints are based on first-order homogeneous Markov
chains for which we identify the parameters from observed
training application traces. As the fingerprint parameters of
chosen applications differ considerably, the method results in a
very good accuracy of application discrimination and provides
a possibility of detecting abnormal SSL/TLS sessions. We have
also shown that application fingerprints need to be updated
periodically, because they change over time.
Our analysis of the results reveals that obtaining application
discrimination mainly comes from incorrect and diverse imple-
mentation practices, the misuse of the SSL/TLS protocol, var-
ious server configurations, and the application nature. Finally,
even if we are able to identify some very reliable statistical
fingerprints for selected applications, it is also possible to
evade the classification by avoiding implementation mistakes
and building the secure layer on limited, but widely-used set
of SSL/TLS states.
In the future work, we plan to investigate further the
proposed method on a wider range of Internet applications
and cross-validate on other heterogeneous datasets gathered in
various subnetworks. We also aim at analyzing the SSL/TLS
stack to verify its consistency with protocol recommendations
and best security practices. Finally, we plan to apply the ap-
proach to reveal intrusions that exploit the SSL/TLS protocol
by establishing suspicious, unlikely sessions.
ACKNOWLEDGMENTS
We would like to thank DIMACS and CCICADA for
support, and Nina Fefferman for useful comments on the
draft. This work was partially supported by the European
Commission FP7 project INDECT under contract 218086.
REFERENCES
[1] “Internet Assigned Numbers Authority (IANA),”
http://www.iana.org/assignments/port-numbers.
[2] A. W. Moore and K. Papagiannaki, “Toward the Accurate Identification
of Network Applications,” Proc. of the PAM Conference, pp. 41–54,
2005.
[3] S. Sen, O. Spatscheck, and D. Wang, “Accurate, Scalable In-network
Identification of P2P Traffic Using Application Signatures,” in Proc. of
the WWW Conference, 2004, pp. 512 – 521.
[4] L. Bernaille and R. Teixeira, “Early Recognition of Encrypted Applica-
tions,” Proc. of the PAM Conference, vol. 4427, pp. 165–175, 2007.
[5] A. W. Moore and D. Zuev, “Internet Traffic Classification Using
Bayesian Analysis Techniques,” Proc. of ACM SIGMETRICS, pp. 50–60,
2005.
[6] M. Iliofotou, P. Pappu, M. Faloutsos, M. Mitzenmacher, S. Singh, and
G. Varghese, “Network Monitoring Using Traffic Dispersion Graphs
(TDGs),” in Proc. of ACM IMC, 2007, pp. 315–320.
[7] T. Karagiannis, K. Papagiannaki, and M. Faloutsos, “BLINC: Multilevel
Traffic Classification in the Dark,” Proc. of ACM SIGCOMM, pp. 229–
240, 2005.
[8] H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee,
“Internet Traffic Classification Demystified: Myths, Caveats, and the
Best Practices,” in Proc. of ACM CoNEXT, 2008, pp. 1–12.
[9] F. Risso, M. Baldi, O. Morandi, A. Baldini, and P. Monclus,
“Lightweight, Payload-Based Traffic Classification: An Experimental
Evaluation,” Proc. of IEEE ICC, pp. 5869–5875, 2008.
[10] T. Dierks and E. Rescorla, “The Transport Layer Security (TLS)
Protocol, Version 1.2,” RFC 5246 (Proposed Standard), August 2008.
[11] A. Freier, P. Karlton, and P. Kocher, “ The Secure Sockets Layer (SSL)
Protocol Version 3.0,” RFC 6101 (Historic), August 2011.
[12] I. L. MacDonald and W. Zucchini, Hidden Markov and Other Models
for Discrete-Valued Time Series. Chapman & Hall, 1997.
[13] M. Korczy´
nski and A. Duda, “Classifying Service Flows in the En-
crypted Skype Traffic,” Proc. of IEEE ICC, pp. 1–5, 2012.
[14] R. Seggelmann, M. Tuexen, and M. Williams, “ Transport Layer Secu-
rity (TLS) and Datagram Transport Layer Security (DTLS) Heartbeat
Extension,” RFC 6520 (Proposed Standard), February 2012.
[15] J. Aldrich, “R.A. Fisher and the Making of Maximum Likelihood 1912-
1922,” Statistical Science, vol. 12, no. 3, pp. 162–176, August 1997.
[16] I. Drago, M. Mellia, M. Munafo, A. Sperotto, R. Sadre, and A. Pras,
“Inside Dropbox: Understanding Personal Cloud Storage Services,” in
Proc. of ACM IMC, 2012, pp. 481–494.
[17] S. Blake-Wilson, M. Nystrom, D. Hopwood, and J. Mikkelsen, “ Trans-
port Layer Security (TLS) Extensions,” RFC 4366 (Proposed Standard),
April 2006.
[18] O. Levillain, A. Ébalard, B. Morin, and H. Debar, “One Year of SSL
Internet Measurement,” in Proc. of ACM ACSAC, 2012, pp. 11–20.
[19] A. Langley, “Unfortunate Current Practices for HTTP over TLS,”
Internet Draft, January 2011.
[20] K. Bhargavan, C. Fournet, M. Kohlweiss, A. Pironti, and P. Strub,
“Implementing TLS with Verified Cryptographic Security,” in Proc. of
the IEEE Symposium on Security & Privacy, 2013, pp. 445–459.
[21] G.-L. Sun, Y. Xue, Y. Dong, D. Wang, and C. Li, “An Novel Hybrid
Method for Effectively Classifying Encrypted Traffic,” Proc. of IEEE
GLOBECOM, pp. 1–5, 2010.
[22] R. Alshammari and A. Zincir-Heywood, “Machine Learning Based
Encrypted Traffic Classification: Identifying SSH and Skype,” in IEEE
Symposium on Computational Intelligence for Security and Defense
Applications, 2009, pp. 1–8.
[23] G. D. Bissias, M. Liberatore, D. Jensen, and B. N. Levine, “Privacy
Vulnerabilities in Encrypted HTTP Streams,” in Proc. of the 5th Int.
Conference on Privacy Enhancing Technologies, 2006, pp. 1–11.
[24] H. K. Lee, T. Malkin, and E. Nahum, “Cryptographic Strength of
SSL/TLS Servers: Current and Recent Practices,” in Proc. of ACM IMC,
2007, pp. 83–92.