
In this paper, we propose a new passive OS
fingerprinting method by analyzing DNS traffic. The method
utilizes characteristics on DNS queries: each OS has specific
queries for domains to which other OSs send no query, and
each OS has characteristics on the time interval distribution
in sending the OS-specific domain queries. The method can
estimate the number of OSs from the number of specific
DNS queries. In order to realize our method, we derive
characteristics regarding DNS traffic by analyzing DNS
queries from each OS. Our analysis shows that each OS has
two important characteristics on DNS queries described
above. We also devise a method for estimating the number of
OSs from the number of queries by utilizing the
characteristics. For the estimation, we derive an estimation
equation which utilizes the characteristics of specific DNS
queries and also considers the irregular time interval case
that some queries are sent less than regular time intervals,
and some other queries are sent more than regular time
intervals. In this paper, we provide the results of our analysis
against DNS queries from the Android OS and the
characteristics of the queries. Furthermore, this paper shows
the results of our examination on our intra-network for
estimating the number of OSs by using our estimation. Some
results show that our method is a close estimation of the
results of DHCP fingerprinting.
A. Contribution and Outline of this Paper
In this paper, we propose a new passive OS
fingerprinting method using DNS traffic. We demonstrate,
for example in the case of the Android OS, characteristics for
OS fingerprinting derived from DNS-related traffic analysis.
We derive a method for estimating the number of OSs by
using the characteristics and considering the likelihood of
irregular events: sending queries much less than the regular
time interval and sending queries much more frequently. We
demonstrate the results of our examination of the estimation
on our intra-network.
The outline of this paper is as follows. We describe the
works related to our study in Section II. We summarize our
proposal for the estimation in Section III. We introduce the
results of our DNS traffic analysis with the Android OS and
the equation for estimating the number of OS devices in
Section IV. We introduce our examination of our estimation
by using DNS traffic on our intra-network in Section V, and
conclude this paper in Section VI.
II. RELATED WORKS
There are some works of OS fingerprinting. In [2],
Zalewski uses a passive approach by monitoring differences
in the TCP/IP headers, TTL (Time To Live), and MSS
(Maximum Segment Size) to distinguish OSs. In the HTTP
headers, the User-Agent field has information about the web
browsers as well as the OSs of the users. In [5], Shah tries to
distinguish HTTP server software and the OS by using
information included in the HTTP responses. However, these
works are not feasible on large, complicated networks, since
these works need to establish traffic monitoring equipment at
all network borders and requires the filtering of usable
information from high volumes of captured traffic data.
Moreover, especially in [2], it does not work in the case of
tethering. In this case, some fields in the TCP/IP headers are
usually rewritten. In [3], Kollmann uses DHCP-related
packets for passive OS fingerprinting. He uses the time
difference between retransmission frames or DHCP fields,
such as Secs. However, there is no information about the
services or applications that users enjoy in the DHCP frames.
So, an additional system is needed to gather information
from another traffic analysis to that from DHCP frames.
There are other works of active OS fingerprinting. In [6],
Lyon uses the network scanning tool, Nmap. This tool has a
remote OS fingerprinting function. Nmap sends probe
packets to the target devices and monitors the response. The
application then determines the OS of the target from the
response packets. In [10], Gagnon takes a hybrid approach
that combines the passive approaches with active approaches
to increase the accuracy of OS fingerprinting. However, the
method does not work when the target devices are located
behind network devices, such as a firewall or NAT box. In
such cases, the application is unable to send probe packets to
the targets. Some works have been studied to overcome the
NAT-like situations. In [13], Beverly used a passive
approach to classify the traffic derived from NAT hosts with
other hosts by using a naïve Bayesian classifier for the
characteristic values in the TCP/IP header fields. In [4],
Schulz enabled active OS fingerprinting in the tethering
environment by injecting ICMP (Internet Control Message
Protocol) error packets into the target client’s TCP session.
However, this approach required an additional system to
monitor all clients networking and, especially in [4], to inject
ICMP packets at the right time. Therefore, the approach is
unfeasible with large, complicated networks.
Other works were studied to profile user activities by
analyzing traffic. In [11], Xu classified Internet backbone
traffic into clusters (servers/services, heavy hitter hosts,
scans/exploits) with source/destination IP addresses. This
approach is unrealistic for large networks because of the
need to analyze the volume of traffic data to profile all
activities in terms of storage and computational cost.
Furthermore, there is a problem with the deployment of
monitors to obtain all traffic data on a large network. In [12],
Zhang tried to infer online user activities (browsing, online
game, video, etc.) by analyzing MAC-level traffic on a
wireless LAN and extracting the feature of
data/control/management frames (data rate, frame interval
time, etc.). This approach specialized in wireless LAN traffic
but had a monitor deployment issue.
III. DESCRIPTION OF PROPOSED METHOD
Figure 1 shows our assumption of the network
environment for passive OS fingerprinting. There are some
access networks (cellular, FTTH, etc.) on the whole network,
and each device can connect to any access network. There is
a (set of) DNS server on a core network. Whichever access
network a device connects to, a device sends a query to the
same DNS server. We also assume that there are some
devices that connect to an access network through another
device, such as tethering-enabled ones or NAT-boxes. This