Deterministically Deterring Timing Attacks in Deterland

Weiyi Wu

Yale University

New Haven, CT, USA

weiyi.wu@yale.edu

Bryan Ford

Swiss Federal Institute of Technology (EPFL)

Lausanne, Switzerland

bryan.ford@epfl.ch

Abstract

The massive parallelism and resource sharing embodying to-

day’s cloud business model not only exacerbate the security

challenge of timing channels, but also undermine the via-

bility of defenses based on resource partitioning. We pro-

pose hypervisor-enforced timing mitigation to control tim-

ing channels in cloud environments. This approach closes

“reference clocks” internal to the cloud by imposing a de-

terministic view of time on guest code, and uses timing

mitigators to pace I/O and rate-limit potential information

leakage to external observers. Our prototype hypervisor is

the first system to mitigate timing-channel leakage across

full-scale existing operating systems such as Linux and ap-

plications in arbitrary languages. Mitigation incurs a vary-

ing performance cost, depending on workload and tunable

leakage-limiting parameters, but this cost may be justified

for security-critical cloud applications and data.

1. Introduction

The cloud computing model, which shares computing re-

source among many customers, has become popular due

to perceived advantages such as elasticity, scalability, and

lower total cost of ownership. Clouds typically use virtual-

ization to protect one customer’s computation and data from

another. Virtualization alone does not protect against timing

attacks [40, 57], however, in which a malicious party learns

sensitive information by analyzing the observable timing ef-

fects of a victim’s computation.

Computers contain many shared resources that can be

used for internal timing attacks by a co-resident attacker.

L1 data caches [39], functional units [48], branch target

caches [2], and instruction caches [1] have all been used to

learn sensitive information. Protecting these resources indi-

vidually through partitioning [18, 36, 55] is insufficient as

attackers regularly identify new timing channels, such as L3

caches, memory, and I/O devices. Hardware features such

as address space layout randomization attempt to make at-

tacks more difficult, but often fail to do so adequately [28].

Even external attackers on a remote machine can exploit

side-channels to learn a server’s private key [12], and pos-

sibly any information not processed in constant time [10].

Other existing defenses against timing channels rely on

language support and impose new programming models or

limits to communication with the external world, making

them unusable on unmodified existing application code. De-

terministic containers rely on language support [4, 42, 45,

55], and Determinator requires applications to be ported to

new microkernel APIs [6, 7]. Predictive mitigation offers

important formal techniques to reason about and limit leak-

age [5, 45, 53, 54], but to our knowledge have not been im-

plemented in a fashion compatible with existing languages

and applications.

Today’s cloud consumers expect to execute unmodified

applications in a variety of existing languages, and to have

unrestricted network access, precluding most existing tech-

niques. Practical timing channel protection in cloud envi-

ronments must support unmodified applications, unrestricted

network access, and multi-tenancy.

As a first step towards this goal, we introduce Deterland,

a hypervisor that builds on many of the techniques summa-

rized above, but can run existing guest OS and application

code in an environment guaranteeing strong rate-limits to

timing channel leakage. Deterland uses system-enforced de-

terministic execution to eliminate internal timing channels

exploitable by guest VMs, and applies mitigation to each

VM’s I/O to limit information leakage to external observers.

Deterland permits unmodified VMs and applications to

run in parallel on the same machine and permits remote net-

work interaction, while offering fine-grained control over the

maximum information leakage via timing channels. Deter-

land presents guest VMs with a deterministic internal notion

of time based on instructions executed instead of real time,

and mitigates external leakage via techniques similar to pre-

dictive mitigation but adapted to the hypervisor’s I/O model.

We have implemented Deterland by extending Certi-

KOS [24], a small experimental hypervisor. Unsurprisingly,

we find that Deterland’s general and backward-compatible

timing channel defense comes with performance cost. How-

ever, cloud providers and their customers can tune Deter-

land’s mitigation parameters for different tradeoffs between

performance and leakage risk. Deterland’s overhead can be

fairly small (e.g., less than 10%) in some configurations

on compute-intensive or latency-tolerant workloads, while

mitigation is more costly for interactive or latency-sensitive

workloads. We expect these costs could be reduced with fur-

ther hypervisor improvements, and could also benefit from

hardware features such as precise instruction counting.

While timing channel mitigation via Deterland is prob-

ably not suitable for all cloud environments, our results

suggest that it may offer promise as a general-purpose,

backward-compatible approach to harden security-critical

cloud computations whose sensitivity justifies the perfor-

mance cost of mitigation.

In summary, this paper’s primary contributions are 1) the

first hypervisor offering general-purpose timing channel mit-

igation for unmodified guest operating systems and applica-

tions, 2) an evaluation of the overheads and leakage tradeoffs

for various workloads, and 3) a tunable mitigation model en-

abling cloud providers and customers to control the tradeoffs

between performance overheads and leakage risk.

2. Background

Timing channels [30, 51] require a victim-owned, attacker-

accessible resource and a reference clock to determine ac-

cess delays in order to infer sensitive information processed

by the victim’s computation. An attacker, co-resident to a

victim’s process, can exploit internal timing channels based

upon shared resources [1–3, 32, 39, 48, 50, 57]. Alterna-

tively, if a victim has a network, or otherwise externally,

accessible resource, an attacker can exploit external timing

channels by observing the timing relationship between in-

puts and outputs of the victim’s resource [10, 12].

2.1 Internal Timing Channels

Figure 1 illustrates a typical internal timing attack, where an

attacker attempts to learn a victim’s secret key as the victim

runs known code such as AES encryption. The attacker be-

gins by selecting a shared resource, such as the L1 cache,

and a time source, such as a high precision timer. In the

case of an L1 cache, the attacker determines which cache

blocks the victim accesses during known code sequences,

e.g., while performing key-dependent table lookups in AES.

To obtain this information, the attacker first touches each of

the L1 cache blocks to evict any data held by the victim,

then yields the CPU to the victim. Upon resuming, the at-

tacker then reads from the cache and measures the time for

the reads to complete. Cache misses will take longer than

cache hits, revealing the victim’s memory accesses. This in-

formation has been used to learn AES keys [3, 50], ElGamal

keys [57], and other sensitive cryptographic content [32].

Internal timing attacks require access to a high-resolution

clock, common in existing computer architectures and made

available to userspace by most operating systems. Even if

the OS or hypervisor were to disable direct high-resolution

timing sources such as this, hardware caches and intercon-

nects [1, 2, 39, 48] as well as I/O devices [30, 51] are read-

ily usable as reference clocks. Even a thread incrementing a

cloud computing node

victim VM secret

cache

evil VM



secret

Figure 1: Internal timing channel. The attacker first flushes

the cache, then the victim’s VM performs memory oper-

ations whose access pattern depends on a secret. The at-

tacker finally uses a high-precision clock to detect the vic-

tim’s cache accesses, thereby deducing the secret.

cloud computing node

victim

secret

attacker

∆T1∆T2∆T3

secret

Figure 2: External timing channel. The victim’s VM offers

a SSL-secured Web service. The attacker sends a series of

SSL requests and externally measures the victim’s response

delays to deduce the victim’s secret.

shared-memory counter in a tight loop can create a reference

clock [6, 51]. Resource partitioning can eliminate sharing of

cache [55], memory [36], and I/O [18], but partitioning un-

dermines the sharing and oversubscription advantages of the

cloud model [6] and perpetuates an arms race as adversaries

identify more resources usable for timing channel attacks.

Methods for addressing internal timing channel attacks

include language-based techniques and system-enforced de-

terminism. Language-based techniques [4, 42, 45, 54] elim-

inate reference clocks through time-oblivious operations that

effectively execute for a fixed period of time regardless of in-

put. Alternatively, system-enforced determinism [6, 7] elim-

inates reference clocks by isolating processes from sources

of time both internal and external. Clocks accessible to un-

trusted code reveal only the number of instructions executed,

and communication occurs at explicitly-defined points in

time, preventing the modulation of a reference clock. While

useful, these approaches impose new programming environ-

ments, breaking backwards-compatibility with existing code

bases, and do not address external timing attacks.

2.2 External Timing Channels

Figure 2 illustrates a typical external timing attack, where an

attacker attempts to learn a victim’s RSA private key through

network-based communication. The attacker initiates several

SSL sessions with a victim, causing the victim to perform

repeated RSA calculations using the same private key. If the

victim’s RSA arithmetic is not coded with sufficient care,

the time the victim takes for session initiation statistically

reveals information about the victim’s private RSA key. By

measuring response delays against a local reference clock,

the attacker eventually deduces the victim’s private key [12].

This style of attack generalizes to many cryptographic oper-

ations not implemented in constant time [10].

One way to address external timing channels is to delay

all output from the system. Early proposals added random

noise [23, 26], but offered no formal bounds on information

leakage. More recently, predictive mitigation [5, 53] delays

events according to a schedule predicted using only non-

sensitive information. Mispredictions incur a doubling of fu-

ture delays, however, which may result in unbounded perfor-

mance penalties. Additive noise and predictive mitigation do

not directly counter timing attacks in cloud environments, in

which an attacker co-resident with the victim [56] can use

fine-grained timers to exploit shared resources [40].

3. System Overview

Before delving into design details, this section presents a

high-level overview of the Deterland architecture. We use

the word Deterland to refer both to this architecture and

more generically to a cloud environment using Deterland as

its hypervisor. The name Deterland was inspired by Never-

land [9], a mythical place where time stops, and as such must

be inherently free of timing channels.

3.1 Mitigation Domains

A Deterland cloud, shown in Figure 3, may contain multi-

ple mitigation domains. Virtual machines (VMs) in differ-

ent mitigation domains receive different levels of protection

against timing channels, or more precisely, different rate-

limits on information leakage. As we will see in the eval-

uation (§6), mitigation inevitably comes with performance

costs; a choice among several mitigation domains enables

customers to balance performance against leakage risks.

Each mitigation domain has two properties for protecting

against timing attacks: mitigation interval and virtual CPU

speed. VMs within a given mitigation domain delay I/O that

occur within a specified period until the end of that period

constituting a mitigation interval. The virtual CPU speed is

defined as the number of guest instructions executed within

the guest VM per mitigation interval.

The cloud provider is responsible for creating mitigation

domains for customers according to their requirements and

contracts. Different mitigation domains may have different

maximum information leakage rates, performance overheads

and prices, but mitigation parameters are fixed within a miti-

gation domain. When a customer creates a VM instance, she

chooses a mitigation domain suitable for a particular appli-

cation. VM instances in different mitigation domains may be

created interchangeably, and a VM instance can be migrated

from one mitigation domain to another.

data protection

deterministic execution

low leakage rate

data protection

deterministic execution

medium leakage rate

VM VM data protection

nondeterministic execution

unconstrained leakage rate

Figure 3: Mitigation domains in Deterland. Each color rep-

resents a different mitigation domain.

VM VM Internet

artificial time

mitigation boundary

wall time

hypervisor

vTimer monitor

simulated

devices

virtio

devices

mitigator

backend

drivers

physical

I/O devices

Figure 4: Deterland architecture overview. Orange compo-

nents – including untrusted guest VMs – lie within the mit-

igation boundary, and hence have no direct access to real

time. Blue components connect these orange components

to the outside world, but all interactions between them pass

through the mitigator.

3.2 Architecture Overview

Figure 4 illustrates Deterland’s architecture, which is broadly

similar to conventional hypervisors. However, Deterland en-

forces deterministic execution and predictive mitigation to

control internal and external timing channels.

As shown in Figure 4, each VM instance running on

Deterland has its own mitigation boundary, which delimits

a specific mitigation domain. Code running on a guest VM

within this boundary can observe only artificial time, based

on the number of guest instructions executed since the last

mitigation interval. Thus, the only internal timing source

the VM can access is an artificial time generator (vTimer

in Figure 4). The other components inside the boundary

are also constrained to behave deterministically, so as to

eliminate all internal timing channels.

The only nondeterminism that can affect computations in

a mitigation domain are introduced via explicit I/O crossing

the boundary, through the non-mitigated blue area in Fig-

ure 4. An external observer can measure the times at which

outputs emerge from the cloud computation, yielding exter-

nal timing channels. Deterland uses mitigation to limit the

maximum rate of information leakage across this I/O bound-

ary, however. An internal observer can receive time-stamped

messages from an external clock source, of course, but mit-

igation limits the rate and granularity with which external

time sources are observable by the VM.

Unlike many common hypervisors, Deterland does not

allow its VMs to access hardware directly, but fully virtu-

alizes all essential devices and isolates the VMs from the

physical hardware. This device virtualization is necessary

to prevent untrusted guest VMs from using nondeterminis-

tic device hardware to form high-precision internal timing

sources, which would compromise mitigation security.

Deterland allows a guest VM to communicate with any

remote endpoint, subject to other security policies the cloud

provider might impose, but all external communications are

mitigated – currently including communication between dif-

ferent cloud hosts within the same mitigation domain. An

area for future improvement would be to support determin-

istic intra-cloud communication protocols that need not be

mitigated, but we leave such optimizations to future work.

3.3 Security Assumptions

Deterland’s goal is to protect against both internal and ex-

ternal timing attacks in an environment that can execute ex-

isting, unmodified operating system and application code.

Specifically, Deterland offers timing channel mitigation

against both co-resident and remote adversaries. An adver-

sary is assumed to have full, unrestricted access to his guest

VM’s resources including all guest-accessible hardware and

instructions. The adversary can run arbitrary code that can

flush cache lines, attempt to measure memory access times,

set and receive (virtual) timer interrupts, and send and re-

ceive network packets both inside and outside Deterland.

We impose no special operational limits on the attacker; he

may repeat these activities at will.

We assume an adversary can contrive to co-locate a pro-

cess with a victim, and may allocate multiple virtual ma-

chines on the same machine, although Deterland applies mit-

igation to these VMs as a unit. The underlying hardware pre-

vents direct access to the users’ confidential data via virtu-

alization and access controls. We assume an attacker cannot

escalate privileges and compromise the cloud or a target’s re-

source. An attacker can learn any public information about

a target including virtual machine or process placement and

published or declassified data.

Deterland assumes that clients employ secure communi-

cation primitives, such as HTTPS and TLS. An external at-

tacker may have access to any I/O flows into and out of a

hardware resource executing a target’s process and can com-

pare these accesses against a high-precision clock.

4. Deterland Design

The layout of Deterland components in a single cloud com-

puting node is shown in Figure 5. The hypervisor runs virtual

machines (VMs) inside a mitigation boundary, and isolates

the VMs from real hardware. For implementation simplic-

CPU0

driver

mitigator

CPU1

hypervisor

VM1

VM4

CPU2

VM2

VM5

CPU3

VM3

VM6

disk network console

real time artificial time

mitigation boundary

Figure 5: Layout of Deterland components

ity the mitigator and backend drivers currently run on sep-

arate CPU cores outside the mitigation boundary, although

this current restriction is not essential to the architecture.

The mitigator communicates with the backend drivers, and

the backend drivers in turn access the underlying hardware.

This section details the design of these three key modules:

hypervisor (§4.1), backend (§4.2) and mitigator (§4.3).

4.1 Hypervisor

The hypervisor consists of four main components: the VM

monitor, the artificial time generator (vTimer), simulated

non-I/O devices, and virtio devices, as shown in Figure 4.

All four parts are in the Trusted Computing Base (TCB)

because they can access VM state, which must not be visible

to external observers. The hypervisor enforces deterministic

execution and contributes to the mitigation process.

4.1.1 VM Monitor (VMM)

The VM monitor in Deterland has functionality similar to

a conventional VMM. In contrast with conventional VMMs

designed to maximize computing power and minimize over-

head, Deterland’s VMM must enforce strict determinism

within each guest VM despite the performance costs of do-

ing so. This currently means that Deterland’s VMM must

virtualize all privileged instructions, I/O operations and

interrupts, and avoid using hardware-assisted virtualiza-

tion features that could compromise determinism. Common

hardware virtualization features could in principle be en-

hanced to enforce deterministic execution, thereby poten-

tially improving Deterland’s performance considerably.

The VMM intercepts instructions that access timers,

statistical counters, and other nondeterministic CPU state.

Timing-related instructions are emulated by the vTimer, and

statistical counters are currently inaccessible to guest VMs

at all. The VMM in this way systematically prevents the

guest from accessing CPU features that could be used as

high-precision timing sources to obtain information about

the execution timings of other guest VMs.

The VMM also tracks the precise number of executed

instructions for the vTimer, and pauses the VM after a de-

terministic instruction count as dictated by the vTimer. Al-

though some historical architectures such as PA-RISC [29]

directly support exact instruction counting, on the x86 ar-

chitecture the VMM must use well-known but more costly

techniques to count instructions precisely in software [19].

Deterland does not support I/O passthrough, as the only

physical hardware inside the deterministic boundary are

CPU and memory, excluding timing-related features. All

devices the VM can access are deterministically simulated.

The VMM intercepts and forwards every I/O instruction to

corresponding virtual device, either a simulated non-I/O de-

vice or a virtio device.

The VMM intercepts and handles all raw hardware inter-

rupts, since hardware interrupts could be utilized as timing

sources. The VMM therefore either disables physical hard-

ware interrupts for the CPU core it runs on, or reroutes them

to the CPU core the mitigators run on. The only interrupts

the VMM injects into guest VMs are virtual interrupts gen-

erated deterministically by simulated devices.

4.1.2 vTimer

The artificial time generator, or vTimer, generates the only

fine-grained notion of time observable to guest VMs, and

schedules mitigated I/O and interrupts. Almost all other

components refer to the vTimer’s artificial notion of time.

The vTimer provides three APIs for other components:

1) get the current artificial time; 2) set an alarm at some fu-

ture artificial time for an event, and 3) get the artificial time

for the nearest event. The first API is used widely in other

components as it is the only timing source they can access.

The second API is used by simulated devices that take pe-

riodic actions. For example, the simulated timer device gen-

erates timer interrupts periodically; the virtio devices com-

municates periodically with the mitigator, at the end of each

computation segment. The vTimer will notify the device at

the artificial time so that it can take the corresponding action

and reset the alarm for the next iteration. The third API is

called by the VMM before each VM entry to pause the VM

after the appropriate number of instructions are executed, or

to manually advance the artificial time in some cases.

The vTimer updates the artificial time after each VM exit.

The VMM notifies the vTimer with the number of instruc-

tions the VM executed since the last VM entries. The VMM

also asks the vTimer to manually increase the artificial time

when it manually advances the program counter in the case

it intercepts an instruction and then “executes” it by hand.

Some virtual devices may ask the vTimer to manually ad-

vance the artificial time in order to simulate I/O operations

that normally take some time to complete. When the VM

enters a halt state, the VMM asks the vTimer to advance the

artificial time to the point for the next vTimer event.

4.1.3 Simulated Devices

The hypervisor simulates essential device hardware for guest

VMs. The simulated devices behave deterministically, do

not access the underlying hardware, and use the vTimer as

their only timing source. Only the most critical devices are

simulated, such as bus controller, interrupt controller and

timer device, and Deterland implements a basic version of

each device to minimize hypervisor complexity.

4.1.4 virtio Devices

All external-world I/O in Deterland uses the virtio inter-

face [41]. Virtio is the widely-adopted, de facto standard for

I/O virtualization in current hypervisors. Its request-based

ring structure has a simple design, and its asynchronous na-

ture enables I/O buffering and operation bundling across De-

terland’s mitigation boundary.

When a guest VM needs to perform I/O, it first creates

descriptors and data buffers in guest memory, then notifies

the virtio device. Conventional hypervisors extract requests

from guest memory and process them right away. Deterland

instead extracts the requests immediately but then buffers

them in a mitigation queue. At the end of the current mitiga-

tion interval, the vTimer notifies the virtio device to send all

the queued requests to the mitigator as a bundle, and receives

a response bundle from the mitigator. After receiving this re-

sponse bundle, the virtio device unpacks virtio responses and

copies them into guest memory. When the virtio device fin-

ishes processing I/O responses, it sets a new alarm for the

next mitigation interval.

The data transferred between the virtio devices and the

mitigator are not the scattered, raw data buffers as in the

virtio specification. For output requests, the virtio device

allocates a shared data buffer in hypervisor memory, copies

data from guest memory to the buffer, and sets up the request

with the address of the buffer. This data buffer is shared

between the hypervisor, the mitigator and the backend driver,

but is inaccessible to the guest VM. For input requests, the

virtio device simply sets up the request without an attached

buffer. The data buffer is then allocated and attached to

the corresponding response by the backend driver. For both

types of requests, data buffers are attached to the responses

so that the virtio device can safely deallocate the buffers after

copying data into guest memory.

4.2 Backend

The backend performs actual I/O operations. After receiving

an I/O bundle from the mitigator, the backend driver unpacks

the bundle and performs the I/O requests it contains. Once an

output request finishes, or an input arrives, the driver sets up

the response and sends it back to the mitigator.

The backend driver does not bundle responses, because

it does not know when a mitigation interval starts and ends.

Instead the backend sends the mitigator individual responses

and leaves the task of buffering and bundling responses to

the mitigator.

4.3 Mitigator

The mitigator implements the core of the mitigation process,

which adapts previously-proposed mitigation techniques to

the hypervisor environment.

mitigation

A0A1A2A3A4

artificial time

R0R1R2R3R4

real time

Figure 6: Mitigations for outputs: requests are bundled at the

end of each artificial-time period, and become available for

the device at the beginning of the next real-time period.

Deterland supports different mitigation intervals for dif-

ferent I/O queues of virtio devices for the same VM instance.

In practice, however, we expect cloud providers to use the

same mitigation interval for all virtio devices to rate-limit in-

formation leakage consistently within a mitigation domain.

When a VM instance is created, the hypervisor synchronizes

all mitigator instances for the virtio devices.

The control loop of a virtio device’s mitigator is straight-

forward: 1) receive a request bundle from the virtio device;

2) collect all pending responses until the next time slot;

3) forward the request bundle to the backend driver, and

4) bundle all responses and send the bundle to the virtio de-

vice. The communication primitive the mitigator uses be-

tween the virtio device and itself is a synchronous exchange

operation, while the primitive it uses between the backend

driver and itself is an asynchronous send and receive.

As discussed in §4.1.4, the actual data buffer is shared

among the mitigator, the virtio device and the local backend

driver. Therefore the actual objects being mitigated are the

I/O metadata, i.e., virtio request bundles and response bun-

dles and permissions for accessing the actual data buffer. For

some hardware, the mitigator can be embedded in the local

backend driver to further reduce communication overhead.

In this case, the backend driver becomes a part of TCB.

4.3.1 Mitigation in Deterland

Deterland does not limit the number of I/O operations in a

real-time period or an artificial-time period, unlike earlier

mitigation designs [5]. By enforcing deterministic execution,

the content and ordering of I/O operations in an artificial-

time period become pure deterministic functions of the VM

state and the contents of response bundles virtio devices

receive at the beginning of the artificial-time period. We can

therefore use the request bundle in place of a single request

as the atomic unit of output mitigation, as shown in Figure 6.

Deterland mitigates both input and output. The mitiga-

tion for input is symmetric with that of output, and uses a

response bundle as the unit of mitigation. Figure 7 shows the

data flow of request and response bundles as well as infor-

mation leakage. External observers can obtain at most one

bit of leaked timing information in each mitigation interval.

The VM may gain several bits of information in an artificial-

time period, but no more than what external observers could

have learned already. For example, in Figure 7, an external

R0R1R2R3R4R5R6

A0A1A2A3A4A5

,,,/, ,

, , , , /

Figure 7: Information leakage: when no request bundle is

available at the beginning of a real-time period, external ob-

servers gain one bit of information; when responses bundle

contains responses from multiple real-time periods, the VM

gains the same information. Red frownies represent potential

one-bit information leakage events.

observer may learn one bit of information from the fact that

“there is no output in R5because A3missed the deadline.”

Correspondingly, the VM learns that “the input at the be-

ginning of A5comes from R4and R5.” These two “leaks”

reveal exactly the same information, however, as they both

reflect the same root cause – A3missed the deadline.

In practice, mitigation introduces 2–3 mitigation intervals

of latency to I/O operations – the most important perfor-

mance cost of mitigation. For example, in Figure 7, if the

VM issues a disk I/O operation in A1, the operation will be

bundled at the end of A1and actually performed in R3. The

result of the operation will then be bundled at the end of

R3and become available to the guest VM in A4. The VM

thus sees a 2–3 mitigation interval latency. If the VM misses

deadlines during this roundtrip, the latency may be higher.

Symmetrically, if an external request arrives in R1, it can be

processed as early as in A2and the corresponding response

will be published as early as in R4.

Multiple I/O devices for the same VM instance could

leak more than one bit information per mitigation interval if

incorrectly configured. This leakage is avoided if the cloud

provider sets one mitigation interval for all VM instances

in the same mitigation domain, and Deterland applies the

value to all I/O queues for each VM instance. Deterland

also keeps the real-time periods across I/O queues for the

same VM instance synchronized. The artificial-time periods

are naturally synchronized as they use the same notion of

artificial time.

4.3.2 Mitigation Configuration

The maximum information leakage rate is directly related to

the mitigation interval. However, the upper bound is likely

to be hard to reach, especially when the computing node

has low utilization. Actual information leakage is heavily

impacted by the virtual CPU speed and other factors.

Figure 8 illustrates information leakage for different uti-

lizations under different vCPU speed settings. In Figure 8a,

the vCPU speed is significantly lower than the host CPU

speed, which means that each segment runs only a small

number of instructions and the host CPU can always com-

plete them in a single real-time slot. Provided the VM always

finishes the segment within one time slot, it effectively be-

haves like a hard real-time system and leaks no timing infor-

(a) Slow vCPU, low utilization,

no information leakage

/ /

(b) Fast vCPU, high utilization,

two bits are leaked

Figure 8: Information leakage and utilization

R0R1R2R3R4R5R6

A3,4

1x 1x 2x 1x 1x

Figure 9: Time synchronization using piggybacked value

from the mitigator

mation at all, at a likely cost of substantially under-utilizing

the CPU. In contrast, the vCPU speed in Figure 8a is close to

or higher than the host CPU speed. In this case the VM may

leak one bit of information per time slot – namely whether

the segment completed execution in this time slot or in some

later slot – but utilizes the CPU more efficiently.

Calculating the actual amount of information an execu-

tion trace leaks using Shannon entropy [43] is possible in

principle but difficult. However, we know that an under-

utilized computing node is unlikely to miss deadlines and

hence leak information. An over-utilized computing node,

on which the VM has 50% chance of finishing a segment

in time and 50% chance of missing the deadline, might be

expected to leak information at the maximum rate.

4.3.3 Time Synchronization

If one segment takes longer than expected, the artificial time

the VM perceives will permanently become one mitigation

interval behind real time, as A5and R6in Figure 9. Deter-

land automatically adjusts the guest VM’s perception of time

so that artificial and real time remain roughly synchronized,

provided all I/O queues have the same mitigation interval.

This requirement is normally satisfied as long as the cloud

provider applies the same mitigation interval to all devices.

This time synchronization leaks no additional information.

Just before exchanging buffers, the mitigator calculates

the number of elapsed real-time periods since the last ex-

change, and piggybacks this number in the response bundle,

as shown in Figure 9. The virtio devices are expected to re-

ceive the same number at the same time since the mitigators

are synchronized, and the mitigation intervals for all devices

are the same. After receiving this number, the virtio device

notifies the vTimer to update the speed ratio accordingly.

The speed ratio controls how fast the vTimer advances

artificial time. If the speed ratio of the vTimer is set to n, it

advances the artificial time ntimes faster than normal, which

is equivalent to temporarily reducing the virtual CPU speed

to 1/n-th of its normal value. The number of instructions in

a segment remains unchanged, however. Therefore the virtio

devices need to set up the alarm ntimes as far ahead as they

normally would. This speed ratio compensates the missed

real-time periods and quickly recovers artificial time to keep

it synchronized with real time.

The only information crossing the mitigation boundary is

the number of elapsed real-time slots, and this information

is already known to the external observer and the VM, so

the piggybacked value does not leak additional information.

The maximum leakage rate remains constant since the actual

length of segments and time slots remains unchanged.

4.4 Limitations

Deterland currently supports only a single virtual CPU

per guest VM, though it can run many single-core guests

on a multicore machine. Enforcing deterministic execu-

tion on multiprocessor guests is possible but complex and

costly [20]. More efficient user-space libraries for deter-

ministic parallelism [16] still introduce overhead and can-

not prevent malicious guests from violating determinism to

gain access to high-precision time sources. Determinator’s

microkernel-enforced deterministic parallel model [7] would

satisfy Deterland’s security requirements but is not compat-

ible with existing applications or guest operating systems.

This limitation may not be a serious problem for many

“scale-out” cloud applications, however, which are typically

designed to divide work just as readily across processes or

whole VMs as among shared-memory threads. Especially in

large-scale parallel processing paradigms such as MapRe-

duce, there may not be much fundamental efficiency differ-

ence between running a workload on N M-core (virtual)

machines versus running the same workload on N×M

single-core (virtual) machines. Deterland can run these N×

Msingle-core VMs organized as Mguests on each of N

physical M-core machines.

Another limitation of Deterland is that it mitigates all

inter-VM communication, even among cloud hosts within

the same mitigation domain, which is in principle unnec-

essary to ensure security. This limitation might be solvable

through deterministic intra-cloud communication primitives

or efficient deterministic-time synchronization techniques,

but we leave such improvements to future work.

Deterland currently provides only coarse-grained tim-

ing information control, across entire guest VMs. An IFC-

based timing information control [45] might in principle of-

fer finer-granularity control and reduce mitigation overhead.

Finally, Deterland does not support direct hardware ac-

cess from the VM even if the hardware itself supports virtu-

alization features, such as Intel SR-IOV. Future hypervisor

optimizations and hardware improvements could eventually

enable Deterland to make use of hardware virtualization and

reduce its performance costs.

5. Implementation

Deterland builds upon CertiKOS [24], a certified hypervi-

sor. Deterland adds mitigation, an artificial timer, timing-

related virtual devices such as vPIT and vRTC, and virtio

devices. Deterland also virtualizes the network device via

a network driver running in the hypervisor. CertiKOS gives

guests direct access to network devices via IOMMU, but De-

terland cannot do this without compromising determinism.

Deterland also optimizes IPC performance for communica-

tion patterns resulting from I/O mitigation. Deterland’s code

additions to CertiKOS are not yet certified.

Deterland inherits several drawbacks from CertiKOS.

CertiKOS currently only supports a single VM instance per

CPU core. Also the CertiKOS disk driver utilizes a relatively

slow AHCI mode.

Deterland uses Intel VT-x instructions for virtualization,

but does not make use of Intel VT-d for direct I/O in order to

isolate the VM from physical device hardware.

5.1 Precise Versus Coarse Execution

For experimentation purposes, the Deterland hypervisor can

be configured to enforce either precise or coarse determinis-

tic execution. The coarse mode reflects the currently-untrue

assumption that the hardware is capable of interrupting exe-

cution after a precise number of instructions. Current x86

hardware unfortunately does not have this capability, but

other architectures (notably PA-RISC) have offered it [29],

and it could in principle be added to x86 processors. Coarse

mode experiments may thus offer some insight into Deter-

land’s potential performance on enhanced hardware.

In the precise execution mode, the Deterland hypervisor

uses well-known but costly techniques to emulate exact in-

struction counting in software [19]. Deterland uses Intel’s

architectural performance monitoring counters to count the

executed instructions in guest mode, and to pause the VM af-

ter a certain number of executed instructions. In practice the

counter will nondeterministically overshoot, since the Intel

CPU uses the local APIC to generate performance counter

overflow interrupts instead of generating precise exceptions

internally. The propagation latency is around 80 instructions

in average, and the maximum latency we observed is around

250. Deterland therefore pauses the VM 250 instructions

early, then uses Monitor TF to single-step the VM to the

target instruction count.

Since the performance counters are highly nondetermin-

istic, Deterland hides them from guest VMs by counterfeit-

ing CPUID results and intercepting MSR accesses.

5.2 Artificial Time

The vTimer does not directly calculate artificial time, but

only calculates instruction counts. The “time” the VM per-

ceives is generated by the vRTC and vPIT, which use the vir-

tual CPU frequency to calculate the number of instructions

to execute between guest VM interrupts.

CPU0

LAPIC0

CPU1

LAPIC1

VM1

CPU2

LAPIC2

VM2

CPU3

LAPIC3

VM3

IOAPIC

disk network

timer

perfctr

IPI

MSI

Figure 10: Interrupt routing

Deterland intercepts the rdtsc instruction and accesses

to the MSR alias of the TSC. The vTimer then uses the

number of executed instructions as the value of TSC and

simulates a vCPU that executes exactly one instruction per

cycle, to support OS and application code that relies on the

TSC without compromising determinism.

5.3 Interrupts

Deterland disables the LAPIC timer, enables performance

counter overflow interrupts on the CPU cores the VM runs

on, and configures the IOAPIC to reroute all external inter-

rupts to the core the mitigator runs on. The complete inter-

rupt routing scheme is shown in Figure 10. Guest VMs are

thus preempted by performance counter overflows, instead

of by the true timer. Deterland uses IPIs for IPC since each

guest VM runs under the control of a CertiKOS process.

Virtual interrupts are queued to the vPIC first, then in-

jected into the VM as soon as IF in the guest’s eflags is

set. Virtualization of MSI is not yet implemented, and the

interrupt numbers for devices are currently hardcoded.

5.4 I/O Virtualization

For simplicity, Deterland currently uses port I/O instead of

memory-mapped I/O. MMIO and port I/O each cost one VM

exit because Deterland intercepts all I/O operations. MMIO

requires the VMM to decipher the guest page tables and

instructions to get the actual I/O operation. Port I/O is much

simpler, however, and the port address space is sufficient for

all simulated and virtio devices.

6. Evaluation

This section evaluates the practicality of VM-wide timing

mitigation, focusing on measuring the performance over-

head of Deterland across different configurations. Experi-

ments are run on a commodity PC system consisting of a 4-

core Intel Core i5-4705S CPU running at 3.2GHz with 16GB

of 1600MHz DDR3 RAM, a 128GB SSD and a gigabit Eth-

ernet card (Realtek 8111G). This system design corresponds

roughly to a low-cost cloud hardware configuration.

We compare execution on Deterland against QEMU/KVM

2.1.0 [31] running on Ubuntu 14.10. For a fair comparison,

we enable VT-x and disable VT-d in the BIOS, so that KVM

uses the same CPU features for virtualization. The guest OS

is Ubuntu 14.04.2 LTS server version.

As Deterland, by design, does not expose real time to the

VM, we measure performance overhead using vmcalls that

report hypervisor statistical data to the VM via x86 registers.

These calls are included strictly for experimentation and are

not part of the production hypervisor interface.

We report performance overhead versus KVM along with

an upper-bound estimate of information leakage rate. This

estimated leakage is a loose bound reflecting total possi-

ble information flow across the mitigation boundary, includ-

ing noise and known public information. It seems unlikely

that practical data-exfiltration attacks could come anywhere

close to this maximum leakage rate, except perhaps with the

deliberate help of conspiring “trojan” code in the victim it-

self.

6.1 Cloud Configurations

Deterland depends on three main tunable parameters: miti-

gation interval, vCPU speed and precise/coarse mode. The

vCPU speed may heavily impact the performance of CPU-

bounded applications, as it limits the number of instructions

executed in a time slot. The mitigation interval will likely

impact the performance of I/O-bounded applications, as it

adds delays for mitigation purposes. Precise mode intro-

duces more overhead than coarse mode due to the need to

perform precise instruction-counting in software.

We examine different settings and report both perfor-

mance overhead and maximum information leakage. Infor-

mation leakage rate does not strictly reflect Shannon entropy,

but is approximated by the number of missed deadlines, as-

suming the VM normally finishes every segment in time.

VM entry and exit overhead is categorized as VMX, and is

measured by the number of VM exits. The time for of one

VM exits is measured by single-stepping a piece of code. We

use 1.0445 microsecond per VM exit for all the benchmarks.

6.2 Micro-Benchmarks

The micro-benchmarks focus on three major computing re-

sources a cloud provides: CPU (§6.2.1), network I/O (§6.2.2)

and disk I/O (§6.2.3).

6.2.1 CPU

The CPU benchmark uses the built-in CPU test in sys-

bench [33] to check 20000 primes. We tested different set-

tings of vCPU speed and mitigation intervals. This experi-

ment aims to reveal Deterland’s overhead over KVM.

Figure 11a and Figure 12a shows performance results and

maximum leakage. Figure 13a breaks down execution time.

Performance slowdown is almost linear in vCPU frequency.

Performance saturates near 6.4GHz because the CPU is ca-

pable of executing two simple instructions in one cycle, or

6.4 billion instructions per second.

Clearly the precise mode introduces extra overhead due

to single-stepping. The overhead for single-stepping alone

is 30% for 1ms mitigation interval, including the time spent

in the hypervisor and during VM entries and VM exits. The

same overhead is 5% in total for 100ms interval.

Almost all remaining overhead comes from mitigation,

because the CPU-bounded application mostly executes in-

structions that do not cause VM exits. A cloud provider

might in practice schedule another VM instance during the

mitigation period of this VM. Thus the actual overhead to the

cloud provider is small in coarse mode, or in precise mode

with a long mitigation interval.

The higher red line in Figure 12a is the theoretical upper

bound of information leakage rate for 1ms mitigation inter-

val, while the lower red line is for 100ms mitigation interval.

The estimated information leakage rate is significantly lower

than the theoretical upper bound, and much of this estimated

leakage is in turn likely to be noise.

6.2.2 Network I/O

Network overheads for TCP and UDP were obtained via

iperf [46], with default window sizes (47.3K for TCP and

160K for UDP). All tests run on a 3.2GHz vCPU.

Figure 11b and Figure 12b shows performance and max-

imum leakage. Figure 13b breaks down execution time. As

an I/O bounded application, iperf does not execute the max-

imum number of instructions per time slot as in the CPU

benchmark, because the kernel executes hlt when all pro-

cesses are blocked by I/O.

The VM is under-utilized in most settings, but may still

leak information. This leakage is caused by the hypervisor,

as the hypervisor still needs to move data between guest

memory and the shared data buffer. The mitigator may not

receive bundled requests in time because the hypervisor is

busy copying data. The estimated information leakage rate

is much less than the theoretical maximum for TCP because

traffic is low, while the leakage rate for UDP is higher due

to higher-volume traffic. However, this estimated leakage

mostly reflects public information about traffic volume, and

is unlikely to contain much truly sensitive information.

UDP bandwidth changes over mitigation intervals as ex-

pected. When the mitigation interval is too short, the hyper-

visor spends significant time on VM entries and exits, in-

stead of sending packets. In contrast, when the mitigation

interval is too long, the VM may not have enough virtio de-

scriptors to send packets, as most of them are buffered in the

hypervisor or the mitigator.

TCP bandwidth exhibits more erratic behavior, because

the guest’s congestion control algorithms are severely im-

pacted by the effects of mitigation. The reported result is

measured with CUBIC; other algorithms aside from BIC are

basically unusable. These results are understandable since

congestion control is highly timing-sensitive. A future op-

timization might therefore be to move the TCP stack from

the guest VM into the hypervisor and give the guest a more

latency-tolerant socket-proxy interface.

6.2.3 Disk I/O

Raw disk The disk benchmark uses fio [8] for both syn-

chronous disk I/O and Linux’s native asynchronous disk I/O

0.25

0.5

0.75

0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

slowdown ratio

vCPU frequency (GHz)

precise

1ms

precise

100ms

coarse

1ms

coarse

100ms

(a) Raw CPU performance

0.2

0.4

0.6

0.8

50us 100us 200us 500us 1ms 2ms 5ms 10ms 20ms 50ms 100ms

slowdown ratio

mitigation interval

precise

TCP

precise

UDP

coarse

TCP

coarse

UDP

(b) Raw network performance

0.2

0.4

0.6

0.8

50us 100us 200us 500us 1ms 2ms 5ms 10ms 20ms 50ms 100ms

slowdown ratio

mitigation interval

precise

libaio

precise

sync

coarse

libaio

coarse

sync

0.2

0.4

0.6

0.8

50us 100us 200us 500us 1ms 2ms 5ms 10ms

slowdown ratio

mitigation interval

precise

btrfs

precise

ext4

precise

ext2

coarse

btrfs

coarse

ext4

coarse

ext2

(d) Filesystem performance

Figure 11: Performance overhead for micro benchmarks

10-2

10-1

100

101

102

103

0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

leakage rate (bit/s)

vCPU frequency (GHz)

precise

1ms

precise

100ms

coarse

1ms

coarse

100ms

(a) Leakage for CPU benchmark

10-2

10-1

100

101

102

103

104

50us 100us200us 500us 1ms 2ms 5ms 10ms 20ms 50ms 100ms

leakage rate (bit/s)

mitigation interval

precise

TCP

precise

UDP

coarse

TCP

coarse

UDP

(b) Leakage for network benchmark

10-2

10-1

100

101

102

103

104

50us 100us200us 500us 1ms 2ms 5ms 10ms 20ms 50ms 100ms

leakage rate (bit/s)

mitigation interval

precise

libaio

precise

sync

coarse

libaio

coarse

sync

10-1

100

101

102

103

104

50us 100us 200us 500us 1ms 2ms 5ms 10ms

leakage rate (bit/s)

mitigation interval

precise

btrfs

precise

ext4

precise

ext2

coarse

btrfs

coarse

ext4

coarse

ext2

(d) Leakage for filesystem benchmark

Figure 12: Estimated information leakage rate for micro benchmarks

(libaio). Synchronous disk I/O operations are expected to ex-

hibit latencies up to three times as long as the mitigation in-

terval, as discussed in §4.3.1. The total data size is 16GB for

the libaio test, and 1GB for the synchronous test.

Figure 11c and Figure 12c shows performance results and

maximum leakage. Figure 13c breaks down execution time.

The estimated leakage rate is a order of magnitude lower

than the theoretical upper bound. As with the network

benchmark, the estimated leakage rate is largely caused by

data copying in the hypervisor, and is unlikely to contain

much truly sensitive information.

The asynchronous disk I/O basically keeps the maximum

rate for all mitigation intervals, except in the 100ms interval

case due to a lack of virtio descriptors. Deterland is 50%

slower than KVM because the underlying AHCI driver is not

well-optimized. The throughput is still more than a spinning

magnetic disk offers, however.

Synchronous disk I/O, on the other hand, shows a recip-

rocal rate as expected. With a 1ms mitigation interval, the

throughput is 10 times slower than that of KVM, and the VM

nearly stalls when the mitigation interval is more than 10ms.

Fortunately, many disk I/O bounded applications, such as

databases, use asynchronous I/O and data buffer pools to tol-

erate disk latency. Modern filesystems also have features like

buffering and read-ahead to improve performance.

100

0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

breakdown (%)

vCPU frequency (GHz)

mitigation

precise

1ms

MTF

precise

100ms

other

coarse

1ms

VMX

coarse

100ms

guest

(a) Breakdown for CPU benchmark

100

50us 100us200us 500us 1ms 2ms 5ms 10ms 20ms 50ms 100ms

breakdown (%)

mitigation interval

mitigation

precise

TCP

MTF

precise

UDP

other

coarse

TCP

VMX

coarse

UDP

guest

(b) Breakdown for network benchmark

100

50us 100us200us 500us 1ms 2ms 5ms 10ms 20ms 50ms 100ms

breakdown (%)

mitigation interval

mitigation

precise

libaio

MTF

precise

sync

other

coarse

libaio

VMX

coarse

sync

guest

100

50us 100us 200us 500us 1ms 2ms 5ms 10ms

breakdown (%)

mitigation interval

mitigation

precise

btrfs

MTF

precise

ext4

other

precise

ext2

VMX

coarse

btrfs

guest

coarse

ext4

coarse

ext4

(d) Breakdown for filesystem benchmark

Figure 13: Execution time breakdowns, the whole execution time is 100%

Filesystem The filesystem benchmark also uses fio for

three popular filesystems: btrfs, ext4 and ext2. We turn on

LZO compression for btrfs, and journaling for both ext4

and btrfs. We simulates a common filesystem access pat-

tern by performing random reads and writes on files. The

file size ranges from 1KB to 1MB, and each I/O operation

reads or writes between 4KB and 32KB of data. We use the

O_DIRECT flag to skip the kernel data buffer, so as to reveal

the impact of filesystems themselves.

Figure 11d and Figure 12d shows performance results and

maximum leakage. Figure 13d breaks down execution time.

The ext filesystems show a pattern similar to the syn-

chronous disk I/O benchmark, since most filesystem oper-

ations are synchronous. The journaling feature does not pro-

vide a performance benefit in this case.

The btrfs filesystem performs much better when the mit-

igation interval is long, Btrfs reads and writes the same

amount of data with many fewer disk I/O operations, be-

cause of its compression feature and a better buffering mech-

anism. However, when the mitigation interval is short, its

compression operations produce more overhead.

6.3 Macro Benchmarks

We use several benchmark suites to explore realistic cloud

workloads. The PARSEC [11] benchmark represents sci-

entific computation workloads. OLTP-Bench [17] repre-

sents typical relational database workloads. The YCSB [15]

benchmark represents key-value storage workloads.

PARSEC Complementing the CPU micro-benchmark, PAR-

SEC measures performance overhead for more complex

compute-intensive applications. Instead of simple instruc-

tions, PARSEC applications exercise more CPU features

such as FPU and cache. We compare Deterland’s PARSEC

performance against KVM with 3.2GHz real CPU speed.

The results are shown in Figure 14. The slowdown ratio

is near 1for all applications when vCPU speed is 6.4GHz.

Most benchmarks show sub-linear slowdown ratios, as

the actual functioning units are saturated, like FPU for

blackscholes and streamcluster. The only appli-

cation showing nearly linear slowdown is fluidanimate,

which performs frequent lock operations; lock operations on

a single-core vCPU degenerate to normal instructions. Per-

formance is fairly competitive when the vCPU frequency is

3.2GHz, regardless of mitigation interval.

Storage We test the MySQL server with the TPC-C work-

load from OLTP-Bench, with local clients and remote clients.

We also test the Cassandra server with three different work-

loads from YCSB, with local clients and remote clients. The

backend filesystems for both MySQL and Cassandra are

ext4 and btrfs, with the same configuration as in the filesys-

tem micro benchmark. The remote clients run on a Linux

machine outside the mitigation boundary.

Figure 15a shows performance overhead, and Figure 15b

shows latency overhead. Overall performance is acceptable

with a 1ms mitigation interval. Cassandra shows a better

slowdown ratio than MySQL, as it has less chance of holding

locks during mitigation. Cassandra is also more tolerant of

mitigation level for local clients, as it has simpler buffering

requirements than those of MySQL. Cassandra maintains the

same slowdown ratio when the mitigation interval becomes

10ms, whereas MySQL becomes unusable.

The remote benchmark, however, reveals the impact of

network latency. Both Cassandra and MySQL suffer signif-

icant performance drop when the mitigation level increases

to 10ms. The average latency remains nearly the same for all

0.25

0.5

0.75

0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

slowdown ratio

vCPU frequency (GHz)

fluidanimate

streamcluster

swaptions

blackscholes

canneal

(a) PARSEC with 1ms mitigation interval

0.25

0.5

0.75

0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

slowdown ratio

vCPU frequency (GHz)

fluidanimate

streamcluster

swaptions

blackscholes

canneal

(b) PARSEC with 100ms mitigation interval

Figure 14: Performance overhead for scientific computation benchmarks

0.2

0.4

0.6

0.8

1ms 10ms 1ms 10ms 1ms 10ms 1ms 10ms

slowdown ratio

remote

ext4

remote

btrfs

local

ext4

local

btrfs

MySQL

TPC-C

Cassandra

read-modify-write

Cassandra

read-only

Cassandra

50/50 read/update

(a) Throughput of YCSB and OLTP-Bench

1ms 10ms 1ms 10ms 1ms 10ms 1ms 10ms

normalized latency

remote

ext4

remote

btrfs

local

ext4

local

btrfs

MySQL

TPC-C

Cassandra

read-modify-write

Cassandra

read-only

Cassandra

50/50 read/update

(b) Latency of YCSB and OLTP-Bench

Figure 15: Performance overhead for storage benchmarks

local benchmarks, while the remote latency is higher due to

extra network mitigation in addition to disk mitigation.

6.4 Deployment Considerations

As the benchmarks indicate, we find the best mitigation in-

terval for legacy applications to be around 1ms. Estimated

information leakage is around 100bps for all benchmarks,

and theoretically bounded at 1Kbps. Of course, much of

this estimated leakage is unlikely to be sensitive information

useful to an attacker. If the VM sends network packets at

high rate constantly, the current Deterland prototype spends

more than 1ms to handle all the VM exits and data buffering,

and exhibits a relatively fixed pattern of missed deadlines.

This fixed pattern, if predictable, does not actually leak sen-

sitive information. CPU-bounded applications exhibit little

overhead. Network-bounded applications have no more than

50% throughput loss for both UDP and TCP, compared to

KVM. Cloud providers such as Rackspace are known to

limit network bandwidth for VM instances anyway, which

may hide this throughput impact. While filesystem perfor-

mance suffers, well-designed applications like databases

handle this reasonably well and incur no more than 60%

throughput loss in exchange for mitigation.

The ideal vCPU speed depends on the characteristics of

the underlying physical CPU. However, most VM instances

in the cloud do not execute a full allocation of instructions in

a given segment. As a result, even if the vCPU speed is set

high, the VM will likely be under-utilized.

In summary, applications that are CPU bounded may not

notice any appreciable difference when running in Deter-

land. Applications using asynchronous I/O exhibit moderate

overhead, less than 30%, with a 1ms mitigation interval. Ap-

plications that heavily depend on synchronous I/O may not

be suitable for Deterland.

Deterland represents only one of many possible design

points for systematic timing-channel mitigation in the cloud,

of course, in many cases emphasizing generality and imple-

mentation simplicity over other considerations. It is likely

that many further optimizations, possible hardware features,

and specialization to particular application or security mod-

els of interest could further substantially improve Deter-

land’s performance and practicality.

7. Related Work

Timing channel control has been well-studied in commercial

security systems for decades [10, 12, 21, 28, 32, 39]. We now

discuss related work grouped into three categories: control-

ling internal timing channels, controlling external channels,

and detecting timing channels.

7.1 Controlling Internal Timing Channels

Controlling cache-based timing channels Wang et al.

propose an effort that refines the processor architecture to

minimize information leakage through cache-based timing

channels [49]. Oswald et al. present a method to disable

cache sharing, avoiding cache-based timing attack [37].

these approaches are difficult to use in the cloud model,

because either they require specific infrastructure architec-

ture, or undermine the cloud’s economic model of over-

subscribing and statistically multiplexing resources among

customers [6].

System-enforced Determinism Aviram et al. propose to

eliminate internal timing channels by forcing virtual ma-

chine execution to be deterministic [6]. The proposed ap-

proach does not address external timing channels, how-

ever. In addition, the effort achieves deterministic execution

through Determinator, an OS that imposes new parallel pro-

gramming constraints and limits compatibility with existing

applications and OS kernels.

StopWatch [34] replicates each VM across three physical

machines and uses the median of certain I/O events’ tim-

ing from the three replicated VMs to determine the timings

observed by other VMs co-resident on the same physical

machines. However, StopWatch cannot fundamentally elim-

inate or bound internal timing channels, since attackers still

can learn information by probing multiple victim VMs.

Language-based timing channel control Many language-

based efforts address internal timing channels [27, 52] and

mitigate external timing channels [42, 44, 47]. Zhang et al.

propose a language-level timing mitigation approach, which

can reason about the secrecy of computation time more ac-

curately [54]. In this effort, well-typed programs provably

leak a bounded amount of information over time via exter-

nal timing channels. Through predictive mitigation, this ef-

fort also offers an expressive programming model, where ap-

plications are expressible only if timing leakage is provably

bounded by a desired amount. Unfortunately these language-

based approaches require programs to be rewritten in new

and unfamiliar languages.

7.2 Controlling External Timing Channels

Additive noise To resist external timing channels, sev-

eral proposals inject additive noise [23, 25, 26]. These ap-

proaches control timing channels only in specific workloads,

however. In addition, stealthy timing channels robust to ad-

ditive noise have been constructed [35].

Predictive mitigation Predictive timing mitigation is a

general scheme for provably limiting leakage through exter-

nal channels [5, 53]. In general, predictive mitigation works

by predicting future timing from past non-sensitive infor-

mation, and enforcing these predictions. When a timing-

sensitive event arrives before the predicted time point, the

event is delayed to output according to the prediction; if a

prediction fails, a new epoch with a longer prediction begins

and information is leaked. This approach not only tracks the

amount of information leaked at each epoch transition, but

also offers a provable guarantee on total leakage. However,

predictive mitigation does not directly address internal tim-

ing channels, a practical virtual machine’s workload is hard

to predict, and predictive mitigation cannot identify timing

variations that may leak sensitive information, unnecessarily

hurting performance when most timing variation is benign.

7.3 Detecting Timing Channels

Instead of controlling timing channels, some proposals aim

to detect potential timing channels in programs or systems.

Most existing approaches [13, 22, 38] detect timing channels

either by inspecting some high-level statistic (e.g., entropy)

of network traffic, or by looking for specific patterns. A pow-

erful adversary might be able to circumvent these defenses

by varying timing in a slightly different way, however.

Chen et al. propose to use time-deterministic reply (TDR)

to discover timing channels [14]. A TDR system not only

offers deterministic replay, but also reproduces events that

have non-deterministic timing, thus detecting potential tim-

ing channels. While both of these efforts make use of de-

terminism, Deterland aims to control timing channels proac-

tively rather than detecting timing channels retroactively.

8. Conclusion

This paper has presented Deterland, a hypervisor that runs

unmodified OS kernels and applications while controlling

both internal and external channels. While only a first step,

our proof-of-concept prototype and experiments suggest that

systematic timing channel mitigation may be practical and

realistic for security-critical cloud applications, depending

on configuration parameters and application workload.

Acknowledgments

We thank our shepherd Ken Birman and the anonymous

reviewers for their many helpful suggestions. We also thank

Liang Gu and Daniel Jackowitz for their help in experiments,

as well as David Wolinsky and Ennan Zhai for their help

in copy-editing. This work was funded by NSF under grant

CNS-1407454.

References

[1] Onur Acıic¸mez. Yet another microarchitectural attack: Ex-

ploiting I-cache. In 1st ACM Workshop on Computer Security

Architecture (CCAW), November 2007.

[2] Onur Acıic¸mez et al. Predicting secret keys via branch predic-

tion. In Cryptographers’ Track - RSA Conference (CT-RSA),

February 2007.

[3] Onur Acıic¸ mez, Werner Schindler, and C¸ etin K Koc¸. Cache

based remote timing attack on the AES. In Topics in

Cryptology–CT-RSA 2007, pages 271–286. Springer, 2006.

[4] Johan Agat. Transforming out timing leaks. In Proceedings

of the 27th ACM SIGPLAN-SIGACT Symposium on Principles

of Programming Languages, pages 40–53. ACM, 2000.

[5] Aslan Askarov, Danfeng Zhang, and Andrew C Myers. Pre-

dictive black-box mitigation of timing channels. In Proceed-

ings of the 17th ACM Conference on Computer and Commu-

nications Security, pages 297–307. ACM, 2010.

[6] Amittai Aviram, Sen Hu, Bryan Ford, and Ramakrishna Gum-

madi. Determinating timing channels in compute clouds. In

Proceedings of the 2010 ACM Workshop on Cloud Computing

Security Workshop, pages 103–108. ACM, 2010.

[7] Amittai Aviram, Shu-Chun Weng, Sen Hu, and Bryan Ford.

Efficient system-enforced deterministic parallelism. In Pro-

ceedings of the 9th USENIX Conference on Operating Systems

Design and Implementation, pages 1–16. USENIX Associa-

tion, 2010.

[8] Jens Axboe. FIO-flexible IO tester. http://freecode.

com/projects/fio.

[9] James Matthew Barrie. Peter and Wendy. Hodder &

Stoughton, 1911.

[10] Daniel J Bernstein. Cache-timing attacks on AES,

2005. http://cr.yp.to/antiforgery/

cachetiming-20050414.pdf.

[11] Christian Bienia. Benchmarking Modern Multiprocessors.

PhD thesis, Princeton University, January 2011.

[12] David Brumley and Dan Boneh. Remote timing attacks are

practical. In Proceedings of the 12th Conference on USENIX

Security Symposium-Volume 12. USENIX Association, 2003.

[13] Serdar Cabuk, Carla E Brodley, and Clay Shields. IP covert

timing channels: Design and detection. In Proceedings of

the 11th ACM Conference on Computer and Communications

Security, pages 178–187. ACM, 2004.

[14] Ang Chen, W Brad Moore, Hanjun Xiao, Andreas Haeberlen,

Linh Thi Xuan Phan, Micah Sherr, and Wenchao Zhou. De-

tecting covert timing channels with time-deterministic replay.

In USENIX Symposium on Operating System Design and Im-

plementation (OSDI), 2014.

[15] Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ra-

makrishnan, and Russell Sears. Benchmarking cloud serving

systems with YCSB. In Proceedings of the 1st ACM Sympo-

sium on Cloud Computing, pages 143–154. ACM, 2010.

[16] Joseph Devietti, Brandon Lucia, Luis Ceze, and Mark Oskin.

DMP: Deterministic shared memory multiprocessing. In ACM

SIGARCH Computer Architecture News, volume 37, pages

85–96. ACM, 2009.

[17] Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and

Philippe Cudre-Mauroux. OLTP-Bench: An extensible

testbed for benchmarking relational databases. Proceedings

of the VLDB Endowment, 7(4), 2013.

[18] Yaozu Dong, Xiaowei Yang, Jianhui Li, Guangdeng Liao, Kun

Tian, and Haibing Guan. High performance network virtu-

alization with SR-IOV. Journal of Parallel and Distributed

Computing, 72(11):1471–1480, 2012.

[19] George W. Dunlap et al. ReVirt: Enabling intrusion analysis

through virtual-machine logging and replay. In 5th USENIX

Symposium on Operating Systems Design and Implementation

(OSDI), December 2002.

[20] George W Dunlap, Dominic G Lucchetti, Michael A Fetter-

man, and Peter M Chen. Execution replay of multiproces-

sor virtual machines. In Proceedings of the 4th ACM SIG-

PLAN/SIGOPS International Conference on Virtual Execu-

tion Environments, pages 121–130. ACM, 2008.

[21] Edward W Felten and Michael A Schneider. Timing attacks

on web privacy. In Proceedings of the 7th ACM conference on

Computer and communications security, pages 25–32. ACM,

2000.

[22] Steven Gianvecchio and Haining Wang. Detecting covert tim-

ing channels: an entropy-based approach. In Proceedings of

the 14th ACM Conference on Computer and Communications

Security, pages 307–316. ACM, 2007.

[23] J Giles and B Hajek. An information-theoretic and game-

theoretic study of timing channels. IEEE Transactions on

Information Theory, 48(9):2455–2477, 2002.

[24] Liang Gu, Alexander Vaynberg, Bryan Ford, Zhong Shao,

and David Costanzo. CertiKOS: A certified kernel for secure

cloud computing. In Proceedings of the Second Asia-Pacific

Workshop on Systems, 2011.

[25] Andreas Haeberlen, Benjamin C. Pierce, and Arjun Narayan.

Differential privacy under fire. In 20th Proceedings of

USENIX Security Symposium (USENIX Security), August

2011.

[26] Wei-Ming Hu. Reducing timing channels with fuzzy time. In

IEEE Symposium on Security and Privacy, pages 8–20, 1991.

[27] Marieke Huisman, Pratik Worah, and Kim Sunesen. A tem-

poral logic characterisation of observational determinism. In

IEEE 19th computer Security Foundation Workshop (CSFW),

July 2006.

[28] Ralf Hund, Carsten Willems, and Thorsten Holz. Practical

timing side channel attacks against kernel space ASLR. In

IEEE Symposium on Security and Privacy, pages 191–205.

IEEE, 2013.

[29] Gerry Kane. PA-RISC 2.0 Architecture. Prentice Hall PTR,

1996.

[30] Richard A. Kemmerer. Shared resource matrix methodol-

ogy: An approach to identifying storage and timing channels.

Transactions on Computer Systems (TOCS), 1(3):256–277,

August 1983.

[31] Avi Kivity, Yaniv Kamay, Dor Laor, Uri Lublin, and Anthony

Liguori. KVM: The Linux virtual machine monitor. In

Proceedings of the Linux Symposium, volume 1, pages 225–

230, 2007.

[32] Paul Kocher. Timing attacks on implementations of Diffie-

Hellman, RSA, DSS, and other systems. In Advances in

Cryptology (CRYPTO), pages 104–113. Springer, 1996.

[33] Alexey Kopytov. SysBench: A system performance bench-

mark. http://sysbench.sourceforge.net.

[34] Peng Li, Debin Gao, and Michael K Reiter. StopWatch:

a cloud architecture for timing channel mitigation. ACM

Transactions on Information and System Security (TISSEC),

17(2):8, 2014.

[35] Yali Liu, Dipak Ghosal, Frederik Armknecht, Ahmad-Reza

Sadeghi, Steffen Schulz, and Stefan Katzenbeisser. Hide and

seek in time – Robust covert timing channels. In 14th Eu-

ropean Symposium on Research in Computer Security (ES-

ORICS), September 2009.

[36] Gil Neiger, Amy Santoni, Felix Leung, Dion Rodgers, and

Rich Uhlig. Intel virtualization technology: Hardware sup-

port for efficient processor virtualization. Intel Technology

Journal, 10(3), 2006.

[37] Elisabeth Oswald, Stefan Mangard, Norbert Pramstaller, and

Vincent Rijmen. A side-channel analysis resistant description

of the AES S-Box. In Fast Software Encryption Workshop

(FSE), February 2005.

[38] Pai Peng, Peng Ning, and Douglas S. Reeves. On the secrecy

of timing-based active watermarking trace-back techniques.

In IEEE Symposium on Security and Privacy (S&P), May

2006.

[39] Colin Percival. Cache missing for fun and profit. In BSDCan,

May 2005.

[40] Thomas Ristenpart, Eran Tromer, Hovav Shacham, and Stefan

Savage. Hey, you, get off of my cloud: Exploring information

leakage in third-party compute clouds. In 16th ACM Con-

ference on Computer and Communications Security (CCS),

pages 199–212, 2009.

[41] Rusty Russell. virtio: Towards a de-facto standard for vir-

tual I/O devices. ACM SIGOPS Operating Systems Review,

42(5):95–103, 2008.

[42] Andrei Sabelfeld and David Sands. Probabilistic noninter-

ference for multi-threaded programs. In Computer Security

Foundations Workshop, pages 200–214, 2000.

[43] Claude Elwood Shannon and Warren Weaver. The mathemat-

ical theory of communication. University of Illinois Press,

1959.

[44] Geoffrey Smith and Dennis M. Volpano. Secure information

flow in a multi-threaded imperative language. In 25th Sympo-

sium on Principles of Programming Languages (POPL), Jan-

uary 1998.

[45] Deian Stefan, Alejandro Russo, Pablo Buiras, Amit Levy,

John C Mitchell, and David Mazi´

eres. Addressing covert

termination and timing channels in concurrent information

flow systems. In ACM SIGPLAN International Conference

on Functional Programing (ICFP), 2012.

[46] Ajay Tirumala, Feng Qin, Jon Dugan, Jim Ferguson, and

Kevin Gibbs. Iperf: TCP and UDP bandwidth measurement

tool. http://iperf.sourceforge.net.

[47] Dennis M. Volpano and Geoffrey Smith. Eliminating covert

flows with minimum typings. In 10th Computer Security

Foundations Workshop (CSFW), June 1997.

[48] Zhenghong Wang and Ruby B. Lee. Covert and side channels

due to processor architecture. In 22nd Annual Computer

Security Applications Conference (ACSAC), December 2006.

[49] Zhenhong Wang and Ruby B. Lee. A novel cache architecture

with enhanced performance and security. In IEEE/ACM 41th

International Symposium on Microarchitecture (Micro), 2008.

[50] Michael Weiss, Benedikt Heinz, and Frederic Stumpf. A

cache timing attack on AES in virtualization environments. In

Financial Cryptography and Data Security, pages 314–328.

Springer, 2012.

[51] John C. Wray. An analysis of covert timing channels. In IEEE

Symposium on Security and Privacy, May 1991.

[52] Steve Zdancewic and Andrew C Myers. Observational deter-

minism for concurrent program security. In Computer Secu-

rity Foundations Workshop, pages 29–43. IEEE, 2003.

[53] Danfeng Zhang, Aslan Askarov, and Andrew C Myers. Pre-

dictive mitigation of timing channels in interactive systems.

In Proceedings of the 18th ACM conference on Computer and

communications security, pages 563–574, 2011.

[54] Danfeng Zhang, Aslan Askarov, and Andrew C Myers.

Language-based control and mitigation of timing channels. In

ACM SIGPLAN Conference on Programming Language De-

sign and Implementation (PLDI), 2012.

[55] Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. Towards

practical page coloring-based multicore cache management.

In Proceedings of the 4th ACM European Conference on Com-

puter Systems, pages 89–102. ACM, 2009.

[56] Yinqian Zhang, Ari Juels, Alina Oprea, and Michael K Reiter.

HomeAlone: Co-residency detection in the cloud via side-

channel analysis. In IEEE Security and Privacy (SP), pages

313–328, 2011.

[57] Yinqian Zhang, Ari Juels, Michael K Reiter, and Thomas

Ristenpart. Cross-vm side channels and their use to extract

private keys. In Proceedings of the 2012 ACM Conference

on Computer and Communications Security, pages 305–316.

ACM, 2012.