Thursday, October 1, 2015

Software Protection and Simulation on Oblivious RAMs

In the talk “Software Protection and Simulation on Oblivious RAMs”, we explored a very different and amazing side of Secure Computation that shows how widespread its applications are. The major motivation for designing Oblivious RAM (ORAM) was to prevent duplication of software. We know that to keep the contents of the memory secure, we can use an appropriate encryption scheme to encrypt the program code and data. Is that enough? No - Software protection needs to go beyond that.

An adversary can get valuable information looking at the sequence of memory accesses. Access patterns in programs having properties like, say, a loop structure for instance, can leak some information about the program / the input in the program execution. Consider the example of a sorting program – Suppose the adversary only knows that the program is a sorting algorithm, looking at the access patterns he may be able to figure out what kind of sorting algorithm is used – for example in case of a binary search. An adversary can conduct some experiments with different inputs and get more information than he is supposed to; by looking at the sequence of memory accesses. Oblivious RAM is proposed as a solution to prevent this type of security breach.               

What does “oblivious” mean in the context of machines? – A machine is oblivious if the sequence in which it accesses memory locations is equivalent for any two inputs with the same running time. We consider the following setup - CPU has a small number of registers which can be shielded but the encrypted data and program is not shielded. The adversary cannot inspect or modify the contents of the CPU registers but can see the communication between the CPU and memory. Consider a “smart CPU” that handles the program execution in such a way that the memory access patterns leak no extra information to the adversary. We looked at two different definitions –
1)      Consider two cases when given an encrypted program as input –
(a)    Let { ap1 } denote the memory access pattern when the program is run on smart CPU.
(b)   Let { ap2 } denote the memory access pattern when a dummy program with same running time is run on fake CPU.
The smart CPU achieves obliviousness if the access patterns are computationally indistinguishable. Mathematically, it can be denoted as { ap1} » { ap2}

2)      The conditional probability for a particular input given a sequence of memory accesses which occur during an execution on that input, equals the apriori probability for that particular input. Mathematically, this can be denoted as Pr[Input|{ap}] = Pr[Input]  

Let us look at a couple of ways to design such a smart CPU. Suppose a program runs in “t” steps. Here “t” is the number of memory accesses which can be either a read/ update or a write instruction. Suppose M denotes the content of the RAM. Its size can be denoted by |M| = m. We assume m >>> t. Memory contents of the ORAM are of the form (v, x) where v denotes the virtual address in the actual RAM and x denotes the data at that address.

A trivial way to design a smart CPU that can achieve obliviousness is as follows - For every access, the CPU scans the entire memory content. This will have O(t.m) complexity. This will clearly serve the purpose to defeat the adversary since for every access, irrespective of the input, all the locations of the memory are accessed. So the adversary will not be able to distinguish when the same program is run with two different inputs. The goal has been achieved but at a very high cost. We will see how this can be optimized.

 Let us look at the “square root” algorithm – RAM is simulated by an oblivious RAM of size (m + 2√m) as shown.
                     m words
         m dummy words
     √m shelter words

Initialization : The first (m+√m) locations are allotted to the virtual m memory words and √m dummy words. The remaining √m locations serve to store shelter words as we shall see.

The algorithm runs in epoch of √m steps. In each epoch –
A.      Contents of locations from 1 to (m + √m) are randomly permuted. Let π(v) denote the current location of address v in this epoch.
B.      Shelter is used to maintain the values retrieved in the current epoch.
Consider a single memory access to virtual address ‘v’. There is a boolean variable “found” stored in one of the CPU register initialized to false
1.       First the entire shelter is scanned to check whether (v,x) is present in the shelter. If the address is found, the variable “found” is set to true and the scan is continued till the entire shelter has been scanned.
2.       If the address was not found in the shelter, it is retrieved from the actual word π(v) which is the current location of the content of address v in this epoch.
3.       If the virtual address had been found in the shelter, we still need to make an access in the first (m+√m) locations to maintain the obliviousness. We access a dummy word at location (m + count) which is currently present at π(m + count). “count” is a variable that increments through the √m steps of the epoch so that different dummy words are accessed in each iteration.
4.       Now the value retrieved (updated value if there was a change, the same value otherwise) is encrypted and written into the first empty location at the shelter. All the shelter words are scanned again. Since the encryption is randomized, the adversary will not be able to distinguish if the same value was written back or the updated value.

Analysis - The access pattern is independent of the input – For each virtual access we scan through the entire shelter (√m locations), then we access one location among the first (m + √m) locations (either the actual word location or the dummy word location – this cannot be distinguished since these locations are in permuted order) and during the write-back we access √m locations of the shelter again. Thus we access a total of (2√m + 1) locations for every virtual address access. The adversary will not be able to get any additional information looking at this access pattern.
How is the permutation on the first (m + √m) locations done? The elements are tagged with randomly chosen integers drawn from the set T = {1, 2 …… n2/Ꜫ }. Say τ: {1,2…n} à T is the function that does the random allocation of tags. Then π(i) = k iff  τ(i) is the kth smallest element in { τ(j) : 1<= j <= n} . The n elements can be sorted according to the tags using Batcher’s oblivious sorting. So, suppose we want to access a location i – we actually need to perform a binary search on the tag τ(i). This is how we can retrieve the location π(i) . So what we had considered as a single access for π(i) is actually log (m+√m) accesses needed for the binary search for the tag. We need a total of (2√m + log (m+√m)) = O(√m) accesses for every virtual address access. Thus, this algorithm shows a way to simulate ORAM of size (m + 2√m) and with a complexity of O(t.√m) accesses for t virtual address accesses.
                                              

No comments:

Post a Comment