This post is part one of a two part internal blog entry on creating a Pintool for an assessment. Unfortunately I cannot talk about it, so I decided to put the first part out. If I find an opensource program similar to the assessment I will try and recreate the tool (but I am not holding my breath). As this part is essentially a build up, it may not be coherent at times. Alteratively, if you really want to read it, you can join us. We are almost always hiring (let me do the referral though ;).
Today we are going to talk about discovering encryption keys in sneaky ways. We will start with simple examples, do a bit of Digital Forensics or DF (for low standards of DF) and finally in part two we will use our recently acquired knowledge of Pintool to do [redacted]
.
First let's talk a bit about the inner-workings of AES decryption. By inner-workings of AES I do not mean the following diagrams that you have seen so many times.
Instead I am going to talk about what happens inside those rectangles labeled “block cipher encryption/decryption.” If you don't want to know about the AES stuff, jump directly to 2. AES Keys in Action.
1. Thinking Inside the Box
Each of these boxes consist of a few rounds. The number of rounds is based on key size in AES. Keep in mind that AES is a subset of the Rijndael family of ciphers (and I still do not know how to pronounce the name). NIST (National Institute of Standards and Technology) selected a fixed block size (16 bytes) and three different key sizes (128, 192 and 256 bits) and called it AES (Advanced Encryption Standard) because that's what NIST does (other than allegedly embedding backdoors in almost never used random number generators, see DUAL_EC_DRBG ;)). You do not need to memorize the formula that calculates the number of rounds based on key and block size. You can see the result of my painstaking calculations in the following table:
|
| Key Size (bits) | Number of Rounds (potatoes) |
|------------------|-----------------------------|
| 128 | 10 |
| 192 | 12 |
| 256 | 14 |
|
That was easy. So what happens inside each of these rounds. Except the last round, there are four steps in each round (encryption/decryption). For the remainder of this section I am going to assume that we are using a 128-bit key (16 bytes) resulting in 10 rounds.
There are four different operations but I am going to go into some detail about AddRoundKey
. It is also the only operation which introduces an unknown element (key) into the process. The other operations are also simple and we can probably guess what they do based on their names.
1.1 AddRoundKey
It's a simple XOR. A 16 byte round key is XOR-ed with the current block. If we count the number of AddRoundKey
operations for Nr==10, we get 11. But we only have one 16 byte key and need 16*11 or 176 bytes.
“How am I going to create the extra 160 (176-16) bytes?” one may ask. This is done through some magic known as key expansion
which creates bytes out of thin air. It expands the original key into the 176 bytes also known as key schedule
.
1.1.1 AES Key Schedule (aka Rijndael Key Schedule)
The key expansion algorithm takes the original key and returns the key schedule. I could talk about the boring details of it but you are already bored and I am lazy. Search for Rijndael Key Schedule if you want to know more. Instead we are going to talk about some interesting stuff.
Don't make the convenient mistake of thinking of the key schedule as a Pseudo-Random Number Generator (PRNG) where we enter the original key as the seed and then reap bytes. In a good PRNG, we should not be able to discover the seed by observing the output. In the Rijndael/AES key schedule there is direct correlation between the original key and each round key.
In AES-128, knowing a single round key (regardless of round number) is enough to generate the original key. In AES-256 we need to know two consecutive round keys and that is a good thing for AES-256. If not, the schedule had reduced the entropy of a 256-bit key to 128 bits. In a lot of hardware (a.k.a limited on-board memory), the first (actual encryption key) and last round keys (first two and last two round keys for AES-256) are stored for encryption/decryption and the rest are generated when needed from them.
Also by looking at the key schedule, we can see that the original AES key is copied directly to the start of the key schedule. In other words, the first 128 bits (16 bytes) of the AES-128 key schedule are the same as the original key.
1.1.2 Round Key Usage
Great, so we have 16 bytes that are XOR-ed with something in each round. For decryption, we can create the key schedule and inverse it. This works because XOR is transitive (i.e. If ciphertext = plaintext XOR key then plaintext = ciphertext XOR key).
Notice the first AddRoundKey block in both encryption and decryption. In encryption this is first 16 bytes of the original key (or the whole key in case of AES-128). In decryption, this is the last round key. Keep this in mind, we are going to use it later.
2. AES Keys in Action
By now we know how AES keys are used. Let's do some stuff. We're going to use the same set up as last time. A Kali 32-bit VM running in VirtualBox.
2.1 Function Calls
External function calls leak information. I am going to divide them into two parts System Calls
(syscalls) and Library Calls
. Basically these are functions that you can call and use in your program. If these functions part of the Operating System they are System Calls and if they are provided by a 3rd party library (shared library, DLL etc) they are Library Calls. For an excellent description of system calls, read the blog post by Gustavo Duartes named System Calls Make the World Go Round (also read the rest of his blog).
2.1.1 OpenSSL Example
Our example will be a simple Encryption/Decryption program in C using OpenSSL modified from [http://wiki.openssl.org/index.php/EVP_Symmetric_Encryption_and_Decryption. It will encrypt and decrypt the string “The quick brown fox jumps over the lazy dog” with AES using the 256 bit (32 byte) key ee12c03ceacdfb5d4c0e67c8f5ab3362
and IV d36a4bf2e6dd9c68
(128 bits or 16 bytes). My comments start with ///
.
|
|
we need the libssl-dev
library which can be installed by sudo apt-get install libssl-dev
. To compile use gcc [filename].c -o [outputfile] -lcrypto -ggdb
. We will use the debug information in GDB later. Here is the output:
|
|
2.2 Monitoring Library Calls
To monitor these calls, we have a few tools at hand. On *nix operating systems we can use strace (for system calls) and ltrace (for both syscalls and library calls). On Windows we can use API Monitor as I have used in the past. If you have a Mac you can try your luck with dtruss which uses dtrace. I am not quite sure if it can be used to trace library calls and if it works on iOS.
2.2.1 Discovering Shared Libraries
Assuming we are approaching this application from a black-box perspective, we need to discover the shared libraries first. This can be done in different ways. We will talk about ldd
, nm
, strings
or just ltrace
. Just using ltrace may do the job but if there are a lot of library calls, we need to spot critical/interesting libraries to filter out the noise.
2.2.1.1 ldd
ldd
“prints shared library dependencies” according to the man page. Let's run it.
|
|
In line 3 we can see libcrypto which means the application is using OpenSSL (the other OpenSSL library is libssl
).
2.2.1.2 nm
nm
“lists symbols from object files.” It's a good idea to look at its output and look for familiar symbols. We can clearly see OPENSSL and function names in the truncated output.
|
|
2.2.1.3 strings
strings
is useful because it may leak great information about the binary. It will give us the key and IV directly in our example. We can also see OpenSSL and libcrypto strings. It also gives us the version of the used OpenSSL library.
|
|
2.3 Using ltrace to Find the Key
Finally let's run ltrace on the binary. The i
switch prints the value of instruction pointer at the time of library call (we will need it later). You can also trace syscalls using the S
(capital S) switch.
|
|
In a non-ideal situation, we have to either recognize the good functions from past experience or search them all. Here we are looking for a function with key and IV as parameters. According to the documentation EVP_DecryptInit_ex
is what we are looking for:
int EVP_DecryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *type, ENGINE *impl, unsigned char *key, unsigned char *iv);
But what are these values:[0x8048b0b] EVP_DecryptInit_ex(0x90bdce0, 0xb7735040, 0, 0x8048d50, 0x8048d71) = 1
These are pointers and are 4 bytes each (remember we are in a 32-bit OS). “But where are these pointers pointing to? Do I have to use GDB?” Yes, we had to use GDB before I knew that we can configure ltrace to dereference pointers. But we will use GDB too.
2.3.1 Configuring ltrace
If we know the type of pointers, we can dereference them by modifying ~/.ltrace.conf. We can also do more elaborate stuff like defining structs as explained here. In short we can add lines to ltrace.conf for certain functions. In our case we know the 4th and 5th arguments for EVP_DecryptInit_ex are strings (char*). We do not care about the first three arguments so can ignore them by defining them as addr
(for address). Let's add the following line to ltrace.conf:int EVP_DecryptInit_ex(addr, addr, addr, string, string)
run ltrace again and annnnnnnd voila (look at lines 4 for key and IV):1
2
3
4
5
6
7
# most of the output has been removed
EVP_CIPHER_CTX_new(0, 0xb77cc9c0, 0xbfdecdec, 0xb769fda0, 0xb769f910) = 0x9ff5ce0
EVP_aes_256_cbc(0, 0xb77cc9c0, 0xbfdecdec, 0xb769fda0, 0xb769f910) = 0xb7789040
EVP_DecryptInit_ex(0x09ff5ce0, 0xb7789040, NULL, "ee12c03ceacdfb5d4c0e67c8f5ab3362", "d36a4bf2e6dd9c68") = 1
EVP_DecryptUpdate(0x9ff5ce0, 0xbfdecd6c, 0xbfdecd24, 0xbfdecdec, 48) = 1
EVP_DecryptFinal_ex(0x9ff5ce0, 0xbfdecd8c, 0xbfdecd24, 0xbfdecdec, 48) = 1
EVP_CIPHER_CTX_free(0x9ff5ce0, 0xbfdecd8c, 0xbfdecd24, 0xbfdecdec, 48) = 0
2.4 Finding the Key (Using GDB) II: Electric Boogaloo
That was too easy but we pleased a powerful friend. Let's try and find it using GDB (gasp). Good thing that we compiled out binary using the ggdb switch. If not go ahead and do that. We know we are looking for EVP_DecryptInit_ex
and we have already seen how to use GDB. We will set verbose on
(in case stuff happens).
|
|
We can see EVP_CipherInit_ex
called at 0xb7ed3a5e
. Let's put a breakpoint there (right before function call) and look at its arguments.
|
|
We can see the arguments loaded from memory to eax and then pushed to the stack (esp is the stack pointer and points to the top of the stack at all times). We are in a Linux 32-bit OS so arguments (or parameters whatever) are pushed to the stack from right to left (almost the same in 32-bit Windows systems). Here is what it looks like right before the call instruction:
|
|
We can print the values of both key and IV. To do this in GDB we need to use this command x/s *((char **) ( $esp+0x10 ))
. The s switch tells GDB to print the result as a string. $esp+0x10
is a pointer that points to a location on the stack. In that location we have a char *
which is another pointer to a string, so we need to dereference it twice (hence the char **
). And finally to print it using the s
switch we need to make it a string (e.g. char *
) so we will use the first *. And it works.
|
|
2.5 Using GDB without Debug Info
Our example is in a controlled environment, so we were able to build the binary with debug info. But in a real world scenario we do not have this luxury. In this section I will discuss how to get to EVP_DecryptInit_ex
without debug info.
First we have to build our binary without debug info, just remove the -ggdb
switch to get gcc -o sampleaes-nodebug AES-OpenSSL.c -lcrypto
. Now how do we find the location of EVP_DecryptInit_ex
call?
Remember the following line in the original ltrace output.
[0x8048b0b] EVP_DecryptInit_ex(0x90bdce0, 0xb7735040, 0, 0x8048d50, 0x8048d71) = 1
We used the i
switch to print the value of instruction pointer after the call. This is our entry point. We will debug the binary in GDB and set up a breakpoint at 0x8048b0b
and see what happens.
|
|
Again we see the arguments pushed to the stack.
|
|
We put a breakpoint at 0x08048b06
and re-run the binary. Then we can read key and IV like before:
|
|
However, notice the difference in the function name. It is not just called (0xb7ed3a21) EVP_DecryptInit_ex
but (0x08048b06) EVP_DecryptInit_ex@plt
. Addresses are different. Here's a tip which is not scientific or anything but works for me. If you see an address starting with 0×08 you are in process-land and addresses starting with 0xb are in shared library land. But what is this @plt?
In short, it's the Procedure Linkage Table
. The compiler does not know where EVP_DecryptInit_ex
points to at runtime so it just puts the function call there (relocation) because it does not know the address of our shared library at runtime. Linker will get this function call and replace it with the correct address for the function (actually this is a lot more complex but PLT and Global Offset Table or GOT need their own article). You can read about GOT/PLT in The ELF Object File Format by Dissection on Linux Journal (search for “plt” and read 3 paragraphs including the one with lazy binding).
2.6 iOS and Android
I am not going to go into detail about how we can monitor crypto function calls in iOS and Android as we already have two excellent tools that accomplish this. [redacted internal tool]
is for iOS and [[redacted internal tool]]
is for Android. You can make them hook into crypto function calls and find keys. This is left as an exercise to the reader (meaning I am too lazy). There are also two excellent tutorials by two of my co-workers on how to create custom hooks in iOS and Android Substrate - hooking C on Android and iOS part1/2 and Substrate - hooking C on Android and iOS part 2/2.
2.7 Defence?
We saw that function calls (library calls) leak information. One defense against this side-channel is to link the binaries statically. This will replicate the library code inside the binary and will hopefully make the binary independent of any shared libraries (better for installation). On the other hand, it will increase code size (and thus binary size).
3.0 Looking for Key in Memory
But there are ways to defeat that too. This is our small incursion into the lands of Digital Forensics. The keys are going to be on memory. So that's where we are going to look for them. But how do we find keys in memory. One step is to look for data with high entropy because keys usually look random. But there are many 128-bit (or 256) parts of memory that look random so what do we do?
Remember the Key Schedule
? It's the original key, followed by a number of round keys. If we see a 176 byte structure on memory that looks random, that's probably a key schedule. After finding memories with these characteristics, we can use the relation between the round keys and the original encryption key to determine if the structure is a key schedule.
There are tools that do this for us and they were mostly created for use in Cold Boot Attacks and digital forensics. Imagine if you have a computer running disk encryption software. These keys may be stored in memory in plaintext. Open it up while running until you have access to the RAM. Get a can of air spray, turn it upside down and spray the RAM with it. It will freeze. Frozen RAM degrade much slower so we will have more time to read it. Read it and then run tools on it to find keys. Because memory may have been degraded, these tools use the relationship between round keys and original key to recover degraded bits. For more information you can read this paper Lest We Remember: Cold Boot Attacks on Encryption Keys.
3.1 Dumping Memory
First we need to dump process memory. I know of a couple of different tools. One is memfetch by lcamtuf
(creator of American fuzzy lop fuzzer). In order to build it in Kali you need some modifications. Another is shortstop but has not been update for a long time. By using a Loadable Kernel Module (LKM)
named LiME we can make a memory snapshot of the entire machine. And last but not least Volatility (a memory forensics framework). If you are interested the creators of Volatility recently released a book The Art of Memory Forensics. I have not had time to read it but it looks very useful.
Let's use LiME in our VM.
|
|
This dumps Virtual Machine's memory to memorydump.raw
. Now we need to find keys.
3.2 Finding Keys
There are different tools that we can use here again. One is from the “Lest We Remember” paper called aeskeyfind
. Another is Bulk extractor which finds other memory artifacts such as URLs, emails and Credit Card numbers. We will use aeskeyfind
. The v
switch is for verbose mode that prints the key schedule among other information. This is really not recommended in memory forensics because we are running the dump program inside the VM memory and it will alter memory but it is enough for our purposes. Another thing to note is that I was not running our example program while making the memory snapshot but I found encryption keys.
The 0 constraints mean that no keys were degraded (because we took an on a VM). I do not know what the encryption key is, it's just in memory of VM. If you find out please let me know. In order to find the key for our OpenSSL program this way, we need to stop execution when the key schedule is on memory. This is left as an exercise to the reader (lol).
This concludes our part one. I initially wanted to write everything in one blog post but it this was already too long. Hopefully I can find a 3rd party app to demonstrate my technique in part 2. As usual if you have any feedback/questions, you know where to find me (side bar --->).