Over the weekend at Women Who Hack I gave a short demonstration on reverse engineering. I wanted to show how “cracking” works, to give a better understanding of how programs work once they’re compiled. It also serves my abiding interest in processors and other low-level stuff from the 80s.

My goal was to write a program which accepts a password and outputs whether the password is correct or not. Then I would compile the program to binary form (the way in which most programs are distributed) and attempt to alter the compiled program1 to accept any password. I did the demonstration on OS X, but the entire process uses open source tools from beginning to end, so you can easily do this on Windows (in an environment like Cygwin) or on Linux. If you want to follow along at home, I’m assuming an audience familiar with programming, in some form or another, but not much else.

Building a Program

I opened a terminal window and fired up my text editor (Vim) to create a new file called program.c. I wanted to write something that would be easy to understand, edit, and yet still could be compiled, so C seemed like a fine choice. My program wasn’t doing anything that would’ve been strange in the year 1982.

First, I wrote a function for validating a password.

int is_valid(const char* password)
{
    if (strcmp(password, "poop") == 0) {
        return 1;
    } else {
        return 0;
    }
}

This function accepts a string and returns a 1 if the string is “poop” and 0 otherwise2. I’ve chosen to call it is_valid to make it easier to find later. You’ll understand what I mean a few sections down.

Now we need a bit of code to accept a string as input and call is_valid on it.

int main()
{
    char* input = NULL;
    input = malloc(256);3
    printf("Please input a word: ");
    scanf("%s", input);

    if (is_valid(input)) {
        printf("That's correct!\n");
    } else {
        printf("That's not correct!\n");
    }

    free(input);
    return 0;
}

This source code is likewise pretty standard. It prompts the user to type in a string and reads it in to a variable called input. Once that’s done, it calls is_valid with that string. Depending on the result, it either prints “That’s correct!” or “That’s not correct!” and exits, returning control to the operating system. With a couple of “include” directives at the top, this is a fully functioning program4.

Let’s build it! I saved the file program.c and used the command gcc program.c -o program5 to build it.

This outputs a file in the current directory called program which can be executed directly. Let’s run our program by typing ./program. It’ll ask us to put in a word to check. We already know what to put in (“poop”), so let’s do that and make sure we see the result we expect.

Please input a word: poop
That's correct!

And if we run it again and type in the wrong word, we get the other possible result.

Please input a word: butts
That's not correct!

So far, so good.

A Deeper Look

There’s nothing special about this program that makes it different than your web browser or photo editor; it’s just a lot simpler. I can demonstrate this on my system with the file command. Trying it first on the program I just built, with the command file program, I see:

program: Mach-O 64-bit executable x86_64

This is the file format OS X uses to store programs. If this kind of file seems unfamiliar, the reason is that most applications are distributed as app bundles which are essentially folders holding the executable program itself and some ancillary resources. Again, with file, we can see this directly by running file /Applications/Safari.app/Contents/MacOS/Safari:

/Applications/Safari.app/Contents/MacOS/Safari: Mach-O 64-bit executable x86_64

Let’s learn a little more about the binary we just built. We can’t open it in a text editor, or else we get garbage. Using a program called hexdump we can see the raw binary information (translated to hexadecimal) contained in the file. Let’s get a glimpse with hexdump -C program | head -n 20.

00000000  cf fa ed fe 07 00 00 01  03 00 00 80 02 00 00 00  |................|
00000010  10 00 00 00 10 05 00 00  85 00 20 00 00 00 00 00  |.......... .....|
00000020  19 00 00 00 48 00 00 00  5f 5f 50 41 47 45 5a 45  |....H...__PAGEZE|
00000030  52 4f 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |RO..............|
00000040  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  19 00 00 00 28 02 00 00  |............(...|
00000070  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000080  00 00 00 00 01 00 00 00  00 10 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
000000a0  07 00 00 00 05 00 00 00  06 00 00 00 00 00 00 00  |................|
000000b0  5f 5f 74 65 78 74 00 00  00 00 00 00 00 00 00 00  |__text..........|
000000c0  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
000000d0  10 0e 00 00 01 00 00 00  e7 00 00 00 00 00 00 00  |................|
000000e0  10 0e 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 04 00 80 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  5f 5f 73 74 75 62 73 00  00 00 00 00 00 00 00 00  |__stubs.........|
00000110  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000120  f8 0e 00 00 01 00 00 00  1e 00 00 00 00 00 00 00  |................|
00000130  f8 0e 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|

The left column is the “offset,” in hexadecimal (like line numbering, it tells us how many bytes into the file we are on a particular line). The middle two columns are the actual contents of the file itself, again in hexadecimal. The right column shows an ASCII equivalent for the file’s contents, where possible. If you pipe the file’s contents to less you can scan through and see mostly a lot of garbage and also a few familiar strings. If you’re interested in knowing what pieces of text are embedded in a file, the program strings speeds this process up a great deal. In our case, it tells us:

poop
Please input a word:
That's correct!
That's not correct!

So clearly those strings are still floating around in the file. What’s the rest of this stuff? Volumes of documentation exist out there on the Mach-O file format, but I don’t want to bog down in the details. I have to level with you here—I honestly don’t actually know much about it. Analogizing from other executable formats I’ve seen before, I know there’s probably a header of some kind that helps the operating system know what kind of file this is and points out how the rest of the file is laid out. The rest of the file, incidentally, is made up of sections which may contain any of a number of things, including data (the strings in this case) built into the program; information on how to find code called from elsewhere in the system (imports, like our printf and strcmp functions, among others); and executable machine code.

Disassembling the Program

It’s the machine code we’re interested in now. This is the interesting part! Machine code is binary data, a long string of numbers which correspond to instructions the processor understands. When we run our program, the operating system looks at the file, lays it out in memory, finds the entry point, and starts feeding those instructions directly to the processor.

If you’re used to scripted programming languages, this concept might seem a little odd, but it bears on what we’re about to do to our binary. There’s no interpreter going over things, checking stuff, making sure it makes sense, throwing exceptions for errors and ensuring they get handled. These instructions go right into the processor, and being a physical machine, it has no choice but to accept them and execute each one6. This knowledge is very empowering because we have the final say over what these instructions are.

As you may know, the compiler gcc translated my source code I wrote earlier into machine language (and packaged it nicely in an executable file). This allows the operating system to execute it directly, but as another important consequence of this process, we also no longer need the source code. Most of the programs you run likely came as binary executables without source code at all. Others may have source code available, but they’re distributed in binary form.

Whatever the case, let’s imagine I lost the source code to program up above and can’t remember it. Let’s also imagine I can’t even remember the password, and now my program holds hostage important secrets.

You might think I could run the binary through the strings utility, hoping the password gets printed out, and in this case, you’d be on the right track. Imagine if the program didn’t have a single password built in and only accepted passwords whose letters were in alphabetical order or added up (in binary) a specific way. Without the source code, I couldn’t scan to see which strings seem interesting, and I wouldn’t have a clue what to type in.

But we don’t need to lose heart because we already know that the program contains machine code, and since this machine code is meant to be fed directly to the processor, there’s no chance it’s been obfuscated or otherwise hidden. It’s there, and it can’t hide. If we knew how to read the machine code, there would be no need for the source code.

Machine code is hard for a human to read. There’s a nice GNU utility called objdump which helps enormously in this respect. We’ll use it to disassemble the binary. This process is called “disassembly” instead of “decompilation” because we can’t get back the original source code; instead we can recover the names of the instructions encoded in machine code. It’s not ideal, but we’ll have to do our best. (Many people use a debugger to do this job, and there’s a ton of benefits to doing so, like being able to watch instructions execute step by step, inspect values in memory, and so on, but a disassembly listing is simpler and less abstract.)

I looked up the documentation for gobjdump (as it’s called on my system7) and picked out some options that made sense for my purposes. I ended up running gobjdump -S -l -C -F -t -w program | less to get the disassembly8. This is probably more than we’d care to know about our program’s binary, much of it mysterious to me, but there’s some very useful information here too.

The Disassembly

I’ll share at least what I can make of the disassembly. At the top of the listing is some general information. This symbol table is interesting. We can see the names of the functions I defined. If I had truly never seen the source code, I would at this point take an especial amount of interest in a function called is_valid, wouldn’t I?

Immediately below this is a “Disassembly of section .text”. I happen to know from past experience that the “.text” bit is a bit misleading for historical reasons; a “.text” section actually contains machine code! The leftmost column contains offsets (the place in the file where each instruction begins). The next column is the binary instructions themselves, represented in hexadecimal. After that are the names and parameters of each instruction (sometimes with a helpful little annotation left by objdump).

Of course, the very first thing I see is the instructions of the is_valid function.

Disassembly of section .text:9

0000000100000e10  (File Offset: 0xe10):
   100000e10:   55                      push   %rbp
   100000e11:   48 89 e5                mov    %rsp,%rbp
   100000e14:   48 83 ec 10             sub    $0x10,%rsp
   100000e18:   48 89 7d f0             mov    %rdi,-0x10(%rbp)
   100000e1c:   48 8b 7d f0             mov    -0x10(%rbp),%rdi
   100000e20:   48 8d 35 33 01 00 00    lea    0x133(%rip),%rsi        # 100000f5a <strcmp$stub+0x4a> (File Offset: 0xf5a)
   100000e27:   e8 e4 00 00 00          callq  100000f10 <strcmp$stub> (File Offset: 0xf10)
   100000e2c:   3d 00 00 00 00          cmp    $0x0,%eax
   100000e31:   0f 85 0c 00 00 00       jne    100000e43 <is_valid+0x33> (File Offset: 0xe43)
   100000e37:   c7 45 fc 01 00 00 00    movl   $0x1,-0x4(%rbp)
   100000e3e:   e9 07 00 00 00          jmpq   100000e4a <is_valid+0x3a> (File Offset: 0xe4a)
   100000e43:   c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
   100000e4a:   8b 45 fc                mov    -0x4(%rbp),%eax
   100000e4d:   48 83 c4 10             add    $0x10,%rsp
   100000e51:   5d                      pop    %rbp
   100000e52:   c3                      retq   
   100000e53:   66 66 66 66 2e 0f 1f 84 00 00 00 00 00  data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)

This is super exciting because we’re about to read assembly language10. There are lots of books and sites on this subject, and my own understanding of assembly language is a bit rusty from years of disuse, but I know enough to get the gist. Let’s break it down.

  • The first three instructions (the first three lines, starting with 100000e10) are a preamble that begin most functions in assembly language generated by a compiler. They’re not important for us. (It saves the old frame pointer, gets a new frame pointer, and clears space on the stack for locals.)11
  • The next two instructions set up for our strcmp function. This looks a bit odd in assembly language compared to what we’re used to. The mov instructions are shifting data from a location in memory to a register12 and vice versa. Because registers are involved, the disassembly wasn’t able to hint very well what these values may be, but we can guess it’s moving the strings to compare into place. I know this because of the calling convention for the function call (basically, set up the data and then make the call, which will know where to find the data); because the %rbp is the base register, which usually points to data; and because -0x10(%rbp) is a way of saying “look sixteen bytes earlier in memory than the address in the %rbp register.”
  • The lea and callq instructions load and call the strcmp function using the parameters we just moved in place. That function lives elsewhere in the system, so some magic happens here to transfer control of our program to that function.
  • By the time we reach the cmp instruction, strcmp has done its thing and stored its result in the accumulator register %eax. By convention13, return values usually live in %eax, so given that we’re using a cmp (“compare”), and it’s acting on %eax and $0x0 (a zero), it’s a safe bet we’re checking to make sure strcmp returned zero. This instruction has the side effect of setting a flag in the processor called ZF to either 1 or 0, depending on if the comparison is true or not.
  • The next instruction is jne which is short for “jump if not equal.” It checks the ZF flag, and if it’s zero, skips ahead twelve bytes (bypassing any instructions in the intervening space).
  • That’s followed by a movl and a jmpq. These instructions move a 1 into a location in memory and skip ahead another seven bytes. Look at the two-digit hexadecimal numbers to the left of these two instructions. They add up to twelve!
  • Likewise, after these instructions, one other instruction moves the value 0 into the same location of memory and continues ahead. This instruction is exactly seven bytes long. So these jumps accomplish one of either two things: either the memory location -0x4(%rbp) is going to hold a 1 or a 0 by the time we get to the final mov. This is how assembly language does an if—a very interesting detail we’ll return to.
  • That last mov puts the value at -0x4(%rbp) (we just saw it’s either a 1 or a 0) into %eax, which we know is going to be the return value.
  • Finally, the function undoes the work from the preamble and returns. (After that is some junk that’s never executed.)

That was a lengthy explanation, so to sum up, we learned that the binary executable has a function called is_valid, and this function calls strcmp with some values and returns either a 1 or a 0 based on its return value. That’s a pretty accurate picture based on what we know of the source code, so I’m pleased as punch!

Directly below the definition for this function is the main function. It’s longer, but it’s no more complex. It does the same basic tasks of moving values around, calling functions, inspecting the values, and branching based on this. Again, the values are difficult to get insight into because many registers are used, and there’s a bit more setup. For the sake of brevity, I’ll leave analyzing this function as an exercise for the reader (I promise it won’t be on the test).

Breaking the Program

Remember, we don’t have the slightest idea what the password is, and there’s no good indication from the disassembly what it might be. Now that we have a good understanding of how the program works, we stand a good chance of modifying the program so that it believes any password is correct, which is the next best thing.

We can’t modify this disassembly listing itself. It’s output from objdump meant to help us understand the machine code (the stuff in the second column). We have to modify the program file itself by finding and changing those hexadecimal numbers somewhere in the file.

After looking over how both is_valid and main work, there are lots of opportunities to change the flow of the program to get the result we want, but we have to stay within a few rules. Notice how a lot of the instructions specify where other parts of the program are in terms of relative counts of bytes? That means that we can’t change the number of bytes anywhere, or else we’d break all the symbol references, section locations, jumps, offsets, and so on. We also need to put in numbers which are valid processor instructions so that the program doesn’t crash.

If this were your first program, I’d be forced to assume you wouldn’t know what numbers mean what to the processor. Luckily, the disassembly gives us hints on how to attack it. Let’s confine our possibilities (such as changing jump logic or overwriting instructions with dummy instructions) to only those we can exploit by using looking at this disassembly itself. There isn’t a lot of variety here.

To me, one neat thing about is_valid stands out. Two of the lines are extremely similar: movl $0x0,-0x4(%rbp) and movl $0x1,-0x4(%rbp). They do complementing things with the same memory location, use the same number of bytes (seven), involve the same setup, are near one another, and directly set up the return value for is_valid. This says to me the machine code for each instruction would be interchangeable, and by changing one or the other, we can directly change the return value for is_valid to whatever we want. It’s a safe bet, with a function named that, we want it to return a 1, but if we weren’t sure, I could look ahead to the main function and see how its return value gets used later on.

In other words, we want to change movl $0x0,-0x4(%rbp) to be movl $0x1,-0x4(%rbp) so that no matter what, is_valid returns a one. The machine code for the instruction we have is c7 45 fc 00 00 00 00. Conveniently, the machine code for that precise instruction we want is just two lines above: c7 45 fc 01 00 00 00. The last challenge ahead is to find these bytes in the actual file and change them.

Where in the file are these bytes? Note that the listing says “File Offset: 0xe10” for the function is_valid. That’s actually the count of bytes into the file we’d find the first instruction for this function (3648 bytes, in decimal), and the offset in the left column for the first instruction is “100000e10”, so those offsets in the left column look like they tell where in the file each instruction’s machine code is. The instruction we care about is at “100000e43”, so it must be 3651 bytes into the file. We only need to change the fourth byte of the instruction, so we can add four to that count to get 3655 bytes.

Using hexdump -C program | less and scrolling ahead a bit, I find a line like this one:

00000e40  00 00 00 c7 45 fc 00 00  00 00 8b 45 fc 48 83 c4  |....E......E.H..|

Sure enough, there’s the instruction, and the seventh byte on this line is the one we want to change. Patching a binary file from the command line is sort of difficult, but this command should do the trick:

printf '\x01' | dd of=program bs=1 seek=3654 count=1 conv=notrunc14

dd is writing to the file program (of=program), seeking by one byte at a time (bs=1), skipping ahead 3654 bytes past the first one to land on 3655 (seek=3654), changing only one byte (count=1), and not truncating the rest of the file (conv=notrunc).

Now I’ll run the program the same way we did before (./program) and see if this worked.

Please input a word: butts
That's correct!

Success!

Conclusions

That’s about it. It’s a contrived example, and I knew it would work out before the end, but this is a great way to start learning how programs are compiled, how processors work, and how software cracking happens. The concepts here also apply themselves to understanding how many security exploits work on a mechanistic level.