The Art of Reverse Engineering

0
6318

A Crackme program is an executable file and usually, on execution, you are supposed to provide a password. This article is a tutorial on cracking passwords for Crackme programs that are designed to test the skills of coders and programmers.

Wikipedia defines reverse engineering as the process by which a man-made object is deconstructed to reveal its design and architecture or to extract knowledge from the object. But what does this term mean to a software engineer? Software reverse engineering is the analysis of software to obtain information about its design and implementation. Practical applications of software reverse engineering include detecting viruses, worms, trojans and other malware, designing better software, etc. It is often believed that with source code readily available all the time, open source software systems do not need reverse engineering. But this is not true. Software reverse engineering is also done for fun and to learn. An excellent example is a Crackme program that is used to test a programmer’s reverse engineering skills.

Do note that applying reverse engineering techniques on proprietary software may not be legal due to intellectual property rights. As we all know, proprietary software does not provide the source code to the user. The developers of proprietary software use code protection schemes and algorithms to conceal the code from casual disassembly. Since proprietary software is legally copyright protected, reverse engineering such software is not legal. However, small programs called Crackme programs are specifically designed for students of software engineering, just for practising their reverse engineering skills. They are, on average, more difficult to reverse engineer than proprietary software.

Figure 1: An incorrect password on crack1 c
Figure 2: Output of strings command on test1
Figure 3: The correct password on crack1 c

Now let us look at how a Crackme program works. Usually, it is an executable file and, on execution, you are supposed to provide a password. But you are given no clues about the password and, hence, finding it by using brute force techniques alone is very difficult. In this article, we will learn a few simple Linux commands and utilities to break Crackme programs compiled in the executable format (ELF – Executable and Linkable Format) of Linux. Many online resources provide Crackme programs in the executable format (EXE) of Microsoft Windows also.

Figure 4: Output of file strip and ls commands

To keep the discussion short and simple, the Linux commands and tools to perform reverse engineering discussed in this article will be tested against simple C programs written by the authors themselves. Let us first consider the following C program called crack1.c:

#include<stdio.h>
#include<string.h>
#define pass “hello”
int main(
{
char pas[100];
printf(“\nEnter the password: “);
scanf(“%s”,pas);
if(!strcmp(pass,pas))
{
printf(“\nYou have CRACKED me!!!\n”);
}
else
{
printf(“\nSorry\n”);
}
return(0);
}

First, let us compile the program crack1.c with the command gcc crack1.c -o test1 to obtain an executable file named test1. Figure 1 shows the compilation and execution of the program crack1.c. A typical execution of a Crackme program looks similar. The program asks the user to enter a password; it then checks whether the correct password has been entered or not, and then an appropriate message is displayed. In this case, we know the password is hello because we have seen the source code of the program crack1.c. But what if we only have the executable file test1 of the program crack1.c? Then, to a person unskilled in reverse engineering, the only option is to guess the password repeatedly. But we know that such an approach is pointless because we have no idea about the number of characters in the password. This is when we need to perform reverse engineering on the executable file test1 to make an educated guess about the password.

First, let us use a simple Linux command called strings to find out the password hidden in the executable file test1. The command strings prints the strings of printable characters in a file. So it can be used to print the human-readable strings in an executable file. The command strings is a part of the GNU Binary Utilities (Binutils). Figure 2 shows a part of the output when the command strings test1 is executed.

From Figure 2, we can clearly see that trying out the word hello is a very logical move. Figure 3 shows the output of the program crack1.c when provided with the password hello.
What if we modify the program crack1.c by replacing the line of code:

#define pass “hello”

…with the line of code:

#define pass “123456”

On execution of the command strings test1 we can get the password as 123456 without much difficulty because the number 123456 is also a human readable string. Now let us use the Linux command file to obtain information about the executable file test1. The command file is used to determine the file type.

Figure 5: Output of objdump utility on test1
Figure 6: Output of the strings command on test2
Figure 7: Output of objdump utility on test2
Figure 8: Output of the command ldd

Figure 4 shows the output of the command file test1. From Figure 4, along with other information, we can also see that the executable file test1 is not a stripped file — one in which debugging information and other data not required for program execution is removed so that the resulting executable file is of a smaller size. Let us strip the executable file test1 with the command strip hoping that stripping will hide some information from a potential reverse engineering attack. Figure 4 also shows the execution of the strip command, followed by the file command which shows the changes made by the strip command. Using the command ls, the size of the file before and after the strip operation is also shown in Figure 4.

From Figure 4, we can see that the file size reduces from 8488 bytes to 6120 bytes after the strip operation. Let us now use the command strings again to check whether the stripped executable file test1 hides the password or not. On execution of the command strings test1, we can see that the password is not hidden in the stripped executable file test1 and the output is the same as the one shown in Figure 2. So, stripping the executable file will not complicate the reverse engineering process much.

Now, let us use a utility called objdump which shows the information from an object file. Figure 5 shows a part of the output from the execution of the command objdump -s test1. The utility objdump is also a part of the GNU Binary Utilities (Binutils). It can be used as a disassembler also. But we will only use the option -s which is used to show the full contents of all the sections.

Figure 9: The disassembled code of test2

From Figure 5, we can easily guess that the password is hello. Another GNU Binary Utility called readelf is also very useful and it is similar in performance to objdump. Though strings and objdump are relatively powerful utilities, we can easily hide the password from exploitation with strings and objdump utilities. Simple hacks can be used to make reverse engineering more difficult. For example, consider the modified C program called crack2.c shown below. In this program the password is not stored as a single string; instead, the line of code:

‘char pass[]={‘h’,’e’,’l’,’l’,’o’,’\0’};’

…declares the password as an array of characters.

#include<stdio.h>
#include<string.h>
int main()
{
char pas[100];
char pass[]={‘h’,’e’,’l’,’l’,’o’,’\0’};
printf(“\nEnter the password: “);
scanf(“%s”,pas);
if(!strcmp(pass,pas))
{
printf(“\nYou have CRACKED me!!!\n”);
}
else
{
printf(“\nSorry\n”);
}
return(0);
}

Compile this C program with the command ‘gcc crack2.c -o test2’ to obtain the executable file test2. Now let us search this executable file with the command strings and the utility objdump to find the password. Figure 6 shows a part of the output of the command ‘strings test2’ and Figure 7 shows a part of the output of the command ‘objdump -s test2’.

From Figures 6 and 7, we can clearly see that both the strings command and the objdump utility fail while trying to break the C program crack2.c. So, in order to attack this program, we are going to use the powerful debugger GDB (GNU Debugger) that works for many programming languages like Ada, C, C++, Objective-C, Free Pascal, FORTRAN, Go, etc. But how can we make sure that the executable file test2 is indeed obtained from one of the above languages? A utility called ldd, which prints the shared libraries used by a program, comes to our help in this situation. Figure 8 shows the output of the command ‘ldd test2’.
From Figure 8, we can see that the executable file test2 calls the file libc.so.6. Since the term ‘libc’ is often used as a short hand for the ‘standard C library’, we can conclude that the executable file test2 is an executable compiled from a C program. Thus, we can readily use GDB to disassemble the executable file test2. The command ‘gdb test2’ opens the executable file test2 for processing. Run the program once with the command ‘run’ on the prompt (gdb), before adding any breakpoints. Now execute the command ‘disassemble main’. Figure 9 shows a part of the disassembled code of the executable file test2.
The line of code:

0x0000555555554810 <+102>: callq 0x555555554670 <strcmp@plt>

…in Figure 9 is the only place where the string comparison function strcmp() of C is used. Thus, it is a good idea to search for the password in registers used immediately before this line of code. The lines of code:

0x000055555555480a <+96> mov %rdx,%rsi
and 0x000055555555480d <+99>: mov %rax,%rdi

…shown in Figure 9, tell us that we have to probe the registers rdx and rax to retrieve the password. Now we set two breakpoints. The first at the beginning of the main() function and the second in the line of code where the function strcmp( ) is called, with the commands b main and b * 0x0000555555554810. Now run the executable file test2 with the command run. The execution temporarily stops at the first breakpoint. Enter the command c to continue the execution and enter the test password dummy when prompted by the program. When the program stops at the second breakpoint, we will get the contents of the registers with the command info registers. Figure 10 shows the contents of the registers, as well as the contents of the registers being printed in string format with the command x/s, where the register rdx contains the test password dummy entered by us and the register rax contains the actual password hello.

Figure 10: Register content in test2 with GDB

Thus, from Figure 10, we can see that the password is hello. Hence, GDB has been successfully used to unearth the hidden password in the executable file test2. Please note that the techniques we have discussed here are quite simple. People who make difficult Crackme programs use a variety of techniques to make password recovery very hard. Some of the techniques involve automatic key generation, obfuscating the code itself, etc. But if you want to practice further and be a master of software reverse engineering, there are a lot of resources available on the Internet. However, we would like to caution you to be extremely careful while downloading and using Crackme programs from unreliable sources, as you would be downloading and running a program whose code is absolutely unknown to you. For all you know, it could be a virus or some sort of malicious program.

So absolute care must be taken while practising with Crackme programs downloaded from the Internet. But all things said, we would like to share our personal experience. We have downloaded and used Crackme programs from a popular website called crackmes.one many times and never had any trouble with the executable files.

Before winding up, we would like to make one confession and give one warning. The confession is that the simple techniques we have discussed here will not help you break even the simplest puzzle available on the Internet. But the important question is: does there exist a really unbreakable Crackme program? Well, there cannot be. The program has to compare the password entered by the user with the actual password. For this, the real password must be stored somewhere in the program. It may be obfuscated or processed into some other form by an algorithm, but whatever be the technique employed, a trace of the password will be there in the program. So, a good programmer with sufficient skills and lots of patience can eventually break any Crackme program. So, in theory, every Crackme program is crackable but, in practice, many a programmer has gone mad trying to crack Crackmes! So, be careful. Our warning is that applying the knowledge you may have gained by reading this article and by further studies on proprietary software may not be always legal due to intellectual property rights. So you should be extra careful while choosing software for reverse engineering.