Basic concepts
In Part 1, we created an application, traceMe_rel
(the traceMe.c code is reproduced below).
#include<stdio.h> /* * This is a user-defined type */ typedef struct st{ int val; char arrayArg[15]; } ST; /* This recursive function is meant to simulate a function call with * different parameters every time, so we can log the parameters later on. * * We take only a user-defined type pointer. */ static unsigned int prevSeed; int func(ST *st) { static int counter=0; counter++; printf ("\n Entered [%-5d] with st->val=%-15d, st->arrayArg=%s", counter, st->val, st->arrayArg); if( st->val <=0 ) { return 0; } else { srand(prevSeed); /* rand() so we get some random calls, % so not too much randomness */ st->val=st->val / ((rand() % 3) + 2); sprintf(st->arrayArg,"%014d", st->val); prevSeed= st->val * rand(); /* :-) */ return func (st); /* Recursive call */ } } int main(int argc, char *argv[]) { /* No check on argc and argv, to keep it simple*/ ST st; st.val=atoi(argv[1]); sprintf(st.arrayArg, "%014d", st.val); prevSeed=time(NULL); func(&st); printf("\n"); }
Since this application is in release mode, it does not have enough information about the type of arguments of function func
, called in this application.
As a developer, you know the argument types of func
, so you need to somehow cast it to the correct type (i.e., ST). This can be done by writing an auxiliary program in C (let us name it auxSymbols.c
), the sole purpose of which is to inject debug symbols into GDB when debugging traceMe_rel
.
Let us compile auxSymbols.c
in debug mode (passing -g to GCC) and load the symbols from this auxiliary module (it could be in object code form like auxSymbols.o
, or a shared library like libauxSym.so
). The benefits of this approach are that we do not change the original application binary at all, and that the only symbols the auxiliary program needs to define are the ones we need to target, e.g., ST. We will discuss the details below, but first, let’s review our sample program and write our auxiliary program.
7.2.90.20110429-36.fc15
. The sample binaries produced in the article are all 32-bit.Let us look at the same program (traceMe.c
) used in Part 1 (and the code for is reproduced above), as an example.
There is a UDT (user defined type) ST with two fields: val
of type int
and arrayArg
of type char[15]
. The program expects a non-zero positive number as a command-line parameter. The main function assigns the number to st.val
and a string representation of it to st.arrayArg
, where st
is a variable of type ST. Then main calls func
, which takes the address of st
as a parameter. func
is a recursive function. A recursive call to func
is made after reducing st.val
by a random number. The boundary condition for func
occurs if st.val
reduces to 0.
Here is the output of running this program with an argument of 100:
[raman@Chalotra Part2]$ ./traceMe_rel 100 Entered [1 ] with st->val=100 , st->arrayArg=00000000000100 Entered [2 ] with st->val=33 , st->arrayArg=00000000000033 Entered [3 ] with st->val=16 , st->arrayArg=00000000000016 Entered [4 ] with st->val=5 , st->arrayArg=00000000000005 Entered [5 ] with st->val=1 , st->arrayArg=00000000000001 Entered [6 ] with st->val=0 , st->arrayArg=00000000000000 [raman@Chalotra Part2]$
Note that the output may vary across executions, because of the random number used in reducing the recursive problem.
Auxiliary program
Here is auxSymbols.c
:
#include<stdio.h> /* The UDT 'auxST' is same as UDT 'ST' used by traceMe_rel * */ typedef struct auxType{ int auxVal; int auxArray[15]; } auxST; /* Following function is just to make sure * that there is a user of 'auxST', * so that GCC does not optimise 'auxST' * * This function would never be called! */ int fakeFunc(auxST *ptr ) { auxST *ptr_1; ptr->auxVal++; printf ("\n %s", ptr->auxVal); }
Compile auxSymbols.c
with gcc -g auxSymbols.c -c
and gcc -g --shared -fPIC auxSymbols.c -o libauxSym.so
. (The first creates an object file auxSymbols.o
, the second a shared library libauxSym.so
.) Note that both commands pass -g
to GCC, to build in debug mode, which is very important. Note down the size of auxSymbols.o
with a ls -al auxSymbols.o
; we will need this later on.
‘Function parameter logging’ scripts
Let’s apply the technique discussed in the Basic Concepts section as we will write three scripts:
trace1.gdb
: This script usesauxSymbols.o
to log the function parameters.trace2.gdb
: UseslibauxSym.so
to log function parameters.trace2_mod.gdb
: Modifiedtrace2
to support function argument modification.
I have declared an auxiliary UDT with the name auxST
that is equivalent to ST (used in traceMe_rel
, for the func
parameter ST *
). So, here is what we do:
- Load
traceMe_rel
in GDB. At this point, type information for ST is not known. - Load symbols from auxiliary object code (
auxSymbols.o
) or the shared library (libauxSym.so
), compiled in debug mode. - Put a breakpoint on function
func
, and extract the argument using the processor register EBP, which acts as the base pointer. - Once the argument is extracted, cast it to the
auxST
type and extract the subtypes, i.e.,auxVal
andauxArray
(theval
andarrayArg
fields of the UDT ST). - Play with the arguments as you wish: print them, test them against some condition, modify them, etc. You can even stop the program from executing.
Now, let’s dig into the interesting details, one by one.
‘trace1.gdb’
file ./traceMe_rel br main run 100 bt call open("auxSymbols.o", 2) set variable $_fd=$ call mmap(0, 2516, 1|2|4, 1, $_fd,0) set variable $_address=$ printf "\n $_fd=%d, $_address=0x%x\n", $_fd, $_address add-symbol-file auxSymbols.o $_address ptype auxST ptype func br func c while 1 set variable $_one= *((int *)($ebp +0x8)) printf "\n GDB:TRACER: %d, %s\n", ((auxST *) ($_one))->auxVal, ((auxST *) ($_one))->auxArray c end
This script uses the add-symbol-file
command of GDB. The auxSymbols.o
file is opened using the open
system call via GDB’s call
command. Then the mmap
system call is used to map the just-opened auxSymbols.o
. Once mapped, it is now part of the address space of the inferior process (traceMe_rel
). Then debug symbols from auxSymbols.o
are added to the symbol table using add-symbol-file
. After this, GDB now knows what auxST
is. The commands call and add-symbol-file
were discussed in Part 1.
Also, the GDB variable $
represents the most recent displayed value. I have used this variable at two points:
- To get the file descriptor returned by
open
, which is passed tommap
later on. - To get the address returned by
mmap
, at whichmmap
loadedauxSymbols.o
in memory. This address is subsequently passed toadd-symbol-file
. Please refer to the documentation ofmmap
for the magic numbers used in the script. 2516 is the size of theauxSymbols.o
object file.
The script executes the inferior process inside a while 1
loop, and emits a line starting with \n GDB:TRACER:
on every intercepted call to function func
. All script output has GDB:
to differentiate from the output of traceMe_rel
(lines with the prefix Entered
).
Now run traceMe_rel
in GDB with the script trace1.gdb
and redirect the output to a file named out_trace1.txt
. The error printed is because the command at line 21 attempted to execute even after traceMe_rel
had finished executing; ignore this for now!
[raman@Chalotra Part2]$ gdb -x trace1.gdb -batch > out_trace1.txt trace1.gdb:21: Error in sourced command file: No registers.
Let’s analyse the output of this command:
[raman@Chalotra Part2]$ grep GDB out_trace1.txt GDB:TRACER: 100, 00000000000100 GDB:TRACER: 25, 00000000000025 GDB:TRACER: 6, 00000000000006 GDB:TRACER: 1, 00000000000001 GDB:TRACER: 0, 00000000000000
Grep-ing GDB:
gives us all the argument values traced by the script. To make sure that the script traced the correct values, given below is the output of traceMe_rel
. Both outputs match, which means the script is working! You may like to look at out_trace1.txt
for other output of the script.
[raman@Chalotra Part2]$ grep Enter out_trace1.txt Entered [1 ] with st->val=100 , st->arrayArg=00000000000100 Entered [2 ] with st->val=25 , st->arrayArg=00000000000025 Entered [3 ] with st->val=6 , st->arrayArg=00000000000006 Entered [4 ] with st->val=1 , st->arrayArg=00000000000001 Entered [5 ] with st->val=0 , st->arrayArg=00000000000000
auxSymbols.o.
will fail, because symbols in auxSymbols.o
have not been relocated/resolved into the inferior address space yet.‘trace2.gdb’
This script uses the LD_PRELOAD
variable to load libauxSym.so
into the address space of traceMe_rel
. The rest is similar to trace1.gdb
. Here is trace2.gdb
:
set environment LD_PRELOAD ./libauxSym.so file ./traceMe_rel br main run 100 bt ptype auxST ptype func br func c while 1 set variable $_one= *((int *)($ebp +0x8)) printf "\n GDB:TRACER: %d, %s\n", ((auxST *) ($_one))->auxVal, ((auxST *) ($_one))->auxArray c end
Run traceMe_rel
using trace2.gdb
, redirecting output to out_trace2.txt
, with gdb -x trace2.gdb -batch > out_trace2.txt
. Analyse the output; it is the same as out_trace1.txt
:
[raman@Chalotra Part2]$ grep GDB out_trace2.txt GDB:TRACER: 100, 00000000000100 GDB:TRACER: 33, 00000000000033 GDB:TRACER: 11, 00000000000011 GDB:TRACER: 3, 00000000000003 GDB:TRACER: 0, 00000000000000 [raman@Chalotra Part2]$ grep Enter out_trace2.txt Entered [1 ] with st->val=100 , st->arrayArg=00000000000100 Entered [2 ] with st->val=33 , st->arrayArg=00000000000033 Entered [3 ] with st->val=11 , st->arrayArg=00000000000011 Entered [4 ] with st->val=3 , st->arrayArg=00000000000003 Entered [5 ] with st->val=0 , st->arrayArg=00000000000000
‘trace2_mod.gdb’
This script is a modified version of trace2.gdb
, which shows how one can modify the function arguments. Let us suppose that, arbitrarily, the developer wants func
to be called with only odd values of st->val
. The random-number logic may not always satisfy this condition, so this script will capture the function arguments, and change even numbers in st->val
into odd numbers, by subtracting one from the value, and signal this by emitting a line having ...( HACK ):
.
Please note: the script only changes the numeric value st->val
and leaves st->arrayArg
unchanged.
set environment LD_PRELOAD ./libauxSym.so file ./traceMe_rel br main run 100 bt ptype auxST ptype func br func c while 1 set variable $_one= *((int *)($ebp +0x8)) printf "\n GDB:TRACER: %d, %s", ((auxST *) ($_one))->auxVal, ((auxST *) ($_one))->auxArray set variable $_interger=((auxST *) ($_one))->auxVal if $_interger > 0 && $_interger %2 == 0 printf "...( HACK ): Not allowing even numbers, old = %d, new =%d", $_interger, $_interger -1 set ((auxST *) ($_one))->auxVal = $_interger -1 end c end
Run trace2_mod.gdb
on traceMe_rel
with gdb -x trace2_mod.gdb -batch > out_trace2_mod.txt
. Now, extract the script output with grep
on GDB
in out_trace2_mod.txt
(#1), and output of traceMe_rel
with a grep
on Enter
(#4).
We find that traceMe_rel
called func
with even numbers in st->val
twice (100 and 12), as seen in the two lines with HACK
(#2 and #3).
These lines show the original value (old = 100) passed to func
, and the changed value (new = 99). To see that the state has really been changed, look at lines #5 and #6, comparing st->val=99, but the string version is left as st->arrayArg=100
since trace2_mod.gdb
does not change arrayArg.
[raman@Chalotra Part2]$ grep GDB out_trace2_mod.txt #1 GDB:TRACER: 100, 00000000000100...( HACK ): Not allowing even numbers, old = 100, new =99 #2 GDB:TRACER: 49, 00000000000049 GDB:TRACER: 12, 00000000000012...( HACK ): Not allowing even numbers, old = 12, new =11 #3 GDB:TRACER: 3, 00000000000003 GDB:TRACER: 0, 00000000000000 [raman@Chalotra Part2]$ grep Enter out_trace2_mod.txt #4 Entered [1 ] with st->val=99 , st->arrayArg=00000000000100 #5 Entered [2 ] with st->val=49 , st->arrayArg=00000000000049 Entered [3 ] with st->val=11 , st->arrayArg=00000000000012 #6 Entered [4 ] with st->val=3 , st->arrayArg=00000000000003 Entered [5 ] with st->val=0 , st->arrayArg=00000000000000
This approach is powerful, because GDB provides a huge set of commands to check the state of the program, and hence developers can make a rich set of tests before taking any action. This script just demonstrates a small modification (converting even number to odd) to the inferior; however, great things can be done using this technique.
For example, instead of changing the even number to odd, the developer can provide a second version of func (let’s call it func_even
) packaged in libauxSym.so
. On seeing an even number in st->val
, the script can call func_even
instead of func
.
Summing up
In this article, we have discussed how a debugger can be used to perform nice things other than just debugging! I hope you enjoyed and learned something new from this article. The approach suggested here may not be used safely on production applications, but can be used for research and to learn about the various aspects of the system and your application.
last few updates from LFY have been awesome… espl this one!! great stuff :)