Profiler[0] - Position-independent Executables
It’s always a bit sad to dust off an older project only to find it’s broken. And, that’s exactly what happened to a small C/C++ profiler I wrote.
The Problem
My profiler has just been laying around for a few years now. Though, it has been quite useful and I thought it may actually be worth open sourcing.
I blew off the dust, ran one of it’s tests and here’s a snippet of the first output I saw:
[Pruned Function Tree]
\-- : (5.01 sec) (100.00%)
| \-- ??:??:0 (5.01 sec) (100.00%)
| | \-- ??:??:0 (4.00 sec) (79.95%)
| | \-- ??:??:0 (1.00 sec) (20.01%)
The profiler should be able to map addresses to FUNCTION:FILE:LINENO
, but all I got was ??:??:0
.
Let’s go back to before I figured out the problem and do some debugging together and see what went wrong.
Debugging
My profiler relies on a utility from binutils called addr2line
to convert addresses into something human readable. So, let’s play around with that a bit.
We’ll start with this nearly useless program. Call it test.c
:
int main()
{
return 0;
}
Compile it with debug info.
gcc -g test.c
Now, just run the executable in gdb and print the address of main.
gdb a.out
(gdb) p main
$1 = {int ()} 0x5fa <main>
Great. We know that’s the address of main, but let’s just make sure addr2line
comes to the same conclusion.
addr2line -fse a.out 0x5fa
main
test.c:2
Perfect, gdb and addr2line are on the same page. But, then why is my profiler not able to resolve the addresses.
When we take the same silly program and profile it, we still get ??:??:0
instead of the FUNCTION:FILE:LINENO
.
[Pruned Function Tree]
\-- : (0.00 sec) (100.00%)
| \-- ??:??:0 (0.00 sec) (100.00%)
Enabling some debug printing, we can see exactly what addr2line
commands it’s running:
cmd: addr2line -f -s -e /home/kevin/dev/profiler/a.out 0x559359602F4A
Wow, look at that huge address. That’s certainly not the 0x5fa
gdb and addr2line
had shown above.
It seems like maybe the profiler is getting a bad address. The program to be profiled is compiled with gcc’s -finstrument-functions
option to add special code at the beginning and end of each function call. That’s how the profiler actually gets the address that is eventually passed to addr2line
.
Excerpt from gcc’s man page:
-finstrument-functions
Just after function entry and just before function exit, the following
profiling functions are called with the address of the current
function and its call site.
void __cyg_profile_func_enter (void *this_fn,
void *call_site);
void __cyg_profile_func_exit (void *this_fn,
void *call_site);
Let’s just expand upon our silly test a bit and see what addresses we get when we instrument the functions.
We’ll have gcc instrument the entry to functions for our test.
#include <stdio.h>
void
__attribute__ ((no_instrument_function))
__cyg_profile_func_enter (void *func, void *caller)
{
printf("func: %p, caller: %p\n", func, caller);
}
int main()
{
return 0;
}
When we compile and run the program, we get the same weird large address:
gcc -g -finstrument-functions test.c
./a.out
func: 0x55dbdd09b709, caller: 0x7efd255c9b97
Oh, wait… that’s not the same address. We got 0x559359602F4A
before and now 0x55dbdd09b709
. Let’s run it a few more times.
./a.out
func: 0x556b558a2709, caller: 0x7fb8179a0b97
./a.out
func: 0x559bb0380709, caller: 0x7f96d7714b97
Ah hah! Finally, a good hint at what’s happening. Every time the program is run, it ends up in a different place in memory. That sounds an aweful lot like address space randomization. I knew this was done for shared libraries, but didn’t expect it to place the main executable randomly as well.
The profiler definitely worked on various releases of CentOS and Ubuntu in the past. So, what changed?
A bit of googling later and it turns out there was a (somewhat) recent change to gcc on multiple distros: PIE is enabled by default since Ubuntu 17.10 and I just recently upgraded my personal machines to Xubuntu 18.04.
When an executable is compiled to be position-indepenent (PIE), it allows the kernel to use ASLR and place the executable at a “random” place in memory. This explains why we get a different address every time we run the test programs.
So, what can we do about it? Check out part 1, where we’ll walk through the solution I used.