Follow-up posts
From time to time I use the ltrace
tool for introspection into user-space
processes. This is similar to strace
, but hooks into library API calls intead
of just system calls. This is quite useful, but has some extra challenges.
With system calls you know beforehand the full set of functions you are hooking,
their prototypes, and the meaning and purpose of each argument. With general
libraries the space of all the possible APIs is huge, so you generally do not
know this. ltrace
can read configuration files that define these interfaces,
so with a bit of manual effort you can provide this information. It would be
really nice to be able to trace generic function calls with no extra effort at
all. Much of the prototype data exists in debug infomation, which is often
available along with the executable binary. So by parsing this information, we
can trace API calls without needing to edit a configuration file.
Stock behavior
Let's say I have the following simple project. There are 3 files: tstlib.h
,
tstlib.c
and tst.c
. These define a small library and an application
respectively. Let's say I have
tstlib.h
#pragma once struct tree { int x; struct tree* left; struct tree* right; }; struct tree treetest(struct tree* t); struct loop_a; struct loop_b; typedef struct loop_a { struct loop_b* b; int x;} loop_a_t; struct loop_b { loop_a_t* a; int x;}; void looptest( loop_a_t* a ); enum E { A,B,C }; typedef enum E E_t; int enumtest( enum E a, E_t b ); struct witharray { double x[5]; }; double arraytest( struct witharray* s );
tstlib.c
#include "tstlib.h" struct tree treetest(struct tree* t) { if(t->left != NULL) treetest(t->left); if(t->right != NULL) treetest(t->right); t->x++; return *t; } void looptest( loop_a_t* a ) { a->x++; a->b->x++; } int enumtest( enum E a, E_t b ) { return a == b; } double arraytest( struct witharray* s ) { return s->x[0]; }
tst.c
#include "tstlib.h" #include <unistd.h> void main(void) { struct tree d = {.x = 4}; struct tree c = {.x = 3, .right = &d}; struct tree b = {.x = 2}; struct tree a = {.x = 1, .left = &b, .right = &c}; treetest( &a ); struct loop_a la = {.x = 5}; struct loop_b lb = {.x = 6}; la.b = &lb; lb.a = &la; looptest(&la); enum E ea = A, eb = B; enumtest( ea, eb ); struct witharray s = {.x = {1.0,2.0,1.0,2.0,1.0}}; arraytest( &s ); }
Now I build this with debug information, placing the library in a DSO and setting the RPATH:
cc -g -c -o tst.o tst.c cc -fpic -g -c -o tstlib.o tstlib.c cc -shared -Wl,-rpath=/home/dima/projects/ltrace/ltracetests -o tstlib.so tstlib.o cc -Wl,-rpath=/home/dima/projects/ltrace/ltracetests tst.o tstlib.so -o tst
I now run the stock ltrace
to see calls into the tstlib
library. I'm using
the latest ltrace
in Debian/sid: version 0.7.3-4:
dima@shorty:~/projects/ltrace/ltracetests$ ltrace -n2 -l tstlib.so ./tst tst->treetest(0x7fff6b36ad30, 0x7fff6b36ada0, 0x7fff6b36ada0, 0 <unfinished ...> tstlib.so->treetest(0x7fff6b36acf0, 0x7fff6b36adc0, 0x7fff6b36adc0, 0) = 0 tstlib.so->treetest(0x7fff6b36acf0, 0x7fff6b36ade0, 0x7fff6b36ade0, 0 <unfinished ...> tstlib.so->treetest(0x7fff6b36acb0, 0x7fff6b36ae00, 0x7fff6b36ae00, 0) = 0 <... treetest resumed> ) = 0x7fff6b36acb0 <... treetest resumed> ) = 0x7fff6b36ad30 tst->looptest(0x7fff6b36ad90, 0x7fff6b36ae00, 0x7fff6b36ade0, 0x7fff6b36adc0) = 0x7fff6b36ad80 tst->enumtest(0, 1, 1, 0x7fff6b36adc0) = 0 tst->arraytest(0x7fff6b36ad50, 1, 1, 0x7fff6b36adc0) = 0x3ff0000000000000 +++ exited (status 0) +++
So we clearly see the calls, but the meaning of the arguments (and return
values) isn't clear. This is because ltrace
has no idea what the prototypes of
anything are, and assumes that every API call is long f(long,long,long,long)
.
Patched behavior
I made a patch to read in the prototypes from DWARF debugging information. The
initial version lives at https://github.com/dkogan/ltrace. This is far from
done, but it's enough to evaluate the core functionality. With the patched
ltrace
:
dima@shorty:~/projects/ltrace/ltracetests$ ltrace -n2 -l tstlib.so ./tst tst->treetest({ 1, { 2, nil, nil }, { 3, nil, { 4, nil, nil } } } <unfinished ...> tstlib.so->treetest({ 2, nil, nil }) = nil tstlib.so->treetest({ 3, nil, { 4, nil, nil } } <unfinished ...> tstlib.so->treetest({ 4, nil, nil }) = nil <... treetest resumed> ) = { 5, nil, nil } <... treetest resumed> ) = { 2, { 3, nil, nil }, { 4, nil, { 5, nil, nil } } } tst->looptest({ { recurse^, 6 }, 5 }) = <void> tst->enumtest(A, B) = 0 tst->arraytest({ [ 1.000000, 2.000000, 1.000000, 2.000000... ] }) = 1.000000 +++ exited (status 0) +++
Much better! We see the tree structure, the array and the enum values. The return values make sense too. So this is potentially very useful.
Issues to resolve
Playing with this for a bit, it's becoming more clear what the issues are. The DWARF information gives you the prototype, but an API definition is more than just a prototype. For one thing, if a function has a pointer argument, this can represent and input or an output. My implementation currently assumes it's an input, but being wrong either way is problematic here:
- If a pointer is an output and ltrace interprets it as an input, then the
output is never printed (as we can see in the loop test above). Furthermore,
the input will be printed and since there could be nested pointers, this
could result in a segmentation fault. In this case
ltrace
can thus crash the process being instrumented. Oof. - If a pointer is an input treated as an output, then again, we won't see useful information, and will be printing potentially bogus data at the output.
This can be remedied somewhat by assuming that an input must be const
(and
vice versa), but one can't assume that across the board.
Even if we somehow know that a pointer is an input, we still don't know how to
print it. How many integers does an int*
point to? Currently I assume the
answer is 1, but what if it's not? Guessing too low we don't print enough useful
information; guessing too high can overrun our memory.
These are all things that ltrace
's configuration files can take care of. So it
sounds to me like the best approach is a joint system, where both DWARF and the
config files are read in, and complementary definitions are used. It wouldn't be
fully automatic, but at least it could be right. In theory this is implemented
in the tree I linked to above, but it doesn't work yet.
This all needs a bit more thought, but I think I'm on to something.