Re: [DynInst_API:] Binary Rewriting using wrapFunction


Date: Tue, 16 Dec 2014 00:59:05 +0100
From: Sergej Proskurin <prosig@xxxxxxx>
Subject: Re: [DynInst_API:] Binary Rewriting using wrapFunction
On 16.12.2014 00:31, Josh Stone wrote:
> On 12/15/2014 03:19 PM, Sergej Proskurin wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hello,
>>
>>> On 15.12.2014 19:38, Josh Stone wrote:
>>
>>> Note that GCC may use builtin versions of some libc functions, and
>>> make transformations for efficiency.  For instance, if you didn't
>>> compile with -fno-builtin-printf, a call like printf("%s\n", str)
>>> might be translated to puts(str).
>>
>> I know about this but this should not present any troubles, thanks :)
>>
>>> How about: 3) Use an LD_PRELOAD approach to insert your custom
>>> function, and dlsym RTLD_NEXT if that still needs to call the
>>> original.  You'd have to write your code in its own library, of
>>> course.  Then you can either use LD_PRELOAD directly, or use
>>> dyninst loadLibrary -- although I'm not sure if that lets you
>>> ensure the order of loading.  And this approach will only work with
>>> dynamic functions, not statically linked.
>>
>> The LD_PRELOAD approach should work quite well for dynamically linked
>> library functions. And if it is possible to determine whether a
>> function in question is statically or dynamically linked, the program
>> could be designed in such a way that it wraps statically linked
>> library functions with help of the mentioned BPatch_wrapFunction(...).
>>
>> Regarding the function "BPatch::loadLibrary(...)":
>> I have been using it to load my own library filled with predefined
>> wrappers for shared libraries into the address space of the mutatee.
>> However, that led to the behaviour, described above: Wrappers to
>> dynamic function calls are not preserved, when the binary is
>> statically rewritten (however it works when the process is
>> instrumented at runtime). Or did I miss something?
>>
>> I thought (hoped) that Dyninst BPatch::loadLibrary(...) internally
>> modifies the dynamic symbol tables of the ELF file so that the wrapped
>> calls will be preserved after binary rewriting :)
> 
> ELF symbol tables don't lock in which library it comes from, which is
> why I hedged about loading order above.  If your wrapper function is
> named the same as the original symbol, then you'll only win if your
> library is loaded first.  If you give your functions unique names, then
> I'd think (hope) it should work.  :)
> 

This makes sense :) This is exactly, what I have been trying within my
program: The resulted binary, however results with wrapped functions
calls to either functions that are part of the binary itself or to
statically linked functions (whereas I haven't tried to wrap statically
linked library function calls, yet -- these should behave like the
remaining functions located within the binary). Unique symbols of the
wrappers to dynamically linked functions unfortunately do not even
appear within the symbol table after rewriting.

My guidelines are as follows:
* printf() // library function to be wrapped
* bg_prinf() // wrapper function
* orig_printf() // represents the original printf() after wrapping

But maybe, I have not thought about some details within my
implementation. Here is a short snippet from my code concerning the
function wrapping:

---
...
/* function names in question: providing unique names */
string newFuncName = "bp_" + oldFuncName;
string origFuncName = "orig_" + oldFuncName;

/* find functions in question */
img->findFunction(oldFuncName.c_str(), funcs);
BPatch_function *oldFunc = funcs[0];

img->findFunction(newFuncName.c_str(), funcs);
BPatch_function *newFunc = funcs[0];

/* find the symbol, the orig func shall be referred to from now on */
Module *mod = SymtabAPI::convert(newFunc->getModule());
for (auto s=symbols.begin(); s!=symbols.end(); s++) {
	if (!(*s)->getPrettyName().compare(origFuncName))
		bp_as->wrapFunction(oldFunc, newFunc, (*s));
}

---

PS: In contrast to the solution provided by the Dyninst documentation
(which utilizes the function "findSymbol(...)" to get the symbol of the
function, which should represent the original library function from now
on), I am using the for loop to get the right symbol, since the
documented solution did not work (I am using Dyninst 8.2.1).

Best regards,
Sergej


[← Prev in Thread] Current Thread [Next in Thread→]