Re: [DynInst_API:] [PATCH] proccontrol: Always set UserRPCState in runIRPC


Date: Mon, 10 Feb 2014 13:47:09 -0800 (PST)
From: Matthew LeGendre <legendre1@xxxxxxxx>
Subject: Re: [DynInst_API:] [PATCH] proccontrol: Always set UserRPCState in runIRPC

On Fri, 7 Feb 2014, Josh Stone wrote:
On 02/07/2014 05:08 PM, Matthew LeGendre wrote:
Thanks for the summary, I was a little confused from reading the proposed
patch.

Yeah, sorry.  I was excited to have figured something out, but I just
sent it too quickly.

There was a fair amount of debate about ProcControlAPI's iRPC stopping
semantics a while back.  Our solution was to allow a thread to be stopped
by the user during an iRPC.  That's important for allowing the handling of
any asynchronous events (like signals) that could interrupt an iRPC, or
handling events triggered by the iRPC (like a library load).  We expect
that if the user stops a thread during an iRPC, they will eventually issue
a matching continue and then we'll finish the iRPC.

I haven't tried sneaking signals in there, but this iRPC for my
loadLibrary is still properly handling the library callbacks AFAICT.

It looks like this patch forces a thread with a iRPC to always be running,
even if a user asks for it to be stopped.  That's going to cause other
problems.  I think the proper fix is to figure out why the DyninstAPI code
that sits over ProcControlAPI isn't issuing a continue after it stops the
iRPC.

Sure, we can track down that stop.

But I also wonder how it makes any sense to stop an iRPC, *especially*
for a synchronous iRPC like loadLibrary.

ProcControlAPI and Dyninst have different models for iRPCs/OneTimeCodes, and they deal with this issue differently. For ProcControlAPI, running an iRPC is a state that a thread can be in, which is fairly independent from the rest of the system. A thread running an iRPC can be stopped/continued, handle events and be subject to any normal ProcControlAPI operation.

Dyninst uses OneTimeCodes, which are built on top of ProcControlAPI's iRPCs. Dyninst won't let a OneTimeCode be stopped or allow all other operations to happen while the OneTimeCode is running. It will always immediately run a OneTimeCode to completion when it's posted.

The Dyninst model presents a simpler API to the user, but has more restrictions and has a harder time handling complicated cases (which I suspect is the source of this bug).

If a user stops that from some nested callback, then they'll never get control back!

The Dyninst model doesn't allow that to happen, and the ProcControlAPI model does two things prevent it:

If a user waits for completion of a synchronous iRPC and stops the thread it's running on, this gets detected in ProcControlAPI and is returned as an error from the 'wait for completion' API call.

Also, ProcControlAPI doesn't allow nested callbacks. A ProcControlAPI callback function is somewhat analogous to a UNIX signal handler in that it has a limited set of "safe" calls it can make. The set of calls that can trigger a callback (mostly anything that waits for something to happen) is well defined and is not allowed to be invoked from a callback.

And then you have that "interesting problem" block of code in
PCProcess::postIRPC_internal which will force-continue it anyway...

That code is part of the mapping between the Dyninst model and the ProcControlAPI model. Dyninst does the force continue, ProcControlAPI does not.

Do you have a small reproducer for the original problem?  If not, could
you share some DYNINST_DEBUG_PROCCONTROL output showing the issue?

Sure, see attached.  When you run like:

$ ./mutator ./forkexec
started pid 15523
exec in pid 15524
waiting for pid 15524

Thanks. This reproduced it for me. I'll try to look into it in more depth soon.

Or with my patch, it's never allowed to stop the iRPC at all. :)  But if
that has too many of its own problems, I'm open to suggestions.

I think the general idea behind your patch is fine, but it needs to happen at the Dyninst level rather than ProcControlAPI level.

-Matt
[← Prev in Thread] Current Thread [Next in Thread→]