Re: [DynInst_API:] [PATCH] proccontrol: Always set UserRPCState in runIRPC


Date: Fri, 07 Feb 2014 18:55:05 -0800
From: Josh Stone <jistone@xxxxxxxxxx>
Subject: Re: [DynInst_API:] [PATCH] proccontrol: Always set UserRPCState in runIRPC
On 02/07/2014 05:08 PM, Matthew LeGendre wrote:
> Thanks for the summary, I was a little confused from reading the proposed 
> patch.

Yeah, sorry.  I was excited to have figured something out, but I just
sent it too quickly.

> There was a fair amount of debate about ProcControlAPI's iRPC stopping 
> semantics a while back.  Our solution was to allow a thread to be stopped 
> by the user during an iRPC.  That's important for allowing the handling of 
> any asynchronous events (like signals) that could interrupt an iRPC, or 
> handling events triggered by the iRPC (like a library load).  We expect 
> that if the user stops a thread during an iRPC, they will eventually issue 
> a matching continue and then we'll finish the iRPC.

I haven't tried sneaking signals in there, but this iRPC for my
loadLibrary is still properly handling the library callbacks AFAICT.

> It looks like this patch forces a thread with a iRPC to always be running, 
> even if a user asks for it to be stopped.  That's going to cause other 
> problems.  I think the proper fix is to figure out why the DyninstAPI code 
> that sits over ProcControlAPI isn't issuing a continue after it stops the 
> iRPC.

Sure, we can track down that stop.

But I also wonder how it makes any sense to stop an iRPC, *especially*
for a synchronous iRPC like loadLibrary.  If a user stops that from some
nested callback, then they'll never get control back!

And then you have that "interesting problem" block of code in
PCProcess::postIRPC_internal which will force-continue it anyway...

> Do you have a small reproducer for the original problem?  If not, could 
> you share some DYNINST_DEBUG_PROCCONTROL output showing the issue?

Sure, see attached.  When you run like:

$ ./mutator ./forkexec
started pid 15523
exec in pid 15524
waiting for pid 15524

... it will hang there.  A "kill -WINCH 15523" on the parent is enough
to nudge back it into action.  That's where it returns from the
runIRPCSync under loadLibrary, switching to the continueStoppedIRPC.

With DYNINST_DEBUG_PROCCONTROL=1 DYNINST_DEBUG_DYNPC=1, I can see:
> [handler.C:437-U] - Handling event 'Library' with handler 'Callback'
> [handler.C:2210-U] - Changing callback state of 20073 before CB
> [process.C:4250-U] - Desyncing 20073/20073 callback state to level 1
> [process.C:4372-U] - Changing callback state for 20073/20073 from dontcare to ditto
> [process.C:4372-U] - Changing user state for 20073/20073 from running to stopped
> [handler.C:2233-U] - Triggering callback for event 'Library'
> [handler.C:2238-U] - Triggering callback #0 for event 'Library'
> [140237764323200]pcEventMuxer.C[219]: Begin callbackMux, process pointer = 0x5eb02b0, event Library
> [140237764323200]pcEventMuxer.C[439]: Added event Library to mailbox, size now 1
> [140237764323200]pcEventMuxer.C[221]: after muxing event, mailbox size is 1
> [handler.C:2249-U] - Callback #0 return cbProcStop/cbProcStop
> [process.C:4372-U] - Changing user state for 20073/20073 from stopped to running
> [handler.C:2267-U] - Handling return value for main process
> [handler.C:2192-U] - Callbacks returned process stop
> [process.C:4372-U] - Changing user state for 20073/20073 from running to stopped
> [handler.C:2272-U] - Handling return value for child process/thread
> [handler.C:2292-U] - Restoring callback state of 20073/20073 after CB
> [process.C:4275-U] - Restoring 20073/20073 callback state to level 0
> [process.C:4372-U] - Changing callback state for 20073/20073 from ditto to dontcare
> [handler.C:437-U] - Handling event 'Breakpoint' with handler 'BreakpointContinue'
> [process.C:4275-U] - Restoring 20073/20073 bp hold state to level 0
> [process.C:4372-U] - Changing bp hold state for 20073/20073 from stopped to dontcare
> [process.C:797-U] - At top of syncRunState for 20073
> [process.C:3287-U] - 20073/20073: setting target state to Stopped
> [process.C:834-U] - Checking if any ProcStop events on 20073 are ready
> [process.C:3287-U] - 20073/20073: setting target state to Stopped
> [process.C:846-U] - Current Threading State for 20073:
> [process.C:862-U] - 20073/20073 hand: stopped, gen: stopped, user: stopped, code: ----------------SSS
> [process.C:871-U] - Running plat_syncRunState on 20073
> [process.C:1023-U] - Polling for events = !F && ((T && !T && ?) || F || F || F || F || F || F || F || F)
> [mailbox.C:149-U] - Polled mailbox for messages, but none found
> [process.C:1042-U] - Returning after handling events
> [process.C:1148-U] - Leaving WaitAndHandleEvents with return true, 'cause we're done
> [process.C:972-U] - Top of waitAndHandleEvents.  Block = true
> [process.C:1023-U] - Blocking for events = !F && ((T && !F && T) || ? || ? || ? || ? || ? || ? || ? || ?)
> [mailbox.C:155-U] - Blocking for events from mailbox, queue size = 0
(hang)

Which seems to implicate PCEventMuxer::defaultCallback for its default
ret_stopped, no?  But I don't know enough about how this is called.

If you now give the fork parent that SIGWINCH, you can see the handlers
for that signal, then:
> [process.C:1023-U] - Polling for events = !F && ((T && !F && F) || F || F || F || F || F || F || F || F)
> [mailbox.C:149-U] - Polled mailbox for messages, but none found
> [process.C:1056-U] - No threads are running to produce events
> [process.C:1148-U] - Leaving WaitAndHandleEvents with return false, 'cause we're done
> [process.C:6528-U] - Error: Error waiting for process to finish iRPC
> [140658579666816]dynProcess.C[2192]: Iterating in loop waiting for IRPC to complete
> [process.C:8320-H] - Handler starting to check for events
> [process.C:8328-H] - Events already handled, going back to sleep
> [140658579666816]dynProcess.C[2209]: ProcControl reported IRPC thread stopped, continuing and consuming events

This is where it finally kicks back into gear with that fallback loop.
Hopefully you can reproduce this yourself and get whatever full logs you
like...

Or with my patch, it's never allowed to stop the iRPC at all. :)  But if
that has too many of its own problems, I'm open to suggestions.

Thanks,
Josh
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

int
main()
{
    printf("started pid %d\n", getpid());
    pid_t pid = fork();
    if (pid == 0) {
        printf("exec in pid %d\n", getpid());
        execlp("/bin/true", "/bin/true", NULL);
    }
    else if (pid > 0) {
        printf("waiting for pid %d\n", pid);
        waitpid(pid, NULL, 0);
    }
    return 0;
}
#include <error.h>
#include <BPatch.h>

static void
g_exec_callback(BPatch_thread *thread)
{
    thread->getProcess()->loadLibrary("librt.so.1");
}

int
main(int argc, const char* argv[])
{
    if (argc < 2)
        error(1, 0, "need to specify the target");

    BPatch bpatch;
    bpatch.registerExecCallback(g_exec_callback);
    BPatch_process *proc = bpatch.processCreate(argv[1], &argv[1]);
    if (!proc)
        error(1, 0, "couldn't create the process");

    proc->continueExecution();
    while (!proc->isTerminated())
        bpatch.waitForStatusChange();

    return 0;
}
[← Prev in Thread] Current Thread [Next in Thread→]