[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] SIGQUIT / debugging



On Feb 19, 2013, at 9:45 PM, "Shrum, Donald C" <DCShrum@xxxxxxxxxxxxx> wrote:

> Thanks for the reply Jaime... Here is some more detail..  The program I am running is a simple test program.
> The problem seems to occur on only one submit node so perhaps I will figure this out prior to getting a reply  :)


Here's one way to help debug why your jobs are crashing: Submit a job where the executable is a shell script that sleeps for one minute. When the job starts executing, run condor_ssh_to_job. This will give you a login session on the execute machine with the same environment as your job. Now try running and debugging your program interactively in the environment in which it's crashing. 

Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project