[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] SIGQUIT / debugging



I did solve my problem the hard way... but condor_shh_to_job would have been useful so thanks for the replay and I'll use that suggestion next time.
Have a good week.

--Donny
FSU Research Computing Center

-----Original Message-----
From: htcondor-users-bounces@xxxxxxxxxxx [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Jaime Frey
Sent: Tuesday, February 26, 2013 10:51 AM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] SIGQUIT / debugging

On Feb 19, 2013, at 9:45 PM, "Shrum, Donald C" <DCShrum@xxxxxxxxxxxxx> wrote:

> Thanks for the reply Jaime... Here is some more detail..  The program I am running is a simple test program.
> The problem seems to occur on only one submit node so perhaps I will figure this out prior to getting a reply  :)


Here's one way to help debug why your jobs are crashing: Submit a job where the executable is a shell script that sleeps for one minute. When the job starts executing, run condor_ssh_to_job. This will give you a login session on the execute machine with the same environment as your job. Now try running and debugging your program interactively in the environment in which it's crashing. 

Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/