[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Job not starting correctly



Hi Peter,

You say that when you submit an interactive job, you run the script by doing "./runscript". Do your jobs ever use condor file transfer or is your pool set up to assume a shared file system?

When you submit the job normally, do you still get back the output (stdout) and error (stderr) files? It might be useful to print out the environment at the very beginning of the script and compare between a normal job and an interactive job.

Jason Patton

On Mon, May 3, 2021 at 5:04 PM Peter Ellevseth <Peter.Ellevseth@xxxxxxxxxx> wrote:

Gents

 

We are running a commercial CFD-code via htcondor. Been doing it for years without any issued. I installed a new version of that software and want to run it via htcondor as per usual. I to this by telling condor to run a locally installed bash-script on the execute node which in turn starts the CFD-solver. I have to do it this to source some files need by the solver to start (license etc).

 

However, the new version is refusing to start. From the the StarterLog.slotX I see the job immediately stops with

 

05/03/21 23:56:33 (pid:4135578) Create_Process succeeded, pid=4135579

05/03/21 23:56:33 (pid:4135578) Process exited, pid=4135579, status=139

05/03/21 23:56:33 (pid:4135578) Got SIGQUIT.  Performing fast shutdown.

 

If I ssh in to one of the execute nodes I can start it just and it runs as normal.

 

If I do condor_submit -interactive my_submit_file, I am able to run the script with ./runscript just fine.

 

The why wonât it start when I submit the file normally??

 

Peter

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/