[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] job submit



Hi i need yours Helps.
I use Condor  7-0-2 version
I have problems in executing jobs which are submitting from a submit or execute machine. The job start running but after same time there are stopping. In the local machine the execution is good but the problem appear  with the remote submission. In addition when i submit jobs from my Manager the jobs are executed in remote machines!!!
My description file :
Universe   = vanilla
Executable =/home/condor/test
Arguments  =15 10
Log        =/home/condor/test.log
Output     =/home/condor/test.out
Error      =/home/condor/test.error
should_transfer_files   = YES
when_to_transfer_output = ON_EXIT
Queue

My log file is:
000 (012.000.000) 03/25 14:06:22 Job submitted from host: <41.229.35.203:49377>
...
001 (012.000.000) 03/25 14:06:23 Job executing on host: <41.229.35.204:45769>
...
022 (012.000.000) 03/25 14:06:38 Job disconnected, attempting to reconnect
    Socket between submit and execute hosts closed unexpectedly
    Trying to reconnect to slot2@Grid011 <41.229.35.204:45769>
...
023 (012.000.000) 03/25 14:06:38 Job reconnected to slot2@Grid011
    startd address: <41.229.35.204:45769>
    starter address: <41.229.35.204:45308>
...
022 (012.000.000) 03/25 14:06:38 Job disconnected, attempting to reconnect
    Socket between submit and execute hosts closed unexpectedly
    Trying to reconnect to slot2@Grid011 <41.229.35.204:45769>
...
023 (012.000.000) 03/25 14:06:38 Job reconnected to slot2@Grid011
    startd address: <41.229.35.204:45769>
    starter address: <41.229.35.204:45308>
...
Thank you.
walid SAAD