[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] remote condor job never gets removed

Hey Joe, got it! It's amazing what you can learn by actually reading the Condor Manual :-)

Since it isn't explicitly mentioned in the manual, here are the steps to submit a remote job and get the results back:

   $ condor_submit -remote cmhost -pool cmhost remote_vanilla.sub
   Submitting job(s).
   Logging submit event(s).
   1 job(s) submitted to cluster 61.
   Spooling data files for 1 jobs...

After the job completes (JobStatus=4 'C'), get the results and then remove the job:

   $ condor_q -pool cmhost -name cmhost

   -- Schedd: cmhost.bestsystems.co.jp : <>
     61.0   ajs             5/18 16:55   0+00:00:06 C  0   9.8  vanilla.sh

   0 jobs; 0 idle, 0 running, 0 held
   $ condor_transfer_data -pool cmhost -name cmhost 61.0
   Fetching data files...
   $ condor_rm -pool cmhost -name cmhost 61.0
   Job 61.0 marked for removal


On 5/20/2006 3:17 AM, Joe Meehean wrote:

condor_transfer_data <cluster.process>


Andrew Stubbings wrote:
I have sent the logs to condor-admin. When I said the "submitting machine" I was referring to where condor_submit was invoked. I didn't know data was not returned to the original machine. I can't find any data for the job left on the machine with the schedd. Is this an effect of the job stuck in the completed ('C') state?


On 5/18/2006 11:03 PM, Erik Paulson wrote:
On Thu, May 18, 2006 at 05:20:45PM +0900, Andrew Stubbings wrote:
A remote job submitted from 6.7.18 SuSE 9.3/x86_64 to 6.7.19 SuSE 8.2/x86 completes but never gets removed from the queue or the results returned back to the submitting machine:

The SchedLog shows the job completed but ends with an mrec error:

The mrec thing is not the real problem. Please post (or stick
on a website, or send to condor-admin) the whole schedd log
and shadow log, there's not enough here to figure out what's
going on.

BTW, when you say "submitting machine" you mean the machine with
the schedd, right? Not the machine where condor_submit was invoked?
In a remote submit, Condor doesn't return data to the original
machine (there's no agent on that machine to accept the data), so
it stays on the machine with the schedd.


Condor-users mailing list
Condor-users mailing list