[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs starting on HOLD



Hi Lukas,

As Zach stated below, the issue you are observing is the "-remote" option to condor_submit implies "-spool" as well, under the assumption that you are submitting from a client (like a laptop) to a remote Schedd that does not share the same filesystem mounts, and thus input files need to be spooled (transferred) to the remote Schedd.

If you using a shared file system, e.g. client "tux208" and submit host "tux223" can access all of your job's files with the same path names, you could submit to a remote Schedd without all the file spooling.  The condor_config knob "SCHEDD_HOST", or environment variable "_condor_SCHEDD_HOST", can be used to tell all the tools like condor_submit, condor_q, condor_rm etc which Schedd to use by default.  So if your job's files are in a shared file system, instead of "-remote" you could try from bash

  lkosch@tux208:HTCondor$ _condor_SCHEDD_HOST=tux223..iehk.RWTH-Aachen.DE condor_submit free.job

And/or you could create a file ~/.condor/user_config with the contents:

   SCHEDD_HOST=tux223..iehk.RWTH-Aachen.DE 

and then just "condor_submit free.job"

regards,
Todd


On 4/24/2018 12:01 PM, Zach Miller wrote:
> Yes, this is expected.
> 
> For "remote" submissions (or more specifically, anything where the input data is spooled) the job is submitted on hold until all the input data has been transferred.  It then goes to the Idle state where it can be matched and executed.
> 
> 
> Cheers,
> -zach
> 
> 
>> -----Original Message-----
>> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of
>> Koschmieder, Lukas
>> Sent: Tuesday, April 24, 2018 11:51 AM
>> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
>> Subject: [HTCondor-users] Jobs starting on HOLD
>>
>> Hi,
>>
>> according to condor_q my jobs start on HOLD and then transit to IDLE and
>> RUN. Is this the expected behavior?
>>
>> lkosch@tux208:HTCondor$ condor_submit -remote tux223 free.job && \
>> while true; do \
>> condor_q -n tux223; \
>> sleep 1; \
>> done
>>
>> Submitting job(s).
>> 1 job(s) submitted to cluster 227.
>>
>>
>> -- Schedd: tux223.iehk.RWTH-Aachen.DE : <137.226.130.73:9618?... @ 04/24/18
>> 18:31:52
>> OWNER  BATCH_NAME        SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL
>> JOB_IDS
>> lkosch CMD: /bin/bash   4/24 18:31      _      _      _      1      1 227.0
>>
>> 1 jobs; 0 completed, 0 removed, 0 idle, 0 running, 1 held, 0 suspended
>>
>>
>> -- Schedd: tux223.iehk.RWTH-Aachen.DE : <137.226.130.73:9618?... @ 04/24/18
>> 18:31:53
>> OWNER  BATCH_NAME        SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
>> lkosch CMD: /bin/bash   4/24 18:31      _      _      1      1 227.0
>>
>> 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
>>
>>
>> (...)
>>
>>
>> -- Schedd: tux223.iehk.RWTH-Aachen.DE : <137.226.130.73:9618?... @ 04/24/18
>> 18:31:59
>> OWNER  BATCH_NAME        SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
>> lkosch CMD: /bin/bash   4/24 18:31      _      1      _      1 227.0
>>
>> 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
>>
>>
>>
>> lkosch@tux208:HTCondor$ condor_version
>> $CondorVersion: 8.6.10 Mar 12 2018 BuildID: 435200 $
>> $CondorPlatform: x86_64_RedHat7 $
>>
>>
>> lkosch@tux223:HTCondor$ condor_version
>> $CondorVersion: 8.6.10 Mar 12 2018 BuildID: 435200 $
>> $CondorPlatform: x86_64_RedHat7
>>
>>
>> Best regards,
>> Lukas
>>
>>
>> --
>> Lukas Koschmieder
>> Steel Institute IEHK
>> RWTH Aachen University
>> IntzestraÃe 1
>> 52072 Aachen
>> Germany
>>
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
> 


-- 
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685