[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Specifying log files from python



Thanks. I will look into this. The example at that site seems to be
for using a local submitter. For using a remote would the general flow
be this?

       coll = htcondor.Collector(condor_host)
       schedd_ad = coll.locate(htcondor.DaemonTypes.Schedd)
       sub = htcondor.Submit({...})
       schedd = htcondor.Schedd(schedd_ad)
       with schedd.transaction() as txn:
            id = sub.queue(txn)



On Thu, Dec 28, 2017 at 3:05 PM, Jason Patton <jpatton@xxxxxxxxxxx> wrote:
> I think it was passed along at some point, but it's certainly worth
> repeating the link to the work Brian Bockelman has done documenting the
> python bindings:
> http://htcondor-python.readthedocs.io/en/latest/job_management_intro.html
>
> You can transform (most) any standard submit file (i.e. a file meant to be
> submitted using condor_submit) by transforming the attribute-value pairs in
> the file to a dict and creating a htcondor.Submit object. The example given
> on that page...
>
>>>> sub = htcondor.Submit({"executable": "/bin/sleep", "arguments": "5m"})
>
> ...is the equivalent of a submit file that says:
>
> executable = /bin/sleep
> arguments = 5m
>
> There's more documentation out there (official/unofficial) on using submit
> files than using the python bindings, but the htcondor.Submit object should
> allow a more direct translation.
>
>
> On Thu, Dec 28, 2017 at 1:44 PM, Larry Martell <larry.martell@xxxxxxxxx>
> wrote:
>>
>> Thanks very much for the reply. Let me detail my use case. I have a
>> docker container and from there I want to submit thousands of jobs.
>> The content of the data, which changes on an hourly basis, determines
>> what jobs and what their arguments are. The jobs will be running on
>> bare metal not in a container. This is my first experience with
>> condor. From googling and researching and reading the docs it seemed
>> that using the python API would be the way to go with this. Reading
>> about that everything I found was about creating ClassAds.
>>
>> If I were to do this using submit files, how would that work? Is there
>> a python API for creating those? Or would I have to just open the
>> files and write them out with python? I guess I could do this, but
>> then the management of all those files will become a headache. Putting
>> that aside for a moment, I would have to get the submit files from
>> inside the docker container to somewhere condor can find them. Do the
>> have to be on the condor master?
>>
>>
>> On Thu, Dec 28, 2017 at 11:37 AM, John M Knoeller <johnkn@xxxxxxxxxxx>
>> wrote:
>> > That is not the equivalent submit classad for that submit file, itâs not
>> > even close.   It might generously be considered âconceptuallyâ
>> > equivalent.
>> > But it *cannot* be used as an actual submit classad.
>> >
>> > There are at least a dozen required attributes missing, and the UserLog
>> > expression (and many others) must be a simple string, not an expression
>> > using strcat().
>> >
>> > You need to do the string concatenation in python and then assign the
>> > results to UserLog.
>> >
>> >
>> >
>> > Or better yet, ignore ANY guide to submitting jobs that does not use the
>> > htcondor.Submit() class.  While submitting jobs as bare classads is
>> > technically possible, it is ridiculously hard to do correctly.
>> >
>> >
>> >
>> > If you want to see the equivalent submit classad for a given submit
>> > file,
>> > the condor_submit tool will show the resulting classad when you use
>> > either
>> > the -dump or -dry-run arguments.
>> >
>> > $ condor_submit -dump - example.sub
>> >
>> >
>> >
>> > ClusterId=1
>> > BufferSize=524288
>> > NiceUser=false
>> > CoreSize=0
>> > CumulativeSlotTime=0
>> > OnExitHold=false
>> > RequestCpus=1
>> > BufferBlockSize=32768
>> > Err="test.err"
>> > ImageSize=0
>> > WantCheckpoint=false
>> > CommittedTime=0
>> > WhenToTransferOutput="ON_EXIT"
>> > TargetType="Machine"
>> > Cmd="/home/john/test.sh"
>> > JobUniverse=5
>> > ExitBySignal=false
>> > TransferIn=false
>> > Iwd="/home/john"
>> > NumJobCompletions=0
>> > CumulativeRemoteUserCpu=0.0
>> > NumRestarts=0
>> > EncryptExecuteDirectory=false
>> > CommittedSuspensionTime=0
>> > Owner="johnkn"
>> > NumSystemHolds=0
>> > CumulativeSuspensionTime=0
>> > Environment=""
>> > Requirements=( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" )
>> > && (
>> > TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && (
>> > TARGET.HasFileTransfer )
>> > RequestDisk=DiskUsage
>> > MinHosts=1
>> > JobNotification=0
>> > TransferOutput="output"
>> > NumCkpts=0
>> > LastSuspensionTime=0
>> > NumJobStarts=0
>> > WantRemoteSyscalls=false
>> > JobLeaseDuration=2400
>> > JobPrio=0
>> > RootDir="/"
>> > CurrentHosts=0
>> > StreamOut=false
>> > WantRemoteIO=true
>> > OnExitRemove=true
>> > DiskUsage=0
>> > In="/dev/null"
>> > PeriodicRemove=false
>> > ExecutableSize=0
>> > RemoteUserCpu=0.0
>> > LocalUserCpu=0.0
>> > RemoteSysCpu=0.0
>> > LocalSysCpu=0.0
>> > CompletionDate=0
>> > RemoteWallClockTime=0.0
>> > Rank=0.0
>> > LeaveJobInQueue=false
>> > CondorVersion="$CondorVersion: 8.6.9 Dec 18 2017 BuildID: 427508 $"
>> > MyType="Job"
>> > StreamErr=false
>> > PeriodicHold=false
>> > Arguments="foo bar"
>> > Out="test.out.0"
>> > UserLog="/home/john/test.log"
>> > JobStatus=1
>> > PeriodicRelease=false
>> > RequestMemory=ifthenelse(MemoryUsage =!= undefined,MemoryUsage,(
>> > ImageSize +
>> > 1023 ) / 1024)
>> > MaxHosts=1
>> > TotalSuspensions=0
>> > CommittedSlotTime=0
>> > CumulativeRemoteSysCpu=0.0
>> > TransferInputSizeMB=0
>> > CondorPlatform="$CondorPlatform: x86_64_RedHat6 $"
>> > ShouldTransferFiles="YES"
>> > ExitStatus=0
>> > QDate=1514477927
>> > EnteredCurrentStatus=1514477927
>> > ProcId=0
>> >
>> >
>> > Notice that Out="test.out.0", and NOT an expression using strcat.
>> >
>> >
>> > From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On
>> > Behalf
>> > Of Larry Martell
>> > Sent: Wednesday, December 27, 2017 3:32 PM
>> > To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
>> > Subject: Re: [HTCondor-users] Specifying log files from python
>> >
>> >
>> >
>> > I  am just getting started with condor. I do not have any submit files
>> > nor
>> > any legacy system. I am finding it very difficult to get going. I have
>> > been
>> > messing with it for 2 weeks and have yet to be able to successfully
>> > submit a
>> > job.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Dec 27, 2017 at 1:26 PM Todd Tannenbaum <tannenba@xxxxxxxxxxx>
>> > wrote:
>> >
>> > If you are using HTCondor v8.6.6 or above, I recommend using the
>> > htcondor.Submit() class to submit jobs instead of the Schedd() class.
>> > With
>> > the Submit class, you do not have to convert your submit file into a
>> > classad
>> > - the Submit class will do that for you. So if you have a condor_submit
>> > file
>> > that works, you are likely good to go. Much easier than trying to use
>> > the
>> > more primitive submit method in the Schedd class in my opinion.  Details
>> > on
>> > the Submit class are in the Manual, and there is an example submitting a
>> > job
>> > with the Submit class in this ticket:
>> >
>> > https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=6420
>> >
>> >
>> > Hope this helps
>> >
>> > Todd
>> >
>> >
>> >
>> > Sent from my iPhone
>> >
>> >
>> > On Dec 27, 2017, at 11:59 AM, Larry Martell <larry.martell@xxxxxxxxx>
>> > wrote:
>> >
>> > On this site:
>> >
>> > http://osgtech.blogspot.com/2014/03/submitting-jobs-to-htcondor-using-python.html
>> > I see this:
>> >
>> > For example, consider the following submit file:
>> >
>> > executable = test.sharguments = foo bar
>> > log = test.log
>> > output = test.out.$(Process)
>> > error = test.err
>> > transfer_output_files = output
>> > should_transfer_files = yes
>> > queue 1
>> >
>> > The equivalent submit ClassAd is:
>> >
>> > [
>> >    Cmd = "test.sh";
>> >    Arguments = "foo bar"
>> >    UserLog = "test.log";
>> >    Out = strcat("test.out",ProcId);
>> >    Err = "test.err";
>> >    TransferOutput = "output";
>> >    ShouldTransferFiles = "YES";
>> > ]
>> >
>> > So in my python code I create this logging in my ClassAd:
>> >
>> > [
>> >        Err =
>> > "strcat(\"/Staging/Repos/CAPbase/cluster/logs/compute_radiology.err\",
>> > ProcId)";
>> >        Out =
>> > "strcat(\"/Staging/Repos/CAPbase/cluster/logs/compute_radiology.out\",
>> > ProcId)";
>> >        UserLog =
>> > "strcat(\"/Staging/Repos/CAPbase/cluster/logs/compute_radiology.log\",
>> > ProcId)";
>> > ]
>> >
>> > But I see this in the SchedLog:
>> >
>> > 12/27/17 12:34:15 (pid:3755290) WriteUserLog::initialize:
>> >
>> > safe_open_wrapper("/opt/capcompute/util/strcat("/Staging/Repos/CAPbase/cluster/logs/compute_radiology.log",
>> > ProcId)") failed - errno 2 (No such file or directory)
>> >
>> > /opt/capcompute/util/ is the dir the python script that is submitting
>> > the job is running from.
>> >
>> > What am I doing wrong here? How do I properly specify the path and
>> > file name for the logs?