[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] need help in settings for condo job submission using python bindings



Yes,  condor_submit and sub.queue() do a great many things that schedd.submit() does not do.  This is why the schedd.submit() method (and SOAP) was deprecated, because it requires you to do all of the things that sub.queue() does internally.

 

I don’t have any guesses why your output and error files are empty.   I would suggest comparing the job ad you see from condor_q -long for a job that returned the correct output and a job that did not.  If the failure to return output is somehow a bug in HTCondor, it will almost certainly be triggered by some difference in those job ads.

 

-tj

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Xin Wang
Sent: Monday, September 25, 2017 10:25 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>; 'htcondor-admin@xxxxxxxxxxx' <htcondor-admin@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] need help in settings for condo job submission using python bindings

 

Hi, John,

 

I tried your approach and use condor_submit -dump <dumpfile> to see the job classad for my submission file. It has ~80 lines, and most of them do not make any sense to me. I tried to add those extra settings to my script but it did not help.

 

The error when running schedd.submit(job_ad) in my original script is below

condor_exec.exe: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory

which clearly indicates that something seems wrong with the environment and the condor cannot find the python3.6 shared libraries.

 

The strange thing is that I did set PYTHONHOME in the environment, which is sufficient for the method of condor_submit <submitfile> and the job submitted using sub.queue() but not sufficient for schedd.submit(job_ad).

 

To confirm my idea, when I updated the environment to sub['environment'] = "PYTHONHOME=/my/path/to/anaconda3 LD_LIBRARY_PATH=/my/path/to/anaconda3/lib"

, then my script works with schedd.submit(job_ad).

 

Now the question is, does condor_submit and the job submitted using sub.queue() do anything extra that schedd.submt is not doing?

 

For the job submitted using sub.queue(), I’m 100% sure that the job ran without issues, as I can see all results generated by my script. The only thing is that output and error files specified in the condor config are not updated at all for the job.

 

Thank you.

 

Xin

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of John M Knoeller
Sent: Friday, September 22, 2017 5:09 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>; 'htcondor-admin@xxxxxxxxxxx' <htcondor-admin@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] need help in settings for condo job submission using python bindings

 

[External Message]

First of all, the job submitted using schedd.submit(job_ad) doesn’t run because the job ad  is incomplete.  When you use that method, you must fully specify the job classad,.   To see what a fully specified job classad looks like, run condor_submit -dump <submit_file>

 

For the job submitted using sub.queue() – are you sure that the job ran and produced output?  when the job is submitted, our output and error files will be created as 0 size files before the job ever runs.

 

-tj

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Xin Wang
Sent: Friday, September 22, 2017 2:44 PM
To: 'htcondor-users@xxxxxxxxxxx' <htcondor-users@xxxxxxxxxxx>; 'htcondor-admin@xxxxxxxxxxx' <htcondor-admin@xxxxxxxxxxx>
Subject: [HTCondor-users] need help in settings for condo job submission using python bindings

 

I’m trying to submit jobs to condor to run some python scripts. If I generate a job file and submit with condor_submit, everything works fine.

Here is the job file:

 

universe = vanilla

environment = "PYTHONHOME=/my/path/to/anaconda3"

executable = /my/path/to/anaconda3/bin/python

arguments = /my/path/to/scripts/myrun.py

log = /tmp/job.log

output = /tmp/test.log

error = /tmp/test.err

queue

 

 

For the same job, I tried to submit through python bindings, using two different methods but do not have luck with either.

 

Firstly I tried schedd.Submit with the following codes:

 

import htcondor

schedd = htcondor.Schedd()

sub = htcondor.Submit()

sub['universe'] = 'vanilla'

sub['environment'] = "PYTHONHOME=/my/path/to/anaconda3"

sub['executable'] = '/my/path/to/anaconda3/bin/python'

sub['arguments'] = '/my/path/to/scripts/myrun.py'

sub['log'] = '/tmp/job.log'

sub['output'] = '/tmp/test.log'

sub['error'] = '/tmp/test.err'

 

with schedd.transaction() as txn:

    sub.queue(txn)

 

The job was submitted without any issues, can run successfully without issues, and have log file /tmp/job.log generated successfully. However, output and error does not work, and /tmp/test.log or /tmp/test.err are generated but with size 0 (empty).

 

 

Secondly, I tried schedd.submit with the following codes:

import htcondor

schedd = htcondor.Schedd()

job_ad = {

    "cmd" : ‘/my/path/to/anaconda3/bin/python',

    "arguments" : '/my/path/to/scripts/myrun.py',

    'env': "PYTHONHOME=/my/path/to/anaconda3",

    "log": '/tmp/job.log',

    "out": '/tmp/test.log',

    "err": "/tmp/test.err",

}

clusterId = schedd.submit(job_ad)

 

The job could not run. However, /tmp/test.err can be generated proper error messages:

condor_exec.exe: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory

I suspect that the error is because the environment is not properly set, but I had no luck when I also tried to set “environment” instead of “env”.

 

How should I fix the settings so that I can submit condor task through python bindings properly? Thanks.

 

Xin

 


Jefferies archives and monitors outgoing and incoming e-mail. The contents of this email, including any attachments, are confidential to the ordinary user of the email address to which it was addressed. If you are not the addressee of this email you may not copy, forward, disclose or otherwise use it or any part of it in any form whatsoever. This email may be produced at the request of regulators or in connection with civil litigation. Jefferies accepts no liability for any errors or omissions arising as a result of transmission. Use by other than intended recipients is prohibited. In the United Kingdom, Jefferies operates as Jefferies International Limited; registered in England: no. 1978621; registered office: Vintners Place, 68 Upper Thames Street, London EC4V 3BJ. Jefferies International Limited is authorized and regulated by the Financial Conduct Authority.

Jefferies archives and monitors outgoing and incoming e-mail. The contents of this email, including any attachments, are confidential to the ordinary user of the email address to which it was addressed. If you are not the addressee of this email you may not copy, forward, disclose or otherwise use it or any part of it in any form whatsoever. This email may be produced at the request of regulators or in connection with civil litigation. Jefferies accepts no liability for any errors or omissions arising as a result of transmission. Use by other than intended recipients is prohibited. In the United Kingdom, Jefferies operates as Jefferies International Limited; registered in England: no. 1978621; registered office: Vintners Place, 68 Upper Thames Street, London EC4V 3BJ. Jefferies International Limited is authorized and regulated by the Financial Conduct Authority.