[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor SOAP API and queue jobs.



On 11/14/2012 07:56 AM, Javi Roman wrote:
On Wed, Nov 14, 2012 at 1:08 PM, Matthew Farrellee <matt@xxxxxxxxxx> wrote:
On 11/14/2012 06:10 AM, Javi Roman wrote:
Hello.

I'm getting the first steps using Python and HTCondor SOAP API (condor
v7.8.5). I'm sending single jobs with the SOAP API cycle with success.

It's easy to send jobs with a simple submit description file like this:

Universe   = vanilla
Executable = /bin/sleep
Arguments  = 30
Log        = simple.log
Output     = simple.out
Error      = simple.error
Queue

However, the problem I've run into is when I want to send a number of
queued programs, for example:

Universe   = vanilla
Executable = /bin/sleep
Arguments  = 30
Log        = simple.log
Output     = simple.out
Error      = simple.error
Queue 150

I'm using the following Python code in order to send this 150 jobs:

########## begin code ############
from suds.client import Client
import logging
import sys

CONDOR_HOST      = "server"
SCHEDD_PORT      = "8080"
SCHEDD_LOCATION  = "http://" + CONDOR_HOST + ":" + SCHEDD_PORT
WSDL_SCHEDD_FILE = "file:condorSchedd.wsdl"

def updateAdProperty(job, name, type=None, value=None):
     for i in range(len(job[1][0])):
         if (job[1][0][i].name == name):
             if type:
                 job[1][0][i].type = type
             if value:
                 job[1][0][i].value = value
             return True
     return False

condor_schedd = Client(WSDL_SCHEDD_FILE, location=SCHEDD_LOCATION)

#
# Job submission cycle:
# 1. beginTransaction.
# 2. newCluster
# 3. newJob
# 4. createJobTemplate
# 5. submit
# 6. commitTransaction
#
transaction = condor_schedd.service.beginTransaction(10);
transactionId = transaction[1]
print "TransactionId: %s" % transactionId

cluster = condor_schedd.service.newCluster(transactionId)
clusterId=cluster[1]
print "ClusterId: %s" % clusterId

for i in xrange(150):
         print "newJob"
         job = condor_schedd.service.newJob(transactionId, clusterId)
         jobId = job[1]
         print "createJobTemplate"
         job = condor_schedd.service.createJobTemplate(clusterId, jobId,
"condor", 5, "/bin/sleep", "30", "")
         updateAdProperty(job, "LeaveJobInQueue", value="FALSE")
         jobAd = job[1]
         print "submit jobId -> %s" % jobId
         print "submit"
         result = condor_schedd.service.submit(transactionId, clusterId,
jobId, jobAd)

result = condor_schedd.service.commitTransaction(transactionId)
condor_schedd.service.requestReschedule();
#res = condor_schedd.service.closeSpool(transaction, clusterId, jobId)
print result

############ end code #####################


The job is sent 150 times for the same clusterID, however the submit
loop is quite slow, probably due to the HTTP connections.

Please, Is this the correct way to send a set of jobs?
Is there any way to speed up this submission cycle, like "condor_submi"
do it?

Best Regards.

--
Javi Roman

The result from createJobTemplate() can be reused. Call it once before the loop, update the ClusterId on it, then update its ProcId for each newJob() call.

Best,


matt


Many thanks, Matt.

Updating the ClusterId o even only the ProcId in the job template from createJobTemplate() (before the loop)  enhances the execution time. From about 50 seconds to 12 seconds for 50 iterations.

It's a good solution, anyway I guess is slow if I launch hundred of jobs. I'm afraid the SOAP API is not suitable for this task, from a performance point of view.


--
Javi Roman


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Javi,

You may want to check out the alternate SOAP API in the contrib space: Aviary. It goes pretty fast after the nominal overhead of the HTTP connection. With Aviary, there is more packing into the single submit RPC call as opposed to the 3 separate RPC calls used above in the original Birdbath SOAP interface.

http://condor-git.cs.wisc.edu/?p=condor.git;a=blob;f=src/condor_contrib/aviary/README
http://condor-git.cs.wisc.edu/?p=condor.git;a=blob;f=src/condor_contrib/aviary/test/submit.py

On my machine I get results like this for 150 "simple" submits like yours:
real    0m7.922s

Another user obtained:
$ time PYTHONPATH=/usr/share/condor/aviary/module/ submit.py | wc -l
150
real 0m2.453s

YMMV,
\Pete

-- 
Peter MacKinnon
Cloud BU/MRG Grid
Red Hat Inc.
Raleigh, NC