[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor SOAP API and queue jobs.



On Wed, Nov 14, 2012 at 3:50 PM, Peter MacKinnon <pmackinn@xxxxxxxxxx> wrote:
On 11/14/2012 07:56 AM, Javi Roman wrote:
On Wed, Nov 14, 2012 at 1:08 PM, Matthew Farrellee <matt@xxxxxxxxxx> wrote:
On 11/14/2012 06:10 AM, Javi Roman wrote:
Hello.

I'm getting the first steps using Python and HTCondor SOAP API (condor
v7.8.5). I'm sending single jobs with the SOAP API cycle with success.

It's easy to send jobs with a simple submit description file like this:

Universe   = vanilla
Executable = /bin/sleep
Arguments  = 30
Log        = simple.log
Output     = simple.out
Error      = simple.error
Queue

However, the problem I've run into is when I want to send a number of
queued programs, for example:

Universe   = vanilla
Executable = /bin/sleep
Arguments  = 30
Log        = simple.log
Output     = simple.out
Error      = simple.error
Queue 150

I'm using the following Python code in order to send this 150 jobs:

########## begin code ############
from suds.client import Client
import logging
import sys

CONDOR_HOST      = "server"
SCHEDD_PORT      = "8080"
SCHEDD_LOCATION  = "http://" + CONDOR_HOST + ":" + SCHEDD_PORT
WSDL_SCHEDD_FILE = "file:condorSchedd.wsdl"

def updateAdProperty(job, name, type=None, value=None):
     for i in range(len(job[1][0])):
         if (job[1][0][i].name == name):
             if type:
                 job[1][0][i].type = type
             if value:
                 job[1][0][i].value = value
             return True
     return False

condor_schedd = Client(WSDL_SCHEDD_FILE, location=SCHEDD_LOCATION)

#
# Job submission cycle:
# 1. beginTransaction.
# 2. newCluster
# 3. newJob
# 4. createJobTemplate
# 5. submit
# 6. commitTransaction
#
transaction = condor_schedd.service.beginTransaction(10);
transactionId = transaction[1]
print "TransactionId: %s" % transactionId

cluster = condor_schedd.service.newCluster(transactionId)
clusterId=cluster[1]
print "ClusterId: %s" % clusterId

for i in xrange(150):
         print "newJob"
         job = condor_schedd.service.newJob(transactionId, clusterId)
         jobId = job[1]
         print "createJobTemplate"
         job = condor_schedd.service.createJobTemplate(clusterId, jobId,
"condor", 5, "/bin/sleep", "30", "")
         updateAdProperty(job, "LeaveJobInQueue", value="FALSE")
         jobAd = job[1]
         print "submit jobId -> %s" % jobId
         print "submit"
         result = condor_schedd.service.submit(transactionId, clusterId,
jobId, jobAd)

result = condor_schedd.service.commitTransaction(transactionId)
condor_schedd.service.requestReschedule();
#res = condor_schedd.service.closeSpool(transaction, clusterId, jobId)
print result

############ end code #####################


The job is sent 150 times for the same clusterID, however the submit
loop is quite slow, probably due to the HTTP connections.

Please, Is this the correct way to send a set of jobs?
Is there any way to speed up this submission cycle, like "condor_submi"
do it?

Best Regards.

--
Javi Roman

The result from createJobTemplate() can be reused. Call it once before the loop, update the ClusterId on it, then update its ProcId for each newJob() call.

Best,


matt


Many thanks, Matt.

Updating the ClusterId o even only the ProcId in the job template from createJobTemplate() (before the loop)  enhances the execution time. From about 50 seconds to 12 seconds for 50 iterations.

It's a good solution, anyway I guess is slow if I launch hundred of jobs. I'm afraid the SOAP API is not suitable for this task, from a performance point of view.


--
Javi Roman


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Javi,

You may want to check out the alternate SOAP API in the contrib space: Aviary. It goes pretty fast after the nominal overhead of the HTTP connection. With Aviary, there is more packing into the single submit RPC call as opposed to the 3 separate RPC calls used above in the original Birdbath SOAP interface.

http://condor-git.cs.wisc.edu/?p=condor.git;a=blob;f=src/condor_contrib/aviary/README
http://condor-git.cs.wisc.edu/?p=condor.git;a=blob;f=src/condor_contrib/aviary/test/submit.py

On my machine I get results like this for 150 "simple" submits like yours:
real    0m7.922s

Another user obtained:
$ time PYTHONPATH=/usr/share/condor/aviary/module/ submit.py | wc -l
150
real 0m2.453s

YMMV,
\Pete

-- 
Peter MacKinnon
Cloud BU/MRG Grid
Red Hat Inc.
Raleigh, NC
Many thanks Pete,

This SOAP interface alternative is great.

I'm trying to get the condor-aviary contrib from RPMs, but I can not find it in the UW repositories. The stable and development versions looks like they don't include the Aviary contrib plugin:

condor-7.8.6-73238
condor-7.9.1-70216

Nevertheless, I've decided to include the Aviary plugin from sources, I've run into a new problem. When I build the Condor source code from condor_src-7.9.1-all-all.tar.gz (last version), I'm not getting the AviaryScheddPlugin-plugin.so, even with the command:

"cmake  -DWANT_CONTRIB:BOOL=TRUE -DWITH_MANAGEMENT:BOOL=TRUE -DWITH_AVIARY:BOOL=TRUE"

Can you give any hits about?

Many thanks again.

Best.


--
Javi Roman