[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] API



I have a python script and it kicks off a bunch of threads.  Each of
those threads kicks off another bunch of threads. Each of those
threads runs a python script using Popen. Currently this is run from
cron on one host. I would like to move this to condor to take
advantage of multiple machines and achieve parallelism.
...
Does using the API for this seem like the best way?

Generally, the right API depends on how much communication is necessary between the different threads (and which direction it is), and how long each thread lasts. I tend to think that HTCondor works best when scheduling jobs that don't need to communicate with each other and take about an hour to run. If that describes the last step of your application (the Python script called by popen()), then you might benefit from replacing that popen() with an HTCondor job.

It it takes a while to get to that point, you could consider replacing the second-level threads with HTCondor jobs as well. If the second level takes a while, but the scripts it popen()s do not, maybe you want to start HTCondor jobs instead of second-level threads but not replace the popen()s.

Can anyone point me at an example for using the API?

Brian Bockleman has been working on improving the documentation for the Python API. The documentation is here:

https://htcondor-python.readthedocs.io/en/latest/

His tutorial from earlier this year has links to earlier tutorials and
other resources:

http://research.cs.wisc.edu/htcondor/HTCondorWeek2017/presentations/TueBockelman_Python.pdf

- ToddM