A potential user has asked me about running Stata over Condor on our pool (or a new one).

1. Does Stata need to be installed on all the nodes (the way other packages are handled)?

We currently have Stata installed on all our our Condor nodes (we have
a network license for it), but we also have a HAS_STATA variable
defined on each node, and a wrapper script called condor_stata
constructs a submission that requires that variable. (It doesn't
really matter for Stata since it's on all the nodes, but we also do
something similar with SAS, which is only on a couple of our nodes.)

2. There is a Stata/MP (multiprocessing): how does such interact with Condor?

Fairly well, actually. Last year we bought 10 dual CPU machines, and
set them up with 4 dual-cpu VMs with Stata/MP and 12 single-cpu VMs (6
machines) with regular Stata. Earlier this year, we accidentally
upgraded the main Stata binary to be Stata/MP for all of them, and was
pleasantly surprised by how nicely it behaved on the single-CPU
machines. If there was only one job on a single-CPU VM, Stata/MP
felt (rightly) it could multiprocess across both CPUs -- the CPU
assigned to it by Condor, as well as the CPU that Condor thought was
idle. If another job was matched to the idle CPU, Stata/MP became less
aggressive about grabbing the other CPU -- it seems to be pretty well

Of course, these are all dedicated nodes, and we've turned off
preemption completely, so I could see how either of those could
complicate matters.

3. Anything else I need to know about?

Thanks for any help with this!


