[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] best way to use cached data



On 12/09/2012 11:36 PM, John Wong wrote:

> I tried to run a script to read a file on the slave machine, but condor
> couldn't read it even if it has full privilege. I think condor runs every
> job in an isolated environment?

What was the problem? We put files in /var/tmp/mydatabases and then tell
the program to read from there (for this particular program: by setting
$DBDIR env.var), that works just fine.

> My real concern is to cache frequent data on some servers, and when I run
> jobs, I can have condor pull them over, or let condor decides where the
> jobs should be sent to depending on where those data exist.

That is a different story I think. I'd love to see a node-level data
placement mechanism in condor, or at least the ability to evaluate ` [
-f /var/tmp/mydatabase ] ` at job submission time, but I don't believe
you can.

You can use "TransferXXX" settings in submit file, but that normally
works per-job rather than per-node. You can evaluate "HAS_DATA" in the
node's ClassAd but that requires root and condor_reconfig on the node.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

Attachment: signature.asc
Description: OpenPGP digital signature