[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Standard Universe blues...



Hi,

back from Christams holidays...

Kewley, J (John) writes:
 > It sounds like you want the checkpointing facility of "standard" universe
 > without the shadow/IO handling abilities (actually quite useful from
 > a security point of view).
 > 
 > I am sure there was one way of getting this, but can't remember the best
 > way to do it. I am sure someone else will post it. In the meantime ...
 > 
 > Is it possible to use:
 > ===
 >   local_files = file1,file2,...
 > 
 >     If your job attempts to access a file mentioned in this list, Condor will cause
 >     it to be read or written at the execution machine. This is most useful for
 >     temporary files not used for input or output. This list uses the same syntax
 >     as compress_files, shown above. 
 > 
 >     local_files = /tmp/*
 > 
 >     This option only applies to standard-universe jobs. 
 > ===
 > 
 > for instance you could create a tmp dir in your execute directory and
 > have local_files = tmp/*       # I hope relative names are OK
 > adding tmp/* to your transfer_output_files might be possible to avoid
 > copying them up a level, but use of that option brings with it warnings in the manual.


thanks for the suggestion. I would love to know how to get the standard universe
without the shadow/IO abilities, anyone?

For the moment I tried to give a second look into the local_files stuff, but I
don't see how I could get this to work For our code we have a "input" directory
with loads of files, some of which are read at startup time, and others are
written, then read, written, etc.

If I was to use this approach I would need to be able to transfer the input
directory to the executing machine first. If in the submit file I put 

executable = fixgal
[...]
local_files = input/

should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT

transfer_input_files = input/*
transfer_output_files = input/*

I get :

ERROR: Can't open "/scratch/angelv/Work/guillermo_condor3/input/*"  with flags
00

---------
If I try with a script that just decompresses the input:

executable = fixgal.pl
universe = standard
[...]
local_files = input/

should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT

transfer_input_files = input.tar.gz, fixgal
transfer_output_files = input/*

allow_startup_script = True

then I get this in the Shadow Log

1/3 12:38:23 (1060.0) (10851):My_UID_Domain = "iac.es"
1/3 12:38:24 (1060.0) (10851):	Entering pseudo_get_file_stream
1/3 12:38:24 (1060.0) (10851):	file = "/scratch/condor/spool/cluster1060.ickpt.subproc0"
1/3 12:38:24 (1060.0) (10851):	161.72.81.187
1/3 12:38:24 (1060.0) (10851):	161.72.81.187
1/3 12:38:24 (1060.0) (10851):Reaped child status - pid 10852 exited with status 0
1/3 12:38:24 (1060.0) (10851):Read: tar (child): input.tar.gz: Cannot open: No such file or directory
1/3 12:38:24 (1060.0) (10851):Read: tar (child): Error is not recoverable: exiting now
1/3 12:38:24 (1060.0) (10851):Read: tar: Child returned status 2
1/3 12:38:24 (1060.0) (10851):Read: tar: Error in writing to standard output
1/3 12:38:24 (1060.0) (10851):Read: tar: Error is not recoverable: exiting now
1/3 12:38:24 (1060.0) (10851):Read: chmod: failed to get attributes of `fixgal': No such file or directory
1/3 12:38:24 (1060.0) (10851):Shadow: Job 1060.0 exited, termsig = 0, coredump = 0, retcode = 0
1/3 12:38:24 (1060.0) (10851):Shadow: Job exited normally with status 0


So it looks like I cannot transfer files as in the vanilla universe? (The file
input.tar.gz is in the submitting directory).


Any idea if this can be done in some other way?

Thanks (and happy new year),
Angel de Vicente
-- 
----------------------------------
http://www.iac.es/galeria/angelv/

PostDoc Software Support
Instituto de Astrofisica de Canarias