[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How To TroubleShoot Flocking



Again, thanks everyone for trying to help.
Here is what I have done:
I changed the submit file to the following:
  Executable     = /bin/hostname
  Requirements    = UidDomain == "condor.calumet.purdue.edu" && Arch == "X86_64"
  Universe       = vanilla
  transfer_executable = NO
  should_transfer_files = YES
  when_to_transfer_output = ON_EXIT
  Output         = hostname3.out
  Log            = hostname3.log
  Queue
I also set SHADOW_DEBUG = FULL_DEBUG on the server, which is running all the daemons, including: condor_master, condor_collector, condor_negotiator, condor_startd, and condor_schedd.
 
The above job failed to execute just as previous jobs.  Here is the contents of the StarterLog.vm1 from the server that should be executing the job.
  7/7 15:09:19 Communicating with shadow <x.x.x.x:41587>
  7/7 15:09:19 Submitting machine is "radon.rcac.purdue.edu"
  7/7 15:09:19 File transfer completed successfully.
  7/7 15:09:20 Starting a VANILLA universe job with ID: 252376.0
  7/7 15:09:20 IWD: /usr/local/condor/home/execute/dir_2586
  7/7 15:09:20 Output file: /usr/local/condor/home/execute/dir_2586/hostname3.out
  7/7 15:09:20 About to exec /usr/local/condor/home/execute/dir_2586/condor_exec.exe condor_exec.exe
  7/7 15:09:20 Create_Process: child failed with errno 2 (No such file or directory) before exec()
  7/7 15:09:20 ERROR "Create_Process(/usr/local/condor/home/execute/dir_2586/condor_exec.exe,condor_exec.exe, ...) failed" at line 387 in file os_proc.C
  7/7 15:09:20 ShutdownFast all jobs.

Does this help at all?
 
Thanks
 

 
John Alberts
Technical Assistant for EMS
alberts@xxxxxxxxxxxxxxxxxx
219-989-2083
CLO 332
http://public.xdi.org/=john.alberts

________________________________

From: condor-users-bounces@xxxxxxxxxxx on behalf of Kewley, J (John)
Sent: Fri 7/7/2006 9:46 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] How To TroubleShoot Flocking



> ...
> >According to 6.7 manual :
> >  should_transfer_files = <YES | NO | IF_NEEDED >
> >
> >Is True a valid alternative?
> > 
> >
>
> Ah, good question.  I had to dig into the source code to find
> out.  The
> answer is that should_transfer_files=True is equivalent to
> should_transfer_files=Yes.  I didn't realize my
> recommendation relied on
> an undocumented feature!

I suspected it was valid, but had never seen it before. All examples seem to
use YES or IF_NEEDED

2.5.4 and section on condor_submit are where the documented features of this are
(at least in 6.7)

The choice of YES/NO over TRUE/FALSE is a good one. When people see
T/F they immediately thing of a 2-valued logic, so having someing other than
T/F is good when other values are allowed.

I believe these values are case insensitive (but I don't have the luxury of
the code to check!)

I have still never quite worked out why

"NOTE: The combination of:

  should_transfer_files = IF_NEEDED
  when_to_transfer_output = ON_EXIT_OR_EVICT

 would produce undefined file access semantics. Therefore, this combination is
 prohibited by condor_submit."

It is obviously something obvious, but I haven't twigged it yet.

If you are on a system where some machines are in same FileSystemDomain and some aren't,
you may still want (possibly in)complete files to be returned regardless of how job
finished.

JK

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR


<<winmail.dat>>