I have a Windows-only HTCondor pool, and I’m trying to submit a very simple task to that pool from another Windows machine outside the pool using the grid universe. The batch file that’s being run is on a network drive that’s accessible by all machines involved, and I don’t care about storing stdout, stderr, and log files, so I don’t want any transferring of files to happen. As a result, I’ve set transfer_executable to False and remote_ShouldTransferFiles to “NO”. Here are the contents of my submit file:


universe = grid

# This is accessible to all machines

executable = //FileServer/path/to/file/test.bat

transfer_executable = False

concurrency_limits = 100

accounting_group = group_condor

accounting_group_user = farnhamj


grid_resource = condor HeadNode.aqrcapital.com HeadNode.aqrcapital.com

remote_universe = vanilla

+remote_RunAsOwner = True

+remote_requirements = HasFincad == True

+remote_ShouldTransferFiles = "NO"



Once the task makes it onto the machine I’m calling HeadNode, it ends up staying Idle forever, because the condor_starter tries and fails to start the job. I found the following message in the StarterLog.slot1 log on the machine that was trying to start the task:


06/05/15 17:50:22 (pid:2092) Create_Process: CreateProcess failed, errno=267

06/05/15 17:50:22 (pid:2092) SharedPortEndpoint: Inside stop listener.

06/05/15 17:50:22 (pid:2092) Create_Process(//FileServer/path/to/file/test.bat,, ...) failed:

06/05/15 17:50:22 (pid:2092) In OwnerProfile::loaded()

06/05/15 17:50:22 (pid:2092) Failed to start job, exiting

06/05/15 17:50:22 (pid:2092) ShutdownFast all jobs.

06/05/15 17:50:22 (pid:2092) Got ShutdownFast when no jobs running.

06/05/15 17:50:22 (pid:2092) HOOK_JOB_EXIT not configured.

06/05/15 17:50:22 (pid:2092) Entering JICShadow::updateShadow()

06/05/15 17:50:22 (pid:2092) Sent job ClassAd update to startd.

06/05/15 17:50:22 (pid:2092) Leaving JICShadow::updateShadow(): success

06/05/15 17:50:22 (pid:2092) Inside JICShadow::transferOutput(void)

06/05/15 17:50:22 (pid:2092) JICShadow::transferOutput(void): Transferring...

06/05/15 17:50:22 (pid:2092) Inside JICShadow::transferOutputMopUp(void)

06/05/15 17:50:22 (pid:2092) dirscat: dirpath = /

06/05/15 17:50:22 (pid:2092) dirscat: subdir = C:\condor\execute

06/05/15 17:50:22 (pid:2092) Initializing Directory: curr_dir = /\C:\condor\execute\

06/05/15 17:50:22 (pid:2092) **** condor_starter (condor_STARTER) pid 2092 EXITING WITH STATUS 0


The last four lines look suspicious to me. It seems like Condor is trying to run out of C:\condor\execute instead of the location of the script, //FileServer/path/to/file/test.bat, which might why the condor_starter is failing to start.


In addition, when I use condor_q -l to look at the job’s ClassAd on the machine I’m calling HeadNode, I see the following:

Iwd = "C:\condor\spool\2133\0\cluster22133.proc0.subproc0"


This doesn’t look right--shouldn’t the initial working directory be //FileServer/path/to/file/test.bat?


Finally, every machine in question has the same value set for FILESYSTEM_DOMAIN, which was my attempt to avoid issues accessing the //FileServer/path/to/file UNC path.


I know this is a detailed question--thanks for any help you can provide.



