[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] "I/O error : Permission denied" on windows



Hi,
I have an intermittent problem with (dual processor) execute hosts
occasionally stalling with the message "I/O error : Permission denied"
appearing on stderr. I am using a vanilla universe job under windows XP.


The executable I am running lives on a network share, referenced by a
UNC path - readable by guest. It's a large installation (about 30 DLLs &
support files) that I am reluctant to install on the execute host. The
submitted executable is a batch file that runs the exe on the UNC share.

Poking around the logs on the execute machine indicates that it tried to
start two jobs in a short period of time. The first runs fine,
------VM1 - exits normally
8/19 14:21:47 ******************************************************
8/19 14:21:47 ** condor_starter (CONDOR_STARTER) STARTING UP
8/19 14:21:47 ** C:\Condor\bin\condor_starter.exe
8/19 14:21:47 ** $CondorVersion: 6.6.9 Mar 10 2005 $
8/19 14:21:47 ** $CondorPlatform: INTEL-WINNT40 $
8/19 14:21:47 ** PID = 320
8/19 14:21:47 ******************************************************
8/19 14:21:47 Using config file: C:\Condor\condor_config
8/19 14:21:47 Using local config files: C:\Condor/condor_config.local
8/19 14:21:47 DaemonCore: Command Socket at <131.242.76.104:3575>
8/19 14:21:47 Setting resource limits not implemented!
8/19 14:21:48 Starter communicating with condor_shadow
<131.242.76.84:4451>
8/19 14:21:48 Submitting machine is "tbadjp973.dpi.qld.gov.au"
8/19 14:21:49 File transfer completed successfully.
8/19 14:21:50 Starting a VANILLA universe job with ID: 430.1
8/19 14:21:50 IWD: C:\Condor/execute\dir_320
8/19 14:21:50 Output file: C:\Condor/execute\dir_320\ghab22.stdout
8/19 14:21:50 Error file: C:\Condor/execute\dir_320\ghab22.stderr
8/19 14:21:50 Renice expr "10" evaluated to 10
8/19 14:21:50 About to exec C:\WINDOWS\system32\cmd.exe /Q /C
condor_exec.bat ghab22.sim
8/19 14:21:50 Create_Process succeeded, pid=3504
8/19 14:22:27 Process exited, pid=3504, status=0
8/19 14:22:27 Got SIGQUIT.  Performing fast shutdown.
8/19 14:22:27 ShutdownFast all jobs.
8/19 14:22:27 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0

However vm2 stalls and is killed by submitting user:
------VM2 - 
8/19 14:21:50 ******************************************************
8/19 14:21:50 ** condor_starter (CONDOR_STARTER) STARTING UP
8/19 14:21:50 ** C:\Condor\bin\condor_starter.exe
8/19 14:21:50 ** $CondorVersion: 6.6.9 Mar 10 2005 $
8/19 14:21:50 ** $CondorPlatform: INTEL-WINNT40 $
8/19 14:21:50 ** PID = 3704
8/19 14:21:50 ******************************************************
8/19 14:21:50 Using config file: C:\Condor\condor_config
8/19 14:21:50 Using local config files: C:\Condor/condor_config.local
8/19 14:21:50 DaemonCore: Command Socket at <131.242.76.104:3580>
8/19 14:21:50 Setting resource limits not implemented!
8/19 14:21:50 Starter communicating with condor_shadow
<131.242.76.84:4460>
8/19 14:21:50 Submitting machine is "tbadjp973.dpi.qld.gov.au"
8/19 14:21:51 File transfer completed successfully.
8/19 14:21:52 Starting a VANILLA universe job with ID: 430.2
8/19 14:21:52 IWD: C:\Condor/execute\dir_3704
8/19 14:21:52 Output file: C:\Condor/execute\dir_3704\ghab23.stdout
8/19 14:21:52 Error file: C:\Condor/execute\dir_3704\ghab23.stderr
8/19 14:21:52 Renice expr "10" evaluated to 10
8/19 14:21:52 About to exec C:\WINDOWS\system32\cmd.exe /Q /C
condor_exec.bat ghab23.sim
8/19 14:21:52 Create_Process succeeded, pid=3448
8/19 14:31:55 Got SIGQUIT.  Performing fast shutdown.
8/19 14:31:55 ShutdownFast all jobs.
8/19 14:31:55 Process exited, pid=3448, status=1
8/19 14:31:55 Last process exited, now Starter is exiting
8/19 14:31:55 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0


Any ideas? I'm guessing at an obscure windows sharing error. A
workaround that doesn't involve killing and restarting the stalled job
would be welcome.

Yours,
Pdev. 

********************************DISCLAIMER****************************
The information contained in the above e-mail message or messages 
(which includes any attachments) is confidential and may be legally 
privileged.  It is intended only for the use of the person or entity 
to which it is addressed.  If you are not the addressee any form of 
disclosure, copying, modification, distribution or any action taken 
or omitted in reliance on the information is unauthorised.  Opinions 
contained in the message(s) do not necessarily reflect the opinions 
of the Queensland Government and its authorities.  If you received 
this communication in error, please notify the sender immediately and 
delete it from your computer system network.