[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] Still Shadow Exceptions :-(



hi,

there's something on the borland website about having multiple 
protocols bound to the same network card:

http://community.borland.com/article/0,1410,25340,00.html

maybe it'll help...;)
henry

On Mon, 8 Dec 2003 15:10:57 +0100 Thomas Bauer 
<tombauer@xxxxxxxxxxxxxxxxxxx> wrote:

> Hello all,
> 
> I've still got the Shadow-Exception-Problem. I have a small Testing-Pool
> with 4 Win2000(SP4)-Machines. All machines have Condor 6.6.0(Nov 24 2003)
> installed.
> On all machines the local condor-reuse-Account is in a group called
> PowerUsers. This Group has the right to start a batch-job. I convinced
> myself by checking the local security policy on every machine.
> Ok, thats my setup. Now the problem:
> I submit a job, called trapez.exe. This program should create 3 files as
> results: fort.19, fort.20 and fort.21.
> After a few seconds the Shadow Exception occur:
> The trapez.log says:
> 
> 000 (015.000.000) 12/08 13:43:37 Job submitted from host:
> <128.176.208.220:1051>
> ...
> 001 (015.000.000) 12/08 13:43:47 Job executing on host:
> <128.176.206.149:1048>
> ...
> 006 (015.000.000) 12/08 13:43:55 Image size of job updated: 868
> ...
> 007 (015.000.000) 12/08 13:44:12 Shadow exception!
>         Can no longer talk to condor_starter on execute machine
> (128.176.206.149)
>         0  -  Run Bytes Sent By Job
>         528441  -  Run Bytes Received By Job
> 
> The StartLog of the executing machine says:
> 
> 12/8 13:43:43 ******************************************************
> 12/8 13:43:43 ** condor_starter (CONDOR_STARTER) STARTING UP
> 12/8 13:43:43 ** $CondorVersion: 6.6.0 Nov 24 2003 $
> 12/8 13:43:43 ** $CondorPlatform: INTEL-WINNT40 $
> 12/8 13:43:43 ** PID = 580
> 12/8 13:43:44 ******************************************************
> 12/8 13:43:44 Using config file: C:\Condor\condor_config
> 12/8 13:43:44 Using local config files: C:\Condor\condor_config.local
> 12/8 13:43:44 DaemonCore: Command Socket at <128.176.206.149:1153>
> 12/8 13:43:44 Setting resource limits not implemented!
> 12/8 13:43:44 Starter communicating with condor_shadow
> <128.176.208.220:3410>
> 12/8 13:43:44 Submitting machine is "PFT23.NWZNET.UNI-MUENSTER.DE"
> 12/8 13:43:45 File transfer completed successfully.
> 12/8 13:43:46 Starting a VANILLA universe job with ID: 15.0
> 12/8 13:43:46 IWD: C:\Condor\execute\dir_580
> 12/8 13:43:46 Output file: C:\Condor\execute\dir_580\trapez.out
> 12/8 13:43:46 Error file: C:\Condor\execute\dir_580\trapez.err
> 12/8 13:43:46 Renice expr "10" evaluated to 10
> 12/8 13:43:46 About to exec C:\Condor\execute\dir_580\condor_exec.exe
> 12/8 13:43:47 Create_Process succeeded, pid=528
> 12/8 13:44:11 Process exited, pid=528, status=0
> 12/8 13:44:12 ReliSock: put_file: TransmitFile() failed, errno=10054
> 12/8 13:44:12 ERROR "DoUpload: Failed to send file
> C:\Condor\execute\dir_580\fort.19, exiting at 1371
> " at line 1370 in file ..\src\condor_c++_util\file_transfer.C
> 12/8 13:44:12 ShutdownFast all jobs.
> 12/8 13:44:12 Error disabling account condor-reuse-vm1 (ACCESS DENIED)
> 
> Ok, here goes something wrong. The fort.19-file is created on the executing
> machine but can't be uploaded to the submitting machine.
> But why? Why is there suddenly a Problem to bring that file back to the
> submitting machine?
> 
> Thanks in advance,
> Thomas Bauer
> --------------------------------------------
> Westfaelische Wilhelms-Universitaet Muenster
> Institut fuer Festkoerpertheorie
> Wilhelm-Klemm-Str. 10
> D 48149 Muenster
> ++49 (251) 8339040
> --------------------------------------------
> 
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>
> 

----------------------
In performance classes, llamas may be asked to cross bridges

Henry Knowles, Electrical & Electronic Engineering
Henry.Knowles@xxxxxxxxxxxxx

Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>