[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] jobs submitted from windows don't execute



Hi
the "schedd" daemon is already included in the daemon_list beside master, but still doesn't start automatically.
the schedd log shows the following error:
 
5/14 11:57:49 (pid:2384) Using config source: C:\condor\condor_config
5/14 11:57:49 (pid:2384) Using local config sources:
5/14 11:57:49 (pid:2384)    C:\condor/condor_config.local
5/14 11:57:49 (pid:2384) DaemonCore: Command Socket at
5/14 11:57:49 (pid:2384) History file rotation is enabled.
5/14 11:57:49 (pid:2384)   Maximum history file size is: 20971520 bytes
5/14 11:57:49 (pid:2384)   Number of rotated history files is: 2
5/14 11:57:49 (pid:2384) my_popen: CreateProcess failed
5/14 11:57:49 (pid:2384) Failed to execute C:\condor/bin/condor_shadow.pvm.exe, ignoring
5/14 11:57:49 (pid:2384) my_popen: CreateProcess failed
5/14 11:57:49 (pid:2384) Failed to execute C:\condor/bin/condor_shadow.std.exe, ignoring
5/14 11:57:49 (pid:2384) ERROR "select, error # = 10038" at line 2417 in file ..\src\condor_daemon_core.V6\daemon_core.C
 
and the master log :-
 
5/14 11:57:48 Using config source: C:\condor\condor_config
5/14 11:57:48 Using local config sources:
5/14 11:57:48    C:\condor/condor_config.local
5/14 11:57:48 DaemonCore: Command Socket at < 192.168.100.53:1541>
5/14 11:57:48 Started DaemonCore process "C:\condor/bin/condor_schedd.exe", pid and pgroup = 2384
5/14 11:57:49 DaemonCore: Command received via UDP from host < 192.168.100.53:1546>
5/14 11:57:49 DaemonCore: received command 60011 (DC_NOP), calling handler (handle_nop())
5/14 11:57:49 The SCHEDD (pid 2384) exited with status 4
5/14 11:57:49 Sending obituary for "C:\condor/bin/condor_schedd.exe"
5/14 11:57:52 restarting C:\condor/bin/condor_schedd.exe in 10 seconds
 
the condor_schedd didn't restart so i had to again start it manually. now every time i send a job to the pool and specify in the requirements to be executed on a linux machine, the job is idle and that's all
 
i have tried to configure the PED by modifying the boot.ini as mentioned in one of the messages but useless
 
plz help
 


 
On 5/13/07, Matt Hope <matthew.hope@xxxxxxxxx> wrote:
On 5/13/07, mohammed shambakey <shambakey1@xxxxxxxxx> wrote:
> Hi
> i checked the shadow log and it has the followng error:-
>
> 5/10 10:00:37 Using config source: C:\condor\condor_config
> 5/10 10:00:37 Using local config sources:
> 5/10 10:00:37    C:\condor/condor_config.local
> 5/10 10:00:37 DaemonCore: Command Socket at
> 5/10 10:00:38 Initializing a JAVA shadow for job 7.0
> 5/10 10:00:38 (7.0) (3316): Request to run on <192.168.100.120:47348 > was
> ACCEPTED
> 5/10 10:00:38 (7.0) (3316): ERROR "select, error # = 10038" at line 2417 in
> file ..\src\condor_daemon_core.V6\daemon_core.C
>
> and the schedd.log gives :-
>
> 5/13 10:41:11 (pid:3836) Using config source: C:\condor\condor_config
> 5/13 10:41:11 (pid:3836) Using local config sources:
> 5/13 10:41:11 (pid:3836)    C:\condor/condor_config.local
> 5/13 10:41:12 (pid:3836) DaemonCore: Command Socket at
> 5/13 10:41:12 (pid:3836) History file rotation is enabled.
> 5/13 10:41:12 (pid:3836)   Maximum history file size is: 20971520 bytes
> 5/13 10:41:12 (pid:3836)   Number of rotated history files is: 2
> 5/13 10:41:12 (pid:3836) my_popen: CreateProcess failed
> 5/13 10:41:12 (pid:3836) Failed to execute
> C:\condor/bin/condor_shadow.pvm.exe, ignoring
> 5/13 10:41:12 (pid:3836) my_popen: CreateProcess failed
> 5/13 10:41:12 (pid:3836) Failed to execute
> C:\condor/bin/condor_shadow.std.exe, ignoring
> 5/13 10:41:12 (pid:3836) Sent ad to central manager for
> shambakey@shambakeyserv
> 5/13 10:41:12 (pid:3836) Sent ad to 1 collectors for shambakey@shambakeyserv
> 5/13 10:41:12 (pid:3836) ERROR "select, error # = 10038" at line 2417 in
> file ..\src\condor_daemon_core.V6\daemon_core.C
>
> i looked for this daemon_core.c but didn't find it.
> one thing, i have to start schedd.exe manually as it's not included with
> services running in windows, i don't know if this has something to do with
> the problem

I'm afraid you have a serious misunderstanding of how condor works
regarding the various daemons it needs.
You should not start the daemons directly (unless you are an expert
user doing something complicated at least)

Instead you should alter the condor_config file in you condor
installation and change the following line:

DAEMON_LIST = MASTER,   ... etc ...

to include SCHEDD in the comma separated list. On windows only one
service is required for condor which fires up the master (which then
creates the relevant schedd/startd/ whatever)

I suggest taking another read through the documentation in light of
this and see if this makes more sense and works.

Hope this helps
Matt
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR



--
Mohammed Talat El-Shambakey
Assistant Researcher
AI Department
Informatics Institute
Moubarak City for cientific research