[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Normal Termination (return value 10)



Shruti,

Start with the exit code:

08/19 07:04:50 Process exited, pid=3768, status=10

I suspect you're still encountering missing dependency issues. Have you tried my suggestion of running with USE_VISIBLE_DESKTOP=True on your execute node so you can watch for DLL-missing error popups?

If the R executable runs fine in the shell you're submitting from have you tried exporting your shell setup to the remote job with:

getenv=true

in the submit ticket?

- Ian

On Thu, Aug 19, 2010 at 7:07 AM, Shruti Mudra <shruti.mudra@xxxxxxxxx> wrote:
Hi Everyone,

I'm running  R jobs on Windows vista,I would appreciate ideas on why the following job failed,
here is my submit file
******************************************************************
universe = vanilla
Executable = C:\R\R-2.10.1\bin\Rscript.exe
getenv = true
arguments = Simulate_Normal_Data.R

should_transfer_files = YES
when_to_transfer_output = ON_EXIT
input = Simulate_Normal_Data.R
transfer_input_files = bay_alpha.R

Output = test_r_out.out
Log = test_r_log.log
error = error_r.error

queue
******************************************************************

here is my job log file(test_r_log.log)
******************************************************************
000 (111.000.000) 08/19 06:39:52 Job submitted from host: <y.y.y.y:51083>
...
001 (111.000.000) 08/19 06:44:06 Job executing on host: <y.y.y.y:51084>
...
005 (111.000.000) 08/19 06:44:07 Job terminated.
(1) Normal termination (return value 10)
Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
0  -  Run Bytes Sent By Job
42837  -  Run Bytes Received By Job
0  -  Total Bytes Sent By Job
42837  -  Total Bytes Received By Job
...
******************************************************************

here is my starterlog file

******************************************************************
8/19 07:04:47 Locale: English_United States.1252
08/19 07:04:47 ******************************************************
08/19 07:04:47 ** condor_starter (CONDOR_STARTER) STARTING UP
08/19 07:04:47 ** C:\condor\bin\condor_starter.exe
08/19 07:04:47 ** SubsystemInfo: name=STARTER type=STARTER(8) class=DAEMON(1)
08/19 07:04:47 ** Configuration: subsystem:STARTER local:<NONE> class:DAEMON
08/19 07:04:47 ** $CondorVersion: 7.4.2 Mar 30 2010 BuildID: 227044 $
08/19 07:04:47 ** $CondorPlatform: INTEL-WINNT50 $
08/19 07:04:47 ** PID = 3264
08/19 07:04:47 ** Log last touched 8/19 05:44:07
08/19 07:04:47 ******************************************************
08/19 07:04:47 Using config source: C:\condor\condor_config
08/19 07:04:47 Using local config sources: 
08/19 07:04:47    C:\condor\condor_config.local
08/19 07:04:47 DaemonCore: Command Socket at <y.y.y.y:52431>
08/19 07:04:48 GLEXEC_JOB not supported on this platform; ignoring
08/19 07:04:48 Setting resource limits not implemented!
08/19 07:04:48 Communicating with shadow <y.y.y.y:52424>
08/19 07:04:48 Submitting machine is "machine_name"
08/19 07:04:48 setting the orig job name in starter
08/19 07:04:48 setting the orig job iwd in starter
08/19 07:04:48 File transfer completed successfully.
08/19 07:04:49 Job 112.0 set to execute immediately
08/19 07:04:49 Starting a VANILLA universe job with ID: 112.0
08/19 07:04:49 Tracking process family by login "condor-reuse-slot1"
08/19 07:04:49 IWD: C:\condor\execute\dir_3264
08/19 07:04:49 Input file: C:\condor\execute\dir_3264\Simulate_Normal_Data.R
08/19 07:04:49 Output file: C:\condor\execute\dir_3264\test_r_out.out
08/19 07:04:49 Error file: C:\condor\execute\dir_3264\error_r.error
08/19 07:04:49 Renice expr "10" evaluated to 10
08/19 07:04:49 About to exec C:\condor\execute\dir_3264\condor_exec.exe Simulate_Normal_Data.R
08/19 07:04:50 Create_Process succeeded, pid=3768
08/19 07:04:50 Process exited, pid=3768, status=10
08/19 07:04:50 Got SIGQUIT.  Performing fast shutdown.
08/19 07:04:50 ShutdownFast all jobs.
08/19 07:04:50 **** condor_starter (condor_STARTER) pid 3264 EXITING WITH STATUS 0

*********************************************************************************

startLog
*********************************************************************************
08/19 07:04:46 slot1: match_info called
08/19 07:04:46 slot1: Received match <y.y.y.y:51084>#1282206288#45#...
08/19 07:04:46 slot1: State change: match notification protocol successful
08/19 07:04:46 slot1: Changing state: Unclaimed -> Matched
08/19 07:04:46 slot1: Request accepted.
08/19 07:04:46 slot1: Remote owner is "owner_name"
08/19 07:04:46 slot1: State change: claiming protocol successful
08/19 07:04:46 slot1: Changing state: Matched -> Claimed
08/19 07:04:47 slot1: Got activate_claim request from shadow (<y.y.y.y:52427>)
08/19 07:04:47 slot1: Remote job ID is 112.0
08/19 07:04:47 slot1: Got universe "VANILLA" (5) from request classad
08/19 07:04:47 slot1: State change: claim-activation protocol successful
08/19 07:04:47 slot1: Changing activity: Idle -> Busy
08/19 07:04:50 slot1: Called deactivate_claim_forcibly()
08/19 07:04:50 slot1: State change: received RELEASE_CLAIM command
08/19 07:04:50 slot1: Changing state and activity: Claimed/Busy -> Preempting/Vacating
08/19 07:04:50 Starter pid 3264 exited with status 0
08/19 07:04:50 slot1: State change: starter exited
08/19 07:04:50 slot1: State change: No preempting claim, returning to owner
08/19 07:04:50 slot1: Changing state and activity: Preempting/Vacating -> Owner/Idle
08/19 07:04:50 slot1: State change: IS_OWNER is false
08/19 07:04:50 slot1: Changing state: Owner -> Unclaimed

*********************************************************************************

shadowLog
*********************************************************************************
08/19 07:04:46 Locale: English_United States.1252
08/19 07:04:46 ******************************************************
08/19 07:04:46 ** condor_shadow (CONDOR_SHADOW) STARTING UP
08/19 07:04:46 ** C:\condor\bin\condor_shadow.exe
08/19 07:04:47 ** SubsystemInfo: name=SHADOW type=SHADOW(6) class=DAEMON(1)
08/19 07:04:47 ** Configuration: subsystem:SHADOW local:<NONE> class:DAEMON
08/19 07:04:47 ** $CondorVersion: 7.4.2 Mar 30 2010 BuildID: 227044 $
08/19 07:04:47 ** $CondorPlatform: INTEL-WINNT50 $
08/19 07:04:47 ** PID = 2364
08/19 07:04:47 ** Log last touched 8/19 05:44:07
08/19 07:04:47 ******************************************************
08/19 07:04:47 Using config source: C:\condor\condor_config
08/19 07:04:47 Using local config sources: 
08/19 07:04:47    C:\condor\condor_config.local
08/19 07:04:47 DaemonCore: Command Socket at <y.y.y.y:52424>
08/19 07:04:47 Initializing a VANILLA shadow for job 112.0
08/19 07:04:47 (112.0) (2364): Request to run on slot1@xxxxxxxxx <y.y.y.y:51084> was ACCEPTED
08/19 07:04:50 (112.0) (2364): Job 112.0 terminated: exited with status 10
08/19 07:04:50 (112.0) (2364): **** condor_shadow (condor_SHADOW) pid 2364 EXITING WITH STATUS 100
*************************************************************************************************

and my error file is empty
Thanks,


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/




Cycle Computing, LLC
The Leader in Open Compute Solutions for Clouds, Servers, and Desktops
Enterprise Condor Support and Management Tools

http://www.cyclecomputing.com
http://www.cyclecloud.com