[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problems in Condor-C



Hi Hailong,

Do you have the corresponding logs from the execute side? The StartLog or StarterLog might have more detail on that error. 

-alain

On Dec 31, 2009, at 9:53 AM, hailong.yang1115 wrote:
> Hi everyone,
>  
> Recently we configured two condor pools to flock jobs using Condor-C. The problem is when the jobs appear in the remote condor pool, they stay idle all the way. There is error in the ShadowLog file:
> 06/07 12:43:20 ******************************************************
> 06/07 12:43:20 ** condor_shadow (CONDOR_SHADOW) STARTING UP
> 06/07 12:43:20 ** /opt/condor-7.4.1/sbin/condor_shadow
> 06/07 12:43:20 ** SubsystemInfo: name=SHADOW type=SHADOW(6) class=DAEMON(1)
> 06/07 12:43:20 ** Configuration: subsystem:SHADOW local:<NONE> class:DAEMON
> 06/07 12:43:20 ** $CondorVersion: 7.4.1 Dec 17 2009 BuildID: 204351 $
> 06/07 12:43:20 ** $CondorPlatform: I386-LINUX_RHEL3 $
> 06/07 12:43:20 ** PID = 11152
> 06/07 12:43:20 ** Log last touched 6/7 12:43:20
> 06/07 12:43:20 ******************************************************
> 06/07 12:43:20 Using config source: /opt/condor-7.4.1/etc/condor_config
> 06/07 12:43:20 Using local config sources: 
> 06/07 12:43:20    /opt/condor-7.4.1/local.euchina08/condor_config.local
> 06/07 12:43:20 DaemonCore: Command Socket at <202.38.140.91:38889>
> 06/07 12:43:20 Initializing a VANILLA shadow for job 5.0
> 06/07 12:43:20 (5.0) (11152): Request to run on slot1@xxxxxxxxxxxxxxxxxxxxx <202.38.140.91:38395> was ACCEPTED
> 06/07 12:43:20 (5.0) (11152): ERROR "Error from slot1@xxxxxxxxxxxxxxxxxxxxx: FileTransfer: DownloadFiles called on server sid
> e" at line 655 in file pseudo_ops.cpp
>  
> Here is the job description file:
> [ddg2@www simple_test]$ cat simple.submit
> universe = grid
> grid_resource = condor euchina08.buaa.edu.cn euchina08.buaa.edu.cn
> executable = simple.sh
> output = simple.out
> error = simple.err
> log = simple.log
> remote_universe = vanilla
> +remote_requirements = True
> +remote_ShouldTransferFiles = "YES"
> +remote_WhenToTransferOutput = "ON_EXIT"
> queue
>  
> [ddg2@www simple_test]$ cat simple.sh 
> #!/bin/sh
> echo "Start to sleep for 5 seconds"
> sleep 5
> echo "All done"
>  
> Any clue?
>  
> -Hailong
>  
> 2009-12-31
> ***********************************************
> * Hailong Yang, PhD. Candidate 
> * Sino-German Joint Software Institute, 
> * School of Computer Science&Engineering, Beihang University
> * Phone: (86-010)82315908
> * Email: hailong.yang1115@xxxxxxxxx
> * Address: G413, New Main Building in Beihang University, 
> *              No.37 XueYuan Road,HaiDian District, 
> *              Beijing,P.R.China,100191
> ***********************************************
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/