[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problems in Condor-C



Hailong,

I found that in Condor 7.4.1 there is a problem with the attribute
NeverCreateJobSandbox. This explains your issue.

In addition to the workaround I already mentioned, your original submit
file can be made to work by adding the following:

+remote_NeverCreateJobSandbox = false

--Dan

Dan Bradley wrote:
> Hi Hailong,
>
> I have reproduced the problem you reported. I havn't fully understood
> it, but I did find that I could make things work if I submit the
> original job with file transfer turned on. In other words, change your
> submit file to this:
>
> universe = grid
> grid_resource = condor euchina08.buaa.edu.cn euchina08.buaa.edu.cn
> executable = simple.sh
> output = simple.out
> error = simple.err
> log = simple.log
> remote_universe = vanilla
> +remote_requirements = True
> ShouldTransferFiles = yes
> WhenToTransferOutput = ON_EXIT
>
> queue
>
> --Dan
>
> hailong.yang1115 wrote:
>   
>> Hi Alain,
>> There are the corresponding log files from the execute node in the
>> attachment.
>> -Hailong
>> 2010-01-02
>> ------------------------------------------------------------------------
>> ***********************************************
>> * Hailong Yang, PhD. Candidate
>> * Sino-German Joint Software Institute,
>> * School of Computer Science&Engineering, Beihang University
>> * Phone: (86-010)82315908
>> * Email: hailong.yang1115@xxxxxxxxx <mailto:hailong.yang1115@xxxxxxxxx>
>> * Address: G413, New Main Building in Beihang University,
>> * No.37 XueYuan Road,HaiDian District,
>> * Beijing,P.R.China,100191
>> ***********************************************
>> ------------------------------------------------------------------------
>> *发件人:* Alain Roy
>> *发送时间:* 2010-01-01 00:14:03
>> *收件人:* Condor-Users Mail List
>> *抄送:*
>> *主题:* Re: [Condor-users] Problems in Condor-C
>> Hi Hailong,
>> Do you have the corresponding logs from the execute side? The StartLog
>> or StarterLog might have more detail on that error.
>> -alain
>> On Dec 31, 2009, at 9:53 AM, hailong.yang1115 wrote:
>>     
>>> Hi everyone,
>>>
>>> Recently we configured two condor pools to flock jobs using
>>>       
>> Condor-C. The problem is when the jobs appear in the remote condor
>> pool, they stay idle all the way. There is error in the ShadowLog file:
>>     
>>> 06/07 12:43:20 ******************************************************
>>> 06/07 12:43:20 ** condor_shadow (CONDOR_SHADOW) STARTING UP
>>> 06/07 12:43:20 ** /opt/condor-7.4.1/sbin/condor_shadow
>>> 06/07 12:43:20 ** SubsystemInfo: name=SHADOW type=SHADOW(6)
>>>       
>> class=DAEMON(1)
>>     
>>> 06/07 12:43:20 ** Configuration: subsystem:SHADOW local:<NONE>
>>>       
>> class:DAEMON
>>     
>>> 06/07 12:43:20 ** $CondorVersion: 7.4.1 Dec 17 2009 BuildID: 204351 $
>>> 06/07 12:43:20 ** $CondorPlatform: I386-LINUX_RHEL3 $
>>> 06/07 12:43:20 ** PID = 11152
>>> 06/07 12:43:20 ** Log last touched 6/7 12:43:20
>>> 06/07 12:43:20 ******************************************************
>>> 06/07 12:43:20 Using config source: /opt/condor-7.4.1/etc/condor_config
>>> 06/07 12:43:20 Using local config sources:
>>> 06/07 12:43:20 /opt/condor-7.4.1/local.euchina08/condor_config.local
>>> 06/07 12:43:20 DaemonCore: Command Socket at <202.38.140.91:38889>
>>> 06/07 12:43:20 Initializing a VANILLA shadow for job 5.0
>>> 06/07 12:43:20 (5.0) (11152): Request to run on
>>>       
>> slot1@xxxxxxxxxxxxxxxxxxxxx <202.38.140.91:38395> was ACCEPTED
>>     
>>> 06/07 12:43:20 (5.0) (11152): ERROR "Error from
>>>       
>> slot1@xxxxxxxxxxxxxxxxxxxxx: FileTransfer: DownloadFiles called on
>> server sid
>>     
>>> e" at line 655 in file pseudo_ops.cpp
>>>
>>> Here is the job description file:
>>> [ddg2@www simple_test]$ cat simple.submit
>>> universe = grid
>>> grid_resource = condor euchina08.buaa.edu.cn euchina08.buaa.edu.cn
>>> executable = simple.sh
>>> output = simple.out
>>> error = simple.err
>>> log = simple.log
>>> remote_universe = vanilla
>>> +remote_requirements = True
>>> +remote_ShouldTransferFiles = "YES"
>>> +remote_WhenToTransferOutput = "ON_EXIT"
>>> queue
>>>
>>> [ddg2@www simple_test]$ cat simple.sh
>>> #!/bin/sh
>>> echo "Start to sleep for 5 seconds"
>>> sleep 5
>>> echo "All done"
>>>
>>> Any clue?
>>>
>>> -Hailong
>>>
>>> 2009-12-31
>>> ***********************************************
>>> * Hailong Yang, PhD. Candidate
>>> * Sino-German Joint Software Institute,
>>> * School of Computer Science&Engineering, Beihang University
>>> * Phone: (86-010)82315908
>>> * Email: hailong.yang1115@xxxxxxxxx
>>> * Address: G413, New Main Building in Beihang University,
>>> * No.37 XueYuan Road,HaiDian District,
>>> * Beijing,P.R.China,100191
>>> ***********************************************
>>> _______________________________________________
>>> Condor-users mailing list
>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
>>>       
>> with a
>>     
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>       
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>>     
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>