[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-C



 
hello again,

For now i can tell you that about line :

grid_resource = condor joe@xxxxxxxxxxxxxxxxxxxxxxxxx
remotecentralmanager.example.com 
 
it is correct to be used in the submit description file, and it is not an
error in the manual, that is because each machine that has a schedd daemon
should have a name, and in this example it was joe@xxxxxxxxxxxxxxxxxxxxxxxxx
, but sometimes the name is not assigned. to make sure what is the name of
the schedd you can use the command: "CONDOR_CONFIG_VAL -schedd SCHEDDNAME"
or SCHEDD_NAME.

good luck,
Hisham Ihshaish







Urs Fitze <fitze@xxxxxxxxxxxx> escribió: 
>

On Wed, Jul 04, 2007 at 05:56:08PM +0200, Hisham Ihshaish wrote:
>> 
>> Note: correction in the first two lines the remote queue is
>> (fitze@xxxxxxxxxxxxxxxxxxxxx) which means the queue of the remote
>> machine not the condor manager (goldbach.math.ethz.ch)
>> 
>> thnx
>> Hisham
>
>Hi
>
>Yes I retried it with 
>-------------------------------------------------------------------
>grid_resource = condor goldbach.math.ethz.ch goldbach.math.ethz.ch
>-------------------------------------------------------------------
>and got more 'action' but no success, so I also included (as mentioned in
>the 
>manual) the lines
>-----------------------------------------------
>SEC_DEFAULT_NEGOTIATION = OPTIONAL
>SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
>-----------------------------------------------
>in the local config-files of both the submitter and goldbach and
>issued a 'condor_reconfig' on both => Those variables were set.
>Now I could finally submit the cluster but ran into the problem
>that the file '/path/to/collatz4.$$(arch)' was not found, so I changed
>the submit-file to be explicit
>----------------------------------
>executable      = collatz4.X86_64
>----------------------------------
>and THEN it finally worked so far, that I can see now (with condor_q
>-global)
>my local cluster (on 'I') and 5 new jobs on goldbach that stem from the
>grid-submitted. However those new jobs won't run because -seemingly-- 
>they get matched BUT by INTEL-machines i.e. the manager seems not to
>understand/have the 
>----------------------------------
>Requirements = (Arch == "X86_64")
>----------------------------------
>from the submit-file.
>
>Resumee:
>1.There is an error in the manual concerning Condor-C
>-------------------------------------------------------
>grid_resource = condor joe@xxxxxxxxxxxxxxxxxxxxxxxxx
>remotecentralmanager.example.com
>-------------------------------------------------------
>should be only
>------------------------------------------------------
>grid_resource = condor remotemachine.example.com
>remotecentralmanager.example.com
>------------------------------------------------------
>because the entry  next to 'condor' is the name of the remote schedd, so
>there
>is no 'user@schedd' needed.
> 
>2.Also it should be said there that the lines
>-----------------------------------------------
>SEC_DEFAULT_NEGOTIATION = OPTIONAL
>SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
>-----------------------------------------------
>belong on both the local and the remote submitter (and possibly on the
>manager too).
>
>3. There is an error in Condor-6.8.5 (which I am using) in the translation
>of stuff
>like '$$(arch)' from submit-files. Furthermore 'Requirements = ...' form
>the submit-file
>seem not to be take into account on the manager. 
>
>Thanks for your help and the fast answer
>
>Urs Fitze
>
>> 
>> On Wed, 2007-07-04 at 15:12 +0200, Hisham Ihshaish wrote:
>> > 
>> > Hi,
>> > I think that your job has not been moved to the remote queue
>> > (goldbach.math.ethz.ch), and this is why the job local log file gives
>you 
>> > >------------------------------------------------------------
>> > >020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
>> > >    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
>> > >...
>> > >026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
>> > >    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx
>goldbach.math.ethz.ch
>> > -------------------------------------------------------------------
>> > this means that the problem untill now is not in the jobuniverse, but
>it
>> > could be in you configuration files, or in the remote schedd daemon
>name !!
>> > in these two cases this error message appears.
>> > 
>> > Be sure that the schedd name is fitze@xxxxxxxxxxxxxxxxxxxxx in the
>remote
>> > machine, and then be sure that you have configured its file of
>configuration
>> > to accept condor-c jobs, as well the security parameters as mentioned
>in the
>> > condor manual.
>> > 
>> > Regards,
>> > Hisham Ihshaish
>> > UAB, Barcelona 
>> > 
>> > 
>> > 
>> > Urs Fitze <fitze@xxxxxxxxxxxx> escribió: 
>> > >
>> > 
>> > Hi,
>> > >
>> > >For testing Condor-C I did the following (according to the manual
>5.3.1):
>> > >I submitted a cluster of 5 jobs with the submit-file
>> > >-------------------------------------------------------
>> > >universe         = grid
>> > >grid_resource = condor fitze@xxxxxxxxxxxxxxxxxxxxx
>goldbach.math.ethz.ch
>> > >+remote_jobuniverse = 1 
>> > >+remote_requirements = True
>> > >+remote_ShouldTransferFiles = "YES"
>> > >+remote_WhenToTransferOutput = "ON_EXIT"
>> > >
>> > >my_procs        = 5
>> > >executable      = collatz4.$$(arch)
>> > >arguments       = $(Process)
>> > >
>> > >log             = collatz4-C.log
>> > >output          = collatz4-C.$(Process).out
>> > >error           = collatz4-C.$(Process).err
>> > >
>> > >Requirements = (Arch == "X86_64")
>> > >
>> > >queue $(my_procs)
>> > >------------------------------------------------------
>> > >from a machine that sits (already regularly) in the Condor-pool of
>> > >'goldbach.math.ethz.ch' where I (fitze) am a UID/NFS-known user.
>> > >I did the same with '+remote_jobuniverse = 5' and in both cases the
>> > >log-file only says
>> > >------------------------------------------------------------
>> > >020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
>> > >    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
>> > >...
>> > >026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
>> > >    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx
>goldbach.math.ethz.ch
>> > >------------------------------------------------------------
>> > >and condor_q reports the jobs as being idle 'I' => The jobs are not
>being
>> > >processed and
>> > >there is nothing special mentioned in the Condor-logfiles on both the
>> > >submitter
>> > >and 'goldbach.math.ethz.ch'.
>> > >However 'normal' submission with 'universe = standard' resp ' =
>vanilla'
>> > >works
>> > >flawlessly. What could be the cause of the failure of Condor-C?
>> > >
>> > >Regards
>> > >
>> > >Urs Fitze
>> > >
>> > >_______________________________________________
>> > >Condor-users mailing list
>> > >To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
>with a
>> > >subject: Unsubscribe
>> > >You can also unsubscribe by visiting
>> > >https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> > >
>> > >The archives can be found at: 
>> > >https://lists.cs.wisc.edu/archive/condor-users/
>> > >
>> > >
>> > 
>> > 
>> > 
>> > _______________________________________________
>> > Condor-users mailing list
>> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>a
>> > subject: Unsubscribe
>> > You can also unsubscribe by visiting
>> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> > 
>> > The archives can be found at: 
>> > https://lists.cs.wisc.edu/archive/condor-users/
>> 
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> 
>> The archives can be found at: 
>> https://lists.cs.wisc.edu/archive/condor-users/
>> 
>> !DSPAM:468ba7af133131657519415!
>> 
>
>
>_______________________________________________
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>The archives can be found at: 
>https://lists.cs.wisc.edu/archive/condor-users/
>
>