[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-C



On Wed, Jul 04, 2007 at 05:56:08PM +0200, Hisham Ihshaish wrote:
> 
> Note: correction in the first two lines the remote queue is
> (fitze@xxxxxxxxxxxxxxxxxxxxx) which means the queue of the remote
> machine not the condor manager (goldbach.math.ethz.ch)
> 
> thnx
> Hisham

Hi

Yes I retried it with 
-------------------------------------------------------------------
grid_resource = condor goldbach.math.ethz.ch goldbach.math.ethz.ch
-------------------------------------------------------------------
and got more 'action' but no success, so I also included (as mentioned in the 
manual) the lines
-----------------------------------------------
SEC_DEFAULT_NEGOTIATION = OPTIONAL
SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
-----------------------------------------------
in the local config-files of both the submitter and goldbach and
issued a 'condor_reconfig' on both => Those variables were set.
Now I could finally submit the cluster but ran into the problem
that the file '/path/to/collatz4.$$(arch)' was not found, so I changed
the submit-file to be explicit
----------------------------------
executable      = collatz4.X86_64
----------------------------------
and THEN it finally worked so far, that I can see now (with condor_q -global)
my local cluster (on 'I') and 5 new jobs on goldbach that stem from the
grid-submitted. However those new jobs won't run because -seemingly-- 
they get matched BUT by INTEL-machines i.e. the manager seems not to
understand/have the 
----------------------------------
Requirements = (Arch == "X86_64")
----------------------------------
from the submit-file.

Resumee:
1.There is an error in the manual concerning Condor-C
-------------------------------------------------------
grid_resource = condor joe@xxxxxxxxxxxxxxxxxxxxxxxxx remotecentralmanager.example.com
-------------------------------------------------------
should be only
------------------------------------------------------
grid_resource = condor remotemachine.example.com remotecentralmanager.example.com
------------------------------------------------------
because the entry  next to 'condor' is the name of the remote schedd, so there
is no 'user@schedd' needed.
 
2.Also it should be said there that the lines
-----------------------------------------------
SEC_DEFAULT_NEGOTIATION = OPTIONAL
SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
-----------------------------------------------
belong on both the local and the remote submitter (and possibly on the manager too).

3. There is an error in Condor-6.8.5 (which I am using) in the translation of stuff
like '$$(arch)' from submit-files. Furthermore 'Requirements = ...' form the submit-file
seem not to be take into account on the manager. 

Thanks for your help and the fast answer

Urs Fitze

> 
> On Wed, 2007-07-04 at 15:12 +0200, Hisham Ihshaish wrote:
> > 
> > Hi,
> > I think that your job has not been moved to the remote queue
> > (goldbach.math.ethz.ch), and this is why the job local log file gives you 
> > >------------------------------------------------------------
> > >020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
> > >    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
> > >...
> > >026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
> > >    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
> > -------------------------------------------------------------------
> > this means that the problem untill now is not in the jobuniverse, but it
> > could be in you configuration files, or in the remote schedd daemon name !!
> > in these two cases this error message appears.
> > 
> > Be sure that the schedd name is fitze@xxxxxxxxxxxxxxxxxxxxx in the remote
> > machine, and then be sure that you have configured its file of configuration
> > to accept condor-c jobs, as well the security parameters as mentioned in the
> > condor manual.
> > 
> > Regards,
> > Hisham Ihshaish
> > UAB, Barcelona 
> > 
> > 
> > 
> > Urs Fitze <fitze@xxxxxxxxxxxx> escribió: 
> > >
> > 
> > Hi,
> > >
> > >For testing Condor-C I did the following (according to the manual 5.3.1):
> > >I submitted a cluster of 5 jobs with the submit-file
> > >-------------------------------------------------------
> > >universe         = grid
> > >grid_resource = condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
> > >+remote_jobuniverse = 1 
> > >+remote_requirements = True
> > >+remote_ShouldTransferFiles = "YES"
> > >+remote_WhenToTransferOutput = "ON_EXIT"
> > >
> > >my_procs        = 5
> > >executable      = collatz4.$$(arch)
> > >arguments       = $(Process)
> > >
> > >log             = collatz4-C.log
> > >output          = collatz4-C.$(Process).out
> > >error           = collatz4-C.$(Process).err
> > >
> > >Requirements = (Arch == "X86_64")
> > >
> > >queue $(my_procs)
> > >------------------------------------------------------
> > >from a machine that sits (already regularly) in the Condor-pool of
> > >'goldbach.math.ethz.ch' where I (fitze) am a UID/NFS-known user.
> > >I did the same with '+remote_jobuniverse = 5' and in both cases the
> > >log-file only says
> > >------------------------------------------------------------
> > >020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
> > >    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
> > >...
> > >026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
> > >    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
> > >------------------------------------------------------------
> > >and condor_q reports the jobs as being idle 'I' => The jobs are not being
> > >processed and
> > >there is nothing special mentioned in the Condor-logfiles on both the
> > >submitter
> > >and 'goldbach.math.ethz.ch'.
> > >However 'normal' submission with 'universe = standard' resp ' = vanilla'
> > >works
> > >flawlessly. What could be the cause of the failure of Condor-C?
> > >
> > >Regards
> > >
> > >Urs Fitze
> > >
> > >_______________________________________________
> > >Condor-users mailing list
> > >To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > >subject: Unsubscribe
> > >You can also unsubscribe by visiting
> > >https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > >
> > >The archives can be found at: 
> > >https://lists.cs.wisc.edu/archive/condor-users/
> > >
> > >
> > 
> > 
> > 
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > 
> > The archives can be found at: 
> > https://lists.cs.wisc.edu/archive/condor-users/
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/
> 
> !DSPAM:468ba7af133131657519415!
>