[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-C



Note: correction in the first two lines the remote queue is
(fitze@xxxxxxxxxxxxxxxxxxxxx) which means the queue of the remote
machine not the condor manager (goldbach.math.ethz.ch)

thnx
Hisham

On Wed, 2007-07-04 at 15:12 +0200, Hisham Ihshaish wrote:
> 
> Hi,
> I think that your job has not been moved to the remote queue
> (goldbach.math.ethz.ch), and this is why the job local log file gives you 
> >------------------------------------------------------------
> >020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
> >    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
> >...
> >026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
> >    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
> -------------------------------------------------------------------
> this means that the problem untill now is not in the jobuniverse, but it
> could be in you configuration files, or in the remote schedd daemon name !!
> in these two cases this error message appears.
> 
> Be sure that the schedd name is fitze@xxxxxxxxxxxxxxxxxxxxx in the remote
> machine, and then be sure that you have configured its file of configuration
> to accept condor-c jobs, as well the security parameters as mentioned in the
> condor manual.
> 
> Regards,
> Hisham Ihshaish
> UAB, Barcelona 
> 
> 
> 
> Urs Fitze <fitze@xxxxxxxxxxxx> escribió: 
> >
> 
> Hi,
> >
> >For testing Condor-C I did the following (according to the manual 5.3.1):
> >I submitted a cluster of 5 jobs with the submit-file
> >-------------------------------------------------------
> >universe         = grid
> >grid_resource = condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
> >+remote_jobuniverse = 1 
> >+remote_requirements = True
> >+remote_ShouldTransferFiles = "YES"
> >+remote_WhenToTransferOutput = "ON_EXIT"
> >
> >my_procs        = 5
> >executable      = collatz4.$$(arch)
> >arguments       = $(Process)
> >
> >log             = collatz4-C.log
> >output          = collatz4-C.$(Process).out
> >error           = collatz4-C.$(Process).err
> >
> >Requirements = (Arch == "X86_64")
> >
> >queue $(my_procs)
> >------------------------------------------------------
> >from a machine that sits (already regularly) in the Condor-pool of
> >'goldbach.math.ethz.ch' where I (fitze) am a UID/NFS-known user.
> >I did the same with '+remote_jobuniverse = 5' and in both cases the
> >log-file only says
> >------------------------------------------------------------
> >020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
> >    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
> >...
> >026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
> >    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
> >------------------------------------------------------------
> >and condor_q reports the jobs as being idle 'I' => The jobs are not being
> >processed and
> >there is nothing special mentioned in the Condor-logfiles on both the
> >submitter
> >and 'goldbach.math.ethz.ch'.
> >However 'normal' submission with 'universe = standard' resp ' = vanilla'
> >works
> >flawlessly. What could be the cause of the failure of Condor-C?
> >
> >Regards
> >
> >Urs Fitze
> >
> >_______________________________________________
> >Condor-users mailing list
> >To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> >subject: Unsubscribe
> >You can also unsubscribe by visiting
> >https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> >The archives can be found at: 
> >https://lists.cs.wisc.edu/archive/condor-users/
> >
> >
> 
> 
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/