[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-C




Hi,
I think that your job has not been moved to the remote queue
(goldbach.math.ethz.ch), and this is why the job local log file gives you 
>------------------------------------------------------------
>020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
>    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
>...
>026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
>    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
-------------------------------------------------------------------
this means that the problem untill now is not in the jobuniverse, but it
could be in you configuration files, or in the remote schedd daemon name !!
in these two cases this error message appears.

Be sure that the schedd name is fitze@xxxxxxxxxxxxxxxxxxxxx in the remote
machine, and then be sure that you have configured its file of configuration
to accept condor-c jobs, as well the security parameters as mentioned in the
condor manual.

Regards,
Hisham Ihshaish
UAB, Barcelona 



Urs Fitze <fitze@xxxxxxxxxxxx> escribió: 
>

Hi,
>
>For testing Condor-C I did the following (according to the manual 5.3.1):
>I submitted a cluster of 5 jobs with the submit-file
>-------------------------------------------------------
>universe         = grid
>grid_resource = condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
>+remote_jobuniverse = 1 
>+remote_requirements = True
>+remote_ShouldTransferFiles = "YES"
>+remote_WhenToTransferOutput = "ON_EXIT"
>
>my_procs        = 5
>executable      = collatz4.$$(arch)
>arguments       = $(Process)
>
>log             = collatz4-C.log
>output          = collatz4-C.$(Process).out
>error           = collatz4-C.$(Process).err
>
>Requirements = (Arch == "X86_64")
>
>queue $(my_procs)
>------------------------------------------------------
>from a machine that sits (already regularly) in the Condor-pool of
>'goldbach.math.ethz.ch' where I (fitze) am a UID/NFS-known user.
>I did the same with '+remote_jobuniverse = 5' and in both cases the
>log-file only says
>------------------------------------------------------------
>020 (005.000.000) 07/04 13:52:02 Detected Down Globus Resource
>    RM-Contact: fitze@xxxxxxxxxxxxxxxxxxxxx
>...
>026 (005.000.000) 07/04 13:52:02 Detected Down Grid Resource
>    GridResource: condor fitze@xxxxxxxxxxxxxxxxxxxxx goldbach.math.ethz.ch
>------------------------------------------------------------
>and condor_q reports the jobs as being idle 'I' => The jobs are not being
>processed and
>there is nothing special mentioned in the Condor-logfiles on both the
>submitter
>and 'goldbach.math.ethz.ch'.
>However 'normal' submission with 'universe = standard' resp ' = vanilla'
>works
>flawlessly. What could be the cause of the failure of Condor-C?
>
>Regards
>
>Urs Fitze
>
>_______________________________________________
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>The archives can be found at: 
>https://lists.cs.wisc.edu/archive/condor-users/
>
>