[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Job submitting and Per mission Denied prob lems on Redhat Linux9



Hello Jianwei,
In your submit file have you put should_to_transfer_output and when_to_transfer_output parameters?
About dedicated jobs, Mr. Diego is almost right. You have to submit dedicated jobs only from a single host that is the dedicated scheduler, however, isn´t necessary to be the central manager. You can choose other host to do this, according to manual. In my case I´ve chosen a host that isn´t the central manager. By the way, you have a Vanilla universe.
http://www.cs.wisc.edu/condor/manual/v6.8/2_5Submitting_Job.html#SECTION00354000000000000000
2.5.4 Submitting Jobs Without a Shared File System: Condor's File Transfer Mechanism
2.5.4.2 Specifying If and When to Transfer Files
To enable the file transfer mechanism, two commands are placed in the job's submit description file: should_transfer_files and when_to_transfer_output. An example is:

  should_transfer_files = YES
when_to_transfer_output = ON_EXIT

The should_transfer_files command specifies whether Condor should transfer input files from the submit machine to the remote machine where the job executes. It also specifies whether the output files are transferred back to the submit machine. The command takes on one of three possible values:

  1. YES: Condor always transfers both input and output files.

  2. IF_NEEDED: Condor transfers files if the job is matched with (and to be executed on) a machine in a different FileSystemDomain than the one the submit machine belongs to. If the job is matched with a machine in the local FileSystemDomain, Condor will not transfer files and relies on a shared file system.

  3. NO: Condor's file transfer mechanism is disabled.

http://www.cs.wisc.edu/condor/manual/v6.8/3_13Setting_Up.html
3.13.8.2 Selecting and Setting up your Dedicated Scheduler
We recommend that you select a single host to act as the dedicated scheduler. This is the host from which all users submit their MPI jobs. If you have a dedicated cluster of compute nodes and a single front-end machine from which users are supposed to submit jobs, that machine would be a perfect choice for your dedicated scheduler. If your pool does not have an obvious choice for a submit machine, choose a host that all of your users can log into, and one that is likely to be up and running all the time. All of the Condor's other resource requirements for a submit node apply to this machine, such as having enough disk space in the spool directory to hold jobs (see section 3.2.2 on page [*] for details on these issues).

Regards,

Elaine.


On 9/6/06, Diego Bello < dbello@xxxxxxxxx > wrote:
I had the same problem but with MPI jobs, finally I realized that they
only worked when submitted from the central manager only, not from
other hosts.

Serial jobs work just fine when submitted from any host.

Regards.

On 9/5/06, jianwei_wu < jianwei_wu@xxxxxxx> wrote:
>
> Thanks, Elaine Machtans. I have modified the configure files but the
> problems still exists.
> It seems that the executable is not allowed to transfer to the destination
> location because of the permissions. Is there any other thing else to
> configure or modify? thanks in advance!
>
>
>
>
>
>  ________________________________
>
> -----原始邮件-----
> 发件人:"Elaine Machtans"
> 发送时间:2006-09-05 03:40:26
> 收件人:"Condor-Users Mail List"
> 抄送:(无)
> 主题:Re: [Condor-users] Job submitting and Permission Denied prob lems on
> Redhat Linux9
>
>
>
> Hello,
> I've had a similiar problem and how I using Condor only to do tests I
> configured the condor_config file in each machine from my pool as
> TESTINGMODE. Look for the above text  and replace to TESTINGMODE in
> appropriate location.
> #######################################
> ##  This where you choose the configuration that you would like to
> ##  use.  It has no defaults so it must be defined.  We start this
> ##  file off with the UWCS_* policy.
> ########################################
> ##  Also here is what is referred to as the TESTINGMODE_*, which is
> ##  a quick hardwired way to test Condor.
> ##  Replace UWCS_* with TESTINGMODE_* if you wish to do testing mode.
> ##  For example:
> ##  WANT_SUSPEND                = $(UWCS_WANT_SUSPEND)
> ##  becomes
> ##  WANT_SUSPEND                = $(TESTINGMODE_WANT_SUSPEND)
> WANT_SUSPEND            = $(TESTINGMODE_WANT_SUSPEND)
> WANT_VACATE             = $(TESTINGMODE_WANT_VACATE)
> ##  When is this machine willing to start a job?
> START                   = $(TESTINGMODE_START)
> ##  When should a local universe job be allowed to start?
> START_LOCAL_UNIVERSE    = True
> # Only start a local universe jobs if there are less
> # than 100 local jobs currently running
> #START_LOCAL_UNIVERSE   = TotalLocalJobsRunning < 100
> ##  When should a scheduler universe job be allowed to start?
> START_SCHEDULER_UNIVERSE        = True
> # Only start a scheduler universe jobs if there are less
> # than 100 scheduler jobs currently running
> #START_SCHEDULER_UNIVERSE       = TotalSchedulerJobsRunning < 100
> ##  When to suspend a job?
> SUSPEND                 = $(TESTINGMODE_SUSPEND)
> ##  When to resume a suspended job?
> CONTINUE                = $(TESTINGMODE_CONTINUE)
> ##  When to nicely stop a job?
> ##  (as opposed to killing it instantaneously)
> PREEMPT                 = $(TESTINGMODE_PREEMPT)
> ##  When to instantaneously kill a preempting job
> ##  ( e.g. if a job is in the pre-empting stage for too long)
> KILL                    = $(TESTINGMODE_KILL)
> PERIODIC_CHECKPOINT     =
> $(TESTINGMODE_PERIODIC_CHECKPOINT)
> PREEMPTION_REQUIREMENTS =
> $(TESTINGMODE_PREEMPTION_REQUIREMENTS)
> PREEMPTION_RANK         = $(TESTINGMODE_PREEMPTION_RANK)
> NEGOTIATOR_PRE_JOB_RANK =
> $(TESTINGMODE_NEGOTIATOR_PRE_JOB_RANK)
> NEGOTIATOR_POST_JOB_RANK =
> $(TESTINGMODE_NEGOTIATOR_POST_JOB_RANK)
> MaxJobRetirementTime    =
> $(TESTINGMODE_MaxJobRetirementTime)
> I've changed some intervals time.
> MASTER_UPDATE_INTERVAL          = 30
> UPDATE_INTERVAL         = 30
> MATCH_TIMEOUT           = 30
> SCHEDD_INTERVAL = 30
> You can't submit a job as root because security questions. Use the condor
> account. Examine your submit file too.
> You should start the daemons as root if you can do it.
> I hope it helps you.
> Elaine.
>
>
>
> On 9/4/06, jianwei_wu <jianwei_wu@xxxxxxx > wrote:
> >
> >
> > Hello, I am new to Condor have installed Conodr 6.8.0 on 2 machines under
> Redhat Linux9. Both have the account condor and the local configure files
> are located in /home/condor. One problem is that job submitting is not
> allowed as "root". Secondly, When I submit job to execute on the submit
> machine it goes well, while submit job to execute on another machine, the
> job goes into idle state. The shadow log file tells that it is the
> Permission problems, I do not know what the real problems are and need your
> help, thanks in advance!
> >
> >
> > The log file is as following:
> >
> > 9/4 22:04:17
> ******************************************************
> >
> > 9/4 22:04:17 Using config source:
> /usr/local/condor/etc/condor_config
> >
> > 9/4 22:04:17 Using local config sources:
> >
> > 9/4 22:04:17    /home/condor/condor_config.local
> >
> > 9/4 22:04:17 DaemonCore: Command Socket at < 192.141.121.217:43538>
> >
> > 9/4 22:04:17 Initializing a VANILLA shadow for job 42.0
> >
> > 9/4 22:04:17 (42.0) (6711): Request to run on < 192.141.121.216:32796> was
> ACCEPTED
> >
> > 9/4 22:04:18 (42.0) (6711): Job 42.0 going into Hold state (code 6,2):
> Error from starter on xhg.whu.edu.cn: Failed to execute
> '/home/condor/execute/aab': No such file or directory
> >
> > 9/4 22:04:18 ( 42.0) (6711): **** condor_shadow (condor_SHADOW) EXITING
> WITH STATUS 112
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > 邮 箱 积 分,换 易 趣 现 金 购 物 券
> > 30 邮 箱 积 分 = 现 金 30 元 , 50 元 , 99 元 立 刻 兑 换 > >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to
> condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at either
> > https://lists.cs.wisc.edu/archive/condor-users/
> >
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
> >
> >
>
>
>
>
>
>
>
>
>  买 这 些 让 女 友 很 兴 奋 ( 图 )
>  真 会 过 日 子 ! 一 个 月 收 入 5800 漂 亮 MM 的 感 性 生 活 ( 组 图 )
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to
> condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at either
> https://lists.cs.wisc.edu/archive/condor-users/
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
>
>


--
Diego Bello Carreño
Estudiante Memorista de Ingeniería Civil Informática
UTFSM, Valparaíso, Chile
Usuario #294897 counter.li.org