[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor-ce: troubleshooting and jobRouter



Hello, Stefano.

 

As I know, the "osg-local-job-environment.conf" file was created by "osg-configure" with local setting option.

 

So, the file can not be found by yum provides.

 

Please, see the https://opensciencegrid.org/docs/other/configuration-with-osg-configure/#local-settings

 

Actually, I installed condor-ce on umd middleware. So, I can not use "osg-configure" program. 

 

I created the file on the location manually and it has been worked well.

 

 

If your settings work well, you do not need to change them.

 

Please review only if the setting is not working or there is a limit.

 

Regards,

-----------------------ìë ëìì-----------------------
ëëìë: "Stefano Dal Pra "<stefano.dalpra@xxxxxxxxxxxx>
ëëìë: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>,Geonmo Ryu <geonmo@xxxxxxxxxxx>
ëëìê: 2018-09-20 17:28:20 GMT +0900 (ROK)
ìë: Re: [HTCondor-users] condor-ce: troubleshooting and jobRouter

 

 

Hello Geonmo, thank You,

On 19/09/2018 02:02, "Geonmo Ryu" wrote:

Hello, Stefano.

 

Could you use osg-local-job-environment.conf?

Im' not sure about which is the best place where to put it:

I have no more that path /var/lib/osg/ ; in my first attempt with condor-ce i followed instructions from
https://opensciencegrid.org/docs/compute-element/install-htcondor-ce/ which led me to install
htcondor-ce-*.osg*el7*.rpm

however now i have:
[root@ce01-htc ~]# rpm -qa | grep condor-ce
htcondor-ce-client-3.1.0-1.el7.noarch
htcondor-ce-condor-3.1.0-1.el7.noarch
htcondor-ce-view-3.1.0-1.el7.noarch
htcondor-ce-3.1.0-1.el7.noarch

and there is no /var/lib/osg/ from those packages.

As an attempt to provide external jobs with a PATH i defined

[root@ce01-htc ~]# condor_config_val -dump APPEND_REQUIREMENTS
APPEND_REQUIREMENTS = Environment = "PATH=/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin"

but i'm not sure this is the best approach.

Cheers,
Stefano
 

 

I modified the file to add the environment.

 

## /var/lib/osg/osg-local-job-environment.conf

#!/bin/sh

VO_CMS_SW_DIR=/cvmfs/cms.cern.ch

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin

export VO_CMS_SW_DIR

export PATH

####################################

 

and here is my jobrouter configuration

 

## /etc/condor-ce/config.d/61-job-routes.conf

##  61-job-routes.conf 

#####################################################

# Example Job Route

#

# This is an extraordinarily simple job route.

# All it does is route local condor and set a

# simple Accounting Group and default RequestMemory.

#####################################################

# No custom functions for job router entries; these are causing crashes in 8.3.5.

# Can remove the eval_set_environment attribute below starting in 8.3.8.

JOB_ROUTER_ENTRIES = [ \

        name = "condor_pool_dteam"; \

        TargetUniverse = 5; \

        Requirements = target.x509UserProxyVOName =?= "dteam"; \

        set_requirements = (Arch == "X86_64") && (TARGET.OpSys == "LINUX"); \

        MaxJobs = 100; \

        MaxIdleJobs = 100; \

] \ 

[ \

        name = "condor_pool_ops"; \

        TargetUniverse = 5; \

        Requirements = target.x509UserProxyVOName =?= "ops"; \

        set_requirements = (Arch == "X86_64") && (TARGET.OpSys == "LINUX"); \

        MaxJobs = 100; \

        MaxIdleJobs = 100; \

] \ 

[ \

        name = "condor_pool_cms"; \

        TargetUniverse = 5; \

        Requirements = target.x509UserProxyVOName =?= "cms"; \

        set_requirements = (Arch == "X86_64") && (TARGET.OpSys == "LINUX"); \

        MaxJobs = 1280; \

        MaxIdleJobs = 1280; \

] \ 

 

 

 

 

 

-----------------------ìë ëìì-----------------------
ëëìë: "Stefano Dal Pra " <stefano.dalpra@xxxxxxxxxxxx>
ëëìë: htcondor-users <htcondor-users@xxxxxxxxxxx>
ëëìê: 2018-09-18 22:12:29 GMT +0900 (ROK)
ìë: [HTCondor-users] condor-ce: troubleshooting and jobRouter

 

 

Hello,

 

 

I'm practicing with HTCondor-ce and need some help as i'm not very

 

fluent at troubleshooting / configuration.

 

 

Test pilot jobs submitted by a CMS factory are failing a validation

 

shell script when running in the execute node.

 

Apparently, the reason is that no env var is passed to the job:

 

 

Environment = ""

 

 

I verified that the shell script succeeds if I submit it from the

 

condor-ce itself by adding

 

environment =

 

"PATH=/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin" in the

 

submit file.

 

 

However, if i submit the same from an external machine, again no

 

environment is passed to the job in the exec node.

 

That seems to suggest that a few parameters are trimmed away. I think

 

that JobRouter should be where such submission

 

parameters might be altered but i'm not sure at all and some simpler

 

misconfiguration could explain this problem.

 

 

A couple of questions:

 

 

1) For jobs I submit there are logfiles such as

 

/var/log/condor-ce/GridmanagerLog.dteam039

 

containing a line such as:

 

 

09/17/18 15:08:10 (D_ALWAYS:2) [4098033] GAHP[4098037] <-

 

'CONDOR_JOB_SUBMIT [SNIP] Environment\ =\

 

"PATH=/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin"; [SNIP]

 

 

where i can see the submit file content,

 

however there is no similar file for the cms user:

 

/var/log/condor-ce/GridmanagerLog.pilcms017

 

Is there a way to compare the job parameters "before" and "after" the

 

routing?

 

 

2) Does someone have a few examples of jobrouting configuration for a

 

WLCG like HTCondor-CE ?

 

Currently i'm looking at

 

https://opensciencegrid.org/docs/compute-element/job-router-recipes/ .

 

If the examples there are mostly adequate for a non OSG CE I can go on

 

and refere to those ones.

 

 

Thanks for any help, bye

 

 

Stefano

 

 

_______________________________________________

 

HTCondor-users mailing list

 

To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a

 

subject: Unsubscribe

 

You can also unsubscribe by visiting

 

https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

 

 

The archives can be found at:

 

https://lists.cs.wisc.edu/archive/htcondor-users/

 

 

 
 
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/