Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Core grouping

Date: Sat, 22 Dec 2012 10:31:09 +0100
From: Max Fischer <mfischer@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: [HTCondor-users] Core grouping

Hi,

use condor_reconfig [1] if you have made changes to either config filethat affects a daemon. It's more in line with what you actually want todo, and will have less of a disruptive effect then completely restartinga node.

N.B: I haven't restarted nor rebuilt the node yet (as I can't do it!!)

If you are authorized to change the configuration files but not torestart/reconfigure the service, you should rethink your administrationpolicies. Not sure if I understand you correctly though...If know that your reconfiguration is fine with the administration andthe condor services are set to automatically start on boot-up, you cantry rebooting the machines condor is running on. This should make Condorload the new configuration files (as the Daemons restart as well) evenwhen you cannot manually tell it to do so.


Cheers,
Max

[1]
http://research.cs.wisc.edu/htcondor/manual/current/condor_reconfig.html

On 12/21/2012 04:19 PM, Hermann Fuchs wrote:

Hi

I guess you need to restart condor on the nodes, in order for the
changes to take effect.

Best regards,
Hermann
On Fri, 2012-12-21 at 12:52 +0000, Mostafa.B wrote:

Thanks for the quick Answer Hermann,


I entered the following into the condor_config.local file at specific
nodes of the cluster


Cluster_Group = True

STARTD_ATTRS = $(STARTD_ATTRS), Cluster_Group


then set the requirements to the below (basically just
added Cluster_Group =?=True to what I was requiring before)


Requirements = (Arch == "X86_64" && OpSys =="WINDOWS" && Cluster_Group
=?=True) ||(Arch == "INTEL" && OpSys == "WINDOWS" && Cluster_Group
=?=True)



but the jobs sent don't run in any of the available slots! anyone any
ideas


N.B: I haven't restarted nor rebuilt the node yet (as I can't do it!!)


On Thu, Dec 20, 2012 at 12:39 PM, Hermann Fuchs
<hermann.fuchs@xxxxxxxxxxxxxxxx> wrote:
         Hi

The easiest way to make sure a certain software is installed

         on a
         machine is using ClassAds.

On each machine, where you have installed the required

         software(for
         example gnuplot), you define a ClassAd. In your submit file
         require that
         this Class Ad is present.

Example:On the nodes where gnuplot is installed:

         HAS_GNUPLOT = True
         STARTD_EXPRS = $(STARTD_EXPRS), HAS_GNUPLOT

In the submit file:

         Requirements   = Memory >= 512 && HAS_GNUPLOT =?= True

Then the job will only run on machines which do have the

         necessary
         software.

Best regards,

         Hermann

On Thu, 2012-12-20 at 12:22 +0000, Mostafa.B wrote:

         > Hi All,
         > In our research group, we have a small cluster of 30 cores
         dedicated
         > to the cluster and a few number of temporary cores available
         in the
         > research group network which are added to the cluster
         whenever they
         > are idle.
         > Access to the mentioned 30 cores is easy and required
         third-party
         > programs can be installed on their respective PCs without
         > difficulties, however the temporary cores are often very
         hard to
         > access.
         > Issues arise when a job is sent to one of the cores that
         doesn't have
         > the necessary software to perform the required task. If this
         was only
         > one job, then there would be no major issue. However, when
         it comes to
         > numerous jobs queued, the situation is much more severe.
         > In general what I do for running a job in the cluster is,
         defining the
         > executable file as a .bat application which itself runs a
         few other
         > programs that are already installed on the core's respective
         PC. So
         > this clarifies where exactly the problem is happening.
         > Has anybody experienced this problem?
         > Does anyone know how to tackle this problem without having
         to remove
         > the temporary cores? Probably some way that can specify
         which group of
         > cores may be used for running a specific job.
         >

> _______________________________________________

         > HTCondor-users mailing list
         > To unsubscribe, send a message to
         htcondor-users-request@xxxxxxxxxxx with a
         > subject: Unsubscribe
         > You can also unsubscribe by visiting
         > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
         >
         > The archives can be found at:
         > https://lists.cs.wisc.edu/archive/htcondor-users/

--

         -------------
         DI Hermann Fuchs
         Christian Doppler Laboratory for Medical Radiation Research
         for Radiation Oncology
         Department of Radiation Oncology
         Medical University Vienna
         Währinger Gürtel 18-20
         A-1090 Wien

Tel. + 43 / 1 / 40 400 7271

         Mail. hermann.fuchs@xxxxxxxxxxxxxxxx

_______________________________________________

         HTCondor-users mailing list
         To unsubscribe, send a message to
         htcondor-users-request@xxxxxxxxxxx with a
         subject: Unsubscribe
         You can also unsubscribe by visiting
         https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:

         https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

References:
- [HTCondor-users] Core grouping
  - From: Mostafa.B
- Re: [HTCondor-users] Core grouping
  - From: Hermann Fuchs
- Re: [HTCondor-users] Core grouping
  - From: Mostafa.B
- Re: [HTCondor-users] Core grouping
  - From: Hermann Fuchs

Prev by Date: Re: [HTCondor-users] Condor, Soap, and Python: "Incoming packet header unrecognized" error
Next by Date: Re: [HTCondor-users] PoVB modifications
Previous by thread: Re: [HTCondor-users] Core grouping
Next by thread: [HTCondor-users] Condor, Soap, and Python: "Incoming packet header unrecognized" error
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] Core grouping