[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Dynamically determine the amount of extensible machine resource



What is the different between this method with using STARTD_CRON_JOBLIST and STARTD_CRON_xxxx_EXECUTABLE?
http://spinningmatt.wordpress.com/2009/11/17/custom-classad-attributes-in-condor-freememorymb-via-startd_cron/


2013/5/18 Zhe Zhang <zhangzhe.hust@xxxxxxxxx>
Thanks Alex, I appreciate your help. Maybe I should make my question more clearly, I want to make sure the cron job can run in root and had better not require a two-step flow. In your proposed solution, it seems like we need a process periodically check the bandwidth and write the file? What I prefer is that the cron job checks the bandwidth when startd launches. 

On May 17, 2013, at 2:34 PM, Alex Hunt <ahunt@xxxxxxxxxxx> wrote:

I'm sure this isn't the only or best way, but you could use a normal cron job as root (or better as another user with sudo priveledges) to write to a file (atomically), then use the condor startd cron to just cat that file.


On 17 May 2013 14:41, Zhe Zhang <zhangzhe.hust@xxxxxxxxx> wrote:
Hi Todd,

Thanks for your solution! This is exactly what I wanted. If I use a simple script like:

#!/bin/sh
echo "DetectedBandwidth = 1000"

The machine classad just got correct attribute-value pairs to be advertised. 

However, the only problem right now is that for the code I use to determines the ethernet link bandwidth, it requires sudo to run ( has ioctl() call inside ). If I set the callout executable to be this one, restart condor, and run condor_status, the machine slot does not show there. Do you know how to handle callout executables requires sudo permission? Thanks.

Zhe
On May 17, 2013, at 11:01 AM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:

On 5/17/2013 10:23 AM, Zhe Zhang wrote:
Hi,

I know that HTCondor support extensible machine resource. Links at
http://spinningmatt.wordpress.com/2012/11/19/extensible-machine-resources/
explains this.

For example, if you want to advertise GPU as a resource, you can add the
following in your config files:

MACHINE_RESOURCE_NAMES = GPU
MACHINE_RESOURCE_GPU = 2

SLOT_TYPE_1 = cpus=100%,auto
SLOT_TYPE_1_PARTITIONABLE = TRUE
NUM_SLOTS_TYPE_1 = 1

Now I have some resource, e.g. the bandwidth, which I also want to
advertise as a machine resource in this way. However, instead of hard
coding each machine with a specific value for bandwidth, I want it to be
generalized as for different machine, I could dynamically determine the
bandwidth resource (I have a piece of code does that) and put it in the
config file. Is there a way I can do that?

Hi Zhe! Yes, there is a way to do that (a few ways actually).  One way off the top of my head would be to use MACHINE_RESOURCE_INVENTORY_<xxx> knob.  This specifies a script to run at startup of the startd - the stdout of this script should spit out an attribute "Dectected<xxx>=y" for the extensible resource plus any other optional characteristics of the resource to advertise in the machine classad.

So instead of doing:

 # declare a local resource quantity directly in config
 MACHINE_RESOURCE_railgun = 8

You could do something like this:

 # specify a script to run that returns local resource inventory with
 # other optional attributes to be advertised in slot ads
 MACHINE_RESOURCE_INVENTORY_railgun = /bin/railgun_inventory.sh 8 0.95c

where railgun_inventory.sh look like this:

 #!/bin/sh
 echo "DetectedRailgun = $1"
 echo -e "RailgunMuzzleVelocity = \"$2\""

P.s. If you do not specify "MACHINE_RESOURCE_NAMES", the startd should be able to figure it out automatically by searching for all the MACHINE_RESOURCE_* and MACHINE_RESOURCE_INVENTORY_* knobs...

Hope the above helps,
regards,
Todd



------------------------------------------------
Zhe Zhang
Department of Computer Science and Engineering
University of Nebraska-Lincoln
Lincoln, NE, 68588


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


------------------------------------------------
Zhe Zhang
Department of Computer Science and Engineering
University of Nebraska-Lincoln
Lincoln, NE, 68588


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/