[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to run a job in all machines of a Condor Poll, but only once?



Angel,

I tend to do this with a PERL script. Parse condor_status, find the central manager and construct submit files for each machine (except for the manager)

#!/usr/bin/perl
#FARM OUT PROCESS ONCE TO EACH MACHINE
chomp($nodelist = `condor_status -format "%s " name -format "%s\t" opsys`);
@allnodes = split(/\t/,$nodelist);
foreach (@allnodes) {
	($machine,$opsys) = split(/ /,$_);
	if ($machine =~ /@/) { ($x,$nod) = split(/@/, $machine) } else { $nod = $machine; }
	$nodes{$nod}=$opsys;
}
#GET NAME OF CENTRAL MANAGER
chomp($condor_host = `condor_config_val CONDOR_HOST`);
foreach $node (keys %nodes) {
	if ($node ne $condor_host) {
	# LOOPS ONCE PER MACHINE **NOT** ONCE PER NODE/SLOT
	# EDIT THE PRINT STATEMENT TO CREATE THE REQUIRED condor_submit FILE
	open(LOCATESUB,">locate_$node.sub")||die "Can't open locatesub : $! ";
	print LOCATESUB "universe=vanilla
executable = XXXXXXX
arguments =  XXXXXXX
notification = NEVER
output = locateis.\$\$(Name).stdout
error = locateis.\$\$(Name).stderr
Log = locateis.log
Requirements = Machine == \"$node\"
queue
";
	close (LOCATESUB) || die "Can't close locatesub : $! ";
#SUBMIT JOB TO NODE
	system("condor_submit locate_$node.sub 1>/dev/null");
	}
}

There's probably a better way but this has worked well our small cluster.

Steve

Dr Steven Platt
Bioinformatics
Health Protection Agency
UK


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ángel de Vicente
Sent: 29 June 2010 09:43
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] How to run a job in all machines of a Condor Poll,but only once?

Hi all,

 from time to time I find that I would like to run a job in all machines in  
my Condor pool (mostly to find misbehaving ones), but I haven't found an  
easy way of doing it. In the past I've been using the Hawkeye stuff, but  
this is a bit cumbersome, as it implies modifying the configuration files,  
restarting the Startd daemon, etc.

Any ideas on how to easily run a job (only once) on all (as per the  
current pool state) the machines of a Condor pool?

Thanks,
Ángel de Vicente
-- 
+---------------------------------------------+
|                                             |
| http://www.iac.es/galeria/angelv/           |
| VoIP -> sip:angelv@xxxxxxxxx                |
|                                             |
| High Performance Computing Support PostDoc  |
| Instituto de Astrofísica de Canarias        |
|                                             |
+---------------------------------------------+
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
-----------------------------------------
**************************************************************************
The information contained in the EMail and any attachments is
confidential and intended solely and for the attention and use of
the named addressee(s). It may not be disclosed to any other person
without the express authority of the HPA, or the intended
recipient, or both. If you are not the intended recipient, you must
not disclose, copy, distribute or retain this message or any part
of it. This footnote also confirms that this EMail has been swept
for computer viruses, but please re-sweep any attachments before
opening or saving. HTTP://www.HPA.org.uk
**************************************************************************