[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Running a HOOK_JOB_EXIT



FYI, that config will cause the script to be run for every job. If that's what you want great. If you want to selectively run the script for munin jobs then you'll need the +HookKeyword.

Best,


matt

On 06/21/2011 06:12 PM, Colin Leavett-Brown wrote:
Thank you Matt for all your help with this problem. I think, in the end,
the problem turns out to be a bug in 7.5.1. Here's the output from 761:

[crlb@elephant condor]$ cat x.out

1. Condor version - Command "condor_version" produces:
$CondorVersion: 7.6.1 May 31 2011 BuildID: 339001 $
$CondorPlatform: x86_rhap_5 $

2. Condor configuration - Command "condor_config_val -dump | grep -i
hook" produces:
MUNIN_HOOK_JOB_EXIT = /usr/local/bin/munin-node-condor-job-exit

3. Job Class Ad - Command "grep -i hook $_CONDOR_JOB_AD" produces:
HookKeyword = "MUNIN"

4. Job hook permissions - Command "ls -l
/usr/local/bin/munin-node-condor-job-exit" produces:
-rwxr-xr-x 1 root root 126 Jun 21 17:18
/usr/local/bin/munin-node-condor-job-exit

5. Job hook test run - Command
"/usr/local/bin/munin-node-condor-job-exit" produces:
Executing munin-node-condor-job-exit

6. Job exit should produce a second line of output from the hook:
Executing munin-node-condor-job-exit
[crlb@elephant condor]$

I found that the following also works, and doesn't require the user to
add anything to their job file:

2. Condor configuration - Command "condor_config_val -dump | grep -i
hook" produces:
STARTER_HOOK_JOB_EXIT = /usr/local/bin/munin-node-condor-job-exit
STARTER_JOB_HOOK_KEYWORD = STARTER

Thanks again for your help, Colin.

Matthew Farrellee wrote:
Colin,

It could very well be a bug fixed between 7.5.1 and 7.6.1 (over a year
of development).

If you turn on STARTER_DEBUG = D_FULLDEBUG you should see something
along the lines of...

... HOOK_JOB_EXIT (/usr/local/bin/munin-node-condor-job-exit) invoked
with reason: "exit"
...
... HookClient /usr/local/bin/munin-node-condor-job-exit (pid 8077)
exited with status 0

...in the StarterLog.slotX where the job ran.

Best,


matt

On 06/18/2011 11:36 AM, Colin Leavett-Brown wrote:
Hi Matt, I think I'm doing everything you suggest, but it still won't
run:

[crlb@elephant condor]$ cat x.out
1. Condor version - Command "condor_version" produces:
$CondorVersion: 7.5.1 Mar 1 2010 BuildID: 220663 $
$CondorPlatform: I386-LINUX_RHEL5 $

2. Condor configuration - Command "condor_config_val -dump | grep -i
hook" produces:
MUNIN_HOOK_JOB_EXIT = /usr/local/bin/munin-node-condor-job-exit

3. Job Class Ad - Command "grep -i hook $_CONDOR_JOB_AD" produces:
HookKeyword = "MUNIN"

4. Job hook permissions - Command "ls -l
/usr/local/bin/munin-node-condor-job-exit" produces:
-rwxr-xr-x 1 root root 102 Jun 16 13:18
/usr/local/bin/munin-node-condor-job-exit

5. Job hook test run - Command
"/usr/local/bin/munin-node-condor-job-exit" produces:
Executing munin-node-condor-job-exit

6. Job exit should produce a second line of output from the hook:
[crlb@elephant condor]$

Sorry, Colin.


On 11-06-17 11:02 AM, Matthew Farrellee wrote:
From this morning...

--
$ condor_version
$CondorVersion: 7.6.1 May 31 2011 BuildID: RH-7.6.1-0.9.el6 $
$CondorPlatform: X86_64-Fedora_13 $

$ condor_config_val -dump | grep HOOK
MUNIN_HOOK_JOB_EXIT = /usr/local/bin/munin-node-condor-job-exit

$ echo 'cmd =
munin.sh\ntransfer_executable=true\noutput=x.out\nshould_transfer_files=ALWAYS\nwhen_to_transfer_output=ON_EXIT\n+HookKeyword="MUNIN"\nqueue'


| condor_submit

$ cat x.out
1. The job script:
#!/bin/bash

#echo
#echo 0. pwd
#pwd
#ls -al

echo 1. The job script:
cat condor_exec.exe

echo
echo 2. Hook config:
grep -i HOOK /etc/condor/condor_config

echo
echo 3. Permissions on the hook:
ls -l /usr/local/bin/munin-node-condor-job-exit

echo
echo 4. The hook:
cat /usr/local/bin/munin-node-condor-job-exit

echo
echo 5. Test run of the hook:
/usr/local/bin/munin-node-condor-job-exit

echo
echo 6. Job exit should produce a second line of output from the hook:

2. Hook config:

3. Permissions on the hook:
-rwxr-xr-x. 1 root root 98 Jun 17 11:04
/usr/local/bin/munin-node-condor-job-exit

4. The hook:
#!/usr/bin/perl
open(DD, ">>x.out");
print DD "Executing munin-node-condor-job-exit\n";
close(DD);
5. Test run of the hook:
Executing munin-node-condor-job-exit

6. Job exit should produce a second line of output from the hook:
Executing munin-node-condor-job-exit
--

Submit again without the HookKeyword and you won't see the message
in 6.

Best,


matt