[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Get remote queue id

On Wed, 17 Oct 2007, Michael Thomas wrote:

I have a condor-g host that is used heavily by users.  Every now and
then the users complain about some problem, and send me the condor-g
log, which gives the local condor-g queue id.  But I need to map this to
the job id in the remote condor queue (also managed by me) so that I can
poke through the remote system logs to find out what may have gone wrong.

Is there a way to map the local condor-g queue id to the remote condor
queue id?

There are two ways.
The hard way: Look in the Userlog of the condor-g user on the client.
That will have something like

027 (176900.089.000) 09/27 09:39:46 Job submitted to grid resource
    GridResource: gt2 fgitb-gk.fnal.gov/jobmanager-condor
GridJobId: gt2 fgitb-gk.fnal.gov/jobmanager-condor https://fgitb-gk.fnal.gov

(in the case of a grid/gt2 resource.
The 2nd number, in this case 27985, is the process id of the
globus-job-manager process on the remote host and 1190903977 is
the timestamp.

By looking in the /var/log/messages of the grid CE you should
be able to match the condor job id with the pid of the globus-job-manager.

If you are using a VDT-based grid installation there is a
utility called vdt-get-job-info which will match condor job id
on server end to globus-job-manager pid--but not to the condor-G job
ide on the client which is what you really need.

The easier way:
We wrap condor_submit on the client end, adding things to the classad like this:

GlobusRSL = "(condorsubmit=('+SubmitHost' 'clienthostname.clientdomain')('+SubmitClusterID' $(Cluster)))"

The punctuation may not be quite right but the effect is to
change the client so that it always sends two extra fields across the
grid, namely the originating cluster id and the originating hostname.
Of course this only works if you have condor on both ends
but we have modified our few non-condor installations to throw
the condorsubmit RSL attribute on the ground so that it doesn't cause
an exception.

Steve Timm


Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.