[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] globally unique job identifiers



On Wed, 11 Feb 2004, Matt Hope wrote:

> I have been writting a quick c# class library to allow programmatic
> submission / query / control functionality essentially by parsing the
> outputs from condor_submit, condor_q, condor_rm and condor_prio (so far)
>
> this is working well except the use case internally is central
> management of all submitted jobs (an end user or a batch process can
> submit a job and the submitted job is linked to an entry in a database)
>
> this lets me track timeouts / runaway jobs and take the appropriate
> action (resubmit for example)
>
> to do this I needed a way to identify a job and foolishly assumed that
> clusterId and ProcessId would give a unique id for a job in a pool. This
> turns out not to be the case and that it only gives it unique to the
> user who submitted the job. (which I see the rationale for given the
> source of the project)
>
> I am left with three options
>
> 1) fake the submission user in some way even when running on another
> machine (don't even know if this is possible)
> 2) record the user which submitted the job in the database as well and
> use -global and user constraints
> 3) add a new attribute to all class ads with a Guid which is stored in
> the database, use this in any queries (along with global)
>
> 3 sounds like the best option so far but if condor has an internal
> unique id is there any way I can easily get it / query on it?

Cluster and proc id are unique per schedd daemon. So you can generate a
globally unique job id (within the pool, at least) by combining the job's
cluster/proc id with the schedd's name. One potential problem with this is
that the schedd's name is derived from the machine's hostname and the
SCHEDD_NAME parameter in the config file. So you need to make sure these
don't change, and that different schedds won't end up with the same name
even if it's at different times.

For some work we've been doing in the 6.7 development series (coming soon
to a cluster near you), we've added a GlobalJobId attribute that'll be
included in every job ad. A variation on the idea above, it consists of
the cluster/proc id, the schedd name, and a timestamp. Once Condor 6.7.0
is released, you'll be able to use that instead of rolling your own.

+------------------------------------+-------------------------------+
|             Jaime Frey             |There are 10 types of people in|
|         jfrey@xxxxxxxxxxx          |the world: Those who understand|
|   http://www.cs.wisc.edu/~jfrey/   |  binary, and those who don't  |
+------------------------------------+-------------------------------+
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>