[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] worker node job directory names?

> On Jul 7, 2020, at 2:37 PM, Maarten Litmaath <Maarten.Litmaath@xxxxxxx> wrote:
> Dear HTCondor users,
> there is new site trying to get HTCondor CE + batch system to work
> for the ALICE LHC experiment.  So far they seem to be the only site
> where the job directory names have a structure like in this example:
>    /users/condor/spool/715/0/cluster715.proc0.subproc0/\
>    home_*_${CE}_9619_${CE}#716.0#1594078077/
> The presence of those '#' characters is problematic for legacy SW
> that cannot handle such paths, which the site got by default.
> How may the admins configure HTCondor to avoid such characters
> being used in job directories?  I looked at all occurrences of the
> words "directory" or "scratch" in the admin guide, to no availâ

Those directory names are part of an attempt to submit from HTCondor to other batch systems in non-CE environments.
In particular, to handle the user submitting many jobs with the same working directory, we create a temporary subdirectory in which to run each job. To ensure each job gets a different subdirectory that HTCondor can clean up after any errors, we use the the GlobalJobId attribute from the HTCondor as part of the name. Thatâs where the â#â characters are coming from.

We should sanitize those values, and I will ensure future releases do so.
As an immediate work-around, you can disable the unique subdirectory name based on GlobalJobId logic by modifying the Job Router rules to include the following line:

For example, in the standard CE configuration files, you would set JOB_ROUTER_ENTRIES like so for Slurm:

  GridResource = "batch slurm";
  TargetUniverse = 9;
  name = "Local_Slurm";

 - Jaime