[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] .update.ad problems after upgrade.

07/20/20 14:56:07 (pid:10322) Failed to open '.update.ad' to read update
ad: No such file or directory (2).

Problems with .update.ad will almost never cause problems for HTCondor directly, but sometimes be an indication of other issues. If I recall correctly, the starter will check the .update.ad just after the job exits (to make sure the starter catches the last update about GPU usage from the startd).

From what I've seen, this file should be created in /var/condor/execute,

It should be created in the job's sandbox, which will be a subdirectory of the HTCondor configuration variable EXECUTE, IIRC. The directory should probably be owned by group condor as well as user condor, but I don't know if that will matter.

having this problem upon upgrade, so at this point I'm not positive this
IS a problem? Is it THE problem that's causing these jobs to fail? What
the heck can I do to diagnose/resolve this issue?

Look further back into the starter log; what's causing the job to fail? Is the job actually starting successfully? If you can't find anything, consider setting STARTER_DEBUG = D_FULLDEBUG in the config to make the log more (exceedingly) verbose.

- ToddM