[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] weird log messages



Here's the obvious smelly part:

08/05/13 09:54:55 (pid:16584) WriteUserLog::initialize: safe_open_wrapper("/var/lib/condor/spool/2748/0/cluster32748.proc0.subproc0/job.32748.0.log") failed - errno 13 (Permission denied)

I assume you've eyeballed the permissions - see anything sketchy in that directory (or its parents)?  I'm not sure at what point in the job lifecycle that directory is supposed to be owned by the *user*, not *condor*.

What user is the condor_schedd running as?  One thought would be that the EUID is being set wrong to create the directory.

Brian

On Aug 6, 2013, at 9:28 AM, Pek Daniel <pekdaniel@xxxxxxxxx> wrote:

> Hi!
> 
> I started to hit a single schedd with ~20 submissions / sec (vanilla universe), it couldn't handle it, and I got really strange messages in my SchedLog:
> 
> http://fpaste.org/30096/56896391/
> 
> I tried to increase the file descriptor limit but it didn't help. The permissions are fine in /var/lib/condor (everything is owned by condor user recursively).
> 
> Any idea what causes this?
> 
> Thanks,
> Daniel
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/