[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] Fwd: [Medusa-users] ulimit -a
- Date: Mon, 29 Sep 2003 18:10:46 +0100
- From: Alexander Klyubin <A.Kljubin@xxxxxxxxxxx>
- Subject: Re: [condor-users] Fwd: [Medusa-users] ulimit -a
Actually, looking at the log entries in StarterLog I start to suspect
that Condor indeed sets resource limits for job processes.
For example, Windows machine's StarterLog:
9/29 15:08:57 Setting resource limits not implemented!
Linux machine's StarterLog:
9/8 20:07:52 Done setting resource limits
I wonder what limits it tries to set and where the settings governing
this process are located.
Scott Koranda wrote:
On Monday 29 September 2003 8:58 am, Scott Koranda wrote:
Hmmmm. The limit is not being set in /etc/profile. It is being set in
/etc/security/limits.conf, but perhaps your failure mechanism is the
Well... I can think of a couple, probably all obvious things that you've
thought of, but off the top of my head:
1. Restart Condor after the change; the condor master process would be running
with the original limits until it's restarted. Actually, a condor_restart
-master probably wouldn't do the trick, either. This is because the last
thing the master does on a restart is "exec condor_master", so the new master
would inherit the ulimit from the original master. :-(
I am pretty sure that Condor has been restarted since the edits to
/etc/security/limits.conf were made, but just to be sure I am
restarting Condor now.
2. Doest the user that's running condor have it's own limit set, or is it set
in one of the startup files?
The limit is set in /etc/security/limits.conf. In this file
I read the following:
"Also, please note that all limit settings are set PER LOGIN. They
are not global, nor are they permanent (they apply for the session
Also I read in there:
"No limits are imposed on UID 0 accounts."
So I am guessing that since the Condor daemons run as root,
/etc/security/limits.conf is ignored, and so the limits are not passed
to the vanilla universe job.
(I am not by any means a PAM expert so if you have any thoughts please
let me know...)
Thanks, I will dig deeper.
Can it be that you changed the number of open files in a script, which
does not get executed for user "nobody"? E.g., /etc/profile gets
executed for normal users, but not for user "nobody" under which Condor
normally runs jobs?
Scott Koranda wrote:
We recently changed the nodes in our cluster to allow 2048 open file
descriptors rather than the standard 1024. On any node in our cluster
I see the following:
[skoranda@medusa-slave001 ]$ ulimit -n
But as the user below points out, when the ulimit is run via Condor in
the vanilla universe we always get 1024 and not 2048.
----- Forwarded message from Vladimir Dergachev
Subject: [Medusa-users] ulimit -a
Date: Sun, 28 Sep 2003 20:15:41 -0400 (EDT)
It was a while since I needed to run my statistics generating program
that needs to open many files at once, and for some reason I can not do
it using condor.
condor_run "ulimit -a" reports limit of open files as 1024, but when I
rsh to a node and run ulimit -a myself I get 2048 (as it should be
after recent changes).
Would anyone have a suggestion how I can explain to condor not to lower
the limit ?
thank you !
Medusa-users mailing list
Condor Support Information:
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>