[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_shadow "D" state in processes



Followup:

A mixture of adding nscd on the submit node, and disabling log files on NFS volumes, seems to have alleviated the problem.

rob


On Dec 5, 2007, at 2:04 PM, Robert E. Parrott wrote:

Looking at the condor_shadow processes with strace, I'm seeing that
the condor_shadow processes are hanging when doing a lookup of user ids.

I can sometimes see yp errors while doing a "ps" on the command line
when many condor_shadow processes aare hanging.

I'm going to take a look at nscd as a possible fix here.

rob


On Dec 5, 2007, at 4:21 AM, Matt Hope wrote:

On Dec 4, 2007 6:00 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
Robert E. Parrott wrote:
Guess :  Is the user asking to have a log file via
   log = /some/path
in his/her submit file?  If so, does the path to the log file
exist on
NFS?

As a good general rule (general rules are for assimilation rather than
dogmatic application) In a distributed high throughput system try to
avoid going of you local node where ever possible.

It is tempting to write everything to the network where it is
available and it can make life easier but you are creating a potential
bottleneck so caveat architectus.

Matt
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/