Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_shadow "D" state in processes

Date: Tue, 04 Dec 2007 10:34:58 -0600
From: Dan Bradley <dan@xxxxxxxxxxxx>
Subject: Re: [Condor-users] condor_shadow "D" state in processes

Does the ShadowLog contain any clues about what the shadows are doingduring the time of high load?

If not, it may be enlightening to run 'strace -p <pid of a shadow>' andsee what the shadow is trying to do.


--Dan

Robert E. Parrott wrote:

Hi Folks,
I'm seeing an unfortunate behavior with condor_shadow jobs in thevanilla universe. this is LINUX X86_64 and condor v6.8.6.
A user submits a large number (500 -1000 ) or jobs on a cluster with150 processors, and has about 100 jobs running simultaneously. Thesejobs all run for about 3 minutes, and then complete at nearly thesame time. At this time, the load on the submit machine, which isalso the head node, reaches a little over N, where N is the number ofthis user's running jobs.
Closer inspection shows that all of the condor_shadow processes ownedby this user are in the "D" state, contending for what appears to bethe same resources.
At first I thought that this was contention was the output data wasreturned from the compute nodes to the submit node. As such I askedthe user to add
   initialdir = [ the run dir ]
   should_transfer_files = NO
To the submit file, but this doesn't help. Also, looking at theactual output, each job produces less than 20 K in output data.
What could be causing such contention in a vanilla universecondor_shadow job, if not the final file transfer process? Hasanyone seen such behavior before in the vanilla universe? Any hintsof guesses for things to look at?
thanks,
rob






_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:https://lists.cs.wisc.edu/archive/condor-users/

References:
- [Condor-users] condor_shadow "D" state in processes
  - From: Robert E. Parrott

Prev by Date: [Condor-users] Mixed 6.8.6 and 6.9.5 installation
Next by Date: Re: [Condor-users] Mixed 6.8.6 and 6.9.5 installation
Previous by thread: [Condor-users] condor_shadow "D" state in processes
Next by thread: Re: [Condor-users] condor_shadow "D" state in processes
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] condor_shadow "D" state in processes