Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor_shadow "D" state in processes

Date: Tue, 4 Dec 2007 11:13:01 -0500
From: "Robert E. Parrott" <parrott@xxxxxxxxxxxxxxxx>
Subject: [Condor-users] condor_shadow "D" state in processes

Hi Folks,

I'm seeing an unfortunate behavior with condor_shadow jobs in thevanilla universe. this is LINUX X86_64 and condor v6.8.6.

A user submits a large number (500 -1000 ) or jobs on a cluster with150 processors, and has about 100 jobs running simultaneously. Thesejobs all run for about 3 minutes, and then complete at nearly thesame time. At this time, the load on the submit machine, which isalso the head node, reaches a little over N, where N is the number ofthis user's running jobs.

Closer inspection shows that all of the condor_shadow processes ownedby this user are in the "D" state, contending for what appears to bethe same resources.

At first I thought that this was contention was the output data wasreturned from the compute nodes to the submit node. As such I askedthe user to add


   initialdir = [ the run dir ]
   should_transfer_files = NO

To the submit file, but this doesn't help. Also, looking at theactual output, each job produces less than 20 K in output data.

What could be causing such contention in a vanilla universecondor_shadow job, if not the final file transfer process? Hasanyone seen such behavior before in the vanilla universe? Any hintsof guesses for things to look at?



thanks,
rob

Follow-Ups:
- Re: [Condor-users] condor_shadow "D" state in processes
  - From: Todd Tannenbaum
- Re: [Condor-users] condor_shadow "D" state in processes
  - From: Dan Bradley

Prev by Date: [Condor-users] Group issues
Next by Date: [Condor-users] Mixed 6.8.6 and 6.9.5 installation
Previous by thread: Re: [Condor-users] Group issues
Next by thread: Re: [Condor-users] condor_shadow "D" state in processes
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] condor_shadow "D" state in processes