The limiting factor in speed of condor_submit completion is usually
the time it takes to complete fsync() in the schedd. If your jobs
have user logs, prior to 7.5, condor_submit also called fsync().
To confirm whether this is your problem, you can start a long series
of condor_submit invocations. While those are running, periodically
run gstack <schedd_pid>. This will print out the schedd
stack. Is fsync() frequently listed?
Things you can do to speed up fsync: put $(SPOOL) on a fast disk
that doesn't have a lot of other usage. For testing, you can even
stick it in /dev/shm, which is basically a ramdisk. The same
applies to user logs.
--Dan
On 9/8/11 3:32 PM, Patty Bragger wrote:
Thanks David,
I've tried adding the -disable flag, and it seems to help a little
bit, but not a whole lot. It's now averaging about 10 seconds
per 100 instead of 11 seconds.
So this is still a pretty stark difference in performance from
what you're seeing, and granted, my 4 core machine is probably
pretty weak compared to a 16 core nahalem, but I guess I was still
expecting to see some kind explanation by way of maxed out cpu, or
something.. but I'm not seeing that at all. I submitted 1200
jobs, just to sustain the "load" for a noticeable time of 2+
minutes. During that time, the load average didn't even break 1,
and the cpu usage increased from about 10% to about 35%.
Oh well, this isn't the end of the world, thanks for all of the
info.
-Patty
On Thu, Sep 8, 2011 at 3:37 PM, David J.
Herzfeld <herzfeldd@xxxxxxxxx>
wrote:
On Thu, 2011-09-08 at 15:23 -0400, David J.
Herzfeld wrote:
> Hi Patty:
>
> On Thu, 2011-09-08 at 14:40 -0400, Patty Bragger wrote:
> > So an average of about 9 jobs/sec, which is faster
(but only a little) than
> > submitting through dag. What kind of rates are
you guys getting? Maybe
> > this is this normal?
> >
>
> My guess is that the numbers you are seeing and
probably pretty normal
> (both for dagman and when calling directly from the
command line).
>
> We see faster times (real = 0m2.454s, user = 0m1.315s,
~40 jobs/s), but
> have a pretty customized config. For instance, we set
> SUBMIT_SKIP_FILECHECKS = False
> SUBMIT_SEND_RESCHEDULE = False
> I would assume that both of these knobs would reduce
submit times
> (although haven't tested them myself).
Sorry, that should be:
SUBMIT_SKIP_FILECHECKS = True
(see
> http://www.cs.wisc.edu/condor/manual/v7.6/3_3Configuration.html#SECTION004314000000000000000).
Sorry about that.
You should be able to emulate this behavior with the -disable
flag to
condor_submit (if you want to try to see if that increases
your speed).
Best of luck,
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
|