On Wed, Sep 7, 2011 at 3:57 PM, David J. Herzfeld
<herzfeldd@xxxxxxxxx> wrote:
Hi Patty:
On Wed, 2011-09-07 at 15:33 -0400, Patty Bragger wrote:
> here are the relevant config settings:
> DAGMAN_MAX_SUBMITS_PER_INTERVAL = 250
> DAGMAN_SUBMIT_DELAY = 0
> DAGMAN_USER_LOG_SCAN_INTERVAL = 5
> (at least I think that's all of them)
> Has anyone else seen performance like this or does anyone know how to
> figure out what is taking it so long to dispatch these nodes to the
> queue?
We are running 7.6.1 on RHEL here, and do not see the issue that you are
describing. Your configuration settings seem fine to me. I assume that
when you say "config settings", you mean that you are using the CONFIG
directive directly in your dag.
the config settings that I was referring to are those defined in the config files for the central manager:
$ condor_config_val -dump | grep -i dag
DAGMAN_MAX_SUBMITS_PER_INTERVAL = 250
DAGMAN_SUBMIT_DELAY = 0
DAGMAN_USER_LOG_SCAN_INTERVAL = 5
I have not tried setting these directly in the job submit file.
What operating system are you using?
We are also on RHEL with a mix of v5.4 and v5.5 machines
Perhaps upgrading to condor 7.6.1 will resolve this issue, or perhaps
Kent is correct in this being the overhead of submitting jobs.
But if that is the case, that this is just the overhead of submitting
individual jobs, why don't I see the same overhead when submitting 100
separate jobs through a non-dag submit? What is the difference between
submitting a dag file that has 100 entries and submitting a non-dag file
that has 100 job submissions (not through queue 100, but through 100
separate job definitions in the file.) I would expect those to have
pretty much the same overhead.
Thanks,
Patty
Here's an example that I ran on our RHEL 7.6.1 system, which submits 250
jobs/second (following the initial delay to "ensure ProcessId
uniqueness"):
#!/bin/bash
# Create config file
cat > test.config << EOF
DAGMAN_MAX_SUBMITS_PER_INTERVAL = 250
DAGMAN_SUBMIT_DELAY = 0
DAGMAN_USER_LOG_SCAN_INTERVAL = 5
EOF
# Create submit file
cat > test.sub << EOF
Executable = /bin/echo
Arguments = "Hello World"
transfer_executable = False
Output = out/test_\$(RUN).out
Error = err/test_\$(RUN).err
Log = test.log
Queue
EOF
# Create dag
echo "CONFIG test.config" > test.dag
for i in $(seq 0 250)
do
echo "JOB A${i} test.sub" >> test.dag
echo "VARS A${i} RUN=\"${i}\"" >> test.dag
done
# Make out and err directories
rm -rf out err 2>/dev/null
mkdir out err
# Submit!
condor_submit_dag test.dag
> Thanks,
> Patty