[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Condor job submission delayed

I never saw an answer to this question. Did one get proffered off the list? Could you please cross post it if that is the case. I too am curious about this delay as I'm seeing this in my flock of Windows XP machines.

Also, I do believe there is a typo in the condor_q -analyze output in the 6.7.1 release. The line says:

~      4 match, match, but reject the job for unknown reasons

But feels like it should read:

~      4 match, but reject the job for unknown reasons

Technically the comma is not required on any of those lines if you really want to get picky about grammar.

Can I request that these lines receive some detailed descriptions in the help for condor_q in the manual? That one line in question is very troubling to a user who doesn't know what's going on in the system. You start asking questions like: well how can I make it match? How can there be an unknown reason? Maybe a little technical talk around the meaning of each line would help relieve some user stress.


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Marc Saric
Sent: August 31, 2004 10:26 AM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] Condor job submission delayed

Hash: SHA1

Hi all,

I am experimenting with a small Condor cluster (Condor 6.6.6, mostly on Windows-boxes unfortunately) as you can see from my various beginners mails popping up in the forum.

I have set up a bunch of Windows-machines (Win2k SP6 and WinXP Pro SP1) and a central Linux-Master-Server.

Submission of jobs works in principle (tested it with the hello-world-examples from http://www.liv.ac.uk/e-science/condor/hello.html
but sometimes I observe a strange behaviour in that certain jobs need a very long time until they are beeing executed.

This happens while most of the machines are not busy and are listed as availabe (15 min no user + low CPU-utilization).

"condor_status" gives something like:

saric@u-191-srv2:~/tmp> condor_status

Name          OpSys       Arch   State      Activity   LoadAv Mem

u-191-srv2.pr LINUX       INTEL  Unclaimed  Idle       0.010  1004
u-099-cpc-esi WINNT50     INTEL  Owner      Idle       0.240   512
vm1@u-099-csr WINNT50     INTEL  Claimed    Busy       0.000  1024
vm2@u-099-csr WINNT50     INTEL  Unclaimed  Idle       0.000  1024
u-099-cbb1    WINNT51     INTEL  Unclaimed  Idle       0.000   511
u-099-cnb2    WINNT51     INTEL  Owner      Idle       0.020   511
u-099-cpc-sek WINNT51     INTEL  Owner      Idle       0.040   512
u-099-cpc1    WINNT51     INTEL  Owner      Idle       0.000   512
u-099-cpc2    WINNT51     INTEL  Owner      Idle       0.030   512
u-099-cpc3    WINNT51     INTEL  Unclaimed  Idle       0.000   512
u-099-cpc4    WINNT51     INTEL  Owner      Idle       -0.010   512
u-099-cpc5    WINNT51     INTEL  Unclaimed  Idle       0.000   512

so there are at least 4 unclaimed machines in the pool which should match requirements ((OpSys == "WINNT50") || (OpSys == "WINNT51"))..

The result of a "condor_q -analyze" takes quite a long time and gives back something like:

045.000:  Run analysis summary.  Of 12 machines,
~      1 are rejected by your job's requirements
~      6 reject your job because of their own requirements
~      0 match, but are serving users with a better priority in the pool
~      4 match, match, but reject the job for unknown reasons
~      1 match, but will not currently preempt their existing job
~      0 are available to run your job

I can't see why the 4 should reject for unknown reasons. Is there any place where I could look at to find out these unknown reasons (systemlog, local condor-log on machines???).

Thanks in advance!

- --
Marc Saric

Dr. Marc Saric, Bioinformatik, Proteom Centrum Tübingen,
Auf der Morgenstelle 15, D-72076 Tübingen, Germany,
Tel: +49 (0)7071 29 70557, marc.saric@xxxxxxxxxxxxxxxx http://www.proteom-centrum-tuebingen.de
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

-----END PGP SIGNATURE----- _______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx http://lists.cs.wisc.edu/mailman/listinfo/condor-users