[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Basic Job submission problems




Abdul,
      a blank return from condor_status suggests that the pool isn't up ie it
has no nodes. I would suggest you look in the central manager collector log for
an indication that the nodes have successfully registered a classad. I usually
find - in my very limited experience - that the problem is access control
indicated by permission denied messages in the logs. You should also be aware
that there are a number of time windows in the system and full cycle from the
node registering to being presented as part of the pool can take some minutes.
For testing you could change the defaults to the testingmode defaults in the
config files on the central manager and the nodes this would speed things up and
avoid the exec nodes going into owner state when you are on them looking at
things.

Cheers Paul



|--------+-------------------------->
|        |          ABDUL SUBHAN    |
|        |          <subhan@xxxxxxxx|
|        |          pm.edu.sa>      |
|        |                          |
|        |          02/09/2004 01:22|
|        |          AM              |
|        |          Please respond  |
|        |          to Condor-Users |
|        |          Mail List       |
|        |                          |
|--------+-------------------------->
  >--------------------------------------------------------------------------|
  |                                                                          |
  |      To:     condor-users@xxxxxxxxxxx                                    |
  |      cc:     (bcc: Paul Chubb/Staff/ABS)                                 |
  |      Subject:     [Condor-users] Basic Job submission problems           |
  >--------------------------------------------------------------------------|




I've just installed Condor on 2 machine cluster. Set up the Submitting station/
Central Manager as one machine, and another node as the executing station.

Although ps -ef | grep 'condor' on the executing station shows the startd daemon

up and running, however its not listed in available nodes. The command below
shows no entries
> condor_status -available

This is also the reason why I can't submit jobs. When I do, condoq_q shows jobs
are idle. condor_q -analyse shows that

08.000:  Run analysis summary.  Of 2 machines,
      0 are rejected by your job's requirements
      2 reject your job because of their own requirements
      0 match, but are serving users with a better priority in the pool
      0 match, match, but reject the job for unknown reasons
      0 match, but will not currently preempt their existing job
      0 are available to run your job
        No successful match recorded.
        Last failed match: Thu Sep  2 05:48:23 2004
        Reason for last match failure: no match found

I think I'm missing out on some configuration file entries to start up the job
execution. Any help would be really appreciated. Thanx


ABDUL SUBHAN

RESEARCH ASSISTANT
COMPUTER ENGINEERING DEPARTMENT

P.O BOX # 7851
KING FAHD UNIVERSITY OF PETROLEUM & MINERALS
DHAHRAN, 31261
KINGDOM OF SAUDI ARABIA
PHONE RESI: 00966-3-860-8000-EXT: 9902126

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users






-----------------------------------------------
ABS Web Site:  www.abs.gov.au