[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Job not running .....



Rohit,
First of all, I was only trying to help. When I started working with condor I struggled to make it work, I remembered my farm could run jobs similar to the one you are trying to make it work. Someone from this forum suggested me to review the log files from all the machines involved in my pool to find errors, those logs and its errors gave me the starting point to start googleing specific issues. At the end I found that I did not configured my farm correctly and that why they were not running. I am not trying to imply that your work is not correct because I am talking about my particular case.  Troubleshooting condor problems are not an easy task and it will require a lot of patience, if I were you, I would try to google any error message I could find to see if  other people has experience similar issues and how they fix it. 
In your case, I would carefully read all the message generated when you ran condor_q -better-ananlyze, at the end of it condor is giving you a hint where the problem might be. I would correct the condition that  is missing before send any other jobs, see below. 

The following attributes are missing from the job ClassAd:
CheckpointPlatform

Again I hope this message can help you and good luck,
Alex 

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Rohit Farmer
Sent: Monday, August 31, 2009 11:52 AM
To: Condor-Users Mail List
Subject: [Condor-users] Job not running .....

Hi there .....

I installed condor using rpm package on two systems running fedora11
.... and then ran the condor_master and status so all the four
processors are being displayed as shown bellow ... but when i am
submitting any job it is not executing ... so i am pasting the outputs
please help to find out the cause .... why my jobs are not running
....


# condor_status

Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@xxxxxxxxxxxx LINUX      INTEL  Owner     Idle     0.000   245  0+00:15:08
slot2@xxxxxxxxxxxx LINUX      INTEL  Unclaimed Idle     0.000   245  0+00:00:05
slot1@xxxxxxxxxxxx LINUX      INTEL  Owner     Idle     0.070   245  0+00:15:12
slot2@xxxxxxxxxxxx LINUX      INTEL  Owner     Idle     0.000   245  0+00:15:13

                   Total Owner Claimed Unclaimed Matched Preempting Backfill

       INTEL/LINUX     4     3       0         1       0          0        0

             Total     4     3       0         1       0          0        0


# condor_q


-- Submitter: rohit.bioinfo.net : <192.168.7.16:37018> : rohit.bioinfo.net
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
 7.0   rohit           8/28 19:41   0+00:00:00 I  0   0.0  hello_world.sh

1 jobs; 1 idle, 0 running, 0 held
[rohit@rohit condor_test]$ clear

[rohit@rohit condor_test]$ condor_q -better-analyze 7.0


-- Submitter: rohit.bioinfo.net : <192.168.7.16:37018> : rohit.bioinfo.net
---
007.000:  Run analysis summary.  Of 4 machines,
    0 are rejected by your job's requirements
    3 reject your job because of their own requirements
    0 match but are serving users with a better priority in the pool
    1 match but reject the job for unknown reasons
    0 match but will not currently preempt their existing job
    0 are available to run your job
      No successful match recorded.
      Last failed match: Fri Aug 28 19:41:27 2009
      Reason for last match failure: no match found

The following attributes are missing from the job ClassAd:

CheckpointPlatform


# My Submit File

Universe       = vanilla
Executable     = hello.bash

input   = /dev/null
output  = hello.out
error   = hello.error

Queue


And if there is any mistake in my submit file ... . then please
suggest me a simple script that i can submit with the current settings
to test whether its running or not ....

Regards

Rohit

--
Rohit Farmer
M.Tech Bioinformatics
Department of Bioinformatics
CBAS, AAIDU
Allahabad - 211 007
Ph. No. 9839845093, 9415261403
e-Mail rohit.farmer@xxxxxxxxx
URL http://rohitfarmer.netfirms.com
Blog http://rohitsspace.blogspot.com



-- 
Rohit Farmer
M.Tech Bioinformatics
Department of Bioinformatics
CBAS, AAIDU
Allahabad - 211 007
Ph. No. 9839845093, 9415261403
e-Mail rohit.farmer@xxxxxxxxx
URL http://rohitfarmer.netfirms.com
Blog http://rohitsspace.blogspot.com
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/