[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] job requirements

Hello All,

I am having a similar problem. I submitted a bunch of jobs yesterday and
they started running on different machines. I came in this morning and only
my machine was running jobs even though the other machines were Idle. My
machine is running 6.6.6. I decided to update the other machines from 6.4.5
to 6.6.6. The idle jobs found processors and started running. About an hour
later I came back and checked the status. Once again only my machine was
running processes. The other processors were again sitting around unclaimed.
Has this been seen before? Could it possibly have something to do with the
power options set in the control panel (by the way I am running on Windows
2K machines)? Can someone help?

Thanks for your time,

David L. Oakley
Aerospace Engineer
Redstone Arsenal, AL 35898-5252
Email : david.l.oakley@xxxxxxxxxxx
Phone : 256-876-0539 (DSN: 746-0539)
Secure: 256-876-0649 (DSN: 746-0649)
Fax : 256-842-0808 (DSN: 788-0808)

-----Original Message-----
From: Alain Roy [mailto:roy@xxxxxxxxxxx] 
Sent: Monday, August 09, 2004 7:52 PM
To: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] job requirements

Fernando Rannou wrote:
>Requirements = (Arch == "INTEL") && (OpSys == "LINUX") && (Disk >=
>DiskUsage) && ((Memory * 1024) >= ImageSize) && (TARGET.FileSystemDomain 
>== MY.FileSystemDomain)
>but the condor submit file has not Requierements!!!
>This seems to be happening for one specific user.

condor_submit makes reasonable requirements for a job.

Pick one job that has this problem. Pretend it's job 10.0.

Pick one computer that doesn't match. Pretend it's node.example.com.

Run these two commands, substituting the correct identifiers:

   condor_q -l 10.0
   condor_status -l node.example.com

Walk through the requirements and see what can't be true. For instance, if 
the job has DiskUsage of 1000000 and the computer has Disk of 1000, then 
Requirements will be false.

My bet is that it will be the FileSystemDomain that is causing your 
problem. If so, there are two possibilities:

   * If you are submitting the job from a shared disk (like NFS) then the
     computers should indicate that they have the same FileSystemDomain.
     Here at the UW, we set FileSystemDomain to be cs.wisc.edu instead of
     the $(FULL_HOSTNAME). For more information, see:


   * If you are submitting the job from a computer that doesn't
     have shared disks, then you'll need to transfer files, so this
     requirement doesn't show up. For more information, see:


I hope this helps.


Condor-users mailing list