[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] SCHEDD not running right on upgraded CE with Condor 7.6.6



>>> We have the following line in /etc/sysconfig/condor to point to the system wide
>>> configuration file:
>>> CONDOR_CONFIG="/share/apps/condor/etc/condor_config_7.6.6"
>> And it's also in the environment when you run condor_q? The daemons and the tools have to read the same configuration files. If they don't, condor_q and the other tools will fail in the way that you're seeing.
>> 
> 
> That's it.  It's the environment variable issue.  Set CONDOR_CONFIG correctly fix the problem.
> 
> We now able to get output from the condor_q and condor_status commands.
> 
> Thanks.

Great!

> Following problem:
> 
> We see that bunch of jobs have scheduled but they are not executing:

I'm confused because I don't see the jobs. There's nothing in the queue, so nothing is executing? Can you explain in a bit more detail?

-alain

> 
> # cd /wntmp/home
> # ls
> alice        uscms0179  uscms0604  uscms1029  uscms1454  uscms1879  uscms2304
> cdf          uscms0180  uscms0605  uscms1030  uscms1455  uscms1880
>    .
>    .
>    .
> 
> 
> # condor_status
> 
> Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime
> 
> slot1@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     1.000   982  0+04:05:04
> slot2@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.760   982  0+04:05:05
> slot3@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982  0+04:05:06
> slot4@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982  0+04:05:07
> slot5@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982  0+04:05:08
> slot6@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982  0+04:05:09
> slot7@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982  0+04:05:10
> slot8@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982  0+04:05:03
> slot1@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963  6+07:14:53
> slot2@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963  6+07:15:19
> slot3@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963  6+07:15:20
> slot4@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963  6+07:15:21
> slot10@compute-20- LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:07
> slot11@compute-20- LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:08
> slot12@compute-20- LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:09
> slot1@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.420  4024  0+07:44:43
> slot2@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:07
> slot3@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:08
> slot4@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:09
> slot5@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:10
> slot6@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:11
> slot7@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:12
> slot8@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:05
> slot9@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024  0+07:45:06
>                     Total Owner Claimed Unclaimed Matched Preempting Backfill
> 
>        X86_64/LINUX    24     8       0        16       0          0        0
> 
>               Total    24     8       0        16       0          0        0
> 
> 
> # condor_q
> 
> 
> -- Submitter: cithep252.ultralight.org : <10.3.255.253:48116> : cithep252.ultralight.org
> ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
> 
> 0 jobs; 0 idle, 0 running, 0 held
> 
> 
> Thanks.
> 
> Steven.