[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor_view + parallel jobs



Hi All,

I have set up a dedicated HPC cluster running with condor.
I successfully set up the condor view server enabling collector to keep history data. It seems to me the parallel jobs are not displayed correctly in the viewhist files. The job only appears in the viewhist3.* while in Idle state, as job of DedicatedScheduler.
The jobs are fully disappearing when start to run.

Here is a part of the viewhist3.0.new file:
1345035660      Total   :       0       1
1345035660      DedicatedScheduler@xxxxxxxxxxxxxxxxxxx  :       0       1
1345035660      nanotio2@xxxxxxxxxx     :       0       0
1345035900      CompMag@xxxxxxxxxx      :       0       0
1345035900      Total   :       0       0
1345035900      DedicatedScheduler@xxxxxxxxxxxxxxxxxxx  :       0       0
1345035900      nanotio2@xxxxxxxxxx     :       0       0
....
1345037820      CompMag@xxxxxxxxxx      :       0       0
1345037820      Total   :       0       0
1345037820      nanotio2@xxxxxxxxxx     :       0       0

and the output of the condor_q command:
975.0 CompMag 8/12 20:06 2+19:32:15 R 0 73.2 condor_openmpi.sh 975.1 CompMag 8/12 20:06 0+00:00:00 R 0 0.0 condor_openmpi.sh 977.0 CompMag 8/13 14:56 2+00:32:25 R 0 73.2 condor_openmpi.sh 977.1 CompMag 8/13 14:56 0+00:00:00 R 0 0.0 condor_openmpi.sh 981.0 nanotio2 8/14 13:51 1+01:31:34 R 0 73.2 condor_openmpi-1.4 981.1 nanotio2 8/14 13:51 0+00:00:00 R 0 0.0 condor_openmpi-1.4 982.0 nanotio2 8/15 05:21 0+10:10:05 R 0 73.2 condor_openmpi-1.4 982.1 nanotio2 8/15 05:21 0+00:00:00 R 0 0.0 condor_openmpi-1.4 983.0 nanotio2 8/15 14:45 0+00:39:21 R 0 73.2 condor_openmpi-1.4 983.1 nanotio2 8/15 14:45 0+00:00:00 R 0 0.0 condor_openmpi-1.4

Are any configuration details, what I am missed related to parallel jobs and viewhist ?

Thank you,

Imre