[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] lost jobs



On Apr 7, 2011, at 9:28 AM, Santanu Das wrote:

I see there are number of jobs, once submitted in the queue (and eventually failed), they are logged in the job_queue.log but condor_history knows nothing about them. Here is one example for: ClusterId = 604510
[root@serv07 spool]# cat job_queue.log | sed -n -e '/ClusterId 604510/{x;p;g;$N;N;N;N;N;p;D}'

103 0604510.-1 ClusterId 604510
103 0604510.-1 QDate 1302172359
103 0604510.-1 CompletionDate 0
103 0604510.-1 User "pltlhc15@xxxxxxxxxxxxxxxxxxxxxxxx"
103 0604510.-1 Owner "pltlhc15"
#
[root@serv07 spool]# condor_history 604510 && date
 ID      OWNER            SUBMITTED     RUN_TIME ST   COMPLETED CMD            
Thu Apr  7 15:23:26 BST 2011


Does any one know why I'm seeing this?

Do you job history enabled? If 'condor_config_val HISTORY' responds with 'Not defined: HISTORY', then Condor doesn't keep a history of old jobs.

When Condor is keeping a history of old jobs, jobs will be dropped from the history when enough later jobs leave the queue. If you want the history to go back further in the past, you can adjust ENABLE_HISTORY_ROTATION, MAX_HISTORY_LOG, and MAX_HISTORY_ROTATIONS.

Thanks and regards,
Jaime Frey
UW-Madison Condor Team