[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] classad for wall-time



Thanks Ian! I think I see things a bit different here. First of all, jobs never get preempted here, so always run on a single machine.
(EnteredCurrentStatus - JobCurrentStartDate) is always higher than 'RemoteWallClockTime' here. Say for this month, this is what I see:

[root@serv07 ~]# condor_history -c 'formatTime(EnteredCurrentStatus, "%m") == "04" && \
AccountingGroup =!= UNDEFINED' -format "%s " AccountingGroup -format "%d\n" \
'EnteredCurrentStatus - JobCurrentStartDate' | sed 's/\(.*\)\..* \(.*\)/\1 \2/' | \
awk '{sums[$1] = $2 + ($1 in sums ? sums[$1] : 0)} END {for (x in sums) print x,sums[x]}'
group_alice 608721
group_euindia 2092416
group_calice 150833
group_monitor 452541
group_atlas 278858255
group_lhcb 85749166
group_cms 2825308


[root@serv07 ~]# condor_history -c 'formatTime(EnteredCurrentStatus, "%m") == "04" && AccountingGroup =!= UNDEFINED' \
-format "%s " AccountingGroup -format "%d\n" 'RemoteWallClockTime' | sed 's/\(.*\)\..* \(.*\)/\1 \2/' | \
awk '{sums[$1] = $2 + ($1 in sums ? sums[$1] : 0)} END {for (x in sums) print x,sums[x]}'
group_alice 49846
group_euindia 2111122
group_calice 150833
group_monitor 104893
group_atlas 125423594
group_lhcb 81987039
group_cms 2599357

According to your exploitation, I assumed that RemoteWallClockTime would be either higher or equal to (EnteredCurrentStatus - JobCurrentStartDate) but that's not the case here. Am I seeing the correct thing?

Cheers,
Santanu

    

    

On Wednesday, April 27, 2011 at 12:32 PM, Santanu Das wrote:

Dear all,

What exactly 'RemoteWallClockTime' tell us about? Is it the wall-time? I
thought 'EnteredCurrentStatus - JobCurrentStartDate' would be equivalent
to the wall-time but for most of the cases, it's not the same as
RemoteWallClockTime. Dose anyone have any insight?
It tells you about the total wall clock time a job spent on all the machines it ran on. RemoteWallClockTime is cumlative. So if your job runs a bit on one machine, gets preempted, and then starts running again on another machine it'll be higher than (EnteredCurrentStatus - JobCurrentStartDate). It includes suspension time if the job was suspend on a machine.

It would only be equal to (EnteredCurrentStatus - JobCurrentStartDate) if:

1. All three attributes for the job were just updated;
2. The job had only ever run once on one machine (otherwise RemoteWallClockTime will be higher);
3. The job is currently running on the machine (as soon as the job finishes EnteredCurrentStatus reflects the time the job entered the Completed state, not the time it last started to run on a machine).

Regards,
- Ian


_______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/