[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Logfile: Is my VMware job checkpointed or not?



Hello,

I have started a VMware job (pool PC is Windows XP with vmware 1.0):

Universe = vm
Executable = any_name_you_like
Log = vm.log
vm_type = vmware
vm_networking = false
vm_checkpoint = true
vm_memory = 64
vmware_dir = /home/myhome/VM
vm_cdrom_files = job
vm_should_transfer_cdrom_files = YES
vmware_should_transfer_files = YES
Queue

The log file "vm.log" has following content:

------------------------------------------------
000 (016.000.000) 05/10 09:41:45 Job submitted from host: <127.0.0.1:40528>
...
001 (016.000.000) 05/10 09:42:12 Job executing on host: <115.145.228.188:1048>
...
006 (016.000.000) 05/10 09:43:22 Image size of job updated: 34820
...
<snip>
...
006 (016.000.000) 05/10 10:28:23 Image size of job updated: 35200
...
003 (016.000.000) 05/10 10:37:43 Job was checkpointed.
    Usr 0 00:00:00, Sys 0 00:00:22  -  Run Remote Usage
    Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
    69671560  -  Run Bytes Sent By Job For Checkpoint
...
004 (016.000.000) 05/10 10:37:49 Job was evicted.
    (0) Job was not checkpointed.
        Usr 0 00:00:00, Sys 0 00:00:22  -  Run Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
    69671560  -  Run Bytes Sent By Job
    11207530  -  Run Bytes Received By Job
...
001 (016.000.000) 05/10 10:39:31 Job executing on host: <115.145.228.13:1041>
...
004 (016.000.000) 05/10 10:40:44 Job was evicted.
    (0) Job was not checkpointed.
        Usr 0 00:00:00, Sys 0 00:00:20  -  Run Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
    0  -  Run Bytes Sent By Job
    80879088  -  Run Bytes Received By Job
...
001 (016.000.000) 05/10 10:46:07 Job executing on host: <115.145.228.178:1034>
--------------------------------------------

What confuses me, is this:
   003 (016.000.000) 05/10 10:37:43 Job was checkpointed.
and
   004 (016.000.000) 05/10 10:37:49 Job was evicted.
       (0) Job was not checkpointed.

Is it checkpointed, or is not?
Does the next execution start from anew,
or continues from the checkpoint?

Can somebody explain to me the meaning of this
"is checkpointed" and "is not checkpointed" in the log file?

Thank you!

Rob.