[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Segfault when resuming from checkpoint



Hi,
I have a problem with jobs, that segfault, when resuming from a checkpoint after they were evicted.
As far as I can see from the ShadowLog, the last thing that happens is, that the state of the "/dev/null" file handle is restored.
That seems to mean, that the segfault occurs before the execution of the user code is resumed.
I'd be very greatful for any help, because I'm running out of ideas.

We are using:

$CondorVersion: 6.8.4 Feb  1 2007 $
$CondorPlatform: X86_64-LINUX_RHEL3 $

Below are the logs concerning one of the jobs. The segfault can be seen in the last ten lines.

Thanks alot in advance,
Rudolf


3/29 01:37:21 (98.0) (11138):	STREAM FILE RECEIVED OK (524104735 bytes)
3/29 01:37:22 (98.0) (24744):Reaped child status - pid 11138 exited with status 0
3/29 04:10:03 (98.0) (13886):	STREAM FILE RECEIVED OK (939721759 bytes)
3/29 04:10:05 (98.0) (24744):Reaped child status - pid 13886 exited with status 0
3/29 07:24:06 (98.0) (24744):Read: write(fd=3,core_loc=0x71b000,len=0x1f138000)
3/29 22:56:46 (98.0) (6718):Read: 	offset:       313138
3/29 22:56:46 (98.0) (6718):Read: 	size:         313138
3/29 22:57:36 (98.0) (6718):Read: 	offset:       313138
3/30 11:02:06 (?.?) (31623):Hostname = "<129.69.120.67:9695>", Job = 138.0
3/30 11:02:06 (138.0) (31623):Requesting Primary Starter
3/30 11:02:06 (138.0) (31623):Shadow: Request to run a job was ACCEPTED
3/30 11:02:06 (138.0) (31623):Shadow: RSC_SOCK connected, fd = 17
3/30 11:02:06 (138.0) (31623):Shadow: CLIENT_LOG connected, fd = 18
3/30 11:02:06 (138.0) (31623):My_Filesystem_Domain = "ica1.uni-stuttgart.de"
3/30 11:02:06 (138.0) (31623):My_UID_Domain = "ica1.uni-stuttgart.de"
3/30 11:02:06 (138.0) (31623):	Entering pseudo_get_file_stream
3/30 11:02:06 (138.0) (31623):	file = "/condor/spool/cluster138.ickpt.subproc0"
3/30 11:02:07 (138.0) (31623):Reaped child status - pid 31624 exited with status 0
3/30 11:02:07 (138.0) (31623):Read: User Job - $CondorPlatform: X86_64-LINUX_RHEL3 $
3/30 11:02:07 (138.0) (31623):Read: User Job - $CondorVersion: 6.8.4 Feb  1 2007 $
3/30 11:02:07 (138.0) (31623):Read: Checkpoint file name is "/condor/spool/cluster138.proc0.subproc0"
3/30 11:59:20 (138.0) (31623):Updating suspension info to schedd.
3/30 11:59:20 (138.0) (31623):Read: TISABH Starter: Suspended user job: 1
3/30 12:16:01 (138.0) (31623):Updating suspension info to schedd.
3/30 12:16:01 (138.0) (31623):Read: TISABH Starter: Unsuspended user job.
3/30 12:16:01 (138.0) (31623):Read: Got SIGTSTP
3/30 12:16:01 (138.0) (31623):Read: Saved signal state.
3/30 12:16:01 (138.0) (31623):Read: About to save file state
3/30 12:16:01 (138.0) (31623):Read: CondorFileTable::checkpoint
3/30 12:16:01 (138.0) (31623):Read: OPEN FILE TABLE:
3/30 12:16:01 (138.0) (31623):Read: fd 0
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /dev/null
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x0
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/dev/null
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 1
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/condor.out
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x1
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/condor.out
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 2
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/condor.err
3/30 12:16:01 (138.0) (31623):Read: 	offset:       37684
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x1
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/condor.err
3/30 12:16:01 (138.0) (31623):Read: 	size:         37684
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 3
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Shear.log
3/30 12:16:01 (138.0) (31623):Read: 	offset:       62864
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Shear.log
3/30 12:16:01 (138.0) (31623):Read: 	size:         62864
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 4
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Shear3.log
3/30 12:16:01 (138.0) (31623):Read: 	offset:       20866768
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Shear3.log
3/30 12:16:01 (138.0) (31623):Read: 	size:         20866768
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 5
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./ParticlesReload.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x0
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./ParticlesReload.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        2
3/30 12:16:01 (138.0) (31623):Read: fd 6
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./FluidParticlesReload.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x0
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./FluidParticlesReload.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        2
3/30 12:16:01 (138.0) (31623):Read: fd 8
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Complete.0001.conf
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Complete.0001.conf
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 9
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/FluidParticles.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/FluidParticles.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 10
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Particles.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Particles.0001.dat
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 11
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/output/xballs.log
3/30 12:16:01 (138.0) (31623):Read: 	offset:       112492544
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x1
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/output/xballs.log
3/30 12:16:01 (138.0) (31623):Read: 	size:         112492544
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 12
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Shear2.log
3/30 12:16:01 (138.0) (31623):Read: 	offset:       40544
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x401
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Shear2.log
3/30 12:16:01 (138.0) (31623):Read: 	size:         40544
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 15
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Momentum.log
3/30 12:16:01 (138.0) (31623):Read: 	offset:       8266
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x401
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Momentum.log
3/30 12:16:01 (138.0) (31623):Read: 	size:         8266
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 16
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Energie.log
3/30 12:16:01 (138.0) (31623):Read: 	offset:       2314
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x401
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Energie.log
3/30 12:16:01 (138.0) (31623):Read: 	size:         2314
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 17
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/VelocityX.dist
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/VelocityX.dist
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 18
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/VelocityY.dist
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/VelocityY.dist
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 19
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/VelocityZ.dist
3/30 12:16:01 (138.0) (31623):Read: 	offset:       0
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/VelocityZ.dist
3/30 12:16:01 (138.0) (31623):Read: 	size:         0
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 20
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/PartVelocityX.dist
3/30 12:16:01 (138.0) (31623):Read: 	offset:       18550
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/PartVelocityX.dist
3/30 12:16:01 (138.0) (31623):Read: 	size:         18550
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 21
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/PartVelocityY.dist
3/30 12:16:01 (138.0) (31623):Read: 	offset:       18550
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/PartVelocityY.dist
3/30 12:16:01 (138.0) (31623):Read: 	size:         18550
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 22
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/PartVelocityZ.dist
3/30 12:16:01 (138.0) (31623):Read: 	offset:       18550
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/PartVelocityZ.dist
3/30 12:16:01 (138.0) (31623):Read: 	size:         18550
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 23
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./output/Vz.log
3/30 12:16:01 (138.0) (31623):Read: 	offset:       239
3/30 12:16:01 (138.0) (31623):Read: 	dups:         1
3/30 12:16:01 (138.0) (31623):Read: 	open flags:   0x401
3/30 12:16:01 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./output/Vz.log
3/30 12:16:01 (138.0) (31623):Read: 	size:         239
3/30 12:16:01 (138.0) (31623):Read: 	opens:        1
3/30 12:16:01 (138.0) (31623):Read: fd 24
3/30 12:16:01 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./ParticlesReload.0001.dat
3/30 12:16:02 (138.0) (31623):Read: 	offset:       0
3/30 12:16:02 (138.0) (31623):Read: 	dups:         1
3/30 12:16:02 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:02 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./ParticlesReload.0001.dat
3/30 12:16:02 (138.0) (31623):Read: 	size:         0
3/30 12:16:02 (138.0) (31623):Read: 	opens:        2
3/30 12:16:02 (138.0) (31623):Read: fd 25
3/30 12:16:02 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./FluidParticlesReload.0001.dat
3/30 12:16:02 (138.0) (31623):Read: 	offset:       0
3/30 12:16:02 (138.0) (31623):Read: 	dups:         1
3/30 12:16:02 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:02 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./FluidParticlesReload.0001.dat
3/30 12:16:02 (138.0) (31623):Read: 	size:         0
3/30 12:16:02 (138.0) (31623):Read: 	opens:        2
3/30 12:16:02 (138.0) (31623):Read: fd 26
3/30 12:16:02 (138.0) (31623):Read: 	logical name: /data/data0/weeber/scherung/20/./reload.0001.conf
3/30 12:16:02 (138.0) (31623):Read: 	offset:       0
3/30 12:16:02 (138.0) (31623):Read: 	dups:         1
3/30 12:16:02 (138.0) (31623):Read: 	open flags:   0x2
3/30 12:16:02 (138.0) (31623):Read: 	url:          local:/data/data0/weeber/scherung/20/./reload.0001.conf
3/30 12:16:02 (138.0) (31623):Read: 	size:         0
3/30 12:16:02 (138.0) (31623):Read: 	opens:        1
3/30 12:16:02 (138.0) (31623):Read: working dir = /data/data0/weeber/scherung/20
3/30 12:16:07 (138.0) (31623):Read: Done saving file state
3/30 12:16:07 (138.0) (31623):Read: About to update MyImage
3/30 12:16:07 (138.0) (31623):Read: Size of ckpt image = 272360479 bytes
3/30 12:16:07 (138.0) (31623):Read: Checkpointing at 6096 KB/s.
3/30 12:16:07 (138.0) (31623):Read: About to write checkpoint
3/30 12:16:07 (138.0) (31623):Read: Image::Write(): fd -1 file_name /condor/spool/cluster138.proc0.subproc0
3/30 12:16:07 (138.0) (31623):Read: Checkpoint name is "/condor/spool/cluster138.proc0.subproc0"
3/30 12:16:07 (138.0) (31623):Read: Tmp name is "/condor/spool/cluster138.proc0.subproc0.tmp"
3/30 12:16:07 (138.0) (31623):	Entering pseudo_put_file_stream
3/30 12:16:07 (138.0) (31623):	file = "/condor/spool/cluster138.proc0.subproc0.tmp"
3/30 12:16:07 (138.0) (31623):	len = 272360479
3/30 12:16:07 (138.0) (31623):	owner = weeber
3/30 12:16:08 (138.0) (31623):Read: Opened "/condor/spool/cluster138.proc0.subproc0.tmp" via file stream
3/30 12:16:08 (138.0) (31623):Read: Wrote headers OK
3/30 12:16:08 (138.0) (31623):Read: Wrote all SegMaps OK
3/30 12:16:08 (138.0) (31623):Read: write(fd=3,core_loc=0x79c000,len=0x103b2000)
3/30 12:22:31 (138.0) (31623):Read: Wrote Segment[0] of type DATA -> OK
3/30 12:22:31 (138.0) (31623):Read: write(fd=3,core_loc=0x7fffffff2000,len=0xbfff)
3/30 12:22:31 (138.0) (489):	STREAM FILE RECEIVED OK (272360479 bytes)
3/30 12:22:31 (138.0) (31623):Reaped child status - pid 489 exited with status 0
3/30 12:22:32 (138.0) (31623):Read: Wrote Segment[1] of type STACK -> OK
3/30 12:22:32 (138.0) (31623):Read: Wrote all Segments OK
3/30 12:22:32 (138.0) (31623):Read: About to close ckpt fd (3)
3/30 12:22:32 (138.0) (31623):user_time = 9 ticks
3/30 12:22:32 (138.0) (31623):sys_time = 131 ticks
3/30 12:22:32 (138.0) (31623):Read: Closed OK
3/30 12:22:32 (138.0) (31623):Read: About to rename "/condor/spool/cluster138.proc0.subproc0.tmp" to "/condor/spool/cluster138.proc0.subproc0"
3/30 12:22:32 (138.0) (31623):Read: Renamed OK
3/30 12:22:32 (138.0) (31623):Read: USER PROC: CHECKPOINT IMAGE SENT OK
3/30 12:22:32 (138.0) (31623):Read: Ckpt exit
3/30 12:22:34 (138.0) (31623):Shadow: Job 138.0 exited, termsig = 3, coredump = 0, retcode = 0
3/30 12:22:34 (138.0) (31623):Shadow: Job was checkpointed
3/30 12:22:34 (138.0) (31623):user_time = 10 ticks
3/30 12:22:34 (138.0) (31623):sys_time = 131 ticks
3/30 12:22:34 (138.0) (31623):********** Shadow Exiting(101) **********
3/30 12:33:12 (?.?) (788):Hostname = "<129.69.120.64:9683>", Job = 138.0
3/30 12:33:12 (138.0) (788):Requesting Primary Starter
3/30 12:33:12 (138.0) (788):Shadow: Request to run a job was ACCEPTED
3/30 12:33:12 (138.0) (788):Shadow: RSC_SOCK connected, fd = 17
3/30 12:33:12 (138.0) (788):Shadow: CLIENT_LOG connected, fd = 18
3/30 12:33:12 (138.0) (788):My_Filesystem_Domain = "ica1.uni-stuttgart.de"
3/30 12:33:12 (138.0) (788):My_UID_Domain = "ica1.uni-stuttgart.de"
3/30 12:33:12 (138.0) (788):	Entering pseudo_get_file_stream
3/30 12:33:12 (138.0) (788):	file = "/condor/spool/cluster138.ickpt.subproc0"
3/30 12:33:13 (138.0) (788):Reaped child status - pid 791 exited with status 0
3/30 12:33:13 (138.0) (788):Read: condor_restart:
3/30 12:33:13 (138.0) (788):Read: Checkpoint file name is "/condor/spool/cluster138.proc0.subproc0"
3/30 12:33:13 (138.0) (788):	Entering pseudo_get_file_stream
3/30 12:33:13 (138.0) (788):	file = "/condor/spool/cluster138.proc0.subproc0"
3/30 12:33:13 (138.0) (788):Read: Opened "/condor/spool/cluster138.proc0.subproc0" via file stream
3/30 12:33:13 (138.0) (788):Read: Read headers OK
3/30 12:33:13 (138.0) (788):Read: Read SegMap[0](DATA) OK
3/30 12:33:13 (138.0) (788):Read: Read SegMap[1](STACK) OK
3/30 12:33:13 (138.0) (788):Read: Read all SegMaps OK
3/30 12:33:13 (138.0) (788):Read: Found a DATA block, increasing heap from 0x85e000 to 0x10b4e000
3/30 12:33:13 (138.0) (788):Read: About to overwrite 272310272 bytes starting at 0x79c000(DATA)
3/30 12:33:36 (138.0) (788):Reaped child status - pid 792 exited with status 0
3/30 12:33:36 (138.0) (788):Read: About to overwrite 49151 bytes starting at 0x7fffffff2000(STACK)
3/30 12:33:36 (138.0) (788):Read: USER PROC: CHECKPOINT IMAGE RECEIVED OK
3/30 12:33:36 (138.0) (788):Read: Performing an msync() on all dirty pages...
3/30 12:33:36 (138.0) (788):Read: About to restore file state
3/30 12:33:36 (138.0) (788):Read: CondorFileTable::resume
3/30 12:33:36 (138.0) (788):Read: working dir = /data/data0/weeber/scherung/20
3/30 12:33:36 (138.0) (788):Read: OPEN FILE TABLE:
3/30 12:33:36 (138.0) (788):Read: fd 0
3/30 12:33:36 (138.0) (788):Read: 	logical name: /dev/null
3/30 12:33:36 (138.0) (788):Read: 	offset:       0
3/30 12:33:36 (138.0) (788):Shadow: Job 138.0 exited, termsig = 11, coredump = 0, retcode = 0
3/30 12:33:36 (138.0) (788):Shadow: was killed by signal 11.
3/30 12:33:36 (138.0) (788):user_time = 0 ticks
3/30 12:33:36 (138.0) (788):sys_time = 51 ticks
3/30 12:33:36 (138.0) (788):Static Policy: removing job because OnExitRemove has become true
3/30 12:33:36 (138.0) (788):********** Shadow Exiting(102) **********