[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Job is getting rerun instead of terminated



Hi all,

we have a setup that is meant to termminate all jobs after 12 hours 
runtime. Most jobs are vanilla universe. But sometimes there are jobs that 
are evicted after 12 hours and then started again on other nodes. The user 
finally killed the job with condor_rm. Other jobs are terminated after 12 
hours as expected.

Attached is part 3 of our global condor config and the users log for the 
restarting job.

Did I miss something? 

Best regards,
  Andreas

-- 
 Andreas Vetter

000 (1810.000.000) 07/08 18:08:29 Job submitted from host: <132.187.47.2:32
877>
000 (1810.001.000) 07/08 18:08:29 Job submitted from host: <132.187.47.2:32
877>
000 (1810.002.000) 07/08 18:08:29 Job submitted from host: <132.187.47.2:32
877>
001 (1810.001.000) 07/08 18:08:53 Job executing on host: <132.187.47.26:327
72>
001 (1810.000.000) 07/08 18:08:55 Job executing on host: <132.187.47.26:327
72>
001 (1810.002.000) 07/08 18:08:57 Job executing on host: <132.187.47.27:329
33>
006 (1810.001.000) 07/08 18:09:01 Image size of job updated: 86528
006 (1810.000.000) 07/08 18:09:03 Image size of job updated: 86532
006 (1810.002.000) 07/08 18:09:05 Image size of job updated: 86528
006 (1810.001.000) 07/08 18:29:01 Image size of job updated: 398596
006 (1810.000.000) 07/08 18:29:03 Image size of job updated: 175752
006 (1810.002.000) 07/08 18:29:05 Image size of job updated: 176452
006 (1810.000.000) 07/08 18:49:03 Image size of job updated: 189684
006 (1810.002.000) 07/08 18:49:05 Image size of job updated: 180176
006 (1810.002.000) 07/08 19:09:05 Image size of job updated: 215680
006 (1810.000.000) 07/08 20:29:03 Image size of job updated: 195932
006 (1810.001.000) 07/08 21:09:02 Image size of job updated: 399296
005 (1810.001.000) 07/08 23:25:14 Job terminated.
004 (1810.000.000) 07/09 06:09:00 Job was evicted.
004 (1810.002.000) 07/09 06:09:01 Job was evicted.
001 (1810.000.000) 07/09 06:15:13 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/09 06:15:14 Job executing on host: <132.187.47.22:327
72>
006 (1810.000.000) 07/09 06:35:21 Image size of job updated: 175772
006 (1810.002.000) 07/09 06:35:23 Image size of job updated: 176452
006 (1810.000.000) 07/09 06:55:21 Image size of job updated: 211304
006 (1810.002.000) 07/09 06:55:23 Image size of job updated: 261940
006 (1810.000.000) 07/09 08:35:21 Image size of job updated: 246984
004 (1810.000.000) 07/09 18:15:15 Job was evicted.
004 (1810.002.000) 07/09 18:15:15 Job was evicted.
001 (1810.000.000) 07/09 18:15:27 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/09 18:15:30 Job executing on host: <132.187.47.22:327
72>
006 (1810.000.000) 07/09 18:35:36 Image size of job updated: 175752
006 (1810.002.000) 07/09 18:35:38 Image size of job updated: 176452
006 (1810.000.000) 07/09 18:55:36 Image size of job updated: 215672
006 (1810.002.000) 07/09 18:55:38 Image size of job updated: 183500
006 (1810.002.000) 07/09 19:35:38 Image size of job updated: 186236
006 (1810.002.000) 07/09 19:55:38 Image size of job updated: 202884
004 (1810.000.000) 07/10 06:15:29 Job was evicted.
004 (1810.002.000) 07/10 06:15:33 Job was evicted.
001 (1810.000.000) 07/10 06:20:09 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/10 06:20:12 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/10 06:40:17 Image size of job updated: 175752
006 (1810.002.000) 07/10 06:40:20 Image size of job updated: 176452
006 (1810.000.000) 07/10 07:00:17 Image size of job updated: 186164
006 (1810.002.000) 07/10 07:00:20 Image size of job updated: 185256
006 (1810.002.000) 07/10 07:40:20 Image size of job updated: 202484
006 (1810.000.000) 07/10 08:00:17 Image size of job updated: 193272
006 (1810.000.000) 07/10 08:20:17 Image size of job updated: 227648
004 (1810.000.000) 07/10 18:20:10 Job was evicted.
004 (1810.002.000) 07/10 18:20:19 Job was evicted.
001 (1810.000.000) 07/10 18:25:08 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/10 18:25:09 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/10 18:45:16 Image size of job updated: 175752
006 (1810.002.000) 07/10 18:45:17 Image size of job updated: 176452
006 (1810.000.000) 07/10 19:05:16 Image size of job updated: 196240
006 (1810.002.000) 07/10 19:05:17 Image size of job updated: 215680
006 (1810.000.000) 07/10 20:05:22 Image size of job updated: 227648
004 (1810.002.000) 07/11 06:25:11 Job was evicted.
004 (1810.000.000) 07/11 06:25:11 Job was evicted.
001 (1810.000.000) 07/11 06:35:11 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/11 06:35:12 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/11 06:55:19 Image size of job updated: 175752
006 (1810.002.000) 07/11 06:55:20 Image size of job updated: 176452
006 (1810.000.000) 07/11 07:15:19 Image size of job updated: 183600
006 (1810.002.000) 07/11 07:15:20 Image size of job updated: 183500
006 (1810.000.000) 07/11 07:55:20 Image size of job updated: 211288
006 (1810.002.000) 07/11 07:55:20 Image size of job updated: 187440
006 (1810.000.000) 07/11 08:35:20 Image size of job updated: 228156
006 (1810.000.000) 07/11 08:55:20 Image size of job updated: 282148
004 (1810.000.000) 07/11 18:35:15 Job was evicted.
004 (1810.002.000) 07/11 18:35:15 Job was evicted.
001 (1810.000.000) 07/11 18:40:08 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/11 18:40:09 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/11 19:00:16 Image size of job updated: 175772
006 (1810.002.000) 07/11 19:00:17 Image size of job updated: 176452
006 (1810.000.000) 07/11 19:20:16 Image size of job updated: 204788
006 (1810.002.000) 07/11 19:20:17 Image size of job updated: 189692
006 (1810.002.000) 07/11 20:00:17 Image size of job updated: 194028
004 (1810.000.000) 07/12 06:40:08 Job was evicted.
004 (1810.002.000) 07/12 06:40:17 Job was evicted.
001 (1810.000.000) 07/12 06:45:11 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/12 06:45:14 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/12 07:05:19 Image size of job updated: 175752
006 (1810.002.000) 07/12 07:05:22 Image size of job updated: 176452
006 (1810.000.000) 07/12 07:25:19 Image size of job updated: 215672
006 (1810.002.000) 07/12 07:25:22 Image size of job updated: 183500
006 (1810.002.000) 07/12 08:05:22 Image size of job updated: 186236
006 (1810.000.000) 07/12 08:45:19 Image size of job updated: 257004
004 (1810.002.000) 07/12 18:45:16 Job was evicted.
004 (1810.000.000) 07/12 18:45:16 Job was evicted.
001 (1810.000.000) 07/12 18:55:10 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/12 18:55:11 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/12 19:15:18 Image size of job updated: 175752
006 (1810.002.000) 07/12 19:15:19 Image size of job updated: 176452
006 (1810.000.000) 07/12 19:35:18 Image size of job updated: 236872
006 (1810.002.000) 07/12 19:35:19 Image size of job updated: 236880
004 (1810.000.000) 07/13 06:55:12 Job was evicted.
004 (1810.002.000) 07/13 06:55:12 Job was evicted.
001 (1810.000.000) 07/13 07:00:07 Job executing on host: <132.187.47.25:327
72>
001 (1810.002.000) 07/13 07:00:10 Job executing on host: <132.187.47.25:327
72>
006 (1810.000.000) 07/13 07:20:16 Image size of job updated: 175752
006 (1810.002.000) 07/13 07:20:18 Image size of job updated: 176452
006 (1810.000.000) 07/13 07:40:16 Image size of job updated: 215672
006 (1810.002.000) 07/13 07:40:18 Image size of job updated: 185224
004 (1810.000.000) 07/13 19:00:13 Job was evicted.
004 (1810.002.000) 07/13 19:00:15 Job was evicted.
001 (1810.000.000) 07/13 19:10:08 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/13 19:10:09 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/13 19:30:16 Image size of job updated: 175772
006 (1810.002.000) 07/13 19:30:17 Image size of job updated: 176452
006 (1810.000.000) 07/13 19:50:17 Image size of job updated: 193780
006 (1810.002.000) 07/13 19:50:17 Image size of job updated: 183468
006 (1810.002.000) 07/13 20:30:17 Image size of job updated: 203132
006 (1810.000.000) 07/13 20:50:18 Image size of job updated: 230120
004 (1810.000.000) 07/14 07:10:13 Job was evicted.
004 (1810.002.000) 07/14 07:10:14 Job was evicted.
001 (1810.000.000) 07/14 07:20:08 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/14 07:20:09 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/14 07:40:16 Image size of job updated: 175752
006 (1810.002.000) 07/14 07:40:17 Image size of job updated: 176452
006 (1810.000.000) 07/14 08:00:16 Image size of job updated: 186244
006 (1810.002.000) 07/14 08:00:17 Image size of job updated: 183500
006 (1810.000.000) 07/14 08:40:16 Image size of job updated: 269548
006 (1810.002.000) 07/14 08:40:17 Image size of job updated: 183580
006 (1810.002.000) 07/14 09:00:17 Image size of job updated: 184256
004 (1810.002.000) 07/14 19:20:11 Job was evicted.
004 (1810.000.000) 07/14 19:20:11 Job was evicted.
001 (1810.000.000) 07/14 19:30:07 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/14 19:30:09 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/14 19:50:15 Image size of job updated: 175772
006 (1810.002.000) 07/14 19:50:18 Image size of job updated: 176452
006 (1810.000.000) 07/14 20:10:15 Image size of job updated: 208884
006 (1810.002.000) 07/14 20:10:18 Image size of job updated: 236880
006 (1810.000.000) 07/14 21:30:15 Image size of job updated: 213224
006 (1810.000.000) 07/14 21:50:15 Image size of job updated: 222944
004 (1810.002.000) 07/15 07:30:12 Job was evicted.
004 (1810.000.000) 07/15 07:30:12 Job was evicted.
001 (1810.000.000) 07/15 07:35:11 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/15 07:35:12 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/15 07:55:19 Image size of job updated: 175752
006 (1810.002.000) 07/15 07:55:20 Image size of job updated: 176452
006 (1810.000.000) 07/15 08:15:19 Image size of job updated: 183600
006 (1810.002.000) 07/15 08:15:20 Image size of job updated: 183500
006 (1810.002.000) 07/15 08:55:20 Image size of job updated: 183580
006 (1810.000.000) 07/15 09:15:19 Image size of job updated: 192472
006 (1810.000.000) 07/15 09:35:19 Image size of job updated: 227648
006 (1810.002.000) 07/15 09:35:20 Image size of job updated: 211312
004 (1810.000.000) 07/15 19:35:11 Job was evicted.
004 (1810.002.000) 07/15 19:35:20 Job was evicted.
001 (1810.000.000) 07/15 19:40:09 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/15 19:40:12 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/15 20:00:17 Image size of job updated: 175772
006 (1810.002.000) 07/15 20:00:20 Image size of job updated: 176452
006 (1810.000.000) 07/15 20:20:17 Image size of job updated: 183496
006 (1810.002.000) 07/15 20:20:20 Image size of job updated: 183468
006 (1810.000.000) 07/15 21:00:17 Image size of job updated: 183560
006 (1810.002.000) 07/15 21:00:20 Image size of job updated: 183580
006 (1810.000.000) 07/15 21:20:17 Image size of job updated: 257004
006 (1810.002.000) 07/15 21:20:20 Image size of job updated: 186252
004 (1810.000.000) 07/16 07:40:10 Job was evicted.
004 (1810.002.000) 07/16 07:40:19 Job was evicted.
001 (1810.000.000) 07/16 07:45:09 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/16 07:45:12 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/16 08:05:17 Image size of job updated: 175772
006 (1810.002.000) 07/16 08:05:20 Image size of job updated: 176452
006 (1810.000.000) 07/16 08:25:17 Image size of job updated: 186244
006 (1810.002.000) 07/16 08:25:20 Image size of job updated: 183624
006 (1810.002.000) 07/16 09:05:20 Image size of job updated: 211296
006 (1810.000.000) 07/16 09:25:17 Image size of job updated: 195264
006 (1810.002.000) 07/16 09:25:21 Image size of job updated: 211312
006 (1810.000.000) 07/16 09:45:17 Image size of job updated: 230120
004 (1810.000.000) 07/16 19:45:11 Job was evicted.
004 (1810.002.000) 07/16 19:45:20 Job was evicted.
001 (1810.000.000) 07/16 19:45:32 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/16 19:45:33 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/16 20:05:40 Image size of job updated: 175772
006 (1810.002.000) 07/16 20:05:41 Image size of job updated: 176452
006 (1810.000.000) 07/16 20:25:40 Image size of job updated: 208884
006 (1810.002.000) 07/16 20:25:41 Image size of job updated: 211312
006 (1810.000.000) 07/16 21:05:40 Image size of job updated: 211288
006 (1810.000.000) 07/16 21:45:40 Image size of job updated: 285856
004 (1810.000.000) 07/17 07:45:37 Job was evicted.
004 (1810.002.000) 07/17 07:45:37 Job was evicted.
001 (1810.000.000) 07/17 07:55:10 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/17 07:55:13 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/17 08:15:18 Image size of job updated: 175772
006 (1810.002.000) 07/17 08:15:21 Image size of job updated: 176452
006 (1810.000.000) 07/17 08:35:18 Image size of job updated: 183496
006 (1810.002.000) 07/17 08:35:21 Image size of job updated: 211312
006 (1810.000.000) 07/17 09:15:18 Image size of job updated: 183560
006 (1810.002.000) 07/17 09:15:21 Image size of job updated: 269556
006 (1810.000.000) 07/17 09:35:18 Image size of job updated: 193936
006 (1810.000.000) 07/17 09:55:18 Image size of job updated: 230120
004 (1810.000.000) 07/17 19:55:15 Job was evicted.
004 (1810.002.000) 07/17 19:55:15 Job was evicted.
001 (1810.002.000) 07/17 19:55:28 Job executing on host: <132.187.47.22:327
72>
001 (1810.000.000) 07/17 19:55:31 Job executing on host: <132.187.47.21:328
16>
006 (1810.002.000) 07/17 20:15:36 Image size of job updated: 176452
006 (1810.000.000) 07/17 20:15:39 Image size of job updated: 175752
006 (1810.002.000) 07/17 20:35:36 Image size of job updated: 183500
006 (1810.000.000) 07/17 20:35:39 Image size of job updated: 215672
006 (1810.002.000) 07/17 21:15:36 Image size of job updated: 186236
006 (1810.002.000) 07/17 21:35:36 Image size of job updated: 211312
004 (1810.002.000) 07/18 07:55:33 Job was evicted.
004 (1810.000.000) 07/18 07:55:34 Job was evicted.
001 (1810.000.000) 07/18 08:00:11 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/18 08:00:14 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/18 08:20:19 Image size of job updated: 175752
006 (1810.002.000) 07/18 08:20:22 Image size of job updated: 176452
006 (1810.000.000) 07/18 08:40:19 Image size of job updated: 211304
006 (1810.002.000) 07/18 08:40:22 Image size of job updated: 185124
006 (1810.002.000) 07/18 09:40:22 Image size of job updated: 189872
006 (1810.000.000) 07/18 10:00:19 Image size of job updated: 227648
006 (1810.002.000) 07/18 10:00:22 Image size of job updated: 215680
004 (1810.000.000) 07/18 20:00:13 Job was evicted.
004 (1810.002.000) 07/18 20:00:22 Job was evicted.
001 (1810.000.000) 07/18 20:05:10 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/18 20:05:12 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/18 20:25:18 Image size of job updated: 175752
006 (1810.002.000) 07/18 20:25:20 Image size of job updated: 176452
006 (1810.000.000) 07/18 20:45:18 Image size of job updated: 215672
006 (1810.002.000) 07/18 20:45:20 Image size of job updated: 194044
006 (1810.000.000) 07/18 22:05:18 Image size of job updated: 224988
006 (1810.002.000) 07/18 22:05:20 Image size of job updated: 211312
004 (1810.000.000) 07/19 08:05:15 Job was evicted.
004 (1810.002.000) 07/19 08:05:15 Job was evicted.
001 (1810.000.000) 07/19 08:15:11 Job executing on host: <132.187.47.21:328
16>
001 (1810.002.000) 07/19 08:15:12 Job executing on host: <132.187.47.21:328
16>
006 (1810.000.000) 07/19 08:35:19 Image size of job updated: 175772
006 (1810.002.000) 07/19 08:35:20 Image size of job updated: 176452
006 (1810.000.000) 07/19 08:55:19 Image size of job updated: 269564
006 (1810.002.000) 07/19 08:55:20 Image size of job updated: 183500
006 (1810.002.000) 07/19 09:35:20 Image size of job updated: 194160
006 (1810.002.000) 07/19 09:55:20 Image size of job updated: 211312
007 (1810.000.000) 07/19 14:35:25 Shadow exception!
007 (1810.002.000) 07/19 14:35:27 Shadow exception!
001 (1810.000.000) 07/19 15:16:48 Job executing on host: <132.187.47.22:564
21>
001 (1810.002.000) 07/19 15:16:49 Job executing on host: <132.187.47.23:422
34>
006 (1810.000.000) 07/19 15:36:56 Image size of job updated: 175720
006 (1810.002.000) 07/19 15:36:57 Image size of job updated: 176452
006 (1810.000.000) 07/19 15:56:56 Image size of job updated: 183552
006 (1810.002.000) 07/19 15:56:57 Image size of job updated: 183468
006 (1810.002.000) 07/19 16:16:57 Image size of job updated: 186476
006 (1810.000.000) 07/19 16:36:56 Image size of job updated: 195932
006 (1810.002.000) 07/19 16:56:57 Image size of job updated: 231472
004 (1810.000.000) 07/20 03:16:51 Job was evicted.
004 (1810.002.000) 07/20 03:16:54 Job was evicted.
001 (1810.000.000) 07/20 09:00:49 Job executing on host: <132.187.47.28:374
83>
006 (1810.000.000) 07/20 09:20:57 Image size of job updated: 175752
006 (1810.000.000) 07/20 09:40:58 Image size of job updated: 182140
006 (1810.000.000) 07/20 10:00:58 Image size of job updated: 184376
006 (1810.000.000) 07/20 10:40:58 Image size of job updated: 186228
001 (1810.002.000) 07/20 11:07:43 Job executing on host: <132.187.47.27:467
94>
006 (1810.000.000) 07/20 11:20:58 Image size of job updated: 214912
006 (1810.002.000) 07/20 11:27:51 Image size of job updated: 176452
006 (1810.002.000) 07/20 12:07:52 Image size of job updated: 236880
006 (1810.000.000) 07/20 12:20:58 Image size of job updated: 222944
005 (1810.000.000) 07/20 12:57:34 Job terminated.
004 (1810.002.000) 07/20 13:04:30 Job was evicted.
009 (1810.002.000) 07/20 13:04:30 Job was aborted by the user.
##  This section contains macros are here to help write legible
##  expressions:
MINUTE		= 60
HOUR		= (60 * $(MINUTE))
StateTimer	= (CurrentTime - EnteredCurrentState)
ActivityTimer	= (CurrentTime - EnteredCurrentActivity)
ActivationTimer = (CurrentTime - JobStart)
LastCkpt	= (CurrentTime - LastPeriodicCheckpoint)

##  The JobUniverse attribute is just an int.  These macros can be
##  used to specify the universe in a human-readable way:
STANDARD	= 1
PVM		= 4
VANILLA		= 5
MPI		= 8
IsPVM           = (TARGET.JobUniverse == $(PVM))
IsMPI           = (TARGET.JobUniverse == $(MPI))
IsVanilla       = (TARGET.JobUniverse == $(VANILLA))
IsStandard      = (TARGET.JobUniverse == $(STANDARD))

NonCondorLoadAvg	= (LoadAvg - CondorLoadAvg)
BackgroundLoad		= 0.3
HighLoad		= 0.5
StartIdleTime		= 15 * $(MINUTE)
ContinueIdleTime	=  5 * $(MINUTE)
MaxSuspendTime		= 10 * $(MINUTE)
MaxVacateTime		= 10 * $(MINUTE)

KeyboardBusy		= (KeyboardIdle < $(MINUTE))
ConsoleBusy		= (ConsoleIdle  < $(MINUTE))
CPUIdle			= ($(NonCondorLoadAvg) <= $(BackgroundLoad))
CPUBusy			= ($(NonCondorLoadAvg) >= $(HighLoad))
KeyboardNotBusy		= ($(KeyboardBusy) == False)

BigJob		= (TARGET.ImageSize >= (50 * 1024))
MediumJob	= (TARGET.ImageSize >= (15 * 1024) && TARGET.ImageSize < (50 
* 1024))
SmallJob	= (TARGET.ImageSize <  (15 * 1024))

JustCPU			= ($(CPUBusy) && ($(KeyboardBusy) == False))
MachineBusy		= ($(CPUBusy) || $(KeyboardBusy))

##  The RANK expression controls which jobs this machine prefers to
##  run over others.  Some examples from the manual include:
##    RANK = TARGET.ImageSize
##    RANK = (Owner == "coltrane") + (Owner == "tyner") \
##                  + ((Owner == "garrison") * 10) + (Owner == "jon
es")
##  By default, RANK is always 0, meaning that all jobs have an equal
##  ranking.
#RANK			= 0


#####################################################################
##  This where you choose the configuration that you would like to
##  use.  It has no defaults so it must be defined.  We start this
##  file off with the UWCS_* policy.
######################################################################

##  Also here is what is referred to as the TESTINGMODE_*, which is
##  a quick hardwired way to test Condor.
##  Replace UWCS_* with TESTINGMODE_* if you wish to do testing mode.
##  For example:
##  WANT_SUSPEND 		= $(UWCS_WANT_SUSPEND)
##  becomes
##  WANT_SUSPEND 		= $(TESTINGMODE_WANT_SUSPEND)

WANT_SUSPEND 		= $(Magic_WANT_SUSPEND)
WANT_VACATE		= $(Magic_WANT_VACATE)

##  When is this machine willing to start a job? 
START			= $(Magic_START)

##  When to suspend a job?
SUSPEND			= $(Magic_SUSPEND)

##  When to resume a suspended job?
CONTINUE		= $(Magic_CONTINUE)

##  When to nicely stop a job?
##  (as opposed to killing it instantaneously)
PREEMPT			= $(Magic_PREEMPT)

##  When to instantaneously kill a preempting job
##  (e.g. if a job is in the pre-empting stage for too long)
KILL			= $(Magic_KILL)

PERIODIC_CHECKPOINT	= $(Magic_PERIODIC_CHECKPOINT)
PREEMPTION_REQUIREMENTS	= $(Magic_PREEMPTION_REQUIREMENTS)
PREEMPTION_RANK		= $(Magic_PREEMPTION_RANK)
NEGOTIATOR_PRE_JOB_RANK = $(Magic_NEGOTIATOR_PRE_JOB_RANK)
NEGOTIATOR_POST_JOB_RANK = $(Magic_NEGOTIATOR_POST_JOB_RANK)
MaxJobRetirementTime    = $(Magic_MaxJobRetirementTime)

#####################################################################
## This is the Magic Configuration.
#####################################################################
Magic_WANT_SUSPEND	= False
Magic_WANT_VACATE 	= False

# Only start jobs if:
# always
# (NOTE: Condor will only run 1 job at a time on a given resource.
# The reasons Condor might consider running a different job while
# already running one are machine Rank (defined above), and user
# priorities.)
Magic_START	= True

# Suspend jobs if:
# Never!
Magic_SUSPEND = False

# Continue jobs if:
# Always
Magic_CONTINUE = True

# Preempt jobs if they have taken too long 
#Magic_PREEMPT= False
Magic_PREEMPT= ( $(ActivationTimer) > 12 * $(HOUR) )


# Maximum time (in seconds) to wait for a job to finish before kicking
# it off (due to PREEMPT, a higher priority claim, or the startd
# gracefully shutting down).  This is computed from the time the job
# was started, minus any suspension time.  Once the retirement time runs
# out, the usual preemption process will take place.  The job may
# self-limit the retirement time to _less_ than what is given here.
# By default, nice user jobs and standard universe jobs set their
# MaxJobRetirementTime to 0, so they will usually not wait in retirement.

Magic_MaxJobRetirementTime = 0

# Kill jobs if they have taken too long to vacate gracefully
Magic_KILL = (  $(ActivationTimer) > 12 * $(HOUR) \
               || TARGET.ImageSize > 1024 * 1024)

##  Only define vanilla versions of these if you want to make them
##  different from the above settings.
#SUSPEND_VANILLA  = ( $(KeyboardBusy) || \
#       ((CpuBusyTime > 2 * $(MINUTE)) && $(ActivationTimer) > 90) )
#CONTINUE_VANILLA = ( $(CPUIdle) && ($(ActivityTimer) > 10) \
#                     && (KeyboardIdle > $(ContinueIdleTime)) )
#PREEMPT_VANILLA  = ( ((Activity == "Suspended") && \
#                     ($(ActivityTimer) > $(MaxSuspendTime))) \
#                     || (SUSPEND_VANILLA && (WANT_SUSPEND == False
)) )
#KILL_VANILLA    = $(ActivityTimer) > $(MaxVacateTime)

##  We use a simple Periodic checkpointing mechanism, but then
##  again we have a very fast network.
Magic_PERIODIC_CHECKPOINT	= $(LastCkpt) > (3 * $(HOUR))

##  You might want to checkpoint a little less often.  A good
##  example of this is below.  For jobs smaller than 60 megabytes, we
##  periodic checkpoint every 6 hours.  For larger jobs, we only
##  checkpoint every 12 hours.
#Magic_PERIODIC_CHECKPOINT	= ( (TARGET.ImageSize < 60000) && \
#			    ($(LastCkpt) > (6 * $(HOUR))) ) || \ 
#			  ( $(LastCkpt) > (12 * $(HOUR)) )

##  The rank expressions used by the negotiator are configured below.
##  This is the order in which ranks are applied by the negotiator:
##    1. NEGOTIATOR_PRE_JOB_RANK
##    2. rank in job ClassAd
##    3. NEGOTIATOR_POST_JOB_RANK
##    4. cause of preemption (0=user priority,1=startd rank,2=no pree
mption)
##    5. PREEMPTION_RANK

##  The NEGOTIATOR_PRE_JOB_RANK expression overrides all other ranks
##  that are used to pick a match from the set of possibilities.
##  The following expression matches jobs to unclaimed resources
##  whenever possible, regardless of the job-supplied rank.
Magic_NEGOTIATOR_PRE_JOB_RANK = RemoteOwner =?= UNDEFINED

##  The NEGOTIATOR_POST_JOB_RANK expression chooses between
##  resources that are equally preferred by the job.
##  The following example expression steers jobs toward
##  faster machines and tends to fill a cluster of multi-processors
##  breadth-first instead of depth-first.  In this example,
##  the expression is chosen to have no effect when preemption
##  would take place, allowing control to pass on to
##  PREEMPTION_RANK.
#Magic_NEGOTIATOR_POST_JOB_RANK = \
# (RemoteOwner =?= UNDEFINED) * (KFlops - VirtualMachineID)

##  The negotiator will not preempt a job running on a given machine
##  unless the PREEMPTION_REQUIREMENTS expression evaluates to true
##  and the owner of the idle job has a better priority than the owner
##  of the running job.  This expression defaults to true.
Magic_PREEMPTION_REQUIREMENTS = False

##  The PREEMPTION_RANK expression is used in a case where preemption
##  is the only option and all other negotiation ranks are equal.  For
##  example, if the job has no preference, it is usually preferable to
##  preempt a job with a small ImageSize instead of a job with a large
##  ImageSize.  The default is to rank all preemptable matches the
##  same.  However, the negotiator will always prefer to match the job
##  with an idle machine over a preemptable machine, if all other
##  negotiation ranks are equal.
Magic_PREEMPTION_RANK = 0