[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] notification on error



Hello all,

 

Thanks for any input in advance. With the following submit file I am trying to avoid blackholes machines, and limit jobs that fail to restart indefinitely. To accomplish that I use the following recipes from https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToAdminRecipes:

 

UNIVERSE = VANILLA

job_machine_attrs = Machine

job_machine_attrs_history_length = 5

BlackholeMachine = (target.machine =!= MachineAttrMachine1 && target.machine =!= MachineAttrMachine2)

Arch = (Arch == "INTEL" || Arch == "X86_64")

Windows = (OpSys == "WINDOWS"  || OpSys == "WINNT51" || OpSys == "WINNT52" || OpSys == "WINNT60"|| OpSys == "WINNT61")

REQUEST_MEMORY = 500

REQUIREMENTS  = $(Arch) && $(Windows) && $(BlackholeMachine) && Target.Memory>=RequestMemory && NumJobStarts == 0

GETENV = TRUE

== FALSE) && (ExitCode == 0)

PERIODIC_REMOVE = JobStatus == 1 && NumJobStarts > 0

NOTIFICATION = Error

 

Jobs failing due to a signal but get e-mails. When I setup the notification,  and was working,  the version was 7.8.0. I recently upgraded to 7.84.  My log files are showing up with the following lines:

 

004 (12620.000.000) 10/17 10:04:21 Job was evicted.

                (0) Job terminated and was requeued

                                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage

                                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage

                0  -  Run Bytes Sent By Job

                1258214  -  Run Bytes Received By Job

                (1) Normal termination (return value -1073741629)

                The job attribute OnExitRemove _expression_ '( ExitBySignal == false ) && ( ExitCode == 0 )' evaluated to FALSE

...

009 (12620.000.000) 10/17 10:05:11 Job was aborted by the user.

                The job attribute PeriodicRemove _expression_ 'JobStatus == 1 && NumJobStarts > 0' evaluated to TRUE

...

 

Something change on version 7.8.4?

 

Respectfully,

 

Alex Alas