Re: [Condor-users] why my submitted job runs for 2 mins and gets suspended and unsuspended every 10 mins

Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

On 26/04/07, VIT Students < vit.gridproject@xxxxxxxxx> wrote:

hi,

i have setup condor for 2 machines with NFS sharing(linux fedora core 3) .The installation and configuration was perfect.But when i submit a simple example job sh_loop.cmd ,it gets excecuted for couple of mins and afterwards get suspeneded and after sometime again gets unsuspended and afterwards gets evicted.please can any one help me out with this .i am showing the sh_loop.log file here .i hope it might be of some help.This log file is for a single machine.i am facing the same problem for single and multiple machines

000 (002.000.000) 04/25 20:02:23 Job submitted from host: <127.0.0.1:32878>
...
001 (002.000.000) 04/25 20:02:25 Job executing on host: <127.0.0.1:32877>
...
010 (002.000.000) 04/25 20:02:30 Job was suspended.
    Number of processes actually suspended: 2
...
011 (002.000.000) 04/25 20:12:31 Job was unsuspended.
...
004 (002.000.000) 04/25 20:12:32 Job was evicted.
    (0) Job was not checkpointed.
        Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
    0 - Run Bytes Sent By Job
    0 - Run Bytes Received By Job
...
001 (002.000.000) 04/25 20:22:29 Job executing on host: <127.0.0.1:32877>
...
010 (002.000.000) 04/25 20:22:33 Job was suspended.
    Number of processes actually suspended: 2
...
011 (002.000.000) 04/25 20:32:34 Job was unsuspended.
...
004 (002.000.000) 04/25 20:32:34 Job was evicted.
    (0) Job was not checkpointed.
        Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
    0 - Run Bytes Sent By Job
    0 - Run Bytes Received By Job
...
001 (002.000.000) 04/25 20:42:28 Job executing on host: <127.0.0.1:32877>
...
010 (002.000.000) 04/25 20:42:33 Job was suspended.
    Number of processes actually suspended: 2
...
011 (002.000.000) 04/25 20:52:36 Job was unsuspended.
...
004 (002.000.000) 04/25 20:52:36 Job was evicted.
    (0) Job was not checkpointed.
        Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
    0 - Run Bytes Sent By Job
    0 - Run Bytes Received By Job
...

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR

Mailing List Archives

Public Access

Re: [Condor-users] why my submitted job runs for 2 mins and gets suspended and unsuspended every 10 mins