[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] job runs twice



I just saw something strange and am curious if anybody knows why this would have happened.  I have a job that appears to have been run twice by condor, even though the first time it seems to have run to completion.  It’s launched by wrapper scripts, and I thought maybe one of them was seeing bad output and re-submitting the job, except it’s the same cluster/job id, and there’s no indication in the log that the job was resubmitted.  Can’t reproduce or see a pattern…

 

It’s a vanilla universe job submitted on a linux machine, flocked to another pool, and running on a Linux machine.

 

Here’s the log…

 

000 (24305.000.000) 10/18 15:29:23 Job submitted from host: <x.y.z.14:50865>

...

001 (24305.000.000) 10/18 15:34:22 Job executing on host: <a.b.c.138:32775>

...

006 (24305.000.000) 10/18 15:34:30 Image size of job updated: 28808

...

005 (24305.000.000) 10/18 15:42:40 Job terminated.

        (1) Normal termination (return value 0)

                Usr 0 00:05:13, Sys 0 00:00:00  -  Run Remote Usage

                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage

                Usr 0 00:05:13, Sys 0 00:00:00  -  Total Remote Usage

                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage

        1989  -  Run Bytes Sent By Job

        8242  -  Run Bytes Received By Job

        1989  -  Total Bytes Sent By Job

        8242  -  Total Bytes Received By Job

...

001 (24305.000.000) 10/18 16:25:00 Job executing on host: < a.b.c.112:32778>

...

006 (24305.000.000) 10/18 16:25:08 Image size of job updated: 27176

...

005 (24305.000.000) 10/18 16:33:22 Job terminated.

        (1) Normal termination (return value 0)

                Usr 0 00:05:18, Sys 0 00:00:00  -  Run Remote Usage

                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage

                Usr 0 00:05:18, Sys 0 00:00:00  -  Total Remote Usage

                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage

        1989  -  Run Bytes Sent By Job

        8242  -  Run Bytes Received By Job

        1989  -  Total Bytes Sent By Job

        8242  -  Total Bytes Received By Job

...

 

Michael.