Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] out of order condor-g failures in log file

Date: Fri, 9 Apr 2010 18:03:19 -0400
From: Peter Doherty <doherty@xxxxxxxxxxxxxxxxxxx>
Subject: [Condor-users] out of order condor-g failures in log file

I've got a large DAG running with Condor-G on Open Science Grid.

All the nodes are using one log file, and I'm seeing a lot of errorslike this:

000 (887642.000.000) 04/09 17:04:23 Job submitted from host:
<10.0.10.39:54286>
   DAG Node: 3cb5-00257

018 (887642.000.000) 04/09 17:04:55 Globus job submission failed!
   Reason: 22 the job manager failed to create an internal script
argument file

017 (887642.000.000) 04/09 17:05:14 Job submitted to Globus
   RM-Contact: gridgk01.racf.bnl.gov/jobmanager-condor
   JM-Contact: https://gridgk01.racf.bnl.gov:20908/26140/1270847110/
   Can-Restart-JM: 1
...
027 (887642.000.000) 04/09 17:05:14 Job submitted to grid resource
   GridResource: gt2 gridgk01.racf.bnl.gov/jobmanager-condor
   GridJobId: gt2 gridgk01.racf.bnl.gov/jobmanager-condor
https://gridgk01.racf.bnl.gov:20908/26140/1270847110/

Shouldn't code 018 for a Globus submit failure have to happen afterthe job is submitted to Globus?

Some of these jobs then manage to execute and complete, although Ihave to look further to see if they are running successfully.


Best,
Peter

Follow-Ups:
- Re: [Condor-users] out of order condor-g failures in log file
  - From: Alan De Smet

Prev by Date: Re: [Condor-users] condor submit and dag man
Next by Date: Re: [Condor-users] condor submit and dag man
Previous by thread: Re: [Condor-users] condor submit and dag man
Next by thread: Re: [Condor-users] out of order condor-g failures in log file
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] out of order condor-g failures in log file