[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problems with Condor-C in 7.0.4 [Sec=Unclassified]



On Jul 31, 2008, at 8:24 PM, Troy Robertson wrote:

Thought I would give this another spin as I had no takers last time.
Can anyone else at least confirm or deny my problem with Condor-C on
7.0.4
between Windows submit and Linux central-manager/execute

-------------------------------------------
Hi all,

I having been having problems with Condor-C for a while now.

With 7.0.3 it was that if I submitted more than one job then the jobs
would execute but then sit on the central manager of the remote pool and not be returned until I removed the first job. This was was found to be
a bug and fixed in 7.0.4

I have now upgraded to 7.0.4 but now if I submit a job from a Windows
Personal Pool to the remote linux pool,a new remote job is created by
the central manager every negotiation cycle but is not run.  They stay
on hold and just accumulate, one new job for every cycle until there are
heaps of them for the one submitted job.

Both local GridManager logs and remote central manager logs do not seem
to indicate why this might be the case.

If I try 'Release' one of the held remote jobs its goes back on hold.
Condor_q -analyse then indicates that it 'Cannot access initial working directory' on the local submit machine but I suspect this is a furphy as
a straight Condor job submitted from the same machine runs fine.

Can anyone please help me run this down?


Ouch. Can you send me the gridmanager log from your submit machine?

Thanks and regards,
Jaime Frey
UW-Madison Condor Team