[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] FileLock::obtain(1) failed - errno 9 (Bad file descriptor)



Hi,

	We have a problem with dag. When submitting dag jobs with maxjobs, the maximum number of jobs is submitted and the dag abends immed after that.
	Of course, no more jobs from that pull are submitted later.

The only error found in logs is:

3/9 11:48:58 Successfully created sched universe process
3/9 11:48:58 FileLock::obtain(1) failed - errno 9 (Bad file descriptor) 
3/9 11:48:58 FileLock::obtain(2) failed - errno 9 (Bad file descriptor) 
3/9 11:48:58 Starting add_shadow_birthdate(7647.0) 
3/9 11:48:58 Called reschedule_negotiator() 3/9 11:48:58 Return from HandleReq <reschedule_negotiator>

strace shows that the condor_dagman abends on segmentation fault.

	Regular jobs works fine.
	We were using 6.8.2 nad upgrade to 7.0.1 on Deb 3 (Linux-RHEL3) but the problem persists.


Thanks,
Eddie