----- Message d'origine ----
De : R. Kent Wenger <wenger@xxxxxxxxxxx>
À : Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Envoyé
le : Jeudi, 10 Juillet 2008, 19h06mn 16s
Objet : Re: [Condor-users] Dagman job cannot start second node
On Mon, 7 Jul 2008, Vigilant Lionel wrote:
We are running on Condor 7.0.1.
I want to use dag jobs so i tested with two simple Cpp progs :
un submit file :
Universe = standard
Executable = un
Log = un.log
Output = un.out
Error = un.err
Arguments = 35
Queue
7/7 11:24:37 Bootstrapping...
7/7 11:24:37 Number of pre-completed nodes: 0
7/7 11:24:37 Running in RECOVERY mode...
7/7 11:25:37 FileLock::obtain(1) failed - errno 5 (Input/output error)
7/7 11:25:37 ERROR "Assertion ERROR on (m_is_locked)" at line 1125 in file read_user_log.C
(various details removed above)
Okay, my first question is whether un.log is on a shared filesystem. If
so, is it possible to move it to a place that's on a local disk on your
submit machine?
You *should* also be able to
work around this (somewhat dangerously) by
setting ENABLE_USERLOG_LOCKING to false in your configuration, but we just
found a bug with that, which is probably in 7.0.1 (it's known to be in
7.0.2). The fix should be in 7.0.4.
Kent Wenger
Condor Team
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to
condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-usersThe archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/