[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] DAG questions




One thing to think about is upgrading from a 7.0.x DAGMan to a newer
version -- 7.4.0 (or wait for 7.4.1 if you're on Windows).  You can

We're on 7.2.4 right now. We don't do upgrades on Fridays, but will upgrade to 7.4 on Monday.

Thanks for the -output_dir and -debug pointers -- I read the DAG documentation, but not the command details (man page) and missed seeing them. I was expecting they'd be config file params.

Now I'm more confused about my situation. I've setup a much smaller run with only 3500 nodes in the DAG, however I'm still getting PANIC messages due to lack of file descriptors. An identical submission with only 40 nodes works fine, so I feel that rules out my general configuration, and points to either an OS issue or a Condor config issue. I've completely stopped all condor processes and restarted them.

The only other thing I can currently think of doing is rebooting the machine (not desirable -- it is our OSG CE and SE, plus web server), unless someone has a suggestion for a config change (Condor or OS) that could help.

Ian

12/18 15:07:36 argv[12] == "$CondorVersion: 7.2.4 Jun 15 2009 BuildID: 159529 $" 12/18 15:07:40 Duplicate DAGMan PID 487 is no longer alive; this DAGMan should continue.
12/18 15:07:40 Sleeping for 12 seconds to ensure ProcessId uniqueness
12/18 15:07:52 Running in RECOVERY mode...
**** PANIC -- OUT OF FILE DESCRIPTORS at line 796 in dprintf.c

--
Ian Stokes-Rees, Research Associate
SBGrid, Harvard Medical School
http://sbgrid.org