[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Too many popen() calls in DAGMan ?



Dear all,

I'm using DAGMan to perform a set of simulations
with different parameters. DAGMan has worked well
with a small set of simulations, but when I try
to perform a larger set, it stopped with an error
message in its .dagman.out file, like :

>9/6 00:45:28 Submitting Condor Job f1s5v13t ...
>9/6 00:45:28 submitting: condor_submit  -a 'dag_node_name = f1s5v13t' -a '+DAGMa
>nJobID = 17168' -a 'submit_event_notes = DAG Node: f1s5v13t' -a 'currname = fram
>e1' -a 'prevname = frame0' -a 'ndx = group.ndx' -a '+DAGParentNodeNames = "f0s5v
>13"' SAMPLE5/VDW13/tpbconv.submit 2>&1
>9/6 00:45:28 condor_submit  -a 'dag_node_name = f1s5v13t' -a '+DAGManJobID = 171
>68' -a 'submit_event_notes = DAG Node: f1s5v13t' -a 'currname = frame1' -a 'prev
>name = frame0' -a 'ndx = group.ndx' -a '+DAGParentNodeNames = "f0s5v13"' SAMPLE5
>/VDW13/tpbconv.submit 2>&1: popen() in submit_try failed!
>9/6 00:45:28 ERROR: submit attempt failed
>  
>

So I guess my simulations make DAGMan create
too many processes by invoking popen().

Specifically, I'm trying to run 396 molecular dynamics
simulations. Each simulation is divided into 20 time frames,
so that analysis program can be run after each time frame.
Hence my .dag file has 16341 nodes
( = (simulations + analysis ) * time frames + additional analysis)
and 396 simulations and 396 analysis programs are submitted
simultaneously to CONDOR whose pool has 128 CPUs.

Could anybody please tell me if this size of simulations
can exceed the limit of DAGMan ? Or the older version of
DAGMan in CONDR 6.7.14 can easily create more processes
that the latest version ? (Actually this older version
is installed in our system.)

I'm wondering if I should inspect some other aspects
in the CONDOR log files, or should ask our system administrator
to update our CONDOR to the latest version.

I'd be very grateful for any hint, advice, or comments.

Thanks in advance.

-- 
Masakatsu Ito 
Nanotechnology Research Center
FUJITSU LABORATORIES LTD.