[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Memory issues when running condor jobs:
- Date: Wed, 15 Oct 2008 00:22:12 -0500
- From: Yeye He <heyeye@xxxxxxxxxxx>
- Subject: [Condor-users] Memory issues when running condor jobs:
I am running a very memory-intensive job via Condor and every time when
the virtual memory size goes beyond 1.6GB or so, the job was evicted and
never picked up by anyone (condor_q -analyze shows that all machines
that qualifies to run the program reject to do so). I understand it may
have something to do with machine's local job policy, where when the
image size of my job exceeds certain threshold it gets evicted.
One obvious solution on my end is to limit the memory footprint of this
program. But this is a "naive" approach that someone proposed in a paper
and we are proposing something else to beat it. We are trying to get
some data points on some non-trivial dataset to show that we indeed beat
it in terms of both efficiency and quality. The problem is that in the
"naive" approach its memory footprint can easily go beyond 2-3 GB, which
my 32bit workstation cannot handle due to limits of address space
(allocation error). I don't have access to a 64 bit machine so that is
why I am using Condor to start with.
I am just wondering, without rewriting current implementation
(potentially by moving some data to disk and removing them from in-mem
structure), is there any workaround to this problem?
Any suggestions are highly appreciated!