[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] java universe and 32-bit vs 64-bit JVMs

Here's our setup - We're running a grid with 18 blades, each with 2 quad
core processors and 16 gigs of memory.  Each core is a compute node, so
we have 8 nodes per blade...  which should yield approximately 2 gigs of
memory per execute node.
Problem:  When we use a 64-bit JVM, we see utilization of all 16 gigs of
memory, and all jobs run fine.  In this configuration, the min and max
heap sizes are both set to 1906 MB.  When we use a 32-bit JVM, we see
utilization of about 5-6 gigs.  In this configuration, the min and max
heap sizes are both set to 1024.  The problem is that, with the 32-bit
JVM, we're getting 8 jobs starting per blade, but after a couple
minutes, 2 or 3 of those jobs get suspended.  They remain suspended for
10 minutes, and then get evicted (I know this 10 minutes is
configurable).  If we kick off a run of 50 jobs, over 30 end up getting
evicted and restarting.  We've tried setting the heap size to 1906 on
the 32-bit configuration, but not all of the jobs would start.

Question:  I was wondering if anyone could provide any insight into what
might be going on.  I know that each java process gets its own JVM
runtime instance, and so should be able to address over 3 gigs.  I was
surprised when the 32-bit setup was showing > 4 gigs utilized, as I
thought maybe the problem was that all of the memory was being mapped
through one instance of the JVM.  I don't think Condor would do anything
strange with how the memory was being managed, but I don't know for
sure.  It seems to me like this is a memory size issue, as our problems
were solved when we used the 64-bit JVM.  This, however, is not ideal as
we have third-party libraries that we'll be using that are 32-bit.  Any
insight would be greatly appreciated.