[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Java universe on WinXP, misconfigured?



> As far as I know (please correct me if I'm wrong), the _condor_stdout
> and _condor_stderr format comes from using $(CLUSTER) and $(PROCESS) to
> distinguish output from multiple copies of a job, something I do
> regularly in the vanilla universe, without problems.

After more than a glance at the code, it seems that you are correct in 
this regard.  I do it all the time too, I've just never looked over that
part of the code before.

Anyway, maybe running with D_FULLDEBUG enabled might help shed some 
light on the problem... it just seems so strange that it runs from
the console but not under Condor.  Do you have any Windows submit 
nodes, that you can for to submit to a Windows machine? (After 
compiling the code on the Windows machine as well... not that that 
should make a difference either.)

-B

> 
> Thanks,
> 
> Rob
> 
> Ben Burnett wrote:
> > Thanks.  You're right, it does look fairly straightforward.  I've
> > even run it on my machine and it works just fine.  What troubles
> > me is the log file entries your spits out:
> >
> >> 4/15 16:39:25 Output file:
> >> C:\Progra~1\Condor\execute\dir_3248\_condor_stdout
> >> 4/15 16:39:25 Error file:
> >> C:\Progra~1\Condor\execute\dir_3248\_condor_stderr
> >
> > The only place I find mention of these name is in the Grid Manager,
> > which I'm guessing you are not using, and when error/log file names
> > are not specified--which is strange, since you do.
> >
> > Since you know that the application runs on the machine itself,
> > would it be possible to simplify the submit file to the most
> > basic one possible:
> >
> > universe                = java
> > executable              = JavaTest.class
> > arguments               = JavaTest
> > output                  = out.txt
> > error                   = err.txt
> > log                     = log.txt
> > queue
> >
> > Then clean out the logs, and start Condor fresh.  See if that
> > actually works.
> >
> > -B
> >
> > This
> >> -----Original Message-----
> >> From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-
> >> bounces@xxxxxxxxxxx] On Behalf Of Rob de Graaf
> >> Sent: Tuesday, April 15, 2008 12:45 PM
> >> To: Condor-Users Mail List
> >> Subject: Re: [Condor-users] Java universe on WinXP, misconfigured?
> >>
> >> Hi Ben,
> >>
> >> My submit file is fairly straightforward:
> >>
> >> universe                = java
> >> executable              = JavaTest.class
> >> arguments               = JavaTest
> >> output                  = output/java.$(Cluster).$(Process).out
> >> error                   = error/java.$(Cluster).$(Process).err
> >> log                     = log/java.$(Cluster).$(Process).log
> >>
> >> requirements            = (OpSys == "LINUX" || OpSys == "WINNT51")
> >> should_transfer_files   = YES
> >> when_to_transfer_output = ON_EXIT
> >> notification            = never
> >>
> >> queue 10
> >>
> >> When I submit this, some will start on WinXP nodes, run into this
> Java
> >> configuration problem, get evicted and restarted on Linux nodes
> where
> >> they complete normally.
> >>
> >> The application is just hello world:
> >>
> >> public class JavaTest {
> >>          public static void main( String[] args ) {
> >>                  System.out.println("Moo!");
> >>          }
> >> }
> >>
> >> I compiled on a 1.6 JDK with -target 1.5 option to make it work on
> the
> >> older JRE's we have on the WinXP boxes, and as I said before, it
> does
> >> run, just not through Condor.
> >>
> >> Thanks!
> >>
> >> Rob
> >>
> >> Ben Burnett wrote:
> >>> Hi Rob:
> >>>
> >>> Could you post the submit file you are using.  I've tried moving
> >> Condor
> >>> around and playing with the paths, but nothing seems to spit out
> the
> >> same
> >>> error.  Maybe I can reproduce it with you submission file.
> >>>
> >>> Regards,
> >>> -B
> >>>
> >>>> -----Original Message-----
> >>>> From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-
> >>>> bounces@xxxxxxxxxxx] On Behalf Of Rob de Graaf
> >>>> Sent: Tuesday, April 15, 2008 10:00 AM
> >>>> To: condor-users@xxxxxxxxxxx
> >>>> Subject: [Condor-users] Java universe on WinXP, misconfigured?
> >>>>
> >>>> Hello all,
> >>>>
> >>>> I'm trying to run a Java job on a mixed WinXP / Linux pool.
> >>>> Everything's
> >>>> fine on the Linux nodes, but on the windows nodes, Java universe
> >> seems
> >>>> broken somehow. I get this in the StarterLog:
> >>>>
> >>>> 4/15 16:39:24 Initialized IO Proxy.
> >>>> 4/15 16:39:24 File transfer completed successfully.
> >>>> 4/15 16:39:25 Job 16.0 set to execute immediately
> >>>> 4/15 16:39:25 Starting a JAVA universe job with ID: 16.0
> >>>> 4/15 16:39:25 JavaProc:
> >> Cmd=C:\PROGRA~1\Java\jre1.5.0_07\bin\java.exe
> >>>> 4/15 16:39:25 JavaProc: Args=-Xmx501m -classpath
> >>>> C:\Progra~1\Condor/lib;C:\Progra~1\Condor/lib/scimark2lib.jar;.
> >>>> -Dchirp.config=C:\Progra~1\Condor\execute\dir_3248\chirp.config
> >>>> CondorJavaWrapper C:\Progra~1\Condor\execute\dir_3248\jvm.start
> >>>> C:\Progra~1\Condor\execute\dir_3248\jvm.end JavaTest
> >>>> 4/15 16:39:25 Tracking process family by login "condor-reuse-
> slot1"
> >>>> 4/15 16:39:25 IWD: C:\Progra~1\Condor\execute\dir_3248
> >>>> 4/15 16:39:25 Output file:
> >>>> C:\Progra~1\Condor\execute\dir_3248\_condor_stdout
> >>>> 4/15 16:39:25 Error file:
> >>>> C:\Progra~1\Condor\execute\dir_3248\_condor_stderr
> >>>> 4/15 16:39:25 Renice expr "10" evaluated to 10
> >>>> 4/15 16:39:25 About to exec
> >> C:\PROGRA~1\Java\jre1.5.0_07\bin\java.exe
> >>>> -Xmx501m -classpath
> >>>> C:\Progra~1\Condor/lib;C:\Progra~1\Condor/lib/scimark2lib.jar;.
> >>>> -Dchirp.config=C:\Progra~1\Condor\execute\dir_3248\chirp.config
> >>>> CondorJavaWrapper C:\Progra~1\Condor\execute\dir_3248\jvm.start
> >>>> C:\Progra~1\Condor\execute\dir_3248\jvm.end JavaTest
> >>>> 4/15 16:39:25 Create_Process succeeded, pid=2388
> >>>> 4/15 16:39:25 Process exited, pid=2388, status=1
> >>>> 4/15 16:39:25 JavaProc: JVM pid 2388 has finished
> >>>> 4/15 16:39:25 JavaProc: JVM exited normally with code 1
> >>>> 4/15 16:39:25 JavaProc: Wrapper did not leave start record.
> >>>> 4/15 16:39:25 JavaProc: I'll assume Java is misconfigured here.
> >>>> 4/15 16:39:25 JavaProc: unlinking
> >>>> C:\Progra~1\Condor\execute\dir_3248\jvm.start and
> >>>> C:\Progra~1\Condor\execute\dir_3248\jvm.end
> >>>> 4/15 16:39:26 Got SIGQUIT.  Performing fast shutdown.
> >>>> 4/15 16:39:26 ShutdownFast all jobs.
> >>>> 4/15 16:39:26 **** condor_starter (condor_STARTER) EXITING WITH
> >> STATUS
> >>>> 0
> >>>>
> >>>> The Java app runs fine on the machine itself, just not through
> >> Condor.
> >>>> The
> >>>> path to the JVM is correct and the machine advertises Java
> >> capability,
> >>>> as
> >>>> well as the correct version. What is misconfigured? This is Condor
> >>>> version
> >>>> 7.0.1 on Windows XP.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Rob de Graaf
> >>>>