Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: RE: [Condor-users] Debugging interactive jobs (was Re: Condorandbatch Matlab problem)
- Date: Thu, 15 Jul 2004 14:30:24 +0300
- From: Gabriel <gghin@xxxx>
- Subject: RE: RE: [Condor-users] Debugging interactive jobs (was Re: Condorandbatch Matlab problem)
Hi, Andrey
I have also tried with a Win2000 Pro execute machine, thinking that
it might be an XP problem, but I got the same behavior.
The USE_VISIBLE_DESKTOP version is a debugging facility
introduced in version 6.2. I have a version of the
condor manual from april 2004 and it is briefly mentioned on page 375,
Chapter 8.
Regards
Gabriel
"Andrey Kaliazin" <A.Kaliazin@xxxxxxxxxxx> wrote:
>Hi
>
>I have investigated the case with Matlab and can say that it just won't
run
>in a non-interactive fashion. It actually does start and even creates the
>log file (command line is
>matlab.exe -nojvm -nosplash -logfile run.log -r myscript )
>but then apparently terminates instantly but without returning control
back
>to condor, so the job hangs.
>(I have Windows XP boxes with Matlab 6.5 installed)
>I have checked and Matlab did not modify any files in its folder while
>running simple interactive job
>So the best solution will probably be to run Octave instead. :-)
>
>btw, I could not find USE_VISIBLE_DESKTOP anywhere in Condor
documentation.
>Could somebody give a link to it or it is a hidden feature?
>
>Anyway, I do not quite understand how it is possible to bring up desktop
on
>execute machine
>which is not even logged in? And if it is logged in by some user then how
>can condor user
>create windows on somebodyelse's desktop?
>
>Andrey
>
>> -----Original Message-----
>> From: condor-users-bounces@xxxxxxxxxxx
>> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Gabriel
>> Sent: Thursday, July 15, 2004 7:57 AM
>> To: ErikPaulson
>> Cc: condor-users@xxxxxxxxxxx
>> Subject: RE: [Condor-users] Debugging interactive jobs (was
>> Re: Condor andbatch Matlab problem)
>>
>>
>>
>>
>> Hi
>>
>> I have tried to use the debugging technique you recommended.
>>
>> I have put the USE_VISIBLE_DESKTOP = True option in the
>> condor_config and condor_config.local files on the execute
>>
>> machine.
>>
>> When I submit the script, it runs ok, exits but no window
>> is popping. (I am using the execute machine as a submit
>> machine)
>>
>> In the StarterLog of the execute machine I get the following:
>>
>> 7/15 14:14:25 File transfer completed successfully.
>> 7/15 14:14:26 Starting a VANILLA universe job with ID: 52.0
>> 7/15 14:14:26 IWD: c:\Condor/execute\dir_184
>> 7/15 14:14:26 Output file: c:\Condor/execute\dir_184\c.out
>> 7/15 14:14:26 Error file: c:\Condor/execute\dir_184\c.err
>> 7/15 14:14:26 Renice expr "10" evaluated to 10
>> 7/15 14:14:26 About to exec c:\windows\system32\cmd.exe /K
>> 7/15 14:14:26 Create_Process: Unable to use visible desktop
>> 7/15 14:14:26 Create_Process succeeded, pid=3732
>> 7/15 14:14:26 Process exited, pid=3732, status=0
>> 7/15 14:14:26 Got SIGQUIT. Performing fast shutdown.
>> 7/15 14:14:26 ShutdownFast all jobs.
>> 7/15 14:14:26 **** condor_starter (condor_STARTER) EXITING
>> WITH STATUS 0
>> 7/15 14:16:28 ******************************************************
>>
>> It says "unable to use visible desktop". Is there any other
>> setting I have to do to use this facility?
>>
>> Thank you
>> Best Regards
>> Gabriel
>>
>>
>> Erik Paulson <epaulson@xxxxxxxxxxx> wrote:
>> >On Wed, Jul 14, 2004 at 11:29:25AM -0600, bgore@xxxxxxxxxx wrote:
>> >> It could be a security/permissions thing. We had this
>> happen on another
>> matlab-like program. Every time it ran it wanted to update a
>> file in the
>> install directory. However, without modification the default
>> condor local
>> account did not have permission to update this file. So, run matlab
>> interactively and look for recently modified files in the
>> matlab install
>> directory. Give update permissions to the local condor
>> account for those
>> files and see if that fixes your issue. ~B
>> >>
>> >
>> >When you run it interactively, you should try and make sure
>> that you're
>> >using the same environment as what the job will see - the
>> best way to do
>> that
>> >is to set
>> >
>> >USE_VISIBLE_DESKTOP = True
>> >
>> >on an _EXECUTE_ machine.
>> >
>> >When a job starts up, it will open a DOS prompt on the desktop and
>> >start executing there if that is set. You can watch it run
>> if you'd like!
>> >
>> >Make your submit file be something like:
>> >
>> >#Executable = matlab.bat
>> >Executable = cmd.exe
>> >Universe = vanilla
>> >#Requirements = ((Arch == "INTEL" && OpSys == "WINNT51"))
>> >Requrirements = machine ==
>> "host-with-use-visible-desktop-set.your.domain"
>> >should_transfer_files = YES
>> >transfer_executable = false
>> >whenToTransferOutput = ON_EXIT
>> >#transfer_input_files = a.dat,b.dat,test.m
>> >transfer_input_files = a.dat,b.dat,test.m,matlab.bat
>> >environment = PATH=c:\matlab_sv13\bin\win32
>> >#arguments = /r test /logfile log.txt
>> >arguments = /K
>> >log = mat.log
>> >Output = mat.out
>> >Error = mat.err
>> >Queue 1
>> >
>> >Submit your job, then wait at
>> host-with-use-visible-desktop-set.your.domain
>> >and you'll get a cmd.exe window. Now you can debug the job
>> exactly as the
>> >job will see it.
>> >
>> >I'm not 100% sure about the /K argument - I know you need to give
>> something
>> >to cmd.exe to get it to stick around so you can actually type on it.
>> >Hopefully someone who uses Windows can confirm.
>> >
>> >Make sure you watch out for your START expression - if
>> you've got it set
>> >so that typing on the keyboard suspends your job, as soon as
>> you try and
>> >use your window the startd will suspend it. Best to set START=true :)
>> >
>> >We use this approach all the time to debug why jobs won't run under
>> >Condor - it turns out that there are a number of "console"
>> apps that will
>> >decide to pop up a window waiting for the user to click "OK" before
>> they'll
>> >start.
>> >
>> >You can pull a similar approach on Unix with X-Windows:
>> >universe = vanilla
>> >executable = /usr/X11R6/bin/xterm
>> >arguments == --display=submitting-host.your.domain:0
>> >queue
>> >
>> >set 'xhost +' and then submit your job, and when it runs,
>> you'll get an
>> >xterm on your screen that's running on the remote machine,
>> and you can
>> >interactively work on the remote machine. Add in any file transfer
>> >options you need to set things up for your job and you can
>> get a start
>> >on debugging "it runs outside of Condor, but not inside"
>> >
>> >And one last trick - sometimes a program just insists on
>> haivng an Xserver
>> >to run under - Open Office comes to mind - there's no way to
>> disable it's
>> >screen. In order to get it to work, we used the X Virtual
>> Frame Buffer -
>> >it's like /dev/null for X. (http://www.xfree86.org/4.0.1/Xvfb.1.html)
>> >
>> >To get it to work, I used this submit file:
>> >universe = vanilla
>> >executable = xvfb-run
>> >WhenToTransferOutput = always
>> >transfer_files = always
>> >arguments = xlr8r_linux filename
>> >transfer_input_files = xlr8r_linux, xlr8r_linux.rdb, filename
>> >environment =
>> LD_LIBRARY_PATH=/p/condor/workspaces/epaulson/739/xlr8r_libraries;
>> >PATH=/usr/bin:/bin:/s/std/bin:.;HOME=/u/e/p/epaulson
>> >requirements = IsC2Cluster
>> >output = filename.out
>> >error = filename.err
>> >
>> >(xlr8r_linux was a program that invoked OpenOffice)
>> >
>> >xvfb-run was a shell script I found on the net (I think it's
>> from Debian)
>> >
>> >
>> >#!/bin/sh
>> >
>> >chmod 755 xlr8r_linux
>> >set -o xtrace
>> >set -e
>> >
>> >ulimit -c 0
>> ># xvfb-run - run the specified command in a virtual X server
>> >
>> ># This script starts an instance of Xvfb, the "fake" X server, runs a
>> ># command with that server available, and kills the X server when
>> ># done. The return value of the command becomes the return value of
>> ># this script.
>> >#
>> ># If anyone is using this to build a Debian package, make sure the
>> ># package Build-Depends on xvfb, xbase-clients and xfonts-base.
>> >
>> >set -e
>> >
>> >DISPLAYNUM=99
>> >AUTHFILE=$(pwd)/Xauthority
>> >STARTWAIT=3
>> >LISTENTCP="-nolisten tcp"
>> >#unset AUTODISPLAYNUM
>> >
>> >
>> >usage()
>> >{
>> > echo "Usage: $0 [OPTION]... [command]"
>> > echo
>> > echo "run specified X client or command in a virtual X server
>> environment"
>> > echo
>> > echo " -a --auto-displaynum Try to get a free display number,
>> starting at --display-num"
>> > echo " -f --auth-file=FILE File to store auth cookie
>> (default:./Xauthority)"
>> > echo " -n --display-num=NUM Display number to use
>> (default:$DISPLAYNUM)"
>> > echo " -l --listen-tcp Enable TCP port
>> listening in the X
>> server"
>> > echo " -w --wait=DELAY Delay in seconds to wait
>> for Xvfb to
>> start (default:$STARTWAIT)"
>> > echo " -h --help Display this help and exit"
>> >}
>> >
>> >
>> ># Parse command line
>> >ARGS=`getopt --options +af:n:lw:h \
>> > --long
>> auto-displaynum,authority-file:,display-num:,listen-tcp,wait:,help
>> \
>> > --name "$0" -- "$@"`
>> >if [ $? != 0 ] ; then echo "Terminating..." >&2 ; exit 1 ; fi
>> >
>> >eval set -- "$ARGS"
>> >while true ; do
>> > case "$1" in
>> > '-a'|'--auto-displaynum')
>> > AUTODISPLAYNUM=y
>> > ;;
>> > '-f'|'--auth-file')
>> > AUTHFILE="$2"
>> > shift
>> > ;;
>> > '-n'|'--display-num')
>> > DISPLAYNUM="$2"
>> > shift
>> > ;;
>> > '-l'|'--listen-tcp')
>> > LISTENTCP=
>> > ;;
>> > '-w'|'--wait')
>> > STARTWAIT="$2"
>> > shift
>> > ;;
>> > '-h'|'--help')
>> > usage
>> > exit 1
>> > ;;
>> > '--')
>> > # end of options
>> > shift
>> > break
>> > ;;
>> > *)
>> > echo "Internal error!"; exit 1;;
>> > esac
>> >
>> > shift
>> >done
>> >
>> >echo "Starting"
>> >i=$DISPLAYNUM
>> >while [ -f /tmp/.X$i-lock ]; do
>> > echo "Checking $i"
>> > i=$(($i+1))
>> > done
>> >echo $i
>> >DISPLAYNUM=$i
>> >
>> >echo $DISPLAYNUM
>> ># start Xvfb
>> >rm -f "$AUTHFILE"
>> >MCOOKIE=$(mcookie)
>> >XAUTHORITY="$AUTHFILE" xauth add :$DISPLAYNUM . $MCOOKIE > /dev/null
>> >XAUTHORITY="$AUTHFILE" Xvfb :$DISPLAYNUM -screen 0 640x480x8
>> $LISTENTCP \
>> > > /dev/null &
>> >XVFBPID=$!
>> >sleep $STARTWAIT
>> >
>> >set +e
>> >
>> ># Check that server has not exited
>> >if ! kill -0 $XVFBPID; then
>> > echo "Xvfb server has died" >&2
>> > exit 1
>> >fi
>> >
>> ># start the command and save its exit status
>> >echo $@
>> >DISPLAY=:$DISPLAYNUM XAUTHORITY="$AUTHFILE" $@ 2>&1
>> >RETVAL=$?
>> >set -e
>> >
>> ># kill Xvfb and clean up
>> >kill $XVFBPID
>> >XAUTHORITY="$AUTHFILE" xauth remove :$DISPLAYNUM > /dev/null
>> >rm "$AUTHFILE"
>> >
>> ># return the executed command's exit status
>> >exit $RETVAL
>> >
>> >
>> ># Find free display number by looking at .X-lock files in /tmp
>> >#find-free-display()
>> >#{
>> >#}
>> >_______________________________________________
>> >Condor-users mailing list
>> >Condor-users@xxxxxxxxxxx
>> >http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >
>> >
>> >
>>
>>
>>
>>
>> ---------------------------------
>> Vreau sa-mi inregistrez CV-ul la BursaMuncii!
>> Vreti sa publicati oferte de munca? Noi avem solutia!
>> Lasa profesionistii sa lucreze pentru tine!
>> http://www.bursamuncii.ro
>>
>>
>>
>>
>>
>> _______________________________________________
>> Condor-users mailing list
>> Condor-users@xxxxxxxxxxx
>> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>
>_______________________________________________
>Condor-users mailing list
>Condor-users@xxxxxxxxxxx
>http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>
>
---------------------------------
Vreau sa-mi inregistrez CV-ul la BursaMuncii!
Vreti sa publicati oferte de munca? Noi avem solutia!
Lasa profesionistii sa lucreze pentru tine!
http://www.bursamuncii.ro