[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Debugging interactive jobs (was Re: Condorandbatch Matlab problem)



Hi

I have investigated the case with Matlab and can say that it just won't run
in a non-interactive fashion. It actually does start and even creates the
log file (command line is
matlab.exe -nojvm -nosplash -logfile run.log -r myscript )
but then apparently terminates instantly but without returning control back
to condor, so the job hangs.
(I have Windows XP boxes with Matlab 6.5 installed)
I have checked and Matlab did not modify any files in its folder while
running simple interactive job
So the best solution will probably be to run Octave instead. :-)

btw, I could not find USE_VISIBLE_DESKTOP anywhere in Condor documentation.
Could somebody give a link to it or it is a hidden feature?

Anyway, I do not quite understand how it is possible to bring up desktop on
execute machine
which is not even logged in? And if it is logged in by some user then how
can condor user
create windows on somebodyelse's desktop?

Andrey

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Gabriel
> Sent: Thursday, July 15, 2004 7:57 AM
> To: ErikPaulson
> Cc: condor-users@xxxxxxxxxxx
> Subject: RE: [Condor-users] Debugging interactive jobs (was 
> Re: Condor andbatch Matlab problem)
> 
> 
>   
> 
>  Hi
> 
>  I have tried to use the debugging technique you recommended.
> 
>  I have put the USE_VISIBLE_DESKTOP = True option in the
> condor_config and condor_config.local files on the execute
> 
> machine.
> 
>  When I submit the script, it runs ok, exits but no window
> is popping. (I am using the execute machine as a submit
> machine)
> 
>  In the StarterLog of the execute machine I get the following:
> 
> 7/15 14:14:25 File transfer completed successfully.
> 7/15 14:14:26 Starting a VANILLA universe job with ID: 52.0
> 7/15 14:14:26 IWD: c:\Condor/execute\dir_184
> 7/15 14:14:26 Output file: c:\Condor/execute\dir_184\c.out
> 7/15 14:14:26 Error file: c:\Condor/execute\dir_184\c.err
> 7/15 14:14:26 Renice expr "10" evaluated to 10
> 7/15 14:14:26 About to exec c:\windows\system32\cmd.exe /K
> 7/15 14:14:26 Create_Process: Unable to use visible desktop
> 7/15 14:14:26 Create_Process succeeded, pid=3732
> 7/15 14:14:26 Process exited, pid=3732, status=0
> 7/15 14:14:26 Got SIGQUIT.  Performing fast shutdown.
> 7/15 14:14:26 ShutdownFast all jobs.
> 7/15 14:14:26 **** condor_starter (condor_STARTER) EXITING 
> WITH STATUS 0
> 7/15 14:16:28 ******************************************************
> 
>  It says "unable to use visible desktop". Is there any other
> setting I have to do to use this facility?
> 
>  Thank you
>  Best Regards
>  Gabriel
> 
> 
> Erik Paulson <epaulson@xxxxxxxxxxx> wrote:
> >On Wed, Jul 14, 2004 at 11:29:25AM -0600, bgore@xxxxxxxxxx wrote:
> >> It could be a security/permissions thing. We had this 
> happen on another
> matlab-like program. Every time it ran it wanted to update a 
> file in the
> install directory. However, without modification the default 
> condor local
> account did not have permission to update this file. So, run matlab
> interactively and look for recently modified files in the 
> matlab install
> directory. Give update permissions to the local condor 
> account for those
> files and see if that fixes your issue. ~B
> >> 
> >
> >When you run it interactively, you should try and make sure 
> that you're
> >using the same environment as what the job will see - the 
> best way to do
> that
> >is to set 
> >
> >USE_VISIBLE_DESKTOP = True 
> >
> >on an _EXECUTE_ machine. 
> >
> >When a job starts up, it will open a DOS prompt on the desktop and
> >start executing there if that is set. You can watch it run 
> if you'd like!
> >
> >Make your submit file be something like:
> >
> >#Executable = matlab.bat
> >Executable = cmd.exe
> >Universe = vanilla
> >#Requirements = ((Arch == "INTEL" && OpSys == "WINNT51")) 
> >Requrirements = machine == 
> "host-with-use-visible-desktop-set.your.domain"
> >should_transfer_files = YES 
> >transfer_executable = false
> >whenToTransferOutput = ON_EXIT 
> >#transfer_input_files = a.dat,b.dat,test.m 
> >transfer_input_files = a.dat,b.dat,test.m,matlab.bat
> >environment = PATH=c:\matlab_sv13\bin\win32 
> >#arguments = /r test /logfile log.txt 
> >arguments = /K
> >log = mat.log 
> >Output = mat.out 
> >Error = mat.err 
> >Queue 1
> >
> >Submit your job, then wait at
> host-with-use-visible-desktop-set.your.domain
> >and you'll get a cmd.exe window. Now you can debug the job 
> exactly as the
> >job will see it. 
> >
> >I'm not 100% sure about the /K argument - I know you need to give
> something 
> >to cmd.exe to get it to stick around so you can actually type on it. 
> >Hopefully someone who uses Windows can confirm. 
> >
> >Make sure you watch out for your START expression - if 
> you've got it set
> >so that typing on the keyboard suspends your job, as soon as 
> you try and
> >use your window the startd will suspend it. Best to set START=true :)
> >
> >We use this approach all the time to debug why jobs won't run under
> >Condor - it turns out that there are a number of "console" 
> apps that will
> >decide to pop up a window waiting for the user to click "OK" before
> they'll
> >start. 
> >
> >You can pull a similar approach on Unix with X-Windows:
> >universe = vanilla
> >executable = /usr/X11R6/bin/xterm
> >arguments == --display=submitting-host.your.domain:0
> >queue
> >
> >set 'xhost +' and then submit your job, and when it runs, 
> you'll get an
> >xterm on your screen that's running on the remote machine, 
> and you can
> >interactively work on the remote machine. Add in any file transfer 
> >options you need to set things up for your job and you can 
> get a start
> >on debugging "it runs outside of Condor, but not inside"
> >
> >And one last trick - sometimes a program just insists on 
> haivng an Xserver
> >to run under - Open Office comes to mind - there's no way to 
> disable it's
> >screen. In order to get it to work, we used the X Virtual 
> Frame Buffer -
> >it's like /dev/null for X. (http://www.xfree86.org/4.0.1/Xvfb.1.html)
> >
> >To get it to work, I used this submit file:
> >universe = vanilla
> >executable = xvfb-run
> >WhenToTransferOutput = always
> >transfer_files = always
> >arguments = xlr8r_linux filename
> >transfer_input_files = xlr8r_linux, xlr8r_linux.rdb, filename 
> >environment =
> LD_LIBRARY_PATH=/p/condor/workspaces/epaulson/739/xlr8r_libraries;
> >PATH=/usr/bin:/bin:/s/std/bin:.;HOME=/u/e/p/epaulson
> >requirements = IsC2Cluster 
> >output = filename.out
> >error = filename.err
> >
> >(xlr8r_linux was a program that invoked OpenOffice)
> >
> >xvfb-run was a shell script I found on the net (I think it's 
> from Debian)
> >
> >
> >#!/bin/sh
> >
> >chmod 755 xlr8r_linux
> >set -o xtrace
> >set -e
> >
> >ulimit -c 0
> ># xvfb-run - run the specified command in a virtual X server
> >
> ># This script starts an instance of Xvfb, the "fake" X server, runs a
> ># command with that server available, and kills the X server when
> ># done.  The return value of the command becomes the return value of
> ># this script.
> >#
> ># If anyone is using this to build a Debian package, make sure the
> ># package Build-Depends on xvfb, xbase-clients and xfonts-base.
> >
> >set -e
> >
> >DISPLAYNUM=99
> >AUTHFILE=$(pwd)/Xauthority
> >STARTWAIT=3
> >LISTENTCP="-nolisten tcp"
> >#unset AUTODISPLAYNUM
> >
> >
> >usage()
> >{
> >  echo "Usage: $0 [OPTION]... [command]"
> >  echo
> >  echo "run specified X client or command in a virtual X server
> environment"
> >  echo
> >  echo "  -a --auto-displaynum      Try to get a free display number,
> starting at --display-num"
> >  echo "  -f --auth-file=FILE       File to store auth cookie
> (default:./Xauthority)"
> >  echo "  -n --display-num=NUM      Display number to use
> (default:$DISPLAYNUM)"
> >  echo "  -l --listen-tcp           Enable TCP port 
> listening in the X
> server"
> >  echo "  -w --wait=DELAY           Delay in seconds to wait 
> for Xvfb to
> start (default:$STARTWAIT)"
> >  echo "  -h --help                 Display this help and exit"
> >}
> >
> >
> ># Parse command line
> >ARGS=`getopt --options +af:n:lw:h \
> >	--long 
> auto-displaynum,authority-file:,display-num:,listen-tcp,wait:,help
> \
> >	--name "$0" -- "$@"`
> >if [ $? != 0 ] ; then echo "Terminating..." >&2 ; exit 1 ; fi
> >
> >eval set -- "$ARGS"
> >while true ; do
> >    case "$1" in
> >      '-a'|'--auto-displaynum')
> >      	    AUTODISPLAYNUM=y
> >      	    ;;
> >      '-f'|'--auth-file')
> >	    AUTHFILE="$2"
> >	    shift
> >	    ;;
> >      '-n'|'--display-num')
> >	    DISPLAYNUM="$2"
> >	    shift
> >	    ;;
> >      '-l'|'--listen-tcp')
> >	    LISTENTCP=
> >	    ;;
> >      '-w'|'--wait')
> >	    STARTWAIT="$2"
> >	    shift
> >	    ;;
> >      '-h'|'--help')
> >	    usage
> >	    exit 1
> >	    ;;
> >      '--')
> >	    # end of options
> >	    shift
> >	    break
> >	    ;;
> >      *)
> >            echo "Internal error!"; exit 1;;
> >    esac
> >
> >    shift
> >done
> >
> >echo "Starting"
> >i=$DISPLAYNUM
> >while [ -f /tmp/.X$i-lock ]; do
> >  echo "Checking $i"
> >  i=$(($i+1))
> > done
> >echo $i
> >DISPLAYNUM=$i
> >
> >echo $DISPLAYNUM
> ># start Xvfb
> >rm -f "$AUTHFILE"
> >MCOOKIE=$(mcookie)
> >XAUTHORITY="$AUTHFILE" xauth add :$DISPLAYNUM . $MCOOKIE > /dev/null
> >XAUTHORITY="$AUTHFILE" Xvfb :$DISPLAYNUM -screen 0 640x480x8 
> $LISTENTCP \
> >	> /dev/null  &
> >XVFBPID=$!
> >sleep $STARTWAIT
> >
> >set +e
> >
> ># Check that server has not exited
> >if ! kill -0 $XVFBPID; then
> >  echo "Xvfb server has died" >&2
> >  exit 1
> >fi
> >
> ># start the command and save its exit status
> >echo $@
> >DISPLAY=:$DISPLAYNUM XAUTHORITY="$AUTHFILE" $@ 2>&1
> >RETVAL=$?
> >set -e
> >
> ># kill Xvfb and clean up
> >kill $XVFBPID
> >XAUTHORITY="$AUTHFILE" xauth remove :$DISPLAYNUM > /dev/null
> >rm "$AUTHFILE"
> >
> ># return the executed command's exit status
> >exit $RETVAL
> >
> >
> ># Find free display number by looking at .X-lock files in /tmp
> >#find-free-display()
> >#{
> >#}
> >_______________________________________________
> >Condor-users mailing list
> >Condor-users@xxxxxxxxxxx
> >http://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> >
> >
>  
> 
> 
> 
> ---------------------------------
> Vreau sa-mi inregistrez CV-ul la BursaMuncii!
> Vreti sa publicati oferte de munca? Noi avem solutia!
> Lasa profesionistii sa lucreze pentru tine!
> http://www.bursamuncii.ro
> 
> 
> 
> 
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>