[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Debugging interactive jobs (was Re: Condor andbatch Matlab problem)



  

 Hi

 I have tried to use the debugging technique you recommended.

 I have put the USE_VISIBLE_DESKTOP = True option in the
condor_config and condor_config.local files on the execute

machine.

 When I submit the script, it runs ok, exits but no window
is popping. (I am using the execute machine as a submit
machine)

 In the StarterLog of the execute machine I get the following:

7/15 14:14:25 File transfer completed successfully.
7/15 14:14:26 Starting a VANILLA universe job with ID: 52.0
7/15 14:14:26 IWD: c:\Condor/execute\dir_184
7/15 14:14:26 Output file: c:\Condor/execute\dir_184\c.out
7/15 14:14:26 Error file: c:\Condor/execute\dir_184\c.err
7/15 14:14:26 Renice expr "10" evaluated to 10
7/15 14:14:26 About to exec c:\windows\system32\cmd.exe /K
7/15 14:14:26 Create_Process: Unable to use visible desktop
7/15 14:14:26 Create_Process succeeded, pid=3732
7/15 14:14:26 Process exited, pid=3732, status=0
7/15 14:14:26 Got SIGQUIT.  Performing fast shutdown.
7/15 14:14:26 ShutdownFast all jobs.
7/15 14:14:26 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0
7/15 14:16:28 ******************************************************

 It says "unable to use visible desktop". Is there any other
setting I have to do to use this facility?

 Thank you
 Best Regards
 Gabriel


Erik Paulson <epaulson@xxxxxxxxxxx> wrote:
>On Wed, Jul 14, 2004 at 11:29:25AM -0600, bgore@xxxxxxxxxx wrote:
>> It could be a security/permissions thing. We had this happen on another
matlab-like program. Every time it ran it wanted to update a file in the
install directory. However, without modification the default condor local
account did not have permission to update this file. So, run matlab
interactively and look for recently modified files in the matlab install
directory. Give update permissions to the local condor account for those
files and see if that fixes your issue. ~B
>> 
>
>When you run it interactively, you should try and make sure that you're
>using the same environment as what the job will see - the best way to do
that
>is to set 
>
>USE_VISIBLE_DESKTOP = True 
>
>on an _EXECUTE_ machine. 
>
>When a job starts up, it will open a DOS prompt on the desktop and
>start executing there if that is set. You can watch it run if you'd like!
>
>Make your submit file be something like:
>
>#Executable = matlab.bat
>Executable = cmd.exe
>Universe = vanilla
>#Requirements = ((Arch == "INTEL" && OpSys == "WINNT51")) 
>Requrirements = machine == "host-with-use-visible-desktop-set.your.domain"
>should_transfer_files = YES 
>transfer_executable = false
>whenToTransferOutput = ON_EXIT 
>#transfer_input_files = a.dat,b.dat,test.m 
>transfer_input_files = a.dat,b.dat,test.m,matlab.bat
>environment = PATH=c:\matlab_sv13\bin\win32 
>#arguments = /r test /logfile log.txt 
>arguments = /K
>log = mat.log 
>Output = mat.out 
>Error = mat.err 
>Queue 1
>
>Submit your job, then wait at
host-with-use-visible-desktop-set.your.domain
>and you'll get a cmd.exe window. Now you can debug the job exactly as the
>job will see it. 
>
>I'm not 100% sure about the /K argument - I know you need to give
something 
>to cmd.exe to get it to stick around so you can actually type on it. 
>Hopefully someone who uses Windows can confirm. 
>
>Make sure you watch out for your START expression - if you've got it set
>so that typing on the keyboard suspends your job, as soon as you try and
>use your window the startd will suspend it. Best to set START=true :)
>
>We use this approach all the time to debug why jobs won't run under
>Condor - it turns out that there are a number of "console" apps that will
>decide to pop up a window waiting for the user to click "OK" before
they'll
>start. 
>
>You can pull a similar approach on Unix with X-Windows:
>universe = vanilla
>executable = /usr/X11R6/bin/xterm
>arguments == --display=submitting-host.your.domain:0
>queue
>
>set 'xhost +' and then submit your job, and when it runs, you'll get an
>xterm on your screen that's running on the remote machine, and you can
>interactively work on the remote machine. Add in any file transfer 
>options you need to set things up for your job and you can get a start
>on debugging "it runs outside of Condor, but not inside"
>
>And one last trick - sometimes a program just insists on haivng an Xserver
>to run under - Open Office comes to mind - there's no way to disable it's
>screen. In order to get it to work, we used the X Virtual Frame Buffer -
>it's like /dev/null for X. (http://www.xfree86.org/4.0.1/Xvfb.1.html)
>
>To get it to work, I used this submit file:
>universe = vanilla
>executable = xvfb-run
>WhenToTransferOutput = always
>transfer_files = always
>arguments = xlr8r_linux filename
>transfer_input_files = xlr8r_linux, xlr8r_linux.rdb, filename 
>environment =
LD_LIBRARY_PATH=/p/condor/workspaces/epaulson/739/xlr8r_libraries;
>PATH=/usr/bin:/bin:/s/std/bin:.;HOME=/u/e/p/epaulson
>requirements = IsC2Cluster 
>output = filename.out
>error = filename.err
>
>(xlr8r_linux was a program that invoked OpenOffice)
>
>xvfb-run was a shell script I found on the net (I think it's from Debian)
>
>
>#!/bin/sh
>
>chmod 755 xlr8r_linux
>set -o xtrace
>set -e
>
>ulimit -c 0
># xvfb-run - run the specified command in a virtual X server
>
># This script starts an instance of Xvfb, the "fake" X server, runs a
># command with that server available, and kills the X server when
># done.  The return value of the command becomes the return value of
># this script.
>#
># If anyone is using this to build a Debian package, make sure the
># package Build-Depends on xvfb, xbase-clients and xfonts-base.
>
>set -e
>
>DISPLAYNUM=99
>AUTHFILE=$(pwd)/Xauthority
>STARTWAIT=3
>LISTENTCP="-nolisten tcp"
>#unset AUTODISPLAYNUM
>
>
>usage()
>{
>  echo "Usage: $0 [OPTION]... [command]"
>  echo
>  echo "run specified X client or command in a virtual X server
environment"
>  echo
>  echo "  -a --auto-displaynum      Try to get a free display number,
starting at --display-num"
>  echo "  -f --auth-file=FILE       File to store auth cookie
(default:./Xauthority)"
>  echo "  -n --display-num=NUM      Display number to use
(default:$DISPLAYNUM)"
>  echo "  -l --listen-tcp           Enable TCP port listening in the X
server"
>  echo "  -w --wait=DELAY           Delay in seconds to wait for Xvfb to
start (default:$STARTWAIT)"
>  echo "  -h --help                 Display this help and exit"
>}
>
>
># Parse command line
>ARGS=`getopt --options +af:n:lw:h \
>	--long auto-displaynum,authority-file:,display-num:,listen-tcp,wait:,help
\
>	--name "$0" -- "$@"`
>if [ $? != 0 ] ; then echo "Terminating..." >&2 ; exit 1 ; fi
>
>eval set -- "$ARGS"
>while true ; do
>    case "$1" in
>      '-a'|'--auto-displaynum')
>      	    AUTODISPLAYNUM=y
>      	    ;;
>      '-f'|'--auth-file')
>	    AUTHFILE="$2"
>	    shift
>	    ;;
>      '-n'|'--display-num')
>	    DISPLAYNUM="$2"
>	    shift
>	    ;;
>      '-l'|'--listen-tcp')
>	    LISTENTCP=
>	    ;;
>      '-w'|'--wait')
>	    STARTWAIT="$2"
>	    shift
>	    ;;
>      '-h'|'--help')
>	    usage
>	    exit 1
>	    ;;
>      '--')
>	    # end of options
>	    shift
>	    break
>	    ;;
>      *)
>            echo "Internal error!"; exit 1;;
>    esac
>
>    shift
>done
>
>echo "Starting"
>i=$DISPLAYNUM
>while [ -f /tmp/.X$i-lock ]; do
>  echo "Checking $i"
>  i=$(($i+1))
> done
>echo $i
>DISPLAYNUM=$i
>
>echo $DISPLAYNUM
># start Xvfb
>rm -f "$AUTHFILE"
>MCOOKIE=$(mcookie)
>XAUTHORITY="$AUTHFILE" xauth add :$DISPLAYNUM . $MCOOKIE > /dev/null
>XAUTHORITY="$AUTHFILE" Xvfb :$DISPLAYNUM -screen 0 640x480x8 $LISTENTCP \
>	> /dev/null  &
>XVFBPID=$!
>sleep $STARTWAIT
>
>set +e
>
># Check that server has not exited
>if ! kill -0 $XVFBPID; then
>  echo "Xvfb server has died" >&2
>  exit 1
>fi
>
># start the command and save its exit status
>echo $@
>DISPLAY=:$DISPLAYNUM XAUTHORITY="$AUTHFILE" $@ 2>&1
>RETVAL=$?
>set -e
>
># kill Xvfb and clean up
>kill $XVFBPID
>XAUTHORITY="$AUTHFILE" xauth remove :$DISPLAYNUM > /dev/null
>rm "$AUTHFILE"
>
># return the executed command's exit status
>exit $RETVAL
>
>
># Find free display number by looking at .X-lock files in /tmp
>#find-free-display()
>#{
>#}
>_______________________________________________
>Condor-users mailing list
>Condor-users@xxxxxxxxxxx
>http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>
>
 



---------------------------------
Vreau sa-mi inregistrez CV-ul la BursaMuncii!
Vreti sa publicati oferte de munca? Noi avem solutia!
Lasa profesionistii sa lucreze pentru tine!
http://www.bursamuncii.ro