[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] [Globus-discuss] Job submission from GT4 to condor 6.7.10 (job never executes)



hi,
   I am using GT4 with condor 6.7.10 on a Fedor Core $
machine. I am able to submit and execute a job with
GRAM without using condor(without putting condor into
context). And  I can create a jobad  and submit it
through condor as well by using condor_submit with
this classad file
--------------------------------------------
universe       = grid
  grid_type      = gt4
  executable     = /bin/hostname
  log            = ad.log
  output         = ad.musti_ouput
  error          = ad.error
  globusscheduler =
https://ucf-7.linuxclass.marist.edu:8443
  jobmanager_type = Fork
  should_transfer_files = YES
  when_to_transfer_output = ON_EXIT
  queue
 -------------------------------------

. even this scenario works for me
 globusrun-ws -submit -Ft Condor -S -o  job.epr -b -c
/bin/touch touched_it

creating a classad as this
---------------------------------------------
#
# description file for condor submission
#
Universe = vanilla
Notification = Never
Executable = /bin/touch
Requirements = OpSys == "LINUX"  && Arch == "INTEL"
Environment =
GLOBUS_LOCATION=/usr/local/globus;X509_CERT_DIR=/etc/grid-security/certificates;X509_USER_PROXY=;X509_USER_CERT=
;X509_USER_KEY=;HOME=/home/globus;LOGNAME=globus;JAVA_HOME=/usr/lib/jvm/java-1.5.0-sun-1.5.0.03/jre;GLOBUS_GRAM_JOB_HANDLE=htt
ps://<ip>:8443/wsrf/services/ManagedExecutableJobService?b7b4c80a-317c-11da-99e0-000d60eb0162;LD_LIBRARY_PATH=
Arguments = touched_it
InitialDir = /home/globus
Input = /dev/null
Log = /usr/local/globus/var/globus-condor.log
log_xml = True
#Extra attributes specified by client

Output = /dev/null
Error = /dev/null
---------------------------------------------------


--------------------PROBLEM CAUSING SUBMISSION
SCENARIO-----------------------------------------
But. Once i use this command syntax
globusrun-ws -submit   -factory  
https://<ip>:8443/wsrf/services/ManagedJobFactoryService
-Ft Condor -f
/usr/local/globus/test/globus_wsrf_gram_service_java_test_unit/test.xml

globusrun-ws -submit   -factory  
https://148.100.51.27:8443/wsrf/services/ManagedJobFactoryService
-factory-type Condor -f
/usr/local/globus/test/globus_wsrf_gram_service_java_test_unit/test.xml
Submitting job...Done.
Job ID: uuid:885a1146-3186-11da-883e-000d60eb0162
Termination time: 10/01/2005 07:48 GMT
Current job state: Pending

my jobs stays "pending". for ever. it does create a
classadd for the RSL but never execute. and i dont see
any errors on my container side.
----------------RSL-------------------------
<?xml version="1.0" encoding="UTF-8"?>
<job>
    <executable>/bin/hostname</executable>
    <directory>${GLOBUS_USER_HOME}</directory>
    <argument>12</argument>
    <argument>abc</argument>
    <argument>34</argument>
   
<argument>pdscaex_instr_GrADS_grads23_28919.cfg</argument>
    <argument>pgwynnel was here</argument>
    <environment>
        <name>PI</name>
        <value>3.141</value>
    </environment>
    <environment>
        <name>GLOBUS_DUROC_SUBJOB_INDEX</name>
        <value>0</value>
    </environment>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
    <count>1</count>
    <jobType>multiple</jobType>
</job>
----------------------------------------------
it creates a classad file for condor which looks like
-------------------CLASS_AD-----------------------------
#
# description file for condor submission
#
Universe = vanilla
Notification = Never
Executable = /bin/hostname
Requirements = OpSys == "LINUX"  && Arch == "INTEL"
Environment =
PI=3.141;GLOBUS_DUROC_SUBJOB_INDEX=0;GLOBUS_LOCATION=/usr/local/globus;X509_CERT_DIR=/etc/grid-security/certificates;X509_USER_PROXY=;X509_USER_CERT=;X509_USER_KEY=;HOME=/home/globus;LOGNAME=globus;JAVA_HOME=/usr/lib/jvm/java-1.5.0-sun-1.5.0.03/jre;GLOBUS_GRAM_JOB_HANDLE=https://148.100.51.27:8443/wsrf/services/ManagedExecutableJobService?ce468320-3171-11da-8d85-000d60eb0162;LD_LIBRARY_PATH=
Arguments = 12 abc 34
pdscaex_instr_GrADS_grads23_28919.cfg pgwynnel was
here
InitialDir = /home/globus
Input = /dev/null
Log = /usr/local/globus/var/globus-condor.log
log_xml = True
#Extra attributes specified by client

Output = /home/globus/stdout
Error = /home/globus/stderr
queue 1
---------------------------------------------------------

here are last few entries from Schedular logfile
------------------------------------------------
9/30 02:57:14 (pid:8885) Activity on stashed
negotiator socket
9/30 02:57:14 (pid:8885) Negotiating for owner:
KBPSD@xxxxxxxxxxxxxxxxxxxxxxxxxxx
9/30 02:57:14 (pid:8885) Checking consistency running
and runnable jobs
9/30 02:57:14 (pid:8885) Tables are consistent
9/30 02:57:15 (pid:8885) Out of servers - 0 jobs
matched, 4 jobs idle, 4 jobs rejected
9/30 02:57:15 (pid:8885) Activity on stashed
negotiator socket
9/30 02:57:15 (pid:8885) Negotiating for owner:
globus@xxxxxxxxxxxxxxxxxxxxxxxxxxx
9/30 02:57:15 (pid:8885) Checking consistency running
and runnable jobs
9/30 02:57:15 (pid:8885) Tables are consistent
9/30 02:57:15 (pid:8885) Out of servers - 0 jobs
matched, 4 jobs idle, 4 jobs rejected

-------------------------------globus-condor.conf--------
/usr/local/globus/etc]$ cat globus-condor.conf
log_path=/usr/local/globus/var/globus-condor.log
-------------------------------------------------

--------------globus-condor.log------------------------
<c>
    <a n="MyType"><s>SubmitEvent</s></a>
    <a n="EventTypeNumber"><i>0</i></a>
    <a n="EventTime"><s>2005-09-30T03:48:05</s></a>
    <a n="Cluster"><i>68</i></a>
    <a n="Proc"><i>0</i></a>
    <a n="Subproc"><i>0</i></a>
    <a n="SubmitHost"><s>&lt;<ip>:59194&gt;</s></a>
</c>



Analysis and Questions:
=======================
Possibly my sytax is wrong to submit the job  (if yes
please some correct me)
secondly my machine is not the central manager in the
condor pool. (could it be the problem since apparently
i am not referring to central manager. Though think it
should be automatic since i can submit other jobs
withoout putting condor pool central manager into
context)
There is some thing wrong with my condor or globus
configuration. 
Its been a while i have been trying to fix this
problem but some how people are too busy or may be i
didnt give clear enough description of the problem. So
please let me know if there is any thing i need to do
to fix it. Its been a while i am stuck with this. Any
help will be apprciated.thanx in advance.
Mustansar
Marist College Poughkeepsie Ny



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com