[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] remote job submission with condor, (newbie) question



I think the problem was hostname related as I was submitting from my laptop which has localhost as hostname. I thought it worked with IP addresses rather than FQDN.

I installed everything on a machine with a proper hostname, when I give the commands globus-hostname and globus-gass-server it uses the right hostname (frank.ultralight.org).


[fvlingen@frank osg]$ globus-hostname
frank.ultralight.org
[fvlingen@frank osg]$ globus-gass-server
https://frank.ultralight.org:33422

I then did:
voms-proxy-init
condor_master
condor_submit test2.jdl (where the jdl file contains the lines below)

universe = globus
globusscheduler = t2cms02.sdsc.edu/jobmanager-fork
executable = /bin/hostname
output = testing2.out
error = testing2.err
log = testing2.log
notification = never
queue


When I do a condor_q it is idle for a long time.
How can I view the list of the remote condor q? I tried

condor_q -name t2cms02.sdsc.edu/jobmanager-fork
Error: Collector has no record of schedd/submitter

Frank.

PS: is there a tuturial that shows how to do remote job submissions and ways of trouble shoot when the submission does not do what you expected? I looked at the condor site
but found only examples that submit to local queus.




Jaime Frey wrote:

On Jan 10, 2006, at 1:13 PM, Frank van Lingen wrote:

I am new to using condor and trying to submit a job to a condor site:

This is what I did:
-installed the vdt client (1.3.10)
-generated a proxy using voms-proxy-init
-did a "condor_submit  test.jdl" where the contents of test.jdl is:

universe = globus
Executable = /bin/date
globusscheduler = t2cms02.sdsc.edu/jobmanager-fork
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
Output  = /home/fvlingen/condor_test/output.out
Error = /home/fvlingen/condor_test/error.err
Log = /home/fvlingen/condor_test/user.log
queue

If I do not run the command "condor_schedd" I get an error, local
scheduler not found. If I run the "condor_schedd" command before
condor_submit
it works, but it seems to submit to the scheduler on my laptop (below
the error log output)
which is not what I specified in my jdl file:

000 (002.000.000) 01/10 10:34:34 Job submitted from host: <127.0.0.1:32771>
...
018 (002.000.000) 01/10 10:36:51 Globus job submission failed!
   Reason: 43 the job manager failed to stage the executable
...
012 (002.000.000) 01/10 10:36:58 Job was held.
Globus error 43: the job manager failed to stage the executable
       Code 2 Subcode 43


Using the -r option does not seem to help either:

condor_submit -r t2cms02.sdsc.edu/jobmanager-fork test.jdl
ERROR: Can't find address of schedd t2cms02.sdsc.edu/jobmanager-fork

I probably am doing something trivially wrong. I looked at the
condor_submit man pages
and examples of submitting a job, but they seem be based around local
submission it seems.

Where can I find information on remote job submission?

When you use Condor, you submit your jobs to a condor_schedd daemon, usually one running on your local machine. It will then forward the job to an appropriate destination (in this case, t2cms02.sdsc.edu/ jobmanager-fork). Your job is being held because the transfer of the job's executable from your machine to t2cms02.sdsc.edu is failing. This is usually a networking issue (incomplete hostname, firewall, etc).

Try running globus-gass-server and look at the URL it prints. Is the hostname incomplete? Is the port one that's being blocked by a firewall?

+--------------------------------+-----------------------------------+
|           Jaime Frey           | I used to be a heavy gambler.     |
|       jfrey@xxxxxxxxxxx        | But now I just make mental bets.  |
| http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind.        |
+--------------------------------+-----------------------------------+


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users


--
--------------------------------------
Frank van Lingen
California Institute of Technology
CA 91125 Pasadena
United States
--------------------------------------
Mail Code:356-48
bld      :340 Lauritsen
email    :fvlingen@xxxxxxxxxxx
im(aim)  :marcellus0872
tel      :(+1) 626 395 3862
mobile   :(+1) 310 968 5584
url      :http://www.van-lingen.name/
--------------------------------------