[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] remote job submission with condor, (newbie) question



If I submit something using globus-job-run it is working:

[fvlingen@frank osg]$ globus-job-run t2cms02.sdsc.edu/jobmanager-fork /bin/hostname
t2cms02.sdsc.edu

However if I use condor_submit and the attributes below it hangs idle in my queue, without any error or output messages, it only writes in my log that the job has been submitted. Is there something
else I need to pass to condor_submit?


#####condor job attributes I used:
universe = globus
globusscheduler = t2cms02.sdsc.edu/jobmanager-fork
executable = /bin/hostname
output = testing2.out
error = testing2.err
log = testing2.log
notification = never
queue

Frank.


Frank van Lingen wrote:

I think the problem was hostname related as I was submitting from my laptop which has localhost as hostname. I thought it worked with IP addresses rather than FQDN.

I installed everything on a machine with a proper hostname, when I give the commands globus-hostname and globus-gass-server it uses the right hostname (frank.ultralight.org).


[fvlingen@frank osg]$ globus-hostname
frank.ultralight.org
[fvlingen@frank osg]$ globus-gass-server
https://frank.ultralight.org:33422

I then did:
voms-proxy-init
condor_master
condor_submit test2.jdl (where the jdl file contains the lines below)

universe = globus
globusscheduler = t2cms02.sdsc.edu/jobmanager-fork
executable = /bin/hostname
output = testing2.out
error = testing2.err
log = testing2.log
notification = never
queue


When I do a condor_q it is idle for a long time.
How can I view the list of the remote condor q? I tried

condor_q -name t2cms02.sdsc.edu/jobmanager-fork
Error: Collector has no record of schedd/submitter

Frank.

PS: is there a tuturial that shows how to do remote job submissions and ways of trouble shoot when the submission does not do what you expected? I looked at the condor site
but found only examples that submit to local queus.




Jaime Frey wrote:

On Jan 10, 2006, at 1:13 PM, Frank van Lingen wrote:



I am new to using condor and trying to submit a job to a condor site:

This is what I did:
-installed the vdt client (1.3.10)
-generated a proxy using voms-proxy-init
-did a "condor_submit test.jdl" where the contents of test.jdl is:

universe = globus
Executable = /bin/date
globusscheduler = t2cms02.sdsc.edu/jobmanager-fork
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
Output = /home/fvlingen/condor_test/output.out
Error = /home/fvlingen/condor_test/error.err
Log = /home/fvlingen/condor_test/user.log
queue

If I do not run the command "condor_schedd" I get an error, local
scheduler not found. If I run the "condor_schedd" command before
condor_submit
it works, but it seems to submit to the scheduler on my laptop (below
the error log output)
which is not what I specified in my jdl file:

000 (002.000.000) 01/10 10:34:34 Job submitted from host: <127.0.0.1:32771>
...
018 (002.000.000) 01/10 10:36:51 Globus job submission failed!
Reason: 43 the job manager failed to stage the executable
...
012 (002.000.000) 01/10 10:36:58 Job was held.
Globus error 43: the job manager failed to stage the executable
Code 2 Subcode 43


Using the -r option does not seem to help either:

condor_submit -r t2cms02.sdsc.edu/jobmanager-fork test.jdl
ERROR: Can't find address of schedd t2cms02.sdsc.edu/jobmanager-fork

I probably am doing something trivially wrong. I looked at the
condor_submit man pages
and examples of submitting a job, but they seem be based around local
submission it seems.

Where can I find information on remote job submission?


When you use Condor, you submit your jobs to a condor_schedd daemon, usually one running on your local machine. It will then forward the job to an appropriate destination (in this case, t2cms02.sdsc.edu/ jobmanager-fork). Your job is being held because the transfer of the job's executable from your machine to t2cms02.sdsc.edu is failing. This is usually a networking issue (incomplete hostname, firewall, etc).

Try running globus-gass-server and look at the URL it prints. Is the hostname incomplete? Is the port one that's being blocked by a firewall?

+--------------------------------+-----------------------------------+
| Jaime Frey | I used to be a heavy gambler. |
| jfrey@xxxxxxxxxxx | But now I just make mental bets. |
| http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind. |
+--------------------------------+-----------------------------------+


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users





--
--------------------------------------
Frank van Lingen
California Institute of Technology
CA 91125 Pasadena
United States
--------------------------------------
Mail Code:356-48
bld      :340 Lauritsen
email    :fvlingen@xxxxxxxxxxx
im(aim)  :marcellus0872
tel      :(+1) 626 395 3862
mobile   :(+1) 310 968 5584
url      :http://www.van-lingen.name/
--------------------------------------