[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] submit to a remote pool, from a linux machine, to an osx manager

You can set the Arch and OpSys in the submitFile.


"Arch and OpSys are set equal to the Arch and OpSys of the submit machine. In other words: unless you request otherwise, Condor will give your job machines with the same architecture and operating system version as the machine running condor_submit."

 I dont know a way for you to run a Linux program in a BSD System (unless they had installed the port for linux binaries. MacOs is BSD) But really I never test it. I always run BSD in BSD and Linux in Linux.
And... I think you should do the same, maybe you can get 2 binaries, 1 for Mac and 1 for Linux and submit them with Arch set to the correct one.

On Thu, Sep 16, 2010 at 3:43 PM, Peter Doherty <doherty@xxxxxxxxxxxxxxxxxxx> wrote:

On Sep 16, 2010, at 15:07 , Peter Doherty wrote:


Okay, so I've got an all mac (intel xeons, running OS X 10.5) cluster.  I set up the headnode as a central manager running the schedd/collector, etc.
It works pretty good.  But I've got a linux box that I want to use to submit jobs from.  So I installed condor on the linux box and set the condor_config variables to point to the remote pool.
No daemons run on the linux box, I just changed CONDOR_HOST to reference the mac central manager.

I can use condor_submit and run jobs, and everything works great.
But when I want to submit a DAG, I run into problems.
1.) the dagman scheduler universe job has requirements (OpSys, Arch) that are looking for a LINUX X86_64 machine, but the CM is INTEL, OSX.
2.) the CMD it wants to run is the linux binary of condor_dagman (because it pulled the config from the linux box i presume)
So the job just gets stuck eternally in the queue.


In true mailing list form, I made some headway shortly after sending this email.
I changed ARCH and OPSYS in the condor_config on the local box, so the job now gets matched and runs. (i'm sure there is a more proper way to do this, however)
But then the job fails to run, because it's a linux binary on os x.

So I found the -dagman flag (and passed it the full path for the osx binary) for condor_submit_dag, but then I get an error in job.dag.dagman.out file that shows that dagman started, ran, and then exited due to -Dagman being an unknown option.  So it looks like condor_submit_dag passed it's option onto condor_dagman, which doesn't support the flag?  Although the fact that it produced the error tells me that at least condor_dagman actually started and ran on the CM before producing the error.

At least i'm getting closer.

Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at:

Edier Alberto Zapata Hernández
Est. Ingeniería de Sistemas
Universidad de Valle