Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] submit to a remote pool, from a linux machine, to an osx manager

Date: Thu, 16 Sep 2010 15:07:25 -0400
From: Peter Doherty <doherty@xxxxxxxxxxxxxxxxxxx>
Subject: [Condor-users] submit to a remote pool, from a linux machine, to an osx manager

Hi,

Okay, so I've got an all mac (intel xeons, running OS X 10.5)cluster. I set up the headnode as a central manager running theschedd/collector, etc.It works pretty good. But I've got a linux box that I want to use tosubmit jobs from. So I installed condor on the linux box and set thecondor_config variables to point to the remote pool.No daemons run on the linux box, I just changed CONDOR_HOST toreference the mac central manager.


I can use condor_submit and run jobs, and everything works great.
But when I want to submit a DAG, I run into problems.

1.) the dagman scheduler universe job has requirements (OpSys, Arch)that are looking for a LINUX X86_64 machine, but the CM is INTEL, OSX.2.) the CMD it wants to run is the linux binary of condor_dagman(because it pulled the config from the linux box i presume)

So the job just gets stuck eternally in the queue.

I was originally going to go hack up condor_config and I was lookingfor something like DAGMAN_EXE= so I could set it to the osx binary,but I don't see that option listed.

that would address #2.

The first one, well, I think I can fix that with something inSUBMIT_EXPR, right? Will that work with condor_submit_dag?

Am I just barking up the wrong tree here? Should I provision a linuxCM for the pool? (I'd rather not, I've got limited resources to workwith)


One more question, for anyone still reading.

One my linux clusters, if I do something like reboot the CM, the nodesall reconnect once the CM is running again.When I reboot the Mac CM, the nodes get into a funny state. a'killall -HUP condor_master' fixes it right up, but short of that,they don't reconnect to the CM.



Thanks.
Peter

Follow-Ups:
- Re: [Condor-users] submit to a remote pool, from a linux machine, to an osx manager
  - From: Peter Doherty

Prev by Date: Re: [Condor-users] User Statistics
Next by Date: Re: [Condor-users] submit to a remote pool, from a linux machine, to an osx manager
Previous by thread: Re: [Condor-users] User Statistics
Next by thread: Re: [Condor-users] submit to a remote pool, from a linux machine, to an osx manager
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] submit to a remote pool, from a linux machine, to an osx manager