Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...

Date: Tue, 26 Mar 2019 16:09:40 -0500 (CDT)
From: Todd L Miller <tlmiller@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...

Indeed looks straightforward! Iâve tried this trick with thesubstitution "$(OPSYS)â, unfortunately it does not seem to work. Itsimply transforms to the OS of the host where job is submitted.

Note the two dollar signs: $$(ARCH). The second dollar sign means"don't evaluate until I've been matched", which should address the problemof being the submit node's ARCH (or OPSYS), instead of the execute node's.

Interesting idea, I shall try this. Where can I read about "startdflockingâ? Is there some recipe? Probably I simply not read thedocumentation careful enough, but I cannot find a word about this.

Huh. Neither can I. IIRC, the idea is to set COLLECTOR_HOST onthe startd to a list of collectors -- the collector for each pool you wantto have access to the startd. There's a lot of details I'm forgetting,I'm sure.

Iâve tried â no luck. Here is the simple submit file I used for this:

executable     = wrapper.sh
arguments      = ping -c444 127.0.0.1
universe       = vanilla
requirements   = OpSys == "LINUX"
queue

How can I debug this?


	Try turning on the output and error logs?

- ToddM

References:
- [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...
  - From: Alexander Prokhorov
- Re: [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...
  - From: Todd L Miller
- Re: [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...
  - From: Alexander Prokhorov

Prev by Date: Re: [HTCondor-users] Python Bindings crash without exception when remotely holding jobs
Next by Date: Re: [HTCondor-users] Condor running slow (q,status,submit)
Previous by thread: Re: [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...
Next by thread: Re: [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] Parallel job with flocking, condor_tail does not work, upload/download to/from a running job, slots in Claimed-Idle state, ...