[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor Schedd and docker



Hi Todd,

Thanks, that fixes the problem that I had (trying to query the schedd from outside the container), provided the hostname of the container is set to be the same as that of the actual host. If you just use the default 'random' hostname, even inside the container things don't quite work: condor_status says there's a schedd with the hostname of the container [1], but if you try to query that with condor_q it doesn't work [2] (which is what Iain saw). However if you just run condor_q it works.

Regards,
Andrew.

[1]
[root@a49d0657b0f0 /]# condor_status -schedd
Name                 Machine    TotalRunningJobs TotalIdleJobs TotalHeldJobs

a49d0657b0f0         a49d0657b0                0             0              0
                      TotalRunningJobs      TotalIdleJobs      TotalHeldJobs


               Total                 0                  0                  0

[2]
[root@a49d0657b0f0 /]# condor_q -name a49d0657b0f0
Error: Collector has no record of schedd/submitter

[3]
[root@a49d0657b0f0 /]# condor_q


-- Schedd: a49d0657b0f0 : <172.17.0.44:58869>
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended



________________________________________
From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Todd Tannenbaum [tannenba@xxxxxxxxxxx]
Sent: Friday, December 18, 2015 2:08 PM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] HTCondor Schedd and docker

Reading the below makes me think you want to set knob TCP_FORWARDING_HOST on the condor config inside the container to be the IP address of the physical host. See the manual for info on this config knob. Note that some versions of HTCondor had a bug dealing with this setting that has been fixed as of HTCondor v8.4.2.

Hope the above helps
Todd

Sent from my iPhone

> On Dec 18, 2015, at 6:44 AM, "andrew.lahiff@xxxxxxxxxx" <andrew.lahiff@xxxxxxxxxx> wrote:
>
> Hi Iain,
>
> i think condor_q tries to contact the schedd using the container's IP address, which of course isn't accessible from outside:
>
> Fetching job ads...
> -- Failed to fetch ads from: <172.17.0.42:9618?addrs=172.17.0.42-9618&noUDP&sock=78_147a_4> : <hostname>
> CEDAR:6001:Failed to connect to <172.17.0.42:9618?addrs=172.17.0.42-9618&noUDP&sock=78_147a_4>
>
> Regards,
> Andrew.
>
> ________________________________________
> From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Iain Bradford Steers [iain.steers@xxxxxxx]
> Sent: Friday, December 18, 2015 12:15 PM
> To: HTCondor-Users Mail List
> Subject: Re: [HTCondor-users] HTCondor Schedd and docker
>
> Hi Andrew,
>
> Excellent, means it’s not just that I’ve screwed up my container networking!
>
> Yeah it’s a bit of a weird one.
>
> Doing
>
> condor_q -name 11e3ad4f744a
>
> Looking at the CollectorLog it looks like the query that used the Name attribute of the Schedd didn’t get passed through
>
> 2/18/15 12:08:12 (Sending 0 ads in response to query)
> 12/18/15 12:08:12 Query info: matched=0; skipped=1; query_time=0.000041; send_time=0.000049; type=Scheduler; requirements={( ( Name == "" ) )}; peer=<##ip##:27811>; projection={}
>
> I’m wondering if it’s something to do with how condor is internally forming the requirements to send to the collector, however I haven’t spotted it in the debug output yet.
>
> Cheers, Iain
>
>> On Dec 18, 2015, at 12:02, andrew.lahiff@xxxxxxxxxx wrote:
>>
>> Hi Iain,
>>
>> I can reproduce the same problem, but I'm not sure how to fix it - having a collector, negotiator and startds in containers seems to be fairly straightforward (and works), but a schedd is a bit harder.
>>
>> Regards,
>> Andrew.
>>
>> ________________________________
>> From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Iain Bradford Steers [iain.steers@xxxxxxx]
>> Sent: Wednesday, December 16, 2015 11:48 AM
>> To: htcondor-users@xxxxxxxxxxx
>> Subject: [HTCondor-users] HTCondor Schedd and docker
>>
>> Hi All,
>>
>> Seeing as it's almost the holidays I decided to have some fun with docker and a single schedd.
>>
>> I've built and run a container that runs condor with the following custom config.
>>
>> DAEMON_LIST = MASTER, COLLECTOR, SCHEDD
>> USE_SHARED_PORT = True
>>
>> The container works and condor appears to work. I've exposed the usual port 9618 as well. Using bash inside the container I can query both the collector and schedd.
>>
>> However from outside the container I only appear able to query the collector.(1)
>>
>> condor_q with multiple attempts at getting -name right just tells me that the collector has no knowledge of the schedd. I've removed the -pool ip as it's publicly routable i.e.
>>
>> outside]# condor_q -name 11e3ad4f744a
>> Error: Collector has no record of schedd/submitter.
>>
>> Has anyone else tried this before?
>>
>> Nothing urgent here, just decided to have some fun and a poke around.
>>
>> Cheers, Iain
>>
>> (1)
>> outside]# condor_status -schedds
>> Name                 Machine    TotalRunningJobs TotalIdleJobs TotalHeldJobs
>>
>> 11e3ad4f744a         11e3ad4f74                0             0              0
>>                     TotalRunningJobs      TotalIdleJobs      TotalHeldJobs
>>
>>
>>              Total                 0                  0                  0
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/