[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor Schedd and docker



After setting:

SCHEDD_NAME = myschedd

condor_q works inside the container, both when specifying a name and without, e.g.

[root@60d2cbbd1b3f /]# condor_q myschedd


-- Schedd: myschedd@60d2cbbd1b3f : <172.17.0.45:47149>
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

however it still doesn't work from outside the container:

[root@vm91 ~]# condor_q -pool <hostname> -name myschedd
Error: Collector has no record of schedd/submitter

Regards,
Andrew.


________________________________________
From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Krieger, Donald N. [kriegerd@xxxxxxxx]
Sent: Friday, December 18, 2015 3:06 PM
To: 'HTCondor-Users Mail List'
Subject: Re: [HTCondor-users] HTCondor Schedd and docker

Can you relax that naming limitation with:
   condor_schedd -local-name WhatEverYouWant ?

Best regards,

Don


Don Krieger, Ph.D.
Department of Neurological Surgery
University of Pittsburgh
(412)648-9654 Office
(412)521-4431 Cell/Text

> -----Original Message-----
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf
> Of andrew.lahiff@xxxxxxxxxx
> Sent: Friday, December 18, 2015 9:36 AM
> To: htcondor-users@xxxxxxxxxxx
> Subject: Re: [HTCondor-users] HTCondor Schedd and docker
>
> Hi Todd,
>
> Thanks, that fixes the problem that I had (trying to query the schedd from
> outside the container), provided the hostname of the container is set to be the
> same as that of the actual host. If you just use the default 'random' hostname,
> even inside the container things don't quite work: condor_status says there's a
> schedd with the hostname of the container [1], but if you try to query that with
> condor_q it doesn't work [2] (which is what Iain saw). However if you just run
> condor_q it works.
>
> Regards,
> Andrew.
>
> [1]
> [root@a49d0657b0f0 /]# condor_status -schedd
> Name                 Machine    TotalRunningJobs TotalIdleJobs TotalHeldJobs
>
> a49d0657b0f0         a49d0657b0                0             0              0
>                       TotalRunningJobs      TotalIdleJobs      TotalHeldJobs
>
>
>                Total                 0                  0                  0
>
> [2]
> [root@a49d0657b0f0 /]# condor_q -name a49d0657b0f0
> Error: Collector has no record of schedd/submitter
>
> [3]
> [root@a49d0657b0f0 /]# condor_q
>
>
> -- Schedd: a49d0657b0f0 : <172.17.0.44:58869>
>  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
>
> 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
>
>
>
> ________________________________________
> From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of
> Todd Tannenbaum [tannenba@xxxxxxxxxxx]
> Sent: Friday, December 18, 2015 2:08 PM
> To: HTCondor-Users Mail List
> Subject: Re: [HTCondor-users] HTCondor Schedd and docker
>
> Reading the below makes me think you want to set knob
> TCP_FORWARDING_HOST on the condor config inside the container to be the IP
> address of the physical host. See the manual for info on this config knob. Note
> that some versions of HTCondor had a bug dealing with this setting that has
> been fixed as of HTCondor v8.4.2.
>
> Hope the above helps
> Todd
>
> Sent from my iPhone
>
> > On Dec 18, 2015, at 6:44 AM, "andrew.lahiff@xxxxxxxxxx"
> <andrew.lahiff@xxxxxxxxxx> wrote:
> >
> > Hi Iain,
> >
> > i think condor_q tries to contact the schedd using the container's IP address,
> which of course isn't accessible from outside:
> >
> > Fetching job ads...
> > -- Failed to fetch ads from:
> > <172.17.0.42:9618?addrs=172.17.0.42-9618&noUDP&sock=78_147a_4> :
> > <hostname> CEDAR:6001:Failed to connect to
> > <172.17.0.42:9618?addrs=172.17.0.42-9618&noUDP&sock=78_147a_4>
> >
> > Regards,
> > Andrew.
> >
> > ________________________________________
> > From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of
> > Iain Bradford Steers [iain.steers@xxxxxxx]
> > Sent: Friday, December 18, 2015 12:15 PM
> > To: HTCondor-Users Mail List
> > Subject: Re: [HTCondor-users] HTCondor Schedd and docker
> >
> > Hi Andrew,
> >
> > Excellent, means it's not just that I've screwed up my container networking!
> >
> > Yeah it's a bit of a weird one.
> >
> > Doing
> >
> > condor_q -name 11e3ad4f744a
> >
> > Looking at the CollectorLog it looks like the query that used the Name
> > attribute of the Schedd didn't get passed through
> >
> > 2/18/15 12:08:12 (Sending 0 ads in response to query)
> > 12/18/15 12:08:12 Query info: matched=0; skipped=1;
> > query_time=0.000041; send_time=0.000049; type=Scheduler;
> > requirements={( ( Name == "" ) )}; peer=<##ip##:27811>; projection={}
> >
> > I'm wondering if it's something to do with how condor is internally forming
> the requirements to send to the collector, however I haven't spotted it in the
> debug output yet.
> >
> > Cheers, Iain
> >
> >> On Dec 18, 2015, at 12:02, andrew.lahiff@xxxxxxxxxx wrote:
> >>
> >> Hi Iain,
> >>
> >> I can reproduce the same problem, but I'm not sure how to fix it - having a
> collector, negotiator and startds in containers seems to be fairly
> straightforward (and works), but a schedd is a bit harder.
> >>
> >> Regards,
> >> Andrew.
> >>
> >> ________________________________
> >> From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf
> >> of Iain Bradford Steers [iain.steers@xxxxxxx]
> >> Sent: Wednesday, December 16, 2015 11:48 AM
> >> To: htcondor-users@xxxxxxxxxxx
> >> Subject: [HTCondor-users] HTCondor Schedd and docker
> >>
> >> Hi All,
> >>
> >> Seeing as it's almost the holidays I decided to have some fun with docker
> and a single schedd.
> >>
> >> I've built and run a container that runs condor with the following custom
> config.
> >>
> >> DAEMON_LIST = MASTER, COLLECTOR, SCHEDD USE_SHARED_PORT = True
> >>
> >> The container works and condor appears to work. I've exposed the usual port
> 9618 as well. Using bash inside the container I can query both the collector and
> schedd.
> >>
> >> However from outside the container I only appear able to query the
> >> collector.(1)
> >>
> >> condor_q with multiple attempts at getting -name right just tells me that the
> collector has no knowledge of the schedd. I've removed the -pool ip as it's
> publicly routable i.e.
> >>
> >> outside]# condor_q -name 11e3ad4f744a
> >> Error: Collector has no record of schedd/submitter.
> >>
> >> Has anyone else tried this before?
> >>
> >> Nothing urgent here, just decided to have some fun and a poke around.
> >>
> >> Cheers, Iain
> >>
> >> (1)
> >> outside]# condor_status -schedds
> >> Name                 Machine    TotalRunningJobs TotalIdleJobs TotalHeldJobs
> >>
> >> 11e3ad4f744a         11e3ad4f74                0             0              0
> >>                     TotalRunningJobs      TotalIdleJobs      TotalHeldJobs
> >>
> >>
> >>              Total                 0                  0                  0
> >>
> >>
> >> _______________________________________________
> >> HTCondor-users mailing list
> >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> >> with a
> >> subject: Unsubscribe
> >> You can also unsubscribe by visiting
> >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> >>
> >> The archives can be found at:
> >> https://lists.cs.wisc.edu/archive/htcondor-users/
> >
> >
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> > with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/