[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Problems with submit node in version 8.6.0



I believe this problem reflect the same underlying issues as this previous report.

https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=5857,4

--
Tom Downes
Senior Scientist and Data CenterÂManager
Center for Gravitation, Cosmology and Astrophysics
University of Wisconsin-Milwaukee
414.229.2678

On Wed, Feb 1, 2017 at 11:35 AM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
On 2/1/2017 9:13 AM, Feldt, Andrew N. wrote:
Zach,

Update - I had neglected to try the â-allusersâ flag - with this flag I
find that I can run condor_q from a non-submit node without any problem
(and without the fix you suggest) in our environment. I.e. âcondor_q
-allusers -gâ works just fine for us.

Andy

So if you want, you can have condor_q do the "-allusers" flag behavior by default by placing the following into your condor_config file:

 CONDOR_Q_ONLY_MY_JOBS = False

regards,
Todd




On Feb 1, 2017, at 9:04 AM, Feldt, Andrew N. <afeldt@xxxxxx
<mailto:afeldt@xxxxxx>> wrote:

    This sender failed our fraud detection checks and may not be who they appear to be. Learn about spoofing <http://aka.ms/LearnAboutSpoofing>  ÂFeedback <http://aka.ms/SafetyTipsFeedback>

Zach,

Unfortunately, this does not work in all environments. We are an
NFS/NIS set of systems. Without the fix you have suggested, condor_q
works for the submitter or a queue super user directly from the
submitting host but not from any other node (where one would have to
use the -g flag). However, after setting the two variables below,
condor_q does not even work from the submitting node.

An analysis of the logs (without the fix) shows that âcondor_q -gâ
from a non-submitting node fails because no authentication method
works. We use âFSâ authentication which works for 8.4 in our NFS
environment on RHEL 7.3 with SELinux enabled, but fails for 8.6.0.

Andy

*Andy Feldt*
/Senior System Support Programmer/
/Affiliate Assistant Professor/
Homer L. Dodge Department of Physics & Astronomy
The University of Oklahoma

On Feb 1, 2017, at 6:27 AM, Zach Miller <zmiller@xxxxxxxxxxx
<mailto:zmiller@xxxxxxxxxxx>> wrote:

This is indeed a problem, related to the new behavior of having
condor_q show only the jobs for the user running condor_q.

Unfortunately, itâs not as simple as running âcondor_q âallusersâ to
return to the old behavior. For the time being, if you want to use
8.6.0 youâll need to change:

   SEC_DEFAULT_AUTHENTICATION = OPTIONAL
   SEC_DEFAULT_NEGOTIATION = OPTIONAL

This does add some overhead in terms of network traffic, but it does
work. Weâll take a deeper look at how to address this, but in the
meantime I just wanted to put that workaround out there.


Cheers,
-zach


On 2/1/17, 12:22 AM, "HTCondor-users on behalf of
Greg.Hitchen@xxxxxxxx <mailto:Greg.Hitchen@xxxxxxxx>"
<htcondor-users-bounces@xxxxxxc.edu
<mailto:htcondor-users-bounces@cs.wisc.edu> on behalf of
Greg.Hitchen@xxxxxxxx <mailto:Greg.Hitchen@xxxxxxxx>> wrote:

 ÂHi All

 ÂThought I'd have a look at the latest stable release but have had
some issues/problems.
 ÂI have tried this on win7 32 bit with 32 bit condor 8.6.0, win7 64
bit with 32 bit condor 8.6.0,
 Âwin2008 server 64 bit with 32 bit condor 8.6.0, and win2008 server
64 bit with 64 bit condor 8.6.0
 ÂLastly I have tried it on Ubuntu1604 64 bit.
 ÂI have the same problem with all of the above combinations.

 ÂSecurity: host based

 ÂI'm testing the 8.6.0 version using the same condor_config files
that work for 8.4.4 (and previous versions).

 ÂIf I try a condor_q command on the submit node for 8.6.0 I get:

condor_q
 Â-- Failed to fetch ads from:
<152.83.64.39:9618?addrs=152.83.64.39-9618&noUDP&sock=4488_72cc_3> :
WIN2008-GJH9-VC.nexus.csiro.au <http://win2008-gjh9-vc.nexus.csiro.au/>

 ÂIf I run the 8.4.4 condor_q executable it works and I get:

c:\condor-8.4.4\bin\condor_q
 Â-- Schedd: WIN2008-GJH9-VC.nexus.csiro.au
<http://win2008-gjh9-vc.nexus.csiro.au/> : <152.83.64.39:9618?...

  ID   OWNER      SUBMITTED  ÂRUN_TIME ST PRI SIZE CMD

 Â0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

 ÂIn fact I can run the 8.4.4 version of condor_q from any other
submit node using the -name option
 Âand that works OK as well.

 ÂTurning on FULL_DEBUG gives the following in the schedd log file:
 Â01/31/17 15:27:33 Calling Handler
<DaemonCommandProtocol::WaitForSocketData> (6)
 Â01/31/17 15:27:33 Failed to read end of message from
<152.83.64.39:60746>; 310 untouched bytes.
 Â01/31/17 15:27:33 AUTHENTICATE: handshake failed!
 Â01/31/17 15:27:33 DC_AUTHENTICATE: Our security policy is invalid!
 Â01/31/17 15:27:33 Return from Handler
<DaemonCommandProtocol::WaitForSocketData> 0.000196s

 ÂThe above bit about "Failed to read end of message" seems strange?
As does the AUTHENTICATE
 Âlines. For reference we have the following config lines some of
which should turn off authentication?
 ÂQUEUE_ALL_USERS_TRUSTED = True
 ÂSEC_DEFAULT_AUTHENTICATION = NEVER
 ÂSEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE
 ÂSEC_DEFAULT_AUTHENTICATION_TIMEOUT = 20
 ÂSEC_DEFAULT_NEGOTIATION = NEVER

 ÂIf I install 8.6.0 from the msi windows installer file (rather
than just unzipping), and therefore use
 Âthe generated condor_config file then condor_q works OK.

 ÂI have painstakingly compared the output from condor_config_val
-dump- verbose for 8.6.0
 Âusing both our 8.4.4 config files and the 8.6.0 generated config
files and cannot see many
 Âdifferences, and those don't "appear" to be important.

 ÂI've tried using the generated condor_config and renaming our
8.4.4 condor_config to condor_config.local
 Âbut then the condor_q error is back again.

 ÂAny help/suggestions greatly appreciated.

 ÂCheers

 ÂGreg

 Â_______________________________________________
 ÂHTCondor-users mailing list
 ÂTo unsubscribe, send a message to
htcondor-users-request@xxxxxxx.edu
<mailto:htcondor-users-request@cs.wisc.edu> with a
 Âsubject: Unsubscribe
 ÂYou can also unsubscribe by visiting
 Âhttps://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

 ÂThe archives can be found at:
 Âhttps://lists.cs.wisc.edu/archive/htcondor-users/



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxx.edu
<mailto:htcondor-users-request@cs.wisc.edu> with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxx.edu
<mailto:htcondor-users-request@cs.wisc.edu> with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxx.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing ÂDepartment of Computer Sciences
HTCondor Technical Lead        1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132Â Â Â Â Â Â Â Â Â Madison, WI 53706-1685

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxx.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/