[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [htcondor-admin #90892] UPDATE: RE: Error when submitting runs



Good morning,

Nothing has changed on the system lately, and all disks that are utilized have plenty of room. The error was originally:

"Shadow exception. 
Error from slot*@compute-*-*.local. Failed to execute
Errno=13: Permission denied"

Then changed after I rebooted the head node and all 9 compute nodes. I've also restarted condor on all nodes. The answers to your questions are below; thanks for your time!

How would I determine which version I am running?

The error occurs when starting any job. 

How would I determine which HTCondor binary is producing the error?

The error isn't specific to any job.

We've always used the head node to issue the commands from.

Yes, the error repeats every time, regardless of the job.

Thanks for anything you can do for me. I'm just the Systems Admin here, I don't use Condor like my users/customers do. 

Nate Mobley
Millennium Engineering & Integration Company
ISSO/Systems Administrator
Desk: 256-489-7847
Cell (Voice Only): 256-655-5570
MEI Help Desk:Â 703-413-7771 
nmobley@xxxxxxxxxxxxxx
www.meicompany.com

-----Original Message-----
From: Todd Miller via RT [mailto:spam@xxxxxxxxxxx] 
Sent: Tuesday, December 19, 2017 2:25 PM
To: Mobley, Nate (Millennium) <nmobley@xxxxxxxxxxxxxx>
Subject: [crt.cs.wisc.edu #90891] *****SPAM***** Re: [htcondor-admin #90892] UPDATE: RE: Error when submitting runs

Spam detection software, running on the system "ninja.cs.wisc.edu", has identified this incoming email as possible spam.  The original message has been attached to this so you can view it or label similar future email.  If you have any questions, see lab@xxxxxxxxxxx for details.

Content preview:  As far as I can tell, 'forrtl' does not appear in the current
   HTCondor code base. "No space left on device" certainly suggests that you're
   running out of disk space somewhere along the way, but without considerably
   more context, that's all I can say. (For instance: what version of HTCondor
   are you running? When is the error occuring? Which HTCondor binary is producing
   the error? Is it related to a specific job? Is it related to a specific machine?
   Can you reproduce the problem?) [...] 

Content analysis details:   (6.1 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 1.3 RCVD_IN_RP_RNBL        RBL: Relay in RNBL,
                            https://senderscore.org/blacklistlookup/
                           [69.130.245.124 listed in bl.score.senderscore.com]
 3.6 RCVD_IN_PBL            RBL: Received via a relay in Spamhaus PBL
                            [69.130.245.124 listed in zen.spamhaus.org]
 1.3 RDNS_NONE              Delivered to internal network by a host with no rDNS