[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Update: RE: Error: "forrtl: no space left on device"



Update: I worked all day on the issue where my runs crash due to no space left on the partition that “/var/opt/condor/execute” resides on. I created a symbolic link to point  “/var/opt/condor/execute” to “export/condor_configure/execute” and rebooted, but there’s no change. I also ensured that both condor_config and condor_config.local specify that “EXECUTE = $ /export/condor_output/execute.” I’m at a loss why I cannot force this temp data to write to “/export/condor_output/execute”…thanks for any advice.

 

Nate Mobley

Millennium Engineering & Integration Company

ISSO/Systems Administrator

Desk: 256-489-7847

Cell (Voice Only): 256-655-5570

MEI Help Desk:  703-413-7771

nmobley@xxxxxxxxxxxxxx

www.meicompany.com

 

From: Mobley, Nate (Millennium)
Sent: Thursday, January 04, 2018 9:51 AM
To: 'htcondor-admin@xxxxxxxxxxx' <htcondor-admin@xxxxxxxxxxx>; 'htcondor-users@xxxxxxxxxxx' <htcondor-users@xxxxxxxxxxx>
Subject: Error: "forrtl: no space left on device"

 

Good morning,

 

Thank you for the assistance you’ve all provided in the last couple of weeks with the shadow exception error we were receiving; I determined what was going on there and corrected that issue.

 

However, I’m back to my “original” issue, where the _condor.log file states “Job was evicted” and the _condor.error file states “forrtl: no space left on device” and “forrtl: severe (38): error during write, unit 2, file /var/opt/condor/execute/some/file”.

 

The partition where “/var/opt/condor/execute/some/file” is being written to is a very small partition with less than 1 gb available, so this error makes sense. What doesn’t make sense is why HTCondor is still trying to write to “/var/opt/condor/execute” when I specified the EXECUTE directory to be “/export/condor/execute” which is on a much larger partition. I specified this in my condor_config.local file in May of 2017, and this cleared all of the issues I was having then, until just recently.

 

Can I easily set up a symbolic link to tell any data being written to “/var/opt/condor/execute” to redirect to “/export/condor/execute”? I’m not very familiar with symbolic links. Or is there somewhere else where I need to tell HTCondor to write its temp data to “/export/condor/execute”?

 

Thank you for any assistance you can provide. My customers need to be able to run this data asap for a report due next week, so a workaround or quick fix would be ideal for this week, until I can address the real issue (partition sizes) at a later time.

 

Cheers

 

Nate Mobley

Millennium Engineering & Integration Company

ISSO/Systems Administrator

Desk: 256-489-7847

Cell (Voice Only): 256-655-5570

MEI Help Desk:  703-413-7771

nmobley@xxxxxxxxxxxxxx

www.meicompany.com