[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor and Ceph



* On 25 Sep 2015, Lidie Stephen wrote: 
> Condor was running fine until we moved user home folders from a traditional NFS server to a Ceph-based server.  Now a condor_submit gives:
> 
> 	[lusol@condor condor]$ condor_submit submit.job Submitting job(s)
> 	ERROR: Can't open "/home/lusol/condor/out.0"  with flags 01101 (Value too large for defined data type)
> 
> Has anyone a clue as to what might be happening?  If I specify that output and error files goto /tmp rather than the Ceph volume then condor_submit works normally.

I don't have an immediate answer, but we submit from ceph routinely with
no known workarounds, so there is hope.  Our CephFS is only 400 TB --
it's possible that a larger FS would trigger integer overflow issues, I
guess.

The only thing that I can think of that Ceph does different wrt stat()
is that directories' st_size reflect the volume of data anywhere
underneath that point in the filesystem, rather than the conventional
number of blocks required to store the directory entries.  Do any of the
directories up from this file contain disturbingly vast amounts of data?
(Try ls -ld on each directory up from out.0.)

We're using ceph 0.80.7.

-- 
       David Champion â dgc@xxxxxxxxxxxx â University of Chicago

Attachment: signature.asc
Description: PGP signature