[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Nodes cannot start condor



filesystem error :-)

that would do it


On Tue, Jul 13, 2010 at 6:55 AM, David McKechan
<david.mckechan@xxxxxxxxxxxxxx> wrote:
> Hi,
>
> On Tue, Jul 13, 2010 at 11:45 AM, Mag Gam <magawake@xxxxxxxxx> wrote:
>> as the condor user, can you
>> cd /var/log/condor && touch hi
>
> Thanks for this. I also had a reply off-list and it was suggested I
> look at dmesg. I got the following output:
> ..
> condor_exec.467[28638]: segfault at 0000000000000010 rip
> 0000000000814de7 rsp 00007fffffff3850 error 6
> lalapps_BankEff[25683]: segfault at 0000000000000078 rip
> 0000000000442831 rsp 00007fffbb4e3a10 error 4
> lalapps_BankEff[25698]: segfault at 0000000000000078 rip
> 0000000000442831 rsp 00007fff1253e640 error 4
> EXT3-fs error (device sda1): ext3_free_blocks: Freeing blocks not in
> datazone - block = 931161345, count = 1
> Aborting journal on device sda1.
> ext3_abort called.
> EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal
> Remounting filesystem read-only
> EXT3-fs error (device sda1): ext3_free_blocks: Freeing blocks not in
> datazone - block = 847167968, count = 1
> EXT3-fs error (device sda1): ext3_free_blocks: Freeing blocks not in
> datazone - block = 1288987090, count = 1
> EXT3-fs error (device sda1): ext3_free_blocks: Freeing blocks not in
> datazone - block = 1074310421, count = 1
> EXT3-fs error (device sda1): ext3_free_blocks: Freeing blocks not in
> ..
>
> which suggested something weird and I decided to rebuild the node.
>
> Thanks,
> David
>
>
>
>
>
>
>
>
>
>> On Tue, Jul 13, 2010 at 3:37 AM, David McKechan
>> <david.mckechan@xxxxxxxxxxxxxx> wrote:
>>> Hi,
>>>
>>> I've had this incident twice in the last week on separate nodes. The
>>> node drops out of the condor pool and then I cannot start condor on
>>> it. I get the following error:
>>>
>>> [node3 ~]$ /etc/init.d/condor start
>>> Starting up Condor...Can't open "/var/log/condor/MasterLog"
>>> dprintf() had a fatal error in pid 27088
>>> Can't open "/var/log/condor/MasterLog"
>>> errno: 30 (Read-only file system)
>>> euid: 106119, ruid: 106119
>>> done.
>>>
>>> Although the file should be readable by condor:
>>> [node3 ~]$ ls -hlrt /var/log/condor/
>>> total 36M
>>> -rw-r--r-- 1 condor condor 1.8K Jul  5 15:09 StarterLog
>>> -rw-r--r-- 1 condor condor 9.6M Jul 11 06:59 StartLog.old
>>> -rw-r--r-- 1 condor condor 5.2M Jul 12 00:03 StarterLog.boinc
>>> -rw------- 1 condor condor    0 Jul 12 23:09 InstanceLock
>>> -rw-r--r-- 1 condor condor 148K Jul 13 05:48 MasterLog
>>> -rw-r--r-- 1 condor condor 5.3M Jul 13 06:00 StarterLog.slot2
>>> -rw-r--r-- 1 condor condor 8.5M Jul 13 06:03 StarterLog.slot1
>>> prw------- 1 condor condor    0 Jul 13 06:03 procd_pipe.STARTD.watchdog
>>> -rw-r--r-- 1 condor condor 5.3M Jul 13 06:03 StartLog
>>> prw------- 1 condor condor    0 Jul 13 06:03 procd_pipe.STARTD
>>> -rw-r--r-- 1 condor condor 1.3M Jul 13 06:03 CkptServerLog
>>>
>>> Can anyone help me?
>>>
>>> Thanks,
>>> David
>>> --
>>> Help me raise money for Alzheimer Scotland - http://www.waitup.org.uk
>>> _______________________________________________
>>> Condor-users mailing list
>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>>
>
>
>
> --
> Help me raise money for Alzheimer Scotland - http://www.waitup.org.uk
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>