Project

General

Profile

Actions

Bug #3055

closed

HAMMER2 crash + LK_RELEASE fail

Added by arcade@b1t.name over 7 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
VFS subsystem
Target version:
Start date:
09/14/2017
Due date:
% Done:

0%

Estimated time:

Description

This happens when 'cleanup' was too long ago.

kern_rename actually happened hours before before the actual crash.


Files

core.txt.33 (178 KB) core.txt.33 arcade@b1t.name, 09/14/2017 12:34 PM
core.txt.34 (179 KB) core.txt.34 arcade@b1t.name, 09/18/2017 09:07 AM
core.txt.38 (206 KB) core.txt.38 arcade@b1t.name, 09/23/2017 02:51 PM
core.txt.39 (282 KB) core.txt.39 arcade@b1t.name, 10/12/2017 01:10 AM
core.txt.40 (185 KB) core.txt.40 arcade@b1t.name, 10/12/2017 01:10 AM
core.txt.44 (213 KB) core.txt.44 arcade@b1t.name, 10/17/2017 01:21 PM
core.txt.46 (282 KB) core.txt.46 arcade@b1t.name, 11/01/2017 02:14 PM
core.txt.47 (215 KB) core.txt.47 arcade@b1t.name, 11/01/2017 08:22 PM
core.txt.51 (280 KB) core.txt.51 arcade@b1t.name, 11/28/2017 12:13 AM
core.txt.52 (295 KB) core.txt.52 arcade@b1t.name, 11/28/2017 12:13 AM
Actions #1

Updated by arcade@b1t.name over 7 years ago

PS: This is without latest kern_mutex.c changes.

Actions #2

Updated by dillon over 7 years ago

  • Status changed from New to In Progress
  • Assignee set to dillon

Was the filesystem full at the time this ran? There is an error path that is not being checked properly in hammer2_chain_indirect_maintenance() for the situation where the filesystem has become full. I will commit error processing for that part of the code right now. If it still panics (verses just thowing an error on the kernel console), I'll need a backtrace from kgdb.

-Matt

Actions #3

Updated by arcade@b1t.name over 7 years ago

Filesystem was close to 90% full with 20% being "jettissonable". I'm not sure about this one actually as FS was created more then a few weeks ago and can contain some older discrepancies. I can recreate FS from scratch and retest if that would be required.

Anyway double "hammer cleanup" makes FS stable again. Without cleanup host can't even boot due to problems writing data:

strategy_xop_write: error 32 loff=0000000057480000
strategy_xop_write: error 32 loff=00000000583f0000
strategy_xop_write: error 32 loff=000000005a680000
strategy_xop_write: error 32 loff=0000000063f50000
strategy_xop_write: error 32 loff=00000000677c0000
strategy_xop_write: error 32 loff=000000006aa90000
strategy_xop_write: error 32 loff=0000000073020000
strategy_xop_write: error 32 loff=0000000076cf0000
strategy_xop_write: error 32 loff=000000007f660000
strategy_xop_write: error 32 loff=0000000084430000
strategy_xop_write: error 32 loff=0000000087450000
strategy_xop_write: error 32 loff=0000000088e00000
strategy_xop_write: error 32 loff=0000000093ac0000
strategy_xop_write: error 32 loff=0000000093b90000

Or just crashes again.

Actions #4

Updated by arcade@b1t.name over 7 years ago

Happened again. Alas, my kernel was built without DEBUG so kgdb output is pretty useless. Will try replicating one more time.

Actions #5

Updated by arcade@b1t.name about 7 years ago

Happened again. Pool was 93% full (9G free).

Updated by arcade@b1t.name about 7 years ago

A few more crashes. 7G free space reached...

Actions #7

Updated by dillon about 7 years ago

I think I see what is going on. I did not completely instrument error handling for some of these failure cases (when the media becomes full). Several calls to hammer2_chain_delete() are not processing the returned error code and that may be leading to these assertions.

I am working on instrumenting these and will commit an update this afternoon to master and release.

-Matt

Actions #8

Updated by arcade@b1t.name about 7 years ago

Another crash...

Actions #9

Updated by arcade@b1t.name about 7 years ago

This situation was triggered a few times more, but there was no crash. Host was getting slow but was slowly gettin through.

Actions #10

Updated by arcade@b1t.name about 7 years ago

I knew I can did it.

Actions #11

Updated by arcade@b1t.name about 7 years ago

And here's something new:

panic: assertion "chain->bref.key == base[i].key" failed in hammer2_combined_find at /usr/src/sys/vfs/hammer2/hammer2_chain.c:4968

Updated by arcade@b1t.name about 7 years ago

Fresh cores arrived.

Actions #13

Updated by liweitianux over 5 years ago

Hello. Is this issue resolved with the latest master/release? The HAMMER2 has gained significant improvements. Thank you.

Actions #14

Updated by arcade@b1t.name over 3 years ago

  • Status changed from In Progress to Closed

I guess not applicable anymore.

Actions

Also available in: Atom PDF