Project

General

Profile

Actions

Bug #1418

closed

HAMMER and problem with external drive(s)

Added by joelkp almost 15 years ago. Updated almost 15 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Problems beginning some time after mount on line beginning with "(da8". This is
a new version 2 hammer partition made and used on freshly updated system on
ThinkPad X31 laptop. Same kind of error and messages have appeared with a
different external drive (older hammer partition, older installation) on a
different DragonFly install as well, as well as when trying to use its even
older installation CD (which was an old 09-02-06 snapshot) live as well as
trying to install with HAMMER on said different external drive (for later manual
adjustment and use in laptop).

No such problem with other filesystems, no problem (ever) with HAMMER on
non-external drives.

For log below (until a first incident), these messages with which the problem
begins keep repeating if trying further use. Unmounting and re-mounting gives
some time (though much data written during previous mount lost) before it
happens again in the same way.

Everything works until the
"(da8:umass-sim0:0:0:0): SYNCHRONIZE CACHE. CDB: 35 0 0 0 0 0 0 0 0 0
(da8:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da8:umass-sim0:0:0:0): SCSI Status: Check Condition
(da8:umass-sim0:0:0:0): ILLEGAL REQUEST asc:20,0
(da8:umass-sim0:0:0:0): Invalid command operation code
(da8:umass-sim0:0:0:0): Unretryable error"

starts repeating, followed by

"HAMMER: Critical error inode=-1 while flushing meta-data"

and

"HAMMER: Forcing read-only mode"
.

Console log for fluffc.localhost
umass0: <Generic USB Storage Device, class 0/0, rev 2.00/0.00, addr 2> on uhub3
umass0: Get Max Lun not supported (TIMEOUT)
da8 at umass-sim0 bus 0 target 0 lun 0
da8: <WDC WD40 WD-WMAMY1717283 6A04> Fixed Direct Access SCSI-2 device
da8: 40.000MB/s transfers
da8: 381554MB (781422768 512 byte sectors: 255H 63S/T 48641C)
Jul 8 23:06:57 fluffc su: joel to root on /dev/ttyp1
(da8:umass-sim0:0:0:0): SYNCHRONIZE CACHE. CDB: 35 0 0 0 0 0 0 0 0 0
(da8:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da8:umass-sim0:0:0:0): SCSI Status: Check Condition
(da8:umass-sim0:0:0:0): ILLEGAL REQUEST asc:20,0
(da8:umass-sim0:0:0:0): Invalid command operation code
(da8:umass-sim0:0:0:0): Unretryable error
(da8:umass-sim0:0:0:0): SYNCHRONIZE CACHE. CDB: 35 0 0 0 0 0 0 0 0 0
(da8:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da8:umass-sim0:0:0:0): SCSI Status: Check Condition
(da8:umass-sim0:0:0:0): ILLEGAL REQUEST asc:20,0
(da8:umass-sim0:0:0:0): Invalid command operation code
(da8:umass-sim0:0:0:0): Unretryable error
(da8:umass-sim0:0:0:0): SYNCHRONIZE CACHE. CDB: 35 0 0 0 0 0 0 0 0 0
(da8:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da8:umass-sim0:0:0:0): SCSI Status: Check Condition
(da8:umass-sim0:0:0:0): ILLEGAL REQUEST asc:20,0
(da8:umass-sim0:0:0:0): Invalid command operation code
(da8:umass-sim0:0:0:0): Unretryable error
HAMMER: Critical error inode=-1 while flushing meta-data
HAMMER: Forcing read-only mode
HAMMER: Critical write error during flush, refusing to sync UNDO FIFO

Actions #1

Updated by joelkp almost 15 years ago

Updating the system, that commit fixed this issue. However, when copying a
relatively large directory, I now instead encountered a second issue. This takes
longer - perhaps about a minute of copying - following which there is a panic.

If in the xorg desktop, I see during the time left until reboot occurs in the
xconsole a similar inode error message (this time with a large number as id)
followed by another hammer-related message.

Doing it on the ttyv0 console last time, up came the debugger, which had
"Stopped at hammer_btree_extract+0xf: movl 0x3c(%eax),%edx"

Thereafter (hopefully of some use):

db> trace
hammer_btree_extract(c8bfcc9c,1,ffffffff,0,c0310) at hammer_btree_extract+0xf
hammer_ip_next(c8bfcc9c,20000,80000000,18000,0) at hammer_ip_next+0x4d7
hammer_ip_delete_range(c8bfcc9c,c8c47db8,10000,0,ffffffff) at
hammer_ip_delete_range+0x22f
hammer_sync_inode(c8bf011c,c8c48db8,c8bf0000) at hammer_sync_inode+0x18f
hammer_flusher_slave_thread(c0dfdf78,0,0,0,0) at hammer_flusher_slave_thread+0x70
lwkt_exit() at lwkt_exit

Actions #2

Updated by dillon almost 15 years ago

There needs to be more context. What's the exact console output leading up to
that debugger entry? If there were write errors unrelated to the synchronize
cache issue then that could certain cascade into a panic. If it just dropped
into the debugger on its own it would also have generated a Debugger message or
a panic message indicating why it did it.

I have fairly low confidence regarding USB-connected storage in general. There
are probably bugs in the USB driver causing these problems.

-Matt

Actions #3

Updated by joelkp almost 15 years ago

I just thought of (in hindsight obvious) logging into single-user mode to get a
full output in one place; however, trying then to reproduce it the hammer errors
were the end of the issue.

The second error message was merely the "Forcing read-only mode" one.
I noticed doing multiple attempts that the inode numbered in the first message
was the same (4295132199), and I wondered if the problem could be due to
filesystem corruption as a result of writing after the triggering (but, I
suppose, before detection) of the now-fixed first problem - and so I tried
reformatting the USB-connected drive, and then it - and further data copying -
worked. So I suppose the problem is resolved.

Actions #4

Updated by dillon almost 15 years ago

:New submission from Joel K. Pettersson <>:
:
:Problems beginning some time after mount on line beginning with "(da8". Thi=
:s is
:a new version 2 hammer partition made and used on freshly updated system on
:ThinkPad X31 laptop. Same kind of error and messages have appeared with a

Yah, the problem is the usb device can't handle a cache flush
request. HAMMER ignores the failure, but unfortunately the
failed cache sync probably bricks the usb port.
This has hit a number of people.  I am going to change the
code to disable cache flushes for usb-attached devices
by default.
-Matt
Matthew Dillon
&lt;&gt;
Actions

Also available in: Atom PDF