Project

General

Profile

Actions

Bug #1504

closed

hammer crash on cleanups

Added by corecode over 15 years ago. Updated almost 11 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
VFS subsystem
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Redirecting to bugs@

Eugene wrote:

Hello All.
I've got a time-to-time reproducible panic while running cleanups on a
mirror built with hammer mirror-stream.
I'm running a system built on -DEVELOPMENT sources from August, 9th
(uses SILI device) and a following system layout:

/dev/da0s1a / ufs rw 1 1
/dev/da0s2b none swap sw 0 0
/dev/da0s2h /HAMMER0 hammer rw 2 2
/HAMMER0/pfs/var /var null rw 2 2
/dev/da1s1a /mirror ufs rw 1 1 # not used
/dev/da1s2b none swap sw 0 0
/dev/da1s2h /HAMMER1 hammer rw 2 2
/HAMMER1/pfs/var /mirrorvar null rw 2 2
proc /proc procfs rw 0 0

A kernel panic occurs on running daily cleanups on a slave part of
mirror and I managed to reproduce it manually while executing %hammer
reblock and %hammer rebalance commands on a slave pfs after a day or two
of uptime.
When running %hammer cleanup immediately after system boot-up, it always
runs fine and resumes with no error.
I've attached a screenshot of a latest panic I've got while running
cleanup.

If there can be any solution for this problem?

Which kernel version are you running? Please post a uname -a output.

Also, please configure a dumpdev and capture a crash dump.

cheers
simon

Actions #1

Updated by dfuser over 15 years ago

Thank You for reply.

Here it is:
DragonFly diana.medcom.com.ua 2.3.2-DEVELOPMENT DragonFly
2.3.2-DEVELOPMENT #0: Sun Aug 9 16:51:26 GMT 2009
:/usr/obj/usr/src/sys/CUSTOM i386

The system was upgraded from DragonFly 1.8-Release.

Also, please configure a dumpdev and capture a crash dump.

O.K. Now I have to wait a number of days to get panic again.
;(

Actions #2

Updated by corecode over 15 years ago

Eugene (via DragonFly issue tracker) wrote:

Eugene <> added the comment:

Thank You for reply.

Here it is:
DragonFly diana.medcom.com.ua 2.3.2-DEVELOPMENT DragonFly
2.3.2-DEVELOPMENT #0: Sun Aug 9 16:51:26 GMT 2009
:/usr/obj/usr/src/sys/CUSTOM i386

The system was upgraded from DragonFly 1.8-Release.

Also, please configure a dumpdev and capture a crash dump.

O.K. Now I have to wait a number of days to get panic again.
;(

Best update to latest master, there were many fixes since August.

cheers
simon

Actions #3

Updated by dfuser over 15 years ago

Hello again.

Before updating I've decided to run %hammer reblock on a slave pfs with
%hammer reblock /mirrorvar
without remounting it before executing a command and got next (reported
with kgdb):

Unread portion of the kernel message buffer:
panic: assertion: cursor->trans->sync_lock_refs > 0 in hammer_recover_cursor
Trace beginning at frame 0xd743e714
panic(d743e738,d743e7b8,d743e89c,d743e814,d743e744) at panic+0x8c
panic(c059aed8,c060299c,c05826fc,d743e7b8,d743e9ac) at panic+0x8c
hammer_recover_cursor(d743e7b8,0,0,5,d743e778) at hammer_recover_cursor+0x2c
hammer_ioc_mirror_write(d743ea84,d2c52550,c2a72998,c2b13fa8,1) at
hammer_ioc_mirror_write+0x928
hammer_ioctl(d2c52550,c0c46808,c2a72998,1,c29a7768) at hammer_ioctl+0x8f2
hammer_vop_ioctl(d743eae0,c0664560,c2ae8b50,d60e9b00,46) at
hammer_vop_ioctl+0x2f
vop_ioctl(c2ae8b50,c2b13fa8,c0c46808,c2a72998,1) at vop_ioctl+0x38
vn_ioctl(d0291148,c0c46808,c2a72998,c29a7768,d0291148) at vn_ioctl+0xdd
mapped_ioctl(3,c0c46808,bfbff794,0,d743ed34) at mapped_ioctl+0x3e1
sys_ioctl(d743ecf0,6,24de3699,0,d2aecfd8) at sys_ioctl+0x16
syscall2(d743ed40) at syscall2+0x1ef
Xint0x80_syscall() at Xint0x80_syscall+0x36
(kgdb) backtrace
#0 dumpsys () at ./machine/thread.h:83
#1 0xc0332666 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:375
#2 0xc0332787 in panic (fmt=0xc059e8fc "from debugger")
at /usr/src/sys/kern/kern_shutdown.c:801
#3 0xc0182745 in db_panic (addr=-1068281776, have_addr=0, count=-1,
modif=0xd743e5c4 "") at /usr/src/sys/ddb/db_command.c:447
#4 0xc0182db0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:343
#5 0xc0185380 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
#6 0xc05351bc in kdb_trap (type=3, code=0, regs=0xd743e6c0)
at /usr/src/sys/platform/pc32/i386/db_interface.c:152
#7 0xc0543693 in trap (frame=0xd743e6c0)
at /usr/src/sys/platform/pc32/i386/trap.c:797
#8 0xc0535ef7 in calltrap ()
at /usr/src/sys/platform/pc32/i386/exception.s:785
#9 0xc0535050 in Debugger (msg=0xc05b6499 "panic") at ./cpu/cpufunc.h:73
#10 0xc033277e in panic (fmt=0xc059aed8 "assertion: s in %s")
at /usr/src/sys/kern/kern_shutdown.c:799
#11 0xc04a1dc4 in hammer_recover_cursor (cursor=0xd743e7b8)
at /usr/src/sys/vfs/hammer/hammer_cursor.c:582
#12 0xc04ab39d in hammer_ioc_mirror_write (trans=0xd743ea84, ip=0xd2c52550,
mirror=0xc2a72998) at /usr/src/sys/vfs/hammer/hammer_mirror.c:457
#13 0xc04aa686 in hammer_ioctl (ip=0xd2c52550, com=3234097160,
data=0xc2a72998 "", fflag=1, cred=0xc29a7768)
at /usr/src/sys/vfs/hammer/hammer_ioctl.c:134
#14 0xc04ba16d in hammer_vop_ioctl (ap=0xd743eae0)
at /usr/src/sys/vfs/hammer/hammer_vnops.c:2191
#15 0xc0387c4e in vop_ioctl (ops=0xc2ae8b50, vp=0xc2b13fa8,
command=3234097160, data=0xc2a72998 "", fflag=1, cred=0xc29a7768)
at /usr/src/sys/kern/vfs_vopops.c:372
#16 0xc0387018 in vn_ioctl (fp=0xd0291148, com=3234097160,
data=0xc2a72998 "",
ucred=0xc29a7768) at /usr/src/sys/kern/vfs_vnops.c:1120
#17 0xc035295b in mapped_ioctl (fd=3, com=3234097160,
uspc_data=0xbfbff794 <Address 0xbfbff794 out of bounds>, map=0x0)
at /usr/src/sys/sys/file2.h:87
#18 0xc03529e3 in sys_ioctl (uap=0xd743ecf0)
at /usr/src/sys/kern/sys_generic.c:525
#19 0xc0543033 in syscall2 (frame=0xd743ed40)
at /usr/src/sys/platform/pc32/i386/trap.c:1339
#20 0xc0535fa6 in Xint0x80_syscall ()
at /usr/src/sys/platform/pc32/i386/exception.s:876
#21 0x080552d7 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(kgdb) list 0xc04a1dc4
0xc04a1dc4 is in hammer_recover_cursor
(/usr/src/sys/vfs/hammer/hammer_cursor.c:587).
582 KKASSERT;
583
584 /

585 * Wait for the deadlock to clear
586 /
587 if (cursor->deadlk_node) {
588 hammer_lock_ex_ident(&cursor->deadlk_node->lock,
"hmrdlk");
589 hammer_unlock(&cursor->deadlk_node->lock);
590 hammer_rel_node(cursor->deadlk_node);
591 cursor->deadlk_node = NULL;
(kgdb) list *0xc04ab39d
0xc04ab39d is in hammer_ioc_mirror_write
(/usr/src/sys/vfs/hammer/hammer_mirror.c:458).
453 * for the next loop.
454 */
455 if (error EDEADLK) {
456 while (error EDEADLK) {
457 hammer_recover_cursor(&cursor);
458 error =
hammer_cursor_upgrade(&cursor);
459 }
460 } else {
461 if (error == EALREADY)
462 error = 0;
(kgdb) list *0xc04aa686
0xc04aa686 is in hammer_ioctl (/usr/src/sys/vfs/hammer/hammer_ioctl.c:134).
129 (struct hammer_ioc_mirror_rw
)data);
130 }
131 break;
132 case HAMMERIOC_MIRROR_WRITE:
133 if (error == 0) {
134 error = hammer_ioc_mirror_write(&trans, ip,
135 (struct hammer_ioc_mirror_rw
*)data);
136 }
137 break;
138 case HAMMERIOC_GET_VERSION:
(kgdb) list *0xc04ba16d
0xc04ba16d is in hammer_vop_ioctl
(/usr/src/sys/vfs/hammer/hammer_vnops.c:2193).
2188 struct hammer_inode *ip = ap->a_vp->v_data;
2189
2190 ++hammer_stats_file_iopsr;
2191 return(hammer_ioctl(ip, ap->a_command, ap->a_data,
2192 ap->a_fflag, ap->a_cred));
2193 }
2194
2195 static
2196 int
2197 hammer_vop_mountctl(struct vop_mountctl_args *ap)
(kgdb) list *0xc0387c4e
0xc0387c4e is in vop_ioctl (/usr/src/sys/kern/vfs_vopops.c:374).
369 ap.a_fflag = fflag;
370 ap.a_cred = cred;
371
372 DO_OPS(ops, error, &ap, vop_ioctl);
373 return(error);
374 }
375
376 int
377 vop_poll(struct vop_ops *ops, struct vnode *vp, int events,
struct ucred *cred)
378 {
(kgdb) list *0xc0387018
0xc0387018 is in vn_ioctl (/usr/src/sys/kern/vfs_vnops.c:1120).
1115 }
1116 *(int *)data = dev_dflags(vp->v_rdev) x%x

D_TYPEMASK;
1117 error = 0;
1118 break;
1119 }
1120 error = VOP_IOCTL(vp, com, data, fp->f_flag, ucred);
1121 if (error 0 && com TIOCSCTTY) {
1122 struct proc *p = curthread->td_proc;
1123 struct session *sess;
1124
(kgdb)

Best update to latest master, there were many fixes since August.

Now I'm planning to start updating the system and watch if crashes will
go on. Please tell, what can I dig from crash dumps more, or how to send
those dumps to You?

Actions #4

Updated by dfuser over 15 years ago

Hello again.

Now I've an updated kernel with sources from 13-sep-2009:
DragonFly diana.medcom.com.ua 2.3.2-DEVELOPMENT DragonFly
2.3.2-DEVELOPMENT #1: Tue Sep 15 17:23:22 EEST 2009
:/usr/obj/usr/src/sys/CUSTOM i386

but the "panic" message greeted me this morning:

panic: assertion: cursor->trans->sync_lock_refs > 0 in hammer_recover_cursor
Trace beginning at frame 0xd75ca708
panic(d75ca72c,d75ca7a8,d75ca88c,d75ca804,d75ca738) at panic+0x8c
panic(c05ac924,c0617e20,c059441c,d75ca7a8,d75ca99c) at panic+0x8c
hammer_recover_cursor(d75ca7a8,b,8,d75ca768,c04acd05) at
hammer_recover_cursor+0x2c
hammer_ioc_mirror_write(d75caa74,d2d28550,c2a727f8,d75ca9d8,c0394328) at
hammer_ioc_mirror_write+0x947
hammer_ioctl(d2d28550,c0c46808,c2a727f8,1,c29a7768) at hammer_ioctl+0x8f8
hammer_vop_ioctl(d75caad0,c067a1e0,d4ae7050,d75caaec,c0345eaa) at
hammer_vop_ioctl+0x2f
vop_ioctl(d4ae7050,c2b145e8,c0c46808,c2a727f8,1) at vop_ioctl+0x3e
vn_ioctl(d0291928,c0c46808,c2a727f8,c29a7768,d75cacf0) at vn_ioctl+0xe0
mapped_ioctl(3,c0c46808,bfbffa14,0,d75cacf0) at mapped_ioctl+0x3e7
sys_ioctl(d75cacf0,6,1b88a2c3,0,d2aed218) at sys_ioctl+0x17
syscall2(d75cad40) at syscall2+0x1ef
Xint0x80_syscall() at Xint0x80_syscall+0x36
Debugger("panic")

a partly backtrace is:

#11 0xc04b0f8c in hammer_recover_cursor (cursor=0xd75ca7a8)
at /usr/src/sys/vfs/hammer/hammer_cursor.c:591
591 KKASSERT;
(kgdb) list
586 hammer_recover_cursor(hammer_cursor_t cursor)
587 {
588 int error;
589
590 hammer_unlock_cursor(cursor);
591 KKASSERT;
592
593 /*
594 * Wait for the deadlock to clear
595 */

(kgdb) print cursor
$1 = (hammer_cursor_t) 0xd75ca7a8
(kgdb) print cursor->trans
$2 = (hammer_transaction_t) 0xd75caa74
(kgdb) print cursor->trans->sync_lock_refs
$3 = 0

#12 0xc04ba517 in hammer_ioc_mirror_write (trans=0xd75caa74, ip=0xd2d28550,
mirror=0xc2a727f8) at /usr/src/sys/vfs/hammer/hammer_mirror.c:469
469 hammer_recover_cursor(&cursor);
(kgdb) list
464 * Retry the current record on deadlock,
otherwise setup
465 * for the next loop.
466 */
467 if (error EDEADLK) {
468 while (error EDEADLK) {
469 hammer_recover_cursor(&cursor);
470 error =
hammer_cursor_upgrade(&cursor);
471 }
472 } else {
473 if (error == EALREADY)

#13 0xc04b97dc in hammer_ioctl (ip=0xd2d28550, com=3234097160,
data=0xc2a727f8 "", fflag=1, cred=0xc29a7768)
at /usr/src/sys/vfs/hammer/hammer_ioctl.c:134
134 error = hammer_ioc_mirror_write(&trans, ip,
(kgdb) list
129 (struct hammer_ioc_mirror_rw
)data);
130 }
131 break;
132 case HAMMERIOC_MIRROR_WRITE:
133 if (error == 0) {
134 error = hammer_ioc_mirror_write(&trans, ip,
135 (struct hammer_ioc_mirror_rw
)data);
136 }
137 break;
138 case HAMMERIOC_GET_VERSION:

#14 0xc04c94de in hammer_vop_ioctl (ap=0xd75caad0)
at /usr/src/sys/vfs/hammer/hammer_vnops.c:2305
2305 return(hammer_ioctl(ip, ap->a_command, ap->a_data,
(kgdb) list
2300 hammer_vop_ioctl(struct vop_ioctl_args *ap)
2301 {
2302 struct hammer_inode *ip = ap->a_vp->v_data;
2303
2304 ++hammer_stats_file_iopsr;
2305 return(hammer_ioctl(ip, ap->a_command, ap->a_data,
2306 ap->a_fflag, ap->a_cred));
2307 }
2308

#15 0xc03940ae in vop_ioctl (ops=0xd4ae7050, vp=0xc2b145e8,
command=3234097160, data=0xc2a727f8 "", fflag=1, cred=0xc29a7768,
msg=0xd75cacf0) at /usr/src/sys/kern/vfs_vopops.c:376
376 DO_OPS(ops, error, &ap, vop_ioctl);
(kgdb) list
371 ap.a_data = data;
372 ap.a_fflag = fflag;
373 ap.a_cred = cred;
374 ap.a_sysmsg = msg;
375
376 DO_OPS(ops, error, &ap, vop_ioctl);
377 return(error);
378 }
379

#16 0xc0393bed in vn_ioctl (fp=0xd0291928, com=3234097160,
data=0xc2a727f8 "",
ucred=0xc29a7768, msg=0xd75cacf0) at /usr/src/sys/kern/vfs_vnops.c:938
938 error = VOP_IOCTL(vp, com, data, fp->f_flag,
ucred, msg);
(kgdb) list
933 }
934 *(int *)data = dev_dflags(vp->v_rdev) &
D_TYPEMASK;
935 error = 0;
936 break;
937 }
938 error = VOP_IOCTL(vp, com, data, fp->f_flag,
ucred, msg);
939 if (error 0 && com TIOCSCTTY) {
940 struct proc *p = curthread->td_proc;
941 struct session *sess;

Actions #5

Updated by dennis.melentyev over 15 years ago

Hi!

I had something very similar on 95% full hammer FS with version of FS = 1.
After upgrading it to v2 panics vent away.

dennis@dfly (xterm) > uname -srip
DragonFly 2.3.1-DEVELOPMENT i386 GENERIC
As of 05/Jul/2009

Like:
  1. hammer version-upgrade /mnt/big_plate 2

Also here are some interesting lines in /var/log/messages, possibly
indicating corrupted media:

Sep 16 01:22:42 dfly kernel: ad4: WARNING - WRITE_DMA UDMA ICRC error
(retrying request) LBA=92046464
Sep 16 01:26:18 dfly syslogd: kernel boot file is /boot/kernel
Sep 16 01:26:18 dfly kernel: ad4: FAILURE - device detached
Sep 16 01:26:18 dfly kernel: subdisk4: detached
Sep 16 01:26:18 dfly kernel: HAMMER: Critical error
inode=-1 while flushing meta-data
Sep 16 01:26:18 dfly kernel: HAMMER: Forcing read-only mode
Sep 16 01:26:18 dfly kernel: HAMMER: Critical error
inode=-1 while flushing meta-data
Sep 16 01:26:18 dfly last message repeated 6 times
Sep 16 01:26:18 dfly kernel: ad4: detached
......
Sep 16 01:26:18 dfly kernel: Fatal trap 12: page fault while in kernel mode
Sep 16 01:26:18 dfly kernel: fault virtual address = 0xa4
Sep 16 01:26:18 dfly kernel: fault code = supervisor write,
page not present
......
Sep 16 01:26:18 dfly kernel: processor eflags = interrupt enabled,
resume, IOPL = 0
Sep 16 01:26:18 dfly kernel: current process = Idle
Sep 16 01:26:18 dfly kernel: current thread = pri 46 (CRIT)
......
Sep 16 01:26:18 dfly kernel:
Sep 16 01:26:18 dfly kernel: syncing disks... 439 HAMMER:
Critical error inode=-1 while flushing meta-data

2009/9/16 Eugene <>:

Hello again.

Simon 'corecode' Schubert wrote:

Best update to latest master, there were many fixes since August.

Now I've an updated kernel with sources from 13-sep-2009:
DragonFly diana.medcom.com.ua 2.3.2-DEVELOPMENT DragonFly 2.3.2-DEVELOPMENT
#1: Tue Sep 15 17:23:22 EEST 2009
:/usr/obj/usr/src/sys/CUSTOM  i386

but the "panic" message greeted me this morning:

panic: assertion: cursor->trans->sync_lock_refs > 0 in hammer_recover_cursor
Trace beginning at frame 0xd75ca708
panic(d75ca72c,d75ca7a8,d75ca88c,d75ca804,d75ca738) at panic+0x8c
panic(c05ac924,c0617e20,c059441c,d75ca7a8,d75ca99c) at panic+0x8c
hammer_recover_cursor(d75ca7a8,b,8,d75ca768,c04acd05) at
hammer_recover_cursor+0x2c
hammer_ioc_mirror_write(d75caa74,d2d28550,c2a727f8,d75ca9d8,c0394328) at
hammer_ioc_mirror_write+0x947
hammer_ioctl(d2d28550,c0c46808,c2a727f8,1,c29a7768) at hammer_ioctl+0x8f8
hammer_vop_ioctl(d75caad0,c067a1e0,d4ae7050,d75caaec,c0345eaa) at
hammer_vop_ioctl+0x2f
vop_ioctl(d4ae7050,c2b145e8,c0c46808,c2a727f8,1) at vop_ioctl+0x3e
vn_ioctl(d0291928,c0c46808,c2a727f8,c29a7768,d75cacf0) at vn_ioctl+0xe0
mapped_ioctl(3,c0c46808,bfbffa14,0,d75cacf0) at mapped_ioctl+0x3e7
sys_ioctl(d75cacf0,6,1b88a2c3,0,d2aed218) at sys_ioctl+0x17
syscall2(d75cad40) at syscall2+0x1ef
Xint0x80_syscall() at Xint0x80_syscall+0x36
Debugger("panic")

a partly backtrace is:

#11 0xc04b0f8c in hammer_recover_cursor (cursor=0xd75ca7a8)
  at /usr/src/sys/vfs/hammer/hammer_cursor.c:591
591             KKASSERT;
(kgdb) list
586     hammer_recover_cursor(hammer_cursor_t cursor)
587     {
588             int error;
589
590             hammer_unlock_cursor(cursor);
591             KKASSERT;
592
593             /*
594              * Wait for the deadlock to clear
595              */

(kgdb) print cursor
$1 = (hammer_cursor_t) 0xd75ca7a8
(kgdb) print cursor->trans
$2 = (hammer_transaction_t) 0xd75caa74
(kgdb) print cursor->trans->sync_lock_refs
$3 = 0

#12 0xc04ba517 in hammer_ioc_mirror_write (trans=0xd75caa74, ip=0xd2d28550,
  mirror=0xc2a727f8) at /usr/src/sys/vfs/hammer/hammer_mirror.c:469
469                                     hammer_recover_cursor(&cursor);
(kgdb) list
464                      * Retry the current record on deadlock, otherwise
setup
465                      * for the next loop.
466                      */
467                     if (error EDEADLK) {
468                             while (error EDEADLK) {
469                                     hammer_recover_cursor(&cursor);
470                                     error =
hammer_cursor_upgrade(&cursor);
471                             }
472                     } else {
473                             if (error == EALREADY)

#13 0xc04b97dc in hammer_ioctl (ip=0xd2d28550, com=3234097160,
  data=0xc2a727f8 "", fflag=1, cred=0xc29a7768)
  at /usr/src/sys/vfs/hammer/hammer_ioctl.c:134
134                             error = hammer_ioc_mirror_write(&trans, ip,
(kgdb) list
129                                         (struct hammer_ioc_mirror_rw
)data);
130                     }
131                     break;
132             case HAMMERIOC_MIRROR_WRITE:
133                     if (error == 0) {
134                             error = hammer_ioc_mirror_write(&trans, ip,
135                                         (struct hammer_ioc_mirror_rw
)data);
136                     }
137                     break;
138             case HAMMERIOC_GET_VERSION:

#14 0xc04c94de in hammer_vop_ioctl (ap=0xd75caad0)
  at /usr/src/sys/vfs/hammer/hammer_vnops.c:2305
2305            return(hammer_ioctl(ip, ap->a_command, ap->a_data,
(kgdb) list
2300    hammer_vop_ioctl(struct vop_ioctl_args *ap)
2301    {
2302            struct hammer_inode *ip = ap->a_vp->v_data;
2303
2304            ++hammer_stats_file_iopsr;
2305            return(hammer_ioctl(ip, ap->a_command, ap->a_data,
2306                                ap->a_fflag, ap->a_cred));
2307    }
2308

#15 0xc03940ae in vop_ioctl (ops=0xd4ae7050, vp=0xc2b145e8,
  command=3234097160, data=0xc2a727f8 "", fflag=1, cred=0xc29a7768,
  msg=0xd75cacf0) at /usr/src/sys/kern/vfs_vopops.c:376
376             DO_OPS(ops, error, &ap, vop_ioctl);
(kgdb) list
371             ap.a_data = data;
372             ap.a_fflag = fflag;
373             ap.a_cred = cred;
374             ap.a_sysmsg = msg;
375
376             DO_OPS(ops, error, &ap, vop_ioctl);
377             return(error);
378     }
379

#16 0xc0393bed in vn_ioctl (fp=0xd0291928, com=3234097160, data=0xc2a727f8
"",
  ucred=0xc29a7768, msg=0xd75cacf0) at /usr/src/sys/kern/vfs_vnops.c:938
938                     error = VOP_IOCTL(vp, com, data, fp->f_flag, ucred,
msg);
(kgdb) list
933                             }
934                             *(int *)data = dev_dflags(vp->v_rdev) &
D_TYPEMASK;
935                             error = 0;
936                             break;
937                     }
938                     error = VOP_IOCTL(vp, com, data, fp->f_flag, ucred,
msg);
939                     if (error 0 && com TIOCSCTTY) {
940                             struct proc *p = curthread->td_proc;
941                             struct session *sess;

--
Sorry for my poor English.

Actions #6

Updated by corecode over 15 years ago

Eugene wrote:

Hello again.

Simon 'corecode' Schubert wrote:

Best update to latest master, there were many fixes since August.

Now I've an updated kernel with sources from 13-sep-2009:
DragonFly diana.medcom.com.ua 2.3.2-DEVELOPMENT DragonFly
2.3.2-DEVELOPMENT #1: Tue Sep 15 17:23:22 EEST 2009
:/usr/obj/usr/src/sys/CUSTOM i386

are you sure this is new code? because there is no version tag. uname
should look like this these days:

DragonFly sweatshorts 2.3.2-DEVELOPMENT DragonFly
v2.3.2.930.g702a9-DEVELOPMENT #39: Tue Sep 15 00:53:17 CEST 2009
corecode@sweatshorts:/usr/obj/usr/src/sys/SWEATSHORTS i386

cheers
simon

Actions #7

Updated by dfuser over 15 years ago

Hello.
Dennis Melentyev wrote:

Hi!

I had something very similar on 95% full hammer FS with version of FS = 1.
After upgrading it to v2 panics vent away.

dennis@dfly (xterm) > uname -srip
DragonFly 2.3.1-DEVELOPMENT i386 GENERIC
As of 05/Jul/2009

Like:
  1. hammer version-upgrade /mnt/big_plate 2

Also here are some interesting lines in /var/log/messages, possibly
indicating corrupted media:

As I can understand, I have a 2nd version of filesystem:

diana# hammer version /mirrorvar
min=1 wip=3 max=2 current=2 description="2.3 - New directory entry layout"
available versions:
1 NORM 2.0 - First HAMMER release
2 NORM 2.3 - New directory entry layout

and the logfile has nothing indicating any disk problems. And the system
has considerably small percent of space usage (3% after daily cleanup,
about 15% before)

Actions #8

Updated by dfuser over 15 years ago

Sorry, I've updated not to a [src-master.tar.bz2 (15-Sep-2009 17:07)],
but to [src-Devel.tar.bz2 (13-Sep-2009 15:05)].
Now I'll try to re-update.

Actions #9

Updated by corecode over 15 years ago

Eugene (via DragonFly issue tracker) wrote:

Eugene <> added the comment:

Sorry, I've updated not to a [src-master.tar.bz2 (15-Sep-2009 17:07)],
but to [src-Devel.tar.bz2 (13-Sep-2009 15:05)].
Now I'll try to re-update.

Oh nevermind. The version output only works if you are using git (which you should)

cheers
simon

Actions #10

Updated by dennis.melentyev over 15 years ago

Hi!

Ok, this means we have/had different problems. I've upgraded to latest
master today (almost equal 2.4-Release). That's all I could help you.

Side note: Much better overall experience. No stalls on rl0 interface
(was not sure about the source of the problem) and smoother
everything. Thanks, folks!

Actions #11

Updated by dillon over 15 years ago

:Hi!
:
:I had something very similar on 95% full hammer FS with version of FS =3D 1=
:.
:After upgrading it to v2 panics vent away.
:
:...
:Also here are some interesting lines in /var/log/messages, possibly
:indicating corrupted media:
:
:Sep 16 01:22:42 dfly kernel: ad4: WARNING - WRITE_DMA UDMA ICRC error
:(retrying request) LBA=3D92046464
:Sep 16 01:26:18 dfly syslogd: kernel boot file is /boot/kernel
:Sep 16 01:26:18 dfly kernel: ad4: FAILURE - device detached
:Sep 16 01:26:18 dfly kernel: subdisk4: detached
:Sep 16 01:26:18 dfly kernel: HAMMER: Critical error
:...
:--=20
:Dennis Melentyev

Gotta be two different things.  Basically ad4 had a DMA error and
detached and the HAMMER filesystem, being unable to write, threw a fit.
That would be an issue with NATA. HAMMER was just saving itself.
That sort of UDMA failure is usually indicative of an IDE wiring
issue.
We did fix a bug in HAMMER related to very rare incorrect B-Tree
deletions which would lead to lost inodes and potentially panics.
Anything like that should no longer occur.
-Matt
Matthew Dillon
&lt;&gt;
Actions #12

Updated by dennis.melentyev about 15 years ago

2009/9/17 Matthew Dillon <>:

:Hi!
:
:I had something very similar on 95% full hammer FS with version of FS =3D 1=
:.
:After upgrading it to v2 panics vent away.
:
:...
:Also here are some interesting lines in /var/log/messages, possibly
:indicating corrupted media:
:
:Sep 16 01:22:42 dfly kernel: ad4: WARNING - WRITE_DMA UDMA ICRC error
:(retrying request) LBA=3D92046464
:Sep 16 01:26:18 dfly syslogd: kernel boot file is /boot/kernel
:Sep 16 01:26:18 dfly kernel: ad4: FAILURE - device detached
:Sep 16 01:26:18 dfly kernel: subdisk4: detached
:Sep 16 01:26:18 dfly kernel: HAMMER: Critical error
:...
:--=20
:Dennis Melentyev

   Gotta be two different things.  Basically ad4 had a DMA error and
   detached and the HAMMER filesystem, being unable to write, threw a fit.
   That would be an issue with NATA.  HAMMER was just saving itself.

   That sort of UDMA failure is usually indicative of an IDE wiring
   issue.

   We did fix a bug in HAMMER related to very rare incorrect B-Tree
   deletions which would lead to lost inodes and potentially panics.
   Anything like that should no longer occur.

                                       -Matt
                                       Matthew Dillon
                                       <>

Wiring was the one possibility, and underpowered drive could be just
another (P4/2.4 with 3 HDD's on noname 300W PSU). Just realized that
few days ago: mbmon's 12P had showed 10.5+ Volts until PSU was
replaced with 400W one.

PS. That also saves my ears in the night, thanks to 120mm cooler :)

Actions #13

Updated by tuxillo over 13 years ago

Hi Dennis,

From your last email I would assume you fixed it by replacing either the wire or
the PSU (or both). Is that correct?

Cheers,
Antonio Huete

Actions #14

Updated by tuxillo almost 11 years ago

  • Description updated (diff)
  • Category set to VFS subsystem
  • Assignee changed from 0 to tuxillo
  • Target version set to 3.8

Grab.

Actions #15

Updated by tuxillo almost 11 years ago

  • Status changed from New to Closed

- There was a fix by Matt for related issues.
- DMA errors that could indicate cable issues, if not problems in the drive itself.

Closing this one.

Actions

Also available in: Atom PDF