Bug #2402
closedShowstopper panics for Release 3.2
Description
This is the list of panics we've been accumulating. It's particularly hard for i386 and UFS.
This is a good list of items to be fixed before next release.
#2296: panic: assertion "m->wire_count > 0" failed in pmap_unwire_pte at /usr/src/sys/platform/pc32/i386/pmap.c:1091
core available (~marino/crash, ~thomas/crash
(carried over from 3.0.1, 3.0.2, 3.0.3 showstopper lists)
#2364: panic: lockmgr: locking against myself
core available (~marino/crash)
#2374: Panic where softdep_update_inodeblock() called bwrite() with a NULL buffer
core available (~marino/crash) uploaded today
#2374: panic: flush_pagedep_deps: MKDIR_BODY
core available (~marino/crash)
#2370: panic: ffs_valloc: dup alloc
core available (~marino/crash)
#2350 panic: assertion "m->flags & PG_BUSY" failed in vm_page_protect at /usr/src/sys/vm/vm_page.h:532
core available (~pavalos/crash)
Leftover from 3.0.3 showstopper:
#2353 panic: assertion "gd->gd_spinlocks_wr == 0" failed in bsd4_schedulerclock
core available (~jaydg/crash)
#2388 panic: lockmgr: LK_RELEASE: no lock held
No core.
#2399 Panic on lwkt_reltoken from vm_mmap
core available (limited use)
Leftover from 3.0.1 showstopper:
#2284 panic: general protection fault (3.0 showstopper)
core available on ylem/var/crash, request to put on leaf didn't happen (?)
Other panics:
#2352 panic: Bad link elm 0xffffffe0a3775670 next->prev != elm
core available (~jaydg/crash)
#2369 panic: Bad link elm 0xffffffe07edf6068 next->prev != elm
core available (~jaydg/crash)
#2355 panic: rtrequest1_msghandler: rtrequest table error was cpu4, err 17
core available (~jaydg/crash)
#2083 panic: zone: entry not free
core might be available
#2358 panic: hammer: insufficient undo FIFO space!
NO CORE
#2345 panic: assertion "len <= nmp->nm_size" failed in nfs_writerpc_bio at ....
NO CORE
#2300 EHCI module unload panic
Supposed core availble on request
Updated by marino over 12 years ago
Typo: Issue 2083 --> #2084 panic: zone: entry not free
Updated by dillon over 12 years ago
Here's a patch to try to hopefully help or narrow down some of the softupdates issues. I found two major issues perusing the softupdates code.
First, sema_get() and sema_release() are not MP safe when called without an interlock.
Second, getdirtybuf() improperly retries after release/reacquiring &lk. If this function cannot obtain the buffer lock prior to releasing &lk it MUST return failure. The blocking buffer lock it obtains after releasing &lk is simply so the caller's retry loop doesn't live-lock... even if that second lock attempt succeeds the buffer itself may no longer be legally associated with the softdep work item because the instant &lk is released that work item can get ripped up. The buffer cache pointer itself is type-stable, but not work-item stable.
Updated by dillon over 12 years ago
Here is a second patch to hopefully fix the list-related panics in exit. What I believe is happening is that a threaded program is wait*()'ing for exiting children from several threads at once. This can race inside kern_wait() due to sub-tokens blocking and breaking q->p_token (on the parent). The candidate children have to be further serialized, plus we also have to double-check that the conditions are still valid and the child is still associated with the same parent.
Updated by tuxillo over 2 years ago
- Description updated (diff)
- Category set to Other
- Status changed from New to Closed
- Assignee set to tuxillo
3.2 was released long ago.