Bug #3127

DragonFly 5.1: panic: assertion "count & TOK_COUNTMASK" failed in _lwkt_reltokref at /usr/src/sys/kern/lwkt_token.c:410

Added by peeter over 1 year ago. Updated 4 months ago.

It is DragonFly 5.1 and the panic seems to originate in the ACPI subsystem. It's the same machine as in an earlier bug report

where the bug was ACPI related too.

Version String: DragonFly v5.1.0.1072.g51a52-DEVELOPMENT

panic with 1 spinlocks held
panic: assertion "count & TOK_COUNTMASK" failed in _lwkt_reltokref at /usr/src/sys/kern/lwkt_token.c:410

(kgdb) #0 _get_mycpu () at ./machine/thread.h:69
#1 panic (
fmt=fmt@entry=0xffffffff80bb7298 "assertion \"%s\" failed in %s at %s:%u")
at /usr/src/sys/kern/kern_shutdown.c:861
#2 0xffffffff80611a80 in _lwkt_reltokref (td=0xfffff80223a39780,
ref=<optimized out>) at /usr/src/sys/kern/lwkt_token.c:410
#3 lwkt_relalltokens (td=td@entry=0xfffff80223a39780)
at /usr/src/sys/kern/lwkt_token.c:506
#4 0xffffffff805f8e1a in panic (
fmt=fmt@entry=0xffffffff80beed38 "lwkt_switch: still holding %d exclusive spinlocks!") at /usr/src/sys/kern/kern_shutdown.c:814
#5 0xffffffff8060eafc in lwkt_switch ()
at /usr/src/sys/kern/lwkt_thread.c:645
#6 0xffffffff80618498 in tsleep (
ident=ident@entry=0xffffffff817f0d00 <kernel_map+128>,
flags=flags@entry=1024, wmesg=<optimized out>, timo=timo@entry=0)
at /usr/src/sys/kern/kern_synch.c:716
#7 0xffffffff805e80b0 in lockmgr_exclusive (
lkp=lkp@entry=0xffffffff817f0d00 <kernel_map+128>, flags=flags@entry=2)
at /usr/src/sys/kern/kern_lock.c:382
#8 0xffffffff80901e39 in lockmgr (flags=2,
lkp=0xffffffff817f0d00 <kernel_map+128>) at /usr/src/sys/sys/lock.h:271
#9 vm_map_remove (map=0xffffffff817f0c80 <kernel_map>,
start=18446735302252429312, end=18446735302252560384)
at /usr/src/sys/vm/vm_map.c:3112
#10 0xffffffff805f393b in kmem_slab_free (ptr=<optimized out>,
size=<optimized out>) at /usr/src/sys/kern/kern_slaballoc.c:1670
#11 0xffffffff805f4827 in kmalloc (size=size@entry=88,
type=type@entry=0xffffffff81391e00 <M_ACPITASK>, flags=flags@entry=4866)
at /usr/src/sys/kern/kern_slaballoc.c:732
#12 0xffffffff8097356c in AcpiOsExecute (Type=Type@entry=OSL_GPE_HANDLER,
Function=Function@entry=0xffffffff80979e20 <AcpiEvAsynchExecuteGpeMethod>, Context=Context@entry=0xfffff8037ae17290)
at /usr/src/sys/dev/acpica/Osd/OsdSchedule.c:138
#13 0xffffffff80979fc6 in AcpiEvGpeDispatch (GpeDevice=0xfffff8007a3a3658,
GpeEventInfo=0xfffff8037ae17290, GpeNumber=102)
at /usr/src/sys/contrib/dev/acpica/source/components/events/evgpe.c:993
#14 0xffffffff8097a15c in AcpiEvGpeDetect (
at /usr/src/sys/contrib/dev/acpica/source/components/events/evgpe.c:698
#15 0xffffffff8097c2d9 in AcpiEvSciXruptHandler (Context=0xfffff8007a323ee0)
at /usr/src/sys/contrib/dev/acpica/source/components/events/evsci.c:261
#16 0xffffffff80972de1 in InterruptWrapper (arg=0xfffff8007a323ee0)
at /usr/src/sys/dev/acpica/Osd/OsdInterrupt.c:158
#17 0xffffffff805c6619 in ithread_handler (arg=<optimized out>)
at /usr/src/sys/kern/kern_intr.c:899
#18 0xffffffff8060b710 in ?? () at /usr/src/sys/kern/lwkt_thread.c:1748
#19 0x0000000000000000 in ?? ()



Updated by peeter over 1 year ago

Core dump also on leaf in crash/2018-21-MARCH_acpi.


Updated by dillon over 1 year ago

  • Status changed from New to In Progress
  • Assignee set to dillon

How repeatable is this bug? It looks like ACPI is trying to hold a spin lock across a sleep. I think its a problem with AcpiOsAcquireLock() and friends so I put together a quick patch:

This is a bit of a problem because AcpiOsAcquireLock is supposed to be a spinlock, and can be used from the idle thread (which is not allowed to block). So the patch is a bit of a hack. But it might do the job.



Updated by peeter over 1 year ago

Many thanks---I'll apply the patch later tonight and report back.

It's not easily repeatable, i.e. I have no idea what exactly triggers it. But there seems to be a maximum time period of about two to four days in which the machine will almost certainly panic.


Updated by peeter over 1 year ago

Hello, the machine has an uptime of four days now. I'd say the patch has made a difference. I'll report when it will have made 14 days of uptime---then we know for sure.


Updated by peeter over 1 year ago

Hi again. The machine has uptime of 15 days now as a desktop---I believe the issue can be closed.


Updated by liweitianux 4 months ago

  • Status changed from In Progress to Resolved

The reporter said this issue had been resolved.

