Bug #2117
openACPI and/or bce(4) problem with 2.11.0.673.g0d557 on HP DL380 G6
0%
Description
I got a standard HP Proliant DL380 G6 server with a built-in quad broadcom NIC.
2.10 didn't have the updated bcn drivers, so I installed the 2.11.0.673 snapshot
to get connectivity.
First, the ACPI error (also present in 2.10):
[ACPI Debug] String [0xB] "_TMP Method"
This message repeats 60 times every 10 minutes. I have no idea what it means,
googling for it only points me at a NetBSD discussion from 2009.
Secondly, the bcn driver (or perhaps atapci?):
interrupt total rate
sio2 0 0
sio0 0 0
acpi0 12125 0
bce0 1547359 26
bce1/atapci0 2293301893 39875 <-- ouch?
bce2 0 0
bce3 0 0
uhci0/ehci0 1 0
uhci2/uhci4 34 0
uhci1/uhci3 44 0
ciss0 267683 4
swi_siopoll 0 0
swi_cambio 267762 4
swi_vm 0 0
swi_taskq/swi_mp_taskq 25 0
Total 2295396926 39911
The weird part is that I dont have any ATA devices in use, there's only a CD-
rom. bcn1 isnt configured or marked up, only bcn0 is in use.
The deal breaker here is that I can't do anything disk intensive without getting
a crash. I tried updating pkgsrc yesterday, and here are two examples:
- Signal 10
Stop in /usr.
[snip]
Bus error (core dumped)
- Error code 1
Stop in /usr.
[snip]
While getting these errors messages like this flooded dmesg:
intr 16 at 40001/40000 hz, livelocked limit engaged!
[ACPI Debug] String [0xB] "_TMP Method"
intr 16 at 882/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
pid 34805 (git), uid 0: exited on signal 10 (core dumped)
intr 16 at 3225/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 751/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 765/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 795/20000 hz, livelock removed
[ACPI Debug] String [0xB] "_TMP Method"
I'm not familiar with debugging this, so please let me know if you need more
info. I can also put the server in the DMZ and give a developer SSH access if
needed.
Files
Updated by pauska over 13 years ago
I forgot to mention that there is a Raritan KVM attached wich emulates storage
devices (for mounting ISO's etc via the KVM client). Could this be the cause of
the atapci interrupts?
Updated by pauska over 13 years ago
Update: the interrupt storm went away after disabling S-ATA in the BIOS.
Updated by pauska over 13 years ago
Update2: The interrupt storm came back on irq16/bce0 after doing heavy downloads
via git.
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 827/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 1223/20000 hz, livelock removed
[ACPI Debug] String [0xB] "_TMP Method"
intr 16 at 40001/40000 hz, livelocked limit engaged!
[ACPI Debug] String [0xB] "_TMP Method"
intr 16 at 2838/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
[ACPI Debug] String [0xB] "_TMP Method"
intr 16 at 2234/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 2685/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 790/20000 hz, livelock removed
[ACPI Debug] String [0xB] "_TMP Method"
[ACPI Debug] String [0xB] "_TMP Method"
[ACPI Debug] String [0xB] "_TMP Method"
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 825/20000 hz, livelock removed
[ACPI Debug] String [0xB] "_TMP Method"
[ACPI Debug] String [0xB] "_TMP Method"
[ACPI Debug] String [0xB] "_TMP Method"
[ACPI Debug] String [0xB] "_TMP Method"
[ACPI Debug] String [0xB] "_TMP Method"
[ACPI Debug] String [0xB] "_TMP Method"
seg-fault ft=0002 ff=000c addr=0x7fffffbffff8 rip=0x4e406a pid=814 p_comm=git
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 1169/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 954/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
pid 814 (git), uid 0: exited on signal 10 (core dumped)
intr 16 at 551/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 760/20000 hz, livelock removed
[ACPI Debug] String [0xB] "_TMP Method"
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 772/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 911/20000 hz, livelock removed
interrupt total rate
irq3: sio2 0 0
irq4: sio0 0 0
irq9: acpi0 12113 12
irq16: bce0 1468203 1478
irq17: bce1/atapci0 0 0
irq18: bce2 0 0
irq19: bce3 0 0
irq20: uhci0/ehci0 1 0
irq22: uhci2/uhci4 1 0
irq23: uhci1/uhci3 1 0
irq28: ciss0 185332 186
irq192: swi_siopoll 0 0
irq195: swi_cambio 185396 186
irq196: swi_vm 0 0
irq197: swi_taskq/swi_mp_taskq 0 0
Total 1851047 1864
Updated by pauska over 13 years ago
Update3: Disabling ACPI in the boot loader made it even worse, now the NIC won't
come up (getting "bce0: Watchdog timeout occured, resetting!" repeatedly in the
console).
Updated by sepherosa over 13 years ago
http://leaf.dragonflybsd.org/~sephe/if_bce.c.diff
Please test the above patch.