Bug #2117
openACPI and/or bce(4) problem with 2.11.0.673.g0d557 on HP DL380 G6
0%
Description
I got a standard HP Proliant DL380 G6 server with a built-in quad broadcom NIC.
2.10 didn't have the updated bcn drivers, so I installed the 2.11.0.673 snapshot 
to get connectivity.
First, the ACPI error (also present in 2.10):
[ACPI Debug]  String [0xB] "_TMP Method"
This message repeats 60 times every 10 minutes. I have no idea what it means, 
googling for it only points me at a NetBSD discussion from 2009.
Secondly, the bcn driver (or perhaps atapci?):
interrupt                    total       rate
sio2                             0          0
sio0                             0          0
acpi0                        12125          0
bce0                       1547359         26
bce1/atapci0            2293301893      39875 <-- ouch?
bce2                             0          0
bce3                             0          0
uhci0/ehci0                      1          0
uhci2/uhci4                     34          0
uhci1/uhci3                     44          0
ciss0                       267683          4
swi_siopoll                      0          0
swi_cambio                  267762          4
swi_vm                           0          0
swi_taskq/swi_mp_taskq          25          0
Total                   2295396926      39911
The weird part is that I dont have any ATA devices in use, there's only a CD-
rom. bcn1 isnt configured or marked up, only bcn0 is in use.
The deal breaker here is that I can't do anything disk intensive without getting 
a crash. I tried updating pkgsrc yesterday, and here are two examples:
- Signal 10
 Stop in /usr.
 [snip]
Bus error (core dumped)
- Error code 1
 Stop in /usr.
 [snip]
While getting these errors messages like this flooded dmesg:
intr 16 at 40001/40000 hz, livelocked limit engaged!
[ACPI Debug]  String [0xB] "_TMP Method" 
intr 16 at 882/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
pid 34805 (git), uid 0: exited on signal 10 (core dumped)
intr 16 at 3225/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 751/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 765/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 795/20000 hz, livelock removed
[ACPI Debug]  String [0xB] "_TMP Method"
I'm not familiar with debugging this, so please let me know if you need more 
info. I can also put the server in the DMZ and give a developer SSH access if 
needed.
Files
       Updated by pauska about 14 years ago
      Updated by pauska about 14 years ago
      
    
    I forgot to mention that there is a Raritan KVM attached wich emulates storage 
devices (for mounting ISO's etc via the KVM client). Could this be the cause of 
the atapci interrupts?
       Updated by pauska about 14 years ago
      Updated by pauska about 14 years ago
      
    
    Update: the interrupt storm went away after disabling S-ATA in the BIOS.
       Updated by pauska about 14 years ago
      Updated by pauska about 14 years ago
      
    
    Update2: The interrupt storm came back on irq16/bce0 after doing heavy downloads 
via git.
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 827/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 1223/20000 hz, livelock removed
[ACPI Debug]  String [0xB] "_TMP Method" 
intr 16 at 40001/40000 hz, livelocked limit engaged!
[ACPI Debug]  String [0xB] "_TMP Method" 
intr 16 at 2838/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
[ACPI Debug]  String [0xB] "_TMP Method" 
intr 16 at 2234/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 2685/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 790/20000 hz, livelock removed
[ACPI Debug]  String [0xB] "_TMP Method" 
[ACPI Debug]  String [0xB] "_TMP Method" 
[ACPI Debug]  String [0xB] "_TMP Method" 
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 825/20000 hz, livelock removed
[ACPI Debug]  String [0xB] "_TMP Method" 
[ACPI Debug]  String [0xB] "_TMP Method" 
[ACPI Debug]  String [0xB] "_TMP Method" 
[ACPI Debug]  String [0xB] "_TMP Method" 
[ACPI Debug]  String [0xB] "_TMP Method" 
[ACPI Debug]  String [0xB] "_TMP Method" 
seg-fault ft=0002 ff=000c addr=0x7fffffbffff8 rip=0x4e406a pid=814 p_comm=git
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 1169/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 954/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
pid 814 (git), uid 0: exited on signal 10 (core dumped)
intr 16 at 551/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 760/20000 hz, livelock removed
[ACPI Debug]  String [0xB] "_TMP Method" 
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 772/20000 hz, livelock removed
intr 16 at 40001/40000 hz, livelocked limit engaged!
intr 16 at 911/20000 hz, livelock removed
interrupt                            total       rate
irq3: sio2                               0          0
irq4: sio0                               0          0
irq9: acpi0                          12113         12
irq16: bce0                        1468203       1478
irq17: bce1/atapci0                      0          0
irq18: bce2                              0          0
irq19: bce3                              0          0
irq20: uhci0/ehci0                       1          0
irq22: uhci2/uhci4                       1          0
irq23: uhci1/uhci3                       1          0
irq28: ciss0                        185332        186
irq192: swi_siopoll                      0          0
irq195: swi_cambio                  185396        186
irq196: swi_vm                           0          0
irq197: swi_taskq/swi_mp_taskq           0          0
Total                              1851047       1864
       Updated by pauska about 14 years ago
      Updated by pauska about 14 years ago
      
    
    Update3: Disabling ACPI in the boot loader made it even worse, now the NIC won't 
come up (getting "bce0: Watchdog timeout occured, resetting!" repeatedly in the 
console).
       Updated by sepherosa about 14 years ago
      Updated by sepherosa about 14 years ago
      
    
    http://leaf.dragonflybsd.org/~sephe/if_bce.c.diff
Please test the above patch.