Bug #933

em(4) hardware error after ACPI suspend

Added by matthias almost 7 years ago. Updated almost 2 years ago.

Status:FeedbackStart date:
Priority:NormalDue date:
Assignee:sepherosa% Done:

0%

Category:-
Target version:-

Description

He,

since today I have an IBM Thinkpad T42 here. The machine is equipped
with a em(4) NIC and a ath(4) wireless NIC. After testing ACPI suspend
to RAM (state 3) which works flawlessly, I noticed some errors right
after the resume:

device_probe_and_attach: em0 attach returned 5
em0: <Intel(R) PRO/1000 Network Connection, Version - 6.2.9> port
0x8000-0x803f mem 0xc0200000-0xc020ffff,0xc0220000-0xc023ffff irq 11 at
device 1.0 on pci2
can't re-use a leaf (debug_info)!
can't re-use a leaf (stats)!
can't re-use a leaf (rx_int_delay)!
can't re-use a leaf (tx_int_delay)!
can't re-use a leaf (rx_abs_int_delay)!
can't re-use a leaf (tx_abs_int_delay)!
can't re-use a leaf (int_throttle_ceil)!
can't re-use a leaf (rxd)!
can't re-use a leaf (txd)!
em0: The EEPROM Checksum Is Not Valid
em0: Unable to initialize the hardware
device_probe_and_attach: em0 attach returned 5

Why does the EEPROM checksum changes during a suspend/resume cycle? Is
it possible to suspend with this card? This error also happens if I
unload the modules right before the suspend and load it again after
resume.

An error also happens with the ath(4). If I don't unload the driver,
the card no longer gets an IP address from the DHCP server.
Loading/Unloading etc won't help.

A third error happens with USB stuff. If I don't unload the USB module
I see a continuously stream of the following messages:

Jan 26 19:44:26 jupiter acpi: resumed at 20080126 19:44:26
Jan 26 19:44:27 jupiter kernel: uhub0: port 1 reset failed
Jan 26 19:44:27 jupiter kernel: uhub1: port 1 reset failed
Jan 26 19:44:27 jupiter kernel: uhub2: port 1 reset failed
Jan 26 19:44:28 jupiter kernel: uhub0: port 2 reset failed
Jan 26 19:44:28 jupiter kernel: uhub1: port 2 reset failed
Jan 26 19:44:28 jupiter kernel: uhub2: port 2 reset failed
Jan 26 19:44:30 jupiter kernel: uhub0: port 1 reset failed
Jan 26 19:44:30 jupiter kernel: uhub1: port 1 reset failed
Jan 26 19:44:30 jupiter kernel: uhub2: port 1 reset failed
[...]

Only rebooting the machine helps. Append the relevant information:

DragonFly jupiter 1.11.0-DEVELOPMENT DragonFly 1.11.0-DEVELOPMENT #1:
Sat Jan 26 20:05:28 CET 2008

ath0@pci2:2:0: class=0x020000 card=0x833117ab chip=0x1014168c rev=0x01
hdr=0x00
vendor = 'Atheros Communications Inc.'
device = 'AR5212 Atheros AR5212 802.11abg wireless

em0@pci2:1:0: class=0x020000 card=0x05491014 chip=0x101e8086 rev=0x03
hdr=0x00
vendor = 'Intel Corporation'
device = '82540EP Gigabit Ethernet Controller (Mobile)'

Regards

Matthias

History

#1 Updated by sepherosa almost 7 years ago

Intel does only one eeprom checksum upon em_attach() in their new
driver, but I don't think that will solve your problem; I suspect all
of the devices' BAR is complete trashed after suspend/resume. Could
you apply following patch and post the debug prints?
http://leaf.dragonflybsd.org/~sephe/em_test.diff

The read back value should not be 0xffffffff

#2 Updated by matthias almost 7 years ago

He sephe,

I've applied the patch, but I don't see any status output. The only
change is the following additional line after resume:

Jan 27 10:07:57 jupiter kernel: em0: Memory Access and/or Bus Master
bits were not set

Cheers

Matthias

#3 Updated by sepherosa almost 7 years ago

Don't have much idea about how ACPI suspend/resume works. But it
looks like it calls device attach routine. And the above debug log is
benign. Does it mean you will have to manually ifconfig em0 up again
after resuming?

Best Regards,
sephe

#4 Updated by matthias almost 7 years ago

ifconfig after a resume will not help. The device is completely
unusable after the resume. I don't even see an em0 device after a
resume.

Regards

Matthias

#5 Updated by Johannes.Hofmann almost 7 years ago

Hi,

don't want to discourage you, but I've given up on suspend/resume on
my T42p. And I don't even miss it :-)
It basically works, but then there are tons of minor issues.

Now I use hw.acpi.lid_switch_state=S5 to shutdown when I close the lid
and booting is pretty fast anyway.

Johannes

#6 Updated by matthias almost 7 years ago

Heho,

Yeah, thats right, but I'll continue to use it anyway. Unloading ath(4)
and usb works find and due to the fact that I use wireless LAN most of
the time I can live with that em(4) issue. And how knows, maybe someone
fixes it someday :)

That would be my second choice :)

regards

Matthias

#7 Updated by sepherosa almost 7 years ago

Try the following one:
http://leaf.dragonflybsd.org/~sephe/em_d0.diff

Best Regards,
sephe

#8 Updated by matthias almost 7 years ago

Hi

Doesn't solve the EEPROM checksum problem, but printing the status
output works:

Before suspend:

em0: <Intel(R) PRO/1000 Network Connection, Version - 6.2.9> port
0x8000-0x803f mem 0xc0200000-0xc020ffff,0xc0240000-0xc025ffff
irq 11 at device 1.0 on pci2
STATUS 0xc380

After resume:

em0: <Intel(R) PRO/1000 Network Connection, Version - 6.2.9> port
0x8000-0x803f mem 0xc0200000-0xc020ffff,0xc0240000-0xc025ffff irq 11 at
device 1.0 on pci2
STATUS 0xffffffff
em0: The EEPROM Checksum Is Not Valid
em0: Unable to initialize the hardwar

Is assume that's bad :)

Regards

Matthias

#9 Updated by sepherosa over 6 years ago

hand over to myself

#10 Updated by tuxillo almost 2 years ago

  • Description updated (diff)
  • Status changed from New to Feedback
  • Assignee set to sepherosa

Hi Sephe,

I've assigned it to you (as per your last comment).
I would suppose this is not relevant anymore since em(4) has been updated several times in the last 4 years.

Cheers,
Antonio Huete

Also available in: Atom PDF