Bug #728

Inspiron 9400 freezes when enabling bfe0

Added by joerg1 over 7 years ago. Updated over 7 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi,

I just got me a brand new Dell Inspiron 9400 Notebook, Intel Dual
Core. I installed 1.9-Devel without problems, but noticed the keyboard
rsp. console freezes with ACPI enabled.

Not a big deal, though, booting without ACPI enabled does the trick,
but what really bothers me: when I try to" ifconfig UP" the internal
network card (Broadcom Chipset), the whole system freezes immediately.

The same holds true after starting dhclient; in fact, any attempt to
UP bfe0 causes a freeze.

I admit I was too lazy to check this group if this is a known issue,
so if it is, maybe somebody can point me in the right direction. As
far as I glanced through the topics, I believe it might be related to
interrupt routing...?

I flashed the Notbook with the latest available BIOS (A09), but still
everything freezes as soon as the network card gets involved (tried
1.9-Devel & 1.8.1-REL).

Anyway, if this is not a known issue, I can provide any dump files
that might help solve this annoying problem, just let me know what you
need & what else I can do/try.

Thanx in advance

--j

P.S. I tried the whole common variety of things-to-try, like disabling
dual-core in BIOS, etc. without success.

busdma_machdep_diff.txt Magnifier (764 Bytes) sepherosa, 07/13/2007 04:15 AM

History

#1 Updated by dillon over 7 years ago

:Hi,
:
:I just got me a brand new Dell Inspiron 9400 Notebook, Intel Dual
:Core. I installed 1.9-Devel without problems, but noticed the keyboard
:rsp. console freezes with ACPI enabled.
:
:Not a big deal, though, booting without ACPI enabled does the trick,
:but what really bothers me: when I try to" ifconfig UP" the internal
:network card (Broadcom Chipset), the whole system freezes immediately.
:
:The same holds true after starting dhclient; in fact, any attempt to
:UP bfe0 causes a freeze.
:
:I admit I was too lazy to check this group if this is a known issue,
:so if it is, maybe somebody can point me in the right direction. As
:far as I glanced through the topics, I believe it might be related to
:interrupt routing...?
:
:I flashed the Notbook with the latest available BIOS (A09), but still
:everything freezes as soon as the network card gets involved (tried
:1.9-Devel & 1.8.1-REL).
:
:Anyway, if this is not a known issue, I can provide any dump files
:that might help solve this annoying problem, just let me know what you
:need & what else I can do/try.
:
:
:Thanx in advance
:
:--j
:
:P.S. I tried the whole common variety of things-to-try, like disabling
:dual-core in BIOS, etc. without success.

Hmm. We don't have polling support in that driver or I'd suggest
trying to run it in polling mode.

Try an SMP build without APIC_IO.

Have you tried the FreeBSD livecd to see if FreeBSD can operate the
network interface?

This might require some significant debugging to figure out. A
complete freeze is either an interrupt storm (which is unlikely
since we have code to detect storms), or a code loop in the driver,
or a bus deadlock or bus fatal error due to improper programming of
the device. Sprinkling a bunch of kprintf()'s in the device code
paths might help narrow down the issue, and comparing the driver
against the driver in FreeBSD might yield fixes or other issues
that we need to apply.

-Matt
Matthew Dillon
<>

#2 Updated by joerg1 over 7 years ago

Hi,

Yeah, I just tried the latest "Freesbie" Live-CD (2.0.1), which boots
up okay without ACPI...and bfe0 is also working the way it should!

I can UP and DOWN the interface without problems, and it receives an
inet address from my router's DHCP server. Assigning an address
manually via ifconfig also works fine, the system doesn't freeze at
all in any case.

So I think I should start comparing both bfe implementations, I guess.

Enjoy the show

--j

#3 Updated by sepherosa over 7 years ago

How much phy memory does your laptop have?

Best Regards,
sephe

#4 Updated by janslik over 7 years ago

There's a total of 2048MB of physical memory inside the box, which should be
sufficient... :)

--j

#5 Updated by sepherosa over 7 years ago

bfe(4)'s dma configuration is problematic for boxes whose phy memory >
1Gbytes. I will give you a patch to try after I am off work.

Best Regards,
sephe

#6 Updated by janslik over 7 years ago

Ah! Okay, that sounds like it's worth a try...I'll test your patch as soon as I
get it and after I am off work, too. :-)

--j

#7 Updated by sepherosa over 7 years ago

Please test:
http://leaf.dragonflybsd.org/~sephe/bfe_dma.diff

Best Regards,
sephe

#8 Updated by joerg1 over 7 years ago

Hi!

Just tried that, but to no effort: trying to boot the machine
generates millions of "DB>" debugger prompt lines at light speed,
without the chance to abort or otherwise being able to input something
useful like "reset".

Disabling the dual core feature in BIOS leads to the same result as
described above.

And, a kernel built with APIC_IO and SMP both enabled won't even boot
at all, the machine freezes almost immediately -- no prompts, no
messages, no useful hints, and no debugger at all.

I will now re-build the kernel with SMP und APIC_IO disabled and
Sephe's bfe-stuff (thanks!) patched in.

Stay tuned

--j

#9 Updated by joerg1 over 7 years ago

Hi Sephe,

your fix for the bfe network interface does the trick...almost! :-)

I can now access /dev/bfe0 without causing the machine to freeze, but
unfortunately, there's no way to communicate over the network; the
system displays "bfe0: bfe_encap bus_dmamap_load failed: 36" error
messages when the interface is ifconfig'd UP or dhclient is started.

Anyway, I believe it's true the bfe driver has some serious problems
with machines that have >1GB of RAM -- your fix proves that.

And unless you have a fix for the bus_dmamap_load issue described
above at hand, I'll further investigate.

Thanx so far

--j

#10 Updated by joerg1 over 7 years ago

Hi,

I finally found a workaround for getting the bfe interface up and
running: I built the kernel with the "original" (unpatched) if_bfe.c
and added

hw.physmem="1G"

to /boot/loader.conf. This way the bfe interface works just fine!

I know this isn't a fix, but I take this as a starting point for
further debugging the bfe driver.

Enjoy the show

--j

#11 Updated by dillon over 7 years ago

:Hi,
:
:I finally found a workaround for getting the bfe interface up and
:running: I built the kernel with the "original" (unpatched) if_bfe.c
:and added
:
:hw.physmem="1G"
:
:to /boot/loader.conf. This way the bfe interface works just fine!
:
:I know this isn't a fix, but I take this as a starting point for
:further debugging the bfe driver.
:
:Enjoy the show
:
:--j

This implies that the BFE driver may have limited DMA range, which
we can program for in the busdma setup for bfe.

-Matt
Matthew Dillon
<>

#12 Updated by dillon over 7 years ago

: This implies that the BFE driver may have limited DMA range, which
: we can program for in the busdma setup for bfe.
:

DING! The FreeBSD driver limits DMA to the low 1G of ram.

Try this patch.

-Matt
Matthew Dillon
<>

Index: if_bfe.c
===================================================================
RCS file: /cvs/src/sys/dev/netif/bfe/if_bfe.c,v
retrieving revision 1.30
diff -u -p -r1.30 if_bfe.c
--- if_bfe.c 25 Oct 2006 20:55:56 -0000 1.30
+++ if_bfe.c 13 Jul 2007 01:04:05 -0000
@@ -192,7 +192,7 @@
/* parent tag */
error = bus_dma_tag_create(NULL, /* parent */
PAGE_SIZE, 0, /* alignment, boundary */
- BUS_SPACE_MAXADDR_32BIT, /* lowaddr */
+ 0x3FFFFFFF, /* lowaddr */
BUS_SPACE_MAXADDR, /* highaddr */
NULL, NULL, /* filter, filterarg */
MAXBSIZE, /* maxsize */

#13 Updated by corecode over 7 years ago

Who builds such hardware? Nowadays? How can you just forget two address lines?

cheers
simon

#14 Updated by sepherosa over 7 years ago

I think I included this in my patch :)
My question is:
Should EINPROGRESS from bus_dmamap_load() be taken as fatal error?

Best Regards,
sephe

#15 Updated by dillon over 7 years ago

:I think I included this in my patch :)
:My question is:
:Should EINPROGRESS from bus_dmamap_load() be taken as fatal error?
:
:Best Regards,
:sephe

No, its not a fatal error. It just means the resources could not
be immediately allocated in the bounce map. The callback function
is called when the buffer gets mapped (bfe_dma_map in this case I
think).

-Matt

#16 Updated by sepherosa over 7 years ago

Please apply the attached patch too. Looks like bounce pages are not
allocated at all.

Best Regards,
sephe

#17 Updated by dillon over 7 years ago

:Who builds such hardware? Nowadays? How can you just forget two address=
: lines?
:
:cheers
: simon

It's probably a recycled hardware design. Most of these vendors
are piss-poor designers.

-Matt

#18 Updated by sepherosa over 7 years ago

lol. Some of their modern chips still have 2Gbytes limit :P

#19 Updated by janslik over 7 years ago

Yeah,

both of Sephe's patches together clearly do the trick, bfe0 is up and running
the way it should! :-)

Thanx Sephe & Matt.

#20 Updated by dillon over 7 years ago

:Jörg Anslik <> added the comment:
:
:Yeah,
:
:both of Sephe's patches together clearly do the trick, bfe0 is up and running
:the way it should! :-)
:
:Thanx Sephe & Matt.

Excellent. Commit that stuff, Sephe!

-Matt
Matthew Dillon
<>

#21 Updated by hasso over 7 years ago

It's committed, so closing bug.

Also available in: Atom PDF