Bug #259

nfe(4) for nVidia GigE

Added by sepherosa about 8 years ago. Updated almost 8 years ago.

Status:ClosedStart date:
Priority:LowDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi all,

I have ported nfe(4) from OpenBSD, which supports various nVidia GigE
and is open source:
http://leaf.dragonflybsd.org/~sephe/nfe.tbz

If you want to test this driver, please apply following patch first:
http://leaf.dragonflybsd.org/~sephe/mii.diff

I have tested this driver with a MCP51, which is on "GA-M51GM-S2G 10B"
motherboard

I added a sysctl node for this driver, hw.nfe0.encap_delay, which is
set to 0 by default. But for my MCP51, I have to set it to 3 to make
my nfe0 txing without "watchdog timeout" within 3600sec iperf testing.
Hopefully you won't need it.

Best Regards,
sephe

History

#1 Updated by sepherosa about 8 years ago

On 7/25/06, Sepherosa Ziehau <> wrote:
> Hi all,
>
> I have ported nfe(4) from OpenBSD, which supports various nVidia GigE
> and is open source:
> http://leaf.dragonflybsd.org/~sephe/nfe.tbz
>

I have updated it, and will commit this updated version next week, if
no objection comes.

Please review/test it.

Best Regards,
sephe

#2 Updated by dillon about 8 years ago

:On 7/25/06, Sepherosa Ziehau <> wrote:
:> Hi all,
:>
:> I have ported nfe(4) from OpenBSD, which supports various nVidia GigE
:> and is open source:
:> http://leaf.dragonflybsd.org/~sephe/nfe.tbz
:>
:
:I have updated it, and will commit this updated version next week, if
:no objection comes.
:
:Please review/test it.
:
:Best Regards,
:sephe
:
:--
:Live Free or Die

Oooh. Very nice. And I happen to have test boxes with nVidia chipsets
that the driver recognizes!

I got an assertion when I tried to ifconfig it up. I think the issue
is simply that the serializer has already been acquired when nfe_jalloc()
or nfe_jfree() are called (since you also specified it when you
installed the interrupt vector). If I remove the serializer calls
from those two routines I can ifconfig it up.

-Matt
Matthew Dillon
<>

agp0: <NVIDIA Generic AGP Controller> mem 0xf4000000-0xf7ffffff at device 0.0 on pci0
agp0: Unable to find NVIDIA Memory Controller 1.
device_probe_and_attach: agp0 attach returned 19
nfe0: <NVIDIA nForce3 Gigabit Ethernet> port 0xfc00-0xfc07 mem 0xfdffc000-0xfdffcfff irq 10 at device 5.0 on pci0
nfe0: MAC address: 00:30:1b:b8:a9:f5
miibus0: <MII bus> on nfe0
e1000phy0: <Marvell 88E1111 Gigabit PHY> on miibus0
e1000phy0: 1000baseTX-FDX, 100baseTX-FDX, 100baseTX, 10baseTX-FDX, 10baseTX, auto
panic: assertion: s->last_td != curthread in lwkt_serialize_enter
mp_lock = 00000001; cpuid = 1; lapic.id = 01000000
Trace beginning at frame 0xd5f2999c
panic(c05337b4,1000000,c0522050,d5f299cc,d5decf00) at panic+0x17f
panic(c0522050,c0533f47,c0509c57,2,d5decf00) at panic+0x17f
lwkt_serialize_enter(d60d6a54,d60d68b8,d5decf00,d60d72e0,d5f29a20) at lwkt_seria
lize_enter+0x3a
nfe_jalloc(d60d68b8,1,0,c06570c0,d60d880c) at nfe_jalloc+0x1d
nfe_newbuf_jumbo(d60d68b8,d60d72e0,0,1,d60d68b8) at nfe_newbuf_jumbo+0x4f
nfe_init_rx_ring(d60d68b8,d60d72e0,c0453b96,d5eb29c0,8020690c) at nfe_init_rx_ri
ng+0x35
nfe_init(d60d68b8,0,0,d5bf6828,8020690c) at nfe_init+0x4a
ether_ioctl(d60d68b8,8020690c,d5bf6828,d5f29ab0,0) at ether_ioctl+0x8f
nfe_ioctl(d60d68b8,8020690c,d5bf6828,0,0) at nfe_ioctl+0x136
in_ifinit(d60d68b8,d5bf6828,d5f29c24,0,c02b6ded) at in_ifinit+0x1c7
in_control(d2fe7340,8040691a,d5f29c14,d60d68b8,cecb6800) at in_control+0x896
so_pru_control(d2fe7340,8040691a,d5f29c14,d60d68b8,cecb6800) at so_pru_control+0
x39
ifioctl(d2fe7340,8040691a,d5f29c14,c2465388,0) at ifioctl+0xace
soo_ioctl(d58c1c80,8040691a,d5f29c14,c2465388,2808d000) at soo_ioctl+0x19c
mapped_ioctl(3,8040691a,808d000,0,d5f29d40) at mapped_ioctl+0x4b0
sys_ioctl(d5f29cf4,d5f29cfc,c,2808f000,1) at sys_ioctl+0x2a
syscall2(2f,2f,2f,808a3a0,0) at syscall2+0x24c
Xint0x80_syscall() at Xint0x80_syscall+0x2a
Debugger("panic")

CPU1 stopping CPUs: 0x00000001
stopped
Stopped at Debugger+0x44: movb $0,in_Debugger.0

#3 Updated by sepherosa about 8 years ago

On 8/21/06, Matthew Dillon <> wrote:
>
> :On 7/25/06, Sepherosa Ziehau <> wrote:
> :> Hi all,
> :>
> :> I have ported nfe(4) from OpenBSD, which supports various nVidia GigE
> :> and is open source:
> :> http://leaf.dragonflybsd.org/~sephe/nfe.tbz
> :>
> :
> :I have updated it, and will commit this updated version next week, if
> :no objection comes.
> :
> :Please review/test it.
> :
> :Best Regards,
> :sephe
> :
> :--
> :Live Free or Die
>
> Oooh. Very nice. And I happen to have test boxes with nVidia chipsets
> that the driver recognizes!
>
> I got an assertion when I tried to ifconfig it up. I think the issue
> is simply that the serializer has already been acquired when nfe_jalloc()
> or nfe_jfree() are called (since you also specified it when you
> installed the interrupt vector). If I remove the serializer calls
> from those two routines I can ifconfig it up.

I have added additional serializer to protect jubmo buffer pool,
please refetch and test it again

Best Regards,
sephe

#4 Updated by dillon about 8 years ago

:I have added additional serializer to protect jubmo buffer pool,
:please refetch and test it again
:
:Best Regards,
:sephe

I haven't fetched this yet. I am having a watchdog timeout issue with
the old version. It doesn't happen immediately but pkgbox's new
motherboard is using the new NFE driver (the one I modified) and is
getting timeout errors.

I am implementing polling in the driver code right now to try to
determine whether it is the MAC or the interrupt causing the problem.

-Matt
Matthew Dillon
<>

#5 Updated by dillon about 8 years ago

I found one problem, but I don't know if it will fix the TX watchdog
timeouts.

nfe_encap() is totally broken. It is setting each ring segment to
NFE_TX_VALID as it goes, before it finishes writing out all the segments.
In fact, it seems to be setting NFE_TX_VALID before it sets the LASTFRAG
flag! It's amazing that it works at all.

I am going to try to fix it.

-Matt
Matthew Dillon
<>

#6 Updated by dillon about 8 years ago

: I found one problem, but I don't know if it will fix the TX watchdog
: timeouts.
:
: nfe_encap() is totally broken. It is setting each ring segment to
: NFE_TX_VALID as it goes, before it finishes writing out all the segments.
: In fact, it seems to be setting NFE_TX_VALID before it sets the LASTFRAG
: flag! It's amazing that it works at all.
:
: I am going to try to fix it.

Ok, there are two issues. The first issue is that NFE_TX_VALID is
being set too early in nfe_encap(). The second issue I found from
browsing OpenBSD/NetBSD postings on the subject. It appears that
sometimes the device simply does not generate an interrupt on TX
completion. Nobody seems to know why.

I have put the updated NFE driver code here. There are two changes.
First, I fix the NFE_TX_VALID issue. Second, I poll for transmit
completion in the watchdog timeout and don't reset the interface if
it finds transmit packets to retire.

fetch http://apollo.backplane.com/DFlyMisc/if_nfe01.tgz

Sephe, please integrate both tiem with your serializer changes. The
above archive is based on my original modified driver, not the one
where you fixed the jumbo packet serialization.

--

Also for some reason the NFE interface interferes with my 3ware card
(twa). Both are on IRQ 10. Both work most of the time, but sometimes
the NFE interface causes an interrupt from the 3ware to be lost. I
have no idea why this happens but I'm guessing it is motherboard related.

Turning on emergency interrupt polling (kern.emergency_intr_enable)
and leaving the emergency polling frequency at 10hz seems to solve
that problem. It's a terrible hack but it works.

-Matt
Matthew Dillon
<>

#7 Updated by dillon about 8 years ago

: Sephe, please integrate both tiem with your serializer changes. The
: above archive is based on my original modified driver, not the one
: where you fixed the jumbo packet serialization.

oh, p.s. I also implemented polling for the driver.

-Matt

#8 Updated by sepherosa about 8 years ago

On 8/27/06, Matthew Dillon <> wrote:
> I found one problem, but I don't know if it will fix the TX watchdog
> timeouts.
>
> nfe_encap() is totally broken. It is setting each ring segment to
> NFE_TX_VALID as it goes, before it finishes writing out all the segments.
> In fact, it seems to be setting NFE_TX_VALID before it sets the LASTFRAG
> flag! It's amazing that it works at all.

Yeah, that's it!! Thank you very much!! I didn't figure out the real
cause in the old version, but instead went to the sidetrack: adding a
delay in nfe_encap() :-P

Best Regards,
sephe

#9 Updated by dillon about 8 years ago

:Yeah, that's it!! Thank you very much!! I didn't figure out the real
:cause in the old version, but instead went to the sidetrack: adding a
:delay in nfe_encap() :-P
:
:Best Regards,
:sephe

It would be nice if we could determine which fix was the one that
fixed your MB. Insofar as I can tell there are three possible
causes for the watchdog timeouts.

(1) The hardware races the setting of NFE_TX_VALID in the second ring
buffer of a multi-buffer TX DMA. That is, the hardware is actively
transmitting a prior packet and the driver starts laying down the
new packet, and the hardware starts trying to transmit the new
packet before the driver can finish laying it down. This is due
to the driver improperly setting NFE_TX_VALID on the first ring
buffer in the new packet before finishing setting up all the ring
buffers.

Your delay had the effect of allowing the hardware to finish up
all the TX ring buffers and thus be quiescent when new packets
get queued, avoiding the race. Insofar as I can tell when you
KICK the hardware it runs TX ring buffers until it sees one
without NFE_TX_VALID set, then it goes quiescent until the next
KICK.

This is solved by the encap code fixes.

(2) The hardware fails to generate a TX completion interrupt. The
watchdog comes along and decides to reset the interface.

This is solved by the fixes in the watchdog code which first attempt
to drain the TX ring and then KICK it again before giving up and
resetting the interface (which doesn't solve the problem anyhow, it
appears). The KICK seemed to get TX completion interrupts working
again.

(3) The hardware interferes with other devices on the same IRQ. This
one really has me puzzled. I can't imagine how NFE can interfere
with TWA but it does! My system actually *LOST* an interrupt from
TWA and the I/O subsystem locked up on the disk. May this is an
interrupt routing issue of some sort. I don't quite understand how
the system can assign IRQ 10 to an external PCI card (TWA) *AND* also
the motherboard NFE interface. They are on two different PCI busses.
It should be impossible.

-Matt
Matthew Dillon
<>

#10 Updated by sepherosa about 8 years ago

On 8/27/06, Matthew Dillon <> wrote:
>
> :Yeah, that's it!! Thank you very much!! I didn't figure out the real
> :cause in the old version, but instead went to the sidetrack: adding a
> :delay in nfe_encap() :-P
> :
> :Best Regards,
> :sephe
>
> It would be nice if we could determine which fix was the one that
> fixed your MB. Insofar as I can tell there are three possible
> causes for the watchdog timeouts.
>
> (1) The hardware races the setting of NFE_TX_VALID in the second ring
> buffer of a multi-buffer TX DMA. That is, the hardware is actively
> transmitting a prior packet and the driver starts laying down the
> new packet, and the hardware starts trying to transmit the new
> packet before the driver can finish laying it down. This is due
> to the driver improperly setting NFE_TX_VALID on the first ring
> buffer in the new packet before finishing setting up all the ring
> buffers.
>
> Your delay had the effect of allowing the hardware to finish up
> all the TX ring buffers and thus be quiescent when new packets
> get queued, avoiding the race. Insofar as I can tell when you
> KICK the hardware it runs TX ring buffers until it sees one
> without NFE_TX_VALID set, then it goes quiescent until the next
> KICK.
>
> This is solved by the encap code fixes.

This one fixes my MB's watchdog timeout problem :-)

>
> (2) The hardware fails to generate a TX completion interrupt. The
> watchdog comes along and decides to reset the interface.
>
> This is solved by the fixes in the watchdog code which first attempt
> to drain the TX ring and then KICK it again before giving up and
> resetting the interface (which doesn't solve the problem anyhow, it
> appears). The KICK seemed to get TX completion interrupts working
> again.

Do you mean after the KICK in watchdog handler, normal TX intr
behaviour restores? mmm, IMHO, that means our TX descs are setup
properly, but some unknown registers are not setup properly, or it may
be a hardware bug :P

Best Regards,
sephe

#11 Updated by dillon about 8 years ago

:Do you mean after the KICK in watchdog handler, normal TX intr
:behaviour restores? mmm, IMHO, that means our TX descs are setup
:properly, but some unknown registers are not setup properly, or it may
:be a hardware bug :P
:
:Best Regards,
:sephe

Yes, that is what seems to happen. When the watchdog just has the
reinit code, my system never recovers. I just get a continuous stream
of watchdog timeouts. But with the KICK code in there, it recovers
instantly and doesn't have to reinit. Something definitely is not
being initialized properly.

-Matt
Matthew Dillon
<>

#12 Updated by sepherosa almost 8 years ago

committed

Also available in: Atom PDF