Project

General

Profile

Actions

Bug #1774

open

New IP header cleanup branch available for testing

Added by dillon over 14 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

My leaf repo now has a branch called 'iphdr'. Here's the git remote setup:

[remote "dillon"]
url = git://leaf.dragonflybsd.org/~dillon/dragonfly.git
fetch = +refs/heads/*:refs/remotes/dillon/*
$ git fetch dillon
$ git branch iphdr dillon/iphdr
$ git checkout iphdr

I have done basic testing with ipfw, pf, ipv4 fragmentation, ipv4 TCP, a little bridging, if_tap, and ipv4 UDP.
Tons of other things haven't been tested yet. ipv6, bpf filters, carp, more extensive ipfw2 and pf, and so forth. Donn't bother with IPSEC, it has other issues which will have to be solved in the next stage.
Only do testing if you have some experience with the interfaces in question.
This branch contains the network byte ordering and adjustment fixes for ip_len, ip_off, and a large chunk of the protosw cleanup. I am going to leave the branch alone except for bug fixes.

--

I am now starting work on another branch that will contain a rewrite of the whole ether/ip demux and dispatch mechanic. All protocols will become per-cpu and there will be just one protocol support thread for each cpu. NETISRs will go away and be replaced with a more direct netmsg queueing operation. The toeplitz hash will be integrated into the demux mechanic (probably integrated with ip_lengthcheck()), so the cpu switch will occur even before a protocol is handed the packet. Fast-forwarding will still occur in the demux and be able to bypass netmsg queueing.

Hopefully when all is said and done the combination of the stage 1 fixes now available and this stage 2 work will give us a huge degree of flexibility with regards to managing packet mbufs, including an ability to trivially requeue them (something IPSEC and other tunnel implementations need to be able to do).

This second stage will probably take a week at least.

-Matt

Actions #1

Updated by pavalos over 14 years ago

I'm seeing differing behavior, but basically the machine deadlocks after I
login via SSH. Sometimes I get a panic, and sometimes it just deadlocks.
Here's a panic I got:

LWKT_WAIT_IPIQ WARNING! 0 wait 1 (-1)
panic: LWKT_WAIT_IPIQ
mp_lock = 00000001; cpuid = 0
Trace beginning at frame 0xd7f1ec70
panic(ffffffff) at panic+0x14f
panic(c0382e3d,297,ff808000,d7cd2e10,ff800000) at panic+0x14f
lwkt_wait_ipiq(ff808000,8e6a,ff808000,c01df72c,d7cd2e10,0) at
lwkt_wait_ipiq+0xd7
callout_stop(d7cd2e10,ff800000,246,d7ba78d0,d7ec83cc) at callout_stop+0xd8
ahd_done(d7e2b4b8,d7cd2db8) at ahd_done+0xaf
ahd_run_qoutfifo(d7e2b4b8,0,ff800000,d7f1ed84,c01a463d) at
ahd_run_qoutfifo+0xd5
ahd_platform_intr(d7e2b4b8,0) at ahd_platform_intr+0x10d
ithread_handler(a,0,0,0,0) at ithread_handler+0x171
lwkt_exit() at lwkt_exit
boot() called on cpu#0
Uptime: 1m33s

Freezed right after that. Unable to get a vmcore. After that, I rebooted and
it just froze right after I logged in via SSH. Had to power cycle to get it
back. Machine is completely unusable running your iphdr branch.

Actions #2

Updated by dillon over 14 years ago

:Peter Avalos <> added the comment:
:
:I'm seeing differing behavior, but basically the machine deadlocks after I
:login via SSH. Sometimes I get a panic, and sometimes it just deadlocks.
:Here's a panic I got:

Ugh.  This panic is completely outside the networking paths, which
kinda implies maybe memory corruption somewhere. This is going to
be difficult to track down.
-Matt
Matthew Dillon
&lt;&gt;
Actions #3

Updated by pavalos over 14 years ago

On Sun, May 30, 2010 at 09:00:36PM -0700, Matthew Dillon wrote:

:Peter Avalos <> added the comment:
:
:I'm seeing differing behavior, but basically the machine deadlocks after I
:login via SSH. Sometimes I get a panic, and sometimes it just deadlocks.
:Here's a panic I got:

Ugh. This panic is completely outside the networking paths, which
kinda implies maybe memory corruption somewhere. This is going to
be difficult to track down.

Just to be clear, with a normal master, the box runs fine. Let me know
what I can do to help.

--Peter

Actions #4

Updated by thomas.nikolajsen over 14 years ago

Works here.

I have tested it on UP & SMP systems, using e.g NFS & SSH,
haven't seen any issues.

A minor thing is that kernels doesn't build on iphdr branch:
git cherry-pick 3e2720774f6ba4b654a4212f2b8d9dc65585fba7 fixes this
(http://leaf.dragonflybsd.org/mailarchive/commits/2010-05/msg00082.html).

Actions #5

Updated by dillon over 14 years ago

:Thomas Nikolajsen <> added the comment:
:
:Works here.
:
:I have tested it on UP & SMP systems, using e.g NFS & SSH,
:haven't seen any issues.
:
:A minor thing is that kernels doesn't build on iphdr branch:
:git cherry-pick 3e2720774f6ba4b654a4212f2b8d9dc65585fba7 fixes this
:(http://leaf.dragonflybsd.org/mailarchive/commits/2010-05/msg00082.html).

Ok, I'll merge the latest crater/master into iphdr so it is synced
up.
-Matt
Matthew Dillon
&lt;&gt;
Actions #6

Updated by dillon over 14 years ago

I have re-merged master into my iphdr branch on leaf. It could use
some testing again.

I haven't found anything that would explain the corruption Peter
originally reported in this bug#, is it possible that it might have
been a kernel misbuild?
In anycase, the branch is synchronized with head now and I would like
to have another go at testing.
-Matt
Matthew Dillon
&lt;&gt;
Actions #7

Updated by thomas.nikolajsen over 14 years ago

Works here;
minor testing, build(7) using NFS, w/ SMP kernel.

-thomas
Actions #8

Updated by sjg about 14 years ago

This has been on my test box for a couple of days now, no failures or odd
behavior or etc. to report.

dhclient exercises bpf, so it works (the basics, anyway).

Actions #9

Updated by tuxillo over 2 years ago

  • Description updated (diff)
  • Assignee deleted (0)
Actions

Also available in: Atom PDF