Bug #360

bridge_pfil: m_pullup failed

Added by bastyaelvtars over 7 years ago. Updated over 7 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

OK, another issue here:

Oct 25 11:28:16 fw kernel: bridge_pfil: m_pullup failed

I get loads of such messages in system log. I have no reports on network
outage or anything, just get these messages. Question is: any way to debug?

History

#1 Updated by corecode over 7 years ago

sounds like too small packets? or at least something like that, i guess.

#2 Updated by geekgod over 7 years ago

Are you still experiencing this problem? If so you could change the
printf to:

printf("%s: m_pullup %d failed\n", __func__, i);

This will give us a better idea of what is happening and may help
discover the problem.

Scott

#3 Updated by bastyaelvtars over 7 years ago

There have not been any syslog entries regarding this since the 26th of
Oct. Anyway, I have done the thing, maybe we can get some info.

#4 Updated by dillon over 7 years ago

:There have not been any syslog entries regarding this since the 26th of
:Oct. Anyway, I have done the thing, maybe we can get some info.

My guess is that they were bogus packets generated from a particular
source, either accidently or intentionally corrupted packets.

-Matt
Matthew Dillon
<>

#5 Updated by bastyaelvtars over 7 years ago

Think you should mark this irreproducible or so, if there are no such
messages within a few days.

#6 Updated by cedric over 7 years ago

Gergo Szakal wrote:

Sorry for the suggestion, but could it be bad hardware?

I'm saying that because of the pfr_update_stats problem
that you also have. This assertion mean the following:

1) you've a block rule with table.

table foo { 127/8 10/8 127/12 192.168 }
block on $ext_if from <foo>

2) a packet like 10.0.0.1 arrive on ext_if.

3) PF select the block rule, and drop the packet

4) PF call pfr_update_stats, to increase the packet
counter of the selected block, in that case 10/8.

5) For some reason, the packet does NOT fine any
matching block, as if 10/8 has been removed from
the table between step 3 and 4.

There could be 4 reasons to that actually:

a) Bug in PF. but other people should see that too.
Are there other people still seeing that?

b) Race condition: incorrect locking in PF port to DFly
causing table changes to occur in the midst of pf_test().
But you say that you didn't change table content anyway.

c) Memory corruption due to bad hardware.

d) Memory corruption due to unrelated OS bug.

Cedric

#7 Updated by dillon over 7 years ago

:> Think you should mark this irreproducible or so, if there are no such
:> messages within a few days.
:
:Sorry for the suggestion, but could it be bad hardware?

It's unlikely to be bad hardware within the PC. It could be any
number of things but the most likely cause is that the packet was
constructed that way.

The only way to tell for sure would be to add some code to the kernel
to dump the packet out along with the error message and then examine
the packet(s) looking for commonality (such as they all come from the
same originating IP), then use tcpdump to monitor for the pattern
and diagnose where it is coming from.

It also depends on the type of traffic being bridged. Any environment
exposed to the outside world will *ALWAYS* see weird, malformed, or
corrupted packets, both accidental and intentional. Internal
environments will also often see weird packets, particularly from
Windows boxes (and double particularly if they are infected with a
few viruses, which many windows boxes are to some degree or other).

-Matt

#8 Updated by bastyaelvtars over 7 years ago

Have not had this for a long time, and cannot be reproduced on a UP
machine with the same NICs (the machine dropping these is a MP one).

Also available in: Atom PDF