Bug #309

Another panic in 1.6.x

Added by elekktretterr over 8 years ago. Updated about 8 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

I think I have another PF panic, unfortunately I cant get a backtrace,
even so there were backtraces from other people. The screen *seems* to
be scrolling like this
db>
db>
db>
db>
....

at a very fast rate. I plugged in keyboard but it isnt responding. Why
is this bug still not fixed? Ive just had 20 days uptime and this is 2nd
time it happened in 2 months. If there is a such thing as a *gravely*
bug, this is one.

Petr

History

#1 Updated by corecode over 8 years ago

duplicate

#2 Updated by corecode over 8 years ago

Petr Janda wrote:
> at a very fast rate. I plugged in keyboard but it isnt responding. Why
> is this bug still not fixed? Ive just had 20 days uptime and this is 2nd
> time it happened in 2 months. If there is a such thing as a *gravely*
> bug, this is one.

yes, we know. but nobody was yet able to find the bug.

cheers
simon

#3 Updated by geekgod over 8 years ago

Petr Janda wrote:
[snip]
> Why
> is this bug still not fixed? Ive just had 20 days uptime and this is 2nd
> time it happened in 2 months. If there is a such thing as a *gravely*
> bug, this is one.

It is incredibly difficult to locate these types of bugs without a
backtrace / coredump. Hopefully next time db> will be responsive and
you can either get a backtrace or call doadump to create a corefile.

Scott

#4 Updated by elekktretterr over 8 years ago

Any ideas on why db was doing what it was doing instead of allowing me
to get a backtrace?

Petr
Scott Ullrich wrote:
> Petr Janda wrote:
> [snip]
>> Why is this bug still not fixed? Ive just had 20 days uptime and this
>> is 2nd time it happened in 2 months. If there is a such thing as a
>> *gravely* bug, this is one.
>
> It is incredibly difficult to locate these types of bugs without a
> backtrace / coredump. Hopefully next time db> will be responsive and
> you can either get a backtrace or call doadump to create a corefile.
>
> Scott
>
>

#5 Updated by corecode over 8 years ago

[TOFU reformatted]
Petr Janda wrote:
>>> Why is this bug still not fixed? Ive just had 20 days uptime and this
>>> is 2nd time it happened in 2 months. If there is a such thing as a
>>> *gravely* bug, this is one.
>> It is incredibly difficult to locate these types of bugs without a
>> backtrace / coredump. Hopefully next time db> will be responsive and
>> you can either get a backtrace or call doadump to create a corefile.
> Any ideas on why db was doing what it was doing instead of allowing me
> to get a backtrace?

actually I think we already have about 10 backtraces. It's just that the bug is not straightforward to find, so don't worry about that.

cheers
simon

#6 Updated by elekktretterr over 8 years ago

Have you tried consulting the PF devs?

Petr

Simon 'corecode' Schubert wrote:
> [TOFU reformatted]
> Petr Janda wrote:
>>>> Why is this bug still not fixed? Ive just had 20 days uptime and
>>>> this is 2nd time it happened in 2 months. If there is a such thing
>>>> as a *gravely* bug, this is one.
>>> It is incredibly difficult to locate these types of bugs without a
>>> backtrace / coredump. Hopefully next time db> will be responsive
>>> and you can either get a backtrace or call doadump to create a
>>> corefile.
>> Any ideas on why db was doing what it was doing instead of allowing
>> me to get a backtrace?
>
> actually I think we already have about 10 backtraces. It's just that
> the bug is not straightforward to find, so don't worry about that.
>
> cheers
> simon
>

#7 Updated by swildner over 8 years ago

Petr Janda wrote:
> Have you tried consulting the PF devs?

Hm? Why?

If we're talking about the

db>
db>
db>
..

issue here then my impression is that it has to do with the USB keyboard
I/O not being handled properly.

Sascha

#8 Updated by elekktretterr over 8 years ago

Well from what I could gather DragonFly doesnt have dev that knows the
PF code in-and-out, or is not so? (This is not an inflametory remark by
any means)

Perhaps the PF devs could look at the backtrace and be able to point to
something as to why PF is doing what its doing.

#9 Updated by corecode over 8 years ago

Petr Janda wrote:
> Have you tried consulting the PF devs?

of course. nobody could tell us the cause, it is not a known problem. something damages the state tables.

cheers
simon

#10 Updated by elekktretterr over 8 years ago

Ok I see, thanks.

Someone is porting PF from OpenBSD 3.9, what chances are there that this
bug will not happen then? If the chances are not good, perhaps there
might be some hack considered, that at least makes the kernel not panic
once in a while? What I find quite intriguing is that it happens only
after around 3 weeks of uptime. Why not earlier or later. Id rather even
sacrifice some functionality as long as it prevents the kernel panicking.

Petr

Simon 'corecode' Schubert wrote:
> Petr Janda wrote:
>> Have you tried consulting the PF devs?
>
> of course. nobody could tell us the cause, it is not a known
> problem. something damages the state tables.
>
> cheers
> simon
>

#11 Updated by bastyaelvtars over 8 years ago

Simon 'corecode' Schubert wrote:
> Petr Janda wrote:
>> Have you tried consulting the PF devs?
>
> of course. nobody could tell us the cause, it is not a known problem.
> something damages the state tables.

Guys, next week I will deploy a filtering bridge running 1.6.1. 20-30k
states are expectable. Hope I can crash it and tell you what is wrong.
Petr, could you show me your rules file? I recall having freeezes and
device incompatibilities if PF under OpenBSD 3.7 (I use 3.8 and 3.9 now)
and maybe we have something in common.

#12 Updated by elekktretterr over 8 years ago

My pf.conf is just a simple one:

ext_if="fxp0"

table <ssh-bruteforce>
block drop in quick on $ext_if from <ssh-bruteforce>

block in
pass out keep state

pass quick on { lo }
antispoof quick for { lo, fxp0 }

#pass in on $ext_if proto tcp to ($ext_if) port ssh \
# flags S/SA keep state \
# (max-src-conn-rate 3/30, overload <ssh-bruteforce> flush global)

pass in on $ext_if proto tcp to ($ext_if) port { ssh, smtp, imap, http,
domain } keep state
pass in on $ext_if proto udp to ($ext_if) port { domain } keep state

The commented section blocks script kiddies, unfortunately it doesnt
work in our PF version. Hence why its commented.

Petr

Gergo Szakal wrote:
> Simon 'corecode' Schubert wrote:
>> Petr Janda wrote:
>>> Have you tried consulting the PF devs?
>>
>> of course. nobody could tell us the cause, it is not a known
>> problem. something damages the state tables.
>
> Guys, next week I will deploy a filtering bridge running 1.6.1. 20-30k
> states are expectable. Hope I can crash it and tell you what is wrong.
> Petr, could you show me your rules file? I recall having freeezes and
> device incompatibilities if PF under OpenBSD 3.7 (I use 3.8 and 3.9
> now) and maybe we have something in common.
>
>

#13 Updated by bastyaelvtars over 8 years ago

Petr Janda wrote:
> My pf.conf is just a simple one:
>
> ext_if="fxp0"
>
> table <ssh-bruteforce>
> block drop in quick on $ext_if from <ssh-bruteforce>
>
> block in
> pass out keep state
>
> pass quick on { lo }
> antispoof quick for { lo, fxp0 }
>
> #pass in on $ext_if proto tcp to ($ext_if) port ssh \
> # flags S/SA keep state \
> # (max-src-conn-rate 3/30, overload <ssh-bruteforce> flush global)
>
> pass in on $ext_if proto tcp to ($ext_if) port { ssh, smtp, imap, http,
> domain } keep state
> pass in on $ext_if proto udp to ($ext_if) port { domain } keep state
>
> The commented section blocks script kiddies, unfortunately it doesnt
> work in our PF version. Hence why its commented.
>

Don't you have any configuration section at all? (directives starting
with set)

#14 Updated by elekktretterr over 8 years ago

Thats all I have.

Gergo Szakal wrote:
> Petr Janda wrote:
>> My pf.conf is just a simple one:
>>
>> ext_if="fxp0"
>>
>> table <ssh-bruteforce>
>> block drop in quick on $ext_if from <ssh-bruteforce>
>>
>> block in
>> pass out keep state
>>
>> pass quick on { lo }
>> antispoof quick for { lo, fxp0 }
>>
>> #pass in on $ext_if proto tcp to ($ext_if) port ssh \
>> # flags S/SA keep state \
>> # (max-src-conn-rate 3/30, overload <ssh-bruteforce> flush global)
>>
>> pass in on $ext_if proto tcp to ($ext_if) port { ssh, smtp, imap,
>> http, domain } keep state
>> pass in on $ext_if proto udp to ($ext_if) port { domain } keep state
>>
>> The commented section blocks script kiddies, unfortunately it doesnt
>> work in our PF version. Hence why its commented.
>>
>
> Don't you have any configuration section at all? (directives starting
> with set)
>
>

Also available in: Atom PDF