Bug #885

Strange MXCSR messages?

Added by joerg1 almost 7 years ago. Updated over 6 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi,

I enabled "nata" support on one of my boxes lately, and now my
/var/log/messages gets flooded with entries like this:

[...]
kernel: pid 54403 (icecast) signal return from user: illegal FP MXCSR
ffff0010
last message repeated 24 times

kernel: pid 9033 (mysqld) signal return from user: illegal FP MXCSR
ffff0010
last message repeated 7 times
[...]

This seems to go on and on forever, but besides this spam from
syslogd, everything works okay.

I re-built the icecast and mysql packages and re-installed them, but
these messages still keep coming up.

I also found it's pretty hard to get information about this "MXCSR"
stuff, so if someone could give me a short explanation...? As far as I
can tell, it's some kind of register stuff.

Like I said, everything works fine, but it's somehow annoying anyway,
and believe adjusting syslogd's log level isn't the right way to fix
this -- although I don't believe it's a bug.

Enjoy the show

--j

History

#1 Updated by TGEN almost 7 years ago

Is the problem not encountered if you recompile from the exact same
sources, but with old ATA instead of NATA?

MXCSR is the SSE control/status register, and that value means the
Underflow Flag bit is set. The other 1 bits are reserved ones, which
should *NOT* be set to 1. Attempting to set them to 1 will result in a
general protection exception, which is probably what you're seeing here.
My guess is, the recent changes to how the signal code in the kernel
saves and restores the MXCSR register, introduced this bug, and not
NATA. I'll have a look later at the code in question.

Cheers,
--
Thomas E. Spanjaard

#2 Updated by dillon almost 7 years ago

:Joerg Anslik wrote:
:> I enabled "nata" support on one of my boxes lately, and now my
:> /var/log/messages gets flooded with entries like this:
:
:Is the problem not encountered if you recompile from the exact same
:sources, but with old ATA instead of NATA?
:
:> [...]
:> kernel: pid 54403 (icecast) signal return from user: illegal FP MXCSR
:> ffff0010
:> last message repeated 24 times
:> kernel: pid 9033 (mysqld) signal return from user: illegal FP MXCSR
:> ffff0010
:> last message repeated 7 times
:> [...]
:> I also found it's pretty hard to get information about this "MXCSR"
:> stuff, so if someone could give me a short explanation...? As far as I
:> can tell, it's some kind of register stuff.
:
:MXCSR is the SSE control/status register, and that value means the
:Underflow Flag bit is set. The other 1 bits are reserved ones, which
:should *NOT* be set to 1. Attempting to set them to 1 will result in a
:general protection exception, which is probably what you're seeing here.
:My guess is, the recent changes to how the signal code in the kernel
:saves and restores the MXCSR register, introduced this bug, and not
:NATA. I'll have a look later at the code in question.
:
:Cheers,
:--
: Thomas E. Spanjaard
:

Yah, its unrelated to nata. I'll get mysqld built up and try to
figure out what is messing up the signal stack. firefox and gtk
have the same problem.

-Matt

#3 Updated by joerg1 almost 7 years ago

Okay,

I see, so it's probably just a coincidence...I re-built the kernel
with the NATA configuration in place and cvsup'd the last two weeks
(or so) stuff before.

I never saw these icecast or mysqld messages before in the messages
logfile.

--j

#4 Updated by dillon almost 7 years ago

::should *NOT* be set to 1. Attempting to set them to 1 will result in a
::general protection exception, which is probably what you're seeing here.
::My guess is, the recent changes to how the signal code in the kernel
::saves and restores the MXCSR register, introduced this bug, and not
::NATA. I'll have a look later at the code in question.
::
::Cheers,
::--
:: Thomas E. Spanjaard
::
:
: Yah, its unrelated to nata. I'll get mysqld built up and try to
: figure out what is messing up the signal stack. firefox and gtk
: have the same problem.
:
: -Matt

Ok, I figured it out. It's libc_r's thread code. I have to implement
the FP save format field so libc_r uses the correct fxsave or fnsave
instruction.

libc_r was saving and restoring with fnsave and frstor, and the
kernel is using fxsave and fxrstor.

I'm committing fixes right now. Both the kernel and libc_r must be
recompiled.

-Matt
Matthew Dillon
<>

#5 Updated by dillon almost 7 years ago

Ok, it should now be fixed. I removed the fnsave/fxsave instructions
from libc_r entirely... fxsave can't be used there anyway because
the FP state on the signal stack is not 16-byte aligned (it was causing
an infinite loop, which is another issue but one I'm not going to worry
about right now).

Now the kernel is 100% responsible for saving and restoring the FP state.

Now that its fixed, I also changed the kernel to SIGFPE a user
program that messes up the MXCSR.

-Matt
Matthew Dillon
<>

#6 Updated by corecode almost 7 years ago

Ah, so it was our threading lib that was producing the erroneous data?

That would explain why it happened only for threaded programs (so far) :)

cheers
simon

#7 Updated by dillon almost 7 years ago

:Ah, so it was our threading lib that was producing the erroneous data?
:
:That would explain why it happened only for threaded programs (so far) :)
:
:cheers
: simon

Yup. libc_r was running fnsave (387 format save frame) into the FP
save area of the signal context. The kernel tried to restore it
with fxrstr (SSE format save frame).

Since the kernel now saves and restores the FP state from the signal
context, libc no longer has to do it so ripping the code out of libc
solves the problem.

-Matt
Matthew Dillon
<>

#8 Updated by c.turner almost 7 years ago

Has anyone checked out the firefox / moused thing since this fix?

I noticed that '/usr/pkg/lib/firefox/firefox-bin' is linked to libc_r ..

I'm still ashamedly attempting to update my system to test last
weeks fix.. stupid year closing accounting :)

#9 Updated by josepht almost 7 years ago

The fix works for me.

Joe

#10 Updated by c.turner almost 7 years ago

woohoo! thanks for picking up my slack..

hopefully in a day or two I'll experience simultaneous 2 dimensional and
text-mode mouse bliss!

cheers

- Chris

Also available in: Atom PDF