Bug #676

SB600 HDA audio (Realtek ALC883) glitch - IRQ sharing?

Added by floid over 7 years ago. Updated about 7 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Now that NATA works, I think my sound problems are getting easier to diagnose...

Basically, trying to play sound has worked, but with a constant "crunch" /
"crackling" that sounded vaguely like a mis-sized buffer somewhere or horrible
clipping -- pretty tough to explain, let's say it occurs at a rate of something
like 3 to 6Hz. (Playing with morse, I can get 'e' to sound like anything from 1
to 7 'dit's with extra garbage.) I was assuming the HDA driver was just immature.

Well, I run X.org and use xmms and mplayer to test, and it turns out that moving
the mouse temporarily cures the problem. Swirl mouse around, playback is
essentially clear; stop and the crunchies come back.

Looks like the OHCI controller the mouse is on, atapci1 and the sound device all
share IRQ 3 right now.

% dmesg | grep "irq 3"
IOAPIC #0 intpin 16 -> irq 3
ohci0: <OHCI (generic) USB controller> [tentative] mem 0xff6fe000-0xff6fefff irq
3 at device 19.0 on pci0
ohci0: <OHCI (generic) USB controller> [attached!] mem 0xff6fe000-0xff6fefff irq
3 at device 19.0 on pci0
atapci1: <ATI SB600 UDMA133 controller> [tentative] port
0xff00-0xff0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 3 at device 20.1 on pci0
atapci1: <ATI SB600 UDMA133 controller> [attached!] port
0xff00-0xff0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 3 at device 20.1 on pci0
pci0: <unknown card> (vendor=0x1002, dev=0x4383) at 20.2 irq 3
sio1 [tentative] failed to probe at port 0x2f8-0x2ff irq 3 on isa0
pcm0: <ATI SB600 High Definition Audio Controller> [tentative] mem
0xff6f4000-0xff6f7fff irq 3 at device 20.2 on pci0
pcm0: <ATI SB600 High Definition Audio Controller> [attached!] mem
0xff6f4000-0xff6f7fff irq 3 at device 20.2 on pci0

pci0 is pcm0; I'm loading snd_driver.ko after boot. Athlon 64x2, SMP, APIC_IO,
kernel config again attached in the bug tracker ("MUSTELID2007").

USB mouse:
ums0: Microsoft Microsoft 3-Button Mouse with IntelliEye(TM), rev 1.10/3.00,
addr 3, iclass 3/1
ums0: 3 buttons and Z dir.

Unplugging the mouse doesn't change anything, unplugging atapci1 would be
difficult as it's on the same chip. The BIOS setup for this board doesn't
provide knobs for interrupt assignment or anything sound related (other than
enable/disable, maybe).

This has been a problem with every kernel I've tried, not sure if the mouse
trick was apparent until now. I need to see what happens without X.org or
moused running.

...

Thoughts?

MUSTELID2007 (10.2 KB) floid, 06/03/2007 11:40 PM

dmesg.NOAPIC.4-Jun-07 (13.7 KB) floid, 06/05/2007 02:00 AM

loader.conf(4-Jun-07) (153 Bytes) floid, 06/05/2007 02:02 AM

dmesg.NOAPIC.10-Jun-07 (58.6 KB) floid, 06/10/2007 09:27 PM

History

#1 Updated by dillon over 7 years ago

:Unplugging the mouse doesn't change anything, unplugging atapci1 would be
:difficult as it's on the same chip. The BIOS setup for this board doesn't
:provide knobs for interrupt assignment or anything sound related (other than
:enable/disable, maybe).
:
:This has been a problem with every kernel I've tried, not sure if the mouse
:trick was apparent until now. I need to see what happens without X.org or
:moused running.
:
:.=2E.
:
:Thoughts?

Try turning on emergency interrupt polling and see if that solves the
sound problem.

sysctl kern.emergency_intr_enable=1
sysctl kern.emergency_intr_freq=20

The real issue here is that our interrupt routing is horribly out of
date, and it isn't an easy thing to port from the current best
implementation available, which is in FreeBSD.

-Matt

#2 Updated by floid over 7 years ago

Tried the suggested:
> sysctl kern.emergency_intr_enable=1
> sysctl kern.emergency_intr_freq=20

...both at runtime and as loader.conf variables (quoted appropriately, confirmed
set), didn't notice any impact. While live, I tried 'freq's from 1 to 500. But
see below...

> The real issue here is that our interrupt routing is horribly out of
> date, and it isn't an easy thing to port from the current best
> implementation available, which is in FreeBSD.

This got me to open my eyes and notice that the BIOS configuration (as flashes
on the screen during the boot process) says there's a 'Multimedia Device' on IRQ
15. I should probably take a snapshot or something, as I can't remember if it
was assigning *anything* to 3.

The only thing dmesg has to say about IRQ 15 is:
APIC_IO: MP table broken: IRQ 15 not ISA when IRQ 14 is!

#3 Updated by dillon about 7 years ago

:Tried the suggested:
:> sysctl kern.emergency_intr_enable=1
:> sysctl kern.emergency_intr_freq=20
:
:...both at runtime and as loader.conf variables (quoted appropriately, confirmed
:set), didn't notice any impact. While live, I tried 'freq's from 1 to 500. But
:see below...

Try compiling up a SMP kernel without APIC_IO, see if that works.

:This got me to open my eyes and notice that the BIOS configuration (as flashes
:on the screen during the boot process) says there's a 'Multimedia Device' on IRQ
:15. I should probably take a snapshot or something, as I can't remember if it
:was assigning *anything* to 3.
:
:The only thing dmesg has to say about IRQ 15 is:
: APIC_IO: MP table broken: IRQ 15 not ISA when IRQ 14 is!

A lot of BIOSes have bugged MP tables. IRQ 14 and 15 are fixed IRQs
used for the IDE controller when it runs in compatible mode (which is
almost always). If IRQ 14 is specified as being an ISA interrupt,
then IRQ 15 had better be too.

-Matt
Matthew Dillon
<>

#4 Updated by floid about 7 years ago

> Try compiling up a SMP kernel without APIC_IO, see if that works.

Will do tonight.

> A lot of BIOSes have bugged MP tables. IRQ 14 and 15 are fixed IRQs
> used for the IDE controller when it runs in compatible mode (which is
> almost always). If IRQ 14 is specified as being an ISA interrupt,
> then IRQ 15 had better be too.

Aha! Per my last jabber in the NATA bug (566), the SB600 is unusual and
apparently really only has one legacy channel. But does that impact anything
other than the warning?

I'll have to see if telling the BIOS to make the SATA controller(s) come up in
"Legacy IDE" mode changes the table; per
http://bugs.dragonflybsd.org/file264/dmesg.1.9PREVIEW-NATA2 , using "Native IDE"
didn't obey convention. But of course even the legacy port is coming up on 3.
That dmesg is stale, but the IRQs haven't changed.

#5 Updated by floid about 7 years ago

Looks like a kernel without APIC_IO shows up a new problem; I didn't realize
panics would ever record in the dmesg file at all, but most of this one did:

[All fine until...]
usb0.ohci0.pci0.pcib0.legacypci0.nexus0.root0
usb0: <OHCI (generic) USB controller> [tentative] on ohci0
usb0: USB revision 1.0
uhub0: 2 ports with 2 removable, self powered

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000000; cpuid = 0; lapic.id = 00000000
fault virtual address = 0x0
fault code = supervisor write, page not present
instruction pointer = 0x8:0xc0489f4c
stack pointer = 0x10:0xc3c10d34
frame pointer = 0x10:0xc3c10d44
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = Idle
current thread = pri 46 (CRIT)
<- SMP: XXX

It stopped in usb_add_task(), apparently.

Full dmesg to that point will be attached to #676.

#6 Updated by floid about 7 years ago

That was also a quickkernel, in case I shouldn't have done that.

#7 Updated by dillon about 7 years ago

:Joe "Floid" Kanowitz <> added the comment:
:
:That was also a quickkernel, in case I shouldn't have done that.

Don't think that would have hurt, but its always a good idea to
do a complete rebuild if you aren't sure.

Do you have a USB keyboard?

In anycase, please try this patch. It creates the event queues
before the early discovery of the USB bus instead of after. I have
placed the patch here:

http://apollo.backplane.com/DFlyMisc/usb01.patch

-Matt
Matthew Dillon
<>

#8 Updated by elekktretterr about 7 years ago

Talking about USB keyboards, something should be done to get USB keyboards
to work in single-user mode. (old issue but still total pain!) I just
simple get jibberish when i boot into single-user mode and try typing
anything.

#9 Updated by dillon about 7 years ago

:Talking about USB keyboards, something should be done to get USB keyboards
:to work in single-user mode. (old issue but still total pain!) I just
:simple get jibberish when i boot into single-user mode and try typing
:anything.

If you get jibberish its the keyboard mapping mode. Hmm. I thought we
fixed that!

Just as a quick test, try changing this bit in
/usr/src/sys/dev/usbmisc/ukbd/ukbd.c:

/*
* Initialize the translation mode only if we are not
* reattaching to an already open keyboard (e.g. console).
* Otherwise we might rip the translation mode out from
* under X.
*/
if (!KBD_IS_CONFIGURED(kbd))
state->ks_mode = K_XLATE;
To just:

state->ks_mode = K_XLATE;

This will break X, so its just a quick test to see if that's the issue.

-Matt
Matthew Dillon
<>

#10 Updated by elekktretterr about 7 years ago

This indeed fixed it. But guess what, Im typing this under X (modular xorg
7,2, started via startx), it seems to have not broken it!

Petr

#11 Updated by dillon about 7 years ago

:> This will break X, so its just a quick test to see if that's the
:> issue.
:
:This indeed fixed it. But guess what, Im typing this under X (modular xorg
:7,2, started via startx), it seems to have not broken it!
:
:Petr

I think I see what's going on here. The USB keyboard's translation
mode is set on boot... I just tested that, and it works. But if the
USB keyboard disconnects and reconnects after the device probe, the
translation mode is lost due to that bit of code and you get garbage.

I will commit a fix to HEAD which remembers the translation mode and
restores it if a USB keyboard disconnects and reconnects.

-Matt
Matthew Dillon
<>

#12 Updated by floid about 7 years ago

No USB keyboard for me, though if there's any reason to get masochistic I have a
PS/2 to USB adapter in a KVM.

Tried usb01.patch , unfortunately problems remain.

I had the following on uhub0 for the first boot attempt:
ugen0: CPS UPS AE485, rev 1.10/0.01, addr 2
ums0: Microsoft Microsoft 3-Button Mouse with IntelliEye(TM), rev 1.10/3.00,
addr 3, iclass 3/1

I should've taken better notes -- the kernel either enumerated usb0 or uhub0
before locking up quietly, the block cursor on a new line and the keyboard
unresponsive.

For the second boot, I unplugged both devices and things went smoothly until the
EHCI driver loaded, with a page fault after enumerating its uhub5: 10 ports with
10 removable, self powered.

More debugging-by-photograph, this time with meat-based OCR:

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000000; cpuid = 0; lapic.id = 00000000
fault virtual address = 0x58
fault code = supervisor read, page not present
instruction pointer = 0x8:0xC048A873
stack pointer = 0x10:0xDA3D3D58
frame pointer = 0x10:0xDA3D3D64
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = Idle
current thread = pri 44 (CRIT)

Stopped at usb_discover+0x70: movl 0x58(%eax),%edx

Backtrace:
usb_discover(c3d0a610,1f4,0,0,c062d600) at usb_discover+0x70
usb_event_thread(c3d0a610,0,0,0,0) at usb_event_thread+0x48
kthread_exit() at kthread_exit

...

Of course, I forgot to disable ehci in loader.conf and see if the sound bug
improved before reporting all this.

===[OFF_TOPIC_PADDING]

Unrelated to *this* bug, some miscellaneous information for the search engines
to pick up:

- Modular X.org ("1.2.0") doesn't like me, as evidenced by the server dying on
signal 11 and/or demonstrating a problem with WaitForSomething() essentially
identical to the one Jeremy C. Reed reported a year ago:

http://lists.freedesktop.org/archives/xorg/2006-March/013678.html
http://leaf.dragonflybsd.org/mailarchive/users/2006-03/msg00082.html

Since the new ati driver, and pretty much every 6.n.n version, doesn't seem to
have improved at autoconfiguration of DVI monitors (or, 9 times out of 10, at
working at all), and the most interesting information I could find had me
sending a note to AMD's Investor Relations that fglrx doesn't cover everyone.

- Every component of Gnome 2.18.1 from pkgsrc-current builds (hooray!) except
for gnome-user-doc. The problem with that seems to be a Python type error, I
need to investigate.

- With BUILD_MAKE_FLAGS=-j 6 set, some of the Gnome packages outside gnome-base
do get into races that have gmake stop on a 'write error.' The really lazy way
to approach this was just to restart the bmake every time where it left off.

- The Gnome System Monitor panel applet (2.18.0) shows a constant "IOWait"
utilization taking up 50% of the "Processor" -- it doesn't break it down by CPU.
This even when top shows the system idle with <1% utilization in all categories.

===[END OFF_TOPIC_PADDING]

-Joe "Floid" Kanowitz

#13 Updated by floid about 7 years ago

Whups. Typo in my off-topic rant, wouldn't want to leave it hanging:

"Since the new ati driver, and pretty much every 6.n.n version, doesn't seem to
have improved at autoconfiguration of DVI monitors (or, 9 times out of 10, at
working at all), and the most interesting information I could find had me
sending a note to AMD's Investor Relations that fglrx doesn't cover everyone, *I
give up for now.*" Have to pick my battles, I'd best help get the kernel
working first. :)

Another forgotten, off-topic factoid: As with NetBSD, you have to disable the
uhid driver (device uhid) to use 'nut' (sysutils/ups-nut, sysutils/ups-nut-usb)
with a USB HID UPS. The userspace newhidups driver uses libusb to support HID
devices directly, and needs access through the generic (device ugen) kernel
driver, which doesn't seem to work if the uhid kernel driver 'claims' the device.

A uhid-aware 'upscontrol' may become a project for me, since it really shouldn't
take hours of configuration to mute a UPS that lasts 30 minutes.

#14 Updated by floid about 7 years ago

*On* topic:

Pulled out the USB devices, pulled ehci from loader.conf, booted perfectly.
pcm0 came up on IRQ 15, where the BIOS had it.

No luck, it got worse. Again, I should probably make audio samples, but it
changes from the 'annoying crackle' to a pretty consistent throb/echo, not so
much fast "cat /dev/urandom > /dev/dsp" static as something both getting
retriggered/looping and introducing noise.

An 'e' in morse -p might get something like
"bibibip(chush)(chush)(chush)(chush);" the string 'test' would probably take at
least a minute to conclude, and it took at least a good ten seconds to get it to
stop after ^C.

Tried a 48KHz .wav for curiosity, as bad. Tried enabling the emergency
interrupt polling sysctl while interactive with no impact.

Without a mouse, keyboard input did seem to affect it a bit, but not to the
point of clearing it up as observed with the APIC_IO kernel and the mouse-shakes.

Verbose dmesg:
http://bugs.dragonflybsd.org/file276/dmesg.NOAPIC.10-Jun-07

The HDA debugging messages on the tail are from my first invocation of mplayer
on a MP3, I think. It does spit something out every time the device is accessed.

#15 Updated by hasso about 7 years ago

Try to apply http://leaf.dragonflybsd.org/~hasso/sound-update-from-fbsd6.patch
and if it doesn't solve your problem, try enable polling via setting
hw.snd.pcm0.polling sysctl to 1.

#16 Updated by floid about 7 years ago

Hasso Tepper said:
> Try to apply http://leaf.dragonflybsd.org/~hasso/sound-update-from-fbsd6.patch
> and if it doesn't solve your problem, try enable polling via setting
> hw.snd.pcm0.polling sysctl to 1.

I confess I had some trouble applying the patch -- some filename issues, maybe
my own inexperience with pointing `patch` to the destination path -- but after
the import, sound from HEAD now works perfectly. Thanks again! :)

[This with SMP and APIC_IO on, and the device happy on IRQ 3, for those keeping
track.]

Anyone know if FreeBSD had, or was likely to have had, the same problem with the
earlier code, or was it specific to the initial DragonFly port? Just asking for
educational purposes, in case I try to learn myself something by reading all the
diffs.

Also available in: Atom PDF