Bug #2653

Timer DELAY hangs boot on Lenovo S10 Intel Atom N270 with acpi enabled

Added by davshao 5 months ago. Updated 2 months ago.

Status:NewStart date:03/05/2014
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

On a i386 Lenovo S10 netbook with Intel Atom N270 and acpi enabled, boot hangs after:

acpi0.nexus0.root0
acpi0: <LENOVO CB-01> [tentative] on motherboard
ACPI: All ACPI Tables successfully acquired
ACPI FADT: SCI testing interrupt mode ...
ACPI FADT: SCI testing level/high
IOAPIC: irq 9, gsi 9 edge/high -> level/high

Brute force debugging with kprintf shows that commenting out the
DELAY(100 * 1000);
in function acpi_sci_test() of file sys/platform/pc32/acpica/acpi_fadt.c

enables boot to at least progress to the end of function call
acpi_sci_config();
in function AcpiOsInstallInterruptHandler() in
file sys/dev/acpica/Osd/OsdInterrupt.c
(after which at some point booting hangs again).

I can only speculate this may have some relation to the thread
"Time keeping Issues with the low-resolution TSC timecounter"
on the FreeBSD current mailing list around June 2011. For example:

http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025319.html

"Somewhere from an Intel manual, I think I read TSC stops when DPSLP#
pin is asserted for Core/Core2/Atom processors and I guess that means
entering C3 stops TSC. :-("

Attached is a dmesg from an acpi-disabled successful boot of the machine.

lenovo_s10_dmesg.txt Magnifier (36.5 KB) davshao, 03/05/2014 09:36 AM

History

#1 Updated by swildner 5 months ago

Does it make a difference when you put hw.tsc_cputimer_enable=0 in /boot/loader.conf?

#2 Updated by sepherosa 5 months ago

On Fri, Mar 7, 2014 at 6:50 PM, <> wrote:
> Issue #2653 has been updated by swildner.
>
>
> Does it make a difference when you put hw.tsc_cputimer_enable=0 in /boot/loader.conf?
>
>
> ----------------------------------------
> Bug #2653: Timer DELAY hangs boot on Lenovo S10 Intel Atom N270 with acpi enabled
> http://bugs.dragonflybsd.org/issues/2653#change-11887
>
> * Author: davshao
> * Status: New
> * Priority: Normal
> * Assignee:
> * Category:
> * Target version:
> ----------------------------------------
> On a i386 Lenovo S10 netbook with Intel Atom N270 and acpi enabled, boot hangs after:
>
> acpi0.nexus0.root0
> acpi0: <LENOVO CB-01> [tentative] on motherboard
> ACPI: All ACPI Tables successfully acquired
> ACPI FADT: SCI testing interrupt mode ...
> ACPI FADT: SCI testing level/high
> IOAPIC: irq 9, gsi 9 edge/high -> level/high
>
> Brute force debugging with kprintf shows that commenting out the
> DELAY(100 * 1000);
> in function acpi_sci_test() of file sys/platform/pc32/acpica/acpi_fadt.c
>
> enables boot to at least progress to the end of function call
> acpi_sci_config();
> in function AcpiOsInstallInterruptHandler() in
> file sys/dev/acpica/Osd/OsdInterrupt.c
> (after which at some point booting hangs again).
>
> I can only speculate this may have some relation to the thread
> "Time keeping Issues with the low-resolution TSC timecounter"
> on the FreeBSD current mailing list around June 2011. For example:
>
> http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025319.html
>
> "Somewhere from an Intel manual, I think I read TSC stops when DPSLP#
> pin is asserted for Core/Core2/Atom processors and I guess that means
> entering C3 stops TSC. :-("

I don't think TSC is used as cputimer on N270 here. It's still i8254,
since ACPI timer and HPET is not yet probed. The cpu probably is
choked by high interrupt rate when SCI mode is being tested. As about
later hanging, it may be caused by BIOS triggered C1E which could
choke local APIC timer. Let's try following tunables (you probably
want all of them):

hw.lapic_timer_enable="0"
hw.acpi.sci.trigger="level"
hw.acpi.sci.polarity="low"

Best Regards,
sephe

--
Tomorrow Will Never Die

#3 Updated by davshao 5 months ago

The above suggestions did not work, but what did work was blasting away with kprintf's until I found enough DELAYs to replace with ... something. Fortunately there were only two further DELAYs blocking and now for the first time ever, the Lenovo S10 with i386 Intel Atom N270 boots with acpi enabled. Thanks for the feedback.

The two further places to replace DELAYs are

1) line 519 in function acpi_attach() in file sys/dev/acpica/acpi.c

DELAY(5000);
cputimer_intr_pmfixup();

2) line 454 in function pcireg_cfgopen() in file sys/bus/pci/i386/pci_cfgreg.c

outl(CONF1_ADDR_PORT, CONF1_ENABLE_CHK);
DELAY(1);
mode1res = inl(CONF1_ADDR_PORT);

Now the question becomes, if DELAY() doesn't work (I suspect for reasons related to Bug #2652 that it is related to lwkt_switch()),
and something like tsleep() can't be used in acpi code, then what to use for delays?

Right now my embarrassing answer, and why I have not included a complete patch, is I am simply incrementing an integer some number of times after which I kprintf it, hoping this prevents the compiler from optimizing the operations away.

I am sure there must be a far better way, can someone suggest it?

#4 Updated by sepherosa 5 months ago

On Sat, Mar 8, 2014 at 3:05 PM, <> wrote:
> Issue #2653 has been updated by davshao.
>
>
> The above suggestions did not work, but what did work was blasting away with kprintf's until I found enough DELAYs to replace with ... something. Fortunately there were only two further DELAYs blocking and now for the first time ever, the Lenovo S10 with i386 Intel Atom N270 boots with acpi enabled. Thanks for the feedback.
>
> The two further places to replace DELAYs are
>
> 1) line 519 in function acpi_attach() in file sys/dev/acpica/acpi.c
>
> DELAY(5000);
> cputimer_intr_pmfixup();
>
> 2) line 454 in function pcireg_cfgopen() in file sys/bus/pci/i386/pci_cfgreg.c
>
> outl(CONF1_ADDR_PORT, CONF1_ENABLE_CHK);
> DELAY(1);
> mode1res = inl(CONF1_ADDR_PORT);
>
> Now the question becomes, if DELAY() doesn't work (I suspect for reasons related to Bug #2652 that it is related to lwkt_switch()),
> and something like tsleep() can't be used in acpi code, then what to use for delays?
>
> Right now my embarrassing answer, and why I have not included a complete patch, is I am simply incrementing an integer some number of times after which I kprintf it, hoping this prevents the compiler from optimizing the operations away.
>
> I am sure there must be a far better way, can someone suggest it?

Thank you for the information, that's quite helpful.

Look like i8254 is not working on your box if ACPI is enabled. Could
you help testing the following patch:
http://leaf.dragonflybsd.org/~sephe/tsc_delay.diff

DELAY will switch to TSC, if systimer hangs.

>
> ----------------------------------------
> Bug #2653: Timer DELAY hangs boot on Lenovo S10 Intel Atom N270 with acpi enabled
> http://bugs.dragonflybsd.org/issues/2653#change-11891
>
> * Author: davshao
> * Status: New
> * Priority: Normal
> * Assignee:
> * Category:
> * Target version:
> ----------------------------------------
> On a i386 Lenovo S10 netbook with Intel Atom N270 and acpi enabled, boot hangs after:
>
> acpi0.nexus0.root0
> acpi0: <LENOVO CB-01> [tentative] on motherboard
> ACPI: All ACPI Tables successfully acquired
> ACPI FADT: SCI testing interrupt mode ...
> ACPI FADT: SCI testing level/high
> IOAPIC: irq 9, gsi 9 edge/high -> level/high
>
> Brute force debugging with kprintf shows that commenting out the
> DELAY(100 * 1000);
> in function acpi_sci_test() of file sys/platform/pc32/acpica/acpi_fadt.c
>
> enables boot to at least progress to the end of function call
> acpi_sci_config();
> in function AcpiOsInstallInterruptHandler() in
> file sys/dev/acpica/Osd/OsdInterrupt.c
> (after which at some point booting hangs again).
>
> I can only speculate this may have some relation to the thread
> "Time keeping Issues with the low-resolution TSC timecounter"
> on the FreeBSD current mailing list around June 2011. For example:
>
> http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025319.html
>
> "Somewhere from an Intel manual, I think I read TSC stops when DPSLP#
> pin is asserted for Core/Core2/Atom processors and I guess that means
> entering C3 stops TSC. :-("
>
> Attached is a dmesg from an acpi-disabled successful boot of the machine.
>
>
> ---Files--------------------------------
> lenovo_s10_dmesg.txt (36.5 KB)
>
>
> --
> You have received this notification because you have either subscribed to it, or are involved in it.
> To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

#5 Updated by davshao 4 months ago

Unfortunately, the tsc_delay.diff patch does not solve the problem, and booting once again hangs after:

hpt27xx: RocketRAID 27xx controller driver v1.0 (Mar 13 2014 22:47:26)

Previous explorations with kprintf seemed to show possibly related problems occurred at lwkt_switch().

On a happier note, using the working patch with master through

commit 7dadaa2286bb268725a0b6255ad1832de28f1a61
Date: Thu Mar 13 19:32:20 2014 +0100

Gnargh, fix typo.

the Lenovo S10 i386 Intel Atom N270 for the first time ever now shuts down correctly instead of hanging after the filesystem is synched.

#6 Updated by sepherosa 4 months ago

On Fri, Mar 14, 2014 at 3:02 PM,
<> wrote:
> Issue #2653 has been updated by davshao.
>
>
> Unfortunately, the tsc_delay.diff patch does not solve the problem, and booting once again hangs after:
>
> hpt27xx: RocketRAID 27xx controller driver v1.0 (Mar 13 2014 22:47:26)
>
> Previous explorations with kprintf seemed to show possibly related problems occurred at lwkt_switch().

Do you mean the lwk_switch() in DELAY? I don't think the
lwkt_switch() in DELAY will be executed.

>
> On a happier note, using the working patch with master through

What's the working patch? Could you post it?

Thanks,
sephe

>
> commit 7dadaa2286bb268725a0b6255ad1832de28f1a61
> Date: Thu Mar 13 19:32:20 2014 +0100
>
> Gnargh, fix typo.
>
> the Lenovo S10 i386 Intel Atom N270 for the first time ever now shuts down correctly instead of hanging after the filesystem is synched.
>
> ----------------------------------------
> Bug #2653: Timer DELAY hangs boot on Lenovo S10 Intel Atom N270 with acpi enabled
> http://bugs.dragonflybsd.org/issues/2653#change-11898
>
> * Author: davshao
> * Status: New
> * Priority: Normal
> * Assignee:
> * Category:
> * Target version:
> ----------------------------------------
> On a i386 Lenovo S10 netbook with Intel Atom N270 and acpi enabled, boot hangs after:
>
> acpi0.nexus0.root0
> acpi0: <LENOVO CB-01> [tentative] on motherboard
> ACPI: All ACPI Tables successfully acquired
> ACPI FADT: SCI testing interrupt mode ...
> ACPI FADT: SCI testing level/high
> IOAPIC: irq 9, gsi 9 edge/high -> level/high
>
> Brute force debugging with kprintf shows that commenting out the
> DELAY(100 * 1000);
> in function acpi_sci_test() of file sys/platform/pc32/acpica/acpi_fadt.c
>
> enables boot to at least progress to the end of function call
> acpi_sci_config();
> in function AcpiOsInstallInterruptHandler() in
> file sys/dev/acpica/Osd/OsdInterrupt.c
> (after which at some point booting hangs again).
>
> I can only speculate this may have some relation to the thread
> "Time keeping Issues with the low-resolution TSC timecounter"
> on the FreeBSD current mailing list around June 2011. For example:
>
> http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025319.html
>
> "Somewhere from an Intel manual, I think I read TSC stops when DPSLP#
> pin is asserted for Core/Core2/Atom processors and I guess that means
> entering C3 stops TSC. :-("
>
> Attached is a dmesg from an acpi-disabled successful boot of the machine.
>
>
> ---Files--------------------------------
> lenovo_s10_dmesg.txt (36.5 KB)
>
>
> --
> You have received this notification because you have either subscribed to it, or are involved in it.
> To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

#7 Updated by swildner 2 months ago

Is this issue still happening to you? If not, can you close this ticket please?

Also available in: Atom PDF