Bug #1402

ahci: attachment to atapi device

Added by ftigeot almost 5 years ago. Updated almost 5 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi,

I just tested the new ahci driver on a Core 2 machine with ICH7 chipset.
It attachs to a SATA DVD burner, which I believe should be left to the nata
driver.
Kernel is 2.3.1 GENERIC with SMP enabled (no apic).

Details of chipset (pciconf -lv):

atapci0@pci0:31:1: class=0x01018a card=0x58428086 chip=0x27df8086 rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) Ultra ATA Storage Controller'
class = mass storage
subclass = ATA
atapci1@pci0:31:2: class=0x010601 card=0x58428086 chip=0x27c18086 rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = '82801GB I/O Controller Hub SATA cc=AHCI'
class = mass storage
subclass = SATA

Parts of dmesg:

ahci0: <AHCI-PCI-SATA> port 0x30a0-0x30af,0x30e0,0x30e3,0x30c0,0x30c7,0x30e4,0x30e7,0x30c8,0x30cf mem 0xd2204000-0xd22043ff irq 9 at device 31.2 on pci0
match generic sata
ahci0: ahci 1.1 capabilities 0xc320ff03 <S64A,NCQ,SAL,SCL0,PMD,SSC,PSC>, 4 ports, 32 tags/port,gen 1 (1.5Gbps) and 2 (3Gbps)
ahci0.0: sig 00000101
ahci0.0: START SOFTRESET 4c017<HPCP,CR,FR,FRE,POD,SUD,ST>
ahci0.1: sig eb140101
ahci0.1: START SOFTRESET 4c017<HPCP,CR,FR,FRE,POD,SUD,ST>
ahci0.0: sig 00000101
ahci0.1: sig eb140101
ahci0.0: END SOFTRESET
ahci0.1: END SOFTRESET
ahci0.0: Found DISK "WDC WD4000YS-01MPB1 09.02E09" serial=" WD-WCANU207346"
ahci0.0: tags=32/32 satacaps=0706 satafeat=0044 capacity=381554.08MB
ahci0.0: f85=7469 f86=3c41 f87=4163 WC=enabled RA=enabled SEC=frozen
ahci0.1: Found ATAPI "ASUS DRW-1814BLT 1.04 " serial=" "
ahci0.1: tags=0/32 satacaps=0602 satafeat=0028 capacity=0.00MB
ahci0.1: f85=0000 f86=0000 f87=0000 WC=notsupp RA=notsupp SEC=notsupp
ahci0.1: cmd 0: error
ahci0.3: sig 00000101
ahci0.3: START SOFTRESET 4c017<HPCP,CR,FR,FRE,POD,SUD,ST>
ahci0.3: sig 00000101
ahci0.3: END SOFTRESET
ahci0.2: Port appears to be unplugged
ahci0.2: No device detected
ahci0.3: Found DISK "WDC WD5002ABYS-01B1B0 02.03B02" serial=" WD-WCASY3848290"
ahci0.0: tags=32/32 satacaps=0506 satafeat=0044 capacity=476940.02MB
ahci0.0: f85=7469 f86=be41 f87=4163 WC=enabled RA=enabled SEC=frozen

[...]

da0 at ahci0 bus 0 target 0 lun 0
da0: <SATA WDC WD4000YS-01M 09.0> Fixed Direct Access SCSI-4 device
da0: 300 MB/s transfers
da0: 381554MB (781422768 512 byte sectors: 255H 63S/T 48641C)
da1 at ahci0 bus 3 target 0 lun 0
da1: 300 MB/s transfers
da1: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C)

[...]

Mounting root from ufs:da0s1a
cd0 at ahci0 bus 1 target 0 lun 0
cd0: <ASUS DRW-1814BLT 1.04> Removale CD-ROM SCSI-0 device
cd0: 150.000MB/s transfers
cd0: cd present [490912 x 2048 byte records]
ahci0.0: CMD TIMEOUT state=5 slot=22
cmd-reg 0x4d617<HPCP,CR,FR,FRE,POD,SUD,ST>
sactive=00400000 active=00000000 expired=00000000
sact=000000000 ci=00000000
ahci0.0: disk rw: timeout

This last part (timeout) repeats forever, with about a 10 second pause between
each printing.

History

#1 Updated by ftigeot almost 5 years ago

On Mon, Jun 15, 2009 at 09:31:09AM -0700, Matthew Dillon wrote:
>
> : I'm not sure what is going on there but get the latest git and try
> : again. It will be less verbose and shouldn't try to double-ident
> : ahci0.0. I'm hoping that is the cause of the forever-repeating
> : timeouts.

I just booted a fresh kernel. cd0 still attachs to ahci:

ahci0.0: Found DISK "WDC WD4000YS-01MPB1 09.02E09" serial=" WD-WCANU2073446"
ahci0.0: tags=32/32 satacap=0706 satafea=0044 NCQ=YES capacity=381554.08MB
ahci0.1: Found ATAPI "ASUS DRW-1814BLT 1.04 " serial=" "
ahci0.1: tags=0/32 satacap=602 satafea=0028 NCQ=NO capacity=0.00MB
ahci0.2: No device detected
ahci0.3: Found disk "WDC WD5002ABYS-01B1B0 02.03B02" serial=" WD-WCASY3848290"
ahci0.3: tags=32/32 satacap0506 satafea=0044 NCQ=YES capacity=476940.02MB"
[...]
mountroot> ufs:da0s1a
Mounting root from ufs:da0s1a
cd0 at ahci0 bus 1 target 0 lun 0
cd0: <ASUS DRW-1814BLT 1.04> Removable CD-ROM SCSI-0 device
cd0: 150.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
ahci0.0: CMD TIMEOUT state=5 slot=22
cmd-reg 0x4d617<HPcP,CR,FR,FRE,POD,SUD,ST>
sactive=004000000 active=00000000 expired=00000000
sact=000000000 ci=00000000
ahci0.0: disk_rw: timeout

[timeout loop]

I then tried again after unpluging the burner:

atapci0: <Intel ICH7 UDMA100 controller> port 0x30b0-0x30bf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 10 at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
match generic sata
match generic sata
ahci0: <AHCI-PCI-SATA> port 0x30a0-0x30af,0x30e0-0x30e3,0x30c0-0x30c7,0x30e4-0x30e7,0x30c8-0x30cf mem 0xd2204000-0xd22043ff irq 9 at device 31.2 on pci0
match generic sata
ahci0: AHCI 1.1 capabilities 0xc320ff03<S64A,NCQ,SAL,SCLO,PMD,SSC,PSC>, 4 ports, 32 tags/port, gen 1 (1.5Gbps) and 2 (3Gbps)
ahci0.1: No device detected
ahci0: Found DISK "WDC WD4000YS-01MPB1 09.02E09" serial=" WD-WCANU2073446"
ahci0.0: tags=32/32 satacap=0706 satafea=0044 NCQ=YES capacity=381554.08MB
ahci0.1: Found ATAPI "ASUS DRW-1814BLT 1.04 " serial=" "
ahci0.1: tags=0/32 satacap=602 satafea=0028 NCQ=NO capacity=0.00MB
ahci0.2: No device detected
ahci0.3: Found disk "WDC WD5002ABYS-01B1B0 02.03B02" serial=" WD-WCASY3848290"
ahci0.3: tags=32/32 satacap0506 satafea=0044 NCQ=YES capacity=476940.02MB"
[...]
da0 at ahci0 bus 0 target 0 lun 0
da0: <SATA WDC WD4000YS-01M 09.0> Fixed Direct Access SCSI-4 device
da0: 300.000MB/s transfers
da0: 381554MB (781422768 512 byte sectors: 255H 63S/T 48641C)
da1 at ahci0 bus 3 target 0 lun 0
da1: <SATA WDC WD5002ABYS-0 02.0> Fixed Direct Access SCSI-4 device
da1: 300.000MB/s transfers
da1: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C)
[...]
mountroot> ufs:da0s1a
Mounting root from ufs:da0s1a
cd0 at ahci0 bus 1 target 0 lun 0
cd0: <ASUS DRW-1814BLT 1.04> Removable CD-ROM SCSI-0 device
cd0: 150.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray
closed
ahci0.0: CMD TIMEOUT state=5 slot=22
cmd-reg 0x4d617<HPcP,CR,FR,FRE,POD,SUD,ST>
sactive=004000000 active=00000000 expired=00000000
sact=000000000 ci=00000000
ahci0.0: disk_rw: timeout

The CMD TIMEOUT lines repeat a few times, the "slot=" part going from 22 to 26

It ends with these two lines:
da0: reading primary partition table: error accessing offset 000000000000 for 512
Root mount failed: 5

> I'm going to assume those last two tags=/f86= where actually from
> the other disk and you mis-typed them.

It's quite possible. As you guessed, the entire dmesg was hand-typed.

> If that's the case it might be an issue of the drive or the controller
> not being able to support SATA-300. It may help if you post what the
> NATA driver reports for those attachments (set hint.ahic.disabled from
> the boot loader).

Here you go:

atapci0: <Intel ICH7 UDMA100 controller> port 0x30b0-0x30bf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 10 at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
atapci1: <Intel AHCI controller> port 0x30a0-0x30af,0x30e0-0x30e3,0x30c0-0x30c7,0x30e4-0x30e7,0x30c8-0x30cf mem 0xd2204000-0xd22043ff irq 9 at device 31.2 on pci0
atapci1: AHCI Version 01.10 controller with 4 ports detected
ata2: <ATA channel 0> on atapci1
intr 9 at 40001/40000 hz, livelocked limit engaged!
ad4: 381554MB <WDC WD4000YS-01MPB1 09.02E09> at ata2-master SATA300
ata3: <ATA channel 1> on atapci1
ata4: <ATA channel 2> on atapci1
ata5: <ATA channel 3> on atapci1
ad10: 476940MB <WDC WD5002ABYS-01B1B0 02.03B02> at ata5-master SATA300

[...]

Mounting root from ufs:/dev/ad4s1a

#2 Updated by ftigeot almost 5 years ago

On Mon, Jun 15, 2009 at 03:25:01PM -0700, Matthew Dillon wrote:
> Is that WDC WD4000YS on ahci0.0 a 2.5" drive? Is it a portable HD
> with an external connector or is it a bare bones HD screwed into the
> machine?
>
> If its a portable HD does it have an E-sata connector? And does it
> have any other connectors? (Firewire, usb, etc).
>
> Is this a laptop?

No, this is a desktop machine.

All disks are 3.5" Western-Digital RAID-Edition models.
http://www.wdc.com/en/products/Products.asp?DriveID=489

They are similar in apparence to desktop drives, and have a regular SATA
connector.

#3 Updated by ftigeot almost 5 years ago

On Mon, Jun 15, 2009 at 03:07:36PM -0700, Matthew Dillon wrote:
>
> :ahci0.0: CMD TIMEOUT state=5 slot=22
> : cmd-reg 0x4d617<HPcP,CR,FR,FRE,POD,SUD,ST>
> : sactive=004000000 active=00000000 expired=00000000
> : sact=000000000 ci=00000000
> :ahci0.0: disk_rw: timeout
>
> Are you absolutely sure these TIMEOUT's are coming from ahci0.0 ?

Absolutely.

> :cd0 at ahci0 bus 1 target 0 lun 0
> :cd0: <ASUS DRW-1814BLT 1.04> Removable CD-ROM SCSI-0 device
> :cd0: 150.000MB/s transfers
> :cd0: Attempt to query device size failed: NOT READY, Medium not present - tray
> :closed
>
> Wait, you said you unplugged the burner. Why is still being probed?
> Do you have multiple CD/DVD drives? Make sure they are all unplugged.

My bad. I must have mixed parts of the two different dmesg logs.
In the end, they were identical : CMD TIMEOUT on ahci0.0

> Ok, I'd like you to sync the git repo again, I added some additional
> output to the TIMEOUT message.
>
> You don't have to type everything in again :-), but please tell me what
> the STS line in the CMD TIMEOUT message says (that I just added to the
> driver code).

Ok, here is a new dmesg (still at 3Gb/s).

ahci0.0: Found DISK "WDC WD4000YS ...
ahci0.0: tags=32/32 satacap= ...
ahci0.0: f85=7469 f86=3c41 f87=4163 WC=enabled RA=enabled SEC=frozen
[...]
ahci0.3: Found DISK "WDC WD5002ABYS ...
ahci0.3: tags=32/32 ...
ahci0.3: f85=7469 f86=be41 f87=4163 WC=enabled RA=enabled SEC=frozen
[...]
Mounting root from ufs:da0s1a
ahci0.0: CMD TIMEOUT state=5 slot=22
cmd-reg=0x4d617<HPCP,CR,FR,FRE,POD,SUD,ST>
sactive=00400000 active=00000000
[...]
STS=40
ahci0.0: disk_rw: timeout

I'll post a new dmesg with drives forced at 1.5Gb/s ASAP.

#4 Updated by ftigeot almost 5 years ago

On Tue, Jun 16, 2009 at 09:26:44AM +0200, Francois Tigeot wrote:
> On Mon, Jun 15, 2009 at 03:07:36PM -0700, Matthew Dillon wrote:
>
> > Ok, I'd like you to sync the git repo again, I added some additional
> > output to the TIMEOUT message.
> >
> > You don't have to type everything in again :-), but please tell me what
> > the STS line in the CMD TIMEOUT message says (that I just added to the
> > driver code).
>
> Ok, here is a new dmesg (still at 3Gb/s).
>
> ahci0.0: Found DISK "WDC WD4000YS ...
> ahci0.0: tags=32/32 satacap= ...
> ahci0.0: f85=7469 f86=3c41 f87=4163 WC=enabled RA=enabled SEC=frozen
> [...]
> ahci0.3: Found DISK "WDC WD5002ABYS ...
> ahci0.3: tags=32/32 ...
> ahci0.3: f85=7469 f86=be41 f87=4163 WC=enabled RA=enabled SEC=frozen
> [...]
> Mounting root from ufs:da0s1a
> ahci0.0: CMD TIMEOUT state=5 slot=22
> cmd-reg=0x4d617<HPCP,CR,FR,FRE,POD,SUD,ST>
> sactive=00400000 active=00000000
> [...]
> STS=40
> ahci0.0: disk_rw: timeout

With hint.ahci.force150, results are identical.

#5 Updated by ftigeot almost 5 years ago

On Mon, Jun 15, 2009 at 04:07:42PM -0700, Matthew Dillon wrote:
> I add yet another kenv for you to try.
>
> set hint.ahci.nofeatures in /boot/loader.conf or on the boot loader
> command line.

No change: the first disk still timeouts when mounting /

> I also enhanced the output to indicate which setup command was generating
> the CMD TIMEOUT. It should say just after the CMD TIMEOUT stuff.
>
> e.g.
>
> CMD TIMEOUT ...
> ....
> ...
> ...
> STS=...
> ahcix.x: Unable to set BLAHBLAH
>
> I'd like to know which setup command it was failing on, if it was
> failing on one, and whether setting hint.ahci.nofeatures fixes the
> problem.

There's no change either; the Unable to set ... message is not printed. I
only get this instead:

ahci0.0: disk_rw: timeout

#6 Updated by ftigeot almost 5 years ago

On Tue, Jun 16, 2009 at 12:07:50PM -0700, Matthew Dillon wrote:
> :> ahci0.0: CMD TIMEOUT state=5 slot=22
> :> cmd-reg=0x4d617<HPCP,CR,FR,FRE,POD,SUD,ST>
> :> sactive=00400000 active=00000000
> :> [...]
> :> STS=40
> :> ahci0.0: disk_rw: timeout
> :
> :With hint.ahci.force150, results are identical.
>
> Hmm. I'm a bit at a loss.
>
> Update again. I added some defensive bzero()ing of the FIS to the AHCI
> driver. Maybe there was some cruft in the read command it tried to
> construct.

I just tried again. No change, the disk still timeouts.

> Do any other disks work there (get past boot)? Looking on Google
> those 4000YS's had firmware issues. It's worth a shot. Maybe they
> are expecting some field to be set in the command that I'm not setting.

From what I found, the vendor forgot to enable TLER; this shouldn't be a cause
of trouble for normal operation.

I also tried to mount root from the second disk (WD5002ABYS); it fails with the
same error.

#7 Updated by ftigeot almost 5 years ago

On Wed, Jun 17, 2009 at 01:02:40AM -0700, Matthew Dillon wrote:
> I think I have an idea as to the problem. I think it may be an interrupt
> issue.
>
> Try turning on emergency interrupt polling in the loader. I think
> this is the format (maybe needs equal signs, not sure).
>
> kern.emergency_intr_enable=1
> kern.emergency_intr_freq=100

I tried to enable these sysctls and changed all disk-related BIOS options
I could find.

Still no go.

#8 Updated by elekktretterr almost 5 years ago

Hi Matt,

I get the same problem on my Fujitsu Lifebook laptop.

Petr

On Wed, 17 Jun 2009 05:07:50 Matthew Dillon wrote:
> :> ahci0.0: CMD TIMEOUT state=5 slot=22
> :> cmd-reg=0x4d617<HPCP,CR,FR,FRE,POD,SUD,ST>
> :> sactive=00400000 active=00000000
> :> [...]
> :> STS=40
> :> ahci0.0: disk_rw: timeout
> :
> :With hint.ahci.force150, results are identical.
> :
> :--
> :Francois Tigeot
>
> Hmm. I'm a bit at a loss.
>
> Update again. I added some defensive bzero()ing of the FIS to the AHCI
> driver. Maybe there was some cruft in the read command it tried to
> construct.
>
> Do any other disks work there (get past boot)? Looking on Google
> those 4000YS's had firmware issues. It's worth a shot. Maybe they
> are expecting some field to be set in the command that I'm not setting.
>
> -Matt
> Matthew Dillon
> <>

#9 Updated by elekktretterr almost 5 years ago

bottom of the page... i meant.

Petr

On Thu, 18 Jun 2009 09:52:11 Matthew Dillon wrote:
> :Hi Matt,
> :
> :I get the same problem on my Fujitsu Lifebook laptop.
> :
> :Petr
>
> Ok, back on bugs@ I guess. Try this:
>
> fetch http://apollo.backplane.com/DFlyMisc/ahci02.patch
>
> And see if any "IS ..." lines print out, and/or if it works
> with the patch.
>
> -Matt

#10 Updated by elekktretterr almost 5 years ago

How do I boot my laptop back?

I changed my fstab to da0s.... When I boot with option 6 (without AHCI), i set
mountroot to /dev/ad4s3a, which takes me to single usermode. But from there I
cannot edit the fstab because... Did I mention vi is impossible to use from
single user mode?

I mount the /usr and /tmp partitions to get access to vi, but when i try to
open anything, it doesnt show anything meaningful. (some gibberish at the
bottom of the text)

Petr

On Thu, 18 Jun 2009 09:52:11 Matthew Dillon wrote:
> :Hi Matt,
> :
> :I get the same problem on my Fujitsu Lifebook laptop.
> :
> :Petr
>
> Ok, back on bugs@ I guess. Try this:
>
> fetch http://apollo.backplane.com/DFlyMisc/ahci02.patch
>
> And see if any "IS ..." lines print out, and/or if it works
> with the patch.
>
> -Matt

#11 Updated by swildner almost 5 years ago

Matthew Dillon schrieb:
> :How do I boot my laptop back?
> :
> :I changed my fstab to da0s.... When I boot with option 6 (without AHCI), i set
> :mountroot to /dev/ad4s3a, which takes me to single usermode. But from there I
> :cannot edit the fstab because... Did I mention vi is impossible to use from
> :single user mode?
> :
> :I mount the /usr and /tmp partitions to get access to vi, but when i try to
> :open anything, it doesnt show anything meaningful. (some gibberish at the
> :bottom of the text)
> :
> :Petr
>
> You almost got it. The termcap is broken. So you also need to do,
> in addition to mounting the stuff:
>
> TERM=cons25
> export TERM
>
> Then vi will work.

We also have mined(1) which is very useful for such cases.

Regards,
Sascha

#12 Updated by elekktretterr almost 5 years ago

> You almost got it. The termcap is broken. So you also need to do,
> in addition to mounting the stuff:
>
> TERM=cons25
> export TERM
>
> Then vi will work.

Hi Matt,

I applied the patch. Removed the sys/compile/FUJITSU directory, ran config
FUJITSU, and then make clean && make depend && make && make install.

But the error is same

Petr

#13 Updated by ftigeot almost 5 years ago

On Fri, Jun 19, 2009 at 12:03:53AM -0700, Matthew Dillon wrote:
> Ok, ahci should work better with ICH7.

This is an understatement.
It works fine for me now, and seems to take the load much better than the
old nata driver.
I'm running 4 different find / and a cvs update of pkgsrc in parallel and
there's no hang or perceptible slow-down on disk accesses. New applications
seem to launch as fast as if nothing was accessing the disks.

I'm happy now :-)

#14 Updated by ftigeot almost 5 years ago

On Fri, Jun 19, 2009 at 11:22:13AM -0700, Matthew Dillon wrote:
>
> :I'm running 4 different find / and a cvs update of pkgsrc in parallel and
> :there's no hang or perceptible slow-down on disk accesses. New applications
> :seem to launch as fast as if nothing was accessing the disks.
>
> Did you have to use the emergency interrupt polling or ahci/non-ahci
> combination? Or does the interrupt work for real?

Nope, I did nothing special. The ahci driver works out of the box without any
emergency interrupt stuff.

#15 Updated by elekktretterr almost 5 years ago

> Excellent!
>
> Did you have to use the emergency interrupt polling or ahci/non-ahci
> combination? Or does the interrupt work for real?
>
> I also fixed some console spew this morning (that Sephe reported),
> just in case your attachment is NCQ-capable.
>

Hi Matt,

It also works here now, and first impression it seems pretty speedy (boot
seems to take less long)

Cheers,
Petr

#16 Updated by ftigeot almost 5 years ago

On Mon, Jun 15, 2009 at 09:47:54AM +0200, Francois Tigeot wrote:
>
> I just tested the new ahci driver on a Core 2 machine with ICH7 chipset.
> It attachs to a SATA DVD burner, which I believe should be left to the nata
> driver.
> Kernel is 2.3.1 GENERIC with SMP enabled (no apic).

I just plugged the CD burner back.
It still attaches to ahci, but now works fine.

dmesg extract:

Mounting root from ufs:/dev/da0s1a
ahci0.1: TFES slot 11 ci_saved = 00000800
ahci0.1: Copying error rfis slot 11
ahci0.1: WARNING! Unload prdbc resid was zero! tag=11
cd0 at ahci0 bus 1 target 0 lun 0
cd0: <ASUS DRW-1814BLT 1.04> Removable CD-ROM SCSI-0 device
cd0: 150.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed

It is much faster than with the nata driver. I measured about a 4x speedup
when burner regular CD images.

#17 Updated by dillon almost 5 years ago

:Hi,
:
:I just tested the new ahci driver on a Core 2 machine with ICH7 chipset.
:It attachs to a SATA DVD burner, which I believe should be left to the nata
:driver.
:Kernel is 2.3.1 GENERIC with SMP enabled (no apic).

That part is correct. The SATA DVD burner is on a SATA port. You
have to use cdrecord (instead of burncd) to burn via cd0.

:ahci0.0: sig 00000101
:ahci0.0: START SOFTRESET 4c017<HPCP,CR,FR,FRE,POD,SUD,ST>
:ahci0.0: sig 00000101
:ahci0.0: END SOFTRESET
:ahci0.0: Found DISK "WDC WD4000YS-01MPB1 09.02E09" serial=" WD-WCANU207346"
:ahci0.0: tags=32/32 satacaps=0706 satafeat=0044 capacity=381554.08MB
:ahci0.0: f85=7469 f86=3c41 f87=4163 WC=enabled RA=enabled SEC=frozen
:...
:ahci0.0: tags=32/32 satacaps=0506 satafeat=0044 capacity=476940.02MB
:ahci0.0: f85=7469 f86=be41 f87=4163 WC=enabled RA=enabled SEC=frozen
:
:ahci0.0: CMD TIMEOUT state=5 slot=22
: cmd-reg 0x4d617<HPCP,CR,FR,FRE,POD,SUD,ST>
: sactive=00400000 active=00000000 expired=00000000
: sact=000000000 ci=00000000
:ahci0.0: disk rw: timeout
:
:This last part (timeout) repeats forever, with about a 10 second pause between
:each printing.
:
:--
:Francois Tigeot

I'm not sure what is going on there but get the latest git and try
again. It will be less verbose and shouldn't try to double-ident
ahci0.0. I'm hoping that is the cause of the forever-repeating
timeouts.

-Matt
Matthew Dillon
<>

#18 Updated by dillon almost 5 years ago

::ahci0.0: tags=32/32 satacaps=0706 satafeat=0044 capacity=381554.08MB
::ahci0.0: f85=7469 f86=3c41 f87=4163 WC=enabled RA=enabled SEC=frozen
::...
::ahci0.0: tags=32/32 satacaps=0506 satafeat=0044 capacity=476940.02MB
::ahci0.0: f85=7469 f86=be41 f87=4163 WC=enabled RA=enabled SEC=frozen
::
::This last part (timeout) repeats forever, with about a 10 second pause between
::each printing.
::
::--
::Francois Tigeot
:
: I'm not sure what is going on there but get the latest git and try
: again. It will be less verbose and shouldn't try to double-ident
: ahci0.0. I'm hoping that is the cause of the forever-repeating
: timeouts.

I'm going to assume those last two tags=/f86= where actually from
the other disk and you mis-typed them.

If that's the case it might be an issue of the drive or the controller
not being able to support SATA-300. It may help if you post what the
NATA driver reports for those attachments (set hint.ahic.disabled from
the boot loader).

I am also going to add a new kenv called hint.ahci.force150 which will
force it to negotiate at a slower speed. That's being committed as
we speak.

-Matt
Matthew Dillon
<>

#19 Updated by dillon almost 5 years ago

:ahci0.0: CMD TIMEOUT state=5 slot=22
: cmd-reg 0x4d617<HPcP,CR,FR,FRE,POD,SUD,ST>
: sactive=004000000 active=00000000 expired=00000000
: sact=000000000 ci=00000000
:ahci0.0: disk_rw: timeout

Are you absolutely sure these TIMEOUT's are coming from ahci0.0 ?

:I then tried again after unpluging the burner:
:...
:
:[...]
:mountroot> ufs:da0s1a
:Mounting root from ufs:da0s1a
:cd0 at ahci0 bus 1 target 0 lun 0
:cd0: <ASUS DRW-1814BLT 1.04> Removable CD-ROM SCSI-0 device
:cd0: 150.000MB/s transfers
:cd0: Attempt to query device size failed: NOT READY, Medium not present - tray
:closed

Wait, you said you unplugged the burner. Why is still being probed?
Do you have multiple CD/DVD drives? Make sure they are all unplugged.

:It ends with these two lines:
:da0: reading primary partition table: error accessing offset 000000000000 for 512
:Root mount failed: 5

:Here you go:
:ad4: 381554MB <WDC WD4000YS-01MPB1 09.02E09> at ata2-master SATA300
:ad10: 476940MB <WDC WD5002ABYS-01B1B0 02.03B02> at ata5-master SATA300
:--
:Francois Tigeot

NATA is probing them at 3 GBits, so that probably isn't the problem.
Stil, I added a feature to the driver to force 150, add this to your
loader.conf or set from the boot loader:

set hint.ahci.force150

Ok, I'd like you to sync the git repo again, I added some additional
output to the TIMEOUT message.

You don't have to type everything in again :-), but please tell me what
the STS line in the CMD TIMEOUT message says (that I just added to the
driver code).

-Matt
Matthew Dillon
<>

#20 Updated by dillon almost 5 years ago

:> ahci0.0: CMD TIMEOUT state=5 slot=22
:> cmd-reg=0x4d617<HPCP,CR,FR,FRE,POD,SUD,ST>
:> sactive=00400000 active=00000000
:> [...]
:> STS=40
:> ahci0.0: disk_rw: timeout
:
:With hint.ahci.force150, results are identical.
:
:--
:Francois Tigeot

Hmm. I'm a bit at a loss.

Update again. I added some defensive bzero()ing of the FIS to the AHCI
driver. Maybe there was some cruft in the read command it tried to
construct.

Do any other disks work there (get past boot)? Looking on Google
those 4000YS's had firmware issues. It's worth a shot. Maybe they
are expecting some field to be set in the command that I'm not setting.

-Matt
Matthew Dillon
<>

#21 Updated by dillon almost 5 years ago

Check your BIOS settings. See if there is an AHCI selection.
Maybe there's an issue somewhere. Also check HDD options, the
BIOS might be doing something funny with the disk that I'm not
catching. Also make sure RAID options are turned off.

-Matt

#22 Updated by dillon almost 5 years ago

:Hi Matt,
:
:I get the same problem on my Fujitsu Lifebook laptop.
:
:Petr

Ok, back on bugs@ I guess. Try this:

fetch http://apollo.backplane.com/DFlyMisc/ahci02.patch

And see if any "IS ..." lines print out, and/or if it works
with the patch.

-Matt

#23 Updated by dillon almost 5 years ago

:How do I boot my laptop back?
:
:I changed my fstab to da0s.... When I boot with option 6 (without AHCI), i set
:mountroot to /dev/ad4s3a, which takes me to single usermode. But from there I
:cannot edit the fstab because... Did I mention vi is impossible to use from
:single user mode?
:
:I mount the /usr and /tmp partitions to get access to vi, but when i try to
:open anything, it doesnt show anything meaningful. (some gibberish at the
:bottom of the text)
:
:Petr

You almost got it. The termcap is broken. So you also need to do,
in addition to mounting the stuff:

TERM=cons25
export TERM

Then vi will work.

-Matt

#24 Updated by dillon almost 5 years ago

Ok guys, also try this patch and see if it prints out
that BIOS disabled line on the console. In addition to the
other patch.

And after that, maybe try uncommenting the pci_write_config
for 0x92... but not after testing the patch as-is.

-Matt
Matthew Dillon
<>

diff --git a/sys/dev/disk/ahci/ahci_attach.c b/sys/dev/disk/ahci/ahci_attach.c
index 68f4059..67551aa 100644
--- a/sys/dev/disk/ahci/ahci_attach.c
+++ b/sys/dev/disk/ahci/ahci_attach.c
@@ -159,6 +159,15 @@ ahci_pci_attach(device_t dev)
int error;
const char *revision;

+/*
+ pci_write_config(dev, 0x92, pci_read_config(dev, 0x92, 2) | 0x0F, 2);
+*/
+ if (pci_read_config(dev, PCIR_COMMAND, 2) & 0x0400) {
+ device_printf(dev, "BIOS disabled PCI interrupt, re-enabling\n");
+ pci_write_config(dev, PCIR_COMMAND,
+ pci_read_config(dev, PCIR_COMMAND, 2) & ~0x0400, 2);
+ }
+
/*
* Map the AHCI controller's IRQ and BAR(5) (hardware registers)
*/

#25 Updated by dillon almost 5 years ago

Everything playing with the AHCI driver needs to check their BIOSes.
If there is an AHCI setting for SATA, then it has to be set. Otherwise
the AHCI controller might probe but might still be in compatibility
mode.

-Matt

#26 Updated by dillon almost 5 years ago

I've got some of it figured out. I loaded up Virtual Box on my windows
box.

I figured out part of the interrupt issue. VB and the Intel ICH
chipsets do not have Async Notification. Because of that I wasn't
setting an interrupt bit called SDBE which is apparently used by the
Intel chipset to notify when a NCQ command completes.

That explains why it got past the probe and then timed-out trying
to read from the disk.

I'm working on some other issues related to VBOX, then I'll commit the
whole mess and people can try the driver again. I hope to commit
this evening.

-Matt
Matthew Dillon
<>

#27 Updated by dillon almost 5 years ago

Ok, ahci should work better with ICH7. However, for the case
where the interrupt is not routed properly there isn't much I
can do in the driver itself. Emergency interrupt polling should
work now, though.

-Matt
Matthew Dillon
<>

#28 Updated by dillon almost 5 years ago

:This is an understatement.
:It works fine for me now, and seems to take the load much better than the
:old nata driver.
:I'm running 4 different find / and a cvs update of pkgsrc in parallel and
:there's no hang or perceptible slow-down on disk accesses. New applications
:seem to launch as fast as if nothing was accessing the disks.
:
:I'm happy now :-)
:
:--
:Francois Tigeot

Excellent!

Did you have to use the emergency interrupt polling or ahci/non-ahci
combination? Or does the interrupt work for real?

I also fixed some console spew this morning (that Sephe reported),
just in case your attachment is NCQ-capable.

-Matt
Matthew Dillon
<>

#29 Updated by dillon almost 5 years ago

:Hi Matt,
:
:It also works here now, and first impression it seems pretty speedy (boot
:seems to take less long)
:
:Cheers,
:Petr

Yah, it probes all the ports in parallel. Makes a big difference on
modern motherboards which have tons of sata ports. I could make it
even faster, there are some delays at the end that could be removed,
but I'm not going to worry about it atm. The probe code is fairly
sensitive due to device variations.

-Matt
Matthew Dillon
<>

#30 Updated by dillon almost 5 years ago

Is that WDC WD4000YS on ahci0.0 a 2.5" drive? Is it a portable HD
with an external connector or is it a bare bones HD screwed into the
machine?

If its a portable HD does it have an E-sata connector? And does it
have any other connectors? (Firewire, usb, etc).

Is this a laptop?

-Matt

#31 Updated by dillon almost 5 years ago

I add yet another kenv for you to try.

set hint.ahci.nofeatures in /boot/loader.conf or on the boot loader
command line.

I also enhanced the output to indicate which setup command was generating
the CMD TIMEOUT. It should say just after the CMD TIMEOUT stuff.

e.g.

CMD TIMEOUT ...
....
...
...
STS=...
ahcix.x: Unable to set BLAHBLAH

I'd like to know which setup command it was failing on, if it was
failing on one, and whether setting hint.ahci.nofeatures fixes the
problem.

-Matt

#32 Updated by dillon almost 5 years ago

I think I have an idea as to the problem. I think it may be an interrupt
issue.

Try turning on emergency interrupt polling in the loader. I think
this is the format (maybe needs equal signs, not sure).

kern.emergency_intr_enable=1
kern.emergency_intr_freq=100

-Matt

#33 Updated by dillon almost 5 years ago

We'll go private email, I have a few more things up my sleave.

-Matt

#34 Updated by dillon almost 5 years ago

:On Mon, Jun 15, 2009 at 09:47:54AM +0200, Francois Tigeot wrote:
:>
:> I just tested the new ahci driver on a Core 2 machine with ICH7 chipset.
:> It attachs to a SATA DVD burner, which I believe should be left to the nata
:> driver.
:> Kernel is 2.3.1 GENERIC with SMP enabled (no apic).
:
:I just plugged the CD burner back.
:It still attaches to ahci, but now works fine.
:
:dmesg extract:
:
:Mounting root from ufs:/dev/da0s1a
:ahci0.1: TFES slot 11 ci_saved = 00000800
:ahci0.1: Copying error rfis slot 11
:ahci0.1: WARNING! Unload prdbc resid was zero! tag=11
:cd0 at ahci0 bus 1 target 0 lun 0
:cd0: <ASUS DRW-1814BLT 1.04> Removable CD-ROM SCSI-0 device
:cd0: 150.000MB/s transfers
:cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
:
:It is much faster than with the nata driver. I measured about a 4x speedup
:when burner regular CD images.
:
:--
:Francois Tigeot

Its good that it works. I think it may only be faster because the
cdrecord software is probing the burn speed more effectively. burncd
never could do that very well.

-Matt
Matthew Dillon
<>

Also available in: Atom PDF