Bug #1168

Strange taskqueue timeouts

Added by mneumann almost 6 years ago. Updated almost 3 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi,

I just rsynced my harddisk to a DragonFly box. After that, the box became more and more unusable (at first I could still
ping and ssh into it, then after a few seconds this also didn't worked anymore). It's a pretty "old" DragonFly I am using,
so probably I should compile a more recent kernel and the problem goes away. But maybe it's something worse, so I'll better
report it...

Regards,

Michael

uname:

DragonFly 2.1.0-DEVELOPMENT DragonFly 2.1.0-DEVELOPMENT #1: Thu Aug 21 14:07:25 CEST 2008
:/usr/src/sys/compile/GENERIC i386

dmesg output:

swap_pager_getswapspace: failed
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=373340928
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=373340928
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - READ_DMA retrying (1 retry left) LBA=8996640
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: FAILURE - READ_DMA48 timed out LBA=373340928
ar0: FAILURE - SPAN array broken
ad4: WARNING - WRITE_DMA taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - READ_DMA retrying (0 retries left) LBA=8996640
ad4: WARNING - WRITE_DMA48 freeing taskqueue zombie request
Waiting (max 60 seconds) for system thread vnlru to stop...stopped
Waiting (max 60 seconds) for system thread bufdaemon to stop...stopped
Waiting (max 60 seconds) for system thread bufdaemon_hw to stop...stopped
Waiting (max 60 seconds) for system thread syncer to stop...stopped

syncing disks... 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
done
unmount(0xd24c7158,hammer): Forced unmount: 4 namecache references still present
unmount(0xd24c7158,hammer): Forced unmount: 9 process references still present

History

#1 Updated by jdc almost 6 years ago

These are a known problem with the FreeBSD ATA layer. No one's been
able to figure out what the true cause is, but sometimes they can
indicate disk errors if the LBA shown is consistent (in your case, it
looks like it may be).

Prior to my recent departure from the FreeBSD Project, I documented this
common problem. And no, there is no fix for it; it requires a large
amount of time and money (replacing hardware) to troubleshoot.

http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting

#2 Updated by sepherosa almost 6 years ago

Put kern.intr_mpsafe="0" in loader.conf, if you don't plan to upgrade
immediately. Your source tree may be checkd out between the time I
turned on kern.intr_mpsafe and the time I turned off INTR_MPSAFE in
nata.

Best Regards,
sephe

#3 Updated by mneumann almost 6 years ago

Thanks! I will try a recent kernel instead.

Regards,

Michael

#4 Updated by mneumann almost 6 years ago

Too strange that after a reboot or reset, the disk was completely missing.
There was no ad4 (or ar0) any more. It appeared again, after powering down
the box completely.

Thanks! I'll take a close look if this kind of problems appear again.

Regards,

Michael

#5 Updated by dillon almost 6 years ago

:Too strange that after a reboot or reset, the disk was completely missing.
:There was no ad4 (or ar0) any more. It appeared again, after powering down
:the box completely.
:
:> Prior to my recent departure from the FreeBSD Project, I documented this
:> common problem. And no, there is no fix for it; it requires a large
:> amount of time and money (replacing hardware) to troubleshoot.
:
:Thanks! I'll take a close look if this kind of problems appear again.
:
:Regards,
:
: Michael

Check that all the fans are working and the disk hasn't overheated, too.
One thing that a big rdist might do is exercise the cpu and disk enough
to overheat the system if it was not otherwise well fanned.

-Matt
Matthew Dillon
<>

#6 Updated by corecode over 5 years ago

did this go away?

#7 Updated by ftigeot almost 3 years ago

  • Description updated (diff)
  • Status changed from New to Resolved
  • Assignee deleted (0)

Closing due to lack of recent feedback.

#8 Updated by ftigeot almost 3 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF