Project

General

Profile

Actions

Bug #2249

closed

deadlock under high i/o load (e.g. hammer reblock)

Added by rumcic about 13 years ago. Updated almost 13 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Almost latest master (last commit from upstream should be
0e84c0db0d0bcd6d681bc30fd16e96c26bb92db2).

Whenever nightly cron runs and a few minutes after hammer cleanup starts doing
it's job, the machine becomes mostly unresponsive (most processes and most ssh
connections stop responding, but strangely pf still continues doing it's job).

Fortunately debugging through serial console still worked, so the dump is
available at leaf:~rumko/crash/deadlock/*.0 .
--
Please do not CC me, since I already receive everything from these MLs.

Regards,
Rumko

Actions #1

Updated by tuxillo about 13 years ago

Hi Rumko,

I see your kern.0 is 7.5K, bad upload or bad dump?

rw-r--r- 1 rumko wheel 231K Nov 28 13:51 core.txt.0
rw------ 1 rumko wheel 465B Nov 28 13:50 info.0
rw-r--r- 1 rumko wheel 7.5K Nov 28 13:50 kern.0
rw------ 1 rumko wheel 407M Nov 28 13:51 vmcore.0

Thanks,
Antonio Huete

Actions #2

Updated by rumcic about 13 years ago

Antonio M. Huete Jimenez via Redmine wrote:

Issue #2249 has been updated by Antonio M. Huete Jimenez.

Hi Rumko,

I see your kern.0 is 7.5K, bad upload or bad dump?

bad dump I'd guess, copied over the kernel from /boot and seems to be ok now

rw-r--r- 1 rumko wheel 231K Nov 28 13:51 core.txt.0
rw------ 1 rumko wheel 465B Nov 28 13:50 info.0
rw-r--r- 1 rumko wheel 7.5K Nov 28 13:50 kern.0
rw------ 1 rumko wheel 407M Nov 28 13:51 vmcore.0

Thanks,
Antonio Huete

<snip>
--
Please do not CC me, since I already receive everything from these MLs.

Regards,
Rumko

Actions #3

Updated by tuxillo about 13 years ago

Rumko,

It's indeed a deadlock. I'm experiencing that during 'hammer cleanup' also but on x86_64. Explanation from Matt:

21:21 <@dillon> the pageout daemon deadlock is because the hammer backend locks hammer inodes and the pageout
daemon frontend can only detect locked vnodes
21:21 <@dillon> so the pageout daemon will happily lock a vnode and then issue the pageout request to the hammer
backend and cause the hammer backend to get stuck on the inode (waiting for new memory)

It doesn't seem to have an easy solution.

Cheers,
Antonio Huete

Actions #4

Updated by tuxillo almost 13 years ago

  • Status changed from New to In Progress

Rumko, Jan,

Matt pushed a fix for this, I'm about to try it myself in my main box.

Commit is 55b50bd522537a7b4e0810aa4cab05ad355d1381

Cheers,
Antonio Huete

Actions #5

Updated by rumcic almost 13 years ago

  • Status changed from In Progress to Resolved

The fix seems to do it's job, cannot get it to deadlock again

Actions #6

Updated by alexh almost 13 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF