Bug #752

system freeze on objcache_get

Added by corecode over 6 years ago. Updated over 6 years ago.

Status:ClosedStart date:
Priority:UrgentDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

[resend to get context in the bug tracker]

hey,

on the build machine I just had a de-facto system freeze where all processes would get stuck in objcache_get (^T told me). I couldn't get more information, unfortunately :/ But I could run reboot to get the machine going again, thankfully.

The box was running two and a half days, basically building non-stop. When I tried to set up another machine I tried duplicating the filesystems with cpdup over ssh and this was REAL slow, until I suspended all build processes. I don't know, maybe that has something to do with the objcache stuff? (would hint me towards mbuf or vfs). Sorry that I don't have more information.

Maybe we should find a way to make objcache_get interruptible (for instance if M_NULLOK is passed?)

cheers
simon

execve-leak.diff Magnifier (2.04 KB) corecode, 07/30/2007 02:15 PM

History

#1 Updated by corecode over 6 years ago

More details on that: It just happened again.

This time I managed to get into kgdb. Details follow:

(kgdb) bt
#0 lwkt_switch () at thread2.h:177
#1 0xc02fc071 in tsleep (ident=0xc66ce15c, flags=0, wmesg=0xc0583792 "objcache_get", timo=0)
at /usr/src/src/sys/kern/kern_synch.c:473
#2 0xc02fc362 in msleep (ident=0x0, spin=0xc66ce168, flags=0, wmesg=0x0, timo=0)
at /usr/src/src/sys/kern/kern_synch.c:602
#3 0xc02e5ad9 in objcache_get (oc=0xc66ce138, ocflags=2)
at /usr/src/src/sys/kern/kern_objcache.c:432
#4 0xc02de60b in exec_copyin_args (args=0xed6edc50,
fname=0x2810a000 <Error reading address 0x2810a000: Bad address>, segflg=PATH_USERSPACE,
argv=0x28101af0, envv=0x28105700) at /usr/src/src/sys/kern/kern_exec.c:734
#5 0xc02de0b5 in sys_execve (uap=0xed6edcf8) at /usr/src/src/sys/kern/kern_exec.c:525
#6 0xc05265c9 in syscall2 (frame=0xed6edd40) at /usr/src/src/sys/platform/pc32/i386/trap.c:1340
#7 0xc050e025 in Xint0x80_syscall () at /usr/src/src/sys/platform/pc32/i386/exception.s:872

(kgdb) fra 4
#4 0xc02de60b in exec_copyin_args (args=0xed6edc50,
fname=0x2810a000 <Error reading address 0x2810a000: Bad address>, segflg=PATH_USERSPACE,
argv=0x28101af0, envv=0x28105700) at /usr/src/src/sys/kern/kern_exec.c:734
734 args->buf = objcache_get(exec_objcache, M_WAITOK);

(kgdb) p *exec_objcache
$2 = {name = 0xc64e07a0 "exec-args", ctor = 0xc02e542c <null_ctor>, dtor = 0xc02e5427 <null_dtor>,
privdata = 0x0, alloc = 0xc02e5b4b <objcache_malloc_alloc>,
free = 0xc02e5b6f <objcache_malloc_free>, allocator_args = 0xc65a1180, oc_next = {
sle_next = 0xc66ce088}, exhausted = 1, depot = {{fullmagazines = {slh_first = 0x0},
emptymagazines = {slh_first = 0x0}, magcapacity = 2, spin = {lock = 0},
unallocated_objects = 0, waiting = 35, contested = 0}}, cache_percpu = 0xc66ce178}

#######################
First issue: Why is magcapacity == 2? Why are there no empty magazines?
#######################

(kgdb) p exec_objcache->cache_percpu[0]
$3 = {loaded_magazine = 0xc65008b8, previous_magazine = 0xc65008d0, gets_cumulative = 3092800,
gets_null = 0, puts_cumulative = 3092782, puts_othercluster = 0, waiting = 0}
(kgdb) p *exec_objcache->cache_percpu[0]->loaded_magazine
$5 = {rounds = 1, capacity = 2, cleaning = 0, nextmagazine = {sle_next = 0x0},
objects = 0xc65008c8}
(kgdb) p *exec_objcache->cache_percpu[0]->previous_magazine
$6 = {rounds = 0, capacity = 2, cleaning = 0, nextmagazine = {sle_next = 0x0},
objects = 0xc65008e0}

#######################
Seems the magazines are indeed just 2 rounds long. However, the loaded magazine has a round in, so why didn't the process get woken up?

I'll look into the exec path, there might be a leak as well.

cheers
simon

#2 Updated by corecode over 6 years ago

Solution: all processes block on CPU3, which des not have any rounds.

In total, there are just 5 rounds in all magazines: one each in loaded_magazine on CPUs 0,1,2 and two in previous_magazine on CPU2.

So my question now is: where did all the allocated objects go?

objcache(exec-args): too small for ncpus, adjusting cluster_limit 16->64

So there should be 59 objects possible to be allocated! I checked all processes. None is blocked somewehere in kern_exec, so somehow objects must be leaking!

cheers
simon

#3 Updated by kmb810 over 6 years ago

just to add to debgging info:
when system boots up:
top -S outputs:

Inactive Mem: 8M, Free Mem: 1810M

I just did a cvs update -A -d on a checked out copy of src. cvs tree
is already on the local machine.

top -S outputs:

Inactive Mem: 465M, Free Mem: 1312M

is there a memory leak? or is it stored in objcache?

cheers
kmb

#4 Updated by corecode over 6 years ago

please do not hijack my thread, your data seems unrelated.

it probably also tells you about wired, buffer and active, right?

i think you're not reading the data correctly.

cheers
simon

#5 Updated by corecode over 6 years ago

Answer: they were never freed:

There were exactly 59 less frees than there were allocs (checked by looking at get/put_cummulative).

So, where'd it go?

if (error == 0) {
error = exec_copyin_args(&args, uap->fname, PATH_USERSPACE,
uap->argv, uap->envv);
}
if (error == 0)
error = kern_execve(&nd, &args);
nlookup_done(&nd);
exec_free_args(&args);

must be in kern_execve, right? but how so?

in kern_execve...

exec_fail:
/*
* we're done here, clear P_INEXEC if we were the ones that
* set it. Otherwise if vmspace_destroyed is still set we
* raced another thread and that thread is responsible for
* clearing it.
*/
if (imgp->vmspace_destroyed & 2)
p->p_flag &= ~P_INEXEC;
if (imgp->vmspace_destroyed) {
/* sorry, no more process anymore. exit gracefully */
exit1(W_EXITCODE(0, SIGABRT)); <<<<<<<< BIATCH
/* NOT REACHED */
return(0);
} else {
return(error);
}
}

there we go. turns out cperciva fixed that in freebsd in 2005, rev. 1.277. fix will arrive now in dragonfly as well.

cheers + hope to see you again in this cinema.
simon

#6 Updated by corecode over 6 years ago

see attached patch. pls review + comment.

cheers
simon

#7 Updated by dillon over 6 years ago

:More details on that: It just happened again.
:
:This time I managed to get into kgdb. Details follow:
:
:...
: at /usr/src/src/sys/kern/kern_objcache.c:432
:#4 0xc02de60b in exec_copyin_args (args=0xed6edc50,
: fname=0x2810a000 <Error reading address 0x2810a000: Bad address>, segflg=PATH_USERSPACE,
: argv=0x28101af0, envv=0x28105700) at /usr/src/src/sys/kern/kern_exec.c:734
:#5 0xc02de0b5 in sys_execve (uap=0xed6edcf8) at /usr/src/src/sys/kern/kern_exec.c:525
:#6 0xc05265c9 in syscall2 (frame=0xed6edd40) at /usr/src/src/sys/platform/pc32/i386/trap.c:1340
:#7 0xc050e025 in Xint0x80_syscall () at /usr/src/src/sys/platform/pc32/i386/exception.s:872
:
:First issue: Why is magcapacity == 2? Why are there no empty magazines?

Those are very large buffers and we don't want to keep too many of
them cached. Each cpu will have at least two active magazines
plus there is a slop of one more (so there are always some free
magazines in the pool), so we wind up with four buffers per cpu with
another four in the common pool. This is plenty for exec.

:Seems the magazines are indeed just 2 rounds long. However, the loaded magazine has a round in, so why didn't the process get woken up?
:
:I'll look into the exec path, there might be a leak as well.
:
:cheers
: simon

That is a very a good question, and it looks like you tracked it
down in a later email! Your patch looks good, commit it!

Don't worry about committing to the release branch, I'll synchronize
most of the work in HEAD to RELEASE later on today.

-Matt
Matthew Dillon
<>

#8 Updated by dillon over 6 years ago

:> just to add to debgging info:
:> when system boots up:
:
:please do not hijack my thread, your data seems unrelated.

Ha ha... Simon is on a roll here!

:> top -S outputs:
:>
:> Inactive Mem: 8M, Free Mem: 1810M
:
:it probably also tells you about wired, buffer and active, right?
:
:> is there a memory leak? or is it stored in objcache?
:
:i think you're not reading the data correctly.
:
:cheers
: simon

There's no leak (in KMs post). The OS keeps as few 'free' pages
around as it can. As many are possible are used to cache filesystem
data.

Simon found the leak in the linux exec code, nice!

-Matt
Matthew Dillon
<>

Also available in: Atom PDF