Bug #622

cvs commit: src/lib Makefile src/lib/libpthread Makefile README dummy.c

Added by aoiko almost 7 years ago. Updated almost 7 years ago.

Status:ClosedStart date:
Priority:LowDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

On Tuesday 17 April 2007 15:42, Simon 'corecode' Schubert wrote:
[...]
> Known issues (of libthread_xu/system):
> Mozilla stuff doesn't work. they use sched_get_priority_min/max() and
> pthread_setsched_prio(). units disagree (former reporting -20/20, later
> expecting 0-32 or so). this has to be streamlined somehow.
>
> Unfortunately I won't have time to do this in the next month because I am
> on tour, so I'd be grateful if somebody else could jump in. It's not much
> work left!

The first attached patch (lwp_prio) adds a new syscall

int lwp_rtprio (int, pid_t, lwpid_t, struct rtprio *);

that monkeys rtprio(2) and works on lwps. The other patch brings in code from
freebsd (+ some small changes to /not/ silently accept any priority value for
SCHED_OTHER) in libthread_xu, so that now my test program can only set
priority values within sched_get_priority_{min,max} and smaller/larger values
fail. The values are passed on to the kernel using lwp_rtprio() but the
kernel side is not well tested.

Since, AFAICT lwpid can be zero lwp_rtprio() uses pid 0 for "current process"
but lwpid -1 for "current thread", which IMO is ugly. Perhaps we should also
reserve lwpid 0? Alternatively, I can change it to -1 for both. Then again,
very few programs are ever going to use this. Opinions are welcome.

Don't know if this fixes thunderbird; can someone who has it linked against
libpthread switch the symlink and test?

Aggelos

lwp_rtprio.patch Magnifier (3.47 KB) aoiko, 04/30/2007 12:49 PM

xu-schedparams-checkother.patch Magnifier (9.06 KB) aoiko, 04/30/2007 12:49 PM

ktrace.patch Magnifier (1.21 KB) josepht, 05/04/2007 07:04 PM

kdump.patch Magnifier (535 Bytes) josepht, 05/04/2007 07:04 PM

History

#1 Updated by aoiko almost 7 years ago

On Monday 30 April 2007 15:44, Aggelos Economopoulos wrote:
[...]
> Don't know if this fixes thunderbird; can someone who has it linked against
> libpthread switch the symlink and test?

I've built the pkgsrc 2007Q1 thunderbird but I can't reproduce a crash on
startup. If its dot dirs are available, thunderbird seems to work with
libc_r, vanilla libthread_xu and libthread_xu + prio patch. However, if you
remove the dot dirs and start thunderbird with the symlink pointing to
libthread_xu, it gets stuck waiting for two locks (presumably two threads do
the waiting, ktrace is not lwp aware so doesn't help much). Kill it, and next
time it starts up just fine. Given that this doesn't happen using libc_r,
chances are the locking code in libthread_xu could use some debugging. This
would be fun to do (especially if it turns out the bug is in the mozilla
code :P), but I can't volunteer.

That said, I think the two patches I posted can safely go in. Please commit
while they still apply :)

Thanks,
Aggelos

#2 Updated by dillon almost 7 years ago

:I've built the pkgsrc 2007Q1 thunderbird but I can't reproduce a crash on
:startup. If its dot dirs are available, thunderbird seems to work with
:libc_r, vanilla libthread_xu and libthread_xu + prio patch. However, if you
:remove the dot dirs and start thunderbird with the symlink pointing to
:libthread_xu, it gets stuck waiting for two locks (presumably two threads do
:the waiting, ktrace is not lwp aware so doesn't help much). Kill it, and next
:time it starts up just fine. Given that this doesn't happen using libc_r,
:chances are the locking code in libthread_xu could use some debugging. This
:would be fun to do (especially if it turns out the bug is in the mozilla
:code :P), but I can't volunteer.
:
:That said, I think the two patches I posted can safely go in. Please commit
:while they still apply :)
:
:Thanks,
:Aggelos

Committed!

I can't test the lock stuff any time soon, but if ktrace does not
already have a field for the LWP id and someone would like to add
it to the kernel and support to the userland utility, submit a patch
any time!

-Matt
Matthew Dillon
<>

#3 Updated by josepht almost 7 years ago

Here's my attempt.

Joe

#4 Updated by dillon almost 7 years ago

:Here's my attempt.
:
:Joe

Looks good! I haven't had a chance to test it yet (got a ton
of uncommitted stuff I have to commit first).

If you'd like to use the time to make a slight change, actually two
small changes, that would be great. See here where you print the TID ?

:- col = printf("%6d %-8.*s ", kth->ktr_pid, MAXCOMLEN, kth->ktr_comm);
:+ col = printf("%6d %6d %-8.*s ", kth->ktr_pid, kth->ktr_tid, MAXCOMLEN, kth->ktr_comm);

What I would like is it to print the PID and TID as %6d:%-6d instead,
and only print the PID (i.e. just %6d) if the TID is 0.

Yah, yah, I know.. its simple stuff. But it actually is beneficial
having you do it instead of me at the moment, I'm hip deep in reworking
vnode ref counting to use SYSREF.

-Matt
Matthew Dillon
<>

#5 Updated by aoiko almost 7 years ago

On Saturday 05 May 2007 09:52, Matthew Dillon wrote:
[...]
> What I would like is it to print the PID and TID as %6d:%-6d instead,
> and only print the PID (i.e. just %6d) if the TID is 0.

Why? Is it so that a kdump for a single-threaded process will not show
redundant data?

If so, I'd prefer a command line flag to toggle display of the lwpid on and
off (or at least to force display of the lwpid). What you're suggesting would
unnecessarily complicate parsing of kdump output.

I wish more people would comment on stuff like that (instead of bikesheding).
About 90% of the source is "simple", but attention to detail makes the
difference between a system that occasionally gets in your way and one that
doesn't.

Aggelos

#6 Updated by dillon almost 7 years ago

:Why? Is it so that a kdump for a single-threaded process will not show
:redundant data?
:
:If so, I'd prefer a command line flag to toggle display of the lwpid on and
:off (or at least to force display of the lwpid). What you're suggesting would
:unnecessarily complicate parsing of kdump output.

If the pid:tid is separated with a colon instead of whitespace it
will not be difficult to parse it with or without the tid. Screen
space is already at a premium with the kdump output, I don't want
to add the additional output unless the program is actually threaded.

There is no easy way to determine whether the program is threaded
or not short of adding a flag to say 'yah, this program did some LWP
ops at some point so treat it as if it were threaded'. This could
be done by inserting another 'short' field in the ktrace structure
(see that 'short' declaration in the structure? The structure is
probably being realigned after that declaration so we might as well
insert another 'short' there for a flags field).

The only other way, the quick solution which I originally suggested,
would be to only output the TID if it is non-zero

An option could be added to force the TID to be displayed regardless,
but the default should be to determine TID display automatically.

-Matt
Matthew Dillon
<>

#7 Updated by corecode almost 7 years ago

Oh, that might be easy. I'd track this in kdump, not in the kernel. Just
remember if there was ever a tid != 0 for a particular proc and if so, start
using the extended output. I recall that strace is doing something like this
when tracing procs with fork tracing switched on. Maybe also not using 6
digits unconditionaly (dunno, just skimmed the patch) but scaling up to the
needed amount could help.

cheers
simon

#8 Updated by aoiko almost 7 years ago

You'd have to actively try in order to make it difficult to parse in any
case :) While this doesn't break cut, it needs a bit of extra attention when
you're using a regex.

I think a more future proof solution is to allow the default output to be
arbitrarily fancy, but have an option that produces fixed-form output so you
can trivially parse the data in your favourite scripting language.

Aggelos

#9 Updated by dillon almost 7 years ago

All committed. I added some code to detect threading and made the
display a bit more dynamic, but it could probably use some more
polishing.

-Matt

Also available in: Atom PDF