Bug #1003

Qt 4.4 QtConcurrent and libthread_xu

Added by hasso about 6 years ago. Updated over 1 year ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Qt 4.4 integrates QT Concurrent [1] template library which provides
high-level and very easy to use API for writing multi-threaded programs.

The problem is that all these programs crash (segfault) rather "reliably"
with libthread_xu while there is no single problem with libc_r. I even
managed to deadlock whole machine with one example program, but can not
repeat it for now (I have single desktop at the moment I need to work
with ;).

I made static binaries from trivial example program you can download
here - http://leaf.dragonflybsd.org/~hasso/qt44-threaded-tests.tar.bz2.

runfunction-lr is compiled with libc_r, runfunction-xu with libthread_xu.
runfunction-xu crashing backtrace is always the same:

#0 __pthread_mutex_trylock (m=0x0)
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
292 if (__predict_false(*m == NULL)) {
(gdb) bt
#0 __pthread_mutex_trylock (m=0x0)
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
#1 0x0814157e in g_mutex_trylock_posix_impl ()
#2 0x0815186f in g_slice_alloc ()
#3 0x0815ca7a in g_ptr_array_sized_new ()
#4 0x0815cab5 in g_ptr_array_new ()
#5 0x0814b5ec in g_main_context_new ()
#6 0x080fd298 in QEventDispatcherGlibPrivate (this=0x281fb400,
context=0x0)
at kernel/qeventdispatcher_glib.cpp:241
#7 0x080fd4e7 in QEventDispatcherGlib (this=0x281f9260, parent=0x0) at
kernel/qeventdispatcher_glib.cpp:268
#8 0x08057e1a in QThreadPrivate::createEventDispatcher (data=0x281fa740)
at thread/qthread_unix.cpp:161
#9 0x08057f0c in QThreadPrivate::start (arg=0x281f91c0) at
thread/qthread_unix.cpp:185
#10 0x08145017 in thread_start (arg=0x281f6200)
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_create.c:239
#11 0x00000000 in ?? ()

[1] - http://labs.trolltech.com/blogs/category/labs/threads/qt-concurrent/

History

#1 Updated by corecode about 6 years ago

Well, somebody seems to pass a null pointer here.

could you compile glib with -g and also include their sources?

cheers
simon

#2 Updated by dillon about 6 years ago

Something in that chain of calls is passing a NULL to
__pthread_mutex_trylock().

In libc_r we have this:

int
_pthread_mutex_trylock(pthread_mutex_t * mutex)
{
struct pthread *curthread = _get_curthread();
int ret = 0;

if (mutex == NULL)
ret = EINVAL;
...
}

In libthread_xu it assumes non-NULL and will crash.

Try this patch. It will do the same check that libc_r does. I'm
not convinced that Qt isn't broken, though, Qt shouldn't be passing
NULL to the mutex functions, it should be passing the address of
a pthread_mutex_t which itself can be NULL, but it should be passing
NULL.

-Matt

Index: thread/thr_mutex.c
===================================================================
RCS file: /cvs/src/lib/libthread_xu/thread/thr_mutex.c,v
retrieving revision 1.14
diff -u -p -r1.14 thr_mutex.c
--- thread/thr_mutex.c 13 Apr 2006 11:53:39 -0000 1.14
+++ thread/thr_mutex.c 7 May 2008 19:18:04 -0000
@@ -285,6 +285,8 @@ {
struct pthread *curthread = tls_get_curthread();
int ret;

+ if (__predict_false(m == NULL))
+ return(EINVAL);
/*
* If the mutex is statically initialized, perform the dynamic
* initialization:
@@ -372,12 +374,14 @@ int ret;

_thr_check_init();

- curthread = tls_get_curthread();
+ if (__predict_false(m == NULL))
+ return(EINVAL);

/*
* If the mutex is statically initialized, perform the dynamic
* initialization:
*/
+ curthread = tls_get_curthread();
if (__predict_false(*m == NULL)) {
ret = init_static(curthread, m);
if (__predict_false(ret))
@@ -394,12 +398,14 @@ int ret;

_thr_check_init();

- curthread = tls_get_curthread();
+ if (__predict_false(m == NULL))
+ return(EINVAL);

/*
* If the mutex is statically initialized, perform the dynamic
* initialization marking it private (delete safe):
*/
+ curthread = tls_get_curthread();
if (__predict_false(*m == NULL)) {
ret = init_static_private(curthread, m);
if (__predict_false(ret))
@@ -417,12 +423,14 @@ int ret;

_thr_check_init();

- curthread = tls_get_curthread();
+ if (__predict_false(m == NULL))
+ return(EINVAL);

/*
* If the mutex is statically initialized, perform the dynamic
* initialization:
*/
+ curthread = tls_get_curthread();
if (__predict_false(*m == NULL)) {
ret = init_static(curthread, m);
if (__predict_false(ret))
@@ -440,6 +448,9 @@ int ret;

_thr_check_init();

+ if (__predict_false(m == NULL))
+ return(EINVAL);
+
curthread = tls_get_curthread();

/*
@@ -457,6 +468,8 @@
int
_pthread_mutex_unlock(pthread_mutex_t *m)
{
+ if (__predict_false(m == NULL))
+ return(EINVAL);
return (mutex_unlock_common(m));
}

@@ -556,7 +569,6 @@ struct pthread_mutex *m;

if (__predict_false((m = *mutex)== NULL))
return (EINVAL);
-
if (__predict_false(m->m_owner != curthread))
return (EPERM);

@@ -600,9 +612,10 @@ {
struct pthread *curthread = tls_get_curthread();
struct pthread_mutex *m;

- if (__predict_false((m = *mutex)== NULL))
+ if (__predict_false(mutex == NULL))
+ return (EINVAL);
+ if (__predict_false((m = *mutex) == NULL))
return (EINVAL);
-
if (__predict_false(m->m_owner != curthread))
return (EPERM);

#3 Updated by hasso about 6 years ago

Compiled glib2 (2.14.6 in 2008Q1) distfile from source:

Program terminated with signal 11, Segmentation fault.
#0 __pthread_mutex_trylock (m=0x0)
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
292 if (__predict_false(*m == NULL)) {
(gdb) bt
#0 __pthread_mutex_trylock (m=0x0)
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
#1 0x081415f2 in g_mutex_trylock_posix_impl (mutex=0x0) at
gthread-posix.c:187
#2 0x08151b6b in IA__g_slice_alloc (mem_size=12) at gslice.c:395
#3 0x0815cd76 in IA__g_ptr_array_sized_new (reserved_size=0) at
garray.c:372
#4 0x0815cdb1 in IA__g_ptr_array_new () at garray.c:366
#5 0x0814b65c in IA__g_main_context_new () at gmain.c:757
#6 0x080fd298 in QEventDispatcherGlibPrivate (this=0x281fb400,
context=0x0)
at kernel/qeventdispatcher_glib.cpp:241
#7 0x080fd4e7 in QEventDispatcherGlib (this=0x281f9260, parent=0x0) at
kernel/qeventdispatcher_glib.cpp:268
#8 0x08057e1a in QThreadPrivate::createEventDispatcher (data=0x281fa740)
at thread/qthread_unix.cpp:161
#9 0x08057f0c in QThreadPrivate::start (arg=0x281f91b0) at
thread/qthread_unix.cpp:185
#10 0x08145087 in thread_start (arg=0x281f6200)
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_create.c:239
#11 0x00000000 in ?? ()

#4 Updated by corecode about 6 years ago

From libc_r:

int
_pthread_mutex_trylock(pthread_mutex_t * mutex)
{
struct pthread *curthread = _get_curthread();
int ret = 0;

if (mutex == NULL)
ret = EINVAL;

/*
* If the mutex is statically initialized, perform the dynamic
* initialization:
*/
else if (*mutex != NULL || (ret = init_static(mutex)) == 0) {
/*

from libthread_xu:

int
__pthread_mutex_trylock(pthread_mutex_t *m)
{
struct pthread *curthread = tls_get_curthread();
int ret;

/*
* If the mutex is statically initialized, perform the dynamic
* initialization:
*/
if (__predict_false(*m == NULL)) {

Not sure if we are required to catch NULL pointers? They shouldn't pass
them in the first place...

cheers
simon

#5 Updated by aoiko about 6 years ago

On Wednesday 07 May 2008, Matthew Dillon wrote:
[...]
> In libthread_xu it assumes non-NULL and will crash.
>
> Try this patch. It will do the same check that libc_r does. I'm
> not convinced that Qt isn't broken, though, Qt shouldn't be passing
> NULL to the mutex functions, it should be passing the address of
> a pthread_mutex_t which itself can be NULL, but it should be passing
> NULL.

Still, it's bad form for a library function to crash just because somebody
called it with an obviously invalid argument.

Aggelos

#6 Updated by Johannes.Hofmann about 6 years ago

Not sure about that. There are many illeagal pointers that can't be catched
so why treat NULL differently?
And a crash with an obvious stack trace shows that there is a problem that
should be fixed.

Johannes

#7 Updated by aoiko about 6 years ago

Because NULL is guaranteed not to point to an object.

While an error of EINVAL is hard to debug?

Aggelos

#8 Updated by corecode about 6 years ago

This is true. Fair enough, we can add a check. I still wonder why they
are passing NULL in the first place.

Yes, most people seem to ignore error messages returned by mutex functions
anyways, or so I've heard :)

cheers
simon

#9 Updated by dillon about 6 years ago

Ultimately it comes down to what does the standard intend?

In my years of programming I've found that it is generally better to
enforce the stated intention by crashing the program (or allowing it
to crash) rather then returning an error code. It results in far better
code quality from the people using the API. Unless the standard
specifically says that NULL is ok, then the problem is with the Qt
library and not with the threads library.

That said, from an administrative point of view I want to have maximum
compatibility between libthread_xu and libpthread, because we aren't
really in the business of policing other people's work.

So I am willing to commit the change, even if the standard doesn't allow
NULL.

-Matt

#10 Updated by aoiko about 6 years ago

[...]
> So I am willing to commit the change, even if the standard doesn't allow
> NULL.

Well, SUSv3 says

"The pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock()
functions may fail if:
[EINVAL]
The value specified by mutex does not refer to an initialized mutex object."

So it definitely allows NULL and seems to encourage extra mtx->magic checks.

Aggelos

#11 Updated by corecode about 6 years ago

The standard says:

int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_trylock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);

Note that none of the parameters has the modifier "restricted". That
means that they may be NULL.

The pthread_mutex_lock(), pthread_mutex_trylock(), and
pthread_mutex_unlock() functions may fail if:

[EINVAL]
The value specified by mutex does not refer to an initialized mutex
object.

Now, of course we cannot detect every single value that might not refer to
an initialized mutex object (unless we caught exceptions, etc.), but
returning EINVAL on a NULL pointer for sure isn't the wrong thing to do.

cheers
simon

#12 Updated by hasso about 6 years ago

Yep, but ...

It just makes backtraces huge. Obviously passing NULL to
__pthread_mutex_trylock() isn't the problem, but probably just
consequence.

Qt doesn't pass NULL, it's glib. It doesn't mean that problem may not be
in Qt though. Just to be sure I tested these examples in some other
platforms and all these examples work fine in Linux, OpenBSD and FreeBSD.

I updated static binaries tarball to include debug info from glib as well.

http://leaf.dragonflybsd.org/%7Ehasso/qt44-threaded-tests.tar.bz2

#13 Updated by hasso about 6 years ago

BTW, seems that static binaries don't give enough info though. With
dynamic binaries I almost always get the error:

GThread-ERROR **: GThread system may only be initialized once.

And I also managed to deadlock the system again. Unfortunately I couldn't
get any info from system. Something very fishy is going on.

#14 Updated by corecode about 6 years ago

Both don't fail/crash here.

cheers
simon

#15 Updated by corecode about 6 years ago

checking with linux (ubuntu hardy), passing a NULL pointer to
pthread_mutex_trylock() crashes, so I don't think we have to do the checks
as well. The root of the problem seems to be somewhere else.

cheers
simon

#16 Updated by dillon about 6 years ago

I'm going to go ahead and commit the NULL checks for the mutexes anyway,
just to get those off my table.

-Matt
Matthew Dillon
<>

#17 Updated by hasso about 6 years ago

Original issue seems to be in Qt though. My guess is that it's the
http://trolltech.com/developer/task-tracker/index_html?method=entry&amp;id=210246 .
It's not confirmed yet by Trolltech, but sounds like it is. Compiling Qt-4.4.0
without glib support fixes things as well. libc_r isn't probably affected due
to nature of its' threads.

I'm closing bug for now.

#18 Updated by tuxillo over 1 year ago

  • Description updated (diff)
  • Status changed from New to Closed
  • Assignee deleted (0)

Effectively closing the ticket, as the caller meant to do so.

Also available in: Atom PDF