Bug #1003
closedQt 4.4 QtConcurrent and libthread_xu
0%
Description
Qt 4.4 integrates QT Concurrent [1] template library which provides 
high-level and very easy to use API for writing multi-threaded programs.
The problem is that all these programs crash (segfault) rather "reliably" 
with libthread_xu while there is no single problem with libc_r. I even 
managed to deadlock whole machine with one example program, but can not 
repeat it for now (I have single desktop at the moment I need to work 
with ;).
I made static binaries from trivial example program you can download 
here - http://leaf.dragonflybsd.org/~hasso/qt44-threaded-tests.tar.bz2.
runfunction-lr is compiled with libc_r, runfunction-xu with libthread_xu.
runfunction-xu crashing backtrace is always the same:
#0  _pthread_mutex_trylock (m=0x0) 
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
292             if (_predict_false(*m == NULL)) {
(gdb) bt
#0  __pthread_mutex_trylock (m=0x0) 
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
#1  0x0814157e in g_mutex_trylock_posix_impl ()
#2  0x0815186f in g_slice_alloc ()
#3  0x0815ca7a in g_ptr_array_sized_new ()
#4  0x0815cab5 in g_ptr_array_new ()
#5  0x0814b5ec in g_main_context_new ()
#6  0x080fd298 in QEventDispatcherGlibPrivate (this=0x281fb400, 
context=0x0)
    at kernel/qeventdispatcher_glib.cpp:241
#7  0x080fd4e7 in QEventDispatcherGlib (this=0x281f9260, parent=0x0) at 
kernel/qeventdispatcher_glib.cpp:268
#8  0x08057e1a in QThreadPrivate::createEventDispatcher (data=0x281fa740) 
at thread/qthread_unix.cpp:161
#9  0x08057f0c in QThreadPrivate::start (arg=0x281f91c0) at 
thread/qthread_unix.cpp:185
#10 0x08145017 in thread_start (arg=0x281f6200)
    at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_create.c:239
#11 0x00000000 in ?? ()
[1] - http://labs.trolltech.com/blogs/category/labs/threads/qt-concurrent/
      
      Updated by corecode over 17 years ago
      
    
    Well, somebody seems to pass a null pointer here.
could you compile glib with -g and also include their sources?
cheers
   simon
      
      Updated by dillon over 17 years ago
      
    
    Something in that chain of calls is passing a NULL to
    __pthread_mutex_trylock().
In libc_r we have this:
	int
_pthread_mutex_trylock(pthread_mutex_t * mutex)
{
        struct pthread  *curthread = _get_curthread();
        int     ret = 0;
if (mutex == NULL)
                ret = EINVAL;
    ...
}
	In libthread_xu it assumes non-NULL and will crash.
	Try this patch.  It will do the same check that libc_r does.  I'm
    not convinced that Qt isn't broken, though, Qt shouldn't be passing
    NULL to the mutex functions, it should be passing the address of
    a pthread_mutex_t which itself can be NULL, but it should be passing
    NULL.
	-Matt
	Index: thread/thr_mutex.c
===================================================================
RCS file: /cvs/src/lib/libthread_xu/thread/thr_mutex.c,v
retrieving revision 1.14
diff u -p -r1.14 thr_mutex.c thread/thr_mutex.c    13 Apr 2006 11:53:39 -0000    1.14
--
+++ thread/thr_mutex.c    7 May 2008 19:18:04 -0000@ -285,6 +285,8 @ {
     struct pthread *curthread = tls_get_curthread();
     int ret;
+    if (__predict_false(m == NULL))
+        return(EINVAL);
     /*
      * If the mutex is statically initialized, perform the dynamic
      * initialization:@ -372,12 +374,14 @     int    ret;
_thr_check_init();
	-    curthread = tls_get_curthread();
+    if (__predict_false(m == NULL))
+        return(EINVAL);
/*
 * If the mutex is statically initialized, perform the dynamic
 * initialization:
      */
+    curthread = tls_get_curthread();
     if (_predict_false(*m == NULL)) {
         ret = init_static(curthread, m);
         if (_predict_false(ret))
@ -394,12 +398,14 @     int    ret;
	_thr_check_init();
	-    curthread = tls_get_curthread();
+    if (__predict_false(m == NULL))
+        return(EINVAL);
/*
 * If the mutex is statically initialized, perform the dynamic
 * initialization marking it private (delete safe):
      */
+    curthread = tls_get_curthread();
     if (_predict_false(*m == NULL)) {
         ret = init_static_private(curthread, m);
         if (_predict_false(ret))
@ -417,12 +423,14 @     int    ret;
	_thr_check_init();
	-    curthread = tls_get_curthread();
+    if (__predict_false(m == NULL))
+        return(EINVAL);
/*
 * If the mutex is statically initialized, perform the dynamic
 * initialization:
      */
+    curthread = tls_get_curthread();
     if (_predict_false(*m == NULL)) {
         ret = init_static(curthread, m);
         if (_predict_false(ret))
@ -440,6 +448,9 @     int    ret;
	_thr_check_init();
	+    if (__predict_false(m == NULL))
+        return(EINVAL);
+
     curthread = tls_get_curthread();
/*
@ -457,6 +468,8 @ 
 int
 _pthread_mutex_unlock(pthread_mutex_t *m)
{
+    if (__predict_false(m  NULL))
+        return(EINVAL);
     return (mutex_unlock_common(m));
 }
	@ -556,7 +569,6 @     struct pthread_mutex *m;
if (_predict_false((m = *mutex) NULL))
         return (EINVAL);
     if (_predict_false(m>m_owner != curthread))
         return (EPERM);
	@ -600,9 +612,10 @ {
     struct pthread *curthread = tls_get_curthread();
     struct pthread_mutex *m;
-    if (_predict_false((m = *mutex)== NULL))
+    if (_predict_false(mutex  NULL))
+        return (EINVAL);
+    if (_predict_false((m = *mutex)  NULL))
         return (EINVAL);_predict_false(m>m_owner != curthread))
     if (
         return (EPERM);
      
      Updated by hasso over 17 years ago
      
    
    Compiled glib2 (2.14.6 in 2008Q1) distfile from source:
Program terminated with signal 11, Segmentation fault.
#0  _pthread_mutex_trylock (m=0x0) 
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
292             if (_predict_false(*m == NULL)) {
(gdb) bt
#0  _pthread_mutex_trylock (m=0x0) 
at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_mutex.c:292
#1  0x081415f2 in g_mutex_trylock_posix_impl (mutex=0x0) at 
gthread-posix.c:187
#2  0x08151b6b in IA_g_slice_alloc (mem_size=12) at gslice.c:395
#3  0x0815cd76 in IA__g_ptr_array_sized_new (reserved_size=0) at 
garray.c:372
#4  0x0815cdb1 in IA__g_ptr_array_new () at garray.c:366
#5  0x0814b65c in IA__g_main_context_new () at gmain.c:757
#6  0x080fd298 in QEventDispatcherGlibPrivate (this=0x281fb400, 
context=0x0)
    at kernel/qeventdispatcher_glib.cpp:241
#7  0x080fd4e7 in QEventDispatcherGlib (this=0x281f9260, parent=0x0) at 
kernel/qeventdispatcher_glib.cpp:268
#8  0x08057e1a in QThreadPrivate::createEventDispatcher (data=0x281fa740) 
at thread/qthread_unix.cpp:161
#9  0x08057f0c in QThreadPrivate::start (arg=0x281f91b0) at 
thread/qthread_unix.cpp:185
#10 0x08145087 in thread_start (arg=0x281f6200)
    at /home/hasso/dragonfly-src/lib/libthread_xu/thread/thr_create.c:239
#11 0x00000000 in ?? ()
      
      Updated by corecode over 17 years ago
      
    
    From libc_r:
int
_pthread_mutex_trylock(pthread_mutex_t * mutex)
{
    struct pthread    *curthread = _get_curthread();
    int    ret = 0;
if (mutex == NULL)
        ret = EINVAL;
	/*
 * If the mutex is statically initialized, perform the dynamic
 * initialization:
     /
    else if (*mutex != NULL || (ret = init_static(mutex)) == 0) {
        /
	from libthread_xu:
int
__pthread_mutex_trylock(pthread_mutex_t *m)
{
    struct pthread *curthread = tls_get_curthread();
    int ret;
/*
 * If the mutex is statically initialized, perform the dynamic
 * initialization:
     */
    if (__predict_false(*m == NULL)) {
	Not sure if we are required to catch NULL pointers?  They shouldn't pass 
them in the first place...
cheers
   simon
      
      Updated by aoiko over 17 years ago
      
    
    On Wednesday 07 May 2008, Matthew Dillon wrote:
[...]
In libthread_xu it assumes non-NULL and will crash.
Try this patch. It will do the same check that libc_r does. I'm
not convinced that Qt isn't broken, though, Qt shouldn't be passing
NULL to the mutex functions, it should be passing the address of
a pthread_mutex_t which itself can be NULL, but it should be passing
NULL.
Still, it's bad form for a library function to crash just because somebody
called it with an obviously invalid argument.
Aggelos
      
      Updated by Johannes.Hofmann over 17 years ago
      
    
    Not sure about that. There are many illeagal pointers that can't be catched
so why treat NULL differently? 
And a crash with an obvious stack trace shows that there is a problem that
should be fixed.
Johannes
      
      Updated by aoiko over 17 years ago
      
    
    Because NULL is guaranteed not to point to an object.
While an error of EINVAL is hard to debug?
Aggelos
      
      Updated by corecode over 17 years ago
      
    
    This is true.  Fair enough, we can add a check.  I still wonder why they 
are passing NULL in the first place.
Yes, most people seem to ignore error messages returned by mutex functions 
anyways, or so I've heard :)
cheers
   simon
      
      Updated by dillon over 17 years ago
      
    
    Ultimately it comes down to what does the standard intend?
In my years of programming I've found that it is generally better to
    enforce the stated intention by crashing the program (or allowing it
    to crash) rather then returning an error code.  It results in far better
    code quality from the people using the API.  Unless the standard
    specifically says that NULL is ok, then the problem is with the Qt
    library and not with the threads library.
	That said, from an administrative point of view I want to have maximum
    compatibility between libthread_xu and libpthread, because we aren't
    really in the business of policing other people's work.
	So I am willing to commit the change, even if the standard doesn't allow
    NULL.
	-Matt
      
      Updated by aoiko over 17 years ago
      
    
    [...]
So I am willing to commit the change, even if the standard doesn't allow
NULL.
Well, SUSv3 says
"The pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock()
functions may fail if:
[EINVAL]
The value specified by mutex does not refer to an initialized mutex object."
So it definitely allows NULL and seems to encourage extra mtx->magic checks.
Aggelos
      
      Updated by corecode over 17 years ago
      
    
    The standard says:
int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_trylock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);
Note that none of the parameters has the modifier "restricted".  That 
means that they may be NULL.
The pthread_mutex_lock(), pthread_mutex_trylock(), and 
pthread_mutex_unlock() functions may fail if:
[EINVAL]
     The value specified by mutex does not refer to an initialized mutex 
object.
Now, of course we cannot detect every single value that might not refer to 
an initialized mutex object (unless we caught exceptions, etc.), but 
returning EINVAL on a NULL pointer for sure isn't the wrong thing to do.
cheers
   simon
      
      Updated by hasso over 17 years ago
      
    
    Yep, but ...
It just makes backtraces huge. Obviously passing NULL to 
__pthread_mutex_trylock() isn't the problem, but probably just 
consequence.
Qt doesn't pass NULL, it's glib. It doesn't mean that problem may not be 
in Qt though. Just to be sure I tested these examples in some other 
platforms and all these examples work fine in Linux, OpenBSD and FreeBSD.
I updated static binaries tarball to include debug info from glib as well.
http://leaf.dragonflybsd.org/%7Ehasso/qt44-threaded-tests.tar.bz2
      
      Updated by hasso over 17 years ago
      
    
    BTW, seems that static binaries don't give enough info though. With 
dynamic binaries I almost always get the error:
GThread-ERROR **: GThread system may only be initialized once.
And I also managed to deadlock the system again. Unfortunately I couldn't 
get any info from system. Something very fishy is going on.
      
      Updated by corecode over 17 years ago
      
    
    checking with linux (ubuntu hardy), passing a NULL pointer to 
pthread_mutex_trylock() crashes, so I don't think we have to do the checks 
as well.  The root of the problem seems to be somewhere else.
cheers
   simon
      
      Updated by dillon over 17 years ago
      
    
    I'm going to go ahead and commit the NULL checks for the mutexes anyway,
    just to get those off my table.
-Matt
                    Matthew Dillon 
                    <dillon@backplane.com>
      
      Updated by hasso over 17 years ago
      
    
    Original issue seems to be in Qt though. My guess is that it's the 
http://trolltech.com/developer/task-tracker/index_html?method=entry&id=210246 .
It's not confirmed yet by Trolltech, but sounds like it is. Compiling Qt-4.4.0 
without glib support fixes things as well. libc_r isn't probably affected due 
to nature of its' threads.
I'm closing bug for now.
      
      Updated by tuxillo over 12 years ago
      
    
    - Description updated (diff)
 - Status changed from New to Closed
 - Assignee deleted (
0) 
Effectively closing the ticket, as the caller meant to do so.