Modify

#885 closed defect (obsolete)

segmentation fault when connecting to icq

Reported by: Daniel Owned by:
Priority: normal Milestone:
Component: OSCAR Version: devel
Keywords: Cc:
IRC client+version: Client-independent Operating System: Public server
OS version/distro: OpenBSD 5.0

Description

Hi!

I'm connecting to iCQ when this segfault happens. I'm using bzr revision #868 on OpenBSD.

If You need more information I'll gladly provide it.

-- Daniel

Attachments (0)

Change History (15)

comment:1 Changed at 2012-01-02T21:26:16Z by wilmer

At least more duplicate bugs don't add useful info.. :-P

What I need is a backtrace. Right now I have no idea where this segfault is happening.

comment:2 Changed at 2012-01-02T21:55:16Z by wilmer

I saw your stacktrace while cleaning up the spamfilter here (which is clearly too trigger happy, ARGH!!):

GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-openbsd5.0"...
Core was generated by `bitlbee'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libpthread.so.13.1...done.
Loaded symbols for /usr/lib/libpthread.so.13.1
Reading symbols from /usr/local/lib/libgmodule-2.0.so.2800.0...done.
Loaded symbols for /usr/local/lib/libgmodule-2.0.so.2800.0
Reading symbols from /usr/local/lib/libglib-2.0.so.2800.0...done.
Loaded symbols for /usr/local/lib/libglib-2.0.so.2800.0
Reading symbols from /usr/local/lib/libintl.so.5.0...done.
Loaded symbols for /usr/local/lib/libintl.so.5.0
Reading symbols from /usr/local/lib/libiconv.so.6.0...done.
Loaded symbols for /usr/local/lib/libiconv.so.6.0
Reading symbols from /usr/local/lib/libgnutls.so.17.1...done.
Loaded symbols for /usr/local/lib/libgnutls.so.17.1
Reading symbols from /usr/local/lib/libgcrypt.so.15.0...done.
Loaded symbols for /usr/local/lib/libgcrypt.so.15.0
Reading symbols from /usr/local/lib/libgpg-error.so.3.1...done.
Loaded symbols for /usr/local/lib/libgpg-error.so.3.1
Reading symbols from /usr/local/lib/libotr.so.3.2...done.
Loaded symbols for /usr/local/lib/libotr.so.3.2
Symbols already loaded for /usr/lib/libpthread.so.13.1
Reading symbols from /usr/lib/libc.so.60.1...done.
Loaded symbols for /usr/lib/libc.so.60.1
Reading symbols from /usr/local/lib/libpcre.so.2.4...done.
Loaded symbols for /usr/local/lib/libpcre.so.2.4
Reading symbols from /usr/local/lib/libtasn1.so.2.0...done.
Loaded symbols for /usr/local/lib/libtasn1.so.2.0
Reading symbols from /usr/local/lib/libhogweed.so.0.0...done.
Loaded symbols for /usr/local/lib/libhogweed.so.0.0
Reading symbols from /usr/local/lib/libnettle.so.0.0...done.
Loaded symbols for /usr/local/lib/libnettle.so.0.0
Reading symbols from /usr/local/lib/libgmp.so.9.0...done.
Loaded symbols for /usr/local/lib/libgmp.so.9.0
Reading symbols from /usr/lib/libz.so.4.1...done.
Loaded symbols for /usr/lib/libz.so.4.1
Reading symbols from /usr/libexec/ld.so...done.
Loaded symbols for /usr/libexec/ld.so
#0  aim_callhandler (sess=0x7fd5f900, conn=0x8a9f6b80, family=65535, type=65535) at rxhandlers.c:277
277                     if ((cur->family == family) && (cur->type == type))
(gdb) list
272
273             if (!conn)
274                     return NULL;
275
276             for (cur = (struct aim_rxcblist_s *)conn->handlerlist; cur; cur = cur->next) {
277                     if ((cur->family == family) && (cur->type == type))
278                             return cur->handler;
279             }
280
281             if (type == AIM_CB_SPECIAL_DEFAULT) {
(gdb) bt
#0  aim_callhandler (sess=0x7fd5f900, conn=0x8a9f6b80, family=65535, type=65535) at rxhandlers.c:277
#1  0x1c04d210 in snachandler (sess=0x7fd5f900, mod=0x8a9f6dc0, rx=0x7c269900, snac=0xcfbd1ab0, bs=0x7c269908)
    at misc.c:377
#2  0x1c04e272 in consumenonsnac (sess=0x7fd5f900, rx=0x7c269900, family=Variable "family" is not available.
) at rxhandlers.c:144
#3  0x1c04e393 in aim_rxdispatch (sess=0x7fd5f900) at rxhandlers.c:359
#4  0x1c055b06 in oscar_callback (data=0x8a9f6b80, source=13, condition=Variable "condition" is not available.
) at oscar.c:292
#5  0x1c026e23 in gaim_io_invoke (source=0x8a9f60c0, condition=Variable "condition" is not available.
) at events_glib.c:85
#6  0x0fb07dfd in g_io_channel_unix_get_fd () from /usr/local/lib/libglib-2.0.so.2800.0
#7  0x0fabf397 in g_main_context_dispatch () from /usr/local/lib/libglib-2.0.so.2800.0
#8  0x0fac343e in g_main_context_prepare () from /usr/local/lib/libglib-2.0.so.2800.0
#9  0x0fac3847 in g_main_loop_run () from /usr/local/lib/libglib-2.0.so.2800.0
#10 0x1c026e93 in b_main_run () at events_glib.c:64
#11 0x1c024f62 in main (argc=Cannot access memory at address 0xffff
) at unix.c:177
(gdb) bt full
#0  aim_callhandler (sess=0x7fd5f900, conn=0x8a9f6b80, family=65535, type=65535) at rxhandlers.c:277
        cur = (struct aim_rxcblist_s *) 0xdfdfdfdf
#1  0x1c04d210 in snachandler (sess=0x7fd5f900, mod=0x8a9f6dc0, rx=0x7c269900, snac=0xcfbd1ab0, bs=0x7c269908)
    at misc.c:377
        userfunc = Variable "userfunc" is not available.
(gdb)

comment:3 Changed at 2012-01-02T22:00:05Z by wilmer

Hm. Could you also show me the output of "print cur" ?

comment:4 Changed at 2012-01-03T09:02:08Z by Daniel <leva@…>

(gdb) print cur
$1 = (struct aim_rxcblist_s *) 0xdfdfdfdf

comment:5 Changed at 2012-01-03T23:47:13Z by wilmer

Ew.

And what about conn->handlerlist ? Also, what about "connections" or "get_connections()"?

This is old inherited code from Gaim 0.58, I'm not sure what's up here..

comment:6 Changed at 2012-01-04T05:39:55Z by brynet@…

OpenBSD's malloc(3) has a MALLOC_OPTIONS environment variable that allows behaviour to be altered, which is used for finding bugs.

Here is the description for "J":

     J       ``Junk''.  Fill some junk into the area allocated.  Currently
             junk is bytes of 0xd0 when allocating; this is pronounced
             ``Duh''.  :-) Freed chunks are filled with 0xdf.

http://www.openbsd.org/cgi-bin/man.cgi?query=malloc&manpath=OpenBSD+Current&format=html

comment:7 Changed at 2012-01-04T05:44:27Z by brynet@…

To be more precise, it looks like a use-after free(3) bug in bitlblee.

comment:8 Changed at 2012-01-04T08:07:36Z by Daniel <leva@…>

Of course, I'm sorry, I should've noticed it. I'm using the '/etc/malloc.conf@ -> S' measures for malloc(3) on OpenBSD.

The requested info:

(gdb) print conn->handlerlist
$1 = (void *) 0x46140210
(gdb) print connections
$2 = (GSList *) 0x7c377148
(gdb) print get_connections  
$3 = {GSList *()} 0x1c031270 <get_connections>

comment:9 Changed at 2012-01-04T09:52:13Z by Wilmer van der Gaast <wilmer@…>

Ah yes, I already figured 0xdf would stand for double-free.

I assume valgrind works on OpenBSD? It'd probably be easier to find the root cause that way.

comment:10 Changed at 2012-01-04T11:33:39Z by Daniel

Nope, unfortunatelly we don't have valgrind.

comment:11 Changed at 2012-01-04T22:52:36Z by wilmer

D'oh, that doesn't make this easier. I don't see any obvious frees that may have caused this.

Does printing conn->handlerlist->next->next->... eventually show this 0xdfdfdfdf?

I don't really understand what the 0xffff SNAC is, it's not described on http://iserverd.khstu.ru/oscar/families.html . It seems to be artificial, for whatever the reason may be. If you comment out "consumenonsnac(sess, cur, 0xffff, 0xffff); /* last chance! */" in rxhandlers.c you may not get this crash ... but I guess stuff will break some other way.

Can you reproduce this crash on testing.bitlbee.org?

comment:12 Changed at 2012-01-05T07:44:19Z by Daniel <leva@…>

I can not reproduce this on testing.bitlbee.org, because it connects right away.

On my server, bitlbee can not connect to iCQ and the crash happens at one of the retries. Sometimes it survives 2-3 connection retry attempts, sometimes it crashes at the first. So it seems, if bitlbee can connect to iCQ, there is no problem.

Of course, disabling malloc(3)'s double free check masks the problem.. :)

comment:13 Changed at 2012-01-05T19:40:02Z by Daniel <leva@…>

I only get "Attempt to dereference a generic pointer." when trying to print conn->handlerlist->next . I don't have the required gdb-foo, but my searches showed me that maybe there is a missing cast in the print command line in gdb, but I couldn't figure out how to spit the ->next item out :\

comment:14 Changed at 2012-03-25T15:17:14Z by wilmer

Hrmm, seems that I've dropped a ball here, sorry.. testing.bitlbee.org:6668 runs inside valgrind which should offer double-free checks as well, and easier to debug because it keeps track of the original malloc/free, etc.

As for type-casting, I think (struct aim_rxcblist_s *) is all you need, maybe with another * in front of it, maybe some more parentheses.

comment:15 Changed at 2015-10-31T11:37:14Z by dx

Resolution: obsolete
Status: newclosed

Somehow, I didn't see this bug before. Looks interesting but I can't find a way to reproduce it.

Grabbed an openbsd 5.0 VM and git revision 6451d2704fd0742680b485fb1d3690e251860073 (equivalent to bzr 868), set MALLOC_OPTIONS=S, spent a while with a tcp killer, random packet loss, and interrupted logins, and... nothing.

Gave up after a while and just started trying the same thing in linux under valgrind. Not a single invalid read. It did find a bunch of memory leaks, but i'm not going to debug memory leaks of bitlbee 3.0.4.

Given that this ticket is old, the debug information provided is not enough, and it seems to rely on very specific network conditions (or something else i might be missing, maybe even protocol details that changed since then), I'm closing it, as it's not going anywhere.

If this somehow still happens, feel free to reopen.

Modify Ticket

Action
as closed The ticket will remain with no owner.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.