Modify

#790 closed defect (fixed)

Crash in ssl_gnutls.c

Reported by: anonymous Owned by:
Priority: normal Milestone:
Component: BitlBee Version: devel
Keywords: Cc:
IRC client+version: Client-independent Operating System: Public server
OS version/distro: Ubuntu 10.10 x86-64

Description

Bitlbee (running bzr-788) crashes in ssl_gnutls.c under unknown conditions. Backtrace follows:

Program received signal SIGSEGV, Segmentation fault.

0x0000000001286210 in ?? ()

(gdb) bt

#0 0x0000000001286210 in ?? ()

#1 0x0000000000439a86 in ssl_connected (data=0x12f4da0, source=-1, cond=B_EV_IO_READ) at ssl_gnutls.c:128

#2 0x000000000043752f in gaim_io_connected (data=0x12a5610, source=13, cond=B_EV_IO_WRITE) at proxy.c:97

#3 0x0000000000431a88 in gaim_io_invoke (source=0x1279150, condition=G_IO_NVAL, data=0x12f29c0) at events_glib.c:85

#4 0x00007f108daf3342 in g_main_context_dispatch () from /lib/libglib-2.0.so.0

#5 0x00007f108daf72a8 in ?? () from /lib/libglib-2.0.so.0

#6 0x00007f108daf77b5 in g_main_loop_run () from /lib/libglib-2.0.so.0

#7 0x0000000000431a0b in b_main_run () at events_glib.c:64

#8 0x000000000042f2e6 in main (argc=6, argv=0x7fffb58d4188) at unix.c:177

Hope this is somewhat useful, I get crashes from this about once a week.

Attachments (0)

Change History (10)

comment:1 Changed at 2011-05-05T12:43:36Z by anonymous

Correction: I'm running bzr-779!

comment:2 Changed at 2011-05-12T08:19:22Z by wilmer

So you don't know what it was doing at the time?

It looks like it failed to connect to some SSL server. Could be any IM account, or authentication for one. Or maybe a Twitter API poll. The backtrace won't show that because this is all async I/O.

If on the next crash you could type something like "print *ssl_connected::conn" that may help in finding out what's going on.

comment:3 Changed at 2011-05-12T12:26:21Z by anonymous

I was asleep at the time. I still have the gdb session up:

(gdb) print *ssl_connected::conn $1 = {func = 0x12843f0, data = 0x1286210, fd = 1667591790, established = 1735289204, inpa = 544106784, session = 0x2e73646e6f636573,

xcred = 0x2e}

I'll hop on #bitlbee this morning and see if you're there.

comment:4 Changed at 2011-06-06T16:28:48Z by anonymous

Hi, I ran the process in valgrind and it crashed again. The cause of the crash was a loss of network connectivity. Attached are the valgrind logs, running the latest bzr branch FYI (I believe the crash was in pid 32726, I'm not sure what the other pids were from):

==32716== Memcheck, a memory error detector
==32716== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==32716== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==32716== Command: /usr/local/sbin/bitlbee -p 6667 -P /var/run/bitlbee.pid -F
==32716== Parent PID: 8526
==32716==
==32716==
==32716== HEAP SUMMARY:
==32716==     in use at exit: 29,248 bytes in 397 blocks
==32716==   total heap usage: 2,639 allocs, 2,242 frees, 363,762 bytes allocated
==32716==
==32716== LEAK SUMMARY:
==32716==    definitely lost: 0 bytes in 0 blocks
==32716==    indirectly lost: 0 bytes in 0 blocks
==32716==      possibly lost: 3,301 bytes in 21 blocks
==32716==    still reachable: 25,947 bytes in 376 blocks
==32716==         suppressed: 0 bytes in 0 blocks
==32716== Rerun with --leak-check=full to see details of leaked memory
==32716==
==32716== For counts of detected and suppressed errors, rerun with: -v
==32716== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)
==8441== Invalid write of size 8
==8441==    at 0x41B296: bee_irc_channel_chat_part (irc_im.c:910)
==8441==    by 0x41E14A: irc_cmd_part (irc_commands.c:234)
==8441==    by 0x41FA41: irc_exec (irc_commands.c:779)
==8441==    by 0x417813: irc_process (irc.c:393)
==8441==    by 0x411F05: bitlbee_io_current_client_read (bitlbee.c:226)
==8441==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==8441==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==8441==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==8441==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==8441==    by 0x431B1A: b_main_run (events_glib.c:64)
==8441==    by 0x42F3F5: main (unix.c:177)
==8441==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
==8441==
==8441==
==8441== HEAP SUMMARY:
==8441==     in use at exit: 160,147 bytes in 1,910 blocks
==8441==   total heap usage: 18,686 allocs, 16,776 frees, 1,164,958 bytes allocated
==8441==
==8398== Invalid write of size 8
==8398==    at 0x41B296: bee_irc_channel_chat_part (irc_im.c:910)
==8398==    by 0x41E14A: irc_cmd_part (irc_commands.c:234)
==8398==    by 0x41FA41: irc_exec (irc_commands.c:779)
==8398==    by 0x417813: irc_process (irc.c:393)
==8398==    by 0x411F05: bitlbee_io_current_client_read (bitlbee.c:226)
==8398==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==8398==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==8398==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==8398==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==8398==    by 0x431B1A: b_main_run (events_glib.c:64)
==8398==    by 0x42F3F5: main (unix.c:177)
==8398==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
==8398==
==8441== LEAK SUMMARY:
==8441==    definitely lost: 60 bytes in 1 blocks
==8441==    indirectly lost: 240 bytes in 10 blocks
==8441==      possibly lost: 2,853 bytes in 21 blocks
==8441==    still reachable: 156,994 bytes in 1,878 blocks
==8441==         suppressed: 0 bytes in 0 blocks
==8441== Rerun with --leak-check=full to see details of leaked memory
==8441==
==8441== For counts of detected and suppressed errors, rerun with: -v
==8441== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
==8398==
==8398== HEAP SUMMARY:
==8398==     in use at exit: 160,147 bytes in 1,910 blocks
==8398==   total heap usage: 18,840 allocs, 16,930 frees, 1,172,119 bytes allocated
==8398==
==8398== LEAK SUMMARY:
==8398==    definitely lost: 60 bytes in 1 blocks
==8398==    indirectly lost: 240 bytes in 10 blocks
==8398==      possibly lost: 4,673 bytes in 54 blocks
==8398==    still reachable: 155,174 bytes in 1,845 blocks
==8398==         suppressed: 0 bytes in 0 blocks
==8398== Rerun with --leak-check=full to see details of leaked memory
==8398==
==8398== For counts of detected and suppressed errors, rerun with: -v
==8398== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
==32726== Invalid read of size 4
==32726==    at 0x414A2A: ipc_child_cmd_opermsg (ipc.c:276)
==32726==    by 0x415451: ipc_command_exec (ipc.c:536)
==32726==    by 0x4157E3: ipc_child_read (ipc.c:640)
==32726==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==32726==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x431B1A: b_main_run (events_glib.c:64)
==32726==    by 0x42F3F5: main (unix.c:177)
==32726==  Address 0x6eef264 is 4 bytes inside a block of size 176 free'd
==32726==    at 0x4C27D71: free (vg_replace_malloc.c:366)
==32726==    by 0x4174BC: irc_free (irc.c:281)
==32726==    by 0x41719B: irc_abort (irc.c:210)
==32726==    by 0x411DB9: bitlbee_io_current_client_read (bitlbee.c:199)
==32726==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==32726==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x431B1A: b_main_run (events_glib.c:64)
==32726==    by 0x42F3F5: main (unix.c:177)
==32726==
==32726== Invalid read of size 1
==32726==    at 0x4C28E95: __GI_strchr (mc_replace_strmem.c:144)
==32726==    by 0x414A48: ipc_child_cmd_opermsg (ipc.c:279)
==32726==    by 0x415451: ipc_command_exec (ipc.c:536)
==32726==    by 0x4157E3: ipc_child_read (ipc.c:640)
==32726==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==32726==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x431B1A: b_main_run (events_glib.c:64)
==32726==    by 0x42F3F5: main (unix.c:177)
==32726==  Address 0x6eef2b0 is 80 bytes inside a block of size 176 free'd
==32726==    at 0x4C27D71: free (vg_replace_malloc.c:366)
==32726==    by 0x4174BC: irc_free (irc.c:281)
==32726==    by 0x41719B: irc_abort (irc.c:210)
==32726==    by 0x411DB9: bitlbee_io_current_client_read (bitlbee.c:199)
==32726==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==32726==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x431B1A: b_main_run (events_glib.c:64)
==32726==    by 0x42F3F5: main (unix.c:177)
==32726==
==32726== Invalid read of size 1
==32726==    at 0x4C28EB6: __GI_strchr (mc_replace_strmem.c:144)
==32726==    by 0x414A48: ipc_child_cmd_opermsg (ipc.c:279)
==32726==    by 0x415451: ipc_command_exec (ipc.c:536)
==32726==    by 0x4157E3: ipc_child_read (ipc.c:640)
==32726==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==32726==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x431B1A: b_main_run (events_glib.c:64)
==32726==    by 0x42F3F5: main (unix.c:177)
==32726==  Address 0x6eef2b1 is 81 bytes inside a block of size 176 free'd
==32726==    at 0x4C27D71: free (vg_replace_malloc.c:366)
==32726==    by 0x4174BC: irc_free (irc.c:281)
==32726==    by 0x41719B: irc_abort (irc.c:210)
==32726==    by 0x411DB9: bitlbee_io_current_client_read (bitlbee.c:199)
==32726==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==32726==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==32726==    by 0x431B1A: b_main_run (events_glib.c:64)
==32726==    by 0x42F3F5: main (unix.c:177)
==32726==
==32726==
==32726== HEAP SUMMARY:
==32726==     in use at exit: 118,534 bytes in 561 blocks
==32726==   total heap usage: 3,333,335 allocs, 3,332,774 frees, 649,152,224 bytes allocated
==32726==
==32726== LEAK SUMMARY:
==32726==    definitely lost: 82,286 bytes in 326 blocks
==32726==    indirectly lost: 501 bytes in 48 blocks
==32726==      possibly lost: 18,701 bytes in 102 blocks
==32726==    still reachable: 17,046 bytes in 85 blocks
==32726==         suppressed: 0 bytes in 0 blocks
==32726== Rerun with --leak-check=full to see details of leaked memory
==32726==
==32726== For counts of detected and suppressed errors, rerun with: -v
==32726== ERROR SUMMARY: 5 errors from 3 contexts (suppressed: 4 from 4)
==29300==
==29300== HEAP SUMMARY:
==29300==     in use at exit: 21,247 bytes in 127 blocks
==29300==   total heap usage: 6,907 allocs, 6,780 frees, 643,024 bytes allocated
==29300==
==29300== LEAK SUMMARY:
==29300==    definitely lost: 316 bytes in 2 blocks
==29300==    indirectly lost: 240 bytes in 10 blocks
==29300==      possibly lost: 3,821 bytes in 32 blocks
==29300==    still reachable: 16,870 bytes in 83 blocks
==29300==         suppressed: 0 bytes in 0 blocks
==29300== Rerun with --leak-check=full to see details of leaked memory
==29300==
==29300== For counts of detected and suppressed errors, rerun with: -v
==29300== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)


comment:5 Changed at 2011-06-10T22:04:35Z by anonymous

Hi, back again. I think the above log from valgrind is related to another crash. This is what I'm seeing with the SSL-related functions. These occur in the same place as noted by gdb:

==30505== Invalid read of size 8
==30505==    at 0x439B7E: ssl_connected (ssl_gnutls.c:128)
==30505==    by 0x43846C: s5_canread_again (proxy.c:385)
==30505==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==30505==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x431B1A: b_main_run (events_glib.c:64)
==30505==    by 0x42F3F5: main (unix.c:177)
==30505==  Address 0x815f2a0 is 0 bytes inside a block of size 48 free'd
==30505==    at 0x4C27D71: free (vg_replace_malloc.c:366)
==30505==    by 0x439F27: ssl_disconnect (ssl_gnutls.c:246)
==30505==    by 0x44EB7A: jabber_logout (jabber.c:304)
==30505==    by 0x43F3F2: imc_logout (nogaim.c:342)
==30505==    by 0x43CB6B: account_on_timeout (account.c:376)
==30505==    by 0x527AB1A: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x431B1A: b_main_run (events_glib.c:64)
==30505==    by 0x42F3F5: main (unix.c:177)
==30505==
==30505== Invalid read of size 8
==30505==    at 0x439B85: ssl_connected (ssl_gnutls.c:128)
==30505==    by 0x43846C: s5_canread_again (proxy.c:385)
==30505==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==30505==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x431B1A: b_main_run (events_glib.c:64)
==30505==    by 0x42F3F5: main (unix.c:177)
==30505==  Address 0x815f2a8 is 8 bytes inside a block of size 48 free'd
==30505==    at 0x4C27D71: free (vg_replace_malloc.c:366)
==30505==    by 0x439F27: ssl_disconnect (ssl_gnutls.c:246)
==30505==    by 0x44EB7A: jabber_logout (jabber.c:304)
==30505==    by 0x43F3F2: imc_logout (nogaim.c:342)
==30505==    by 0x43CB6B: account_on_timeout (account.c:376)
==30505==    by 0x527AB1A: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x431B1A: b_main_run (events_glib.c:64)
==30505==    by 0x42F3F5: main (unix.c:177)
==30505==

==30505== Invalid free() / delete / delete[]
==30505==    at 0x4C27D71: free (vg_replace_malloc.c:366)
==30505==    by 0x439BA1: ssl_connected (ssl_gnutls.c:129)
==30505==    by 0x43846C: s5_canread_again (proxy.c:385)
==30505==    by 0x431B97: gaim_io_invoke (events_glib.c:85)
==30505==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x431B1A: b_main_run (events_glib.c:64)
==30505==    by 0x42F3F5: main (unix.c:177)
==30505==  Address 0x815f2a0 is 0 bytes inside a block of size 48 free'd
==30505==    at 0x4C27D71: free (vg_replace_malloc.c:366)
==30505==    by 0x439F27: ssl_disconnect (ssl_gnutls.c:246)
==30505==    by 0x44EB7A: jabber_logout (jabber.c:304)
==30505==    by 0x43F3F2: imc_logout (nogaim.c:342)
==30505==    by 0x43CB6B: account_on_timeout (account.c:376)
==30505==    by 0x527AB1A: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527A341: g_main_context_dispatch (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E2A7: ??? (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x527E7B4: g_main_loop_run (in /lib/libglib-2.0.so.0.2600.1)
==30505==    by 0x431B1A: b_main_run (events_glib.c:64)
==30505==    by 0x42F3F5: main (unix.c:177)
==30505==

comment:6 Changed at 2011-06-11T17:55:52Z by wilmer

That last "Invalid free()" is very interesting and should be the real problem here.

account_on_timeout in the backtrace may also give me a good clue on how to reproduce this. Thanks! I hope to have a chance/time to look at this soonish..

comment:7 Changed at 2011-08-01T12:58:04Z by anonymous

Hi Wilmer,

I'm pretty sure I know the cause of this. Last night, the jabber.org server quit responding. After the final reconnect following the 900 second timeout, the bitlbee process died. This has happened before when my server has lost network connectivity. It appears that losing connectivity to an SSL-connected Jabber server is the cause. Hopefully you can take a look at this again when you have the time.

Thanks!

comment:8 Changed at 2011-10-28T19:50:25Z by anonymous

Is https://bugzilla.redhat.com/show_bug.cgi?id=749895 the same as this? it looks somewhat similar to my non-coder eyes. Bitlbee is crashing every few minutes for me with that trace.

comment:9 Changed at 2011-12-19T20:47:57Z by wilmer

Original submitter: Now that I look at this more precisely, I know what's going on, and this is a known issue with GLib-based event handling. If you use libevent, you should not have this problem. What happens is that BitlBee gives up on a connection, closes it, but then a few moments later the event handler for it still gets called. With libevent, event handlers do get killed properly.

I just found an easy way to fix this for glib too though. This crash should now be gone. Apologies for taking so long. If I were using glib on public servers I would've noticed this one a lot more often myself, probably.

changeset:devel,852.

And no, the RedHat/NSS bug is unrelated. I'd comment on that but but I don't have an account there..

comment:10 Changed at 2011-12-19T20:48:06Z by wilmer

Resolution: fixed
Status: newclosed

Modify Ticket

Action
as closed The ticket will remain with no owner.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.