Opened at 2007-11-27T11:32:14Z
Closed at 2007-12-02T11:07:54Z
#330 closed defect (fixed)
Segfault when sending 'ç' (c-cedilla) over jabber
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | critical | Milestone: | |
Component: | Jabber | Version: | 1.1.1dev |
Keywords: | segfault jabber special char | Cc: | |
IRC client+version: | Client-independent | Operating System: | Linux |
OS version/distro: | Debian sid |
Description
Bitlbee version 1.1.1dev segfaults when sending a ç (c-cedilla) over jabber. Backtrace :
#0 0xb7ef64e7 in g_markup_escape_text () from /usr/lib/libglib-2.0.so.0 #1 0x08078197 in xt_to_string_real (node=0x80cb220, str=0x80c4630) at xmltree.c:282 #2 0x080781d4 in xt_to_string_real (node=0x80dcf40, str=0x80c4630) at xmltree.c:288 #3 0x08078229 in xt_to_string (node=0x80dcf40) at xmltree.c:299 #4 0x08071331 in jabber_write_packet (ic=0x80caba8, node=0x80dcf40) at io.c:35 #5 0x080749ee in jabber_buddy_msg (ic=0x80caba8, who=0x80dd238 "anonymous@server", message=0x80cb3ef "ç", flags=0) at jabber.c:302 #6 0x0806b26e in imc_buddy_msg (ic=0x80caba8, handle=0x80dd238 "anonymous@server", msg=0x80cb3ef "ç", flags=0) at nogaim.c:975 #7 0x0805a998 in buddy_send_handler (irc=0x80c7578, u=0x80dc160, msg=0x80cb3ef "ç", flags=0) at irc.c:1079 #8 0x0805a6b2 in irc_send (irc=0x80c7578, nick=0x80cb3e8 "renzo", s=0x80cb3ef "ç", flags=0) at irc.c:1006 #9 0x0805b9bd in irc_cmd_privmsg (irc=0x80c7578, cmd=0x80c9ff8) at irc_commands.c:265 #10 0x0805ca5a in irc_exec (irc=0x80c7578, cmd=0x80c9ff8) at irc_commands.c:650 #11 0x08058991 in irc_process (irc=0x80c7578) at irc.c:331 #12 0x0805544e in bitlbee_io_current_client_read (data=0x80c7578, fd=8, cond=GAIM_INPUT_READ) at bitlbee.c:181 #13 0x08063c17 in gaim_io_invoke (source=0x80c7610, condition=G_IO_IN, data=0x80c7600) at events_glib.c:84 #14 0xb7f214ed in ?? () from /usr/lib/libglib-2.0.so.0 #15 0x080c7610 in ?? () #16 0x00000001 in ?? () #17 0x080c7600 in ?? () #18 0xb7f6277c in ?? () from /usr/lib/libglib-2.0.so.0 #19 0xbfcc76dc in ?? () #20 0x080c7668 in ?? () #21 0xbfcc76f8 in ?? () #22 0xb7ef21c6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
Offending code : (gdb) x/10i $eip 0xb7ef64e7 <g_markup_escape_text+87>: movzbl (%esi),%eax 0xb7ef64ea <g_markup_escape_text+90>: mov -0x10(%ebp),%edx 0xb7ef64ed <g_markup_escape_text+93>: movsbl (%eax,%edx,1),%eax 0xb7ef64f1 <g_markup_escape_text+97>: lea (%eax,%esi,1),%edi
Registers : eax 0x80c44b0 135021744 ecx 0x80c44b0 135021744 edx 0x21ce9 138473 ebx 0xb7f6277c -1208604804 esp 0xbfcc6fa0 0xbfcc6fa0 ebp 0xbfcc6fc8 0xbfcc6fc8 esi 0x80fe000 135258112 edi 0x80fe000 135258112 eip 0xb7ef64e7 0xb7ef64e7 <g_markup_escape_text+87> eflags 0x10287 [ CF PF SF IF RF ]
Attachments (0)
Change History (8)
comment:1 Changed at 2007-11-27T11:36:56Z by
comment:2 Changed at 2007-11-27T11:47:30Z by
additionnal details : if my client is setup to send chars in UTF8, it doesn't crash. It looks like it's due to invalid UTF8 text being passed to g_markup_escape_text.
comment:3 Changed at 2007-12-01T16:42:16Z by
This one sucks and I can reproduce it. :-( I hope I can do something with this Valgrind output...
comment:4 Changed at 2007-12-01T17:13:00Z by
Bah, this is really just a GLib bug. This little program:
#include <string.h> #include <stdio.h> #include <glib.h> int main() { gchar *dommeglib = "Hállø!\n"; printf( "%s\n", g_markup_escape_text( dommeglib, strlen( dommeglib ) ) ); }
Generates a segfault. (Debian GLib 2.14.1-5)
comment:5 Changed at 2007-12-01T18:57:46Z by
Brilliant! g_markup_printf_escaped works properly. I'd say this is a bug, even though official GNOME software is working around this (their own) bug...
comment:6 Changed at 2007-12-01T21:22:35Z by
Urgh, never mind, actually g_markup_printf_escaped sucks too, just that I used a slightly different test string.
comment:7 Changed at 2007-12-01T23:13:51Z by
Camino is a piece of crap and ate my long comment here. Here's a summary:
In short, there's some code that parses UTF-8 chars to make sure they're copied as-is and parts of it aren't converted to &#xxxx; sequences. It can easily skip past the end of the string if there's some invalid UTF-8 sequence at the end of the string, and even better, when that happens it will walk through the rest of your RAM until SIGSEGV attacks, because of: while (p != end). while (p < end) would be so much safer... (Although still not fully reliable, yes.)
Anyway, I guess I'll just have to fix this by refusing messages that aren't encoded in the right charset, since this is not the only place where charset mismatches cause ugly behaviour.
comment:8 Changed at 2007-12-02T11:07:54Z by
Resolution: | → fixed |
---|---|
Status: | new → closed |
I'm pushing a change now that makes BitlBee reject lines that caused iconv failures. So instead of a crash, you'd get this error message now:
ERROR: Charset mismatch detected. The charset setting is currently set to utf-8, so please make sure your IRC client will send and accept text in that charset, or tell BitlBee which charset to expect by changing the charset setting. See `help set charset' for more information. Your message was ignored.
I hope this is clear enough. :-)
correctly formatted :