Modify

Ticket #53 (new enhancement)

Opened 9 years ago

Last modified 4 years ago

Remote encoding...

Reported by: Clojster Owned by: jelmer
Priority: wishlist Milestone:
Component: OSCAR Version: 1.0
Keywords: charset oscar Cc:
IRC client+version: Client-independent Operating System: Linux
OS version/distro:

Description

Hi, it'd be great if BitlBee will support changing remote charset. Because not all users on ICQ uses UTF-8, so conversion to my localcharset produces messy text. To be more specific - I'd like to change charset in which the BitleBee will receive messages. (CP1250 in my case as most of my friends uses windows). Another level would be to change remote charset per user/group etc...

Attachments

bitlbee-recode.diff (2.6 KB) - added by waker@… 8 years ago.
patch which adds recoding capabilities to bitlbee
bitlbee-recode.2.diff (5.7 KB) - added by wakeroid@… 8 years ago.
patch which adds recoding capabilities to bitlbee (updated)
bitlbee-recode-0.4.diff (7.4 KB) - added by waker 8 years ago.
oscar-recode-0.4
bitlbee-recode-0.5.diff (7.2 KB) - added by waker 8 years ago.
partial support for recoding offline messages (see http://bugs.bitlbee.org/bitlbee/ticket/221)
8bit-charset.diff (24.6 KB) - added by darkk 8 years ago.
patch for configurable encoding for icq accounts
bitlbee-recode-0.6.diff (9.1 KB) - added by waker 8 years ago.
added recoding of user info for 1.0.3
bitlbee-recode-0.6.1.diff (8.7 KB) - added by newman 7 years ago.
patch against 1.1.1dev, not fully functional (regression)
bitlbee-recode-0.6.2.patch (7.3 KB) - added by hmich 7 years ago.
patch against bitlbee 1.2
bitlbee-recode-0.6.3.patch (7.1 KB) - added by newman 6 years ago.
Runs OK with bitlbee-1.2.3, if anyone else, after three years, still cares…

Change History

comment:1 Changed 9 years ago by wilmer

  • Component changed from BitlBee to OSCAR

Most likely this is only a problem with OSCAR, so I'll reassign this. Per-buddy will be extremely nasty, I hope we can avoid that. So not all recent ICQ clients support Unicode (UTF-16, actually, not UTF-8) yet?

comment:2 Changed 9 years ago by Clojster

Yes, I agree that this is OSCAR problem... And "no". On windows clients are sending messages in CP1250 (In my country where Czech is default language) but they can receive messages in UTF-8 with no problem. The best thing would be, if you can do this: bitlbee detects encoding in which the message has been sent and according to that it will convert it to local charset. But I don't know if it's even possible... do OSCAR protocol send some info about encoding in each message?

comment:3 Changed 9 years ago by wilmer

  • Keywords charset oscar added
  • Owner set to jelmer

Oops, yes, it seems I didn't read your report very well. In that case, it should be possible for the ICQ code to recognize the charset and convert it to UTF-8 (which is the internal charset for BitlBee) automatically. I hope Jelmer will be able to figure this out from the right specs? :-)

comment:4 Changed 9 years ago by Clojster

Wow, that would be GREAT! Now I got weird message... One of my contacts wrote me something which produced that messy text (he uses standard Mirabilis ICQ5) and last word of that message was the encoding. But I don't know why was it there because it appeared only in that one message... The message was as follows: "no vidím, Âe u tì tak napñl mají [cp1250]" Correct message would be: "no vidím, že už tě tak napůl mají" I don't think this really helps, but... whatever...

comment:5 Changed 9 years ago by wilmer

Okay, I just asked Jelmer about this, I'll post it here FYI:

15:04:27     jelmer| wilmer: Yeah, but I'll postpone that until I've got the
                     Win32 port up and running.
15:04:42     jelmer| wilmer: I'd like to get it right in my oscar rewrite
                     rather then fixing it in the current implementation.

So I hope you can wait for just another while, and it'll work! :-)

comment:6 Changed 9 years ago by Clojster

Well, what else can I do than "wait" :)) But I'm glad it will work someday. BTW: You guys are doing great work! This is what I was looking for for a long time. And as soon as you implement those "groups" features and filetransfers, this piece of software will be flawless :) Keep up great work and I hope to see new version with correct oscar charsets soon ;)

comment:7 Changed 9 years ago by wilmer

BTW, if you don't feel that much like waiting, there might be a temporary solution, at least if you use only ICQ. You can disable charset conversion by setting charset to none, and then BitlBee shouldn't do any translation at all. Then just talk cp1250 (IIRC?) to BitlBee and BitlBee will just pass it as-is.

I'm not sure if it'll work, but it might just be a solution for now. Good luck!

comment:8 Changed 9 years ago by Clojster

Thanks for advice, but I think I'd rather wait... Because if I understand it well, I will have to change terminal fonts to some CP1250, locales to 1250 etc... or am I wrong?

comment:9 Changed 8 years ago by anonymous

hi! i wrote the little patch for bitlbee, which allows to set remote encoding and recode the message if it's not in unicode.

patch adds new set-variable "oscar_recode_charset", which controls recoding behavior (original code just uses iso88590-1).

so, for cp1251 u can simply type set oscar_recode_charset cp1251

i successfully tested it with irssi and russian cp1251 encoding.

see attach, and thanks for such a great piece of software!

WBR, Alexey "waker" Yakovenko <waker@…>

Changed 8 years ago by waker@…

patch which adds recoding capabilities to bitlbee

Changed 8 years ago by wakeroid@…

patch which adds recoding capabilities to bitlbee (updated)

comment:10 Changed 8 years ago by wilmer

Hmmm, nice. Wouldn't it maybe be nice to also make this patch somehow send a flag to indicate the charset used to encode the message? I don't know how easy this is though, I haven't read the OSCAR "specs" very well yet...

Implementing this would probably be easier in the storage-xml branch by the way, since it adds support for per-account settings. So then you can just type something like "account set oscar/charset CP1250" and you're done. And you can set it per-account, if you want.

comment:11 Changed 8 years ago by waker

unfortunately i don't have enough expertise in bitlbee hacking to implement such stuff.

hovewer the good news is that patch performs extremely well for me. i tested it with miranda, icq2003a (mirabilis client), qip and centericq, and the only bugged client was &rq which cant _recieve_ utf8 text by default, though it should be possible to fix that using &rq's settings, and it's unrelated to job of my patch.

question is why would one need different oscar charsets for 2 accounts on same machine? though it can be done easily i think..

comment:12 Changed 8 years ago by wilmer

With storage-xml it's the easiest. I try to keep BitlBee-wide settings completely out of the IM-modules (there are only some references to the debug setting at some places) and instead introduced per-account settings in that branch.

One advantage of having different charset settings per account could be so that you could, if necessary, have a separate account with a different charset for people who use a different charset.

Sure, it's hackish, but isn't having to use different charsets for different people hackish in general? ;-)

I'll give a shot at a storage-xml port some day then (not too much development time for the next few weeks though, unfortunately).

comment:13 Changed 8 years ago by anonymous

i've added support for recoding outgoing messages, and going to add recoding of offline messages (broken too). will post diff in 1-2 days.

after that i'll test it for some days, and if it'll work i gonna checkout latest cvs and experiment with xml-whatever branch (really hate xml! why u wanna use it?!)

comment:14 Changed 8 years ago by wilmer

Sounds good!

And for XML, it's a pretty decent format for this kind of things. Users usually won't have to edit the files by hand (I'm not a big fan of editing XML-conffiles by hand myself either) so it doesn't matter that much.

And also XML is pretty easy to parse because there are enough parsers available. It's certainly (in many ways) a huge improvement over the old format.

comment:15 Changed 8 years ago by waker

here we go.. updated patch for recoding, recodes both incoming and outgoing messages as well as offline messages.

Changed 8 years ago by waker

oscar-recode-0.4

Changed 8 years ago by waker

partial support for recoding offline messages (see http://bugs.bitlbee.org/bitlbee/ticket/221)

comment:16 Changed 8 years ago by Clojster

Wow, that's great to see that someone is actually doing something about this... It would be great though, if you guys added this patch to the next release... what do you think?

Changed 8 years ago by darkk

patch for configurable encoding for icq accounts

Changed 8 years ago by waker

added recoding of user info for 1.0.3

comment:17 Changed 7 years ago by anonymous

Could somebody please port this patches to current 1.1.1dev? Thank you in advance!

comment:18 Changed 7 years ago by newman

Please make it (=bitlbee-recode) someone working with recent 1.1.1dev version, thanks.

comment:19 Changed 7 years ago by newman

Attaching patch for v1.1.1dev. It's not possible to change oscar_recode_charset via set (set oscar_recode_charset iso-8859-2), it's hardcoded in patch.

Until fixed, do

%s/cp1250/iso88590-1/g

on patch, for example.

The problem is


assam bitlbee-1.1.1dev-new # make

  • Compiling irc.c

irc.c: In function 'irc_new':

irc.c:112: warning: passing argument 4 of 'set_add' from incompatible pointer type

make[1]: Entering directory `/usr/src/bitlbee-1.1.1dev-new/lib'

  • Compiling misc.c

please, anyone fix it. I was not able to figure it out.

Changed 7 years ago by newman

patch against 1.1.1dev, not fully functional (regression)

comment:20 Changed 7 years ago by newman

tested finally. hardcoded charset works as intended. setting via set is not possible, please fix.

comment:21 Changed 7 years ago by newman

Running for several weeks and seems OK to me. Once happen, after some weeks of continuous run, Bitlebee stopped in encoding, restart of service did the job.

comment:22 follow-up: ↓ 24 Changed 7 years ago by wilmer

Hmm, instead of having a hardcoded setting, this patch should be able to use per-account settings now. Actually I should probably apply the patch to the main tree like that.

comment:23 Changed 7 years ago by wilmer

BTW, is it a good idea to set the AIM_IMFLAGS_ISO_8859_1 flag while in fact the message isn't really coded in that charset?

comment:24 in reply to: ↑ 22 Changed 7 years ago by newman

Replying to wilmer:

Hmm, instead of having a hardcoded setting, this patch should be able to use per-account settings now. Actually I should probably apply the patch to the main tree like that.

Yup, right. See the warning while compiling

  • Compiling irc.c

irc.c: In function 'irc_new':

irc.c:112: warning: passing argument 4 of 'set_add' from incompatible pointer type

-- it should be The problem/hardcoding but I didn't know that time how to fix it.

Replying to wilmer:

BTW, is it a good idea to set the AIM_IMFLAGS_ISO_8859_1 flag while in fact the message isn't really coded in that charset?

I really do not know, just recoded the patch to patch and compile clean, I'm not that familiar with the code.

Please report back when patch pushed, so I can check out recent bzr.

comment:25 Changed 7 years ago by newman

was'up? is it already in upstream?

Changed 7 years ago by hmich

patch against bitlbee 1.2

comment:26 Changed 6 years ago by newman

So what? Please review and push this patch into upstream.

comment:27 Changed 6 years ago by wilmer

Your attitude is broken. Please review and push it into your brain.

(And yes, this will happen at some point.)

Changed 6 years ago by newman

Runs OK with bitlbee-1.2.3, if anyone else, after three years, still cares...

comment:28 Changed 6 years ago by newman

Sorry for the previous tone, but it's frustrating to have it for three years unfixed. Patch for recent version attached.

comment:29 Changed 6 years ago by wilmer

The patch still touches BitlBee's irc structure to read this setting, protocol modules really should use their own set_t now... I may try to do this myself, but don't know when I'll have time for that.

comment:30 Changed 4 years ago by anonymous

I ported the latest patch by newman to the current bzr version. But I won't built:

  • Compiling oscar.c

oscar.c: In function ‘get_oscar_recode_charset’: oscar.c:978:26: error: ‘struct im_connection’ has no member named ‘irc’ make[2]: * [oscar.o] Error 1 make[2]: Leaving directory `/home/virus_found/abs/bitlbee/src/bitlbee-build/protocols/oscar' make[1]: * [oscar] Error 2 make[1]: Leaving directory `/home/virus_found/abs/bitlbee/src/bitlbee-build/protocols' make: * [protocols] Error 2

comment:31 Changed 4 years ago by Wilmer van der Gaast <wilmer@…>

Try replacing it with bee. Structs got moved around a little bit.

comment:32 Changed 4 years ago by anonymous

Thank you, builds fine now. But I've yet to test it. If someone is interested, an applicable to bzr, but untested patch is here - http://sprunge.us/YDOI

View

Add a comment

Modify Ticket

Action
as new
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.