Modify

#640 closed defect (fixed)

Twitter: DOS newline not ignored

Reported by: ilf@… Owned by: geert
Priority: normal Milestone:
Component: Twitter Version: devel
Keywords: Cc:
IRC client+version: Client-independent Operating System: Public server
OS version/distro:

Description

Unfortunately some Tweets include DOS-newlines ^M

These are not visible in Web: https://twitter.com/EFF/status/17525392536

But in XML: wget "https://api.twitter.com/1/statuses/show.xml?id=17525392536"

  <text>ACLU: America is riddled with politically motivated surveillance^M
http://eff.org/r.m8W</text>

Clients like irssi interpret that newline as the end of one message, starting a new one after ^M.

Probably BitlBee should strip these.

Attachments (1)

twitter-strip_newslines.patch (2.5 KB) - added by Daniel Albers <daniel@…> at 2012-01-13T17:55:27Z.

Download all attachments as: .zip

Change History (8)

comment:1 Changed at 2010-07-02T08:27:45Z by ilf

Ok, so the ^M doesn't seem to be the problem, but the CRLF.

Twitter itself sais:

Do not depend on a newline to delimit a complete JSON message, as text fields may also contain newlines.

http://apiwiki.twitter.com/User-Stream-Implementation-Suggestions

So a new line should not be a new message.

There's intentional use of that though, which only works if every line starts at the same point:

https://twitter.com/tw1tt3rart/status/17526089918

I suggest fixing newlines, so they do not create new messages, but a new line in a message. While also adding an option to disable line breaks in Tweets completely. I don't want to have them :)

comment:2 Changed at 2010-07-03T08:35:51Z by Wilmer van der Gaast <wilmer@…>

I suggest fixing newlines, so they do not create new messages, but a new
line in a message. While also adding an option to disable line breaks in
Tweets completely. I don't want to have them :)


So as you may know, the IRC protocol does not support multiline
messages, so what you're requesting here is not possible.

(Except for the option of course, that one *is* possible.)

Also, the twitterart thing, does it really use newlines, or does it just
depend on exactly the right linewrapping? It seems to be the latter.

comment:3 Changed at 2010-07-03T10:04:38Z by ilf

Of course. I remembered trigger.pl or twirssi, they use irssi newlines, but BitlBee needs to do IRC.

The TwitterArt thing does indeed use newlines on purpose, see https://api.twitter.com/1/statuses/show.xml?id=17526089918

That the line width works on https://twitter.com/ does not mean it works on clients, too.

But yeah, I'd like an option to ignore newlines. My impression is most clients do this, so maybe this should be default, too.

comment:4 Changed at 2011-02-15T10:28:28Z by überRegenbogen

Some of the text graphics tweets do use newlines; but the only place in the web interface that they matter is when viewing the individual tweet with the mobile interface—where they become <br /> in the page source. In the new non-mobile web interface and in lists they are passed as raw newlines, and (per normal HTML behaviour) they and any adjacent whitespace characters are rendered as a single space. (I haven't checked the behaviour of the old non-mobile web interface; but all of this is only of auxiliary importance anyway.)

I also see tweets (particularly from TechMech) with gobs of tab characters as well as newline—causing an annoying multi-line mess.

What i would like to see, is an option to observe HTML whitespace rules—i.e. convert each group of whitespace characters (space|tab|cr|lf|etc) into a single space. (A further option to override this when certain characters—e.g. block graphics—are present might be nice; but that is a luxury of far less importance.)

Changed at 2012-01-13T17:55:27Z by Daniel Albers <daniel@…>

comment:5 Changed at 2012-01-13T17:59:23Z by Daniel Albers <daniel@…>

Hi,

although I prefer the oneliners, I thought both behaviors could be desirable, so I wrote a tiny patch (using bzr send -o for lack of git format-patch :) that adds the boolean setting 'strip_newlines' to the Twitter protocol.

Usage:

acc <twitter acc #> set strip_newlines true

Cheers, Daniel

comment:6 Changed at 2012-01-30T21:41:35Z by wilmer

Resolution: fixed
Status: newclosed

comment:7 Changed at 2012-01-30T21:42:18Z by wilmer

And yes, this should be a setting. I don't mind the newlines and in some tweets they're useful/improve readability/etc.

Modify Ticket

Action
as closed The owner will remain geert.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.