Bug #2370

<meta charset="windows-1251"> is not handled

Added by Wain . over 6 years ago. Updated over 6 years ago.

Status:FixedStart date:11/15/2014
Priority:NormalDue date:
Assignee:Andreas Smas% Done:

100%

Category:API
Target version:4.8
Found in version:4.7.478 Platform:Linux

Description

I am currenly developing a plugin for rutracker.org, and found a bug (probably).
Looking at the sources of Showtime on GitHub (though I'm not really good in C) I assume it has 2 ways of detecting page encoding in .toString() method of HTTP Response Object:
1. Using 'Content-Type' response header
2. Using <meta http-equiv=... charset="windows-1251"> tag.
However, some sites (including http://rutracker.org) use separate meta tag for setting encoding, like this:
<meta charset="windows-1251">
Showtime can't detect it and all plugin text is shown incorrectly.
I don't really know whether it's a valid way of doing things, but I think additional check should be added for such tag.

Associated revisions

Revision 45f10406
Added by Andreas Smas over 6 years ago

ecmascript/http: Add convertFromEncoding() to response object

Fixes #2370

Change included in version 4.7.567

History

#1 Updated by Andreas Smas over 6 years ago

  • Status changed from New to Need feedback

Maybe it would be best if the response object got a new method like:

.convertFromEncoding() which takes one argument (the character encoding)

What do you think about that?

I'm a bit opposed to adding lot of weird logic into Showtime to handle various websites out there.

#2 Updated by Wain . over 6 years ago

Andreas Ă–man wrote:

Maybe it would be best if the response object got a new method like:

.convertFromEncoding() which takes one argument (the character encoding)

What do you think about that?

I'm a bit opposed to adding lot of weird logic into Showtime to handle various websites out there.

Sure, this would be great and solve the problem.
By the way, torrent support is awesome feature!

#3 Updated by Leonid Protasov over 6 years ago

  • Subject changed from Showtime cannot detect corrent page encoding in certain cases to <meta charset="windows-1251"> is not handled

#4 Updated by Andreas Smas over 6 years ago

  • Target version set to 4.8

#5 Updated by Andreas Smas over 6 years ago

  • Status changed from Need feedback to Fixed
  • % Done changed from 0 to 100

#6 Updated by Andreas Smas over 6 years ago

example:

var http = require('showtime/http');

var x = http.request('http://rutracker.org/', {
  debug: true
});

print(x.convertFromEncoding('windows-1251'));

Also available in: Atom PDF