Bug #1944

ST won't autodetect ANSI (cp1251) in icymeta

Added by Leonid Protasov about 7 years ago. Updated about 7 years ago.

Status:FixedStart date:02/08/2014
Priority:NormalDue date:
Assignee:Andreas Smas% Done:

100%

Category:Audio
Target version:4.6
Found in version:Latest Platform:Linux

Description

icymeta [DEBUG]: 0x000000: 53 74 72 65 61 6d 54 69  74 6c 65 3d 27 c1 e8 eb    StreamTitle='...
icymeta [DEBUG]: 0x000010: e0 ed 20 c4 e8 ec e0 20  2d 20 c4 ee f2 ff ed e8    .. .... - ......
icymeta [DEBUG]: 0x000020: f1 fc 27 3b                                         ..';

That corresponds to: Билан Дима - Дотянись

I wonder why that is not detected properly.

Maybe to debug icymeta would be nice to add string to see what autotedection detected...
And move that into settings:dev to not flood the log...

History

#1 Updated by Leonid Protasov about 7 years ago

This is decoded properly as it is utf8. But when text is ansi etc it won't. Looks like it should be the same as for subs. If title is not utf8 - convert to utf8 then detect...

00:04:39.450: icymeta [DEBUG]:0x000000: 53 74 72 65 61 6d 54 69  74 6c 65 3d 27 d0 9a d1    StreamTitle='...
00:04:39.464: icymeta [DEBUG]:0x000010: 80 d0 b5 d0 bc d0 b0 d1  82 d0 be d1 80 d0 b8 d0    ................
00:04:39.486: icymeta [DEBUG]:0x000020: b9 20 2d 20 d0 a0 d0 b5  d0 b0 d0 bd d0 b8 d0 bc    . - ............
00:04:39.500: icymeta [DEBUG]:0x000030: d0 b0 d1 86 d0 b8 d0 be  d0 bd d0 bd d0 b0 d1 8f    ................
00:04:39.534: icymeta [DEBUG]:0x000040: 20 d0 9c d0 b0 d1 88 d0  b8 d0 bd d0 b0 27 3b        ............';

#2 Updated by Leonid Protasov about 7 years ago

Hmm, now I see that it does. But the problem is that ANSI detection is not always detects properly. For example this was decoded properly and is ANSI:

icymeta [DEBUG]: 0x000000: 53 74 72 65 61 6d 54 69  74 6c 65 3d 27 ce ea e5    StreamTitle='...
icymeta [DEBUG]: 0x000010: e0 ed 20 c5 eb fc e7 e8  20 2d 20 d1 f2 f0 b3 eb    .. ..... - .....
icymeta [DEBUG]: 0x000020: ff e9 27 3b 53 74 72 65  61 6d 55 72 6c 3d 27 68    ..';

#3 Updated by Leonid Protasov about 7 years ago

  • Status changed from New to Fixed
  • % Done changed from 0 to 100

I beleive it can only be fixed if you set codepage in settings manually :(

#4 Updated by Andreas Smas about 7 years ago

It's the exact same code that's used to convert. The problem is that there are too few characters so the scoring algorithm probably mis-detects it as something else.

Also available in: Atom PDF