Official BS.Player forums

Official BS.Player forums (http://forum.bsplayer.com/index.php)
-   Feature Requests, Feedback And Suggestions (http://forum.bsplayer.com/forumdisplay.php?f=9)
-   -   Unicode for subtitles (http://forum.bsplayer.com/showthread.php?t=5641)

Brdja 10th October 2004 11:11 PM

Unicode for subtitles
 
I've not found a way to show unicode subtitles in BSP, and I think that there is not such an option [am not sure, correct me if I wrong].

And here goes my suggestion/request. Make BSP possible to show subtitles in SUB/SRT [or any other] format that are written with UTF-8 encoding.

mmthor 26th October 2004 07:16 AM

Yup! I agree. Support for Unicode (UTF8) in text file is important. Not only we don't have to choose font and script, thus less trouble to users, the most important is that this allows different languages to display at the same time.

goulo 18th December 2004 08:14 PM

yes, please fix UTF-8
 
I was initially drawn to BSPlayer specifically because of reports that it handles UTF-8 files, so it's a bummer that it doesn't. (Did it at one time, and then the UTF-8 code broke? I have read info from a user who says he used to be able to do UTF-8 .srt files and now they no longer work.) Having to use a specific ISO-8859 code page is a real hassle and greatly restricts the font options.

Preferably the UTF-8 support would not require BOM at start of the file, either.

mmthor 21st December 2004 11:52 PM

There was report that BSPlayer would have been able to handle UTF-8 file? Really? I've just used it since last year, but I've never heard of its ability to do so. Too bad.

On the other hand, I understand that a lot of people dislike BOM. But even if it's not required, I hope BSplayer won't ignore it or misinterpret it if it's present.

Or maybe a simpler solution is to use a new extension. Why are subtitle files called SRT, by the way? If S stands for "sub" and T stands for "title", what does R stand for??

I would suggest calling UTF-8 (only UTF8 but not UCS2, UCS4 or UTF16) subtitle file extension as .ust. How's this sound?

Brdja 22nd December 2004 03:44 PM

Quote:

Originally Posted by mmthor
Or maybe a simpler solution is to use a new extension. Why are subtitle files called SRT, by the way? If S stands for "sub" and T stands for "title", what does R stand for??

SubRip Title format...

Quote:

Originally Posted by mmthor
I would suggest calling UTF-8 (only UTF8 but not UCS2, UCS4 or UTF16) subtitle file extension as .ust. How's this sound?

MPlayer uses .UTF extension for subtitles in UTF-8 codeset, and then it determines in which format title is [MicroDVD (sub), SubRip (srt), etc.]. I think that this is good solution for BSPlayer also..

mmthor 24th December 2004 08:39 AM

SubRip Title !? :? Ripped Sub Title would make a lot sense. Anyway...

I don't know what mplayer is. But .utf (or .utf8) just means text file in Unicode UTF-8. In other words, calling it .utf is just like renaming a .srt extension back to .txt. We can't say it's wrong, but it's certainly not precise enough.

Brdja 25th December 2004 03:09 PM

Quote:

Originally Posted by mmthor
SubRip Title !? :? Ripped Sub Title would make a lot sense. Anyway...

But, than that would not be SRT ;) Maybe there is some software called SubRip for ripping titles from DVDs, and from this we have extension srt...

Quote:

Originally Posted by mmthor
I don't know what mplayer is.

MPlayer is the best media player for GNU/Linux. There is win32 port too, but this port doesn't have GUI :(

Quote:

Originally Posted by mmthor
But .utf (or .utf8) just means text file in Unicode UTF-8. In other words, calling it .utf is just like renaming a .srt extension back to .txt. We can't say it's wrong, but it's certainly not precise enough.

In my opinion just .utf is better. I think that most of players automatically recognize format of subtitle file no matter what it's extension is.

But the best solution seams to be no new extensions, but automatically recognization if .sub, .srt, etc file is in UTF-8 codeset. And without any inline indentificators. I do not know how to do that, but I know that is possible. UltraEdit does this UTF indetificaton very well...

adicoto 25th December 2004 07:42 PM

SubRip is a software for ripping subtitles from DVD images to text files, there stand the extension SRT and the specific format. SUB is from MicroDVD, another specific format.
All modern players can find what type of subtitle is the file, no matter what extension is used.
And except for ssa, the most used subtitle files are text based files, ANSI.
Why so much fuss about UTF ? I never got a subtitle not working corectly in BSPlayer, no matter if it was srt, txt, sub.

adicoto 25th December 2004 08:05 PM

Just went to learn something more about UTF. VIsited:

http://www.macchiato.com/unicode/Uni...criptions.html

Seems I can't read correctly:

Canadian aboriginal
Cherokee
Deseret
Ethiopic
Khmer
Ogham
Runic and
Sinhala

The chars are not supported in txt format. So, the format for the files need to be changed, isn't it ? All software for editing subtitles will need to be adjusted, no ?
Windows can't read them. If windows can't, will it BSPlayer able to see ?

goulo 28th December 2004 02:01 AM

Unicode, UTF-8, etc.
 
Windows most certainly does handle UTF-8 files, as do many modern applications. All modern web browsers easily display webpages with charset=utf-8. UTF-8 has become arguably the best solution to handling non-ASCII Unicode text. Any ASCII characters continue to be just 1 byte in UTF-8 (so any plain ASCII file is trivially also UTF-8), while non-ASCII characters are encoded with 2 or more bytes. Any Unicode character is representable in UTF-8.

I personally am interested in this for making subtitles in Esperanto. Currently with BSPlayer as far as I know I must use Latin-3 (aka ISO-8859-3 aka South European) coding, which limits the number of fonts available to me and is a generally less appealing older encoding method. UTF-8 nicely handles all Unicode instead forcing you to use different encodings for different languages (and UTF-8 thus also permits different languages to be mixed together, not possible with Latin-3 etc. which are all 1-byte encodings which thus only permit 256 characters to be represented instead of all Unicode characters, which sucks if, e.g., some character has a French name or German name or whatever and needs letters not in Latin-3.)

Markus Kuhn has a nice FAQ about Unicode, UTF-8, and all that:
http://www.cl.cam.ac.uk/~mgk25/unicode.html

BTW, as Brdja observes, UltraEdit certainly easily detects if a file is UTF-8 or not. Yes, a program that was written assuming all characters are 1-byte will need some rewriting. But it's not a fundamentally hard problem to process UTF-8 text.


All times are GMT +1. The time now is 05:26 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Search Engine Optimization by vBSEO 3.6.0 PL2
Ad Management plugin by RedTyger


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20