Google Blogoscoped

Forum

Unicode Most Popular Encoding  (View post)

James Xuan [PersonRank 10]

Monday, May 5, 2008
16 years ago4,824 views

All heil Unicode!

Peteris Krumins [PersonRank 1]

16 years ago #

Unicode is not an encoding, though.

Tony Ruscoe [PersonRank 10]

16 years ago #

Peteris, what is it then? I always thought it was an encoding standard. And so does the Unicode website:

<< The Unicode Standard is the universal character encoding standard used for representation of text for computer processing. >>

From: http://www.unicode.org/standard/principles.html

Martin Porcheron [PersonRank 10]

16 years ago #

I was always under the presumption that UTF was a character encoding and Unicode was just the standard.

Marcin Sochacki (Wanted) [PersonRank 10]

16 years ago #

I think both are encodings:

Unicode defines how to encode text into numbers (code points).

UTF-(7,8,16,32) define how to encode a series of Unicode code points into a file (this deals with issues like legacy 7-bit systems, endianess, visual compatibility with ASCII, etc.)

Peteris Krumins [PersonRank 1]

16 years ago #

As I understand Unicode is just a standard that defines all the possible characters in the existence (by assigning them a "code point" number). It does not tell how it will be represented in computer memory, it just defines them.

An encoding actually defines how the code points will be represented in memory.

Jallan [PersonRank 0]

16 years ago #

And this standard also defines how they can be variously represented officially in computer memory. Obviously you understood wrongly. Read the standard.

Peteris Krumins [PersonRank 1]

16 years ago #

Unicode itself is a standard that defines several of those encodings.

George R [PersonRank 10]

16 years ago #

According to the chart if you combine the entries for ascii and the similar western European encoding they are about twice as frequent as unicode.

http://en.wikipedia.org/wiki/How_to_Lie_with_Statistics
http://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statistics

TOMHTML [PersonRank 10]

16 years ago #

BTW, Blogger with FTP is not really ISO compliant... :-/

Unicode Guy [PersonRank 0]

16 years ago #

Web servers provide bytes, not Unicode characters. Peteris is 100% correct, and the rest of you should [attack removed]

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!