Unicode
Created | Updated Jan 28, 2002
Apologies for the extreme roughness of the below - very late - will improve soon promise.
To quote the What is Unicode? page on the Unicode site:
Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
no matter what the language.
History
Every letter and symbol you use on your computer is stored by your computer as a number. For example, the roman letter 'A' will almost certainly be represented as the number 65. A group of such mappings between characters and numbers is called a character set.
There are hundreds of such character sets in use today throughout the world, the most common variations on ISO-8859-1. More here.
Most character sets have a maximum number of 256 elements, as most computers store bytes (a character) in 8 binary bits, giving a range from 0 - 255. This is a problem for countries such as Japan and China, who have much more than 256 glyphs in their language. Talk about Shift-JIS and its problems.
Now talk about conflicts: e.g. 141 is the TM symbol on an Acorn, but something else on a PC.
The Solution
ISO/IEC 10646 and Unicode (which is the official way to implement it). Talk all about Unicode: millions of elements, encoding formats, no conflict, universally recognised, etc..