![]() |
China
internet resources--Chinese character sets GB and Big5 |
|---|
Home-->Chinese internet resource-->Chinese character sets Chinese character sets The two most commonly used character sets for Chinese
are:
GB (used in mainland China and associated with simplified characters) Big5 (used in Taiwan and Hong Kong and associated with traditional characters) A new universal character set called Unicode is gradually coming into use, but almost all Chinese web pages use either GB or Big5. What's a character set? For writing English, the most common character set is called ASCII. This is a 7-bit character set, which means it can store a maximum of 128 separate symbols. It includes capital letters, small letters, digits, common punctuation symbols, and so on. For instance, the capital letter A is represented by 01000001. On the Internet, most web pages can be written using the ISO-8859-1 character set. This is an 8-bit character set, which means it can store a maximum of 256 separate symbols. It includes all of ASCII plus accented letters used by Western European languages. Because Chinese contains thousands of different characters, it requires a double-byte (or 16-bit) character set, such as GB or Big5 or Unicode, capable of storing a maximum of 65536 separate symbols. Technical details Each byte of a double-byte GB character is in the hexadecimal range a1 to fe. For Big5, the first byte in the range a1 to fe, while the second byte is in the range 40 to 7f or a1 to fe. Thus, 7-bit ASCII text can be intermixed freely within a GB or Big5 text. However, 8-bit accented European letters cannot be mixed in this way, and most current browsers unfortunately cannot display European languages and Chinese on the same web page. |
>
Simple facts about China |
|
|