(Redirected from CJK)'CJK' is a collective term for
Chinese,
Japanese, and
Korean, which constitute the main
East Asian languages. The term is used in the field of
software and communications
internationalization.
The term 'CJKV' means CJK plus
Vietnamese, which in the past used
Chinese characters (
chữ nôm) prior to adopting ''quốc ngữ'' (see
Vietnamese alphabet).
These languages all have a shared characteristic: Their
writing systems all completely or partly use
Chinese characters—
hà nzì in Chinese,
kanji in Japanese, and
hanja in Korean. Chinese is only written in Chinese characters and requires c. 4,000 characters for general literacy and there are up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters—general literacy in Japan can be expected with about 2,000 characters, together with two
syllabaries. The use of Chinese characters in Korea is becoming increasingly rare altogether, although idiosyncratic use of Chinese characters in proper names requires knowledge (and therefore availability) of many more characters. The number of characters required for complete coverage of all these languages' needs cannot fit in the 256-character code space of 8-bit
encodings, requiring at least a 16-bit fixed width
character encoding or multi-byte variable-length encodings. The 16-bit fixed width encodings, such as
Unicode up to and including version 2.0, are now deprecated due to the requirement to encode more characters than a 16-bit encoding can accommodate—Unicode 5.0 has some 90,000 Han characters—and the requirement by the Chinese government that software in China support the
GB18030 character set.
Although CJK encodings have common character sets, the encodings often used to represent them have been developed separately by different East Asian governments and software companies, and are mutually incompatible.
Unicode has attempted, with some controversy, to unify the character sets in a process known as
Han unification.
CJK character encodings should consist minimally of Han characters plus language-specific phonetic scripts such as
pinyin,
bopomofo,
hiragana,
katakana, and
hangul.
CJK character encodings include:
★
Big5
★
EUC-JP
★
EUC-KR
★
GB18030 (the mandated standard in the
People's Republic of China)
★
GB2312
★
ISO 2022-JP
★
KS C 5861
★
Shift-JIS
★
Unicode
The CJK character sets take up the bulk of the
Unicode code space. There is much controversy among Japanese experts of Chinese characters about the desirability and technical merit of the Han unification process used to map multiple Chinese and Japanese characters sets into a single set of unified characters.
Chinese and Japanese can be written both
left-to-right and top-to-bottom, but is usually considered a left-to-right script when discussing encoding issues.
See also
★
Chinese character encoding
★
Han unification
★
Chinese input methods for computers
★
Japanese language and computers
★
Korean language and computers
★
Variable-width encoding
★
Complex Text Layout languages (CTL)
★
CJK strokes
★
Horizontal and vertical writing in East Asian scripts
★
Graphics tablet
References
★
DeFrancis, John. ''. Honolulu: University of Hawaii Press, 1990. ISBN 0-8248-1068-6.
★ Hannas, William C. ''Asia's Orthographic Dilemma''. Honolulu: University of Hawaii Press, 1997. ISBN 0-8248-1892-X (paperback); ISBN 0-8248-1842-3 (hardcover).
★
Lunde, Ken. ''CJKV Information Processing''. Sebastopol, Calif.: O'Reilly & Associates, 1998. ISBN 1-56592-224-7.
External links
★
CJKV: A Brief Introduction
★ http://www.praxagora.com/lunde/cjk_inf.html