aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/unicode.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/unicode.txt')
-rw-r--r--Documentation/unicode.txt175
1 files changed, 0 insertions, 175 deletions
diff --git a/Documentation/unicode.txt b/Documentation/unicode.txt
deleted file mode 100644
index 4a33f81cadb1..000000000000
--- a/Documentation/unicode.txt
+++ /dev/null
@@ -1,175 +0,0 @@
- Last update: 2005-01-17, version 1.4
-
-This file is maintained by H. Peter Anvin <unicode@lanana.org> as part
-of the Linux Assigned Names And Numbers Authority (LANANA) project.
-The current version can be found at:
-
- http://www.lanana.org/docs/unicode/unicode.txt
-
- ------------------------
-
-The Linux kernel code has been rewritten to use Unicode to map
-characters to fonts. By downloading a single Unicode-to-font table,
-both the eight-bit character sets and UTF-8 mode are changed to use
-the font as indicated.
-
-This changes the semantics of the eight-bit character tables subtly.
-The four character tables are now:
-
-Map symbol Map name Escape code (G0)
-
-LAT1_MAP Latin-1 (ISO 8859-1) ESC ( B
-GRAF_MAP DEC VT100 pseudographics ESC ( 0
-IBMPC_MAP IBM code page 437 ESC ( U
-USER_MAP User defined ESC ( K
-
-In particular, ESC ( U is no longer "straight to font", since the font
-might be completely different than the IBM character set. This
-permits for example the use of block graphics even with a Latin-1 font
-loaded.
-
-Note that although these codes are similar to ISO 2022, neither the
-codes nor their uses match ISO 2022; Linux has two 8-bit codes (G0 and
-G1), whereas ISO 2022 has four 7-bit codes (G0-G3).
-
-In accordance with the Unicode standard/ISO 10646 the range U+F000 to
-U+F8FF has been reserved for OS-wide allocation (the Unicode Standard
-refers to this as a "Corporate Zone", since this is inaccurate for
-Linux we call it the "Linux Zone"). U+F000 was picked as the starting
-point since it lets the direct-mapping area start on a large power of
-two (in case 1024- or 2048-character fonts ever become necessary).
-This leaves U+E000 to U+EFFF as End User Zone.
-
-[v1.2]: The Unicodes range from U+F000 and up to U+F7FF have been
-hard-coded to map directly to the loaded font, bypassing the
-translation table. The user-defined map now defaults to U+F000 to
-U+F0FF, emulating the previous behaviour. In practice, this range
-might be shorter; for example, vgacon can only handle 256-character
-(U+F000..U+F0FF) or 512-character (U+F000..U+F1FF) fonts.
-
-
-Actual characters assigned in the Linux Zone
---------------------------------------------
-
-In addition, the following characters not present in Unicode 1.1.4
-have been defined; these are used by the DEC VT graphics map. [v1.2]
-THIS USE IS OBSOLETE AND SHOULD NO LONGER BE USED; PLEASE SEE BELOW.
-
-U+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1
-U+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3
-U+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7
-U+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9
-
-The DEC VT220 uses a 6x10 character matrix, and these characters form
-a smooth progression in the DEC VT graphics character set. I have
-omitted the scan 5 line, since it is also used as a block-graphics
-character, and hence has been coded as U+2500 FORMS LIGHT HORIZONTAL.
-
-[v1.3]: These characters have been officially added to Unicode 3.2.0;
-they are added at U+23BA, U+23BB, U+23BC, U+23BD. Linux now uses the
-new values.
-
-[v1.2]: The following characters have been added to represent common
-keyboard symbols that are unlikely to ever be added to Unicode proper
-since they are horribly vendor-specific. This, of course, is an
-excellent example of horrible design.
-
-U+F810 KEYBOARD SYMBOL FLYING FLAG
-U+F811 KEYBOARD SYMBOL PULLDOWN MENU
-U+F812 KEYBOARD SYMBOL OPEN APPLE
-U+F813 KEYBOARD SYMBOL SOLID APPLE
-
-Klingon language support
-------------------------
-
-In 1996, Linux was the first operating system in the world to add
-support for the artificial language Klingon, created by Marc Okrand
-for the "Star Trek" television series. This encoding was later
-adopted by the ConScript Unicode Registry and proposed (but ultimately
-rejected) for inclusion in Unicode Plane 1. Thus, it remains as a
-Linux/CSUR private assignment in the Linux Zone.
-
-This encoding has been endorsed by the Klingon Language Institute.
-For more information, contact them at:
-
- http://www.kli.org/
-
-Since the characters in the beginning of the Linux CZ have been more
-of the dingbats/symbols/forms type and this is a language, I have
-located it at the end, on a 16-cell boundary in keeping with standard
-Unicode practice.
-
-NOTE: This range is now officially managed by the ConScript Unicode
-Registry. The normative reference is at:
-
- http://www.evertype.com/standards/csur/klingon.html
-
-Klingon has an alphabet of 26 characters, a positional numeric writing
-system with 10 digits, and is written left-to-right, top-to-bottom.
-
-Several glyph forms for the Klingon alphabet have been proposed.
-However, since the set of symbols appear to be consistent throughout,
-with only the actual shapes being different, in keeping with standard
-Unicode practice these differences are considered font variants.
-
-U+F8D0 KLINGON LETTER A
-U+F8D1 KLINGON LETTER B
-U+F8D2 KLINGON LETTER CH
-U+F8D3 KLINGON LETTER D
-U+F8D4 KLINGON LETTER E
-U+F8D5 KLINGON LETTER GH
-U+F8D6 KLINGON LETTER H
-U+F8D7 KLINGON LETTER I
-U+F8D8 KLINGON LETTER J
-U+F8D9 KLINGON LETTER L
-U+F8DA KLINGON LETTER M
-U+F8DB KLINGON LETTER N
-U+F8DC KLINGON LETTER NG
-U+F8DD KLINGON LETTER O
-U+F8DE KLINGON LETTER P
-U+F8DF KLINGON LETTER Q
- - Written <q> in standard Okrand Latin transliteration
-U+F8E0 KLINGON LETTER QH
- - Written <Q> in standard Okrand Latin transliteration
-U+F8E1 KLINGON LETTER R
-U+F8E2 KLINGON LETTER S
-U+F8E3 KLINGON LETTER T
-U+F8E4 KLINGON LETTER TLH
-U+F8E5 KLINGON LETTER U
-U+F8E6 KLINGON LETTER V
-U+F8E7 KLINGON LETTER W
-U+F8E8 KLINGON LETTER Y
-U+F8E9 KLINGON LETTER GLOTTAL STOP
-
-U+F8F0 KLINGON DIGIT ZERO
-U+F8F1 KLINGON DIGIT ONE
-U+F8F2 KLINGON DIGIT TWO
-U+F8F3 KLINGON DIGIT THREE
-U+F8F4 KLINGON DIGIT FOUR
-U+F8F5 KLINGON DIGIT FIVE
-U+F8F6 KLINGON DIGIT SIX
-U+F8F7 KLINGON DIGIT SEVEN
-U+F8F8 KLINGON DIGIT EIGHT
-U+F8F9 KLINGON DIGIT NINE
-
-U+F8FD KLINGON COMMA
-U+F8FE KLINGON FULL STOP
-U+F8FF KLINGON SYMBOL FOR EMPIRE
-
-Other Fictional and Artificial Scripts
---------------------------------------
-
-Since the assignment of the Klingon Linux Unicode block, a registry of
-fictional and artificial scripts has been established by John Cowan
-<jcowan@reutershealth.com> and Michael Everson <everson@evertype.com>.
-The ConScript Unicode Registry is accessible at:
-
- http://www.evertype.com/standards/csur/
-
-The ranges used fall at the low end of the End User Zone and can hence
-not be normatively assigned, but it is recommended that people who
-wish to encode fictional scripts use these codes, in the interest of
-interoperability. For Klingon, CSUR has adopted the Linux encoding.
-The CSUR people are driving adding Tengwar and Cirth into Unicode
-Plane 1; the addition of Klingon to Unicode Plane 1 has been rejected
-and so the above encoding remains official.