| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
functions and delete a macro that is used in only one place;
no functional change.
This completes the audit of the citrus directory (only 3 files left).
|
|
|
|
| |
leave that decision to the compiler; no functional change
|
|
|
|
|
|
|
|
|
|
| |
- delete unused headers
- add missing function prototype
- delete needless casts of return values
- KNF: return is not a function
- KNF: do not use a pointer as a boolean
- consistent wording in comments: s/octets/bytes/
OK gcc: no object change after strip -g
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Declare functions rather than generating declarations with macros.
Just call functions rather than mainting function pointer tables.
Purge unused arguments. Simplify mbstate_t casting.
Garbage collect one empty and one unused function.
As a bonus, make mbsinit(3) work at all, it returned garbage
in the past due to a missing cast when passing mbstate_t.
Apart from that, no functional change.
No libc bump needed; only private functions are removed and
change prototype and only private structs change size.
OK stsp@ mpi@; deraadt@ likes the general direction.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
By definition, the range of valid Unicode code points is the union of
U+0000..U+D7FF and U+E000..U+10FFFF (see Unicode 8.0.0, chapter 3.9).
In UTF-16, the encoded values that would represent U+D800..U+DFFF are
used for surrogate pairs. UTF-8 has no concept of surrogate pairs;
attempting to treat them as regular code points violates the standard
and makes no sense besides.
ok stsp@
|
|
|
|
|
|
|
| |
doing ASCII handling once rather than twice, and using <= rather
than ((&~)==) obfuscation (which already caused a bug in the past).
No functional change.
Joint work with and OK stsp@ semarie@ bentley@
|
|
|
|
|
|
| |
Fixing a regression in wcrtomb(3) found with the mandoc testsuite
that was caused by the last commit.
OK semarie@ bentley@
|
|
|
|
|
|
|
|
|
|
| |
RFC 3629 (limiting the range of UTF-8 to 0x10FFFF).
it is the counterpart of a previous commit correcting mbrtowc(3).
problem spotted by stsp
ok bentley@ stsp@
|
|
|
|
| |
ok stsp@
|
|
|
|
| |
ok stsp@
|
|
|
|
|
|
| |
symbols that are not longer exported. (This improves the generated code.)
ok deraadt@
|
|
|
|
| |
review by millert, binary checking process with doug, concept with guenther
|
|
|
|
|
|
|
|
|
| |
so stop rejecting them in our citrus UTF-8 parser.
This is a common misinterpretation of the Unicode standard which resulted
in a corrigendum last year: http://www.unicode.org/versions/corrigendum9.html
Pointed out by jilles@freebsd (via pfg@freebsd), thanks!
|
|
|
|
|
| |
Patch by Vladimir Támara Patiño <vtamara@pasosdeJesus.org>
ok mpi millert
|
|
|
|
|
| |
wcrtomb() must pretend to store one byte (NUL-terminator) in this case.
Patch by Vladimir Tamara Patino. ok guenther
|
|
|
|
| |
ok guenther millert kettenis
|
|
|
|
| |
Bulk build test by naddy.
|
|
|
|
|
|
| |
of wide characters. This will fix a problem of uim-fep pre-edit display.
OK stsp@
|
|
|
|
|
|
|
|
|
| |
code positions U+D800 to U+DFFF (UTF-16 surrogates), U+FFFE, and U+FFFF.
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
http://unicode.org/faq/utf_bom.html#utf8-4
ok phessler, millert, miod, deraadt
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is supposed to ignore the 'n' parameter and return the number of wide
characters needed to represent the given multi-byte character sequence.
However, in the special case where 'pwcs' is NULL and 'n' is zero, our
mbsrtowcs() implementation for single-byte locales mistakenly returned zero.
Before the UTF-8 locale was added, this bug was invisible to callers of
mbstowcs() because mbstowcs() handled this special case itself.
But our new mbstowcs() implementation simply forwards to the locale-specific
mbsrtowcs() implementation and expects it to do the right thing.
The "awesome" window manager's "Run:" command prompt uses mbstowcs() to
measure how many (possibly multi-byte) characters a user has typed, and
due to this bug would always be tricked into thinking the user had entered
zero characters when a single-byte locale was used.
Found after prodding by dcoppa.
ok deraadt sthen espie
|
|
|
|
|
|
| |
add missing prototype
ok stsp@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conversion interfaces of libc (mbrtowc(3) and friends) with new
implementations that internally call an API based on NetBSD's citrus.
This allows us to support locales with multi-byte character encodings.
Provide two implementations of the citrus-based API: one based on the old
single-byte placeholders for use with our existing single-byte character
locales (C, ISO8859-*, KOI8, CP1251, etc.), and one that provides support
for UTF-8 encoded characters (code based on FreeBSD's implementation).
Install the en_US.UTF-8 ctype locale support file, and allow the UTF-8
ctype locale to be enabled via setlocale(3) (export LC_CTYPE='en_US.UTF-8').
A lot of programs, especially from ports, will now start using UTF-8 if the
UTF-8 locale is enabled. Use at your own risk, and please report any breakage.
Note that ncurses-based programs cannot display UTF-8 right now, this is being
worked on.
To prevent install media growth, add vfprintf(3) and mbrtowc(3) to libstubs.
The mbrtowc stub was copied unchanged from its old single-byte placeholder.
vfprintf.c doesn't need to be copied, just put in .PATH (hint by fgsch@).
Testing by myself, naddy, sthen, nicm, espie, armani, Dmitrij D. Czarkoff.
ok matthieu espie millert sthen nicm deraadt
|
|
okay deraadt@
|