1 files changed, 586 insertions, 453 deletions
diff --git a/lib/libc/stdio/printf.3 b/lib/libc/stdio/printf.3
index aafb6f5a409..3ca5645f9ff 100644
--- a/lib/libc/stdio/printf.3
+++ b/lib/libc/stdio/printf.3
@@ -1,4 +1,4 @@
-.\"	$OpenBSD: printf.3,v 1.85 2020/07/06 17:24:59 schwarze Exp $
+.\"	$OpenBSD: printf.3,v 1.86 2020/07/10 14:43:18 schwarze Exp $
 .\"
 .\" Copyright (c) 1990, 1991, 1993
 .\"	The Regents of the University of California.  All rights reserved.
@@ -33,7 +33,7 @@
 .\"
 .\"     @(#)printf.3	8.1 (Berkeley) 6/4/93
 .\"
-.Dd $Mdocdate: July 6 2020 $
+.Dd $Mdocdate: July 10 2020 $
 .Dt PRINTF 3
 .Os
 .Sh NAME
@@ -159,487 +159,598 @@ characters (not
 which are copied unchanged to the output stream,
 and conversion specifications, each of which results
 in fetching zero or more subsequent arguments.
-Each conversion specification is introduced by the character
-.Cm % .
 The arguments must correspond properly (after type promotion)
-with the conversion specifier.
-After the
-.Cm % ,
-the following appear in sequence:
-.Bl -bullet
-.It
-An optional field, consisting of a decimal digit string followed by a
-.Cm $
-specifying the next argument to access.
-If this field is not provided, the argument following the last
-argument accessed will be used.
-Arguments are numbered starting at
-.Cm 1 .
-.It
-Zero or more of the following flags:
-.Bl -hyphen
-.It
-A hash
-.Sq Cm #
-character
-specifying that the value should be converted to an
-.Dq alternate form .
-For
-.Cm o
-conversions, the precision of the number is increased to force the first
-character of the output string to a zero (except if a zero value is printed
-with an explicit precision of zero).
-For
-.Cm x
-and
-.Cm X
-conversions, a non-zero result has the string
-.Ql 0x
-(or
-.Ql 0X
-for
-.Cm X
-conversions) prepended to it.
-For
-.Cm a ,
-.Cm A ,
-.Cm e ,
-.Cm E ,
-.Cm f ,
-.Cm F ,
-.Cm g ,
-and
-.Cm G
-conversions, the result will always contain a decimal point, even if no
-digits follow it (normally, a decimal point appears in the results of
-those conversions only if a digit follows).
-For
-.Cm g
-and
-.Cm G
-conversions, trailing zeros are not removed from the result as they
-would otherwise be.
-For all other formats, behaviour is undefined.
-.It
-A zero
-.Sq Cm \&0
-character specifying zero padding.
-For all conversions except
-.Cm n ,
-the converted value is padded on the left with zeros rather than blanks.
-If a precision is given with a numeric conversion
-.Pf ( Cm d ,
-.Cm i ,
-.Cm o ,
-.Cm u ,
-.Cm x ,
-and
-.Cm X ) ,
-the
-.Sq Cm \&0
-flag is ignored.
-.It
-A negative field width flag
-.Sq Cm \-
-indicates the converted value is to be left adjusted on the field boundary.
-Except for
-.Cm n
-conversions, the converted value is padded on the right with blanks,
-rather than on the left with blanks or zeros.
-A
-.Sq Cm \-
-overrides a
-.Sq Cm \&0
-if both are given.
-.It
-A space, specifying that a blank should be left before a positive number
-produced by a signed conversion
-.Pf ( Cm d ,
-.Cm a ,
-.Cm A ,
-.Cm e ,
-.Cm E ,
-.Cm f ,
-.Cm F ,
-.Cm g ,
-.Cm G ,
-or
-.Cm i ) .
-.It
-A
-.Sq Cm +
-character specifying that a sign always be placed before a
-number produced by a signed conversion.
-A
-.Sq Cm +
-overrides a space if both are used.
+with the conversion specifiers.
+.Pp
+The overall syntax of a conversion specification is:
+.Bd -filled -offset indent
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Ar flags
+.Op Ar width
+.Op . Ar precision
+.Op Ar size
+.Ar conversion
+.Sm on
+.Ed
+.Pp
+Not all combinations of these parts are meaningful;
+see the description of the individual
+.Ar conversion
+specifiers for details.
+.Pp
+The parts of a conversion specification are as follows:
+.Bl -tag -width Ds
+.It Cm %
+A literal percent character begins a conversion specification.
+.It Ar argno Ns Cm $
+An unsigned decimal digit string followed by a dollar character
+specifies the index of the next argument to access.
+By default, the argument following the last argument accessed is used.
+Arguments are numbered starting at 1.
+.It Ar flags
+Zero or more of the following flag characters can be given:
+.Bl -tag -width 11n
+.It Cm # Pq hash
+Use an alternate form for the output.
+The effect differs depending on the conversion specifier.
+.It So \~ Sc Pq space
+For signed conversions, print a space character before a positive number.
+.It Cm + Pq plus
+For signed conversions, always print a sign before the number,
+even if it is positive.
+This overrides the space flag if both are specified.
+.It Cm 0 Pq zero
+Pad numbers with leading zeros instead of space characters
+to fill the field
+.Ar width .
+This flag is ignored if the
+.Ar precision
+modifier is also given, which in this case specifies
+.Ar mindigits .
+.It Cm \- Pq minus
+Left adjust: pad to the field
+.Ar width
+with space characters on the right rather than on the left.
+This overrides the
+.Sq Cm 0
+flag if both are specified.
 .El
-.It
-An optional decimal digit string specifying a minimum field width.
-If the converted value has fewer characters than the field width, it will
-be padded with spaces on the left (or right, if the left-adjustment
-flag has been given) to fill out
-the field width.
-.It
-An optional precision, in the form of a period
-.Sq Cm \&.
-followed by an
-optional digit string.
-If the digit string is omitted, the precision is taken as zero.
-This gives the minimum number of digits to appear for
-.Cm d ,
-.Cm i ,
-.Cm o ,
-.Cm u ,
-.Cm x ,
-and
-.Cm X
-conversions, the number of digits to appear after the decimal-point for
-.Cm a ,
-.Cm A ,
-.Cm e ,
-.Cm E ,
-.Cm f ,
-and
-.Cm F
-conversions, the maximum number of significant digits for
-.Cm g
-and
-.Cm G
-conversions, or the maximum number of characters to be printed from a
-string for
-.Cm s
-conversions.
-.It
-An optional length modifier, that specifies the size of the argument.
-The following length modifiers are valid for the
-.Cm d , i , n ,
-.Cm o , u , x ,
+.It Ar width
+An unsigned decimal digit string specifies a minimum field width in bytes.
+Unless the
+.Sq Cm 0
 or
-.Cm X
-conversions:
-.Bl -column "(deprecated)" "signed char" "unsigned long long" "long long *"
-.It Sy Modifier Ta Sy "d, i" Ta Sy "o, u, x, X" Ta Sy n
-.It hh Ta "signed char" Ta "unsigned char" Ta "signed char *"
-.It h Ta short Ta "unsigned short" Ta "short *"
-.It "l (ell)" Ta long Ta "unsigned long" Ta "long *"
-.It "ll (ell ell)" Ta "long long" Ta "unsigned long long" Ta "long long *"
-.It j Ta intmax_t Ta uintmax_t Ta "intmax_t *"
-.It t Ta ptrdiff_t Ta (see note) Ta "ptrdiff_t *"
-.It z Ta "(see note)" Ta size_t Ta "(see note)"
-.It "q (deprecated)" Ta quad_t Ta u_quad_t Ta "quad_t *"
-.El
+.Sq Cm \-
+flag is given, the value is right adjusted in the field and
+padded with space characters on the left.
+By default, no padding is added.
+In no case does a non-existent or small field
+.Ar width
+cause truncation of a field; if the result of a conversion is wider
+than the field width, the field is expanded to contain the conversion
+result.
+.It Pf . Ar precision
+The meaning of an unsigned decimal digit string prefixed with a
+period character depends on the conversion specifier:
+it provides the minimum number of digits for integer conversions,
+of decimals for some floating point conversions and of significant
+digits for others, or the maximum number of bytes to print for
+string conversions.
 .Pp
-Note:
-the
-.Cm t
-modifier, when applied to an
-.Cm o , u , x ,
-or
-.Cm X
-conversion, indicates that the argument is of an unsigned type
-equivalent in size to a
-.Vt ptrdiff_t .
-The
-.Cm z
-modifier, when applied to a
-.Cm d
-or
-.Cm i
-conversion, indicates that the argument is of a signed type equivalent in
-size to a
-.Vt size_t .
-Similarly, when applied to an
-.Cm n
-conversion, it indicates that the argument is a pointer to a signed type
-equivalent in size to a
-.Vt size_t .
-.Pp
-The following length modifiers are valid for the
-.Cm a ,
-.Cm A ,
-.Cm e ,
-.Cm E ,
-.Cm f ,
-.Cm F ,
-.Cm g ,
+A field
+.Ar width
 or
-.Cm G
-conversions:
-.Bl -column "Modifier" "e, E, f, F, g, G"
-.It Sy Modifier Ta Sy "e, E, f, F, g, G"
-.It "l (ell)" Ta double (ignored: same behavior as without it)
-.It L Ta "long double"
+.Ar precision ,
+or both, may alternatively be indicated as
+.Cm * Ns Op Ar argno Ns Cm $ ,
+i.e. as an asterisk optionally followed
+by an unsigned decimal digit string and a dollar sign.
+In this case, an additional
+.Vt int
+argument supplies the field width or precision.
+If a single conversion specification tries to use arguments
+both with and without
+.Ar argno Ns Cm $
+modifiers, the result is undefined.
+.It Ar size
+An argument size modifier.
+The syntax, the precise meaning, and the default size of the argument
+depend on the following
+.Ar conversion
+character.
+.It Ar conversion
+Each conversion specification ends with a conversion specifier,
+which is a single letter determining which argument type is expected
+and how it is formatted.
 .El
 .Pp
-The following length modifier is valid for the
-.Cm c
-or
-.Cm s
-conversions:
-.Bl -column "Modifier" "wint_t" "wchar_t *"
-.It Sy Modifier Ta Sy c Ta Sy s
-.It "l (ell)" Ta wint_t Ta "wchar_t *"
-.El
-.It
-A character that specifies the type of conversion to be applied.
-.El
+The conversion specifiers are:
+.Bl -tag -width Ds
+.It Cm %a
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm #
+.Op Cm \~ | +
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar hexadecimals
+.Op Cm L | l
+.Cm a
+.Sm on
 .Pp
-A field width or precision, or both, may be indicated by
-an asterisk
-.Ql *
-or an asterisk followed by one or more decimal digits and a
-.Ql $
-instead of a
-digit string.
-In this case, an
-.Li int
-argument supplies the field width or precision.
-A negative field width is treated as a left adjustment flag followed by a
-positive field width; a negative precision is treated as though it were
-missing.
-If a single format directive mixes positional (nn$) and
-non-positional arguments, the results are undefined.
-.Pp
-The conversion specifiers and their meanings are:
-.Bl -tag -width "diouxX"
-.It Cm diouxX
 The
-.Li int
-(or appropriate variant) argument is converted to signed decimal
-.Pf ( Cm d
+.Vt double
+argument is converted to the hexadecimal notation
+.Sm off
+.Oo \- Oc Sy 0x No h.hhh Sy p No \(+-d
+.Sm on
+with one digit before the hexadecimal point.
+If specified, the number is rounded to
+.Ar hexadecimals
+after the hexadecimal point; otherwise,
+enough digits are printed to represent it exactly.
+The hexadecimal point is only printed if at least one digit follows it
+or if the
+.Sq Cm #
+flag is given.
+.Pp
+The exponent is expressed in base 2, not in base 16.
+Consequently, there are multiple ways to represent a number in this format.
+For example, 0x3.24p+0, 0x6.48p-1, and 0xc.9p-2 are all equivalent.
+The format chosen depends on the internal representation of the
+number, but the implementation guarantees that the length of the
+mantissa is minimized.
+Zeroes are always represented with a mantissa of
+.Ql 0
+(preceded by a sign if appropriate) and an exponent of
+.Ql +0 .
+.Pp
+If the argument is infinity, it is converted to
+.Ql [-]inf .
+If the argument is not-a-number (NaN), it is converted to
+.Ql [-]nan .
+.Pp
+.Cm %La
+is similar to
+.Cm %a
+except that it takes an argument of
+.Vt long double .
+.Cm %la Pq ell a
+is an alias for
+.Cm %a .
+.It Cm \&%A
+Identical to
+.Cm %a
+except that upper case is used, i.e.\&
+.Ql 0X
+for the prefix,
+.Ql 0123456789ABCDEF
+for the digits,
+.Ql P
+to introduce the exponent,
 and
-.Cm i ) ,
-unsigned octal
-.Pq Cm o ,
-unsigned decimal
-.Pq Cm u ,
-or unsigned hexadecimal
-.Pf ( Cm x
+.Ql [-]INF
 and
-.Cm X )
-notation.
-The letters
-.Cm abcdef
-are used for
-.Cm x
-conversions; the letters
-.Cm ABCDEF
-are used for
-.Cm X
-conversions.
-The precision, if any, gives the minimum number of digits that must
-appear; if the converted value requires fewer digits, it is padded on
-the left with zeros.
-.It Cm DOU
-The
-.Li long int
-argument is converted to signed decimal, unsigned octal, or unsigned
-decimal, as if the format had been
-.Cm ld ,
-.Cm lo ,
-or
-.Cm lu
-respectively.
-These conversion characters are deprecated, and will eventually disappear.
-.It Cm eE
+.Ql [-]NAN
+for infinity and not-a-number, respectively.
+.It Cm %c
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm \-
+.Op Ar width
+.Cm c
+.Sm on
+.Pp
 The
-.Li double
-argument is rounded and converted in the style
+.Vt int
+argument is converted to an
+.Vt unsigned char ,
+and the resulting single-byte character is written, with optional padding.
+.It Cm %lc
 .Sm off
-.Pf [\-]d Cm \&. No ddd Cm e No \(+-dd
+.Cm %
+.Op Ar argno Cm $
+.Op Cm \-
+.Op Ar width
+.Cm lc
 .Sm on
-where there is one digit before the
-decimal-point character
-and the number of digits after it is equal to the precision;
-if the precision is missing,
-it is taken as 6; if the precision is
-zero, no decimal-point character appears.
-An
-.Cm E
-conversion uses the letter
-.Cm E
-(rather than
-.Cm e )
-to introduce the exponent.
-The exponent always contains at least two digits; if the value is zero,
-the exponent is 00.
-.Pp
-If the argument is infinity, it will be converted to [-]inf
-.Pq Cm e
-or [-]INF
-.Pq Cm E ,
-respectively.
-If the argument is not-a-number (NaN), it will be converted to
-[-]nan
-.Pq Cm e
-or [-]NAN
-.Pq Cm E ,
-respectively.
-.It Cm fF
+.Pp
 The
-.Li double
-argument is rounded and converted to decimal notation in the style
+.Vt wint_t
+argument is converted to a multibyte character according to the current
+.Dv LC_CTYPE
+.Xr locale 1 ,
+and that character is written.
+For example, under a UTF-8 locale on
+.Ox ,
+.Ql printf("%lc", 0x03c0)
+writes the greek letter pi, whereas the same call fails
+under the default POSIX locale.
+Padding assures at least
+.Ar width
+bytes are printed; the number of characters printed may be smaller,
+and the number of display columns occupied may be smaller or larger.
+.It Cm %d
 .Sm off
-.Pf [-]ddd Cm \&. No ddd ,
+.Cm %
+.Op Ar argno Cm $
+.Op Cm \~ | +
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar mindigits
+.Op Ar size
+.Cm d
 .Sm on
-where the number of digits after the decimal-point character
-is equal to the precision specification.
-If the precision is missing, it is taken as 6; if the precision is
-explicitly zero, no decimal-point character appears.
-If a decimal point appears, at least one digit appears before it.
 .Pp
-If the argument is infinity, it will be converted to [-]inf
-.Pq Cm f
-or [-]INF
-.Pq Cm F ,
-respectively.
-If the argument is not-a-number (NaN), it will be converted to
-[-]nan
-.Pq Cm f
-or [-]NAN
-.Pq Cm F ,
-respectively.
-.It Cm gG
 The
-.Li double
-argument is converted in style
-.Cm f
-or
-.Cm e
-for
-.Cm g ,
-or
-.Cm F
-or
-.Cm E
-for
-.Cm G
-.Pq general floating point notation .
-The precision specifies the number of significant digits.
-If the precision is missing, 6 digits are given; if the precision is zero,
-it is treated as 1.
-Style
+.Vt int
+argument is converted to signed decimal notation.
+If specified, at least
+.Ar mindigits
+are printed, padding with leading zeros if needed.
+The following are similar to
+.Cm %d
+except that they take an argument of a different size:
+.Bl -column %hhd
+.It Cm %hhd Ta Vt signed char
+.It Cm %hd  Ta Vt signed short
+.It Cm %d   Ta Vt signed int
+.It Cm %ld  Ta Vt signed long Pq percent ell dee
+.It Cm %lld Ta Vt signed long long Pq percent ell ell dee
+.It Cm %jd  Ta Vt intmax_t
+.It Cm %td  Ta Vt ptrdiff_t
+.It Cm %zd  Ta Vt ssize_t
+.It Cm %qd  Ta Vt quad_t Pq deprecated
+.El
+.It Cm \&%D
+A deprecated alias for
+.Cm %ld .
+.It Cm %e
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm #
+.Op Cm \~ | +
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar decimals
+.Op Cm L | l
 .Cm e
-is used if the exponent from its conversion is less than -4 or greater than
-or equal to the precision.
-Trailing zeros are removed from the fractional part of the result; a
-decimal point appears only if it is followed by at least one digit.
-.Pp
-If the argument is infinity, it will be converted to [-]inf
-.Pq Cm g
-or [-]INF
-.Pq Cm G ,
-respectively.
-If the argument is not-a-number (NaN), it will be converted to
-[-]nan
-.Pq Cm g
-or [-]NAN
-.Pq Cm G ,
-respectively.
-.It Cm aA
+.Sm on
+.Pp
 The
-.Li double
-argument is rounded and converted to hexadecimal notation in the style
+.Vt double
+argument is rounded and converted to the scientific notation
+.Pf [\-]d.dddddd Sy e Ns \(+-dd
+with one digit before the decimal point and
+.Ar decimals ,
+or six digits by default, after it.
+If
+.Ar decimals
+is zero and the
+.Sq Cm #
+flag is not given, the decimal point is omitted.
+The exponent always contains at least two digits; if the value is zero,
+the exponent is
+.Ql +00 .
+If the argument is infinity, it is converted to
+.Ql [-]inf .
+If the argument is not-a-number (NaN), it is converted to
+.Ql [-]nan .
+.Pp
+.Cm %Le
+is similar to
+.Cm %e
+except that it takes an argument of
+.Vt long double .
+.Cm %le Pq ell e
+is an alias for
+.Cm %e .
+.It Cm \&%E
+Identical to
+.Cm %e
+except that upper case is used, i.e.\&
+.Ql E
+instead of
+.Ql e
+to introduce the exponent and
+.Ql [-]INF
+and
+.Ql [-]NAN
+for infinity and not-a-number, respectively.
+.It Cm %f
 .Sm off
-.Pf [\-]0xh Cm \&. No hhh Cm p No [\(+-]d
+.Cm %
+.Op Ar argno Cm $
+.Op Cm #
+.Op Cm \~ | +
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar decimals
+.Op Cm L | l
+.Cm f
 .Sm on
-where the number of digits after the hexadecimal-point character
-is equal to the precision specification.
-If the precision is missing, it is taken as enough to represent
-the floating-point number exactly, and no rounding occurs.
-If the precision is zero, no hexadecimal-point character appears.
+.Pp
 The
-.Cm p
-is a literal character
-.Ql p ,
-and the exponent consists of a positive or negative sign
-followed by a decimal number representing an exponent of 2.
+.Vt double
+argument is rounded and converted to the decimal notation [\-]ddd.dddddd with
+.Ar decimals ,
+or six digits by default, after the decimal point.
+If
+.Ar decimals
+is zero and the
+.Sq Cm #
+flag is not given, the decimal point is omitted.
+If a decimal point appears, at least one digit appears before it.
+If the argument is infinity, it is converted to
+.Ql [-]inf .
+If the argument is not-a-number (NaN), it is converted to
+.Ql [-]nan .
+.Pp
+.Cm %Lf
+is similar to
+.Cm %f
+except that it takes an argument of
+.Vt long double .
+.Cm %lf Pq ell eff
+is an alias for
+.Cm %f .
+.It Cm \&%F
+Identical to
+.Cm %f
+except that upper case is used, i.e.\&
+.Ql [-]INF
+and
+.Ql [-]NAN
+for infinity and not-a-number, respectively.
+.It Cm %g
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm #
+.Op Cm \~ | +
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar significant
+.Op Cm L | l
+.Cm g
+.Sm on
+.Pp
 The
-.Cm A
-conversion uses the prefix
-.Dq Li 0X
-(rather than
-.Dq Li 0x ) ,
-the letters
-.Dq Li ABCDEF
-(rather than
-.Dq Li abcdef )
-to represent the hex digits, and the letter
-.Ql P
-(rather than
-.Ql p )
-to separate the mantissa and exponent.
-.Pp
-Note that there may be multiple valid ways to represent floating-point
-numbers in this hexadecimal format.
-For example,
-.Li 0x3.24p+0 , 0x6.48p-1
+.Vt double
+argument is converted in style
+.Cm %f
+or
+.Cm %e
+.Pq general floating point notation
+with
+.Ar significant
+digits, or six significant digits by default.
+If
+.Ar significant
+is zero, one is used instead.
+Style
+.Cm %e
+is used if the exponent from its conversion is less than \-4
+or greater than or equal to
+.Ar significant .
+Unless the
+.Sq Cm #
+flag is given, trailing zeros are removed from the fractional
+part of the result, and the decimal point only appears if it is
+followed by at least one digit.
+.Pp
+.Cm %Lg
+is similar to
+.Cm %g
+except that it takes an argument of
+.Vt long double .
+.Cm %lg Pq ell gee
+is an alias for
+.Cm %g .
+.It Cm \&%G
+Identical to
+.Cm %g
+except that upper case is used, i.e.\&
+.Ql E
+instead of
+.Ql e
+to introduce the exponent and
+.Ql [-]INF
 and
-.Li 0xc.9p-2
-are all equivalent.
-The format chosen depends on the internal representation of the
-number, but the implementation guarantees that the length of the
-mantissa will be minimized.
-Zeroes are always represented with a mantissa of 0 (preceded by a
-.Ql -
-if appropriate) and an exponent of
-.Li +0 .
-.Pp
-If the argument is infinity, it will be converted to [-]inf
-.Pq Cm a
-or [-]INF
-.Pq Cm A ,
-respectively.
-If the argument is not-a-number (NaN), it will be converted to
-[-]nan
-.Pq Cm a
-or [-]NAN
-.Pq Cm A ,
-respectively.
-.It Cm c
+.Ql [-]NAN
+for infinity and not-a-number, respectively.
+.It Cm %i
+An alias for
+.Cm %d ,
+supporting the same modifiers.
+.It Cm %n
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Ar size
+.Cm n
+.Sm on
+.Pp
+The number of bytes written so far is stored into the signed integer
+variable indicated by the pointer argument.
+No argument is converted.
+.Pp
+Make sure the
+.Ar size
+modifier matches the type of the pointer passed:
+.Bl -column %hhn
+.It Cm %hhn Ta Vt signed char *
+.It Cm %hn  Ta Vt signed short *
+.It Cm %n   Ta Vt signed int *
+.It Cm %ln  Ta Vt signed long * Pq percent ell dee
+.It Cm %lln Ta Vt signed long long * Pq percent ell ell dee
+.It Cm %jn  Ta Vt intmax_t *
+.It Cm %tn  Ta Vt ptrdiff_t *
+.It Cm %zn  Ta Vt ssize_t *
+.It Cm %qn  Ta Vt quad_t * Pq deprecated
+.El
+.It Cm %o
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm #
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar mindigits
+.Op Ar size
+.Cm o
+.Sm on
+.Pp
+Similar to
+.Cm %u
+except that the
+.Vt unsigned int
+argument is converted to unsigned octal notation.
+If the
+.Sq Cm #
+flag is given,
+.Ar mindigits
+is increased such that the first digit printed is a zero,
+except if a zero value is printed with an explicit
+.Ar mindigits
+of zero.
+.It Cm \&%O
+A deprecated alias for
+.Cm %lo .
+.It Cm %p
 The
-.Li int
-argument is converted to an
-.Li unsigned char ,
-and the resulting character is written.
-.It Cm s
+.Vt void *
+pointer argument is printed in hexadecimal, similar to
+.Cm %#x
+or
+.Cm %#lx
+depending on the size of pointers.
+.It Cm %s
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm \-
+.Op Ar width
+.Op . Ar maxbytes
+.Cm s
+.Sm on
+.Pp
+Characters from the
+.Vt char * Pq string
+argument are written up to (but not including) a terminating NUL character.
+If
+.Ar maxbytes
+is specified, at most
+.Ar maxbytes
+bytes are written; in that case, no NUL character needs to be present.
+.It Cm %ls
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm \-
+.Op Ar width
+.Op . Ar maxbytes
+.Cm ls
+.Sm on
+.Pp
 The
-.Li char *
-argument is expected to be a pointer to an array of character type (pointer
-to a string).
-Characters from the array are written up to (but not including)
-a terminating NUL character;
-if a precision is specified, no more than the number specified are
-written.
-If a precision is given, no NUL character need be present;
-if the precision is not specified, or is greater than the size
-of the array, the array must contain a terminating NUL character.
-.It Cm p
+.Vt wchar_t * Pq wide character string
+argument is converted to a multibyte character string
+according to the current
+.Dv LC_CTYPE
+.Xr locale 1
+up to (but not including) a terminating NUL character,
+and that multibyte character string is written.
+If
+.Ar maxbytes
+is specified, at most
+.Ar maxbytes
+bytes are written; in that case, no NUL character needs to be present.
+If a multibyte character does not fit into the rest of
+.Ar maxbytes ,
+it is omitted together with the rest of the argument string;
+partial characters are not written.
+Locale dependency and padding work in the same way as for
+.Cm %lc .
+.It Cm %u
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar mindigits
+.Op Ar size
+.Cm u
+.Sm on
+.Pp
 The
-.Li void *
-pointer argument is printed in hexadecimal (as if by
-.Ql %#x
-or
-.Ql %#lx ) .
-.It Cm n
-The number of characters written so far is stored into the
-integer indicated by the
-.Li int *
-(or variant) pointer argument.
-No argument is converted.
-.It Cm %
-A
-.Ql %
+.Vt unsigned int
+argument is converted to unsigned decimal notation.
+If specified, at least
+.Ar mindigits
+are printed, padding with leading zeros if needed.
+The following are similar to
+.Cm %u
+except that they take an argument of a different size:
+.Bl -column %hhu
+.It Cm %hhu Ta Vt unsigned char
+.It Cm %hu  Ta Vt unsigned short
+.It Cm %u   Ta Vt unsigned int
+.It Cm %lu  Ta Vt unsigned long Pq percent ell u
+.It Cm %llu Ta Vt unsigned long long Pq percent ell ell u
+.It Cm %ju  Ta Vt uintmax_t
+.It Cm %tu  Ta unsigned type of same size as Vt ptrdiff_t
+.It Cm %zu  Ta Vt size_t
+.It Cm %qu  Ta Vt u_quad_t Pq deprecated
+.El
+.It Cm \&%U
+A deprecated alias for
+.Cm %lu .
+.It Cm %x
+.Sm off
+.Cm %
+.Op Ar argno Cm $
+.Op Cm #
+.Op Cm \- | 0
+.Op Ar width
+.Op . Ar mindigits
+.Op Ar size
+.Cm x
+.Sm on
+.Pp
+Similar to
+.Cm %u
+except that the
+.Vt unsigned int
+argument is converted to unsigned hexadecimal notation using the digits
+.Ql 0123456789abcdef .
+If the
+.Sq Cm #
+flag is given, the string
+.Ql 0x
+is prepended unless the value is zero.
+.It Cm \&%X
+Identical to
+.Cm %x
+except that upper case is used, i.e.\&
+.Ql 0X
+for the optional prefix and
+.Ql 0123456789ABCDEF
+for the digits.
+.It Cm %%
+A single percent sign
+.Pq Ql %
 is written.
 No argument is converted.
 The complete conversion specification is
-.Ql %% .
+.Ql %% ;
+no modifiers can be inserted between the two percent signs.
 .El
-.Pp
-In no case does a non-existent or small field width cause truncation of
-a field; if the result of a conversion is wider than the field width, the
-field is expanded to contain the conversion result.
 .Sh RETURN VALUES
 For all these functions if an output or encoding error occurs, a value
 less than 0 is returned.
@@ -657,7 +768,7 @@ The
 and
 .Fn vasprintf
 functions
-return the number of characters printed
+return the number of bytes printed
 (not including the trailing
 .Ql \e0
 used to end output to strings).
@@ -666,7 +777,7 @@ The
 .Fn snprintf
 and
 .Fn vsnprintf
-functions return the number of characters that would have
+functions return the number of bytes that would have
 been output if the
 .Fa size
 were unlimited
@@ -683,7 +794,7 @@ The
 .Fn asprintf
 and
 .Fn vasprintf
-functions return the number of characters that were output
+functions return the number of bytes that were output
 to the newly allocated string
 (excluding the final
 .Ql \e0 ) .
@@ -706,6 +817,28 @@ is set to the
 pointer, but other implementations may leave
 .Fa ret
 unchanged.
+.Sh ENVIRONMENT
+.Bl -tag -width LC_CTYPE
+.It Ev LC_CTYPE
+The character encoding
+.Xr locale 1 .
+It decides which
+.Vt wchar_t
+values represent valid wide characters for the
+.Cm %lc
+and
+.Cm %ls
+conversion specifiers and how they are encoded into multibyte characters.
+If unset or set to
+.Qq C ,
+.Qq POSIX ,
+or an unsupported value,
+.Cm %lc
+and
+.Cm %ls
+only work correctly for ASCII characters
+and fail for arguments greater than 255.
+.El
 .Sh EXAMPLES
 To print a date and time in the form `Sunday, July 3, 10:02',
 where