| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
a volatile sig_atomic_t variable, and then processing events in the mainloop.
But only one variable was used for 3 signals, with |= bit operations which
are signal interruptable! Rewrite the code to use 3 independent variables
and cleanup how the mainloop observes indications.
ok schwarze
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ligatures: it was incomplete (only for the Arabic script and only
for the single ligature LAM WITH ALEF) and it was implemented in a
way that is unsustainable (with a static table inside less).
If we ever want ligature support, we are better off making a fresh
start. However, for languages like Arabic and Persian, even that
wouldn't really be useful without having bidirectional support first.
OK millert@
(and also considering comments from Mohammadreza Abdollahzadeh,
Evan Silberman, and benno@)
|
|
|
|
|
|
|
| |
decoding a UTF-8 multibyte character to the left of a given byte -
is already needed at three places in line.c and will also be needed
for cleanup work in cmdbuf.c in the future.
OK millert@
|
|
|
|
|
|
|
|
| |
- history trim
- sundry
diff from evan silberman;
tweaked/ok by schwarze and deraadt
|
|
|
|
| |
problem reported by George Brown <321 dot george at gmail dot com> on tech@.
|
|
|
|
|
|
| |
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.
|
|
|
|
|
|
|
| |
(very sloppy specification) leaves an undefined value in *ret, so it is
wrong to inspect it, the error condition is enough.
discussed a little with nicm, and then much more with millert until we
were exasperated
|
|
|
|
|
| |
Most of these are correct just as '. A few benefit from Ql or \(aq.
But if in doubt, just use '.
|
| |
|
|
|
|
| |
and get_wchar() static for now - until they can be deleted
|
|
|
|
| |
with the standard isascii(3)
|
|
|
|
|
| |
This also allows to delete the buggy, now unused function put_wchar().
OK millert@
|
|
|
|
|
|
| |
This function is only ever called with constant ASCII string arguments,
so actually it doesn't need any UTF-8 handling whatsoever.
OK millert@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Use the standard functions mbtowc(3), wcwidth(3), iscntrl(3) instead
of bad functions like get_wchar(), utf_len(), is_wide_char(),
is_composing_char(), is_combining_char(), control_char().
If only half of a double-width character is shifted off screen, do not
inspect anything following it because that clearly remains on-screen.
Improve and add comments.
OK millert@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Employ the usual form of an mbtowc(3) loop, eliminating two calls
to the bad function step_char() and reducing the number of nested
loops by one. This also removes the last caller of the bad function
binary_char(), which is consequently deleted.
While here, count ASCII C0 non-whitespace control characters as
binary (except backspace and, with -R only, escape).
OK millert@
|
|
|
|
|
|
|
| |
Use the standard function mbrtowc(3) to distinguish valid, incomplete,
and invalid multibyte characters, getting rid of five calls to functions
and macros that we want to phase out, and of one goto. Add comments.
OK millert@.
|
|
|
|
|
|
|
|
| |
When looking for uppercase characters, iterate over multibyte
characters with the standard function mbtowc(3) rather than with
the buggy and outdated step_char(), skipping invalid bytes,
and correctly use iswupper(3) instead of the inapplicable isupper(3).
OK stsp@
|
|
|
|
|
|
|
|
| |
* get_wchar() -> mbtowc(3)
* is_composing_char() || is_combining_char() -> wcwidth(3)
* control_char() -> !isprint(3)
* is_ubin_char() -> !iswprint(3)
OK millert@
|
|
|
|
|
|
|
|
|
| |
Use wchar_t instead of LWCHAR and mbtowc(3) instead of step_char().
Play it safe and handle all error cases, even in the arguably unlikely
case that linebuf[] contains UTF-8 encoding errors.
Reset mbtowc(3) internal state after failure for portability,
also in one place where mbtowc(3) was already introduced earlier.
OK nicm@
|
|
|
|
| |
tweak and OK millert@
|
|
|
|
|
|
|
|
|
|
| |
a call to the flawed function step_char(-1), using the standard
function mbtowc(3) instead.
Merge in in_ansi_esc_seq(), simplifying the code, and make the
related functions is_ansi_end() and is_ascii_char() static because
they are used in line.c only.
OK nicm@, and no opposition when shown on tech@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for ANSI escape sequences introduced by an 8-bit CSI (e.g. "\23343m")
because these are neither compatible with UTF-8 nor strictly
compatible with pure ASCII and for those introduced by an UTF-8 CSI
(e.g. "\302\23343m") because not even xterm(1) supports them at
all, not even with a non-default configuration, because both forms
are very rarely used, if at all, and because the current code trying
to support them doesn't even appear to work according to my tests.
Full support for the ESC-[ CSI (e.g. "\033[43m") remains.
Tweaks and OK millert@, OK nicm@,
and sthen@ agrees with the general direction.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the standard function wcwidth(3) instead of several hand-rolled
functions accessing outdated local character tables, making this
part of the code conform to our in-tree Unicode 10.
Of course, with the current hand-rolled (and buggy) UTF-8 parser
contained in less(1), this only works if wchar_t stores UCS-4 values
and is more than 31 bits wide, but both will always be true on
OpenBSD, and ultmately, we shall switch to mbtowc(3) for parsing
anyway, lifting these restrictuons.
The existence of the outdated character tables was originally
called out by Evan Silberman on bugs@.
OK stsp@
|
|
|
|
| |
ok deraadt@
|
|
|
|
|
|
| |
As guenther@ said "STOP SPLITTING ANYTHING BUT $LESS ON '$' !".
anton@ came up with the same diff. ok nicm@
|
|
|
|
|
|
| |
As guenther@ said "fix whatever led to the \337 x 16 crap".
anton@ came up with the same diff. ok nicm@
|
|
|
|
| |
from Scott Cheloha who's pushing this upstream. ok tb@
|
|
|
|
|
| |
Based on a smaller diff from Jesper Wallin <jesper at ifconfig dot se>.
OK deraadt@
|
|
|
|
|
|
|
|
|
|
| |
bounds prior calling regexec(). In this inverted scenario a match is found when
regexec() returns false causing the bounds to not be updated. This is
problematic since the bounds will then refer to a previous match and future
pointer arithmetic will eventually be off which is manifested in a SIGSEGV.
Issue reported by Larry Hynes on tech@
ok martijn@ tb@
|
|
|
|
|
|
| |
the default.
okay millert@
|
|
|
|
| |
From Anton Lindqvist. OK tobias@ nicm@
|
|
|
|
|
|
|
|
| |
characters and top-bit-set nonprintable characters (so both iscntrl()
and !isprint()), fixes behaviour broken in r1.15/r1.16, noticed by
deraadt@.
ok deraadt tedu
|
|
|
|
| |
ok millert and nicm a while ago
|
|
|
|
|
| |
clear the status before printing content on the last line of the screen.
OK millert@ tom@
|
|
|
|
| |
is set to "*". Patch from Tobias Stoeckmann. OK tb@
|
|
|
|
|
| |
to prevent printing the calculating message over and over.
from Hugo Villeneuve
|
|
|
|
|
|
| |
* Consistently use "character encoding locale" as suggested by stsp@.
* Resolve various gratuitious wording variations.
OK jmc@.
|
| |
|
|
|
|
|
| |
more(1) options, so it is possible to change them using MORE. From Ross
L Richardson. ok deraadt millert
|
|
|
|
|
| |
from calling exit() when given an unknown terminal type.
From Anton Lindqvist, who also upstreamed the fix.
|
|
|
|
|
|
|
|
|
|
|
| |
can either mean an underlined underscore or a bold underscore. This
ambiguity can be 'resolved' by takeing the state of the surrounding text
into account. If surrounded by bold text, the result should probably be
bold and likewise for underlined. less(1) previously only looked at the
preceding text and ul(1) didn't examine the context at all.
tweaks and ok schwarze
ok tb (on a previous version of the diff)
|
|
|
|
| |
ok nicm@
|
|
|
|
|
|
|
|
|
| |
I'm discussing with deraadt@ whether it's a good idea to convert some of
these to functions. The one changed by this commit probably isn't
eligible because it defines only a for loop's condition, but many others
in less(1) should probably be converted.
ok millert@
|
|
|
|
| |
ok nicm, tb
|
|
|
|
| |
from Michael Reed
|
|
|
|
| |
from Michael Reed
|
|
|
|
| |
ok nicm
|