| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
escape sequences just like it was earlier implemented for -Thtml.
Do not let control characters other than ASCII 9 (horizontal tab)
propagate to the output, even though groff allows them; but that
really doesn't look like a great idea.
Let mchars_num2char() return int such that we can distinguish invalid \N
syntax from \N'0'. This also reduces the danger of signed char issues
popping up.
|
|
|
|
|
|
|
|
| |
validity of character escape names and warn about unknown ones.
This requires mchars_spec2cp() to report unknown names again.
Fortunately, that doesn't require changing the calling code because
according to groff, invalid character escapes should not produce
output anyway, and now that we warn about them, that's fine.
|
|
|
|
|
| |
Accept only 0xXXXX, 0xYXXXX, 0x10XXXX with Y != 0.
This simplifies mchars_num2uc().
|
|
|
|
|
|
|
|
|
|
| |
In UTF-8 output, do not print anything if mchars_spec2cp() returns 0.
In particular, this repairs handling of zero-width spaces (\&).
While here, let mchars_spec2cp() return 0xFFFD instead of -1
if the character is not found, simplifying the using code.
In HTML output, do not print obfuscated ASCII characters and
do not test for one-char escapes, mchars_spec2cp() already does that.
|
|
|
|
|
|
|
|
| |
sequences above codepoint 512 by doing a reverse lookup in the
existing mandoc_char(7) character table.
Again, groff isn't smart enough to do this and silently discards such
escape sequences without printing anything.
|
|
|
|
|
|
|
|
|
|
|
|
| |
code points, provide ASCII approximations. This is already much better
than what groff does, which prints nothing for most code points.
A few minor fixes while here:
* Handle Unicode escape sequences in the ASCII range.
* In case of errors, use the REPLACEMENT CHARACTER U+FFFD for -Tutf8
and the string "<?>" for -Tascii output.
* Handle all one-character escape sequences in mchars_spec2{cp,str}()
and remove the workarounds on the higher level.
|
|
|
|
|
|
|
|
|
|
| |
After decoding numeric (\N) and one-character (\<, \> etc.)
character escape sequences, do not forget to HTML-encode the
resulting ASCII character. Malicious manuals were able to smuggle
XSS content by roff-escaping the HTML-special characters they need.
That's a classic bug type in many web applications, actually... :-(
Found myself while auditing the HTML formatter for safe output handling.
|
|
|
|
|
| |
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change
|
|
|
|
|
|
|
| |
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.
|
|
|
|
|
|
|
| |
documented in the Ossanna-Kernighan-Ritter troff manual
and also supported by groff.
Missing feature reported by Steffen Nurpmeso <sdaoden at gmail dot com>.
|
|
|
|
|
|
|
|
|
|
|
| |
* Parsing macro arguments has to be done in copy mode,
which implies replacing "\t" by a literal tab character.
* Otherwise, render "\t" as the empty string, not as a 't' character.
This fixes formatting of the distfile example in the oldrdist(1) manual.
This also shows up in the unzip(1) manual as one of several issues
preventing the removal of USE_GROFF from the archivers/unzip port.
Thanks to espie@ for attracting my attention to the unzip(1) manual.
|
|
|
|
|
|
|
| |
data pointed to, pass the size of the right pointer type to calloc;
cosmetic issue reported by Ulrich Spoerlein <uqs@spoerlein.net>
found in Coverity Scan CID 978734.
No binary change - ok cmp(1).
|
| |
|
|
|
|
|
|
|
|
| |
adding an implementation of the eqn(7) language
by kristaps@
So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.
|
|
|
|
|
| |
the hash chain or an extra function for checking matches;
from kristaps@
|
|
|
|
|
|
| |
* Do not pass integers outside the ASCII range to isprint().
* Make sure escaped characters are really printed verbatim
when the escape sequence has no special meaning.
|
|
|
|
|
|
|
|
|
| |
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.
|
|
|
|
|
|
|
|
|
|
| |
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes
|
|
|
|
|
|
|
| |
Don't use it in new manuals, it is inherently non-portable, but we
need it for backward-compatibility with existing manuals, for example
in Xenocara driver pages.
ok kristaps@ matthieu@ jmc@
|
|
|
|
| |
there are still a few bugs, but fixing these will be easier in tree.
|
|
|
|
| |
from kristaps@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ignore embedded escapes and mathematical roff subexpressions.
In roff copy mode, resolve "\\" to '\'.
Allow ".xx\}" where xx is a macro to close roff conditional scope.
Mandoc now handles the special character definitions in the pod2man(1)
preamble, so remove the explicit redefinitions in chars.c/chars.in.
From kristaps@.
I have checked that this causes no relevant change to the Perl manuals.
The only change introduced is that some non-ASCII characters rendered
incorrectly before are now rendered incorrectly in a different way.
For example, e accent aigu was "e", now is "e'"
and c cedille was "c", now is "c,".
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.
|
|
|
|
|
|
| |
It is always defined in the preamble using .ds when used in manuals.
Since we now support .ds, it is no longer necessary to provide it.
Triggered by a bug report from Thomas Jeunet, patch by kristaps@.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.
Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages
While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.
Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.
Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.
idea and coding by kristaps@
Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.
|
|
|
|
|
|
|
| |
* much improved pod2man support and low-level roff robustness
* have -Tlint imply -Wall and -fstrict
* use fewer macros and more enum in libman
* and various bug fixes
|
|
|
|
|
|
|
|
|
| |
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements
|
|
|
|
|
|
|
|
|
| |
- new function a2roffdeco
- font modes (\f) only affect the current stack point
- implement scaling (\s)
- implement space suppression (\c)
- implement non-breaking space (\~) in -Tascii
- many manual improvements
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features
portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation
simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/
|
|
|
|
| |
as mentioned in the preceding manual commit (oops)
|
|
tables and the supporting infrastructure, mostly in preparation for
HTML output support
|