| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
This is of some relevance because the pod2man(1) preamble abuses it
for the icelandic letter Thorn, instead of simply using \(TP and \(Tp.
Missing feature found by sthen@ in DateTime::Locale::is_IS(3p).
|
|
|
|
| |
without the required subsequent argument; found by jsg@ with afl.
|
|
|
|
| |
patch from Jan Stary <hans at stare dot cz> some time ago.
|
|
|
|
| |
and a few missing <sys/types.h> inclusions; no code change.
|
|
|
|
|
| |
Accept only 0xXXXX, 0xYXXXX, 0x10XXXX with Y != 0.
This simplifies mchars_num2uc().
|
|
|
|
|
|
|
| |
Require exactly 4, 5 or 6 hex digits and allow nothing else.
This avoids mishandling stuff like \[ua] and \C'uA' as Unicode
and also fixes underlining in eqn(7) -Thtml output which uses \[ul].
Problem found and semantics suggested by kristaps@.
|
|
|
|
|
|
|
| |
Fix a corner case where \H<nil> (where <nil> is the \0 character) would
cause mandoc_escape() to read past the end of an allocated string.
Found when a script scanning of all Mac OSX manuals accidentally also
scanned binary (gzip'd) files, discussed with schwarze@ on tech@mdocml.
|
|
|
|
|
| |
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.
|
|
|
|
|
|
|
| |
* Repair detection of invalid delimiters.
* Discard the invalid delimiter together with the invalid sequence.
Note to self: In general, strchr("\0...", c) is a thoroughly bad idea.
|
|
|
|
|
|
|
|
|
| |
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.
|
|
|
|
|
|
|
|
|
|
|
| |
So far, this covers all WARNINGs related to the prologue.
1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue
Started on the plane back from Ottawa.
|
|
|
|
|
| |
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change
|
|
|
|
|
|
|
|
|
|
|
| |
partially implement the \w (measure text width) escape sequence
in a way that makes them usable in numerical expressions and in
conditional requests, similar to how \n (interpolate number register)
and \* (expand user-defined string) are implemented.
This lets mandoc(1) handle the baroque low-level roff code
found at the beginning of the ggrep(1) manual.
Thanks to pascal@ for the report.
|
|
|
|
| |
Needed for example by the new Perl pod2man(1) preamble.
|
|
|
|
|
|
|
| |
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.
|
|
|
|
|
| |
Found by Thomas Klausner <wiz at NetBSD dot org> using clang.
No functional change.
|
|
|
|
| |
No functional change.
|
|
|
|
|
|
|
|
|
|
|
| |
mapped to ESCAPE_NUMBERED (which is for \N and only for \N), that
made no sense at all. Properly remap them to ESCAPE_IGNORE.
While here, move \B and \w from the group taking number arguments
to the group taking string arguments; right now, that doesn't imply
any functional change, but if we ever go ahead and implement a
parser for roff(7) numerical expressions, it will suddenly start
to matter, and cause confusion.
|
|
|
|
|
|
| |
und \u (move half line up). Found by bentley@ in some DocBook crap.
Surprisingly, these two do actually occur in our terminfo(5),
so this patch reduces groff-mandoc differences in base by 0.03%.
|
|
|
|
| |
suggested by Thomas Klausner <wiz @ NetBSD dot org>.
|
|
|
|
|
|
|
|
|
|
| |
It is already documented in the Heirloom troff manual,
and groff handles it as well.
Bug reported by Bjarni Ingi Gislason <bjarniig at rhi dot hi dot is>
on <bug-groff at gnu dot org>. Well, admittedly, that bug was reported
against groff, but mandoc was even more broken than groff with respect
to this syntax...
|
|
|
|
|
|
|
| |
- avoid bad qualifier casting in roff.c, roff_parsetext()
by changing the mandoc_escape arguments to "const char const **"
- avoid bad qualifier casting in mandocdb.c, index_merge()
- garbage collect a few unused variables elsewhere
|
|
|
|
|
| |
This improves the formatting of about 40 base manuals
and reduces groff-mandoc formatting differences in base by about 5%.
|
|
|
|
|
|
|
|
|
|
|
| |
* Parsing macro arguments has to be done in copy mode,
which implies replacing "\t" by a literal tab character.
* Otherwise, render "\t" as the empty string, not as a 't' character.
This fixes formatting of the distfile example in the oldrdist(1) manual.
This also shows up in the unzip(1) manual as one of several issues
preventing the removal of USE_GROFF from the archivers/unzip port.
Thanks to espie@ for attracting my attention to the unzip(1) manual.
|
|
|
|
| |
Needed for sqlite3(1) as reported by espie@.
|
|
|
|
|
|
|
|
|
|
|
| |
profit of the occasion to pull out some spaghetti, that is,
three confusing variables and fourteen pointless assignments
among them; instead, always operate on the official pointers
**start, **end, and *sz, each of which conveys an obvious meaning.
No functional change intended, and the new tests confirm that
everything still (err...) "works", as far as that word can be
applied to the kind of roff(7) mock-up code i'm polishing here.
|
|
|
|
|
|
|
|
|
|
|
|
| |
in particular when the inner escapes are preceded or followed by other terms.
While doing so, remove lots of bogus code that was trying to make pointless
distinctions between numeric and non-numeric escape sequences, while both
actually share the same syntax and we ignore the semantics anyway.
This prevents some of the strings defined in the pod2man(1) preamble
from producing garbage output, in particular in scandinavian words.
Of course, proper rendering of scandinavian national characters
cannot be expected even with these fixes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
character without advancing the cursor position; implement it to
simply skip the next character, as it will usually be overwritten.
With this change, the pod2man(1) preamble user-defined string \*:,
intended to render as a diaeresis or umlaut diacritic above the
preceding character, is rendered in a slightly less ugly way,
though still not correctly. It was rendered as "z.." and is now
rendered as ".".
Given that the definition of \*: uses elaborate manual \h positioning,
there is little chance for mandoc(1) to ever render it correctly,
but at least we can refrain from printing out a spurious "z", and
we can make the \z do something semi-reasonable for easier cases.
|
|
|
|
|
|
|
|
|
|
|
| |
They have been considered valid in the past, but were reformatted
to the mdoc(7) "Month day, year" style.
To make page footers more similar to groff, no longer reformat them,
just print them as they are.
This doesn't change anything with respect to what's considered valid
or what is warned about.
Putting this in now such that i can improve the unit test suite.
|
|
|
|
|
|
| |
Helps with some real-world manuals that (incorrectly)
assume that the C font family means "constant width".
Suggested by Andreas Vogele, patch by kristaps@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If \N is followed by a digit, ignore \N and the digit.
If \N is followed by a non-digit, the next non-digit
ends the character number; the two delimiters need not match.
Kristaps calls that "gross, but not our fault".
This fixes most of src/regress/usr.bin/mandoc/char/N/basic.in, except
that handling of non-printable characters still differs from groff.
For now, i'm fixing \N only. Other escapes taking numeric arguments
may or may not need similar handling, but \N is by far the most
important for practical purposes.
ok kristaps@
|
|
|
|
|
|
|
|
| |
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring
regressions are so minor that it's better to get this in
and fix them in the tree
|
|
|
|
|
|
|
|
| |
adding an implementation of the eqn(7) language
by kristaps@
So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.
|
|
|
|
|
|
|
|
|
| |
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.
|
|
|
|
|
|
|
|
|
|
| |
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes
|
|
|
|
|
| |
found by Ulrich Spoerlein <uqs at freebsd> using the clang static analyzer;
"ok, but please document the numbers" kristaps@
|
|
|
|
|
|
|
|
|
|
|
|
| |
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
argument parsing (in man_argv.c, man_args()), both having different bugs,
to use one common macro argument parser (in mandoc.c, mandoc_getarg()),
because from the point of view of roff, man macros are just roff macros,
hence their arguments are parsed in exactly the same way.
While doing so, fix these bugs:
* Escaped blanks (i.e. those preceded by an odd number of backslashes)
were mishandled as argument separators in unquoted arguments to
user-defined roff macros.
* Unescaped blanks preceded by an even number of backslashes were not
recognized as argument separators in unquoted arguments to man macros.
* Escaped backslashes (i.e. pairs of backslashes) were not reduced
to single backslashes both in unquoted and quoted arguments both
to user-defined roff macros and to man macros.
* Escaped quotes (i.e. pairs of quotes inside quoted arguments) were
not reduced to single quotes in man macros.
OK kristaps@
Note that mdoc macro argument parsing is yet another beast for no good
reason and is probably afflicted by similar bugs. But i don't attempt
to fix that right now because it is intricately entangled with lots of
unrelated high-level mdoc(7) functionality, like delimiter handling and
column list phrase handling. Disentagling that would waste too much
time now.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ignore embedded escapes and mathematical roff subexpressions.
In roff copy mode, resolve "\\" to '\'.
Allow ".xx\}" where xx is a macro to close roff conditional scope.
Mandoc now handles the special character definitions in the pod2man(1)
preamble, so remove the explicit redefinitions in chars.c/chars.in.
From kristaps@.
I have checked that this causes no relevant change to the Perl manuals.
The only change introduced is that some non-ASCII characters rendered
incorrectly before are now rendered incorrectly in a different way.
For example, e accent aigu was "e", now is "e'"
and c cedille was "c", now is "c,".
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.
|
|
|
|
|
|
|
| |
just in the same way as \s (font size) escapes, and handle \s escapes
not having an explicit plus or minus sign.
This fixes the very ugly rendering of "C++" in gcc(1).
Reported by Thomas Jeunet, fixed by kristaps@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
should not flag the end of a sentence if:
1) The punctuation is followed by closing delimiters
and not preceded by alphanumeric characters, like in
"There is no full stop (.) in this sentence"
or
2) The punctuation is a child of a macro
and not preceded by alphanumeric characters, like in
"There is no full stop
.Pq \&.
in this sentence"
jmc@ and sobrado@ like this
|
|
|
|
|
|
|
|
|
|
|
|
| |
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.
Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages
While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.
Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.
Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.
idea and coding by kristaps@
Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.
|
|
|
|
|
|
| |
* recognize the end of quoted sentences, and of those in parantheses
* detect EOS in append_delims, so it works after all macros
by kristaps@
|
|
|
|
| |
from bsd.lv mandoc.c 1.13 and mdoc_macro.c 1.64
|