summaryrefslogtreecommitdiffstats
path: root/usr.bin/mandoc/html.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* Switch HTML output to polyglot HTML5; have only one single -Thml mode.schwarze2014-10-071-58/+38
| | | | | | | | Replace hard-coded widths and alignments with a minimal embedded stylesheet. Do not use <p> because it cannot appear inside block macros. Remove the "summary" attribute because it is not HTML5. Written by kristaps@ some months ago, finished during EuroBSDCon.
* Revert previous, as requested by kristaps@.schwarze2014-08-141-2/+1
| | | | | | | | | | | | | The .Bf block can contain subblocks, so it has to render as an element that can contain flow content. But <em> cannot contain flow content, only phrasing content. Rendering .Em and .Bf differently would by unfortunate, and closing out .Bf before subblocks and re-opening it afterwards would merely complicate both the C code of the program and the generated HTML code. Besides, converting .Em to semantic HTML markup would require some content to be put into <em> and some into <i>, but we cannot automatically distinguish which is which, so strictly speaking, we can't use semantic HTML here but have to fall back to physical markup. Wonders of HTML...
* Begin cleanup of scaling units.schwarze2014-08-131-1/+3
| | | | | | | | | | | | Note that we use 240u := 1i for all devices, even -Tps and -Tpdf. Big fix of -Tascii rendering of f, m, and u. Small fix of -Tascii rendering of c. Big fix of -Thtml rendering of u. Big fix of -Tps rendering of m, p, and u. Clarify -Tps rendering of c. Correct documentation of scaling units, in particular with respect to u. This for example improves rendering of the OpenGL manuals. Joint work with kristaps@.
* Use <em> for .Em and .Bf -emphasis.schwarze2014-08-131-1/+2
| | | | | | | | | | | | | | | | | The vast majority of .Em in real-world manuals is stress emphasis, for which <em> is the correct markup. Admittedly, there are some instances of .Em usage for alternate quality, for which <i> would be a better match. Most of these are technical terms that neither allow semantic markup nor are keywords - for the latter, .Sy would be preferable. A typical example is that the shell breaks input into .Em words . Alternate voice or mood, which would also require <i>, is almost absent from manuals. We cannot satisfy both stress emphasis and alternate quality, so pick the one that fits more often and looks less wrong when off. Patch from Guy Harris <guy at alum dot mit dot edu>. ok bentley@ joerg@NetBSD
* Security fix:schwarze2014-07-231-27/+38
| | | | | | | | | | After decoding numeric (\N) and one-character (\<, \> etc.) character escape sequences, do not forget to HTML-encode the resulting ASCII character. Malicious manuals were able to smuggle XSS content by roff-escaping the HTML-special characters they need. That's a classic bug type in many web applications, actually... :-( Found myself while auditing the HTML formatter for safe output handling.
* Security fix:schwarze2014-07-221-2/+5
| | | | | | | | | | The function print_encode() is used both for plain text and for quoted attribute values. Escape the '"' character such that malicious manuals cannot pull off XSS attacks using malformed .Lk, .Mt, .%U, and .UR macros (and maybe others) to trigger the latter case. In the former case, escaping does no harm. Issue found by Sebastien Marie <semarie-openbsd at latrappe dot fr>.
* Audit strlcpy(3)/strlcat(3) usage.schwarze2014-04-231-1/+7
| | | | | | | | | | | | | * Repair three instances of silent truncation, use asprintf(3). * Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+... to use asprintf(3) instead to make them less error prone. * Cast the return value of four instances where the destination buffer is known to be large enough to (void). * Completely remove three useless instances of strlcpy(3)/strlcat(3). * Mark two places in -Thtml with XXX that can cause information loss and crashes but are not easy to fix, requiring design changes of some internal interfaces. * The file mandocdb.c remains to be audited.
* KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,schwarze2014-04-201-67/+59
| | | | | remove trailing whitespace and blanks before tabs, improve some indenting; no functional change
* The files mandoc.c and mandoc.h contained both specialised low-levelschwarze2014-03-211-1/+2
| | | | | | | functions used for multiple languages (mdoc, man, roff), for example mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary functions. Split the auxiliaries out into their own file and header. While here, do some #include cleanup.
* Implement the \: (optional line break) escape sequence,schwarze2014-01-221-3/+9
| | | | | | | documented in the Ossanna-Kernighan-Ritter troff manual and also supported by groff. Missing feature reported by Steffen Nurpmeso <sdaoden at gmail dot com>.
* Fix one case where a non-literal is used as format string.schwarze2014-01-051-2/+2
| | | | | Fix another case where a variable is formatted using the wrong type. Patch from Joerg Sonnenberger <joerg@NetBSD>.
* Implement the roff(7) font-escape sequence \f(BI "bold+italic".schwarze2013-08-081-10/+35
| | | | | This improves the formatting of about 40 base manuals and reduces groff-mandoc formatting differences in base by about 5%.
* Implement the roff \z escape sequence, intended to output the nextschwarze2012-05-281-25/+58
| | | | | | | | | | | | | | | | character without advancing the cursor position; implement it to simply skip the next character, as it will usually be overwritten. With this change, the pod2man(1) preamble user-defined string \*:, intended to render as a diaeresis or umlaut diacritic above the preceding character, is rendered in a slightly less ugly way, though still not correctly. It was rendered as "z.." and is now rendered as ".". Given that the definition of \*: uses elaborate manual \h positioning, there is little chance for mandoc(1) to ever render it correctly, but at least we can refrain from printing out a spurious "z", and we can make the \z do something semi-reasonable for easier cases.
* Sync to version 1.12.0; all code by kristaps@:schwarze2011-10-091-3/+7
| | | | | | | | Implement .Rv in -Tman. Let -man -Tman work a bit like cat(1). Add the -Ofragment option to -T[x]html. Minor fixes in -T[x]html. Lots of apropos(1) and -Tman code cleanup.
* clean up .HP, .IP, .TP, .nf, and \c handling in -T[x]html;schwarze2011-07-081-2/+4
| | | | from kristaps@
* Sync to bsd.lv (all coded by kristaps@):schwarze2011-07-051-2/+1
| | | | | | | | | | - mdoc(7): fix an assertion if the first line after .Bd -column starts with a blank, and some simplifications in mdoc_argv.c - man(7): literal mode ends at .SH and .SS (bug reported by naddy@) - allow .RS/.RE blocks to nest (bug reported by dcoppa@ and gsoares@) - improve vertical spacing of man(7) blocks - roff(7): clear user-defined strings when starting a new file - correct ID tags in -T[x]html
* Merge release 1.11.3, almost all code by kristaps@:schwarze2011-05-291-212/+129
| | | | | | | | | * Unicode output support (no Unicode input yet, though). * Refactoring: completely handle predefined strings in roff.c. - New function mandoc_escape() replaces a2roffdeco() and mandoc_special(). - Start using mandoc_getarg() in mdoc_argv.c. - Clean up parsing of delimiters in mdoc(7). * And many minor fixes and lots of cleanup.
* Merge version 1.11.1:schwarze2011-04-241-10/+1
| | | | | | | | | | | | | | Again lots of cleanup and maintenance work by kristaps@. - simplify error reporting: less function pointers, more mandoc_[v]msg - main: split document parsing out of main.c into read.c - roff, mdoc, man: improved recognition of control characters - roff: better handling of if/else stack overflows - roff: add some predefined strings for backward compatibility - mdoc, man: empty sections are not errors - mdoc: move delimiter handling to libmdoc - some header restructuring and some minor features and fixes This merge causes two minor regressions that i will fix in separate commits right afterwards.
* Merge version 1.10.10:schwarze2011-04-211-50/+9
| | | | | | | | | | lots of cleanup and maintenance work by kristaps@. - move some main.c globals into struct curparse - move mandoc_*alloc to mandoc.h such that all code can use them - make mandoc_isdelim available to formatting frontends - dissolve mdoc_strings.c, move the code where it is used - make all error reporting functions void, their return values were useless - and various minor cleanups and fixes
* Implement the \N'number' (numbered character) roff escape sequence.schwarze2011-01-301-1/+17
| | | | | | | Don't use it in new manuals, it is inherently non-portable, but we need it for backward-compatibility with existing manuals, for example in Xenocara driver pages. ok kristaps@ matthieu@ jmc@
* Merge from bsd.lv, original commit message by kristaps@:schwarze2011-01-161-2/+15
| | | | | | | | | | Change how -Thtml behaves with tables: use multiple rows, with widths set by COL, until an external macro is encountered. At this point in time, close out the table and process the macro. When the first table row is again re-encountered, re-start the table. This requires a bit of tracking added to "struct html", but the change is very small and follows the logic of meta-fonts. This all follows a bug-report by joerg@.
* In case an ID attribute is written in pieces, only protect the firstschwarze2010-12-271-10/+14
| | | | | | | | piece with a prepended 'x', not each piece, such that quoted and unquoted .Sh, .Ss, and .Sx arguments are compatible with each other. Fixing a bug reported by Nicolas Joly <njoly at NetBSD dot org>, avoiding a regression in my first patch as pointed out by njoly as well. "feel free to do so" kristaps@
* Yet another batch of -Thtml polishing from kristaps@:schwarze2010-12-251-29/+27
| | | | | | | | In particular, use <SMALL> for .SM and <CODE> for .Dl. Use <B> for bold and <I> for italic in general. Also call this mandoc 1.10.8 now, as it is functionally equivalent, even though one one set of refactoring patches has not been merged yet because it conflicts with our tbl(1) handling.
* More small -Thtml improvements by kristaps@,schwarze2010-12-221-15/+17
| | | | | | | | in particular, use <B>, <I> and <U> where appropriate. Provide relative widths for header and footer lines. Manuals: More concise short descriptions of output modes. Correct a few places still talking about CSS2 to say CSS1. Code examples should use .Dl, not .D1.
* Significant improvements to -Thtml by kristaps@:schwarze2010-12-191-8/+12
| | | | | | Use less <DIV>, use more <H1>, <H2>, <P>, <BR>, <PRE>, <UL>, <OL>, <DL> etc. Triggered by input from Will Backman. Remove CSS2 note in mandoc.1, which is no longer true.
* * need a space before .No even if it starts with a closing delimiterschwarze2010-10-011-1/+3
| | | | | | | * slightly simplify .Pf *_IGNDELIM code, and share part of it with .No * do not let opening delimiters fall out of the front of .Ns (from kristaps@) This fixes a few spacing issues in csh(1) and ksh(1). OK kristaps@
* Merge the last bits of 1.10.6 (released today), most were already in:schwarze2010-09-271-3/+3
| | | | | | | | | | | | * ignore double-.Pp * ignore .Pp before .Bd and .Bl (unless -compact in specified) * avoid double blank line upon .Pp, .br and friends in literal context * cast enums to int when passing them to exit(3) to please lint(1) While merging, fix a regression introduced by kristaps@: Outside literal mode, double blank lines must both be printed. To achieve this again after kristaps@ improvements in 1.10.6, treat such blank lines as .sp (instead of .Pp as in 1.10.5) and drop .Pp before .sp just like dropping .Pp before .Pp.
* Implement a simple, consistent user interface for error handling.schwarze2010-08-201-3/+3
| | | | | | | | | | | | | | | | | We now have sufficient practical experience to know what we want, so this is intended to be final: - provide -Wlevel (warning, error or fatal) to select what you care about - provide -Wstop to stop after parsing a file with warnings you care about - provide consistent exit status codes for those warnings you care about - fully document what warnings, errors and fatal errors mean - remove all other cruft from the user interface, less is more: - remove all -f knobs along with the whole -f option - remove the old -Werror because calling warnings "fatal" is silly - always finish parsing each file, unless fatal errors prevent that This commit also includes a couple of related simplifications behind the scenes regarding error handling. Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and Sascha Wildner (DragonFly BSD) agree with the general direction.
* Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.schwarze2010-07-311-1/+3
| | | | | | | | | | | | | NOT including Kristaps' .Bd -literal changes which cause regressions. Features: * -Tpdf now fully working Bugfixes: * proper handling of quoted strings by .ds in roff(7) * allow empty .Dd * make .Sm start no-spacing after the first output word * underline .Ad * minor fixes in -Thtml and some optimisations in terminal output.
* Sync to bsd.lv; in particular, pull in lots of bug fixes.schwarze2010-07-251-13/+31
| | | | | | | | | | | | | | | | | | | | | new features: * support the .in macro in man(7) * support minimal PDF output * support .Sm in mdoc(7) HTML output * support .Vb and .nf in man(7) HTML output * complete the mdoc(7) manual bug fixes: * do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@ * avoid double blank lines related to man(7) .sp and .br * let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@ * let "\ " produce a non-breaking space; reported by deraadt@ * discard \m colour escape sequences; reported by J.C. Roberts * map undefined 1-character-escapes to the literal character itself maintenance: * express mdoc(7) arguments in terms of an enum for additional type-safety * simplify mandoc_special() and a2roffdeco() * use strcspn in term_word() in place of a manual loop * minor optimisations in the -Tps and -Thtml formatting frontends
* Merge release 1.10.4 (all code by kristaps@), providing four new features:schwarze2010-07-131-13/+27
| | | | | | | | | | 1) Proper .Bk support: allow output line breaks at input line breaks, but keep input lines together in the output, finally fixing synopses like aucat(1), mail(1) and tmux(1). 2) Mostly finished -Tps (PostScript) output. 3) Implement -Thtml output for .Nm blocks and .Bk -words. 4) Allow iterative interpolation of user-defined roff(7) strings. Also contains some minor bugfixes and some performance improvements.
* Remove "pt" from struct roffsu, as CSS (the only reason it was there) isschwarze2010-06-271-6/+6
| | | | | | unclear about which units accept floats/integers, which leads me to assume that it handles either and rounds as appropriate. from kristaps@
* Merge more bits that will be going into 1.10.1:schwarze2010-06-081-3/+1
| | | | | | | | | | | | Clean up vertical spacing in the SYNOPSIS, making the code much more systematic; this doesn't solve all SYNOPSIS problems yet, in particular not those related to keeps, indentation and the low-level .nr roff instruction, but it's a nice step forward and i couldn't find relevant regressions. (from kristaps) Besides, * make the output width configurable (default: -Owidth=80) (kristaps) * use mmap with MAP_SHARED (from Joerg Sonnenberger)
* When a word does not fully fit onto the output line, but it containsschwarze2010-05-261-16/+15
| | | | | | | | | | | | | | | | | | | | | | | at least one hyphen, we already had support for breaking the line a the last fitting hyphen. This patch improves this functionality by only breaking at hyphens in free-form text, and by not breaking at hyphens * at the beginning or end of a word or * immediately preceded or followed by another hyphen or * escaped by a preceding backslash. Before this patch, differences in break-at-hyphen support were one of the major sources of noise in automatic comparisons to mdoc(7) groff output. Now, the remaining differences are hard to find among the noise coming from other sources. Where there are still differences, what we do seems to be better than what groff does, see e.g. the chio(1) exchange and position commands for one of the now rare examples. idea and coding by kristaps@ Besides, this was the last substantial code difference left between bsd.lv and openbsd.org. We are now in full sync.
* merge 1.9.24, keeping local patches; some changes:schwarze2010-05-141-2/+6
| | | | | | | | | * preserve multiple consecutive space characters in input * do not restrict .Cd and .Rv to certain sections (requested by Joerg) * do not run lookup() on quoted words * enum return types for mdoc_args and mdoc_argv * fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel) * various lint and manual fixes
* Merge the good parts of 1.9.23,schwarze2010-04-071-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | avoid the bad parts of 1.9.23, and keep local patches. Input in general: * Basic handling of roff-style font escapes \f, \F. * Quoted punctuation does not count as punctuation. mdoc(7) parser: * Make .Pf callable; noted by Claus Assmann. * Let .Bd and .Bl ignore unknown arguments; noted by deraadt@. * Do not warn when .Er is used outside certain sections. * Replace mdoc_node_free[list] by mdoc_node_delete. * Replace #define by enum for rew*() return values. man(7) parser: * When .TH is missing, use default section and date. Output in general: * Curly braces do not count as punctuation. * No space after .Fl w/o args when a macro follows on the same line. HTML output: * Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT(). * Print whitespace after, not before .Vt .Fn .Ft .Fo. Checked that all manuals in base still build.
* sync to release 1.9.15:schwarze2010-02-181-20/+112
| | | | | | | | | * corrected .Vt handling (spotted by Joerg Sonnenberger) * corrected .Xr argument handling (based on my patch) * removed \\ escape sequence (because it is for low-level roff only) * warn about trailing whitespace (suggested by jmc@) * -Txhtml support * and some general cleanup and doc improvements
* sync to 1.9.14: rewrite escape sequence handling:schwarze2009-12-241-118/+108
| | | | | | | | | - new function a2roffdeco - font modes (\f) only affect the current stack point - implement scaling (\s) - implement space suppression (\c) - implement non-breaking space (\~) in -Tascii - many manual improvements
* sync to 1.9.13: minor fixes:schwarze2009-12-231-25/+22
| | | | | | | | | | | | | | | | | | correctness/functionality: - bugfix: properly ignore lines with only a dot in -man - bugfix: .Bl -ohang doesn't allow -width, warn about this - improve date string handling by new function mandoc_a2time - some HTML improvements - significant documentation additions in man.7 and mdoc.7 portability: - replace __dead by __attribute__((noreturn)) - bugfix: correct .Dx rendering - some more library names for NetBSD simplicity: - replace hand-rolled putchar(3)-loops by fwrite(3) - replace single-character printf(3) by putchar(3)
* sync to 1.9.12, mostly portability and refactoring:schwarze2009-12-221-14/+44
| | | | | | | | | | | | | | | | | | | correctness/functionality: - bugfix: do not die when overstep hits the right margin - new option: -fign-escape - and various HTML features portability: - replace bzero(3) by memset(3), which is ANSI C - replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C - iuse argv[0] instead of __progname - add time.h to various files for FreeBSD compilation simplicity: - do not allocate header/footer data dynamically in *_term.c - provide and use malloc frontends that error out on failure for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/
* sync to 1.9.11: adapt printing of dates to groff conventions,schwarze2009-10-271-17/+13
| | | | | NetBSD portability fixes and some minor bugfixes and feature enhancements; also checked that my hyphenation code still works on top of this
* sync to 1.9.9, featuring:schwarze2009-10-211-0/+649
* -Thtml output mode * roff scaling units * and some minor fixes for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/