summaryrefslogtreecommitdiffstats
path: root/usr.bin/mandoc/mdoc_macro.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a placeschwarze2020-01-191-2/+3
| | | | | | | | | | | | | | | as defining a term. Please only use it when automatic tagging does not work. Manual page authors will not be required to add the new macro; using it remains optional. HTML output is still rudimentary in this version and will be polished later. Thanks to kn@ for reminding me that i have been considering since BSDCan 2014 whether something like this might be useful. Given that possibilities of making automatic tagging better are running out and there are still several situations where automatic tagging cannot do the job, i think the time is now ripe. Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.
* Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:schwarze2020-01-191-2/+2
| | | | | | | | without an argument, use the empty string, and always concatenate all arguments, no matter their number. This allows reducing the number of arguments of mandoc_normdate() and some other simplifications, at the same time polishing some error messages by adding the name of the macro in question.
* Represent mdoc(7) .Pp (and .sp, and some SYNOPSIS and .Rs features)schwarze2019-01-071-3/+14
| | | | | | | | | | | | | | | | | | | by the <p> HTML element and use the html_fillmode() mechanism for .Bd -unfilled, just like it was done for man(7) earlier, finally getting rid both of the horrible <div class="Pp"></div> hack and of the worst HTML syntax violations caused by nested displays. Care is needed because in some situations, paragraphs have to remain open across several subsequent macros, whereas in other situations, they must get closed together with a block containing them. Some implementation details include: * Always close paragraphs before emitting HTML flow content. * Let html_close_paragraph() also close <pre> for extra safety. * Drop the old, now unused function print_paragraph(). * Minor adjustments in the top-level man(7) node formatter for symmetry. * Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it. * Bugfix: give up on .Op semantic markup for now, see the comment.
* Correctly set the ROFF_NOFILL parser flag for .Bd .Ed .Sh, suchschwarze2019-01-011-4/+46
| | | | | | that children and later siblings get correct NODE_NOFILL assignments. This doesn't change rendering yet but prepares for future rendering improvements.
* Cleanup, minus 15 LOC, no functional change:schwarze2018-12-311-2/+1
| | | | | | | | | Simplify the way the man(7) and mdoc(7) validators are called. Reset the parser state with a common function before calling them. There is no need to again reset the parser state afterwards, the parsers are no longer used after validation. This allows getting rid of man_node_validate() and mdoc_node_validate() as separate functions.
* Cleanup, no functional change:schwarze2018-12-301-2/+2
| | | | | | | | | | | | | | The struct roff_man used to be a bad mixture of internal parser state and public parsing results. Move the public results to the parsing result struct roff_meta, which is already public. Move the rest of struct roff_man to the parser-internal header roff_int.h. Since the validators need access to the parser state, call them from the top level parser during mparse_result() rather than from the main programs, also reducing code duplication. This keeps parser internal state out of thee main programs (five in mandoc portable) and out of eight formatters.
* Rename mandoc_getarg() to roff_getarg() and pass it the roff parserschwarze2018-12-211-23/+66
| | | | | | | | | | | | | | | | | | struct as an argument such that after copy-in, it can call roff_expand() once again, which used to be called roff_res() before this. This fixes a subtle low-level roff(7) parsing bug reported by Fabio Scotoni <fabio at esse dot ch> in the 4.4BSD-Lite2 mdoc.samples(7) manual page, because that page used an escaped escape sequence in a macro argument. To expand escaped escape sequences in quoted mdoc(7) arguments, too, stop bypassing the call to roff_getarg() in mdoc_argv.c, function args() for this case. This does not solve the case of escaped escape sequences in quoted .Bl -column phrases yet. Because roff_expand() can make the string longer, roff_getarg() can no longer operate in-place but needs to malloc(3) the returned string. In the high-level parsers, free(3) that string after processing it.
* Almost mechanical diff to remove the "struct mparse *" argumentschwarze2018-12-141-42/+33
| | | | | | | | from mandoc_msg(), where it is no longer used. While here, rename mandoc_vmsg() to mandoc_msg() and retire the old version: There is really no point in having another function merely to save "%s" in a few places. Minus 140 lines of code.
* Clean up the validation of .Pp, .PP, .sp, and .br. Make sure allschwarze2018-12-041-2/+2
| | | | | | | | | | | | | | combinations are handled, and are handled in a systematic manner. This resolves some erratic duplicate handling, handles a number of missing cases, and improves diagnostics in various respects. Move validation of .br and .sp to the roff validation module rather than doing that twice in the mdoc and man validation modules. Move the node relinking function to the roff library where it belongs. In validation functions, only look at the node itself, at previous nodes, and at descendants, not at following nodes or ancestors, such that only nodes are inspected which are already validated.
* Remove more pointer arithmetic passing via regions outside the arrayschwarze2018-08-171-22/+30
| | | | | that is undefined according to the C standard. Robert Elz <kre at munnari dot oz dot au> pointed out i wasn't quite done yet.
* Macro argument quoting does not prevent recognition of punctuationschwarze2017-05-301-23/+17
| | | | | | | | | | | and of called macros. This bug affects almost all macros, and fixing it simplifies the code. It is amazing that the bogus ARGS_QWORD feature got implemented in the first place, and then carrier along for more than eight years without anybody ever noticing that it was pointless. Reported by Leah Neukirchen <leah at vuxu dot org>, found on Void Linux.
* Move .sp to the roff modules. Enough infrastructure is in placeschwarze2017-05-051-5/+3
| | | | now that this actually saves code: -70 LOC.
* move .ll to the roff modulesschwarze2017-05-051-3/+2
|
* Parser reorg:schwarze2017-05-041-7/+6
| | | | | Generate the first node on the roff level: .br Fix some column numbers in diagnostic messages while here.
* Parser unification: use nice ohashes for all three request and macro tables;schwarze2017-04-291-2/+2
| | | | no functional change, minus two source files, minus 200 lines of code.
* Continue parser unification:schwarze2017-04-241-43/+42
| | | | | | | | * Make enum rofft an internal interface as enum roff_tok in "roff.h". * Represent mdoc and man macros in enum roff_tok. * Make TOKEN_NONE a proper enum value and use it throughout. * Put the prologue macros first in the macro tables. * Unify mdoc_macroname[] and man_macroname[] into roff_name[].
* Fix handling of trailing punctuation in .Lk.schwarze2017-04-171-2/+6
| | | | | | | | | This macro is unusual in so far as trailing punction needs to remain inside the scope because it must be inside, not after the display of long URIs in terminal output mode. Improves formatting of fw_update(1), help(1), less(1), sendbug(1), acx(4), inet6(4), ipsec(4), oce(4), isakmpd.conf(5), afterboot(8), release(8), traceroute(8).
* Fix block scoping error if an explicit block is broken by twoschwarze2017-02-161-3/+6
| | | | | | | | | implicit blocks (.Aq Bq Po .Pc) that left the outer breaker open and could in exceptional cases, like between .Bl and .It, cause tree corruption leading to NULL dereference. Found by tb@ with afl(1). While here, do not mark intermediate ENDBODY markers as broken.
* Remove the ENDBODY_NOSPACE flag, simplifying the code.schwarze2017-02-161-4/+4
| | | | | | | | Comparing to groff output, it appears that all cases where it was used and made a difference actually require the opposite, ENDBODY_SPACE. I have no idea why i added it back in 2010; maybe to compensate for some other bug that has long been fixed.
* Never look for broken blocks inside blocks that are already closed.schwarze2017-02-111-4/+5
| | | | | Fixes the last the of tree corruptions sometimes causing NULL dereference reported by tb@; this one triggered in cases like: .Bl -column .It Pq Ta
* Do not prematurely close .Nd containing a broken child.schwarze2017-02-111-3/+9
| | | | | | Fixes tree corruption leading to NULL dereference in insane cases like .Oo Oo .Nd .Pq Oc .Oc Oc found by tb@ with afl(1).
* Do not prematurely mark intermediate blocks as broken while scanningschwarze2017-02-111-10/+17
| | | | | backwards. Only do so when a block is found that is actually broken. Logic error found while investigating crashes reported by tb@.
* For child macros of block-end macros, only scan backwards for pendingschwarze2017-02-101-7/+8
| | | | | | | | breakers unless the parent of the block is already closed. While the scanning is needed in cases like ".Ac Bo" for broken Ao, it is useless and crashy in cases like ".Ac Bc" for non-broken Ao. This fixes a NULL pointer dereference that tb@ found with afl(1).
* Oops, the previous commit unintentionally included this file.schwarze2017-02-101-1/+1
| | | | | | | | | | | | | | | The intended commit message for rev. 1.167 is: In the SYNOPSIS, .Nm blocks can get broken if one of their children gets broken. In that case, mark them as BROKEN and ENDED and make sure they get closed out together with the child. Fixes tree corruption leeding to a NULL dereference found by tb@ with afl(1) in: .Sh SYNOPSIS .Bl .Oo .Nm .Bk .Oc .It (where .Bk is the child and .Oo is the breaker). A simpler form of the same corruption (without crash) is visible in: .Sh SYNOPSIS .Ao .Nm .Bo .Ac .Bc text where the text ended up inside the .Nm (child .Bo, breaker .Ao).
* In -Ttree output mode, show the BROKEN node flag andschwarze2017-02-101-7/+15
| | | | provide a -Onoval output option to show the unvalidated tree.
* unify names of AST node flags; no change of cpp outputschwarze2017-01-101-31/+31
|
* When a mismatching end macro occurs while at least two nested blocksschwarze2016-08-201-14/+23
| | | | | | | | are open, all except the innermost open block got a bogus MDOC_ENDED marker, in some situations triggering segfaults down the road which tb@ found with afl(1). Fix the logic error by figuring out up front whether an end macro has a matching body, and if it hasn't, don't mark any blocks as broken.
* When scanning upwards for a column list to put a .Ta macro in,schwarze2016-08-201-2/+2
| | | | | ignore body end markers of lists breaking other blocks. Fixing a logical error that caused a NULL deref found by tb@ with afl(1).
* Even after switching from a pending head to the body, we have toschwarze2016-08-131-2/+2
| | | | | | continue scanning upwards, because the enclosing block might already be pending as well, e.g. .Bl .Bl .It Bo .El .It. Tree corruption leading to a later NULL deref found by tb@ with afl(1).
* In order to become able to generate syntax tree nodes on the roff(7)schwarze2015-10-201-13/+7
| | | | | | | | level, validation must be separated from parsing and rewinding. This first big step moves calling of the mdoc(7) post_*() functions out of the parser loop into their own mdoc_validate() pass, while using a new mdoc_state() module to make syntax tree state handling available to both the parser loop and the validation pass.
* Very tricky diff to fix macro interpretation and spacing around tabsschwarze2015-10-171-23/+29
| | | | | | | | | | | | | | in .Bl -column; it took me more than a day to get this right. Triggered by a loosely related bug report from tim@. The lesson for you is: Use .Ta macros in .Bl -column, avoid tabs, or you are in for surprises: The last word before a tab is not interpreted as a macro (unless there is a blank in between), the first word after a tab isn't either (unless there is a blank in between), and a blank after a tab causes a leading blank in the respective output cell. Yes, "blank", "tab", "blank tab" and "tab blank" all have different semantics; if you write code relying on that, good luck maintaining it afterwards...
* When blk_full() handles an .It line in .Bl -column and indirectlyschwarze2015-10-151-1/+6
| | | | | | | | calls phrase_ta() to handle a .Ta child macro, advance the body pointer accordingly, such that a subsequent tab character rewinds the right body block and doesn't fail an assertion. That happened when there was nothing between the .Ta and the tab character. Bug reported by tim@ some time ago.
* To make the code more readable, delete 283 /* FALLTHROUGH */ commentsschwarze2015-10-121-11/+1
| | | | | | that were right between two adjacent case statement. Keep only those 24 where the first case actually executes some code before falling through to the next case.
* modernize style: "return" is not a function; ok cmp(1)schwarze2015-10-061-25/+25
|
* /* NOTREACHED */ after abort() is silly, delete itschwarze2015-09-261-2/+1
|
* mdoc_valid_post() may indirectly call roff_node_unlink() which mayschwarze2015-05-011-2/+2
| | | | | | | | | | | set ROFF_NEXT_CHILD, which is desirable for the final call to mdoc_valid_post() - in case the target itself gets deleted, the parse point may need this adjustment - but not for the intermediate calls - if intermediate nodes get deleted, that mustn't clobber the parse point. So move setting ROFF_NEXT_SIBLING to the proper place in rew_last(). This fixes the assertion failure in jsg@'s afl test case 108/Apr27.
* Setting the "last" member of struct roff_node was done at an extremelyschwarze2015-05-011-4/+2
| | | | | | | | weird place. Move it to the obviously correct place. Surprisingly, this didn't cause any misformatting in the test suite or in any base system manuals, but i cannot believe the code was really correct for all conceivable input, and it would be very hard to verify. At the very least, it cannot have worked for man(7).
* Minor bug fix: When .Pp rewinds .Nm, rewind the whole block,schwarze2015-05-011-2/+2
| | | | | not just the body. In some unusual edge cases, this caused the .Pp to become a sibling of the .Nm body inside the .Nm block.
* If a block body gets broken, that's no good reason to extend theschwarze2015-04-291-2/+4
| | | | | | | scope of the end macro. Instead, only keep the tail scope open if the end macro macro calls an explicit macro and actually breaks that. This corrects syntax tree structure and fixes an assertion found by jsg@ with afl (test case 098/Apr27).
* Do not mark a block with the MDOC_BROKEN flag if it merely containsschwarze2015-04-291-1/+3
| | | | | | a mismatching explicit end macro without actually being broken. Avoids a subsequent upward search for the non-existent breaker ending up in a NULL pointer access; afl test case 005/Apr27 from jsg@.
* Get rid of two empty wrapper functions. No functional change.schwarze2015-04-231-2/+2
|
* Avoid a use after free when the target node is deleted during validation.schwarze2015-04-211-13/+16
| | | | Bug reported by jsg@.
* Unify trickier node handling functions.schwarze2015-04-191-4/+4
| | | | | | | * man_elem_alloc() -> roff_elem_alloc() * man_block_alloc() -> roff_block_alloc() The functions mdoc_elem_alloc() and mdoc_block_alloc() remain for now because they need to do mdoc(7)-specific argument processing.
* Unify some node handling functions that use TOKEN_NONE.schwarze2015-04-191-3/+3
| | | | | | | | * mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc() * mdoc_word_append(), man_word_append() -> roff_word_append() * mdoc_addspan(), man_addspan() -> roff_addtbl() * mdoc_addeqn(), man_addeqn() -> roff_addeqn() Minus 50 lines of code, no functional change.
* Decouple the token code for "no request or macro" from the individualschwarze2015-04-191-18/+19
| | | | | | high-level parsers to allow further unification of functions that only need to recognize this code, but that don't care about different high-level macrosets beyond that.
* Unify node handling functions:schwarze2015-04-191-17/+18
| | | | | | | | | | | * node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc() * node_append() for mdoc and man_node_append() -> roff_node_append() * mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc() * mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc() * mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink() * mdoc_node_free() and man_node_free() -> roff_node_free() * mdoc_node_delete() and man_node_delete() -> roff_node_delete() Minus 130 lines of code, no functional change.
* Replace the structs mdoc and man by a unified struct roff_man.schwarze2015-04-181-21/+22
| | | | | Almost completely mechanical, no functional change. Written on the train from Exeter to London returning from p2k15.
* If a partial explicit block extending to the next input line followsschwarze2015-04-051-4/+16
| | | | | the end macro of a broken block, put all of it into the breaking block. Needed for example by mutella(1).
* Reduce code duplication, no functional change:schwarze2015-04-051-51/+45
| | | | | Both partial and full implicit blocks can break explicit blocks. Put the code to handle both cases into a common function.
* Arguments to end macros of broken partial explicit blocksschwarze2015-04-051-10/+8
| | | | | | | | must go inside the breaking block. For example, in .It Ic cmd Oo .Ar optional_arg Oc Ar mandatory_arg the mandatory_arg is still inside the .It block. Used for example by mutella(1).