diff options
| author | 2016-09-03 22:46:54 +0000 | |
|---|---|---|
| committer | 2016-09-03 22:46:54 +0000 | |
| commit | b5500b9ca0102f1ccaf32f0e77e96d0739aded9b (patch) | |
| tree | e1b7ebb5a0231f9e6d8d3f6f719582cebd64dc98 /gnu/llvm/docs/TableGen | |
| parent | clarify purpose of src/gnu/ directory. (diff) | |
| download | wireguard-openbsd-b5500b9ca0102f1ccaf32f0e77e96d0739aded9b.tar.xz wireguard-openbsd-b5500b9ca0102f1ccaf32f0e77e96d0739aded9b.zip | |
Use the space freed up by sparc and zaurus to import LLVM.
ok hackroom@
Diffstat (limited to 'gnu/llvm/docs/TableGen')
| -rw-r--r-- | gnu/llvm/docs/TableGen/BackEnds.rst | 427 | ||||
| -rw-r--r-- | gnu/llvm/docs/TableGen/Deficiencies.rst | 31 | ||||
| -rw-r--r-- | gnu/llvm/docs/TableGen/LangIntro.rst | 615 | ||||
| -rw-r--r-- | gnu/llvm/docs/TableGen/LangRef.rst | 388 | ||||
| -rw-r--r-- | gnu/llvm/docs/TableGen/index.rst | 308 |
5 files changed, 1769 insertions, 0 deletions
diff --git a/gnu/llvm/docs/TableGen/BackEnds.rst b/gnu/llvm/docs/TableGen/BackEnds.rst new file mode 100644 index 00000000000..e8544b65216 --- /dev/null +++ b/gnu/llvm/docs/TableGen/BackEnds.rst @@ -0,0 +1,427 @@ +================= +TableGen BackEnds +================= + +.. contents:: + :local: + +Introduction +============ + +TableGen backends are at the core of TableGen's functionality. The source files +provide the semantics to a generated (in memory) structure, but it's up to the +backend to print this out in a way that is meaningful to the user (normally a +C program including a file or a textual list of warnings, options and error +messages). + +TableGen is used by both LLVM and Clang with very different goals. LLVM uses it +as a way to automate the generation of massive amounts of information regarding +instructions, schedules, cores and architecture features. Some backends generate +output that is consumed by more than one source file, so they need to be created +in a way that is easy to use pre-processor tricks. Some backends can also print +C code structures, so that they can be directly included as-is. + +Clang, on the other hand, uses it mainly for diagnostic messages (errors, +warnings, tips) and attributes, so more on the textual end of the scale. + +LLVM BackEnds +============= + +.. warning:: + This document is raw. Each section below needs three sub-sections: description + of its purpose with a list of users, output generated from generic input, and + finally why it needed a new backend (in case there's something similar). + +Overall, each backend will take the same TableGen file type and transform into +similar output for different targets/uses. There is an implicit contract between +the TableGen files, the back-ends and their users. + +For instance, a global contract is that each back-end produces macro-guarded +sections. Based on whether the file is included by a header or a source file, +or even in which context of each file the include is being used, you have +todefine a macro just before including it, to get the right output: + +.. code-block:: c++ + + #define GET_REGINFO_TARGET_DESC + #include "ARMGenRegisterInfo.inc" + +And just part of the generated file would be included. This is useful if +you need the same information in multiple formats (instantiation, initialization, +getter/setter functions, etc) from the same source TableGen file without having +to re-compile the TableGen file multiple times. + +Sometimes, multiple macros might be defined before the same include file to +output multiple blocks: + +.. code-block:: c++ + + #define GET_REGISTER_MATCHER + #define GET_SUBTARGET_FEATURE_NAME + #define GET_MATCHER_IMPLEMENTATION + #include "ARMGenAsmMatcher.inc" + +The macros will be undef'd automatically as they're used, in the include file. + +On all LLVM back-ends, the ``llvm-tblgen`` binary will be executed on the root +TableGen file ``<Target>.td``, which should include all others. This guarantees +that all information needed is accessible, and that no duplication is needed +in the TbleGen files. + +CodeEmitter +----------- + +**Purpose**: CodeEmitterGen uses the descriptions of instructions and their fields to +construct an automated code emitter: a function that, given a MachineInstr, +returns the (currently, 32-bit unsigned) value of the instruction. + +**Output**: C++ code, implementing the target's CodeEmitter +class by overriding the virtual functions as ``<Target>CodeEmitter::function()``. + +**Usage**: Used to include directly at the end of ``<Target>MCCodeEmitter.cpp``. + +RegisterInfo +------------ + +**Purpose**: This tablegen backend is responsible for emitting a description of a target +register file for a code generator. It uses instances of the Register, +RegisterAliases, and RegisterClass classes to gather this information. + +**Output**: C++ code with enums and structures representing the register mappings, +properties, masks, etc. + +**Usage**: Both on ``<Target>BaseRegisterInfo`` and ``<Target>MCTargetDesc`` (headers +and source files) with macros defining in which they are for declaration vs. +initialization issues. + +InstrInfo +--------- + +**Purpose**: This tablegen backend is responsible for emitting a description of the target +instruction set for the code generator. (what are the differences from CodeEmitter?) + +**Output**: C++ code with enums and structures representing the register mappings, +properties, masks, etc. + +**Usage**: Both on ``<Target>BaseInstrInfo`` and ``<Target>MCTargetDesc`` (headers +and source files) with macros defining in which they are for declaration vs. + +AsmWriter +--------- + +**Purpose**: Emits an assembly printer for the current target. + +**Output**: Implementation of ``<Target>InstPrinter::printInstruction()``, among +other things. + +**Usage**: Included directly into ``InstPrinter/<Target>InstPrinter.cpp``. + +AsmMatcher +---------- + +**Purpose**: Emits a target specifier matcher for +converting parsed assembly operands in the MCInst structures. It also +emits a matcher for custom operand parsing. Extensive documentation is +written on the ``AsmMatcherEmitter.cpp`` file. + +**Output**: Assembler parsers' matcher functions, declarations, etc. + +**Usage**: Used in back-ends' ``AsmParser/<Target>AsmParser.cpp`` for +building the AsmParser class. + +Disassembler +------------ + +**Purpose**: Contains disassembler table emitters for various +architectures. Extensive documentation is written on the +``DisassemblerEmitter.cpp`` file. + +**Output**: Decoding tables, static decoding functions, etc. + +**Usage**: Directly included in ``Disassembler/<Target>Disassembler.cpp`` +to cater for all default decodings, after all hand-made ones. + +PseudoLowering +-------------- + +**Purpose**: Generate pseudo instruction lowering. + +**Output**: Implements ``ARMAsmPrinter::emitPseudoExpansionLowering()``. + +**Usage**: Included directly into ``<Target>AsmPrinter.cpp``. + +CallingConv +----------- + +**Purpose**: Responsible for emitting descriptions of the calling +conventions supported by this target. + +**Output**: Implement static functions to deal with calling conventions +chained by matching styles, returning false on no match. + +**Usage**: Used in ISelLowering and FastIsel as function pointers to +implementation returned by a CC sellection function. + +DAGISel +------- + +**Purpose**: Generate a DAG instruction selector. + +**Output**: Creates huge functions for automating DAG selection. + +**Usage**: Included in ``<Target>ISelDAGToDAG.cpp`` inside the target's +implementation of ``SelectionDAGISel``. + +DFAPacketizer +------------- + +**Purpose**: This class parses the Schedule.td file and produces an API that +can be used to reason about whether an instruction can be added to a packet +on a VLIW architecture. The class internally generates a deterministic finite +automaton (DFA) that models all possible mappings of machine instructions +to functional units as instructions are added to a packet. + +**Output**: Scheduling tables for GPU back-ends (Hexagon, AMD). + +**Usage**: Included directly on ``<Target>InstrInfo.cpp``. + +FastISel +-------- + +**Purpose**: This tablegen backend emits code for use by the "fast" +instruction selection algorithm. See the comments at the top of +lib/CodeGen/SelectionDAG/FastISel.cpp for background. This file +scans through the target's tablegen instruction-info files +and extracts instructions with obvious-looking patterns, and it emits +code to look up these instructions by type and operator. + +**Output**: Generates ``Predicate`` and ``FastEmit`` methods. + +**Usage**: Implements private methods of the targets' implementation +of ``FastISel`` class. + +Subtarget +--------- + +**Purpose**: Generate subtarget enumerations. + +**Output**: Enums, globals, local tables for sub-target information. + +**Usage**: Populates ``<Target>Subtarget`` and +``MCTargetDesc/<Target>MCTargetDesc`` files (both headers and source). + +Intrinsic +--------- + +**Purpose**: Generate (target) intrinsic information. + +OptParserDefs +------------- + +**Purpose**: Print enum values for a class. + +CTags +----- + +**Purpose**: This tablegen backend emits an index of definitions in ctags(1) +format. A helper script, utils/TableGen/tdtags, provides an easier-to-use +interface; run 'tdtags -H' for documentation. + +Clang BackEnds +============== + +ClangAttrClasses +---------------- + +**Purpose**: Creates Attrs.inc, which contains semantic attribute class +declarations for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``. +This file is included as part of ``Attr.h``. + +ClangAttrParserStringSwitches +----------------------------- + +**Purpose**: Creates AttrParserStringSwitches.inc, which contains +StringSwitch::Case statements for parser-related string switches. Each switch +is given its own macro (such as ``CLANG_ATTR_ARG_CONTEXT_LIST``, or +``CLANG_ATTR_IDENTIFIER_ARG_LIST``), which is expected to be defined before +including AttrParserStringSwitches.inc, and undefined after. + +ClangAttrImpl +------------- + +**Purpose**: Creates AttrImpl.inc, which contains semantic attribute class +definitions for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``. +This file is included as part of ``AttrImpl.cpp``. + +ClangAttrList +------------- + +**Purpose**: Creates AttrList.inc, which is used when a list of semantic +attribute identifiers is required. For instance, ``AttrKinds.h`` includes this +file to generate the list of ``attr::Kind`` enumeration values. This list is +separated out into multiple categories: attributes, inheritable attributes, and +inheritable parameter attributes. This categorization happens automatically +based on information in ``Attr.td`` and is used to implement the ``classof`` +functionality required for ``dyn_cast`` and similar APIs. + +ClangAttrPCHRead +---------------- + +**Purpose**: Creates AttrPCHRead.inc, which is used to deserialize attributes +in the ``ASTReader::ReadAttributes`` function. + +ClangAttrPCHWrite +----------------- + +**Purpose**: Creates AttrPCHWrite.inc, which is used to serialize attributes in +the ``ASTWriter::WriteAttributes`` function. + +ClangAttrSpellings +--------------------- + +**Purpose**: Creates AttrSpellings.inc, which is used to implement the +``__has_attribute`` feature test macro. + +ClangAttrSpellingListIndex +-------------------------- + +**Purpose**: Creates AttrSpellingListIndex.inc, which is used to map parsed +attribute spellings (including which syntax or scope was used) to an attribute +spelling list index. These spelling list index values are internal +implementation details exposed via +``AttributeList::getAttributeSpellingListIndex``. + +ClangAttrVisitor +------------------- + +**Purpose**: Creates AttrVisitor.inc, which is used when implementing +recursive AST visitors. + +ClangAttrTemplateInstantiate +---------------------------- + +**Purpose**: Creates AttrTemplateInstantiate.inc, which implements the +``instantiateTemplateAttribute`` function, used when instantiating a template +that requires an attribute to be cloned. + +ClangAttrParsedAttrList +----------------------- + +**Purpose**: Creates AttrParsedAttrList.inc, which is used to generate the +``AttributeList::Kind`` parsed attribute enumeration. + +ClangAttrParsedAttrImpl +----------------------- + +**Purpose**: Creates AttrParsedAttrImpl.inc, which is used by +``AttributeList.cpp`` to implement several functions on the ``AttributeList`` +class. This functionality is implemented via the ``AttrInfoMap ParsedAttrInfo`` +array, which contains one element per parsed attribute object. + +ClangAttrParsedAttrKinds +------------------------ + +**Purpose**: Creates AttrParsedAttrKinds.inc, which is used to implement the +``AttributeList::getKind`` function, mapping a string (and syntax) to a parsed +attribute ``AttributeList::Kind`` enumeration. + +ClangAttrDump +------------- + +**Purpose**: Creates AttrDump.inc, which dumps information about an attribute. +It is used to implement ``ASTDumper::dumpAttr``. + +ClangDiagsDefs +-------------- + +Generate Clang diagnostics definitions. + +ClangDiagGroups +--------------- + +Generate Clang diagnostic groups. + +ClangDiagsIndexName +------------------- + +Generate Clang diagnostic name index. + +ClangCommentNodes +----------------- + +Generate Clang AST comment nodes. + +ClangDeclNodes +-------------- + +Generate Clang AST declaration nodes. + +ClangStmtNodes +-------------- + +Generate Clang AST statement nodes. + +ClangSACheckers +--------------- + +Generate Clang Static Analyzer checkers. + +ClangCommentHTMLTags +-------------------- + +Generate efficient matchers for HTML tag names that are used in documentation comments. + +ClangCommentHTMLTagsProperties +------------------------------ + +Generate efficient matchers for HTML tag properties. + +ClangCommentHTMLNamedCharacterReferences +---------------------------------------- + +Generate function to translate named character references to UTF-8 sequences. + +ClangCommentCommandInfo +----------------------- + +Generate command properties for commands that are used in documentation comments. + +ClangCommentCommandList +----------------------- + +Generate list of commands that are used in documentation comments. + +ArmNeon +------- + +Generate arm_neon.h for clang. + +ArmNeonSema +----------- + +Generate ARM NEON sema support for clang. + +ArmNeonTest +----------- + +Generate ARM NEON tests for clang. + +AttrDocs +-------- + +**Purpose**: Creates ``AttributeReference.rst`` from ``AttrDocs.td``, and is +used for documenting user-facing attributes. + +How to write a back-end +======================= + +TODO. + +Until we get a step-by-step HowTo for writing TableGen backends, you can at +least grab the boilerplate (build system, new files, etc.) from Clang's +r173931. + +TODO: How they work, how to write one. This section should not contain details +about any particular backend, except maybe ``-print-enums`` as an example. This +should highlight the APIs in ``TableGen/Record.h``. + diff --git a/gnu/llvm/docs/TableGen/Deficiencies.rst b/gnu/llvm/docs/TableGen/Deficiencies.rst new file mode 100644 index 00000000000..a00aecd342d --- /dev/null +++ b/gnu/llvm/docs/TableGen/Deficiencies.rst @@ -0,0 +1,31 @@ +===================== +TableGen Deficiencies +===================== + +.. contents:: + :local: + +Introduction +============ + +Despite being very generic, TableGen has some deficiencies that have been +pointed out numerous times. The common theme is that, while TableGen allows +you to build Domain-Specific-Languages, the final languages that you create +lack the power of other DSLs, which in turn increase considerably the size +and complexity of TableGen files. + +At the same time, TableGen allows you to create virtually any meaning of +the basic concepts via custom-made back-ends, which can pervert the original +design and make it very hard for newcomers to understand it. + +There are some in favour of extending the semantics even more, but making sure +back-ends adhere to strict rules. Others suggesting we should move to more +powerful DSLs designed with specific purposes, or even re-using existing +DSLs. + +Known Problems +============== + +TODO: Add here frequently asked questions about why TableGen doesn't do +what you want, how it might, and how we could extend/restrict it to +be more use friendly. diff --git a/gnu/llvm/docs/TableGen/LangIntro.rst b/gnu/llvm/docs/TableGen/LangIntro.rst new file mode 100644 index 00000000000..a148634e3ed --- /dev/null +++ b/gnu/llvm/docs/TableGen/LangIntro.rst @@ -0,0 +1,615 @@ +============================== +TableGen Language Introduction +============================== + +.. contents:: + :local: + +.. warning:: + This document is extremely rough. If you find something lacking, please + fix it, file a documentation bug, or ask about it on llvm-dev. + +Introduction +============ + +This document is not meant to be a normative spec about the TableGen language +in and of itself (i.e. how to understand a given construct in terms of how +it affects the final set of records represented by the TableGen file). For +the formal language specification, see :doc:`LangRef`. + +TableGen syntax +=============== + +TableGen doesn't care about the meaning of data (that is up to the backend to +define), but it does care about syntax, and it enforces a simple type system. +This section describes the syntax and the constructs allowed in a TableGen file. + +TableGen primitives +------------------- + +TableGen comments +^^^^^^^^^^^^^^^^^ + +TableGen supports C++ style "``//``" comments, which run to the end of the +line, and it also supports **nestable** "``/* */``" comments. + +.. _TableGen type: + +The TableGen type system +^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen files are strongly typed, in a simple (but complete) type-system. +These types are used to perform automatic conversions, check for errors, and to +help interface designers constrain the input that they allow. Every `value +definition`_ is required to have an associated type. + +TableGen supports a mixture of very low-level types (such as ``bit``) and very +high-level types (such as ``dag``). This flexibility is what allows it to +describe a wide range of information conveniently and compactly. The TableGen +types are: + +``bit`` + A 'bit' is a boolean value that can hold either 0 or 1. + +``int`` + The 'int' type represents a simple 32-bit integer value, such as 5. + +``string`` + The 'string' type represents an ordered sequence of characters of arbitrary + length. + +``bits<n>`` + A 'bits' type is an arbitrary, but fixed, size integer that is broken up + into individual bits. This type is useful because it can handle some bits + being defined while others are undefined. + +``list<ty>`` + This type represents a list whose elements are some other type. The + contained type is arbitrary: it can even be another list type. + +Class type + Specifying a class name in a type context means that the defined value must + be a subclass of the specified class. This is useful in conjunction with + the ``list`` type, for example, to constrain the elements of the list to a + common base class (e.g., a ``list<Register>`` can only contain definitions + derived from the "``Register``" class). + +``dag`` + This type represents a nestable directed graph of elements. + +To date, these types have been sufficient for describing things that TableGen +has been used for, but it is straight-forward to extend this list if needed. + +.. _TableGen expressions: + +TableGen values and expressions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen allows for a pretty reasonable number of different expression forms +when building up values. These forms allow the TableGen file to be written in a +natural syntax and flavor for the application. The current expression forms +supported include: + +``?`` + uninitialized field + +``0b1001011`` + binary integer value. + Note that this is sized by the number of bits given and will not be + silently extended/truncated. + +``07654321`` + octal integer value (indicated by a leading 0) + +``7`` + decimal integer value + +``0x7F`` + hexadecimal integer value + +``"foo"`` + string value + +``[{ ... }]`` + usually called a "code fragment", but is just a multiline string literal + +``[ X, Y, Z ]<type>`` + list value. <type> is the type of the list element and is usually optional. + In rare cases, TableGen is unable to deduce the element type in which case + the user must specify it explicitly. + +``{ a, b, 0b10 }`` + initializer for a "bits<4>" value. + 1-bit from "a", 1-bit from "b", 2-bits from 0b10. + +``value`` + value reference + +``value{17}`` + access to one bit of a value + +``value{15-17}`` + access to multiple bits of a value + +``DEF`` + reference to a record definition + +``CLASS<val list>`` + reference to a new anonymous definition of CLASS with the specified template + arguments. + +``X.Y`` + reference to the subfield of a value + +``list[4-7,17,2-3]`` + A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it. + Elements may be included multiple times. + +``foreach <var> = [ <list> ] in { <body> }`` + +``foreach <var> = [ <list> ] in <def>`` + Replicate <body> or <def>, replacing instances of <var> with each value + in <list>. <var> is scoped at the level of the ``foreach`` loop and must + not conflict with any other object introduced in <body> or <def>. Currently + only ``def``\s are expanded within <body>. + +``foreach <var> = 0-15 in ...`` + +``foreach <var> = {0-15,32-47} in ...`` + Loop over ranges of integers. The braces are required for multiple ranges. + +``(DEF a, b)`` + a dag value. The first element is required to be a record definition, the + remaining elements in the list may be arbitrary other values, including + nested ```dag``' values. + +``!listconcat(a, b, ...)`` + A list value that is the result of concatenating the 'a' and 'b' lists. + The lists must have the same element type. + More than two arguments are accepted with the result being the concatenation + of all the lists given. + +``!strconcat(a, b, ...)`` + A string value that is the result of concatenating the 'a' and 'b' strings. + More than two arguments are accepted with the result being the concatenation + of all the strings given. + +``str1#str2`` + "#" (paste) is a shorthand for !strconcat. It may concatenate things that + are not quoted strings, in which case an implicit !cast<string> is done on + the operand of the paste. + +``!cast<type>(a)`` + A symbol of type *type* obtained by looking up the string 'a' in the symbol + table. If the type of 'a' does not match *type*, TableGen aborts with an + error. !cast<string> is a special case in that the argument must be an + object defined by a 'def' construct. + +``!subst(a, b, c)`` + If 'a' and 'b' are of string type or are symbol references, substitute 'b' + for 'a' in 'c.' This operation is analogous to $(subst) in GNU make. + +``!foreach(a, b, c)`` + For each member of dag or list 'b' apply operator 'c.' 'a' is a dummy + variable that should be declared as a member variable of an instantiated + class. This operation is analogous to $(foreach) in GNU make. + +``!head(a)`` + The first element of list 'a.' + +``!tail(a)`` + The 2nd-N elements of list 'a.' + +``!empty(a)`` + An integer {0,1} indicating whether list 'a' is empty. + +``!if(a,b,c)`` + 'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise. + +``!eq(a,b)`` + 'bit 1' if string a is equal to string b, 0 otherwise. This only operates + on string, int and bit objects. Use !cast<string> to compare other types of + objects. + +``!shl(a,b)`` ``!srl(a,b)`` ``!sra(a,b)`` ``!add(a,b)`` ``!and(a,b)`` + The usual binary and arithmetic operators. + +Note that all of the values have rules specifying how they convert to values +for different types. These rules allow you to assign a value like "``7``" +to a "``bits<4>``" value, for example. + +Classes and definitions +----------------------- + +As mentioned in the :doc:`introduction <index>`, classes and definitions (collectively known as +'records') in TableGen are the main high-level unit of information that TableGen +collects. Records are defined with a ``def`` or ``class`` keyword, the record +name, and an optional list of "`template arguments`_". If the record has +superclasses, they are specified as a comma separated list that starts with a +colon character ("``:``"). If `value definitions`_ or `let expressions`_ are +needed for the class, they are enclosed in curly braces ("``{}``"); otherwise, +the record ends with a semicolon. + +Here is a simple TableGen file: + +.. code-block:: llvm + + class C { bit V = 1; } + def X : C; + def Y : C { + string Greeting = "hello"; + } + +This example defines two definitions, ``X`` and ``Y``, both of which derive from +the ``C`` class. Because of this, they both get the ``V`` bit value. The ``Y`` +definition also gets the Greeting member as well. + +In general, classes are useful for collecting together the commonality between a +group of records and isolating it in a single place. Also, classes permit the +specification of default values for their subclasses, allowing the subclasses to +override them as they wish. + +.. _value definition: +.. _value definitions: + +Value definitions +^^^^^^^^^^^^^^^^^ + +Value definitions define named entries in records. A value must be defined +before it can be referred to as the operand for another value definition or +before the value is reset with a `let expression`_. A value is defined by +specifying a `TableGen type`_ and a name. If an initial value is available, it +may be specified after the type with an equal sign. Value definitions require +terminating semicolons. + +.. _let expression: +.. _let expressions: +.. _"let" expressions within a record: + +'let' expressions +^^^^^^^^^^^^^^^^^ + +A record-level let expression is used to change the value of a value definition +in a record. This is primarily useful when a superclass defines a value that a +derived class or definition wants to override. Let expressions consist of the +'``let``' keyword followed by a value name, an equal sign ("``=``"), and a new +value. For example, a new class could be added to the example above, redefining +the ``V`` field for all of its subclasses: + +.. code-block:: llvm + + class D : C { let V = 0; } + def Z : D; + +In this case, the ``Z`` definition will have a zero value for its ``V`` value, +despite the fact that it derives (indirectly) from the ``C`` class, because the +``D`` class overrode its value. + +.. _template arguments: + +Class template arguments +^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen permits the definition of parameterized classes as well as normal +concrete classes. Parameterized TableGen classes specify a list of variable +bindings (which may optionally have defaults) that are bound when used. Here is +a simple example: + +.. code-block:: llvm + + class FPFormat<bits<3> val> { + bits<3> Value = val; + } + def NotFP : FPFormat<0>; + def ZeroArgFP : FPFormat<1>; + def OneArgFP : FPFormat<2>; + def OneArgFPRW : FPFormat<3>; + def TwoArgFP : FPFormat<4>; + def CompareFP : FPFormat<5>; + def CondMovFP : FPFormat<6>; + def SpecialFP : FPFormat<7>; + +In this case, template arguments are used as a space efficient way to specify a +list of "enumeration values", each with a "``Value``" field set to the specified +integer. + +The more esoteric forms of `TableGen expressions`_ are useful in conjunction +with template arguments. As an example: + +.. code-block:: llvm + + class ModRefVal<bits<2> val> { + bits<2> Value = val; + } + + def None : ModRefVal<0>; + def Mod : ModRefVal<1>; + def Ref : ModRefVal<2>; + def ModRef : ModRefVal<3>; + + class Value<ModRefVal MR> { + // Decode some information into a more convenient format, while providing + // a nice interface to the user of the "Value" class. + bit isMod = MR.Value{0}; + bit isRef = MR.Value{1}; + + // other stuff... + } + + // Example uses + def bork : Value<Mod>; + def zork : Value<Ref>; + def hork : Value<ModRef>; + +This is obviously a contrived example, but it shows how template arguments can +be used to decouple the interface provided to the user of the class from the +actual internal data representation expected by the class. In this case, +running ``llvm-tblgen`` on the example prints the following definitions: + +.. code-block:: llvm + + def bork { // Value + bit isMod = 1; + bit isRef = 0; + } + def hork { // Value + bit isMod = 1; + bit isRef = 1; + } + def zork { // Value + bit isMod = 0; + bit isRef = 1; + } + +This shows that TableGen was able to dig into the argument and extract a piece +of information that was requested by the designer of the "Value" class. For +more realistic examples, please see existing users of TableGen, such as the X86 +backend. + +Multiclass definitions and instances +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +While classes with template arguments are a good way to factor commonality +between two instances of a definition, multiclasses allow a convenient notation +for defining multiple definitions at once (instances of implicitly constructed +classes). For example, consider an 3-address instruction set whose instructions +come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``" +(e.g. SPARC). In this case, you'd like to specify in one place that this +commonality exists, then in a separate place indicate what all the ops are. + +Here is an example TableGen fragment that shows this idea: + +.. code-block:: llvm + + def ops; + def GPR; + def Imm; + class inst<int opc, string asmstr, dag operandlist>; + + multiclass ri_inst<int opc, string asmstr> { + def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + } + + // Instantiations of the ri_inst multiclass. + defm ADD : ri_inst<0b111, "add">; + defm SUB : ri_inst<0b101, "sub">; + defm MUL : ri_inst<0b100, "mul">; + ... + +The name of the resultant definitions has the multidef fragment names appended +to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc. A defm may +inherit from multiple multiclasses, instantiating definitions from each +multiclass. Using a multiclass this way is exactly equivalent to instantiating +the classes multiple times yourself, e.g. by writing: + +.. code-block:: llvm + + def ops; + def GPR; + def Imm; + class inst<int opc, string asmstr, dag operandlist>; + + class rrinst<int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + + class riinst<int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + + // Instantiations of the ri_inst multiclass. + def ADD_rr : rrinst<0b111, "add">; + def ADD_ri : riinst<0b111, "add">; + def SUB_rr : rrinst<0b101, "sub">; + def SUB_ri : riinst<0b101, "sub">; + def MUL_rr : rrinst<0b100, "mul">; + def MUL_ri : riinst<0b100, "mul">; + ... + +A ``defm`` can also be used inside a multiclass providing several levels of +multiclass instantiations. + +.. code-block:: llvm + + class Instruction<bits<4> opc, string Name> { + bits<4> opcode = opc; + string name = Name; + } + + multiclass basic_r<bits<4> opc> { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + + multiclass basic_s<bits<4> opc> { + defm SS : basic_r<opc>; + defm SD : basic_r<opc>; + def X : Instruction<opc, "x">; + } + + multiclass basic_p<bits<4> opc> { + defm PS : basic_r<opc>; + defm PD : basic_r<opc>; + def Y : Instruction<opc, "y">; + } + + defm ADD : basic_s<0xf>, basic_p<0xf>; + ... + + // Results + def ADDPDrm { ... + def ADDPDrr { ... + def ADDPSrm { ... + def ADDPSrr { ... + def ADDSDrm { ... + def ADDSDrr { ... + def ADDY { ... + def ADDX { ... + +``defm`` declarations can inherit from classes too, the rule to follow is that +the class list must start after the last multiclass, and there must be at least +one multiclass before them. + +.. code-block:: llvm + + class XD { bits<4> Prefix = 11; } + class XS { bits<4> Prefix = 12; } + + class I<bits<4> op> { + bits<4> opcode = op; + } + + multiclass R { + def rr : I<4>; + def rm : I<2>; + } + + multiclass Y { + defm SS : R, XD; + defm SD : R, XS; + } + + defm Instr : Y; + + // Results + def InstrSDrm { + bits<4> opcode = { 0, 0, 1, 0 }; + bits<4> Prefix = { 1, 1, 0, 0 }; + } + ... + def InstrSSrr { + bits<4> opcode = { 0, 1, 0, 0 }; + bits<4> Prefix = { 1, 0, 1, 1 }; + } + +File scope entities +------------------- + +File inclusion +^^^^^^^^^^^^^^ + +TableGen supports the '``include``' token, which textually substitutes the +specified file in place of the include directive. The filename should be +specified as a double quoted string immediately after the '``include``' keyword. +Example: + +.. code-block:: llvm + + include "foo.td" + +'let' expressions +^^^^^^^^^^^^^^^^^ + +"Let" expressions at file scope are similar to `"let" expressions within a +record`_, except they can specify a value binding for multiple records at a +time, and may be useful in certain other cases. File-scope let expressions are +really just another way that TableGen allows the end-user to factor out +commonality from the records. + +File-scope "let" expressions take a comma-separated list of bindings to apply, +and one or more records to bind the values in. Here are some examples: + +.. code-block:: llvm + + let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in + def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; + + let isCall = 1 in + // All calls clobber the non-callee saved registers... + let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, + XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in { + def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops), + "call\t${dst:call}", []>; + def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), + "call\t{*}$dst", [(X86call GR32:$dst)]>; + def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), + "call\t{*}$dst", []>; + } + +File-scope "let" expressions are often useful when a couple of definitions need +to be added to several records, and the records do not otherwise need to be +opened, as in the case with the ``CALL*`` instructions above. + +It's also possible to use "let" expressions inside multiclasses, providing more +ways to factor out commonality from the records, specially if using several +levels of multiclass instantiations. This also avoids the need of using "let" +expressions within subsequent records inside a multiclass. + +.. code-block:: llvm + + multiclass basic_r<bits<4> opc> { + let Predicates = [HasSSE2] in { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + let Predicates = [HasSSE3] in + def rx : Instruction<opc, "rx">; + } + + multiclass basic_ss<bits<4> opc> { + let IsDouble = 0 in + defm SS : basic_r<opc>; + + let IsDouble = 1 in + defm SD : basic_r<opc>; + } + + defm ADD : basic_ss<0xf>; + +Looping +^^^^^^^ + +TableGen supports the '``foreach``' block, which textually replicates the loop +body, substituting iterator values for iterator references in the body. +Example: + +.. code-block:: llvm + + foreach i = [0, 1, 2, 3] in { + def R#i : Register<...>; + def F#i : Register<...>; + } + +This will create objects ``R0``, ``R1``, ``R2`` and ``R3``. ``foreach`` blocks +may be nested. If there is only one item in the body the braces may be +elided: + +.. code-block:: llvm + + foreach i = [0, 1, 2, 3] in + def R#i : Register<...>; + +Code Generator backend info +=========================== + +Expressions used by code generator to describe instructions and isel patterns: + +``(implicit a)`` + an implicitly defined physical register. This tells the dag instruction + selection emitter the input pattern's extra definitions matches implicit + physical register definitions. + diff --git a/gnu/llvm/docs/TableGen/LangRef.rst b/gnu/llvm/docs/TableGen/LangRef.rst new file mode 100644 index 00000000000..27b2c8beaa6 --- /dev/null +++ b/gnu/llvm/docs/TableGen/LangRef.rst @@ -0,0 +1,388 @@ +=========================== +TableGen Language Reference +=========================== + +.. contents:: + :local: + +.. warning:: + This document is extremely rough. If you find something lacking, please + fix it, file a documentation bug, or ask about it on llvm-dev. + +Introduction +============ + +This document is meant to be a normative spec about the TableGen language +in and of itself (i.e. how to understand a given construct in terms of how +it affects the final set of records represented by the TableGen file). If +you are unsure if this document is really what you are looking for, please +read the :doc:`introduction to TableGen <index>` first. + +Notation +======== + +The lexical and syntax notation used here is intended to imitate +`Python's`_. In particular, for lexical definitions, the productions +operate at the character level and there is no implied whitespace between +elements. The syntax definitions operate at the token level, so there is +implied whitespace between tokens. + +.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation + +Lexical Analysis +================ + +TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) +comments. + +The following is a listing of the basic punctuation tokens:: + + - + [ ] { } ( ) < > : ; . = ? # + +Numeric literals take one of the following forms: + +.. TableGen actually will lex some pretty strange sequences an interpret + them as numbers. What is shown here is an attempt to approximate what it + "should" accept. + +.. productionlist:: + TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` + DecimalInteger: ["+" | "-"] ("0"..."9")+ + HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ + BinInteger: "0b" ("0" | "1")+ + +One aspect to note is that the :token:`DecimalInteger` token *includes* the +``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as +most languages do. + +Also note that :token:`BinInteger` creates a value of type ``bits<n>`` +(where ``n`` is the number of bits). This will implicitly convert to +integers when needed. + +TableGen has identifier-like tokens: + +.. productionlist:: + ualpha: "a"..."z" | "A"..."Z" | "_" + TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* + TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* + +Note that unlike most languages, TableGen allows :token:`TokIdentifier` to +begin with a number. In case of ambiguity, a token will be interpreted as a +numeric literal rather than an identifier. + +TableGen also has two string-like literals: + +.. productionlist:: + TokString: '"' <non-'"' characters and C-like escapes> '"' + TokCodeFragment: "[{" <shortest text not containing "}]"> "}]" + +:token:`TokCodeFragment` is essentially a multiline string literal +delimited by ``[{`` and ``}]``. + +.. note:: + The current implementation accepts the following C-like escapes:: + + \\ \' \" \t \n + +TableGen also has the following keywords:: + + bit bits class code dag + def foreach defm field in + int let list multiclass string + +TableGen also has "bang operators" which have a +wide variety of meanings: + +.. productionlist:: + BangOperator: one of + :!eq !if !head !tail !con + :!add !shl !sra !srl !and + :!cast !empty !subst !foreach !listconcat !strconcat + +Syntax +====== + +TableGen has an ``include`` mechanism. It does not play a role in the +syntax per se, since it is lexically replaced with the contents of the +included file. + +.. productionlist:: + IncludeDirective: "include" `TokString` + +TableGen's top-level production consists of "objects". + +.. productionlist:: + TableGenFile: `Object`* + Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach` + +``class``\es +------------ + +.. productionlist:: + Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` + +A ``class`` declaration creates a record which other records can inherit +from. A class can be parametrized by a list of "template arguments", whose +values can be used in the class body. + +A given class can only be defined once. A ``class`` declaration is +considered to define the class if any of the following is true: + +.. break ObjectBody into its consituents so that they are present here? + +#. The :token:`TemplateArgList` is present. +#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. +#. The :token:`BaseClassList` in the :token:`ObjectBody` is present. + +You can declare an empty class by giving and empty :token:`TemplateArgList` +and an empty :token:`ObjectBody`. This can serve as a restricted form of +forward declaration: note that records deriving from the forward-declared +class will inherit no fields from it since the record expansion is done +when the record is parsed. + +.. productionlist:: + TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" + +Declarations +------------ + +.. Omitting mention of arcane "field" prefix to discourage its use. + +The declaration syntax is pretty much what you would expect as a C++ +programmer. + +.. productionlist:: + Declaration: `Type` `TokIdentifier` ["=" `Value`] + +It assigns the value to the identifer. + +Types +----- + +.. productionlist:: + Type: "string" | "code" | "bit" | "int" | "dag" + :| "bits" "<" `TokInteger` ">" + :| "list" "<" `Type` ">" + :| `ClassID` + ClassID: `TokIdentifier` + +Both ``string`` and ``code`` correspond to the string type; the difference +is purely to indicate programmer intention. + +The :token:`ClassID` must identify a class that has been previously +declared or defined. + +Values +------ + +.. productionlist:: + Value: `SimpleValue` `ValueSuffix`* + ValueSuffix: "{" `RangeList` "}" + :| "[" `RangeList` "]" + :| "." `TokIdentifier` + RangeList: `RangePiece` ("," `RangePiece`)* + RangePiece: `TokInteger` + :| `TokInteger` "-" `TokInteger` + :| `TokInteger` `TokInteger` + +The peculiar last form of :token:`RangePiece` is due to the fact that the +"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as +two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, +instead of "1", "-", and "5". +The :token:`RangeList` can be thought of as specifying "list slice" in some +contexts. + + +:token:`SimpleValue` has a number of forms: + + +.. productionlist:: + SimpleValue: `TokIdentifier` + +The value will be the variable referenced by the identifier. It can be one +of: + +.. The code for this is exceptionally abstruse. These examples are a + best-effort attempt. + +* name of a ``def``, such as the use of ``Bar`` in:: + + def Bar : SomeClass { + int X = 5; + } + + def Foo { + SomeClass Baz = Bar; + } + +* value local to a ``def``, such as the use of ``Bar`` in:: + + def Foo { + int Bar = 5; + int Baz = Bar; + } + +* a template arg of a ``class``, such as the use of ``Bar`` in:: + + class Foo<int Bar> { + int Baz = Bar; + } + +* value local to a ``multiclass``, such as the use of ``Bar`` in:: + + multiclass Foo { + int Bar = 5; + int Baz = Bar; + } + +* a template arg to a ``multiclass``, such as the use of ``Bar`` in:: + + multiclass Foo<int Bar> { + int Baz = Bar; + } + +.. productionlist:: + SimpleValue: `TokInteger` + +This represents the numeric value of the integer. + +.. productionlist:: + SimpleValue: `TokString`+ + +Multiple adjacent string literals are concatenated like in C/C++. The value +is the concatenation of the strings. + +.. productionlist:: + SimpleValue: `TokCodeFragment` + +The value is the string value of the code fragment. + +.. productionlist:: + SimpleValue: "?" + +``?`` represents an "unset" initializer. + +.. productionlist:: + SimpleValue: "{" `ValueList` "}" + ValueList: [`ValueListNE`] + ValueListNE: `Value` ("," `Value`)* + +This represents a sequence of bits, as would be used to initialize a +``bits<n>`` field (where ``n`` is the number of bits). + +.. productionlist:: + SimpleValue: `ClassID` "<" `ValueListNE` ">" + +This generates a new anonymous record definition (as would be created by an +unnamed ``def`` inheriting from the given class with the given template +arguments) and the value is the value of that record definition. + +.. productionlist:: + SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] + +A list initializer. The optional :token:`Type` can be used to indicate a +specific element type, otherwise the element type will be deduced from the +given values. + +.. The initial `DagArg` of the dag must start with an identifier or + !cast, but this is more of an implementation detail and so for now just + leave it out. + +.. productionlist:: + SimpleValue: "(" `DagArg` `DagArgList` ")" + DagArgList: `DagArg` ("," `DagArg`)* + DagArg: `Value` [":" `TokVarName`] | `TokVarName` + +The initial :token:`DagArg` is called the "operator" of the dag. + +.. productionlist:: + SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" + +Bodies +------ + +.. productionlist:: + ObjectBody: `BaseClassList` `Body` + BaseClassList: [":" `BaseClassListNE`] + BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* + SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] + DefmID: `TokIdentifier` + +The version with the :token:`MultiClassID` is only valid in the +:token:`BaseClassList` of a ``defm``. +The :token:`MultiClassID` should be the name of a ``multiclass``. + +.. put this somewhere else + +It is after parsing the base class list that the "let stack" is applied. + +.. productionlist:: + Body: ";" | "{" BodyList "}" + BodyList: BodyItem* + BodyItem: `Declaration` ";" + :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";" + +The ``let`` form allows overriding the value of an inherited field. + +``def`` +------- + +.. TODO:: + There can be pastes in the names here, like ``#NAME#``. Look into that + and document it (it boils down to ParseIDValue with IDParseMode == + ParseNameMode). ParseObjectName calls into the general ParseValue, with + the only different from "arbitrary expression parsing" being IDParseMode + == Mode. + +.. productionlist:: + Def: "def" `TokIdentifier` `ObjectBody` + +Defines a record whose name is given by the :token:`TokIdentifier`. The +fields of the record are inherited from the base classes and defined in the +body. + +Special handling occurs if this ``def`` appears inside a ``multiclass`` or +a ``foreach``. + +``defm`` +-------- + +.. productionlist:: + Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";" + +Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must +precede any ``class``'s that appear. + +``foreach`` +----------- + +.. productionlist:: + Foreach: "foreach" `Declaration` "in" "{" `Object`* "}" + :| "foreach" `Declaration` "in" `Object` + +The value assigned to the variable in the declaration is iterated over and +the object or object list is reevaluated with the variable set at each +iterated value. + +Top-Level ``let`` +----------------- + +.. productionlist:: + Let: "let" `LetList` "in" "{" `Object`* "}" + :| "let" `LetList` "in" `Object` + LetList: `LetItem` ("," `LetItem`)* + LetItem: `TokIdentifier` [`RangeList`] "=" `Value` + +This is effectively equivalent to ``let`` inside the body of a record +except that it applies to multiple records at a time. The bindings are +applied at the end of parsing the base classes of a record. + +``multiclass`` +-------------- + +.. productionlist:: + MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] + : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" + BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* + MultiClassID: `TokIdentifier` + MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` diff --git a/gnu/llvm/docs/TableGen/index.rst b/gnu/llvm/docs/TableGen/index.rst new file mode 100644 index 00000000000..9526240d54f --- /dev/null +++ b/gnu/llvm/docs/TableGen/index.rst @@ -0,0 +1,308 @@ +======== +TableGen +======== + +.. contents:: + :local: + +.. toctree:: + :hidden: + + BackEnds + LangRef + LangIntro + Deficiencies + +Introduction +============ + +TableGen's purpose is to help a human develop and maintain records of +domain-specific information. Because there may be a large number of these +records, it is specifically designed to allow writing flexible descriptions and +for common features of these records to be factored out. This reduces the +amount of duplication in the description, reduces the chance of error, and makes +it easier to structure domain specific information. + +The core part of TableGen parses a file, instantiates the declarations, and +hands the result off to a domain-specific `backend`_ for processing. + +The current major users of TableGen are :doc:`../CodeGenerator` +and the +`Clang diagnostics and attributes <http://clang.llvm.org/docs/UsersManual.html#controlling-errors-and-warnings>`_. + +Note that if you work on TableGen much, and use emacs or vim, that you can find +an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and +``llvm/utils/vim`` directories of your LLVM distribution, respectively. + +.. _intro: + + +The TableGen program +==================== + +TableGen files are interpreted by the TableGen program: `llvm-tblgen` available +on your build directory under `bin`. It is not installed in the system (or where +your sysroot is set to), since it has no use beyond LLVM's build process. + +Running TableGen +---------------- + +TableGen runs just like any other LLVM tool. The first (optional) argument +specifies the file to read. If a filename is not specified, ``llvm-tblgen`` +reads from standard input. + +To be useful, one of the `backends`_ must be used. These backends are +selectable on the command line (type '``llvm-tblgen -help``' for a list). For +example, to get a list of all of the definitions that subclass a particular type +(which can be useful for building up an enum list of these records), use the +``-print-enums`` option: + +.. code-block:: bash + + $ llvm-tblgen X86.td -print-enums -class=Register + AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX, + ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D, + R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15, + R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI, + RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, + XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5, + XMM6, XMM7, XMM8, XMM9, + + $ llvm-tblgen X86.td -print-enums -class=Instruction + ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri, + ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8, + ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm, + ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr, + ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ... + +The default backend prints out all of the records. + +If you plan to use TableGen, you will most likely have to write a `backend`_ +that extracts the information specific to what you need and formats it in the +appropriate way. + +Example +------- + +With no other arguments, `llvm-tblgen` parses the specified file and prints out all +of the classes, then all of the definitions. This is a good way to see what the +various definitions expand to fully. Running this on the ``X86.td`` file prints +this (at the time of this writing): + +.. code-block:: llvm + + ... + def ADD32rr { // Instruction X86Inst I + string Namespace = "X86"; + dag OutOperandList = (outs GR32:$dst); + dag InOperandList = (ins GR32:$src1, GR32:$src2); + string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}"; + list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]; + list<Register> Uses = []; + list<Register> Defs = [EFLAGS]; + list<Predicate> Predicates = []; + int CodeSize = 3; + int AddedComplexity = 0; + bit isReturn = 0; + bit isBranch = 0; + bit isIndirectBranch = 0; + bit isBarrier = 0; + bit isCall = 0; + bit canFoldAsLoad = 0; + bit mayLoad = 0; + bit mayStore = 0; + bit isImplicitDef = 0; + bit isConvertibleToThreeAddress = 1; + bit isCommutable = 1; + bit isTerminator = 0; + bit isReMaterializable = 0; + bit isPredicable = 0; + bit hasDelaySlot = 0; + bit usesCustomInserter = 0; + bit hasCtrlDep = 0; + bit isNotDuplicable = 0; + bit hasSideEffects = 0; + InstrItinClass Itinerary = NoItinerary; + string Constraints = ""; + string DisableEncoding = ""; + bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; + Format Form = MRMDestReg; + bits<6> FormBits = { 0, 0, 0, 0, 1, 1 }; + ImmType ImmT = NoImm; + bits<3> ImmTypeBits = { 0, 0, 0 }; + bit hasOpSizePrefix = 0; + bit hasAdSizePrefix = 0; + bits<4> Prefix = { 0, 0, 0, 0 }; + bit hasREX_WPrefix = 0; + FPFormat FPForm = ?; + bits<3> FPFormBits = { 0, 0, 0 }; + } + ... + +This definition corresponds to the 32-bit register-register ``add`` instruction +of the x86 architecture. ``def ADD32rr`` defines a record named +``ADD32rr``, and the comment at the end of the line indicates the superclasses +of the definition. The body of the record contains all of the data that +TableGen assembled for the record, indicating that the instruction is part of +the "X86" namespace, the pattern indicating how the instruction is selected by +the code generator, that it is a two-address instruction, has a particular +encoding, etc. The contents and semantics of the information in the record are +specific to the needs of the X86 backend, and are only shown as an example. + +As you can see, a lot of information is needed for every instruction supported +by the code generator, and specifying it all manually would be unmaintainable, +prone to bugs, and tiring to do in the first place. Because we are using +TableGen, all of the information was derived from the following definition: + +.. code-block:: llvm + + let Defs = [EFLAGS], + isCommutable = 1, // X = ADD Y,Z --> X = ADD Z,Y + isConvertibleToThreeAddress = 1 in // Can transform into LEA. + def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), + (ins GR32:$src1, GR32:$src2), + "add{l}\t{$src2, $dst|$dst, $src2}", + [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; + +This definition makes use of the custom class ``I`` (extended from the custom +class ``X86Inst``), which is defined in the X86-specific TableGen file, to +factor out the common features that instructions of its class share. A key +feature of TableGen is that it allows the end-user to define the abstractions +they prefer to use when describing their information. + +Each ``def`` record has a special entry called "NAME". This is the name of the +record ("``ADD32rr``" above). In the general case ``def`` names can be formed +from various kinds of string processing expressions and ``NAME`` resolves to the +final value obtained after resolving all of those expressions. The user may +refer to ``NAME`` anywhere she desires to use the ultimate name of the ``def``. +``NAME`` should not be defined anywhere else in user code to avoid conflicts. + +Syntax +====== + +TableGen has a syntax that is loosely based on C++ templates, with built-in +types and specification. In addition, TableGen's syntax introduces some +automation concepts like multiclass, foreach, let, etc. + +Basic concepts +-------------- + +TableGen files consist of two key parts: 'classes' and 'definitions', both of +which are considered 'records'. + +**TableGen records** have a unique name, a list of values, and a list of +superclasses. The list of values is the main data that TableGen builds for each +record; it is this that holds the domain specific information for the +application. The interpretation of this data is left to a specific `backend`_, +but the structure and format rules are taken care of and are fixed by +TableGen. + +**TableGen definitions** are the concrete form of 'records'. These generally do +not have any undefined values, and are marked with the '``def``' keyword. + +.. code-block:: llvm + + def FeatureFPARMv8 : SubtargetFeature<"fp-armv8", "HasFPARMv8", "true", + "Enable ARMv8 FP">; + +In this example, FeatureFPARMv8 is ``SubtargetFeature`` record initialised +with some values. The names of the classes are defined via the +keyword `class` either on the same file or some other included. Most target +TableGen files include the generic ones in ``include/llvm/Target``. + +**TableGen classes** are abstract records that are used to build and describe +other records. These classes allow the end-user to build abstractions for +either the domain they are targeting (such as "Register", "RegisterClass", and +"Instruction" in the LLVM code generator) or for the implementor to help factor +out common properties of records (such as "FPInst", which is used to represent +floating point instructions in the X86 backend). TableGen keeps track of all of +the classes that are used to build up a definition, so the backend can find all +definitions of a particular class, such as "Instruction". + +.. code-block:: llvm + + class ProcNoItin<string Name, list<SubtargetFeature> Features> + : Processor<Name, NoItineraries, Features>; + +Here, the class ProcNoItin, receiving parameters `Name` of type `string` and +a list of target features is specializing the class Processor by passing the +arguments down as well as hard-coding NoItineraries. + +**TableGen multiclasses** are groups of abstract records that are instantiated +all at once. Each instantiation can result in multiple TableGen definitions. +If a multiclass inherits from another multiclass, the definitions in the +sub-multiclass become part of the current multiclass, as if they were declared +in the current multiclass. + +.. code-block:: llvm + + multiclass ro_signed_pats<string T, string Rm, dag Base, dag Offset, dag Extend, + dag address, ValueType sty> { + def : Pat<(i32 (!cast<SDNode>("sextload" # sty) address)), + (!cast<Instruction>("LDRS" # T # "w_" # Rm # "_RegOffset") + Base, Offset, Extend)>; + + def : Pat<(i64 (!cast<SDNode>("sextload" # sty) address)), + (!cast<Instruction>("LDRS" # T # "x_" # Rm # "_RegOffset") + Base, Offset, Extend)>; + } + + defm : ro_signed_pats<"B", Rm, Base, Offset, Extend, + !foreach(decls.pattern, address, + !subst(SHIFT, imm_eq0, decls.pattern)), + i8>; + + + +See the :doc:`TableGen Language Introduction <LangIntro>` for more generic +information on the usage of the language, and the +:doc:`TableGen Language Reference <LangRef>` for more in-depth description +of the formal language specification. + +.. _backend: +.. _backends: + +TableGen backends +================= + +TableGen files have no real meaning without a back-end. The default operation +of running ``llvm-tblgen`` is to print the information in a textual format, but +that's only useful for debugging of the TableGen files themselves. The power +in TableGen is, however, to interpret the source files into an internal +representation that can be generated into anything you want. + +Current usage of TableGen is to create huge include files with tables that you +can either include directly (if the output is in the language you're coding), +or be used in pre-processing via macros surrounding the include of the file. + +Direct output can be used if the back-end already prints a table in C format +or if the output is just a list of strings (for error and warning messages). +Pre-processed output should be used if the same information needs to be used +in different contexts (like Instruction names), so your back-end should print +a meta-information list that can be shaped into different compile-time formats. + +See the `TableGen BackEnds <BackEnds.html>`_ for more information. + +TableGen Deficiencies +===================== + +Despite being very generic, TableGen has some deficiencies that have been +pointed out numerous times. The common theme is that, while TableGen allows +you to build Domain-Specific-Languages, the final languages that you create +lack the power of other DSLs, which in turn increase considerably the size +and complexity of TableGen files. + +At the same time, TableGen allows you to create virtually any meaning of +the basic concepts via custom-made back-ends, which can pervert the original +design and make it very hard for newcomers to understand the evil TableGen +file. + +There are some in favour of extending the semantics even more, but making sure +back-ends adhere to strict rules. Others are suggesting we should move to less, +more powerful DSLs designed with specific purposes, or even re-using existing +DSLs. + +Either way, this is a discussion that will likely span across several years, +if not decades. You can read more in the `TableGen Deficiencies <Deficiencies.html>`_ +document. |
