summaryrefslogtreecommitdiffstats
path: root/gnu/llvm/docs/SourceLevelDebugging.rst
diff options
context:
space:
mode:
authorpascal <pascal@openbsd.org>2016-09-03 22:46:54 +0000
committerpascal <pascal@openbsd.org>2016-09-03 22:46:54 +0000
commitb5500b9ca0102f1ccaf32f0e77e96d0739aded9b (patch)
treee1b7ebb5a0231f9e6d8d3f6f719582cebd64dc98 /gnu/llvm/docs/SourceLevelDebugging.rst
parentclarify purpose of src/gnu/ directory. (diff)
downloadwireguard-openbsd-b5500b9ca0102f1ccaf32f0e77e96d0739aded9b.tar.xz
wireguard-openbsd-b5500b9ca0102f1ccaf32f0e77e96d0739aded9b.zip
Use the space freed up by sparc and zaurus to import LLVM.
ok hackroom@
Diffstat (limited to 'gnu/llvm/docs/SourceLevelDebugging.rst')
-rw-r--r--gnu/llvm/docs/SourceLevelDebugging.rst1335
1 files changed, 1335 insertions, 0 deletions
diff --git a/gnu/llvm/docs/SourceLevelDebugging.rst b/gnu/llvm/docs/SourceLevelDebugging.rst
new file mode 100644
index 00000000000..270c44eb50b
--- /dev/null
+++ b/gnu/llvm/docs/SourceLevelDebugging.rst
@@ -0,0 +1,1335 @@
+================================
+Source Level Debugging with LLVM
+================================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+This document is the central repository for all information pertaining to debug
+information in LLVM. It describes the :ref:`actual format that the LLVM debug
+information takes <format>`, which is useful for those interested in creating
+front-ends or dealing directly with the information. Further, this document
+provides specific examples of what debug information for C/C++ looks like.
+
+Philosophy behind LLVM debugging information
+--------------------------------------------
+
+The idea of the LLVM debugging information is to capture how the important
+pieces of the source-language's Abstract Syntax Tree map onto LLVM code.
+Several design aspects have shaped the solution that appears here. The
+important ones are:
+
+* Debugging information should have very little impact on the rest of the
+ compiler. No transformations, analyses, or code generators should need to
+ be modified because of debugging information.
+
+* LLVM optimizations should interact in :ref:`well-defined and easily described
+ ways <intro_debugopt>` with the debugging information.
+
+* Because LLVM is designed to support arbitrary programming languages,
+ LLVM-to-LLVM tools should not need to know anything about the semantics of
+ the source-level-language.
+
+* Source-level languages are often **widely** different from one another.
+ LLVM should not put any restrictions of the flavor of the source-language,
+ and the debugging information should work with any language.
+
+* With code generator support, it should be possible to use an LLVM compiler
+ to compile a program to native machine code and standard debugging
+ formats. This allows compatibility with traditional machine-code level
+ debuggers, like GDB or DBX.
+
+The approach used by the LLVM implementation is to use a small set of
+:ref:`intrinsic functions <format_common_intrinsics>` to define a mapping
+between LLVM program objects and the source-level objects. The description of
+the source-level program is maintained in LLVM metadata in an
+:ref:`implementation-defined format <ccxx_frontend>` (the C/C++ front-end
+currently uses working draft 7 of the `DWARF 3 standard
+<http://www.eagercon.com/dwarf/dwarf3std.htm>`_).
+
+When a program is being debugged, a debugger interacts with the user and turns
+the stored debug information into source-language specific information. As
+such, a debugger must be aware of the source-language, and is thus tied to a
+specific language or family of languages.
+
+Debug information consumers
+---------------------------
+
+The role of debug information is to provide meta information normally stripped
+away during the compilation process. This meta information provides an LLVM
+user a relationship between generated code and the original program source
+code.
+
+Currently, debug information is consumed by DwarfDebug to produce dwarf
+information used by the gdb debugger. Other targets could use the same
+information to produce stabs or other debug forms.
+
+It would also be reasonable to use debug information to feed profiling tools
+for analysis of generated code, or, tools for reconstructing the original
+source from generated code.
+
+TODO - expound a bit more.
+
+.. _intro_debugopt:
+
+Debugging optimized code
+------------------------
+
+An extremely high priority of LLVM debugging information is to make it interact
+well with optimizations and analysis. In particular, the LLVM debug
+information provides the following guarantees:
+
+* LLVM debug information **always provides information to accurately read
+ the source-level state of the program**, regardless of which LLVM
+ optimizations have been run, and without any modification to the
+ optimizations themselves. However, some optimizations may impact the
+ ability to modify the current state of the program with a debugger, such
+ as setting program variables, or calling functions that have been
+ deleted.
+
+* As desired, LLVM optimizations can be upgraded to be aware of the LLVM
+ debugging information, allowing them to update the debugging information
+ as they perform aggressive optimizations. This means that, with effort,
+ the LLVM optimizers could optimize debug code just as well as non-debug
+ code.
+
+* LLVM debug information does not prevent optimizations from
+ happening (for example inlining, basic block reordering/merging/cleanup,
+ tail duplication, etc).
+
+* LLVM debug information is automatically optimized along with the rest of
+ the program, using existing facilities. For example, duplicate
+ information is automatically merged by the linker, and unused information
+ is automatically removed.
+
+Basically, the debug information allows you to compile a program with
+"``-O0 -g``" and get full debug information, allowing you to arbitrarily modify
+the program as it executes from a debugger. Compiling a program with
+"``-O3 -g``" gives you full debug information that is always available and
+accurate for reading (e.g., you get accurate stack traces despite tail call
+elimination and inlining), but you might lose the ability to modify the program
+and call functions where were optimized out of the program, or inlined away
+completely.
+
+:ref:`LLVM test suite <test-suite-quickstart>` provides a framework to test
+optimizer's handling of debugging information. It can be run like this:
+
+.. code-block:: bash
+
+ % cd llvm/projects/test-suite/MultiSource/Benchmarks # or some other level
+ % make TEST=dbgopt
+
+This will test impact of debugging information on optimization passes. If
+debugging information influences optimization passes then it will be reported
+as a failure. See :doc:`TestingGuide` for more information on LLVM test
+infrastructure and how to run various tests.
+
+.. _format:
+
+Debugging information format
+============================
+
+LLVM debugging information has been carefully designed to make it possible for
+the optimizer to optimize the program and debugging information without
+necessarily having to know anything about debugging information. In
+particular, the use of metadata avoids duplicated debugging information from
+the beginning, and the global dead code elimination pass automatically deletes
+debugging information for a function if it decides to delete the function.
+
+To do this, most of the debugging information (descriptors for types,
+variables, functions, source files, etc) is inserted by the language front-end
+in the form of LLVM metadata.
+
+Debug information is designed to be agnostic about the target debugger and
+debugging information representation (e.g. DWARF/Stabs/etc). It uses a generic
+pass to decode the information that represents variables, types, functions,
+namespaces, etc: this allows for arbitrary source-language semantics and
+type-systems to be used, as long as there is a module written for the target
+debugger to interpret the information.
+
+To provide basic functionality, the LLVM debugger does have to make some
+assumptions about the source-level language being debugged, though it keeps
+these to a minimum. The only common features that the LLVM debugger assumes
+exist are `source files <LangRef.html#difile>`_, and `program objects
+<LangRef.html#diglobalvariable>`_. These abstract objects are used by a
+debugger to form stack traces, show information about local variables, etc.
+
+This section of the documentation first describes the representation aspects
+common to any source-language. :ref:`ccxx_frontend` describes the data layout
+conventions used by the C and C++ front-ends.
+
+Debug information descriptors are `specialized metadata nodes
+<LangRef.html#specialized-metadata>`_, first-class subclasses of ``Metadata``.
+
+.. _format_common_intrinsics:
+
+Debugger intrinsic functions
+----------------------------
+
+LLVM uses several intrinsic functions (name prefixed with "``llvm.dbg``") to
+provide debug information at various points in generated code.
+
+``llvm.dbg.declare``
+^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+ void @llvm.dbg.declare(metadata, metadata, metadata)
+
+This intrinsic provides information about a local element (e.g., variable).
+The first argument is metadata holding the alloca for the variable. The second
+argument is a `local variable <LangRef.html#dilocalvariable>`_ containing a
+description of the variable. The third argument is a `complex expression
+<LangRef.html#diexpression>`_.
+
+``llvm.dbg.value``
+^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+ void @llvm.dbg.value(metadata, i64, metadata, metadata)
+
+This intrinsic provides information when a user source variable is set to a new
+value. The first argument is the new value (wrapped as metadata). The second
+argument is the offset in the user source variable where the new value is
+written. The third argument is a `local variable
+<LangRef.html#dilocalvariable>`_ containing a description of the variable. The
+third argument is a `complex expression <LangRef.html#diexpression>`_.
+
+Object lifetimes and scoping
+============================
+
+In many languages, the local variables in functions can have their lifetimes or
+scopes limited to a subset of a function. In the C family of languages, for
+example, variables are only live (readable and writable) within the source
+block that they are defined in. In functional languages, values are only
+readable after they have been defined. Though this is a very obvious concept,
+it is non-trivial to model in LLVM, because it has no notion of scoping in this
+sense, and does not want to be tied to a language's scoping rules.
+
+In order to handle this, the LLVM debug format uses the metadata attached to
+llvm instructions to encode line number and scoping information. Consider the
+following C fragment, for example:
+
+.. code-block:: c
+
+ 1. void foo() {
+ 2. int X = 21;
+ 3. int Y = 22;
+ 4. {
+ 5. int Z = 23;
+ 6. Z = X;
+ 7. }
+ 8. X = Y;
+ 9. }
+
+Compiled to LLVM, this function would be represented like this:
+
+.. code-block:: llvm
+
+ ; Function Attrs: nounwind ssp uwtable
+ define void @foo() #0 !dbg !4 {
+ entry:
+ %X = alloca i32, align 4
+ %Y = alloca i32, align 4
+ %Z = alloca i32, align 4
+ call void @llvm.dbg.declare(metadata i32* %X, metadata !11, metadata !13), !dbg !14
+ store i32 21, i32* %X, align 4, !dbg !14
+ call void @llvm.dbg.declare(metadata i32* %Y, metadata !15, metadata !13), !dbg !16
+ store i32 22, i32* %Y, align 4, !dbg !16
+ call void @llvm.dbg.declare(metadata i32* %Z, metadata !17, metadata !13), !dbg !19
+ store i32 23, i32* %Z, align 4, !dbg !19
+ %0 = load i32, i32* %X, align 4, !dbg !20
+ store i32 %0, i32* %Z, align 4, !dbg !21
+ %1 = load i32, i32* %Y, align 4, !dbg !22
+ store i32 %1, i32* %X, align 4, !dbg !23
+ ret void, !dbg !24
+ }
+
+ ; Function Attrs: nounwind readnone
+ declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
+
+ attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
+ attributes #1 = { nounwind readnone }
+
+ !llvm.dbg.cu = !{!0}
+ !llvm.module.flags = !{!7, !8, !9}
+ !llvm.ident = !{!10}
+
+ !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)", isOptimized: false, runtimeVersion: 0, emissionKind: 1, enums: !2, retainedTypes: !2, subprograms: !3, globals: !2, imports: !2)
+ !1 = !DIFile(filename: "/dev/stdin", directory: "/Users/dexonsmith/data/llvm/debug-info")
+ !2 = !{}
+ !3 = !{!4}
+ !4 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: false, variables: !2)
+ !5 = !DISubroutineType(types: !6)
+ !6 = !{null}
+ !7 = !{i32 2, !"Dwarf Version", i32 2}
+ !8 = !{i32 2, !"Debug Info Version", i32 3}
+ !9 = !{i32 1, !"PIC Level", i32 2}
+ !10 = !{!"clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)"}
+ !11 = !DILocalVariable(name: "X", scope: !4, file: !1, line: 2, type: !12)
+ !12 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+ !13 = !DIExpression()
+ !14 = !DILocation(line: 2, column: 9, scope: !4)
+ !15 = !DILocalVariable(name: "Y", scope: !4, file: !1, line: 3, type: !12)
+ !16 = !DILocation(line: 3, column: 9, scope: !4)
+ !17 = !DILocalVariable(name: "Z", scope: !18, file: !1, line: 5, type: !12)
+ !18 = distinct !DILexicalBlock(scope: !4, file: !1, line: 4, column: 5)
+ !19 = !DILocation(line: 5, column: 11, scope: !18)
+ !20 = !DILocation(line: 6, column: 11, scope: !18)
+ !21 = !DILocation(line: 6, column: 9, scope: !18)
+ !22 = !DILocation(line: 8, column: 9, scope: !4)
+ !23 = !DILocation(line: 8, column: 7, scope: !4)
+ !24 = !DILocation(line: 9, column: 3, scope: !4)
+
+
+This example illustrates a few important details about LLVM debugging
+information. In particular, it shows how the ``llvm.dbg.declare`` intrinsic and
+location information, which are attached to an instruction, are applied
+together to allow a debugger to analyze the relationship between statements,
+variable definitions, and the code used to implement the function.
+
+.. code-block:: llvm
+
+ call void @llvm.dbg.declare(metadata i32* %X, metadata !11, metadata !13), !dbg !14
+ ; [debug line = 2:7] [debug variable = X]
+
+The first intrinsic ``%llvm.dbg.declare`` encodes debugging information for the
+variable ``X``. The metadata ``!dbg !14`` attached to the intrinsic provides
+scope information for the variable ``X``.
+
+.. code-block:: llvm
+
+ !14 = !DILocation(line: 2, column: 9, scope: !4)
+ !4 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !5,
+ isLocal: false, isDefinition: true, scopeLine: 1,
+ isOptimized: false, variables: !2)
+
+Here ``!14`` is metadata providing `location information
+<LangRef.html#dilocation>`_. In this example, scope is encoded by ``!4``, a
+`subprogram descriptor <LangRef.html#disubprogram>`_. This way the location
+information attached to the intrinsics indicates that the variable ``X`` is
+declared at line number 2 at a function level scope in function ``foo``.
+
+Now lets take another example.
+
+.. code-block:: llvm
+
+ call void @llvm.dbg.declare(metadata i32* %Z, metadata !17, metadata !13), !dbg !19
+ ; [debug line = 5:9] [debug variable = Z]
+
+The third intrinsic ``%llvm.dbg.declare`` encodes debugging information for
+variable ``Z``. The metadata ``!dbg !19`` attached to the intrinsic provides
+scope information for the variable ``Z``.
+
+.. code-block:: llvm
+
+ !18 = distinct !DILexicalBlock(scope: !4, file: !1, line: 4, column: 5)
+ !19 = !DILocation(line: 5, column: 11, scope: !18)
+
+Here ``!19`` indicates that ``Z`` is declared at line number 5 and column
+number 0 inside of lexical scope ``!18``. The lexical scope itself resides
+inside of subprogram ``!4`` described above.
+
+The scope information attached with each instruction provides a straightforward
+way to find instructions covered by a scope.
+
+.. _ccxx_frontend:
+
+C/C++ front-end specific debug information
+==========================================
+
+The C and C++ front-ends represent information about the program in a format
+that is effectively identical to `DWARF 3.0
+<http://www.eagercon.com/dwarf/dwarf3std.htm>`_ in terms of information
+content. This allows code generators to trivially support native debuggers by
+generating standard dwarf information, and contains enough information for
+non-dwarf targets to translate it as needed.
+
+This section describes the forms used to represent C and C++ programs. Other
+languages could pattern themselves after this (which itself is tuned to
+representing programs in the same way that DWARF 3 does), or they could choose
+to provide completely different forms if they don't fit into the DWARF model.
+As support for debugging information gets added to the various LLVM
+source-language front-ends, the information used should be documented here.
+
+The following sections provide examples of a few C/C++ constructs and the debug
+information that would best describe those constructs. The canonical
+references are the ``DIDescriptor`` classes defined in
+``include/llvm/IR/DebugInfo.h`` and the implementations of the helper functions
+in ``lib/IR/DIBuilder.cpp``.
+
+C/C++ source file information
+-----------------------------
+
+``llvm::Instruction`` provides easy access to metadata attached with an
+instruction. One can extract line number information encoded in LLVM IR using
+``Instruction::getDebugLoc()`` and ``DILocation::getLine()``.
+
+.. code-block:: c++
+
+ if (DILocation *Loc = I->getDebugLoc()) { // Here I is an LLVM instruction
+ unsigned Line = Loc->getLine();
+ StringRef File = Loc->getFilename();
+ StringRef Dir = Loc->getDirectory();
+ }
+
+C/C++ global variable information
+---------------------------------
+
+Given an integer global variable declared as follows:
+
+.. code-block:: c
+
+ int MyGlobal = 100;
+
+a C/C++ front-end would generate the following descriptors:
+
+.. code-block:: llvm
+
+ ;;
+ ;; Define the global itself.
+ ;;
+ @MyGlobal = global i32 100, align 4
+
+ ;;
+ ;; List of debug info of globals
+ ;;
+ !llvm.dbg.cu = !{!0}
+
+ ;; Some unrelated metadata.
+ !llvm.module.flags = !{!6, !7}
+
+ ;; Define the compile unit.
+ !0 = !DICompileUnit(language: DW_LANG_C99, file: !1,
+ producer:
+ "clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)",
+ isOptimized: false, runtimeVersion: 0, emissionKind: 1,
+ enums: !2, retainedTypes: !2, subprograms: !2, globals:
+ !3, imports: !2)
+
+ ;;
+ ;; Define the file
+ ;;
+ !1 = !DIFile(filename: "/dev/stdin",
+ directory: "/Users/dexonsmith/data/llvm/debug-info")
+
+ ;; An empty array.
+ !2 = !{}
+
+ ;; The Array of Global Variables
+ !3 = !{!4}
+
+ ;;
+ ;; Define the global variable itself.
+ ;;
+ !4 = !DIGlobalVariable(name: "MyGlobal", scope: !0, file: !1, line: 1,
+ type: !5, isLocal: false, isDefinition: true,
+ variable: i32* @MyGlobal)
+
+ ;;
+ ;; Define the type
+ ;;
+ !5 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+
+ ;; Dwarf version to output.
+ !6 = !{i32 2, !"Dwarf Version", i32 2}
+
+ ;; Debug info schema version.
+ !7 = !{i32 2, !"Debug Info Version", i32 3}
+
+C/C++ function information
+--------------------------
+
+Given a function declared as follows:
+
+.. code-block:: c
+
+ int main(int argc, char *argv[]) {
+ return 0;
+ }
+
+a C/C++ front-end would generate the following descriptors:
+
+.. code-block:: llvm
+
+ ;;
+ ;; Define the anchor for subprograms.
+ ;;
+ !4 = !DISubprogram(name: "main", scope: !1, file: !1, line: 1, type: !5,
+ isLocal: false, isDefinition: true, scopeLine: 1,
+ flags: DIFlagPrototyped, isOptimized: false,
+ variables: !2)
+
+ ;;
+ ;; Define the subprogram itself.
+ ;;
+ define i32 @main(i32 %argc, i8** %argv) !dbg !4 {
+ ...
+ }
+
+Debugging information format
+============================
+
+Debugging Information Extension for Objective C Properties
+----------------------------------------------------------
+
+Introduction
+^^^^^^^^^^^^
+
+Objective C provides a simpler way to declare and define accessor methods using
+declared properties. The language provides features to declare a property and
+to let compiler synthesize accessor methods.
+
+The debugger lets developer inspect Objective C interfaces and their instance
+variables and class variables. However, the debugger does not know anything
+about the properties defined in Objective C interfaces. The debugger consumes
+information generated by compiler in DWARF format. The format does not support
+encoding of Objective C properties. This proposal describes DWARF extensions to
+encode Objective C properties, which the debugger can use to let developers
+inspect Objective C properties.
+
+Proposal
+^^^^^^^^
+
+Objective C properties exist separately from class members. A property can be
+defined only by "setter" and "getter" selectors, and be calculated anew on each
+access. Or a property can just be a direct access to some declared ivar.
+Finally it can have an ivar "automatically synthesized" for it by the compiler,
+in which case the property can be referred to in user code directly using the
+standard C dereference syntax as well as through the property "dot" syntax, but
+there is no entry in the ``@interface`` declaration corresponding to this ivar.
+
+To facilitate debugging, these properties we will add a new DWARF TAG into the
+``DW_TAG_structure_type`` definition for the class to hold the description of a
+given property, and a set of DWARF attributes that provide said description.
+The property tag will also contain the name and declared type of the property.
+
+If there is a related ivar, there will also be a DWARF property attribute placed
+in the ``DW_TAG_member`` DIE for that ivar referring back to the property TAG
+for that property. And in the case where the compiler synthesizes the ivar
+directly, the compiler is expected to generate a ``DW_TAG_member`` for that
+ivar (with the ``DW_AT_artificial`` set to 1), whose name will be the name used
+to access this ivar directly in code, and with the property attribute pointing
+back to the property it is backing.
+
+The following examples will serve as illustration for our discussion:
+
+.. code-block:: objc
+
+ @interface I1 {
+ int n2;
+ }
+
+ @property int p1;
+ @property int p2;
+ @end
+
+ @implementation I1
+ @synthesize p1;
+ @synthesize p2 = n2;
+ @end
+
+This produces the following DWARF (this is a "pseudo dwarfdump" output):
+
+.. code-block:: none
+
+ 0x00000100: TAG_structure_type [7] *
+ AT_APPLE_runtime_class( 0x10 )
+ AT_name( "I1" )
+ AT_decl_file( "Objc_Property.m" )
+ AT_decl_line( 3 )
+
+ 0x00000110 TAG_APPLE_property
+ AT_name ( "p1" )
+ AT_type ( {0x00000150} ( int ) )
+
+ 0x00000120: TAG_APPLE_property
+ AT_name ( "p2" )
+ AT_type ( {0x00000150} ( int ) )
+
+ 0x00000130: TAG_member [8]
+ AT_name( "_p1" )
+ AT_APPLE_property ( {0x00000110} "p1" )
+ AT_type( {0x00000150} ( int ) )
+ AT_artificial ( 0x1 )
+
+ 0x00000140: TAG_member [8]
+ AT_name( "n2" )
+ AT_APPLE_property ( {0x00000120} "p2" )
+ AT_type( {0x00000150} ( int ) )
+
+ 0x00000150: AT_type( ( int ) )
+
+Note, the current convention is that the name of the ivar for an
+auto-synthesized property is the name of the property from which it derives
+with an underscore prepended, as is shown in the example. But we actually
+don't need to know this convention, since we are given the name of the ivar
+directly.
+
+Also, it is common practice in ObjC to have different property declarations in
+the @interface and @implementation - e.g. to provide a read-only property in
+the interface,and a read-write interface in the implementation. In that case,
+the compiler should emit whichever property declaration will be in force in the
+current translation unit.
+
+Developers can decorate a property with attributes which are encoded using
+``DW_AT_APPLE_property_attribute``.
+
+.. code-block:: objc
+
+ @property (readonly, nonatomic) int pr;
+
+.. code-block:: none
+
+ TAG_APPLE_property [8]
+ AT_name( "pr" )
+ AT_type ( {0x00000147} (int) )
+ AT_APPLE_property_attribute (DW_APPLE_PROPERTY_readonly, DW_APPLE_PROPERTY_nonatomic)
+
+The setter and getter method names are attached to the property using
+``DW_AT_APPLE_property_setter`` and ``DW_AT_APPLE_property_getter`` attributes.
+
+.. code-block:: objc
+
+ @interface I1
+ @property (setter=myOwnP3Setter:) int p3;
+ -(void)myOwnP3Setter:(int)a;
+ @end
+
+ @implementation I1
+ @synthesize p3;
+ -(void)myOwnP3Setter:(int)a{ }
+ @end
+
+The DWARF for this would be:
+
+.. code-block:: none
+
+ 0x000003bd: TAG_structure_type [7] *
+ AT_APPLE_runtime_class( 0x10 )
+ AT_name( "I1" )
+ AT_decl_file( "Objc_Property.m" )
+ AT_decl_line( 3 )
+
+ 0x000003cd TAG_APPLE_property
+ AT_name ( "p3" )
+ AT_APPLE_property_setter ( "myOwnP3Setter:" )
+ AT_type( {0x00000147} ( int ) )
+
+ 0x000003f3: TAG_member [8]
+ AT_name( "_p3" )
+ AT_type ( {0x00000147} ( int ) )
+ AT_APPLE_property ( {0x000003cd} )
+ AT_artificial ( 0x1 )
+
+New DWARF Tags
+^^^^^^^^^^^^^^
+
++-----------------------+--------+
+| TAG | Value |
++=======================+========+
+| DW_TAG_APPLE_property | 0x4200 |
++-----------------------+--------+
+
+New DWARF Attributes
+^^^^^^^^^^^^^^^^^^^^
+
++--------------------------------+--------+-----------+
+| Attribute | Value | Classes |
++================================+========+===========+
+| DW_AT_APPLE_property | 0x3fed | Reference |
++--------------------------------+--------+-----------+
+| DW_AT_APPLE_property_getter | 0x3fe9 | String |
++--------------------------------+--------+-----------+
+| DW_AT_APPLE_property_setter | 0x3fea | String |
++--------------------------------+--------+-----------+
+| DW_AT_APPLE_property_attribute | 0x3feb | Constant |
++--------------------------------+--------+-----------+
+
+New DWARF Constants
+^^^^^^^^^^^^^^^^^^^
+
++--------------------------------------+-------+
+| Name | Value |
++======================================+=======+
+| DW_APPLE_PROPERTY_readonly | 0x01 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_getter | 0x02 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_assign | 0x04 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_readwrite | 0x08 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_retain | 0x10 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_copy | 0x20 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_nonatomic | 0x40 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_setter | 0x80 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_atomic | 0x100 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_weak | 0x200 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_strong | 0x400 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_unsafe_unretained | 0x800 |
++--------------------------------+-----+-------+
+
+Name Accelerator Tables
+-----------------------
+
+Introduction
+^^^^^^^^^^^^
+
+The "``.debug_pubnames``" and "``.debug_pubtypes``" formats are not what a
+debugger needs. The "``pub``" in the section name indicates that the entries
+in the table are publicly visible names only. This means no static or hidden
+functions show up in the "``.debug_pubnames``". No static variables or private
+class variables are in the "``.debug_pubtypes``". Many compilers add different
+things to these tables, so we can't rely upon the contents between gcc, icc, or
+clang.
+
+The typical query given by users tends not to match up with the contents of
+these tables. For example, the DWARF spec states that "In the case of the name
+of a function member or static data member of a C++ structure, class or union,
+the name presented in the "``.debug_pubnames``" section is not the simple name
+given by the ``DW_AT_name attribute`` of the referenced debugging information
+entry, but rather the fully qualified name of the data or function member."
+So the only names in these tables for complex C++ entries is a fully
+qualified name. Debugger users tend not to enter their search strings as
+"``a::b::c(int,const Foo&) const``", but rather as "``c``", "``b::c``" , or
+"``a::b::c``". So the name entered in the name table must be demangled in
+order to chop it up appropriately and additional names must be manually entered
+into the table to make it effective as a name lookup table for debuggers to
+use.
+
+All debuggers currently ignore the "``.debug_pubnames``" table as a result of
+its inconsistent and useless public-only name content making it a waste of
+space in the object file. These tables, when they are written to disk, are not
+sorted in any way, leaving every debugger to do its own parsing and sorting.
+These tables also include an inlined copy of the string values in the table
+itself making the tables much larger than they need to be on disk, especially
+for large C++ programs.
+
+Can't we just fix the sections by adding all of the names we need to this
+table? No, because that is not what the tables are defined to contain and we
+won't know the difference between the old bad tables and the new good tables.
+At best we could make our own renamed sections that contain all of the data we
+need.
+
+These tables are also insufficient for what a debugger like LLDB needs. LLDB
+uses clang for its expression parsing where LLDB acts as a PCH. LLDB is then
+often asked to look for type "``foo``" or namespace "``bar``", or list items in
+namespace "``baz``". Namespaces are not included in the pubnames or pubtypes
+tables. Since clang asks a lot of questions when it is parsing an expression,
+we need to be very fast when looking up names, as it happens a lot. Having new
+accelerator tables that are optimized for very quick lookups will benefit this
+type of debugging experience greatly.
+
+We would like to generate name lookup tables that can be mapped into memory
+from disk, and used as is, with little or no up-front parsing. We would also
+be able to control the exact content of these different tables so they contain
+exactly what we need. The Name Accelerator Tables were designed to fix these
+issues. In order to solve these issues we need to:
+
+* Have a format that can be mapped into memory from disk and used as is
+* Lookups should be very fast
+* Extensible table format so these tables can be made by many producers
+* Contain all of the names needed for typical lookups out of the box
+* Strict rules for the contents of tables
+
+Table size is important and the accelerator table format should allow the reuse
+of strings from common string tables so the strings for the names are not
+duplicated. We also want to make sure the table is ready to be used as-is by
+simply mapping the table into memory with minimal header parsing.
+
+The name lookups need to be fast and optimized for the kinds of lookups that
+debuggers tend to do. Optimally we would like to touch as few parts of the
+mapped table as possible when doing a name lookup and be able to quickly find
+the name entry we are looking for, or discover there are no matches. In the
+case of debuggers we optimized for lookups that fail most of the time.
+
+Each table that is defined should have strict rules on exactly what is in the
+accelerator tables and documented so clients can rely on the content.
+
+Hash Tables
+^^^^^^^^^^^
+
+Standard Hash Tables
+""""""""""""""""""""
+
+Typical hash tables have a header, buckets, and each bucket points to the
+bucket contents:
+
+.. code-block:: none
+
+ .------------.
+ | HEADER |
+ |------------|
+ | BUCKETS |
+ |------------|
+ | DATA |
+ `------------'
+
+The BUCKETS are an array of offsets to DATA for each hash:
+
+.. code-block:: none
+
+ .------------.
+ | 0x00001000 | BUCKETS[0]
+ | 0x00002000 | BUCKETS[1]
+ | 0x00002200 | BUCKETS[2]
+ | 0x000034f0 | BUCKETS[3]
+ | | ...
+ | 0xXXXXXXXX | BUCKETS[n_buckets]
+ '------------'
+
+So for ``bucket[3]`` in the example above, we have an offset into the table
+0x000034f0 which points to a chain of entries for the bucket. Each bucket must
+contain a next pointer, full 32 bit hash value, the string itself, and the data
+for the current string value.
+
+.. code-block:: none
+
+ .------------.
+ 0x000034f0: | 0x00003500 | next pointer
+ | 0x12345678 | 32 bit hash
+ | "erase" | string value
+ | data[n] | HashData for this bucket
+ |------------|
+ 0x00003500: | 0x00003550 | next pointer
+ | 0x29273623 | 32 bit hash
+ | "dump" | string value
+ | data[n] | HashData for this bucket
+ |------------|
+ 0x00003550: | 0x00000000 | next pointer
+ | 0x82638293 | 32 bit hash
+ | "main" | string value
+ | data[n] | HashData for this bucket
+ `------------'
+
+The problem with this layout for debuggers is that we need to optimize for the
+negative lookup case where the symbol we're searching for is not present. So
+if we were to lookup "``printf``" in the table above, we would make a 32 hash
+for "``printf``", it might match ``bucket[3]``. We would need to go to the
+offset 0x000034f0 and start looking to see if our 32 bit hash matches. To do
+so, we need to read the next pointer, then read the hash, compare it, and skip
+to the next bucket. Each time we are skipping many bytes in memory and
+touching new cache pages just to do the compare on the full 32 bit hash. All
+of these accesses then tell us that we didn't have a match.
+
+Name Hash Tables
+""""""""""""""""
+
+To solve the issues mentioned above we have structured the hash tables a bit
+differently: a header, buckets, an array of all unique 32 bit hash values,
+followed by an array of hash value data offsets, one for each hash value, then
+the data for all hash values:
+
+.. code-block:: none
+
+ .-------------.
+ | HEADER |
+ |-------------|
+ | BUCKETS |
+ |-------------|
+ | HASHES |
+ |-------------|
+ | OFFSETS |
+ |-------------|
+ | DATA |
+ `-------------'
+
+The ``BUCKETS`` in the name tables are an index into the ``HASHES`` array. By
+making all of the full 32 bit hash values contiguous in memory, we allow
+ourselves to efficiently check for a match while touching as little memory as
+possible. Most often checking the 32 bit hash values is as far as the lookup
+goes. If it does match, it usually is a match with no collisions. So for a
+table with "``n_buckets``" buckets, and "``n_hashes``" unique 32 bit hash
+values, we can clarify the contents of the ``BUCKETS``, ``HASHES`` and
+``OFFSETS`` as:
+
+.. code-block:: none
+
+ .-------------------------.
+ | HEADER.magic | uint32_t
+ | HEADER.version | uint16_t
+ | HEADER.hash_function | uint16_t
+ | HEADER.bucket_count | uint32_t
+ | HEADER.hashes_count | uint32_t
+ | HEADER.header_data_len | uint32_t
+ | HEADER_DATA | HeaderData
+ |-------------------------|
+ | BUCKETS | uint32_t[n_buckets] // 32 bit hash indexes
+ |-------------------------|
+ | HASHES | uint32_t[n_hashes] // 32 bit hash values
+ |-------------------------|
+ | OFFSETS | uint32_t[n_hashes] // 32 bit offsets to hash value data
+ |-------------------------|
+ | ALL HASH DATA |
+ `-------------------------'
+
+So taking the exact same data from the standard hash example above we end up
+with:
+
+.. code-block:: none
+
+ .------------.
+ | HEADER |
+ |------------|
+ | 0 | BUCKETS[0]
+ | 2 | BUCKETS[1]
+ | 5 | BUCKETS[2]
+ | 6 | BUCKETS[3]
+ | | ...
+ | ... | BUCKETS[n_buckets]
+ |------------|
+ | 0x........ | HASHES[0]
+ | 0x........ | HASHES[1]
+ | 0x........ | HASHES[2]
+ | 0x........ | HASHES[3]
+ | 0x........ | HASHES[4]
+ | 0x........ | HASHES[5]
+ | 0x12345678 | HASHES[6] hash for BUCKETS[3]
+ | 0x29273623 | HASHES[7] hash for BUCKETS[3]
+ | 0x82638293 | HASHES[8] hash for BUCKETS[3]
+ | 0x........ | HASHES[9]
+ | 0x........ | HASHES[10]
+ | 0x........ | HASHES[11]
+ | 0x........ | HASHES[12]
+ | 0x........ | HASHES[13]
+ | 0x........ | HASHES[n_hashes]
+ |------------|
+ | 0x........ | OFFSETS[0]
+ | 0x........ | OFFSETS[1]
+ | 0x........ | OFFSETS[2]
+ | 0x........ | OFFSETS[3]
+ | 0x........ | OFFSETS[4]
+ | 0x........ | OFFSETS[5]
+ | 0x000034f0 | OFFSETS[6] offset for BUCKETS[3]
+ | 0x00003500 | OFFSETS[7] offset for BUCKETS[3]
+ | 0x00003550 | OFFSETS[8] offset for BUCKETS[3]
+ | 0x........ | OFFSETS[9]
+ | 0x........ | OFFSETS[10]
+ | 0x........ | OFFSETS[11]
+ | 0x........ | OFFSETS[12]
+ | 0x........ | OFFSETS[13]
+ | 0x........ | OFFSETS[n_hashes]
+ |------------|
+ | |
+ | |
+ | |
+ | |
+ | |
+ |------------|
+ 0x000034f0: | 0x00001203 | .debug_str ("erase")
+ | 0x00000004 | A 32 bit array count - number of HashData with name "erase"
+ | 0x........ | HashData[0]
+ | 0x........ | HashData[1]
+ | 0x........ | HashData[2]
+ | 0x........ | HashData[3]
+ | 0x00000000 | String offset into .debug_str (terminate data for hash)
+ |------------|
+ 0x00003500: | 0x00001203 | String offset into .debug_str ("collision")
+ | 0x00000002 | A 32 bit array count - number of HashData with name "collision"
+ | 0x........ | HashData[0]
+ | 0x........ | HashData[1]
+ | 0x00001203 | String offset into .debug_str ("dump")
+ | 0x00000003 | A 32 bit array count - number of HashData with name "dump"
+ | 0x........ | HashData[0]
+ | 0x........ | HashData[1]
+ | 0x........ | HashData[2]
+ | 0x00000000 | String offset into .debug_str (terminate data for hash)
+ |------------|
+ 0x00003550: | 0x00001203 | String offset into .debug_str ("main")
+ | 0x00000009 | A 32 bit array count - number of HashData with name "main"
+ | 0x........ | HashData[0]
+ | 0x........ | HashData[1]
+ | 0x........ | HashData[2]
+ | 0x........ | HashData[3]
+ | 0x........ | HashData[4]
+ | 0x........ | HashData[5]
+ | 0x........ | HashData[6]
+ | 0x........ | HashData[7]
+ | 0x........ | HashData[8]
+ | 0x00000000 | String offset into .debug_str (terminate data for hash)
+ `------------'
+
+So we still have all of the same data, we just organize it more efficiently for
+debugger lookup. If we repeat the same "``printf``" lookup from above, we
+would hash "``printf``" and find it matches ``BUCKETS[3]`` by taking the 32 bit
+hash value and modulo it by ``n_buckets``. ``BUCKETS[3]`` contains "6" which
+is the index into the ``HASHES`` table. We would then compare any consecutive
+32 bit hashes values in the ``HASHES`` array as long as the hashes would be in
+``BUCKETS[3]``. We do this by verifying that each subsequent hash value modulo
+``n_buckets`` is still 3. In the case of a failed lookup we would access the
+memory for ``BUCKETS[3]``, and then compare a few consecutive 32 bit hashes
+before we know that we have no match. We don't end up marching through
+multiple words of memory and we really keep the number of processor data cache
+lines being accessed as small as possible.
+
+The string hash that is used for these lookup tables is the Daniel J.
+Bernstein hash which is also used in the ELF ``GNU_HASH`` sections. It is a
+very good hash for all kinds of names in programs with very few hash
+collisions.
+
+Empty buckets are designated by using an invalid hash index of ``UINT32_MAX``.
+
+Details
+^^^^^^^
+
+These name hash tables are designed to be generic where specializations of the
+table get to define additional data that goes into the header ("``HeaderData``"),
+how the string value is stored ("``KeyType``") and the content of the data for each
+hash value.
+
+Header Layout
+"""""""""""""
+
+The header has a fixed part, and the specialized part. The exact format of the
+header is:
+
+.. code-block:: c
+
+ struct Header
+ {
+ uint32_t magic; // 'HASH' magic value to allow endian detection
+ uint16_t version; // Version number
+ uint16_t hash_function; // The hash function enumeration that was used
+ uint32_t bucket_count; // The number of buckets in this hash table
+ uint32_t hashes_count; // The total number of unique hash values and hash data offsets in this table
+ uint32_t header_data_len; // The bytes to skip to get to the hash indexes (buckets) for correct alignment
+ // Specifically the length of the following HeaderData field - this does not
+ // include the size of the preceding fields
+ HeaderData header_data; // Implementation specific header data
+ };
+
+The header starts with a 32 bit "``magic``" value which must be ``'HASH'``
+encoded as an ASCII integer. This allows the detection of the start of the
+hash table and also allows the table's byte order to be determined so the table
+can be correctly extracted. The "``magic``" value is followed by a 16 bit
+``version`` number which allows the table to be revised and modified in the
+future. The current version number is 1. ``hash_function`` is a ``uint16_t``
+enumeration that specifies which hash function was used to produce this table.
+The current values for the hash function enumerations include:
+
+.. code-block:: c
+
+ enum HashFunctionType
+ {
+ eHashFunctionDJB = 0u, // Daniel J Bernstein hash function
+ };
+
+``bucket_count`` is a 32 bit unsigned integer that represents how many buckets
+are in the ``BUCKETS`` array. ``hashes_count`` is the number of unique 32 bit
+hash values that are in the ``HASHES`` array, and is the same number of offsets
+are contained in the ``OFFSETS`` array. ``header_data_len`` specifies the size
+in bytes of the ``HeaderData`` that is filled in by specialized versions of
+this table.
+
+Fixed Lookup
+""""""""""""
+
+The header is followed by the buckets, hashes, offsets, and hash value data.
+
+.. code-block:: c
+
+ struct FixedTable
+ {
+ uint32_t buckets[Header.bucket_count]; // An array of hash indexes into the "hashes[]" array below
+ uint32_t hashes [Header.hashes_count]; // Every unique 32 bit hash for the entire table is in this table
+ uint32_t offsets[Header.hashes_count]; // An offset that corresponds to each item in the "hashes[]" array above
+ };
+
+``buckets`` is an array of 32 bit indexes into the ``hashes`` array. The
+``hashes`` array contains all of the 32 bit hash values for all names in the
+hash table. Each hash in the ``hashes`` table has an offset in the ``offsets``
+array that points to the data for the hash value.
+
+This table setup makes it very easy to repurpose these tables to contain
+different data, while keeping the lookup mechanism the same for all tables.
+This layout also makes it possible to save the table to disk and map it in
+later and do very efficient name lookups with little or no parsing.
+
+DWARF lookup tables can be implemented in a variety of ways and can store a lot
+of information for each name. We want to make the DWARF tables extensible and
+able to store the data efficiently so we have used some of the DWARF features
+that enable efficient data storage to define exactly what kind of data we store
+for each name.
+
+The ``HeaderData`` contains a definition of the contents of each HashData chunk.
+We might want to store an offset to all of the debug information entries (DIEs)
+for each name. To keep things extensible, we create a list of items, or
+Atoms, that are contained in the data for each name. First comes the type of
+the data in each atom:
+
+.. code-block:: c
+
+ enum AtomType
+ {
+ eAtomTypeNULL = 0u,
+ eAtomTypeDIEOffset = 1u, // DIE offset, check form for encoding
+ eAtomTypeCUOffset = 2u, // DIE offset of the compiler unit header that contains the item in question
+ eAtomTypeTag = 3u, // DW_TAG_xxx value, should be encoded as DW_FORM_data1 (if no tags exceed 255) or DW_FORM_data2
+ eAtomTypeNameFlags = 4u, // Flags from enum NameFlags
+ eAtomTypeTypeFlags = 5u, // Flags from enum TypeFlags
+ };
+
+The enumeration values and their meanings are:
+
+.. code-block:: none
+
+ eAtomTypeNULL - a termination atom that specifies the end of the atom list
+ eAtomTypeDIEOffset - an offset into the .debug_info section for the DWARF DIE for this name
+ eAtomTypeCUOffset - an offset into the .debug_info section for the CU that contains the DIE
+ eAtomTypeDIETag - The DW_TAG_XXX enumeration value so you don't have to parse the DWARF to see what it is
+ eAtomTypeNameFlags - Flags for functions and global variables (isFunction, isInlined, isExternal...)
+ eAtomTypeTypeFlags - Flags for types (isCXXClass, isObjCClass, ...)
+
+Then we allow each atom type to define the atom type and how the data for each
+atom type data is encoded:
+
+.. code-block:: c
+
+ struct Atom
+ {
+ uint16_t type; // AtomType enum value
+ uint16_t form; // DWARF DW_FORM_XXX defines
+ };
+
+The ``form`` type above is from the DWARF specification and defines the exact
+encoding of the data for the Atom type. See the DWARF specification for the
+``DW_FORM_`` definitions.
+
+.. code-block:: c
+
+ struct HeaderData
+ {
+ uint32_t die_offset_base;
+ uint32_t atom_count;
+ Atoms atoms[atom_count0];
+ };
+
+``HeaderData`` defines the base DIE offset that should be added to any atoms
+that are encoded using the ``DW_FORM_ref1``, ``DW_FORM_ref2``,
+``DW_FORM_ref4``, ``DW_FORM_ref8`` or ``DW_FORM_ref_udata``. It also defines
+what is contained in each ``HashData`` object -- ``Atom.form`` tells us how large
+each field will be in the ``HashData`` and the ``Atom.type`` tells us how this data
+should be interpreted.
+
+For the current implementations of the "``.apple_names``" (all functions +
+globals), the "``.apple_types``" (names of all types that are defined), and
+the "``.apple_namespaces``" (all namespaces), we currently set the ``Atom``
+array to be:
+
+.. code-block:: c
+
+ HeaderData.atom_count = 1;
+ HeaderData.atoms[0].type = eAtomTypeDIEOffset;
+ HeaderData.atoms[0].form = DW_FORM_data4;
+
+This defines the contents to be the DIE offset (eAtomTypeDIEOffset) that is
+encoded as a 32 bit value (DW_FORM_data4). This allows a single name to have
+multiple matching DIEs in a single file, which could come up with an inlined
+function for instance. Future tables could include more information about the
+DIE such as flags indicating if the DIE is a function, method, block,
+or inlined.
+
+The KeyType for the DWARF table is a 32 bit string table offset into the
+".debug_str" table. The ".debug_str" is the string table for the DWARF which
+may already contain copies of all of the strings. This helps make sure, with
+help from the compiler, that we reuse the strings between all of the DWARF
+sections and keeps the hash table size down. Another benefit to having the
+compiler generate all strings as DW_FORM_strp in the debug info, is that
+DWARF parsing can be made much faster.
+
+After a lookup is made, we get an offset into the hash data. The hash data
+needs to be able to deal with 32 bit hash collisions, so the chunk of data
+at the offset in the hash data consists of a triple:
+
+.. code-block:: c
+
+ uint32_t str_offset
+ uint32_t hash_data_count
+ HashData[hash_data_count]
+
+If "str_offset" is zero, then the bucket contents are done. 99.9% of the
+hash data chunks contain a single item (no 32 bit hash collision):
+
+.. code-block:: none
+
+ .------------.
+ | 0x00001023 | uint32_t KeyType (.debug_str[0x0001023] => "main")
+ | 0x00000004 | uint32_t HashData count
+ | 0x........ | uint32_t HashData[0] DIE offset
+ | 0x........ | uint32_t HashData[1] DIE offset
+ | 0x........ | uint32_t HashData[2] DIE offset
+ | 0x........ | uint32_t HashData[3] DIE offset
+ | 0x00000000 | uint32_t KeyType (end of hash chain)
+ `------------'
+
+If there are collisions, you will have multiple valid string offsets:
+
+.. code-block:: none
+
+ .------------.
+ | 0x00001023 | uint32_t KeyType (.debug_str[0x0001023] => "main")
+ | 0x00000004 | uint32_t HashData count
+ | 0x........ | uint32_t HashData[0] DIE offset
+ | 0x........ | uint32_t HashData[1] DIE offset
+ | 0x........ | uint32_t HashData[2] DIE offset
+ | 0x........ | uint32_t HashData[3] DIE offset
+ | 0x00002023 | uint32_t KeyType (.debug_str[0x0002023] => "print")
+ | 0x00000002 | uint32_t HashData count
+ | 0x........ | uint32_t HashData[0] DIE offset
+ | 0x........ | uint32_t HashData[1] DIE offset
+ | 0x00000000 | uint32_t KeyType (end of hash chain)
+ `------------'
+
+Current testing with real world C++ binaries has shown that there is around 1
+32 bit hash collision per 100,000 name entries.
+
+Contents
+^^^^^^^^
+
+As we said, we want to strictly define exactly what is included in the
+different tables. For DWARF, we have 3 tables: "``.apple_names``",
+"``.apple_types``", and "``.apple_namespaces``".
+
+"``.apple_names``" sections should contain an entry for each DWARF DIE whose
+``DW_TAG`` is a ``DW_TAG_label``, ``DW_TAG_inlined_subroutine``, or
+``DW_TAG_subprogram`` that has address attributes: ``DW_AT_low_pc``,
+``DW_AT_high_pc``, ``DW_AT_ranges`` or ``DW_AT_entry_pc``. It also contains
+``DW_TAG_variable`` DIEs that have a ``DW_OP_addr`` in the location (global and
+static variables). All global and static variables should be included,
+including those scoped within functions and classes. For example using the
+following code:
+
+.. code-block:: c
+
+ static int var = 0;
+
+ void f ()
+ {
+ static int var = 0;
+ }
+
+Both of the static ``var`` variables would be included in the table. All
+functions should emit both their full names and their basenames. For C or C++,
+the full name is the mangled name (if available) which is usually in the
+``DW_AT_MIPS_linkage_name`` attribute, and the ``DW_AT_name`` contains the
+function basename. If global or static variables have a mangled name in a
+``DW_AT_MIPS_linkage_name`` attribute, this should be emitted along with the
+simple name found in the ``DW_AT_name`` attribute.
+
+"``.apple_types``" sections should contain an entry for each DWARF DIE whose
+tag is one of:
+
+* DW_TAG_array_type
+* DW_TAG_class_type
+* DW_TAG_enumeration_type
+* DW_TAG_pointer_type
+* DW_TAG_reference_type
+* DW_TAG_string_type
+* DW_TAG_structure_type
+* DW_TAG_subroutine_type
+* DW_TAG_typedef
+* DW_TAG_union_type
+* DW_TAG_ptr_to_member_type
+* DW_TAG_set_type
+* DW_TAG_subrange_type
+* DW_TAG_base_type
+* DW_TAG_const_type
+* DW_TAG_file_type
+* DW_TAG_namelist
+* DW_TAG_packed_type
+* DW_TAG_volatile_type
+* DW_TAG_restrict_type
+* DW_TAG_interface_type
+* DW_TAG_unspecified_type
+* DW_TAG_shared_type
+
+Only entries with a ``DW_AT_name`` attribute are included, and the entry must
+not be a forward declaration (``DW_AT_declaration`` attribute with a non-zero
+value). For example, using the following code:
+
+.. code-block:: c
+
+ int main ()
+ {
+ int *b = 0;
+ return *b;
+ }
+
+We get a few type DIEs:
+
+.. code-block:: none
+
+ 0x00000067: TAG_base_type [5]
+ AT_encoding( DW_ATE_signed )
+ AT_name( "int" )
+ AT_byte_size( 0x04 )
+
+ 0x0000006e: TAG_pointer_type [6]
+ AT_type( {0x00000067} ( int ) )
+ AT_byte_size( 0x08 )
+
+The DW_TAG_pointer_type is not included because it does not have a ``DW_AT_name``.
+
+"``.apple_namespaces``" section should contain all ``DW_TAG_namespace`` DIEs.
+If we run into a namespace that has no name this is an anonymous namespace, and
+the name should be output as "``(anonymous namespace)``" (without the quotes).
+Why? This matches the output of the ``abi::cxa_demangle()`` that is in the
+standard C++ library that demangles mangled names.
+
+
+Language Extensions and File Format Changes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Objective-C Extensions
+""""""""""""""""""""""
+
+"``.apple_objc``" section should contain all ``DW_TAG_subprogram`` DIEs for an
+Objective-C class. The name used in the hash table is the name of the
+Objective-C class itself. If the Objective-C class has a category, then an
+entry is made for both the class name without the category, and for the class
+name with the category. So if we have a DIE at offset 0x1234 with a name of
+method "``-[NSString(my_additions) stringWithSpecialString:]``", we would add
+an entry for "``NSString``" that points to DIE 0x1234, and an entry for
+"``NSString(my_additions)``" that points to 0x1234. This allows us to quickly
+track down all Objective-C methods for an Objective-C class when doing
+expressions. It is needed because of the dynamic nature of Objective-C where
+anyone can add methods to a class. The DWARF for Objective-C methods is also
+emitted differently from C++ classes where the methods are not usually
+contained in the class definition, they are scattered about across one or more
+compile units. Categories can also be defined in different shared libraries.
+So we need to be able to quickly find all of the methods and class functions
+given the Objective-C class name, or quickly find all methods and class
+functions for a class + category name. This table does not contain any
+selector names, it just maps Objective-C class names (or class names +
+category) to all of the methods and class functions. The selectors are added
+as function basenames in the "``.debug_names``" section.
+
+In the "``.apple_names``" section for Objective-C functions, the full name is
+the entire function name with the brackets ("``-[NSString
+stringWithCString:]``") and the basename is the selector only
+("``stringWithCString:``").
+
+Mach-O Changes
+""""""""""""""
+
+The sections names for the apple hash tables are for non-mach-o files. For
+mach-o files, the sections should be contained in the ``__DWARF`` segment with
+names as follows:
+
+* "``.apple_names``" -> "``__apple_names``"
+* "``.apple_types``" -> "``__apple_types``"
+* "``.apple_namespaces``" -> "``__apple_namespac``" (16 character limit)
+* "``.apple_objc``" -> "``__apple_objc``"
+