diff options
Diffstat (limited to 'Documentation/mm/damon')
-rw-r--r-- | Documentation/mm/damon/api.rst | 20 | ||||
-rw-r--r-- | Documentation/mm/damon/design.rst | 804 | ||||
-rw-r--r-- | Documentation/mm/damon/faq.rst | 27 | ||||
-rw-r--r-- | Documentation/mm/damon/index.rst | 45 | ||||
-rw-r--r-- | Documentation/mm/damon/maintainer-profile.rst | 105 | ||||
-rw-r--r-- | Documentation/mm/damon/monitoring_intervals_tuning_example.rst | 247 |
6 files changed, 1248 insertions, 0 deletions
diff --git a/Documentation/mm/damon/api.rst b/Documentation/mm/damon/api.rst new file mode 100644 index 000000000000..08f34df45523 --- /dev/null +++ b/Documentation/mm/damon/api.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============= +API Reference +============= + +Kernel space programs can use every feature of DAMON using below APIs. All you +need to do is including ``damon.h``, which is located in ``include/linux/`` of +the source tree. + +Structures +========== + +.. kernel-doc:: include/linux/damon.h + + +Functions +========= + +.. kernel-doc:: mm/damon/core.c diff --git a/Documentation/mm/damon/design.rst b/Documentation/mm/damon/design.rst new file mode 100644 index 000000000000..ddc50db3afa4 --- /dev/null +++ b/Documentation/mm/damon/design.rst @@ -0,0 +1,804 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====== +Design +====== + + +.. _damon_design_execution_model_and_data_structures: + +Execution Model and Data Structures +=================================== + +The monitoring-related information including the monitoring request +specification and DAMON-based operation schemes are stored in a data structure +called DAMON ``context``. DAMON executes each context with a kernel thread +called ``kdamond``. Multiple kdamonds could run in parallel, for different +types of monitoring. + +To know how user-space can do the configurations and start/stop DAMON, refer to +:ref:`DAMON sysfs interface <sysfs_interface>` documentation. + + +Overall Architecture +==================== + +DAMON subsystem is configured with three layers including + +- :ref:`Operations Set <damon_operations_set>`: Implements fundamental + operations for DAMON that depends on the given monitoring target + address-space and available set of software/hardware primitives, +- :ref:`Core <damon_core_logic>`: Implements core logics including monitoring + overhead/accuracy control and access-aware system operations on top of the + operations set layer, and +- :ref:`Modules <damon_modules>`: Implements kernel modules for various + purposes that provides interfaces for the user space, on top of the core + layer. + + +.. _damon_operations_set: + +Operations Set Layer +==================== + +.. _damon_design_configurable_operations_set: + +For data access monitoring and additional low level work, DAMON needs a set of +implementations for specific operations that are dependent on and optimized for +the given target address space. For example, below two operations for access +monitoring are address-space dependent. + +1. Identification of the monitoring target address range for the address space. +2. Access check of specific address range in the target space. + +DAMON consolidates these implementations in a layer called DAMON Operations +Set, and defines the interface between it and the upper layer. The upper layer +is dedicated for DAMON's core logics including the mechanism for control of the +monitoring accuracy and the overhead. + +Hence, DAMON can easily be extended for any address space and/or available +hardware features by configuring the core logic to use the appropriate +operations set. If there is no available operations set for a given purpose, a +new operations set can be implemented following the interface between the +layers. + +For example, physical memory, virtual memory, swap space, those for specific +processes, NUMA nodes, files, and backing memory devices would be supportable. +Also, if some architectures or devices support special optimized access check +features, those will be easily configurable. + +DAMON currently provides below three operation sets. Below two subsections +describe how those work. + + - vaddr: Monitor virtual address spaces of specific processes + - fvaddr: Monitor fixed virtual address ranges + - paddr: Monitor the physical address space of the system + +To know how user-space can do the configuration via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`operations <sysfs_context>` file part of the +documentation. + + + .. _damon_design_vaddr_target_regions_construction: + +VMA-based Target Address Range Construction +------------------------------------------- + +A mechanism of ``vaddr`` DAMON operations set that automatically initializes +and updates the monitoring target address regions so that entire memory +mappings of the target processes can be covered. + +This mechanism is only for the ``vaddr`` operations set. In cases of +``fvaddr`` and ``paddr`` operation sets, users are asked to manually set the +monitoring target address ranges. + +Only small parts in the super-huge virtual address space of the processes are +mapped to the physical memory and accessed. Thus, tracking the unmapped +address regions is just wasteful. However, because DAMON can deal with some +level of noise using the adaptive regions adjustment mechanism, tracking every +mapping is not strictly required but could even incur a high overhead in some +cases. That said, too huge unmapped areas inside the monitoring target should +be removed to not take the time for the adaptive mechanism. + +For the reason, this implementation converts the complex mappings to three +distinct regions that cover every mapped area of the address space. The two +gaps between the three regions are the two biggest unmapped areas in the given +address space. The two biggest unmapped areas would be the gap between the +heap and the uppermost mmap()-ed region, and the gap between the lowermost +mmap()-ed region and the stack in most of the cases. Because these gaps are +exceptionally huge in usual address spaces, excluding these will be sufficient +to make a reasonable trade-off. Below shows this in detail:: + + <heap> + <BIG UNMAPPED REGION 1> + <uppermost mmap()-ed region> + (small mmap()-ed regions and munmap()-ed regions) + <lowermost mmap()-ed region> + <BIG UNMAPPED REGION 2> + <stack> + + +PTE Accessed-bit Based Access Check +----------------------------------- + +Both of the implementations for physical and virtual address spaces use PTE +Accessed-bit for basic access checks. Only one difference is the way of +finding the relevant PTE Accessed bit(s) from the address. While the +implementation for the virtual address walks the page table for the target task +of the address, the implementation for the physical address walks every page +table having a mapping to the address. In this way, the implementations find +and clear the bit(s) for next sampling target address and checks whether the +bit(s) set again after one sampling period. This could disturb other kernel +subsystems using the Accessed bits, namely Idle page tracking and the reclaim +logic. DAMON does nothing to avoid disturbing Idle page tracking, so handling +the interference is the responsibility of sysadmins. However, it solves the +conflict with the reclaim logic using ``PG_idle`` and ``PG_young`` page flags, +as Idle page tracking does. + + +.. _damon_core_logic: + +Core Logics +=========== + +.. _damon_design_monitoring: + +Monitoring +---------- + +Below four sections describe each of the DAMON core mechanisms and the five +monitoring attributes, ``sampling interval``, ``aggregation interval``, +``update interval``, ``minimum number of regions``, and ``maximum number of +regions``. + +To know how user-space can set the attributes via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`monitoring_attrs <sysfs_monitoring_attrs>` +part of the documentation. + + +Access Frequency Monitoring +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The output of DAMON says what pages are how frequently accessed for a given +duration. The resolution of the access frequency is controlled by setting +``sampling interval`` and ``aggregation interval``. In detail, DAMON checks +access to each page per ``sampling interval`` and aggregates the results. In +other words, counts the number of the accesses to each page. After each +``aggregation interval`` passes, DAMON calls callback functions that previously +registered by users so that users can read the aggregated results and then +clears the results. This can be described in below simple pseudo-code:: + + while monitoring_on: + for page in monitoring_target: + if accessed(page): + nr_accesses[page] += 1 + if time() % aggregation_interval == 0: + for callback in user_registered_callbacks: + callback(monitoring_target, nr_accesses) + for page in monitoring_target: + nr_accesses[page] = 0 + sleep(sampling interval) + +The monitoring overhead of this mechanism will arbitrarily increase as the +size of the target workload grows. + + +.. _damon_design_region_based_sampling: + +Region Based Sampling +~~~~~~~~~~~~~~~~~~~~~ + +To avoid the unbounded increase of the overhead, DAMON groups adjacent pages +that assumed to have the same access frequencies into a region. As long as the +assumption (pages in a region have the same access frequencies) is kept, only +one page in the region is required to be checked. Thus, for each ``sampling +interval``, DAMON randomly picks one page in each region, waits for one +``sampling interval``, checks whether the page is accessed meanwhile, and +increases the access frequency counter of the region if so. The counter is +called ``nr_accesses`` of the region. Therefore, the monitoring overhead is +controllable by setting the number of regions. DAMON allows users to set the +minimum and the maximum number of regions for the trade-off. + +This scheme, however, cannot preserve the quality of the output if the +assumption is not guaranteed. + + +.. _damon_design_adaptive_regions_adjustment: + +Adaptive Regions Adjustment +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Even somehow the initial monitoring target regions are well constructed to +fulfill the assumption (pages in same region have similar access frequencies), +the data access pattern can be dynamically changed. This will result in low +monitoring quality. To keep the assumption as much as possible, DAMON +adaptively merges and splits each region based on their access frequency. + +For each ``aggregation interval``, it compares the access frequencies +(``nr_accesses``) of adjacent regions. If the difference is small, and if the +sum of the two regions' sizes is smaller than the size of total regions divided +by the ``minimum number of regions``, DAMON merges the two regions. If the +resulting number of total regions is still higher than ``maximum number of +regions``, it repeats the merging with increasing access frequenceis difference +threshold until the upper-limit of the number of regions is met, or the +threshold becomes higher than possible maximum value (``aggregation interval`` +divided by ``sampling interval``). Then, after it reports and clears the +aggregated access frequency of each region, it splits each region into two or +three regions if the total number of regions will not exceed the user-specified +maximum number of regions after the split. + +In this way, DAMON provides its best-effort quality and minimal overhead while +keeping the bounds users set for their trade-off. + + +.. _damon_design_age_tracking: + +Age Tracking +~~~~~~~~~~~~ + +By analyzing the monitoring results, users can also find how long the current +access pattern of a region has maintained. That could be used for good +understanding of the access pattern. For example, page placement algorithm +utilizing both the frequency and the recency could be implemented using that. +To make such access pattern maintained period analysis easier, DAMON maintains +yet another counter called ``age`` in each region. For each ``aggregation +interval``, DAMON checks if the region's size and access frequency +(``nr_accesses``) has significantly changed. If so, the counter is reset to +zero. Otherwise, the counter is increased. + + +Dynamic Target Space Updates Handling +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The monitoring target address range could dynamically changed. For example, +virtual memory could be dynamically mapped and unmapped. Physical memory could +be hot-plugged. + +As the changes could be quite frequent in some cases, DAMON allows the +monitoring operations to check dynamic changes including memory mapping changes +and applies it to monitoring operations-related data structures such as the +abstracted monitoring target memory area only for each of a user-specified time +interval (``update interval``). + +User-space can get the monitoring results via DAMON sysfs interface and/or +tracepoints. For more details, please refer to the documentations for +:ref:`DAMOS tried regions <sysfs_schemes_tried_regions>` and :ref:`tracepoint`, +respectively. + + +.. _damon_design_monitoring_params_tuning_guide: + +Monitoring Parameters Tuning Guide +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In short, set ``aggregation interval`` to capture meaningful amount of accesses +for the purpose. The amount of accesses can be measured using ``nr_accesses`` +and ``age`` of regions in the aggregated monitoring results snapshot. The +default value of the interval, ``100ms``, turns out to be too short in many +cases. Set ``sampling interval`` proportional to ``aggregation interval``. By +default, ``1/20`` is recommended as the ratio. + +``Aggregation interval`` should be set as the time interval that the workload +can make an amount of accesses for the monitoring purpose, within the interval. +If the interval is too short, only small number of accesses are captured. As a +result, the monitoring results look everything is samely accessed only rarely. +For many purposes, that would be useless. If it is too long, however, the time +to converge regions with the :ref:`regions adjustment mechanism +<damon_design_adaptive_regions_adjustment>` can be too long, depending on the +time scale of the given purpose. This could happen if the workload is actually +making only rare accesses but the user thinks the amount of accesses for the +monitoring purpose too high. For such cases, the target amount of access to +capture per ``aggregation interval`` should carefully reconsidered. Also, note +that the captured amount of accesses is represented with not only +``nr_accesses``, but also ``age``. For example, even if every region on the +monitoring results show zero ``nr_accesses``, regions could still be +distinguished using ``age`` values as the recency information. + +Hence the optimum value of ``aggregation interval`` depends on the access +intensiveness of the workload. The user should tune the interval based on the +amount of access that captured on each aggregated snapshot of the monitoring +results. + +Note that the default value of the interval is 100 milliseconds, which is too +short in many cases, especially on large systems. + +``Sampling interval`` defines the resolution of each aggregation. If it is set +too large, monitoring results will look like every region was samely rarely +accessed, or samely frequently accessed. That is, regions become +undistinguishable based on access pattern, and therefore the results will be +useless in many use cases. If ``sampling interval`` is too small, it will not +degrade the resolution, but will increase the monitoring overhead. If it is +appropriate enough to provide a resolution of the monitoring results that +sufficient for the given purpose, it shouldn't be unnecessarily further +lowered. It is recommended to be set proportional to ``aggregation interval``. +By default, the ratio is set as ``1/20``, and it is still recommended. + +Based on the manual tuning guide, DAMON provides more intuitive knob-based +intervals auto tuning mechanism. Please refer to :ref:`the design document of +the feature <damon_design_monitoring_intervals_autotuning>` for detail. + +Refer to below documents for an example tuning based on the above guide. + +.. toctree:: + :maxdepth: 1 + + monitoring_intervals_tuning_example + + +.. _damon_design_monitoring_intervals_autotuning: + +Monitoring Intervals Auto-tuning +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +DAMON provides automatic tuning of the ``sampling interval`` and ``aggregation +interval`` based on the :ref:`the tuning guide idea +<damon_design_monitoring_params_tuning_guide>`. The tuning mechanism allows +users to set the aimed amount of access events to observe via DAMON within +given time interval. The target can be specified by the user as a ratio of +DAMON-observed access events to the theoretical maximum amount of the events +(``access_bp``) that measured within a given number of aggregations +(``aggrs``). + +The DAMON-observed access events are calculated in byte granularity based on +DAMON :ref:`region assumption <damon_design_region_based_sampling>`. For +example, if a region of size ``X`` bytes of ``Y`` ``nr_accesses`` is found, it +means ``X * Y`` access events are observed by DAMON. Theoretical maximum +access events for the region is calculated in same way, but replacing ``Y`` +with theoretical maximum ``nr_accesses``, which can be calculated as +``aggregation interval / sampling interval``. + +The mechanism calculates the ratio of access events for ``aggrs`` aggregations, +and increases or decrease the ``sampleing interval`` and ``aggregation +interval`` in same ratio, if the observed access ratio is lower or higher than +the target, respectively. The ratio of the intervals change is decided in +proportion to the distance between current samples ratio and the target ratio. + +The user can further set the minimum and maximum ``sampling interval`` that can +be set by the tuning mechanism using two parameters (``min_sample_us`` and +``max_sample_us``). Because the tuning mechanism changes ``sampling interval`` +and ``aggregation interval`` in same ratio always, the minimum and maximum +``aggregation interval`` after each of the tuning changes can automatically set +together. + +The tuning is turned off by default, and need to be set explicitly by the user. +As a rule of thumbs and the Parreto principle, 4% access samples ratio target +is recommended. Note that Parreto principle (80/20 rule) has applied twice. +That is, assumes 4% (20% of 20%) DAMON-observed access events ratio (source) +to capture 64% (80% multipled by 80%) real access events (outcomes). + +To know how user-space can use this feature via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`intervals_goal <sysfs_scheme>` part of +the documentation. + + +.. _damon_design_damos: + +Operation Schemes +----------------- + +One common purpose of data access monitoring is access-aware system efficiency +optimizations. For example, + + paging out memory regions that are not accessed for more than two minutes + +or + + using THP for memory regions that are larger than 2 MiB and showing a high + access frequency for more than one minute. + +One straightforward approach for such schemes would be profile-guided +optimizations. That is, getting data access monitoring results of the +workloads or the system using DAMON, finding memory regions of special +characteristics by profiling the monitoring results, and making system +operation changes for the regions. The changes could be made by modifying or +providing advice to the software (the application and/or the kernel), or +reconfiguring the hardware. Both offline and online approaches could be +available. + +Among those, providing advice to the kernel at runtime would be flexible and +effective, and therefore widely be used. However, implementing such schemes +could impose unnecessary redundancy and inefficiency. The profiling could be +redundant if the type of interest is common. Exchanging the information +including monitoring results and operation advice between kernel and user +spaces could be inefficient. + +To allow users to reduce such redundancy and inefficiencies by offloading the +works, DAMON provides a feature called Data Access Monitoring-based Operation +Schemes (DAMOS). It lets users specify their desired schemes at a high +level. For such specifications, DAMON starts monitoring, finds regions having +the access pattern of interest, and applies the user-desired operation actions +to the regions, for every user-specified time interval called +``apply_interval``. + +To know how user-space can set ``apply_interval`` via :ref:`DAMON sysfs +interface <sysfs_interface>`, refer to :ref:`apply_interval_us <sysfs_scheme>` +part of the documentation. + + +.. _damon_design_damos_action: + +Operation Action +~~~~~~~~~~~~~~~~ + +The management action that the users desire to apply to the regions of their +interest. For example, paging out, prioritizing for next reclamation victim +selection, advising ``khugepaged`` to collapse or split, or doing nothing but +collecting statistics of the regions. + +The list of supported actions is defined in DAMOS, but the implementation of +each action is in the DAMON operations set layer because the implementation +normally depends on the monitoring target address space. For example, the code +for paging specific virtual address ranges out would be different from that for +physical address ranges. And the monitoring operations implementation sets are +not mandated to support all actions of the list. Hence, the availability of +specific DAMOS action depends on what operations set is selected to be used +together. + +The list of the supported actions, their meaning, and DAMON operations sets +that supports each action are as below. + + - ``willneed``: Call ``madvise()`` for the region with ``MADV_WILLNEED``. + Supported by ``vaddr`` and ``fvaddr`` operations set. + - ``cold``: Call ``madvise()`` for the region with ``MADV_COLD``. + Supported by ``vaddr`` and ``fvaddr`` operations set. + - ``pageout``: Reclaim the region. + Supported by ``vaddr``, ``fvaddr`` and ``paddr`` operations set. + - ``hugepage``: Call ``madvise()`` for the region with ``MADV_HUGEPAGE``. + Supported by ``vaddr`` and ``fvaddr`` operations set. + - ``nohugepage``: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE``. + Supported by ``vaddr`` and ``fvaddr`` operations set. + - ``lru_prio``: Prioritize the region on its LRU lists. + Supported by ``paddr`` operations set. + - ``lru_deprio``: Deprioritize the region on its LRU lists. + Supported by ``paddr`` operations set. + - ``migrate_hot``: Migrate the regions prioritizing warmer regions. + Supported by ``paddr`` operations set. + - ``migrate_cold``: Migrate the regions prioritizing colder regions. + Supported by ``paddr`` operations set. + - ``stat``: Do nothing but count the statistics. + Supported by all operations sets. + +Applying the actions except ``stat`` to a region is considered as changing the +region's characteristics. Hence, DAMOS resets the age of regions when any such +actions are applied to those. + +To know how user-space can set the action via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`action <sysfs_scheme>` part of the +documentation. + + +.. _damon_design_damos_access_pattern: + +Target Access Pattern +~~~~~~~~~~~~~~~~~~~~~ + +The access pattern of the schemes' interest. The patterns are constructed with +the properties that DAMON's monitoring results provide, specifically the size, +the access frequency, and the age. Users can describe their access pattern of +interest by setting minimum and maximum values of the three properties. If a +region's three properties are in the ranges, DAMOS classifies it as one of the +regions that the scheme is having an interest in. + +To know how user-space can set the access pattern via :ref:`DAMON sysfs +interface <sysfs_interface>`, refer to :ref:`access_pattern +<sysfs_access_pattern>` part of the documentation. + + +.. _damon_design_damos_quotas: + +Quotas +~~~~~~ + +DAMOS upper-bound overhead control feature. DAMOS could incur high overhead if +the target access pattern is not properly tuned. For example, if a huge memory +region having the access pattern of interest is found, applying the scheme's +action to all pages of the huge region could consume unacceptably large system +resources. Preventing such issues by tuning the access pattern could be +challenging, especially if the access patterns of the workloads are highly +dynamic. + +To mitigate that situation, DAMOS provides an upper-bound overhead control +feature called quotas. It lets users specify an upper limit of time that DAMOS +can use for applying the action, and/or a maximum bytes of memory regions that +the action can be applied within a user-specified time duration. + +To know how user-space can set the basic quotas via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`quotas <sysfs_quotas>` part of the +documentation. + + +.. _damon_design_damos_quotas_prioritization: + +Prioritization +^^^^^^^^^^^^^^ + +A mechanism for making a good decision under the quotas. When the action +cannot be applied to all regions of interest due to the quotas, DAMOS +prioritizes regions and applies the action to only regions having high enough +priorities so that it will not exceed the quotas. + +The prioritization mechanism should be different for each action. For example, +rarely accessed (colder) memory regions would be prioritized for page-out +scheme action. In contrast, the colder regions would be deprioritized for huge +page collapse scheme action. Hence, the prioritization mechanisms for each +action are implemented in each DAMON operations set, together with the actions. + +Though the implementation is up to the DAMON operations set, it would be common +to calculate the priority using the access pattern properties of the regions. +Some users would want the mechanisms to be personalized for their specific +case. For example, some users would want the mechanism to weigh the recency +(``age``) more than the access frequency (``nr_accesses``). DAMOS allows users +to specify the weight of each access pattern property and passes the +information to the underlying mechanism. Nevertheless, how and even whether +the weight will be respected are up to the underlying prioritization mechanism +implementation. + +To know how user-space can set the prioritization weights via :ref:`DAMON sysfs +interface <sysfs_interface>`, refer to :ref:`weights <sysfs_quotas>` part of +the documentation. + + +.. _damon_design_damos_quotas_auto_tuning: + +Aim-oriented Feedback-driven Auto-tuning +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Automatic feedback-driven quota tuning. Instead of setting the absolute quota +value, users can specify the metric of their interest, and what target value +they want the metric value to be. DAMOS then automatically tunes the +aggressiveness (the quota) of the corresponding scheme. For example, if DAMOS +is under achieving the goal, DAMOS automatically increases the quota. If DAMOS +is over achieving the goal, it decreases the quota. + +The goal can be specified with four parameters, namely ``target_metric``, +``target_value``, ``current_value`` and ``nid``. The auto-tuning mechanism +tries to make ``current_value`` of ``target_metric`` be same to +``target_value``. + +- ``user_input``: User-provided value. Users could use any metric that they + has interest in for the value. Use space main workload's latency or + throughput, system metrics like free memory ratio or memory pressure stall + time (PSI) could be examples. Note that users should explicitly set + ``current_value`` on their own in this case. In other words, users should + repeatedly provide the feedback. +- ``some_mem_psi_us``: System-wide ``some`` memory pressure stall information + in microseconds that measured from last quota reset to next quota reset. + DAMOS does the measurement on its own, so only ``target_value`` need to be + set by users at the initial time. In other words, DAMOS does self-feedback. +- ``node_mem_used_bp``: Specific NUMA node's used memory ratio in bp (1/10,000). +- ``node_mem_free_bp``: Specific NUMA node's free memory ratio in bp (1/10,000). + +``nid`` is optionally required for only ``node_mem_used_bp`` and +``node_mem_free_bp`` to point the specific NUMA node. + +To know how user-space can set the tuning goal metric, the target value, and/or +the current value via :ref:`DAMON sysfs interface <sysfs_interface>`, refer to +:ref:`quota goals <sysfs_schemes_quota_goals>` part of the documentation. + + +.. _damon_design_damos_watermarks: + +Watermarks +~~~~~~~~~~ + +Conditional DAMOS (de)activation automation. Users might want DAMOS to run +only under certain situations. For example, when a sufficient amount of free +memory is guaranteed, running a scheme for proactive reclamation would only +consume unnecessary system resources. To avoid such consumption, the user would +need to manually monitor some metrics such as free memory ratio, and turn +DAMON/DAMOS on or off. + +DAMOS allows users to offload such works using three watermarks. It allows the +users to configure the metric of their interest, and three watermark values, +namely high, middle, and low. If the value of the metric becomes above the +high watermark or below the low watermark, the scheme is deactivated. If the +metric becomes below the mid watermark but above the low watermark, the scheme +is activated. If all schemes are deactivated by the watermarks, the monitoring +is also deactivated. In this case, the DAMON worker thread only periodically +checks the watermarks and therefore incurs nearly zero overhead. + +To know how user-space can set the watermarks via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`watermarks <sysfs_watermarks>` part of the +documentation. + + +.. _damon_design_damos_filters: + +Filters +~~~~~~~ + +Non-access pattern-based target memory regions filtering. If users run +self-written programs or have good profiling tools, they could know something +more than the kernel, such as future access patterns or some special +requirements for specific types of memory. For example, some users may know +only anonymous pages can impact their program's performance. They can also +have a list of latency-critical processes. + +To let users optimize DAMOS schemes with such special knowledge, DAMOS provides +a feature called DAMOS filters. The feature allows users to set an arbitrary +number of filters for each scheme. Each filter specifies + +- a type of memory (``type``), +- whether it is for the memory of the type or all except the type + (``matching``), and +- whether it is to allow (include) or reject (exclude) applying + the scheme's action to the memory (``allow``). + +For efficient handling of filters, some types of filters are handled by the +core layer, while others are handled by operations set. In the latter case, +hence, support of the filter types depends on the DAMON operations set. In +case of the core layer-handled filters, the memory regions that excluded by the +filter are not counted as the scheme has tried to the region. In contrast, if +a memory regions is filtered by an operations set layer-handled filter, it is +counted as the scheme has tried. This difference affects the statistics. + +When multiple filters are installed, the group of filters that handled by the +core layer are evaluated first. After that, the group of filters that handled +by the operations layer are evaluated. Filters in each of the groups are +evaluated in the installed order. If a part of memory is matched to one of the +filter, next filters are ignored. If the part passes through the filters +evaluation stage because it is not matched to any of the filters, applying the +scheme's action to it depends on the last filter's allowance type. If the last +filter was for allowing, the part of memory will be rejected, and vice versa. + +For example, let's assume 1) a filter for allowing anonymous pages and 2) +another filter for rejecting young pages are installed in the order. If a page +of a region that eligible to apply the scheme's action is an anonymous page, +the scheme's action will be applied to the page regardless of whether it is +young or not, since it matches with the first allow-filter. If the page is +not anonymous but young, the scheme's action will not be applied, since the +second reject-filter blocks it. If the page is neither anonymous nor young, +the page will pass through the filters evaluation stage since there is no +matching filter, and the action will be applied to the page. + +Below ``type`` of filters are currently supported. + +- Core layer handled + - addr + - Applied to pages that belonging to a given address range. + - target + - Applied to pages that belonging to a given DAMON monitoring target. +- Operations layer handled, supported by only ``paddr`` operations set. + - anon + - Applied to pages that containing data that not stored in files. + - active + - Applied to active pages. + - memcg + - Applied to pages that belonging to a given cgroup. + - young + - Applied to pages that are accessed after the last access check from the + scheme. + - hugepage_size + - Applied to pages that managed in a given size range. + - unmapped + - Applied to pages that unmapped. + +To know how user-space can set the filters via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:`filters <sysfs_filters>` part of the +documentation. + +.. _damon_design_damos_stat: + +Statistics +~~~~~~~~~~ + +The statistics of DAMOS behaviors that designed to help monitoring, tuning and +debugging of DAMOS. + +DAMOS accounts below statistics for each scheme, from the beginning of the +scheme's execution. + +- ``nr_tried``: Total number of regions that the scheme is tried to be applied. +- ``sz_trtied``: Total size of regions that the scheme is tried to be applied. +- ``sz_ops_filter_passed``: Total bytes that passed operations set + layer-handled DAMOS filters. +- ``nr_applied``: Total number of regions that the scheme is applied. +- ``sz_applied``: Total size of regions that the scheme is applied. +- ``qt_exceeds``: Total number of times the quota of the scheme has exceeded. + +"A scheme is tried to be applied to a region" means DAMOS core logic determined +the region is eligible to apply the scheme's :ref:`action +<damon_design_damos_action>`. The :ref:`access pattern +<damon_design_damos_access_pattern>`, :ref:`quotas +<damon_design_damos_quotas>`, :ref:`watermarks +<damon_design_damos_watermarks>`, and :ref:`filters +<damon_design_damos_filters>` that handled on core logic could affect this. +The core logic will only ask the underlying :ref:`operation set +<damon_operations_set>` to do apply the action to the region, so whether the +action is really applied or not is unclear. That's why it is called "tried". + +"A scheme is applied to a region" means the :ref:`operation set +<damon_operations_set>` has applied the action to at least a part of the +region. The :ref:`filters <damon_design_damos_filters>` that handled by the +operation set, and the types of the :ref:`action <damon_design_damos_action>` +and the pages of the region can affect this. For example, if a filter is set +to exclude anonymous pages and the region has only anonymous pages, or if the +action is ``pageout`` while all pages of the region are unreclaimable, applying +the action to the region will fail. + +To know how user-space can read the stats via :ref:`DAMON sysfs interface +<sysfs_interface>`, refer to :ref:s`stats <sysfs_stats>` part of the +documentation. + +Regions Walking +~~~~~~~~~~~~~~~ + +DAMOS feature allowing users access each region that a DAMOS action has just +applied. Using this feature, DAMON :ref:`API <damon_design_api>` allows users +access full properties of the regions including the access monitoring results +and amount of the region's internal memory that passed the DAMOS filters. +:ref:`DAMON sysfs interface <sysfs_interface>` also allows users read the data +via special :ref:`files <sysfs_schemes_tried_regions>`. + +.. _damon_design_api: + +Application Programming Interface +--------------------------------- + +The programming interface for kernel space data access-aware applications. +DAMON is a framework, so it does nothing by itself. Instead, it only helps +other kernel components such as subsystems and modules building their data +access-aware applications using DAMON's core features. For this, DAMON exposes +its all features to other kernel components via its application programming +interface, namely ``include/linux/damon.h``. Please refer to the API +:doc:`document </mm/damon/api>` for details of the interface. + + +.. _damon_modules: + +Modules +======= + +Because the core of DAMON is a framework for kernel components, it doesn't +provide any direct interface for the user space. Such interfaces should be +implemented by each DAMON API user kernel components, instead. DAMON subsystem +itself implements such DAMON API user modules, which are supposed to be used +for general purpose DAMON control and special purpose data access-aware system +operations, and provides stable application binary interfaces (ABI) for the +user space. The user space can build their efficient data access-aware +applications using the interfaces. + + +General Purpose User Interface Modules +-------------------------------------- + +DAMON modules that provide user space ABIs for general purpose DAMON usage in +runtime. + +Like many other ABIs, the modules create files on pseudo file systems like +'sysfs', allow users to specify their requests to and get the answers from +DAMON by writing to and reading from the files. As a response to such I/O, +DAMON user interface modules control DAMON and retrieve the results as user +requested via the DAMON API, and return the results to the user-space. + +The ABIs are designed to be used for user space applications development, +rather than human beings' fingers. Human users are recommended to use such +user space tools. One such Python-written user space tool is available at +Github (https://github.com/damonitor/damo), Pypi +(https://pypistats.org/packages/damo), and Fedora +(https://packages.fedoraproject.org/pkgs/python-damo/damo/). + +Currently, one module for this type, namely 'DAMON sysfs interface' is +available. Please refer to the ABI :ref:`doc <sysfs_interface>` for details of +the interfaces. + + +Special-Purpose Access-aware Kernel Modules +------------------------------------------- + +DAMON modules that provide user space ABI for specific purpose DAMON usage. + +DAMON user interface modules are for full control of all DAMON features in +runtime. For each special-purpose system-wide data access-aware system +operations such as proactive reclamation or LRU lists balancing, the interfaces +could be simplified by removing unnecessary knobs for the specific purpose, and +extended for boot-time and even compile time control. Default values of DAMON +control parameters for the usage would also need to be optimized for the +purpose. + +To support such cases, yet more DAMON API user kernel modules that provide more +simple and optimized user space interfaces are available. Currently, two +modules for proactive reclamation and LRU lists manipulation are provided. For +more detail, please read the usage documents for those +(:doc:`/admin-guide/mm/damon/reclaim` and +:doc:`/admin-guide/mm/damon/lru_sort`). diff --git a/Documentation/mm/damon/faq.rst b/Documentation/mm/damon/faq.rst new file mode 100644 index 000000000000..3279dc7a8211 --- /dev/null +++ b/Documentation/mm/damon/faq.rst @@ -0,0 +1,27 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================== +Frequently Asked Questions +========================== + +Does DAMON support virtual memory only? +======================================= + +No. The core of the DAMON is address space independent. The address space +specific monitoring operations including monitoring target regions +constructions and actual access checks can be implemented and configured on the +DAMON core by the users. In this way, DAMON users can monitor any address +space with any access check technique. + +Nonetheless, DAMON provides vma/rmap tracking and PTE Accessed bit check based +implementations of the address space dependent functions for the virtual memory +and the physical memory by default, for a reference and convenient use. + + +Can I simply monitor page granularity? +====================================== + +Yes. You can do so by setting the ``min_nr_regions`` attribute higher than the +working set size divided by the page size. Because the monitoring target +regions size is forced to be ``>=page size``, the region split will make no +effect. diff --git a/Documentation/mm/damon/index.rst b/Documentation/mm/damon/index.rst new file mode 100644 index 000000000000..31c1fa955b3d --- /dev/null +++ b/Documentation/mm/damon/index.rst @@ -0,0 +1,45 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================================================================ +DAMON: Data Access MONitoring and Access-aware System Operations +================================================================ + +DAMON is a Linux kernel subsystem that provides a framework for data access +monitoring and the monitoring results based system operations. The core +monitoring :ref:`mechanisms <damon_design_monitoring>` of DAMON make it + + - *accurate* (the monitoring output is useful enough for DRAM level memory + management; It might not appropriate for CPU Cache levels, though), + - *light-weight* (the monitoring overhead is low enough to be applied online), + and + - *scalable* (the upper-bound of the overhead is in constant range regardless + of the size of target workloads). + +Using this framework, therefore, the kernel can operate system in an +access-aware fashion. Because the features are also exposed to the :doc:`user +space </admin-guide/mm/damon/index>`, users who have special information about +their workloads can write personalized applications for better understanding +and optimizations of their workloads and systems. + +For easier development of such systems, DAMON provides a feature called +:ref:`DAMOS <damon_design_damos>` (DAMon-based Operation Schemes) in addition +to the monitoring. Using the feature, DAMON users in both kernel and :doc:`user +spaces </admin-guide/mm/damon/index>` can do access-aware system operations +with no code but simple configurations. + +.. toctree:: + :maxdepth: 2 + + faq + design + api + maintainer-profile + +To utilize and control DAMON from the user-space, please refer to the +administration :doc:`guide </admin-guide/mm/damon/index>`. + +If you prefer academic papers for reading and citations, please use the papers +from `HPDC'22 <https://dl.acm.org/doi/abs/10.1145/3502181.3531466>`_ and +`Middleware19 Industry <https://dl.acm.org/doi/abs/10.1145/3366626.3368125>`_ . +Note that those cover DAMON implementations in Linux v5.16 and v5.15, +respectively. diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst new file mode 100644 index 000000000000..ce3e98458339 --- /dev/null +++ b/Documentation/mm/damon/maintainer-profile.rst @@ -0,0 +1,105 @@ +.. SPDX-License-Identifier: GPL-2.0 + +DAMON Maintainer Entry Profile +============================== + +The DAMON subsystem covers the files that are listed in 'DATA ACCESS MONITOR' +section of 'MAINTAINERS' file. + +The mailing lists for the subsystem are damon@lists.linux.dev and +linux-mm@kvack.org. Patches should be made against the `mm-unstable tree +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ whenever possible and posted +to the mailing lists. + +SCM Trees +--------- + +There are multiple Linux trees for DAMON development. Patches under +development or testing are queued in `damon/next +<https://git.kernel.org/sj/h/damon/next>`_ by the DAMON maintainer. +Sufficiently reviewed patches will be queued in `mm-unstable +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ by the memory management +subsystem maintainer. After more sufficient tests, the patches will be queued +in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_, and finally +pull-requested to the mainline by the memory management subsystem maintainer. + +Note again the patches for `mm-unstable tree +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ are queued by the memory +management subsystem maintainer. If the patches requires some patches in +`damon/next tree <https://git.kernel.org/sj/h/damon/next>`_ which not yet merged +in mm-unstable, please make sure the requirement is clearly specified. + +Submit checklist addendum +------------------------- + +When making DAMON changes, you should do below. + +- Build changes related outputs including kernel and documents. +- Ensure the builds introduce no new errors or warnings. +- Run and ensure no new failures for DAMON `selftests + <https://github.com/damonitor/damon-tests/blob/master/corr/run.sh#L49>`_ and + `kunittests + <https://github.com/damonitor/damon-tests/blob/master/corr/tests/kunit.sh>`_. + +Further doing below and putting the results will be helpful. + +- Run `damon-tests/corr + <https://github.com/damonitor/damon-tests/tree/master/corr>`_ for normal + changes. +- Run `damon-tests/perf + <https://github.com/damonitor/damon-tests/tree/master/perf>`_ for performance + changes. + +Key cycle dates +--------------- + +Patches can be sent anytime. Key cycle dates of the `mm-unstable +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and `mm-stable +<https://git.kernel.org/akpm/mm/h/mm-stable>`_ trees depend on the memory +management subsystem maintainer. + +Review cadence +-------------- + +The DAMON maintainer does the work on the usual work hour (09:00 to 17:00, +Mon-Fri) in PT (Pacific Time). The response to patches will occasionally be +slow. Do not hesitate to send a ping if you have not heard back within a week +of sending a patch. + +Mailing tool +------------ + +Like many other Linux kernel subsystems, DAMON uses the mailing lists +(damon@lists.linux.dev and linux-mm@kvack.org) as the major communication +channel. There is a simple tool called `HacKerMaiL +<https://github.com/damonitor/hackermail>`_ (``hkml``), which is for people who +are not very familiar with the mailing lists based communication. The tool +could be particularly helpful for DAMON community members since it is developed +and maintained by DAMON maintainer. The tool is also officially announced to +support DAMON and general Linux kernel development workflow. + +In other words, `hkml <https://github.com/damonitor/hackermail>`_ is a mailing +tool for DAMON community, which DAMON maintainer is committed to support. +Please feel free to try and report issues or feature requests for the tool to +the maintainer. + +Community meetup +---------------- + +DAMON community is maintaining two bi-weekly meetup series for community +members who prefer synchronous conversations over mails. + +The first one is for any discussion between every community member. No +reservation is needed. + +The seconds one is for discussions on specific topics between restricted +members including the maintainer. The maintainer shares the available time +slots, and attendees should reserve one of those at least 24 hours before the +time slot, by reaching out to the maintainer. + +Schedules and available reservation time slots are available at the Google `doc +<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`_. +There is also a public Google `calendar +<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_ +that has the events. Anyone can subscribe it. DAMON maintainer will also +provide periodic reminder to the mailing list (damon@lists.linux.dev). diff --git a/Documentation/mm/damon/monitoring_intervals_tuning_example.rst b/Documentation/mm/damon/monitoring_intervals_tuning_example.rst new file mode 100644 index 000000000000..7207cbed591f --- /dev/null +++ b/Documentation/mm/damon/monitoring_intervals_tuning_example.rst @@ -0,0 +1,247 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================================================= +DAMON Moniting Interval Parameters Tuning Example +================================================= + +DAMON's monitoring parameters need tuning based on given workload and the +monitoring purpose. There is a :ref:`tuning guide +<damon_design_monitoring_params_tuning_guide>` for that. This document +provides an example tuning based on the guide. + +Setup +===== + +For below example, DAMON of Linux kernel v6.11 and `damo +<https://github.com/damonitor/damo>`_ (DAMON user-space tool) v2.5.9 was used to +monitor and visualize access patterns on the physical address space of a system +running a real-world server workload. + +5ms/100ms intervals: Too Short Interval +======================================= + +Let's start by capturing the access pattern snapshot on the physical address +space of the system using DAMON, with the default interval parameters (5 +milliseconds and 100 milliseconds for the sampling and the aggregation +intervals, respectively). Wait ten minutes between the start of DAMON and +the capturing of the snapshot, to show a meaningful time-wise access patterns. +:: + + # damo start + # sleep 600 + # damo record --snapshot 0 1 + # damo stop + +Then, list the DAMON-found regions of different access patterns, sorted by the +"access temperature". "Access temperature" is a metric representing the +access-hotness of a region. It is calculated as a weighted sum of the access +frequency and the age of the region. If the access frequency is 0 %, the +temperature is multiplied by minus one. That is, if a region is not accessed, +it gets minus temperature and it gets lower as not accessed for longer time. +The sorting is in temperature-ascendint order, so the region at the top of the +list is the coldest, and the one at the bottom is the hottest one. :: + + # damo report access --sort_regions_by temperature + 0 addr 16.052 GiB size 5.985 GiB access 0 % age 5.900 s # coldest + 1 addr 22.037 GiB size 6.029 GiB access 0 % age 5.300 s + 2 addr 28.065 GiB size 6.045 GiB access 0 % age 5.200 s + 3 addr 10.069 GiB size 5.983 GiB access 0 % age 4.500 s + 4 addr 4.000 GiB size 6.069 GiB access 0 % age 4.400 s + 5 addr 62.008 GiB size 3.992 GiB access 0 % age 3.700 s + 6 addr 56.795 GiB size 5.213 GiB access 0 % age 3.300 s + 7 addr 39.393 GiB size 6.096 GiB access 0 % age 2.800 s + 8 addr 50.782 GiB size 6.012 GiB access 0 % age 2.800 s + 9 addr 34.111 GiB size 5.282 GiB access 0 % age 2.300 s + 10 addr 45.489 GiB size 5.293 GiB access 0 % age 1.800 s # hottest + total size: 62.000 GiB + +The list shows not seemingly hot regions, and only minimum access pattern +diversity. Every region has zero access frequency. The number of region is +10, which is the default ``min_nr_regions value``. Size of each region is also +nearly identical. We can suspect this is because “adaptive regions adjustment” +mechanism was not well working. As the guide suggested, we can get relative +hotness of regions using ``age`` as the recency information. That would be +better than nothing, but given the fact that the longest age is only about 6 +seconds while we waited about ten minutes, it is unclear how useful this will +be. + +The temperature ranges to total size of regions of each range histogram +visualization of the results also shows no interesting distribution pattern. :: + + # damo report access --style temperature-sz-hist + <temperature> <total size> + [-,590,000,000, -,549,000,000) 5.985 GiB |********** | + [-,549,000,000, -,508,000,000) 12.074 GiB |********************| + [-,508,000,000, -,467,000,000) 0 B | | + [-,467,000,000, -,426,000,000) 12.052 GiB |********************| + [-,426,000,000, -,385,000,000) 0 B | | + [-,385,000,000, -,344,000,000) 3.992 GiB |******* | + [-,344,000,000, -,303,000,000) 5.213 GiB |********* | + [-,303,000,000, -,262,000,000) 12.109 GiB |********************| + [-,262,000,000, -,221,000,000) 5.282 GiB |********* | + [-,221,000,000, -,180,000,000) 0 B | | + [-,180,000,000, -,139,000,000) 5.293 GiB |********* | + total size: 62.000 GiB + +In short, the parameters provide poor quality monitoring results for hot +regions detection. According to the :ref:`guide +<damon_design_monitoring_params_tuning_guide>`, this is due to the too short +aggregation interval. + +100ms/2s intervals: Starts Showing Small Hot Regions +==================================================== + +Following the guide, increase the interval 20 times (100 milliseocnds and 2 +seconds for sampling and aggregation intervals, respectively). :: + + # damo start -s 100ms -a 2s + # sleep 600 + # damo record --snapshot 0 1 + # damo stop + # damo report access --sort_regions_by temperature + 0 addr 10.180 GiB size 6.117 GiB access 0 % age 7 m 8 s # coldest + 1 addr 49.275 GiB size 6.195 GiB access 0 % age 6 m 14 s + 2 addr 62.421 GiB size 3.579 GiB access 0 % age 6 m 4 s + 3 addr 40.154 GiB size 6.127 GiB access 0 % age 5 m 40 s + 4 addr 16.296 GiB size 6.182 GiB access 0 % age 5 m 32 s + 5 addr 34.254 GiB size 5.899 GiB access 0 % age 5 m 24 s + 6 addr 46.281 GiB size 2.995 GiB access 0 % age 5 m 20 s + 7 addr 28.420 GiB size 5.835 GiB access 0 % age 5 m 6 s + 8 addr 4.000 GiB size 6.180 GiB access 0 % age 4 m 16 s + 9 addr 22.478 GiB size 5.942 GiB access 0 % age 3 m 58 s + 10 addr 55.470 GiB size 915.645 MiB access 0 % age 3 m 6 s + 11 addr 56.364 GiB size 6.056 GiB access 0 % age 2 m 8 s + 12 addr 56.364 GiB size 4.000 KiB access 95 % age 16 s + 13 addr 49.275 GiB size 4.000 KiB access 100 % age 8 m 24 s # hottest + total size: 62.000 GiB + # damo report access --style temperature-sz-hist + <temperature> <total size> + [-42,800,000,000, -33,479,999,000) 22.018 GiB |***************** | + [-33,479,999,000, -24,159,998,000) 27.090 GiB |********************| + [-24,159,998,000, -14,839,997,000) 6.836 GiB |****** | + [-14,839,997,000, -5,519,996,000) 6.056 GiB |***** | + [-5,519,996,000, 3,800,005,000) 4.000 KiB |* | + [3,800,005,000, 13,120,006,000) 0 B | | + [13,120,006,000, 22,440,007,000) 0 B | | + [22,440,007,000, 31,760,008,000) 0 B | | + [31,760,008,000, 41,080,009,000) 0 B | | + [41,080,009,000, 50,400,010,000) 0 B | | + [50,400,010,000, 59,720,011,000) 4.000 KiB |* | + total size: 62.000 GiB + +DAMON found two distinct 4 KiB regions that pretty hot. The regions are also +well aged. The hottest 4 KiB region was keeping the access frequency for about +8 minutes, and the coldest region was keeping no access for about 7 minutes. +The distribution on the histogram also looks like having a pattern. + +Especially, the finding of the 4 KiB regions among the 62 GiB total memory +shows DAMON’s adaptive regions adjustment is working as designed. + +Still the number of regions is close to the ``min_nr_regions``, and sizes of +cold regions are similar, though. Apparently it is improved, but it still has +rooms to improve. + +400ms/8s intervals: Pretty Improved Results +=========================================== + +Increase the intervals four times (400 milliseconds and 8 seconds +for sampling and aggregation intervals, respectively). :: + + # damo start -s 400ms -a 8s + # sleep 600 + # damo record --snapshot 0 1 + # damo stop + # damo report access --sort_regions_by temperature + 0 addr 64.492 GiB size 1.508 GiB access 0 % age 6 m 48 s # coldest + 1 addr 21.749 GiB size 5.674 GiB access 0 % age 6 m 8 s + 2 addr 27.422 GiB size 5.801 GiB access 0 % age 6 m + 3 addr 49.431 GiB size 8.675 GiB access 0 % age 5 m 28 s + 4 addr 33.223 GiB size 5.645 GiB access 0 % age 5 m 12 s + 5 addr 58.321 GiB size 6.170 GiB access 0 % age 5 m 4 s + [...] + 25 addr 6.615 GiB size 297.531 MiB access 15 % age 0 ns + 26 addr 9.513 GiB size 12.000 KiB access 20 % age 0 ns + 27 addr 9.511 GiB size 108.000 KiB access 25 % age 0 ns + 28 addr 9.513 GiB size 20.000 KiB access 25 % age 0 ns + 29 addr 9.511 GiB size 12.000 KiB access 30 % age 0 ns + 30 addr 9.520 GiB size 4.000 KiB access 40 % age 0 ns + [...] + 41 addr 9.520 GiB size 4.000 KiB access 80 % age 56 s + 42 addr 9.511 GiB size 12.000 KiB access 100 % age 6 m 16 s + 43 addr 58.321 GiB size 4.000 KiB access 100 % age 6 m 24 s + 44 addr 9.512 GiB size 4.000 KiB access 100 % age 6 m 48 s + 45 addr 58.106 GiB size 4.000 KiB access 100 % age 6 m 48 s # hottest + total size: 62.000 GiB + # damo report access --style temperature-sz-hist + <temperature> <total size> + [-40,800,000,000, -32,639,999,000) 21.657 GiB |********************| + [-32,639,999,000, -24,479,998,000) 17.938 GiB |***************** | + [-24,479,998,000, -16,319,997,000) 16.885 GiB |**************** | + [-16,319,997,000, -8,159,996,000) 586.879 MiB |* | + [-8,159,996,000, 5,000) 4.946 GiB |***** | + [5,000, 8,160,006,000) 260.000 KiB |* | + [8,160,006,000, 16,320,007,000) 0 B | | + [16,320,007,000, 24,480,008,000) 0 B | | + [24,480,008,000, 32,640,009,000) 0 B | | + [32,640,009,000, 40,800,010,000) 16.000 KiB |* | + [40,800,010,000, 48,960,011,000) 8.000 KiB |* | + total size: 62.000 GiB + +The number of regions having different access patterns has significantly +increased. Size of each region is also more varied. Total size of non-zero +access frequency regions is also significantly increased. Maybe this is already +good enough to make some meaningful memory management efficiency changes. + +800ms/16s intervals: Another bias +================================= + +Further double the intervals (800 milliseconds and 16 seconds for sampling +and aggregation intervals, respectively). The results is more improved for the +hot regions detection, but starts looking degrading cold regions detection. :: + + # damo start -s 800ms -a 16s + # sleep 600 + # damo record --snapshot 0 1 + # damo stop + # damo report access --sort_regions_by temperature + 0 addr 64.781 GiB size 1.219 GiB access 0 % age 4 m 48 s + 1 addr 24.505 GiB size 2.475 GiB access 0 % age 4 m 16 s + 2 addr 26.980 GiB size 504.273 MiB access 0 % age 4 m + 3 addr 29.443 GiB size 2.462 GiB access 0 % age 4 m + 4 addr 37.264 GiB size 5.645 GiB access 0 % age 4 m + 5 addr 31.905 GiB size 5.359 GiB access 0 % age 3 m 44 s + [...] + 20 addr 8.711 GiB size 40.000 KiB access 5 % age 2 m 40 s + 21 addr 27.473 GiB size 1.970 GiB access 5 % age 4 m + 22 addr 48.185 GiB size 4.625 GiB access 5 % age 4 m + 23 addr 47.304 GiB size 902.117 MiB access 10 % age 4 m + 24 addr 8.711 GiB size 4.000 KiB access 100 % age 4 m + 25 addr 20.793 GiB size 3.713 GiB access 5 % age 4 m 16 s + 26 addr 8.773 GiB size 4.000 KiB access 100 % age 4 m 16 s + total size: 62.000 GiB + # damo report access --style temperature-sz-hist + <temperature> <total size> + [-28,800,000,000, -23,359,999,000) 12.294 GiB |***************** | + [-23,359,999,000, -17,919,998,000) 9.753 GiB |************* | + [-17,919,998,000, -12,479,997,000) 15.131 GiB |********************| + [-12,479,997,000, -7,039,996,000) 0 B | | + [-7,039,996,000, -1,599,995,000) 7.506 GiB |********** | + [-1,599,995,000, 3,840,006,000) 6.127 GiB |********* | + [3,840,006,000, 9,280,007,000) 0 B | | + [9,280,007,000, 14,720,008,000) 136.000 KiB |* | + [14,720,008,000, 20,160,009,000) 40.000 KiB |* | + [20,160,009,000, 25,600,010,000) 11.188 GiB |*************** | + [25,600,010,000, 31,040,011,000) 4.000 KiB |* | + total size: 62.000 GiB + +It found more non-zero access frequency regions. The number of regions is still +much higher than the ``min_nr_regions``, but it is reduced from that of the +previous setup. And apparently the distribution seems bit biased to hot +regions. + +Conclusion +========== + +With the above experimental tuning results, we can conclude the theory and the +guide makes sense to at least this workload, and could be applied to similar +cases. |