diff options
Diffstat (limited to 'Documentation/admin-guide/mm/damon')
-rw-r--r-- | Documentation/admin-guide/mm/damon/index.rst | 9 | ||||
-rw-r--r-- | Documentation/admin-guide/mm/damon/lru_sort.rst | 294 | ||||
-rw-r--r-- | Documentation/admin-guide/mm/damon/reclaim.rst | 44 | ||||
-rw-r--r-- | Documentation/admin-guide/mm/damon/start.rst | 13 | ||||
-rw-r--r-- | Documentation/admin-guide/mm/damon/usage.rst | 619 |
5 files changed, 894 insertions, 85 deletions
diff --git a/Documentation/admin-guide/mm/damon/index.rst b/Documentation/admin-guide/mm/damon/index.rst index 61aff88347f3..33d37bb2fb4e 100644 --- a/Documentation/admin-guide/mm/damon/index.rst +++ b/Documentation/admin-guide/mm/damon/index.rst @@ -1,10 +1,10 @@ .. SPDX-License-Identifier: GPL-2.0 -======================== -Monitoring Data Accesses -======================== +========================== +DAMON: Data Access MONitor +========================== -:doc:`DAMON </vm/damon/index>` allows light-weight data access monitoring. +:doc:`DAMON </mm/damon/index>` allows light-weight data access monitoring. Using DAMON, users can analyze the memory access patterns of their systems and optimize those. @@ -14,3 +14,4 @@ optimize those. start usage reclaim + lru_sort diff --git a/Documentation/admin-guide/mm/damon/lru_sort.rst b/Documentation/admin-guide/mm/damon/lru_sort.rst new file mode 100644 index 000000000000..c09cace80651 --- /dev/null +++ b/Documentation/admin-guide/mm/damon/lru_sort.rst @@ -0,0 +1,294 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================= +DAMON-based LRU-lists Sorting +============================= + +DAMON-based LRU-lists Sorting (DAMON_LRU_SORT) is a static kernel module that +aimed to be used for proactive and lightweight data access pattern based +(de)prioritization of pages on their LRU-lists for making LRU-lists a more +trusworthy data access pattern source. + +Where Proactive LRU-lists Sorting is Required? +============================================== + +As page-granularity access checking overhead could be significant on huge +systems, LRU lists are normally not proactively sorted but partially and +reactively sorted for special events including specific user requests, system +calls and memory pressure. As a result, LRU lists are sometimes not so +perfectly prepared to be used as a trustworthy access pattern source for some +situations including reclamation target pages selection under sudden memory +pressure. + +Because DAMON can identify access patterns of best-effort accuracy while +inducing only user-specified range of overhead, proactively running +DAMON_LRU_SORT could be helpful for making LRU lists more trustworthy access +pattern source with low and controlled overhead. + +How It Works? +============= + +DAMON_LRU_SORT finds hot pages (pages of memory regions that showing access +rates that higher than a user-specified threshold) and cold pages (pages of +memory regions that showing no access for a time that longer than a +user-specified threshold) using DAMON, and prioritizes hot pages while +deprioritizing cold pages on their LRU-lists. To avoid it consuming too much +CPU for the prioritizations, a CPU time usage limit can be configured. Under +the limit, it prioritizes and deprioritizes more hot and cold pages first, +respectively. System administrators can also configure under what situation +this scheme should automatically activated and deactivated with three memory +pressure watermarks. + +Its default parameters for hotness/coldness thresholds and CPU quota limit are +conservatively chosen. That is, the module under its default parameters could +be widely used without harm for common situations while providing a level of +benefits for systems having clear hot/cold access patterns under memory +pressure while consuming only a limited small portion of CPU time. + +Interface: Module Parameters +============================ + +To use this feature, you should first ensure your system is running on a kernel +that is built with ``CONFIG_DAMON_LRU_SORT=y``. + +To let sysadmins enable or disable it and tune for the given system, +DAMON_LRU_SORT utilizes module parameters. That is, you can put +``damon_lru_sort.<parameter>=<value>`` on the kernel boot command line or write +proper values to ``/sys/modules/damon_lru_sort/parameters/<parameter>`` files. + +Below are the description of each parameter. + +enabled +------- + +Enable or disable DAMON_LRU_SORT. + +You can enable DAMON_LRU_SORT by setting the value of this parameter as ``Y``. +Setting it as ``N`` disables DAMON_LRU_SORT. Note that DAMON_LRU_SORT could do +no real monitoring and LRU-lists sorting due to the watermarks-based activation +condition. Refer to below descriptions for the watermarks parameter for this. + +commit_inputs +------------- + +Make DAMON_LRU_SORT reads the input parameters again, except ``enabled``. + +Input parameters that updated while DAMON_LRU_SORT is running are not applied +by default. Once this parameter is set as ``Y``, DAMON_LRU_SORT reads values +of parametrs except ``enabled`` again. Once the re-reading is done, this +parameter is set as ``N``. If invalid parameters are found while the +re-reading, DAMON_LRU_SORT will be disabled. + +hot_thres_access_freq +--------------------- + +Access frequency threshold for hot memory regions identification in permil. + +If a memory region is accessed in frequency of this or higher, DAMON_LRU_SORT +identifies the region as hot, and mark it as accessed on the LRU list, so that +it could not be reclaimed under memory pressure. 50% by default. + +cold_min_age +------------ + +Time threshold for cold memory regions identification in microseconds. + +If a memory region is not accessed for this or longer time, DAMON_LRU_SORT +identifies the region as cold, and mark it as unaccessed on the LRU list, so +that it could be reclaimed first under memory pressure. 120 seconds by +default. + +quota_ms +-------- + +Limit of time for trying the LRU lists sorting in milliseconds. + +DAMON_LRU_SORT tries to use only up to this time within a time window +(quota_reset_interval_ms) for trying LRU lists sorting. This can be used +for limiting CPU consumption of DAMON_LRU_SORT. If the value is zero, the +limit is disabled. + +10 ms by default. + +quota_reset_interval_ms +----------------------- + +The time quota charge reset interval in milliseconds. + +The charge reset interval for the quota of time (quota_ms). That is, +DAMON_LRU_SORT does not try LRU-lists sorting for more than quota_ms +milliseconds or quota_sz bytes within quota_reset_interval_ms milliseconds. + +1 second by default. + +wmarks_interval +--------------- + +The watermarks check time interval in microseconds. + +Minimal time to wait before checking the watermarks, when DAMON_LRU_SORT is +enabled but inactive due to its watermarks rule. 5 seconds by default. + +wmarks_high +----------- + +Free memory rate (per thousand) for the high watermark. + +If free memory of the system in bytes per thousand bytes is higher than this, +DAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the +watermarks. 200 (20%) by default. + +wmarks_mid +---------- + +Free memory rate (per thousand) for the middle watermark. + +If free memory of the system in bytes per thousand bytes is between this and +the low watermark, DAMON_LRU_SORT becomes active, so starts the monitoring and +the LRU-lists sorting. 150 (15%) by default. + +wmarks_low +---------- + +Free memory rate (per thousand) for the low watermark. + +If free memory of the system in bytes per thousand bytes is lower than this, +DAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the +watermarks. 50 (5%) by default. + +sample_interval +--------------- + +Sampling interval for the monitoring in microseconds. + +The sampling interval of DAMON for the cold memory monitoring. Please refer to +the DAMON documentation (:doc:`usage`) for more detail. 5ms by default. + +aggr_interval +------------- + +Aggregation interval for the monitoring in microseconds. + +The aggregation interval of DAMON for the cold memory monitoring. Please +refer to the DAMON documentation (:doc:`usage`) for more detail. 100ms by +default. + +min_nr_regions +-------------- + +Minimum number of monitoring regions. + +The minimal number of monitoring regions of DAMON for the cold memory +monitoring. This can be used to set lower-bound of the monitoring quality. +But, setting this too high could result in increased monitoring overhead. +Please refer to the DAMON documentation (:doc:`usage`) for more detail. 10 by +default. + +max_nr_regions +-------------- + +Maximum number of monitoring regions. + +The maximum number of monitoring regions of DAMON for the cold memory +monitoring. This can be used to set upper-bound of the monitoring overhead. +However, setting this too low could result in bad monitoring quality. Please +refer to the DAMON documentation (:doc:`usage`) for more detail. 1000 by +defaults. + +monitor_region_start +-------------------- + +Start of target memory region in physical address. + +The start physical address of memory region that DAMON_LRU_SORT will do work +against. By default, biggest System RAM is used as the region. + +monitor_region_end +------------------ + +End of target memory region in physical address. + +The end physical address of memory region that DAMON_LRU_SORT will do work +against. By default, biggest System RAM is used as the region. + +kdamond_pid +----------- + +PID of the DAMON thread. + +If DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread. Else, +-1. + +nr_lru_sort_tried_hot_regions +----------------------------- + +Number of hot memory regions that tried to be LRU-sorted. + +bytes_lru_sort_tried_hot_regions +-------------------------------- + +Total bytes of hot memory regions that tried to be LRU-sorted. + +nr_lru_sorted_hot_regions +------------------------- + +Number of hot memory regions that successfully be LRU-sorted. + +bytes_lru_sorted_hot_regions +---------------------------- + +Total bytes of hot memory regions that successfully be LRU-sorted. + +nr_hot_quota_exceeds +-------------------- + +Number of times that the time quota limit for hot regions have exceeded. + +nr_lru_sort_tried_cold_regions +------------------------------ + +Number of cold memory regions that tried to be LRU-sorted. + +bytes_lru_sort_tried_cold_regions +--------------------------------- + +Total bytes of cold memory regions that tried to be LRU-sorted. + +nr_lru_sorted_cold_regions +-------------------------- + +Number of cold memory regions that successfully be LRU-sorted. + +bytes_lru_sorted_cold_regions +----------------------------- + +Total bytes of cold memory regions that successfully be LRU-sorted. + +nr_cold_quota_exceeds +--------------------- + +Number of times that the time quota limit for cold regions have exceeded. + +Example +======= + +Below runtime example commands make DAMON_LRU_SORT to find memory regions +having >=50% access frequency and LRU-prioritize while LRU-deprioritizing +memory regions that not accessed for 120 seconds. The prioritization and +deprioritization is limited to be done using only up to 1% CPU time to avoid +DAMON_LRU_SORT consuming too much CPU time for the (de)prioritization. It also +asks DAMON_LRU_SORT to do nothing if the system's free memory rate is more than +50%, but start the real works if it becomes lower than 40%. If DAMON_RECLAIM +doesn't make progress and therefore the free memory rate becomes lower than +20%, it asks DAMON_LRU_SORT to do nothing again, so that we can fall back to +the LRU-list based page granularity reclamation. :: + + # cd /sys/modules/damon_lru_sort/parameters + # echo 500 > hot_thres_access_freq + # echo 120000000 > cold_min_age + # echo 10 > quota_ms + # echo 1000 > quota_reset_interval_ms + # echo 500 > wmarks_high + # echo 400 > wmarks_mid + # echo 200 > wmarks_low + # echo Y > enabled diff --git a/Documentation/admin-guide/mm/damon/reclaim.rst b/Documentation/admin-guide/mm/damon/reclaim.rst index fb9def3a7355..4f1479a11e63 100644 --- a/Documentation/admin-guide/mm/damon/reclaim.rst +++ b/Documentation/admin-guide/mm/damon/reclaim.rst @@ -48,12 +48,6 @@ DAMON_RECLAIM utilizes module parameters. That is, you can put ``damon_reclaim.<parameter>=<value>`` on the kernel boot command line or write proper values to ``/sys/modules/damon_reclaim/parameters/<parameter>`` files. -Note that the parameter values except ``enabled`` are applied only when -DAMON_RECLAIM starts. Therefore, if you want to apply new parameter values in -runtime and DAMON_RECLAIM is already enabled, you should disable and re-enable -it via ``enabled`` parameter file. Writing of the new values to proper -parameter values should be done before the re-enablement. - Below are the description of each parameter. enabled @@ -66,6 +60,17 @@ Setting it as ``N`` disables DAMON_RECLAIM. Note that DAMON_RECLAIM could do no real monitoring and reclamation due to the watermarks-based activation condition. Refer to below descriptions for the watermarks parameter for this. +commit_inputs +------------- + +Make DAMON_RECLAIM reads the input parameters again, except ``enabled``. + +Input parameters that updated while DAMON_RECLAIM is running are not applied +by default. Once this parameter is set as ``Y``, DAMON_RECLAIM reads values +of parametrs except ``enabled`` again. Once the re-reading is done, this +parameter is set as ``N``. If invalid parameters are found while the +re-reading, DAMON_RECLAIM will be disabled. + min_age ------- @@ -208,6 +213,31 @@ PID of the DAMON thread. If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread. Else, -1. +nr_reclaim_tried_regions +------------------------ + +Number of memory regions that tried to be reclaimed by DAMON_RECLAIM. + +bytes_reclaim_tried_regions +--------------------------- + +Total bytes of memory regions that tried to be reclaimed by DAMON_RECLAIM. + +nr_reclaimed_regions +-------------------- + +Number of memory regions that successfully be reclaimed by DAMON_RECLAIM. + +bytes_reclaimed_regions +----------------------- + +Total bytes of memory regions that successfully be reclaimed by DAMON_RECLAIM. + +nr_quota_exceeds +---------------- + +Number of times that the time/space quota limits have exceeded. + Example ======= @@ -232,4 +262,4 @@ granularity reclamation. :: .. [1] https://research.google/pubs/pub48551/ .. [2] https://lwn.net/Articles/787611/ -.. [3] https://www.kernel.org/doc/html/latest/vm/free_page_reporting.html +.. [3] https://www.kernel.org/doc/html/latest/mm/free_page_reporting.html diff --git a/Documentation/admin-guide/mm/damon/start.rst b/Documentation/admin-guide/mm/damon/start.rst index 4d5ca2c46288..9f88afc734da 100644 --- a/Documentation/admin-guide/mm/damon/start.rst +++ b/Documentation/admin-guide/mm/damon/start.rst @@ -29,16 +29,9 @@ called DAMON Operator (DAMO). It is available at https://github.com/awslabs/damo. The examples below assume that ``damo`` is on your ``$PATH``. It's not mandatory, though. -Because DAMO is using the debugfs interface (refer to :doc:`usage` for the -detail) of DAMON, you should ensure debugfs is mounted. Mount it manually as -below:: - - # mount -t debugfs none /sys/kernel/debug/ - -or append the following line to your ``/etc/fstab`` file so that your system -can automatically mount debugfs upon booting:: - - debugfs /sys/kernel/debug debugfs defaults 0 0 +Because DAMO is using the sysfs interface (refer to :doc:`usage` for the +detail) of DAMON, you should ensure :doc:`sysfs </filesystems/sysfs>` is +mounted. Recording Data Access Patterns diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst index ed96bbf0daff..b47b0cbbd491 100644 --- a/Documentation/admin-guide/mm/damon/usage.rst +++ b/Documentation/admin-guide/mm/damon/usage.rst @@ -4,49 +4,412 @@ Detailed Usages =============== -DAMON provides below three interfaces for different users. +DAMON provides below interfaces for different users. - *DAMON user space tool.* - This is for privileged people such as system administrators who want a - just-working human-friendly interface. Using this, users can use the DAMON’s - major features in a human-friendly way. It may not be highly tuned for - special cases, though. It supports both virtual and physical address spaces - monitoring. + `This <https://github.com/awslabs/damo>`_ is for privileged people such as + system administrators who want a just-working human-friendly interface. + Using this, users can use the DAMON’s major features in a human-friendly way. + It may not be highly tuned for special cases, though. It supports both + virtual and physical address spaces monitoring. For more detail, please + refer to its `usage document + <https://github.com/awslabs/damo/blob/next/USAGE.md>`_. +- *sysfs interface.* + :ref:`This <sysfs_interface>` is for privileged user space programmers who + want more optimized use of DAMON. Using this, users can use DAMON’s major + features by reading from and writing to special sysfs files. Therefore, + you can write and use your personalized DAMON sysfs wrapper programs that + reads/writes the sysfs files instead of you. The `DAMON user space tool + <https://github.com/awslabs/damo>`_ is one example of such programs. It + supports both virtual and physical address spaces monitoring. Note that this + interface provides only simple :ref:`statistics <damos_stats>` for the + monitoring results. For detailed monitoring results, DAMON provides a + :ref:`tracepoint <tracepoint>`. - *debugfs interface.* - This is for privileged user space programmers who want more optimized use of - DAMON. Using this, users can use DAMON’s major features by reading - from and writing to special debugfs files. Therefore, you can write and use - your personalized DAMON debugfs wrapper programs that reads/writes the - debugfs files instead of you. The DAMON user space tool is also a reference - implementation of such programs. It supports both virtual and physical - address spaces monitoring. + :ref:`This <debugfs_interface>` is almost identical to :ref:`sysfs interface + <sysfs_interface>`. This will be removed after next LTS kernel is released, + so users should move to the :ref:`sysfs interface <sysfs_interface>`. - *Kernel Space Programming Interface.* - This is for kernel space programmers. Using this, users can utilize every - feature of DAMON most flexibly and efficiently by writing kernel space - DAMON application programs for you. You can even extend DAMON for various - address spaces. + :doc:`This </mm/damon/api>` is for kernel space programmers. Using this, + users can utilize every feature of DAMON most flexibly and efficiently by + writing kernel space DAMON application programs for you. You can even extend + DAMON for various address spaces. For detail, please refer to the interface + :doc:`document </mm/damon/api>`. -Nevertheless, you could write your own user space tool using the debugfs -interface. A reference implementation is available at -https://github.com/awslabs/damo. If you are a kernel programmer, you could -refer to :doc:`/vm/damon/api` for the kernel space programming interface. For -the reason, this document describes only the debugfs interface +.. _sysfs_interface: + +sysfs Interface +=============== + +DAMON sysfs interface is built when ``CONFIG_DAMON_SYSFS`` is defined. It +creates multiple directories and files under its sysfs directory, +``<sysfs>/kernel/mm/damon/``. You can control DAMON by writing to and reading +from the files under the directory. + +For a short example, users can monitor the virtual address space of a given +workload as below. :: + + # cd /sys/kernel/mm/damon/admin/ + # echo 1 > kdamonds/nr_kdamonds && echo 1 > kdamonds/0/contexts/nr_contexts + # echo vaddr > kdamonds/0/contexts/0/operations + # echo 1 > kdamonds/0/contexts/0/targets/nr_targets + # echo $(pidof <workload>) > kdamonds/0/contexts/0/targets/0/pid_target + # echo on > kdamonds/0/state + +Files Hierarchy +--------------- + +The files hierarchy of DAMON sysfs interface is shown below. In the below +figure, parents-children relations are represented with indentations, each +directory is having ``/`` suffix, and files in each directory are separated by +comma (","). :: + + /sys/kernel/mm/damon/admin + │ kdamonds/nr_kdamonds + │ │ 0/state,pid + │ │ │ contexts/nr_contexts + │ │ │ │ 0/avail_operations,operations + │ │ │ │ │ monitoring_attrs/ + │ │ │ │ │ │ intervals/sample_us,aggr_us,update_us + │ │ │ │ │ │ nr_regions/min,max + │ │ │ │ │ targets/nr_targets + │ │ │ │ │ │ 0/pid_target + │ │ │ │ │ │ │ regions/nr_regions + │ │ │ │ │ │ │ │ 0/start,end + │ │ │ │ │ │ │ │ ... + │ │ │ │ │ │ ... + │ │ │ │ │ schemes/nr_schemes + │ │ │ │ │ │ 0/action + │ │ │ │ │ │ │ access_pattern/ + │ │ │ │ │ │ │ │ sz/min,max + │ │ │ │ │ │ │ │ nr_accesses/min,max + │ │ │ │ │ │ │ │ age/min,max + │ │ │ │ │ │ │ quotas/ms,bytes,reset_interval_ms + │ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil + │ │ │ │ │ │ │ watermarks/metric,interval_us,high,mid,low + │ │ │ │ │ │ │ stats/nr_tried,sz_tried,nr_applied,sz_applied,qt_exceeds + │ │ │ │ │ │ ... + │ │ │ │ ... + │ │ ... + +Root +---- + +The root of the DAMON sysfs interface is ``<sysfs>/kernel/mm/damon/``, and it +has one directory named ``admin``. The directory contains the files for +privileged user space programs' control of DAMON. User space tools or deamons +having the root permission could use this directory. + +kdamonds/ +--------- + +The monitoring-related information including request specifications and results +are called DAMON context. DAMON executes each context with a kernel thread +called kdamond, and multiple kdamonds could run in parallel. + +Under the ``admin`` directory, one directory, ``kdamonds``, which has files for +controlling the kdamonds exist. In the beginning, this directory has only one +file, ``nr_kdamonds``. Writing a number (``N``) to the file creates the number +of child directories named ``0`` to ``N-1``. Each directory represents each +kdamond. + +kdamonds/<N>/ +------------- + +In each kdamond directory, two files (``state`` and ``pid``) and one directory +(``contexts``) exist. + +Reading ``state`` returns ``on`` if the kdamond is currently running, or +``off`` if it is not running. Writing ``on`` or ``off`` makes the kdamond be +in the state. Writing ``commit`` to the ``state`` file makes kdamond reads the +user inputs in the sysfs files except ``state`` file again. Writing +``update_schemes_stats`` to ``state`` file updates the contents of stats files +for each DAMON-based operation scheme of the kdamond. For details of the +stats, please refer to :ref:`stats section <sysfs_schemes_stats>`. + +If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread. + +``contexts`` directory contains files for controlling the monitoring contexts +that this kdamond will execute. + +kdamonds/<N>/contexts/ +---------------------- + +In the beginning, this directory has only one file, ``nr_contexts``. Writing a +number (``N``) to the file creates the number of child directories named as +``0`` to ``N-1``. Each directory represents each monitoring context. At the +moment, only one context per kdamond is supported, so only ``0`` or ``1`` can +be written to the file. + +contexts/<N>/ +------------- + +In each context directory, two files (``avail_operations`` and ``operations``) +and three directories (``monitoring_attrs``, ``targets``, and ``schemes``) +exist. + +DAMON supports multiple types of monitoring operations, including those for +virtual address space and the physical address space. You can get the list of +available monitoring operations set on the currently running kernel by reading +``avail_operations`` file. Based on the kernel configuration, the file will +list some or all of below keywords. + + - vaddr: Monitor virtual address spaces of specific processes + - fvaddr: Monitor fixed virtual address ranges + - paddr: Monitor the physical address space of the system + +Please refer to :ref:`regions sysfs directory <sysfs_regions>` for detailed +differences between the operations sets in terms of the monitoring target +regions. + +You can set and get what type of monitoring operations DAMON will use for the +context by writing one of the keywords listed in ``avail_operations`` file and +reading from the ``operations`` file. + +contexts/<N>/monitoring_attrs/ +------------------------------ + +Files for specifying attributes of the monitoring including required quality +and efficiency of the monitoring are in ``monitoring_attrs`` directory. +Specifically, two directories, ``intervals`` and ``nr_regions`` exist in this +directory. + +Under ``intervals`` directory, three files for DAMON's sampling interval +(``sample_us``), aggregation interval (``aggr_us``), and update interval +(``update_us``) exist. You can set and get the values in micro-seconds by +writing to and reading from the files. + +Under ``nr_regions`` directory, two files for the lower-bound and upper-bound +of DAMON's monitoring regions (``min`` and ``max``, respectively), which +controls the monitoring overhead, exist. You can set and get the values by +writing to and rading from the files. + +For more details about the intervals and monitoring regions range, please refer +to the Design document (:doc:`/mm/damon/design`). + +contexts/<N>/targets/ +--------------------- + +In the beginning, this directory has only one file, ``nr_targets``. Writing a +number (``N``) to the file creates the number of child directories named ``0`` +to ``N-1``. Each directory represents each monitoring target. + +targets/<N>/ +------------ + +In each target directory, one file (``pid_target``) and one directory +(``regions``) exist. + +If you wrote ``vaddr`` to the ``contexts/<N>/operations``, each target should +be a process. You can specify the process to DAMON by writing the pid of the +process to the ``pid_target`` file. + +.. _sysfs_regions: + +targets/<N>/regions +------------------- + +When ``vaddr`` monitoring operations set is being used (``vaddr`` is written to +the ``contexts/<N>/operations`` file), DAMON automatically sets and updates the +monitoring target regions so that entire memory mappings of target processes +can be covered. However, users could want to set the initial monitoring region +to specific address ranges. + +In contrast, DAMON do not automatically sets and updates the monitoring target +regions when ``fvaddr`` or ``paddr`` monitoring operations sets are being used +(``fvaddr`` or ``paddr`` have written to the ``contexts/<N>/operations``). +Therefore, users should set the monitoring target regions by themselves in the +cases. + +For such cases, users can explicitly set the initial monitoring target regions +as they want, by writing proper values to the files under this directory. + +In the beginning, this directory has only one file, ``nr_regions``. Writing a +number (``N``) to the file creates the number of child directories named ``0`` +to ``N-1``. Each directory represents each initial monitoring target region. + +regions/<N>/ +------------ + +In each region directory, you will find two files (``start`` and ``end``). You +can set and get the start and end addresses of the initial monitoring target +region by writing to and reading from the files, respectively. + +contexts/<N>/schemes/ +--------------------- + +For usual DAMON-based data access aware memory management optimizations, users +would normally want the system to apply a memory management action to a memory +region of a specific access pattern. DAMON receives such formalized operation +schemes from the user and applies those to the target memory regions. Users +can get and set the schemes by reading from and writing to files under this +directory. + +In the beginning, this directory has only one file, ``nr_schemes``. Writing a +number (``N``) to the file creates the number of child directories named ``0`` +to ``N-1``. Each directory represents each DAMON-based operation scheme. + +schemes/<N>/ +------------ + +In each scheme directory, four directories (``access_pattern``, ``quotas``, +``watermarks``, and ``stats``) and one file (``action``) exist. + +The ``action`` file is for setting and getting what action you want to apply to +memory regions having specific access pattern of the interest. The keywords +that can be written to and read from the file and their meaning are as below. + + - ``willneed``: Call ``madvise()`` for the region with ``MADV_WILLNEED`` + - ``cold``: Call ``madvise()`` for the region with ``MADV_COLD`` + - ``pageout``: Call ``madvise()`` for the region with ``MADV_PAGEOUT`` + - ``hugepage``: Call ``madvise()`` for the region with ``MADV_HUGEPAGE`` + - ``nohugepage``: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE`` + - ``lru_prio``: Prioritize the region on its LRU lists. + - ``lru_deprio``: Deprioritize the region on its LRU lists. + - ``stat``: Do nothing but count the statistics + +schemes/<N>/access_pattern/ +--------------------------- + +The target access pattern of each DAMON-based operation scheme is constructed +with three ranges including the size of the region in bytes, number of +monitored accesses per aggregate interval, and number of aggregated intervals +for the age of the region. + +Under the ``access_pattern`` directory, three directories (``sz``, +``nr_accesses``, and ``age``) each having two files (``min`` and ``max``) +exist. You can set and get the access pattern for the given scheme by writing +to and reading from the ``min`` and ``max`` files under ``sz``, +``nr_accesses``, and ``age`` directories, respectively. + +schemes/<N>/quotas/ +------------------- + +Optimal ``target access pattern`` for each ``action`` is workload dependent, so +not easy to find. Worse yet, setting a scheme of some action too aggressive +can cause severe overhead. To avoid such overhead, users can limit time and +size quota for each scheme. In detail, users can ask DAMON to try to use only +up to specific time (``time quota``) for applying the action, and to apply the +action to only up to specific amount (``size quota``) of memory regions having +the target access pattern within a given time interval (``reset interval``). + +When the quota limit is expected to be exceeded, DAMON prioritizes found memory +regions of the ``target access pattern`` based on their size, access frequency, +and age. For personalized prioritization, users can set the weights for the +three properties. + +Under ``quotas`` directory, three files (``ms``, ``bytes``, +``reset_interval_ms``) and one directory (``weights``) having three files +(``sz_permil``, ``nr_accesses_permil``, and ``age_permil``) in it exist. + +You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and +``reset interval`` in milliseconds by writing the values to the three files, +respectively. You can also set the prioritization weights for size, access +frequency, and age in per-thousand unit by writing the values to the three +files under the ``weights`` directory. + +schemes/<N>/watermarks/ +----------------------- + +To allow easy activation and deactivation of each scheme based on system +status, DAMON provides a feature called watermarks. The feature receives five +values called ``metric``, ``interval``, ``high``, ``mid``, and ``low``. The +``metric`` is the system metric such as free memory ratio that can be measured. +If the metric value of the system is higher than the value in ``high`` or lower +than ``low`` at the memoent, the scheme is deactivated. If the value is lower +than ``mid``, the scheme is activated. + +Under the watermarks directory, five files (``metric``, ``interval_us``, +``high``, ``mid``, and ``low``) for setting each value exist. You can set and +get the five values by writing to the files, respectively. + +Keywords and meanings of those that can be written to the ``metric`` file are +as below. + + - none: Ignore the watermarks + - free_mem_rate: System's free memory rate (per thousand) + +The ``interval`` should written in microseconds unit. + +.. _sysfs_schemes_stats: + +schemes/<N>/stats/ +------------------ + +DAMON counts the total number and bytes of regions that each scheme is tried to +be applied, the two numbers for the regions that each scheme is successfully +applied, and the total number of the quota limit exceeds. This statistics can +be used for online analysis or tuning of the schemes. + +The statistics can be retrieved by reading the files under ``stats`` directory +(``nr_tried``, ``sz_tried``, ``nr_applied``, ``sz_applied``, and +``qt_exceeds``), respectively. The files are not updated in real time, so you +should ask DAMON sysfs interface to updte the content of the files for the +stats by writing a special keyword, ``update_schemes_stats`` to the relevant +``kdamonds/<N>/state`` file. + +Example +~~~~~~~ + +Below commands applies a scheme saying "If a memory region of size in [4KiB, +8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate +interval in [10, 20], page out the region. For the paging out, use only up to +10ms per second, and also don't page out more than 1GiB per second. Under the +limitation, page out memory regions having longer age first. Also, check the +free memory rate of the system every 5 seconds, start the monitoring and paging +out when the free memory rate becomes lower than 50%, but stop it if the free +memory rate becomes larger than 60%, or lower than 30%". :: + + # cd <sysfs>/kernel/mm/damon/admin + # # populate directories + # echo 1 > kdamonds/nr_kdamonds; echo 1 > kdamonds/0/contexts/nr_contexts; + # echo 1 > kdamonds/0/contexts/0/schemes/nr_schemes + # cd kdamonds/0/contexts/0/schemes/0 + # # set the basic access pattern and the action + # echo 4096 > access_pattern/sz/min + # echo 8192 > access_pattern/sz/max + # echo 0 > access_pattern/nr_accesses/min + # echo 5 > access_pattern/nr_accesses/max + # echo 10 > access_pattern/age/min + # echo 20 > access_pattern/age/max + # echo pageout > action + # # set quotas + # echo 10 > quotas/ms + # echo $((1024*1024*1024)) > quotas/bytes + # echo 1000 > quotas/reset_interval_ms + # # set watermark + # echo free_mem_rate > watermarks/metric + # echo 5000000 > watermarks/interval_us + # echo 600 > watermarks/high + # echo 500 > watermarks/mid + # echo 300 > watermarks/low + +Please note that it's highly recommended to use user space tools like `damo +<https://github.com/awslabs/damo>`_ rather than manually reading and writing +the files as above. Above is only for an example. + +.. _debugfs_interface: debugfs Interface ================= -DAMON exports five files, ``attrs``, ``target_ids``, ``init_regions``, -``schemes`` and ``monitor_on`` under its debugfs directory, -``<debugfs>/damon/``. +.. note:: + + DAMON debugfs interface will be removed after next LTS kernel is released, so + users should move to the :ref:`sysfs interface <sysfs_interface>`. + +DAMON exports eight files, ``attrs``, ``target_ids``, ``init_regions``, +``schemes``, ``monitor_on``, ``kdamond_pid``, ``mk_contexts`` and +``rm_contexts`` under its debugfs directory, ``<debugfs>/damon/``. Attributes ---------- Users can get and set the ``sampling interval``, ``aggregation interval``, -``regions update interval``, and min/max number of monitoring target regions by +``update interval``, and min/max number of monitoring target regions by reading from and writing to the ``attrs`` file. To know about the monitoring -attributes in detail, please refer to the :doc:`/vm/damon/design`. For +attributes in detail, please refer to the :doc:`/mm/damon/design`. For example, below commands set those values to 5 ms, 100 ms, 1,000 ms, 10 and 1000, and then check it again:: @@ -105,24 +468,28 @@ In such cases, users can explicitly set the initial monitoring target regions as they want, by writing proper values to the ``init_regions`` file. Each line of the input should represent one region in below form.:: - <target id> <start address> <end address> + <target idx> <start address> <end address> -The ``target id`` should already in ``target_ids`` file, and the regions should -be passed in address order. For example, below commands will set a couple of -address ranges, ``1-100`` and ``100-200`` as the initial monitoring target -region of process 42, and another couple of address ranges, ``20-40`` and -``50-100`` as that of process 4242.:: +The ``target idx`` should be the index of the target in ``target_ids`` file, +starting from ``0``, and the regions should be passed in address order. For +example, below commands will set a couple of address ranges, ``1-100`` and +``100-200`` as the initial monitoring target region of pid 42, which is the +first one (index ``0``) in ``target_ids``, and another couple of address +ranges, ``20-40`` and ``50-100`` as that of pid 4242, which is the second one +(index ``1``) in ``target_ids``.:: # cd <debugfs>/damon - # echo "42 1 100 - 42 100 200 - 4242 20 40 - 4242 50 100" > init_regions + # cat target_ids + 42 4242 + # echo "0 1 100 + 0 100 200 + 1 20 40 + 1 50 100" > init_regions Note that this sets the initial monitoring target regions only. In case of virtual memory monitoring, DAMON will automatically updates the boundary of the -regions after one ``regions update interval``. Therefore, users should set the -``regions update interval`` large enough in this case, if they don't want the +regions after one ``update interval``. Therefore, users should set the +``update interval`` large enough in this case, if they don't want the update. @@ -131,24 +498,38 @@ Schemes For usual DAMON-based data access aware memory management optimizations, users would simply want the system to apply a memory management action to a memory -region of a specific size having a specific access frequency for a specific -time. DAMON receives such formalized operation schemes from the user and -applies those to the target processes. It also counts the total number and -size of regions that each scheme is applied. This statistics can be used for -online analysis or tuning of the schemes. +region of a specific access pattern. DAMON receives such formalized operation +schemes from the user and applies those to the target processes. Users can get and set the schemes by reading from and writing to ``schemes`` debugfs file. Reading the file also shows the statistics of each scheme. To -the file, each of the schemes should be represented in each line in below form: +the file, each of the schemes should be represented in each line in below +form:: + + <target access pattern> <action> <quota> <watermarks> + +You can disable schemes by simply writing an empty string to the file. - min-size max-size min-acc max-acc min-age max-age action +Target Access Pattern +~~~~~~~~~~~~~~~~~~~~~ -Note that the ranges are closed interval. Bytes for the size of regions -(``min-size`` and ``max-size``), number of monitored accesses per aggregate -interval for access frequency (``min-acc`` and ``max-acc``), number of -aggregate intervals for the age of regions (``min-age`` and ``max-age``), and a -predefined integer for memory management actions should be used. The supported -numbers and their meanings are as below. +The ``<target access pattern>`` is constructed with three ranges in below +form:: + + min-size max-size min-acc max-acc min-age max-age + +Specifically, bytes for the size of regions (``min-size`` and ``max-size``), +number of monitored accesses per aggregate interval for access frequency +(``min-acc`` and ``max-acc``), number of aggregate intervals for the age of +regions (``min-age`` and ``max-age``) are specified. Note that the ranges are +closed interval. + +Action +~~~~~~ + +The ``<action>`` is a predefined integer for memory management actions, which +DAMON will apply to the regions having the target access pattern. The +supported numbers and their meanings are as below. - 0: Call ``madvise()`` for the region with ``MADV_WILLNEED`` - 1: Call ``madvise()`` for the region with ``MADV_COLD`` @@ -157,20 +538,82 @@ numbers and their meanings are as below. - 4: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE`` - 5: Do nothing but count the statistics -You can disable schemes by simply writing an empty string to the file. For -example, below commands applies a scheme saying "If a memory region of size in -[4KiB, 8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate -interval in [10, 20], page out the region", check the entered scheme again, and -finally remove the scheme. :: +Quota +~~~~~ - # cd <debugfs>/damon - # echo "4096 8192 0 5 10 20 2" > schemes - # cat schemes - 4096 8192 0 5 10 20 2 0 0 - # echo > schemes +Optimal ``target access pattern`` for each ``action`` is workload dependent, so +not easy to find. Worse yet, setting a scheme of some action too aggressive +can cause severe overhead. To avoid such overhead, users can limit time and +size quota for the scheme via the ``<quota>`` in below form:: + + <ms> <sz> <reset interval> <priority weights> + +This makes DAMON to try to use only up to ``<ms>`` milliseconds for applying +the action to memory regions of the ``target access pattern`` within the +``<reset interval>`` milliseconds, and to apply the action to only up to +``<sz>`` bytes of memory regions within the ``<reset interval>``. Setting both +``<ms>`` and ``<sz>`` zero disables the quota limits. + +When the quota limit is expected to be exceeded, DAMON prioritizes found memory +regions of the ``target access pattern`` based on their size, access frequency, +and age. For personalized prioritization, users can set the weights for the +three properties in ``<priority weights>`` in below form:: -The last two integers in the 4th line of above example is the total number and -the total size of the regions that the scheme is applied. + <size weight> <access frequency weight> <age weight> + +Watermarks +~~~~~~~~~~ + +Some schemes would need to run based on current value of the system's specific +metrics like free memory ratio. For such cases, users can specify watermarks +for the condition.:: + + <metric> <check interval> <high mark> <middle mark> <low mark> + +``<metric>`` is a predefined integer for the metric to be checked. The +supported numbers and their meanings are as below. + + - 0: Ignore the watermarks + - 1: System's free memory rate (per thousand) + +The value of the metric is checked every ``<check interval>`` microseconds. + +If the value is higher than ``<high mark>`` or lower than ``<low mark>``, the +scheme is deactivated. If the value is lower than ``<mid mark>``, the scheme +is activated. + +.. _damos_stats: + +Statistics +~~~~~~~~~~ + +It also counts the total number and bytes of regions that each scheme is tried +to be applied, the two numbers for the regions that each scheme is successfully +applied, and the total number of the quota limit exceeds. This statistics can +be used for online analysis or tuning of the schemes. + +The statistics can be shown by reading the ``schemes`` file. Reading the file +will show each scheme you entered in each line, and the five numbers for the +statistics will be added at the end of each line. + +Example +~~~~~~~ + +Below commands applies a scheme saying "If a memory region of size in [4KiB, +8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate +interval in [10, 20], page out the region. For the paging out, use only up to +10ms per second, and also don't page out more than 1GiB per second. Under the +limitation, page out memory regions having longer age first. Also, check the +free memory rate of the system every 5 seconds, start the monitoring and paging +out when the free memory rate becomes lower than 50%, but stop it if the free +memory rate becomes larger than 60%, or lower than 30%".:: + + # cd <debugfs>/damon + # scheme="4096 8192 0 5 10 20 2" # target access pattern and action + # scheme+=" 10 $((1024*1024*1024)) 1000" # quotas + # scheme+=" 0 0 100" # prioritization weights + # scheme+=" 1 5000000 600 500 300" # watermarks + # echo "$scheme" > schemes Turning On/Off @@ -195,6 +638,54 @@ the monitoring is turned on. If you write to the files while DAMON is running, an error code such as ``-EBUSY`` will be returned. +Monitoring Thread PID +--------------------- + +DAMON does requested monitoring with a kernel thread called ``kdamond``. You +can get the pid of the thread by reading the ``kdamond_pid`` file. When the +monitoring is turned off, reading the file returns ``none``. :: + + # cd <debugfs>/damon + # cat monitor_on + off + # cat kdamond_pid + none + # echo on > monitor_on + # cat kdamond_pid + 18594 + + +Using Multiple Monitoring Threads +--------------------------------- + +One ``kdamond`` thread is created for each monitoring context. You can create +and remove monitoring contexts for multiple ``kdamond`` required use case using +the ``mk_contexts`` and ``rm_contexts`` files. + +Writing the name of the new context to the ``mk_contexts`` file creates a +directory of the name on the DAMON debugfs directory. The directory will have +DAMON debugfs files for the context. :: + + # cd <debugfs>/damon + # ls foo + # ls: cannot access 'foo': No such file or directory + # echo foo > mk_contexts + # ls foo + # attrs init_regions kdamond_pid schemes target_ids + +If the context is not needed anymore, you can remove it and the corresponding +directory by putting the name of the context to the ``rm_contexts`` file. :: + + # echo foo > rm_contexts + # ls foo + # ls: cannot access 'foo': No such file or directory + +Note that ``mk_contexts``, ``rm_contexts``, and ``monitor_on`` files are in the +root directory only. + + +.. _tracepoint: + Tracepoint for Monitoring Results ================================= |