From f4840ccfca25db225b3371a8f7b5770febee87c5 Mon Sep 17 00:00:00 2001 From: Johannes Weiner Date: Thu, 19 May 2022 14:08:53 -0700 Subject: zswap: memcg accounting Applications can currently escape their cgroup memory containment when zswap is enabled. This patch adds per-cgroup tracking and limiting of zswap backend memory to rectify this. The existing cgroup2 memory.stat file is extended to show zswap statistics analogous to what's in meminfo and vmstat. Furthermore, two new control files, memory.zswap.current and memory.zswap.max, are added to allow tuning zswap usage on a per-workload basis. This is important since not all workloads benefit from zswap equally; some even suffer compared to disk swap when memory contents don't compress well. The optimal size of the zswap pool, and the threshold for writeback, also depends on the size of the workload's warm set. The implementation doesn't use a traditional page_counter transaction. zswap is unconventional as a memory consumer in that we only know the amount of memory to charge once expensive compression has occurred. If zwap is disabled or the limit is already exceeded we obviously don't want to compress page upon page only to reject them all. Instead, the limit is checked against current usage, then we compress and charge. This allows some limit overrun, but not enough to matter in practice. [hannes@cmpxchg.org: fix for CONFIG_SLOB builds] Link: https://lkml.kernel.org/r/YnwD14zxYjUJPc2w@cmpxchg.org [hannes@cmpxchg.org: opt out of cgroups v1] Link: https://lkml.kernel.org/r/Yn6it9mBYFA+/lTb@cmpxchg.org Link: https://lkml.kernel.org/r/20220510152847.230957-7-hannes@cmpxchg.org Signed-off-by: Johannes Weiner Cc: Michal Hocko Cc: Roman Gushchin Cc: Shakeel Butt Cc: Seth Jennings Cc: Dan Streetman Cc: Minchan Kim Signed-off-by: Andrew Morton --- Documentation/admin-guide/cgroup-v2.rst | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'Documentation/admin-guide/cgroup-v2.rst') diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 07ef3bf1448e..08ae8189e6cb 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1354,6 +1354,12 @@ PAGE_SIZE multiple when read back. Amount of cached filesystem data that is swap-backed, such as tmpfs, shm segments, shared anonymous mmap()s + zswap + Amount of memory consumed by the zswap compression backend. + + zswapped + Amount of application memory swapped out to zswap. + file_mapped Amount of cached filesystem data mapped with mmap() @@ -1544,6 +1550,21 @@ PAGE_SIZE multiple when read back. higher than the limit for an extended period of time. This reduces the impact on the workload and memory management. + memory.zswap.current + A read-only single value file which exists on non-root + cgroups. + + The total amount of memory consumed by the zswap compression + backend. + + memory.zswap.max + A read-write single value file which exists on non-root + cgroups. The default is "max". + + Zswap usage hard limit. If a cgroup's zswap pool reaches this + limit, it will refuse to take any more stores before existing + entries fault back in or are written out to disk. + memory.pressure A read-only nested-keyed file. -- cgit v1.2.3-59-g8ed1b