<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/drivers/dax, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/drivers/dax?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/drivers/dax?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-10-15T01:41:41Z</updated>
<entry>
<title>Merge tag 'libnvdimm-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm</title>
<updated>2022-10-15T01:41:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-15T01:41:41Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=19d17ab7c68b62180e0537f92400a6f798019775'/>
<id>urn:sha1:19d17ab7c68b62180e0537f92400a6f798019775</id>
<content type='text'>
Pull nvdimm updates from Dan Williams:
 "Some small cleanups and fixes in and around the nvdimm subsystem. The
  most significant change is a regression fix for nvdimm namespace
  (volume) creation when the namespace size is smaller than 2MB/

  Summary:

   - Fix nvdimm namespace creation on platforms that do not publish
     associated 'DIMM' metadata for a persistent memory region.

   - Miscellaneous fixes and cleanups"

* tag 'libnvdimm-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  ACPI: HMAT: Release platform device in case of platform_device_add_data() fails
  dax: Remove usage of the deprecated ida_simple_xxx API
  libnvdimm/region: Allow setting align attribute on regions without mappings
  nvdimm/namespace: Fix comment typo
  nvdimm: make __nvdimm_security_overwrite_query static
  nvdimm/region: Fix kernel-doc
  nvdimm/namespace: drop unneeded temporary variable in size_store()
  nvdimm/namespace: return uuid_null only once in nd_dev_to_uuid()
</content>
</entry>
<entry>
<title>Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2022-10-11T00:53:04Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-11T00:53:04Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=27bc50fc90647bbf7b734c3fc306a5e61350da53'/>
<id>urn:sha1:27bc50fc90647bbf7b734c3fc306a5e61350da53</id>
<content type='text'>
Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP &amp; KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock-&gt;vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
</content>
</entry>
<entry>
<title>ACPI: HMAT: Release platform device in case of platform_device_add_data() fails</title>
<updated>2022-09-30T00:54:29Z</updated>
<author>
<name>Lin Yujun</name>
<email>linyujun809@huawei.com</email>
</author>
<published>2022-09-14T03:37:55Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=6a02124c87f0b61dcaaeb65e7fd406d8afb40fd4'/>
<id>urn:sha1:6a02124c87f0b61dcaaeb65e7fd406d8afb40fd4</id>
<content type='text'>
The platform device is not released when platform_device_add_data()
fails. And platform_device_put() perfom one more pointer check than
put_device() to check for errors in the 'pdev' pointer.

Use platform_device_put() to release platform device in
platform_device_add()/platform_device_add_data()/
platform_device_add_resources() error case.

Fixes: c01044cc8191 ("ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device")
Signed-off-by: Lin Yujun &lt;linyujun809@huawei.com&gt;
Link: https://lore.kernel.org/r/20220914033755.99924-1-linyujun809@huawei.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>dax: Remove usage of the deprecated ida_simple_xxx API</title>
<updated>2022-09-30T00:29:27Z</updated>
<author>
<name>Bo Liu</name>
<email>liubo03@inspur.com</email>
</author>
<published>2022-09-26T01:26:35Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=0f702033a64bd3adcd57c9d5cf91ea64c08fad42'/>
<id>urn:sha1:0f702033a64bd3adcd57c9d5cf91ea64c08fad42</id>
<content type='text'>
ida_alloc_max() makes it clear that the second argument is inclusive,
and the alloc/free terminology is more idiomatic and symmetric then
get/remove.

Signed-off-by: Bo Liu &lt;liubo03@inspur.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Link: https://lore.kernel.org/r/20220926012635.3205-1-liubo03@inspur.com
[djbw: reword changelog]
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>mm/demotion/dax/kmem: set node's abstract distance to MEMTIER_DEFAULT_DAX_ADISTANCE</title>
<updated>2022-09-27T02:46:11Z</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.ibm.com</email>
</author>
<published>2022-08-18T13:10:36Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=7b88bda3761b95856cf97822efe8281c8100067b'/>
<id>urn:sha1:7b88bda3761b95856cf97822efe8281c8100067b</id>
<content type='text'>
By default, all nodes are assigned to the default memory tier which is the
memory tier designated for nodes with DRAM

Set dax kmem device node's tier to slower memory tier by assigning
abstract distance to MEMTIER_DEFAULT_DAX_ADISTANCE.  Low-level drivers
like papr_scm or ACPI NFIT can initialize memory device type to a more
accurate value based on device tree details or HMAT.  If the kernel
doesn't find the memory type initialized, a default slower memory type is
assigned by the kmem driver.

[aneesh.kumar@linux.ibm.com: assign correct memory type for multiple dax devices with the same node affinity]
  Link: https://lkml.kernel.org/r/20220826100224.542312-1-aneesh.kumar@linux.ibm.com
Link: https://lkml.kernel.org/r/20220818131042.113280-5-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Reviewed-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Acked-by: Wei Xu &lt;weixugc@google.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Bharata B Rao &lt;bharata@amd.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Hesham Almatary &lt;hesham.almatary@huawei.com&gt;
Cc: Jagdish Gediya &lt;jvgediya.oss@gmail.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Tim Chen &lt;tim.c.chen@intel.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: SeongJae Park &lt;sj@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>devdax: Fix soft-reservation memory description</title>
<updated>2022-09-25T01:05:53Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2022-09-23T22:05:56Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=67feaba413ec68daf4124e9870878899b4ed9a0e'/>
<id>urn:sha1:67feaba413ec68daf4124e9870878899b4ed9a0e</id>
<content type='text'>
The "hmem" platform-devices that are created to represent the
platform-advertised "Soft Reserved" memory ranges end up inserting a
resource that causes the iomem_resource tree to look like this:

340000000-43fffffff : hmem.0
  340000000-43fffffff : Soft Reserved
    340000000-43fffffff : dax0.0

This is because insert_resource() reparents ranges when they completely
intersect an existing range.

This matters because code that uses region_intersects() to scan for a
given IORES_DESC will only check that top-level 'hmem.0' resource and
not the 'Soft Reserved' descendant.

So, to support EINJ (via einj_error_inject()) to inject errors into
memory hosted by a dax-device, be sure to describe the memory as
IORES_DESC_SOFT_RESERVED. This is a follow-on to:

commit b13a3e5fd40b ("ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP")

...that fixed EINJ support for "Soft Reserved" ranges in the first
instance.

Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration")
Reported-by: Ricardo Sandoval Torres &lt;ricardo.sandoval.torres@intel.com&gt;
Tested-by: Ricardo Sandoval Torres &lt;ricardo.sandoval.torres@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Omar Avelar &lt;omar.avelar@intel.com&gt;
Cc: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Mark Gross &lt;markgross@kernel.org&gt;
Link: https://lore.kernel.org/r/166397075670.389916.7435722208896316387.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>dax: introduce holder for dax_device</title>
<updated>2022-07-18T00:14:30Z</updated>
<author>
<name>Shiyang Ruan</name>
<email>ruansy.fnst@fujitsu.com</email>
</author>
<published>2022-06-03T05:37:25Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=8012b866085523758780850087102421dbcce522'/>
<id>urn:sha1:8012b866085523758780850087102421dbcce522</id>
<content type='text'>
Patch series "v14 fsdax-rmap + v11 fsdax-reflink", v2.

The patchset fsdax-rmap is aimed to support shared pages tracking for
fsdax.

It moves owner tracking from dax_assocaite_entry() to pmem device driver,
by introducing an interface -&gt;memory_failure() for struct pagemap.  This
interface is called by memory_failure() in mm, and implemented by pmem
device.

Then call holder operations to find the filesystem which the corrupted
data located in, and call filesystem handler to track files or metadata
associated with this page.

Finally we are able to try to fix the corrupted data in filesystem and do
other necessary processing, such as killing processes who are using the
files affected.

The call trace is like this:
memory_failure()
|* fsdax case
|------------
|pgmap-&gt;ops-&gt;memory_failure()      =&gt; pmem_pgmap_memory_failure()
| dax_holder_notify_failure()      =&gt;
|  dax_device-&gt;holder_ops-&gt;notify_failure() =&gt;
|                                     - xfs_dax_notify_failure()
|  |* xfs_dax_notify_failure()
|  |--------------------------
|  |   xfs_rmap_query_range()
|  |    xfs_dax_failure_fn()
|  |    * corrupted on metadata
|  |       try to recover data, call xfs_force_shutdown()
|  |    * corrupted on file data
|  |       try to recover data, call mf_dax_kill_procs()
|* normal case
|-------------
|mf_generic_kill_procs()


The patchset fsdax-reflink attempts to add CoW support for fsdax, and
takes XFS, which has both reflink and fsdax features, as an example.

One of the key mechanisms needed to be implemented in fsdax is CoW.  Copy
the data from srcmap before we actually write data to the destination
iomap.  And we just copy range in which data won't be changed.

Another mechanism is range comparison.  In page cache case, readpage() is
used to load data on disk to page cache in order to be able to compare
data.  In fsdax case, readpage() does not work.  So, we need another
compare data with direct access support.

With the two mechanisms implemented in fsdax, we are able to make reflink
and fsdax work together in XFS.


This patch (of 14):

To easily track filesystem from a pmem device, we introduce a holder for
dax_device structure, and also its operation.  This holder is used to
remember who is using this dax_device:

 - When it is the backend of a filesystem, the holder will be the
   instance of this filesystem.
 - When this pmem device is one of the targets in a mapped device, the
   holder will be this mapped device.  In this case, the mapped device
   has its own dax_device and it will follow the first rule.  So that we
   can finally track to the filesystem we needed.

The holder and holder_ops will be set when filesystem is being mounted,
or an target device is being activated.

Link: https://lkml.kernel.org/r/20220603053738.1218681-1-ruansy.fnst@fujitsu.com
Link: https://lkml.kernel.org/r/20220603053738.1218681-2-ruansy.fnst@fujitsu.com
Signed-off-by: Shiyang Ruan &lt;ruansy.fnst@fujitsu.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Dan Williams &lt;dan.j.wiliams@intel.com&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Jane Chu &lt;jane.chu@oracle.com&gt;
Cc: Goldwyn Rodrigues &lt;rgoldwyn@suse.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Cc: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>dax: add .recovery_write dax_operation</title>
<updated>2022-05-16T20:37:59Z</updated>
<author>
<name>Jane Chu</name>
<email>jane.chu@oracle.com</email>
</author>
<published>2022-04-22T22:45:06Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=047218ec904da19c45c4a70274fc3f818a1fcba1'/>
<id>urn:sha1:047218ec904da19c45c4a70274fc3f818a1fcba1</id>
<content type='text'>
Introduce dax_recovery_write() operation. The function is used to
recover a dax range that contains poison. Typical use case is when
a user process receives a SIGBUS with si_code BUS_MCEERR_AR
indicating poison(s) in a dax range, in response, the user process
issues a pwrite() to the page-aligned dax range, thus clears the
poison and puts valid data in the range.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jane Chu &lt;jane.chu@oracle.com&gt;
Link: https://lore.kernel.org/r/20220422224508.440670-6-jane.chu@oracle.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>dax: introduce DAX_RECOVERY_WRITE dax access mode</title>
<updated>2022-05-16T20:35:56Z</updated>
<author>
<name>Jane Chu</name>
<email>jane.chu@oracle.com</email>
</author>
<published>2022-05-13T22:10:58Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e511c4a3d2a1f64aafc1f5df37a2ffcf7ef91b55'/>
<id>urn:sha1:e511c4a3d2a1f64aafc1f5df37a2ffcf7ef91b55</id>
<content type='text'>
Up till now, dax_direct_access() is used implicitly for normal
access, but for the purpose of recovery write, dax range with
poison is requested.  To make the interface clear, introduce
	enum dax_access_mode {
		DAX_ACCESS,
		DAX_RECOVERY_WRITE,
	}
where DAX_ACCESS is used for normal dax access, and
DAX_RECOVERY_WRITE is used for dax recovery write.

Suggested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Jane Chu &lt;jane.chu@oracle.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Link: https://lore.kernel.org/r/165247982851.52965.11024212198889762949.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'dax-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm</title>
<updated>2022-03-25T01:12:09Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-03-25T01:12:09Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=f0614eefbf829a2914ac9a82cb8bbeaf1af28f9d'/>
<id>urn:sha1:f0614eefbf829a2914ac9a82cb8bbeaf1af28f9d</id>
<content type='text'>
Pull DAX updates from Dan Williams:
 "Andrew has been shepherding major dax features that touch the core -mm
  through his tree, but I still collect the dax updates that are core-mm
  independent.

   - Fix a crash due to a missing rcu_barrier() in dax_fs_exit()

   - Fix two miscellaneous doc issues"

* tag 'dax-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: Fix missing kdoc for dax_device
  dax: make sure inodes are flushed before destroy cache
  fsdax: fix function description
</content>
</entry>
</feed>
