<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/ipc, branch linus/master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/ipc?h=linus%2Fmaster</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/ipc?h=linus%2Fmaster'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-06-03T22:54:57Z</updated>
<entry>
<title>Merge tag 'per-namespace-ipc-sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace</title>
<updated>2022-06-03T22:54:57Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-06-03T22:54:57Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=1888e9b4bb78c88514b24ecafa9e4e4faf761747'/>
<id>urn:sha1:1888e9b4bb78c88514b24ecafa9e4e4faf761747</id>
<content type='text'>
Pull ipc sysctl namespace updates from Eric Biederman:
 "This updates the ipc sysctls so that they are fundamentally per ipc
  namespace. Previously these sysctls depended upon a hack to simulate
  being per ipc namespace by looking up the ipc namespace in read or
  write. With this set of changes the ipc sysctls are registered per ipc
  namespace and open looks up the ipc namespace.

  Not only does this series of changes ensure the traditional binding at
  open time happens, but it sets a foundation for being able to relax
  the permission checks to allow a user namspace root to change the ipc
  sysctls for an ipc namespace that the user namespace root requires. To
  do this requires the ipc namespace to be known at open time"

* tag 'per-namespace-ipc-sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ipc: Remove extra braces
  ipc: Check permissions for checkpoint_restart sysctls at open time
  ipc: Remove extra1 field abuse to pass ipc namespace
  ipc: Use the same namespace to modify and validate
  ipc: Store ipc sysctls in the ipc namespace
  ipc: Store mqueue sysctls in the ipc namespace
</content>
</entry>
<entry>
<title>ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()</title>
<updated>2022-05-10T01:29:21Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2022-05-10T01:29:21Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=d60c4d01a98bc1942dba6e3adc02031f5519f94b'/>
<id>urn:sha1:d60c4d01a98bc1942dba6e3adc02031f5519f94b</id>
<content type='text'>
When running the stress-ng clone benchmark with multiple testing threads,
it was found that there were significant spinlock contention in sget_fc().
The contended spinlock was the sb_lock.  It is under heavy contention
because the following code in the critcal section of sget_fc():

  hlist_for_each_entry(old, &amp;fc-&gt;fs_type-&gt;fs_supers, s_instances) {
      if (test(old, fc))
          goto share_extant_sb;
  }

After testing with added instrumentation code, it was found that the
benchmark could generate thousands of ipc namespaces with the
corresponding number of entries in the mqueue's fs_supers list where the
namespaces are the key for the search.  This leads to excessive time in
scanning the list for a match.

Looking back at the mqueue calling sequence leading to sget_fc():

  mq_init_ns()
  =&gt; mq_create_mount()
  =&gt; fc_mount()
  =&gt; vfs_get_tree()
  =&gt; mqueue_get_tree()
  =&gt; get_tree_keyed()
  =&gt; vfs_get_super()
  =&gt; sget_fc()

Currently, mq_init_ns() is the only mqueue function that will indirectly
call mqueue_get_tree() with a newly allocated ipc namespace as the key for
searching.  As a result, there will never be a match with the exising ipc
namespaces stored in the mqueue's fs_supers list.

So using get_tree_keyed() to do an existing ipc namespace search is just a
waste of time.  Instead, we could use get_tree_nodev() to eliminate the
useless search.  By doing so, we can greatly reduce the sb_lock hold time
and avoid the spinlock contention problem in case a large number of ipc
namespaces are present.

Of course, if the code is modified in the future to allow
mqueue_get_tree() to be called with an existing ipc namespace instead of a
new one, we will have to use get_tree_keyed() in this case.

The following stress-ng clone benchmark command was run on a 2-socket
48-core Intel system:

./stress-ng --clone 32 --verbose --oomable --metrics-brief -t 20

The "bogo ops/s" increased from 5948.45 before patch to 9137.06 after
patch. This is an increase of 54% in performance.

Link: https://lkml.kernel.org/r/20220121172315.19652-1-longman@redhat.com
Fixes: 935c6912b198 ("ipc: Convert mqueue fs to fs_context")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: update semtimedop() to use hrtimer</title>
<updated>2022-05-10T01:29:20Z</updated>
<author>
<name>Prakash Sangappa</name>
<email>prakash.sangappa@oracle.com</email>
</author>
<published>2022-05-10T01:29:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=49c9dd0df65d547a58642d2f717eeb560e1db140'/>
<id>urn:sha1:49c9dd0df65d547a58642d2f717eeb560e1db140</id>
<content type='text'>
semtimedop() should be converted to use hrtimer like it has been done for
most of the system calls with timeouts.  This system call already takes a
struct timespec as an argument and can therefore provide finer granularity
timed wait.

Link: https://lkml.kernel.org/r/1651187881-2858-1-git-send-email-prakash.sangappa@oracle.com
Signed-off-by: Prakash Sangappa &lt;prakash.sangappa@oracle.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Reviewed-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/sem: remove redundant assignments</title>
<updated>2022-05-10T01:29:20Z</updated>
<author>
<name>Michal Orzel</name>
<email>michalorzel.eng@gmail.com</email>
</author>
<published>2022-05-10T01:29:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=0e900029655327bb5326ced02eff97667a079039'/>
<id>urn:sha1:0e900029655327bb5326ced02eff97667a079039</id>
<content type='text'>
Get rid of redundant assignments which end up in values not being
read either because they are overwritten or the function ends.

Reported by clang-tidy [deadcode.DeadStores]

Link: https://lkml.kernel.org/r/20220409101933.207157-1-michalorzel.eng@gmail.com
Signed-off-by: Michal Orzel &lt;michalorzel.eng@gmail.com&gt;
Reviewed-by: Tom Rix &lt;trix@redhat.com&gt;
Reviewed-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: Remove extra braces</title>
<updated>2022-05-03T22:25:58Z</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2022-05-03T13:39:57Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=38cd5b12b7854941ede1954cf5a2393eb94b5d37'/>
<id>urn:sha1:38cd5b12b7854941ede1954cf5a2393eb94b5d37</id>
<content type='text'>
Fix coding style. In the previous commit, I added braces because,
in addition to changing .data, .extra1 also changed. Now this is not
needed.

Fixes: 1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/37687827f630bc150210f5b8abeeb00f1336814e.1651584847.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>ipc: Check permissions for checkpoint_restart sysctls at open time</title>
<updated>2022-05-03T22:25:58Z</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2022-05-03T13:39:56Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=0889f44e281034e180daa6daf3e2d57c012452d4'/>
<id>urn:sha1:0889f44e281034e180daa6daf3e2d57c012452d4</id>
<content type='text'>
As Eric Biederman pointed out, it is possible not to use a custom
proc_handler and check permissions for every write, but to use a
.permission handler. That will allow the checkpoint_restart sysctls to
perform all of their permission checks at open time, and not need any
other special code.

Link: https://lore.kernel.org/lkml/87czib9g38.fsf@email.froward.int.ebiederm.org/
Fixes: 1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/65fa8459803830608da4610a39f33c76aa933eb9.1651584847.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>ipc: Remove extra1 field abuse to pass ipc namespace</title>
<updated>2022-05-03T22:25:58Z</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2022-05-03T13:39:55Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=dd141a4955d5ebbb3f4c7996796e86a3ac9ed57f'/>
<id>urn:sha1:dd141a4955d5ebbb3f4c7996796e86a3ac9ed57f</id>
<content type='text'>
Eric Biederman pointed out that using .extra1 to pass ipc namespace
looks like an ugly hack and there is a better solution. We can get the
ipc_namespace using the .data field.

Link: https://lore.kernel.org/lkml/87czib9g38.fsf@email.froward.int.ebiederm.org/
Fixes: 1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/93df64a8fe93ba20ebbe1d9f8eda484b2f325426.1651584847.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>ipc: Use the same namespace to modify and validate</title>
<updated>2022-05-03T22:25:58Z</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2022-05-03T13:39:54Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=def7343ff03bbb36ce7a34dcb19cab599f0da446'/>
<id>urn:sha1:def7343ff03bbb36ce7a34dcb19cab599f0da446</id>
<content type='text'>
In the 1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace") I
missed that in addition to the modification of sem_ctls[3], the change
is validated. This validation must occur in the same namespace.

Link: https://lore.kernel.org/lkml/875ymnvryb.fsf@email.froward.int.ebiederm.org/
Fixes: 1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/b3cb9a25cce6becbef77186bc1216071a08a969b.1651584847.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>fs: allocate inode by using alloc_inode_sb()</title>
<updated>2022-03-22T22:57:03Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2022-03-22T21:41:03Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=fd60b28842df833477c42da6a6d63d0d114a5fcc'/>
<id>urn:sha1:fd60b28842df833477c42da6a6d63d0d114a5fcc</id>
<content type='text'>
The inode allocation is supposed to use alloc_inode_sb(), so convert
kmem_cache_alloc() of all filesystems to alloc_inode_sb().

Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Acked-by: Theodore Ts'o &lt;tytso@mit.edu&gt;		[ext4]
Acked-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Alex Shi &lt;alexs@kernel.org&gt;
Cc: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Fam Zheng &lt;fam.zheng@bytedance.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kari Argillander &lt;kari.argillander@gmail.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: Store ipc sysctls in the ipc namespace</title>
<updated>2022-03-08T19:39:40Z</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2022-02-14T18:18:15Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=1f5c135ee509e89e0cc274333a65f73c62cb16e5'/>
<id>urn:sha1:1f5c135ee509e89e0cc274333a65f73c62cb16e5</id>
<content type='text'>
The ipc sysctls are not available for modification inside the user
namespace. Following the mqueue sysctls, we changed the implementation
to be more userns friendly.

So far, the changes do not provide additional access to files. This
will be done in a future patch.

Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/be6f9d014276f4dddd0c3aa05a86052856c1c555.1644862280.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
</feed>
