<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/net/9p, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/net/9p?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/net/9p?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-10-07T12:23:09Z</updated>
<entry>
<title>net/9p: clarify trans_fd parse_opt failure handling</title>
<updated>2022-10-07T12:23:09Z</updated>
<author>
<name>Li Zhong</name>
<email>floridsleeves@gmail.com</email>
</author>
<published>2022-09-21T21:09:21Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a8e633c604476e24d26a636582c0f5bdb421e70d'/>
<id>urn:sha1:a8e633c604476e24d26a636582c0f5bdb421e70d</id>
<content type='text'>
This parse_opts will set invalid opts.rfd/wfd in case of failure which
we already check, but it is not clear for readers that parse_opts error
are handled in p9_fd_create: clarify this by explicitely checking the
return value.

Link: https://lkml.kernel.org/r/20220921210921.1654735-1-floridsleeves@gmail.com
Signed-off-by: Li Zhong &lt;floridsleeves@gmail.com&gt;
[Dominique: reworded commit message to clarify this is NOOP]
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>net/9p: add __init/__exit annotations to module init/exit funcs</title>
<updated>2022-10-07T12:23:09Z</updated>
<author>
<name>Xiu Jianfeng</name>
<email>xiujianfeng@huawei.com</email>
</author>
<published>2022-09-09T10:35:46Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=0664c63af16dceb4b40a9825e738136a2dac0260'/>
<id>urn:sha1:0664c63af16dceb4b40a9825e738136a2dac0260</id>
<content type='text'>
xen transport was missing annotations

Link: https://lkml.kernel.org/r/20220909103546.73015-1-xiujianfeng@huawei.com
Signed-off-by: Xiu Jianfeng &lt;xiujianfeng@huawei.com&gt;
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>net/9p: use a dedicated spinlock for trans_fd</title>
<updated>2022-10-07T12:22:48Z</updated>
<author>
<name>Dominique Martinet</name>
<email>asmadeus@codewreck.org</email>
</author>
<published>2022-09-04T11:17:49Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=296ab4a813841ba1d5f40b03190fd1bd8f25aab0'/>
<id>urn:sha1:296ab4a813841ba1d5f40b03190fd1bd8f25aab0</id>
<content type='text'>
Shamelessly copying the explanation from Tetsuo Handa's suggested
patch[1] (slightly reworded):
syzbot is reporting inconsistent lock state in p9_req_put()[2],
for p9_tag_remove() from p9_req_put() from IRQ context is using
spin_lock_irqsave() on "struct p9_client"-&gt;lock but trans_fd
(not from IRQ context) is using spin_lock().

Since the locks actually protect different things in client.c and in
trans_fd.c, just replace trans_fd.c's lock by a new one specific to the
transport (client.c's protect the idr for fid/tag allocations,
while trans_fd.c's protects its own req list and request status field
that acts as the transport's state machine)

Link: https://lore.kernel.org/r/20220904112928.1308799-1-asmadeus@codewreck.org
Link: https://lkml.kernel.org/r/2470e028-9b05-2013-7198-1fdad071d999@I-love.SAKURA.ne.jp [1]
Link: https://syzkaller.appspot.com/bug?extid=2f20b523930c32c160cc [2]
Reported-by: syzbot &lt;syzbot+2f20b523930c32c160cc@syzkaller.appspotmail.com&gt;
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Reviewed-by: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>9p/trans_fd: always use O_NONBLOCK read/write</title>
<updated>2022-10-07T00:59:36Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2022-08-26T15:27:46Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=ef575281b21e9a34dfae544a187c6aac2ae424a9'/>
<id>urn:sha1:ef575281b21e9a34dfae544a187c6aac2ae424a9</id>
<content type='text'>
syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
 from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
started kernel_read() from p9_fd_read() from p9_read_work() and/or
kernel_write() from p9_fd_write() from p9_write_work() requests.

Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
need to interrupt kernel_read()/kernel_write(). However, since p9_fd_open()
does not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
p9_mux_poll_stop() needs to interrupt kernel_read()/kernel_write() when
the file descriptor refers to a pipe. In other words, pipe file descriptor
needs to be handled as if socket file descriptor.

We somehow need to interrupt kernel_read()/kernel_write() on pipes.

A minimal change, which this patch is doing, is to set O_NONBLOCK flag
 from p9_fd_open(), for O_NONBLOCK flag does not affect reading/writing
of regular files. But this approach changes O_NONBLOCK flag on userspace-
supplied file descriptors (which might break userspace programs), and
O_NONBLOCK flag could be changed by userspace. It would be possible to set
O_NONBLOCK flag every time p9_fd_read()/p9_fd_write() is invoked, but still
remains small race window for clearing O_NONBLOCK flag.

If we don't want to manipulate O_NONBLOCK flag, we might be able to
surround kernel_read()/kernel_write() with set_thread_flag(TIF_SIGPENDING)
and recalc_sigpending(). Since p9_read_work()/p9_write_work() works are
processed by kernel threads which process global system_wq workqueue,
signals could not be delivered from remote threads when p9_mux_poll_stop()
 from p9_conn_destroy() from p9_fd_close() is called. Therefore, calling
set_thread_flag(TIF_SIGPENDING)/recalc_sigpending() every time would be
needed if we count on signals for making kernel_read()/kernel_write()
non-blocking.

Link: https://lkml.kernel.org/r/345de429-a88b-7097-d177-adecf9fed342@I-love.SAKURA.ne.jp
Link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33 [1]
Reported-by: syzbot &lt;syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com&gt;
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Tested-by: syzbot &lt;syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com&gt;
Reviewed-by: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
[Dominique: add comment at Christian's suggestion]
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>net/9p: allocate appropriate reduced message buffers</title>
<updated>2022-10-04T22:05:41Z</updated>
<author>
<name>Christian Schoenebeck</name>
<email>linux_oss@crudebyte.com</email>
</author>
<published>2022-07-15T21:33:56Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=60ece0833b6c2bc1465eb2803fec20b670e2ee93'/>
<id>urn:sha1:60ece0833b6c2bc1465eb2803fec20b670e2ee93</id>
<content type='text'>
So far 'msize' was simply used for all 9p message types, which is far
too much and slowed down performance tremendously with large values
for user configurable 'msize' option.

Let's stop this waste by using the new p9_msg_buf_size() function for
allocating more appropriate, smaller buffers according to what is
actually sent over the wire.

Only exception: RDMA transport is currently excluded from this message
size optimization - for its response buffers that is - as RDMA transport
would not cope with it, due to its response buffers being pulled from a
shared pool. [1]

Link: https://lore.kernel.org/all/Ys3jjg52EIyITPua@codewreck.org/ [1]
Link: https://lkml.kernel.org/r/3f51590535dc96ed0a165b8218c57639cfa5c36c.1657920926.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>net/9p: add 'pooled_rbuffers' flag to struct p9_trans_module</title>
<updated>2022-10-04T22:05:41Z</updated>
<author>
<name>Christian Schoenebeck</name>
<email>linux_oss@crudebyte.com</email>
</author>
<published>2022-07-15T21:33:09Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=01d205d936ae18532e14814808592b926aacc6d5'/>
<id>urn:sha1:01d205d936ae18532e14814808592b926aacc6d5</id>
<content type='text'>
This is a preparatory change for the subsequent patch: the RDMA
transport pulls the buffers for its 9p response messages from a
shared pool. [1] So this case has to be considered when choosing
an appropriate response message size in the subsequent patch.

Link: https://lore.kernel.org/all/Ys3jjg52EIyITPua@codewreck.org/ [1]
Link: https://lkml.kernel.org/r/79d24310226bc4eb037892b5c097ec4ad4819a03.1657920926.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>net/9p: add p9_msg_buf_size()</title>
<updated>2022-10-04T22:05:41Z</updated>
<author>
<name>Christian Schoenebeck</name>
<email>linux_oss@crudebyte.com</email>
</author>
<published>2022-07-15T21:32:34Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=1effdbf94a728b74b23a24ce7b6f1d1d9a2480a4'/>
<id>urn:sha1:1effdbf94a728b74b23a24ce7b6f1d1d9a2480a4</id>
<content type='text'>
This new function calculates a buffer size suitable for holding the
intended 9p request or response. For rather small message types (which
applies to almost all 9p message types actually) simply use hard coded
values. For some variable-length and potentially large message types
calculate a more precise value according to what data is actually
transmitted to avoid unnecessarily huge buffers.

So p9_msg_buf_size() divides the individual 9p message types into 3
message size categories:

  - dynamically calculated message size (i.e. potentially large)
  - 8k hard coded message size
  - 4k hard coded message size

As for the latter two hard coded message types: for most 9p message
types it is pretty obvious whether they would always fit into 4k or
8k. But for some of them it depends on the maximum directory entry
name length allowed by OS and filesystem for determining into which
of the two size categories they would fit into. Currently Linux
supports directory entry names up to NAME_MAX (255), however when
comparing the limitation of individual filesystems, ReiserFS
theoretically supports up to slightly below 4k long names. So in
order to make this code more future proof, and as revisiting it
later on is a bit tedious and has the potential to miss out details,
the decision [1] was made to take 4k as basis as for max. name length.

Link: https://lkml.kernel.org/r/bd6be891cf67e867688e8c8796d06408bfafa0d9.1657920926.git.linux_oss@crudebyte.com
Link: https://lore.kernel.org/all/5564296.oo812IJUPE@silver/ [1]
Signed-off-by: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>net/9p: split message size argument into 't_size' and 'r_size' pair</title>
<updated>2022-10-04T22:05:41Z</updated>
<author>
<name>Christian Schoenebeck</name>
<email>linux_oss@crudebyte.com</email>
</author>
<published>2022-07-15T21:32:28Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e7c6219778e46143ee9e68a25febac10a66383ae'/>
<id>urn:sha1:e7c6219778e46143ee9e68a25febac10a66383ae</id>
<content type='text'>
Refactor 'max_size' argument of p9_tag_alloc() and 'req_size' argument
of p9_client_prepare_req() both into a pair of arguments 't_size' and
'r_size' respectively to allow handling the buffer size for request and
reply separately from each other.

Link: https://lkml.kernel.org/r/9431a25fe4b37fd12cecbd715c13af71f701f220.1657920926.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
</content>
</entry>
<entry>
<title>9p: trans_fd/p9_conn_cancel: drop client lock earlier</title>
<updated>2022-10-04T22:05:40Z</updated>
<author>
<name>Dominique Martinet</name>
<email>asmadeus@codewreck.org</email>
</author>
<published>2022-08-17T05:58:44Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=52f1c45dde9136f964d63a77d19826c8a74e2c7f'/>
<id>urn:sha1:52f1c45dde9136f964d63a77d19826c8a74e2c7f</id>
<content type='text'>
syzbot reported a double-lock here and we no longer need this
lock after requests have been moved off to local list:
just drop the lock earlier.

Link: https://lkml.kernel.org/r/20220904064028.1305220-1-asmadeus@codewreck.org
Reported-by: syzbot+50f7e8d06c3768dd97f3@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
Tested-by: Schspa Shi &lt;schspa@gmail.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2022-08-09T03:04:35Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-09T03:04:35Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=f30adc0d332fdfe5315cb98bd6a7ff0d5cf2aa38'/>
<id>urn:sha1:f30adc0d332fdfe5315cb98bd6a7ff0d5cf2aa38</id>
<content type='text'>
Pull more iov_iter updates from Al Viro:

 - more new_sync_{read,write}() speedups - ITER_UBUF introduction

 - ITER_PIPE cleanups

 - unification of iov_iter_get_pages/iov_iter_get_pages_alloc and
   switching them to advancing semantics

 - making ITER_PIPE take high-order pages without splitting them

 - handling copy_page_from_iter() for high-order pages properly

* tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
  fix copy_page_from_iter() for compound destinations
  hugetlbfs: copy_page_to_iter() can deal with compound pages
  copy_page_to_iter(): don't split high-order page in case of ITER_PIPE
  expand those iov_iter_advance()...
  pipe_get_pages(): switch to append_pipe()
  get rid of non-advancing variants
  ceph: switch the last caller of iov_iter_get_pages_alloc()
  9p: convert to advancing variant of iov_iter_get_pages_alloc()
  af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
  iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
  block: convert to advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: saner helper for page array allocation
  fold __pipe_get_pages() into pipe_get_pages()
  ITER_XARRAY: don't open-code DIV_ROUND_UP()
  unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
  unify xarray_get_pages() and xarray_get_pages_alloc()
  unify pipe_get_pages() and pipe_get_pages_alloc()
  iov_iter_get_pages(): sanity-check arguments
  iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
  ...
</content>
</entry>
</feed>
