<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/io_uring, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/io_uring?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/io_uring?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-11-11T16:59:27Z</updated>
<entry>
<title>io_uring/poll: lockdep annote io_poll_req_insert_locked</title>
<updated>2022-11-11T16:59:27Z</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-11-11T16:51:30Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=5576035f15dfcc6cb1cec236db40c2c0733b0ba4'/>
<id>urn:sha1:5576035f15dfcc6cb1cec236db40c2c0733b0ba4</id>
<content type='text'>
Add a lockdep annotation in io_poll_req_insert_locked().

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/8115d8e702733754d0aea119e9b5bb63d1eb8b24.1668184658.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/poll: fix double poll req-&gt;flags races</title>
<updated>2022-11-11T16:59:27Z</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-11-11T16:51:29Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=30a33669fa21cd3dc7d92a00ba736358059014b7'/>
<id>urn:sha1:30a33669fa21cd3dc7d92a00ba736358059014b7</id>
<content type='text'>
io_poll_double_prepare()            | io_poll_wake()
                                    | poll-&gt;head = NULL
smp_load(&amp;poll-&gt;head); /* NULL */   |
flags = req-&gt;flags;                 |
                                    | req-&gt;flags &amp;= ~SINGLE_POLL;
req-&gt;flags = flags | DOUBLE_POLL    |

The idea behind io_poll_double_prepare() is to serialise with the
first poll entry by taking the wq lock. However, it's not safe to assume
that io_poll_wake() is not running when we can't grab the lock and so we
may race modifying req-&gt;flags.

Skip double poll setup if that happens. It's ok because the first poll
entry will only be removed when it's definitely completing, e.g.
pollfree or oneshot with a valid mask.

Fixes: 49f1c68e048f1 ("io_uring: optimise submission side poll_refs")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/b7fab2d502f6121a7d7b199fe4d914a43ca9cdfd.1668184658.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: check for rollover of buffer ID when providing buffers</title>
<updated>2022-11-10T18:07:41Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-11-10T17:50:55Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=3851d25c75ed03117268a8feb34adca5a843a126'/>
<id>urn:sha1:3851d25c75ed03117268a8feb34adca5a843a126</id>
<content type='text'>
We already check if the chosen starting offset for the buffer IDs fit
within an unsigned short, as 65535 is the maximum value for a provided
buffer. But if the caller asks to add N buffers at offset M, and M + N
would exceed the size of the unsigned short, we simply add buffers with
wrapping around the ID.

This is not necessarily a bug and could in fact be a valid use case, but
it seems confusing and inconsistent with the initial check for starting
offset. Let's check for wrap consistently, and error the addition if we
do need to wrap.

Reported-by: Olivier Langlois &lt;olivier@trillion01.com&gt;
Link: https://github.com/axboe/liburing/issues/726
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: calculate CQEs from the user visible value</title>
<updated>2022-11-08T17:36:15Z</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2022-11-08T15:30:16Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=0fc8c2acbfc789a977a50a4a9812a8e4b37958ce'/>
<id>urn:sha1:0fc8c2acbfc789a977a50a4a9812a8e4b37958ce</id>
<content type='text'>
io_cqring_wait (and it's wake function io_has_work) used cached_cq_tail in
order to calculate the number of CQEs. cached_cq_tail is set strictly
before the user visible rings-&gt;cq.tail

However as far as userspace is concerned,  if io_uring_enter(2) is called
with a minimum number of events, they will verify by checking
rings-&gt;cq.tail.

It is therefore possible for io_uring_enter(2) to return early with fewer
events visible to the user.

Instead make the wait functions read from the user visible value, so there
will be no discrepency.

This is triggered eventually by the following reproducer:

struct io_uring_sqe *sqe;
struct io_uring_cqe *cqe;
unsigned int cqe_ready;
struct io_uring ring;
int ret, i;

ret = io_uring_queue_init(N, &amp;ring, 0);
assert(!ret);
while(true) {
	for (i = 0; i &lt; N; i++) {
		sqe = io_uring_get_sqe(&amp;ring);
		io_uring_prep_nop(sqe);
		sqe-&gt;flags |= IOSQE_ASYNC;
	}
	ret = io_uring_submit(&amp;ring);
	assert(ret == N);

	do {
		ret = io_uring_wait_cqes(&amp;ring, &amp;cqe, N, NULL, NULL);
	} while(ret == -EINTR);
	cqe_ready = io_uring_cq_ready(&amp;ring);
	assert(!ret);
	assert(cqe_ready == N);
	io_uring_cq_advance(&amp;ring, N);
}

Fixes: ad3eb2c89fb2 ("io_uring: split overflow state into SQ and CQ side")
Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221108153016.1854297-1-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: unlock if __io_run_local_work locked inside</title>
<updated>2022-10-27T15:52:12Z</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2022-10-27T14:44:29Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=b3026767e15b488860d4bbf1649d69612bab2c25'/>
<id>urn:sha1:b3026767e15b488860d4bbf1649d69612bab2c25</id>
<content type='text'>
It is possible for tw to lock the ring, and this was not propogated out to
io_run_local_work. This can cause an unlock to be missed.

Instead pass a pointer to locked into __io_run_local_work.

Fixes: 8ac5d85a89b4 ("io_uring: add local task_work run helper that is entered locked")
Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221027144429.3971400-3-dylany@meta.com
[axboe: WARN_ON() -&gt; WARN_ON_ONCE() and add a minor comment]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: use io_run_local_work_locked helper</title>
<updated>2022-10-27T15:51:47Z</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2022-10-27T14:44:28Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=8de11cdc96bf58b324c59a28512eb9513fd02553'/>
<id>urn:sha1:8de11cdc96bf58b324c59a28512eb9513fd02553</id>
<content type='text'>
prefer to use io_run_local_work_locked helper for consistency

Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221027144429.3971400-2-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/net: fail zc sendmsg when unsupported by socket</title>
<updated>2022-10-22T14:43:03Z</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-10-21T10:16:41Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=cc767e7c6913f770741d9fad1efa4957c2623744'/>
<id>urn:sha1:cc767e7c6913f770741d9fad1efa4957c2623744</id>
<content type='text'>
The previous patch fails zerocopy send requests for protocols that don't
support it, do the same for zerocopy sendmsg.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/0854e7bb4c3d810a48ec8b5853e2f61af36a0467.1666346426.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/net: fail zc send when unsupported by socket</title>
<updated>2022-10-22T14:43:03Z</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-10-21T10:16:40Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=edf81438799ccead7122948446d7e44b083e788d'/>
<id>urn:sha1:edf81438799ccead7122948446d7e44b083e788d</id>
<content type='text'>
If a protocol doesn't support zerocopy it will silently fall back to
copying. This type of behaviour has always been a source of troubles
so it's better to fail such requests instead.

Cc: &lt;stable@vger.kernel.org&gt; # 6.0
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/2db3c7f16bb6efab4b04569cd16e6242b40c5cb3.1666346426.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io-wq: Fix memory leak in worker creation</title>
<updated>2022-10-20T12:48:59Z</updated>
<author>
<name>Rafael Mendonca</name>
<email>rafaelmendsr@gmail.com</email>
</author>
<published>2022-10-20T01:47:09Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=996d3efeb091c503afd3ee6b5e20eabf446fd955'/>
<id>urn:sha1:996d3efeb091c503afd3ee6b5e20eabf446fd955</id>
<content type='text'>
If the CPU mask allocation for a node fails, then the memory allocated for
the 'io_wqe' struct of the current node doesn't get freed on the error
handling path, since it has not yet been added to the 'wqes' array.

This was spotted when fuzzing v6.1-rc1 with Syzkaller:
BUG: memory leak
unreferenced object 0xffff8880093d5000 (size 1024):
  comm "syz-executor.2", pid 7701, jiffies 4295048595 (age 13.900s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;00000000cb463369&gt;] __kmem_cache_alloc_node+0x18e/0x720
    [&lt;00000000147a3f9c&gt;] kmalloc_node_trace+0x2a/0x130
    [&lt;000000004e107011&gt;] io_wq_create+0x7b9/0xdc0
    [&lt;00000000c38b2018&gt;] io_uring_alloc_task_context+0x31e/0x59d
    [&lt;00000000867399da&gt;] __io_uring_add_tctx_node.cold+0x19/0x1ba
    [&lt;000000007e0e7a79&gt;] io_uring_setup.cold+0x1b80/0x1dce
    [&lt;00000000b545e9f6&gt;] __x64_sys_io_uring_setup+0x5d/0x80
    [&lt;000000008a8a7508&gt;] do_syscall_64+0x5d/0x90
    [&lt;000000004ac08bec&gt;] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 0e03496d1967 ("io-wq: use private CPU mask")
Cc: stable@vger.kernel.org
Signed-off-by: Rafael Mendonca &lt;rafaelmendsr@gmail.com&gt;
Link: https://lore.kernel.org/r/20221020014710.902201-1-rafaelmendsr@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/msg_ring: Fix NULL pointer dereference in io_msg_send_fd()</title>
<updated>2022-10-19T19:33:33Z</updated>
<author>
<name>Harshit Mogalapalli</name>
<email>harshit.m.mogalapalli@oracle.com</email>
</author>
<published>2022-10-19T17:12:18Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=16bbdfe5fb0e78e0acb13e45fc127e9a296913f2'/>
<id>urn:sha1:16bbdfe5fb0e78e0acb13e45fc127e9a296913f2</id>
<content type='text'>
Syzkaller produced the below call trace:

 BUG: KASAN: null-ptr-deref in io_msg_ring+0x3cb/0x9f0
 Write of size 8 at addr 0000000000000070 by task repro/16399

 CPU: 0 PID: 16399 Comm: repro Not tainted 6.1.0-rc1 #28
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7
 Call Trace:
  &lt;TASK&gt;
  dump_stack_lvl+0xcd/0x134
  ? io_msg_ring+0x3cb/0x9f0
  kasan_report+0xbc/0xf0
  ? io_msg_ring+0x3cb/0x9f0
  kasan_check_range+0x140/0x190
  io_msg_ring+0x3cb/0x9f0
  ? io_msg_ring_prep+0x300/0x300
  io_issue_sqe+0x698/0xca0
  io_submit_sqes+0x92f/0x1c30
  __do_sys_io_uring_enter+0xae4/0x24b0
....
 RIP: 0033:0x7f2eaf8f8289
 RSP: 002b:00007fff40939718 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa
 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2eaf8f8289
 RDX: 0000000000000000 RSI: 0000000000006f71 RDI: 0000000000000004
 RBP: 00007fff409397a0 R08: 0000000000000000 R09: 0000000000000039
 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004006d0
 R13: 00007fff40939880 R14: 0000000000000000 R15: 0000000000000000
  &lt;/TASK&gt;
 Kernel panic - not syncing: panic_on_warn set ...

We don't have a NULL check on file_ptr in io_msg_send_fd() function,
so when file_ptr is NUL src_file is also NULL and get_file()
dereferences a NULL pointer and leads to above crash.

Add a NULL check to fix this issue.

Fixes: e6130eba8a84 ("io_uring: add support for passing fixed file descriptors")
Reported-by: syzkaller &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Harshit Mogalapalli &lt;harshit.m.mogalapalli@oracle.com&gt;
Link: https://lore.kernel.org/r/20221019171218.1337614-1-harshit.m.mogalapalli@oracle.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
