<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/kernel/bpf, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/kernel/bpf?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/kernel/bpf?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-11-03T23:24:12Z</updated>
<entry>
<title>bpf: Fix wrong reg type conversion in release_reference()</title>
<updated>2022-11-03T23:24:12Z</updated>
<author>
<name>Youlin Li</name>
<email>liulin063@gmail.com</email>
</author>
<published>2022-11-03T09:34:39Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=f1db20814af532f85e091231223e5e4818e8464b'/>
<id>urn:sha1:f1db20814af532f85e091231223e5e4818e8464b</id>
<content type='text'>
Some helper functions will allocate memory. To avoid memory leaks, the
verifier requires the eBPF program to release these memories by calling
the corresponding helper functions.

When a resource is released, all pointer registers corresponding to the
resource should be invalidated. The verifier use release_references() to
do this job, by apply  __mark_reg_unknown() to each relevant register.

It will give these registers the type of SCALAR_VALUE. A register that
will contain a pointer value at runtime, but of type SCALAR_VALUE, which
may allow the unprivileged user to get a kernel pointer by storing this
register into a map.

Using __mark_reg_not_init() while NOT allow_ptr_leaks can mitigate this
problem.

Fixes: fd978bf7fd31 ("bpf: Add reference tracking to verifier")
Signed-off-by: Youlin Li &lt;liulin063@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20221103093440.3161-1-liulin063@gmail.com
</content>
</entry>
<entry>
<title>bpf, verifier: Fix memory leak in array reallocation for stack state</title>
<updated>2022-11-01T13:29:16Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2022-10-29T02:54:30Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=42378a9ca55347102bbf86708776061d8fe3ece2'/>
<id>urn:sha1:42378a9ca55347102bbf86708776061d8fe3ece2</id>
<content type='text'>
If an error (NULL) is returned by krealloc(), callers of realloc_array()
were setting their allocation pointers to NULL, but on error krealloc()
does not touch the original allocation. This would result in a memory
resource leak. Instead, free the old allocation on the error handling
path.

The memory leak information is as follows as also reported by Zhengchao:

  unreferenced object 0xffff888019801800 (size 256):
  comm "bpf_repo", pid 6490, jiffies 4294959200 (age 17.170s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;00000000b211474b&gt;] __kmalloc_node_track_caller+0x45/0xc0
    [&lt;0000000086712a0b&gt;] krealloc+0x83/0xd0
    [&lt;00000000139aab02&gt;] realloc_array+0x82/0xe2
    [&lt;00000000b1ca41d1&gt;] grow_stack_state+0xfb/0x186
    [&lt;00000000cd6f36d2&gt;] check_mem_access.cold+0x141/0x1341
    [&lt;0000000081780455&gt;] do_check_common+0x5358/0xb350
    [&lt;0000000015f6b091&gt;] bpf_check.cold+0xc3/0x29d
    [&lt;000000002973c690&gt;] bpf_prog_load+0x13db/0x2240
    [&lt;00000000028d1644&gt;] __sys_bpf+0x1605/0x4ce0
    [&lt;00000000053f29bd&gt;] __x64_sys_bpf+0x75/0xb0
    [&lt;0000000056fedaf5&gt;] do_syscall_64+0x35/0x80
    [&lt;000000002bd58261&gt;] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: c69431aab67a ("bpf: verifier: Improve function state reallocation")
Reported-by: Zhengchao Shao &lt;shaozhengchao@huawei.com&gt;
Reported-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Bill Wendling &lt;morbo@google.com&gt;
Cc: Lorenz Bauer &lt;oss@lmb.io&gt;
Link: https://lore.kernel.org/bpf/20221029025433.2533810-1-keescook@chromium.org
</content>
</entry>
<entry>
<title>Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf</title>
<updated>2022-10-24T17:32:01Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2022-10-24T17:32:00Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=e28c44450b14474009a7ac84eb2bd631357c9635'/>
<id>urn:sha1:e28c44450b14474009a7ac84eb2bd631357c9635</id>
<content type='text'>
Alexei Starovoitov says:

====================
pull-request: bpf 2022-10-23

We've added 7 non-merge commits during the last 18 day(s) which contain
a total of 8 files changed, 69 insertions(+), 5 deletions(-).

The main changes are:

1) Wait for busy refill_work when destroying bpf memory allocator, from Hou.

2) Allow bpf_user_ringbuf_drain() callbacks to return 1, from David.

3) Fix dispatcher patchable function entry to 5 bytes nop, from Jiri.

4) Prevent decl_tag from being referenced in func_proto, from Stanislav.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Use __llist_del_all() whenever possbile during memory draining
  bpf: Wait for busy refill_work when destroying bpf memory allocator
  bpf: Fix dispatcher patchable function entry to 5 bytes nop
  bpf: prevent decl_tag from being referenced in func_proto
  selftests/bpf: Add reproducer for decl_tag in func_proto return type
  selftests/bpf: Make bpf_user_ringbuf_drain() selftest callback return 1
  bpf: Allow bpf_user_ringbuf_drain() callbacks to return 1
====================

Link: https://lore.kernel.org/r/20221023192244.81137-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Use __llist_del_all() whenever possbile during memory draining</title>
<updated>2022-10-22T02:17:38Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2022-10-21T11:49:13Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=fa4447cb73b2bfe7175f1b7ffdc70580fcfcb991'/>
<id>urn:sha1:fa4447cb73b2bfe7175f1b7ffdc70580fcfcb991</id>
<content type='text'>
Except for waiting_for_gp list, there are no concurrent operations on
free_by_rcu, free_llist and free_llist_extra lists, so use
__llist_del_all() instead of llist_del_all(). waiting_for_gp list can be
deleted by RCU callback concurrently, so still use llist_del_all().

Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20221021114913.60508-3-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Wait for busy refill_work when destroying bpf memory allocator</title>
<updated>2022-10-22T02:17:38Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2022-10-21T11:49:12Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=3d05818707bb2cf133ccdcd29f2d5585b5bc1298'/>
<id>urn:sha1:3d05818707bb2cf133ccdcd29f2d5585b5bc1298</id>
<content type='text'>
A busy irq work is an unfinished irq work and it can be either in the
pending state or in the running state. When destroying bpf memory
allocator, refill_work may be busy for PREEMPT_RT kernel in which irq
work is invoked in a per-CPU RT-kthread. It is also possible for kernel
with arch_irq_work_has_interrupt() being false (e.g. 1-cpu arm32 host or
mips) and irq work is inovked in timer interrupt.

The busy refill_work leads to various issues. The obvious one is that
there will be concurrent operations on free_by_rcu and free_list between
irq work and memory draining. Another one is call_rcu_in_progress will
not be reliable for the checking of pending RCU callback because
do_call_rcu() may have not been invoked by irq work yet. The other is
there will be use-after-free if irq work is freed before the callback
of irq work is invoked as shown below:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor instruction fetch in kernel mode
 #PF: error_code(0x0010) - not-present page
 PGD 12ab94067 P4D 12ab94067 PUD 1796b4067 PMD 0
 Oops: 0010 [#1] PREEMPT_RT SMP
 CPU: 5 PID: 64 Comm: irq_work/5 Not tainted 6.0.0-rt11+ #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 RIP: 0010:0x0
 Code: Unable to access opcode bytes at 0xffffffffffffffd6.
 RSP: 0018:ffffadc080293e78 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffffcdc07fb6a388 RCX: ffffa05000a2e000
 RDX: ffffa05000a2e000 RSI: ffffffff96cc9827 RDI: ffffcdc07fb6a388
 ......
 Call Trace:
  &lt;TASK&gt;
  irq_work_single+0x24/0x60
  irq_work_run_list+0x24/0x30
  run_irq_workd+0x23/0x30
  smpboot_thread_fn+0x203/0x300
  kthread+0x126/0x150
  ret_from_fork+0x1f/0x30
  &lt;/TASK&gt;

Considering the ease of concurrency handling, no overhead for
irq_work_sync() under non-PREEMPT_RT kernel and has-irq-work-interrupt
kernel and the short wait time used for irq_work_sync() under PREEMPT_RT
(When running two test_maps on PREEMPT_RT kernel and 72-cpus host, the
max wait time is about 8ms and the 99th percentile is 10us), just using
irq_work_sync() to wait for busy refill_work to complete before memory
draining and memory freeing.

Fixes: 7c8199e24fa0 ("bpf: Introduce any context BPF specific memory allocator.")
Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20221021114913.60508-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix dispatcher patchable function entry to 5 bytes nop</title>
<updated>2022-10-21T01:57:51Z</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@kernel.org</email>
</author>
<published>2022-10-18T07:59:34Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=dbe69b29988465b011f198f2797b1c2b6980b50e'/>
<id>urn:sha1:dbe69b29988465b011f198f2797b1c2b6980b50e</id>
<content type='text'>
The patchable_function_entry(5) might output 5 single nop
instructions (depends on toolchain), which will clash with
bpf_arch_text_poke check for 5 bytes nop instruction.

Adding early init call for dispatcher that checks and change
the patchable entry into expected 5 nop instruction if needed.

There's no need to take text_mutex, because we are using it
in early init call which is called at pre-smp time.

Fixes: ceea991a019c ("bpf: Move bpf_dispatcher function out of ftrace locations")
Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20221018075934.574415-1-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'cgroup-for-6.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2022-10-18T01:52:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-18T01:52:43Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=bb1a1146467ad812bb65440696df0782e2bc63c8'/>
<id>urn:sha1:bb1a1146467ad812bb65440696df0782e2bc63c8</id>
<content type='text'>
Pull cgroup fixes from Tejun Heo:

 - Fix a recent regression where a sleeping kernfs function is called
   with css_set_lock (spinlock) held

 - Revert the commit to enable cgroup1 support for cgroup_get_from_fd/file()

   Multiple users assume that the lookup only works for cgroup2 and
   breaks when fed a cgroup1 file. Instead, introduce a separate set of
   functions to lookup both v1 and v2 and use them where the user
   explicitly wants to support both versions.

 - Compat update for tools/perf/util/bpf_skel/bperf_cgroup.bpf.c.

 - Add Josef Bacik as a blkcg maintainer.

* tag 'cgroup-for-6.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  blkcg: Update MAINTAINERS entry
  mm: cgroup: fix comments for get from fd/file helpers
  perf stat: Support old kernels for bperf cgroup counting
  bpf: cgroup_iter: support cgroup1 using cgroup fd
  cgroup: add cgroup_v1v2_get_from_[fd/file]()
  Revert "cgroup: enable cgroup_get_from_file() on cgroup1"
  cgroup: Reorganize css_set_lock and kernfs path processing
</content>
</entry>
<entry>
<title>bpf: prevent decl_tag from being referenced in func_proto</title>
<updated>2022-10-17T17:56:12Z</updated>
<author>
<name>Stanislav Fomichev</name>
<email>sdf@google.com</email>
</author>
<published>2022-10-15T00:24:44Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=ea68376c8bed5cd156900852aada20c3a0874d17'/>
<id>urn:sha1:ea68376c8bed5cd156900852aada20c3a0874d17</id>
<content type='text'>
Syzkaller was able to hit the following issue:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 3609 at kernel/bpf/btf.c:1946
btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946
Modules linked in:
CPU: 0 PID: 3609 Comm: syz-executor361 Not tainted
6.0.0-syzkaller-02734-g0326074ff465 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 09/22/2022
RIP: 0010:btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946
Code: ef e8 7f 8e e4 ff 41 83 ff 0b 77 28 f6 44 24 10 18 75 3f e8 6d 91
e4 ff 44 89 fe bf 0e 00 00 00 e8 20 8e e4 ff e8 5b 91 e4 ff &lt;0f&gt; 0b 45
31 f6 e9 98 02 00 00 41 83 ff 12 74 18 e8 46 91 e4 ff 44
RSP: 0018:ffffc90003cefb40 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
RDX: ffff8880259c0000 RSI: ffffffff81968415 RDI: 0000000000000005
RBP: ffff88801270ca00 R08: 0000000000000005 R09: 000000000000000e
R10: 0000000000000011 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000011 R14: ffff888026ee6424 R15: 0000000000000011
FS:  000055555641b300(0000) GS:ffff8880b9a00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000f2e258 CR3: 000000007110e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 &lt;TASK&gt;
 btf_func_proto_check kernel/bpf/btf.c:4447 [inline]
 btf_check_all_types kernel/bpf/btf.c:4723 [inline]
 btf_parse_type_sec kernel/bpf/btf.c:4752 [inline]
 btf_parse kernel/bpf/btf.c:5026 [inline]
 btf_new_fd+0x1926/0x1e70 kernel/bpf/btf.c:6892
 bpf_btf_load kernel/bpf/syscall.c:4324 [inline]
 __sys_bpf+0xb7d/0x4cf0 kernel/bpf/syscall.c:5010
 __do_sys_bpf kernel/bpf/syscall.c:5069 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5067 [inline]
 __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:5067
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f0fbae41c69
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 &lt;48&gt; 3d 01
f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc8aeb6228 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0fbae41c69
RDX: 0000000000000020 RSI: 0000000020000140 RDI: 0000000000000012
RBP: 00007f0fbae05e10 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000246 R12: 00007f0fbae05ea0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 &lt;/TASK&gt;

Looks like it tries to create a func_proto which return type is
decl_tag. For the details, see Martin's spot on analysis in [0].

0: https://lore.kernel.org/bpf/CAKH8qBuQDLva_hHxxBuZzyAcYNO4ejhovz6TQeVSk8HY-2SO6g@mail.gmail.com/T/#mea6524b3fcd6298347432226e81b1e6155efc62c

Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Fixes: bd16dee66ae4 ("bpf: Add BTF_KIND_DECL_TAG typedef support")
Reported-by: syzbot+d8bd751aef7c6b39a344@syzkaller.appspotmail.com
Signed-off-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lore.kernel.org/r/20221015002444.2680969-2-sdf@google.com
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Allow bpf_user_ringbuf_drain() callbacks to return 1</title>
<updated>2022-10-13T15:27:37Z</updated>
<author>
<name>David Vernet</name>
<email>void@manifault.com</email>
</author>
<published>2022-10-12T23:20:14Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=c92a7a52243871eb10e0ea2260685def47fb5094'/>
<id>urn:sha1:c92a7a52243871eb10e0ea2260685def47fb5094</id>
<content type='text'>
The bpf_user_ringbuf_drain() helper function allows a BPF program to
specify a callback that is invoked when draining entries from a
BPF_MAP_TYPE_USER_RINGBUF ring buffer map. The API is meant to allow the
callback to return 0 if it wants to continue draining samples, and 1 if
it's done draining. Unfortunately, bpf_user_ringbuf_drain() landed shortly
after commit 1bfe26fb0827 ("bpf: Add verifier support for custom
callback return range"), which changed the default behavior of callbacks
to only support returning 0.

This patch corrects that oversight by allowing bpf_user_ringbuf_drain()
callbacks to return 0 or 1. A follow-on patch will update the
user_ringbuf selftests to also return 1 from a bpf_user_ringbuf_drain()
callback to prevent this from regressing in the future.

Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
Signed-off-by: David Vernet &lt;void@manifault.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20221012232015.1510043-2-void@manifault.com
</content>
</entry>
<entry>
<title>treewide: use get_random_u32() when possible</title>
<updated>2022-10-11T23:42:58Z</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2022-10-05T15:43:22Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a251c17aa558d8e3128a528af5cf8b9d7caae4fd'/>
<id>urn:sha1:a251c17aa558d8e3128a528af5cf8b9d7caae4fd</id>
<content type='text'>
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.

Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt; # for ext4
Acked-by: Toke Høiland-Jørgensen &lt;toke@toke.dk&gt; # for sch_cake
Acked-by: Chuck Lever &lt;chuck.lever@oracle.com&gt; # for nfsd
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt; # for thunderbolt
Acked-by: Darrick J. Wong &lt;djwong@kernel.org&gt; # for xfs
Acked-by: Helge Deller &lt;deller@gmx.de&gt; # for parisc
Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt; # for s390
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
</content>
</entry>
</feed>
