<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/arch/arm64/net, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/arch/arm64/net?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/arch/arm64/net?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-09-07T02:51:14Z</updated>
<entry>
<title>bpf: arm64: No support of struct argument in trampoline programs</title>
<updated>2022-09-07T02:51:14Z</updated>
<author>
<name>Yonghong Song</name>
<email>yhs@fb.com</email>
</author>
<published>2022-08-31T15:27:02Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=eb707dde264af5eb0271156d7fbd59133fa02cac'/>
<id>urn:sha1:eb707dde264af5eb0271156d7fbd59133fa02cac</id>
<content type='text'>
ARM64 does not support struct argument for trampoline based
bpf programs yet.

Signed-off-by: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lore.kernel.org/r/20220831152702.2079066-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, arm64: Fix bpf trampoline instruction endianness</title>
<updated>2022-08-10T14:50:57Z</updated>
<author>
<name>Xu Kuohai</name>
<email>xukuohai@huawei.com</email>
</author>
<published>2022-08-08T04:07:35Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=aada476655461a9ab491d8298a415430cdd10278'/>
<id>urn:sha1:aada476655461a9ab491d8298a415430cdd10278</id>
<content type='text'>
The sparse tool complains as follows:

arch/arm64/net/bpf_jit_comp.c:1684:16:
	warning: incorrect type in assignment (different base types)
arch/arm64/net/bpf_jit_comp.c:1684:16:
	expected unsigned int [usertype] *branch
arch/arm64/net/bpf_jit_comp.c:1684:16:
	got restricted __le32 [usertype] *
arch/arm64/net/bpf_jit_comp.c:1700:52:
	error: subtraction of different types can't work (different base
	types)
arch/arm64/net/bpf_jit_comp.c:1734:29:
	warning: incorrect type in assignment (different base types)
arch/arm64/net/bpf_jit_comp.c:1734:29:
	expected unsigned int [usertype] *
arch/arm64/net/bpf_jit_comp.c:1734:29:
	got restricted __le32 [usertype] *
arch/arm64/net/bpf_jit_comp.c:1918:52:
	error: subtraction of different types can't work (different base
	types)

This is because the variable branch in function invoke_bpf_prog and the
variable branches in function prepare_trampoline are defined as type
u32 *, which conflicts with ctx-&gt;image's type __le32 *, so sparse complains
when assignment or arithmetic operation are performed on these two
variables and ctx-&gt;image.

Since arm64 instructions are always little-endian, change the type of
these two variables to __le32 * and call cpu_to_le32() to convert
instruction to little-endian before writing it to memory. This is also
in line with emit() which internally does cpu_to_le32(), too.

Fixes: efc9909fdce0 ("bpf, arm64: Add bpf trampoline for arm64")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Xu Kuohai &lt;xukuohai@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jean-Philippe Brucker &lt;jean-philippe@linaro.org&gt;
Link: https://lore.kernel.org/bpf/20220808040735.1232002-1-xukuohai@huawei.com
</content>
</entry>
<entry>
<title>bpf, arm64: Allocate program buffer using kvcalloc instead of kcalloc</title>
<updated>2022-08-08T14:42:43Z</updated>
<author>
<name>Aijun Sun</name>
<email>aijun.sun@unisoc.com</email>
</author>
<published>2022-08-04T02:54:42Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=19f68ed6dc90c93daf7e54d3350ea67fead7cbbf'/>
<id>urn:sha1:19f68ed6dc90c93daf7e54d3350ea67fead7cbbf</id>
<content type='text'>
It is not necessary to allocate contiguous physical memory for BPF
program buffer using kcalloc. When the BPF program is large more than
memory page size, kcalloc allocates multiple memory pages from buddy
system. If the device can not provide sufficient memory, for example
in low-end android devices [0], memory allocation for BPF program is
likely to fail.

Test cases in lib/test_bpf.c all pass on ARM64 QEMU.

[0]
  AndroidTestSuit: page allocation failure: order:4,
  mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=foreground,mems_allowed=0
  Call trace:
   dump_stack+0xa4/0x114
   warn_alloc+0xf8/0x14c
   __alloc_pages_slowpath+0xac8/0xb14
   __alloc_pages_nodemask+0x194/0x3d0
   kmalloc_order_trace+0x44/0x1e8
   __kmalloc+0x29c/0x66c
   bpf_int_jit_compile+0x17c/0x568
   bpf_prog_select_runtime+0x4c/0x1b0
   bpf_prepare_filter+0x5fc/0x6bc
   bpf_prog_create_from_user+0x118/0x1c0
   seccomp_set_mode_filter+0x1c4/0x7cc
   __do_sys_prctl+0x380/0x1424
   __arm64_sys_prctl+0x20/0x2c
   el0_svc_common+0xc8/0x22c
   el0_svc_handler+0x1c/0x28
   el0_svc+0x8/0x100

Signed-off-by: Aijun Sun &lt;aijun.sun@unisoc.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220804025442.22524-1-aijun.sun@unisoc.com
</content>
</entry>
<entry>
<title>bpf, arm64: Fix compile error in dummy_tramp()</title>
<updated>2022-07-21T22:21:16Z</updated>
<author>
<name>Xu Kuohai</name>
<email>xukuohai@huawei.com</email>
</author>
<published>2022-07-21T12:13:19Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=339ed900b3079b6ea21df55f368b1ccba17cd5e6'/>
<id>urn:sha1:339ed900b3079b6ea21df55f368b1ccba17cd5e6</id>
<content type='text'>
dummy_tramp() uses "lr" to refer to the x30 register, but some assembler
does not recognize "lr" and reports a build failure:

/tmp/cc52xO0c.s: Assembler messages:
/tmp/cc52xO0c.s:8: Error: operand 1 should be an integer register -- `mov lr,x9'
/tmp/cc52xO0c.s:7: Error: undefined symbol lr used as an immediate value
make[2]: *** [scripts/Makefile.build:250: arch/arm64/net/bpf_jit_comp.o] Error 1
make[1]: *** [scripts/Makefile.build:525: arch/arm64/net] Error 2

So replace "lr" with "x30" to fix it.

Fixes: b2ad54e1533e ("bpf, arm64: Implement bpf_arch_text_poke() for arm64")
Reported-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Signed-off-by: Xu Kuohai &lt;xukuohai@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Tested-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Reviewed-by: Jean-Philippe Brucker &lt;jean-philippe@linaro.org&gt;
Link: https://lore.kernel.org/bpf/20220721121319.2999259-1-xukuohai@huaweicloud.com
</content>
</entry>
<entry>
<title>bpf, arm64: Mark dummy_tramp as global</title>
<updated>2022-07-14T14:57:26Z</updated>
<author>
<name>Nathan Chancellor</name>
<email>nathan@kernel.org</email>
</author>
<published>2022-07-13T17:35:03Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=33f32e5072b6cc84d1b130a3ad485849bcec907a'/>
<id>urn:sha1:33f32e5072b6cc84d1b130a3ad485849bcec907a</id>
<content type='text'>
When building with clang + CONFIG_CFI_CLANG=y, the following error
occurs at link time:

  ld.lld: error: undefined symbol: dummy_tramp

dummy_tramp is declared globally in C but its definition in inline
assembly does not use .global, which prevents clang from properly
resolving the references to it when creating the CFI jump tables.

Mark dummy_tramp as global so that the reference can be properly
resolved.

Fixes: b2ad54e1533e ("bpf, arm64: Implement bpf_arch_text_poke() for arm64")
Suggested-by: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Link: https://github.com/ClangBuiltLinux/linux/issues/1661
Link: https://lore.kernel.org/bpf/20220713173503.3889486-1-nathan@kernel.org
</content>
</entry>
<entry>
<title>bpf, arm64: Add bpf trampoline for arm64</title>
<updated>2022-07-11T19:08:08Z</updated>
<author>
<name>Xu Kuohai</name>
<email>xukuohai@huawei.com</email>
</author>
<published>2022-07-11T15:08:23Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=efc9909fdce00a827a37609628223cd45bf95d0b'/>
<id>urn:sha1:efc9909fdce00a827a37609628223cd45bf95d0b</id>
<content type='text'>
This is arm64 version of commit fec56f5890d9 ("bpf: Introduce BPF
trampoline"). A bpf trampoline converts native calling convention to bpf
calling convention and is used to implement various bpf features, such
as fentry, fexit, fmod_ret and struct_ops.

This patch does essentially the same thing that bpf trampoline does on x86.

Tested on Raspberry Pi 4B and qemu:

 #18 /1     bpf_tcp_ca/dctcp:OK
 #18 /2     bpf_tcp_ca/cubic:OK
 #18 /3     bpf_tcp_ca/invalid_license:OK
 #18 /4     bpf_tcp_ca/dctcp_fallback:OK
 #18 /5     bpf_tcp_ca/rel_setsockopt:OK
 #18        bpf_tcp_ca:OK
 #51 /1     dummy_st_ops/dummy_st_ops_attach:OK
 #51 /2     dummy_st_ops/dummy_init_ret_value:OK
 #51 /3     dummy_st_ops/dummy_init_ptr_arg:OK
 #51 /4     dummy_st_ops/dummy_multiple_args:OK
 #51        dummy_st_ops:OK
 #57 /1     fexit_bpf2bpf/target_no_callees:OK
 #57 /2     fexit_bpf2bpf/target_yes_callees:OK
 #57 /3     fexit_bpf2bpf/func_replace:OK
 #57 /4     fexit_bpf2bpf/func_replace_verify:OK
 #57 /5     fexit_bpf2bpf/func_sockmap_update:OK
 #57 /6     fexit_bpf2bpf/func_replace_return_code:OK
 #57 /7     fexit_bpf2bpf/func_map_prog_compatibility:OK
 #57 /8     fexit_bpf2bpf/func_replace_multi:OK
 #57 /9     fexit_bpf2bpf/fmod_ret_freplace:OK
 #57        fexit_bpf2bpf:OK
 #237       xdp_bpf2bpf:OK

Signed-off-by: Xu Kuohai &lt;xukuohai@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jean-Philippe Brucker &lt;jean-philippe@linaro.org&gt;
Acked-by: Song Liu &lt;songliubraving@fb.com&gt;
Acked-by: KP Singh &lt;kpsingh@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20220711150823.2128542-5-xukuohai@huawei.com
</content>
</entry>
<entry>
<title>bpf, arm64: Implement bpf_arch_text_poke() for arm64</title>
<updated>2022-07-11T19:08:01Z</updated>
<author>
<name>Xu Kuohai</name>
<email>xukuohai@huawei.com</email>
</author>
<published>2022-07-11T15:08:22Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=b2ad54e1533e91449cb2a371e034942bd7882b58'/>
<id>urn:sha1:b2ad54e1533e91449cb2a371e034942bd7882b58</id>
<content type='text'>
Implement bpf_arch_text_poke() for arm64, so bpf prog or bpf trampoline
can be patched with it.

When the target address is NULL, the original instruction is patched to
a NOP.

When the target address and the source address are within the branch
range, the original instruction is patched to a bl instruction to the
target address directly.

To support attaching bpf trampoline to both regular kernel function and
bpf prog, we follow the ftrace patchsite way for bpf prog. That is, two
instructions are inserted at the beginning of bpf prog, the first one
saves the return address to x9, and the second is a nop which will be
patched to a bl instruction when a bpf trampoline is attached.

However, when a bpf trampoline is attached to bpf prog, the distance
between target address and source address may exceed 128MB, the maximum
branch range, because bpf trampoline and bpf prog are allocated
separately with vmalloc. So long jump should be handled.

When a bpf prog is constructed, a plt pointing to empty trampoline
dummy_tramp is placed at the end:

        bpf_prog:
                mov x9, lr
                nop // patchsite
                ...
                ret

        plt:
                ldr x10, target
                br x10
        target:
                .quad dummy_tramp // plt target

This is also the state when no trampoline is attached.

When a short-jump bpf trampoline is attached, the patchsite is patched to
a bl instruction to the trampoline directly:

        bpf_prog:
                mov x9, lr
                bl &lt;short-jump bpf trampoline address&gt; // patchsite
                ...
                ret

        plt:
                ldr x10, target
                br x10
        target:
                .quad dummy_tramp // plt target

When a long-jump bpf trampoline is attached, the plt target is filled with
the trampoline address and the patchsite is patched to a bl instruction to
the plt:

        bpf_prog:
                mov x9, lr
                bl plt // patchsite
                ...
                ret

        plt:
                ldr x10, target
                br x10
        target:
                .quad &lt;long-jump bpf trampoline address&gt;

dummy_tramp is used to prevent another CPU from jumping to an unknown
location during the patching process, making the patching process easier.

The patching process is as follows:

1. when neither the old address or the new address is a long jump, the
   patchsite is replaced with a bl to the new address, or nop if the new
   address is NULL;

2. when the old address is not long jump but the new one is, the
   branch target address is written to plt first, then the patchsite
   is replaced with a bl instruction to the plt;

3. when the old address is long jump but the new one is not, the address
   of dummy_tramp is written to plt first, then the patchsite is replaced
   with a bl to the new address, or a nop if the new address is NULL;

4. when both the old address and the new address are long jump, the
   new address is written to plt and the patchsite is not changed.

Signed-off-by: Xu Kuohai &lt;xukuohai@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Reviewed-by: KP Singh &lt;kpsingh@kernel.org&gt;
Reviewed-by: Jean-Philippe Brucker &lt;jean-philippe@linaro.org&gt;
Acked-by: Song Liu &lt;songliubraving@fb.com&gt;
Link: https://lore.kernel.org/bpf/20220711150823.2128542-4-xukuohai@huawei.com
</content>
</entry>
<entry>
<title>bpf, arm64: Keep tail call count across bpf2bpf calls</title>
<updated>2022-06-21T16:52:14Z</updated>
<author>
<name>Jakub Sitnicki</name>
<email>jakub@cloudflare.com</email>
</author>
<published>2022-06-17T10:57:35Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=d4609a5d8c70d21b4a3f801cf896a3c16c613fe1'/>
<id>urn:sha1:d4609a5d8c70d21b4a3f801cf896a3c16c613fe1</id>
<content type='text'>
Today doing a BPF tail call after a BPF to BPF call, that is from a
subprogram, is allowed only by the x86-64 BPF JIT. Mixing these features
requires support from JIT. Tail call count has to be tracked through BPF to
BPF calls, as well as through BPF tail calls to prevent unbounded chains of
tail calls.

arm64 BPF JIT stores the tail call count (TCC) in a dedicated
register (X26). This makes it easier to support bpf2bpf calls mixed with
tail calls than on x86 platform.

In order to keep the tail call count in tact throughout bpf2bpf calls, all
we need to do is tweak the program prologue generator. When emitting
prologue for a subprogram, we skip the block that initializes the tail call
count and emits a jump pad for the tail call.

With this change, a sample execution flow where a bpf2bpf call is followed
by a tail call would look like so:

int entry(struct __sk_buff *skb):
   0xffffffc0090151d4:  paciasp
   0xffffffc0090151d8:  stp     x29, x30, [sp, #-16]!
   0xffffffc0090151dc:  mov     x29, sp
   0xffffffc0090151e0:  stp     x19, x20, [sp, #-16]!
   0xffffffc0090151e4:  stp     x21, x22, [sp, #-16]!
   0xffffffc0090151e8:  stp     x25, x26, [sp, #-16]!
   0xffffffc0090151ec:  stp     x27, x28, [sp, #-16]!
   0xffffffc0090151f0:  mov     x25, sp
   0xffffffc0090151f4:  mov     x26, #0x0                       // &lt;- init TCC only
   0xffffffc0090151f8:  bti     j                               //    in main prog
   0xffffffc0090151fc:  sub     x27, x25, #0x0
   0xffffffc009015200:  sub     sp, sp, #0x10
   0xffffffc009015204:  mov     w1, #0x0
   0xffffffc009015208:  mov     x10, #0xffffffffffffffff
   0xffffffc00901520c:  strb    w1, [x25, x10]
   0xffffffc009015210:  mov     x10, #0xffffffffffffd25c
   0xffffffc009015214:  movk    x10, #0x902, lsl #16
   0xffffffc009015218:  movk    x10, #0xffc0, lsl #32
   0xffffffc00901521c:  blr     x10 -------------------.        // bpf2bpf call
   0xffffffc009015220:  add     x7, x0, #0x0 &lt;-------------.
   0xffffffc009015224:  add     sp, sp, #0x10          |   |
   0xffffffc009015228:  ldp     x27, x28, [sp], #16    |   |
   0xffffffc00901522c:  ldp     x25, x26, [sp], #16    |   |
   0xffffffc009015230:  ldp     x21, x22, [sp], #16    |   |
   0xffffffc009015234:  ldp     x19, x20, [sp], #16    |   |
   0xffffffc009015238:  ldp     x29, x30, [sp], #16    |   |
   0xffffffc00901523c:  add     x0, x7, #0x0           |   |
   0xffffffc009015240:  autiasp                        |   |
   0xffffffc009015244:  ret                            |   |
                                                       |   |
int subprog_tail(struct __sk_buff *skb):               |   |
   0xffffffc00902d25c:  paciasp &lt;----------------------'   |
   0xffffffc00902d260:  stp     x29, x30, [sp, #-16]!      |
   0xffffffc00902d264:  mov     x29, sp                    |
   0xffffffc00902d268:  stp     x19, x20, [sp, #-16]!      |
   0xffffffc00902d26c:  stp     x21, x22, [sp, #-16]!      |
   0xffffffc00902d270:  stp     x25, x26, [sp, #-16]!      |
   0xffffffc00902d274:  stp     x27, x28, [sp, #-16]!      |
   0xffffffc00902d278:  mov     x25, sp                    |
   0xffffffc00902d27c:  sub     x27, x25, #0x0             |
   0xffffffc00902d280:  sub     sp, sp, #0x10              |    // &lt;- end of prologue, notice:
   0xffffffc00902d284:  add     x19, x0, #0x0              |    //    1) TCC not touched, and
   0xffffffc00902d288:  mov     w0, #0x1                   |    //    2) no tail call jump pad
   0xffffffc00902d28c:  mov     x10, #0xfffffffffffffffc   |
   0xffffffc00902d290:  str     w0, [x25, x10]             |
   0xffffffc00902d294:  mov     x20, #0xffffff80ffffffff   |
   0xffffffc00902d298:  movk    x20, #0xc033, lsl #16      |
   0xffffffc00902d29c:  movk    x20, #0x4e00               |
   0xffffffc00902d2a0:  add     x0, x19, #0x0              |
   0xffffffc00902d2a4:  add     x1, x20, #0x0              |
   0xffffffc00902d2a8:  mov     x2, #0x0                   |
   0xffffffc00902d2ac:  mov     w10, #0x24                 |
   0xffffffc00902d2b0:  ldr     w10, [x1, x10]             |
   0xffffffc00902d2b4:  add     w2, w2, #0x0               |
   0xffffffc00902d2b8:  cmp     w2, w10                    |
   0xffffffc00902d2bc:  b.cs    0xffffffc00902d2f8         |
   0xffffffc00902d2c0:  mov     w10, #0x21                 |
   0xffffffc00902d2c4:  cmp     x26, x10                   |    // TCC &gt;= MAX_TAIL_CALL_CNT?
   0xffffffc00902d2c8:  b.cs    0xffffffc00902d2f8         |
   0xffffffc00902d2cc:  add     x26, x26, #0x1             |    // TCC++
   0xffffffc00902d2d0:  mov     w10, #0x110                |
   0xffffffc00902d2d4:  add     x10, x1, x10               |
   0xffffffc00902d2d8:  lsl     x11, x2, #3                |
   0xffffffc00902d2dc:  ldr     x11, [x10, x11]            |
   0xffffffc00902d2e0:  cbz     x11, 0xffffffc00902d2f8    |
   0xffffffc00902d2e4:  mov     w10, #0x30                 |
   0xffffffc00902d2e8:  ldr     x10, [x11, x10]            |
   0xffffffc00902d2ec:  add     x10, x10, #0x24            |
   0xffffffc00902d2f0:  add     sp, sp, #0x10              |    // &lt;- destroy just current
   0xffffffc00902d2f4:  br      x10 ---------------------. |    //    BPF stack frame
   0xffffffc00902d2f8:  mov     x10, #0xfffffffffffffffc | |    //    before the tail call
   0xffffffc00902d2fc:  ldr     w7, [x25, x10]           | |
   0xffffffc00902d300:  add     sp, sp, #0x10            | |
   0xffffffc00902d304:  ldp     x27, x28, [sp], #16      | |
   0xffffffc00902d308:  ldp     x25, x26, [sp], #16      | |
   0xffffffc00902d30c:  ldp     x21, x22, [sp], #16      | |
   0xffffffc00902d310:  ldp     x19, x20, [sp], #16      | |
   0xffffffc00902d314:  ldp     x29, x30, [sp], #16      | |
   0xffffffc00902d318:  add     x0, x7, #0x0             | |
   0xffffffc00902d31c:  autiasp                          | |
   0xffffffc00902d320:  ret                              | |
                                                         | |
int classifier_0(struct __sk_buff *skb):                 | |
   0xffffffc008ff5874:  paciasp                          | |
   0xffffffc008ff5878:  stp     x29, x30, [sp, #-16]!    | |
   0xffffffc008ff587c:  mov     x29, sp                  | |
   0xffffffc008ff5880:  stp     x19, x20, [sp, #-16]!    | |
   0xffffffc008ff5884:  stp     x21, x22, [sp, #-16]!    | |
   0xffffffc008ff5888:  stp     x25, x26, [sp, #-16]!    | |
   0xffffffc008ff588c:  stp     x27, x28, [sp, #-16]!    | |
   0xffffffc008ff5890:  mov     x25, sp                  | |
   0xffffffc008ff5894:  mov     x26, #0x0                | |
   0xffffffc008ff5898:  bti     j &lt;----------------------' |
   0xffffffc008ff589c:  sub     x27, x25, #0x0             |
   0xffffffc008ff58a0:  sub     sp, sp, #0x0               |
   0xffffffc008ff58a4:  mov     x0, #0xffffffc0ffffffff    |
   0xffffffc008ff58a8:  movk    x0, #0x8fc, lsl #16        |
   0xffffffc008ff58ac:  movk    x0, #0x6000                |
   0xffffffc008ff58b0:  mov     w1, #0x1                   |
   0xffffffc008ff58b4:  str     w1, [x0]                   |
   0xffffffc008ff58b8:  mov     w7, #0x0                   |
   0xffffffc008ff58bc:  mov     sp, sp                     |
   0xffffffc008ff58c0:  ldp     x27, x28, [sp], #16        |
   0xffffffc008ff58c4:  ldp     x25, x26, [sp], #16        |
   0xffffffc008ff58c8:  ldp     x21, x22, [sp], #16        |
   0xffffffc008ff58cc:  ldp     x19, x20, [sp], #16        |
   0xffffffc008ff58d0:  ldp     x29, x30, [sp], #16        |
   0xffffffc008ff58d4:  add     x0, x7, #0x0               |
   0xffffffc008ff58d8:  autiasp                            |
   0xffffffc008ff58dc:  ret -------------------------------'

Signed-off-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220617105735.733938-3-jakub@cloudflare.com
</content>
</entry>
<entry>
<title>bpf, arm64: Clear prog-&gt;jited_len along prog-&gt;jited</title>
<updated>2022-06-07T17:40:53Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2022-05-31T21:51:13Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=10f3b29c65bb2fe0d47c2945cd0b4087be1c5218'/>
<id>urn:sha1:10f3b29c65bb2fe0d47c2945cd0b4087be1c5218</id>
<content type='text'>
syzbot reported an illegal copy_to_user() attempt
from bpf_prog_get_info_by_fd() [1]

There was no repro yet on this bug, but I think
that commit 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns")
is exposing a prior bug in bpf arm64.

bpf_prog_get_info_by_fd() looks at prog-&gt;jited_len
to determine if the JIT image can be copied out to user space.

My theory is that syzbot managed to get a prog where prog-&gt;jited_len
has been set to 43, while prog-&gt;bpf_func has ben cleared.

It is not clear why copy_to_user(uinsns, NULL, ulen) is triggering
this particular warning.

I thought find_vma_area(NULL) would not find a vm_struct.
As we do not hold vmap_area_lock spinlock, it might be possible
that the found vm_struct was garbage.

[1]
usercopy: Kernel memory exposure attempt detected from vmalloc (offset 792633534417210172, size 43)!
kernel BUG at mm/usercopy.c:101!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 25002 Comm: syz-executor.1 Not tainted 5.18.0-syzkaller-10139-g8291eaafed36 #0
Hardware name: linux,dummy-virt (DT)
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : usercopy_abort+0x90/0x94 mm/usercopy.c:101
lr : usercopy_abort+0x90/0x94 mm/usercopy.c:89
sp : ffff80000b773a20
x29: ffff80000b773a30 x28: faff80000b745000 x27: ffff80000b773b48
x26: 0000000000000000 x25: 000000000000002b x24: 0000000000000000
x23: 00000000000000e0 x22: ffff80000b75db67 x21: 0000000000000001
x20: 000000000000002b x19: ffff80000b75db3c x18: 00000000fffffffd
x17: 2820636f6c6c616d x16: 76206d6f72662064 x15: 6574636574656420
x14: 74706d6574746120 x13: 2129333420657a69 x12: 73202c3237313031
x11: 3237313434333533 x10: 3336323937207465 x9 : 657275736f707865
x8 : ffff80000a30c550 x7 : ffff80000b773830 x6 : ffff80000b773830
x5 : 0000000000000000 x4 : ffff00007fbbaa10 x3 : 0000000000000000
x2 : 0000000000000000 x1 : f7ff000028fc0000 x0 : 0000000000000064
Call trace:
 usercopy_abort+0x90/0x94 mm/usercopy.c:89
 check_heap_object mm/usercopy.c:186 [inline]
 __check_object_size mm/usercopy.c:252 [inline]
 __check_object_size+0x198/0x36c mm/usercopy.c:214
 check_object_size include/linux/thread_info.h:199 [inline]
 check_copy_size include/linux/thread_info.h:235 [inline]
 copy_to_user include/linux/uaccess.h:159 [inline]
 bpf_prog_get_info_by_fd.isra.0+0xf14/0xfdc kernel/bpf/syscall.c:3993
 bpf_obj_get_info_by_fd+0x12c/0x510 kernel/bpf/syscall.c:4253
 __sys_bpf+0x900/0x2150 kernel/bpf/syscall.c:4956
 __do_sys_bpf kernel/bpf/syscall.c:5021 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5019 [inline]
 __arm64_sys_bpf+0x28/0x40 kernel/bpf/syscall.c:5019
 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
 invoke_syscall+0x48/0x114 arch/arm64/kernel/syscall.c:52
 el0_svc_common.constprop.0+0x44/0xec arch/arm64/kernel/syscall.c:142
 do_el0_svc+0xa0/0xc0 arch/arm64/kernel/syscall.c:206
 el0_svc+0x44/0xb0 arch/arm64/kernel/entry-common.c:624
 el0t_64_sync_handler+0x1ac/0x1b0 arch/arm64/kernel/entry-common.c:642
 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:581
Code: aa0003e3 d00038c0 91248000 97fff65f (d4210000)

Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Song Liu &lt;songliubraving@fb.com&gt;
Link: https://lore.kernel.org/bpf/20220531215113.1100754-1-eric.dumazet@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, arm64: Sign return address for JITed code</title>
<updated>2022-04-05T22:04:22Z</updated>
<author>
<name>Xu Kuohai</name>
<email>xukuohai@huawei.com</email>
</author>
<published>2022-04-02T07:39:42Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=042152c27c3bc3e20882f75c289ced32331f4010'/>
<id>urn:sha1:042152c27c3bc3e20882f75c289ced32331f4010</id>
<content type='text'>
Sign return address for JITed code when the kernel is built with pointer
authentication enabled:

1. Sign LR with paciasp instruction before LR is pushed to stack. Since
   paciasp acts like landing pads for function entry, no need to insert
   bti instruction before paciasp.

2. Authenticate LR with autiasp instruction after LR is popped from stack.

For BPF tail call, the stack frame constructed by the caller is reused by
the callee. That is, the stack frame is constructed by the caller and
destructed by the callee. Thus LR is signed and pushed to the stack in the
caller's prologue, and poped from the stack and authenticated in the
callee's epilogue.

For BPF2BPF call, the caller and callee construct their own stack frames,
and sign and authenticate their own LRs.

Signed-off-by: Xu Kuohai &lt;xukuohai@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://events.static.linuxfound.org/sites/events/files/slides/slides_23.pdf
Link: https://lore.kernel.org/bpf/20220402073942.3782529-1-xukuohai@huawei.com
</content>
</entry>
</feed>
