Age | Commit message (Collapse) | Author | Files | Lines |
|
Convert internal parts of the SQ managment to the region API.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1fb73ced6b835cb319ab0fe1dc0b2e982a9a5650.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A preparation patch, pass the context to io_register_free_rings.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c1865fd2b3d4db22d1a1aac7dd06ea22cb990834.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The patch implements mmap for the param region and enables the kernel
allocation mode. Internally it uses a fixed mmap offset, however the
user has to use the offset returned in
struct io_uring_region_desc::mmap_offset.
Note, mmap doesn't and can't take ->uring_lock and the region / ring
lookup is protected by ->mmap_lock, and it's directly peeking at
ctx->param_region. We can't protect io_create_region() with the
mmap_lock as it'd deadlock, which is why io_create_region_mmap_safe()
initialises it for us in a temporary variable and then publishes it
with the lock taken. It's intentionally decoupled from main region
helpers, and in the future we might want to have a list of active
regions, which then could be protected by the ->mmap_lock.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0f1212bd6af7fb39b63514b34fae8948014221d1.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Allow the kernel to allocate memory for a region. That's the classical
way SQ/CQ are allocated. It's not yet useful to user space as there
is no way to mmap it, which is why it's explicitly disabled in
io_register_mem_region().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7b8c40e6542546bbf93f4842a9a42a7373b81e0d.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Kernel allocated compound pages will have just one reference for the
entire page array, add a flag telling io_free_region about that.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a7abfa7535e9728d5fcade29a1ea1605ec2c04ce.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In preparation to adding kernel allocated regions extract a new helper
that pins user pages.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a17d7c39c3de4266b66b75b2dcf768150e1fc618.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We don't need to vmap if memory is already physically contiguous. There
are two important cases it covers: PAGE_SIZE regions and huge pages.
Use io_check_coalesce_buffer() to get the number of contiguous folios.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d5240af23064a824c29d14d2406f1ae764bf4505.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Regions are going to become more complex with allocation options and
optimisations, I want to split initialisation into steps and for that it
needs a sane fail path. Reuse io_free_region(), it's smart enough to
undo only what's needed and leaves the structure in a consistent state.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b853b4ec407cc80d033d021bdd2c14e22378fc78.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Move memory accounting before page pinning. It shouldn't even try to pin
pages if it's not allowed, and accounting is also relatively
inexpensive. It also give a better code structure as we do generic
accounting and then can branch for different mapping types.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1e242b8038411a222e8b269d35e021fa5015289f.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In preparation to kernel allocated regions add a flag telling if
the region contains user pinned pages or not.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0dc91564642654405bab080b7ec911cb4a43ec6e.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add internal flags for struct io_mapped_region. The first flag we need
is IO_REGION_F_VMAPPED, that indicates that the pointer has to be
unmapped on region destruction. For now all regions are vmap'ed, so it's
set unconditionally.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5a3d8046a038da97c0f8a8c8f1733fa3fc689d31.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_try_coalesce_buffer() is a useful helper collecting useful info about
a set of pages, I want to reuse it for analysing ring/etc. mappings. I
don't need the entire thing and only interested if it can be coalesced
into a single page, but that's better than duplicating the parsing.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/353b447953cd5d34c454a7d909bb6024c391d6e2.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
->resize_lock is used for resizing rings, but it's a good idea to reuse
it in other cases as well. Rename it into mmap_lock as it's protects
from races with mmap.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/68f705306f3ac4d2fb999eb80ea1615015ce9f7f.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
|
|
Pull KVM x86 fixes from Paolo Bonzini:
- Disable AVIC on SNP-enabled systems that don't allow writes to the
virtual APIC page, as such hosts will hit unexpected RMP #PFs in the
host when running VMs of any flavor.
- Fix a WARN in the hypercall completion path due to KVM trying to
determine if a guest with protected register state is in 64-bit mode
(KVM's ABI is to assume such guests only make hypercalls in 64-bit
mode).
- Allow the guest to write to supported bits in MSR_AMD64_DE_CFG to fix
a regression with Windows guests, and because KVM's read-only
behavior appears to be entirely made up.
- Treat TDP MMU faults as spurious if the faulting access is allowed
given the existing SPTE. This fixes a benign WARN (other than the
WARN itself) due to unexpectedly replacing a writable SPTE with a
read-only SPTE.
- Emit a warning when KVM is configured with ignore_msrs=1 and also to
hide the MSRs that the guest is looking for from the kernel logs.
ignore_msrs can trick guests into assuming that certain processor
features are present, and this in turn leads to bogus bug reports.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: let it be known that ignore_msrs is a bad idea
KVM: VMX: don't include '<linux/find.h>' directly
KVM: x86/mmu: Treat TDP MMU faults as spurious if access is already allowed
KVM: SVM: Allow guest writes to set MSR_AMD64_DE_CFG bits
KVM: x86: Play nice with protected guests in complete_hypercall_exit()
KVM: SVM: Disable AVIC on SNP-enabled system without HvInUseWrAllowed feature
|
|
KVM x86 fixes for 6.13:
- Disable AVIC on SNP-enabled systems that don't allow writes to the virtual
APIC page, as such hosts will hit unexpected RMP #PFs in the host when
running VMs of any flavor.
- Fix a WARN in the hypercall completion path due to KVM trying to determine
if a guest with protected register state is in 64-bit mode (KVM's ABI is to
assume such guests only make hypercalls in 64-bit mode).
- Allow the guest to write to supported bits in MSR_AMD64_DE_CFG to fix a
regression with Windows guests, and because KVM's read-only behavior appears
to be entirely made up.
- Treat TDP MMU faults as spurious if the faulting access is allowed given the
existing SPTE. This fixes a benign WARN (other than the WARN itself) due to
unexpectedly replacing a writable SPTE with a read-only SPTE.
|
|
When running KVM with ignore_msrs=1 and report_ignored_msrs=0, the user has
no clue that that the guest is being lied to. This may cause bug reports
such as https://gitlab.com/qemu-project/qemu/-/issues/2571, where enabling
a CPUID bit in QEMU caused Linux guests to try reading MSR_CU_DEF_ERR; and
being lied about the existence of MSR_CU_DEF_ERR caused the guest to assume
other things about the local APIC which were not true:
Sep 14 12:02:53 kernel: mce: [Firmware Bug]: Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.
Sep 14 12:02:53 kernel: unchecked MSR access error: RDMSR from 0x852 at rIP: 0xffffffffb548ffa7 (native_read_msr+0x7/0x40)
Sep 14 12:02:53 kernel: Call Trace:
...
Sep 14 12:02:53 kernel: native_apic_msr_read+0x20/0x30
Sep 14 12:02:53 kernel: setup_APIC_eilvt+0x47/0x110
Sep 14 12:02:53 kernel: mce_amd_feature_init+0x485/0x4e0
...
Sep 14 12:02:53 kernel: [Firmware Bug]: cpu 0, try to use APIC520 (LVT offset 2) for vector 0xf4, but the register is already in use for vector 0x0 on this cpu
Without reported_ignored_msrs=0 at least the host kernel log will contain
enough information to avoid going on a wild goose chase. But if reports
about individual MSR accesses are being silenced too, at least complain
loudly the first time a VM is started.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The header clearly states that it does not want to be included directly,
only via '<linux/bitmap.h>'. Replace the include accordingly.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Message-ID: <20241217070539.2433-2-wsa+renesas@sang-engineering.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pull devicetree fixes from Rob Herring:
- Disable #address-cells/#size-cells warning on coreboot (Chromebooks)
platforms
- Add missing root #address-cells/#size-cells in default empty DT
- Fix uninitialized variable in of_irq_parse_one()
- Fix interrupt-map cell length check in of_irq_parse_imap_parent()
- Fix refcount handling in __of_get_dma_parent()
- Fix error path in of_parse_phandle_with_args_map()
- Fix dma-ranges handling with flags cells
- Drop explicit fw_devlink handling of 'interrupt-parent'
- Fix "compression" typo in fixed-partitions binding
- Unify "fsl,liodn" property type definitions
* tag 'devicetree-fixes-for-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: Add coreboot firmware to excluded default cells list
of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one()
of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent()
of: Fix refcount leakage for OF node returned by __of_get_dma_parent()
of: Fix error path in of_parse_phandle_with_args_map()
dt-bindings: mtd: fixed-partitions: Fix "compression" typo
of: Add #address-cells/#size-cells in the device-tree root empty node
dt-bindings: Unify "fsl,liodn" type definitions
of: address: Preserve the flags portion on 1:1 dma-ranges mapping
of/unittest: Add empty dma-ranges address translation tests
of: property: fw_devlink: Do not use interrupt-parent directly
|
|
Pull SoC fixes from Arnd Bergmann:
"Two more small fixes, correcting the cacheline size on Raspberry Pi 5
and fixing a logic mistake in the microchip mpfs firmware driver"
* tag 'soc-fixes-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: dts: broadcom: Fix L2 linesize for Raspberry Pi 5
firmware: microchip: fix UL_IAP lock check in mpfs_auto_update_state()
|
|
Pull misc fixes from Andrew Morton:
"25 hotfixes. 16 are cc:stable. 19 are MM and 6 are non-MM.
The usual bunch of singletons and doubletons - please see the relevant
changelogs for details"
* tag 'mm-hotfixes-stable-2024-12-21-12-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits)
mm: huge_memory: handle strsep not finding delimiter
alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG
alloc_tag: fix module allocation tags populated area calculation
mm/codetag: clear tags before swap
mm/vmstat: fix a W=1 clang compiler warning
mm: convert partially_mapped set/clear operations to be atomic
nilfs2: fix buffer head leaks in calls to truncate_inode_pages()
vmalloc: fix accounting with i915
mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in split_large_buddy()
fork: avoid inappropriate uprobe access to invalid mm
nilfs2: prevent use of deleted inode
zram: fix uninitialized ZRAM not releasing backing device
zram: refuse to use zero sized block device as backing device
mm: use clear_user_(high)page() for arch with special user folio handling
mm: introduce cpu_icache_is_aliasing() across all architectures
mm: add RCU annotation to pte_offset_map(_lock)
mm: correctly reference merged VMA
mm: use aligned address in copy_user_gigantic_page()
mm: use aligned address in clear_gigantic_page()
mm: shmem: fix ShmemHugePages at swapout
...
|
|
My tests run an allyesconfig build and it failed with the following errors:
LD [M] samples/kfifo/dma-example.ko
ld.lld: error: undefined symbol: nec7210_board_reset
ld.lld: error: undefined symbol: nec7210_read
ld.lld: error: undefined symbol: nec7210_write
It appears that some modules call the function nec7210_board_reset()
that is defined in nec7210.c. In an allyesconfig build, these other
modules are built in. But the file that holds nec7210_board_reset()
has:
obj-m += nec7210.o
Where that "-m" means it only gets built as a module. With the other
modules built in, they have no access to nec7210_board_reset() and the build
fails.
This isn't the only function. After fixing that one, I hit another:
ld.lld: error: undefined symbol: push_gpib_event
ld.lld: error: undefined symbol: gpib_match_device_path
Where push_gpib_event() was also used outside of the file it was defined
in, and that file too only was built as a module.
Since the directory that nec7210.c is only traversed when
CONFIG_GPIB_NEC7210 is set, and the directory with gpib_common.c is only
traversed when CONFIG_GPIB_COMMON is set, use those configs as the
option to build those modules. When it is an allyesconfig, then they
will both be built in and their functions will be available to the other
modules that are also built in.
Fixes: 3ba84ac69b53e ("staging: gpib: Add nec7210 GPIB chip driver")
Fixes: 9dde4559e9395 ("staging: gpib: Add GPIB common core driver")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull Kbuild fixes from Masahiro Yamada:
- Remove stale code in usr/include/headers_check.pl
- Fix issues in the user-mode-linux Debian package
- Fix false-positive "export twice" errors in modpost
* tag 'kbuild-fixes-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
modpost: distinguish same module paths from different dump files
kbuild: deb-pkg: Do not install maint scripts for arch 'um'
kbuild: deb-pkg: add debarch for ARCH=um
kbuild: Drop support for include/asm-<arch> in headers_check.pl
|
|
Pull BPF fixes from Daniel Borkmann:
- Fix inlining of bpf_get_smp_processor_id helper for !CONFIG_SMP
systems (Andrea Righi)
- Fix BPF USDT selftests helper code to use asm constraint "m" for
LoongArch (Tiezhu Yang)
- Fix BPF selftest compilation error in get_uprobe_offset when
PROCMAP_QUERY is not defined (Jerome Marchand)
- Fix BPF bpf_skb_change_tail helper when used in context of BPF
sockmap to handle negative skb header offsets (Cong Wang)
- Several fixes to BPF sockmap code, among others, in the area of
socket buffer accounting (Levi Zim, Zijian Zhang, Cong Wang)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Test bpf_skb_change_tail() in TC ingress
selftests/bpf: Introduce socket_helpers.h for TC tests
selftests/bpf: Add a BPF selftest for bpf_skb_change_tail()
bpf: Check negative offsets in __bpf_skb_min_len()
tcp_bpf: Fix copied value in tcp_bpf_sendmsg
skmsg: Return copied bytes in sk_msg_memcopy_from_iter
tcp_bpf: Add sk_rmem_alloc related logic for tcp_bpf ingress redirection
tcp_bpf: Charge receive socket buffer in bpf_tcp_ingress()
selftests/bpf: Fix compilation error in get_uprobe_offset()
selftests/bpf: Use asm constraint "m" for LoongArch
bpf: Fix bpf_get_smp_processor_id() on !CONFIG_SMP
|
|
Pull media fixes from Mauro Carvalho Chehab:
- fix a clang build issue with mediatec vcodec
- add missing variable initialization to dib3000mb write function
* tag 'media/v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: mediatek: vcodec: mark vdec_vp9_slice_map_counts_eob_coef noinline
media: dvb-frontends: dib3000mb: fix uninit-value in dib3000_write_reg
|
|
Pull PCI fixes from Krzysztof Wilczyński:
"Two small patches that are important for fixing boot time hang on
Intel JHL7540 'Titan Ridge' platforms equipped with a Thunderbolt
controller.
The boot time issue manifests itself when a PCI Express bandwidth
control is unnecessarily enabled on the Thunderbolt controller
downstream ports, which only supports a link speed of 2.5 GT/s in
accordance with USB4 v2 specification (p. 671, sec. 11.2.1, "PCIe
Physical Layer Logical Sub-block").
As such, there is no need to enable bandwidth control on such
downstream port links, which also works around the issue.
Both patches were tested by the original reporter on the hardware on
which the failure origin golly manifested itself. Both fixes were
proven to resolve the reported boot hang issue, and both patches have
been in linux-next this week with no reported problems"
* tag 'pci-v6.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI/bwctrl: Enable only if more than one speed is supported
PCI: Honor Max Link Speed when determining supported speeds
|
|
Pull power management fixes from Rafael Wysocki:
"These fix some amd-pstate driver issues:
- Detect preferred core support in amd-pstate before driver
registration to avoid initialization ordering issues (K Prateek
Nayak)
- Fix issues with with boost numerator handling in amd-pstate leading
to inconsistently programmed CPPC max performance values (Mario
Limonciello)"
* tag 'pm-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies
cpufreq/amd-pstate: Store the boost numerator as highest perf again
cpufreq/amd-pstate: Detect preferred core support before driver registration
|
|
Pull thermal control fixes from Rafael Wysocki:
"Fix two issues with the user thermal thresholds feature introduced in
this development cycle (Daniel Lezcano)"
* tag 'thermal-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/thresholds: Fix boundaries and detection routine
thermal/thresholds: Fix uapi header macros leading to a compilation error
|
|
Pull ACPI fix from Rafael Wysocki:
"Unbreak ACPI EC support on LoongArch that has been broken earlier in
this development cycle (Huacai Chen)"
* tag 'acpi-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: EC: Enable EC support on LoongArch by default
|
|
Pull smb client fixes from Steve French:
- fix regression in display of write stats
- fix rmmod failure with network namespaces
- two minor cleanups
* tag '6.13-rc3-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: fix bytes written value in /proc/fs/cifs/Stats
smb: client: fix TCP timers deadlock after rmmod
smb: client: Deduplicate "select NETFS_SUPPORT" in Kconfig
smb: use macros instead of constants for leasekey size and default cifsattrs value
|
|
Pull NFS client fixes from Trond Myklebust:
- NFS/pnfs: Fix a live lock between recalled layouts and layoutget
- Fix a build warning about an undeclared symbol 'nfs_idmap_cache_timeout'
* tag 'nfs-for-6.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
fs/nfs: fix missing declaration of nfs_idmap_cache_timeout
NFS/pnfs: Fix a live lock between recalled layouts and layoutget
|
|
Pull ceph fixes from Ilya Dryomov:
"A handful of important CephFS fixes from Max, Alex and myself: memory
corruption due to a buffer overrun, potential infinite loop and
several memory leaks on the error paths. All but one marked for
stable"
* tag 'ceph-for-6.13-rc4' of https://github.com/ceph/ceph-client:
ceph: allocate sparse_ext map only for sparse reads
ceph: fix memory leak in ceph_direct_read_write()
ceph: improve error handling and short/overflow-read logic in __ceph_sync_read()
ceph: validate snapdirname option length when mounting
ceph: give up on paths longer than PATH_MAX
ceph: fix memory leaks in __ceph_sync_read()
|
|
Since commit 13b25489b6f8 ("kbuild: change working directory to external
module directory with M="), module paths are always relative to the top
of the external module tree.
The module paths recorded in Module.symvers are no longer globally unique
when they are passed via KBUILD_EXTRA_SYMBOLS for building other external
modules, which may result in false-positive "exported twice" errors.
Such errors should not occur because external modules should be able to
override in-tree modules.
To address this, record the dump file path in struct module and check it
when searching for a module.
Fixes: 13b25489b6f8 ("kbuild: change working directory to external module directory with M=")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Closes: https://lore.kernel.org/all/eb21a546-a19c-40df-b821-bbba80f19a3d@nvidia.com/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
|
|
Stop installing Debian maintainer scripts when building a
user-mode-linux Debian package.
Debian maintainer scripts are used for e.g. requesting rebuilds of
initrd, rebuilding DKMS modules and updating of grub configuration. As
all of this is not relevant for UML but also may lead to failures while
processing the kernel hooks, do no more install maintainer scripts for
the UML package.
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
'make ARCH=um bindeb-pkg' shows the following warning.
$ make ARCH=um bindeb-pkg
[snip]
GEN debian
** ** ** WARNING ** ** **
Your architecture doesn't have its equivalent
Debian userspace architecture defined!
Falling back to the current host architecture (amd64).
Please add support for um to ./scripts/package/mkdebian ...
This commit hard-codes i386/amd64 because UML is only supported for x86.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
|
|
"include/asm-<arch>" was replaced by "arch/<arch>/include/asm" a long
time ago. All assembler header files are now included using
"#include <asm/*>", so there is no longer a need to rewrite paths.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Similarly to the previous test, we also need a test case to cover
positive offsets as well, TC is an excellent hook for this.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Zijian Zhang <zijianzhang@bytedance.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241213034057.246437-5-xiyou.wangcong@gmail.com
|
|
Pull socket helpers out of sockmap_helpers.h so that they can be reused
for TC tests as well. This prepares for the next patch.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241213034057.246437-4-xiyou.wangcong@gmail.com
|
|
As requested by Daniel, we need to add a selftest to cover
bpf_skb_change_tail() cases in skb_verdict. Here we test trimming,
growing and error cases, and validate its expected return values and the
expected sizes of the payload.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241213034057.246437-3-xiyou.wangcong@gmail.com
|
|
skb_network_offset() and skb_transport_offset() can be negative when
they are called after we pull the transport header, for example, when
we use eBPF sockmap at the point of ->sk_data_ready().
__bpf_skb_min_len() uses an unsigned int to get these offsets, this
leads to a very large number which then causes bpf_skb_change_tail()
failed unexpectedly.
Fix this by using a signed int to get these offsets and ensure the
minimum is at least zero.
Fixes: 5293efe62df8 ("bpf: add bpf_skb_change_tail helper")
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241213034057.246437-2-xiyou.wangcong@gmail.com
|
|
Pull arm64 fix from Catalin Marinas:
"Fix a sparse warning in the arm64 signal code dealing with the user
shadow stack register, GCSPR_EL0"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/signal: Silence sparse warning storing GCSPR_EL0
|
|
bpf kselftest sockhash::test_txmsg_cork_hangs in test_sockmap.c triggers a
kernel NULL pointer dereference:
BUG: kernel NULL pointer dereference, address: 0000000000000008
? __die_body+0x6e/0xb0
? __die+0x8b/0xa0
? page_fault_oops+0x358/0x3c0
? local_clock+0x19/0x30
? lock_release+0x11b/0x440
? kernelmode_fixup_or_oops+0x54/0x60
? __bad_area_nosemaphore+0x4f/0x210
? mmap_read_unlock+0x13/0x30
? bad_area_nosemaphore+0x16/0x20
? do_user_addr_fault+0x6fd/0x740
? prb_read_valid+0x1d/0x30
? exc_page_fault+0x55/0xd0
? asm_exc_page_fault+0x2b/0x30
? splice_to_socket+0x52e/0x630
? shmem_file_splice_read+0x2b1/0x310
direct_splice_actor+0x47/0x70
splice_direct_to_actor+0x133/0x300
? do_splice_direct+0x90/0x90
do_splice_direct+0x64/0x90
? __ia32_sys_tee+0x30/0x30
do_sendfile+0x214/0x300
__se_sys_sendfile64+0x8e/0xb0
__x64_sys_sendfile64+0x25/0x30
x64_sys_call+0xb82/0x2840
do_syscall_64+0x75/0x110
entry_SYSCALL_64_after_hwframe+0x4b/0x53
This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
size (8192), which causes the while loop in splice_to_socket() to release
an uninitialized pipe buf.
The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
will copy all bytes upon success but it actually might only copy part of
it.
This commit changes it to use the real copied bytes.
Signed-off-by: Levi Zim <rsworktech@outlook.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241130-tcp-bpf-sendmsg-v1-2-bae583d014f3@outlook.com
|
|
Previously sk_msg_memcopy_from_iter returns the copied bytes from the
last copy_from_iter{,_nocache} call upon success.
This commit changes it to return the total number of copied bytes on
success.
Signed-off-by: Levi Zim <rsworktech@outlook.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241130-tcp-bpf-sendmsg-v1-1-bae583d014f3@outlook.com
|
|
Pull hwmon fixes from Guenter Roeck:
- Fix reporting of negative temperature, current, and voltage values in
the tmp513 driver
* tag 'hwmon-for-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers
hwmon: (tmp513) Fix Current Register value interpretation
hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers
|
|
Google Juniper and other Chromebook platforms have a very old bootloader
which populates /firmware node without proper address/size-cells leading
to warnings:
Missing '#address-cells' in /firmware
WARNING: CPU: 0 PID: 1 at drivers/of/base.c:106 of_bus_n_addr_cells+0x90/0xf0
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0 #1 933ab9971ff4d5dc58cb378a96f64c7f72e3454d
Hardware name: Google juniper sku16 board (DT)
...
Missing '#size-cells' in /firmware
WARNING: CPU: 0 PID: 1 at drivers/of/base.c:133 of_bus_n_size_cells+0x90/0xf0
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.12.0 #1 933ab9971ff4d5dc58cb378a96f64c7f72e3454d
Tainted: [W]=WARN
Hardware name: Google juniper sku16 board (DT)
These platform won't receive updated bootloader/firmware, so add an
exclusion for platforms with a "coreboot" compatible node. While this is
wider than necessary, that's the easiest fix and it doesn't doesn't
matter if we miss checking other platforms using coreboot.
We may revisit this later and address with a fixup to the DT itself.
Reported-by: Sasha Levin <sashal@kernel.org>
Closes: https://lore.kernel.org/all/Z0NUdoG17EwuCigT@sashalap/
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Chen-Yu Tsai <wenst@chromium.org>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Pull block fixes from Jens Axboe:
- Minor cleanups for bdev/nvme using the helpers introduced
- Revert of a deadlock fix that still needs more work
- Fix a UAF of hctx in the cpu hotplug code
* tag 'block-6.13-20241220' of git://git.kernel.dk/linux:
block: avoid to reuse `hctx` not removed from cpuhp callback list
block: Revert "block: Fix potential deadlock while freezing queue and acquiring sysfs_lock"
nvme: use blk_validate_block_size() for max LBA check
block/bdev: use helper for max block size check
|
|
Pull io_uring fixes from Jens Axboe:
- Fix for a file ref leak for registered ring fds
- Turn the ->timeout_lock into a raw spinlock, as it nests under the
io-wq lock which is a raw spinlock as it's called from the scheduler
side
- Limit ring resizing to DEFER_TASKRUN for now. We will broaden this in
the future, but for now, ensure that it's only feasible on rings with
a single user
- Add sanity check for io-wq enqueuing
* tag 'io_uring-6.13-20241220' of git://git.kernel.dk/linux:
io_uring: check if iowq is killed before queuing
io_uring/register: limit ring resizing to DEFER_TASKRUN
io_uring: Fix registered ring file refcount leak
io_uring: make ctx->timeout_lock a raw spinlock
|
|
Pull USB / Thunderbolt fixes from Greg KH:
"Here are some important, and small, fixes for USB and Thunderbolt
issues that have come up in the -rc releases. And some new device ids
for good measure. Included in here are:
- Much reported xhci bugfix for usb-storage devices (and other
devices as well, tripped me up on a video camera)
- thunderbolt fixes for some small reported issues
- new usb-serial device ids
All of these have been in linux-next this week with no reported issues"
* tag 'usb-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: xhci: fix ring expansion regression in 6.13-rc1
xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic
thunderbolt: Improve redrive mode handling
USB: serial: option: add Telit FE910C04 rmnet compositions
USB: serial: option: add MediaTek T7XX compositions
USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready
USB: serial: option: add MeiG Smart SLM770A
USB: serial: option: add TCL IK512 MBIM & ECM
thunderbolt: Don't display nvm_version unless upgrade supported
thunderbolt: Add support for Intel Panther Lake-M/P
|
|
Pull spi fix from Mark Brown:
"A fix for the remove path of the Rockchip driver, the code was just
clearly and obviously wrong"
* tag 'spi-fix-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: rockchip-sfc: Fix error in remove progress
|
|
Pull regulator fix from Mark Brown:
"The recently added regulator-uv-survival-time-ms property was renamed
during the review of the series that added it, but unfortunately only
in the DT binding and not in the code that parses the binding.
This brings the code in line with the binding, if someone started
using the original name we can add compat support for it but there's
nothing upstream yet and it's a very niche feature so hopefully not"
* tag 'regulator-fix-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: rename regulator-uv-survival-time-ms according to DT binding
|