Age | Commit message (Collapse) | Author | Files | Lines |
|
After running once, the for_each_trip_desc() loop in
bang_bang_manage() is pure needless overhead because it is not going to
make any changes unless a new cooling device has been bound to one of
the trips in the thermal zone or the system is resuming from sleep.
For this reason, make bang_bang_manage() set governor_data for the
thermal zone and check it upfront to decide whether or not it needs to
do anything.
However, governor_data needs to be reset in some cases to let
bang_bang_manage() know that it should walk the trips again, so add an
.update_tz() callback to the governor and make the core additionally
invoke it during system resume.
To avoid affecting the other users of that callback unnecessarily, add
a special notification reason for system resume, THERMAL_TZ_RESUME, and
also pass it to __thermal_zone_device_update() called during system
resume for consistency.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Kästle <peter@piie.net>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Cc: 6.10+ <stable@vger.kernel.org> # 6.10+
Link: https://patch.msgid.link/2285575.iZASKD2KPV@rjwysocki.net
|
|
After recent changes, the Bang-bang governor may not adjust the
initial configuration of cooling devices to the actual situation.
Namely, if a cooling device bound to a certain trip point starts in
the "on" state and the thermal zone temperature is below the threshold
of that trip point, the trip point may never be crossed on the way up
in which case the state of the cooling device will never be adjusted
because the thermal core will never invoke the governor's
.trip_crossed() callback. [Note that there is no issue if the zone
temperature is at the trip threshold or above it to start with because
.trip_crossed() will be invoked then to indicate the start of thermal
mitigation for the given trip.]
To address this, add a .manage() callback to the Bang-bang governor
and use it to ensure that all of the thermal instances managed by the
governor have been initialized properly and the states of all of the
cooling devices involved have been adjusted to the current zone
temperature as appropriate.
Fixes: 530c932bdf75 ("thermal: gov_bang_bang: Use .trip_crossed() instead of .throttle()")
Link: https://lore.kernel.org/linux-pm/1bfbbae5-42b0-4c7d-9544-e98855715294@piie.net/
Cc: 6.10+ <stable@vger.kernel.org> # 6.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Kästle <peter@piie.net>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Link: https://patch.msgid.link/8419356.T7Z3S40VBb@rjwysocki.net
|
|
Move the setting of the thermal instance target state from
bang_bang_control() into a separate function that will be also called
in a different place going forward.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Kästle <peter@piie.net>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Cc: 6.10+ <stable@vger.kernel.org> # 6.10+
Link: https://patch.msgid.link/3313587.aeNJFYEL58@rjwysocki.net
|
|
Instead of clearing the "updated" flag for each cooling device
affected by the trip point crossing in bang_bang_control() and
walking all thermal instances to run thermal_cdev_update() for all
of the affected cooling devices, call __thermal_cdev_update()
directly for each of them.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Kästle <peter@piie.net>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Cc: 6.10+ <stable@vger.kernel.org> # 6.10+
Link: https://patch.msgid.link/13583081.uLZWGnKmhe@rjwysocki.net
|
|
Lockdep reported a warning in Linux version 6.6:
[ 414.344659] ================================
[ 414.345155] WARNING: inconsistent lock state
[ 414.345658] 6.6.0-07439-gba2303cacfda #6 Not tainted
[ 414.346221] --------------------------------
[ 414.346712] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 414.347545] kworker/u10:3/1152 [HC0[0]:SC0[0]:HE0:SE1] takes:
[ 414.349245] ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0
[ 414.351204] {IN-SOFTIRQ-W} state was registered at:
[ 414.351751] lock_acquire+0x18d/0x460
[ 414.352218] _raw_spin_lock_irqsave+0x39/0x60
[ 414.352769] __wake_up_common_lock+0x22/0x60
[ 414.353289] sbitmap_queue_wake_up+0x375/0x4f0
[ 414.353829] sbitmap_queue_clear+0xdd/0x270
[ 414.354338] blk_mq_put_tag+0xdf/0x170
[ 414.354807] __blk_mq_free_request+0x381/0x4d0
[ 414.355335] blk_mq_free_request+0x28b/0x3e0
[ 414.355847] __blk_mq_end_request+0x242/0xc30
[ 414.356367] scsi_end_request+0x2c1/0x830
[ 414.345155] WARNING: inconsistent lock state
[ 414.345658] 6.6.0-07439-gba2303cacfda #6 Not tainted
[ 414.346221] --------------------------------
[ 414.346712] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 414.347545] kworker/u10:3/1152 [HC0[0]:SC0[0]:HE0:SE1] takes:
[ 414.349245] ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0
[ 414.351204] {IN-SOFTIRQ-W} state was registered at:
[ 414.351751] lock_acquire+0x18d/0x460
[ 414.352218] _raw_spin_lock_irqsave+0x39/0x60
[ 414.352769] __wake_up_common_lock+0x22/0x60
[ 414.353289] sbitmap_queue_wake_up+0x375/0x4f0
[ 414.353829] sbitmap_queue_clear+0xdd/0x270
[ 414.354338] blk_mq_put_tag+0xdf/0x170
[ 414.354807] __blk_mq_free_request+0x381/0x4d0
[ 414.355335] blk_mq_free_request+0x28b/0x3e0
[ 414.355847] __blk_mq_end_request+0x242/0xc30
[ 414.356367] scsi_end_request+0x2c1/0x830
[ 414.356863] scsi_io_completion+0x177/0x1610
[ 414.357379] scsi_complete+0x12f/0x260
[ 414.357856] blk_complete_reqs+0xba/0xf0
[ 414.358338] __do_softirq+0x1b0/0x7a2
[ 414.358796] irq_exit_rcu+0x14b/0x1a0
[ 414.359262] sysvec_call_function_single+0xaf/0xc0
[ 414.359828] asm_sysvec_call_function_single+0x1a/0x20
[ 414.360426] default_idle+0x1e/0x30
[ 414.360873] default_idle_call+0x9b/0x1f0
[ 414.361390] do_idle+0x2d2/0x3e0
[ 414.361819] cpu_startup_entry+0x55/0x60
[ 414.362314] start_secondary+0x235/0x2b0
[ 414.362809] secondary_startup_64_no_verify+0x18f/0x19b
[ 414.363413] irq event stamp: 428794
[ 414.363825] hardirqs last enabled at (428793): [<ffffffff816bfd1c>] ktime_get+0x1dc/0x200
[ 414.364694] hardirqs last disabled at (428794): [<ffffffff85470177>] _raw_spin_lock_irq+0x47/0x50
[ 414.365629] softirqs last enabled at (428444): [<ffffffff85474780>] __do_softirq+0x540/0x7a2
[ 414.366522] softirqs last disabled at (428419): [<ffffffff813f65ab>] irq_exit_rcu+0x14b/0x1a0
[ 414.367425]
other info that might help us debug this:
[ 414.368194] Possible unsafe locking scenario:
[ 414.368900] CPU0
[ 414.369225] ----
[ 414.369548] lock(&sbq->ws[i].wait);
[ 414.370000] <Interrupt>
[ 414.370342] lock(&sbq->ws[i].wait);
[ 414.370802]
*** DEADLOCK ***
[ 414.371569] 5 locks held by kworker/u10:3/1152:
[ 414.372088] #0: ffff88810130e938 ((wq_completion)writeback){+.+.}-{0:0}, at: process_scheduled_works+0x357/0x13f0
[ 414.373180] #1: ffff88810201fdb8 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x3a3/0x13f0
[ 414.374384] #2: ffffffff86ffbdc0 (rcu_read_lock){....}-{1:2}, at: blk_mq_run_hw_queue+0x637/0xa00
[ 414.375342] #3: ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0
[ 414.376377] #4: ffff888106205a08 (&hctx->dispatch_wait_lock){+.-.}-{2:2}, at: blk_mq_dispatch_rq_list+0x1337/0x1ee0
[ 414.378607]
stack backtrace:
[ 414.379177] CPU: 0 PID: 1152 Comm: kworker/u10:3 Not tainted 6.6.0-07439-gba2303cacfda #6
[ 414.380032] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 414.381177] Workqueue: writeback wb_workfn (flush-253:0)
[ 414.381805] Call Trace:
[ 414.382136] <TASK>
[ 414.382429] dump_stack_lvl+0x91/0xf0
[ 414.382884] mark_lock_irq+0xb3b/0x1260
[ 414.383367] ? __pfx_mark_lock_irq+0x10/0x10
[ 414.383889] ? stack_trace_save+0x8e/0xc0
[ 414.384373] ? __pfx_stack_trace_save+0x10/0x10
[ 414.384903] ? graph_lock+0xcf/0x410
[ 414.385350] ? save_trace+0x3d/0xc70
[ 414.385808] mark_lock.part.20+0x56d/0xa90
[ 414.386317] mark_held_locks+0xb0/0x110
[ 414.386791] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 414.387320] lockdep_hardirqs_on_prepare+0x297/0x3f0
[ 414.387901] ? _raw_spin_unlock_irq+0x28/0x50
[ 414.388422] trace_hardirqs_on+0x58/0x100
[ 414.388917] _raw_spin_unlock_irq+0x28/0x50
[ 414.389422] __blk_mq_tag_busy+0x1d6/0x2a0
[ 414.389920] __blk_mq_get_driver_tag+0x761/0x9f0
[ 414.390899] blk_mq_dispatch_rq_list+0x1780/0x1ee0
[ 414.391473] ? __pfx_blk_mq_dispatch_rq_list+0x10/0x10
[ 414.392070] ? sbitmap_get+0x2b8/0x450
[ 414.392533] ? __blk_mq_get_driver_tag+0x210/0x9f0
[ 414.393095] __blk_mq_sched_dispatch_requests+0xd99/0x1690
[ 414.393730] ? elv_attempt_insert_merge+0x1b1/0x420
[ 414.394302] ? __pfx___blk_mq_sched_dispatch_requests+0x10/0x10
[ 414.394970] ? lock_acquire+0x18d/0x460
[ 414.395456] ? blk_mq_run_hw_queue+0x637/0xa00
[ 414.395986] ? __pfx_lock_acquire+0x10/0x10
[ 414.396499] blk_mq_sched_dispatch_requests+0x109/0x190
[ 414.397100] blk_mq_run_hw_queue+0x66e/0xa00
[ 414.397616] blk_mq_flush_plug_list.part.17+0x614/0x2030
[ 414.398244] ? __pfx_blk_mq_flush_plug_list.part.17+0x10/0x10
[ 414.398897] ? writeback_sb_inodes+0x241/0xcc0
[ 414.399429] blk_mq_flush_plug_list+0x65/0x80
[ 414.399957] __blk_flush_plug+0x2f1/0x530
[ 414.400458] ? __pfx___blk_flush_plug+0x10/0x10
[ 414.400999] blk_finish_plug+0x59/0xa0
[ 414.401467] wb_writeback+0x7cc/0x920
[ 414.401935] ? __pfx_wb_writeback+0x10/0x10
[ 414.402442] ? mark_held_locks+0xb0/0x110
[ 414.402931] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 414.403462] ? lockdep_hardirqs_on_prepare+0x297/0x3f0
[ 414.404062] wb_workfn+0x2b3/0xcf0
[ 414.404500] ? __pfx_wb_workfn+0x10/0x10
[ 414.404989] process_scheduled_works+0x432/0x13f0
[ 414.405546] ? __pfx_process_scheduled_works+0x10/0x10
[ 414.406139] ? do_raw_spin_lock+0x101/0x2a0
[ 414.406641] ? assign_work+0x19b/0x240
[ 414.407106] ? lock_is_held_type+0x9d/0x110
[ 414.407604] worker_thread+0x6f2/0x1160
[ 414.408075] ? __kthread_parkme+0x62/0x210
[ 414.408572] ? lockdep_hardirqs_on_prepare+0x297/0x3f0
[ 414.409168] ? __kthread_parkme+0x13c/0x210
[ 414.409678] ? __pfx_worker_thread+0x10/0x10
[ 414.410191] kthread+0x33c/0x440
[ 414.410602] ? __pfx_kthread+0x10/0x10
[ 414.411068] ret_from_fork+0x4d/0x80
[ 414.411526] ? __pfx_kthread+0x10/0x10
[ 414.411993] ret_from_fork_asm+0x1b/0x30
[ 414.412489] </TASK>
When interrupt is turned on while a lock holding by spin_lock_irq it
throws a warning because of potential deadlock.
blk_mq_prep_dispatch_rq
blk_mq_get_driver_tag
__blk_mq_get_driver_tag
__blk_mq_alloc_driver_tag
blk_mq_tag_busy -> tag is already busy
// failed to get driver tag
blk_mq_mark_tag_wait
spin_lock_irq(&wq->lock) -> lock A (&sbq->ws[i].wait)
__add_wait_queue(wq, wait) -> wait queue active
blk_mq_get_driver_tag
__blk_mq_tag_busy
-> 1) tag must be idle, which means there can't be inflight IO
spin_lock_irq(&tags->lock) -> lock B (hctx->tags)
spin_unlock_irq(&tags->lock) -> unlock B, turn on interrupt accidentally
-> 2) context must be preempt by IO interrupt to trigger deadlock.
As shown above, the deadlock is not possible in theory, but the warning
still need to be fixed.
Fix it by using spin_lock_irqsave to get lockB instead of spin_lock_irq.
Fixes: 4f1731df60f9 ("blk-mq: fix potential io hang by wrong 'wake_batch'")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240815024736.2040971-1-lilingfeng@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
read_balance() will avoid reading from slow disks as much as possible,
however, if valid data only lands in slow disks, and a new normal disk
is still in recovery, unrecovered data can be read:
raid1_read_request
read_balance
raid1_should_read_first
-> return false
choose_best_rdev
-> normal disk is not recovered, return -1
choose_bb_rdev
-> missing the checking of recovery, return the normal disk
-> read unrecovered data
Root cause is that the checking of recovery is missing in
choose_bb_rdev(). Hence add such checking to fix the problem.
Also fix similar problem in choose_slow_rdev().
Cc: stable@vger.kernel.org
Fixes: 9f3ced792203 ("md/raid1: factor out choose_bb_rdev() from read_balance()")
Fixes: dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()")
Reported-and-tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Closes: https://lore.kernel.org/all/9952f532-2554-44bf-b906-4880b2e88e3a@o2.pl/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20240803091137.3197008-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu <song@kernel.org>
|
|
The out-of-bounds access is reported by UBSAN:
[ 0.000000] UBSAN: array-index-out-of-bounds in ../arch/riscv/kernel/vendor_extensions.c:41:66
[ 0.000000] index -1 is out of range for type 'riscv_isavendorinfo [32]'
[ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2ubuntu-defconfig #2
[ 0.000000] Hardware name: riscv-virtio,qemu (DT)
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff94e078ba>] dump_backtrace+0x32/0x40
[ 0.000000] [<ffffffff95c83c1a>] show_stack+0x38/0x44
[ 0.000000] [<ffffffff95c94614>] dump_stack_lvl+0x70/0x9c
[ 0.000000] [<ffffffff95c94658>] dump_stack+0x18/0x20
[ 0.000000] [<ffffffff95c8bbb2>] ubsan_epilogue+0x10/0x46
[ 0.000000] [<ffffffff95485a82>] __ubsan_handle_out_of_bounds+0x94/0x9c
[ 0.000000] [<ffffffff94e09442>] __riscv_isa_vendor_extension_available+0x90/0x92
[ 0.000000] [<ffffffff94e043b6>] riscv_cpufeature_patch_func+0xc4/0x148
[ 0.000000] [<ffffffff94e035f8>] _apply_alternatives+0x42/0x50
[ 0.000000] [<ffffffff95e04196>] apply_boot_alternatives+0x3c/0x100
[ 0.000000] [<ffffffff95e05b52>] setup_arch+0x85a/0x8bc
[ 0.000000] [<ffffffff95e00ca0>] start_kernel+0xa4/0xfb6
The dereferencing using cpu should actually not happen, so remove it.
Fixes: 23c996fc2bc1 ("riscv: Extend cpufeature.c to detect vendor extensions")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240814192619.276794-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Trusted keys unseal the key blob on load, but keep the sealed payload in
the blob field so that every subsequent read (export) will simply
convert this field to hex and send it to userspace.
With DCP-based trusted keys, we decrypt the blob encryption key (BEK)
in the Kernel due hardware limitations and then decrypt the blob payload.
BEK decryption is done in-place which means that the trusted key blob
field is modified and it consequently holds the BEK in plain text.
Every subsequent read of that key thus send the plain text BEK instead
of the encrypted BEK to userspace.
This issue only occurs when importing a trusted DCP-based key and
then exporting it again. This should rarely happen as the common use cases
are to either create a new trusted key and export it, or import a key
blob and then just use it without exporting it again.
Fix this by performing BEK decryption and encryption in a dedicated
buffer. Further always wipe the plain text BEK buffer to prevent leaking
the key via uninitialized memory.
Cc: stable@vger.kernel.org # v6.10+
Fixes: 2e8a0f40a39c ("KEYS: trusted: Introduce NXP DCP-backed trusted keys")
Signed-off-by: David Gstir <david@sigma-star.at>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The DCP trusted key type uses the wrong helper function to store
the blob's payload length which can lead to the wrong byte order
being used in case this would ever run on big endian architectures.
Fix by using correct helper function.
Cc: stable@vger.kernel.org # v6.10+
Fixes: 2e8a0f40a39c ("KEYS: trusted: Introduce NXP DCP-backed trusted keys")
Suggested-by: Richard Weinberger <richard@nod.at>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405240610.fj53EK0q-lkp@intel.com/
Signed-off-by: David Gstir <david@sigma-star.at>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
With CONFIG_LTO_CLANG=y, the compiler may add .llvm.<hash> suffix to
function names to avoid duplication. APIs like kallsyms_lookup_name()
and kallsyms_on_each_match_symbol() tries to match these symbol names
without the .llvm.<hash> suffix, e.g., match "c_stop" with symbol
c_stop.llvm.17132674095431275852. This turned out to be problematic
for use cases that require exact match, for example, livepatch.
Fix this by making the APIs to match symbols exactly.
Also cleanup kallsyms_selftests accordingly.
Signed-off-by: Song Liu <song@kernel.org>
Fixes: 8cc32a9bbf29 ("kallsyms: strip LTO-only suffixes from promoted global functions")
Tested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20240807220513.3100483-3-song@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Cleaning up the symbols causes various issues afterwards. Let's sort
the list based on original name.
Signed-off-by: Song Liu <song@kernel.org>
Fixes: 8cc32a9bbf29 ("kallsyms: strip LTO-only suffixes from promoted global functions")
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20240807220513.3100483-2-song@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The 'device_name' array doesn't exist out of the
'overflow_allocation_test' function scope. However, it is being used as
a driver name when calling 'kunit_driver_create' from
'kunit_device_register'. It produces the kernel panic with KASAN
enabled.
Since this variable is used in one place only, remove it and pass the
device name into kunit_device_register directly as an ascii string.
Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/r/20240815000431.401869-1-ivan.orlov0322@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Avoid GT TLB invalidation timeouts by holding a PM ref when
invalidations are inflight.
v2:
- Drop PM ref before signaling fence (CI)
v3:
- Move invalidation_fence_signal helper in tlb timeout to previous
patch (Matthew Auld)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-4-matthew.brost@intel.com
(cherry picked from commit 0a382f9bc5dc4744a33970a5ed4df8f9c702ee9e)
Requires: 46209ce5287b ("drm/xe: Add xe_gt_tlb_invalidation_fence_init
helper")
Requires: 0e414ab036e0 ("drm/xe: Drop xe_gt_tlb_invalidation_wait")
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Having two methods to wait on GT TLB invalidations is not ideal. Remove
xe_gt_tlb_invalidation_wait and only use GT TLB invalidation fences.
In addition to two methods being less than ideal, once GT TLB
invalidations are coalesced the seqno cannot be assigned during
xe_gt_tlb_invalidation_ggtt/range. Thus xe_gt_tlb_invalidation_wait
would not have a seqno to wait one. A fence however can be armed and
later signaled.
v3:
- Add explaination about coalescing to commit message
v4:
- Don't put dma fence if defined on stack (CI)
v5:
- Initialize ret to zero (CI)
v6:
- Use invalidation_fence_signal helper in tlb timeout (Matthew Auld)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-3-matthew.brost@intel.com
(cherry picked from commit 61ac035361ae555ee5a17a7667fe96afdde3d59a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Other layers should not be touching struct xe_gt_tlb_invalidation_fence
directly, add helper for initialization.
v2:
- Add dma_fence_get and list init to xe_gt_tlb_invalidation_fence_init
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-2-matthew.brost@intel.com
(cherry picked from commit a522b285c6b4b611406d59612a8d7241714d2e31)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
When validating VF config on the media GT, we may wrongly report
that VF is already partially configured on it, as we consider GGTT
and LMEM provisioning done on the primary GT (since both GGTT and
LMEM are tile-level resources, not a GT-level).
This will cause skipping a VF auto-provisioning on the media-GT and
in result will block a VF from successfully initialize that GT.
Fix that by considering GGTT and LMEM configurations only when
checking if a VF provisioning is complete, and omit GGTT and LMEM
when reporting empty/partial provisioning.
Fixes: 234670cea9a2 ("drm/xe/pf: Skip fair VFs provisioning if already provisioned")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240806180516.618-1-michal.wajdeczko@intel.com
(cherry picked from commit 5bdacb0907c1f531995b6ba47b832ac3a0182ae9)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Take PM ref when any G2H are outstanding, drop when none are
outstanding.
To safely ensure we have PM ref when in the GuC CT layer, a PM ref needs
to be held when scheduler messages are pending too.
v2:
- Add outer PM protections to xe_file_close (CI)
v3:
- Only take PM ref 0->1 and drop on 1->0 (Matthew Auld)
v4:
- Add assert to G2H increment function
v5:
- Rebase
v6:
- Declare xe as local variable in xe_file_close (CI)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-5-matthew.brost@intel.com
(cherry picked from commit d930c19fdff3109e97b610fa10943b7602efcabd)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
We should use the number of actual entries stored in the runtime
register buffer, not the maximum number of entries that this buffer
can hold, otherwise bsearch() may fail and we may miss the data and
wrongly report unexpected access to some registers.
Fixes: 4edadc41a3a4 ("drm/xe/vf: Use register values obtained from the PF")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718203155.486-1-michal.wajdeczko@intel.com
(cherry picked from commit ad16682db18f4414e53bba1ce0db75b08bdc4dff)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
xe_file_close triggers an asynchronous queue cleanup and then frees up
the xef object. Since queue cleanup flushes all pending jobs and the KMD
stores client usage stats into the xef object after jobs are flushed, we
see a use-after-free for the xef object. Resolve this by taking a
reference to xef from xe_exec_queue.
While at it, revert an earlier change that contained a partial work
around for this issue.
v2:
- Take a ref to xef even for the VM bind queue (Matt)
- Squash patches relevant to that fix and work around (Lucas)
v3: Fix typo (Lucas)
Fixes: ce62827bc294 ("drm/xe: Do not access xe file when updating exec queue run_ticks")
Fixes: 6109f24f87d7 ("drm/xe: Add helper to accumulate exec queue runtime")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1908
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-5-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 2149ded63079449b8dddf9da38392632f155e6b5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Take a reference to xef when user creates the VM and put the reference
when user destroys the VM.
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-4-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit a2387e69493df3de706f14e4573ee123d23d5d34)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Add ref counting for xe_file.
v2:
- Add kernel doc for exported functions (Matt)
- Instead of xe_file_destroy, export the get/put helpers (Lucas)
v3: Fixup the kernel-doc format and description (Matt, Lucas)
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-3-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit ce8c161cbad43f4056451e541f7ae3471d0cca12)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
In order to make xe_file ref counted, move destruction of xe_file
members to a helper.
v2: Move xe_vm_close_and_put back into xe_file_close (Matt)
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-2-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 3d0c4a62cc553c6ffde4cb11620eba991e770665)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Fail invalid addresses during user fence creation.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240717140429.1396820-1-matthew.brost@intel.com
(cherry picked from commit 0fde907da2d5fd4da68845e96c6842497159c858)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
pci_request_regions is called to apply for PCI I/O and memory resources
when the driver is initialized, Therefore, when the driver is uninstalled,
pci_release_regions should be used to release PCI I/O and memory resources
instead of pci_release_mem_regions is used to release memory reasouces
only.
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When query reg inf of SSU, it loops tnl_num times. However, tnl_num comes
from hardware and the length of array is a fixed value. To void array out
of bound, make sure the loop time is not greater than the length of array
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When config TC during the reset process, may cause a deadlock, the flow is
as below:
pf reset start
│
▼
......
setup tc │
│ ▼
▼ DOWN: napi_disable()
napi_disable()(skip) │
│ │
▼ ▼
...... ......
│ │
▼ │
napi_enable() │
▼
UINIT: netif_napi_del()
│
▼
......
│
▼
INIT: netif_napi_add()
│
▼
...... global reset start
│ │
▼ ▼
UP: napi_enable()(skip) ......
│ │
▼ ▼
...... napi_disable()
In reset process, the driver will DOWN the port and then UINIT, in this
case, the setup tc process will UP the port before UINIT, so cause the
problem. Adds a DOWN process in UINIT to fix it.
Fixes: bb6b94a896d4 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Consider the followed case that the user change speed and reset the net
interface. Before the hw change speed successfully, the driver get old
old speed from hw by timer task. After reset, the previous speed is config
to hw. As a result, the new speed is configed successfully but lost after
PF reset. The followed pictured shows more dirrectly.
+------+ +----+ +----+
| USER | | PF | | HW |
+---+--+ +-+--+ +-+--+
| ethtool -s 100G | |
+------------------>| set speed 100G |
| +--------------------->|
| | set successfully |
| |<---------------------+---+
| |query cfg (timer task)| |
| +--------------------->| | handle speed
| | return 200G | | changing event
| ethtool --reset |<---------------------+ | (100G)
+------------------>| cfg previous speed |<--+
| | after reset (200G) |
| +--------------------->|
| | +---+
| |query cfg (timer task)| |
| +--------------------->| | handle speed
| | return 100G | | changing event
| |<---------------------+ | (200G)
| | |<--+
| |query cfg (timer task)|
| +--------------------->|
| | return 200G |
| |<---------------------+
| | |
v v v
This patch save new speed if hw change speed successfully, which will be
used after reset successfully.
Fixes: 2d03eacc0b7e ("net: hns3: Only update mac configuation when necessary")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, if hns3 PF or VF FLR reset failed after five times retry,
the reset done process will directly release the semaphore
which has already released in hclge_reset_prepare_general.
This will cause down operation fail.
So this patch fixes it by adding reset state judgement. The up operation is
only called after successful PF FLR reset.
Fixes: 8627bdedc435 ("net: hns3: refactor the precedure of PF FLR")
Fixes: f28368bb4542 ("net: hns3: refactor the procedure of VF FLR")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When deleting netns, it is possible to still have some tasks running,
e.g. background tasks like tcpdump running in the background, not
stopped because the test has been interrupted.
Before deleting the netns, it is then safer to kill all attached PIDs,
if any. That should reduce some noises after the end of some tests, and
help with the debugging of some issues. That's why this modification is
seen as a "fix".
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20240813-upstream-net-20240813-selftests-net-lib-kill-v1-1-27b689b248b8@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix an issue where `devm_regulator_register()` would fail for PSE
controllers that do not support current limit control, such as simple
GPIO-based controllers like the podl-pse-regulator. The
`REGULATOR_CHANGE_CURRENT` flag and `max_uA` constraint are now
conditionally set only if the `pi_set_current_limit` operation is
supported. This change prevents the regulator registration routine from
attempting to call `pse_pi_set_current_limit()`, which would return
`-EOPNOTSUPP` and cause the registration to fail.
Fixes: 4a83abcef5f4f ("net: pse-pd: Add new power limit get and set c33 features")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Tested-by: Kyle Swenson <kyle.swenson@est.tech>
Link: https://patch.msgid.link/20240813073719.2304633-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
HDMI analyser shows that the AVI infoframe is no being longer send.
The switch to the HDMI connector api should have used the frame content
which is now given in the buffer parameter, but instead still uses the
(now) empty and superfluous packed_frame variable.
Fix it.
Fixes: 65548c8ff0ab ("drm/rockchip: inno_hdmi: Switch to HDMI connector")
Signed-off-by: Alex Bee <knaerzche@gmail.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805110855.274140-2-knaerzche@gmail.com
|
|
Commit 94833addfaba ("net: thunderx: Unembed netdev structure") had
a go at dynamically allocating the netdev structures for the thunderx_bgx
driver. This change results in my ThunderX box catching fire (to be fair,
it is what it does best).
The issues with this change are that:
- bgx_lmac_enable() is called *after* bgx_acpi_register_phy() and
bgx_init_of_phy(), both expecting netdev to be a valid pointer.
- bgx_init_of_phy() populates the MAC addresses for *all* LMACs
attached to a given BGX instance, and thus needs netdev for each of
them to have been allocated.
There is a few things to be said about how the driver mixes LMAC and
BGX states which leads to this sorry state, but that's beside the point.
To address this, go back to a situation where all netdev structures
are allocated before the driver starts relying on them, and move the
freeing of these structures to driver removal. Someone brave enough
can always go and restructure the driver if they want.
Fixes: 94833addfaba ("net: thunderx: Unembed netdev structure")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Breno Leitao <leitao@debian.org>
Cc: Sunil Goutham <sgoutham@marvell.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240812141322.1742918-1-maz@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
CMIS 5.2 standard section 9.4.2 defines four types of firmware update
supported mechanism: None, only LPL, only EPL, both LPL and EPL.
Currently, only LPL (Local Payload) type of write firmware block is
supported. However, if the module supports both LPL and EPL the flashing
process wrongly fails for no supporting LPL.
Fix that, by allowing the write mechanism to be LPL or both LPL and
EPL.
Fixes: c4f78134d45c ("ethtool: cmis_fw_update: add a layer for supporting firmware update using CDB")
Reported-by: Vladyslav Mykhaliuk <vmykhaliuk@nvidia.com>
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20240812140824.3718826-1-danieller@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
After a vsock socket has been added to a BPF sockmap, its prot->recvmsg
has been replaced with vsock_bpf_recvmsg(). Thus the following
recursiion could happen:
vsock_bpf_recvmsg()
-> __vsock_recvmsg()
-> vsock_connectible_recvmsg()
-> prot->recvmsg()
-> vsock_bpf_recvmsg() again
We need to fix it by calling the original ->recvmsg() without any BPF
sockmap logic in __vsock_recvmsg().
Fixes: 634f1a7110b4 ("vsock: support sockmap")
Reported-by: syzbot+bdb4bd87b5e22058e2a4@syzkaller.appspotmail.com
Tested-by: syzbot+bdb4bd87b5e22058e2a4@syzkaller.appspotmail.com
Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240812022153.86512-1-xiyou.wangcong@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, kasan_init_sw_tags() is called before setup_per_cpu_areas(),
so per_cpu(prng_state, cpu) accesses the same address regardless of the
value of "cpu", and the same seed value gets copied to the percpu area
for every CPU. Fix this by moving the call to smp_prepare_boot_cpu(),
which is the first architecture hook after setup_per_cpu_areas().
Fixes: 3c9e3aa11094 ("kasan: add tag related helper functions")
Fixes: 3f41b6093823 ("kasan: fix random seed generation for tag-based mode")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20240814091005.969756-1-samuel.holland@sifive.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Change expected_buf from (const void *) to (const char *)
in function __recvpair().
This change fixes the below warnings during test compilation:
```
In file included from msg_oob.c:14:
msg_oob.c: In function ‘__recvpair’:
../../kselftest_harness.h:106:40: warning: format ‘%s’ expects argument
of type ‘char *’,but argument 6 has type ‘const void *’ [-Wformat=]
../../kselftest_harness.h:101:17: note: in expansion of macro ‘__TH_LOG’
msg_oob.c:235:17: note: in expansion of macro ‘TH_LOG’
../../kselftest_harness.h:106:40: warning: format ‘%s’ expects argument
of type ‘char *’,but argument 6 has type ‘const void *’ [-Wformat=]
../../kselftest_harness.h:101:17: note: in expansion of macro ‘__TH_LOG’
msg_oob.c:259:25: note: in expansion of macro ‘TH_LOG’
```
Fixes: d098d77232c3 ("selftest: af_unix: Add msg_oob.c.")
Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240814080743.1156166-1-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Objects' dump callbacks are not concurrency-safe per-se with reset bit
set. If two CPUs perform a reset at the same time, at least counter and
quota objects suffer from value underrun.
Prevent this by introducing dedicated locking callbacks for nfnetlink
and the asynchronous dump handling to serialize access.
Fixes: 43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for stateful objects")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Outsource the reply skb preparation for non-dump getrule requests into a
distinct function. Prep work for object reset locking.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
In theory, dumpreset may fail and invalidate the preceeding log message.
Fix this and use the occasion to prepare for object reset locking, which
benefits from a few unrelated changes:
* Add an early call to nfnetlink_unicast if not resetting which
effectively skips the audit logging but also unindents it.
* Extract the table's name from the netlink attribute (which is verified
via earlier table lookup) to not rely upon validity of the looked up
table pointer.
* Do not use local variable family, it will vanish.
Fixes: 8e6cf365e1d5 ("audit: log nftables configuration change events")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Trigger cloned skbs leaving softirq protection.
This triggers splat without the preceeding change
("netfilter: nf_queue: drop packets with cloned unconfirmed
conntracks"):
WARNING: at net/netfilter/nf_conntrack_core.c:1198 __nf_conntrack_confirm..
because local delivery and forwarding will race for confirmation.
Based on a reproducer script from Yi Chen.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Conntrack assumes an unconfirmed entry (not yet committed to global hash
table) has a refcount of 1 and is not visible to other cores.
With multicast forwarding this assumption breaks down because such
skbs get cloned after being picked up, i.e. ct->use refcount is > 1.
Likewise, bridge netfilter will clone broad/mutlicast frames and
all frames in case they need to be flood-forwarded during learning
phase.
For ip multicast forwarding or plain bridge flood-forward this will
"work" because packets don't leave softirq and are implicitly
serialized.
With nfqueue this no longer holds true, the packets get queued
and can be reinjected in arbitrary ways.
Disable this feature, I see no other solution.
After this patch, nfqueue cannot queue packets except the last
multicast/broadcast packet.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Fix missing initialisation of extack in flow offload.
Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add missing extack initialisation when ACKing BATCH_BEGIN and BATCH_END.
Fixes: bf2ac490d28c ("netfilter: nfnetlink: Handle ACK flags for batch messages")
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
In preparation for misaligned vector performance hwprobe keys, rename
the hwprobe key values associated with misaligned scalar accesses to
include the term SCALAR. Leave the old defines in place to maintain
source compatibility.
This change is intended to be a functional no-op.
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-3-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
hwprobe_key_is_bitmask(), when in reality it was an enum value. This
causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
since SLOW, FAST, and EMULATED have values whose bits overlap with
each other. If the caller asked for the set of CPUs that was SLOW or
EMULATED, the returned set would also include CPUs that were FAST.
Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
returns the same values in response to a direct query (with no flags),
but is properly handled as an enumerated value. As a result, SLOW,
FAST, and EMULATED are all correctly treated as distinct values under
the new key when queried with the WHICH_CPUS flag.
Leave the old key in place to avoid disturbing applications which may
have already come to rely on the key, with or without its broken
behavior with respect to the WHICH_CPUS flag.
Fixes: e178bf146e4b ("RISC-V: hwprobe: Introduce which-cpus flag")
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-2-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.
Fixes: eabd9db64ea8 ("ACPI: RISCV: Add NUMA support based on SRAT and SLIT")
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/0d362a8ae50558b95685da4c821b2ae9e8cf78be.1722828421.git.haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
With XIP kernel, kernel_map.size is set to be only the size of data part of
the kernel. This is inconsistent with "normal" kernel, who sets it to be
the size of the entire kernel.
More importantly, XIP kernel fails to boot if CONFIG_DEBUG_VIRTUAL is
enabled, because there are checks on virtual addresses with the assumption
that kernel_map.size is the size of the entire kernel (these checks are in
arch/riscv/mm/physaddr.c).
Change XIP's kernel_map.size to be the size of the entire kernel.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org> # v6.1+
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240508191917.2892064-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.
Fixes: 52449c17bdd1 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Reported-by: "Dmitry V. Levin" <ldv@strace.io>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Link: https://lore.kernel.org/r/20240627142338.5114-2-CoelacanthusHex@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Commit 264640fc2c5f4 ("ipv6: distinguish frag queues by device
for multicast and link-local packets") modified the ipv6 fragment
reassembly logic to distinguish frag queues by device for multicast
and link-local packets but in fact only the main reassembly code
limits the use of the device to those address types and the netfilter
reassembly code uses the device for all packets.
This means that if fragments of a packet arrive on different interfaces
then netfilter will fail to reassemble them and the fragments will be
expired without going any further through the filters.
Fixes: 648700f76b03 ("inet: frags: use rhashtables for reassembly units")
Signed-off-by: Tom Hughes <tom@compton.nu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
"INVALID" is misspelt in "SEV_RET_INAVLID_CONFIG". Since this is part of
the UAPI, keep the current definition and add a new one with the fix.
Fix-suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Amit Shah <amit.shah@amd.com>
Message-ID: <20240814083113.21622-1-amit@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|