| Age | Commit message (Collapse) | Author | Files | Lines |
|
The *_gpl section are not being used populated by modpost anymore. Hence
the module loader doesn't need to find and process these sections in
modules.
This patch also simplifies symbol finding logic in module loader since
*_gpl sections don't have to be searched anymore.
Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
Read kflagstab section for vmlinux and modules to determine whether
kernel symbols are GPL only.
This patch eliminates the need for fragmenting the ksymtab for infering
the value of GPL-only symbol flag, henceforth stop populating *_gpl
versions of the ksymtab and kcrctab in modpost.
Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
This patch adds the ability to create entries for kernel symbol flag
bitsets in kflagstab. Modpost populates only the GPL-only flag for now.
Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
The core architectural issue with kernel symbol flags is our reliance on
splitting the main symbol table, ksymtab. To handle a single boolean
property, such as GPL-only, all exported symbols are split across two
separate tables: __ksymtab and __ksymtab_gpl.
This design forces the module loader to perform a separate search on
each of these tables for every symbol it needs, for vmlinux and for all
previously loaded modules.
This approach is fundamentally not scalable. If we were to introduce a
second flag, we would need four distinct symbol tables. For n boolean
flags, this model requires an exponential growth to 2^n tables,
dramatically increasing complexity.
Another consequence of this fragmentation is degraded performance. For
example, a binary search on the symbol table of vmlinux, that would take
only 14 comparison steps (assuming ~2^14 or 16K symbols) in a unified
table, can require up to 26 steps when spread across two tables
(assuming both tables have ~2^13 symbols). This performance penalty
worsens as more flags are added.
To address this, symbol flags is an enumeration used to represent flags
as a bitset, for example a flag to tell if a symbol is GPL only.
The said bitset is introduced in subsequent patches and will contain
values of kernel symbol flags. These bitset will then be used to infer
flag values rather than fragmenting ksymtab for separating symbols with
different flag values, thereby eliminating the need to fragment the
ksymtab.
Link: https://lore.kernel.org/r/20260326-kflagstab-v5-0-fa0796fe88d9@google.com
Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
[Sami: Updated the commit message to explain the use case for the series.]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
Recently, tracepoints were switched from using disabled preemption
(which acts as RCU read section) to SRCU-fast when they are not
faultable. This means that to do a proper grace period wait for programs
running in such tracepoints, we must use SRCU's grace period wait.
This is only for non-faultable tracepoints, faultable ones continue
using RCU Tasks Trace.
However, bpf_link_free() currently does call_rcu() for all cases when
the link is non-sleepable (hence, for tracepoints, non-faultable). Fix
this by doing a call_srcu() grace period wait.
As far RCU Tasks Trace gp -> RCU gp chaining is concerned, it is deemed
unnecessary for tracepoint programs. The link and program are either
accessed under RCU Tasks Trace protection, or SRCU-fast protection now.
The earlier logic of chaining both RCU Tasks Trace and RCU gp waits was
to generalize the logic, even if it conceded an extra RCU gp wait,
however that is unnecessary for tracepoints even before this change.
In practice no cost was paid since rcu_trace_implies_rcu_gp() was always
true. Hence we need not chaining any RCU gp after the SRCU gp.
For instance, in the non-faultable raw tracepoint, the RCU read section
of the program in __bpf_trace_run() is enclosed in the SRCU gp, likewise
for faultable raw tracepoint, the program is under the RCU Tasks Trace
protection. Hence, the outermost scope can be waited upon to ensure
correctness.
Also, sleepable programs cannot be attached to non-faultable
tracepoints, so whenever program or link is sleepable, only RCU Tasks
Trace protection is being used for the link and prog.
Fixes: a46023d5616e ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast")
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20260331211021.1632902-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
With CONFIG_UBSAN_INTEGER_WRAP being replaced by Overflow Behavior
Types, remove the __signed_wrap function annotation as it is already
unused, and any future work here will use OBT annotations instead.
Link: https://patch.msgid.link/20260331163725.2765789-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Pull cgroup fixes from Tejun Heo:
- Fix cgroup rmdir racing with dying tasks.
Deferred task cgroup unlink introduced a window where cgroup.procs
is empty but the cgroup is still populated, causing rmdir to fail
with -EBUSY and selftest failures.
Make rmdir wait for dying tasks to fully leave and fix selftests to
not depend on synchronous populated updates.
- Fix cpuset v1 task migration failure from empty cpusets under strict
security policies.
When CPU hotplug removes the last CPU from a v1 cpuset, tasks must be
migrated to an ancestor without a security_task_setscheduler() check
that would block the migration.
* tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: Skip security check for hotplug induced v1 task migration
cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
cgroup: Fix cgroup_drain_dying() testing the wrong condition
selftests/cgroup: Don't require synchronous populated update on task exit
cgroup: Wait for dying tasks to leave on rmdir
|
|
Pull udf fix from Jan Kara:
"Fix for a race in UDF that can lead to memory corruption"
* tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix race between file type conversion and writeback
mpage: Provide variant of mpage_writepages() with own optional folio handler
|
|
Convert struct mem_ctl_info to use flex array and use the new flex array
helpers to enable runtime bounds checking, including annotating the array
length member with __counted_by() for extra runtime analysis when requested.
Move memcpy() after the counter assignment so that it is initialized before
the first reference to the flex array, as the new attribute requires.
[ bp: Heavily massage commit message. ]
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20260327024828.7377-1-rosenp@gmail.com
|
|
Some utility functions on sched_dl_entity can be useful outside of
deadline.c , for instance for modelling, without relying on raw
structure fields.
Move functions like dl_task_of and dl_is_implicit to deadline.h to make
them available outside.
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/r/20260330111010.153663-12-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
|
|
RV deterministic and hybrid automata currently only support global,
per-cpu and per-task monitors. It isn't possible to write a model that
would follow some different type of object, like a deadline entity or a
lock.
Define the generic per-object monitor implementation which shares part
of the implementation with the per-task monitors.
The user needs to provide an id for the object (e.g. pid for tasks) and
define the data type for the monitor_target (e.g. struct task_struct *
for tasks). Both are supplied to the event handlers, as the id may not
be easily available in the target.
The monitor storage (e.g. the rv monitor, pointer to the target, etc.)
is stored in a hash table indexed by id. Monitor storage objects are
automatically allocated unless specified otherwise (e.g. if the creation
context is unsafe for allocation).
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260330111010.153663-9-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
|
|
Deterministic automata define which events are allowed in every state,
but cannot define more sophisticated constraint taking into account the
system's environment (e.g. time or other states not producing events).
Add the Hybrid Automata monitor type as an extension of Deterministic
automata where each state transition is validating a constraint on a
finite number of environment variables.
Hybrid automata can be used to implement timed automata, where the
environment variables are clocks.
Also implement the necessary functionality to handle clock constraints
(ns or jiffy granularity) on state and events.
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260330111010.153663-3-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
|
|
Allows uniwill-laptop feature work that depends on changes in the fixes
branch proceed.
|
|
The TCG Opal device could enter a state where no new session can be
created, blocking even Discovery or PSID reset. While a power cycle
or waiting for the timeout should work, there is another possibility
for recovery: using the Stack Reset command.
The Stack Reset command is defined in the TCG Storage Architecture Core
Specification and is mandatory for all Opal devices (see Section 3.3.6
of the Opal SSC specification).
This patch implements the Stack Reset command. Sending it should clear
all active sessions immediately, allowing subsequent commands to run
successfully. While it is a TCG transport layer command, the Linux
kernel implements only Opal ioctls, so it makes sense to use the
IOC_OPAL ioctl interface.
The Stack Reset takes no arguments; the response can be success or pending.
If the command reports a pending state, userspace can try to repeat it;
in this case, the code returns -EBUSY.
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Reviewed-by: Ondrej Kozina <okozina@redhat.com>
Link: https://patch.msgid.link/20260310095349.411287-1-gmazyland@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
Now that dev_get_cma_area() is no longer inline, we don't have any user
of dma_contiguous_default_area() outside of contiguous.c so we can make
it static.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-3-e18fda504419@kernel.org
|
|
As we try to enable dma-buf heaps, and the CMA one in particular, to
compile as modules, we need to export dev_get_cma_area(). It's currently
implemented as an inline function that returns either the content of
device->cma_area or dma_contiguous_default_area.
Thus, it means we need to export dma_contiguous_default_area, which
isn't really something we want any module to have access to.
Instead, let's make dev_get_cma_area() a proper function we will be able
to export so we can avoid exporting dma_contiguous_default_area.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-2-e18fda504419@kernel.org
|
|
The CMA heap instantiation was initially developed by having the
contiguous DMA code call into the CMA heap to create a new instance
every time a reserved memory area is probed.
Turning the CMA heap into a module would create a dependency of the
kernel on a module, which doesn't work.
Let's turn the logic around and do the opposite: store all the reserved
memory CMA regions into the contiguous DMA code, and provide an iterator
for the heap to use when it probes.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-1-e18fda504419@kernel.org
|
|
Add of_find_serdev_controller_by_node() API to find the serdev controller
device associated with the devicetree node.
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64)
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260326-pci-m2-e-v7-2-43324a7866e6@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
|
|
If these helpers receive the 'const struct device' pointer, then the const
qualifier will get dropped, leading to below warning:
warning: passing argument 1 of ‘to_serdev_device_driver’ discards 'const'
qualifier from pointer target type [-Wdiscarded-qualifiers]
This is not an issue as of now, but with the future commits adding serdev
device based driver matching, this warning will get triggered. Hence,
convert these helpers to macros so that the qualifier get preserved and
also use container_of_const() as container_of() is deprecated.
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64)
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260326-pci-m2-e-v7-1-43324a7866e6@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
|
|
Nobody defines struct ina2xx_platform_data. Remove platform data support
from the drivers which still have it (it's effectively dead code) and
remove the header.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://lore.kernel.org/r/20260326-drop-ina2xx-pdata-v1-1-c159437bb2df@oss.qualcomm.com
[groeck: Fixed continuation line alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
When debugfs is disabled, the hisilicon driver now fails to build:
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c: In function 'hisi_acc_vfio_debug_init':
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c:1671:62: error: 'struct vfio_device' has no member named 'debug_root'
1671 | vfio_dev_migration = debugfs_lookup("migration", vdev->debug_root);
| ^~
The driver otherwise relies on dead-code elimination, but this reference
fails. The single struct member is not going to make much of a difference
for memory consumption, so just keep this visible unconditionally.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: b398f91779b8 ("hisi_acc_vfio_pci: register debugfs for hisilicon migration driver")
Link: https://lore.kernel.org/r/20260327165521.3779707-1-arnd@kernel.org
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
Similarly to the previous commit, this renames the somewhat confusingly
named function. But in this case, it was at least less confusing: the
__copy_from_user_inatomic_nocache is indeed copying from user memory,
and it is indeed ok to be used in an atomic context, so it will not warn
about it.
But the previous commit also removed the NTB mis-use of the
__copy_from_user_inatomic_nocache() function, and as a result every
call-site is now _actually_ doing a real user copy. That means that we
can now do the proper user pointer verification too.
End result: add proper address checking, remove the double underscores,
and change the "nocache" to "nontemporal" to more accurately describe
what this x86-only function actually does. It might be worth noting
that only the target is non-temporal: the actual user accesses are
normal memory accesses.
Also worth noting is that non-x86 targets (and on older 32-bit x86 CPU's
before XMM2 in the Pentium III) we end up just falling back on a regular
user copy, so nothing can actually depend on the non-temporal semantics,
but that has always been true.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Linux 7.0-rc6
Requested by a few people on irc to resolve conflicts in other tress.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
The Power Management Quality of Service (PM QoS) allows to
aggregate constraints from multiple entities. It is currently
used to manage the min/max frequency of a given policy.
Frequency constraints can come for instance from:
- Thermal framework: acpi_thermal_cpufreq_init()
- Firmware: _PPC objects: acpi_processor_ppc_init()
- User: by setting policyX/scaling_[min|max]_freq
The minimum of the max frequency constraints is used to compute
the resulting maximum allowed frequency.
When enabling boost frequencies, the same frequency request object
(policy->max_freq_req) as to handle requests from users is used.
As a result, when setting:
- scaling_max_freq
- boost
The last sysfs file used overwrites the request from the other
sysfs file.
To avoid this, create a per-policy boost_freq_req to save the boost
constraints instead of overwriting the last scaling_max_freq
constraint.
policy_set_boost() calls the cpufreq set_boost callback.
Update the newly added boost_freq_req request from there:
- whenever boost is toggled
- to cover all possible paths
In the existing .set_boost() callbacks:
- Don't update policy->max as this is done through the qos notifier
cpufreq_notifier_max() which calls cpufreq_set_policy().
- Remove freq_qos_update_request() calls as the qos request is now
done in policy_set_boost() and updates the new boost_freq_req
$ ## Init state
scaling_max_freq:1000000
cpuinfo_max_freq:1000000
$ echo 700000 > scaling_max_freq
scaling_max_freq:700000
cpuinfo_max_freq:1000000
$ echo 1 > ../boost
scaling_max_freq:1200000
cpuinfo_max_freq:1200000
$ echo 800000 > scaling_max_freq
scaling_max_freq:800000
cpuinfo_max_freq:1200000
$ ## Final step:
$ ## Without the patches:
$ echo 0 > ../boost
scaling_max_freq:1000000
cpuinfo_max_freq:1000000
$ ## With the patches:
$ echo 0 > ../boost
scaling_max_freq:800000
cpuinfo_max_freq:1000000
Note:
cpufreq_frequency_table_cpuinfo() updates policy->min
and max from:
A.
cpufreq_boost_set_sw()
\-cpufreq_frequency_table_cpuinfo()
B.
cpufreq_policy_online()
\-cpufreq_table_validate_and_sort()
\-cpufreq_frequency_table_cpuinfo()
Keep these updates as some drivers expect policy->min and
max to be set through B.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-3-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Now that RCU Tasks Trace is implemented in terms of SRCU-fast, the fact
that each SRCU-fast grace period implies at least two RCU grace periods
in turn means that each RCU Tasks Trace grace period implies at least
two grace periods. This commit therefore updates the documentation
accordingly.
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
Typo fix in srcu_read_unlock_fast() header comment.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
The SRCU_READ_FLAVOR_FAST and SRCU_READ_FLAVOR_FAST_UPDOWN comments
need repair. The former fails to not that SRCU-fast can be used in NMI
handlers, and the latter says that it goes with srcu_read_lock_fast()
when it really goes with srcu_read_lock_fast_updown(). This commit
therefore fixes both comments.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
This commit adds a trivial textbook implementation of preemptible RCU
to rcutorture ("torture_type=trivial-preempt"), similar in spirit to the
existing "torture_type=trivial" textbook implementation of non-preemptible
RCU. Neither trivial RCU implementation has any value for production use,
and are intended only to keep Paul honest in his introductory writings
and presentations.
[ paulmck: Apply kernel test robot feedback. ]
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
Let's bring v7.0-rc6 to the -next branch, so we can merge the DMA
attributes fix [1] without merge conflicts.
[1] https://lore.kernel.org/all/20260323-umem-dma-attrs-v1-1-d6890f2e6a1e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
* master: (1688 commits)
Linux 7.0-rc6
...
|
|
Merge the new helpers in UBWC driver through a topic branch, to allow
them to be shared with display and video branches as well.
|
|
Currently the database stores macrotile_mode in the data. However it
can be derived from the rest of the data: it should be used for UBWC
encoding >= 3.0 except for several corner cases (SM8150 and SC8180X).
The ubwc_bank_spread field seems to be based on the impreside data we
had for the MDSS and DPU programming. In some cases UBWC engine inside
the display controller doesn't need to program it, although bank spread
is to be enabled.
Bank swizzle is also currently stored as is, but it is almost standard
(banks 1-3 for UBWC 1.0 and 2-3 for other versions), the only exception
being Lemans (it uses only bank 3).
Add helpers returning values from the config for now. They will be
rewritten later, in a separate series, but having the helper now
simplifies refacroring the code later.
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260125-iris-ubwc-v4-2-1ff30644ac81@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
MDSS and GPU drivers use different approaches to get min_acc length.
Add helper function that can be used by all the drivers.
The helper reflects our current best guess, it blindly copies the
approach adopted by the MDSS drivers and it matches current values
selected by the GPU driver.
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Acked-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dikshita Agarwal <dikshita.agarwal@oss.qualcomm.com>
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260125-iris-ubwc-v4-1-1ff30644ac81@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Merge branch 'for-7.0' of
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into
asoc-7.1 for both ASoC and general bug fixes to support testing.
|
|
Add a helper function to find the selector for a given value in a linear
range array. The selector should be such that the value it represents
should be higher or equal to the given value.
Signed-off-by: Amit Sunil Dhamne <amitsd@google.com>
Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20260325-max77759-charger-v9-4-4486dd297adc@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add register bitmasks for charger function.
In addition split the charger IRQs further such that each bit represents
an IRQ downstream of charger regmap irq chip. In addition populate the
ack_base to offload irq ack to the regmap irq chip framework.
Signed-off-by: Amit Sunil Dhamne <amitsd@google.com>
Reviewed-by: André Draszik <andre.draszik@linaro.org>
Link: https://patch.msgid.link/20260325-max77759-charger-v9-3-4486dd297adc@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In case of insufficient bandwidth usb_submit_urb()
returns -ENOSPC. Translating this to -EIO is not
optimal. There are insufficient resources not
an error. EBUSY is a better fit.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://patch.msgid.link/20260325145537.372993-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
famfs needs to look up a dax_device by dev_t when resolving fmap
entries that reference character dax devices.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: John Groves <john@groves.net>
Link: https://patch.msgid.link/0100019d311daab5-bb212f0b-4e05-4668-bf53-d76fab56be68-000000@email.amazonses.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
|
|
The fs_dax_get() function should be called by fs-dax file systems after
opening a fsdev dax device. This adds holder_operations, which provides
a memory failure callback path and effects exclusivity between callers
of fs_dax_get().
fs_dax_get() is specific to fsdev_dax, so it checks the driver type
(which required touching bus.[ch]). fs_dax_get() fails if fsdev_dax is
not bound to the memory.
This function serves the same role as fs_dax_get_by_bdev(), which dax
file systems call after opening the pmem block device.
This can't be located in fsdev.c because struct dax_device is opaque
there.
This will be called by fs/fuse/famfs.c in a subsequent commit.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: John Groves <john@groves.net>
Link: https://patch.msgid.link/0100019d311d8750-75395c22-031b-4d5f-aebe-790dca656b87-000000@email.amazonses.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
|
|
Add a new dax_set_ops() function that allows drivers to set the
dax_operations after the dax_device has been allocated. This is needed
for fsdev_dax where the operations need to be set during probe and
cleared during unbind.
The fsdev driver uses devm_add_action_or_reset() for cleanup consistency,
avoiding the complexity of mixing devm-managed resources with manual
cleanup in a remove() callback. This ensures cleanup happens automatically
in the correct reverse order when the device is unbound.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: John Groves <john@groves.net>
Link: https://patch.msgid.link/0100019d311d65a0-b9c1419e-f3a0-4afd-b0bd-848f18ff5950-000000@email.amazonses.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
|
|
Both fs/dax.c:dax_folio_put() and drivers/dax/fsdev.c:
fsdev_clear_folio_state() (the latter coming in the next commit after this
one) contain nearly identical code to reset a compound DAX folio back to
order-0 pages. Factor this out into a shared helper function.
The new dax_folio_reset_order() function:
- Clears the folio's mapping and share count
- Resets compound folio state via folio_reset_order()
- Clears PageHead and compound_head for each sub-page
- Restores the pgmap pointer for each resulting order-0 folio
- Returns the original folio order (for callers that need to advance by
that many pages)
Two intentional differences from the original dax_folio_put() logic:
1. folio->share is cleared unconditionally. This is correct because the DAX
subsystem maintains the invariant that share != 0 only when
mapping == NULL (enforced by dax_folio_make_shared()). dax_folio_put()
ensures share has reached zero before calling this helper, so the
unconditional clear is safe.
2. folio->pgmap is now explicitly restored for order-0 folios. For the
dax_folio_put() caller this is a no-op (reads and writes back the same
field). It is intentional for the upcoming fsdev_clear_folio_state()
caller, which converts previously-compound folios and needs pgmap
re-established for all pages regardless of order.
This simplifies fsdev_clear_folio_state() from ~50 lines to ~15 lines.
Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: John Groves <john@groves.net>
Link: https://patch.msgid.link/0100019d311cc6b9-5be7428a-7f16-4774-8f90-a44b88ac5660-000000@email.amazonses.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
|
|
The AVERAGE_POWER primitive and RAPL_PRIMITIVE_DERIVED flag are not
used anywhere in the code.
Remove them to simplify the primitive handling logic.
No functional changes.
Co-developed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20260313185333.2370733-2-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Use the correct function parameter names in kernel-doc comments to
avoid these warnings:
Warning: include/linux/powercap.h:254 function parameter 'name' not
described in 'powercap_register_control_type'
Warning: include/linux/powercap.h:298 function parameter 'nr_constraints'
not described in 'powercap_register_zone'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260312051444.685136-1-rdunlap@infradead.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The SEV firmware has support to disable SNP during an SNP_SHUTDOWN_EX command.
Verify that this support is available and set the flag so that SNP is disabled
when it is not being used.
In cases where SNP is disabled, skip the call to amd_iommu_snp_disable(), as
all of the IOMMU pages have already been made shared. Also skip the panic
case, since snp_shutdown() does IPIs.
Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://patch.msgid.link/20260324161301.1353976-7-tycho@kernel.org
|
|
Boris needs 7.0-rc6 for a shmem helper fix.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
|
Correct a function parameter name (s/page/folio/) and add function
return value sections for multiple functions to eliminate
kernel-doc warnings:
Warning: include/linux/sunrpc/xdr.h:298 function parameter 'folio' not
described in 'xdr_set_scratch_folio'
Warning: include/linux/sunrpc/xdr.h:337 No description found for return
value of 'xdr_stream_remaining'
Warning: include/linux/sunrpc/xdr.h:357 No description found for return
value of 'xdr_align_size'
Warning: include/linux/sunrpc/xdr.h:374 No description found for return
value of 'xdr_pad_size'
Warning: include/linux/sunrpc/xdr.h:387 No description found for return
value of 'xdr_stream_encode_item_present'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Previously, Write chunk RDMA Writes were posted via a separate
ib_post_send() call with their own completion handler. Each Write
chunk incurred a doorbell and generated a completion event.
Link Write chunk WRs onto the RPC Reply's Send WR chain so that a
single ib_post_send() call posts both the RDMA Writes and the Send
WR. A single completion event signals that all operations have
finished. This reduces both doorbell rate and completion rate, as
well as eliminating the latency of a round-trip between the Write
chunk completion and the subsequent Send WR posting.
The lifecycle of Write chunk resources changes: previously, the
svc_rdma_write_done() completion handler released Write chunk
resources when RDMA Writes completed. With WR chaining, resources
remain live until the Send completion. A new sc_write_info_list
tracks Write chunk metadata attached to each Send context, and
svc_rdma_write_chunk_release() frees these resources when the
Send context is released.
The svc_rdma_write_done() handler now handles only error cases.
On success it returns immediately since the Send completion handles
resource release. On failure (WR flush), it closes the connection
to signal to the client that the RPC Reply is incomplete.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
When the Send Queue fills, multiple threads may wait for SQ slots.
The previous implementation had no ordering guarantee, allowing
starvation when one thread repeatedly acquires slots while others
wait indefinitely.
Introduce a ticket-based fair queuing system. Each waiter takes a
ticket number and is served in FIFO order. This ensures forward
progress for all waiters when SQ capacity is constrained.
The implementation has two phases:
1. Fast path: attempt to reserve SQ slots without waiting
2. Slow path: take a ticket, wait for turn, then wait for slots
The ticket system adds two atomic counters to the transport:
- sc_sq_ticket_head: next ticket to issue
- sc_sq_ticket_tail: ticket currently being served
A dedicated wait queue (sc_sq_ticket_wait) handles ticket
ordering, separate from sc_send_wait which handles SQ capacity.
This separation ensures that send completions (the high-frequency
wake source) wake only the current ticket holder rather than all
queued waiters. Ticket handoff wakes only the ticket wait queue,
and each ticket holder that exits via connection close propagates
the wake to the next waiter in line.
When a waiter successfully reserves slots, it advances the tail
counter and wakes the next waiter. This creates an orderly handoff
that prevents starvation while maintaining good throughput on the
fast path when contention is low.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
svc_alloc_arg() invokes alloc_pages_bulk() with the full rq_maxpages
count (~259 for 1MB messages) for the rq_respages array, causing a
full-array scan despite most slots holding valid pages.
svc_rqst_release_pages() NULLs only the range
[rq_respages, rq_next_page)
after each RPC, so only that range contains NULL entries. Limit the
rq_respages fill in svc_alloc_arg() to that range instead of
scanning the full array.
svc_init_buffer() initializes rq_next_page to span the entire
rq_respages array, so the first svc_alloc_arg() call fills all
slots.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The rq_pages array holds pages allocated for incoming RPC requests.
Two transport receive paths NULL entries in rq_pages to prevent
svc_rqst_release_pages() from freeing pages that the transport has
taken ownership of:
- svc_tcp_save_pages() moves partial request data pages to
svsk->sk_pages during multi-fragment TCP reassembly.
- svc_rdma_clear_rqst_pages() moves request data pages to
head->rc_pages because they are targets of active RDMA Read WRs.
A new rq_pages_nfree field in struct svc_rqst records how many
entries were NULLed. svc_alloc_arg() uses it to refill only those
entries rather than scanning the full rq_pages array. In steady
state, the transport NULLs a handful of entries per RPC, so the
allocator visits only those entries instead of the full ~259 slots
(for 1MB messages).
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|