Age | Commit message (Collapse) | Author | Files | Lines |
|
node_to_pxm() is used by hv_numa_node_to_pxm_info().
That helper will be used by Hyper-V root partition module code
when CONFIG_MSHV_ROOT=m.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Link: https://lore.kernel.org/r/1741980536-3865-6-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1741980536-3865-6-git-send-email-nunodasneves@linux.microsoft.com>
|
|
Factor out the check for enabling auto eoi, to be reused in root
partition code.
No functional changes.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Link: https://lore.kernel.org/r/1741980536-3865-5-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1741980536-3865-5-git-send-email-nunodasneves@linux.microsoft.com>
|
|
These non-nested msr and fast hypercall functions are present in x86,
but they must be available in both architectures for the root partition
driver code.
While at it, remove the redundant 'extern' keywords from the
hv_do_hypercall() variants in asm-generic/mshyperv.h.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Roman Kisel <romank@linux.microsoft.com>
Link: https://lore.kernel.org/r/1741980536-3865-4-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1741980536-3865-4-git-send-email-nunodasneves@linux.microsoft.com>
|
|
Extend the "ms_hyperv_info" structure to include a new field,
"ext_features", for capturing extended Hyper-V features.
Update the "ms_hyperv_init_platform" function to retrieve these features
using the cpuid instruction and include them in the informational output.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1741980536-3865-3-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1741980536-3865-3-git-send-email-nunodasneves@linux.microsoft.com>
|
|
Introduce hv_status_printk() macros as a convenience to log hypercall
errors, formatting them with the status code (HV_STATUS_*) as a raw hex
value and also as a string, which saves some time while debugging.
Create a table of HV_STATUS_ codes with strings and mapped errnos, and
use it for hv_result_to_string() and hv_result_to_errno().
Use the new hv_status_printk()s in hv_proc.c, hyperv-iommu.c, and
irqdomain.c hypercalls to aid debugging in the root partition.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Link: https://lore.kernel.org/r/1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com>
|
|
snp_set_vmsa() returns 0 as success result and so fix it.
Cc: stable@vger.kernel.org
Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest")
Signed-off-by: Tianyu Lan <tiala@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20250313085217.45483-1-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250313085217.45483-1-ltykernel@gmail.com>
|
|
The kernel runs as a firmware in the VTL mode, and the only way
to restart in the VTL mode on x86 is to triple fault. Thus, one
has to always supply "reboot=t" on the kernel command line in the
VTL mode, and missing that renders rebooting not working.
Define the machine restart callback to always use the triple
fault to provide the robust configuration by default.
Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250227214728.15672-3-romank@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250227214728.15672-3-romank@linux.microsoft.com>
|
|
By default, X86(-64) systems use the emergecy restart routine
in the course of which the code unconditionally writes to
the physical address of 0x472 to indicate the boot mode
to the firmware (BIOS or UEFI).
When the kernel itself runs as a firmware in the VTL mode,
that write corrupts the memory of the guest upon emergency
restarting. Preserving the state intact in that situation
is important for debugging, at least.
Define the specialized machine callback to avoid that write
and use the triple fault to perform emergency restart.
Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250227214728.15672-2-romank@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250227214728.15672-2-romank@linux.microsoft.com>
|
|
The union vmpacket_largest_possible_header and several structs have not
been used for a long time afaict - remove them.
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20250311091634.494888-2-thorsten.blum@linux.dev
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250311091634.494888-2-thorsten.blum@linux.dev>
|
|
CONFIG_MSHV_ROOT allows kernels built to run as a normal Hyper-V guest
to exclude the root partition code, which is expected to grow
significantly over time.
This option is a tristate so future driver code can be built as a
(m)odule, allowing faster development iteration cycles.
If CONFIG_MSHV_ROOT is disabled, don't compile hv_proc.c, and stub
hv_root_partition() to return false unconditionally. This allows the
compiler to optimize away root partition code blocks since they will
be disabled at compile time.
In the case of booting as root partition *without* CONFIG_MSHV_ROOT
enabled, print a critical error (the kernel will likely crash).
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1740167795-13296-4-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1740167795-13296-4-git-send-email-nunodasneves@linux.microsoft.com>
|
|
Introduce hv_curr_partition_type to store the partition type
as an enum.
Right now this is limited to guest or root partition, but there will
be other kinds in future and the enum is easily extensible.
Set up hv_curr_partition_type early in Hyper-V initialization with
hv_identify_partition_type(). hv_root_partition() just queries this
value, and shouldn't be called before that.
Making this check into a function sets the stage for adding a config
option to gate the compilation of root partition code. In particular,
hv_root_partition() can be stubbed out always be false if root
partition support isn't desired.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1740167795-13296-3-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1740167795-13296-3-git-send-email-nunodasneves@linux.microsoft.com>
|
|
Return linux-friendly error codes from hypercall helper functions,
which allows them to be used more flexibly.
Introduce hv_result_to_errno() for this purpose, which also handles
the special value U64_MAX returned from hv_do_hypercall().
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1740167795-13296-2-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1740167795-13296-2-git-send-email-nunodasneves@linux.microsoft.com>
|
|
Currently, it is tedious to offline CPUs in a Hyper-V VM since CPUs may
have VMBus channels attached to them that a user would have to manually
rebind elsewhere. So, as made mention of in
commit d570aec0f2154 ("Drivers: hv: vmbus: Synchronize init_vp_index()
vs. CPU hotplug"), rebind channels associated with CPUs that a user is
trying to offline to a new "randomly" selected CPU.
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20250117203309.192072-3-hamzamahfooz@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250117203309.192072-3-hamzamahfooz@linux.microsoft.com>
|
|
The core functionality in target_cpu_store() is also needed in a
subsequent patch for automatically changing the CPU when taking
a CPU offline. As such, factor out the body of target_cpu_store()
into new function vmbus_channel_set_cpu() that can also be used
elsewhere.
No functional change is intended.
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20250117203309.192072-2-hamzamahfooz@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250117203309.192072-2-hamzamahfooz@linux.microsoft.com>
|
|
If CONFIG_HYPERV=m, lockdep_assert_cpus_held() is undefined for HyperV.
So, export the function so that GPL drivers can use it more broadly.
Cc: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250117203309.192072-1-hamzamahfooz@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250117203309.192072-1-hamzamahfooz@linux.microsoft.com>
|
|
These helpers are not specific to x86_64 and will be needed by common code.
Remove some unnecessary #includes.
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Link: https://lore.kernel.org/r/1738955002-20821-3-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1738955002-20821-3-git-send-email-nunodasneves@linux.microsoft.com>
|
|
Move hv_current_partition_id and hv_get_partition_id() to hv_common.c,
and call hv_get_partition_id() on arm64 in hyperv_init(). These aren't
specific to x86_64 and will be needed by common code.
Set hv_current_partition_id to HV_PARTITION_ID_SELF by default.
Rename struct hv_get_partition_id to hv_output_get_partition_id, to
make it distinct from the function hv_get_partition_id(), and match
the original Hyper-V struct name.
Remove the BUG()s. Failing to get the id need not crash the machine.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1738955002-20821-2-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1738955002-20821-2-git-send-email-nunodasneves@linux.microsoft.com>
|
|
|
|
Commit 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of
parents and children") exposed an issue related to simple_pm_bus_pm_ops
that uses pm_runtime_force_suspend() and pm_runtime_force_resume() as
bus type PM callbacks for the noirq phases of system-wide suspend and
resume.
The problem is that pm_runtime_force_suspend() does not distinguish
runtime-suspended devices from devices for which runtime PM has never
been enabled, so if it sees a device with runtime PM status set to
RPM_ACTIVE, it will assume that runtime PM is enabled for that device
and so it will attempt to suspend it with the help of its runtime PM
callbacks which may not be ready for that. As it turns out, this
causes simple_pm_bus_runtime_suspend() to crash due to a NULL pointer
dereference.
Another problem related to the above commit and simple_pm_bus_pm_ops is
that setting runtime PM status of a device handled by the latter to
RPM_ACTIVE will actually prevent it from being resumed because
pm_runtime_force_resume() only resumes devices with runtime PM status
set to RPM_SUSPENDED.
To mitigate these issues, do not allow power.set_active to propagate
beyond the parent of the device with DPM_FLAG_SMART_SUSPEND set that
will need to be resumed, which should be a sufficient stop-gap for the
time being, but they will need to be properly addressed in the future
because in general during system-wide resume it is necessary to resume
all devices in a dependency chain in which at least one device is going
to be resumed.
Fixes: 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of parents and children")
Closes: https://lore.kernel.org/linux-pm/1c2433d4-7e0f-4395-b841-b8eac7c25651@nvidia.com/
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/6137505.lOV4Wx5bFT@rjwysocki.net
|
|
The code was restructured where the function graph notrace code, that
would not trace a function and all its children is done by setting a
NOTRACE flag when the function that is not to be traced is hit.
There's a TRACE_GRAPH_NOTRACE_BIT which defines the bit in the flags and a
TRACE_GRAPH_NOTRACE which is the mask with that bit set. But the
restructuring used TRACE_GRAPH_NOTRACE_BIT when it should have used
TRACE_GRAPH_NOTRACE.
For example:
# cd /sys/kernel/tracing
# echo set_track_prepare stack_trace_save > set_graph_notrace
# echo function_graph > current_tracer
# cat trace
[..]
0) | __slab_free() {
0) | free_to_partial_list() {
0) | arch_stack_walk() {
0) | __unwind_start() {
0) 0.501 us | get_stack_info();
Where a non filter trace looks like:
# echo > set_graph_notrace
# cat trace
0) | free_to_partial_list() {
0) | set_track_prepare() {
0) | stack_trace_save() {
0) | arch_stack_walk() {
0) | __unwind_start() {
Where the filter should look like:
# cat trace
0) | free_to_partial_list() {
0) | _raw_spin_lock_irqsave() {
0) 0.350 us | preempt_count_add();
0) 0.351 us | do_raw_spin_lock();
0) 2.440 us | }
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250208001511.535be150@batman.local.home
Fixes: b84214890a9bc ("function_graph: Move graph notrace bit to shadow stack global var")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
-Wenum-enum-conversion was strengthened in clang-19 to warn for C, which
caused the kernel to move it to W=1 in commit 75b5ab134bb5 ("kbuild:
Move -Wenum-{compare-conditional,enum-conversion} into W=1") because
there were numerous instances that would break builds with -Werror.
Unfortunately, this is not a full solution, as more and more developers,
subsystems, and distributors are building with W=1 as well, so they
continue to see the numerous instances of this warning.
Since the move to W=1, there have not been many new instances that have
appeared through various build reports and the ones that have appeared
seem to be following similar existing patterns, suggesting that most
instances of this warning will not be real issues. The only alternatives
for silencing this warning are adding casts (which is generally seen as
an ugly practice) or refactoring the enums to macro defines or a unified
enum (which may be undesirable because of type safety in other parts of
the code).
Move the warning to W=2, where warnings that occur frequently but may be
relevant should reside.
Cc: stable@vger.kernel.org
Fixes: 75b5ab134bb5 ("kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1")
Link: https://lore.kernel.org/ZwRA9SOcOjjLJcpi@google.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
While attempting to build a Debian packages with CC="ccache gcc", I
saw the following error as builddeb builds linux-headers-$KERNELVERSION:
make HOSTCC=ccache gcc VPATH= srcroot=. -f ./scripts/Makefile.build obj=debian/linux-headers-6.14.0-rc1/usr/src/linux-headers-6.14.0-rc1/scripts
make[6]: *** No rule to make target 'gcc'. Stop.
Upon investigation, it seems that one instance of $(CC) variable reference
in ./scripts/package/install-extmod-build was missing quotation marks,
causing the above error.
Add the missing quotation marks around $(CC) to fix build.
Fixes: 5f73e7d0386d ("kbuild: refactor cross-compiling linux-headers package")
Co-developed-by: Mingcong Bai <jeffbai@aosc.io>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: WangYuli <wangyuli@uniontech.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
I no longer have any faith left in the kernel development process or
community management approach.
Apple/ARM platform development will continue downstream. If I feel like
sending some patches upstream in the future myself for whatever subtree
I may, or I may not. Anyone who feels like fighting the upstreaming
fight themselves is welcome to do so.
Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I need to filter my emails better, switch to pavel@kernel.org address
to help with that.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This costs a strlen() call when instatianating a symlink.
Preferably it would be hidden behind VFS_WARN_ON (or compatible), but
there is no such facility at the moment. With the facility in place the
call can be patched out in production kernels.
In the meantime, since the cost is being paid unconditionally, use the
result to a fixup the bad caller.
This is not expected to persist in the long run (tm).
Sample splat:
bad length passed for symlink [/tmp/syz-imagegen43743633/file0/file0] (got 131109, expected 37)
[rest of WARN blurp goes here]
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/r/20250204213207.337980-1-mjguzik@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Pidfs supports extensible and non-extensible ioctls. The extensible
ioctls need to check for the ioctl number itself not just the ioctl
command otherwise both backward- and forward compatibility are broken.
The pidfs ioctl handler also needs to look at the type of the ioctl
command to guard against cases where "[...] a daemon receives some
random file descriptor from a (potentially less privileged) client and
expects the FD to be of some specific type, it might call ioctl() on
this FD with some type-specific command and expect the call to fail if
the FD is of the wrong type; but due to the missing type check, the
kernel instead performs some action that userspace didn't expect."
(cf. [1]]
Link: https://lore.kernel.org/r/20250204-work-pidfs-ioctl-v1-1-04987d239575@kernel.org
Link: https://lore.kernel.org/r/CAG48ez2K9A5GwtgqO31u9ZL292we8ZwAA=TJwwEv7wRuJ3j4Lw@mail.gmail.com [1]
Fixes: 8ce352818820 ("pidfs: check for valid ioctl commands")
Acked-by: Luca Boccassi <luca.boccassi@gmail.com>
Reported-by: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org # v6.13; please backport with 8ce352818820 ("pidfs: check for valid ioctl commands")
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
After introducing pre-content events, we had a regression related to
disabling huge faults on files that should never have pre-content events
enabled.
This happened because the default f_mode of allocated files (0) does
not disable pre-content events.
Pre-content events are disabled in file_set_fsnotify_mode_by_watchers()
but internal files may not get to call this helper.
Initialize f_mode to disable permission and pre-content events for all
files and if needed they will be enabled for the callers of
file_set_fsnotify_mode_by_watchers().
Fixes: 20bf82a898b6 ("mm: don't allow huge faults for files with pre content watches")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Closes: https://lore.kernel.org/linux-fsdevel/20250131121703.1e4d00a7.alex.williamson@redhat.com/
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250203223205.861346-4-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
STATMOUNT_MNT_OPTS can actually be missing if there are no options. This
is a change of behavior since 75ead69a7173 ("fs: don't let statmount return
empty strings").
The other checks shouldn't actually trigger, but add them for correctness
and for easier debugging if the test fails.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/r/20250129160641.35485-1-mszeredi@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Most pseudo files are not applicable for fsnotify events at all,
let alone to the new pre-content events.
Disable notifications to all files allocated with alloc_file_pseudo()
and enable legacy inotify events for the specific cases of pipe and
socket, which have known users of inotify events.
Pre-content events are also kept disabled for sockets and pipes.
Fixes: 20bf82a898b6 ("mm: don't allow huge faults for files with pre content watches")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Closes: https://lore.kernel.org/linux-fsdevel/20250131121703.1e4d00a7.alex.williamson@redhat.com/
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wi2pThSVY=zhO=ZKxViBj5QCRX-=AS2+rVknQgJnHXDFg@mail.gmail.com/
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250203223205.861346-3-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Prepending security options was made conditional on sb->s_op->show_options,
but security options are independent of sb options.
Fixes: 056d33137bf9 ("fs: prepend statmount.mnt_opts string with security_sb_mnt_opts()")
Fixes: f9af549d1fd3 ("fs: export mount options via statmount()")
Cc: stable@vger.kernel.org # v6.11
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/r/20250129151253.33241-1-mszeredi@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The FMODE_NONOTIFY_* bits are a 2-bits mode. Open coding manipulation
of those bits is risky. Use an accessor file_set_fsnotify_mode() to
set the mode.
Rename file_set_fsnotify_mode() => file_set_fsnotify_mode_from_watchers()
to make way for the simple accessor name.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250203223205.861346-2-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
All users of lockref_init() now initialize the count to 1, so hardcode
that and remove the count argument.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Link: https://lore.kernel.org/r/20250130135624.1899988-4-agruenba@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
In qd_alloc(), initialize the lockref count to 1 to cover the common
case. Compensate for that in gfs2_quota_init() by adjusting the count
back down to 0; this only occurs when mounting the filesystem rw.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Link: https://lore.kernel.org/r/20250130135624.1899988-3-agruenba@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Move the initialization of gl_lockref from gfs2_init_glock_once() to
gfs2_glock_get(). This allows to use lockref_init() there.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Link: https://lore.kernel.org/r/20250130135624.1899988-2-agruenba@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Just like it's normal for unset values to be zero, unset strings should be
empty instead of containing random values.
It seems to be a typical mistake that the mask returned by statmount is not
checked, which can result in various bugs.
With this fix, these bugs are prevented, since it is highly likely that
userspace would just want to turn the missing mask case into an empty
string anyway (most of the recently found cases are of this type).
Link: https://lore.kernel.org/all/CAJfpegsVCPfCn2DpM8iiYSS5DpMsLB8QBUCHecoj6s0Vxf4jzg@mail.gmail.com/
Fixes: 68385d77c05b ("statmount: simplify string option retrieval")
Fixes: 46eae99ef733 ("add statmount(2) syscall")
Cc: stable@vger.kernel.org # v6.8
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/r/20250130121500.113446-1-mszeredi@redhat.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Building with GCC 15 results in build error
fs/vboxsf/super.c:24:54: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization]
24 | static const unsigned char VBSF_MOUNT_SIGNATURE[4] = "\000\377\376\375";
| ^~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Due to GCC having enabled -Werror=unterminated-string-initialization[0]
by default. Separately initializing each array element of
VBSF_MOUNT_SIGNATURE to ensure NUL termination, thus satisfying GCC 15
and fixing the build error.
[0]: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wno-unterminated-string-initialization
Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com>
Link: https://lore.kernel.org/r/20250121162648.1408743-1-brahmajit.xyz@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Clang static checker(scan-build) warning:
fs/stat.c:287:21: warning: The left expression of the compound assignment is
an uninitialized value. The computed value will also be garbage.
287 | stat->result_mask |= STATX_MNT_ID_UNIQUE;
| ~~~~~~~~~~~~~~~~~ ^
fs/stat.c:290:21: warning: The left expression of the compound assignment is
an uninitialized value. The computed value will also be garbage.
290 | stat->result_mask |= STATX_MNT_ID;
When vfs_getattr() failed because of security_inode_getattr(), 'stat' is
uninitialized. In this case, there is a harmless garbage problem in
vfs_statx_path(). It's better to return error directly when
vfs_getattr() failed, avoiding garbage value and more clearly.
Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20250119025946.1168957-1-suhui@nfschina.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Before attaching a new root to the old root, the children counter of the
new root is checked to verify that only the upcoming CPU's top group have
been connected to it. However since the recently added commit b729cc1ec21a
("timers/migration: Fix another race between hotplug and idle entry/exit")
this check is not valid anymore because the old root is pre-accounted
as a child to the new root. Therefore after connecting the upcoming
CPU's top group to the new root, the children count to be expected must
be 2 and not 1 anymore.
This omission results in the old root to not be connected to the new
root. Then eventually the system may run with more than one top level,
which defeats the purpose of a single idle migrator.
Also the old root is pre-accounted but not connected upon the new root
creation. But it can be connected to the new root later on. Therefore
the old root may be accounted twice to the new root. The propagation of
such overcommit can end up creating a double final top-level root with a
groupmask incorrectly initialized. Although harmless given that the final
top level roots will never have a parent to walk up to, this oddity
opportunistically reported the core issue:
WARNING: CPU: 8 PID: 0 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote
CPU: 8 UID: 0 PID: 0 Comm: swapper/8
RIP: 0010:tmigr_requires_handle_remote
Call Trace:
<IRQ>
? tmigr_requires_handle_remote
? hrtimer_run_queues
update_process_times
tick_periodic
tick_handle_periodic
__sysvec_apic_timer_interrupt
sysvec_apic_timer_interrupt
</IRQ>
Fix the problem by taking the old root into account in the children count
of the new root so the connection is not omitted.
Also warn when more than one top level group exists to better detect
similar issues in the future.
Fixes: b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit")
Reported-by: Matt Fleming <mfleming@cloudflare.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250205160220.39467-1-frederic@kernel.org
|
|
The space separator was factored out from the multiple chip name prints,
but several irq_chip::irq_print_chip() callbacks still print a leading
space. Remove the superfluous double spaces.
Fixes: 9d9f204bdf7243bf ("genirq/proc: Add missing space separator back")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/893f7e9646d8933cd6786d5a1ef3eb076d263768.1738764803.git.geert+renesas@glider.be
|
|
Previously, bch2_bkey_sectors_need_rebalance() called
bch2_target_accepts_data(), checking whether the target is writable.
However, this means that adding or removing devices from a target would
change the value of bch2_bkey_sectors_need_rebalance() for an existing
extent; this needs to be invariant so that the extent trigger can
correctly maintain rebalance_work accounting.
Instead, check target_accepts_data() in io_opts_to_rebalance_opts(),
before creating the bch_extent_rebalance entry.
This fixes (one?) cause of rebalance_work accounting being off.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Spotted by sparse.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The discard path is supposed to issue journal flushes when there's too
many buckets empty buckets that need a journal commit before they can be
written to again, but at some point this code seems to have been lost.
Bring it back with a new optimization to make sure we don't issue too
many journal flushes: the journal now tracks the sequence number of the
most recent flush in progress, which the discard path uses when deciding
which buckets need a journal flush.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In the previous commit b3d82c2f2761, code was added to prevent journal sequence
overflow. Among them, the code added to journal_entry_open() uses the
bch2_fs_fatal_err_on() function to handle errors.
However, __journal_res_get() , which calls journal_entry_open() , calls
journal_entry_open() while holding journal->lock , but bch2_fs_fatal_err_on()
internally tries to acquire journal->lock , which results in a deadlock.
So we need to add a locked helper to handle fatal errors even when the
journal->lock is held.
Fixes: b3d82c2f2761 ("bcachefs: Guard against journal seq overflow")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
For some unknown reason, checks on struct bkey_s_c_snapshot and struct
bkey_s_c_snapshot_tree pointers are missing.
Therefore, I think it would be appropriate to fix the incorrect pointer checking
through this patch.
Fixes: 4bd06f07bcb5 ("bcachefs: Fixes for snapshot_tree.master_subvol")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add an (initial?) patch submission checklist, focusing mainly on
testing.
Yes, all patches must be tested, and that starts (but does not end) with
the patch author.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The destination argument of memtostr*() and strtomem*() must be a
fixed-size char array at compile time, so there is no need to use
__builtin_object_size() (which is useful for when an argument is
either a pointer or unknown). Instead use ARRAY_SIZE(), which has the
benefit of working around a bug in Clang (fixed[1] in 15+) that got
__builtin_object_size() wrong sometimes.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501310832.kiAeOt2z-lkp@intel.com/
Suggested-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://github.com/llvm/llvm-project/commit/d8e0a6d5e9dd2311641f9a8a5d2bf90829951ddc [1]
Tested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
In preparation for adding stricter type checking to the str/mem*()
helpers, provide a way to check that a variable is a byte array
via __must_be_byte_array().
Suggested-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The C kernel helpers for evaluating C Strings were positioned where they
were visible to assembly inclusion, which was not intended. Move them
into the kernel and C-only area of the header so future changes won't
confuse the assembler.
Fixes: d7a516c6eeae ("compiler.h: Fix undefined BUILD_BUG_ON_ZERO()")
Fixes: 559048d156ff ("string: Check for "nonstring" attribute on strscpy() arguments")
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
When using Rust on the x86 architecture, we are currently using the
unstable target.json feature to specify the compilation target. Rustc is
going to change how softfloat is specified in the target.json file on
x86, thus update generate_rust_target.rs to specify softfloat using the
new option.
Note that if you enable this parameter with a compiler that does not
recognize it, then that triggers a warning but it does not break the
build.
[ For future reference, this solves the following error:
RUSTC L rust/core.o
error: Error loading target specification: target feature
`soft-float` is incompatible with the ABI but gets enabled in
target spec. Run `rustc --print target-list` for a list of
built-in targets
- Miguel ]
Cc: <stable@vger.kernel.org> # Needed in 6.12.y and 6.13.y only (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust/pull/136146
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86
Link: https://lore.kernel.org/r/20250203-rustc-1-86-x86-softfloat-v1-1-220a72a5003e@google.com
[ Added 6.13.y too to Cc: stable tag and added reasoning to avoid
over-backporting. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
The uretprobe syscall is implemented as a performance enhancement on
x86_64 by having the kernel inject a call to it on function exit; User
programs cannot call this system call explicitly.
As such, this syscall is considered a kernel implementation detail and
should not be filtered by seccomp.
Enhance the seccomp bpf test suite to check that uretprobes can be
attached to processes without the killing the process regardless of
seccomp policy.
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Link: https://lore.kernel.org/r/20250202162921.335813-3-eyal.birger@gmail.com
[kees: Skip archs without __NR_uretprobe]
Signed-off-by: Kees Cook <kees@kernel.org>
|