aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-01-03cxl/pmem: Remove is_cxl_nvdimm_bridge()Zijun Hu2-7/+0
Remove is_cxl_nvdimm_bridge() which has no caller now. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-11-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03cxl/pmem: Replace match_nvdimm_bridge() with API device_match_type()Zijun Hu1-6/+3
Static match_nvdimm_bridge(), as matching function of device_find_child() matches a device with device type @cxl_nvdimm_bridge_type, and its task can be simplified by the recently introduced API device_match_type(). Replace match_nvdimm_bridge() usage with device_match_type(). Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-10-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03driver core: Introduce an device matching API device_match_type()Zijun Hu2-0/+7
Introduce device_match_type() for purposes below: - Test if a device matches with a specified device type. - As argument of various device finding APIs to find a device with specified type. device_find_child() will use it to simplify operations later. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-9-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03gpio: sim: Remove gpio_sim_dev_match_fwnode()Zijun Hu1-6/+1
gpio_sim_dev_match_fwnode() is a simple wrapper of API device_match_fwnode(). Remove the needless wrapper and use the API instead. Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-8-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03slimbus: core: Remove of_slim_match_dev()Zijun Hu1-9/+1
static of_slim_match_dev() has same function as API device_match_of_node(). Remove the former and use the later instead. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-7-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03driver core: Remove match_any()Zijun Hu1-6/+1
Static match_any() is now exactly same as API device_match_any(). Remove the former and use the later instead. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-6-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03driver core: Simplify API device_find_child_by_name() implementationZijun Hu1-12/+1
Simplify device_find_child_by_name() implementation by both existing API device_find_child() and device_match_name(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-5-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03driver core: Constify API device_find_child() and adapt for various usagesZijun Hu28-62/+66
Constify the following API: struct device *device_find_child(struct device *dev, void *data, int (*match)(struct device *dev, void *data)); To : struct device *device_find_child(struct device *dev, const void *data, device_match_t match); typedef int (*device_match_t)(struct device *dev, const void *data); with the following reasons: - Protect caller's match data @*data which is for comparison and lookup and the API does not actually need to modify @*data. - Make the API's parameters (@match)() and @data have the same type as all of other device finding APIs (bus|class|driver)_find_device(). - All kinds of existing device match functions can be directly taken as the API's argument, they were exported by driver core. Constify the API and adapt for various existing usages. BTW, various subsystem changes are squashed into this commit to meet 'git bisect' requirement, and this commit has the minimal and simplest changes to complement squashing shortcoming, and that may bring extra code improvement. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Acked-by: Uwe Kleine-König <ukleinek@kernel.org> # for drivers/pwm Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-4-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03bus: fsl-mc: Constify fsl_mc_device_match()Zijun Hu1-2/+2
fsl_mc_device_match() does not modify caller's inputs. To prepare for constifying API device_find_child() later. Constify this comparison function by simply changing its parameter types to const pointer. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-3-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03slimbus: core: Constify slim_eaddr_equal()Zijun Hu1-1/+2
bool slim_eaddr_equal(struct slim_eaddr *a, struct slim_eaddr *b) does not modify @*a or @*b. To prepare for constifying API device_find_child() later. Constify this comparison function by simply changing its parameter type to 'const struct slim_eaddr *'. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-2-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03libnvdimm: Replace namespace_match() with device_find_child_by_name()Zijun Hu1-8/+1
Simplify nd_namespace_store() implementation by using device_find_child_by_name(). Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-1-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-24drivers: base: test: Add ...find_device_by...(... NULL) testsBrian Norris1-1/+40
We recently updated these device_match*() (and therefore, various *find_device_by*()) functions to return a consistent 'false' value when trying to match a NULL handle. Add tests for this. This provides regression-testing coverage for the sorts of bugs that underly commit 5c8418cf4025 ("PCI/pwrctrl: Unregister platform device only if one actually exists"). Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Brian Norris <briannorris@chromium.org> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: David Gow <davidgow@google.com> Link: https://lore.kernel.org/r/20241216201148.535115-4-briannorris@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-24drivers: base: test: Enable device model tests with KUNIT_ALL_TESTSBrian Norris1-0/+1
Per commit bebe94b53eb7 ("drivers: base: default KUNIT_* fragments to KUNIT_ALL_TESTS"), it seems like we should default to KUNIT_ALL_TESTS. This enables these platform_device tests for common configurations, such as with: ./tools/testing/kunit/kunit.py run Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20241216201148.535115-3-briannorris@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-24drivers: base: Don't match devices with NULL of_node/fwnode/etcBrian Norris1-4/+4
of_find_device_by_node(), bus_find_device_by_of_node(), bus_find_device_by_fwnode(), ..., all produce arbitrary results when provided with a NULL of_node, fwnode, ACPI handle, etc. This is counterintuitive, and the source of a few bugs, such as the one fixed by commit 5c8418cf4025 ("PCI/pwrctrl: Unregister platform device only if one actually exists"). It's hard to imagine a good reason that these device_match_*() APIs should return 'true' for a NULL argument. Augment these to return 0 (false). Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Acked-by: David Gow <davidgow@google.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20241216201148.535115-2-briannorris@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-24kheaders: Simplify attribute through __BIN_ATTR_SIMPLE_RO()Thomas Weißschuh1-16/+3
The utility macro from the sysfs core is sufficient to implement this attribute. Make use of it. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241221-sysfs-const-bin_attr-kheaders-v2-1-8205538aa012@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20MAINTAINERS: add Danilo to DRIVER COREDanilo Krummrich1-0/+1
Let's keep an eye on the Rust abstractions; add myself as reviewer to DRIVER CORE. Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-17-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20samples: rust: add Rust platform sample driverDanilo Krummrich5-0/+66
Add a sample Rust platform driver illustrating the usage of the platform bus abstractions. This driver probes through either a match of device / driver name or a match within the OF ID table. Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-16-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: platform: add basic platform device / driver abstractionsDanilo Krummrich6-0/+215
Implement the basic platform bus abstractions required to write a basic platform driver. This includes the following data structures: The `platform::Driver` trait represents the interface to the driver and provides `platform::Driver::probe` for the driver to implement. The `platform::Device` abstraction represents a `struct platform_device`. In order to provide the platform bus specific parts to a generic `driver::Registration` the `driver::RegistrationOps` trait is implemented by `platform::Adapter`. Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-15-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: driver: implement `Adapter`Danilo Krummrich2-1/+58
In order to not duplicate code in bus specific implementations (e.g. platform), implement a generic `driver::Adapter` to represent the connection of matched drivers and devices. Bus specific `Adapter` implementations can simply implement this trait to inherit generic functionality, such as matching OF or ACPI device IDs and ID table entries. Suggested-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-14-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: of: add `of::DeviceId` abstractionDanilo Krummrich3-0/+62
`of::DeviceId` is an abstraction around `struct of_device_id`. This is used by subsequent patches, in particular the platform bus abstractions, to create OF device ID tables. Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-13-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20samples: rust: add Rust PCI sample driverDanilo Krummrich4-0/+123
This commit adds a sample Rust PCI driver for QEMU's "pci-testdev" device. To enable this device QEMU has to be called with `-device pci-testdev`. The same driver shows how to use the PCI device / driver abstractions, as well as how to request and map PCI BARs, including a short sequence of MMIO operations. Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-12-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: pci: implement I/O mappable `pci::Bar`Danilo Krummrich1-1/+141
Implement `pci::Bar`, `pci::Device::iomap_region` and `pci::Device::iomap_region_sized` to allow for I/O mappings of PCI BARs. To ensure that a `pci::Bar`, and hence the I/O memory mapping, can't out-live the PCI device, the `pci::Bar` type is always embedded into a `Devres` container, such that the `pci::Bar` is revoked once the device is unbound and hence the I/O mapped memory is unmapped. A `pci::Bar` can be requested with (`pci::Device::iomap_region_sized`) or without (`pci::Device::iomap_region`) a const generic representing the minimal requested size of the I/O mapped memory region. In case of the latter only runtime checked I/O reads / writes are possible. Co-developed-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-11-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: pci: add basic PCI device / driver abstractionsDanilo Krummrich6-0/+315
Implement the basic PCI abstractions required to write a basic PCI driver. This includes the following data structures: The `pci::Driver` trait represents the interface to the driver and provides `pci::Driver::probe` for the driver to implement. The `pci::Device` abstraction represents a `struct pci_dev` and provides abstractions for common functions, such as `pci::Device::set_master`. In order to provide the PCI specific parts to a generic `driver::Registration` the `driver::RegistrationOps` trait is implemented by `pci::Adapter`. `pci::DeviceId` implements PCI device IDs based on the generic `device_id::RawDevceId` abstraction. Co-developed-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-10-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: add devres abstractionDanilo Krummrich5-0/+191
Add a Rust abstraction for the kernel's devres (device resource management) implementation. The Devres type acts as a container to manage the lifetime and accessibility of device bound resources. Therefore it registers a devres callback and revokes access to the resource on invocation. Users of the Devres abstraction can simply free the corresponding resources in their Drop implementation, which is invoked when either the Devres instance goes out of scope or the devres callback leads to the resource being revoked, which implies a call to drop_in_place(). Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-9-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: add `io::{Io, IoRaw}` base typesDanilo Krummrich4-0/+363
I/O memory is typically either mapped through direct calls to ioremap() or subsystem / bus specific ones such as pci_iomap(). Even though subsystem / bus specific functions to map I/O memory are based on ioremap() / iounmap() it is not desirable to re-implement them in Rust. Instead, implement a base type for I/O mapped memory, which generically provides the corresponding accessors, such as `Io::readb` or `Io:try_readb`. `Io` supports an optional const generic, such that a driver can indicate the minimal expected and required size of the mapping at compile time. Correspondingly, calls to the 'non-try' accessors, support compile time checks of the I/O memory offset to read / write, while the 'try' accessors, provide boundary checks on runtime. `IoRaw` is meant to be embedded into a structure (e.g. pci::Bar or io::IoMem) which creates the actual I/O memory mapping and initializes `IoRaw` accordingly. To ensure that I/O mapped memory can't out-live the device it may be bound to, subsystems must embed the corresponding I/O memory type (e.g. pci::Bar) into a `Devres` container, such that it gets revoked once the device is unbound. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Tested-by: Daniel Almeida <daniel.almeida@collabora.com> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-8-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: add `Revocable` typeWedson Almeida Filho2-0/+220
Revocable allows access to objects to be safely revoked at run time. This is useful, for example, for resources allocated during device probe; when the device is removed, the driver should stop accessing the device resources even if another state is kept in memory due to existing references (i.e., device context data is ref-counted and has a non-zero refcount after removal of the device). Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Co-developed-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20241219170425.12036-7-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: types: add `Opaque::pin_init`Danilo Krummrich1-0/+11
Analogous to `Opaque::new` add `Opaque::pin_init`, which instead of a value `T` takes a `PinInit<T>` and returns a `PinInit<Opaque<T>>`. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Suggested-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-6-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: add rcu abstractionWedson Almeida Filho5-0/+63
Add a simple abstraction to guard critical code sections with an rcu read lock. Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Co-developed-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-5-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: implement `IdArray`, `IdTable` and `RawDeviceId`Danilo Krummrich3-0/+172
Most subsystems use some kind of ID to match devices and drivers. Hence, we have to provide Rust drivers an abstraction to register an ID table for the driver to match. Generally, those IDs are subsystem specific and hence need to be implemented by the corresponding subsystem. However, the `IdArray`, `IdTable` and `RawDeviceId` types provide a generalized implementation that makes the life of subsystems easier to do so. Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Co-developed-by: Fabien Parent <fabien.parent@linaro.org> Signed-off-by: Fabien Parent <fabien.parent@linaro.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-4-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: implement generic driver registrationDanilo Krummrich3-0/+119
Implement the generic `Registration` type and the `RegistrationOps` trait. The `Registration` structure is the common type that represents a driver registration and is typically bound to the lifetime of a module. However, it doesn't implement actual calls to the kernel's driver core to register drivers itself. Instead the `RegistrationOps` trait is provided to subsystems, which have to implement `RegistrationOps::register` and `RegistrationOps::unregister`. Subsystems have to provide an implementation for both of those methods where the subsystem specific variants to register / unregister a driver have to implemented. For instance, the PCI subsystem would call __pci_register_driver() from `RegistrationOps::register` and pci_unregister_driver() from `DrvierOps::unregister`. Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-3-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-20rust: module: add trait `ModuleMetadata`Danilo Krummrich2-0/+10
In order to access static metadata of a Rust kernel module, add the `ModuleMetadata` trait. In particular, this trait provides the name of a Rust kernel module as specified by the `module!` macro. Signed-off-by: Danilo Krummrich <dakr@kernel.org> Tested-by: Dirk Behme <dirk.behme@de.bosch.com> Tested-by: Fabien Parent <fabien.parent@linaro.org> Link: https://lore.kernel.org/r/20241219170425.12036-2-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16rust: miscdevice: add fops->show_fdinfo() hookAlice Ryhl1-0/+34
File descriptors should generally provide a fops->show_fdinfo() hook for debugging purposes. Thus, add such a hook to the miscdevice abstractions. Signed-off-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20241203-miscdevice-showfdinfo-v1-1-7e990732d430@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16samples: rust_misc_device: Provide an example C program to exercise functionalityLee Jones1-0/+90
Here is the expected output (manually spliced together): USERSPACE: Opening /dev/rust-misc-device for reading and writing KERNEL: rust_misc_device: Opening Rust Misc Device Sample USERSPACE: Calling Hello KERNEL: rust_misc_device: IOCTLing Rust Misc Device Sample KERNEL: rust_misc_device: -> Hello from the Rust Misc Device USERSPACE: Fetching initial value KERNEL: rust_misc_device: IOCTLing Rust Misc Device Sample KERNEL: rust_misc_device: -> Copying data to userspace (value: 0) USERSPACE: Submitting new value (1) KERNEL: rust_misc_device: IOCTLing Rust Misc Device Sample KERNEL: rust_misc_device: -> Copying data from userspace (value: 1) USERSPACE: Fetching new value KERNEL: rust_misc_device: IOCTLing Rust Misc Device Sample KERNEL: rust_misc_device: -> Copying data to userspace (value: 1) USERSPACE: Attempting to call in to an non-existent IOCTL KERNEL: rust_misc_device: IOCTLing Rust Misc Device Sample KERNEL: rust_misc_device: -> IOCTL not recognised: 20992 USERSPACE: ioctl: Succeeded to fail - this was expected: Inappropriate ioctl for device USERSPACE: Closing /dev/rust-misc-device KERNEL: rust_misc_device: Exiting the Rust Misc Device Sample USERSPACE: Success Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20241213134715.601415-6-lee@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16MAINTAINERS: Add Rust Misc Sample to MISC entryLee Jones1-0/+1
Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20241213134715.601415-5-lee@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16samples: rust_misc_device: Demonstrate additional get/set value functionalityLee Jones1-14/+75
Expand the complexity of the sample driver by providing the ability to get and set an integer. The value is protected by a mutex. Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20241213134715.601415-4-lee@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16samples: rust: Provide example using the new Rust MiscDevice abstractionLee Jones3-0/+98
This sample driver demonstrates the following basic operations: * Register a Misc Device * Create /dev/rust-misc-device * Provide open call-back for the aforementioned character device * Operate on the character device via a simple ioctl() * Provide close call-back for the character device Signed-off-by: Lee Jones <lee@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20241213134715.601415-3-lee@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16Documentation: ioctl-number: Carve out some identifiers for use by sample driversLee Jones1-0/+1
32 IDs should be plenty (at yet, not too greedy) since a lot of sample drivers will use their own subsystem allocations. Sample drivers predominately reside in <KERNEL_ROOT>/samples, but there should be no issue with in-place example drivers using them. Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20241213134715.601415-2-lee@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16rust: miscdevice: Provide accessor to pull out miscdevice::this_deviceLee Jones1-0/+11
There are situations where a pointer to a `struct device` will become necessary (e.g. for calling into dev_*() functions). This accessor allows callers to pull this out from the `struct miscdevice`. Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Tested-by: Lee Jones <lee@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20241210-miscdevice-file-param-v3-3-b2a79b666dc5@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16rust: miscdevice: access the `struct miscdevice` from fops->open()Alice Ryhl1-8/+22
Providing access to the underlying `struct miscdevice` is useful for various reasons. For example, this allows you access the miscdevice's internal `struct device` for use with the `dev_*` printing macros. Note that since the underlying `struct miscdevice` could get freed at any point after the fops->open() call (if misc_deregister is called), only the open call is given access to it. To use `dev_*` printing macros from other fops hooks, take a refcount on `miscdevice->this_device` to keep it alive. See the linked thread for further discussion on the lifetime of `struct miscdevice`. Link: https://lore.kernel.org/r/2024120951-botanist-exhale-4845@gregkh Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Lee Jones <lee@kernel.org> Tested-by: Lee Jones <lee@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20241210-miscdevice-file-param-v3-2-b2a79b666dc5@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16rust: miscdevice: access file in fopsAlice Ryhl1-6/+25
This allows fops to access information about the underlying struct file for the miscdevice. For example, the Binder driver needs to inspect the O_NONBLOCK flag inside the fops->ioctl() hook. Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Lee Jones <lee@kernel.org> Tested-by: Lee Jones <lee@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20241210-miscdevice-file-param-v3-1-b2a79b666dc5@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-15Linux 6.13-rc3Linus Torvalds1-1/+1
2024-12-14bpf: Avoid deadlock caused by nested kprobe and fentry bpf programsPriya Bala Govindasamy1-0/+6
BPF program types like kprobe and fentry can cause deadlocks in certain situations. If a function takes a lock and one of these bpf programs is hooked to some point in the function's critical section, and if the bpf program tries to call the same function and take the same lock it will lead to deadlock. These situations have been reported in the following bug reports. In percpu_freelist - Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ In bpf_lru_list - Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ Similar bugs have been reported by syzbot. In queue_stack_maps - Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ In lpm_trie - Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ In ringbuf - Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ Prevent kprobe and fentry bpf programs from attaching to these critical sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. The bugs reported by syzbot are due to tracepoint bpf programs being called in the critical sections. This patch does not aim to fix deadlocks caused by tracepoint programs. However, it does prevent deadlocks from occurring in similar situations due to kprobe and fentry programs. Signed-off-by: Priya Bala Govindasamy <pgovind2@uci.edu> Link: https://lore.kernel.org/r/CAPPBnEZpjGnsuA26Mf9kYibSaGLm=oF6=12L21X1GEQdqjLnzQ@mail.gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-13selftests/bpf: Add tests for raw_tp NULL argsKumar Kartikeya Dwivedi2-0/+27
Add tests to ensure that arguments are correctly marked based on their specified positions, and whether they get marked correctly as maybe null. For modules, all tracepoint parameters should be marked PTR_MAYBE_NULL by default. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241213221929.3495062-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-13bpf: Augment raw_tp arguments with PTR_MAYBE_NULLKumar Kartikeya Dwivedi2-10/+147
Arguments to a raw tracepoint are tagged as trusted, which carries the semantics that the pointer will be non-NULL. However, in certain cases, a raw tracepoint argument may end up being NULL. More context about this issue is available in [0]. Thus, there is a discrepancy between the reality, that raw_tp arguments can actually be NULL, and the verifier's knowledge, that they are never NULL, causing explicit NULL check branch to be dead code eliminated. A previous attempt [1], i.e. the second fixed commit, was made to simulate symbolic execution as if in most accesses, the argument is a non-NULL raw_tp, except for conditional jumps. This tried to suppress branch prediction while preserving compatibility, but surfaced issues with production programs that were difficult to solve without increasing verifier complexity. A more complete discussion of issues and fixes is available at [2]. Fix this by maintaining an explicit list of tracepoints where the arguments are known to be NULL, and mark the positional arguments as PTR_MAYBE_NULL. Additionally, capture the tracepoints where arguments are known to be ERR_PTR, and mark these arguments as scalar values to prevent potential dereference. Each hex digit is used to encode NULL-ness (0x1) or ERR_PTR-ness (0x2), shifted by the zero-indexed argument number x 4. This can be represented as follows: 1st arg: 0x1 2nd arg: 0x10 3rd arg: 0x100 ... and so on (likewise for ERR_PTR case). In the future, an automated pass will be used to produce such a list, or insert __nullable annotations automatically for tracepoints. Each compilation unit will be analyzed and results will be collated to find whether a tracepoint pointer is definitely not null, maybe null, or an unknown state where verifier conservatively marks it PTR_MAYBE_NULL. A proof of concept of this tool from Eduard is available at [3]. Note that in case we don't find a specification in the raw_tp_null_args array and the tracepoint belongs to a kernel module, we will conservatively mark the arguments as PTR_MAYBE_NULL. This is because unlike for in-tree modules, out-of-tree module tracepoints may pass NULL freely to the tracepoint. We don't protect against such tracepoints passing ERR_PTR (which is uncommon anyway), lest we mark all such arguments as SCALAR_VALUE. While we are it, let's adjust the test raw_tp_null to not perform dereference of the skb->mark, as that won't be allowed anymore, and make it more robust by using inline assembly to test the dead code elimination behavior, which should still stay the same. [0]: https://lore.kernel.org/bpf/ZrCZS6nisraEqehw@jlelli-thinkpadt14gen4.remote.csb [1]: https://lore.kernel.org/all/20241104171959.2938862-1-memxor@gmail.com [2]: https://lore.kernel.org/bpf/20241206161053.809580-1-memxor@gmail.com [3]: https://github.com/eddyz87/llvm-project/tree/nullness-for-tracepoint-params Reported-by: Juri Lelli <juri.lelli@redhat.com> # original bug Reported-by: Manu Bretelle <chantra@meta.com> # bugs in masking fix Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") Fixes: cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Co-developed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241213221929.3495062-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-13bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"Kumar Kartikeya Dwivedi4-87/+9
This patch reverts commit cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"). The patch was well-intended and meant to be as a stop-gap fixing branch prediction when the pointer may actually be NULL at runtime. Eventually, it was supposed to be replaced by an automated script or compiler pass detecting possibly NULL arguments and marking them accordingly. However, it caused two main issues observed for production programs and failed to preserve backwards compatibility. First, programs relied on the verifier not exploring == NULL branch when pointer is not NULL, thus they started failing with a 'dereference of scalar' error. Next, allowing raw_tp arguments to be modified surfaced the warning in the verifier that warns against reg->off when PTR_MAYBE_NULL is set. More information, context, and discusson on both problems is available in [0]. Overall, this approach had several shortcomings, and the fixes would further complicate the verifier's logic, and the entire masking scheme would have to be removed eventually anyway. Hence, revert the patch in preparation of a better fix avoiding these issues to replace this commit. [0]: https://lore.kernel.org/bpf/20241206161053.809580-1-memxor@gmail.com Reported-by: Manu Bretelle <chantra@meta.com> Fixes: cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241213221929.3495062-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-13ARC: build: Try to guess GCC variant of cross compilerLeon Romanovsky1-1/+1
ARC GCC compiler is packaged starting from Fedora 39i and the GCC variant of cross compile tools has arc-linux-gnu- prefix and not arc-linux-. This is causing that CROSS_COMPILE variable is left unset. This change allows builds without need to supply CROSS_COMPILE argument if distro package is used. Before this change: $ make -j 128 ARCH=arc W=1 drivers/infiniband/hw/mlx4/ gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead gcc: error: unrecognized command-line option ‘-mmedium-calls’ gcc: error: unrecognized command-line option ‘-mlock’ gcc: error: unrecognized command-line option ‘-munaligned-access’ [1] https://packages.fedoraproject.org/pkgs/cross-gcc/gcc-arc-linux-gnu/index.html Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2024-12-13KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module initSean Christopherson3-5/+29
Snapshot the output of CPUID.0xD.[1..n] during kvm.ko initiliaization to avoid the overead of CPUID during runtime. The offset, size, and metadata for CPUID.0xD.[1..n] sub-leaves does not depend on XCR0 or XSS values, i.e. is constant for a given CPU, and thus can be cached during module load. On Intel's Emerald Rapids, CPUID is *wildly* expensive, to the point where recomputing XSAVE offsets and sizes results in a 4x increase in latency of nested VM-Enter and VM-Exit (nested transitions can trigger xstate_required_size() multiple times per transition), relative to using cached values. The issue is easily visible by running `perf top` while triggering nested transitions: kvm_update_cpuid_runtime() shows up at a whopping 50%. As measured via RDTSC from L2 (using KVM-Unit-Test's CPUID VM-Exit test and a slightly modified L1 KVM to handle CPUID in the fastpath), a nested roundtrip to emulate CPUID on Skylake (SKX), Icelake (ICX), and Emerald Rapids (EMR) takes: SKX 11650 ICX 22350 EMR 28850 Using cached values, the latency drops to: SKX 6850 ICX 9000 EMR 7900 The underlying issue is that CPUID itself is slow on ICX, and comically slow on EMR. The problem is exacerbated on CPUs which support XSAVES and/or XSAVEC, as KVM invokes xstate_required_size() twice on each runtime CPUID update, and because there are more supported XSAVE features (CPUID for supported XSAVE feature sub-leafs is significantly slower). SKX: CPUID.0xD.2 = 348 cycles CPUID.0xD.3 = 400 cycles CPUID.0xD.4 = 276 cycles CPUID.0xD.5 = 236 cycles <other sub-leaves are similar> EMR: CPUID.0xD.2 = 1138 cycles CPUID.0xD.3 = 1362 cycles CPUID.0xD.4 = 1068 cycles CPUID.0xD.5 = 910 cycles CPUID.0xD.6 = 914 cycles CPUID.0xD.7 = 1350 cycles CPUID.0xD.8 = 734 cycles CPUID.0xD.9 = 766 cycles CPUID.0xD.10 = 732 cycles CPUID.0xD.11 = 718 cycles CPUID.0xD.12 = 734 cycles CPUID.0xD.13 = 1700 cycles CPUID.0xD.14 = 1126 cycles CPUID.0xD.15 = 898 cycles CPUID.0xD.16 = 716 cycles CPUID.0xD.17 = 748 cycles CPUID.0xD.18 = 776 cycles Note, updating runtime CPUID information multiple times per nested transition is itself a flaw, especially since CPUID is a mandotory intercept on both Intel and AMD. E.g. KVM doesn't need to ensure emulated CPUID state is up-to-date while running L2. That flaw will be fixed in a future patch, as deferring runtime CPUID updates is more subtle than it appears at first glance, the benefits aren't super critical to have once the XSAVE issue is resolved, and caching CPUID output is desirable even if KVM's updates are deferred. Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241211013302.1347853-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-13block: Fix potential deadlock while freezing queue and acquiring sysfs_lockNilay Shroff3-23/+26
For storing a value to a queue attribute, the queue_attr_store function first freezes the queue (->q_usage_counter(io)) and then acquire ->sysfs_lock. This seems not correct as the usual ordering should be to acquire ->sysfs_lock before freezing the queue. This incorrect ordering causes the following lockdep splat which we are able to reproduce always simply by accessing /sys/kernel/debug file using ls command: [ 57.597146] WARNING: possible circular locking dependency detected [ 57.597154] 6.12.0-10553-gb86545e02e8c #20 Tainted: G W [ 57.597162] ------------------------------------------------------ [ 57.597168] ls/4605 is trying to acquire lock: [ 57.597176] c00000003eb56710 (&mm->mmap_lock){++++}-{4:4}, at: __might_fault+0x58/0xc0 [ 57.597200] but task is already holding lock: [ 57.597207] c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 [ 57.597226] which lock already depends on the new lock. [ 57.597233] the existing dependency chain (in reverse order) is: [ 57.597241] -> #5 (&sb->s_type->i_mutex_key#3){++++}-{4:4}: [ 57.597255] down_write+0x6c/0x18c [ 57.597264] start_creating+0xb4/0x24c [ 57.597274] debugfs_create_dir+0x2c/0x1e8 [ 57.597283] blk_register_queue+0xec/0x294 [ 57.597292] add_disk_fwnode+0x2e4/0x548 [ 57.597302] brd_alloc+0x2c8/0x338 [ 57.597309] brd_init+0x100/0x178 [ 57.597317] do_one_initcall+0x88/0x3e4 [ 57.597326] kernel_init_freeable+0x3cc/0x6e0 [ 57.597334] kernel_init+0x34/0x1cc [ 57.597342] ret_from_kernel_user_thread+0x14/0x1c [ 57.597350] -> #4 (&q->debugfs_mutex){+.+.}-{4:4}: [ 57.597362] __mutex_lock+0xfc/0x12a0 [ 57.597370] blk_register_queue+0xd4/0x294 [ 57.597379] add_disk_fwnode+0x2e4/0x548 [ 57.597388] brd_alloc+0x2c8/0x338 [ 57.597395] brd_init+0x100/0x178 [ 57.597402] do_one_initcall+0x88/0x3e4 [ 57.597410] kernel_init_freeable+0x3cc/0x6e0 [ 57.597418] kernel_init+0x34/0x1cc [ 57.597426] ret_from_kernel_user_thread+0x14/0x1c [ 57.597434] -> #3 (&q->sysfs_lock){+.+.}-{4:4}: [ 57.597446] __mutex_lock+0xfc/0x12a0 [ 57.597454] queue_attr_store+0x9c/0x110 [ 57.597462] sysfs_kf_write+0x70/0xb0 [ 57.597471] kernfs_fop_write_iter+0x1b0/0x2ac [ 57.597480] vfs_write+0x3dc/0x6e8 [ 57.597488] ksys_write+0x84/0x140 [ 57.597495] system_call_exception+0x130/0x360 [ 57.597504] system_call_common+0x160/0x2c4 [ 57.597516] -> #2 (&q->q_usage_counter(io)#21){++++}-{0:0}: [ 57.597530] __submit_bio+0x5ec/0x828 [ 57.597538] submit_bio_noacct_nocheck+0x1e4/0x4f0 [ 57.597547] iomap_readahead+0x2a0/0x448 [ 57.597556] xfs_vm_readahead+0x28/0x3c [ 57.597564] read_pages+0x88/0x41c [ 57.597571] page_cache_ra_unbounded+0x1ac/0x2d8 [ 57.597580] filemap_get_pages+0x188/0x984 [ 57.597588] filemap_read+0x13c/0x4bc [ 57.597596] xfs_file_buffered_read+0x88/0x17c [ 57.597605] xfs_file_read_iter+0xac/0x158 [ 57.597614] vfs_read+0x2d4/0x3b4 [ 57.597622] ksys_read+0x84/0x144 [ 57.597629] system_call_exception+0x130/0x360 [ 57.597637] system_call_common+0x160/0x2c4 [ 57.597647] -> #1 (mapping.invalidate_lock#2){++++}-{4:4}: [ 57.597661] down_read+0x6c/0x220 [ 57.597669] filemap_fault+0x870/0x100c [ 57.597677] xfs_filemap_fault+0xc4/0x18c [ 57.597684] __do_fault+0x64/0x164 [ 57.597693] __handle_mm_fault+0x1274/0x1dac [ 57.597702] handle_mm_fault+0x248/0x484 [ 57.597711] ___do_page_fault+0x428/0xc0c [ 57.597719] hash__do_page_fault+0x30/0x68 [ 57.597727] do_hash_fault+0x90/0x35c [ 57.597736] data_access_common_virt+0x210/0x220 [ 57.597745] _copy_from_user+0xf8/0x19c [ 57.597754] sel_write_load+0x178/0xd54 [ 57.597762] vfs_write+0x108/0x6e8 [ 57.597769] ksys_write+0x84/0x140 [ 57.597777] system_call_exception+0x130/0x360 [ 57.597785] system_call_common+0x160/0x2c4 [ 57.597794] -> #0 (&mm->mmap_lock){++++}-{4:4}: [ 57.597806] __lock_acquire+0x17cc/0x2330 [ 57.597814] lock_acquire+0x138/0x400 [ 57.597822] __might_fault+0x7c/0xc0 [ 57.597830] filldir64+0xe8/0x390 [ 57.597839] dcache_readdir+0x80/0x2d4 [ 57.597846] iterate_dir+0xd8/0x1d4 [ 57.597855] sys_getdents64+0x88/0x2d4 [ 57.597864] system_call_exception+0x130/0x360 [ 57.597872] system_call_common+0x160/0x2c4 [ 57.597881] other info that might help us debug this: [ 57.597888] Chain exists of: &mm->mmap_lock --> &q->debugfs_mutex --> &sb->s_type->i_mutex_key#3 [ 57.597905] Possible unsafe locking scenario: [ 57.597911] CPU0 CPU1 [ 57.597917] ---- ---- [ 57.597922] rlock(&sb->s_type->i_mutex_key#3); [ 57.597932] lock(&q->debugfs_mutex); [ 57.597940] lock(&sb->s_type->i_mutex_key#3); [ 57.597950] rlock(&mm->mmap_lock); [ 57.597958] *** DEADLOCK *** [ 57.597965] 2 locks held by ls/4605: [ 57.597971] #0: c0000000137c12f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0xcc/0x154 [ 57.597989] #1: c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 Prevent the above lockdep warning by acquiring ->sysfs_lock before freezing the queue while storing a queue attribute in queue_attr_store function. Later, we also found[1] another function __blk_mq_update_nr_ hw_queues where we first freeze queue and then acquire the ->sysfs_lock. So we've also updated lock ordering in __blk_mq_update_nr_hw_queues function and ensured that in all code paths we follow the correct lock ordering i.e. acquire ->sysfs_lock before freezing the queue. [1] https://lore.kernel.org/all/CAFj5m9Ke8+EHKQBs_Nk6hqd=LGXtk4mUxZUN5==ZcCjnZSBwHw@mail.gmail.com/ Reported-by: kjain@linux.ibm.com Fixes: af2814149883 ("block: freeze the queue in queue_attr_store") Tested-by: kjain@linux.ibm.com Cc: hch@lst.de Cc: axboe@kernel.dk Cc: ritesh.list@gmail.com Cc: ming.lei@redhat.com Cc: gjoyce@linux.ibm.com Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241210144222.1066229-1-nilay@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-13irqchip/gic-v3: Work around insecure GIC integrationsMarc Zyngier1-1/+16
It appears that the relatively popular RK3399 SoC has been put together using a large amount of illicit substances, as experiments reveal that its integration of GIC500 exposes the *secure* programming interface to non-secure. This has some pretty bad effects on the way priorities are handled, and results in a dead machine if booting with pseudo-NMI enabled (irqchip.gicv3_pseudo_nmi=1) if the kernel contains 18fdb6348c480 ("arm64: irqchip/gic-v3: Select priorities at boot time"), which relies on the priorities being programmed using the NS view. Let's restore some sanity by going one step further and disable security altogether in this case. This is not any worse, and puts us in a mode where priorities actually make some sense. Huge thanks to Mark Kettenis who initially identified this issue on OpenBSD, and to Chen-Yu Tsai who reported the problem in Linux. Fixes: 18fdb6348c480 ("arm64: irqchip/gic-v3: Select priorities at boot time") Reported-by: Mark Kettenis <mark.kettenis@xs4all.nl> Reported-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chen-Yu Tsai <wens@csie.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241213141037.3995049-1-maz@kernel.org
2024-12-13irqchip/gic: Correct declaration of *percpu_base pointer in union gic_baseUros Bizjak1-1/+1
percpu_base is used in various percpu functions that expect variable in __percpu address space. Correct the declaration of percpu_base to void __iomem * __percpu *percpu_base; to declare the variable as __percpu pointer. The patch fixes several sparse warnings: irq-gic.c:1172:44: warning: incorrect type in assignment (different address spaces) irq-gic.c:1172:44: expected void [noderef] __percpu *[noderef] __iomem *percpu_base irq-gic.c:1172:44: got void [noderef] __iomem *[noderef] __percpu * ... irq-gic.c:1231:43: warning: incorrect type in argument 1 (different address spaces) irq-gic.c:1231:43: expected void [noderef] __percpu *__pdata irq-gic.c:1231:43: got void [noderef] __percpu *[noderef] __iomem *percpu_base There were no changes in the resulting object files. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/all/20241213145809.2918-2-ubizjak@gmail.com