linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-10-14	scripts/clang-tools: Convert clang-tidy args to list	Guru Das Srinagesh	1	-5/+6
	Convert list of clang-tidy arguments to a list for ease of adding to them and extending them as required. Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com> Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-14	modpost: put modpost options before argument	Richard Acayan	1	-1/+1
	The musl implementation of getopt stops looking for options after the first non-option argument. Put the options before the non-option argument so environments using musl can still build the kernel and modules. Fixes: f73edc8951b2 ("kbuild: unify two modpost invocations") Link: https://git.musl-libc.org/cgit/musl/tree/src/misc/getopt.c?h=dc9285ad1dc19349c407072cc48ba70dab86de45#n44 Signed-off-by: Richard Acayan <mailingradian@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-13	kbuild: Stop including vmlinux.bz2 in the rpm's	Zack Rusin	1	-2/+0
	vmlinux.bz2 was added to the rpm packages in 2009 in the fc370ecfdb37 ("kbuild: add vmlinux to kernel rpm") but seemingly hasn't been used since. Originally this should have been split up in a seperate debugging package because it massively increases the size of the generated rpm's e.g. kernel rpm built using binrpm-pkg on Fedora 36 default 5.19.8 kernel config and localmodconfig is ~255MB with vmlinux.bz2 and only ~65MB without it. Make the kernel built rpms about 4x smaller by not including the unused vmlinux.bz2 in them. Signed-off-by: Zack Rusin <zackr@vmware.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-13	Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT	Masahiro Yamada	1	-0/+1
	CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT does not give explicit -gdwarf-* flag. The actual DWARF version is up to the toolchain. The combination of GCC and GAS works fine, and Clang with the integrated assembler is good too. The combination of Clang and GAS is tricky, but at least, the -g flag works for Clang <=13, which defaults to DWARF v4. Clang 14 switched its default to DWARF v5. Now, CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT has the same issue as addressed by commit 98cd6f521f10 ("Kconfig: allow explicit opt in to DWARF v5"). CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y for Clang >= 14 and GAS < 2.35 produces a ton of errors like follows: /tmp/main-c2741c.s: Assembler messages: /tmp/main-c2741c.s:109: Error: junk at end of line, first unrecognized character is `"' /tmp/main-c2741c.s:109: Error: file number less than one Add 'depends on' to check toolchains. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2022-10-13	Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5	Masahiro Yamada	1	-2/+2
	Commit c0a5c81ca9be ("Kconfig.debug: drop GCC 5+ version check for DWARF5") could have cleaned up the code a bit more. "CC_IS_CLANG &&" is unneeded. No functional change is intended. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2022-10-10	module: tracking: Keep a record of tainted unloaded modules only	Aaron Tomlin	1	-0/+3
	This ensures that no module record/or entry is added to the unloaded_tainted_modules list if it does not carry a taint. Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Fixes: 99bd9956551b ("module: Introduce module unload taint tracking") Signed-off-by: Aaron Tomlin <atomlin@redhat.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-07	virtio_pci: don't try to use intxif pin is zero	Angus Chen	1	-0/+3
	The background is that we use dpu in cloud computing,the arch is x86,80 cores. We will have a lots of virtio devices,like 512 or more. When we probe about 200 virtio_blk devices,it will fail and the stack is printed as follows: [25338.485128] virtio-pci 0000:b3:00.0: virtio_pci: leaving for legacy driver [25338.496174] genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer) [25338.503822] CPU: 20 PID: 5431 Comm: kworker/20:0 Kdump: loaded Tainted: G OE --------- - - 4.18.0-305.30.1.el8.x86_64 [25338.516403] Hardware name: Inspur NF5280M5/YZMB-00882-10E, BIOS 4.1.21 08/25/2021 [25338.523881] Workqueue: events work_for_cpu_fn [25338.528235] Call Trace: [25338.530687] dump_stack+0x5c/0x80 [25338.534000] __setup_irq.cold.53+0x7c/0xd3 [25338.538098] request_threaded_irq+0xf5/0x160 [25338.542371] vp_find_vqs+0xc7/0x190 [25338.545866] init_vq+0x17c/0x2e0 [virtio_blk] [25338.550223] ? ncpus_cmp_func+0x10/0x10 [25338.554061] virtblk_probe+0xe6/0x8a0 [virtio_blk] [25338.558846] virtio_dev_probe+0x158/0x1f0 [25338.562861] really_probe+0x255/0x4a0 [25338.566524] ? __driver_attach_async_helper+0x90/0x90 [25338.571567] driver_probe_device+0x49/0xc0 [25338.575660] bus_for_each_drv+0x79/0xc0 [25338.579499] __device_attach+0xdc/0x160 [25338.583337] bus_probe_device+0x9d/0xb0 [25338.587167] device_add+0x418/0x780 [25338.590654] register_virtio_device+0x9e/0xe0 [25338.595011] virtio_pci_probe+0xb3/0x140 [25338.598941] local_pci_probe+0x41/0x90 [25338.602689] work_for_cpu_fn+0x16/0x20 [25338.606443] process_one_work+0x1a7/0x360 [25338.610456] ? create_worker+0x1a0/0x1a0 [25338.614381] worker_thread+0x1cf/0x390 [25338.618132] ? create_worker+0x1a0/0x1a0 [25338.622051] kthread+0x116/0x130 [25338.625283] ? kthread_flush_work_fn+0x10/0x10 [25338.629731] ret_from_fork+0x1f/0x40 [25338.633395] virtio_blk: probe of virtio418 failed with error -16 The log : "genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer)" was printed because of the irq 0 is used by timer exclusive,and when vp_find_vqs call vp_find_vqs_msix and returns false twice (for whatever reason), then it will call vp_find_vqs_intx as a fallback. Because vp_dev->pci_dev->irq is zero, we request irq 0 with flag IRQF_SHARED, and get a backtrace like above. According to PCI spec about "Interrupt Pin" Register (Offset 3Dh): "The Interrupt Pin register is a read-only register that identifies the legacy interrupt Message(s) the Function uses. Valid values are 01h, 02h, 03h, and 04h that map to legacy interrupt Messages for INTA, INTB, INTC, and INTD respectively. A value of 00h indicates that the Function uses no legacy interrupt Message(s)." So if vp_dev->pci_dev->pin is zero, we should not request legacy interrupt. Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220930000915.548-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vDPA: conditionally read MTU and MAC in dev cfg space	Zhu Lingshan	1	-8/+29
	The spec says: mtu only exists if VIRTIO_NET_F_MTU is set The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC is set) So vdpa_dev_net_config_fill() should read MTU and MAC conditionally on the feature bits. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220929014555.112323-7-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vDPA: fix spars cast warning in vdpa_dev_net_mq_config_fill	Zhu Lingshan	1	-1/+2
	This commit fixes spars warnings: cast to restricted __le16 in function vdpa_dev_net_mq_config_fill() Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220929014555.112323-6-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vDPA: check virtio device features to detect MQ	Zhu Lingshan	1	-1/+1
	vdpa_dev_net_mq_config_fill() should checks device features for MQ than driver features. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220929014555.112323-5-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vDPA: check VIRTIO_NET_F_RSS for max_virtqueue_paris's presence	Zhu Lingshan	1	-3/+3
	virtio 1.2 spec says: max_virtqueue_pairs only exists if VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is set. So when reporint MQ to userspace, it should check both VIRTIO_NET_F_MQ and VIRTIO_NET_F_RSS. unused parameter struct vdpa_device *vdev is removed Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220929014555.112323-4-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vDPA: only report driver features if FEATURES_OK is set	Zhu Lingshan	1	-6/+14
	This commit reports driver features to user space only after FEATURES_OK is features negotiation is done. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220929014555.112323-3-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vDPA: allow userspace to query features of a vDPA device	Zhu Lingshan	2	-5/+15
	This commit adds a new vDPA netlink attribution VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query features of vDPA devices through this new attr. This commit invokes vdpa_config_ops.get_config() rather than vdpa_get_config_unlocked() to read the device config spcae, so no races in vdpa_set_features_unlocked() Userspace tool iproute2 example: $ vdpa dev config show vdpa0 vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 4 mtu 1500 negotiated_features MRG_RXBUF CTRL_VQ MQ VERSION_1 ACCESS_PLATFORM dev_features MTU MAC MRG_RXBUF CTRL_VQ MQ ANY_LAYOUT VERSION_1 ACCESS_PLATFORM Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220929014555.112323-2-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	virtio_blk: add SECURE ERASE command support	Alvaro Karsz	2	-18/+111
	Support for the VIRTIO_BLK_F_SECURE_ERASE VirtIO feature. A device that offers this feature can receive VIRTIO_BLK_T_SECURE_ERASE commands. A device which supports this feature has the following fields in the virtio config: - max_secure_erase_sectors - max_secure_erase_seg - secure_erase_sector_alignment max_secure_erase_sectors and secure_erase_sector_alignment are expressed in 512-byte units. Every secure erase command has the following fields: - sectors: The starting offset in 512-byte units. - num_sectors: The number of sectors. Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Message-Id: <20220921082729.2516779-1-alvaro.karsz@solid-run.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-07	vp_vdpa: support feature provisioning	Jason Wang	1	-2/+20
	This patch allows the device features to be provisioned via netlink. This is done by: 1) validating the provisioned features to be a subset of the parent features. 2) clearing the features that is not wanted by the userspace For example: # vdpa mgmtdev show pci/0000:02:00.0: supported_classes net max_supported_vqs 3 dev_features CSUM GUEST_CSUM CTRL_GUEST_OFFLOADS MAC GUEST_TSO4 GUEST_TSO6 GUEST_ECN GUEST_UFO HOST_TSO4 HOST_TSO6 HOST_ECN HOST_UFO MRG_RXBUF STATUS CTRL_VQ CTRL_RX CTRL_VLAN CTRL_RX_EXTRA GUEST_ANNOUNCE CTRL_MAC_ADDR RING_INDIRECT_DESC RING_EVENT_IDX VERSION_1 ACCESS_PLATFORM 1) provision vDPA device with all features that are supported by the virtio-pci # vdpa dev add name dev1 mgmtdev pci/0000:02:00.0 # vdpa dev config show dev1: mac 52:54:00:12:34:56 link up link_announce false mtu 65535 negotiated_features CSUM GUEST_CSUM CTRL_GUEST_OFFLOADS MAC GUEST_TSO4 GUEST_TSO6 GUEST_ECN GUEST_UFO HOST_TSO4 HOST_TSO6 HOST_ECN HOST_UFO MRG_RXBUF STATUS CTRL_VQ CTRL_RX CTRL_VLAN GUEST_ANNOUNCE CTRL_MAC_ADDR RING_INDIRECT_DESC RING_EVENT_IDX VERSION_1 ACCESS_PLATFORM 2) provision vDPA device with a subset of the features # vdpa dev add name dev1 mgmtdev pci/0000:02:00.0 device_features 0x300020000 # dev1: mac 52:54:00:12:34:56 link up link_announce false mtu 65535 negotiated_features CTRL_VQ VERSION_1 ACCESS_PLATFORM Reviewed-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220927074810.28627-4-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vdpa_sim_net: support feature provisioning	Jason Wang	4	-5/+17
	This patch implements features provisioning for vdpa_sim_net. 1) validating the provisioned features to be a subset of the parent features. 2) clearing the features that is not wanted by the userspace For example: vdpasim_net: supported_classes net max_supported_vqs 3 dev_features MTU MAC CTRL_VQ CTRL_MAC_ADDR ANY_LAYOUT VERSION_1 ACCESS_PLATFORM 1) provision vDPA device with all features that are supported by the net simulator dev1: mac 00:00:00:00:00:00 link up link_announce false mtu 1500 negotiated_features MTU MAC CTRL_VQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM 2) provision vDPA device with a subset of the features dev1: mac 00:00:00:00:00:00 link up link_announce false mtu 1500 negotiated_features CTRL_VQ VERSION_1 ACCESS_PLATFORM Reviewed-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220927074810.28627-3-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-10-07	vdpa: device feature provisioning	Jason Wang	3	-0/+8
	This patch allows the device features to be provisioned through netlink. A new attribute is introduced to allow the userspace to pass a 64bit device features during device adding. This provides several advantages: - Allow to provision a subset of the features to ease the cross vendor live migration. - Better debug-ability for vDPA framework and parent. Reviewed-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220927074810.28627-2-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	virtio-net: use mtu size as buffer length for big packets	Gavin Li	1	-13/+24
	Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big packets even when GUEST_* offloads are not present on the device. However, if guest GSO is not supported, it would be sufficient to allocate segments to cover just up the MTU size and no further. Allocating the maximum amount of segments results in a large waste of buffer space in the queue, which limits the number of packets that can be buffered and can result in reduced performance. Therefore, if guest GSO is not supported, use the MTU to calculate the optimal amount of segments required. Below is the iperf TCP test results over a Mellanox NIC, using vDPA for 1 VQ, queue size 1024, before and after the change, with the iperf server running over the virtio-net interface. MTU(Bytes)/Bandwidth (Gbit/s) Before After 1500 22.5 22.4 9000 12.8 25.9 And result of queue size 256. MTU(Bytes)/Bandwidth (Gbit/s) Before After 9000 2.15 11.9 With this patch no degradation is observed with multiple below tests and feature bit combinations. Results are summarized below for q depth of 1024. Interface MTU is 1500 if MTU feature is disabled. MTU is set to 9000 in other tests. Features/ Bandwidth (Gbit/s) Before After mtu off 20.1 20.2 mtu/indirect on 17.4 17.3 mtu/indirect/packed on 17.2 17.2 Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <20220914144911.56422-3-gavinl@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-10-07	virtio-net: introduce and use helper function for guest gso support checks	Gavin Li	1	-4/+9
	Probe routine is already several hundred lines. Use helper function for guest gso support check. Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <20220914144911.56422-2-gavinl@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	virtio: drop vp_legacy_set_queue_size	Michael S. Tsirkin	1	-2/+0
	There's actually no way to set queue size on legacy virtio pci. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220815220447.155860-1-mst@redhat.com>
2022-10-07	virtio_ring: make vring_alloc_queue_packed prettier	Deming Wang	1	-3/+3
	Add some spaces to vring_alloc_queue(make it look prettier). Signed-off-by: Deming Wang <wangdeming@inspur.com> Message-Id: <20220926183306.4535-1-wangdeming@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	virtio_ring: split: Operators use unified style	Deming Wang	1	-1/+1
	The operators of vring_alloc_queue_split should use the unified style.Add space for the '\|' ,make it be looked more pretty. Signed-off-by: Deming Wang <wangdeming@inspur.com> Message-Id: <20220926022202.1516-1-wangdeming@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	vhost: add __init/__exit annotations to module init/exit funcs	Xiu Jianfeng	1	-2/+2
	Add missing __init/__exit annotations to module init/exit funcs. Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Message-Id: <20220917083803.21521-1-xiujianfeng@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07	KVM: PPC: Book3S HV: Fix stack frame regs marker	Nicholas Piggin	1	-1/+1
	The hard-coded marker is out of date now, fix it using the nice define. Fixes: 17773afdcd15 ("powerpc/64: use 32-bit immediate for STACK_FRAME_REGS_MARKER") Reported-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221006143345.129077-1-npiggin@gmail.com
2022-10-06	riscv: enable THP_SWAP for RV64	Jisheng Zhang	1	-0/+1
	I have a Sipeed Lichee RV dock board which only has 512MB DDR, so memory optimizations such as swap on zram are helpful. As is seen in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after swapped out"), THP_SWAP can improve the swap throughput significantly. Enable THP_SWAP for RV64, testing the micro-benchmark which is introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") shows below numbers on the Lichee RV dock board: swp out bandwidth w/o patch: 66908 bytes/ms (mean of 10 tests) swp out bandwidth w/ patch: 322638 bytes/ms (mean of 10 tests) Improved by 382%! Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220829145742.3139-1-jszhang@kernel.org/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-06	RISC-V: Print SSTC in canonical order	Palmer Dabbelt	1	-1/+1
	This got out of order during a merge conflict, fix it by putting the entries in the correct order. Fixes: 7ab52f75a9cf ("RISC-V: Add Sstc extension support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220920204518.10988-1-palmer@rivosinc.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-07	Revert "drm/sched: Use parent fence instead of finished"	Dave Airlie	1	-2/+2
	This reverts commit e4dc45b1848bc6bcac31eb1b4ccdd7f6718b3c86. This is causing instability on Linus' desktop, and I'm seeing oops with VK CTS runs. netconsole got me the following oops: [ 1234.778760] BUG: kernel NULL pointer dereference, address: 0000000000000088 [ 1234.778782] #PF: supervisor read access in kernel mode [ 1234.778787] #PF: error_code(0x0000) - not-present page [ 1234.778791] PGD 0 P4D 0 [ 1234.778798] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1234.778803] CPU: 7 PID: 805 Comm: systemd-journal Not tainted 6.0.0+ #2 [ 1234.778809] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 5603 07/28/2020 [ 1234.778813] RIP: 0010:drm_sched_job_done.isra.0+0xc/0x140 [gpu_sched] [ 1234.778828] Code: aa 0f 1d ce e9 57 ff ff ff 48 89 d7 e8 9d 8f 3f ce e9 4a ff ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 53 48 89 fb <48> 8b af 88 00 00 00 f0 ff 8d f0 00 00 00 48 8b 85 80 01 00 00 f0 [ 1234.778834] RSP: 0000:ffffabe680380de0 EFLAGS: 00010087 [ 1234.778839] RAX: ffffffffc04e9230 RBX: 0000000000000000 RCX: 0000000000000018 [ 1234.778897] RDX: 00000ba278e8977a RSI: ffff953fb288b460 RDI: 0000000000000000 [ 1234.778901] RBP: ffff953fb288b598 R08: 00000000000000e0 R09: ffff953fbd98b808 [ 1234.778905] R10: 0000000000000000 R11: ffffabe680380ff8 R12: ffffabe680380e00 [ 1234.778908] R13: 0000000000000001 R14: 00000000ffffffff R15: ffff953fbd9ec458 [ 1234.778912] FS: 00007f35e7008580(0000) GS:ffff95428ebc0000(0000) knlGS:0000000000000000 [ 1234.778916] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1234.778919] CR2: 0000000000000088 CR3: 000000010147c000 CR4: 00000000003506e0 [ 1234.778924] Call Trace: [ 1234.778981] <IRQ> [ 1234.778989] dma_fence_signal_timestamp_locked+0x6a/0xe0 [ 1234.778999] dma_fence_signal+0x2c/0x50 [ 1234.779005] amdgpu_fence_process+0xc8/0x140 [amdgpu] [ 1234.779234] sdma_v3_0_process_trap_irq+0x70/0x80 [amdgpu] [ 1234.779395] amdgpu_irq_dispatch+0xa9/0x1d0 [amdgpu] [ 1234.779609] amdgpu_ih_process+0x80/0x100 [amdgpu] [ 1234.779783] amdgpu_irq_handler+0x1f/0x60 [amdgpu] [ 1234.779940] __handle_irq_event_percpu+0x46/0x190 [ 1234.779946] handle_irq_event+0x34/0x70 [ 1234.779949] handle_edge_irq+0x9f/0x240 [ 1234.779954] __common_interrupt+0x66/0x100 [ 1234.779960] common_interrupt+0xa0/0xc0 [ 1234.779965] </IRQ> [ 1234.779968] <TASK> [ 1234.779971] asm_common_interrupt+0x22/0x40 [ 1234.779976] RIP: 0010:finish_mkwrite_fault+0x22/0x110 [ 1234.779981] Code: 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 55 41 54 55 48 89 fd 53 48 8b 07 f6 40 50 08 0f 84 eb 00 00 00 48 8b 45 30 48 8b 18 <48> 89 df e8 66 bd ff ff 48 85 c0 74 0d 48 89 c2 83 e2 01 48 83 ea [ 1234.779985] RSP: 0000:ffffabe680bcfd78 EFLAGS: 00000202 Revert it for now and figure it out later. Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-10-06	random: clear new batches when bringing new CPUs online	Jason A. Donenfeld	2	-12/+18
	The commit that added the new get_random_{u8,u16}() functions neglected to update the code that clears the batches when bringing up a new CPU. It also forgot a few comments and helper defines, so add those in too. Fixes: 585cd5fe9f73 ("random: add 8-bit and 16-bit batches") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-06	ftrace: Create separate entry in MAINTAINERS for function hooks	Steven Rostedt (Google)	1	-5/+14
	The function hooks (ftrace) is a completely different subsystem from the general tracing. It manages how to attach callbacks to most functions in the kernel. It is also used by live kernel patching. It really is not part of tracing, although tracing uses it. Create a separate entry for FUNCTION HOOKS (FTRACE) to be separate from tracing itself in the MAINTAINERS file. Perhaps it should be moved out of the kernel/trace directory, but that's for another time. Link: https://lkml.kernel.org/r/20221006144439.459272364@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-10-06	tracing: Update MAINTAINERS to reflect new tracing git repo	Steven Rostedt (Google)	1	-2/+2
	The tracing git repo will no longer be housed in my personal git repo, but instead live in trace/linux-trace.git. Update the MAINTAINERS file appropriately. Link: https://lkml.kernel.org/r/20221006144439.282193367@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-10-07	powerpc: Don't add __powerpc_ prefix to syscall entry points	Michael Ellerman	4	-16/+10
	When using syscall wrappers the __SYSCALL_DEFINEx() and related macros add a "__powerpc_" prefix to all syscall entry points. So for example sys_mmap becomes __powerpc_sys_mmap. This risks breaking workflows and tools that expect the old naming scheme. At a minimum setting a breakpoint on eg. sys_mmap with gdb no longer works. There seems to be no compelling reason to add the "__powerpc_" prefix, other than that it follows what some other arches do (x86, arm64, s390). But unlike other arches powerpc doesn't always enable syscall wrappers, so the syscall entry points can change name depending on CONFIG options. For those reasons drop the "__powerpc_" prefix, reverting to the existing naming. Doing so reveals two prototypes in signal.h that have the incorrect type when syscall wrappers are enabled. There are already prototypes for both functions in syscalls.h, so drop the ones from signal.h. Fixes: 7e92e01b7245 ("powerpc: Provide syscall wrapper") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221006135940.1223988-1-mpe@ellerman.id.au
2022-10-06	sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot()	Valentin Schneider	1	-4/+1
	This removes the second use of the sched_core_mask temporary mask. Suggested-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
2022-10-06	lib/test_cpumask: Add for_each_cpu_and(not) tests	Valentin Schneider	1	-0/+19
	Following the recent introduction of for_each_andnot(), add some tests to ensure for_each_cpu_and(not) results in the same as iterating over the result of cpumask_and(not)(). Suggested-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
2022-10-06	cpumask: Introduce for_each_cpu_andnot()	Valentin Schneider	2	-0/+23
	for_each_cpu_and() is very convenient as it saves having to allocate a temporary cpumask to store the result of cpumask_and(). The same issue applies to cpumask_andnot() which doesn't actually need temporary storage for iteration purposes. Following what has been done for for_each_cpu_and(), introduce for_each_cpu_andnot(). Signed-off-by: Valentin Schneider <vschneid@redhat.com>
2022-10-06	lib/find_bit: Introduce find_next_andnot_bit()	Valentin Schneider	2	-0/+42
	In preparation of introducing for_each_cpu_andnot(), add a variant of find_next_bit() that negate the bits in @addr2 when ANDing them with the bits in @addr1. Signed-off-by: Valentin Schneider <vschneid@redhat.com>
2022-10-06	ARM/dma-mapping: remove the dma_coherent member of struct dev_archdata	Christoph Hellwig	2	-4/+1
	Since commit ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally") only the dma_coherent flag in struct device is used, so remove the now write only flag in struct dev_archdata. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-06	ARM/dma-mappіng: don't override ->dma_coherent when set from a bus notifier	Christoph Hellwig	1	-2/+10
	Commit ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally") caused a regression on the mvebu platform, wherein devices that are dma-coherent are marked as dma-noncoherent, because although mvebu_hwcc_notifier() after that commit still marks then as coherent, the arm_coherent_dma_ops() function, which is called later, overwrites this setting, since it is being called from drivers/of/device.c with coherency parameter determined by of_dma_is_coherent(), and the device-trees do not declare the 'dma-coherent' property. Fix this by defaulting never clearing the dma_coherent flag in arm_coherent_dma_ops(). Fixes: ae626eb97376 ("ARM/dma-mapping: use dma-direct unconditionally") Reported-by: Marek Behún <kabel@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Marek Behún <kabel@kernel.org>
2022-10-05	mailbox: qcom-ipcc: flag IRQ NO_THREAD	Eric Chanudet	1	-1/+2
	PREEMPT_RT forces qcom-ipcc's handler to be threaded with interrupts enabled, which triggers a warning in __handle_irq_event_percpu(): irq 173 handler irq_default_primary_handler+0x0/0x10 enabled interrupts WARNING: CPU: 0 PID: 77 at kernel/irq/handle.c:161 __handle_irq_event_percpu+0x4c4/0x4d0 Mark it IRQF_NO_THREAD to avoid running the handler in a threaded context with threadirqs or PREEMPT_RT enabled. Signed-off-by: Eric Chanudet <echanude@redhat.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	mailbox: pcc: Fix spelling mistake "Plaform" -> "Platform"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a pr_err message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	mailbox: bcm-ferxrm-mailbox: Fix error check for dma_map_sg	Jack Wang	1	-4/+4
	dma_map_sg return 0 on error, fix the error check, and return -EIO to caller. Fixes: dbc049eee730 ("mailbox: Add driver for Broadcom FlexRM ring manager") Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	mailbox: qcom-apcs-ipc: add IPQ8074 APSS clock support	Robert Marko	1	-1/+1
	IPQ8074 has the APSS clock controller utilizing the same register space as the APCS, so provide access to the APSS utilizing a child device like IPQ6018. IPQ6018 and IPQ8074 use the same controller and driver, so just utilize IPQ6018 match data for IPQ8074. Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	dt-bindings: mailbox: qcom: correct clocks for IPQ6018 and IPQ8074	Robert Marko	1	-12/+34
	IPQ6018 APSS driver is registered by APCS as they share the same register space, and it uses "pll" and "xo" as inputs. Correct the allowed clocks for IPQ6018 and IPQ8074 as they share the same driver to allow "pll" and "xo" as clock-names. Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	dt-bindings: mailbox: qcom: set correct #clock-cells	Robert Marko	1	-1/+16
	IPQ6018 and IPQ8074 require #clock-cells to be set to 1 as their APSS clock driver provides multiple clock outputs. So allow setting 1 as #clock-cells and check that its set to 1 for IPQ6018 and IPQ8074, check others for 0 as its currently. Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	mailbox: mpfs: account for mbox offsets while sending	Conor Dooley	1	-4/+3
	The mailbox offset is not only used for receiving messages, but it is also used by messages sent to the system controller by Linux that have a payload, such as the "digital signature service". It is also overloaded by certain other services (reprogramming of the FPGA fabric, see Link:) to have a meaning other than the offset the system controller should read from. When the driver was written, no such services of the latter type were in use & those of the former used an offset of zero so this has gone un-noticed. Link: https://www.microsemi.com/document-portal/doc_download/1245815-polarfire-fpga-and-polarfire-soc-fpga-system-services-user-guide # Section 5.2 Fixes: 83d7b1560810 ("mbox: add polarfire soc system controller mailbox") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	mailbox: mpfs: fix handling of the reg property	Conor Dooley	1	-10/+14
	The "data" region of the PolarFire SoC's system controller mailbox is not one continuous register space - the system controller's QSPI sits between the control and data registers. Split the "data" reg into two parts: "data" & "control". Optionally get the "data" register address from the 3rd reg property in the devicetree & fall back to using the old base + MAILBOX_REG_OFFSET that the current code uses. Fixes: 83d7b1560810 ("mbox: add polarfire soc system controller mailbox") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	dt-bindings: mailbox: fix the mpfs' reg property	Conor Dooley	1	-4/+11
	The "data" region of the PolarFire SoC's system controller mailbox is not one continuous register space - the system controller's QSPI sits between the control and data registers. Split the "data" reg into two parts: "data" & "control". Fixes: 213556235526 ("dt-bindings: soc/microchip: update syscontroller compatibles") Fixes: ed9543d6f2c4 ("dt-bindings: add bindings for polarfire soc mailbox") Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	mailbox: imx: fix RST channel support	Peng Fan	1	-5/+5
	Because IMX_MU_xCR_MAX was increased to 5, some mu cfgs were not updated to include the CR register. Add the missed CR register to xcr array. Fixes: 82ab513baed5 ("mailbox: imx: support RST channel") Reported-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Tested-by: Liu Ying <victor.liu@nxp.com> # i.MX8qm/qxp MEK boards boot Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2022-10-05	tracing: Do not free snapshot if tracer is on cmdline	Steven Rostedt (Google)	1	-4/+6
	The ftrace_boot_snapshot and alloc_snapshot cmdline options allocate the snapshot buffer at boot up for use later. The ftrace_boot_snapshot in particular requires the snapshot to be allocated because it will take a snapshot at the end of boot up allowing to see the traces that happened during boot so that it's not lost when user space takes over. When a tracer is registered (started) there's a path that checks if it requires the snapshot buffer or not, and if it does not and it was allocated it will do a synchronization and free the snapshot buffer. This is only required if the previous tracer was using it for "max latency" snapshots, as it needs to make sure all max snapshots are complete before freeing. But this is only needed if the previous tracer was using the snapshot buffer for latency (like irqoff tracer and friends). But it does not make sense to free it, if the previous tracer was not using it, and the snapshot was allocated by the cmdline parameters. This basically takes away the point of allocating it in the first place! Note, the allocated snapshot worked fine for just trace events, but fails when a tracer is enabled on the cmdline. Further investigation, this goes back even further and it does not require a tracer on the cmdline to fail. Simply enable snapshots and then enable a tracer, and it will remove the snapshot. Link: https://lkml.kernel.org/r/20221005113757.041df7fe@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: 45ad21ca5530 ("tracing: Have trace_array keep track if snapshot buffer is allocated") Reported-by: Ross Zwisler <zwisler@kernel.org> Tested-by: Ross Zwisler <zwisler@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-10-05	ftrace: Still disable enabled records marked as disabled	Steven Rostedt (Google)	1	-4/+16
	Weak functions started causing havoc as they showed up in the "available_filter_functions" and this confused people as to why some functions marked as "notrace" were listed, but when enabled they did nothing. This was because weak functions can still have fentry calls, and these addresses get added to the "available_filter_functions" file. kallsyms is what converts those addresses to names, and since the weak functions are not listed in kallsyms, it would just pick the function before that. To solve this, there was a trick to detect weak functions listed, and these records would be marked as DISABLED so that they do not get enabled and are mostly ignored. As the processing of the list of all functions to figure out what is weak or not can take a long time, this process is put off into a kernel thread and run in parallel with the rest of start up. Now the issue happens whet function tracing is enabled via the kernel command line. As it starts very early in boot up, it can be enabled before the records that are weak are marked to be disabled. This causes an issue in the accounting, as the weak records are enabled by the command line function tracing, but after boot up, they are not disabled. The ftrace records have several accounting flags and a ref count. The DISABLED flag is just one. If the record is enabled before it is marked DISABLED it will get an ENABLED flag and also have its ref counter incremented. After it is marked for DISABLED, neither the ENABLED flag nor the ref counter is cleared. There's sanity checks on the records that are performed after an ftrace function is registered or unregistered, and this detected that there were records marked as ENABLED with ref counter that should not have been. Note, the module loading code uses the DISABLED flag as well to keep its functions from being modified while its being loaded and some of these flags may get set in this process. So changing the verification code to ignore DISABLED records is a no go, as it still needs to verify that the module records are working too. Also, the weak functions still are calling a trampoline. Even though they should never be called, it is dangerous to leave these weak functions calling a trampoline that is freed, so they should still be set back to nops. There's two places that need to not skip records that have the ENABLED and the DISABLED flags set. That is where the ftrace_ops is processed and sets the records ref counts, and then later when the function itself is to be updated, and the ENABLED flag gets removed. Add a helper function "skip_record()" that returns true if the record has the DISABLED flag set but not the ENABLED flag. Link: https://lkml.kernel.org/r/20221005003809.27d2b97b@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: b39181f7c6907 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-10-05	riscv: compat: s/failed/unsupported if compat mode isn't supported	Jisheng Zhang	1	-1/+1
	When compat mode isn't supported(I believe this is the most case now), kernel will emit somthing as: [ 0.050407] riscv: ELF compat mode failed This msg may make users think there's something wrong with the kernel itself, replace "failed" with "unsupported" to make it clear. In fact this is the real compat_mode_supported meaning. After the patch, the msg would be: [ 0.050407] riscv: ELF compat mode unsupported Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Acked-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20220821141819.3804-1-jszhang@kernel.org/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>