aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/testing/selftests/arm64/mte/check_ksm_options.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2024-03-19wireguard: netlink: access device through ctx instead of peerJason A. Donenfeld1-2/+2
The previous commit fixed a bug that led to a NULL peer->device being dereferenced. It's actually easier and faster performance-wise to instead get the device from ctx->wg. This semantically makes more sense too, since ctx->wg->peer_allowedips.seq is compared with ctx->allowedips_seq, basing them both in ctx. This also acts as a defence in depth provision against freed peers. Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: netlink: check for dangling peer via is_dead instead of empty listJason A. Donenfeld1-3/+3
If all peers are removed via wg_peer_remove_all(), rather than setting peer_list to empty, the peer is added to a temporary list with a head on the stack of wg_peer_remove_all(). If a netlink dump is resumed and the cursored peer is one that has been removed via wg_peer_remove_all(), it will iterate from that peer and then attempt to dump freed peers. Fix this by instead checking peer->is_dead, which was explictly created for this purpose. Also move up the device_update_lock lockdep assertion, since reading is_dead relies on that. It can be reproduced by a small script like: echo "Setting config..." ip link add dev wg0 type wireguard wg setconf wg0 /big-config ( while true; do echo "Showing config..." wg showconf wg0 > /dev/null done ) & sleep 4 wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n") Resulting in: BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20 Read of size 8 at addr ffff88811956ec70 by task wg/59 CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5 Call Trace: <TASK> dump_stack_lvl+0x47/0x70 print_address_description.constprop.0+0x2c/0x380 print_report+0xab/0x250 kasan_report+0xba/0xf0 __lock_acquire+0x182a/0x1b20 lock_acquire+0x191/0x4b0 down_read+0x80/0x440 get_peer+0x140/0xcb0 wg_get_device_dump+0x471/0x1130 Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Reported-by: Lillian Berry <lillian@star-ark.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: device: remove generic .ndo_get_stats64Breno Leitao1-1/+0
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is doing some custom stats collection, it does not need to set .ndo_get_stats64. Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64 function pointer. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: device: leverage core stats allocatorBreno Leitao1-8/+2
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver. With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now. Remove the allocation in this driver and leverage the network core allocation instead. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: receive: annotate data-race around receiving_counter.counterNikita Zhandarovich1-3/+3
Syzkaller with KCSAN identified a data-race issue when accessing keypair->receiving_counter.counter. Use READ_ONCE() and WRITE_ONCE() annotations to mark the data race as intentional. BUG: KCSAN: data-race in wg_packet_decrypt_worker / wg_packet_rx_poll write to 0xffff888107765888 of 8 bytes by interrupt on cpu 0: counter_validate drivers/net/wireguard/receive.c:321 [inline] wg_packet_rx_poll+0x3ac/0xf00 drivers/net/wireguard/receive.c:461 __napi_poll+0x60/0x3b0 net/core/dev.c:6536 napi_poll net/core/dev.c:6605 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6738 __do_softirq+0xc4/0x279 kernel/softirq.c:553 do_softirq+0x5e/0x90 kernel/softirq.c:454 __local_bh_enable_ip+0x64/0x70 kernel/softirq.c:381 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline] _raw_spin_unlock_bh+0x36/0x40 kernel/locking/spinlock.c:210 spin_unlock_bh include/linux/spinlock.h:396 [inline] ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline] wg_packet_decrypt_worker+0x6c5/0x700 drivers/net/wireguard/receive.c:499 process_one_work kernel/workqueue.c:2633 [inline] ... read to 0xffff888107765888 of 8 bytes by task 3196 on cpu 1: decrypt_packet drivers/net/wireguard/receive.c:252 [inline] wg_packet_decrypt_worker+0x220/0x700 drivers/net/wireguard/receive.c:501 process_one_work kernel/workqueue.c:2633 [inline] process_scheduled_works+0x5b8/0xa30 kernel/workqueue.c:2706 worker_thread+0x525/0x730 kernel/workqueue.c:2787 ... Fixes: a9e90d9931f3 ("wireguard: noise: separate receive counter from send counter") Reported-by: syzbot+d1de830e4ecdaac83d89@syzkaller.appspotmail.com Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19net: move dev->state into net_device_read_txrx groupEric Dumazet3-3/+4
dev->state can be read in rx and tx fast paths. netif_running() which needs dev->state is called from - enqueue_to_backlog() [RX path] - __dev_direct_xmit() [TX path] Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240314200845.3050179-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19virtio_net: rename free_old_xmit_skbs to free_old_xmitXuan Zhuo1-5/+5
Since free_old_xmit_skbs not only deals with skb, but also xdp frame and subsequent added xsk, so change the name of this function to free_old_xmit. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240229072044.77388-19-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19virtio_net: unify the code for recycling the xmit ptrXuan Zhuo1-43/+39
There are two completely similar and independent implementations. This is inconvenient for the subsequent addition of new types. So extract a function from this piece of code and call this function uniformly to recover old xmit ptr. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240229072044.77388-18-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19virtio-net: add cond_resched() to the command waiting loopJason Wang1-1/+3
Adding cond_resched() to the command waiting loop for a better co-operation with the scheduler. This allows to give CPU a breath to run other task(workqueue) instead of busy looping when preemption is not allowed on a device whose CVQ might be slow. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230720083839.481487-3-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
2024-03-19virtio-net: convert rx mode setting to use workqueueJason Wang1-3/+52
This patch convert rx mode setting to be done in a workqueue, this is a must for allow to sleep when waiting for the cvq command to response since current code is executed under addr spin lock. Note that we need to disable and flush the workqueue during freeze, this means the rx mode setting is lost after resuming. This is not the bug of this patch as we never try to restore rx mode setting during resume. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230720083839.481487-2-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
2024-03-19virtio: packed: fix unmap leak for indirect desc tableXuan Zhuo1-3/+3
When use_dma_api and premapped are true, then the do_unmap is false. Because the do_unmap is false, vring_unmap_extra_packed is not called by detach_buf_packed. if (unlikely(vq->do_unmap)) { curr = id; for (i = 0; i < state->num; i++) { vring_unmap_extra_packed(vq, &vq->packed.desc_extra[curr]); curr = vq->packed.desc_extra[curr].next; } } So the indirect desc table is not unmapped. This causes the unmap leak. So here, we check vq->use_dma_api instead. Synchronously, dma info is updated based on use_dma_api judgment This bug does not occur, because no driver use the premapped with indirect. Fixes: b319940f83c2 ("virtio_ring: skip unmap for premapped") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20240223071833.26095-1-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-blk flush info to user spaceZhu Lingshan2-0/+15
This commit reports whether a virtio-blk device support cache flush command to user space Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-11-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block read-only info to user spaceZhu Lingshan2-0/+15
This commit report read-only information of virtio-blk devices to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-10-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block write zeroes configuration to user spaceZhu Lingshan2-0/+25
This commits reports write zeroes configuration of virtio-block devices to user space, includes: 1)maximum write zeroes sectors size 2)maximum write zeroes segment number Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-9-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block discarding configuration to user spaceZhu Lingshan2-0/+29
This commit reports virtio-blk discarding configuration to user space,includes: 1) the maximum discard sectors 2) maximum number of discard segments for the block driver to use 3) the alignment for splitting a discarding request Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-8-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block topology info to user spaceZhu Lingshan2-0/+36
This commit allows vDPA reporting topology information of virtio-blk devices to user space, includes: 1) the number of logical blocks per physical block 2) offset of first aligned logical block 3) suggested minimum I/O size in blocks 4) optimal (suggested maximum) I/O size in blocks Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-7-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block MQ info to user spaceZhu Lingshan2-0/+17
This commits allows vDPA reporting virtio-block multi-queue configuration to user sapce. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-6-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block max segments in a request to user spaceZhu Lingshan2-0/+18
This commit allows vDPA reporting the maximum number of segments in a request of virtio-block devices to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-5-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block block-size to user spaceZhu Lingshan2-0/+19
This commit allows reporting the block size of a virtio-block device to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-4-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block max segment size to user spaceZhu Lingshan2-0/+18
This commit allows reporting the max size of any single segment of virtio-block devices to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-3-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: report virtio-block capacity to user spaceZhu Lingshan3-0/+38
This commit allows userspace to query capacity of a virtio-block device. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-2-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19virtio: make virtio_bus constRicardo B. Marliere1-1/+1
Now that the driver core can properly handle constant struct bus_type, move the virtio_bus variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Message-Id: <20240204-bus_cleanup-virtio-v1-1-3bcb2212aaa0@marliere.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Jason Wang <jasowang@redhat.com>
2024-03-19vdpa: make vdpa_bus constRicardo B. Marliere1-1/+1
Now that the driver core can properly handle constant struct bus_type, move the vdpa_bus variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Message-Id: <20240204-bus_cleanup-vdpa-v1-1-1745eccb0a5c@marliere.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-19vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_minZhu Lingshan2-0/+7
IFCVF HW supports operation with vq size less than the max size, as the spec required. This commit implements vdpa_config_ops.get_vq_num_min to report the minimal size of the virtqueues, which gives vDPA framework a chance to reduce the vring size. We need at least one descriptor to be functional, but it is better no less than 64 to meet ceratin performance requirements. Actually the framework would allocate at least a PAGE for the vq. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-11-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA/ifcvf: get_max_vq_size to return max sizeZhu Lingshan1-5/+1
Since we already implemented vdpa_config_ops.get_vq_size, so get_max_vq_size can return the acutal max size of the virtqueues other than the max allowed safe size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-10-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19virtio_vdpa: create vqs with the actual sizeZhu Lingshan1-1/+4
The size of a virtqueue is a per vq configuration, this commit allows virtio_vdpa to create virtqueues with the actual size of a specific vq size that supported by the backend device. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-9-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vduse: implement vdpa_config_ops.get_vq_size for vduseZhu Lingshan1-0/+12
This commit implements get_vq_size for vdpa_config_ops. This new interface is used to report per vq size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-8-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vdpa_sim: implement vdpa_config_ops.get_vq_size for vDPA simulatorZhu Lingshan1-0/+12
This commit implements vdpa_config_ops.get_vq_size for vDPA simulator, this new interface can help report per vq size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-7-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19eni_vdpa: implement vdpa_config_ops.get_vq_sizeZhu Lingshan1-0/+8
This commit implements get_vq_size which report per vq size in vdpa_config_ops Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-6-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vp_vdpa: implement vdpa_config_ops.get_vq_sizeZhu Lingshan1-0/+8
This commit implements vdpa_config_ops.get_vq_size in vp_vdpa, which reports per virtqueue size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-5-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA/ifcvf: implement vdpa_config_ops.get_vq_sizeZhu Lingshan3-1/+14
This commit implements vdpa_ops.get_vq_size to report the size of a specific virtqueue. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-4-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vDPA: introduce get_vq_size to vdpa_config_opsZhu Lingshan2-0/+13
This commit introduces a new interface get_vq_size to vDPA config ops, this new interface intends to report the size of a specific virtqueue Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-3-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vhost-vdpa: uapi to support reporting per vq sizeZhu Lingshan1-0/+7
The size of a virtqueue is a per vq configuration. This commit introduce a new ioctl uAPI to support this flexibility. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-2-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vdpa/pds: fixes for VF vdpa flr-aer handlingShannon Nelson3-4/+19
This addresses a couple of things found while testing the FLR and AER handling with the VFs. - release irqs before calling vp_modern_remove() - make sure we have a valid struct pointer before using it to release irqs - make sure the FW is alive before trying to add a new device Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20240220011050.30913-1-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vduse: implement DMA sync callbacksMaxime Coquelin3-3/+54
Since commit 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers"), VDUSE device require support for DMA's .sync_single_for_cpu() operation as the memory is non-coherent between the device and CPU because of the use of a bounce buffer. This patch implements both .sync_single_for_cpu() and .sync_single_for_device() callbacks, and also skip bounce buffer copies during DMA map and unmap operations if the DMA_ATTR_SKIP_CPU_SYNC attribute is set to avoid extra copies of the same buffer. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20240219170606.587290-1-maxime.coquelin@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vdpa/mlx5: Allow CVQ size changesJonah Palmer1-4/+9
The MLX driver was not updating its control virtqueue size at set_vq_num and instead always initialized to MLX5_CVQ_MAX_ENT (16) at setup_cvq_vring. Qemu would try to set the size to 64 by default, however, because the CVQ size always was initialized to 16, an error would be thrown when sending >16 control messages (as used-ring entry 17 is initialized to 0). For example, starting a guest with x-svq=on and then executing the following command would produce the error below: # for i in {1..20}; do ifconfig eth0 hw ether XX:xx:XX:xx:XX:XX; done qemu-system-x86_64: Insufficient written data (0) [ 435.331223] virtio_net virtio0: Failed to set mac address by vq command. SIOCSIFHWADDR: Invalid argument Acked-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240216142502.78095-1-jonah.palmer@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting")
2024-03-19vdpa: skip suspend/resume ops if not DRIVER_OKSteve Sistare1-0/+6
If a vdpa device is not in state DRIVER_OK, then there is no driver state to preserve, so no need to call the suspend and resume driver ops. Suggested-by: Eugenio Perez Martin <eperezma@redhat.com>" Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Message-Id: <1707834358-165470-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2024-03-19virtio: reenable config if freezing device failedDavid Hildenbrand1-1/+3
Currently, we don't reenable the config if freezing the device failed. For example, virtio-mem currently doesn't support suspend+resume, and trying to freeze the device will always fail. Afterwards, the device will no longer respond to resize requests, because it won't get notified about config changes. Let's fix this by re-enabling the config if freezing fails. Fixes: 22b7050a024d ("virtio: defer config changed notifications") Cc: <stable@kernel.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240213135425.795001-1-david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vdpa_sim: reset must not runSteve Sistare1-1/+2
vdpasim_do_reset sets running to true, which is wrong, as it allows vdpasim_kick_vq to post work requests before the device has been configured. To fix, do not set running until VIRTIO_CONFIG_S_DRIVER_OK is set. Fixes: 0c89e2a3a9d0 ("vdpa_sim: Implement suspend vdpa op") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1707517807-137331-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19virtio: uapi: Drop __packed attribute in linux/virtio_pci.hSuzuki K Poulose1-5/+5
Commit 92792ac752aa ("virtio-pci: Introduce admin command sending function") added "__packed" structures to UAPI header linux/virtio_pci.h. This triggers build failures in the consumer userspace applications without proper "definition" of __packed (e.g., kvmtool build fails). Moreover, the structures are already packed well, and doesn't need explicit packing, similar to the rest of the structures in all virtio_* headers. Remove the __packed attribute. Fixes: 92792ac752aa ("virtio-pci: Introduce admin command sending function") Cc: Feng Liu <feliu@nvidia.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Yishai Hadas <yishaih@nvidia.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Message-Id: <20240125232039.913606-1-suzuki.poulose@arm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19vhost: Added pad cleanup if vnet_hdr is not present.Andrew Melnychenko1-0/+3
When the Qemu launched with vhost but without tap vnet_hdr, vhost tries to copy vnet_hdr from socket iter with size 0 to the page that may contain some trash. That trash can be interpreted as unpredictable values for vnet_hdr. That leads to dropping some packets and in some cases to stalling vhost routine when the vhost_net tries to process packets and fails in a loop. Qemu options: -netdev tap,vhost=on,vnet_hdr=off,... Signed-off-by: Andrew Melnychenko <andrew@daynix.com> Message-Id: <20240115194840.1183077-1-andrew@daynix.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18bcachefs: Fix lost wakeup on journal shutdownKent Overstreet1-6/+6
We need to check for journal shutdown first in __journal_res_get() - after the journal is shutdown, j->watermark won't be changing anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-18bcachefs; Fix deadlock in bch2_btree_update_start()Kent Overstreet1-4/+9
BCH_TRANS_COMMIT_journal_reclaim with watermark != BCH_WATERMARK_reclaim means nonblocking, and we need the journal_res_get() in btree_update_start() to respect that. In a future refactoring we'll be deleting BCH_TRANS_COMMIT_journal_reclaim and replacing it with an explicit BCH_TRANS_COMMIT_nonblocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-18ksmbd: remove module versionNamjae Jeon2-3/+0
ksmbd module version marking is not needed. Since there is a Linux kernel version, there is no point in increasing it anymore. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-18ksmbd: fix potencial out-of-bounds when buffer offset is invalidNamjae Jeon2-29/+42
I found potencial out-of-bounds when buffer offset fields of a few requests is invalid. This patch set the minimum value of buffer offset field to ->Buffer offset to validate buffer length. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-18x86/hyperv: Use Hyper-V entropy to seed guest random number generatorMichael Kelley4-0/+74
A Hyper-V host provides its guest VMs with entropy in a custom ACPI table named "OEM0". The entropy bits are updated each time Hyper-V boots the VM, and are suitable for seeding the Linux guest random number generator (rng). See a brief description of OEM0 in [1]. Generation 2 VMs on Hyper-V use UEFI to boot. Existing EFI code in Linux seeds the rng with entropy bits from the EFI_RNG_PROTOCOL. Via this path, the rng is seeded very early during boot with good entropy. The ACPI OEM0 table provided in such VMs is an additional source of entropy. Generation 1 VMs on Hyper-V boot from BIOS. For these VMs, Linux doesn't currently get any entropy from the Hyper-V host. While this is not fundamentally broken because Linux can generate its own entropy, using the Hyper-V host provided entropy would get the rng off to a better start and would do so earlier in the boot process. Improve the rng seeding for Generation 1 VMs by having Hyper-V specific code in Linux take advantage of the OEM0 table to seed the rng. For Generation 2 VMs, use the OEM0 table to provide additional entropy beyond the EFI_RNG_PROTOCOL. Because the OEM0 table is custom to Hyper-V, parse it directly in the Hyper-V code in the Linux kernel and use add_bootloader_randomness() to add it to the rng. Once the entropy bits are read from OEM0, zero them out in the table so they don't appear in /sys/firmware/acpi/tables/OEM0 in the running VM. The zero'ing is done out of an abundance of caution to avoid potential security risks to the rng. Also set the OEM0 data length to zero so a kexec or other subsequent use of the table won't try to use the zero'ed bits. [1] https://download.microsoft.com/download/1/c/9/1c9813b8-089c-4fef-b2ad-ad80e79403ba/Whitepaper%20-%20The%20Windows%2010%20random%20number%20generation%20infrastructure.pdf Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20240318155408.216851-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240318155408.216851-1-mhklinux@outlook.com>
2024-03-18x86/hyperv: Cosmetic changes for hv_spinlock.cPurna Pavan Chandra Aekkaladevi1-1/+2
Fix issues reported by checkpatch.pl script for hv_spinlock.c file. - Place __initdata after variable name - Add missing blank line after enum declaration No functional changes intended. Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/1710763751-14137-1-git-send-email-paekkaladevi@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1710763751-14137-1-git-send-email-paekkaladevi@linux.microsoft.com>
2024-03-18btrfs: do not skip re-registration for the mounted deviceAnand Jain1-11/+47
There are reports that since version 6.7 update-grub fails to find the device of the root on systems without initrd and on a single device. This looks like the device name changed in the output of /proc/self/mountinfo: 6.5-rc5 working 18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ... 6.7 not working: 17 1 0:15 / / rw,noatime - btrfs /dev/root ... and "update-grub" shows this error: /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?) This looks like it's related to the device name, but grub-probe recognizes the "/dev/root" path and tries to find the underlying device. However there's a special case for some filesystems, for btrfs in particular. The generic root device detection heuristic is not done and it all relies on reading the device infos by a btrfs specific ioctl. This ioctl returns the device name as it was saved at the time of device scan (in this case it's /dev/root). The change in 6.7 for temp_fsid to allow several single device filesystem to exist with the same fsid (and transparently generate a new UUID at mount time) was to skip caching/registering such devices. This also skipped mounted device. One step of scanning is to check if the device name hasn't changed, and if yes then update the cached value. This broke the grub-probe as it always read the device /dev/root and couldn't find it in the system. A temporary workaround is to create a symlink but this does not survive reboot. The right fix is to allow updating the device path of a mounted filesystem even if this is a single device one. In the fix, check if the device's major:minor number matches with the cached device. If they do, then we can allow the scan to happen so that device_list_add() can take care of updating the device path. The file descriptor remains unchanged. This does not affect the temp_fsid feature, the UUID of the mounted filesystem remains the same and the matching is based on device major:minor which is unique per mounted filesystem. This covers the path when the device (that exists for all mounted devices) name changes, updating /dev/root to /dev/sdx. Any other single device with filesystem and is not mounted is still skipped. Note that if a system is booted and initial mount is done on the /dev/root device, this will be the cached name of the device. Only after the command "btrfs device scan" it will change as it triggers the rename. The fix was verified by users whose systems were affected. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353 Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/ Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem") CC: stable@vger.kernel.org # 6.7+ Tested-by: Alex Romosan <aromosan@gmail.com> Tested-by: CHECK_1234543212345@protonmail.com Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-19kbuild: rpm-pkg: add dtb files in kernel rpmJose Ignacio Tornos Martinez1-0/+13
Some architectures, like aarch64 ones, need a dtb file to configure the hardware. The default dtb file can be preloaded from u-boot, but the final and/or more complete dtb file needs to be able to be loaded later from rootfs. Add the possible dtb files to the kernel rpm and mimic Fedora shipping process, storing the dtb files in the module directory. These dtb files will be copied to /boot directory by the install scripts, but add fallback just in case, checking if the content in /boot directory is correct. Mark the files installed to /boot as %ghost to make sure they will be removed when the package is uninstalled. Tested with Fedora Rawhide (x86_64 and aarch64) with dnf and rpm tools. In addition, fallback was also tested after modifying the install scripts. Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-19kconfig: remove unneeded menu_is_visible() call in conf_write_defconfig()Masahiro Yamada1-4/+1
When the condition 'sym == NULL' is met, the code will reach the 'next_menu' label regardless of the return value from menu_is_visible(). menu_is_visible() calculates some symbol values as a side-effect, for instance by calling expr_calc_value(menu->visibility), but all the symbol values will be calculated eventually. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>