aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2022-09-01audit_init_parent(): constify pathAl Viro1-1/+1
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01__io_setxattr(): constify pathAl Viro1-1/+1
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01do_proc_readlink(): constify pathAl Viro1-1/+1
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01overlayfs: constify pathAl Viro8-40/+40
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01fs/notify: constify pathAl Viro3-5/+5
Reviewed-by: Matthew Bobrowski <repnop@google.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01may_linkat(): constify pathAl Viro2-2/+2
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01do_sys_name_to_handle(): constify pathAl Viro1-1/+1
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01->getprocattr(): attribute name is const char *, TYVM...Al Viro7-9/+9
cast of ->d_name.name to char * is completely wrong - nothing is allowed to modify its contents. Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01fs: fix UAF/GPF bug in nilfs_mdt_destroyDongliang Mu1-4/+3
In alloc_inode, inode_init_always() could return -ENOMEM if security_inode_alloc() fails, which causes inode->i_private uninitialized. Then nilfs_is_metadata_file_inode() returns true and nilfs_free_inode() wrongly calls nilfs_mdt_destroy(), which frees the uninitialized inode->i_private and leads to crashes(e.g., UAF/GPF). Fix this by moving security_inode_alloc just prior to this_cpu_inc(nr_inodes) Link: https://lkml.kernel.org/r/CAFcO6XOcf1Jj2SeGt=jJV59wmhESeSKpfR0omdFRq+J9nD1vfQ@mail.gmail.com Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> Reported-by: Hao Sun <sunhao.th@gmail.com> Reported-by: Jiacheng Xu <stitch@zju.edu.cn> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01hisi_acc_vfio_pci: Correct the function prefix for hssi_acc_drvdata()Shameer Kolothum1-10/+10
Commit 91be0bd6c6cf("vfio/pci: Have all VFIO PCI drivers store the vfio_pci_core_device in drvdata") introduced a helper function to retrieve the drvdata but used "hssi" instead of "hisi" for the function prefix. Correct that and also while at it, moved the function a bit down so that it's close to other hisi_ prefixed functions. No functional changes. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220831085943.993-1-shameerali.kolothum.thodi@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio: Remove vfio_group dev_counterJason Gunthorpe1-6/+3
This counts the number of devices attached to a vfio_group, ie the number of items in the group->device_list. It is only read in vfio_pin_pages(), as some kind of protection against limitations in type1. However, with all the code cleanups in this area, now that vfio_pin_pages() accepts a vfio_device directly it is redundant. All drivers are already calling vfio_register_emulated_iommu_dev() which directly creates a group specifically for the device and thus it is guaranteed that there is a singleton group. Leave a note in the comment about this requirement and remove the logic. Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v2-d4374a7bf0c9+c4-vfio_dev_counter_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio/pci: Implement VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUPAbhishek Sahu2-3/+61
This patch implements VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device feature. In the VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY, if there is any access for the VFIO device on the host side, then the device will be moved out of the low power state without the user's guest driver involvement. Once the device access has been finished, then the host can move the device again into low power state. With the low power entry happened through VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP, the device will not be moved back into the low power state and a notification will be sent to the user by triggering wakeup eventfd. vfio_pci_core_pm_entry() will be called for both the variants of low power feature entry so add an extra argument for wakeup eventfd context and store locally in 'struct vfio_pci_core_device'. For the entry happened without wakeup eventfd, all the exit related handling will be done by the LOW_POWER_EXIT device feature only. When the LOW_POWER_EXIT will be called, then the vfio core layer vfio_device_pm_runtime_get() will increment the usage count and will resume the device. In the driver runtime_resume callback, the 'pm_wake_eventfd_ctx' will be NULL. Then vfio_pci_core_pm_exit() will call vfio_pci_runtime_pm_exit() and all the exit related handling will be done. For the entry happened with wakeup eventfd, in the driver resume callback, eventfd will be triggered and all the exit related handling will be done. When vfio_pci_runtime_pm_exit() will be called by vfio_pci_core_pm_exit(), then it will return early. But if the runtime suspend has not happened on the host side, then all the exit related handling will be done in vfio_pci_core_pm_exit() only. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-6-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio/pci: Implement VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY/EXITAbhishek Sahu2-8/+146
Currently, if the runtime power management is enabled for vfio-pci based devices in the guest OS, then the guest OS will do the register write for PCI_PM_CTRL register. This write request will be handled in vfio_pm_config_write() where it will do the actual register write of PCI_PM_CTRL register. With this, the maximum D3hot state can be achieved for low power. If we can use the runtime PM framework, then we can achieve the D3cold state (on the supported systems) which will help in saving maximum power. 1. D3cold state can't be achieved by writing PCI standard PM config registers. This patch implements the following newly added low power related device features: - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY - VFIO_DEVICE_FEATURE_LOW_POWER_EXIT The VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY feature will allow the device to make use of low power platform states on the host while the VFIO_DEVICE_FEATURE_LOW_POWER_EXIT will prevent further use of those power states. 2. The vfio-pci driver uses runtime PM framework for low power entry and exit. On the platforms where D3cold state is supported, the runtime PM framework will put the device into D3cold otherwise, D3hot or some other power state will be used. There are various cases where the device will not go into the runtime suspended state. For example, - The runtime power management is disabled on the host side for the device. - The user keeps the device busy after calling LOW_POWER_ENTRY. - There are dependent devices that are still in runtime active state. For these cases, the device will be in the same power state that has been configured by the user through PCI_PM_CTRL register. 3. The hypervisors can implement virtual ACPI methods. For example, in guest linux OS if PCI device ACPI node has _PR3 and _PR0 power resources with _ON/_OFF method, then guest linux OS invokes the _OFF method during D3cold transition and then _ON during D0 transition. The hypervisor can tap these virtual ACPI calls and then call the low power device feature IOCTL. 4. The 'pm_runtime_engaged' flag tracks the entry and exit to runtime PM. This flag is protected with 'memory_lock' semaphore. 5. All the config and other region access are wrapped under pm_runtime_resume_and_get() and pm_runtime_put(). So, if any device access happens while the device is in the runtime suspended state, then the device will be resumed first before access. Once the access has been finished, then the device will again go into the runtime suspended state. 6. The memory region access through mmap will not be allowed in the low power state. Since __vfio_pci_memory_enabled() is a common function, so check for 'pm_runtime_engaged' has been added explicitly in vfio_pci_mmap_fault() to block only mmap'ed access. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-5-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio/pci: Mask INTx during runtime suspendAbhishek Sahu4-6/+41
This patch adds INTx handling during runtime suspend/resume. All the suspend/resume related code for the user to put the device into the low power state will be added in subsequent patches. The INTx lines may be shared among devices. Whenever any INTx interrupt comes for the VFIO devices, then vfio_intx_handler() will be called for each device sharing the interrupt. Inside vfio_intx_handler(), it calls pci_check_and_mask_intx() and checks if the interrupt has been generated for the current device. Now, if the device is already in the D3cold state, then the config space can not be read. Attempt to read config space in D3cold state can cause system unresponsiveness in a few systems. To prevent this, mask INTx in runtime suspend callback, and unmask the same in runtime resume callback. If INTx has been already masked, then no handling is needed in runtime suspend/resume callbacks. 'pm_intx_masked' tracks this, and vfio_pci_intx_mask() has been updated to return true if the INTx vfio_pci_irq_ctx.masked value is changed inside this function. For the runtime suspend which is triggered for the no user of VFIO device, the 'irq_type' will be VFIO_PCI_NUM_IRQS and these callbacks won't do anything. The MSI/MSI-X are not shared so similar handling should not be needed for MSI/MSI-X. vfio_msihandler() triggers eventfd_signal() without doing any device-specific config access. When the user performs any config access or IOCTL after receiving the eventfd notification, then the device will be moved to the D0 state first before servicing any request. Another option was to check this flag 'pm_intx_masked' inside vfio_intx_handler() instead of masking the interrupts. This flag is being set inside the runtime_suspend callback but the device can be in non-D3cold state (for example, if the user has disabled D3cold explicitly by sysfs, the D3cold is not supported in the platform, etc.). Also, in D3cold supported case, the device will be in D0 till the PCI core moves the device into D3cold. In this case, there is a possibility that the device can generate an interrupt. Adding check in the IRQ handler will not clear the IRQ status and the interrupt line will still be asserted. This can cause interrupt flooding. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-4-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio: Increment the runtime PM usage count during IOCTL callAbhishek Sahu1-3/+49
The vfio-pci based drivers will have runtime power management support where the user can put the device into the low power state and then PCI devices can go into the D3cold state. If the device is in the low power state and the user issues any IOCTL, then the device should be moved out of the low power state first. Once the IOCTL is serviced, then it can go into the low power state again. The runtime PM framework manages this with help of usage count. One option was to add the runtime PM related API's inside vfio-pci driver but some IOCTL (like VFIO_DEVICE_FEATURE) can follow a different path and more IOCTL can be added in the future. Also, the runtime PM will be added for vfio-pci based drivers variant currently, but the other VFIO based drivers can use the same in the future. So, this patch adds the runtime calls runtime-related API in the top-level IOCTL function itself. For the VFIO drivers which do not have runtime power management support currently, the runtime PM API's won't be invoked. Only for vfio-pci based drivers currently, the runtime PM API's will be invoked to increment and decrement the usage count. In the vfio-pci drivers also, the variant drivers can opt-out by incrementing the usage count during device-open. The pm_runtime_resume_and_get() checks the device current status and will return early if the device is already in the ACTIVE state. Taking this usage count incremented while servicing IOCTL will make sure that the user won't put the device into the low power state when any other IOCTL is being serviced in parallel. Let's consider the following scenario: 1. Some other IOCTL is called. 2. The user has opened another device instance and called the IOCTL for low power entry. 3. The low power entry IOCTL moves the device into the low power state. 4. The other IOCTL finishes. If we don't keep the usage count incremented then the device access will happen between step 3 and 4 while the device has already gone into the low power state. The pm_runtime_resume_and_get() will be the first call so its error should not be propagated to user space directly. For example, if pm_runtime_resume_and_get() can return -EINVAL for the cases where the user has passed the correct argument. So the pm_runtime_resume_and_get() errors have been masked behind -EIO. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-3-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio: Add the device features for the low power entry and exitAbhishek Sahu1-0/+56
This patch adds the following new device features for the low power entry and exit in the header file. The implementation for the same will be added in the subsequent patches. - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP - VFIO_DEVICE_FEATURE_LOW_POWER_EXIT For vfio-pci based devices, with the standard PCI PM registers, all power states cannot be achieved. The platform-based power management needs to be involved to go into the lowest power state. For doing low power entry and exit with platform-based power management, these device features can be used. The entry device feature has two variants. These two variants are mainly to support the different behaviour for the low power entry. If there is any access for the VFIO device on the host side, then the device will be moved out of the low power state without the user's guest driver involvement. Some devices (for example NVIDIA VGA or 3D controller) require the user's guest driver involvement for each low-power entry. In the first variant, the host can return the device to low power automatically. The device will continue to attempt to reach low power until the low power exit feature is called. In the second variant, if the device exits low power due to an access, the host kernel will signal the user via the provided eventfd and will not return the device to low power without a subsequent call to one of the low power entry features. A call to the low power exit feature is optional if the user provided eventfd is signaled. These device features only support VFIO_DEVICE_FEATURE_SET and VFIO_DEVICE_FEATURE_PROBE operations. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-2-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio: Split VFIO_GROUP_GET_STATUS into a functionJason Gunthorpe1-31/+30
This is the last sizable implementation in vfio_group_fops_unl_ioctl(), move it to a function so vfio_group_fops_unl_ioctl() is emptied out. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio: Follow the naming pattern for vfio_group_ioctl_unset_container()Jason Gunthorpe1-11/+16
Make it clear that this is the body of the ioctl. Fold the locking into the function so it is self contained like the other ioctls. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/7-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio: Fold VFIO_GROUP_SET_CONTAINER into vfio_group_set_container()Jason Gunthorpe1-27/+24
No reason to split it up like this, just have one function to process the ioctl. Move the lock into the function as well to avoid having a lockdep annotation. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio: Fold VFIO_GROUP_GET_DEVICE_FD into vfio_group_get_device_fd()Jason Gunthorpe1-13/+11
No reason to split it up like this, just have one function to process the ioctl. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio-pci: Replace 'void __user *' with proper types in the ioctl functionsJason Gunthorpe1-30/+28
This makes the code clearer and replaces a few places trying to access a flex array with an actual flex array. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio-pci: Re-indent what was vfio_pci_core_ioctl()Jason Gunthorpe1-361/+348
Done mechanically with: $ git clang-format-14 -i --lines 675:1210 drivers/vfio/pci/vfio_pci_core.c And manually reflow the multi-line comments clang-format doesn't fix. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio-pci: Break up vfio_pci_core_ioctl() into one function per ioctlJason Gunthorpe1-29/+68
500 lines is a bit long for a single function, move the bodies of each ioctl into separate functions and leave behind a switch statement to dispatch them. This patch just adds the function declarations and does not fix the indenting. The next patch will restore the indenting. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio-pci: Fix vfio_pci_ioeventfd() to return intJason Gunthorpe2-4/+4
This only returns 0 or -ERRNO, it should return int like all the other ioctl dispatch functions. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio/pci: Simplify the is_intx/msi/msix/etc definesJason Gunthorpe3-8/+18
Only three of these are actually used, simplify to three inline functions, and open code the if statement in vfio_pci_config.c. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/3-v2-1bd95d72f298+e0e-vfio_pci_priv_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio/pci: Rename vfio_pci_register_dev_region()Jason Gunthorpe3-12/+12
As this is part of the vfio_pci_core component it should be called vfio_pci_core_register_dev_region() like everything else exported from this module. Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/2-v2-1bd95d72f298+e0e-vfio_pci_priv_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01vfio/pci: Split linux/vfio_pci_core.hJason Gunthorpe9-140/+145
The header in include/linux should have only the exported interface for other vfio_pci modules to use. Internal definitions for vfio_pci.ko should be in a "priv" header along side the .c files. Move the internal declarations out of vfio_pci_core.h. They either move to vfio_pci_priv.h or to the C file that is the only user. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/1-v2-1bd95d72f298+e0e-vfio_pci_priv_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01bpf: Remove useless else ifJiapeng Chong1-3/+1
The assignment of the else and else if branches is the same, so the else if here is redundant, so we remove it and add a comment to make the code here readable. ./kernel/bpf/cgroup_iter.c:81:6-8: WARNING: possible condition with no effect (if == else). Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2016 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20220831021618.86770-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-02Merge tag 'drm-msm-fixes-2022-08-27' of https://gitlab.freedesktop.org/drm/msm into drm-fixesDave Airlie7-4/+17
Fixes for v6.0 - Fix for inconsistent indenting in function msm_dsi_dphy_timing_calc_v3. This fixes a smatch warning reported by kbot - Fix to make eDP the first connector in the connected list. This was mainly done to address a screen corruption issue we were seeing on sc7280 boards which have eDP as the primary display. The corruption itself is from usermode but we decided to fix it this way because things work correct with the primary display as the first one for usermode - Fix to populate intf_cfg correctly before calling reset_intf_cfg(). Without this, the display pipeline is not torn down correctly for writeback - Specify the correct number of DSI regulators for SDM660. It should have been 1 but 2 was mentioned - Specify the correct number of DSI regulators for MSM8996. It should have been 3 but 2 was mentioned - Fix for removing DP_RECOVERED_CLOCK_OUT_EN bit for tps4 link training for DP. This was causing link training failures and hence no display for a specific DP to HDMI cable on chromebooks - Fix probe-deferral crash in gpu devfreq - Fix gpu debugfs deadlock Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtuY=jd44itwTkLXVqhnoKgY0BswPTrxDTxCiPG3WbmLA@mail.gmail.com
2022-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski331-1417/+2531
tools/testing/selftests/net/.gitignore sort the net-next version and use it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02Merge tag 'amd-drm-fixes-6.0-2022-08-31' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixesDave Airlie23-96/+234
amd-drm-fixes-6.0-2022-08-31: amdgpu: - FRU error message fix - MES 11 updates - DCN 3.2.x fixes - DCN 3.1.4 fixes - Fix possible use after free in CS IOCTL - SMU 13.0.x fixes - Fix iolink reporting on devices with direct connections to CPU - GFX10 tap delay firmware fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220831212312.5921-1-alexander.deucher@amd.com
2022-09-02Merge tag 'drm-misc-fixes-2022-08-31' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixesDave Airlie1-1/+2
Short summary of fixes pull: * dma-buf/dma-resv: Fence-handling fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/Yw+pZnEbPxkJ1nHa@linux-uq9g.fritz.box
2022-09-01Merge branch 'Use this_cpu_xxx for preemption-safety'Martin KaFai Lau8-28/+191
Hou Tao says: ==================== From: Hou Tao <houtao1@huawei.com> Hi, The patchset aims to make the update of per-cpu prog->active and per-cpu bpf_task_storage_busy being preemption-safe. The problem is on same architectures (e.g. arm64), __this_cpu_{inc|dec|inc_return} are neither preemption-safe nor IRQ-safe, so under fully preemptible kernel the concurrent updates on these per-cpu variables may be interleaved and the final values of these variables may be not zero. Patch 1 & 2 use the preemption-safe per-cpu helpers to manipulate prog->active and bpf_task_storage_busy. Patch 3 & 4 add a test case in map_tests to show the concurrent updates on the per-cpu bpf_task_storage_busy by using __this_cpu_{inc|dec} are not atomic. Comments are always welcome. Regards, Tao Change Log: v2: * Patch 1: update commit message to indicate the problem is only possible for fully preemptible kernel * Patch 2: a new patch which fixes the problem for prog->active * Patch 3 & 4: move it to test_maps and make it depend on CONFIG_PREEMPT v1: https://lore.kernel.org/bpf/20220829142752.330094-1-houtao@huaweicloud.com/ ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01selftests/bpf: Test concurrent updates on bpf_task_storage_busyHou Tao2-0/+161
Under full preemptible kernel, task local storage lookup operations on the same CPU may update per-cpu bpf_task_storage_busy concurrently. If the update of bpf_task_storage_busy is not preemption safe, the final value of bpf_task_storage_busy may become not-zero forever and bpf_task_storage_trylock() will always fail. So add a test case to ensure the update of bpf_task_storage_busy is preemption safe. Will skip the test case when CONFIG_PREEMPT is disabled, and it can only reproduce the problem probabilistically. By increasing TASK_STORAGE_MAP_NR_LOOP and running it under ARM64 VM with 4-cpus, it takes about four rounds to reproduce: > test_maps is modified to only run test_task_storage_map_stress_lookup() $ export TASK_STORAGE_MAP_NR_THREAD=256 $ export TASK_STORAGE_MAP_NR_LOOP=81920 $ export TASK_STORAGE_MAP_PIN_CPU=1 $ time ./test_maps test_task_storage_map_stress_lookup(135):FAIL:bad bpf_task_storage_busy got -2 real 0m24.743s user 0m6.772s sys 0m17.966s Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-5-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01selftests/bpf: Move sys_pidfd_open() into task_local_storage_helpers.hHou Tao3-18/+20
sys_pidfd_open() is defined twice in both test_bprm_opts.c and test_local_storage.c, so move it to a common header file. And it will be used in map_tests as well. Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-4-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01bpf: Use this_cpu_{inc_return|dec} for prog->activeHou Tao1-4/+4
Both __this_cpu_inc_return() and __this_cpu_dec() are not preemption safe and now migrate_disable() doesn't disable preemption, so the update of prog-active is not atomic and in theory under fully preemptible kernel recurisve prevention may do not work. Fixing by using the preemption-safe and IRQ-safe variants. Fixes: ca06f55b9002 ("bpf: Add per-program recursion prevention mechanism") Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-3-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01bpf: Use this_cpu_{inc|dec|inc_return} for bpf_task_storage_busyHou Tao2-6/+6
Now migrate_disable() does not disable preemption and under some architectures (e.g. arm64) __this_cpu_{inc|dec|inc_return} are neither preemption-safe nor IRQ-safe, so for fully preemptible kernel concurrent lookups or updates on the same task local storage and on the same CPU may make bpf_task_storage_busy be imbalanced, and bpf_task_storage_trylock() on the specific cpu will always fail. Fixing it by using this_cpu_{inc|dec|inc_return} when manipulating bpf_task_storage_busy. Fixes: bc235cdb423a ("bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]") Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-2-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01drm/amd/display: fix indentation in commit_planes_for_stream()Hamza Mahfooz1-3/+4
Address the following warning: drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3508:9: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation] 3508 | if (update_type != UPDATE_TYPE_FAST) | ^~ drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3510:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’ 3510 | if (update_type != UPDATE_TYPE_FAST) | ^~ Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amd/display: remove a duplicated declarationAsher Song1-1/+1
There are duplicated declarations of i, remove one of those. Signed-off-by: Asher Song <Asher.Song@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu/vcn: Add MMSCH v4_0 support for sriovJane Jian1-0/+140
These structures are basically ported from MMSCH v3_0, besides, added RB and RB4 enablement flag to support unified queue Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu/vcn: Add sriov VCN v4_0 unified queue supportJane Jian3-15/+300
Enable unified queue support for sriov, abandon all previous multi-queue settings Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu/vcn: Add vcn/vcn1 in white list to load its firmware under sriovJane Jian1-1/+3
Previously since vcn0/vcn1 are not enabled, loading firmware is skipped. Now add firmware loading back since vcn0/vcn1 has already been enabled on sriov Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu/vcn: Disable CG/PG for SRIOVJane Jian1-0/+5
For sriov, CG and MG are controlled from hypervisor side, no need to manage them again in ip init Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amd: Skip smu_v13 register irq on SRIOV VFYifan Zha1-0/+9
[Why] VF do not need to handle SMU IRQ state. L1 Policy will block VF access THM_THERMAL_INT_CTRL and MP1_SMN_IH_SW_INT/CNTL. [How] Skip smu_v13 init register_irq_handler under SRIOV VF. And add irq_src check in enable/disable thermal alert to avoid thermal alert enable/disable fail. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/admgpu: Skip CG/PG on SOC21 under SRIOV VFYifan Zha1-0/+4
[Why] There is no CG(Clock Gating)/PG(Power Gating) requirement on SRIOV VF. For multi VF, VF should not enable any CG/PG features. For one VF, PF will program CG/PG related registers. [How] Do not set any cg/pg flag bit at early init under sriov. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu: Use RLCG to program GRBM_GFX_CNTL during full access timeYifan Zha1-1/+1
[Why] KIQ register init requires GRBM_GFX_CNTL to select KIQ. [How] As RLCG accessing registers will save the data of GRBM_GFX_CNTL and restore it. Use RLCG indirect accessing register method to select grbm instead of mmio directly access. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu: Skip program SDMA0_SEM_WAIT_FAIL_TIMER_CNTL under SRIOV VFYifan Zha1-1/+2
[Why] As SDMA0_SEM_WAIT_FAIL_TIMER_CNTL is a PF-only register, L1 would block this register for VF access. [How] VF do not program it. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu: Skip the VRAM base offset on SRIOVYifan Zha1-1/+4
[Why] As VF cannot read MMMC_VM_FB_OFFSET with L1 Policy(read 0xffffffff). It leads to driver get the incorrect vram base offset. [How] Since SR-IOV is dGPU only, skip reading this register and set the fb_offest to 0. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu: skip "Issue additional private vm invalidation to MMHUB" on SRIOVYifan Zha1-1/+2
[Why] vm_l2_bank_select_reserved_cid2 is a PF_only register that cannot be programmed by VF. This feature is only support HDP using GPUVM page tables to access FB memory which should be disabled on SRIOV. [How] Disable the feature on VF. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-01drm/amdgpu: Skip the program of MMMC_VM_AGP_* in SRIOV on MMHUB v3_0_0Yifan Zha1-5/+5
[Why] VF should not program these registers, the value were defined in the host. [How] Skip writing them in SRIOV environment and program them on host side. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>