linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-06-25	dev_forward_skb: do not scrub skb mark within the same name space	Nicolas Dichtel	1	-1/+1
	The goal is to keep the mark during a bpf_redirect(), like it is done for legacy encapsulation / decapsulation, when there is no x-netns. This was initially done in commit 213dd74aee76 ("skbuff: Do not scrub skb mark within the same name space"). When the call to skb_scrub_packet() was added in dev_forward_skb() (commit 8b27f27797ca ("skb: allow skb_scrub_packet() to be used by tunnels")), the second argument (xnet) was set to true to force a call to skb_orphan(). At this time, the mark was always cleanned up by skb_scrub_packet(), whatever xnet value was. This call to skb_orphan() was removed later in commit 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb."). But this 'true' stayed here without any real reason. Let's correctly set xnet in ____dev_forward_skb(), this function has access to the previous interface and to the new interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25	userfaultfd: uapi: fix UFFDIO_CONTINUE ioctl request definition	Gleb Fotengauer-Malinovskiy	1	-2/+2
	This ioctl request reads from uffdio_continue structure written by userspace which justifies _IOC_WRITE flag. It also writes back to that structure which justifies _IOC_READ flag. See NOTEs in include/uapi/asm-generic/ioctl.h for more information. Fixes: f619147104c8 ("userfaultfd: add UFFDIO_CONTINUE ioctl") Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-25	kunit: Support skipped tests	David Gow	1	-6/+67
	The kunit_mark_skipped() macro marks the current test as "skipped", with the provided reason. The kunit_skip() macro will mark the test as skipped, and abort the test. The TAP specification supports this "SKIP directive" as a comment after the "ok" / "not ok" for a test. See the "Directives" section of the TAP spec for details: https://testanything.org/tap-specification.html#directives The 'success' field for KUnit tests is replaced with a kunit_status enum, which can be SUCCESS, FAILURE, or SKIPPED, combined with a 'status_comment' containing information on why a test was skipped. A new 'kunit_status' test suite is added to test this. Signed-off-by: David Gow <davidgow@google.com> Tested-by: Marco Elver <elver@google.com> Reviewed-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-06-25	kunit: introduce kunit_kmalloc_array/kunit_kcalloc() helpers	Daniel Latypov	1	-4/+32
	Add in: * kunit_kmalloc_array() and wire up kunit_kmalloc() to be a special case of it. * kunit_kcalloc() for symmetry with kunit_kzalloc() This should using KUnit more natural by making it more similar to the existing *alloc() APIs. And while we shouldn't necessarily be writing unit tests where overflow should be a concern, it can't hurt to be safe. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-06-25	Merge tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client	Linus Torvalds	1	-1/+3
	Pull ceph fixes from Ilya Dryomov: "Two regression fixes from the merge window: one in the auth code affecting old clusters and one in the filesystem for proper propagation of MDS request errors. Also included a locking fix for async creates, marked for stable" * tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client: libceph: set global_id as soon as we get an auth ticket libceph: don't pass result into ac->ops->handle_reply() ceph: fix error handling in ceph_atomic_open and ceph_lookup ceph: must hold snap_rwsem when filling inode for async create
2021-06-25	Merge tag 'kvmarm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD	Paolo Bonzini	34	-109/+258
	KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes
2021-06-25	i2c: core-smbus: Expose PEC calculate function for generic use	Quan Nguyen	1	-0/+1
	Expose the PEC calculation i2c_smbus_pec() for generic use. Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com> Acked-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-25	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/smmu', 'x86/vt-d', 'x86/amd', 'virtio' and 'core' into next	Joerg Roedel	8	-31/+110

2021-06-25	Merge remote-tracking branch 'spi/for-5.14' into spi-next	Mark Brown	7	-35/+156

2021-06-25	Merge remote-tracking branch 'asoc/for-5.14' into asoc-next	Mark Brown	8	-6/+232

2021-06-25	Merge remote-tracking branch 'asoc/for-5.13' into asoc-linus	Mark Brown	1	-0/+3

2021-06-25	iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops()	Jean-Philippe Brucker	1	-2/+2
	Passing a 64-bit address width to iommu_setup_dma_ops() is valid on virtual platforms, but isn't currently possible. The overflow check in iommu_dma_init_domain() prevents this even when @dma_base isn't 0. Pass a limit address instead of a size, so callers don't have to fake a size to work around the check. The base and limit parameters are being phased out, because: * they are redundant for x86 callers. dma-iommu already reserves the first page, and the upper limit is already in domain->geometry. * they can now be obtained from dev->dma_range_map on Arm. But removing them on Arm isn't completely straightforward so is left for future work. As an intermediate step, simplify the x86 callers by passing dummy limits. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210618152059.1194210-5-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-25	ACPI: Add driver for the VIOT table	Jean-Philippe Brucker	1	-0/+19
	The ACPI Virtual I/O Translation Table describes topology of para-virtual platforms, similarly to vendor tables DMAR, IVRS and IORT. For now it describes the relation between virtio-iommu and the endpoints it manages. Three steps are needed to configure DMA of endpoints: (1) acpi_viot_init(): parse the VIOT table, find or create the fwnode associated to each vIOMMU device. This needs to happen after acpi_scan_init(), because it relies on the struct device and their fwnode to be available. (2) When probing the vIOMMU device, the driver registers its IOMMU ops within the IOMMU subsystem. This step doesn't require any intervention from the VIOT driver. (3) viot_iommu_configure(): before binding the endpoint to a driver, find the associated IOMMU ops. Register them, along with the endpoint ID, into the device's iommu_fwspec. If step (3) happens before step (2), it is deferred until the IOMMU is initialized, then retried. Tested-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20210618152059.1194210-4-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-25	ACPI: Move IOMMU setup code out of IORT	Jean-Philippe Brucker	2	-5/+6
	Extract the code that sets up the IOMMU infrastructure from IORT, since it can be reused by VIOT. Move it one level up into a new acpi_iommu_configure_id() function, which calls the IORT parsing function which in turn calls the acpi_iommu_fwspec_init() helper. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210618152059.1194210-3-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-25	ACPI: arm64: Move DMA setup operations out of IORT	Jean-Philippe Brucker	2	-3/+6
	Extract generic DMA setup code out of IORT, so it can be reused by VIOT. Keep it in drivers/acpi/arm64 for now, since it could break x86 platforms that haven't run this code so far, if they have invalid tables. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210618152059.1194210-2-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-25	HID: core: Add hid_hw_may_wakeup() function	Hans de Goede	1	-0/+18
	Add a hid_hw_may_wakeup() function, which is the equivalent of device_may_wakeup() for hid devices. In most cases this just returns device_may_wakeup(hdev->dev.parent), but for some ll-drivers this is not correct. E.g. usb_hid_driver instantiated hid devices have their parent set to the usb-interface to which the usb_hid_driver is bound, but the power/wakeup* sysfs attributes are part of the usb-device, which is the usb-interface's parent. For these special cases a new may_wakeup callback is added to hid_ll_driver, so that ll-drivers can override the default behavior. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2021-06-25	spi: core: add dma_map_dev for dma device	Vinod Koul	1	-0/+1
	Some controllers like qcom geni need the parent device to be used for dma mapping, so add a dma_map_dev field and let drivers fill this to be used as mapping device Signed-off-by: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20210625052213.32260-4-vkoul@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-25	tty: make linux/tty_flip.h self-contained	Jiri Slaby	1	-0/+2
	If someone includes linux/tty_flip.h before linux/tty.h, they see many compiler errors like: include/linux/tty_flip.h:23:30: error: invalid use of undefined type 'struct tty_port' include/linux/tty_flip.h:26:14: error: invalid use of undefined type 'struct tty_buffer' tty_flip.h actually lexicographically sorts before tty.h. So if people sort includes (as I tried in amiserial), the compilation suddenly breaks. Solve this by including linux/tty.h from linux/tty_flip.h, so that everything is defined as needed. Another alternative would be to uninline tty_insert_flip_char and just insert forward declarations of tty_port and tty_buffer structs into tty_flip.h as that inline is the only real user. But that would mean slowing down the fast path without any good reason. (Provided the fix is that easy and there were no real problems with this until now.) Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20210625073511.4514-1-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-24	mm, futex: fix shared futex pgoff on shmem huge page	Hugh Dickins	2	-22/+7
	If more than one futex is placed on a shmem huge page, it can happen that waking the second wakes the first instead, and leaves the second waiting: the key's shared.pgoff is wrong. When 3.11 commit 13d60f4b6ab5 ("futex: Take hugepages into account when generating futex_key"), the only shared huge pages came from hugetlbfs, and the code added to deal with its exceptional page->index was put into hugetlb source. Then that was missed when 4.8 added shmem huge pages. page_to_pgoff() is what others use for this nowadays: except that, as currently written, it gives the right answer on hugetlbfs head, but nonsense on hugetlbfs tails. Fix that by calling hugetlbfs-specific hugetlb_basepage_index() on PageHuge tails as well as on head. Yes, it's unconventional to declare hugetlb_basepage_index() there in pagemap.h, rather than in hugetlb.h; but I do not expect anything but page_to_pgoff() ever to need it. [akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope] Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Reported-by: Neel Natu <neelnatu@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Zhang Yi <wetpzy@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24	mm/vmalloc: add vmalloc_no_huge	Claudio Imbrenda	1	-0/+1
	Patch series "mm: add vmalloc_no_huge and use it", v4. Add vmalloc_no_huge() and export it, so modules can allocate memory with small pages. Use the newly added vmalloc_no_huge() in KVM on s390 to get around a hardware limitation. This patch (of 2): Commit 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") added support for hugepage vmalloc mappings, it also added the flag VM_NO_HUGE_VMAP for __vmalloc_node_range to request the allocation to be performed with 0-order non-huge pages. This flag is not accessible when calling vmalloc, the only option is to call directly __vmalloc_node_range, which is not exported. This means that a module can't vmalloc memory with small pages. Case in point: KVM on s390x needs to vmalloc a large area, and it needs to be mapped with non-huge pages, because of a hardware limitation. This patch adds the function vmalloc_no_huge, which works like vmalloc, but it is guaranteed to always back the mapping using small pages. This new function is exported, therefore it is usable by modules. [akpm@linux-foundation.org: whitespace fixes, per Christoph] Link: https://lkml.kernel.org/r/20210614132357.10202-1-imbrenda@linux.ibm.com Link: https://lkml.kernel.org/r/20210614132357.10202-2-imbrenda@linux.ibm.com Fixes: 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24	blk: Fix lock inversion between ioc lock and bfqd lock	Jan Kara	1	-1/+2
	Lockdep complains about lock inversion between ioc->lock and bfqd->lock: bfqd -> ioc: put_io_context+0x33/0x90 -> ioc->lock grabbed blk_mq_free_request+0x51/0x140 blk_put_request+0xe/0x10 blk_attempt_req_merge+0x1d/0x30 elv_attempt_insert_merge+0x56/0xa0 blk_mq_sched_try_insert_merge+0x4b/0x60 bfq_insert_requests+0x9e/0x18c0 -> bfqd->lock grabbed blk_mq_sched_insert_requests+0xd6/0x2b0 blk_mq_flush_plug_list+0x154/0x280 blk_finish_plug+0x40/0x60 ext4_writepages+0x696/0x1320 do_writepages+0x1c/0x80 __filemap_fdatawrite_range+0xd7/0x120 sync_file_range+0xac/0xf0 ioc->bfqd: bfq_exit_icq+0xa3/0xe0 -> bfqd->lock grabbed put_io_context_active+0x78/0xb0 -> ioc->lock grabbed exit_io_context+0x48/0x50 do_exit+0x7e9/0xdd0 do_group_exit+0x54/0xc0 To avoid this inversion we change blk_mq_sched_try_insert_merge() to not free the merged request but rather leave that upto the caller similarly to blk_mq_sched_try_merge(). And in bfq_insert_requests() we make sure to free all the merged requests after dropping bfqd->lock. Fixes: aee69d78dec0 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler") Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210623093634.27879-3-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24	kvm: x86: Allow userspace to handle emulation errors	Aaron Lewis	1	-0/+23
	Add a fallback mechanism to the in-kernel instruction emulator that allows userspace the opportunity to process an instruction the emulator was unable to. When the in-kernel instruction emulator fails to process an instruction it will either inject a #UD into the guest or exit to userspace with exit reason KVM_INTERNAL_ERROR. This is because it does not know how to proceed in an appropriate manner. This feature lets userspace get involved to see if it can figure out a better path forward. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210510144834.658457-2-aaronlewis@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: debugfs: Reuse binary stats descriptors	Jing Zhang	1	-16/+1
	To remove code duplication, use the binary stats descriptors in the implementation of the debugfs interface for statistics. This unifies the definition of statistics for the binary and debugfs interfaces. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-8-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Support binary stats retrieval for a VCPU	Jing Zhang	1	-1/+12
	Add a VCPU ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VCPU stats header, descriptors and data. Define VCPU statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-5-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Support binary stats retrieval for a VM	Jing Zhang	1	-0/+6
	Add a VM ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VM stats header, descriptors and data. Define VM statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-4-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	sctp: do black hole detection in search complete state	Xin Long	1	-1/+2
	Currently the PLPMUTD probe will stop for a long period (interval * 30) after it enters search complete state. If there's a pmtu change on the route path, it takes a long time to be aware if the ICMP TooBig packet is lost or filtered. As it says in rfc8899#section-4.3: "A DPLPMTUD method MUST NOT rely solely on this method." (ICMP PTB message). This patch is to enable the other method for search complete state: "A PL can use the DPLPMTUD probing mechanism to periodically generate probe packets of the size of the current PLPMTU." With this patch, the probe will continue with the current pmtu every 'interval' until the PMTU_RAISE_TIMER 'timeout', which we implement by adding raise_count to raise the probe size when it counts to 30 and removing the SCTP_PL_COMPLETE check for PMTU_RAISE_TIMER. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	net: macsec: fix the length used to copy the key for offloading	Antoine Tenart	1	-1/+1
	The key length used when offloading macsec to Ethernet or PHY drivers was set to MACSEC_KEYID_LEN (16), which is an issue as: - This was never meant to be the key length. - The key length can be > 16. Fix this by using MACSEC_MAX_KEY_LEN to store the key (the max length accepted in uAPI) and secy->key_len to copy it. Fixes: 3cf3227a21d1 ("net: macsec: hardware offloading infrastructure") Reported-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	PCI: Export pci_dev_trylock() and pci_dev_unlock()	Luis Chamberlain	1	-0/+3
	Other places in the kernel use this form, and so just provide a common path for it. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20210623022824.308041-2-mcgrof@kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-24	libceph: set global_id as soon as we get an auth ticket	Ilya Dryomov	1	-1/+3
	Commit 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") delayed the setting of global_id too much. It is set only after all tickets are received, but in pre-nautilus clusters an auth ticket and the service tickets are obtained in separate steps (for a total of three MAuth replies). When the service tickets are requested, global_id is used to build an authorizer; if global_id is still 0 we never get them and fail to establish the session. Moving the setting of global_id into protocol implementations. This way global_id can be set exactly when an auth ticket is received, not sooner nor later. Fixes: 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24	libceph: don't pass result into ac->ops->handle_reply()	Ilya Dryomov	1	-1/+1
	There is no result to pass in msgr2 case because authentication failures are reported through auth_bad_method frame and in MAuth case an error is returned immediately. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24	net: mdiobus: fix fwnode_mdbiobus_register() fallback case	Marcin Wojtas	1	-1/+1
	The fallback case of fwnode_mdbiobus_register() (relevant for !CONFIG_FWNODE_MDIO) was defined with wrong argument name, causing a compilation error. Fix that. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	net: retrieve netns cookie via getsocketopt	Martynas Pumputis	1	-0/+2
	It's getting more common to run nested container environments for testing cloud software. One of such examples is Kind [1] which runs a Kubernetes cluster in Docker containers on a single host. Each container acts as a Kubernetes node, and thus can run any Pod (aka container) inside the former. This approach simplifies testing a lot, as it eliminates complicated VM setups. Unfortunately, such a setup breaks some functionality when cgroupv2 BPF programs are used for load-balancing. The load-balancer BPF program needs to detect whether a request originates from the host netns or a container netns in order to allow some access, e.g. to a service via a loopback IP address. Typically, the programs detect this by comparing netns cookies with the one of the init ns via a call to bpf_get_netns_cookie(NULL). However, in nested environments the latter cannot be used given the Kubernetes node's netns is outside the init ns. To fix this, we need to pass the Kubernetes node netns cookie to the program in a different way: by extending getsockopt() with a SO_NETNS_COOKIE option, the orchestrator which runs in the Kubernetes node netns can retrieve the cookie and pass it to the program instead. Thus, this is following up on Eric's commit 3d368ab87cf6 ("net: initialize net->net_cookie at netns setup") to allow retrieval via SO_NETNS_COOKIE. This is also in line in how we retrieve socket cookie via SO_COOKIE. [1] https://kind.sigs.k8s.io/ Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Martynas Pumputis <m@lambda.lt> Cc: Eric Dumazet <edumazet@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	block: pass a gendisk to bdev_disk_changed	Christoph Hellwig	1	-1/+1
	bdev_disk_changed can only operate on whole devices. Make that clear by passing a gendisk instead of the struct block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24	block: move bdev_disk_changed	Christoph Hellwig	1	-1/+0
	Move bdev_disk_changed to block/partitions/core.c, together with the rest of the partition scanning code. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24	xdp: Add proper __rcu annotations to redirect map entries	Toke Høiland-Jørgensen	2	-6/+4
	XDP_REDIRECT works by a three-step process: the bpf_redirect() and bpf_redirect_map() helpers will lookup the target of the redirect and store it (along with some other metadata) in a per-CPU struct bpf_redirect_info. Next, when the program returns the XDP_REDIRECT return code, the driver will call xdp_do_redirect() which will use the information thus stored to actually enqueue the frame into a bulk queue structure (that differs slightly by map type, but shares the same principle). Finally, before exiting its NAPI poll loop, the driver will call xdp_do_flush(), which will flush all the different bulk queues, thus completing the redirect. Pointers to the map entries will be kept around for this whole sequence of steps, protected by RCU. However, there is no top-level rcu_read_lock() in the core code; instead drivers add their own rcu_read_lock() around the XDP portions of the code, but somewhat inconsistently as Martin discovered[0]. However, things still work because everything happens inside a single NAPI poll sequence, which means it's between a pair of calls to local_bh_disable()/local_bh_enable(). So Paul suggested[1] that we could document this intention by using rcu_dereference_check() with rcu_read_lock_bh_held() as a second parameter, thus allowing sparse and lockdep to verify that everything is done correctly. This patch does just that: we add an __rcu annotation to the map entry pointers and remove the various comments explaining the NAPI poll assurance strewn through devmap.c in favour of a longer explanation in filter.c. The goal is to have one coherent documentation of the entire flow, and rely on the RCU annotations as a "standard" way of communicating the flow in the map code (which can additionally be understood by sparse and lockdep). The RCU annotation replacements result in a fairly straight-forward replacement where READ_ONCE() becomes rcu_dereference_check(), WRITE_ONCE() becomes rcu_assign_pointer() and xchg() and cmpxchg() gets wrapped in the proper constructs to cast the pointer back and forth between __rcu and __kernel address space (for the benefit of sparse). The one complication is that xskmap has a few constructions where double-pointers are passed back and forth; these simply all gain __rcu annotations, and only the final reference/dereference to the inner-most pointer gets changed. With this, everything can be run through sparse without eliciting complaints, and lockdep can verify correctness even without the use of rcu_read_lock() in the drivers. Subsequent patches will clean these up from the drivers. [0] https://lore.kernel.org/bpf/20210415173551.7ma4slcbqeyiba2r@kafai-mbp.dhcp.thefacebook.com/ [1] https://lore.kernel.org/bpf/20210419165837.GA975577@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210624160609.292325-6-toke@redhat.com
2021-06-24	rcu: Create an unrcu_pointer() to remove __rcu from a pointer	Paul E. McKenney	1	-0/+14
	The xchg() and cmpxchg() functions are sometimes used to carry out RCU updates. Unfortunately, this can result in sparse warnings for both the old-value and new-value arguments, as well as for the return value. The arguments can be dealt with using RCU_INITIALIZER(): old_p = xchg(&p, RCU_INITIALIZER(new_p)); But a sparse warning still remains due to assigning the __rcu pointer returned from xchg to the (most likely) non-__rcu pointer old_p. This commit therefore provides an unrcu_pointer() macro that strips the __rcu. This macro can be used as follows: old_p = unrcu_pointer(xchg(&p, RCU_INITIALIZER(new_p))); Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210624160609.292325-2-toke@redhat.com
2021-06-24	KVM: stats: Add fd-based API to read binary stats data	Jing Zhang	3	-2/+155
	This commit defines the API for userspace and prepare the common functionalities to support per VM/VCPU binary stats data readings. The KVM stats now is only accessible by debugfs, which has some shortcomings this change series are supposed to fix: 1. The current debugfs stats solution in KVM could be disabled when kernel Lockdown mode is enabled, which is a potential rick for production. 2. The current debugfs stats solution in KVM is organized as "one stats per file", it is good for debugging, but not efficient for production. 3. The stats read/clear in current debugfs solution in KVM are protected by the global kvm_lock. Besides that, there are some other benefits with this change: 1. All KVM VM/VCPU stats can be read out in a bulk by one copy to userspace. 2. A schema is used to describe KVM statistics. From userspace's perspective, the KVM statistics are self-describing. 3. With the fd-based solution, a separate telemetry would be able to read KVM stats in a less privileged environment. 4. After the initial setup by reading in stats descriptors, a telemetry only needs to read the stats data itself, no more parsing or setup is needed. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-3-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Separate generic stats from architecture specific ones	Jing Zhang	1	-0/+12
	Generic KVM stats are those collected in architecture independent code or those supported by all architectures; put all generic statistics in a separate structure. This ensures that they are defined the same way in the statistics API which is being added, removing duplication among different architectures in the declaration of the descriptors. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	fs: remove bdev_try_to_free_page callback	Zhang Yi	1	-1/+0
	After remove the unique user of sop->bdev_try_to_free_page() callback, we could remove the callback and the corresponding blkdev_releasepage() at all. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210610112440.3438139-9-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-06-24	jbd2,ext4: add a shrinker to release checkpointed buffers	Zhang Yi	2	-0/+127
	Current metadata buffer release logic in bdev_try_to_free_page() have a lot of use-after-free issues when umount filesystem concurrently, and it is difficult to fix directly because ext4 is the only user of s_op->bdev_try_to_free_page callback and we may have to add more special refcount or lock that is only used by ext4 into the common vfs layer, which is unacceptable. One better solution is remove the bdev_try_to_free_page callback, but the real problem is we cannot easily release journal_head on the checkpointed buffer, so try_to_free_buffers() cannot release buffers and page under memory pressure, which is more likely to trigger out-of-memory. So we cannot remove the callback directly before we find another way to release journal_head. This patch introduce a shrinker to free journal_head on the checkpointed transaction. After the journal_head got freed, try_to_free_buffers() could free buffer properly. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210610112440.3438139-6-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-06-24	jbd2: ensure abort the journal if detect IO error when writing original buffer back	Zhang Yi	1	-0/+11
	Although we merged c044f3d8360 ("jbd2: abort journal if free a async write error metadata buffer"), there is a race between jbd2_journal_try_to_free_buffers() and jbd2_journal_destroy(), so the jbd2_log_do_checkpoint() may still fail to detect the buffer write io error flag which may lead to filesystem inconsistency. jbd2_journal_try_to_free_buffers() ext4_put_super() jbd2_journal_destroy() __jbd2_journal_remove_checkpoint() detect buffer write error jbd2_log_do_checkpoint() jbd2_cleanup_journal_tail() <--- lead to inconsistency jbd2_journal_abort() Fix this issue by introducing a new atomic flag which only have one JBD2_CHECKPOINT_IO_ERROR bit now, and set it in __jbd2_journal_remove_checkpoint() when freeing a checkpoint buffer which has write_io_error flag. Then jbd2_journal_destroy() will detect this mark and abort the journal to prevent updating log tail. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210610112440.3438139-3-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-06-24	stm class: Spelling fix	Randy Dunlap	1	-1/+1
	Drop the repeated word "the" in a comment. Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [alexander.shishkin: fixed the commit message] Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lore.kernel.org/r/20210621151246.31891-2-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-24	HID: input: Add support for Programmable Buttons	Thomas Weißschuh	1	-0/+1
	Map them to KEY_MACRO# event codes. These buttons are defined by HID as follows: "The user defines the function of these buttons to control software applications or GUI objects." This matches the semantics of the KEY_MACRO# input event codes that Linux supports. Also add support for HID "Named Array" collections. Also add hid-debug support for KEY_MACRO#. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2021-06-24	drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default	Daniel Vetter	1	-2/+5
	It's tedious to review this all the time, and my audit showed that arcpgu actually forgot to set this. Make this the default and stop worrying. Again I sprinkled WARN_ON_ONCE on top to make sure we don't have strange combinations of hooks: cleanup_fb without prepare_fb doesn't make sense, and since simpler drivers are all new they better be GEM based drivers. v2: Warn and bail when it's _not_ a GEM driver (Noralf) v3: It's neither ... nor, not not (Sam) Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Noralf Trønnes <noralf@tronnes.org> Acked-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210623162456.3373469-1-daniel.vetter@ffwll.ch
2021-06-24	drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS	Daniel Vetter	1	-0/+12
	Like we have for the shadow helpers too, and roll it out to drivers. Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Tian Tao <tiantao6@hisilicon.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-11-daniel.vetter@ffwll.ch
2021-06-24	drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default	Daniel Vetter	1	-2/+5
	There's a bunch of atomic drivers who don't do this quite correctly, luckily most of them aren't in wide use or people would have noticed the tearing. By making this the default we avoid the constant audit pain and can additionally remove a ton of lines from vfuncs for a bit more clarity in smaller drivers. While at it complain if there's a cleanup_fb hook but no prepare_fb hook, because that makes no sense. I haven't found any driver which violates this, but better safe than sorry. Subsequent patches will reap the benefits. v2: It's neither ... nor, not not (Sam) Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210623162200.3372056-1-daniel.vetter@ffwll.ch
2021-06-24	dma-buf: Document dma-buf implicit fencing/resv fencing rules	Daniel Vetter	1	-0/+34
	Docs for struct dma_resv are fairly clear: "A reservation object can have attached one exclusive fence (normally associated with write operations) or N shared fences (read operations)." https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects Furthermore a review across all of upstream. First of render drivers and how they set implicit fences: - nouveau follows this contract, see in validate_fini_no_ticket() nouveau_bo_fence(nvbo, fence, !!b->write_domains); and that last boolean controls whether the exclusive or shared fence slot is used. - radeon follows this contract by setting p->relocs[i].tv.num_shared = !r->write_domain; in radeon_cs_parser_relocs(), which ensures that the call to ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the right thing. - vmwgfx seems to follow this contract with the shotgun approach of always setting ttm_val_buf->num_shared = 0, which means ttm_eu_fence_buffer_objects() will only use the exclusive slot. - etnaviv follows this contract, as can be trivially seen by looking at submit_attach_object_fences() - i915 is a bit a convoluted maze with multiple paths leading to i915_vma_move_to_active(). Which sets the exclusive flag if EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for softpin mode, or through the write_domain when using relocations. It follows this contract. - lima follows this contract, see lima_gem_submit() which sets the exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that bo - msm follows this contract, see msm_gpu_submit() which sets the exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer - panfrost follows this contract with the shotgun approach of just always setting the exclusive fence, see panfrost_attach_object_fences(). Benefits of a single engine I guess - v3d follows this contract with the same shotgun approach in v3d_attach_fences_and_unlock_reservation(), but it has at least an XXX comment that maybe this should be improved - v4c uses the same shotgun approach of always setting an exclusive fence, see vc4_update_bo_seqnos() - vgem also follows this contract, see vgem_fence_attach_ioctl() and the VGEM_FENCE_WRITE. This is used in some igts to validate prime sharing with i915.ko without the need of a 2nd gpu - vritio follows this contract again with the shotgun approach of always setting an exclusive fence, see virtio_gpu_array_add_fence() This covers the setting of the exclusive fences when writing. Synchronizing against the exclusive fence is a lot more tricky, and I only spot checked a few: - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all implicit dependencies (which is used by vulkan) - etnaviv does this. Implicit dependencies are collected in submit_fence_sync(), again with an opt-out flag ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in etnaviv_sched_dependency which is the drm_sched_backend_ops->dependency callback. - v4c seems to not do much here, maybe gets away with it by not having a scheduler and only a single engine. Since all newer broadcom chips than the OG vc4 use v3d for rendering, which follows this contract, the impact of this issue is fairly small. - v3d does this using the drm_gem_fence_array_add_implicit() helper, which then it's drm_sched_backend_ops->dependency callback v3d_job_dependency() picks up. - panfrost is nice here and tracks the implicit fences in panfrost_job->implicit_fences, which again the drm_sched_backend_ops->dependency callback panfrost_job_dependency() picks up. It is mildly questionable though since it only picks up exclusive fences in panfrost_acquire_object_fences(), but not buggy in practice because it also always sets the exclusive fence. It should pick up both sets of fences, just in case there's ever going to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a pcie port and a real gpu, which might actually happen eventually. A bug, but easy to fix. Should probably use the drm_gem_fence_array_add_implicit() helper. - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and the same schema as v3d. - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT, but because it doesn't use the drm/scheduler it handles fences from the wrong context with a synchronous dma_fence_wait. See submit_fence_sync() leading to msm_gem_sync_object(). Investing into a scheduler might be a good idea. - all the remaining drivers are ttm based, where I hope they do appropriately obey implicit fences already. I didn't do the full audit there because a) not follow the contract would confuse ttm quite well and b) reading non-standard scheduler and submit code which isn't based on drm/scheduler is a pain. Onwards to the display side. - Any driver using the drm_gem_plane_helper_prepare_fb() helper will correctly. Overwhelmingly most drivers get this right, except a few totally dont. I'll follow up with a patch to make this the default and avoid a bunch of bugs. - I didn't audit the ttm drivers, but given that dma_resv started there I hope they get this right. In conclusion this IS the contract, both as documented and overwhelmingly implemented, specically as implemented by all render drivers except amdgpu. Amdgpu tried to fix this already in commit 049aca4363d8af87cab8d53de5401602db3b9999 Author: Christian König <christian.koenig@amd.com> Date: Wed Sep 19 16:54:35 2018 +0200 drm/amdgpu: fix using shared fence for exported BOs v2 but this fix falls short on a number of areas: - It's racy, by the time the buffer is shared it might be too late. To make sure there's definitely never a problem we need to set the fences correctly for any buffer that's potentially exportable. - It's breaking uapi, dma-buf fds support poll() and differentitiate between, which was introduced in commit 9b495a5887994a6d74d5c261d012083a92b94738 Author: Maarten Lankhorst <maarten.lankhorst@canonical.com> Date: Tue Jul 1 12:57:43 2014 +0200 dma-buf: add poll support, v3 - Christian König wants to nack new uapi building further on this dma_resv contract because it breaks amdgpu, quoting "Yeah, and that is exactly the reason why I will NAK this uAPI change. "This doesn't works for amdgpu at all for the reasons outlined above." https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/ Rejecting new development because your own driver is broken and violates established cross driver contracts and uapi is really not how upstream works. Now this patch will have a severe performance impact on anything that runs on multiple engines. So we can't just merge it outright, but need a bit a plan: - amdgpu needs a proper uapi for handling implicit fencing. The funny thing is that to do it correctly, implicit fencing must be treated as a very strange IPC mechanism for transporting fences, where both setting the fence and dependency intercepts must be handled explicitly. Current best practices is a per-bo flag to indicate writes, and a per-bo flag to to skip implicit fencing in the CS ioctl as a new chunk. - Since amdgpu has been shipping with broken behaviour we need an opt-out flag from the butchered implicit fencing model to enable the proper explicit implicit fencing model. - for kernel memory fences due to bo moves at least the i915 idea is to use ttm_bo->moving. amdgpu probably needs the same. - since the current p2p dma-buf interface assumes the kernel memory fence is in the exclusive dma_resv fence slot we need to add a new fence slot for kernel fences, which must never be ignored. Since currently only amdgpu supports this there's no real problem here yet, until amdgpu gains a NO_IMPLICIT CS flag. - New userspace needs to ship in enough desktop distros so that users wont notice the perf impact. I think we can ignore LTS distros who upgrade their kernels but not their mesa3d snapshot. - Then when this is all in place we can merge this patch here. What is not a solution to this problem here is trying to make the dma_resv rules in the kernel more clever. The fundamental issue here is that the amdgpu CS uapi is the least expressive one across all drivers (only equalled by panfrost, which has an actual excuse) by not allowing any userspace control over how implicit sync is conducted. Until this is fixed it's completely pointless to make the kernel more clever to improve amdgpu, because all we're doing is papering over this uapi design issue. amdgpu needs to attain the status quo established by other drivers first, once that's achieved we can tackle the remaining issues in a consistent way across drivers. v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I entirely missed. This is great because it means the amdgpu specific piece for proper implicit fence handling exists already, and that since a while. The only thing that's now missing is - fishing the implicit fences out of a shared object at the right time - setting the exclusive implicit fence slot at the right time. Jason has a patch series to fill that gap with a bunch of generic ioctl on the dma-buf fd: https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/ v3: Since Christian has fixed amdgpu now in commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next) Author: Christian König <christian.koenig@amd.com> Date: Wed Jun 9 13:51:36 2021 +0200 drm/amdgpu: rework dma_resv handling v3 Use the audit covered in this commit message as the excuse to update the dma-buf docs around dma_buf.resv usage across drivers. Since dynamic importers have different rules also hammer these in again while we're at it. v4: - Add the missing "through the device" in the dynamic section that I overlooked. - Fix a kerneldoc markup mistake, the link didn't connect v5: - A few s/should/must/ to make clear what must be done (if the driver does implicit sync) and what's more a maybe (Daniel Stone) - drop all the example api discussion, that needs to be expanded, clarified and put into a new chapter in drm-uapi.rst (Daniel Stone) Cc: Daniel Stone <daniel@fooishbar.org> Acked-by: Daniel Stone <daniel@fooishbar.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (v4) Reviewed-by: Christian König <christian.koenig@amd.com> (v3) Cc: mesa-dev@lists.freedesktop.org Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: Dave Airlie <airlied@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Cc: Kristian H. Kristensen <hoegsberg@google.com> Cc: Michel Dänzer <michel@daenzer.net> Cc: Daniel Stone <daniels@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Chen Li <chenli@uniontech.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: Dennis Li <Dennis.Li@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
2021-06-24	dma-buf: Switch to inline kerneldoc	Daniel Vetter	1	-26/+90
	Also review & update everything while we're at it. This is prep work to smash a ton of stuff into the kerneldoc for @resv. v2: Move the doc for sysfs_entry.attachment_uid to the right place too (Sam) Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Christian König <christian.koenig@amd.com> Cc: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Chen Li <chenli@uniontech.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20210623161712.3370885-1-daniel.vetter@ffwll.ch
2021-06-24	Merge branch 'for-next/smccc' into for-next/core	Will Deacon	1	-2/+86
	Add support for versions v1.2 and 1.3 of the SMC calling convention. * for-next/smccc: arm64: smccc: Support SMCCC v1.3 SVE register saving hint arm64: smccc: Add support for SMCCCv1.2 extended input/output registers
2021-06-24	Merge branch 'for-next/perf' into for-next/core	Will Deacon	2	-33/+8
	PMU driver cleanups for managing IRQ affinity and exposing event attributes via sysfs. * for-next/perf: (36 commits) drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe() perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number arm64: perf: Simplify EVENT ATTR macro in perf_event.c drivers/perf: Simplify EVENT ATTR macro in fsl_imx8_ddr_perf.c drivers/perf: Simplify EVENT ATTR macro in xgene_pmu.c drivers/perf: Simplify EVENT ATTR macro in qcom_l3_pmu.c drivers/perf: Simplify EVENT ATTR macro in qcom_l2_pmu.c drivers/perf: Simplify EVENT ATTR macro in SMMU PMU driver perf: Add EVENT_ATTR_ID to simplify event attributes perf/smmuv3: Don't trample existing events with global filter perf/hisi: Constify static attribute_group structs perf: qcom: Remove redundant dev_err call in qcom_l3_cache_pmu_probe() drivers/perf: hisi: Fix data source control arm64: perf: Add more support on caps under sysfs perf: qcom_l2_pmu: move to use request_irq by IRQF_NO_AUTOEN flag arm_pmu: move to use request_irq by IRQF_NO_AUTOEN flag perf: arm_spe: use DEVICE_ATTR_RO macro perf: xgene_pmu: use DEVICE_ATTR_RO macro perf: qcom: use DEVICE_ATTR_RO macro perf: arm_pmu: use DEVICE_ATTR_RO macro ...