wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2023-06-23	PCI: endpoint: Pass EPF device ID to the probe function	Manivannan Sadhasivam	5	-9/+17
	Currently, the EPF probe function doesn't get the device ID argument needed to correctly identify the device table ID of the EPF device. When multiple entries are added to the "struct pci_epf_device_id" table, the probe function needs to identify the correct one. This is achieved by modifying the pci_epf_match_id() function to return the match ID pointer and passing it to the driver's probe function. pci_epf_device_match() function can return bool based on the return value of pci_epf_match_id(). Link: https://lore.kernel.org/r/20230602114756.36586-3-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kishon Vijay Abraham I <kishon@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
2023-06-23	PCI: endpoint: Add missing documentation about the MSI/MSI-X range	Manivannan Sadhasivam	1	-2/+2
	Both pci_epc_raise_irq() and pci_epc_map_msi_irq() APIs expect the MSI/MSI-X vectors to start from 1 but it is not documented. Add the range info to the kdoc of the APIs to make it clear. Link: https://lore.kernel.org/r/20230602114756.36586-2-manivannan.sadhasivam@linaro.org Fixes: 5e8cb4033807 ("PCI: endpoint: Add EP core layer to enable EP controller and EP functions") Fixes: 87d5972e476f ("PCI: endpoint: Add pci_epc_ops to map MSI IRQ") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
2023-06-23	PCI: endpoint: Improve pci_epf_type_add_cfs()	Damien Le Moal	1	-1/+1
	pci_epf_type_add_cfs() should not be called with an unbound EPF device, that is, an epf device with epf->driver not set. For such case, replace the NULL return in pci_epf_type_add_cfs() with a clear ERR_PTR(-ENODEV) pointer error return. Link: https://lore.kernel.org/r/20230515074348.595704-2-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivami <manivannan.sadhasivam@linaro.org>
2023-06-23	PCI: endpoint: functions/pci-epf-test: Fix dma_chan direction	Yoshihiro Shimoda	1	-1/+1
	In pci_epf_test_init_dma_chan() epf_test->dma_chan_rx is assigned from dma_request_channel() with DMA_DEV_TO_MEM as filter.dma_mask. However, in pci_epf_test_data_transfer() if the dir is DMA_DEV_TO_MEM, epf->dma_chan_rx should be used but instead we are using epf_test->dma_chan_tx. Fix it. Link: https://lore.kernel.org/r/20230412063447.2841177-1-yoshihiro.shimoda.uh@renesas.com Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Tested-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com>
2023-06-23	misc: pci_endpoint_test: Simplify pci_endpoint_test_msi_irq()	Damien Le Moal	1	-8/+4
	Simplify the code of pci_endpoint_test_msi_irq() by correctly using booleans: remove the msix comparison to false as that variable is already a boolean, and directly return the result of the comparison of the raised interrupt number. Link: https://lore.kernel.org/r/20230415023542.77601-18-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	misc: pci_endpoint_test: Do not write status in IRQ handler	Damien Le Moal	1	-3/+0
	pci_endpoint_test_irqhandler() always rewrites the status register when an IRQ is raised, either as-is if STATUS_IRQ_RAISED is not set, or with STATUS_IRQ_RAISED cleared if that flag is set. The first case creates a race window with the endpoint side, meaning that the host side test driver may end up reading what it just wrote, thus losing the real status as set by the endpoint side before raising the next interrupt. This can prevent detecting that the STATUS_IRQ_RAISED flag was set by the endpoint. Remove this race window by not clearing the STATUS_IRQ_RAISED status flag and not rewriting that register for every IRQ received. Link: https://lore.kernel.org/r/20230415023542.77601-17-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	misc: pci_endpoint_test: Re-init completion for every test	Damien Le Moal	1	-0/+4
	The irq_raised completion used to detect the end of a test case is initialized when the test device is probed, but never reinitialized again before a test case. As a result, the irq_raised completion synchronization is effective only for the first ioctl test case executed. Any subsequent call to wait_for_completion() by another ioctl() call will immediately return, potentially too early, leading to false positive failures. Fix this by reinitializing the irq_raised completion before starting a new ioctl() test command. Link: https://lore.kernel.org/r/20230415023542.77601-16-dlemoal@kernel.org Fixes: 2c156ac71c6b ("misc: Add host side PCI driver for PCI test function device") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org
2023-06-23	misc: pci_endpoint_test: Free IRQs before removing the device	Damien Le Moal	1	-3/+3
	In pci_endpoint_test_remove(), freeing the IRQs after removing the device creates a small race window for IRQs to be received with the test device memory already released, causing the IRQ handler to access invalid memory, resulting in an oops. Free the device IRQs before removing the device to avoid this issue. Link: https://lore.kernel.org/r/20230415023542.77601-15-dlemoal@kernel.org Fixes: e03327122e2c ("pci_endpoint_test: Add 2 ioctl commands") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org
2023-06-23	PCI: epf-test: Simplify transfers result print	Damien Le Moal	1	-25/+14
	In pci_epf_test_print_rate(), instead of open coding a reduction loop to allow for a division by a 32-bits ns value, simply use div64_u64() to calculate the transfer rate. To match the printed unit of KB/s, this calculation divides the rate by 1000 instead of 1024 (that would be KiB/s unit). Change the format of the results printed by pci_epf_test_print_rate() to be more compact without the double new line. Also use dev_info() instead of pr_info(). Link: https://lore.kernel.org/r/20230415023542.77601-14-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Simplify DMA support checks	Damien Le Moal	1	-30/+15
	There is no need to have each read, write and copy test functions check for the FLAG_USE_DMA flag against the DMA support status indicated by epf_test->dma_supported. Move this test to the command handler function pci_epf_test_cmd_handler() to check once for all cases. Link: https://lore.kernel.org/r/20230415023542.77601-13-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Cleanup request result handling	Damien Le Moal	1	-25/+21
	Each of the test functions pci_epf_test_write(), pci_epf_test_read() and pci_epf_test_copy() return an int result which is used by pci_epf_test_cmd_handler() to set a success or error bit in the request status. In the spirit of keeping the processing of each test case self-contained within its own test function, move the request status field update from pci_epf_test_cmd_handler() to each of these test functions and change these functions declaration to returning void. Link: https://lore.kernel.org/r/20230415023542.77601-12-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Cleanup pci_epf_test_cmd_handler()	Damien Le Moal	1	-16/+14
	Command codes are never combined together as flags into a single value. Thus we can replace the series of "if" tests in pci_epf_test_cmd_handler() with a cleaner switch-case statement. This also allows checking that we got a valid command and print an error message if we did not. Link: https://lore.kernel.org/r/20230415023542.77601-11-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Improve handling of command and status registers	Damien Le Moal	1	-4/+9
	The pci-epf-test driver uses the test register BAR memory directly to get and execute a test registers set by the RC side and defined using a struct pci_epf_test_reg. This direct use relies on using the register BAR address as a pointer to a struct pci_epf_test_reg to execute the test case and to send back the test result through the status field of struct pci_epf_test_reg. In practice, the status field is always updated before an interrupt is raised in pci_epf_test_raise_irq(), to ensure that the RC side sees the updated status when receiving an interrupt. However, such assignment direct access does not ensure that changes to the status register make it to memory, and so visible to the host, before an interrupt is raised, thus potentially resulting in the RC host not seeing the correct status result for a test. Avoid this potential problem by using READ_ONCE()/WRITE_ONCE() when accessing the command and status fields of a pci_epf_test_reg structure. This ensure that a test start (pci_epf_test_cmd_handler() function) and completion (with the function pci_epf_test_raise_irq()) achieve a correct synchronization with the MMIO register accesses on the RC host. Link: https://lore.kernel.org/r/20230415023542.77601-10-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Simplify IRQ test commands execution	Damien Le Moal	1	-26/+17
	For the commands COMMAND_RAISE_LEGACY_IRQ, COMMAND_RAISE_MSI_IRQ and COMMAND_RAISE_MSIX_IRQ, the function pci_epf_test_cmd_handler() sets the STATUS_IRQ_RAISED status flag and calls the epc function pci_epc_raise_irq() directly. However, this is also exactly what the pci_epf_test_raise_irq() function does. Avoid duplicating these operations by directly using pci_epf_test_raise_irq() for the IRQ test commands. It is OK to do so as the host side endpoint test driver always set the correct IRQ type for the IRQ test commands. At the same time, move the IRQ number check done for the COMMAND_RAISE_MSI_IRQ and COMMAND_RAISE_MSIX_IRQ commands to pci_epf_test_raise_irq(), to also check the IRQ number requested by the host for other test commands. This significantly simplifies pci_epf_test_cmd_handler(). Link: https://lore.kernel.org/r/20230415023542.77601-9-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Simplify pci_epf_test_raise_irq()	Damien Le Moal	1	-13/+8
	Change the interface of the function pci_epf_test_raise_irq() to directly pass a pointer to the struct pci_epf_test_reg defining the test being executed. This avoids the need for grabbing this pointer using the register BAR address and simplifies the call sites as the IRQ type and IRQ numbers do not have to be passed as arguments. Link: https://lore.kernel.org/r/20230415023542.77601-8-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Simplify read/write/copy test functions	Damien Le Moal	1	-12/+9
	The function pci_epf_test_cmd_handler() uses the register BAR address as a pointer to a struct pci_epf_test_reg to determine the command sent by the host and to execute the test function accordingly. There is no need for doing this assignment again in each of the read, write and copy test functions. We can simply pass the reg pointer as an argument to the functions pci_epf_test_write(), pci_epf_test_read() and pci_epf_test_copy(). Link: https://lore.kernel.org/r/20230415023542.77601-7-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Use dmaengine_submit() to initiate DMA transfer	Damien Le Moal	1	-1/+1
	Instead of an open coded call to the tx_submit() operation of struct dma_async_tx_descriptor, use the helper function dmaengine_submit(). No functional change is introduced with this. Link: https://lore.kernel.org/r/20230415023542.77601-6-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: epf-test: Fix DMA transfer completion detection	Damien Le Moal	1	-11/+27
	pci_epf_test_data_transfer() and pci_epf_test_dma_callback() are not handling DMA transfer completion correctly, leading to completion notifications to the RC side that are too early. This problem can be detected when the RC side is running an IOMMU with messages such as: pci-endpoint-test 0000:0b:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x001c address=0xfff00000 flags=0x0000] When running the pcitest.sh tests: the address used for a previous test transfer generates the above error while the next test transfer is running. Fix this by testing the DMA transfer status in pci_epf_test_dma_callback() and notifying the completion only when the transfer status is DMA_COMPLETE or DMA_ERROR. Furthermore, in pci_epf_test_data_transfer(), be paranoid and check again the transfer status and always call dmaengine_terminate_sync() before returning. Link: https://lore.kernel.org/r/20230415023542.77601-5-dlemoal@kernel.org Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org
2023-06-23	PCI: epf-test: Fix DMA transfer completion initialization	Damien Le Moal	1	-1/+1
	Reinitialize the transfer_complete DMA transfer completion before calling tx_submit(), to avoid seeing the DMA transfer complete before the completion is initialized, thus potentially losing the completion notification. Link: https://lore.kernel.org/r/20230415023542.77601-4-dlemoal@kernel.org Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org
2023-06-23	PCI: endpoint: Move pci_epf_type_add_cfs() code	Damien Le Moal	3	-34/+36
	pci_epf_type_add_cfs() is called only from pci_ep_cfs_add_type_group() in drivers/pci/endpoint/pci-ep-cfs.c, so there is no need to export this function. Move its code from pci-epf-core.c to pci-ep-cfs.c as a static function. Link: https://lore.kernel.org/r/20230415023542.77601-3-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: endpoint: Automatically create a function specific attributes group	Damien Le Moal	3	-37/+29
	A PCI endpoint function driver can define function specific attributes under its function configfs directory using the add_cfs() endpoint driver operation. This is done by tying up the mkdir operation for the function configfs directory to a call to the add_cfs() operation. However, there are no checks preventing the user from repeatedly creating function specific attribute directories with different names, resulting in the same endpoint specific attributes group being added multiple times, which also result in an invalid reference counting for the attribute groups. E.g., using the pci-epf-ntb function driver as an example, the user creates the function as follows: $ modprobe pci-epf-ntb $ cd /sys/kernel/config/pci_ep/functions/pci_epf_ntb $ mkdir func0 $ tree func0 func0/ \|-- baseclass_code \|-- cache_line_size \|-- ... `-- vendorid $ mkdir func0/attrs $ tree func0 func0/ \|-- attrs \| \|-- db_count \| \|-- mw1 \| \|-- mw2 \| \|-- mw3 \| \|-- mw4 \| \|-- num_mws \| `-- spad_count \|-- baseclass_code \|-- cache_line_size \|-- ... `-- vendorid At this point, the function can be started by linking the EP controller. However, if the user mistakenly creates again a directory: $ mkdir func0/attrs2 $ tree func0 func0/ \|-- attrs \| \|-- db_count \| \|-- mw1 \| \|-- mw2 \| \|-- mw3 \| \|-- mw4 \| \|-- num_mws \| `-- spad_count \|-- attrs2 \| \|-- db_count \| \|-- mw1 \| \|-- mw2 \| \|-- mw3 \| \|-- mw4 \| \|-- num_mws \| `-- spad_count \|-- baseclass_code \|-- cache_line_size \|-- ... `-- vendorid The endpoint function specific attributes are duplicated and cause a crash when the endpoint function device is torn down: refcount_t: addition on 0; use-after-free. WARNING: CPU: 2 PID: 834 at lib/refcount.c:25 refcount_warn_saturate+0xc8/0x144 CPU: 2 PID: 834 Comm: rmdir Not tainted 6.3.0-rc1 #1 Hardware name: Pine64 RockPro64 v2.1 (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) ... Call trace: refcount_warn_saturate+0xc8/0x144 config_item_get+0x7c/0x80 configfs_rmdir+0x17c/0x30c vfs_rmdir+0x8c/0x204 do_rmdir+0x158/0x184 __arm64_sys_unlinkat+0x64/0x80 invoke_syscall+0x48/0x114 ... Fix this by modifying pci_epf_cfs_work() to execute the new function pci_ep_cfs_add_type_group() which itself calls pci_epf_type_add_cfs() to obtain the function specific attribute group and the group name (directory name) from the endpoint function driver. If the function driver defines an attribute group, pci_ep_cfs_add_type_group() then proceeds to register this group using configfs_register_group(), thus automatically exposing the function type specific configfs attributes to the user. E.g.: $ modprobe pci-epf-ntb $ cd /sys/kernel/config/pci_ep/functions/pci_epf_ntb $ mkdir func0 $ tree func0 func0/ \|-- baseclass_code \|-- cache_line_size \|-- ... \|-- pci_epf_ntb.0 \| \|-- db_count \| \|-- mw1 \| \|-- mw2 \| \|-- mw3 \| \|-- mw4 \| \|-- num_mws \| `-- spad_count \|-- primary \|-- ... `-- vendorid With this change, there is no need for the user to create or delete directories in the endpoint function attributes directory. The pci_epf_type_group_ops group operations are thus removed. Also update the documentation for the pci-epf-ntb and pci-epf-vntb function drivers to reflect this change, removing the explanations showing the need to manually create the sub-directory for the function specific attributes. Link: https://lore.kernel.org/r/20230415023542.77601-2-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-06-23	PCI: endpoint: Fix a Kconfig prompt of vNTB driver	Shunsuke Mie	1	-1/+1
	vNTB driver and NTB driver have same Kconfig prompt. Changed to make it distinguishable. Link: https://lore.kernel.org/r/20230202103832.2038286-1-mie@igel.co.jp Fixes: e35f56bb0330 ("PCI: endpoint: Support NTB transfer between RC and EP") Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
2023-05-07	Linux 6.4-rc1	Linus Torvalds	1	-2/+2

2023-05-06	Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL"	Arnaldo Carvalho de Melo	6	-20/+14
	This reverts commit a980755beb5aca9002e1c95ba519b83a44242b5b. We need to better polish building with BPF skels, so revert back to making it an experimental feature that has to be explicitely enabled using BUILD_BPF_SKEL=1. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-06	Revert "perf build: Warn for BPF skeletons if endian mismatches"	Arnaldo Carvalho de Melo	1	-10/+7
	This reverts commit 51924ae69eea5bc90b5da525fbcf4bbd5f8551b3. We need to better polish building with BPF skels, so revert back to making it an experimental feature that has to be explicitely enabled using BUILD_BPF_SKEL=1. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-06	dmapool: link blocks across pages	Keith Busch	1	-127/+130
	The allocated dmapool pages are never freed for the lifetime of the pool. There is no need for the two level list+stack lookup for finding a free block since nothing is ever removed from the list. Just use a simple stack, reducing time complexity to constant. The implementation inserts the stack linking elements and the dma handle of the block within itself when freed. This means the smallest possible dmapool block is increased to at most 16 bytes to accommodate these fields, but there are no exisiting users requesting a dma pool smaller than that anyway. Removing the list has a significant change in performance. Using the kernel's micro-benchmarking self test: Before: # modprobe dmapool_test dmapool test: size:16 blocks:8192 time:57282 dmapool test: size:64 blocks:8192 time:172562 dmapool test: size:256 blocks:8192 time:789247 dmapool test: size:1024 blocks:2048 time:371823 dmapool test: size:4096 blocks:1024 time:362237 After: # modprobe dmapool_test dmapool test: size:16 blocks:8192 time:24997 dmapool test: size:64 blocks:8192 time:26584 dmapool test: size:256 blocks:8192 time:33542 dmapool test: size:1024 blocks:2048 time:9022 dmapool test: size:4096 blocks:1024 time:6045 The module test allocates quite a few blocks that may not accurately represent how these pools are used in real life. For a more marco level benchmark, running fio high-depth + high-batched on nvme, this patch shows submission and completion latency reduced by ~100usec each, 1% IOPs improvement, and perf record's time spent in dma_pool_alloc/free were reduced by half. [kbusch@kernel.org: push new blocks in ascending order] Link: https://lkml.kernel.org/r/20230221165400.1595247-1-kbusch@meta.com Link: https://lkml.kernel.org/r/20230126215125.4069751-12-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: don't memset on free twice	Keith Busch	1	-2/+2
	If debug is enabled, dmapool will poison the range, so no need to clear it to 0 immediately before writing over it. Link: https://lkml.kernel.org/r/20230126215125.4069751-11-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: simplify freeing	Keith Busch	1	-16/+6
	The actions for busy and not busy are mostly the same, so combine these and remove the unnecessary function. Also, the pool is about to be freed so there's no need to poison the page data since we only check for poison on alloc, which can't be done on a freed pool. Link: https://lkml.kernel.org/r/20230126215125.4069751-10-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: consolidate page initialization	Keith Busch	1	-4/+3
	Various fields of the dma pool are set in different places. Move it all to one function. Link: https://lkml.kernel.org/r/20230126215125.4069751-9-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: rearrange page alloc failure handling	Keith Busch	1	-7/+9
	Handle the error in a condition so the good path can be in the normal flow. Link: https://lkml.kernel.org/r/20230126215125.4069751-8-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: move debug code to own functions	Keith Busch	1	-51/+77
	Clean up the normal path by moving the debug code outside it. Link: https://lkml.kernel.org/r/20230126215125.4069751-7-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: speedup DMAPOOL_DEBUG with init_on_alloc	Tony Battersby	1	-1/+1
	Avoid double-memset of the same allocated memory in dma_pool_alloc() when both DMAPOOL_DEBUG is enabled and init_on_alloc=1. Link: https://lkml.kernel.org/r/20230126215125.4069751-6-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: cleanup integer types	Tony Battersby	1	-8/+11
	To represent the size of a single allocation, dmapool currently uses 'unsigned int' in some places and 'size_t' in other places. Standardize on 'unsigned int' to reduce overhead, but use 'size_t' when counting all the blocks in the entire pool. Link: https://lkml.kernel.org/r/20230126215125.4069751-5-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: use sysfs_emit() instead of scnprintf()	Tony Battersby	1	-16/+7
	Use sysfs_emit instead of scnprintf, snprintf or sprintf. Link: https://lkml.kernel.org/r/20230126215125.4069751-4-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	dmapool: remove checks for dev == NULL	Tony Battersby	1	-31/+14
	dmapool originally tried to support pools without a device because dma_alloc_coherent() supports allocations without a device. But nobody ended up using dma pools without a device, and trying to do so will result in an oops. So remove the checks for pool->dev == NULL since they are unneeded bloat. [kbusch@kernel.org: add check for null dev on create] Link: https://lkml.kernel.org/r/20230126215125.4069751-3-kbusch@meta.com Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	nfs: fix mis-merged __filemap_get_folio() error check	Linus Torvalds	1	-1/+1
	Fix another case of an incorrect check for the returned 'folio' value from __filemap_get_folio(). The failure case used to return NULL, but was changed by commit 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio"). But in the meantime, commit ec108d3cc766 ("NFS: Convert readdir page array functions to use a folio") added a new user of that function. And my merge of the two did not fix this up correctly. The ext4 merge had the same issue, but that one had been caught in linux-next and got properly fixed while merging. Fixes: 0127f25b5dfc ("Merge tag 'nfs-for-6.4-1' of git://git.linux-nfs.org/projects/anna/linux-nfs") Cc: Anna Schumaker <anna@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-05-06	afs: fix the afs_dir_get_folio return value	Christoph Hellwig	1	-3/+4
	Keep returning NULL on failure instead of letting an ERR_PTR escape to callers that don't expect it. Link: https://lkml.kernel.org/r/20230503154526.1223095-2-hch@lst.de Fixes: 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio") Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	nilfs2: do not write dirty data after degenerating to read-only	Ryusuke Konishi	1	-1/+4
	According to syzbot's report, mark_buffer_dirty() called from nilfs_segctor_do_construct() outputs a warning with some patterns after nilfs2 detects metadata corruption and degrades to read-only mode. After such read-only degeneration, page cache data may be cleared through nilfs_clear_dirty_page() which may also clear the uptodate flag for their buffer heads. However, even after the degeneration, log writes are still performed by unmount processing etc., which causes mark_buffer_dirty() to be called for buffer heads without the "uptodate" flag and causes the warning. Since any writes should not be done to a read-only file system in the first place, this fixes the warning in mark_buffer_dirty() by letting nilfs_segctor_do_construct() abort early if in read-only mode. This also changes the retry check of nilfs_segctor_write_out() to avoid unnecessary log write retries if it detects -EROFS that nilfs_segctor_do_construct() returned. Link: https://lkml.kernel.org/r/20230427011526.13457-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+2af3bc9585be7f23f290@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=2af3bc9585be7f23f290 Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	mm: do not reclaim private data from pinned page	Jan Kara	1	-0/+10
	If the page is pinned, there's no point in trying to reclaim it. Furthermore if the page is from the page cache we don't want to reclaim fs-private data from the page because the pinning process may be writing to the page at any time and reclaiming fs private info on a dirty page can upset the filesystem (see link below). Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz Link: https://lkml.kernel.org/r/20230428124140.30166-1-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	nilfs2: fix infinite loop in nilfs_mdt_get_block()	Ryusuke Konishi	1	-4/+12
	If the disk image that nilfs2 mounts is corrupted and a virtual block address obtained by block lookup for a metadata file is invalid, nilfs_bmap_lookup_at_level() may return the same internal return code as -ENOENT, meaning the block does not exist in the metadata file. This duplication of return codes confuses nilfs_mdt_get_block(), causing it to read and create a metadata block indefinitely. In particular, if this happens to the inode metadata file, ifile, semaphore i_rwsem can be left held, causing task hangs in lock_mount. Fix this issue by making nilfs_bmap_lookup_at_level() treat virtual block address translation failures with -ENOENT as metadata corruption instead of returning the error code. Link: https://lkml.kernel.org/r/20230430193046.6769-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+221d75710bde87fa0e97@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=221d75710bde87fa0e97 Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	mm/mmap/vma_merge: always check invariants	Lorenzo Stoakes	1	-5/+5
	We may still have inconsistent input parameters even if we choose not to merge and the vma_merge() invariant checks are useful for checking this with no production runtime cost (these are only relevant when CONFIG_DEBUG_VM is specified). Therefore, perform these checks regardless of whether we merge. This is relevant, as a recent issue (addressed in commit "mm/mempolicy: Correctly update prev when policy is equal on mbind") in the mbind logic was only picked up in the 6.2.y stable branch where these assertions are performed prior to determining mergeability. Had this remained the same in mainline this issue may have been picked up faster, so moving forward let's always check them. Link: https://lkml.kernel.org/r/df548a6ae3fa135eec3b446eb3dae8eb4227da97.1682885809.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-06	filemap: Handle error return from __filemap_get_folio()	Matthew Wilcox	1	-1/+1
	Smatch reports that filemap_fault() was missed in the conversion of __filemap_get_folio() error returns from NULL to ERR_PTR. Fixes: 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reported-by: syzbot+48011b86c8ea329af1b9@syzkaller.appspotmail.com Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-05-05	s390: remove the unneeded select GCC12_NO_ARRAY_BOUNDS	Lukas Bulwahn	1	-1/+0
	Commit 0da6e5fd6c37 ("gcc: disable '-Warray-bounds' for gcc-13 too") makes config GCC11_NO_ARRAY_BOUNDS to be for disabling -Warray-bounds in any gcc version 11 and upwards, and with that, removes the GCC12_NO_ARRAY_BOUNDS config as it is now covered by the semantics of GCC11_NO_ARRAY_BOUNDS. As GCC11_NO_ARRAY_BOUNDS is yes by default, there is no need for the s390 architecture to explicitly select GCC11_NO_ARRAY_BOUNDS. Hence, the select GCC12_NO_ARRAY_BOUNDS in arch/s390/Kconfig can simply be dropped. Remove the unneeded "select GCC12_NO_ARRAY_BOUNDS". Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-05-05	perf metrics: Fix SEGV with --for-each-cgroup	Ian Rogers	1	-0/+1
	Ensure the metric threshold is copied correctly or else a use of uninitialized memory happens. Fixes: d0a3052f6faefffc ("perf metric: Compute and print threshold values") Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230505204119.3443491-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-05	perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE	Arnaldo Carvalho de Melo	3	-20/+174
	Linus reported a build break due to using a vmlinux without a BTF elf section to generate the vmlinux.h header with bpftool for use in the BPF tools in tools/perf/util/bpf_skel/*.bpf.c. Instead add a vmlinux.h file with the structs needed with the fields the tools need, marking the structs with __attribute__((preserve_access_index)), so that libbpf's CO-RE code can fixup the struct field offsets. In some cases the vmlinux.h file that was being generated by bpftool from the kernel BTF information was not needed at all, just including linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI types were not being used. To keep te patch small, include those UAPI headers from the trimmed down vmlinux.h file, that then provides the tools with just the structs and the subset of its fields needed for them. Testing it: # perf lock contention -b find / > /dev/null ^C contended total wait max wait avg wait type caller 7 53.59 us 10.86 us 7.66 us rwlock:R start_this_handle+0xa0 2 30.35 us 21.99 us 15.17 us rwsem:R iterate_dir+0x52 1 9.04 us 9.04 us 9.04 us rwlock:W start_this_handle+0x291 1 8.73 us 8.73 us 8.73 us spinlock raw_spin_rq_lock_nested+0x1e # # perf lock contention -abl find / > /dev/null ^C contended total wait max wait avg wait address symbol 1 262.96 ms 262.96 ms 262.96 ms ffff8e67502d0170 (mutex) 12 244.24 us 39.91 us 20.35 us ffff8e6af56f8070 mmap_lock (rwsem) 7 30.28 us 6.85 us 4.33 us ffff8e6c865f1d40 rq_lock (spinlock) 3 7.42 us 4.03 us 2.47 us ffff8e6c864b1d40 rq_lock (spinlock) 2 3.72 us 2.19 us 1.86 us ffff8e6c86571d40 rq_lock (spinlock) 1 2.42 us 2.42 us 2.42 us ffff8e6c86471d40 rq_lock (spinlock) 4 2.11 us 559 ns 527 ns ffffffff9a146c80 rcu_state (spinlock) 3 1.45 us 818 ns 482 ns ffff8e674ae8384c (rwlock) 1 870 ns 870 ns 870 ns ffff8e68456ee060 (rwlock) 1 663 ns 663 ns 663 ns ffff8e6c864f1d40 rq_lock (spinlock) 1 573 ns 573 ns 573 ns ffff8e6c86531d40 rq_lock (spinlock) 1 472 ns 472 ns 472 ns ffff8e6c86431740 (spinlock) 1 397 ns 397 ns 397 ns ffff8e67413a4f04 (spinlock) # # perf test offcpu 95: perf record offcpu profiling tests : Ok # # perf kwork latency --use-bpf Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name \| Cpu \| Avg delay \| Count \| Max delay \| Max delay start \| Max delay end \| -------------------------------------------------------------------------------------------------------------------------------- (w)flush_memcg_stats_dwork \| 0000 \| 1056.212 ms \| 2 \| 2112.345 ms \| 550113.229573 s \| 550115.341919 s \| (w)toggle_allocation_gate \| 0000 \| 10.144 ms \| 62 \| 416.389 ms \| 550113.453518 s \| 550113.869907 s \| (w)0xffff8e6748e28080 \| 0002 \| 0.623 ms \| 1 \| 0.623 ms \| 550110.989841 s \| 550110.990464 s \| (w)vmstat_shepherd \| 0000 \| 0.586 ms \| 10 \| 2.828 ms \| 550111.971536 s \| 550111.974364 s \| (w)vmstat_update \| 0007 \| 0.363 ms \| 5 \| 1.634 ms \| 550113.222520 s \| 550113.224154 s \| (w)vmstat_update \| 0000 \| 0.324 ms \| 10 \| 2.827 ms \| 550111.971526 s \| 550111.974354 s \| (w)0xffff8e674c5f4a58 \| 0002 \| 0.102 ms \| 5 \| 0.134 ms \| 550110.989839 s \| 550110.989972 s \| (w)psi_avgs_work \| 0001 \| 0.086 ms \| 3 \| 0.107 ms \| 550114.957852 s \| 550114.957959 s \| (w)psi_avgs_work \| 0000 \| 0.079 ms \| 5 \| 0.100 ms \| 550118.605668 s \| 550118.605768 s \| (w)kfree_rcu_monitor \| 0006 \| 0.079 ms \| 1 \| 0.079 ms \| 550110.925821 s \| 550110.925900 s \| (w)psi_avgs_work \| 0004 \| 0.079 ms \| 1 \| 0.079 ms \| 550109.581835 s \| 550109.581914 s \| (w)psi_avgs_work \| 0001 \| 0.078 ms \| 1 \| 0.078 ms \| 550109.197809 s \| 550109.197887 s \| (w)psi_avgs_work \| 0002 \| 0.077 ms \| 5 \| 0.086 ms \| 550110.669819 s \| 550110.669905 s \| <SNIP> # strace -e bpf -o perf-stat-bpf-counters.output perf stat -e cycles --bpf-counters sleep 1 Performance counter stats for 'sleep 1': 6,197,983 cycles 1.003922848 seconds time elapsed 0.000000000 seconds user 0.002032000 seconds sys # head -7 perf-stat-bpf-counters.output bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 16) = 3 bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=88, info=0x7ffcead64990}}, 16) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x24129e0, value=0x7ffcead65a48, flags=BPF_ANY}, 32) = 0 bpf(BPF_LINK_GET_FD_BY_ID, {link_id=1252}, 12) = -1 ENOENT (No such file or directory) bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65780, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, +func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 116) = 4 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65920, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, +func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 128) = 4 bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 28) = 4 # Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Song Liu <song@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Co-developed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/lkml/ZFU1PJrn8YtHIqno@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-05	perf stat: Separate bperf from bpf_profiler	Dmitrii Dolgov	2	-2/+7
	It seems that perf stat -b <prog id> doesn't produce any results: $ perf stat -e cycles -b 4 -I 10000 -vvv Control descriptor is not initialized cycles: 0 0 0 time counts unit events 10.007641640 <not supported> cycles Looks like this happens because fentry/fexit progs are getting loaded, but the corresponding perf event is not enabled and not added into the events bpf map. I think there is some mixing up between two type of bpf support, one for bperf and one for bpf_profiler. Both are identified via evsel__is_bpf, based on which perf events are enabled, but for the latter (bpf_profiler) a perf event is required. Using evsel__is_bperf to check only bperf produces expected results: $ perf stat -e cycles -b 4 -I 10000 -vvv Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: size 136 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ [...perf_event_attr for other CPUs...] ------------------------------------------------------------ cycles: 309426 169009 169009 time counts unit events 10.010091271 309426 cycles The final numbers correspond (at least in the level of magnitude) to the same metric obtained via bpftool. Fixes: 112cb56164bc2108 ("perf stat: Introduce config stat.bpf-counter-events") Reviewed-by: Song Liu <song@kernel.org> Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Tested-by: Song Liu <song@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230412182316.11628-1-9erthalion6@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-05	ALSA: pcm: use exit controlled loop in snd_pcm_playback_silence()	Oswald Buddenhagen	1	-2/+2
	We already know that `frames` is greater than zero, because we just checked it. So we don't need to check the loop condition on the first iteration. Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Link: https://lore.kernel.org/r/20230505155244.2312199-7-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-05-05	ALSA: pcm: simplify top-up mode init in snd_pcm_playback_silence()	Oswald Buddenhagen	1	-7/+24
	Inline the remaining call of snd_pcm_playback_hw_avail(). This makes the top-up branch more congruent with the thresholded one, and allows simplifying the handling of the corner cases. Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Link: https://lore.kernel.org/r/20230505155244.2312199-6-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-05-05	ALSA: pcm: playback silence - move silence variable updates to separate function	Jaroslav Kysela	1	-20/+22
	The code tracking the added samples in thresholded mode and the code tracking the just played samples in top-up mode are semantically identical, so factor it out to a common function to enhance readability. Co-developed-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20230505155244.2312199-5-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-05-05	ALSA: pcm: playback silence - remove extra code	Jaroslav Kysela	1	-2/+0
	The removed condition handles de facto only one situation where runtime->silence_filled variable is equal to runtime->buffer_size, because this variable cannot go over the buffer size. This case is implicitly caught by the required comparison of the noise distance with the threshold. Suggested-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Link: https://lore.kernel.org/r/20230505155244.2312199-4-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de>