linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-08-20	net: bridge: vlan: convert mcast router global option to per-vlan entry	Nikolay Aleksandrov	1	-1/+1
	The per-vlan router option controls the port/vlan and host vlan entries' mcast router config. The global option controlled only the host vlan config, but that is unnecessary and incosistent as it's not really a global vlan option, but rather bridge option to control host router config, so convert BRIDGE_VLANDB_GOPTS_MCAST_ROUTER to BRIDGE_VLANDB_ENTRY_MCAST_ROUTER which can be used to control both host vlan and port vlan mcast router config. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19	i2c: virtio: add a virtio i2c frontend driver	Jie Deng	2	-0/+42
	Add an I2C bus driver for virtio para-virtualization. The controller can be emulated by the backend driver in any device model software by following the virtio protocol. The device specification can be found on https://lists.oasis-open.org/archives/virtio-comment/202101/msg00008.html. By following the specification, people may implement different backend drivers to emulate different controllers according to their needs. Co-developed-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Jie Deng <jie.deng@intel.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-08-18	block: fix default IO priority handling	Damien Le Moal	1	-2/+3
	The default IO priority is the best effort (BE) class with the normal priority level IOPRIO_NORM (4). However, get_task_ioprio() returns IOPRIO_CLASS_NONE/IOPRIO_NORM as the default priority and get_current_ioprio() returns IOPRIO_CLASS_NONE/0. Let's be consistent with the defined default and have both of these functions return the default priority IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM) when the user did not define another default IO priority for the task. In include/uapi/linux/ioprio.h, introduce the IOPRIO_BE_NORM macro as an alias to IOPRIO_NORM to clarify that this default level applies to the BE priotity class. In include/linux/ioprio.h, define the macro IOPRIO_DEFAULT as IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM) and use this new macro when setting a priority to the default. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Link: https://lore.kernel.org/r/20210811033702.368488-7-damien.lemoal@wdc.com [axboe: drop unnecessary lightnvm change] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-18	block: Introduce IOPRIO_NR_LEVELS	Damien Le Moal	1	-2/+3
	The BFQ scheduler and ioprio_check_cap() both assume that the RT priority class (IOPRIO_CLASS_RT) can have up to 8 different priority levels, similarly to the BE class (IOPRIO_CLASS_iBE). This is controlled using the IOPRIO_BE_NR macro , which is badly named as the number of levels also applies to the RT class. Introduce the class independent IOPRIO_NR_LEVELS macro, defined to 8, to make things clear. Keep the old IOPRIO_BE_NR macro definition as an alias for IOPRIO_NR_LEVELS. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Link: https://lore.kernel.org/r/20210811033702.368488-6-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-18	block: fix IOPRIO_PRIO_CLASS() and IOPRIO_PRIO_VALUE() macros	Damien Le Moal	1	-4/+8
	The ki_ioprio field of struct kiocb is 16-bits (u16) but often handled as an int in the block layer. E.g. ioprio_check_cap() takes an int as argument. With such implicit int casting function calls, the upper 16-bits of the int argument may be left uninitialized by the compiler, resulting in invalid values for the IOPRIO_PRIO_CLASS() macro (garbage upper bits) and in an error return for functions such as ioprio_check_cap(). Fix this by masking the result of the shift by IOPRIO_CLASS_SHIFT bits in the IOPRIO_PRIO_CLASS() macro. The new macro IOPRIO_CLASS_MASK defines the 3-bits mask for the priority class. Similarly, apply the IOPRIO_PRIO_MASK mask to the data argument of the IOPRIO_PRIO_VALUE() macro to ignore the upper bits of the data value. The IOPRIO_CLASS_MASK mask is also applied to the class argument of this macro before shifting the result by IOPRIO_CLASS_SHIFT bits. While at it, also change the argument name of the IOPRIO_PRIO_CLASS() and IOPRIO_PRIO_DATA() macros from "mask" to "ioprio" to reflect the fact that a priority value should be passed rather than a mask. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Link: https://lore.kernel.org/r/20210811033702.368488-5-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-18	block: change ioprio_valid() to an inline function	Damien Le Moal	1	-2/+0
	Change the ioprio_valid() macro in include/usapi/linux/ioprio.h to an inline function declared on the kernel side in include/linux/ioprio.h. Also improve checks on the class value by checking the upper bound value. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Link: https://lore.kernel.org/r/20210811033702.368488-4-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-18	block: improve ioprio class description comment	Damien Le Moal	1	-4/+6
	In include/usapi/linux/ioprio.h, change the ioprio class enum comment to remove the outdated reference to CFQ and mention BFQ and mq-deadline instead. Also document the high priority NCQ command use for RT class IOs directed at ATA drives that support NCQ priority. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Link: https://lore.kernel.org/r/20210811033702.368488-3-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-18	mptcp: remote addresses fullmesh	Geliang Tang	1	-0/+1
	This patch added and managed a new per endpoint flag, named MPTCP_PM_ADDR_FLAG_FULLMESH. In mptcp_pm_create_subflow_or_signal_addr(), if such flag is set, instead of: remote_address((struct sock_common *)sk, &remote); fill a temporary allocated array of all known remote address. After releaseing the pm lock loop on such array and create a subflow for each remote address from the given local. Note that the we could still use an array even for non 'fullmesh' endpoint: with a single entry corresponding to the primary MPC subflow remote address. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17	NFSD: remove vanity comments	NeilBrown	1	-1/+0
	Including one's name in copyright claims is appropriate. Including it in random comments is just vanity. After 2 decades, it is time for these to be gone. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17	nl80211: add support for BSS coloring	John Crispin	1	-0/+43
	This patch adds support for BSS color collisions to the wireless subsystem. Add the required functionality to nl80211 that will notify about color collisions, triggering the color change and notifying when it is completed. Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/500b3582aec8fe2c42ef46f3117b148cb7cbceb5.1625247619.git.lorenzo@kernel.org [remove unnecessary NULL initialisation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-08-17	bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value	Andrii Nakryiko	1	-0/+16
	Add new BPF helper, bpf_get_attach_cookie(), which can be used by BPF programs to get access to a user-provided bpf_cookie value, specified during BPF program attachment (BPF link creation) time. Naming is hard, though. With the concept being named "BPF cookie", I've considered calling the helper: - bpf_get_cookie() -- seems too unspecific and easily mistaken with socket cookie; - bpf_get_bpf_cookie() -- too much tautology; - bpf_get_link_cookie() -- would be ok, but while we create a BPF link to attach BPF program to BPF hook, it's still an "attachment" and the bpf_cookie is associated with BPF program attachment to a hook, not a BPF link itself. Technically, we could support bpf_cookie with old-style cgroup programs.So I ultimately rejected it in favor of bpf_get_attach_cookie(). Currently all perf_event-backed BPF program types support bpf_get_attach_cookie() helper. Follow-up patches will add support for fentry/fexit programs as well. While at it, mark bpf_tracing_func_proto() as static to make it obvious that it's only used from within the kernel/trace/bpf_trace.c. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210815070609.987780-7-andrii@kernel.org
2021-08-17	bpf: Allow to specify user-provided bpf_cookie for BPF perf links	Andrii Nakryiko	1	-0/+7
	Add ability for users to specify custom u64 value (bpf_cookie) when creating BPF link for perf_event-backed BPF programs (kprobe/uprobe, perf_event, tracepoints). This is useful for cases when the same BPF program is used for attaching and processing invocation of different tracepoints/kprobes/uprobes in a generic fashion, but such that each invocation is distinguished from each other (e.g., BPF program can look up additional information associated with a specific kernel function without having to rely on function IP lookups). This enables new use cases to be implemented simply and efficiently that previously were possible only through code generation (and thus multiple instances of almost identical BPF program) or compilation at runtime (BCC-style) on target hosts (even more expensive resource-wise). For uprobes it is not even possible in some cases to know function IP before hand (e.g., when attaching to shared library without PID filtering, in which case base load address is not known for a library). This is done by storing u64 bpf_cookie in struct bpf_prog_array_item, corresponding to each attached and run BPF program. Given cgroup BPF programs already use two 8-byte pointers for their needs and cgroup BPF programs don't have (yet?) support for bpf_cookie, reuse that space through union of cgroup_storage and new bpf_cookie field. Make it available to kprobe/tracepoint BPF programs through bpf_trace_run_ctx. This is set by BPF_PROG_RUN_ARRAY, used by kprobe/uprobe/tracepoint BPF program execution code, which luckily is now also split from BPF_PROG_RUN_ARRAY_CG. This run context will be utilized by a new BPF helper giving access to this user-provided cookie value from inside a BPF program. Generic perf_event BPF programs will access this value from perf_event itself through passed in BPF program context. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20210815070609.987780-6-andrii@kernel.org
2021-08-17	bpf: Implement minimal BPF perf link	Andrii Nakryiko	1	-0/+2
	Introduce a new type of BPF link - BPF perf link. This brings perf_event-based BPF program attachments (perf_event, tracepoints, kprobes, and uprobes) into the common BPF link infrastructure, allowing to list all active perf_event based attachments, auto-detaching BPF program from perf_event when link's FD is closed, get generic BPF link fdinfo/get_info functionality. BPF_LINK_CREATE command expects perf_event's FD as target_fd. No extra flags are currently supported. Force-detaching and atomic BPF program updates are not yet implemented, but with perf_event-based BPF links we now have common framework for this without the need to extend ioctl()-based perf_event interface. One interesting consideration is a new value for bpf_attach_type, which BPF_LINK_CREATE command expects. Generally, it's either 1-to-1 mapping from bpf_attach_type to bpf_prog_type, or many-to-1 mapping from a subset of bpf_attach_types to one bpf_prog_type (e.g., see BPF_PROG_TYPE_SK_SKB or BPF_PROG_TYPE_CGROUP_SOCK). In this case, though, we have three different program types (KPROBE, TRACEPOINT, PERF_EVENT) using the same perf_event-based mechanism, so it's many bpf_prog_types to one bpf_attach_type. I chose to define a single BPF_PERF_EVENT attach type for all of them and adjust link_create()'s logic for checking correspondence between attach type and program type. The alternative would be to define three new attach types (e.g., BPF_KPROBE, BPF_TRACEPOINT, and BPF_PERF_EVENT), but that seemed like unnecessary overkill and BPF_KPROBE will cause naming conflicts with BPF_KPROBE() macro, defined by libbpf. I chose to not do this to avoid unnecessary proliferation of bpf_attach_type enum values and not have to deal with naming conflicts. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20210815070609.987780-5-andrii@kernel.org
2021-08-16	ethtool: add two link extended substates of bad signal integrity	Guangbin Huang	1	-0/+2
	ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_REFERENCE_CLOCK_LOST means the input external clock signal for SerDes is too weak or lost. ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_ALOS means the received signal for SerDes is too weak because analog loss of signal. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-14	remove the lightnvm subsystem	Christoph Hellwig	1	-224/+0
	Lightnvm supports the OCSSD 1.x and 2.0 specs which were early attempts to produce Open Channel SSDs and never made it into the NVMe spec proper. They have since been superceeded by NVMe enhancements such as ZNS support. Remove the support per the deprecation schedule. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210812132308.38486-1-hch@lst.de Reviewed-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Javier González <javier@javigon.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-14	net: Remove net/ipx.h and uapi/linux/ipx.h header files	Cai Huoqing	1	-87/+0
	commit <47595e32869f> ("<MAINTAINERS: Mark some staging directories>") indicated the ipx network layer as obsolete in Jan 2018, updated in the MAINTAINERS file now, after being exposed for 3 years to refactoring, so to delete uapi/linux/ipx.h and net/ipx.h header files for good. additionally, there is no module that depends on ipx.h except a broken staging driver(r8188eu) Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14	net: bridge: vlan: dump mcast ctx querier state	Nikolay Aleksandrov	1	-0/+1
	Use the new mcast querier state dump infrastructure and export vlans' mcast context querier state embedded in attribute BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14	net: bridge: mcast: dump ipv6 querier state	Nikolay Aleksandrov	1	-0/+3
	Add support for dumping global IPv6 querier state, we dump the state only if our own querier is enabled or there has been another external querier which has won the election. For the bridge global state we use a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside. The structure is: [IFLA_BR_MCAST_QUERIER_STATE] `[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier `[BRIDGE_QUERIER_IPV6_PORT] - bridge port ifindex where the querier was seen (set only if external querier) `[BRIDGE_QUERIER_IPV6_OTHER_TIMER] - other querier timeout IPv4 and IPv6 attributes are embedded at the same level of IFLA_BR_MCAST_QUERIER_STATE. If we didn't dump anything we cancel the nest and return. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14	net: bridge: mcast: dump ipv4 querier state	Nikolay Aleksandrov	2	-0/+11
	Add support for dumping global IPv4 querier state, we dump the state only if our own querier is enabled or there has been another external querier which has won the election. For the bridge global state we use a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside. The structure is: [IFLA_BR_MCAST_QUERIER_STATE] `[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier `[BRIDGE_QUERIER_IP_PORT] - bridge port ifindex where the querier was seen (set only if external querier) `[BRIDGE_QUERIER_IP_OTHER_TIMER] - other querier timeout Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2	-2/+14
	Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h 9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") 099fdeda659d ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c 5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") 2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS 7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13	nl80211: vendor-cmd: add Intel vendor commands for iwlmei usage	Emmanuel Grumbach	1	-0/+77
	iwlmei allows to integrate with the CSME firmware. There are flows that are prioprietary for this purpose: * Get the information of the AP the CSME firmware is connected to. This is useful when we need to speed up the connection process in case the CSME firmware has a TCP connection that must be kept alive across the ownership transition. * Forbid roaming, which will happen when the CSME firmware wants to tell the user space not disrupt the connection. * Request ownership, upon driver boot when the CSME firmware owns the device. This is a notification sent by the kernel. All those commands are expected to be used by any software managing the connection (mainly NetworkManager). Those commands are expected to be used only in case the CSME firmware owns the device and doesn't want to release the device unless the host made sure that it can keep the connectivity. Here are the steps of the expected flow: 1) The machine boots while AMT has an active TCP connection 2) iwlwifi starts and tries to access the device 3) The device is not available because of the active TCP connection. (If there are no active connections, the CSME firmware would have allowed iwlwifi to use the device) Note that all the steps up to here don't involve iwlmei. All this happens in iwlwifi (in iwl_pcie_prepare_card_hw). 4) iwlmei establishes a connection to the CSME firmware (through SAP) Here iwlwifi uses iwlmei to access the device's capabilities (since it can't touch the device), but this is not relevant for the vendor commands. 5) The CSME firmware tells iwlmei that it uses the NIC and that there is an acitve TCP connection, and hence, the host needs to think twice before asking the CSME firmware to release the device 6) iwlmei tells iwlwifi to report HW RFKILL with a special reason Up to here, there was no user space involved. 7) The user space (NetworkManager) boots and sees that the device is in RFKILL because the host doesn't own the device 8) The user space asks the kernel what AP the CSME firmware is connected to (with the first vendor command mentionned above) 9) The user space checks if it has a profile that matches the reply from the CSME firmware 10) The user space installs a network to the wpa_supplicant with a specific BSSID and a specific frequency 11) The user space prevents any type of full scan 12) The user space asks iwlmei to request ownership on the device (with the third vendor command) 13) iwlmei request ownership from the CSME firmware 14) The CSME firmware grants ownership 15) iwlmei tells iwlwifi to lift the RFKILL 16) RFKILL OFF is reported to userspace 17) The host boots the device, loads the firwmare, and connect to a specific BSSID without scanning including IP in less than 600ms (this is what I measured, of course it depends on many factors) 18) The host reports to the CSME firmware that there is a connection 19) The TCP connection is preserved and the host has now connectivity 20) Later, the TCP connection to the CSME firmware is terminated 21) The CSME firmware tells iwlmei that it is now free to do whatever it likes 22) iwlwifi sends the second vendor command to tell the user space that it can remove the special network configuration and pick any SSID / BSSID it likes. Co-Developed-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://lore.kernel.org/r/20210625081717.7680-4-emmanuel.grumbach@intel.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-08-12	Merge tag 'scmi-updates-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/drivers	Arnd Bergmann	2	-0/+25
	SCMI Updates for v5.15 The bulk of the addition this time is mainly refactoring to add support for Virtio transport for SCMI and the addition of the support itself. The refactoring includes allowing transport specific init/exit calls, making each transport as compile time configurable, supporting monotonically increasing tokens instead of using the next available free buffer index as the token for scmi messages which eases handling concurrent and out-of-order messages which is a must have for virtio transport. Virtio support itself is conformant to the virtio SCMI device spec [1]. Virtio device id 32 has been reserved for the SCMI device [2]. Other than the virtio support, there is one bug fix in the probe failure clean up path. [1] https://github.com/oasis-tcs/virtio-spec/blob/master/virtio-scmi.tex [2] https://www.oasis-open.org/committees/ballot.php?id=3496 * tag 'scmi-updates-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Use WARN_ON() to check configured transports firmware: arm_scmi: Fix boolconv.cocci warnings firmware: arm_scmi: Free mailbox channels if probe fails firmware: arm_scmi: Add virtio transport firmware: arm_scmi: Add priv parameter to scmi_rx_callback dt-bindings: arm: Add virtio transport for SCMI firmware: arm_scmi: Add optional link_supplier() transport op firmware: arm_scmi: Add message passing abstractions for transports firmware: arm_scmi: Add method to override max message number firmware: arm_scmi: Make shmem support optional for transports firmware: arm_scmi: Make SCMI transports configurable firmware: arm_scmi: Make polling mode optional firmware: arm_scmi: Make .clear_channel optional firmware: arm_scmi: Handle concurrent and out-of-order messages firmware: arm_scmi: Introduce monotonically increasing tokens firmware: arm_scmi: Add optional transport_init/exit support firmware: arm_scmi: Remove scmi_dump_header_dbg() helper firmware: arm_scmi: Add support for type handling in common functions Link: https://lore.kernel.org/r/20210811075743.707961-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-11	net: bridge: vlan: use br_rports_fill_info() to export mcast router ports	Nikolay Aleksandrov	1	-0/+1
	Embed the standard multicast router port export by br_rports_fill_info() into a new global vlan attribute BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS. In order to have the same format for the global bridge mcast context and the per-vlan mcast context we need a double-nesting: - BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS - MDBA_ROUTER Currently we don't compare router lists, if any router port exists in the bridge mcast contexts we consider their option sets as different and export them separately. In addition we export the router port vlan id when dumping similar to the router port notification format. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast router global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast router state which is used for the bridge itself. We just need to pass multicast context to br_multicast_set_router instead of bridge device and the rest of the logic remains the same. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast querier global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast querier state. We just need to pass multicast context to br_multicast_set_querier instead of bridge device and the rest of the logic remains the same. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast startup query interval global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast startup query interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast query response interval global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast query response interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast query interval global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast query interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast querier interval global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast querier interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast membership interval global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast membership interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast last member interval global option	Nikolay Aleksandrov	1	-0/+2
	Add support to change and retrieve global vlan multicast last member interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast startup query count global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast startup query count option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast last member count global option	Nikolay Aleksandrov	1	-0/+1
	Add support to change and retrieve global vlan multicast last member count option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	net: bridge: vlan: add support for mcast igmp/mld version global options	Nikolay Aleksandrov	1	-0/+2
	Add support to change and retrieve global vlan IGMP/MLD versions. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next	David S. Miller	1	-0/+1
	Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Use nfnetlink_unicast() instead of netlink_unicast() in nft_compat. 2) Remove call to nf_ct_l4proto_find() in flowtable offload timeout fixup. 3) CLUSTERIP registers ARP hook on demand, from Florian. 4) Use clusterip_net to store pernet warning, also from Florian. 5) Remove struct netns_xt, from Florian Westphal. 6) Enable ebtables hooks in initns on demand, from Florian. 7) Allow to filter conntrack netlink dump per status bits, from Florian Westphal. 8) Register x_tables hooks in initns on demand, from Florian. 9) Remove queue_handler from per-netns structure, again from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10	net: bridge: fix flags interpretation for extern learn fdb entries	Nikolay Aleksandrov	1	-2/+5
	Ignore fdb flags when adding port extern learn entries and always set BR_FDB_LOCAL flag when adding bridge extern learn entries. This is closest to the behaviour we had before and avoids breaking any use cases which were allowed. This patch fixes iproute2 calls which assume NUD_PERMANENT and were allowed before, example: $ bridge fdb add 00:11:22:33:44:55 dev swp1 extern_learn Extern learn entries are allowed to roam, but do not expire, so static or dynamic flags make no sense for them. Also add a comment for future reference. Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space") Fixes: 0541a6293298 ("net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry") Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20210810110010.43859-1-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10	dm ima: measure data on table load	Tushar Sugandhi	1	-0/+6
	DM configures a block device with various target specific attributes passed to it as a table. DM loads the table, and calls each target’s respective constructors with the attributes as input parameters. Some of these attributes are critical to ensure the device meets certain security bar. Thus, IMA should measure these attributes, to ensure they are not tampered with, during the lifetime of the device. So that the external services can have high confidence in the configuration of the block-devices on a given system. Some devices may have large tables. And a given device may change its state (table-load, suspend, resume, rename, remove, table-clear etc.) many times. Measuring these attributes each time when the device changes its state will significantly increase the size of the IMA logs. Further, once configured, these attributes are not expected to change unless a new table is loaded, or a device is removed and recreated. Therefore the clear-text of the attributes should only be measured during table load, and the hash of the active/inactive table should be measured for the remaining device state changes. Export IMA function ima_measure_critical_data() to allow measurement of DM device parameters, as well as target specific attributes, during table load. Compute the hash of the inactive table and store it for measurements during future state change. If a load is called multiple times, update the inactive table hash with the hash of the latest populated table. So that the correct inactive table hash is measured when the device transitions to different states like resume, remove, rename, etc. Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> # leak fix Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-08-10	fanotify: add pidfd support to the fanotify API	Matthew Bobrowski	1	-0/+13
	Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which allows userspace applications to control whether a pidfd information record containing a pidfd is to be returned alongside the generic event metadata for each event. If FAN_REPORT_PIDFD is enabled for a notification group, an additional struct fanotify_event_info_pidfd object type will be supplied alongside the generic struct fanotify_event_metadata for a single event. This functionality is analogous to that of FAN_REPORT_FID in terms of how the event structure is supplied to a userspace application. Usage of FAN_REPORT_PIDFD with FAN_REPORT_FID/FAN_REPORT_DFID_NAME is permitted, and in this case a struct fanotify_event_info_pidfd object will likely follow any struct fanotify_event_info_fid object. Currently, the usage of the FAN_REPORT_TID flag is not permitted along with FAN_REPORT_PIDFD as the pidfd API currently only supports the creation of pidfds for thread-group leaders. Additionally, usage of the FAN_REPORT_PIDFD flag is limited to privileged processes only i.e. event listeners that are running with the CAP_SYS_ADMIN capability. Attempting to supply the FAN_REPORT_TID initialization flags with FAN_REPORT_PIDFD or creating a notification group without CAP_SYS_ADMIN will result with -EINVAL being returned to the caller. In the event of a pidfd creation error, there are two types of error values that can be reported back to the listener. There is FAN_NOPIDFD, which will be reported in cases where the process responsible for generating the event has terminated prior to the event listener being able to read the event. Then there is FAN_EPIDFD, which will be reported when a more generic pidfd creation error has occurred when fanotify calls pidfd_create(). Link: https://lore.kernel.org/r/5f9e09cff7ed62bfaa51c1369e0f7ea5f16a91aa.1628398044.git.repnop@google.com Signed-off-by: Matthew Bobrowski <repnop@google.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-08-09	Merge 5.14-rc5 into tty-next	Greg Kroah-Hartman	1	-1/+1
	We need the tty/serial fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-09	Merge tag 'v5.14-rc5' into next	Vinod Koul	1	-1/+1
	Linux 5.14-rc5
2021-08-06	drm/amdkfd: Allow querying SVM attributes that are clear	Felix Kuehling	1	-7/+9
	Currently the SVM get_attr call allows querying, which flags are set in the entire address range. Add the opposite query, which flags are clear in the entire address range. Both queries can be combined in a single get_attr call, which allows answering questions such as, "is this address range coherent, non-coherent, or a mix of both"? Proposed userspace for UAPI: https://github.com/RadeonOpenCompute/ROCR-Runtime/tree/memory_model_queries Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yand <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-06	netfilter: nfnetlink_hook: missing chain family	Pablo Neira Ayuso	1	-0/+9
	The family is relevant for pseudo-families like NFPROTO_INET otherwise the user needs to rely on the hook function name to differentiate it from NFPROTO_IPV4 and NFPROTO_IPV6 names. Add nfnl_hook_chain_desc_attributes instead of using the existing NFTA_CHAIN_* attributes, since these do not provide a family number. Fixes: e2cf17d3774c ("netfilter: add new hook nfnl subsystem") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-05	Merge tag 'v5.14-rc4' into media_tree	Mauro Carvalho Chehab	5	-6/+34
	Linux 5.14-rc4 * tag 'v5.14-rc4': (948 commits) Linux 5.14-rc4 pipe: make pipe writes always wake up readers Revert "perf map: Fix dso->nsinfo refcounting" mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook() slub: fix unreclaimable slab stat for bulk free mm/migrate: fix NR_ISOLATED corruption on 64-bit mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code ocfs2: issue zeroout to EOF blocks ocfs2: fix zero out valid data lib/test_string.c: move string selftest in the Runtime Testing menu gve: Update MAINTAINERS list arch: Kconfig: clean up obsolete use of HAVE_IDE can: esd_usb2: fix memory leak can: ems_usb: fix memory leak can: usb_8dev: fix memory leak can: mcba_usb_start(): add missing urb->transfer_dma initialization can: hi311x: fix a signedness bug in hi3110_cmd() MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver scsi: fas216: Fix fall-through warning for Clang scsi: acornscsi: Fix fall-through warning for clang ...
2021-08-05	netfilter: ctnetlink: allow to filter dump by status bits	Florian Westphal	1	-0/+1
	If CTA_STATUS is present, but CTA_STATUS_MASK is not, then the mask is automatically set to 'status', so that kernel returns those entries that have all of the requested bits set. This makes more sense than using a all-one mask since we'd hardly ever find a match. There are no other checks for status bits, so if e.g. userspace sets impossible combinations it will get an empty dump. If kernel would reject unknown status bits, then a program that works on a future kernel that has IPS_FOO bit fails on old kernels. Same for 'impossible' combinations: Kernel never sets ASSURED without first having set SEEN_REPLY, but its possible that a future kernel could do so. Therefore no sanity tests other than a 0-mask. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-05	net/ipv4/ipv6: Replace one-element arraya with flexible-array members	Gustavo A. R. Silva	1	-5/+16
	There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Use an anonymous union with a couple of anonymous structs in order to keep userspace unchanged and refactor the related code accordingly: $ pahole -C group_filter net/ipv4/ip_sockglue.o struct group_filter { union { struct { __u32 gf_interface_aux; /* 0 4 / / XXX 4 bytes hole, try to pack / struct __kernel_sockaddr_storage gf_group_aux; / 8 128 / / --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- / __u32 gf_fmode_aux; / 136 4 / __u32 gf_numsrc_aux; / 140 4 / struct __kernel_sockaddr_storage gf_slist[1]; / 144 128 / }; / 0 272 / struct { __u32 gf_interface; / 0 4 / / XXX 4 bytes hole, try to pack / struct __kernel_sockaddr_storage gf_group; / 8 128 / / --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- / __u32 gf_fmode; / 136 4 / __u32 gf_numsrc; / 140 4 / struct __kernel_sockaddr_storage gf_slist_flex[0]; / 144 0 / }; / 0 144 / }; / 0 272 / / size: 272, cachelines: 5, members: 1 / / last cacheline: 16 bytes / }; $ pahole -C compat_group_filter net/ipv4/ip_sockglue.o struct compat_group_filter { union { struct { __u32 gf_interface_aux; / 0 4 / struct __kernel_sockaddr_storage gf_group_aux __attribute__((__aligned__(4))); / 4 128 / / --- cacheline 2 boundary (128 bytes) was 4 bytes ago --- / __u32 gf_fmode_aux; / 132 4 / __u32 gf_numsrc_aux; / 136 4 / struct __kernel_sockaddr_storage gf_slist[1] __attribute__((__aligned__(4))); / 140 128 / } __attribute__((__packed__)) __attribute__((__aligned__(4))); / 0 268 / struct { __u32 gf_interface; / 0 4 / struct __kernel_sockaddr_storage gf_group __attribute__((__aligned__(4))); / 4 128 / / --- cacheline 2 boundary (128 bytes) was 4 bytes ago --- / __u32 gf_fmode; / 132 4 / __u32 gf_numsrc; / 136 4 / struct __kernel_sockaddr_storage gf_slist_flex[0] __attribute__((__aligned__(4))); / 140 0 / } __attribute__((__packed__)) __attribute__((__aligned__(4))); / 0 140 / } __attribute__((__aligned__(1))); / 0 268 / / size: 268, cachelines: 5, members: 1 / / forced alignments: 1 / / last cacheline: 12 bytes */ } __attribute__((__packed__)); This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/109 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05	firmware: arm_scmi: Add virtio transport	Igor Skalkin	2	-0/+25
	This transport enables communications with an SCMI platform through virtio; the SCMI platform will be represented by a virtio device. Implement an SCMI virtio driver according to the virtio SCMI device spec [1]. Virtio device id 32 has been reserved for the SCMI device [2]. The virtio transport has one Tx channel (virtio cmdq, A2P channel) and at most one Rx channel (virtio eventq, P2A channel). The following feature bit defined in [1] is not implemented: VIRTIO_SCMI_F_SHARED_MEMORY. The number of messages which can be pending simultaneously is restricted according to the virtqueue capacity negotiated at probing time. As soon as Rx channel message buffers are allocated or have been read out by the arm-scmi driver, feed them back to the virtio device. Since some virtio devices may not have the short response time exhibited by SCMI platforms using other transports, set a generous response timeout. SCMI polling mode is not supported by this virtio transport since deemed meaningless: polling mode operation is offered by the SCMI core to those transports that could not provide a completion interrupt on the TX path, which is never the case for virtio whose core callbacks can easily call into core scmi_rx_callback upon messages reception. [1] https://github.com/oasis-tcs/virtio-spec/blob/master/virtio-scmi.tex [2] https://www.oasis-open.org/committees/ballot.php?id=3496 Link: https://lore.kernel.org/r/20210803131024.40280-16-cristian.marussi@arm.com Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Co-developed-by: Peter Hilber <peter.hilber@opensynergy.com> Co-developed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Igor Skalkin <igor.skalkin@opensynergy.com> [ Peter: Adapted patch for submission to upstream. ] Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com> [ Cristian: simplified driver logic, changed link_supplier and channel available/setup logic, removed dummy callbacks ] Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-08-04	media: v4l2-ctrls: Add intra-refresh period control	Stanimir Varbanov	1	-0/+1
	Add a control to set intra-refresh period. Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-08-04	sock: allow reading and changing sk_userlocks with setsockopt	Pavel Tikhomirov	1	-0/+5
	SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and tcp_sndbuf_expand()). If we've just created a new socket this adjustment is enabled on it, but if one changes the socket buffer size by setsockopt(SO_{SND,RCV}BUF) it becomes disabled. CRIU needs to call setsockopt(SO_{SND,RCV}BUF) on each socket on restore as it first needs to increase buffer sizes for packet queues restore and second it needs to restore back original buffer sizes. So after CRIU restore all sockets become non-auto-adjustable, which can decrease network performance of restored applications significantly. CRIU need to be able to restore sockets with enabled/disabled adjustment to the same state it was before dump, so let's add special setsockopt for it. Let's also export SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags to uAPI so that using these interface one can reenable automatic socket buffer adjustment on their sockets. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04	Merge tag 'linux-can-next-for-5.15-20210804' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next	David S. Miller	1	-0/+9
	Marc Kleine-Budde says: ==================== pull-request: can-next 2021-08-04 this is a pull request of 5 patches for net-next/master. The first patch is by me and fixes a typo in a comment in the CAN J1939 protocol. The next 2 patches are by Oleksij Rempel and update the CAN J1939 protocol to send RX status updates via the error queue mechanism. The next patch is by me and adds a missing variable initialization to the flexcan driver (the problem was introduced in the current net-next cycle). The last patch is by Aswath Govindraju and adds power-domains to the Bosch m_can DT binding documentation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04	can: j1939: extend UAPI to notify about RX status	Oleksij Rempel	1	-0/+9
	To be able to create applications with user friendly feedback, we need be able to provide receive status information. Typical ETP transfer may take seconds or even hours. To give user some clue or show a progress bar, the stack should push status updates. Same as for the TX information, the socket error queue will be used with following new signals: - J1939_EE_INFO_RX_RTS - received and accepted request to send signal. - J1939_EE_INFO_RX_DPO - received data package offset signal - J1939_EE_INFO_RX_ABORT - RX session was aborted Instead of completion signal, user will get data package. To activate this signals, application should set SOF_TIMESTAMPING_RX_SOFTWARE to the SO_TIMESTAMPING socket option. This will avoid unpredictable application behavior for the old software. Link: https://lore.kernel.org/r/20210707094854.30781-3-o.rempel@pengutronix.de Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>