aboutsummaryrefslogtreecommitdiffstats
path: root/include/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-01-09Merge tag 'for-net-next-2022-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-nextJakub Kicinski1-5/+1
Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add support for Foxconn QCA 0xe0d0 - Fix HCI init sequence on MacBook Air 8,1 and 8,2 - Fix Intel firmware loading on legacy ROM devices * tag 'for-net-next-2022-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: Bluetooth: hci_sock: fix endian bug in hci_sock_setsockopt() Bluetooth: L2CAP: uninitialized variables in l2cap_sock_setsockopt() Bluetooth: btqca: sequential validation Bluetooth: btusb: Add support for Foxconn QCA 0xe0d0 Bluetooth: btintel: Fix broken LED quirk for legacy ROM devices Bluetooth: hci_event: Rework hci_inquiry_result_with_rssi_evt Bluetooth: btbcm: disable read tx power for MacBook Air 8,1 and 8,2 Bluetooth: hci_qca: Fix NULL vs IS_ERR_OR_NULL check in qca_serdev_probe Bluetooth: hci_bcm: Check for error irq ==================== Link: https://lore.kernel.org/r/20220107210942.3750887-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski3-2/+13
Alexei Starovoitov says: ==================== pull-request: bpf-next 2022-01-06 We've added 41 non-merge commits during the last 2 day(s) which contain a total of 36 files changed, 1214 insertions(+), 368 deletions(-). The main changes are: 1) Various fixes in the verifier, from Kris and Daniel. 2) Fixes in sockmap, from John. 3) bpf_getsockopt fix, from Kuniyuki. 4) INET_POST_BIND fix, from Menglong. 5) arm64 JIT fix for bpf pseudo funcs, from Hou. 6) BPF ISA doc improvements, from Christoph. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits) bpf: selftests: Add bind retry for post_bind{4, 6} bpf: selftests: Use C99 initializers in test_sock.c net: bpf: Handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() bpf/selftests: Test bpf_d_path on rdonly_mem. libbpf: Add documentation for bpf_map batch operations selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3 xdp: Add xdp_do_redirect_frame() for pre-computed xdp_frames xdp: Move conversion to xdp_frame out of map functions page_pool: Store the XDP mem id page_pool: Add callback to init pages when they are allocated xdp: Allow registering memory model without rxq reference samples/bpf: xdpsock: Add timestamp for Tx-only operation samples/bpf: xdpsock: Add time-out for cleaning Tx samples/bpf: xdpsock: Add sched policy and priority support samples/bpf: xdpsock: Add cyclic TX operation capability samples/bpf: xdpsock: Add clockid selection support samples/bpf: xdpsock: Add Dest and Src MAC setting for Tx-only operation samples/bpf: xdpsock: Add VLAN support for Tx-only operation libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API libbpf 1.0: Deprecate bpf_map__is_offload_neutral() ... ==================== Link: https://lore.kernel.org/r/20220107013626.53943-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06net: bpf: Handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND()Menglong Dong1-0/+1
The return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() in __inet_bind() is not handled properly. While the return value is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and exit: err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk); if (err) { inet->inet_saddr = inet->inet_rcv_saddr = 0; goto out_release_sock; } Let's take UDP for example and see what will happen. For UDP socket, it will be added to 'udp_prot.h.udp_table->hash' and 'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port() called success. If 'inet->inet_rcv_saddr' is specified here, then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong to (because inet_saddr is changed to 0), and UDP packet received will not be passed to this sock. If 'inet->inet_rcv_saddr' is not specified here, the sock will work fine, as it can receive packet properly, which is wired, as the 'bind()' is already failed. To undo the get_port() operation, introduce the 'put_port' field for 'struct proto'. For TCP proto, it is inet_put_port(); For UDP proto, it is udp_lib_unhash(); For icmp proto, it is ping_unhash(). Therefore, after sys_bind() fail caused by BPF_CGROUP_RUN_PROG_INET4_POST_BIND(), it will be unbinded, which means that it can try to be binded to another port. Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220106132022.3470772-2-imagedong@tencent.com
2022-01-06Bluetooth: hci_event: Rework hci_inquiry_result_with_rssi_evtLuiz Augusto von Dentz1-5/+1
This rework the handling of hci_inquiry_result_with_rssi_evt to not use a union to represent the different inquiry responses. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Soenke Huster <soenke.huster@eknoes.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-01-06net: dsa: warn about dsa_port and dsa_switch bit fields being non atomicVladimir Oltean1-0/+8
As discussed during review here: https://patchwork.kernel.org/project/netdevbpf/patch/20220105132141.2648876-3-vladimir.oltean@nxp.com/ we should inform developers about pitfalls of concurrent access to the boolean properties of dsa_switch and dsa_port, now that they've been converted to bit fields. No other measure than a comment needs to be taken, since the code paths that update these bit fields are not concurrent with each other. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06net: dsa: don't enumerate dsa_switch and dsa_port bit fields using commasVladimir Oltean1-58/+56
This is a cosmetic incremental fixup to commits 7787ff776398 ("net: dsa: merge all bools of struct dsa_switch into a single u32") bde82f389af1 ("net: dsa: merge all bools of struct dsa_port into a single u8") The desire to make this change was enunciated after posting these patches here: https://patchwork.kernel.org/project/netdevbpf/cover/20220105132141.2648876-1-vladimir.oltean@nxp.com/ but due to a slight timing overlap (message posted at 2:28 p.m. UTC, merge commit is at 2:46 p.m. UTC), that comment was missed and the changes were applied as-is. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsecDavid S. Miller1-1/+1
Steffen Klassert says: ==================== pull request (net): ipsec 2022-01-06 1) Fix xfrm policy lookups for ipv6 gre packets by initializing fl6_gre_key properly. From Ghalem Boudour. 2) Fix the dflt policy check on forwarding when there is no policy configured. The check was done for the wrong direction. From Nicolas Dichtel. 3) Use the correct 'struct xfrm_user_offload' when calculating netlink message lenghts in xfrm_sa_len(). From Eric Dumazet. 4) Tread inserting xfrm interface id 0 as an error. From Antony Antony. 5) Fail if xfrm state or policy is inserted with XFRMA_IF_ID 0, xfrm interfaces with id 0 are not allowed. From Antony Antony. 6) Fix inner_ipproto setting in the sec_path for tunnel mode. From Raed Salem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller1-0/+5
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2022-01-06 1) Fix some clang_analyzer warnings about never read variables. From luo penghao. 2) Check for pols[0] only once in xfrm_expand_policies(). From Jean Sacren. 3) The SA curlft.use_time was updated only on SA cration time. Update whenever the SA is used. From Antony Antony 4) Add support for SM3 secure hash. From Xu Jia. 5) Add support for SM4 symmetric cipher algorithm. From Xu Jia. 6) Add a rate limit for SA mapping change messages. From Antony Antony. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05page_pool: Store the XDP mem idToke Høiland-Jørgensen1-2/+7
Store the XDP mem ID inside the page_pool struct so it can be retrieved later for use in bpf_prog_run(). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/20220103150812.87914-4-toke@redhat.com
2022-01-05page_pool: Add callback to init pages when they are allocatedToke Høiland-Jørgensen1-0/+2
Add a new callback function to page_pool that, if set, will be called every time a new page is allocated. This will be used from bpf_test_run() to initialise the page data with the data provided by userspace when running XDP programs with redirect turned on. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/20220103150812.87914-3-toke@redhat.com
2022-01-05xdp: Allow registering memory model without rxq referenceToke Høiland-Jørgensen1-0/+3
The functions that register an XDP memory model take a struct xdp_rxq as parameter, but the RXQ is not actually used for anything other than pulling out the struct xdp_mem_info that it embeds. So refactor the register functions and export variants that just take a pointer to the xdp_mem_info. This is in preparation for enabling XDP_REDIRECT in bpf_prog_run(), using a page_pool instance that is not connected to any network device. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220103150812.87914-2-toke@redhat.com
2022-01-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-2/+22
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05net: dsa: combine two holes in struct dsa_switch_treeVladimir Oltean1-3/+3
There is a 7 byte hole after dst->setup and a 4 byte hole after dst->default_proto. Combining them, we have a single hole of just 3 bytes on 64 bit machines. Before: pahole -C dsa_switch_tree net/dsa/slave.o struct dsa_switch_tree { struct list_head list; /* 0 16 */ struct list_head ports; /* 16 16 */ struct raw_notifier_head nh; /* 32 8 */ unsigned int index; /* 40 4 */ struct kref refcount; /* 44 4 */ struct net_device * * lags; /* 48 8 */ bool setup; /* 56 1 */ /* XXX 7 bytes hole, try to pack */ /* --- cacheline 1 boundary (64 bytes) --- */ const struct dsa_device_ops * tag_ops; /* 64 8 */ enum dsa_tag_protocol default_proto; /* 72 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_platform_data * pd; /* 80 8 */ struct list_head rtable; /* 88 16 */ unsigned int lags_len; /* 104 4 */ unsigned int last_switch; /* 108 4 */ /* size: 112, cachelines: 2, members: 13 */ /* sum members: 101, holes: 2, sum holes: 11 */ /* last cacheline: 48 bytes */ }; After: pahole -C dsa_switch_tree net/dsa/slave.o struct dsa_switch_tree { struct list_head list; /* 0 16 */ struct list_head ports; /* 16 16 */ struct raw_notifier_head nh; /* 32 8 */ unsigned int index; /* 40 4 */ struct kref refcount; /* 44 4 */ struct net_device * * lags; /* 48 8 */ const struct dsa_device_ops * tag_ops; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ enum dsa_tag_protocol default_proto; /* 64 4 */ bool setup; /* 68 1 */ /* XXX 3 bytes hole, try to pack */ struct dsa_platform_data * pd; /* 72 8 */ struct list_head rtable; /* 80 16 */ unsigned int lags_len; /* 96 4 */ unsigned int last_switch; /* 100 4 */ /* size: 104, cachelines: 2, members: 13 */ /* sum members: 101, holes: 1, sum holes: 3 */ /* last cacheline: 40 bytes */ }; Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05net: dsa: move dsa_switch_tree :: ports and lags to first cache lineVladimir Oltean1-7/+9
dst->ports is accessed most notably by dsa_master_find_slave(), which is invoked in the RX path. dst->lags is accessed by dsa_lag_dev(), which is invoked in the RX path of tag_dsa.c. dst->tag_ops, dst->default_proto and dst->pd don't need to be in the first cache line, so they are moved out by this change. Before: pahole -C dsa_switch_tree net/dsa/slave.o struct dsa_switch_tree { struct list_head list; /* 0 16 */ struct raw_notifier_head nh; /* 16 8 */ unsigned int index; /* 24 4 */ struct kref refcount; /* 28 4 */ bool setup; /* 32 1 */ /* XXX 7 bytes hole, try to pack */ const struct dsa_device_ops * tag_ops; /* 40 8 */ enum dsa_tag_protocol default_proto; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_platform_data * pd; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct list_head ports; /* 64 16 */ struct list_head rtable; /* 80 16 */ struct net_device * * lags; /* 96 8 */ unsigned int lags_len; /* 104 4 */ unsigned int last_switch; /* 108 4 */ /* size: 112, cachelines: 2, members: 13 */ /* sum members: 101, holes: 2, sum holes: 11 */ /* last cacheline: 48 bytes */ }; After: pahole -C dsa_switch_tree net/dsa/slave.o struct dsa_switch_tree { struct list_head list; /* 0 16 */ struct list_head ports; /* 16 16 */ struct raw_notifier_head nh; /* 32 8 */ unsigned int index; /* 40 4 */ struct kref refcount; /* 44 4 */ struct net_device * * lags; /* 48 8 */ bool setup; /* 56 1 */ /* XXX 7 bytes hole, try to pack */ /* --- cacheline 1 boundary (64 bytes) --- */ const struct dsa_device_ops * tag_ops; /* 64 8 */ enum dsa_tag_protocol default_proto; /* 72 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_platform_data * pd; /* 80 8 */ struct list_head rtable; /* 88 16 */ unsigned int lags_len; /* 104 4 */ unsigned int last_switch; /* 108 4 */ /* size: 112, cachelines: 2, members: 13 */ /* sum members: 101, holes: 2, sum holes: 11 */ /* last cacheline: 48 bytes */ }; Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05net: dsa: make dsa_switch :: num_ports an unsigned intVladimir Oltean1-1/+1
Currently, num_ports is declared as size_t, which is defined as __kernel_ulong_t, therefore it occupies 8 bytes of memory. Even switches with port numbers in the range of tens are exotic, so there is no need for this amount of storage. Additionally, because the max_num_bridges member right above it is also 4 bytes, it means the compiler needs to add padding between the last 2 fields. By reducing the size, we don't need that padding and can reduce the struct size. Before: pahole -C dsa_switch net/dsa/slave.o struct dsa_switch { struct device * dev; /* 0 8 */ struct dsa_switch_tree * dst; /* 8 8 */ unsigned int index; /* 16 4 */ u32 setup:1; /* 20: 0 4 */ u32 vlan_filtering_is_global:1; /* 20: 1 4 */ u32 needs_standalone_vlan_filtering:1; /* 20: 2 4 */ u32 configure_vlan_while_not_filtering:1; /* 20: 3 4 */ u32 untag_bridge_pvid:1; /* 20: 4 4 */ u32 assisted_learning_on_cpu_port:1; /* 20: 5 4 */ u32 vlan_filtering:1; /* 20: 6 4 */ u32 pcs_poll:1; /* 20: 7 4 */ u32 mtu_enforcement_ingress:1; /* 20: 8 4 */ /* XXX 23 bits hole, try to pack */ struct notifier_block nb; /* 24 24 */ /* XXX last struct has 4 bytes of padding */ void * priv; /* 48 8 */ void * tagger_data; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_chip_data * cd; /* 64 8 */ const struct dsa_switch_ops * ops; /* 72 8 */ u32 phys_mii_mask; /* 80 4 */ /* XXX 4 bytes hole, try to pack */ struct mii_bus * slave_mii_bus; /* 88 8 */ unsigned int ageing_time_min; /* 96 4 */ unsigned int ageing_time_max; /* 100 4 */ struct dsa_8021q_context * tag_8021q_ctx; /* 104 8 */ struct devlink * devlink; /* 112 8 */ unsigned int num_tx_queues; /* 120 4 */ unsigned int num_lag_ids; /* 124 4 */ /* --- cacheline 2 boundary (128 bytes) --- */ unsigned int max_num_bridges; /* 128 4 */ /* XXX 4 bytes hole, try to pack */ size_t num_ports; /* 136 8 */ /* size: 144, cachelines: 3, members: 27 */ /* sum members: 132, holes: 2, sum holes: 8 */ /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */ /* paddings: 1, sum paddings: 4 */ /* last cacheline: 16 bytes */ }; After: pahole -C dsa_switch net/dsa/slave.o struct dsa_switch { struct device * dev; /* 0 8 */ struct dsa_switch_tree * dst; /* 8 8 */ unsigned int index; /* 16 4 */ u32 setup:1; /* 20: 0 4 */ u32 vlan_filtering_is_global:1; /* 20: 1 4 */ u32 needs_standalone_vlan_filtering:1; /* 20: 2 4 */ u32 configure_vlan_while_not_filtering:1; /* 20: 3 4 */ u32 untag_bridge_pvid:1; /* 20: 4 4 */ u32 assisted_learning_on_cpu_port:1; /* 20: 5 4 */ u32 vlan_filtering:1; /* 20: 6 4 */ u32 pcs_poll:1; /* 20: 7 4 */ u32 mtu_enforcement_ingress:1; /* 20: 8 4 */ /* XXX 23 bits hole, try to pack */ struct notifier_block nb; /* 24 24 */ /* XXX last struct has 4 bytes of padding */ void * priv; /* 48 8 */ void * tagger_data; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_chip_data * cd; /* 64 8 */ const struct dsa_switch_ops * ops; /* 72 8 */ u32 phys_mii_mask; /* 80 4 */ /* XXX 4 bytes hole, try to pack */ struct mii_bus * slave_mii_bus; /* 88 8 */ unsigned int ageing_time_min; /* 96 4 */ unsigned int ageing_time_max; /* 100 4 */ struct dsa_8021q_context * tag_8021q_ctx; /* 104 8 */ struct devlink * devlink; /* 112 8 */ unsigned int num_tx_queues; /* 120 4 */ unsigned int num_lag_ids; /* 124 4 */ /* --- cacheline 2 boundary (128 bytes) --- */ unsigned int max_num_bridges; /* 128 4 */ unsigned int num_ports; /* 132 4 */ /* size: 136, cachelines: 3, members: 27 */ /* sum members: 128, holes: 1, sum holes: 4 */ /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */ /* paddings: 1, sum paddings: 4 */ /* last cacheline: 8 bytes */ }; Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05net: dsa: merge all bools of struct dsa_switch into a single u32Vladimir Oltean1-46/+51
struct dsa_switch has 9 boolean properties, many of which are in fact set by drivers for custom behavior (vlan_filtering_is_global, needs_standalone_vlan_filtering, etc etc). The binary layout of the structure could be improved. For example, the "bool setup" at the beginning introduces a gratuitous 7 byte hole in the first cache line. The change merges all boolean properties into bitfields of an u32, and places that u32 in the first cache line of the structure, since many bools are accessed from the data path (untag_bridge_pvid, vlan_filtering, vlan_filtering_is_global). We place this u32 after the existing ds->index, which is also 4 bytes in size. As a positive side effect, ds->tagger_data now fits into the first cache line too, because 4 bytes are saved. Before: pahole -C dsa_switch net/dsa/slave.o struct dsa_switch { bool setup; /* 0 1 */ /* XXX 7 bytes hole, try to pack */ struct device * dev; /* 8 8 */ struct dsa_switch_tree * dst; /* 16 8 */ unsigned int index; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ struct notifier_block nb; /* 32 24 */ /* XXX last struct has 4 bytes of padding */ void * priv; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ void * tagger_data; /* 64 8 */ struct dsa_chip_data * cd; /* 72 8 */ const struct dsa_switch_ops * ops; /* 80 8 */ u32 phys_mii_mask; /* 88 4 */ /* XXX 4 bytes hole, try to pack */ struct mii_bus * slave_mii_bus; /* 96 8 */ unsigned int ageing_time_min; /* 104 4 */ unsigned int ageing_time_max; /* 108 4 */ struct dsa_8021q_context * tag_8021q_ctx; /* 112 8 */ struct devlink * devlink; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ unsigned int num_tx_queues; /* 128 4 */ bool vlan_filtering_is_global; /* 132 1 */ bool needs_standalone_vlan_filtering; /* 133 1 */ bool configure_vlan_while_not_filtering; /* 134 1 */ bool untag_bridge_pvid; /* 135 1 */ bool assisted_learning_on_cpu_port; /* 136 1 */ bool vlan_filtering; /* 137 1 */ bool pcs_poll; /* 138 1 */ bool mtu_enforcement_ingress; /* 139 1 */ unsigned int num_lag_ids; /* 140 4 */ unsigned int max_num_bridges; /* 144 4 */ /* XXX 4 bytes hole, try to pack */ size_t num_ports; /* 152 8 */ /* size: 160, cachelines: 3, members: 27 */ /* sum members: 141, holes: 4, sum holes: 19 */ /* paddings: 1, sum paddings: 4 */ /* last cacheline: 32 bytes */ }; After: pahole -C dsa_switch net/dsa/slave.o struct dsa_switch { struct device * dev; /* 0 8 */ struct dsa_switch_tree * dst; /* 8 8 */ unsigned int index; /* 16 4 */ u32 setup:1; /* 20: 0 4 */ u32 vlan_filtering_is_global:1; /* 20: 1 4 */ u32 needs_standalone_vlan_filtering:1; /* 20: 2 4 */ u32 configure_vlan_while_not_filtering:1; /* 20: 3 4 */ u32 untag_bridge_pvid:1; /* 20: 4 4 */ u32 assisted_learning_on_cpu_port:1; /* 20: 5 4 */ u32 vlan_filtering:1; /* 20: 6 4 */ u32 pcs_poll:1; /* 20: 7 4 */ u32 mtu_enforcement_ingress:1; /* 20: 8 4 */ /* XXX 23 bits hole, try to pack */ struct notifier_block nb; /* 24 24 */ /* XXX last struct has 4 bytes of padding */ void * priv; /* 48 8 */ void * tagger_data; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_chip_data * cd; /* 64 8 */ const struct dsa_switch_ops * ops; /* 72 8 */ u32 phys_mii_mask; /* 80 4 */ /* XXX 4 bytes hole, try to pack */ struct mii_bus * slave_mii_bus; /* 88 8 */ unsigned int ageing_time_min; /* 96 4 */ unsigned int ageing_time_max; /* 100 4 */ struct dsa_8021q_context * tag_8021q_ctx; /* 104 8 */ struct devlink * devlink; /* 112 8 */ unsigned int num_tx_queues; /* 120 4 */ unsigned int num_lag_ids; /* 124 4 */ /* --- cacheline 2 boundary (128 bytes) --- */ unsigned int max_num_bridges; /* 128 4 */ /* XXX 4 bytes hole, try to pack */ size_t num_ports; /* 136 8 */ /* size: 144, cachelines: 3, members: 27 */ /* sum members: 132, holes: 2, sum holes: 8 */ /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */ /* paddings: 1, sum paddings: 4 */ /* last cacheline: 16 bytes */ }; Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05net: dsa: move dsa_port :: type near dsa_port :: indexVladimir Oltean1-2/+4
Both dsa_port :: type and dsa_port :: index introduce a 4 octet hole after them, so we can group them together and the holes would be eliminated, turning 16 octets of storage into just 8. This makes the cpu_dp pointer fit in the first cache line, which is good, because dsa_slave_to_master(), called by dsa_enqueue_skb(), uses it. Before: pahole -C dsa_port net/dsa/slave.o struct dsa_port { union { struct net_device * master; /* 0 8 */ struct net_device * slave; /* 0 8 */ }; /* 0 8 */ const struct dsa_device_ops * tag_ops; /* 8 8 */ struct dsa_switch_tree * dst; /* 16 8 */ struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */ enum { DSA_PORT_TYPE_UNUSED = 0, DSA_PORT_TYPE_CPU = 1, DSA_PORT_TYPE_DSA = 2, DSA_PORT_TYPE_USER = 3, } type; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_switch * ds; /* 40 8 */ unsigned int index; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ const char * name; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_port * cpu_dp; /* 64 8 */ u8 mac[6]; /* 72 6 */ u8 stp_state; /* 78 1 */ u8 vlan_filtering:1; /* 79: 0 1 */ u8 learning:1; /* 79: 1 1 */ u8 lag_tx_enabled:1; /* 79: 2 1 */ u8 devlink_port_setup:1; /* 79: 3 1 */ u8 setup:1; /* 79: 4 1 */ /* XXX 3 bits hole, try to pack */ struct device_node * dn; /* 80 8 */ unsigned int ageing_time; /* 88 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_bridge * bridge; /* 96 8 */ struct devlink_port devlink_port; /* 104 288 */ /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */ struct phylink * pl; /* 392 8 */ struct phylink_config pl_config; /* 400 40 */ struct net_device * lag_dev; /* 440 8 */ /* --- cacheline 7 boundary (448 bytes) --- */ struct net_device * hsr_dev; /* 448 8 */ struct list_head list; /* 456 16 */ const struct ethtool_ops * orig_ethtool_ops; /* 472 8 */ const struct dsa_netdevice_ops * netdev_ops; /* 480 8 */ struct mutex addr_lists_lock; /* 488 32 */ /* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */ struct list_head fdbs; /* 520 16 */ struct list_head mdbs; /* 536 16 */ /* size: 552, cachelines: 9, members: 30 */ /* sum members: 539, holes: 3, sum holes: 12 */ /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */ /* last cacheline: 40 bytes */ }; After: pahole -C dsa_port net/dsa/slave.o struct dsa_port { union { struct net_device * master; /* 0 8 */ struct net_device * slave; /* 0 8 */ }; /* 0 8 */ const struct dsa_device_ops * tag_ops; /* 8 8 */ struct dsa_switch_tree * dst; /* 16 8 */ struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */ struct dsa_switch * ds; /* 32 8 */ unsigned int index; /* 40 4 */ enum { DSA_PORT_TYPE_UNUSED = 0, DSA_PORT_TYPE_CPU = 1, DSA_PORT_TYPE_DSA = 2, DSA_PORT_TYPE_USER = 3, } type; /* 44 4 */ const char * name; /* 48 8 */ struct dsa_port * cpu_dp; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u8 mac[6]; /* 64 6 */ u8 stp_state; /* 70 1 */ u8 vlan_filtering:1; /* 71: 0 1 */ u8 learning:1; /* 71: 1 1 */ u8 lag_tx_enabled:1; /* 71: 2 1 */ u8 devlink_port_setup:1; /* 71: 3 1 */ u8 setup:1; /* 71: 4 1 */ /* XXX 3 bits hole, try to pack */ struct device_node * dn; /* 72 8 */ unsigned int ageing_time; /* 80 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_bridge * bridge; /* 88 8 */ struct devlink_port devlink_port; /* 96 288 */ /* --- cacheline 6 boundary (384 bytes) --- */ struct phylink * pl; /* 384 8 */ struct phylink_config pl_config; /* 392 40 */ struct net_device * lag_dev; /* 432 8 */ struct net_device * hsr_dev; /* 440 8 */ /* --- cacheline 7 boundary (448 bytes) --- */ struct list_head list; /* 448 16 */ const struct ethtool_ops * orig_ethtool_ops; /* 464 8 */ const struct dsa_netdevice_ops * netdev_ops; /* 472 8 */ struct mutex addr_lists_lock; /* 480 32 */ /* --- cacheline 8 boundary (512 bytes) --- */ struct list_head fdbs; /* 512 16 */ struct list_head mdbs; /* 528 16 */ /* size: 544, cachelines: 9, members: 30 */ /* sum members: 539, holes: 1, sum holes: 4 */ /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */ /* last cacheline: 32 bytes */ }; Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05net: dsa: merge all bools of struct dsa_port into a single u8Vladimir Oltean1-7/+10
struct dsa_port has 5 bool members which create quite a number of 7 byte holes in the structure layout. By merging them all into bitfields of an u8, and placing that u8 in the 1-byte hole after dp->mac and dp->stp_state, we can reduce the structure size from 576 bytes to 552 bytes on arm64. Before: pahole -C dsa_port net/dsa/slave.o struct dsa_port { union { struct net_device * master; /* 0 8 */ struct net_device * slave; /* 0 8 */ }; /* 0 8 */ const struct dsa_device_ops * tag_ops; /* 8 8 */ struct dsa_switch_tree * dst; /* 16 8 */ struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */ enum { DSA_PORT_TYPE_UNUSED = 0, DSA_PORT_TYPE_CPU = 1, DSA_PORT_TYPE_DSA = 2, DSA_PORT_TYPE_USER = 3, } type; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_switch * ds; /* 40 8 */ unsigned int index; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ const char * name; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_port * cpu_dp; /* 64 8 */ u8 mac[6]; /* 72 6 */ u8 stp_state; /* 78 1 */ /* XXX 1 byte hole, try to pack */ struct device_node * dn; /* 80 8 */ unsigned int ageing_time; /* 88 4 */ bool vlan_filtering; /* 92 1 */ bool learning; /* 93 1 */ /* XXX 2 bytes hole, try to pack */ struct dsa_bridge * bridge; /* 96 8 */ struct devlink_port devlink_port; /* 104 288 */ /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */ bool devlink_port_setup; /* 392 1 */ /* XXX 7 bytes hole, try to pack */ struct phylink * pl; /* 400 8 */ struct phylink_config pl_config; /* 408 40 */ /* --- cacheline 7 boundary (448 bytes) --- */ struct net_device * lag_dev; /* 448 8 */ bool lag_tx_enabled; /* 456 1 */ /* XXX 7 bytes hole, try to pack */ struct net_device * hsr_dev; /* 464 8 */ struct list_head list; /* 472 16 */ const struct ethtool_ops * orig_ethtool_ops; /* 488 8 */ const struct dsa_netdevice_ops * netdev_ops; /* 496 8 */ struct mutex addr_lists_lock; /* 504 32 */ /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */ struct list_head fdbs; /* 536 16 */ struct list_head mdbs; /* 552 16 */ bool setup; /* 568 1 */ /* size: 576, cachelines: 9, members: 30 */ /* sum members: 544, holes: 6, sum holes: 25 */ /* padding: 7 */ }; After: pahole -C dsa_port net/dsa/slave.o struct dsa_port { union { struct net_device * master; /* 0 8 */ struct net_device * slave; /* 0 8 */ }; /* 0 8 */ const struct dsa_device_ops * tag_ops; /* 8 8 */ struct dsa_switch_tree * dst; /* 16 8 */ struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */ enum { DSA_PORT_TYPE_UNUSED = 0, DSA_PORT_TYPE_CPU = 1, DSA_PORT_TYPE_DSA = 2, DSA_PORT_TYPE_USER = 3, } type; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_switch * ds; /* 40 8 */ unsigned int index; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ const char * name; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_port * cpu_dp; /* 64 8 */ u8 mac[6]; /* 72 6 */ u8 stp_state; /* 78 1 */ u8 vlan_filtering:1; /* 79: 0 1 */ u8 learning:1; /* 79: 1 1 */ u8 lag_tx_enabled:1; /* 79: 2 1 */ u8 devlink_port_setup:1; /* 79: 3 1 */ u8 setup:1; /* 79: 4 1 */ /* XXX 3 bits hole, try to pack */ struct device_node * dn; /* 80 8 */ unsigned int ageing_time; /* 88 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_bridge * bridge; /* 96 8 */ struct devlink_port devlink_port; /* 104 288 */ /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */ struct phylink * pl; /* 392 8 */ struct phylink_config pl_config; /* 400 40 */ struct net_device * lag_dev; /* 440 8 */ /* --- cacheline 7 boundary (448 bytes) --- */ struct net_device * hsr_dev; /* 448 8 */ struct list_head list; /* 456 16 */ const struct ethtool_ops * orig_ethtool_ops; /* 472 8 */ const struct dsa_netdevice_ops * netdev_ops; /* 480 8 */ struct mutex addr_lists_lock; /* 488 32 */ /* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */ struct list_head fdbs; /* 520 16 */ struct list_head mdbs; /* 536 16 */ /* size: 552, cachelines: 9, members: 30 */ /* sum members: 539, holes: 3, sum holes: 12 */ /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */ /* last cacheline: 40 bytes */ }; Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05net: dsa: move dsa_port :: stp_state near dsa_port :: macVladimir Oltean1-1/+3
The MAC address of a port is 6 octets in size, and this creates a 2 octet hole after it. There are some other u8 members of struct dsa_port that we can put in that hole. One such member is the stp_state. Before: pahole -C dsa_port net/dsa/slave.o struct dsa_port { union { struct net_device * master; /* 0 8 */ struct net_device * slave; /* 0 8 */ }; /* 0 8 */ const struct dsa_device_ops * tag_ops; /* 8 8 */ struct dsa_switch_tree * dst; /* 16 8 */ struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */ enum { DSA_PORT_TYPE_UNUSED = 0, DSA_PORT_TYPE_CPU = 1, DSA_PORT_TYPE_DSA = 2, DSA_PORT_TYPE_USER = 3, } type; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_switch * ds; /* 40 8 */ unsigned int index; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ const char * name; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_port * cpu_dp; /* 64 8 */ u8 mac[6]; /* 72 6 */ /* XXX 2 bytes hole, try to pack */ struct device_node * dn; /* 80 8 */ unsigned int ageing_time; /* 88 4 */ bool vlan_filtering; /* 92 1 */ bool learning; /* 93 1 */ u8 stp_state; /* 94 1 */ /* XXX 1 byte hole, try to pack */ struct dsa_bridge * bridge; /* 96 8 */ struct devlink_port devlink_port; /* 104 288 */ /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */ bool devlink_port_setup; /* 392 1 */ /* XXX 7 bytes hole, try to pack */ struct phylink * pl; /* 400 8 */ struct phylink_config pl_config; /* 408 40 */ /* --- cacheline 7 boundary (448 bytes) --- */ struct net_device * lag_dev; /* 448 8 */ bool lag_tx_enabled; /* 456 1 */ /* XXX 7 bytes hole, try to pack */ struct net_device * hsr_dev; /* 464 8 */ struct list_head list; /* 472 16 */ const struct ethtool_ops * orig_ethtool_ops; /* 488 8 */ const struct dsa_netdevice_ops * netdev_ops; /* 496 8 */ struct mutex addr_lists_lock; /* 504 32 */ /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */ struct list_head fdbs; /* 536 16 */ struct list_head mdbs; /* 552 16 */ bool setup; /* 568 1 */ /* size: 576, cachelines: 9, members: 30 */ /* sum members: 544, holes: 6, sum holes: 25 */ /* padding: 7 */ }; After: pahole -C dsa_port net/dsa/slave.o struct dsa_port { union { struct net_device * master; /* 0 8 */ struct net_device * slave; /* 0 8 */ }; /* 0 8 */ const struct dsa_device_ops * tag_ops; /* 8 8 */ struct dsa_switch_tree * dst; /* 16 8 */ struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */ enum { DSA_PORT_TYPE_UNUSED = 0, DSA_PORT_TYPE_CPU = 1, DSA_PORT_TYPE_DSA = 2, DSA_PORT_TYPE_USER = 3, } type; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ struct dsa_switch * ds; /* 40 8 */ unsigned int index; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ const char * name; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct dsa_port * cpu_dp; /* 64 8 */ u8 mac[6]; /* 72 6 */ u8 stp_state; /* 78 1 */ /* XXX 1 byte hole, try to pack */ struct device_node * dn; /* 80 8 */ unsigned int ageing_time; /* 88 4 */ bool vlan_filtering; /* 92 1 */ bool learning; /* 93 1 */ /* XXX 2 bytes hole, try to pack */ struct dsa_bridge * bridge; /* 96 8 */ struct devlink_port devlink_port; /* 104 288 */ /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */ bool devlink_port_setup; /* 392 1 */ /* XXX 7 bytes hole, try to pack */ struct phylink * pl; /* 400 8 */ struct phylink_config pl_config; /* 408 40 */ /* --- cacheline 7 boundary (448 bytes) --- */ struct net_device * lag_dev; /* 448 8 */ bool lag_tx_enabled; /* 456 1 */ /* XXX 7 bytes hole, try to pack */ struct net_device * hsr_dev; /* 464 8 */ struct list_head list; /* 472 16 */ const struct ethtool_ops * orig_ethtool_ops; /* 488 8 */ const struct dsa_netdevice_ops * netdev_ops; /* 496 8 */ struct mutex addr_lists_lock; /* 504 32 */ /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */ struct list_head fdbs; /* 536 16 */ struct list_head mdbs; /* 552 16 */ bool setup; /* 568 1 */ /* size: 576, cachelines: 9, members: 30 */ /* sum members: 544, holes: 6, sum holes: 25 */ /* padding: 7 */ }; Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-04mac80211: Add stations iterator where the iterator function may sleepMartin Blumenstingl1-0/+21
ieee80211_iterate_active_interfaces() and ieee80211_iterate_active_interfaces_atomic() already exist, where the former allows the iterator function to sleep. Add ieee80211_iterate_stations() which is similar to ieee80211_iterate_stations_atomic() but allows the iterator to sleep. This is needed for adding SDIO support to the rtw88 driver. Some interators there are reading or writing registers. With the SDIO ops (sdio_readb, sdio_writeb and friends) this means that the iterator function may sleep. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://lore.kernel.org/r/20211228211501.468981-2-martin.blumenstingl@googlemail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-01-04Namespaceify mtu_expires sysctlxu xin1-0/+1
This patch enables the sysctl mtu_expires to be configured per net namespace. Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-04Namespaceify min_pmtu sysctlxu xin1-0/+2
This patch enables the sysctl min_pmtu to be configured per net namespace. Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-04udp6: Use Segment Routing Header for dest address if presentAndrew Lunn1-0/+19
When finding the socket to report an error on, if the invoking packet is using Segment Routing, the IPv6 destination address is that of an intermediate router, not the end destination. Extract the ultimate destination address from the segment address. This change allows traceroute to function in the presence of Segment Routing. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-04icmp: ICMPV6: Examine invoking packet for Segment Route Headers.Andrew Lunn1-0/+1
RFC8754 says: ICMP error packets generated within the SR domain are sent to source nodes within the SR domain. The invoking packet in the ICMP error message may contain an SRH. Since the destination address of a packet with an SRH changes as each segment is processed, it may not be the destination used by the socket or application that generated the invoking packet. For the source of an invoking packet to process the ICMP error message, the ultimate destination address of the IPv6 header may be required. The following logic is used to determine the destination address for use by protocol-error handlers. * Walk all extension headers of the invoking IPv6 packet to the routing extension header preceding the upper-layer header. - If routing header is type 4 Segment Routing Header (SRH) o The SID at Segment List[0] may be used as the destination address of the invoking packet. Mangle the skb so the network header points to the invoking packet inside the ICMP packet. The seg6 helpers can then be used on the skb to find any segment routing headers. If found, mark this fact in the IPv6 control block of the skb, and store the offset into the packet of the SRH. Then restore the skb back to its old state. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-04seg6: export get_srh() for ICMP handlingAndrew Lunn1-0/+1
An ICMP error message can contain in its message body part of an IPv6 packet which invoked the error. Such a packet might contain a segment router header. Export get_srh() so the ICMP code can make use of it. Since his changes the scope of the function from local to global, add the seg6_ prefix to keep the namespace clean. And move it into seg6.c so it is always available, not just when IPV6_SEG6_LWTUNNEL is enabled. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-04net/sched: act_ct: Fill offloading tuple iifidxPaul Blakey2-0/+54
Driver offloading ct tuples can use the information of which devices received the packets that created the offloaded connections, to more efficiently offload them only to the relevant device. Add new act_ct nf conntrack extension, which is used to store the skb devices before offloading the connection, and then fill in the tuple iifindex so drivers can get the device via metadata dissector match. Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-02sctp: hold endpoint before calling cb in sctp_transport_lookup_processXin Long1-2/+1
The same fix in commit 5ec7d18d1813 ("sctp: use call_rcu to free endpoint") is also needed for dumping one asoc and sock after the lookup. Fixes: 86fdb3448cc1 ("sctp: ensure ep is not destroyed before doing the dump") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller7-3/+14
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-12-30 The following pull-request contains BPF updates for your *net-next* tree. We've added 72 non-merge commits during the last 20 day(s) which contain a total of 223 files changed, 3510 insertions(+), 1591 deletions(-). The main changes are: 1) Automatic setrlimit in libbpf when bpf is memcg's in the kernel, from Andrii. 2) Beautify and de-verbose verifier logs, from Christy. 3) Composable verifier types, from Hao. 4) bpf_strncmp helper, from Hou. 5) bpf.h header dependency cleanup, from Jakub. 6) get_func_[arg|ret|arg_cnt] helpers, from Jiri. 7) Sleepable local storage, from KP. 8) Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support, from Kumar. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-4/+5
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c commit 077cdda764c7 ("net/mlx5e: TC, Fix memory leak with rules with internal port") commit 31108d142f36 ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") commit 4390c6edc0fb ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") https://lore.kernel.org/all/20211229065352.30178-1-saeed@kernel.org/ net/smc/smc_wr.c commit 49dc9013e34b ("net/smc: Use the bitmap API when applicable") commit 349d43127dac ("net/smc: fix kernel panic caused by race of smc_sock") bitmap_zero()/memset() is removed by the fix Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29bpf: Invert the dependency between bpf-netns.h and netns/bpf.hJakub Kicinski1-1/+8
netns/bpf.h gets included by netdevice.h (thru net_namespace.h) which in turn gets included in a lot of places. We should keep netns/bpf.h as light-weight as possible. bpf-netns.h seems to contain more implementation details than deserves to be included in a netns header. It needs to pull in uapi/bpf.h to get various enum types. Move enum netns_bpf_attach_type to netns/bpf.h and invert the dependency. This makes netns/bpf.h fit the mold of a struct definition header more clearly, and drops the number of objects rebuilt when uapi/bpf.h is touched from 7.7k to 1.1k. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211230012742.770642-3-kuba@kernel.org
2021-12-29net: Add includes masked by netdevice.h including uapi/bpf.hJakub Kicinski1-0/+1
Add missing includes unmasked by the subsequent change. Mostly network drivers missing an include for XDP_PACKET_HEADROOM. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211230012742.770642-2-kuba@kernel.org
2021-12-29Merge tag 'for-net-next-2021-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-nextJakub Kicinski5-34/+119
Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add support for Foxconn MT7922A - Add support for Realtek RTL8852AE - Rework HCI event handling to use skb_pull_data * tag 'for-net-next-2021-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (62 commits) Bluetooth: MGMT: Fix spelling mistake "simultanous" -> "simultaneous" Bluetooth: vhci: Set HCI_QUIRK_VALID_LE_STATES Bluetooth: MGMT: Fix LE simultaneous roles UUID if not supported Bluetooth: hci_sync: Add check simultaneous roles support Bluetooth: hci_sync: Wait for proper events when connecting LE Bluetooth: hci_sync: Add support for waiting specific LE subevents Bluetooth: hci_sync: Add hci_le_create_conn_sync Bluetooth: hci_event: Use skb_pull_data when processing inquiry results Bluetooth: hci_sync: Push sync command cancellation to workqueue Bluetooth: hci_qca: Stop IBS timer during BT OFF Bluetooth: btusb: Add support for Foxconn MT7922A Bluetooth: btintel: Add missing quirks and msft ext for legacy bootloader Bluetooth: btusb: Add two more Bluetooth parts for WCN6855 Bluetooth: L2CAP: Fix using wrong mode Bluetooth: hci_sync: Fix not always pausing advertising when necessary Bluetooth: mgmt: Make use of mgmt_send_event_skb in MGMT_EV_DEVICE_CONNECTED Bluetooth: mgmt: Make use of mgmt_send_event_skb in MGMT_EV_DEVICE_FOUND Bluetooth: mgmt: Introduce mgmt_alloc_skb and mgmt_send_event_skb Bluetooth: btusb: Return error code when getting patch status failed Bluetooth: btusb: Handle download_firmware failure cases ... ==================== Link: https://lore.kernel.org/r/20211229211258.2290966-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: Don't include filter.h from net/sock.hJakub Kicinski4-1/+5
sock.h is pretty heavily used (5k objects rebuilt on x86 after it's touched). We can drop the include of filter.h from it and add a forward declaration of struct sk_filter instead. This decreases the number of rebuilt objects when bpf.h is touched from ~5k to ~1k. There's a lot of missing includes this was masking. Primarily in networking tho, this time. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
2021-12-25sctp: use call_rcu to free endpointXin Long2-4/+5
This patch is to delay the endpoint free by calling call_rcu() to fix another use-after-free issue in sctp_sock_dump(): BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20 Call Trace: __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline] _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168 spin_lock_bh include/linux/spinlock.h:334 [inline] __lock_sock+0x203/0x350 net/core/sock.c:2253 lock_sock_nested+0xfe/0x120 net/core/sock.c:2774 lock_sock include/net/sock.h:1492 [inline] sctp_sock_dump+0x122/0xb20 net/sctp/diag.c:324 sctp_for_each_transport+0x2b5/0x370 net/sctp/socket.c:5091 sctp_diag_dump+0x3ac/0x660 net/sctp/diag.c:527 __inet_diag_dump+0xa8/0x140 net/ipv4/inet_diag.c:1049 inet_diag_dump+0x9b/0x110 net/ipv4/inet_diag.c:1065 netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244 __netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352 netlink_dump_start include/linux/netlink.h:216 [inline] inet_diag_handler_cmd+0x2ce/0x3f0 net/ipv4/inet_diag.c:1170 __sock_diag_cmd net/core/sock_diag.c:232 [inline] sock_diag_rcv_msg+0x31d/0x410 net/core/sock_diag.c:263 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477 sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:274 This issue occurs when asoc is peeled off and the old sk is freed after getting it by asoc->base.sk and before calling lock_sock(sk). To prevent the sk free, as a holder of the sk, ep should be alive when calling lock_sock(). This patch uses call_rcu() and moves sock_put and ep free into sctp_endpoint_destroy_rcu(), so that it's safe to try to hold the ep under rcu_read_lock in sctp_transport_traverse_process(). If sctp_endpoint_hold() returns true, it means this ep is still alive and we have held it and can continue to dump it; If it returns false, it means this ep is dead and can be freed after rcu_read_unlock, and we should skip it. In sctp_sock_dump(), after locking the sk, if this ep is different from tsp->asoc->ep, it means during this dumping, this asoc was peeled off before calling lock_sock(), and the sk should be skipped; If this ep is the same with tsp->asoc->ep, it means no peeloff happens on this asoc, and due to lock_sock, no peeloff will happen either until release_sock. Note that delaying endpoint free won't delay the port release, as the port release happens in sctp_endpoint_destroy() before calling call_rcu(). Also, freeing endpoint by call_rcu() makes it safe to access the sk by asoc->base.sk in sctp_assocs_seq_show() and sctp_rcv(). Thanks Jones to bring this issue up. v1->v2: - improve the changelog. - add kfree(ep) into sctp_endpoint_destroy_rcu(), as Jakub noticed. Reported-by: syzbot+9276d76e83e3bcde6c99@syzkaller.appspotmail.com Reported-by: Lee Jones <lee.jones@linaro.org> Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-3/+17
include/net/sock.h commit 8f905c0e7354 ("inet: fully convert sk->sk_rx_dst to RCU rules") commit 43f51df41729 ("net: move early demux fields close to sk_refcnt") https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23sctp: move hlist_node and hashent out of sctp_ep_commonXin Long2-6/+6
Struct sctp_ep_common is included in both asoc and ep, but hlist_node and hashent are only needed by ep after asoc_hashtable was dropped by Commit b5eff7128366 ("sctp: drop the old assoc hashtable of sctp"). So it is better to move hlist_node and hashent from sctp_ep_common to sctp_endpoint, and it saves some space for each asoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23xfrm: rate limit SA mapping change message to user spaceAntony Antony1-0/+5
Kernel generates mapping change message, XFRM_MSG_MAPPING, when a source port chage is detected on a input state with UDP encapsulation set. Kernel generates a message for each IPsec packet with new source port. For a high speed flow per packet mapping change message can be excessive, and can overload the user space listener. Introduce rate limiting for XFRM_MSG_MAPPING message to the user space. The rate limiting is configurable via netlink, when adding a new SA or updating it. Use the new attribute XFRMA_MTIMER_THRESH in seconds. v1->v2 change: update xfrm_sa_len() v2->v3 changes: use u32 insted unsigned long to reduce size of struct xfrm_state fix xfrm_ompat size Reported-by: kernel test robot <lkp@intel.com> accept XFRM_MSG_MAPPING only when XFRMA_ENCAP is present Co-developed-by: Thomas Egerer <thomas.egerer@secunet.com> Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com> Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-12-22Merge tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxJakub Kicinski1-0/+8
Saeed Mahameed says: ==================== mlx5-updates-2021-12-21 1) From Shay Drory: Devlink user knobs to control device's EQ size This series provides knobs which will enable users to minimize memory consumption of mlx5 Functions (PF/VF/SF). mlx5 exposes two new generic devlink params for EQ size configuration and uses devlink generic param max_macs. LINK: https://lore.kernel.org/netdev/20211208141722.13646-1-shayd@nvidia.com/ 2) From Tariq and Lama, allocate software channel objects and statistics of a mlx5 netdevice private data dynamically upon first demand to save on memory. * tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Take packet_merge params directly from the RX res struct net/mlx5e: Allocate per-channel stats dynamically at first usage net/mlx5e: Use dynamic per-channel allocations in stats net/mlx5e: Allow profile-specific limitation on max num of channels net/mlx5e: Save memory by using dynamic allocation in netdev priv net/mlx5e: Add profile indications for PTP and QOS HTB features net/mlx5e: Use bitmap field for profile features net/mlx5: Remove the repeated declaration net/mlx5: Let user configure max_macs generic param devlink: Clarifies max_macs generic devlink param net/mlx5: Let user configure event_eq_size param devlink: Add new "event_eq_size" generic device param net/mlx5: Let user configure io_eq_size param devlink: Add new "io_eq_size" generic device param ==================== Link: https://lore.kernel.org/r/20211222031604.14540-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23netfilter: conntrack: tag conntracks picked up in local out hookFlorian Westphal1-0/+1
This allows to identify flows that originate from local machine in a followup patch. It would be possible to make this a ->status bit instead. For now I did not do that yet because I don't have a use-case for exposing this info to userspace. If one comes up the toggle can be replaced with a status bit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-23netfilter: nf_tables: make counter support built-inPablo Neira Ayuso1-0/+6
Make counter support built-in to allow for direct call in case of CONFIG_RETPOLINE. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-22codel: remove unnecessary pkt_sched.h includeJakub Kicinski2-1/+2
Commit d068ca2ae2e6 ("codel: split into multiple files") moved all Qdisc-related code to codel_qdisc.h, move the include of pkt_sched.h as well. This is similar to the previous commit, although we don't care as much about incremental builds after pkt_sched.h was touched itself it is included by net/sch_generic.h which is modified ~20 times a year. This decreases the incremental build size after touching pkt_sched.h from 1592 to 617 objects. Fix unmasked missing includes in WiFi drivers. Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211221193941.3805147-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22codel: remove unnecessary sock.h includeJakub Kicinski2-1/+2
Since sock.h is modified relatively often (60 times in the last 12 months) it seems worthwhile to decrease the incremental build work. CoDel's header includes net/inet_ecn.h which in turn includes net/sock.h. codel.h is itself included by mac80211 which is included by much of the WiFi stack and drivers. Removing the net/inet_ecn.h include from CoDel breaks the dependecy between WiFi and sock.h. Commit d068ca2ae2e6 ("codel: split into multiple files") moved all the code which actually needs ECN helpers out to net/codel_impl.h, the include can be moved there as well. This decreases the incremental build size after touching sock.h from 4999 objects to 4051 objects. Fix unmasked missing includes in WiFi drivers. Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211221193941.3805147-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22Bluetooth: MGMT: Fix LE simultaneous roles UUID if not supportedLuiz Augusto von Dentz1-0/+1
If controller/driver don't support LE simultaneous roles its UUID shall be omitted when responding to MGMT_OP_READ_EXP_FEATURES_INFO. This also rework the support introducing HCI_LE_SIMULTANEOUS_ROLES flag so it can be detected when userspace wants to use or not. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-12-22Bluetooth: hci_sync: Add check simultaneous roles supportLuiz Augusto von Dentz1-0/+6
This attempts to check if the controller can act as both central and peripheral simultaneously and in case it does skip suspending advertising or in case of directed advertising don't fail if scanning. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-12-22Bluetooth: hci_sync: Add support for waiting specific LE subeventsLuiz Augusto von Dentz1-0/+1
This adds support for waiting for specific LE subevents instead of command status which may only indicate that the commands is in progress and a different event is used to complete the operation. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-12-22Bluetooth: hci_sync: Add hci_le_create_conn_syncLuiz Augusto von Dentz2-2/+5
This adds hci_le_create_conn_sync and make hci_le_connect use it instead of queueing multiple commands which may conflict with the likes of hci_update_passive_scan which uses hci_cmd_sync_queue. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-12-22Bluetooth: hci_sync: Push sync command cancellation to workqueueBenjamin Berg2-0/+2
syzbot reported that hci_cmd_sync_cancel may sleep from the wrong context. To avoid this, create a new work item that pushes the relevant parts into a different context. Note that we keep the old implementation with the name __hci_cmd_sync_cancel as the sleeping behaviour is desired in some cases. Reported-and-tested-by: syzbot+485cc00ea7cf41dfdbf1@syzkaller.appspotmail.com Fixes: c97a747efc93 ("Bluetooth: btusb: Cancel sync commands for certain URB errors") Signed-off-by: Benjamin Berg <bberg@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-12-21devlink: Add new "event_eq_size" generic device paramShay Drory1-0/+4
Add new device generic parameter to determine the size of the asynchronous control events EQ. For example, to reduce event EQ size to 64, execute: $ devlink dev param set pci/0000:06:00.0 \ name event_eq_size value 64 cmode driverinit $ devlink dev reload pci/0000:06:00.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21devlink: Add new "io_eq_size" generic device paramShay Drory1-0/+4
Add new device generic parameter to determine the size of the I/O completion EQs. For example, to reduce I/O EQ size to 64, execute: $ devlink dev param set pci/0000:06:00.0 \ name io_eq_size value 64 cmode driverinit $ devlink dev reload pci/0000:06:00.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21Merge tag 'mac80211-next-for-net-next-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-nextJakub Kicinski2-31/+95
Johannes Berg says: ==================== This time we have: * ndo_fill_forward_path support in mac80211, to let drivers use it * association comeback notification for userspace, to be able to react more sensibly to long delays * support for background radar detection hardware in some chipsets * SA Query Procedures offload on the AP side * more logging if we find problems with HT/VHT/HE * various cleanups and minor fixes Conflicts: net/wireless/reg.c: e08ebd6d7b90 ("cfg80211: Acquire wiphy mutex on regulatory work") 701fdfe348f7 ("cfg80211: Enable regulatory enforcement checks for drivers supporting mesh iface") https://lore.kernel.org/r/20211221111950.57ecc6a7@canb.auug.org.au drivers/net/wireless/ath/ath10k/wmi.c: 7f599aeccbd2 ("cfg80211: Use the HE operation IE to determine a 6GHz BSS channel") 3bf2537ec2e3 ("ath10k: drop beacon and probe response which leak from other channel") https://lore.kernel.org/r/20211221115004.1cd6b262@canb.auug.org.au * tag 'mac80211-next-for-net-next-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: (32 commits) cfg80211: Enable regulatory enforcement checks for drivers supporting mesh iface rfkill: allow to get the software rfkill state cfg80211: refactor cfg80211_get_ies_channel_number() nl82011: clarify interface combinations wrt. channels nl80211: Add support to offload SA Query procedures for AP SME device nl80211: Add support to set AP settings flags with single attribute mac80211: add more HT/VHT/HE state logging cfg80211: Use the HE operation IE to determine a 6GHz BSS channel cfg80211: rename offchannel_chain structs to background_chain to avoid confusion with ETSI standard mac80211: Notify cfg80211 about association comeback cfg80211: Add support for notifying association comeback mac80211: introduce channel switch disconnect function cfg80211: Fix order of enum nl80211_band_iftype_attr documentation cfg80211: simplify cfg80211_chandef_valid() mac80211: Remove a couple of obsolete TODO mac80211: fix FEC flag in radio tap header mac80211: use coarse boottime for airtime fairness code ieee80211: change HE nominal packet padding value defines cfg80211: use ieee80211_bss_get_elem() instead of _get_ie() mac80211: Use memset_after() to clear tx status ... ==================== Link: https://lore.kernel.org/r/20211221112532.28708-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>