aboutsummaryrefslogtreecommitdiffstats
path: root/include (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-01-16netfilter: flowtable: use atomic bitwise operations for flow flagsPablo Neira Ayuso1-7/+9
Originally, all flow flag bits were set on only from the workqueue. With the introduction of the flow teardown state and hardware offload this is no longer true. Let's be safe and use atomic bitwise operation to operation with flow flags. Fixes: 59c466dd68e7 ("netfilter: nf_flow_table: add a new flow state for tearing down offloading") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16netfilter: flowtable: remove dying bit, use teardown bit insteadPablo Neira Ayuso1-5/+0
The dying bit removes the conntrack entry if the netdev that owns this flow is going down. Instead, use the teardown mechanism to push back the flow to conntrack to let the classic software path decide what to do with it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16netfilter: nft_bitwise: correct uapi header comment.Jeremy Sowden1-1/+1
The comment documenting how bitwise expressions work includes a table which summarizes the mask and xor arguments combined to express the supported boolean operations. However, the row for OR: mask xor 0 x is incorrect. dreg = (sreg & 0) ^ x is not equivalent to: dreg = sreg | x What the code actually does is: dreg = (sreg & ~x) ^ x Update the documentation to match. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-15Merge tag 'batadv-next-for-davem-20200114' of git://git.open-mesh.org/linux-mergeDavid S. Miller2-2/+2
Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - fix typo and kerneldocs, by Sven Eckelmann - use WiFi txbitrate for B.A.T.M.A.N. V as fallback, by René Treffer - silence some endian sparse warnings by adding annotations, by Sven Eckelmann - Update copyright years to 2020, by Sven Eckelmann - Disable deprecated sysfs configuration by default, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15net: bridge: vlan: add rtnetlink group and notify supportNikolay Aleksandrov1-0/+2
Add a new rtnetlink group for bridge vlan notifications - RTNLGRP_BRVLAN and add support for sending vlan notifications (both single and ranges). No functional changes intended, the notification support will be used by later patches. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15net: bridge: vlan: add rtm range supportNikolay Aleksandrov1-0/+1
Add a new vlandb nl attribute - BRIDGE_VLANDB_ENTRY_RANGE which causes RTM_NEWVLAN/DELVAN to act on a range. Dumps now automatically compress similar vlans into ranges. This will be also used when per-vlan options are introduced and vlans' options match, they will be put into a single range which is encapsulated in one netlink attribute. We need to run similar checks as br_process_vlan_info() does because these ranges will be used for options setting and they'll be able to skip br_process_vlan_info(). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15net: bridge: vlan: add rtm definitions and dump supportNikolay Aleksandrov2-0/+35
This patch adds vlan rtm definitions: - NEWVLAN: to be used for creating vlans, setting options and notifications - DELVLAN: to be used for deleting vlans - GETVLAN: used for dumping vlan information Dumping vlans which can span multiple messages is added now with basic information (vid and flags). We use nlmsg_parse() to validate the header length in order to be able to extend the message with filtering attributes later. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14ipv6: Add "offload" and "trap" indications to routesIdo Schimmel1-1/+10
In a similar fashion to previous patch, add "offload" and "trap" indication to IPv6 routes. This is done by using two unused bits in 'struct fib6_info' to hold these indications. Capable drivers are expected to set these when processing the various in-kernel route notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14ipv4: Add "offload" and "trap" indications to routesIdo Schimmel2-0/+6
When performing L3 offload, routes and nexthops are usually programmed into two different tables in the underlying device. Therefore, the fact that a nexthop resides in hardware does not necessarily mean that all the associated routes also reside in hardware and vice-versa. While the kernel can signal to user space the presence of a nexthop in hardware (via 'RTNH_F_OFFLOAD'), it does not have a corresponding flag for routes. In addition, the fact that a route resides in hardware does not necessarily mean that the traffic is offloaded. For example, unreachable routes (i.e., 'RTN_UNREACHABLE') are programmed to trap packets to the CPU so that the kernel will be able to generate the appropriate ICMP error packet. This patch adds an "offload" and "trap" indications to IPv4 routes, so that users will have better visibility into the offload process. 'struct fib_alias' is extended with two new fields that indicate if the route resides in hardware or not and if it is offloading traffic from the kernel or trapping packets to it. Note that the new fields are added in the 6 bytes hole and therefore the struct still fits in a single cache line [1]. Capable drivers are expected to invoke fib_alias_hw_flags_set() with the route's key in order to set the flags. The indications are dumped to user space via a new flags (i.e., 'RTM_F_OFFLOAD' and 'RTM_F_TRAP') in the 'rtm_flags' field in the ancillary header. v2: * Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set() [1] struct fib_alias { struct hlist_node fa_list; /* 0 16 */ struct fib_info * fa_info; /* 16 8 */ u8 fa_tos; /* 24 1 */ u8 fa_type; /* 25 1 */ u8 fa_state; /* 26 1 */ u8 fa_slen; /* 27 1 */ u32 tb_id; /* 28 4 */ s16 fa_default; /* 32 2 */ u8 offload:1; /* 34: 0 1 */ u8 trap:1; /* 34: 1 1 */ u8 unused:6; /* 34: 2 1 */ /* XXX 5 bytes hole, try to pack */ struct callback_head rcu __attribute__((__aligned__(8))); /* 40 16 */ /* size: 56, cachelines: 1, members: 12 */ /* sum members: 50, holes: 1, sum holes: 5 */ /* sum bitfield members: 8 bits (1 bytes) */ /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */ /* last cacheline: 56 bytes */ } __attribute__((__aligned__(8))); Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14ipv4: Encapsulate function arguments in a structIdo Schimmel1-0/+9
fib_dump_info() is used to prepare RTM_{NEW,DEL}ROUTE netlink messages using the passed arguments. Currently, the function takes 11 arguments, 6 of which are attributes of the route being dumped (e.g., prefix, TOS). The next patch will need the function to also dump to user space an indication if the route is present in hardware or not. Instead of passing yet another argument, change the function to take a struct containing the different route attributes. v2: * Name last argument of fib_dump_info() * Move 'struct fib_rt_info' to include/net/ip_fib.h so that it could later be passed to fib_alias_hw_flags_set() Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: skbuff: disambiguate argument and member for skb_list_walk_safe helperJason A. Donenfeld1-3/+3
This worked before, because we made all callers name their next pointer "next". But in trying to be more "drop-in" ready, the silliness here is revealed. This commit fixes the problem by making the macro argument and the member use different names. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: macsec: PN wrap callbackAntoine Tenart1-0/+2
Allow to call macsec_pn_wrapped from hardware drivers to notify when a PN rolls over. Some drivers might used an interrupt to implement this. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: macsec: add nla support for changing the offloading selectionAntoine Tenart1-0/+11
MACsec offloading to underlying hardware devices is disabled by default (the software implementation is used). This patch adds support for changing this setting through the MACsec netlink interface. Many checks are done when enabling offloading on a given MACsec interface as there are limitations (it must be supported by the hardware, only a single interface can be offloaded on a given physical device at a time, rules can't be moved for now). Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: phy: add MACsec ops in phy_deviceAntoine Tenart1-0/+7
This patch adds a reference to MACsec ops in the phy_device, to allow PHYs to support offloading MACsec operations. The phydev lock will be held while calling those helpers. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: macsec: introduce MACsec opsAntoine Tenart1-0/+24
This patch introduces MACsec ops for drivers to support offloading MACsec operations. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: macsec: introduce the macsec_context structureAntoine Tenart3-0/+30
This patch introduces the macsec_context structure. It will be used in the kernel to exchange information between the common MACsec implementation (macsec.c) and the MACsec hardware offloading implementations. This structure contains pointers to MACsec specific structures which contain the actual MACsec configuration, and to the underlying device (phydev for now). Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: macsec: move some definitions in a dedicated headerAntoine Tenart1-0/+177
This patch moves some structure, type and identifier definitions into a MACsec specific header. This patch does not modify how the MACsec code is running and only move things around. This is a preparation for the future MACsec hardware offloading support, which will re-use those definitions outside macsec.c. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: phy: Added IRQ print to phylink_bringup_phy()Florian Fainelli1-0/+2
The information about the PHY attached to the PHYLINK instance is useful but is missing the IRQ prints that phy_attached_info() adds. phy_attached_info() is a bit long and it would not be possible to use phylink_info() anyway. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-13net: stmmac: Initial support for TBSJose Abreu1-0/+1
Adds the initial hooks for TBS support. This needs a 32 byte descriptor in order for it to work with current HW. Adds all the logic for Enhanced Descriptors in main core but no HW related logic for now. Changes from v2: - Use bitfield for TBS status / support (Jakub) - Remove unneeded cache alignment (Jakub) - Fix checkpatch issues Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-13ptr_ring: add include of linux/mm.hJesper Dangaard Brouer1-0/+1
Commit 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails") started to use kvmalloc_array and kvfree, which are defined in mm.h, the previous functions kcalloc and kfree, which are defined in slab.h. Add the missing include of linux/mm.h. This went unnoticed as other include files happened to include mm.h. Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-12ixp4xx_eth: move platform_data definitionArnd Bergmann1-0/+19
The platform data is needed to compile the driver as standalone, so move it to a global location along with similar files. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-12wan: ixp4xx_hss: prepare compile testingArnd Bergmann1-0/+17
The ixp4xx_hss driver needs the platform data definition and the system clock rate to be compiled. Move both into a new platform_data header file. This is a prerequisite for compile testing, but turning on compile testing requires further patches to isolate the SoC headers. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-12mlx4: Bump up MAX_MSIX from 64 to 128Jonathan Lemon1-1/+1
On modern hardware with a large number of cpus and using XDP, the current MSIX limit is insufficient. Bump the limit in order to allow more queues. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-10devlink: move devlink documentation to subfolderJacob Keller1-2/+2
Combine the documentation for devlink into a subfolder, and provide an index.rst file that can be used to generally describe devlink. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-10devlink: add macro for "fw.psid"Jacob Keller1-0/+2
The "fw.psid" devlink info version is documented in devlink-info.rst, and used by one driver. However, there is no associated macro for this firmware version like there is for others. Add one now. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09skb: add helpers to allocate ext independently from sk_buffPaolo Abeni1-0/+3
Currently we can allocate the extension only after the skb, this change allows the user to do the opposite, will simplify allocation failure handling from MPTCP. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09tcp: clean ext on tx recyclePaolo Abeni1-0/+1
Otherwise we will find stray/unexpected/old extensions value on next iteration. On tcp_write_xmit() we can end-up splitting an already queued skb in two parts, via tso_fragment(). The newly created skb can be allocated via the tx cache and an upper layer will not be aware of it, so that upper layer cannot set the ext properly. Resetting the ext on recycle ensures that stale data is not propagated in to packet headers or elsewhere. An alternative would be add an additional hook in tso_fragment() or in sk_stream_alloc_skb() to init the ext for upper layers that need it. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09tcp: Export TCP functions and ops structMat Martineau1-0/+8
MPTCP will make use of tcp_send_mss() and tcp_push() when sending data to specific TCP subflows. tcp_request_sock_ipvX_ops and ipvX_specific will be referenced during TCP subflow creation. Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09tcp: coalesce/collapse must respect MPTCP extensionsMat Martineau2-0/+65
Coalesce and collapse of packets carrying MPTCP extensions is allowed when the newer packet has no extension or the extensions carried by both packets are equal. This allows merging of TSO packet trains and even cross-TSO packets, and does not require any additional action when moving data into existing SKBs. v3 -> v4: - allow collapsing, under mptcp_skb_can_collapse() constraint v5 -> v6: - clarify MPTCP skb extensions must always be cleared at allocation time Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09mptcp: Add MPTCP to skb extensionsMat Martineau2-0/+31
Add enum value for MPTCP and update config dependencies v5 -> v6: - fixed '__unused' field size Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09tcp, ulp: Add clone operation to tcp_ulp_opsMat Martineau1-0/+3
If ULP is used on a listening socket, icsk_ulp_ops and icsk_ulp_data are copied when the listener is cloned. Sometimes the clone is immediately deleted, which will invoke the release op on the clone and likely corrupt the listening socket's icsk_ulp_data. The clone operation is invoked immediately after the clone is copied and gives the ULP type an opportunity to set up the clone socket and its icsk_ulp_data. The MPTCP ULP clone will silently fallback to plain TCP on allocation failure, so 'clone()' does not need to return an error code. v6 -> v7: - move and rename ulp clone helper to make it inline-friendly v5 -> v6: - clarified MPTCP clone usage in commit message Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09tcp: Add MPTCP option numberMat Martineau1-0/+1
TCP option 30 is allocated for MPTCP by the IANA. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09tcp: Define IPPROTO_MPTCPMat Martineau2-1/+4
To open a MPTCP socket with socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP), IPPROTO_MPTCP needs a value that differs from IPPROTO_TCP. The existing IPPROTO numbers mostly map directly to IANA-specified protocol numbers. MPTCP does not have a protocol number allocated because MPTCP packets use the TCP protocol number. Use private number not used OTA. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09sock: Make sk_protocol a 16-bit valueMat Martineau2-21/+6
Match the 16-bit width of skbuff->protocol. Fills an 8-bit hole so sizeof(struct sock) does not change. Also take care of BPF field access for sk_type/sk_protocol. Both of them are now outside the bitfield, so we can use load instructions without further shifting/masking. v5 -> v6: - update eBPF accessors, too (Intel's kbuild test robot) v2 -> v3: - keep 'sk_type' 2 bytes aligned (Eric) v1 -> v2: - preserve sk_pacing_shift as bit field (Eric) Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: bpf@vger.kernel.org Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09net: Make sock protocol value checks more specificMat Martineau1-1/+0
SK_PROTOCOL_MAX is only used in two places, for DECNet and AX.25. The limits have more to do with the those protocol definitions than they do with the data type of sk_protocol, so remove SK_PROTOCOL_MAX and use U8_MAX directly. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller12-23/+78
The ungrafting from PRIO bug fixes in net, when merged into net-next, merge cleanly but create a build failure. The resolution used here is from Petr Machata. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds4-1/+49
Pull networking fixes from David Miller: 1) Missing netns pointer init in arp_tables, from Florian Westphal. 2) Fix normal tcp SACK being treated as D-SACK, from Pengcheng Yang. 3) Fix divide by zero in sch_cake, from Wen Yang. 4) Len passed to skb_put_padto() is wrong in qrtr code, from Carl Huang. 5) cmd->obj.chunk is leaked in sctp code error paths, from Xin Long. 6) cgroup bpf programs can be released out of order, fix from Roman Gushchin. 7) Make sure stmmac debugfs entry name is changed when device name changes, from Jiping Ma. 8) Fix memory leak in vlan_dev_set_egress_priority(), from Eric Dumazet. 9) SKB leak in lan78xx usb driver, also from Eric Dumazet. 10) Ridiculous TCA_FQ_QUANTUM values configured can cause loops in fq packet scheduler, reject them. From Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) tipc: fix wrong connect() return code tipc: fix link overflow issue at socket shutdown netfilter: ipset: avoid null deref when IPSET_ATTR_LINENO is present netfilter: conntrack: dccp, sctp: handle null timeout argument atm: eni: fix uninitialized variable warning macvlan: do not assume mac_header is set in macvlan_broadcast() net: sch_prio: When ungrafting, replace with FIFO mlxsw: spectrum_qdisc: Ignore grafting of invisible FIFO MAINTAINERS: Remove myself as co-maintainer for qcom-ethqos gtp: fix bad unlock balance in gtp_encap_enable_socket pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM tipc: remove meaningless assignment in Makefile tipc: do not add socket.o to tipc-y twice net: stmmac: dwmac-sun8i: Allow all RGMII modes net: stmmac: dwmac-sunxi: Allow all RGMII modes net: usb: lan78xx: fix possible skb leak net: stmmac: Fixed link does not need MDIO Bus vlan: vlan_changelink() should propagate errors vlan: fix memory leak in vlan_dev_set_egress_priority stmmac: debugfs entry name is not be changed when udev rename device name. ...
2020-01-08net: dsa: Get information about stacked DSA protocolFlorian Fainelli1-1/+2
It is possible to stack multiple DSA switches in a way that they are not part of the tree (disjoint) but the DSA master of a switch is a DSA slave of another. When that happens switch drivers may have to know this is the case so as to determine whether their tagging protocol has a remove chance of working. This is useful for specific switch drivers such as b53 where devices have been known to be stacked in the wild without the Broadcom tag protocol supporting that feature. This allows b53 to continue supporting those devices by forcing the disabling of Broadcom tags on the outermost switches if necessary. The get_tag_protocol() function is therefore updated to gain an additional enum dsa_tag_protocol argument which denotes the current tagging protocol used by the DSA master we are attached to, else DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08devlink: add support for reporter recovery completionVikas Gupta1-0/+2
It is possible that a reporter recovery completion do not finish successfully when recovery is triggered via devlink_health_reporter_recover as recovery could be processed in different context. In such scenario an error is returned by driver when recover hook is invoked and successful recovery completion is intimated later. Expose devlink recover done API to update recovery stats. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-0/+6
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Missing netns context in arp_tables, from Florian Westphal. 2) Underflow in flowtable reference counter, from wenxu. 3) Fix incorrect ethernet destination address in flowtable offload, from wenxu. 4) Check for status of neighbour entry, from wenxu. 5) Fix NAT port mangling, from wenxu. 6) Unbind callbacks from destroy path to cleanup hardware properly on flowtable removal. 7) Fix missing casting statistics timestamp, add nf_flowtable_time_stamp and use it. 8) NULL pointer exception when timeout argument is null in conntrack dccp and sctp protocol helpers, from Florian Westphal. 9) Possible nul-dereference in ipset with IPSET_ATTR_LINENO, also from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08net: introduce skb_list_walk_safe for skb segment walkingJason A. Donenfeld1-0/+5
As part of the continual effort to remove direct usage of skb->next and skb->prev, this patch adds a helper for iterating through the singly-linked variant of skb lists, which are used for lists of GSO packet. The name "skb_list_..." has been chosen to match the existing function, "kfree_skb_list, which also operates on these singly-linked lists, and the "..._walk_safe" part is the same idiom as elsewhere in the kernel. This patch removes the helper from wireguard and puts it into linux/skbuff.h, while making it a bit more robust for general usage. In particular, parenthesis are added around the macro argument usage, and it now accounts for trying to iterate through an already-null skb pointer, which will simply run the iteration zero times. This latter enhancement means it can be used to replace both do { ... } while and while (...) open-coded idioms. This should take care of these three possible usages, which match all current methods of iterations. skb_list_walk_safe(segs, skb, next) { ... } skb_list_walk_safe(skb, skb, next) { ... } skb_list_walk_safe(segs, skb, segs) { ... } Gcc appears to generate efficient code for each of these. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08macvlan: do not assume mac_header is set in macvlan_broadcast()Eric Dumazet1-0/+8
Use of eth_hdr() in tx path is error prone. Many drivers call skb_reset_mac_header() before using it, but others do not. Commit 6d1ccff62780 ("net: reset mac header in dev_start_xmit()") attempted to fix this generically, but commit d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option") brought back the macvlan bug. Lets add a new helper, so that tx paths no longer have to call skb_reset_mac_header() only to get a pointer to skb->data. Hopefully we will be able to revert 6d1ccff62780 ("net: reset mac header in dev_start_xmit()") and save few cycles in transmit fast path. BUG: KASAN: use-after-free in __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline] BUG: KASAN: use-after-free in mc_hash drivers/net/macvlan.c:251 [inline] BUG: KASAN: use-after-free in macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277 Read of size 4 at addr ffff8880a4932401 by task syz-executor947/9579 CPU: 0 PID: 9579 Comm: syz-executor947 Not tainted 5.5.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:145 __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline] mc_hash drivers/net/macvlan.c:251 [inline] macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277 macvlan_queue_xmit drivers/net/macvlan.c:520 [inline] macvlan_start_xmit+0x402/0x77f drivers/net/macvlan.c:559 __netdev_start_xmit include/linux/netdevice.h:4447 [inline] netdev_start_xmit include/linux/netdevice.h:4461 [inline] dev_direct_xmit+0x419/0x630 net/core/dev.c:4079 packet_direct_xmit+0x1a9/0x250 net/packet/af_packet.c:240 packet_snd net/packet/af_packet.c:2966 [inline] packet_sendmsg+0x260d/0x6220 net/packet/af_packet.c:2991 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:659 __sys_sendto+0x262/0x380 net/socket.c:1985 __do_sys_sendto net/socket.c:1997 [inline] __se_sys_sendto net/socket.c:1993 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1993 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x442639 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc13549e08 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442639 RDX: 000000000000000e RSI: 0000000020000080 RDI: 0000000000000003 RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000403bb0 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 9389: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] __kasan_kmalloc mm/kasan/common.c:513 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527 __do_kmalloc mm/slab.c:3656 [inline] __kmalloc+0x163/0x770 mm/slab.c:3665 kmalloc include/linux/slab.h:561 [inline] tomoyo_realpath_from_path+0xc5/0x660 security/tomoyo/realpath.c:252 tomoyo_get_realpath security/tomoyo/file.c:151 [inline] tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822 tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129 security_inode_getattr+0xf2/0x150 security/security.c:1222 vfs_getattr+0x25/0x70 fs/stat.c:115 vfs_statx_fd+0x71/0xc0 fs/stat.c:145 vfs_fstat include/linux/fs.h:3265 [inline] __do_sys_newfstat+0x9b/0x120 fs/stat.c:378 __se_sys_newfstat fs/stat.c:375 [inline] __x64_sys_newfstat+0x54/0x80 fs/stat.c:375 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 9389: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] kasan_set_free_info mm/kasan/common.c:335 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474 kasan_slab_free+0xe/0x10 mm/kasan/common.c:483 __cache_free mm/slab.c:3426 [inline] kfree+0x10a/0x2c0 mm/slab.c:3757 tomoyo_realpath_from_path+0x1a7/0x660 security/tomoyo/realpath.c:289 tomoyo_get_realpath security/tomoyo/file.c:151 [inline] tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822 tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129 security_inode_getattr+0xf2/0x150 security/security.c:1222 vfs_getattr+0x25/0x70 fs/stat.c:115 vfs_statx_fd+0x71/0xc0 fs/stat.c:145 vfs_fstat include/linux/fs.h:3265 [inline] __do_sys_newfstat+0x9b/0x120 fs/stat.c:378 __se_sys_newfstat fs/stat.c:375 [inline] __x64_sys_newfstat+0x54/0x80 fs/stat.c:375 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8880a4932000 which belongs to the cache kmalloc-4k of size 4096 The buggy address is located 1025 bytes inside of 4096-byte region [ffff8880a4932000, ffff8880a4933000) The buggy address belongs to the page: page:ffffea0002924c80 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0 raw: 00fffe0000010200 ffffea0002846208 ffffea00028f3888 ffff8880aa402000 raw: 0000000000000000 ffff8880a4932000 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880a4932300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880a4932380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880a4932400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880a4932480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880a4932500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: b863ceb7ddce ("[NET]: Add macvlan driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07net/mlx5: limit the function in local scopeZhu Yanjun1-2/+0
The function mlx5_buf_alloc_node is only used by the function in the local scope. So it is appropriate to limit this function in the local scope. Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06Merge tag 'trace-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds1-4/+4
Pull tracing fixes from Steven Rostedt: "Various tracing fixes: - kbuild found missing define of MCOUNT_INSN_SIZE for various build configs - Initialize variable to zero as gcc thinks it is used undefined (it really isn't but the code is subtle enough that this doesn't hurt) - Convert from do_div() to div64_ull() to prevent potential divide by zero - Unregister a trace point on error path in sched_wakeup tracer - Use signed offset for archs that can have stext not be first - A simple indentation fix (whitespace error)" * tag 'trace-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix indentation issue kernel/trace: Fix do not unregister tracepoints when register sched_migrate_task fail tracing: Change offset type to s32 in preempt/irq tracepoints ftrace: Avoid potential division by zero in function profiler tracing: Have stack tracer compile when MCOUNT_INSN_SIZE is not defined tracing: Define MCOUNT_INSN_SIZE when not defined without direct calls tracing: Initialize val to zero in parse_entry of inject code
2020-01-06net: ethernet: sxgbe: Rename Samsung to lowercaseKrzysztof Kozlowski1-1/+1
Fix up inconsistent usage of upper and lowercase letters in "Samsung" name. "SAMSUNG" is not an abbreviation but a regular trademarked name. Therefore it should be written with lowercase letters starting with capital letter. Although advertisement materials usually use uppercase "SAMSUNG", the lowercase version is used in all legal aspects (e.g. on Wikipedia and in privacy/legal statements on https://www.samsung.com/semiconductor/privacy-global/). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06Merge tag 'spi-fix-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spiLinus Torvalds1-2/+2
Pull spi fixes from Mark Brown: "A small collection of fixes here, one to make the newly added PTP timestamping code more accurate, a few driver fixes and a fix for the core DT binding to document the fact that we support eight wire buses" * tag 'spi-fix-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: Document Octal mode as valid SPI bus width spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls spi: spi-fsl-dspi: Fix 16-bit word order in 32-bit XSPI mode spi: Don't look at TX buffer for PTP system timestamping spi: uniphier: Fix FIFO threshold
2020-01-06Merge tag 'rtc-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linuxLinus Torvalds1-0/+8
Pull RTC fixes from Alexandre Belloni: "A few fixes for this cycle. The CMOS AltCentury support broke a few platforms with a recent BIOS so I reverted it. The mt6397 fix is not that critical but good to have. And finally, the sun6i fix repairs WiFi and BT on a few platforms. Summary: - cmos: revert AltCentury support on AMD/Hygon - mt6397: fix alarm register overwrite - sun6i: ensure clock is working on R40" * tag 'rtc-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: rtc: cmos: Revert "rtc: Fix the AltCentury value on AMD/Hygon platform" rtc: mt6397: fix alarm register overwrite rtc: sun6i: Add support for RTC clocks on R40
2020-01-06netfilter: flowtable: add nf_flowtable_time_stampPablo Neira Ayuso1-0/+6
This patch adds nf_flowtable_time_stamp and updates the existing code to use it. This patch is also implicitly fixing up hardware statistic fetching via nf_flow_offload_stats() where casting to u32 is missing. Use nf_flow_timeout_delta() to fix this. Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: wenxu <wenxu@ucloud.cn>
2020-01-05net: mscc: ocelot: export ANA, DEV and QSYS registers to include/soc/msccVladimir Oltean3-0/+1170
Since the Felix DSA driver is implementing its own PHYLINK instance due to SoC differences, it needs access to the few registers that are common, mainly for flow control. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05net: mscc: ocelot: make phy_mode a member of the common struct ocelot_portVladimir Oltean1-0/+2
The Ocelot switchdev driver and the Felix DSA one need it for different reasons. Felix (or at least the VSC9959 instantiation in NXP LS1028A) is integrated with the traditional NXP Layerscape PCS design which does not support runtime configuration of SerDes protocol. So it needs to pre-validate the phy-mode from the device tree and prevent PHYLINK from attempting to change it. For this, it needs to cache it in a private variable. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>