linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-07-08	ipv6: elide flowlabel check if no exclusive leases exist	Willem de Bruijn	8	-14/+45
	Processes can request ipv6 flowlabels with cmsg IPV6_FLOWINFO. If not set, by default an autogenerated flowlabel is selected. Explicit flowlabels require a control operation per label plus a datapath check on every connection (every datagram if unconnected). This is particularly expensive on unconnected sockets multiplexing many flows, such as QUIC. In the common case, where no lease is exclusive, the check can be safely elided, as both lease request and check trivially succeed. Indeed, autoflowlabel does the same even with exclusive leases. Elide the check if no process has requested an exclusive lease. fl6_sock_lookup previously returns either a reference to a lease or NULL to denote failure. Modify to return a real error and update all callers. On return NULL, they can use the label and will elide the atomic_dec in fl6_sock_release. This is an optimization. Robust applications still have to revert to requesting leases if the fast path fails due to an exclusive lease. Changes RFC->v1: - use static_key_false_deferred to rate limit jump label operations - call static_key_deferred_flush to stop timers on exit - move decrement out of RCU context - defer optimization also if opt data is associated with a lease - updated all fp6_sock_lookup callers, not just udp Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bonding: fix value exported by Netlink for peer_notif_delay	Vincent Bernat	1	-1/+1
	IFLA_BOND_PEER_NOTIF_DELAY was set to the value of downdelay instead of peer_notif_delay. After this change, the correct value is exported. Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") Signed-off-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	coallocate socket_wq with socket itself	Al Viro	7	-28/+15
	socket->wq is assign-once, set when we are initializing both struct socket it's in and struct socket_wq it points to. As the matter of fact, the only reason for separate allocation was the ability to RCU-delay freeing of socket_wq. RCU-delaying the freeing of socket itself gets rid of that need, so we can just fold struct socket_wq into the end of struct socket and simplify the life both for sock_alloc_inode() (one allocation instead of two) and for tun/tap oddballs, where we used to embed struct socket and struct socket_wq into the same structure (now - embedding just the struct socket). Note that reference to struct socket_wq in struct sock does remain a reference - that's unchanged. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	sockfs: switch to ->free_inode()	Al Viro	1	-3/+3
	we do have an RCU-delayed part there already (freeing the wq), so it's not like the pipe situation; moreover, it might be worth considering coallocating wq with the rest of struct sock_alloc. ->sk_wq in struct sock would remain a pointer as it is, but the object it normally points to would be coallocated with struct socket... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09	xdp: fix race on generic receive path	Ilya Maximets	2	-9/+24
	Unlike driver mode, generic xdp receive could be triggered by different threads on different CPU cores at the same time leading to the fill and rx queue breakage. For example, this could happen while sending packets from two processes to the first interface of veth pair while the second part of it is open with AF_XDP socket. Need to take a lock for each generic receive to avoid race. Fixes: c497176cb2e4 ("xsk: add Rx receive functions and poll support") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: William Tu <u9012063@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	selftests: forwarding: Test multipath hashing on inner IP pkts for GRE tunnel	Stephen Suryaputra	4	-0/+1220
	Add selftest scripts for multipath hashing on inner IP pkts when there is a single GRE tunnel but there are multiple underlay routes to reach the other end of the tunnel. Four cases are covered in these scripts: - IPv4 inner, IPv4 outer - IPv6 inner, IPv4 outer - IPv4 inner, IPv6 outer - IPv6 inner, IPv6 outer Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	ipv6: Support multipath hashing on inner IP pkts	Stephen Suryaputra	2	-0/+37
	Make the same support as commit 363887a2cdfe ("ipv4: Support multipath hashing on inner IP pkts for GRE tunnel") for outer IPv6. The hashing considers both IPv4 and IPv6 pkts when they are tunneled by IPv6 GRE. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	ipv4: Multipath hashing on inner L3 needs to consider inner IPv6 pkts	Stephen Suryaputra	1	-4/+17
	Commit 363887a2cdfe ("ipv4: Support multipath hashing on inner IP pkts for GRE tunnel") supports multipath policy value of 2, Layer 3 or inner Layer 3 if present, but it only considers inner IPv4. There is a use case of IPv6 is tunneled by IPv4 GRE, thus add the ability to hash on inner IPv6 addresses. Fixes: 363887a2cdfe ("ipv4: Support multipath hashing on inner IP pkts for GRE tunnel") Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: pasemi: fix an use-after-free in pasemi_mac_phy_init()	Wen Yang	1	-1/+1
	The phy_dn variable is still being used in of_phy_connect() after the of_node_put() call, which may result in use-after-free. Fixes: 1dd2d06c0459 ("net: Rework pasemi_mac driver to use of_mdio infrastructure") Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: axienet: fix a potential double free in axienet_probe()	Wen Yang	1	-1/+0
	There is a possible use-after-free issue in the axienet_probe(): 1701: np = of_parse_phandle(pdev->dev.of_node, "axistream-connected", 0); 1702: if (np) { ... 1787: of_node_put(np); ---> released here 1788: lp->eth_irq = platform_get_irq(pdev, 0); 1789: } else { ... 1801: } 1802: if (IS_ERR(lp->dma_regs)) { ... 1805: of_node_put(np); ---> double released here 1806: goto free_netdev; 1807: } We solve this problem by removing the unnecessary of_node_put(). Fixes: 28ef9ebdb64c ("net: axienet: make use of axistream-connected attribute optional") Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: Anirudha Sarangi <anirudh@xilinx.com> Cc: John Linn <John.Linn@xilinx.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Robert Hancock <hancock@sedsystems.ca> Cc: netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Robert Hancock <hancock@sedsystems.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09	selftests/bpf: fix test_reuseport_array on s390	Ilya Leoshkevich	1	-6/+15
	Fix endianness issue: passing a pointer to 64-bit fd as a 32-bit key does not work on big-endian architectures. So cast fd to 32-bits when necessary. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	net: stmmac: enable clause 45 mdio support	Kweh Hock Leong	2	-8/+37
	DWMAC4 is capable to support clause 45 mdio communication. This patch enable the feature on stmmac_mdio_write() and stmmac_mdio_read() by following phy_write_mmd() and phy_read_mmd() mdiobus read write implementation format. Reviewed-by: Li, Yifan <yifan2.li@intel.com> Signed-off-by: Kweh Hock Leong <hock.leong.kweh@intel.com> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: openvswitch: use netif_ovs_is_port() instead of opencode	Taehee Yoo	2	-4/+4
	Use netif_ovs_is_port() function instead of open code. This patch doesn't change logic. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	MAINTAINERS: Add page_pool maintainer entry	Jesper Dangaard Brouer	1	-0/+8
	In this release cycle the number of NIC drivers using page_pool will likely reach 4 drivers. It is about time to add a maintainer entry. Add myself and Ilias. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: mvpp2: cls: Add support for ETHER_FLOW	Maxime Chevallier	1	-0/+2
	Users can specify classification actions based on the 'ether' flow type. In that case, this will apply to all ethernet traffic, superseeding flows such as 'udp4' or 'tcp6'. Add support for this flow type in the PPv2 classifier, by mapping the ETHER_FLOW value to the corresponding entries in the classifier. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: mvpp2: cls: Report an error for unsupported flow types	Maxime Chevallier	1	-0/+4
	Add a missing check to detect flow types that we don't support, so that user can be informed of this. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	vsock/virtio: fix flush of works during the .remove()	Stefano Garzarella	1	-6/+9
	This patch moves the flush of works after vdev->config->del_vqs(vdev), because we need to be sure that no workers run before to free the 'vsock' object. Since we stopped the workers using the [tx\|rx\|event]_run flags, we are sure no one is accessing the device while we are calling vdev->config->reset(vdev), so we can safely move the workers' flush. Before the vdev->config->del_vqs(vdev), workers can be scheduled by VQ callbacks, so we must flush them after del_vqs(), to avoid use-after-free of 'vsock' object. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	vsock/virtio: stop workers during the .remove()	Stefano Garzarella	1	-1/+50
	Before to call vdev->config->reset(vdev) we need to be sure that no one is accessing the device, for this reason, we add new variables in the struct virtio_vsock to stop the workers during the .remove(). This patch also add few comments before vdev->config->reset(vdev) and vdev->config->del_vqs(vdev). Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock	Stefano Garzarella	1	-24/+46
	Some callbacks used by the upper layers can run while we are in the .remove(). A potential use-after-free can happen, because we free the_virtio_vsock without knowing if the callbacks are over or not. To solve this issue we move the assignment of the_virtio_vsock at the end of .probe(), when we finished all the initialization, and at the beginning of .remove(), before to release resources. For the same reason, we do the same also for the vdev->priv. We use RCU to be sure that all callbacks that use the_virtio_vsock ended before freeing it. This is not required for callbacks that use vdev->priv, because after the vdev->config->del_vqs() we are sure that they are ended and will no longer be invoked. We also take the mutex during the .remove() to avoid that .probe() can run while we are resetting the device. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Documentation: net: dsa: b53: Describe b53 configuration	Benedikt Spranger	2	-0/+184
	Document the different needs of documentation for the b53 driver. Signed-off-by: Benedikt Spranger <b.spranger@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Documentation: net: dsa: Describe DSA switch configuration	Benedikt Spranger	2	-0/+293
	Document DSA tagged and VLAN based switch configuration by showcases. Signed-off-by: Benedikt Spranger <b.spranger@linutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	nfp: tls: fix error return code in nfp_net_tls_add()	Wei Yongjun	1	-0/+1
	Fix to return negative error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Fixes: 1f35a56cf586 ("nfp: tls: add/delete TLS TX connections") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: add page_pool support	Andy Gospodarek	4	-7/+47
	This removes contention over page allocation for XDP_REDIRECT actions by adding page_pool support per queue for the driver. The performance for XDP_REDIRECT actions scales linearly with the number of cores performing redirect actions when using the page pools instead of the standard page allocator. v2: Fix up the error path from XDP registration, noted by Ilias Apalodimas. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: optimized XDP_REDIRECT support	Andy Gospodarek	4	-10/+140
	This adds basic support for XDP_REDIRECT in the bnxt_en driver. Next patch adds the more optimized page pool support. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: Refactor __bnxt_xmit_xdp().	Michael Chan	4	-13/+28
	__bnxt_xmit_xdp() is used by XDP_TX and ethtool loopback packet transmit. Refactor it so that it can be re-used by the XDP_REDIRECT logic. Restructure the TX interrupt handler logic to cleanly separate XDP_TX logic in preparation for XDP_REDIRECT. Acked-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: rename some xdp functions	Andy Gospodarek	3	-7/+7
	Renaming bnxt_xmit_xdp to __bnxt_xmit_xdp to get ready for XDP_REDIRECT support and reduce confusion/namespace collision. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: cpsw: add XDP support	Ivan Khoronzhuk	4	-59/+488
	Add XDP support based on rx page_pool allocator, one frame per page. Page pool allocator is used with assumption that only one rx_handler is running simultaneously. DMA map/unmap is reused from page pool despite there is no need to map whole page. Due to specific of cpsw, the same TX/RX handler can be used by 2 network devices, so special fields in buffer are added to identify an interface the frame is destined to. Thus XDP works for both interfaces, that allows to test xdp redirect between two interfaces easily. Also, each rx queue have own page pools, but common for both netdevs. XDP prog is common for all channels till appropriate changes are added in XDP infrastructure. Also, once page_pool recycling becomes part of skb netstack some simplifications can be added, like removing page_pool_release_page() before skb receive. In order to keep rx_dev while redirect, that can be somehow used in future, do flush in rx_handler, that allows to keep rx dev the same while redirect. It allows to conform with tracing rx_dev pointed by Jesper. Also, there is probability, that XDP generic code can be extended to support multi ndev drivers like this one, using same rx queue for several ndevs, based on switchdev for instance or else. In this case, driver can be modified like exposed here: https://lkml.org/lkml/2019/7/3/243 Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: cpsw_ethtool: allow res split while down	Ivan Khoronzhuk	1	-2/+1
	That's possible to set channel num while interfaces are down. When interface gets up it should resplit budget. This resplit can happen after phy is up but only if speed is changed, so should be set before this, for this allow it to happen while changing number of channels, when interfaces are down. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: davinci_cpdma: allow desc split while down	Ivan Khoronzhuk	3	-9/+28
	That's possible to set ring params while interfaces are down. When interface gets up it uses number of descs to fill rx queue and on later on changes to create rx pools. Usually, this resplit can happen after phy is up, but it can be needed before this, so allow it to happen while setting number of rx descs, when interfaces are down. Also, if no dependency on intf state, move it to cpdma layer, where it should be. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: davinci_cpdma: add dma mapped submit	Ivan Khoronzhuk	2	-10/+83
	In case if dma mapped packet needs to be sent, like with XDP page pool, the "mapped" submit can be used. This patch adds dma mapped submit based on regular one. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: core: page_pool: add user refcnt and reintroduce page_pool_destroy	Ivan Khoronzhuk	5	-8/+40
	Jesper recently removed page_pool_destroy() (from driver invocation) and moved shutdown and free of page_pool into xdp_rxq_info_unreg(), in-order to handle in-flight packets/pages. This created an asymmetry in drivers create/destroy pairs. This patch reintroduce page_pool_destroy and add page_pool user refcnt. This serves the purpose to simplify drivers error handling as driver now drivers always calls page_pool_destroy() and don't need to track if xdp_rxq_info_reg_mem_model() was unsuccessful. This could be used for a special cases where a single RX-queue (with a single page_pool) provides packets for two net_device'es, and thus needs to register the same page_pool twice with two xdp_rxq_info structures. This patch is primarily to ease API usage for drivers. The recently merged netsec driver, actually have a bug in this area, which is solved by this API change. This patch is a modified version of Ivan Khoronzhuk's original patch. Link: https://lore.kernel.org/netdev/20190625175948.24771-2-ivan.khoronzhuk@linaro.org/ Fixes: 5c67bf0ec4d0 ("net: netsec: Use page_pool API") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	macb: fix build warning for !CONFIG_OF	Arnd Bergmann	1	-2/+2
	When CONFIG_OF is disabled, we get a harmless warning about the newly added variable: drivers/net/ethernet/cadence/macb_main.c:48:39: error: 'mgmt' defined but not used [-Werror=unused-variable] static struct sifive_fu540_macb_mgmt *mgmt; Move the variable closer to its use inside of the #ifdef. Fixes: c218ad559020 ("macb: Add support for SiFive FU540-C000") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	gve: fix unused variable/label warnings	Arnd Bergmann	1	-33/+33
	On unusual page sizes, we get harmless warnings: drivers/net/ethernet/google/gve/gve_rx.c:283:6: error: unused variable 'pagecount' [-Werror,-Wunused-variable] drivers/net/ethernet/google/gve/gve_rx.c:336:1: error: unused label 'have_skb' [-Werror,-Wunused-label] Change the preprocessor #if to regular if() to avoid this. Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	sfc: Remove 'PCIE error reporting unavailable'	Martin Habets	1	-5/+1
	This is only at notice level but it was pointed out that no other driver does this. Also there is no action the user can take as it is really a property of the server. Signed-off-by: Martin Habets <mhabets@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: netsec: Sync dma for device on buffer allocation	Ilias Apalodimas	1	-3/+1
	cd1973a9215a ("net: netsec: Sync dma for device on buffer allocation") was merged on it's v1 instead of the v3. Merge the proper patch version Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	tools: bpftool: add completion for bpftool prog "loadall"	Quentin Monnet	1	-1/+8
	Bash completion for proposing the "loadall" subcommand is missing. Let's add it to the completion script. Add a specific case to propose "load" and "loadall" for completing: $ bpftool prog load ^ cursor is here Otherwise, completion considers that $command is in load\|loadall and starts making related completions (file or directory names, as the number of words on the command line is below 6), when the only suggested keywords should be "load" and "loadall" until one has been picked and a space entered after that to move to the next word. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	bpf: avoid unused variable warning in tcp_bpf_rtt()	Arnd Bergmann	1	-3/+1
	When CONFIG_BPF is disabled, we get a warning for an unused variable: In file included from drivers/target/target_core_device.c:26: include/net/tcp.h:2226:19: error: unused variable 'tp' [-Werror,-Wunused-variable] struct tcp_sock *tp = tcp_sk(sk); The variable is only used in one place, so it can be replaced with its value there to avoid the warning. Fixes: 23729ff23186 ("bpf: add BPF_CGROUP_SOCK_OPS callback that is executed on every RTT") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	bpf: cgroup: Fix build error without CONFIG_NET	YueHaibing	1	-0/+4
	If CONFIG_NET is not set and CONFIG_CGROUP_BPF=y, gcc building fails: kernel/bpf/cgroup.o: In function `cg_sockopt_func_proto': cgroup.c:(.text+0x237e): undefined reference to `bpf_sk_storage_get_proto' cgroup.c:(.text+0x2394): undefined reference to `bpf_sk_storage_delete_proto' kernel/bpf/cgroup.o: In function `__cgroup_bpf_run_filter_getsockopt': (.text+0x2a1f): undefined reference to `lock_sock_nested' (.text+0x2ca2): undefined reference to `release_sock' kernel/bpf/cgroup.o: In function `__cgroup_bpf_run_filter_setsockopt': (.text+0x3006): undefined reference to `lock_sock_nested' (.text+0x32bb): undefined reference to `release_sock' Reported-by: Hulk Robot <hulkci@huawei.com> Suggested-by: Stanislav Fomichev <sdf@fomichev.me> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	selftests/bpf: fix test_attach_probe map definition	Andrii Nakryiko	1	-8/+5
	ef99b02b23ef ("libbpf: capture value in BTF type info for BTF-defined map defs") changed BTF-defined maps syntax, while independently merged 1e8611bbdfc9 ("selftests/bpf: add kprobe/uprobe selftests") added new test using outdated syntax of maps. This patch fixes this test after corresponding patch sets were merged. Fixes: ef99b02b23ef ("libbpf: capture value in BTF type info for BTF-defined map defs") Fixes: 1e8611bbdfc9 ("selftests/bpf: add kprobe/uprobe selftests") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	selftests/bpf: add verifier tests for wide stores	Stanislav Fomichev	2	-3/+50
	Make sure that wide stores are allowed at proper (aligned) addresses. Note that user_ip6 is naturally aligned on 8-byte boundary, so correct addresses are user_ip6[0] and user_ip6[2]. msg_src_ip6 is, however, aligned on a 4-byte bondary, so only msg_src_ip6[1] can be wide-stored. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	bpf: sync bpf.h to tools/	Stanislav Fomichev	1	-3/+3
	Sync user_ip6 & msg_src_ip6 comments. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	bpf: allow wide (u64) aligned stores for some fields of bpf_sock_addr	Stanislav Fomichev	3	-11/+23
	Since commit cd17d7770578 ("bpf/tools: sync bpf.h") clang decided that it can do a single u64 store into user_ip6[2] instead of two separate u32 ones: # 17: (18) r2 = 0x100000000000000 # ; ctx->user_ip6[2] = bpf_htonl(DST_REWRITE_IP6_2); # 19: (7b) (u64 )(r1 +16) = r2 # invalid bpf_context access off=16 size=8 >From the compiler point of view it does look like a correct thing to do, so let's support it on the kernel side. Credit to Andrii Nakryiko for a proper implementation of bpf_ctx_wide_store_ok. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Yonghong Song <yhs@fb.com> Fixes: cd17d7770578 ("bpf/tools: sync bpf.h") Reported-by: kernel test robot <rong.a.chen@intel.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	libbpf: add perf_buffer_ prefix to README	Andrii Nakryiko	1	-1/+2
	perf_buffer "object" is part of libbpf API now, add it to the list of libbpf function prefixes. Suggested-by: Daniel Borkman <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	tools/bpftool: switch map event_pipe to libbpf's perf_buffer	Andrii Nakryiko	1	-137/+64
	Switch event_pipe implementation to rely on new libbpf perf buffer API (it's raw low-level variant). Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	selftests/bpf: test perf buffer API	Andrii Nakryiko	2	-0/+125
	Add test verifying perf buffer API functionality. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	libbpf: auto-set PERF_EVENT_ARRAY size to number of CPUs	Andrii Nakryiko	1	-7/+24
	For BPF_MAP_TYPE_PERF_EVENT_ARRAY typically correct size is number of possible CPUs. This is impossible to specify at compilation time. This change adds automatic setting of PERF_EVENT_ARRAY size to number of system CPUs, unless non-zero size is specified explicitly. This allows to adjust size for advanced specific cases, while providing convenient and logical defaults. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	libbpf: add perf buffer API	Andrii Nakryiko	3	-0/+419
	BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program to user space for additional processing. libbpf already has very low-level API to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to use and requires a lot of code to set everything up. This patch adds perf_buffer abstraction on top of it, abstracting setting up and polling per-CPU logic into simple and convenient API, similar to what BCC provides. perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF map entries. It accepts two user-provided callbacks: one for handling raw samples and one for get notifications of lost samples due to buffer overflow. perf_buffer__new_raw() is similar, but provides more control over how perf events are set up (by accepting user-provided perf_event_attr), how they are handled (perf_event_header pointer is passed directly to user-provided callback), and on which CPUs ring buffers are created (it's possible to provide a list of CPUs and corresponding map keys to update). This API allows advanced users fuller control. perf_buffer__poll() is used to fetch ring buffer data across all CPUs, utilizing epoll instance. perf_buffer__free() does corresponding clean up and unsets FDs from BPF map. All APIs are not thread-safe. User should ensure proper locking/coordination if used in multi-threaded set up. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-07	hinic: add fw version query	Xue Chaojing	4	-0/+54
	This patch adds firmware version query in ethtool -i. Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-07	gve: Fix error return code in gve_alloc_qpls()	Wei Yongjun	1	-1/+3
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-07	net: dsa: vsc73xx: Assert reset if iCPU is enabled	Pawel Dembicki	1	-19/+17
	Driver allow to use devices with disabled iCPU only. Some devices have pre-initialised iCPU by bootloader. That state make switch unmanaged. This patch force reset if device is in unmanaged state. In the result chip lost internal firmware from RAM and it can be managed. Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>