aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/iostats.txt (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-02-02net: remove useless pfmemalloc settingEric Dumazet1-1/+0
When __alloc_skb() allocates an skb from fast clone cache, setting pfmemalloc on the clone is not needed. Clone will be properly initialized later at skb_clone() time, including pfmemalloc field, as it is included in the headers_start/headers_end section which is fully copied. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: dsa: mv88e6xxx: Fix typ0 when configuring 2.5GbpsAndrew Lunn1-1/+1
In order to enable 2.5Gbps mode, we need the base speed of 10G, plus the Alt bit setting. Fix a typ0 that used 1Gb base speed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: dsa: mv88e6xxx: Fix ATU age timer for MV88E6390Andrew Lunn1-6/+6
The MV88E6390 family uses a different ATU age timer coefficient. Fix the info structures. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: phy: marvell: Add support for 88e1545 PHYAndrew Lunn2-0/+22
The 88e1545 PHYs are discrete Marvell PHYs, found in a quad package on the zii-devel-b board. Add support for it to the Marvell PHY driver. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: stmmac: Fix wrong message in stmmac_probe_config_dtHeiner Kallweit1-1/+1
Most likely a copy & paste error in referenced commit. Restore the debug message to what it was before. Fixes: f573c0b9c4e0 ("stmmac: move stmmac_clk, pclk, clk_ptp_ref and stmmac_rst to platform structure") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-By: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: stmmac: add separate warning for PTP not being supported by HWHeiner Kallweit1-2/+4
Chips like Amlogic S905GXBB are supported by this driver but don't have support for PTP. Add a separate warning for missing HW support to differentiate it from other actual failures. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: stmmac: don't set tx delay in RGMII_ID and RGMII_TXID modeHeiner Kallweit1-7/+9
As documented in Documentation/devicetree/bindings/net/ethernet.txt, in RGMII_ID and RGMII_TXID mode the MAC should not add a tx delay. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02unix: add ioctl to open a unix socket file with O_PATHAndrey Vagin2-0/+43
This ioctl opens a file to which a socket is bound and returns a file descriptor. The caller has to have CAP_NET_ADMIN in the socket network namespace. Currently it is impossible to get a path and a mount point for a socket file. socket_diag reports address, device ID and inode number for unix sockets. An address can contain a relative path or a file may be moved somewhere. And these properties say nothing about a mount namespace and a mount point of a socket file. With the introduced ioctl, we can get a path by reading /proc/self/fd/X and get mnt_id from /proc/self/fdinfo/X. In CRIU we are going to use this ioctl to dump and restore unix socket. Here is an example how it can be used: $ strace -e socket,bind,ioctl ./test /tmp/test_sock socket(AF_UNIX, SOCK_STREAM, 0) = 3 bind(3, {sa_family=AF_UNIX, sun_path="test_sock"}, 11) = 0 ioctl(3, SIOCUNIXFILE, 0) = 4 ^Z $ ss -a | grep test_sock u_str LISTEN 0 1 test_sock 17798 * 0 $ ls -l /proc/760/fd/{3,4} lrwx------ 1 root root 64 Feb 1 09:41 3 -> 'socket:[17798]' l--------- 1 root root 64 Feb 1 09:41 4 -> /tmp/test_sock $ cat /proc/760/fdinfo/4 pos: 0 flags: 012000000 mnt_id: 40 $ cat /proc/self/mountinfo | grep "^40\s" 40 19 0:37 / /tmp rw shared:23 - tmpfs tmpfs rw Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: phy: Marvell: Add mv88e6390 internal PHYAndrew Lunn2-0/+26
The mv88e6390 Ethernet switch has internal PHYs. These PHYs don't have an model ID in the ID2 register. So the MDIO driver in the switch intercepts reads to this register, and returns the switch family ID. Extend the Marvell PHY driver by including this ID, and treat the PHY as a 88E1540. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02net: dsa: mv88e6xxx: Workaround missing PHY ID on mv88e6390Andrew Lunn1-0/+8
The internal PHYs of the mv88e6390 do not have a model ID. Trap any calls to the ID register, and if it is zero, return the ID for the mv88e6390. The Marvell PHY driver can then bind to this ID. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02xgene_enet: remove bogus forward declarationsArnd Bergmann3-56/+48
The device match tables for both the xgene_enet driver and its phy driver have forward declarations that declare an array without a length, leading to a clang warning when they are not followed by an actual defitinition: drivers/net/ethernet/apm/xgene/../../../phy/mdio-xgene.h:135:34: warning: tentative array definition assumed to have one element drivers/net/ethernet/apm/xgene/xgene_enet_main.c:33:36: warning: tentative array definition assumed to have one element The declarations for the mdio driver are even in a header file, so they cause duplicate definitions of the tables for each file that includes them. This removes all four forward declarations and moves the actual definitions up a little, so they are in front of their first user. For the OF match tables, this means having to remove the #ifdef around them, and passing the actual structure into of_match_device(). This has no effect on the generated object code though, as the of_match_device function has an empty stub that does not evaluate its argument, and the symbol gets dropped either way. Fixes: 43b3cf6634a4 ("drivers: net: phy: xgene: Add MDIO driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net: phy: broadcom: rehook BCM54612E specific initRafał Miłecki1-34/+33
This extra BCM54612E code in PHY driver isn't really aneg specific. Even without it aneg works OK but the problem is no packets pass through PHY. Moreover putting this code inside config_aneg callback didn't allow resuming PHY correctly. When driver called phy_stop and phy_start it was putting PHY machine into RESUMING state. After that machine was switching into AN and NOLINK without ever calling phy_start_aneg. This prevented this extra setup from being called and PHY didn't work. This change has been verified to fix network on BCM47186B0 SoC device with BCM54612E. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net/sched: act_psample: Remove unnecessary ASSERT_RTNLYotam Gigi1-1/+0
The ASSERT_RTNL is not necessary in the init function, as it does not touch any rtnl protected structures, as opposed to the mirred action which does have to hold a net device. Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net/sched: act_sample: Fix error path in initYotam Gigi1-1/+4
Fix error path of in sample init, by releasing the tc hash in case of failure in psample_group creation. Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01tcp: fix 0 divide in __tcp_select_window()Eric Dumazet1-2/+4
syszkaller fuzzer was able to trigger a divide by zero, when TCP window scaling is not enabled. SO_RCVBUF can be used not only to increase sk_rcvbuf, also to decrease it below current receive buffers utilization. If mss is negative or 0, just return a zero TCP window. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01sh_eth: fix wakeup event reporting from MagicPacketNiklas Söderlund1-2/+2
If a link change interrupt happens along side the MagicPacket interrupt and the link change interrupt is ignored the interrupt handler will return and the wakeup event is not registered. Fix this by moving the MagicPacket check before the link change check. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01sh_eth: align usage of sh_eth_modify() with rest of driverNiklas Söderlund1-1/+1
To be consistent with the rest of the driver when setting bits using sh_eth_modify() the same bit should also be cleared. This have no functional change and should have been done from the start. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Suggested-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01ethernet: aquantia: fix dma_mapping_error testDan Carpenter1-2/+3
dma_mapping_error() returns 1 if there is an error and 0 if not. Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()Dan Carpenter1-1/+1
Casting is a high precedence operation but "off" and "i" are in terms of bytes so we need to have some parenthesis here. Fixes: fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01liquidio: fix for iq and droq cnts going negativeSatanand Burla4-2/+15
Flush the mmio writes before releasing spin locks. if the maintained counts get too high > 2M force writeback of the counts to clear them Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net: ipv6: add NLM_F_APPEND in notifications when applicableDavid Ahern1-0/+3
IPv6 does not set the NLM_F_APPEND flag in notifications to signal that a NEWROUTE is an append versus a new route or a replaced one. Add the flag if the request has it. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net: fix ndo_features_check/ndo_fix_features comment orderingDimitris Michailidis1-14/+15
Commit cdba756f5803a2 ("net: move ndo_features_check() close to ndo_start_xmit()") inadvertently moved the doc comment for .ndo_fix_features instead of .ndo_features_check. Fix the comment ordering. Fixes: cdba756f5803a2 ("net: move ndo_features_check() close to ndo_start_xmit()") Signed-off-by: Dimitris Michailidis <dmichail@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net: ethernet: ti: cpsw: fix NULL pointer dereference in switch modeGrygorii Strashko1-1/+1
In switch mode on struct cpsw_slave->ndev field will be initialized with proper value only for the one cpsw slave port, as result cpsw_get_usage_count() will generate "Unable to handle kernel NULL pointer dereference" exception when first ethernet interface is opening cpsw_ndo_open(). This issue causes boot regression on AM335x EVM and reproducible on am57xx-evm (switch mode). Fix it by adding additional check for !cpsw->slaves[i].ndev in cpsw_get_usage_count(). Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Fixes: 03fd01ad0eea ("net: ethernet: ti: cpsw: don't duplicate ndev_running") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net: reduce skb_warn_bad_offload() noiseEric Dumazet1-3/+9
Dmitry reported warnings occurring in __skb_gso_segment() [1] All SKB_GSO_DODGY producers can allow user space to feed packets that trigger the current check. We could prevent them from doing so, rejecting packets, but this might add regressions to existing programs. It turns out our SKB_GSO_DODGY handlers properly set up checksum information that is needed anyway when packets needs to be segmented. By checking again skb_needs_check() after skb_mac_gso_segment(), we should remove these pesky warnings, at a very minor cost. With help from Willem de Bruijn [1] WARNING: CPU: 1 PID: 6768 at net/core/dev.c:2439 skb_warn_bad_offload+0x2af/0x390 net/core/dev.c:2434 lo: caps=(0x000000a2803b7c69, 0x0000000000000000) len=138 data_len=0 gso_size=15883 gso_type=4 ip_summed=0 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 6768 Comm: syz-executor1 Not tainted 4.9.0 #5 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 ffff8801c063ecd8 ffffffff82346bdf ffffffff00000001 1ffff100380c7d2e ffffed00380c7d26 0000000041b58ab3 ffffffff84b37e38 ffffffff823468f1 ffffffff84820740 ffffffff84f289c0 dffffc0000000000 ffff8801c063ee20 Call Trace: [<ffffffff82346bdf>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffffff82346bdf>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 [<ffffffff81827e34>] panic+0x1fb/0x412 kernel/panic.c:179 [<ffffffff8141f704>] __warn+0x1c4/0x1e0 kernel/panic.c:542 [<ffffffff8141f7e5>] warn_slowpath_fmt+0xc5/0x100 kernel/panic.c:565 [<ffffffff8356cbaf>] skb_warn_bad_offload+0x2af/0x390 net/core/dev.c:2434 [<ffffffff83585cd2>] __skb_gso_segment+0x482/0x780 net/core/dev.c:2706 [<ffffffff83586f19>] skb_gso_segment include/linux/netdevice.h:3985 [inline] [<ffffffff83586f19>] validate_xmit_skb+0x5c9/0xc20 net/core/dev.c:2969 [<ffffffff835892bb>] __dev_queue_xmit+0xe6b/0x1e70 net/core/dev.c:3383 [<ffffffff8358a2d7>] dev_queue_xmit+0x17/0x20 net/core/dev.c:3424 [<ffffffff83ad161d>] packet_snd net/packet/af_packet.c:2930 [inline] [<ffffffff83ad161d>] packet_sendmsg+0x32ed/0x4d30 net/packet/af_packet.c:2955 [<ffffffff834f0aaa>] sock_sendmsg_nosec net/socket.c:621 [inline] [<ffffffff834f0aaa>] sock_sendmsg+0xca/0x110 net/socket.c:631 [<ffffffff834f329a>] ___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1954 [<ffffffff834f5e58>] __sys_sendmsg+0x138/0x300 net/socket.c:1988 [<ffffffff834f604d>] SYSC_sendmsg net/socket.c:1999 [inline] [<ffffffff834f604d>] SyS_sendmsg+0x2d/0x50 net/socket.c:1995 [<ffffffff84371941>] entry_SYSCALL_64_fastpath+0x1f/0xc2 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01net/sched: matchall: Fix configuration raceYotam Gigi1-82/+45
In the current version, the matchall internal state is split into two structs: cls_matchall_head and cls_matchall_filter. This makes little sense, as matchall instance supports only one filter, and there is no situation where one exists and the other does not. In addition, that led to some races when filter was deleted while packet was processed. Unify that two structs into one, thus simplifying the process of matchall creation and deletion. As a result, the new, delete and get callbacks have a dummy implementation where all the work is done in destroy and change callbacks, as was done in cls_cgroup. Fixes: bf3994d2ed31 ("net/sched: introduce Match-all classifier") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01rtnetlink: Handle IFLA_MASTER parameter when processing rtnl_newlinkTheuns Verwoerd1-1/+6
Allow a master interface to be specified as one of the parameters when creating a new interface via rtnl_newlink. Previously this would require invoking interface creation, waiting for it to complete, and then separately binding that new interface to a master. In particular, this is used when creating a macvlan child interface for VRRP in a VRF configuration, allowing the interface creator to specify directly what master interface should be inherited by the child, without having to deal with asynchronous complications and potential race conditions. Signed-off-by: Theuns Verwoerd <theuns.verwoerd@alliedtelesis.co.nz> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01be2net: fix initial MAC settingIvan Vecera1-5/+28
Recent commit 34393529163a ("be2net: fix MAC addr setting on privileged BE3 VFs") allows privileged BE3 VFs to set its MAC address during initialization. Although the initial MAC for such VFs is already programmed by parent PF the subsequent setting performed by VF is OK, but in certain cases (after fresh boot) this command in VF can fail. The MAC should be initialized only when: 1) no MAC is programmed (always except BE3 VFs during first init) 2) programmed MAC is different from requested (e.g. MAC is set when interface is down). In this case the initial MAC programmed by PF needs to be deleted. The adapter->dev_mac contains MAC address currently programmed in HW so it should be zeroed when the MAC is deleted from HW and should not be filled when MAC is set when interface is down in be_mac_addr_set() as no programming is performed in this case. Example of failure without the fix (immediately after fresh boot): # ip link set eth0 up <- eth0 is BE3 PF be2net 0000:01:00.0 eth0: Link is Up # echo 1 > /sys/class/net/eth0/device/sriov_numvfs <- Create 1 VF ... be2net 0000:01:04.0: Emulex OneConnect(be3): VF port 0 # ip link set eth8 up <- eth8 is created privileged VF be2net 0000:01:04.0: opcode 59-1 failed:status 1-76 RTNETLINK answers: Input/output error # echo 0 > /sys/class/net/eth0/device/sriov_numvfs <- Delete VF iommu: Removing device 0000:01:04.0 from group 33 ... # echo 1 > /sys/class/net/eth0/device/sriov_numvfs <- Create it again iommu: Removing device 0000:01:04.0 from group 33 ... # ip link set eth8 up be2net 0000:01:04.0 eth8: Link is Up Initialization is now OK. v2 - Corrected the comment and condition check suggested by Suresh & Harsha Fixes: 34393529163a ("be2net: fix MAC addr setting on privileged BE3 VFs") Cc: Sathya Perla <sathya.perla@broadcom.com> Cc: Ajit Khaparde <ajit.khaparde@broadcom.com> Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Cc: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Ivan Vecera <cera@cera.cz> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: bgmac: use PHY subsystem for initializing PHYRafał Miłecki1-0/+10
This adds support for using bgmac with PHYs supported by standalone PHY drivers. Having any PHY initialization in bgmac is hacky and shouldn't be extended but rather removed if anyone has hardware to test it. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: bgmac: drop struct bcma_mdio we don't need anymoreRafał Miłecki3-60/+42
Adding struct bcma_mdio was a workaround for bcma code not having access to the struct bgmac used in the core code. Now we don't duplicate this struct we can just use it internally in bcma code. This simplifies code & allows access to all bgmac driver details from all places in bcma code. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: bgmac: allocate struct bgmac just once & don't copy itRafał Miłecki4-13/+21
So far were were allocating struct bgmac in 3 places: platform code, bcma code and shared bgmac_enet_probe function. The reason for this was bgmac_enet_probe: 1) Requiring early-filled struct bgmac 2) Calling alloc_etherdev on its own in order to use netdev_priv later This solution got few drawbacks: 1) Was duplicating allocating code 2) Required copying early-filled struct 3) Resulted in platform/bcma code having access only to unused struct Solve this situation by simply extracting some probe code into the new bgmac_alloc function. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: dsa: bcm_sf2: Fix build moduleFlorian Fainelli1-2/+2
Commit 7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc") added a new object to build: bcm_sf2_cfp.o, but in doing so, we essentially just built this object and no longer bcm_sf2.o. Fix this by creating a module named bcm-sf2.ko which links in bcm_sf2.o and bcm_sf2_cfp.o. Fixes: 7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31cxgb4: update latest firmware version supportedGanesh Goudar1-6/+6
Change t4fw_version.h to update latest firmware version number 1.16.26.0. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: ethtool: convert large order kmalloc allocations to vzallocAlexei Starovoitov1-17/+22
under memory pressure 'ethtool -S' command may warn: [ 2374.385195] ethtool: page allocation failure: order:4, mode:0x242c0c0 [ 2374.405573] CPU: 12 PID: 40211 Comm: ethtool Not tainted [ 2374.423071] Call Trace: [ 2374.423076] [<ffffffff8148cb29>] dump_stack+0x4d/0x64 [ 2374.423080] [<ffffffff811667cb>] warn_alloc_failed+0xeb/0x150 [ 2374.423082] [<ffffffff81169cd3>] ? __alloc_pages_direct_compact+0x43/0xf0 [ 2374.423084] [<ffffffff8116a25c>] __alloc_pages_nodemask+0x4dc/0xbf0 [ 2374.423091] [<ffffffffa0023dc2>] ? cmd_exec+0x722/0xcd0 [mlx5_core] [ 2374.423095] [<ffffffff811b3dcc>] alloc_pages_current+0x8c/0x110 [ 2374.423097] [<ffffffff81168859>] alloc_kmem_pages+0x19/0x90 [ 2374.423099] [<ffffffff81186e5e>] kmalloc_order_trace+0x2e/0xe0 [ 2374.423101] [<ffffffff811c0084>] __kmalloc+0x204/0x220 [ 2374.423105] [<ffffffff816c269e>] dev_ethtool+0xe4e/0x1f80 [ 2374.423106] [<ffffffff816b967e>] ? dev_get_by_name_rcu+0x5e/0x80 [ 2374.423108] [<ffffffff816d6926>] dev_ioctl+0x156/0x560 [ 2374.423111] [<ffffffff811d4c68>] ? mem_cgroup_commit_charge+0x78/0x3c0 [ 2374.423117] [<ffffffff8169d542>] sock_do_ioctl+0x42/0x50 [ 2374.423119] [<ffffffff8169d9c3>] sock_ioctl+0x1b3/0x250 [ 2374.423121] [<ffffffff811f0f42>] do_vfs_ioctl+0x92/0x580 [ 2374.423123] [<ffffffff8100222b>] ? do_audit_syscall_entry+0x4b/0x70 [ 2374.423124] [<ffffffff8100287c>] ? syscall_trace_enter_phase1+0xfc/0x120 [ 2374.423126] [<ffffffff811f14a9>] SyS_ioctl+0x79/0x90 [ 2374.423127] [<ffffffff81002bb0>] do_syscall_64+0x50/0xa0 [ 2374.423129] [<ffffffff817e19bc>] entry_SYSCALL64_slow_path+0x25/0x25 ~1160 mlx5 counters ~= order 4 allocation which is unlikely to succeed under memory pressure. Convert them to vzalloc() as ethtool_get_regs() does. Also take care of drivers without counters similar to commit 67ae7cf1eeda ("ethtool: Allow zero-length register dumps again") and reduce warn_on to warn_on_once. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31fscache: Fix dead object requeueDavid Howells2-2/+25
Under some circumstances, an fscache object can become queued such that it fscache_object_work_func() can be called once the object is in the OBJECT_DEAD state. This results in the kernel oopsing when it tries to invoke the handler for the state (which is hard coded to 0x2). The way this comes about is something like the following: (1) The object dispatcher is processing a work state for an object. This is done in workqueue context. (2) An out-of-band event comes in that isn't masked, causing the object to be queued, say EV_KILL. (3) The object dispatcher finishes processing the current work state on that object and then sees there's another event to process, so, without returning to the workqueue core, it processes that event too. It then follows the chain of events that initiates until we reach OBJECT_DEAD without going through a wait state (such as WAIT_FOR_CLEARANCE). At this point, object->events may be 0, object->event_mask will be 0 and oob_event_mask will be 0. (4) The object dispatcher returns to the workqueue processor, and in due course, this sees that the object's work item is still queued and invokes it again. (5) The current state is a work state (OBJECT_DEAD), so the dispatcher jumps to it - resulting in an OOPS. When I'm seeing this, the work state in (1) appears to have been either LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is fscache_osm_lookup_oob). The window for (2) is very small: (A) object->event_mask is cleared whilst the event dispatch process is underway - though there's no memory barrier to force this to the top of the function. The window, therefore is from the time the object was selected by the workqueue processor and made requeueable to the time the mask was cleared. (B) fscache_raise_event() will only queue the object if it manages to set the event bit and the corresponding event_mask bit was set. The enqueuement is then deferred slightly whilst we get a ref on the object and get the per-CPU variable for workqueue congestion. This slight deferral slightly increases the probability by allowing extra time for the workqueue to make the item requeueable. Handle this by giving the dead state a processor function and checking the for the dead state address rather than seeing if the processor function is address 0x2. The dead state processor function can then set a flag to indicate that it's occurred and give a warning if it occurs more than once per object. If this race occurs, an oops similar to the following is seen (note the RIP value): BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 IP: [<0000000000000002>] 0x1 PGD 0 Oops: 0010 [#1] SMP Modules linked in: ... CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 Workqueue: fscache_object fscache_object_work_func [fscache] task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000 RIP: 0010:[<0000000000000002>] [<0000000000000002>] 0x1 RSP: 0018:ffff880717547df8 EFLAGS: 00010202 RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200 RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480 RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510 R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000 R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600 FS: 0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00 ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000 ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900 Call Trace: [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache] [<ffffffff8109d5db>] process_one_work+0x17b/0x470 [<ffffffff8109e4ac>] worker_thread+0x21c/0x400 [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400 [<ffffffff810a5acf>] kthread+0xcf/0xe0 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 [<ffffffff816460d8>] ret_from_fork+0x58/0x90 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeremy McNicoll <jeremymc@redhat.com> Tested-by: Frank Sorenson <sorenson@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-01-31fscache: Clear outstanding writes when disabling a cookieDavid Howells2-0/+11
fscache_disable_cookie() needs to clear the outstanding writes on the cookie it's disabling because they cannot be completed after. Without this, fscache_nfs_open_file() gets stuck because it disables the cookie when the file is opened for writing but can't uncache the pages till afterwards - otherwise there's a race between the open routine and anyone who already has it open R/O and is still reading from it. Looking in /proc/pid/stack of the offending process shows: [<ffffffffa0142883>] __fscache_wait_on_page_write+0x82/0x9b [fscache] [<ffffffffa014336e>] __fscache_uncache_all_inode_pages+0x91/0xe1 [fscache] [<ffffffffa01740fa>] nfs_fscache_open_file+0x59/0x9e [nfs] [<ffffffffa01ccf41>] nfs4_file_open+0x17f/0x1b8 [nfsv4] [<ffffffff8117350e>] do_dentry_open+0x16d/0x2b7 [<ffffffff811743ac>] vfs_open+0x5c/0x65 [<ffffffff81184185>] path_openat+0x785/0x8fb [<ffffffff81184343>] do_filp_open+0x48/0x9e [<ffffffff81174710>] do_sys_open+0x13b/0x1cb [<ffffffff811747b9>] SyS_open+0x19/0x1b [<ffffffff81001c44>] do_syscall_64+0x80/0x17a [<ffffffff8165c2da>] return_from_SYSCALL_64+0x0/0x7a [<ffffffffffffffff>] 0xffffffffffffffff Reported-by: Jianhong Yin <jiyin@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-01-31FS-Cache: Initialise stores_lock in netfs cookieDavid Howells1-0/+1
Initialise the stores_lock in fscache netfs cookies. Technically, it shouldn't be necessary, since the netfs cookie is an index and stores no data, but initialising it anyway adds insignificant overhead. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-01-31ipv6: fix flow labels when the traffic class is non-0Dimitris Michailidis1-0/+5
ip6_make_flowlabel() determines the flow label for IPv6 packets. It's supposed to be passed a flow label, which it returns as is if non-0 and in some other cases, otherwise it calculates a new value. The problem is callers often pass a flowi6.flowlabel, which may also contain traffic class bits. If the traffic class is non-0 ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and returns the whole thing. Thus it can return a 'flow label' longer than 20b and the low 20b of that is typically 0 resulting in packets with 0 label. Moreover, different packets of a flow may be labeled differently. For a TCP flow with ECN non-payload and payload packets get different labels as exemplified by this pair of consecutive packets: (pure ACK) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49 Payload Length: 32 Next Header: TCP (6) (payload) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0)) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2) .... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000 Payload Length: 688 Next Header: TCP (6) This patch allows ip6_make_flowlabel() to be passed more than just a flow label and has it extract the part it really wants. This was simpler than modifying the callers. With this patch packets like the above become Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e Payload Length: 32 Next Header: TCP (6) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0)) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2) .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e Payload Length: 688 Next Header: TCP (6) Signed-off-by: Dimitris Michailidis <dmichail@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: aquantia: atlantic: use new api ethtool_{get|set}_link_ksettingsPhilippe Reynes3-31/+49
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Pavel Belous <pavel.s.belous@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31net: thunderx: avoid dereferencing xcv when NULLVincent1-2/+1
This fixes the following smatch and coccinelle warnings: drivers/net/ethernet/cavium/thunder/thunder_xcv.c:119 xcv_setup_link() error: we previously assumed 'xcv' could be null (see line 118) [smatch] drivers/net/ethernet/cavium/thunder/thunder_xcv.c:119:16-20: ERROR: xcv is NULL but dereferenced. [coccinelle] Fixes: 6465859aba1e66a5 ("net: thunderx: Add RGMII interface type support") Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net> Cc: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31sfc: fix an off-by-one compare on an array sizeColin Ian King1-1/+1
encap_type should be checked to see if it is greater or equal to the size of array map to fix an off-by-one array size check. This fixes an array overrun read as detected by static analysis by CoverityScan, CID#1398883 ("Out-of-bounds-read") Fixes: 9b41080125176841e ("sfc: insert catch-all filters for encapsulated traffic") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31tracing: Fix hwlat kthread migrationSteven Rostedt (VMware)1-3/+5
The hwlat tracer creates a kernel thread at start of the tracer. It is pinned to a single CPU and will move to the next CPU after each period of running. If the user modifies the migration thread's affinity, it will not change after that happens. The original code created the thread at the first instance it was called, but later was changed to destroy the thread after the tracer was finished, and would not be created until the next instance of the tracer was established. The code that initialized the affinity was only called on the initial instantiation of the tracer. After that, it was not initialized, and the previous affinity did not match the current newly created one, making it appear that the user modified the thread's affinity when it did not, and the thread failed to migrate again. Cc: stable@vger.kernel.org Fixes: 0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-01-31HID: cp2112: fix gpio-callback error handlingJohan Hovold1-1/+1
In case of a zero-length report, the gpio direction_input callback would currently return success instead of an errno. Fixes: 1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable") Cc: stable <stable@vger.kernel.org> # 4.9 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-01-31HID: cp2112: fix sleep-while-atomicJohan Hovold1-15/+11
A recent commit fixing DMA-buffers on stack added a shared transfer buffer protected by a spinlock. This is broken as the USB HID request callbacks can sleep. Fix this up by replacing the spinlock with a mutex. Fixes: 1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable") Cc: stable <stable@vger.kernel.org> # 4.9 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-01-31Input: synaptics-rmi4 - fix reversed conditions in enable/disable_irq_wakeChristophe JAILLET1-2/+2
These tests are reversed. A warning should be displayed if an error is returned, not on success. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-01-31bcma: make OF code more generic (not platform_device specific)Rafał Miłecki1-10/+11
OF allows not only specifying platform devices but also describing devices on standard buses like PCI or USB. This change will allow reading info from DT for bcma buses hosted on PCI cards. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-01-31rt2800: enable rt3290 unconditionally on pci probeStanislaw Gruszka1-3/+0
When we restart system using sysrq RT3290 device do not initalize properly, hance always enable it via WLAN_FUN_CTRL register on probe. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=85461 Reported-and-tested-by: Giedrius Statkevičius <edrius.statkevicius@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-01-31bcma: use (get|put)_device when probing/removing device driverRafał Miłecki1-0/+4
This allows tracking device state and e.g. makes devm work as expected. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Cc: Stable <stable@vger.kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-01-31brcmfmac: add .update_connect_params() callbackArend Van Spriel1-0/+24
Add support for the .update_connect_params() callback for roaming or subsequent (re)association. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-01-31brcmfmac: allow wowlan support to be per deviceArend Van Spriel1-7/+19
The wowlan support is (partially) determined dynamic by checking the device/firmware capabilities. So they can differ per device. So it is not possible to use a static global. Instead use the global as a template and use kmemdup(). When kmemdup() fails the template is used unmodified. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-01-31brcmfmac: fix handling firmware results for wowl netdetectArend Van Spriel1-3/+1
For wowl netdetect the event data changed for newer chips. This was recently fixed for scheduled scan, but same change is needed for wowl netdetect. Removing now pointles += operation from both result handlers. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>