aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv6/output_core.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2021-02-22net: stmmac: fix CBS idleslope and sendslope calculationSong, Yoong Siang1-4/+26
When link speed is not 100 Mbps, port transmit rate and speed divider are set to 8 and 1000000 respectively. These values are incorrect for CBS idleslope and sendslope HW values calculation if the link speed is not 1 Gbps. This patch adds switch statement to set the values of port transmit rate and speed divider for 10 Gbps, 5 Gbps, 2.5 Gbps, 1 Gbps, and 100 Mbps. Note that CBS is not supported at 10 Mbps. Fixes: bc41a6689b30 ("net: stmmac: tc: Remove the speed dependency") Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC") Signed-off-by: Song, Yoong Siang <yoong.siang.song@intel.com> Link: https://lore.kernel.org/r/1613655653-11755-1-git-send-email-yoong.siang.song@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22mptcp: do not wakeup listener for MPJ subflowsPaolo Abeni1-0/+6
MPJ subflows are not exposed as fds to user spaces. As such, incoming MPJ subflows are removed from the accept queue by tcp_check_req()/tcp_get_cookie_sock(). Later tcp_child_process() invokes subflow_data_ready() on the parent socket regardless of the subflow kind, leading to poll wakeups even if the later accept will block. Address the issue by double-checking the queue state before waking the user-space. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/164 Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22mptcp: provide subflow aware release functionFlorian Westphal1-2/+53
mptcp re-used inet(6)_release, so the subflow sockets are ignored. Need to invoke ip(v6)_mc_drop_socket function to ensure mcast join resources get free'd. Fixes: 717e79c867ca5 ("mptcp: Add setsockopt()/getsockopt() socket operations") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/110 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22mptcp: fix DATA_FIN generation on early shutdownPaolo Abeni1-9/+14
If the msk is closed before sending or receiving any data, no DATA_FIN is generated, instead an MPC ack packet is crafted out. In the above scenario, the MPTCP protocol creates and sends a pure ack and such packets matches also the criteria for an MPC ack and the protocol tries first to insert MPC options, leading to the described error. This change addresses the issue by avoiding the insertion of an MPC option for DATA_FIN packets or if the sub-flow is not established. To avoid doing multiple times the same test, fetch the data_fin flag in a bool variable and pass it to both the interested helpers. Fixes: 6d0060f600ad ("mptcp: Write MPTCP DSS headers to outgoing data packets") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22mptcp: fix DATA_FIN processing for orphaned socketsPaolo Abeni1-5/+4
Currently we move orphaned msk sockets directly from FIN_WAIT2 state to CLOSE, with the rationale that incoming additional data could be just dropped by the TCP stack/TW sockets. Anyhow we miss sending MPTCP-level ack on incoming DATA_FIN, and that may hang the peers. Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22net: dsa: Fix dependencies with HSRFlorian Fainelli1-0/+1
The core DSA framework uses hsr_is_master() which would not resolve to a valid symbol if HSR is built-into the kernel and DSA is a module. Fixes: 18596f504a3e ("net: dsa: add support for offloading HSR") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: George McCollister <george.mccollister@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20210220051222.15672-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22net: phy: icplus: call phy_restore_page() when phy_select_page() failsDan Carpenter1-4/+5
The comments to phy_select_page() say that "phy_restore_page() must always be called after this, irrespective of success or failure of this call." If we don't call phy_restore_page() then we are still holding the phy_lock_mdio_bus() so it eventually leads to a dead lock. Fixes: 32ab60e53920 ("net: phy: icplus: add MDI/MDIX support for IP101A/G") Fixes: f9bc51e6cce2 ("net: phy: icplus: fix paged register access") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Michael Walle <michael@walle.cc> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/YC+OpFGsDPXPnXM5@mwanda Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22net: mvpp2: skip RSS configurations on loopback portStefan Chulski1-11/+14
PPv2 loopback port doesn't support RSS, so we should skip RSS configurations for this port. Signed-off-by: Stefan Chulski <stefanc@marvell.com> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/r/1613652123-19021-1-git-send-email-stefanc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22dpaa_eth: fix the access method for the dpaa_napi_portalCamelia Groza1-1/+1
The current use of container_of is flawed and unnecessary. Obtain the dpaa_napi_portal reference from the private percpu data instead. Fixes: a1e031ffb422 ("dpaa_eth: add XDP_REDIRECT support") Reported-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/20210218182106.22613-1-camelia.groza@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-22net: ag71xx: remove unnecessary MTU reservationDENG Qingfang1-3/+1
2 bytes of the MTU are reserved for Atheros DSA tag, but DSA core has already handled that since commit dc0fe7d47f9f. Remove the unnecessary reservation. Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20210218034514.3421-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-21octeontx2-af: Fix an off by one in rvu_dbg_qsize_write()Dan Carpenter1-1/+1
This code does not allocate enough memory for the NUL terminator so it ends up putting it one character beyond the end of the buffer. Fixes: 8756828a8148 ("octeontx2-af: Add NPA aura and pool contexts to debugfs") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-20tty: protect tty_write from odd low-level tty disciplinesLinus Torvalds1-1/+4
Al root-caused a new warning from syzbot to the ttyprintk tty driver returning a write count larger than the data the tty layer actually gave it. Which confused the tty write code mightily, and with the new iov_iter based code, caused a WARNING in iov_iter_revert(). syzbot correctly bisected the source of the new warning to commit 9bb48c82aced ("tty: implement write_iter"), but the oddity goes back much further, it just didn't get caught by anything before. Reported-by: syzbot+3d2c27c2b7dc2a94814d@syzkaller.appspotmail.com Fixes: 9bb48c82aced ("tty: implement write_iter") Debugged-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-20fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy*Al Viro1-4/+5
After switching to non-RCU mode, we want nd->depth to match the number of entries in nd->stack[] that need eventual path_put(). legitimize_links() takes care of that on failures; unfortunately, failure exits added for LOOKUP_CACHED do not. We could add the logics for that into those failure exits, both in try_to_unlazy() and in try_to_unlazy_next(), but since both checks are immediately followed by legitimize_links() and there's no calls of legitimize_links() other than those two... It's easier to move the check (and required handling of nd->depth on failure) into legitimize_links() itself. [caught by Jens: ... and since we are zeroing ->depth here, we need to do drop_links() first] Fixes: 6c6ec2b0a3e0 "fs: add support for LOOKUP_CACHED" Tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-02-19i40e: Fix endianness conversionsNorbert Ciosek3-8/+8
Fixes the following sparse warnings: i40e_main.c:5953:32: warning: cast from restricted __le16 i40e_main.c:8008:29: warning: incorrect type in assignment (different base types) i40e_main.c:8008:29: expected unsigned int [assigned] [usertype] ipa i40e_main.c:8008:29: got restricted __le32 [usertype] i40e_main.c:8008:29: warning: incorrect type in assignment (different base types) i40e_main.c:8008:29: expected unsigned int [assigned] [usertype] ipa i40e_main.c:8008:29: got restricted __le32 [usertype] i40e_txrx.c:1950:59: warning: incorrect type in initializer (different base types) i40e_txrx.c:1950:59: expected unsigned short [usertype] vlan_tag i40e_txrx.c:1950:59: got restricted __le16 [usertype] l2tag1 i40e_txrx.c:1953:40: warning: cast to restricted __le16 i40e_xsk.c:448:38: warning: invalid assignment: |= i40e_xsk.c:448:38: left side has type restricted __le64 i40e_xsk.c:448:38: right side has type int Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Fixes: 2a508c64ad27 ("i40e: fix VLAN.TCI == 0 RX HW offload") Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance") Fixes: 8f88b3034db3 ("i40e: Add infrastructure for queue channel support") Signed-off-by: Norbert Ciosek <norbertx.ciosek@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-19i40e: Fix add TC filter for IPv6Mateusz Palczewski1-2/+3
Fix insufficient distinction between IPv4 and IPv6 addresses when creating a filter. IPv4 and IPv6 are kept in the same memory area. If IPv6 is added, then it's caught by IPv4 check, which leads to err -95. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18pstore: Fix typo in compression option nameJiri Bohac1-2/+2
Both pstore_compress() and decompress_record() use a mistyped config option name ("PSTORE_COMPRESSION" instead of "PSTORE_COMPRESS"). As a result compression and decompression of pstore records was always disabled. Use the correct config option name. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Fixes: fd49e03280e5 ("pstore: Fix linking when crypto API disabled") Acked-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210218111547.johvp5klpv3xrpnn@dwarf.suse.cz
2021-02-18i40e: Fix VFs not createdSylwester Dziedziuch1-2/+1
When creating VFs they were sometimes not getting resources. It was caused by not executing i40e_reset_all_vfs due to flag __I40E_VF_DISABLE being set on PF. Because of this IAVF was never able to finish setup sequence never getting reset indication from PF. Changed test_and_set_bit __I40E_VF_DISABLE in i40e_sync_filters_subtask to test_bit and removed clear_bit. This function should not set this bit it should only check if it hasn't been already set. Fixes: a7542b876075 ("i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18i40e: Fix addition of RX filters after enabling FW LLDP agentMateusz Palczewski2-12/+13
Fix addition of VLAN filter for PF after enabling FW LLDP agent. Changing LLDP Agent causes FW to re-initialize per NVM settings. Remove default PF filter and move "Enable/Disable" to currently used reset flag. Without this patch PF would try to add MAC VLAN filter with default switch filter present. This causes AQ error and sets promiscuous mode on. Fixes: c65e78f87f81 ("i40e: Further implementation of LLDP") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18i40e: Fix overwriting flow control settings during driver loadingMateusz Palczewski1-27/+0
During driver loading flow control settings were written to FW using a variable which was always zero, since it was being set only by ethtool. This behavior has been corrected and driver no longer overwrites the default FW/NVM settings. Fixes: 373149fc99a0 ("i40e: Decrease the scope of rtnl lock") Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18i40e: Add zero-initialization of AQ command structuresMateusz Palczewski1-0/+6
Zero-initialize AQ command data structures to comply with API specifications. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Fixes: f4492db16df8 ("i40e: Add NPAR BW get and set functions") Signed-off-by: Andrzej Sawuła <andrzej.sawula@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18i40e: Fix memory leak in i40e_probeKeita Suzuki1-0/+2
Struct i40e_veb is allocated in function i40e_setup_pf_switch, and stored to an array field veb inside struct i40e_pf. However when i40e_setup_misc_vector fails, this memory leaks. Fix this by calling exit and teardown functions. Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-18i40e: Fix flow for IPv6 next header (extension header)Slawomir Laba1-3/+6
When a packet contains an IPv6 header with next header which is an extension header and not a protocol one, the kernel function skb_transport_header called with such sk_buff will return a pointer to the extension header and not to the TCP one. The above explained call caused a problem with packet processing for skb with encapsulation for tunnel with I40E_TX_CTX_EXT_IP_IPV6. The extension header was not skipped at all. The ipv6_skip_exthdr function does check if next header of the IPV6 header is an extension header and doesn't modify the l4_proto pointer if it points to a protocol header value so its safe to omit the comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can return value -1. This means that the skipping process failed and there is something wrong with the packet so it will be dropped. Fixes: a3fd9d8876a5 ("i40e/i40evf: Handle IPv6 extension headers in checksum offload") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-17octeontx2-pf: Fix otx2_get_fecparam()Dan Carpenter1-1/+1
Static checkers complained about an off by one read overflow in otx2_get_fecparam() and we applied two conflicting fixes for it. Correct: b0aae0bde26f ("octeontx2: Fix condition.") Wrong: 93efb0c65683 ("octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()") Revert the incorrect fix. Fixes: 93efb0c65683 ("octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17cteontx2-pf: cn10k: Prevent harmless double shift bugsDan Carpenter1-3/+3
These defines are used with set_bit() and test_bit() which take a bit number. In other words, the code is doing: if (BIT(BIT(1)) & pf->hw.cap_flag) { This was done consistently so it did not cause a problem at runtime but it's still worth fixing. Fixes: facede8209ef ("octeontx2-pf: cn10k: Add mbox support for CN10K") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: stmmac: Add PCI bus info to ethtool driver query outputWong Vee Khee3-0/+6
This patch populates the PCI bus info in the ethtool driver query data. Users will be able to view PCI bus info using 'ethtool -i <interface>'. Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessaryVincent Cheng1-10/+8
Code clean-up. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.Vincent Cheng1-4/+1
Code clean-up. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.Vincent Cheng1-79/+11
Code clean-up. * Remove blank line between variable declarations. * Remove blank line between: err = blah(...) if (err) ... * Remove unnecessary blank line before/after loop constructs. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17ptp: ptp_clockmatrix: Clean-up dev_*() messages.Vincent Cheng1-79/+43
Code clean-up. * Remove unnecessary \n termination from dev_*() messages. * Remove 'char *fmt' to define strings to stay within 80 column limit. Not needed since coding guidelines increased to 100 columns limit. Keeping format in place allows static code checkers to validate the arguments. * Tighten up vertical spacing. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17ptp: ptp_clockmatrix: Remove unused header declarations.Vincent Cheng1-2/+0
Removed unused header declarations. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.Vincent Cheng1-3/+13
When enabling output using PTP_CLK_REQ_PEROUT, need to align the output clock to the internal 1 PPS clock. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.Vincent Cheng3-2/+87
Part of the device initialization aligns the rising edge of the output clock to the internal 1 PPS clock. If the system APLL and DPLL is not locked, then the alignment will fail and there will be a fixed offset between the internal 1 PPS clock and the output clock. After loading the device firmware, poll the system APLL and DPLL for locked state prior to initialization, timing out after 2 seconds. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: stmmac: dwmac-sun8i: Add a shutdown callbackSamuel Holland1-0/+10
The Ethernet MAC and PHY are usually major consumers of power on boards which may not be able to fully power off (those with no PMIC). Powering down the MAC and internal PHY saves power while these boards are "off". Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: stmmac: dwmac-sun8i: Minor probe function cleanupSamuel Holland1-1/+3
Adjust the spacing and use an explicit "return 0" in the success path to make the function easier to parse. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: stmmac: dwmac-sun8i: Use reset_control_resetSamuel Holland1-4/+4
Use the appropriate function instead of reimplementing it, and update the error message to match the code. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: stmmac: dwmac-sun8i: Remove unnecessary PHY power checkSamuel Holland1-4/+2
sun8i_dwmac_unpower_internal_phy already checks if the PHY is powered, so there is no need to do it again here. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: stmmac: dwmac-sun8i: Return void from PHY unpowerSamuel Holland1-3/+2
This is a deinitialization function that always returned zero, and that return value was always ignored. Have it return void instead. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17r8169: use macro pm_ptrHeiner Kallweit1-3/+1
Use macro pm_ptr(), this helps to avoid some ifdeffery. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: mdio: Remove of_phy_attach()Florian Fainelli3-41/+1
We have no in-tree users, also update the sfp-phylink.rst documentation to indicate that phy_attach_direct() is used instead of of_phy_attach(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17net: mscc: ocelot: select PACKING in the KconfigVladimir Oltean1-0/+1
Ocelot now uses include/linux/dsa/ocelot.h which makes use of CONFIG_PACKING to pack/unpack bits into the Injection/Extraction Frame Headers. So it needs to explicitly select it, otherwise there might be build errors due to the missing dependency. Fixes: 40d3f295b5fe ("net: mscc: ocelot: use common tag parsing code with DSA") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17sched,x86: Allow !PREEMPT_DYNAMICPeter Zijlstra1-6/+18
Allow building x86 with PREEMPT_DYNAMIC=n, this is needed for PREEMPT_RT as it makes no sense to not have full preemption on PREEMPT_RT. Fixes: 8c98e8cf723c ("preempt/dynamic: Provide preempt_schedule[_notrace]() static calls") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Mike Galbraith <efault@gmx.de> Link: https://lkml.kernel.org/r/YCK1+JyFNxQnWeXK@hirez.programming.kicks-ass.net
2021-02-17entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling pointFrederic Weisbecker4-10/+50
Following the idle loop model, cleanly check for pending rcuog wakeup before the last rescheduling point upon resuming to guest mode. This way we can avoid to do it from rcu_user_enter() with the last resort self-IPI hack that enforces rescheduling. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210131230548.32970-6-frederic@kernel.org
2021-02-17entry: Explicitly flush pending rcuog wakeup before last rescheduling pointFrederic Weisbecker2-5/+14
Following the idle loop model, cleanly check for pending rcuog wakeup before the last rescheduling point on resuming to user mode. This way we can avoid to do it from rcu_user_enter() with the last resort self-IPI hack that enforces rescheduling. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210131230548.32970-5-frederic@kernel.org
2021-02-17rcu/nocb: Trigger self-IPI on late deferred wake up before user resumeFrederic Weisbecker3-11/+37
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced. Unfortunately the call to rcu_user_enter() is already past the last rescheduling opportunity before we resume to userspace or to guest mode. We may escape there with the woken task ignored. The ultimate resort to fix every callsites is to trigger a self-IPI (nohz_full depends on arch to implement arch_irq_work_raise()) that will trigger a reschedule on IRQ tail or guest exit. Eventually every site that want a saner treatment will need to carefully place a call to rcu_nocb_flush_deferred_wakeup() before the last explicit need_resched() check upon resume. Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210131230548.32970-4-frederic@kernel.org
2021-02-17rcu/nocb: Perform deferred wake up before last idle's need_resched() checkFrederic Weisbecker4-3/+8
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced. Usually a local wake up happening while running the idle task is handled in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler. Unfortunately the call to rcu_idle_enter() is already beyond the last generic need_resched() check and we may halt the CPU with a resched request unhandled, leaving the task hanging. Fix this with splitting the rcuog wakeup handling from rcu_idle_enter() and place it before the last generic need_resched() check in the idle loop. It is then assumed that no call to call_rcu() will be performed after that in the idle loop until the CPU is put in low power mode. Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210131230548.32970-3-frederic@kernel.org
2021-02-17rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callersFrederic Weisbecker1-1/+10
Deferred wakeup of rcuog kthreads upon RCU idle mode entry is going to be handled differently whether initiated by idle, user or guest. Prepare with pulling that control up to rcu_eqs_enter() callers. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210131230548.32970-2-frederic@kernel.org
2021-02-17sched/features: Distinguish between NORMAL and DEADLINE hrtickJuri Lelli5-7/+30
The HRTICK feature has traditionally been servicing configurations that need precise preemptions point for NORMAL tasks. More recently, the feature has been extended to also service DEADLINE tasks with stringent runtime enforcement needs (e.g., runtime < 1ms with HZ=1000). Enabling HRTICK sched feature currently enables the additional timer and task tick for both classes, which might introduced undesired overhead for no additional benefit if one needed it only for one of the cases. Separate HRTICK sched feature in two (and leave the traditional case name unmodified) so that it can be selectively enabled when needed. With: $ echo HRTICK > /sys/kernel/debug/sched_features the NORMAL/fair hrtick gets enabled. With: $ echo HRTICK_DL > /sys/kernel/debug/sched_features the DEADLINE hrtick gets enabled. Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210208073554.14629-3-juri.lelli@redhat.com
2021-02-17sched/features: Fix hrtick reprogrammingJuri Lelli2-5/+4
Hung tasks and RCU stall cases were reported on systems which were not 100% busy. Investigation of such unexpected cases (no sign of potential starvation caused by tasks hogging the system) pointed out that the periodic sched tick timer wasn't serviced anymore after a certain point and that caused all machinery that depends on it (timers, RCU, etc.) to stop working as well. This issues was however only reproducible if HRTICK was enabled. Looking at core dumps it was found that the rbtree of the hrtimer base used also for the hrtick was corrupted (i.e. next as seen from the base root and actual leftmost obtained by traversing the tree are different). Same base is also used for periodic tick hrtimer, which might get "lost" if the rbtree gets corrupted. Much alike what described in commit 1f71addd34f4c ("tick/sched: Do not mess with an enqueued hrtimer") there is a race window between hrtimer_set_expires() in hrtick_start and hrtimer_start_expires() in __hrtick_restart() in which the former might be operating on an already queued hrtick hrtimer, which might lead to corruption of the base. Use hrtick_start() (which removes the timer before enqueuing it back) to ensure hrtick hrtimer reprogramming is entirely guarded by the base lock, so that no race conditions can occur. Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210208073554.14629-2-juri.lelli@redhat.com
2021-02-17sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()Dietmar Eggemann1-4/+7
dl_add_task_root_domain() is called during sched domain rebuild: rebuild_sched_domains_locked() partition_and_rebuild_sched_domains() rebuild_root_domains() for all top_cpuset descendants: update_tasks_root_domain() for all tasks of cpuset: dl_add_task_root_domain() Change it so that only the task pi lock is taken to check if the task has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the rq lock as well to be able to safely de-reference root domain's DL bandwidth structure. Most of the tasks will have another policy (namely SCHED_NORMAL) and can now bail without taking the rq lock. One thing to note here: Even in case that there aren't any DL user tasks, a slow frequency switching system with cpufreq gov schedutil has a DL task (sugov) per frequency domain running which participates in DL bandwidth management. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lkml.kernel.org/r/20210119083542.19856-1-dietmar.eggemann@arm.com
2021-02-17uprobes: (Re)add missing get_uprobe() in __find_uprobe()Sven Schnelle1-1/+1
commit c6bc9bd06dff ("rbtree, uprobes: Use rbtree helpers") accidentally removed the refcount increase. Add it again. Fixes: c6bc9bd06dff ("rbtree, uprobes: Use rbtree helpers") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210209150711.36778-1-svens@linux.ibm.com