aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/md/dm-round-robin.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2019-12-05dm btree: increase rebalance threshold in __rebalance2()Hou Tao1-1/+7
We got the following warnings from thin_check during thin-pool setup: $ thin_check /dev/vdb examining superblock examining devices tree missing devices: [1, 84] too few entries in btree_node: 41, expected at least 42 (block 138, max_entries = 126) examining mapping tree The phenomenon is the number of entries in one node of details_info tree is less than (max_entries / 3). And it can be easily reproduced by the following procedures: $ new a thin pool $ presume the max entries of details_info tree is 126 $ new 127 thin devices (e.g. 1~127) to make the root node being full and then split $ remove the first 43 (e.g. 1~43) thin devices to make the children reblance repeatedly $ stop the thin pool $ thin_check The root cause is that the B-tree removal procedure in __rebalance2() doesn't guarantee the invariance: the minimal number of entries in non-root node should be >= (max_entries / 3). Simply fix the problem by increasing the rebalance threshold to make sure the number of entries in each child will be greater than or equal to (max_entries / 3 + 1), so no matter which child is used for removal, the number will still be valid. Cc: stable@vger.kernel.org Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-26dm: add dm-clone to the documentation indexDiego Calleja1-0/+1
Fixes: 7431b7835f554 ("dm: add clone target") Signed-off-by: Diego Calleja <diegocg@gmail.com> Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-26dm mpath: remove harmful bio-based optimizationMike Snitzer1-36/+1
Removes the branching for edge-case where no SCSI device handler exists. The __map_bio_fast() method was far too limited, by only selecting a new pathgroup or path IFF there was a path failure, fix this be eliminating it in favor of __map_bio(). __map_bio()'s extra SCSI device handler specific MPATHF_PG_INIT_REQUIRED test is not in the fast path anyway. This change restores full path selector functionality for bio-based configurations that don't haave a SCSI device handler. But it should be noted that the path selectors do have an impact on performance for certain networks that are extremely fast (and don't require frequent switching). Fixes: 8d47e65948dd ("dm mpath: remove unnecessary NVMe branching in favor of scsi_dh checks") Cc: stable@vger.kernel.org Reported-by: Drew Hastings <dhastings@crucialwebhost.com> Suggested-by: Martin Wilck <mwilck@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-20Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues"Mike Snitzer1-6/+3
This reverts commit a1b89132dc4f61071bdeaab92ea958e0953380a1. Revert required hand-patching due to subsequent changes that were applied since commit a1b89132dc4f61071bdeaab92ea958e0953380a1. Requires: ed0302e83098d ("dm crypt: make workqueue names device-specific") Cc: stable@vger.kernel.org Bug: https://bugzilla.kernel.org/show_bug.cgi?id=199857 Reported-by: Vito Caputo <vcaputo@pengaru.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-20dm: Fix Kconfig indentationKrzysztof Kozlowski1-27/+27
Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-18dm thin: wakeup worker only when deferred bios existJeffle Xu1-1/+4
Single thread fio test (read, bs=4k, ioengine=libaio, iodepth=128, numjobs=1) over dm-thin device has poor performance versus bare nvme device. Further investigation with perf indicates that queue_work_on() consumes over 20% CPU time when doing IO over dm-thin device. The call stack is as follows. - 40.57% thin_map + 22.07% queue_work_on + 9.95% dm_thin_find_block + 2.80% cell_defer_no_holder 1.91% inc_all_io_entry.isra.33.part.34 + 1.78% bio_detain.isra.35 In cell_defer_no_holder(), wakeup_worker() is always called, no matter whether the tc->deferred_bio_list list is empty or not. In single thread IO model, this list is most likely empty. So skip waking up worker thread if tc->deferred_bio_list list is empty. Single thread IO performance improves from 448 MiB/s to 646 MiB/s (+44%) once the needless wake_worker() calls are properly skipped. Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-15dm integrity: fix excessive alignment of metadata runsMikulas Patocka2-5/+28
Metadata runs are supposed to be aligned on 4k boundary (so that they work efficiently with disks with 4k sectors). However, there was a programming bug that makes them aligned on 128k boundary instead. The unused space is wasted. Fix this bug by providing a proper 4k alignment. In order to keep existing volumes working, we introduce a new flag SB_FLAG_FIXED_PADDING - when the flag is clear, we calculate the padding the old way. In order to make sure that the old version cannot mount the volume created by the new version, we increase superblock version to 4. Also in order to not break with old integritysetup, we fix alignment only if the parameter "fix_padding" is present when formatting the device. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-07dm raid: Remove unnecessary negation of a shift in raid10_format_to_md_layoutNathan Chancellor1-1/+0
When building with Clang + -Wtautological-constant-compare: drivers/md/dm-raid.c:619:8: warning: converting the result of '<<' to a boolean always evaluates to true [-Wtautological-constant-compare] r = !RAID10_OFFSET; ^ drivers/md/dm-raid.c:517:28: note: expanded from macro 'RAID10_OFFSET' #define RAID10_OFFSET (1 << 16) /* stripes with data copies area adjacent on devices */ ^ 1 warning generated. Negating a non-zero number will always make it zero, which is the default value of r in this function so this statement is unnecessary; remove it so that clang no longer warns. Link: https://github.com/ClangBuiltLinux/linux/issues/753 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-07dm zoned: reduce overhead of backing device checksDmitry Fomichev4-32/+61
Commit 75d66ffb48efb3 added backing device health checks and as a part of these checks, check_events() block ops template call is invoked in dm-zoned mapping path as well as in reclaim and flush path. Calling check_events() with ATA or SCSI backing devices introduces a blocking scsi_test_unit_ready() call being made in sd_check_events(). Even though the overhead of calling scsi_test_unit_ready() is small for ATA zoned devices, it is much larger for SCSI and it affects performance in a very negative way. Fix this performance regression by executing check_events() only in case of any I/O errors. The function dmz_bdev_is_dying() is modified to call only blk_queue_dying(), while calls to check_events() are made in a new helper function, dmz_check_bdev(). Reported-by: zhangxiaoxu <zhangxiaoxu5@huawei.com> Fixes: 75d66ffb48efb3 ("dm zoned: properly handle backing device failure") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm dust: add limited write failure modeBryan Gurney1-7/+46
Add a limited write failure mode which allows a write to a block to fail a specified amount of times, prior to remapping. The "addbadblock" message is extended to allow specifying the limited number of times a write fails. Example: add bad block on block 60, with 5 write failures: dmsetup message 0 dust1 addbadblock 60 5 The write failure counter will be printed for newly added bad blocks. Signed-off-by: Bryan Gurney <bgurney@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm dust: change ret to r in dust_map_read and dust_mapBryan Gurney1-7/+7
In the dust_map_read() and dust_map() functions, change the return code variable "ret" to "r", to match the convention of the other device-mapper targets. Signed-off-by: Bryan Gurney <bgurney@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm dust: change result vars to rBryan Gurney1-16/+16
Change the "result" variables to "r" in dust_status() and dust_message(). Signed-off-by: Bryan Gurney <bgurney@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm cache: replace spin_lock_irqsave with spin_lock_irqMikulas Patocka1-49/+28
If we are in a place where it is known that interrupts are enabled, functions spin_lock_irq/spin_unlock_irq should be used instead of spin_lock_irqsave/spin_unlock_irqrestore. spin_lock_irq and spin_unlock_irq are faster because they don't need to push and pop the flags register. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm bio prison: replace spin_lock_irqsave with spin_lock_irqMikulas Patocka2-33/+20
Replace spin_lock_irqsave/irqrestore with spin_lock_irq/spin_unlock_irq. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm thin: replace spin_lock_irqsave with spin_lock_irqMikulas Patocka1-67/+46
If we are in a place where it is known that interrupts are enabled, functions spin_lock_irq/spin_unlock_irq should be used instead of spin_lock_irqsave/spin_unlock_irqrestore. spin_lock_irq and spin_unlock_irq are faster because they don't need to push and pop the flags register. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm clone: add bucket_lock_irq/bucket_unlock_irq helpersNikos Tsironis1-15/+19
Introduce bucket_lock_irq() and bucket_unlock_irq() helpers and use them in places where it is known that interrupts are enabled. Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm clone: replace spin_lock_irqsave with spin_lock_irqMikulas Patocka3-34/+27
If we are in a place where it is known that interrupts are enabled, functions spin_lock_irq/spin_unlock_irq should be used instead of spin_lock_irqsave/spin_unlock_irqrestore. spin_lock_irq and spin_unlock_irq are faster because they don't need to push and pop the flags register. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm writecache: handle REQ_FUAMaged Mokhtar1-1/+2
Call writecache_flush() on REQ_FUA in writecache_map(). Cc: stable@vger.kernel.org # 4.18+ Signed-off-by: Maged Mokhtar <mmokhtar@petasan.org> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm writecache: fix uninitialized variable warningMikulas Patocka1-1/+1
This fixes coverity warning CID 1454301. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm stripe: use struct_size() in kmalloc()Gustavo A. R. Silva2-17/+1
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct stripe_c { ... struct stripe stripe[0]; }; In this case alloc_context() and dm_array_too_big() are removed and replaced by the direct use of the struct_size() helper in kmalloc(). Notice that open-coded form is prone to type mistakes. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm raid: streamline rs_get_progress() and its raid_status() caller sideHeinz Mauelshagen1-27/+20
Pass already deciphered state into rs_get_progress, simplify recovery offset definition and combine two st_resync, st_reshape conditionals into one as is already the case with st_check and st_repair. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm raid: simplify rs_setup_recovery call chainHeinz Mauelshagen1-21/+6
rs_setup_recovery() sets the starting recovery offset. Drop superfluous rs_setup_recovery() and replace with __rs_setup_recovery(). Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm raid: to ensure resynchronization, perform raid set grow in preresumeHeinz Mauelshagen2-21/+62
This fixes a flaw causing raid set extensions not to be synchronized in case the MD bitmap resize required additional pages to be allocated. Also share resize code in the raid constructor between new size changes and those occuring during recovery. Bump the target version to define the change and document it in Documentation/admin-guide/device-mapper/dm-raid.rst. Reported-by: Steve D <steved424@gmail.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm raid: change rs_set_dev_and_array_sectors API and callersHeinz Mauelshagen1-9/+5
Add a size argument to rs_set_dev_and_array_sectors as prerequisite to fixing grown device resynchronization not occuring when new MD bitmap pages have to be allocated as a result of the extension in a follwup patch. Also avoid code duplication by using rs_set_rdev_sectors in the aforementioned function. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05dm table: do not allow request-based DM to stack on partitionsMike Snitzer1-19/+8
Partitioned request-based devices cannot be used as underlying devices for request-based DM because no partition offsets are added to each incoming request. As such, until now, stacking on partitioned devices would _always_ result in data corruption (e.g. wiping the partition table, writing to other partitions, etc). Fix this by disallowing request-based stacking on partitions. While at it, since all .request_fn support has been removed from block core, remove legacy dm-table code that differentiated between blk-mq and .request_fn request-based. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-03Linux 5.4-rc6Linus Torvalds1-1/+1
2019-11-01net: fix installing orphaned programsJakub Kicinski1-1/+2
When netdevice with offloaded BPF programs is destroyed the programs are orphaned and removed from the program IDA - their IDs get released (the programs may remain accessible via existing open file descriptors and pinned files). After IDs are released they are set to 0. This confuses dev_change_xdp_fd() because it compares the __dev_xdp_query() result where 0 means no program with prog->aux->id where 0 means orphaned. dev_change_xdp_fd() would have incorrectly returned success even though it had not installed the program. Since drivers already catch this case via bpf_offload_dev_match() let them handle this case. The error message drivers produce in this case ("program loaded for a different device") is in fact correct as the orphaned program must had to be loaded for a different device. Fixes: c14a9f633d9e ("net: Don't call XDP_SETUP_PROG when nothing is changed") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01net: cls_bpf: fix NULL deref on offload filter removalJakub Kicinski1-2/+6
Commit 401192113730 ("net: sched: refactor block offloads counter usage") missed the fact that either new prog or old prog may be NULL. Fixes: 401192113730 ("net: sched: refactor block offloads counter usage") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01selftests: bpf: Skip write only files in debugfsJakub Kicinski1-0/+5
DebugFS for netdevsim now contains some "action trigger" files which are write only. Don't try to capture the contents of those. Note that we can't use os.access() because the script requires root. Fixes: 4418f862d675 ("netdevsim: implement support for devlink region and snapshots") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01selftests: net: reuseport_dualstack: fix uninitalized parameterWei Wang1-1/+2
This test reports EINVAL for getsockopt(SOL_SOCKET, SO_DOMAIN) occasionally due to the uninitialized length parameter. Initialize it to fix this, and also use int for "test_family" to comply with the API standard. Fixes: d6a61f80b871 ("soreuseport: test mixed v4/v6 sockets") Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Cc: Craig Gallek <cgallek@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01r8169: fix wrong PHY ID issue with RTL8168dpHeiner Kallweit1-0/+4
As reported in [0] at least one RTL8168dp version has problems establishing a link. This chip version has an integrated RTL8211b PHY, however the chip seems to report a wrong PHY ID, resulting in a wrong PHY driver (for Generic Realtek PHY) being loaded. Work around this issue by adding a hook to r8168dp_2_mdio_read() for returning the correct PHY ID. [0] https://bbs.archlinux.org/viewtopic.php?id=246508 Fixes: 242cd9b5866a ("r8169: use phy_resume/phy_suspend") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01net: dsa: bcm_sf2: Fix IMP setup for port different than 8Florian Fainelli1-15/+21
Since it became possible for the DSA core to use a CPU port different than 8, our bcm_sf2_imp_setup() function was broken because it assumes that registers are applicable to port 8. In particular, the port's MAC is going to stay disabled, so make sure we clear the RX_DIS and TX_DIS bits if we are not configured for port 8. Fixes: 9f91484f6fcc ("net: dsa: make "label" property optional for dsa2") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01net: phylink: Fix phylink_dbg() macroFlorian Fainelli1-0/+16
The phylink_dbg() macro does not follow dynamic debug or defined(DEBUG) and as a result, it spams the kernel log since a PR_DEBUG level is currently used. Fix it to be defined appropriately whether CONFIG_DYNAMIC_DEBUG or defined(DEBUG) are set. Fixes: 17091180b152 ("net: phylink: Add phylink_{printk, err, warn, info, dbg} macros") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01gve: Fixes DMA synchronization.Yangchun Fu2-2/+24
Synces the DMA buffer properly in order for CPU and device to see the most up-to-data data. Signed-off-by: Yangchun Fu <yangchun@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01inet: stop leaking jiffies on the wireEric Dumazet5-6/+6
Historically linux tried to stick to RFC 791, 1122, 2003 for IPv4 ID field generation. RFC 6864 made clear that no matter how hard we try, we can not ensure unicity of IP ID within maximum lifetime for all datagrams with a given source address/destination address/protocol tuple. Linux uses a per socket inet generator (inet_id), initialized at connection startup with a XOR of 'jiffies' and other fields that appear clear on the wire. Thiemo Nagel pointed that this strategy is a privacy concern as this provides 16 bits of entropy to fingerprint devices. Let's switch to a random starting point, this is just as good as far as RFC 6864 is concerned and does not leak anything critical. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Thiemo Nagel <tnagel@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01ixgbe: Remove duplicate clear_bit() callIgor Pylypiv1-1/+0
__IXGBE_RX_BUILD_SKB_ENABLED bit is already cleared. Signed-off-by: Igor Pylypiv <igor.pylypiv@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-01Documentation: networking: device drivers: Remove stray asterisksJonathan Neuschäfer12-56/+56
These asterisks were once references to a line that said: "* Other names and brands may be claimed as the property of others." But now, they serve no purpose; they can only irritate the reader. Fixes: de3edab4276c ("e1000: update README for e1000") Fixes: a3fb65680f65 ("e100.txt: Cleanup license info in kernel doc") Fixes: da8c01c4502a ("e1000e.txt: Add e1000e documentation") Fixes: f12a84a9f650 ("Documentation: fm10k: Add kernel documentation") Fixes: b55c52b1938c ("igb.txt: Add igb documentation") Fixes: c4e9b56e2442 ("igbvf.txt: Add igbvf Documentation") Fixes: d7064f4c192c ("Documentation/networking/: Update Intel wired LAN driver documentation") Fixes: c4b8c01112a1 ("ixgbevf.txt: Update ixgbevf documentation") Fixes: 1e06edcc2f22 ("Documentation: i40e: Prepare documentation for RST conversion") Fixes: 105bf2fe6b32 ("i40evf: add driver to kernel build system") Fixes: 1fae869bcf3d ("Documentation: ice: Prepare documentation for RST conversion") Fixes: df69ba43217d ("ionic: Add basic framework for IONIC Network device driver") Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-01e1000: fix memory leaksWenwen Wang1-4/+3
In e1000_set_ringparam(), 'tx_old' and 'rx_old' are not deallocated if e1000_up() fails, leading to memory leaks. Refactor the code to fix this issue. Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-01i40e: Fix receive buffer starvation for AF_XDPJeff Kirsher1-5/+0
Magnus's fix to resolve a potential receive buffer starvation for AF_XDP got applied to both the i40e_xsk_umem_enable/disable() functions, when it should have only been applied to the "enable". So clean up the undesired code in the disable function. CC: Magnus Karlsson <magnus.karlsson@intel.com> Fixes: 1f459bdc2007 ("i40e: fix potential RX buffer starvation for AF_XDP") Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2019-11-01igb: Fix constant media auto sense switching when no cable is connectedManfred Rudigier1-1/+2
At least on the i350 there is an annoying behavior that is maybe also present on 82580 devices, but was probably not noticed yet as MAS is not widely used. If no cable is connected on both fiber/copper ports the media auto sense code will constantly swap between them as part of the watchdog task and produce many unnecessary kernel log messages. The swap code responsible for this behavior (switching to fiber) should not be executed if the current media type is copper and there is no signal detected on the fiber port. In this case we can safely wait until the AUTOSENSE_EN bit is cleared. Signed-off-by: Manfred Rudigier <manfred.rudigier@omicronenergy.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-01net: ethernet: arc: add the missed clk_disable_unprepareChuhong Yuan1-0/+3
The remove misses to disable and unprepare priv->macclk like what is done when probe fails. Add the missed call in remove. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01NFS: Fix an RCU lock leak in nfs4_refresh_delegation_stateid()Trond Myklebust1-1/+1
A typo in nfs4_refresh_delegation_stateid() means we're leaking an RCU lock, and always returning a value of 'false'. As the function description states, we were always supposed to return 'true' if a matching delegation was found. Fixes: 12f275cdd163 ("NFSv4: Retry CLOSE and DELEGRETURN on NFS4ERR_OLD_STATEID.") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-11-01NFSv4: Don't allow a cached open with a revoked delegationTrond Myklebust3-5/+13
If the delegation is marked as being revoked, we must not use it for cached opens. Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-11-01arm64: apply ARM64_ERRATUM_843419 workaround for Brahma-B53 coreFlorian Fainelli2-3/+22
The Broadcom Brahma-B53 core is susceptible to the issue described by ARM64_ERRATUM_843419 so this commit enables the workaround to be applied when executing on that core. Since there are now multiple entries to match, we must convert the existing ARM64_ERRATUM_843419 into an erratum list and use cpucap_multi_entry_cap_matches to match our entries. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-11-01arm64: Brahma-B53 is SSB and spectre v2 safeFlorian Fainelli1-0/+2
Add the Brahma-B53 CPU (all versions) to the whitelists of CPUs for the SSB and spectre v2 mitigations. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-11-01arm64: apply ARM64_ERRATUM_845719 workaround for Brahma-B53 coreDoug Berger3-2/+16
The Broadcom Brahma-B53 core is susceptible to the issue described by ARM64_ERRATUM_845719 so this commit enables the workaround to be applied when executing on that core. Since there are now multiple entries to match, we must convert the existing ARM64_ERRATUM_845719 into an erratum list. Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-31igb: Enable media autosense for the i350.Manfred Rudigier2-2/+2
This patch enables the hardware feature "Media Auto Sense" also on the i350. It works in the same way as on the 82850 devices. Hardware designs using dual PHYs (fiber/copper) can enable this feature by setting the MAS enable bits in the NVM_COMPAT register (0x03) in the EEPROM. Signed-off-by: Manfred Rudigier <manfred.rudigier@omicronenergy.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-10-31igb/igc: Don't warn on fatal read failures when the device is removedLyude Paul2-2/+4
Fatal read errors are worth warning about, unless of course the device was just unplugged from the machine - something that's a rather normal occurrence when the igb/igc adapter is located on a Thunderbolt dock. So, let's only WARN() if there's a fatal read error while the device is still present. This fixes the following WARN splat that's been appearing whenever I unplug my Caldigit TS3 Thunderbolt dock from my laptop: igb 0000:09:00.0 enp9s0: PCIe link lost ------------[ cut here ]------------ igb: Failed to read reg 0x18! WARNING: CPU: 7 PID: 516 at drivers/net/ethernet/intel/igb/igb_main.c:756 igb_rd32+0x57/0x6a [igb] Modules linked in: igb dca thunderbolt fuse vfat fat elan_i2c mei_wdt mei_hdcp i915 wmi_bmof intel_wmi_thunderbolt iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp joydev coretemp crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel intel_cstate drm_kms_helper intel_uncore syscopyarea sysfillrect sysimgblt fb_sys_fops intel_rapl_perf intel_xhci_usb_role_switch mei_me drm roles idma64 i2c_i801 ucsi_acpi typec_ucsi mei intel_lpss_pci processor_thermal_device typec intel_pch_thermal intel_soc_dts_iosf intel_lpss int3403_thermal thinkpad_acpi wmi int340x_thermal_zone ledtrig_audio int3400_thermal acpi_thermal_rel acpi_pad video pcc_cpufreq ip_tables serio_raw nvme nvme_core crc32c_intel uas usb_storage e1000e i2c_dev CPU: 7 PID: 516 Comm: kworker/u16:3 Not tainted 5.2.0-rc1Lyude-Test+ #14 Hardware name: LENOVO 20L8S2N800/20L8S2N800, BIOS N22ET35W (1.12 ) 04/09/2018 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:igb_rd32+0x57/0x6a [igb] Code: 87 b8 fc ff ff 48 c7 47 08 00 00 00 00 48 c7 c6 33 42 9b c0 4c 89 c7 e8 47 45 cd dc 89 ee 48 c7 c7 43 42 9b c0 e8 c1 94 71 dc <0f> 0b eb 08 8b 00 ff c0 75 b0 eb c8 44 89 e0 5d 41 5c c3 0f 1f 44 RSP: 0018:ffffba5801cf7c48 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff9e7956608840 RCX: 0000000000000007 RDX: 0000000000000000 RSI: ffffba5801cf7b24 RDI: ffff9e795e3d6a00 RBP: 0000000000000018 R08: 000000009dec4a01 R09: ffffffff9e61018f R10: 0000000000000000 R11: ffffba5801cf7ae5 R12: 00000000ffffffff R13: ffff9e7956608840 R14: ffff9e795a6f10b0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff9e795e3c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564317bc4088 CR3: 000000010e00a006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: igb_release_hw_control+0x1a/0x30 [igb] igb_remove+0xc5/0x14b [igb] pci_device_remove+0x3b/0x93 device_release_driver_internal+0xd7/0x17e pci_stop_bus_device+0x36/0x75 pci_stop_bus_device+0x66/0x75 pci_stop_bus_device+0x66/0x75 pci_stop_and_remove_bus_device+0xf/0x19 trim_stale_devices+0xc5/0x13a ? __pm_runtime_resume+0x6e/0x7b trim_stale_devices+0x103/0x13a ? __pm_runtime_resume+0x6e/0x7b trim_stale_devices+0x103/0x13a acpiphp_check_bridge+0xd8/0xf5 acpiphp_hotplug_notify+0xf7/0x14b ? acpiphp_check_bridge+0xf5/0xf5 acpi_device_hotplug+0x357/0x3b5 acpi_hotplug_work_fn+0x1a/0x23 process_one_work+0x1a7/0x296 worker_thread+0x1a8/0x24c ? process_scheduled_works+0x2c/0x2c kthread+0xe9/0xee ? kthread_destroy_worker+0x41/0x41 ret_from_fork+0x35/0x40 ---[ end trace 252bf10352c63d22 ]--- Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 47e16692b26b ("igb/igc: warn when fatal read failure happens") Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Acked-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-10-31tcp: increase tcp_max_syn_backlog max valueEric Dumazet2-3/+6
tcp_max_syn_backlog default value depends on memory size and TCP ehash size. Before this patch, the max value was 2048 [1], which is considered too small nowadays. Increase it to 4096 to match the recent SOMAXCONN change. [1] This is with TCP ehash size being capped to 524288 buckets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Yue Cao <ycao009@ucr.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-31net: increase SOMAXCONN to 4096Eric Dumazet2-3/+3
SOMAXCONN is /proc/sys/net/core/somaxconn default value. It has been defined as 128 more than 20 years ago. Since it caps the listen() backlog values, the very small value has caused numerous problems over the years, and many people had to raise it on their hosts after beeing hit by problems. Google has been using 1024 for at least 15 years, and we increased this to 4096 after TCP listener rework has been completed, more than 4 years ago. We got no complain of this change breaking any legacy application. Many applications indeed setup a TCP listener with listen(fd, -1); meaning they let the system select the backlog. Raising SOMAXCONN lowers chance of the port being unavailable under even small SYNFLOOD attack, and reduces possibilities of side channel vulnerabilities. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Yue Cao <ycao009@ucr.edu> Signed-off-by: David S. Miller <davem@davemloft.net>