linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2020-10-30	arm64/smp: Move rcu_cpu_starting() earlier	Qian Cai	1	-0/+1
	The call to rcu_cpu_starting() in secondary_start_kernel() is not early enough in the CPU-hotplug onlining process, which results in lockdep splats as follows: WARNING: suspicious RCU usage ----------------------------- kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0. Call trace: dump_backtrace+0x0/0x3c8 show_stack+0x14/0x60 dump_stack+0x14c/0x1c4 lockdep_rcu_suspicious+0x134/0x14c __lock_acquire+0x1c30/0x2600 lock_acquire+0x274/0xc48 _raw_spin_lock+0xc8/0x140 vprintk_emit+0x90/0x3d0 vprintk_default+0x34/0x40 vprintk_func+0x378/0x590 printk+0xa8/0xd4 __cpuinfo_store_cpu+0x71c/0x868 cpuinfo_store_cpu+0x2c/0xc8 secondary_start_kernel+0x244/0x318 This is avoided by moving the call to rcu_cpu_starting up near the beginning of the secondary_start_kernel() function. Signed-off-by: Qian Cai <cai@redhat.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/ Link: https://lore.kernel.org/r/20201028182614.13655-1-cai@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-30	drm/nouveau/kms/nv50-: Fix clock checking algorithm in nv50_dp_mode_valid()	Lyude Paul	1	-10/+11
	While I thought I had this correct (since it actually did reject modes like I expected during testing), Ville Syrjala from Intel pointed out that the logic here isn't correct. max_clock refers to the max data rate supported by the DP encoder. So, limiting it to the output of ds_clock (which refers to the maximum dotclock of the downstream DP device) doesn't make any sense. Additionally, since we're using the connector's bpc as the canonical BPC we should use this in mode_valid until we support dynamically setting the bpp based on bandwidth constraints. https://lists.freedesktop.org/archives/dri-devel/2020-September/280276.html For more info. So, let's rewrite this using Ville's advice. v2: * Ville pointed out I mixed up the dotclock and the link rate. So fix that... * ...and also rename all the variables in this function to be more appropriately labeled so I stop mixing them up. * Reuse the bpp from the connector for now until we have dynamic bpp selection. * Use use DIV_ROUND_UP for calculating the mode rate like i915 does, which we should also have been doing from the start Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 409d38139b42 ("drm/nouveau/kms/nv50-: Use downstream DP clock limits for mode validation") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-10-30	drm/nouveau/kms/nv50-: Get rid of bogus nouveau_conn_mode_valid()	Lyude Paul	2	-31/+21
	Ville also pointed out that I got a lot of the logic here wrong as well, whoops. While I don't think anyone's likely using 3D output with nouveau, the next patch will make nouveau_conn_mode_valid() make a lot less sense. So, let's just get rid of it and open-code it like before, while taking care to move the 3D frame packing calculations on the dot clock into the right place. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: d6a9efece724 ("drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-10-30	drm/nouveau/device: fix changing endianess code to work on older GPUs	Karol Herbst	1	-13/+26
	With this we try to detect if the endianess switch works and assume LE if not. Suggested by Ben. Fixes: 51c05340e407 ("drm/nouveau/device: detect if changing endianness failed") Signed-off-by: Karol Herbst <kherbst@redhat.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-10-30	drm/nouveau/gem: fix "refcount_t: underflow; use-after-free"	Karol Herbst	1	-1/+2
	we can't use nouveau_bo_ref here as no ttm object was allocated and nouveau_bo_ref mainly deals with that. Simply deallocate the object. Signed-off-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-10-30	drm/nouveau/kms/nv50-: Program notifier offset before requesting disp caps	Lyude Paul	6	-5/+85
	Not entirely sure why this never came up when I originally tested this (maybe some BIOSes already have this setup?) but the ->caps_init vfunc appears to cause the display engine to throw an exception on driver init, at least on my ThinkPad P72: nouveau 0000:01:00.0: disp: chid 0 mthd 008c data 00000000 0000508c 0000102b This is magic nvidia speak for "You need to have the DMA notifier offset programmed before you can call NV507D_GET_CAPABILITIES." So, let's fix this by doing that, and also perform an update afterwards to prevent racing with the GPU when reading capabilities. v2: * Don't just program the DMA notifier offset, make sure to actually perform an update v3: * Don't call UPDATE() * Actually read the correct notifier fields, as apparently the CAPABILITIES_DONE field lives in a different location than the main NV_DISP_CORE_NOTIFIER_1 field. As well, 907d+ use a different CAPABILITIES_DONE field then pre-907d cards. v4: * Don't forget to check the return value of core507d_read_caps() v5: * Get rid of NV50_DISP_CAPS_NTFY[14], use NV50_DISP_CORE_NTFY * Disable notifier after calling GetCapabilities() Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 4a2cb4181b07 ("drm/nouveau/kms/nv50-: Probe SOR and PIOR caps for DP interlacing support") Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-10-30	drm/nouveau/nouveau: fix the start/end range for migration	Ralph Campbell	1	-11/+3
	The user level OpenCL code shouldn't have to align start and end addresses to a page boundary. That is better handled in the nouveau driver. The npages field is also redundant since it can be computed from the start and end addresses. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-10-29	soc: ti: ti_sci_pm_domains: check for proper args count in xlate	Tero Kristo	1	-1/+1
	K2G devices still only use single parameter for power-domains property, so check for this properly in the driver. Without this, every peripheral fails to probe resulting in boot failure. Link: https://lore.kernel.org/r/20201029093337.21170-1-t-kristo@ti.com Fixes: efa5c01cd7ee ("soc: ti: ti_sci_pm_domains: switch to use multiple genpds instead of one") Reported-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tero Kristo <t-kristo@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-10-29	r8169: fix issue with forced threading in combination with shared interrupts	Heiner Kallweit	1	-2/+2
	As reported by Serge flag IRQF_NO_THREAD causes an error if the interrupt is actually shared and the other driver(s) don't have this flag set. This situation can occur if a PCI(e) legacy interrupt is used in combination with forced threading. There's no good way to deal with this properly, therefore we have to remove flag IRQF_NO_THREAD. For fixing the original forced threading issue switch to napi_schedule(). Fixes: 424a646e072a ("r8169: fix operation under forced interrupt threading") Link: https://www.spinics.net/lists/netdev/msg694960.html Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Tested-by: Serge Belyshev <belyshev@depni.sinp.msu.ru> Link: https://lore.kernel.org/r/b5b53bfe-35ac-3768-85bf-74d1290cf394@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-29	netem: fix zero division in tabledist	Aleksandr Nogikh	1	-1/+8
	Currently it is possible to craft a special netlink RTM_NEWQDISC command that can result in jitter being equal to 0x80000000. It is enough to set the 32 bit jitter to 0x02000000 (it will later be multiplied by 2^6) or just set the 64 bit jitter via TCA_NETEM_JITTER64. This causes an overflow during the generation of uniformly distributed numbers in tabledist(), which in turn leads to division by zero (sigma != 0, but sigma * 2 is 0). The related fragment of code needs 32-bit division - see commit 9b0ed89 ("netem: remove unnecessary 64 bit modulus"), so switching to 64 bit is not an option. Fix the issue by keeping the value of jitter within the range that can be adequately handled by tabledist() - [0;INT_MAX]. As negative std deviation makes no sense, take the absolute value of the passed value and cap it at INT_MAX. Inside tabledist(), switch to unsigned 32 bit arithmetic in order to prevent overflows. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Aleksandr Nogikh <nogikh@google.com> Reported-by: syzbot+ec762a6342ad0d3c0d8f@syzkaller.appspotmail.com Acked-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20201028170731.1383332-1-aleksandrnogikh@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-29	ibmvnic: fix ibmvnic_set_mac	Lijun Pan	1	-2/+6
	Jakub Kicinski brought up a concern in ibmvnic_set_mac(). ibmvnic_set_mac() does this: ether_addr_copy(adapter->mac_addr, addr->sa_data); if (adapter->state != VNIC_PROBED) rc = __ibmvnic_set_mac(netdev, addr->sa_data); So if state == VNIC_PROBED, the user can assign an invalid address to adapter->mac_addr, and ibmvnic_set_mac() will still return 0. The fix is to validate ethernet address at the beginning of ibmvnic_set_mac(), and move the ether_addr_copy to the case of "adapter->state != VNIC_PROBED". Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Link: https://lore.kernel.org/r/20201027220456.71450-1-ljp@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-29	mptcp: add missing memory scheduling in the rx path	Paolo Abeni	1	-0/+10
	When moving the skbs from the subflow into the msk receive queue, we must schedule there the required amount of memory. Try to borrow the required memory from the subflow, if needed, so that we leverage the existing TCP heuristic. Fixes: 6771bfd9ee24 ("mptcp: update mptcp ack sequence from work queue") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/f6143a6193a083574f11b00dbf7b5ad151bc4ff4.1603810630.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-29	drm/i915: Reject 90/270 degree rotated initial fbs	Ville Syrjälä	1	-0/+4
	We don't currently handle the initial fb readout correctly for 90/270 degree rotated scanout. Reject it. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201020194330.28568-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit a40a8305a732f4ecc2186ac7ca132ba062ed770d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-10-29	drm/i915: Restore ILK-M RPS support	Ville Syrjälä	1	-0/+1
	Restore RPS for ILK-M. We lost it when an extra HAS_RPS() check appeared in intel_rps_enable(). Unfortunaltey this just makes the performance worse on my ILK because intel_ips insists on limiting the GPU freq to the minimum. If we don't do the RPS init then intel_ips will not limit the frequency for whatever reason. Either it can't get at some required information and thus makes wrong decisions, or we mess up some weights/etc. and cause it to make the wrong decisions when RPS init has been done, or the entire thing is just wrong. Would require a bunch of reverse engineering to figure out what's going on. Cc: stable@vger.kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Fixes: 9c878557b1eb ("drm/i915/gt: Use the RPM config register to determine clk frequencies") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit 2bf06370bcfb0dea5655e9a5ad460c7f7dca7739) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-10-29	drm/i915/region: fix max size calculation	Matthew Auld	3	-2/+79
	We are incorrectly limiting the max allocation size as per the mm max_order, which is effectively the largest power-of-two that we can fit in the region size. However, it's normal to setup the region or allocator with a non-power-of-two size(for example 3G), which we should already handle correctly, except it seems for the early too-big-check. v2: make sure we also exercise the I915_BO_ALLOC_CONTIGUOUS path, which is quite different, since for that we are actually limited by the largest power-of-two that we can fit within the region size. (Chris) Fixes: b908be543e44 ("drm/i915: support creating LMEM objects") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: CQ Tang <cq.tang@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201021103606.241395-1-matthew.auld@intel.com (cherry picked from commit 83ebef47f8ebe320d5c5673db82f9903a4f40a69) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-10-29	include: jhash/signal: Fix fall-through warnings for Clang	Gustavo A. R. Silva	2	-0/+4
	In preparation to enable -Wimplicit-fallthrough for Clang, explicitly add break statements instead of letting the code fall through to the next case. This patch adds four break statements that, together, fix almost 40,000 warnings when building Linux 5.10-rc1 with Clang 12.0.0 and this[1] change reverted. Notice that in order to enable -Wimplicit-fallthrough for Clang, such change[1] is meant to be reverted at some point. So, this patch helps to move in that direction. Something important to mention is that there is currently a discrepancy between GCC and Clang when dealing with switch fall-through to empty case statements or to cases that only contain a break/continue/return statement[2][3][4]. Now that the -Wimplicit-fallthrough option has been globally enabled[5], any compiler should really warn on missing either a fallthrough annotation or any of the other case-terminating statements (break/continue/return/ goto) when falling through to the next case statement. Making exceptions to this introduces variation in case handling which may continue to lead to bugs, misunderstandings, and a general lack of robustness. The point of enabling options like -Wimplicit-fallthrough is to prevent human error and aid developers in spotting bugs before their code is even built/ submitted/committed, therefore eliminating classes of bugs. So, in order to really accomplish this, we should, and can, move in the direction of addressing any error-prone scenarios and get rid of the unintentional fallthrough bug-class in the kernel, entirely, even if there is some minor redundancy. Better to have explicit case-ending statements than continue to have exceptions where one must guess as to the right result. The compiler will eliminate any actual redundancy. [1] commit e2079e93f562c ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now") [2] https://github.com/ClangBuiltLinux/linux/issues/636 [3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91432 [4] https://godbolt.org/z/xgkvIh [5] commit a035d552a93b ("Makefile: Globally enable fall-through warning") Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-10-29	tipc: fix memory leak caused by tipc_buf_append()	Tung Nguyen	1	-3/+2
	Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()") replaced skb_unshare() with skb_copy() to not reduce the data reference counter of the original skb intentionally. This is not the correct way to handle the cloned skb because it causes memory leak in 2 following cases: 1/ Sending multicast messages via broadcast link The original skb list is cloned to the local skb list for local destination. After that, the data reference counter of each skb in the original list has the value of 2. This causes each skb not to be freed after receiving ACK: tipc_link_advance_transmq() { ... /* release skb */ __skb_unlink(skb, &l->transmq); kfree_skb(skb); <-- memory exists after being freed } 2/ Sending multicast messages via replicast link Similar to the above case, each skb cannot be freed after purging the skb list: tipc_mcast_xmit() { ... __skb_queue_purge(pkts); <-- memory exists after being freed } This commit fixes this issue by using skb_unshare() instead. Besides, to avoid use-after-free error reported by KASAN, the pointer to the fragment is set to NULL before calling skb_unshare() to make sure that the original skb is not freed after freeing the fragment 2 times in case skb_unshare() returns NULL. Fixes: ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()") Acked-by: Jon Maloy <jmaloy@redhat.com> Reported-by: Thang Hoang Ngo <thang.h.ngo@dektech.com.au> Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Reviewed-by: Xin Long <lucien.xin@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://lore.kernel.org/r/20201027032403.1823-1-tung.q.nguyen@dektech.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-29	gtp: fix an use-before-init in gtp_newlink()	Masahiro Fujiwara	1	-8/+8
	_pdp_find() from gtp_encap_recv() would trigger a crash when a peer sends GTP packets while creating new GTP device. RIP: 0010:gtp1_pdp_find.isra.0+0x68/0x90 [gtp] <SNIP> Call Trace: <IRQ> gtp_encap_recv+0xc2/0x2e0 [gtp] ? gtp1_pdp_find.isra.0+0x90/0x90 [gtp] udp_queue_rcv_one_skb+0x1fe/0x530 udp_queue_rcv_skb+0x40/0x1b0 udp_unicast_rcv_skb.isra.0+0x78/0x90 __udp4_lib_rcv+0x5af/0xc70 udp_rcv+0x1a/0x20 ip_protocol_deliver_rcu+0xc5/0x1b0 ip_local_deliver_finish+0x48/0x50 ip_local_deliver+0xe5/0xf0 ? ip_protocol_deliver_rcu+0x1b0/0x1b0 gtp_encap_enable() should be called after gtp_hastable_new() otherwise _pdp_find() will access the uninitialized hash table. Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional") Signed-off-by: Masahiro Fujiwara <fujiwara.masahiro@gmail.com> Link: https://lore.kernel.org/r/20201027114846.3924-1-fujiwara.masahiro@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-29	afs: Fix dirty-region encoding on ppc32 with 64K pages	David Howells	2	-9/+20
	The dirty region bounds stored in page->private on an afs page are 15 bits on a 32-bit box and can, at most, represent a range of up to 32K within a 32K page with a resolution of 1 byte. This is a problem for powerpc32 with 64K pages enabled. Further, transparent huge pages may get up to 2M, which will be a problem for the afs filesystem on all 32-bit arches in the future. Fix this by decreasing the resolution. For the moment, a 64K page will have a resolution determined from PAGE_SIZE. In the future, the page will need to be passed in to the helper functions so that the page size can be assessed and the resolution determined dynamically. Note that this might not be the ideal way to handle this, since it may allow some leakage of undirtied zero bytes to the server's copy in the case of a 3rd-party conflict. Fixing that would require a separately allocated record and is a more complicated fix. Fixes: 4343d00872e1 ("afs: Get rid of the afs_writeback record") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-29	afs: Fix afs_invalidatepage to adjust the dirty region	David Howells	4	-14/+79
	Fix afs_invalidatepage() to adjust the dirty region recorded in page->private when truncating a page. If the dirty region is entirely removed, then the private data is cleared and the page dirty state is cleared. Without this, if the page is truncated and then expanded again by truncate, zeros from the expanded, but no-longer dirty region may get written back to the server if the page gets laundered due to a conflicting 3rd-party write. It mustn't, however, shorten the dirty region of the page if that page is still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is stored in page->private to record this. Fixes: 4343d00872e1 ("afs: Get rid of the afs_writeback record") Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-29	afs: Alter dirty range encoding in page->private	David Howells	2	-4/+4
	Currently, page->private on an afs page is used to store the range of dirtied data within the page, where the range includes the lower bound, but excludes the upper bound (e.g. 0-1 is a range covering a single byte). This, however, requires a superfluous bit for the last-byte bound so that on a 4KiB page, it can say 0-4096 to indicate the whole page, the idea being that having both numbers the same would indicate an empty range. This is unnecessary as the PG_private bit is clear if it's an empty range (as is PG_dirty). Alter the way the dirty range is encoded in page->private such that the upper bound is reduced by 1 (e.g. 0-0 is then specified the same single byte range mentioned above). Applying this to both bounds frees up two bits, one of which can be used in a future commit. This allows the afs filesystem to be compiled on ppc32 with 64K pages; without this, the following warnings are seen: ../fs/afs/internal.h: In function 'afs_page_dirty_to': ../fs/afs/internal.h:881:15: warning: right shift count >= width of type [-Wshift-count-overflow] 881 \| return (priv >> __AFS_PAGE_PRIV_SHIFT) & __AFS_PAGE_PRIV_MASK; \| ^~ ../fs/afs/internal.h: In function 'afs_page_dirty': ../fs/afs/internal.h:886:28: warning: left shift count >= width of type [-Wshift-count-overflow] 886 \| return ((unsigned long)to << __AFS_PAGE_PRIV_SHIFT) \| from; \| ^~ Fixes: 4343d00872e1 ("afs: Get rid of the afs_writeback record") Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-29	afs: Wrap page->private manipulations in inline functions	David Howells	3	-34/+44
	The afs filesystem uses page->private to store the dirty range within a page such that in the event of a conflicting 3rd-party write to the server, we write back just the bits that got changed locally. However, there are a couple of problems with this: (1) I need a bit to note if the page might be mapped so that partial invalidation doesn't shrink the range. (2) There aren't necessarily sufficient bits to store the entire range of data altered (say it's a 32-bit system with 64KiB pages or transparent huge pages are in use). So wrap the accesses in inline functions so that future commits can change how this works. Also move them out of the tracing header into the in-directory header. There's not really any need for them to be in the tracing header. Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-29	afs: Fix where page->private is set during write	David Howells	1	-15/+26
	In afs, page->private is set to indicate the dirty region of a page. This is done in afs_write_begin(), but that can't take account of whether the copy into the page actually worked. Fix this by moving the change of page->private into afs_write_end(). Fixes: 4343d00872e1 ("afs: Get rid of the afs_writeback record") Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-29	afs: Fix page leak on afs_write_begin() failure	David Howells	1	-12/+11
	Fix the leak of the target page in afs_write_begin() when it fails. Fixes: 15b4650e55e0 ("afs: convert to new aops") Signed-off-by: David Howells <dhowells@redhat.com> cc: Nick Piggin <npiggin@gmail.com>
2020-10-29	afs: Fix to take ref on page when PG_private is set	David Howells	4	-26/+18
	Fix afs to take a ref on a page when it sets PG_private on it and to drop the ref when removing the flag. Note that in afs_write_begin(), a lot of the time, PG_private is already set on a page to which we're going to add some data. In such a case, we leave the bit set and mustn't increment the page count. As suggested by Matthew Wilcox, use attach/detach_page_private() where possible. Fixes: 31143d5d515e ("AFS: implement basic file write support") Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-29	cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS is set	Rafael J. Wysocki	1	-2/+4
	Because sugov_update_next_freq() may skip a frequency update even if the need_freq_update flag has been set for the policy at hand, policy limits updates may not take effect as expected. For example, if the intel_pstate driver operates in the passive mode with HWP enabled, it needs to update the HWP min and max limits when the policy min and max limits change, respectively, but that may not happen if the target frequency does not change along with the limit at hand. In particular, if the policy min is changed first, causing the target frequency to be adjusted to it, and the policy max limit is changed later to the same value, the HWP max limit will not be updated to follow it as expected, because the target frequency is still equal to the policy min limit and it will not change until that limit is updated. To address this issue, modify get_next_freq() to let the driver callback run if the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag is set regardless of whether or not the new frequency to set is equal to the previous one. Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled") Reported-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+: 1c534352f47f cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS ... Cc: 5.9+ <stable@vger.kernel.org> # 5.9+: a62f68f5ca53 cpufreq: Introduce cpufreq_driver_test_flags() Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-29	cpufreq: Introduce cpufreq_driver_test_flags()	Rafael J. Wysocki	2	-0/+13
	Add a helper function to test the flags of the cpufreq driver in use againt a given flags mask. In particular, this will be needed to test the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag in the schedutil governor. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-29	arm64: Add workaround for Arm Cortex-A77 erratum 1508412	Rob Herring	13	-15/+66
	On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load and a store exclusive or PAR_EL1 read can cause a deadlock. The workaround requires a DMB SY before and after a PAR_EL1 register read. In addition, it's possible an interrupt (doing a device read) or KVM guest exit could be taken between the DMB and PAR read, so we also need a DMB before returning from interrupt and before returning to a guest. A deadlock is still possible with the workaround as KVM guests must also have the workaround. IOW, a malicious guest can deadlock an affected systems. This workaround also depends on a firmware counterpart to enable the h/w to insert DMB SY after load and store exclusive instructions. See the errata document SDEN-1152370 v10 [1] for more information. [1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdf Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: kvmarm@lists.cs.columbia.edu Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-29	arm64: Add part number for Arm Cortex-A77	Rob Herring	1	-0/+2
	Add the MIDR part number info for the Arm Cortex-A77. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201028182839.166037-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-29	drm/vc4: Rework the structure conversion functions	Maxime Ripard	1	-6/+6
	Most of the helpers to retrieve vc4 structures from the DRM base structures rely on the fact that the first member of the vc4 structure is the DRM one and just cast the pointers between them. However, this is pretty fragile especially since there's no check to make sure that the DRM structure is indeed at the offset 0 in the structure, so let's use container_of to make it more robust. Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201028123752.1733242-1-maxime@cerno.tech
2020-10-29	drm/vc4: hdmi: Add a name to the codec DAI component	Maxime Ripard	1	-0/+1
	Since the components for a given device in ASoC are identified by their name, it makes sense to add one even though it's not strictly necessary. Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200708144555.718404-1-maxime@cerno.tech
2020-10-28	arm64: mte: Document that user PSTATE.TCO is ignored by kernel uaccess	Catalin Marinas	1	-1/+3
	On exception entry, the kernel explicitly resets the PSTATE.TCO (tag check override) so that any kernel memory accesses will be checked (the bit is restored on exception return). This has the side-effect that the uaccess routines will not honour the PSTATE.TCO that may have been set by the user prior to a syscall. There is no issue in practice since PSTATE.TCO is expected to be used only for brief periods in specific routines (e.g. garbage collection). To control the tag checking mode of the uaccess routines, the user will have to invoke a corresponding prctl() call. Document the kernel behaviour w.r.t. PSTATE.TCO accordingly. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: df9d7a22dd21 ("arm64: mte: Add Memory Tagging Extension documentation") Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28	ext4: indicate that fast_commit is available via /sys/fs/ext4/feature/...	Theodore Ts'o	1	-0/+2
	Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: use generic casefolding support	Daniel Rosenberg	5	-93/+17
	This switches ext4 over to the generic support provided in libfs. Since casefolded dentries behave the same in ext4 and f2fs, we decrease the maintenance burden by unifying them, and any optimizations will immediately apply to both. Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20201028050820.1636571-1-drosen@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: do not use extent after put_bh	yangerkun	1	-15/+15
	ext4_ext_search_right() will read more extent blocks and call put_bh after we get the information we need. However, ret_ex will break this and may cause use-after-free once pagecache has been freed. Fix it by copying the extent structure if needed. Signed-off-by: yangerkun <yangerkun@huawei.com> Link: https://lore.kernel.org/r/20201028055617.2569255-1-yangerkun@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-10-28	ext4: use IS_ERR() for error checking of path	Harshad Shirwadkar	1	-2/+4
	With this fix, fast commit recovery code uses IS_ERR() for path returned by ext4_find_extent. Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201027204342.2794949-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: fix mmap write protection for data=journal mode	Jan Kara	1	-2/+3
	Commit afb585a97f81 "ext4: data=journal: write-protect pages on j_submit_inode_data_buffers()") added calls ext4_jbd2_inode_add_write() to track inode ranges whose mappings need to get write-protected during transaction commits. However the added calls use wrong start of a range (0 instead of page offset) and so write protection is not necessarily effective. Use correct range start to fix the problem. Fixes: afb585a97f81 ("ext4: data=journal: write-protect pages on j_submit_inode_data_buffers()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201027132751.29858-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	jbd2: fix a kernel-doc markup	Mauro Carvalho Chehab	1	-1/+1
	The kernel-doc markup that documents _fc_replay_callback is missing an asterisk, causing this warning: ../include/linux/jbd2.h:1271: warning: Function parameter or member 'j_fc_replay_callback' not described in 'journal_s' When building the docs. Fixes: 609f928af48f ("jbd2: fast commit recovery path") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/6055927ada2015b55b413cdd2670533bdc9a8da2.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: use s_mount_flags instead of s_mount_state for fast commit state	Harshad Shirwadkar	3	-15/+15
	Ext4's fast commit related transient states should use sb->s_mount_flags instead of persistent sb->s_mount_state. Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201027044915.2553163-3-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: make num of fast commit blocks configurable	Harshad Shirwadkar	2	-2/+15
	This patch reserves a field in the jbd2 superblock for number of fast commit blocks. When this value is non-zero, Ext4 uses this field to set the number of fast commit blocks. Fixes: 6866d7b3f2bb ("ext4/jbd2: add fast commit initialization") Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201027044915.2553163-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: properly check for dirty state in ext4_inode_datasync_dirty()	Andrea Righi	1	-4/+6
	ext4_inode_datasync_dirty() needs to return 'true' if the inode is dirty, 'false' otherwise, but the logic seems to be incorrectly changed by commit aa75f4d3daae ("ext4: main fast-commit commit path"). This introduces a problem with swap files that are always failing to be activated, showing this error in dmesg: [ 34.406479] swapon: file is not committed Simple test case to reproduce the problem: # fallocate -l 8G swapfile # chmod 0600 swapfile # mkswap swapfile # swapon swapfile Fix the logic to return the proper state of the inode. Link: https://lore.kernel.org/lkml/20201024131333.GA32124@xps-13-7390 Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201027044915.2553163-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: fix double locking in ext4_fc_commit_dentry_updates()	Harshad Shirwadkar	1	-1/+0
	Fixed double locking of sbi->s_fc_lock in the above function as reported by kernel-test-robot. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201023161339.1449437-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	cpufreq: speedstep: remove unneeded semicolon	Tom Rix	1	-1/+1
	A semicolon is not needed after a switch statement. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-28	module: use hidden visibility for weak symbol references	Ard Biesheuvel	1	-1/+1
	Geert reports that commit be2881824ae9eb92 ("arm64/build: Assert for unwanted sections") results in build errors on arm64 for configurations that have CONFIG_MODULES disabled. The commit in question added ASSERT()s to the arm64 linker script to ensure that linker generated sections such as .got.plt etc are empty, but as it turns out, there are corner cases where the linker does emit content into those sections. More specifically, weak references to function symbols (which can remain unsatisfied, and can therefore not be emitted as relative references) will be emitted as GOT and PLT entries when linking the kernel in PIE mode (which is the case when CONFIG_RELOCATABLE is enabled, which is on by default). What happens is that code such as struct device (fn)(struct device dev); struct device iommu_device; fn = symbol_get(mdev_get_iommu_device); if (fn) { iommu_device = fn(dev); essentially gets converted into the following when CONFIG_MODULES is off: struct device *iommu_device; if (&mdev_get_iommu_device) { iommu_device = mdev_get_iommu_device(dev); where mdev_get_iommu_device is emitted as a weak symbol reference into the object file. The first reference is decorated with an ordinary ABS64 data relocation (which yields 0x0 if the reference remains unsatisfied). However, the indirect call is turned into a direct call covered by a R_AARCH64_CALL26 relocation, which is converted into a call via a PLT entry taking the target address from the associated GOT entry. Given that such GOT and PLT entries are unnecessary for fully linked binaries such as the kernel, let's give these weak symbol references hidden visibility, so that the linker knows that the weak reference via R_AARCH64_CALL26 can simply remain unsatisfied. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Fangrui Song <maskray@google.com> Acked-by: Jessica Yu <jeyu@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20201027151132.14066-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28	ARM: dts: stm32: Describe Vin power supply on stm32mp157c-edx board	Pascal Paillet	1	-0/+15
	Add description for Vin power supply and for peripherals that are supplied by Vin. Signed-off-by: Pascal Paillet <p.paillet@st.com> Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
2020-10-28	ARM: dts: stm32: Describe Vin power supply on stm32mp15xx-dkx board	Pascal Paillet	1	-0/+17
	Add description for Vin power supply and for peripherals that are supplied by Vin. Signed-off-by: Pascal Paillet <p.paillet@st.com> Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
2020-10-28	arm64: efi: increase EFI PE/COFF header padding to 64 KB	Ard Biesheuvel	1	-1/+1
	Commit 76085aff29f5 ("efi/libstub/arm64: align PE/COFF sections to segment alignment") increased the PE/COFF section alignment to match the minimum segment alignment of the kernel image, which ensures that the kernel does not need to be moved around in memory by the EFI stub if it was built as relocatable. However, the first PE/COFF section starts at _stext, which is only 4 KB aligned, and so the section layout is inconsistent. Existing EFI loaders seem to care little about this, but it is better to clean this up. So let's pad the header to 64 KB to match the PE/COFF section alignment. Fixes: 76085aff29f5 ("efi/libstub/arm64: align PE/COFF sections to segment alignment") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20201027073209.2897-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28	arm64: vmlinux.lds: account for spurious empty .igot.plt sections	Ard Biesheuvel	1	-1/+1
	Now that we started making the linker warn about orphan sections (input sections that are not explicitly consumed by an output section), some configurations produce the following warning: aarch64-linux-gnu-ld: warning: orphan section `.igot.plt' from `arch/arm64/kernel/head.o' being placed in section `.igot.plt' It could be any file that triggers this - head.o is simply the first input file in the link - and the resulting .igot.plt section never actually appears in vmlinux as it turns out to be empty. So let's add .igot.plt to our collection of input sections to disregard unless they are empty. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20201028133332.5571-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28	kselftest/arm64: Fix check_user_mem test	Vincenzo Frascino	1	-0/+4
	The check_user_mem test reports the error below because the test plan is not declared correctly: # Planned tests != run tests (0 != 4) Fix the test adding the correct test plan declaration. Fixes: 4dafc08d0ba4 ("kselftest/arm64: Check mte tagged user address in kernel") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Gabor Kertesz <gabor.kertesz@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Link: https://lore.kernel.org/r/20201026121248.2340-7-vincenzo.frascino@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28	kselftest/arm64: Fix check_ksm_options test	Vincenzo Frascino	1	-0/+4
	The check_ksm_options test reports the error below because the test plan is not declared correctly: # Planned tests != run tests (0 != 4) Fix the test adding the correct test plan declaration. Fixes: f981d8fa2646 ("kselftest/arm64: Verify KSM page merge for MTE pages") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Gabor Kertesz <gabor.kertesz@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Link: https://lore.kernel.org/r/20201026121248.2340-6-vincenzo.frascino@arm.com Signed-off-by: Will Deacon <will@kernel.org>