aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2021-05-14arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache()Catalin Marinas1-1/+3
To ensure that instructions are observable in a new mapping, the arm64 set_pte_at() implementation cleans the D-cache and invalidates the I-cache to the PoU. As an optimisation, this is only done on executable mappings and the PG_dcache_clean page flag is set to avoid future cache maintenance on the same page. When two different processes map the same page (e.g. private executable file or shared mapping) there's a potential race on checking and setting PG_dcache_clean via set_pte_at() -> __sync_icache_dcache(). While on the fault paths the page is locked (PG_locked), mprotect() does not take the page lock. The result is that one process may see the PG_dcache_clean flag set but the I/D cache maintenance not yet performed. Avoid test_and_set_bit(PG_dcache_clean) in favour of separate test_bit() and set_bit(). In the rare event of a race, the cache maintenance is done twice. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Steven Price <steven.price@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210514095001.13236-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-05-14block/partitions/efi.c: Fix the efi_partition() kernel-doc headerBart Van Assche1-1/+1
Fix the following kernel-doc warning: block/partitions/efi.c:685: warning: wrong kernel-doc identifier on line: * efi_partition(struct parsed_partitions *state) Cc: Alexander Viro <viro@math.psu.edu> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210513171708.8391-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14blk-mq: Swap two calls in blk_mq_exit_queue()Bart Van Assche1-2/+4
If a tag set is shared across request queues (e.g. SCSI LUNs) then the block layer core keeps track of the number of active request queues in tags->active_queues. blk_mq_tag_busy() and blk_mq_tag_idle() update that atomic counter if the hctx flag BLK_MQ_F_TAG_QUEUE_SHARED is set. Make sure that blk_mq_exit_queue() calls blk_mq_tag_idle() before that flag is cleared by blk_mq_del_queue_tag_set(). Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Fixes: 0d2602ca30e4 ("blk-mq: improve support for shared tags maps") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210513171529.7977-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14blk-mq: plug request for shared sbitmapMing Lei1-2/+3
In case of shared sbitmap, request won't be held in plug list any more sine commit 32bc15afed04 ("blk-mq: Facilitate a shared sbitmap per tagset"), this way makes request merge from flush plug list & batching submission not possible, so cause performance regression. Yanhui reports performance regression when running sequential IO test(libaio, 16 jobs, 8 depth for each job) in VM, and the VM disk is emulated with image stored on xfs/megaraid_sas. Fix the issue by recovering original behavior to allow to hold request in plug list. Cc: Yanhui Ma <yama@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: kashyap.desai@broadcom.com Fixes: 32bc15afed04 ("blk-mq: Facilitate a shared sbitmap per tagset") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210514022052.1047665-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14io_uring: increase max number of reg buffersPavel Begunkov1-1/+3
Since recent changes instead of storing a large array of struct io_mapped_ubuf, we store pointers to them, that is 4 times slimmer and we should not to so worry about restricting max number of registererd buffer slots, increase the limit 4 times. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d3dee1da37f46da416aa96a16bf9e5094e10584d.1620990371.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14io_uring: further remove sqpoll limits on opcodesPavel Begunkov1-4/+2
There are three types of requests that left disabled for sqpoll, namely epoll ctx, statx, and resources update. Since SQPOLL task is now closely mimics a userspace thread, remove the restrictions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/909b52d70c45636d8d7897582474ea5aab5eed34.1620990306.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14io_uring: fix ltout double free on completion racePavel Begunkov1-3/+4
Always remove linked timeout on io_link_timeout_fn() from the master request link list, otherwise we may get use-after-free when first io_link_timeout_fn() puts linked timeout in the fail path, and then will be found and put on master's free. Cc: stable@vger.kernel.org # 5.10+ Fixes: 90cd7e424969d ("io_uring: track link timeout's master explicitly") Reported-and-tested-by: syzbot+5a864149dd970b546223@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/69c46bf6ce37fec4fdcd98f0882e18eb07ce693a.1620990121.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-13tracing: Handle %.*s in trace_check_vprintf()Steven Rostedt (VMware)1-4/+27
If a trace event uses the %*.s notation, the trace_check_vprintf() will fail and will warn about a bad processing of strings, because it does not take into account the length field when processing the star (*) part. Have it handle this case as well. Link: https://lore.kernel.org/linux-nfs/238C0E2D-C2A4-4578-ADD2-C565B3B99842@oracle.com/ Reported-by: Chuck Lever III <chuck.lever@oracle.com> Fixes: 9a6944fee68e2 ("tracing: Add a verifier to check string pointers for trace events") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-13vt: Fix character height handling with VT_RESIZEXMaciej W. Rozycki3-25/+26
Restore the original intent of the VT_RESIZEX ioctl's `v_clin' parameter which is the number of pixel rows per character (cell) rather than the height of the font used. For framebuffer devices the two values are always the same, because the former is inferred from the latter one. For VGA used as a true text mode device these two parameters are independent from each other: the number of pixel rows per character is set in the CRT controller, while font height is in fact hardwired to 32 pixel rows and fonts of heights below that value are handled by padding their data with blanks when loaded to hardware for use by the character generator. One can change the setting in the CRT controller and it will update the screen contents accordingly regardless of the font loaded. The `v_clin' parameter is used by the `vgacon' driver to set the height of the character cell and then the cursor position within. Make the parameter explicit then, by defining a new `vc_cell_height' struct member of `vc_data', set it instead of `vc_font.height' from `v_clin' in the VT_RESIZEX ioctl, and then use it throughout the `vgacon' driver except where actual font data is accessed which as noted above is independent from the CRTC setting. This way the framebuffer console driver is free to ignore the `v_clin' parameter as irrelevant, as it always should have, avoiding any issues attempts to give the parameter a meaning there could have caused, such as one that has led to commit 988d0763361b ("vt_ioctl: make VT_RESIZEX behave like VT_RESIZE"): "syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than actual font height calculated by con_font_set() from ioctl(PIO_FONT). Since fbcon_set_font() from con_font_set() allocates minimal amount of memory based on actual font height calculated by con_font_set(), use of vt_resizex() can cause UAF/OOB read for font data." The problem first appeared around Linux 2.5.66 which predates our repo history, but the origin could be identified with the old MIPS/Linux repo also at: <git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux.git> as commit 9736a3546de7 ("Merge with Linux 2.5.66."), where VT_RESIZEX code in `vt_ioctl' was updated as follows: if (clin) - video_font_height = clin; + vc->vc_font.height = clin; making the parameter apply to framebuffer devices as well, perhaps due to the use of "font" in the name of the original `video_font_height' variable. Use "cell" in the new struct member then to avoid ambiguity. References: [1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb855245837 [2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48e3 Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org # v2.6.12+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-13vt_ioctl: Revert VT_RESIZEX parameter handling removalMaciej W. Rozycki1-10/+47
Revert the removal of code handling extra VT_RESIZEX ioctl's parameters beyond those that VT_RESIZE supports, fixing a functional regression causing `svgatextmode' not to resize the VT anymore. As a consequence of the reverted change when the video adapter is reprogrammed from the original say 80x25 text mode using a 9x16 character cell (720x400 pixel resolution) to say 80x37 text mode and the same character cell (720x592 pixel resolution), the VT geometry does not get updated and only upper two thirds of the screen are used for the VT, and the lower part remains blank. The proportions change according to text mode geometries chosen. Revert the change verbatim then, bringing back previous VT resizing. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: 988d0763361b ("vt_ioctl: make VT_RESIZEX behave like VT_RESIZE") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-13vgacon: Record video mode changes with VT_RESIZEXMaciej W. Rozycki1-3/+11
Fix an issue with VGA console font size changes made after the initial video text mode has been changed with a user tool like `svgatextmode' calling the VT_RESIZEX ioctl. As it stands in that case the original screen geometry continues being used to validate further VT resizing. Consequently when the video adapter is firstly reprogrammed from the original say 80x25 text mode using a 9x16 character cell (720x400 pixel resolution) to say 80x37 text mode and the same character cell (720x592 pixel resolution), and secondly the CRTC character cell updated to 9x8 (by loading a suitable font with the KD_FONT_OP_SET request of the KDFONTOP ioctl), the VT geometry does not get further updated from 80x37 and only upper half of the screen is used for the VT, with the lower half showing rubbish corresponding to whatever happens to be there in the video memory that maps to that part of the screen. Of course the proportions change according to text mode geometries and font sizes chosen. Address the problem then, by updating the text mode geometry defaults rather than checking against them whenever the VT is resized via a user ioctl. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: e400b6ec4ede ("vt/vgacon: Check if screen resize request comes from userspace") Cc: stable@vger.kernel.org # v2.6.24+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-13arm64: tools: Add __ASM_CPUCAPS_H to the endif in cpucaps.hMark Brown1-1/+1
Anshuman suggested this. Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210513151819.12526-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-05-13drm/amdgpu: update vcn1.0 Non-DPG suspend sequenceSathishkumar S1-4/+9
update suspend register settings in Non-DPG mode. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/amdgpu: set vcn mgcg flag for picassoSathishkumar S1-1/+2
enable vcn mgcg flag for picasso. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connectedKai-Heng Feng3-0/+12
Screen flickers rapidly when two 4K 60Hz monitors are in use. This issue doesn't happen when one monitor is 4K 60Hz (pixelclock 594MHz) and another one is 4K 30Hz (pixelclock 297MHz). The issue is gone after setting "power_dpm_force_performance_level" to "high". Following the indication, we found that the issue occurs when sclk is too low. So resolve the issue by disabling sclk switching when there are two monitors requires high pixelclock (> 297MHz). v2: - Only apply the fix to Oland. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-13drm/amdgpu: update the method for harvest IP for specific SKULikun Gao1-14/+16
Update the method of disabling VCN IP for specific SKU for navi1x ASIC, it will judge whether should add the related IP at the function of amdgpu_device_ip_block_add(). Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/amdgpu: add judgement when add ip blocks (v2)Likun GAO6-2/+57
Judgement whether to add an sw ip according to the harvest info. v2: fix indentation (Alex) Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/amd/display: Initialize attribute for hdcp_srm sysfs fileDavid Ward1-0/+1
It is stored in dynamically allocated memory, so sysfs_bin_attr_init() must be called to initialize it. (Note: "initialization" only sets the .attr.key member in this struct; it does not change the value of any other members.) Otherwise, when CONFIG_DEBUG_LOCK_ALLOC=y this message appears during boot: BUG: key ffff9248900cd148 has not been registered! Fixes: 9037246bb2da ("drm/amd/display: Add sysfs interface for set/get srm") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1586 Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: David Ward <david.ward@gatech.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-05-13drm/amd/pm: Fix out-of-bounds bugGustavo A. R. Silva2-99/+109
Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels and ACPIState.levels are never actually used as flexible arrays. Those arrays can be used as simple objects of type SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead. Currently, the code fails because flexible array _levels_ in struct SISLANDS_SMC_SWSTATE doesn't allow for code that accesses the first element of initialState.levels and ACPIState.levels arrays: drivers/gpu/drm/amd/pm/powerplay/si_dpm.c: 4820: table->initialState.levels[0].mclk.vDLL_CNTL = 4821: cpu_to_be32(si_pi->clock_registers.dll_cntl); ... 5021: table->ACPIState.levels[0].mclk.vDLL_CNTL = 5022: cpu_to_be32(dll_cntl); because such element cannot be accessed without previously allocating enough dynamic memory for it to exist (which never actually happens). So, there is an out-of-bounds bug in this case. That's why struct SISLANDS_SMC_SWSTATE should only be used as type for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is created as type for objects initialState, ACPIState and ULVState. Also, with the change from one-element array to flexible-array member in commit 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1. Fixes: 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/radeon/si_dpm: Fix SMU power state loadGustavo A. R. Silva2-99/+109
Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels and ACPIState.levels are never actually used as flexible arrays. Those arrays can be used as simple objects of type SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead. Currently, the code fails because flexible array _levels_ in struct SISLANDS_SMC_SWSTATE doesn't allow for code that access the first element of initialState.levels and ACPIState.levels arrays: 4353 table->initialState.levels[0].mclk.vDLL_CNTL = 4354 cpu_to_be32(si_pi->clock_registers.dll_cntl); ... 4555 table->ACPIState.levels[0].mclk.vDLL_CNTL = 4556 cpu_to_be32(dll_cntl); because such element cannot exist without previously allocating any dynamic memory for it (which never actually happens). That's why struct SISLANDS_SMC_SWSTATE should only be used as type for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is created as type for objects initialState, ACPIState and ULVState. Also, with the change from one-element array to flexible-array member in commit 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1583 Fixes: 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array with flexible-array in struct SISLANDS_SMC_SWSTATE") Cc: stable@vger.kernel.org Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13drm/radeon/ni_dpm: Fix booting bugGustavo A. R. Silva2-84/+94
Create new structure NISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels and ACPIState.levels are never actually used as flexible arrays. Those arrays can be used as simple objects of type NISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead. Currently, the code fails because flexible array _levels_ in struct NISLANDS_SMC_SWSTATE doesn't allow for code that access the first element of initialState.levels and ACPIState.levels arrays: drivers/gpu/drm/radeon/ni_dpm.c: 1690 table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL = 1691 cpu_to_be32(ni_pi->clock_registers.mpll_ad_func_cntl); ... 1903: table->ACPIState.levels[0].mclk.vMPLL_AD_FUNC_CNTL = cpu_to_be32(mpll_ad_func_cntl); 1904: table->ACPIState.levels[0].mclk.vMPLL_AD_FUNC_CNTL_2 = cpu_to_be32(mpll_ad_func_cntl_2); because such element cannot exist without previously allocating any dynamic memory for it (which never actually happens). That's why struct NISLANDS_SMC_SWSTATE should only be used as type for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is created as type for objects initialState, ACPIState and ULVState. Also, with the change from one-element array to flexible-array member in commit 434fb1e7444a ("drm/radeon/nislands_smc.h: Replace one-element array with flexible-array member in struct NISLANDS_SMC_SWSTATE"), the size of dpmLevels in struct NISLANDS_SMC_STATETABLE should be fixed to be NISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of NISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1. Bug: https://lore.kernel.org/dri-devel/3eedbe78-1fbd-4763-a7f3-ac5665e76a4a@xenosoft.de/ Fixes: 434fb1e7444a ("drm/radeon/nislands_smc.h: Replace one-element array with flexible-array member in struct NISLANDS_SMC_SWSTATE") Cc: stable@vger.kernel.org Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Link: https://lore.kernel.org/dri-devel/9bb5fcbd-daf5-1669-b3e7-b8624b3c36f9@xenosoft.de/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13nvmet: use new ana_log_size instead the old oneHou Pu1-1/+1
The new ana_log_size should be used instead of the old one. Or kernel NULL pointer dereference will happen like below: [ 38.957849][ T69] BUG: kernel NULL pointer dereference, address: 000000000000003c [ 38.975550][ T69] #PF: supervisor write access in kernel mode [ 38.975955][ T69] #PF: error_code(0x0002) - not-present page [ 38.976905][ T69] PGD 0 P4D 0 [ 38.979388][ T69] Oops: 0002 [#1] SMP NOPTI [ 38.980488][ T69] CPU: 0 PID: 69 Comm: kworker/0:2 Not tainted 5.12.0+ #54 [ 38.981254][ T69] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 38.982502][ T69] Workqueue: events nvme_loop_execute_work [ 38.985219][ T69] RIP: 0010:memcpy_orig+0x68/0x10f [ 38.986203][ T69] Code: 83 c2 20 eb 44 48 01 d6 48 01 d7 48 83 ea 20 0f 1f 00 48 83 ea 20 4c 8b 46 f8 4c 8b 4e f0 4c 8b 56 e8 4c 8b 5e e0 48 8d 76 e0 <4c> 89 47 f8 4c 89 4f f0 4c 89 57 e8 4c 89 5f e0 48 8d 7f e0 73 d2 [ 38.987677][ T69] RSP: 0018:ffffc900001b7d48 EFLAGS: 00000287 [ 38.987996][ T69] RAX: 0000000000000020 RBX: 0000000000000024 RCX: 0000000000000010 [ 38.988327][ T69] RDX: ffffffffffffffe4 RSI: ffff8881084bc004 RDI: 0000000000000044 [ 38.988620][ T69] RBP: 0000000000000024 R08: 0000000100000000 R09: 0000000000000000 [ 38.988991][ T69] R10: 0000000100000000 R11: 0000000000000001 R12: 0000000000000024 [ 38.989289][ T69] R13: ffff8881084bc000 R14: 0000000000000000 R15: 0000000000000024 [ 38.989845][ T69] FS: 0000000000000000(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 [ 38.990234][ T69] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 38.990490][ T69] CR2: 000000000000003c CR3: 00000001085b2000 CR4: 00000000000006f0 [ 38.991105][ T69] Call Trace: [ 38.994157][ T69] sg_copy_buffer+0xb8/0xf0 [ 38.995357][ T69] nvmet_copy_to_sgl+0x48/0x6d [ 38.995565][ T69] nvmet_execute_get_log_page_ana+0xd4/0x1cb [ 38.995792][ T69] nvmet_execute_get_log_page+0xc9/0x146 [ 38.995992][ T69] nvme_loop_execute_work+0x3e/0x44 [ 38.996181][ T69] process_one_work+0x1c3/0x3c0 [ 38.996393][ T69] worker_thread+0x44/0x3d0 [ 38.996600][ T69] ? cancel_delayed_work+0x90/0x90 [ 38.996804][ T69] kthread+0xf7/0x130 [ 38.996961][ T69] ? kthread_create_worker_on_cpu+0x70/0x70 [ 38.997171][ T69] ret_from_fork+0x22/0x30 [ 38.997705][ T69] Modules linked in: [ 38.998741][ T69] CR2: 000000000000003c [ 39.000104][ T69] ---[ end trace e719927b609d0fa0 ]--- Fixes: 5e1f689913a4 ("nvme-multipath: fix double initialization of ANA state") Signed-off-by: Hou Pu <houpu.main@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-05-13erofs: fix 1 lcluster-sized pcluster for big pclusterGao Xiang1-2/+19
If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type rather than a HEAD or PLAIN type instead, which means its pclustersize _must_ be 1 lcluster (since its uncompressed size < 2 lclusters), as illustrated below: HEAD HEAD / PLAIN lcluster type ____________ ____________ |_:__________|_________:__| file data (uncompressed) . . .____________. |____________| pcluster data (compressed) Such on-disk case was explained before [1] but missed to be handled properly in the runtime implementation. It can be observed if manually generating 1 lcluster-sized pcluster with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now. [1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org Fixes: cec6e93beadf ("erofs: support parsing big pcluster compress indexes") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <xiang@kernel.org>
2021-05-12hwmon: (adm9240) Fix writes into inX_max attributesGuenter Roeck1-1/+1
When converting the driver to use the devm_hwmon_device_register_with_info API, the wrong register was selected when writing into inX_max attributes. Fix it. Fixes: 124b7e34a5a6 ("hwmon: (adm9240) Convert to devm_hwmon_device_register_with_info API") Reported-by: Chris Packham <Chris.Packham@alliedtelesis.co.nz> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-05-12ACPI: NFIT: Fix support for variable 'SPA' structure sizeDan Williams2-21/+36
ACPI 6.4 introduced the "SpaLocationCookie" to the NFIT "System Physical Address (SPA) Range Structure". The presence of that new field is indicated by the ACPI_NFIT_LOCATION_COOKIE_VALID flag. Pre-ACPI-6.4 firmware implementations omit the flag and maintain the original size of the structure. Update the implementation to check that flag to determine the size rather than the ACPI 6.4 compliant definition of 'struct acpi_nfit_system_address' from the Linux ACPICA definitions. Update the test infrastructure for the new expectations as well, i.e. continue to emulate the ACPI 6.3 definition of that structure. Without this fix the kernel fails to validate 'SPA' structures and this leads to a crash in nfit_get_smbios_id() since that routine assumes that SPAs are valid if it finds valid SMBIOS tables. BUG: unable to handle page fault for address: ffffffffffffffa8 [..] Call Trace: skx_get_nvdimm_info+0x56/0x130 [skx_edac] skx_get_dimm_config+0x1f5/0x213 [skx_edac] skx_register_mci+0x132/0x1c0 [skx_edac] Cc: Bob Moore <robert.moore@intel.com> Cc: Erik Kaneda <erik.kaneda@intel.com> Fixes: cf16b05c607b ("ACPICA: ACPI 6.4: NFIT: add Location Cookie field") Reported-by: Yi Zhang <yi.zhang@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/162037273007.1195827.10907249070709169329.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-12MAINTAINERS: Move nvdimm mailing listDan Williams6-32/+32
After seeing some users have subscription management trouble, more spam than other Linux development lists, and considering some of the benefits of kernel.org hosted lists, nvdimm and persistent memory development is moving to nvdimm@lists.linux.dev. The old list will remain up until v5.14-rc1 and shutdown thereafter. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/161898872871.3406469.4054282559340528393.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-12tools/testing/nvdimm: Make symbol '__nfit_test_ioremap' staticZou Wei1-1/+1
The sparse tool complains as follows: tools/testing/nvdimm/test/iomap.c:65:14: warning: symbol '__nfit_test_ioremap' was not declared. Should it be static? This symbol is not used outside of iomap.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Link: https://lore.kernel.org/r/1618904867-25275-1-git-send-email-zou_wei@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-12libnvdimm: Remove duplicate struct declarationWan Jiabing1-1/+0
struct device is declared at 133rd line. The second declaration is unnecessary, remove it. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Link: https://lore.kernel.org/r/20210419112725.42145-1-wanjiabing@vivo.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-12tpm: fix error return code in tpm2_get_cc_attrs_tbl()Zhen Lei1-0/+1
If the total number of commands queried through TPM2_CAP_COMMANDS is different from that queried through TPM2_CC_GET_CAPABILITY, it indicates an unknown error. In this case, an appropriate error code -EFAULT should be returned. However, we currently do not explicitly assign this error code to 'rc'. As a result, 0 was incorrectly returned. Cc: stable@vger.kernel.org Fixes: 58472f5cd4f6("tpm: validate TPM 2.0 commands") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-05-12tpm, tpm_tis: Reserve locality in tpm_tis_resume()Jarkko Sakkinen1-2/+10
Reserve locality in tpm_tis_resume(), as it could be unsert after waking up from a sleep state. Cc: stable@vger.kernel.org Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de> Reported-by: Hans de Goede <hdegoede@redhat.com> Fixes: a3fbfae82b4c ("tpm: take TPM chip power gating out of tpm_transmit()") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-05-12tpm, tpm_tis: Extend locality handling to TPM2 in tpm_tis_gen_interrupt()Jarkko Sakkinen1-6/+4
The earlier fix (linked) only partially fixed the locality handling bug in tpm_tis_gen_interrupt(), i.e. only for TPM 1.x. Extend the locality handling to cover TPM2. Cc: Hans de Goede <hdegoede@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-integrity/20210220125534.20707-1-jarkko@kernel.org/ Fixes: a3fbfae82b4c ("tpm: take TPM chip power gating out of tpm_transmit()") Reported-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
2021-05-12trusted-keys: match tpm_get_ops on all return pathsBen Boeckel1-3/+3
The `tpm_get_ops` call at the beginning of the function is not paired with a `tpm_put_ops` on this return path. Cc: stable@vger.kernel.org Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Boeckel <mathstuf@gmail.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-05-12KEYS: trusted: Fix memory leak on object tdColin Ian King1-3/+5
Two error return paths are neglecting to free allocated object td, causing a memory leak. Fix this by returning via the error return path that securely kfree's td. Fixes clang scan-build warning: security/keys/trusted-keys/trusted_tpm1.c:496:10: warning: Potential memory leak [unix.Malloc] Cc: stable@vger.kernel.org Fixes: 5df16caada3f ("KEYS: trusted: Fix incorrect handling of tpm_get_random()") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-05-12drm/i915: Use correct downstream caps for check Src-Ctl mode for PCONAnkit Nautiyal1-1/+1
Fix the typo in DPCD caps used for checking SRC CTL mode of HDMI2.1 PCON v2: Corrected Fixes tag (Jani Nikula). v3: Rebased. Fixes: 04b6603d13be ("drm/i915/display: Configure HDMI2.1 Pcon for FRL only if Src-Ctl mode is available") Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: "Ville Syrj_l_" <ville.syrjala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Sean Paul <seanpaul@chromium.org> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Swati Sharma <swati2.sharma@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210511120930.12218-1-ankit.k.nautiyal@intel.com (cherry picked from commit 88a9c5485c48ab60c89612a17fc89f4162bbdb9d) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-05-12drm/i915/overlay: Fix active retire callback alignmentTvrtko Ursulin1-1/+1
__i915_active_call annotation is required on the retire callback to ensure correct function alignment. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: a21ce8ad12d2 ("drm/i915/overlay: Switch to using i915_active tracking") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210429083530.849546-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit d8e44e4dd221ee283ea60a6fb87bca08807aa0ab) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-05-12drm/i915: Fix crash in auto_retireStéphane Marchesin1-1/+2
The retire logic uses the 2 lower bits of the pointer to the retire function to store flags. However, the auto_retire function is not guaranteed to be aligned to a multiple of 4, which causes crashes as we jump to the wrong address, for example like this: 2021-04-24T18:03:53.804300Z WARNING kernel: [ 516.876901] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI 2021-04-24T18:03:53.804310Z WARNING kernel: [ 516.876906] CPU: 7 PID: 146 Comm: kworker/u16:6 Tainted: G U 5.4.105-13595-g3cd84167b2df #1 2021-04-24T18:03:53.804311Z WARNING kernel: [ 516.876907] Hardware name: Google Volteer2/Volteer2, BIOS Google_Volteer2.13672.76.0 02/22/2021 2021-04-24T18:03:53.804312Z WARNING kernel: [ 516.876911] Workqueue: events_unbound active_work 2021-04-24T18:03:53.804313Z WARNING kernel: [ 516.876914] RIP: 0010:auto_retire+0x1/0x20 2021-04-24T18:03:53.804314Z WARNING kernel: [ 516.876916] Code: e8 01 f2 ff ff eb 02 31 db 48 89 d8 5b 5d c3 0f 1f 44 00 00 55 48 89 e5 f0 ff 87 c8 00 00 00 0f 88 ab 47 4a 00 31 c0 5d c3 0f <1f> 44 00 00 55 48 89 e5 f0 ff 8f c8 00 00 00 0f 88 9a 47 4a 00 74 2021-04-24T18:03:53.804319Z WARNING kernel: [ 516.876918] RSP: 0018:ffff9b4d809fbe38 EFLAGS: 00010286 2021-04-24T18:03:53.804320Z WARNING kernel: [ 516.876919] RAX: 0000000000000007 RBX: ffff927915079600 RCX: 0000000000000007 2021-04-24T18:03:53.804320Z WARNING kernel: [ 516.876921] RDX: ffff9b4d809fbe40 RSI: 0000000000000286 RDI: ffff927915079600 2021-04-24T18:03:53.804321Z WARNING kernel: [ 516.876922] RBP: ffff9b4d809fbe68 R08: 8080808080808080 R09: fefefefefefefeff 2021-04-24T18:03:53.804321Z WARNING kernel: [ 516.876924] R10: 0000000000000010 R11: ffffffff92e44bd8 R12: ffff9279150796a0 2021-04-24T18:03:53.804322Z WARNING kernel: [ 516.876925] R13: ffff92791c368180 R14: ffff927915079640 R15: 000000001c867605 2021-04-24T18:03:53.804323Z WARNING kernel: [ 516.876926] FS: 0000000000000000(0000) GS:ffff92791ffc0000(0000) knlGS:0000000000000000 2021-04-24T18:03:53.804323Z WARNING kernel: [ 516.876928] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2021-04-24T18:03:53.804324Z WARNING kernel: [ 516.876929] CR2: 0000239514955000 CR3: 00000007f82da001 CR4: 0000000000760ee0 2021-04-24T18:03:53.804325Z WARNING kernel: [ 516.876930] PKRU: 55555554 2021-04-24T18:03:53.804325Z WARNING kernel: [ 516.876931] Call Trace: 2021-04-24T18:03:53.804326Z WARNING kernel: [ 516.876935] __active_retire+0x77/0xcf 2021-04-24T18:03:53.804326Z WARNING kernel: [ 516.876939] process_one_work+0x1da/0x394 2021-04-24T18:03:53.804327Z WARNING kernel: [ 516.876941] worker_thread+0x216/0x375 2021-04-24T18:03:53.804327Z WARNING kernel: [ 516.876944] kthread+0x147/0x156 2021-04-24T18:03:53.804335Z WARNING kernel: [ 516.876946] ? pr_cont_work+0x58/0x58 2021-04-24T18:03:53.804335Z WARNING kernel: [ 516.876948] ? kthread_blkcg+0x2e/0x2e 2021-04-24T18:03:53.804336Z WARNING kernel: [ 516.876950] ret_from_fork+0x1f/0x40 2021-04-24T18:03:53.804336Z WARNING kernel: [ 516.876952] Modules linked in: cdc_mbim cdc_ncm cdc_wdm xt_cgroup rfcomm cmac algif_hash algif_skcipher af_alg xt_MASQUERADE uinput snd_soc_rt5682_sdw snd_soc_rt5682 snd_soc_max98373_sdw snd_soc_max98373 snd_soc_rl6231 regmap_sdw snd_soc_sof_sdw snd_soc_hdac_hdmi snd_soc_dmic snd_hda_codec_hdmi snd_sof_pci snd_sof_intel_hda_common intel_ipu6_psys snd_sof_xtensa_dsp soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof snd_soc_hdac_hda snd_soc_acpi_intel_match snd_soc_acpi snd_hda_ext_core soundwire_bus snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core intel_ipu6_isys videobuf2_dma_contig videobuf2_v4l2 videobuf2_common videobuf2_memops mei_hdcp intel_ipu6 ov2740 ov8856 at24 sx9310 dw9768 v4l2_fwnode cros_ec_typec intel_pmc_mux roles acpi_als typec fuse iio_trig_sysfs cros_ec_light_prox cros_ec_lid_angle cros_ec_sensors cros_ec_sensors_core industrialio_triggered_buffer cros_ec_sensors_ring kfifo_buf industrialio cros_ec_sensorhub 2021-04-24T18:03:53.804337Z WARNING kernel: [ 516.876972] cdc_ether usbnet iwlmvm lzo_rle lzo_compress iwl7000_mac80211 iwlwifi zram cfg80211 r8152 mii btusb btrtl btintel btbcm bluetooth ecdh_generic ecc joydev 2021-04-24T18:03:53.804337Z EMERG kernel: [ 516.879169] gsmi: Log Shutdown Reason 0x03 This change fixes this by aligning the function. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> Fixes: 229007e02d69 ("drm/i915: Wrap i915_active in a simple kreffed struct") Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210429031021.1218091-1-marcheu@chromium.org (cherry picked from commit ca419f407b43cc89942ebc297c7a63d94abbcae4) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-05-12drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdpLv Yunlong1-1/+0
Our code analyzer reported a double free bug. In gen8_preallocate_top_level_pdp, pde and pde->pt.base are allocated via alloc_pd(vm) with one reference. If pin_pt_dma() failed, pde->pt.base is freed by i915_gem_object_put() with a reference dropped. Then free_pd calls free_px() defined in intel_ppgtt.c, which calls i915_gem_object_put() to put pde->pt.base again. As pde->pt.base is protected by refcount, so the second put will not free pde->pt.base actually. But, maybe it is better to remove the first put? Fixes: 82adf901138cc ("drm/i915/gt: Shrink i915_page_directory's slab bucket") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210426124340.4238-1-lyl2019@mail.ustc.edu.cn (cherry picked from commit ac69496fe65cca0611d5917b7d232730ff605bc7) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-05-12drm/i915: Read C0DRB3/C1DRB3 as 16 bits againVille Syrjälä1-2/+2
We've defined C0DRB3/C1DRB3 as 16 bit registers, so access them as such. Fixes: 1c8242c3a4b2 ("drm/i915: Use unchecked writes for setting up the fences") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210421153401.13847-3-ville.syrjala@linux.intel.com (cherry picked from commit f765a5b48c667bdada5e49d5e0f23f8c0687b21b) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-05-12drm/i915: Avoid div-by-zero on gen2Ville Syrjälä1-1/+1
Gen2 tiles are 2KiB in size so i915_gem_object_get_tile_row_size() can in fact return <4KiB, which leads to div-by-zero here. Avoid that. Not sure i915_gem_object_get_tile_row_size() is entirely sane anyway since it doesn't account for the different tile layouts on i8xx/i915... I'm not able to hit this before commit 6846895fde05 ("drm/i915: Replace PIN_NONFAULT with calls to PIN_NOEVICT") and it looks like I also need to run recent version of Mesa. With those in place xonotic trips on this quite easily on my 85x. Cc: stable@vger.kernel.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210421153401.13847-2-ville.syrjala@linux.intel.com (cherry picked from commit ed52c62d386f764194e0184fdb905d5f24194cae) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-05-12nvmet: seset ns->file when open failsDaniel Wagner1-3/+5
Reset the ns->file value to NULL also in the error case in nvmet_file_ns_enable(). The ns->file variable points either to file object or contains the error code after the filp_open() call. This can lead to following problem: When the user first setups an invalid file backend and tries to enable the ns, it will fail. Then the user switches over to a bdev backend and enables successfully the ns. The first received I/O will crash the system because the IO backend is chosen based on the ns->file value: static u16 nvmet_parse_io_cmd(struct nvmet_req *req) { [...] if (req->ns->file) return nvmet_file_parse_io_cmd(req); return nvmet_bdev_parse_io_cmd(req); } Reported-by: Enzo Matsumiya <ematsumiya@suse.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-05-12ptrace: make ptrace() fail if the tracee changed its pid unexpectedlyOleg Nesterov1-1/+17
Suppose we have 2 threads, the group-leader L and a sub-theread T, both parked in ptrace_stop(). Debugger tries to resume both threads and does ptrace(PTRACE_CONT, T); ptrace(PTRACE_CONT, L); If the sub-thread T execs in between, the 2nd PTRACE_CONT doesn not resume the old leader L, it resumes the post-exec thread T which was actually now stopped in PTHREAD_EVENT_EXEC. In this case the PTHREAD_EVENT_EXEC event is lost, and the tracer can't know that the tracee changed its pid. This patch makes ptrace() fail in this case until debugger does wait() and consumes PTHREAD_EVENT_EXEC which reports old_pid. This affects all ptrace requests except the "asynchronous" PTRACE_INTERRUPT/KILL. The patch doesn't add the new PTRACE_ option to not complicate the API, and I _hope_ this won't cause any noticeable regression: - If debugger uses PTRACE_O_TRACEEXEC and the thread did an exec and the tracer does a ptrace request without having consumed the exec event, it's 100% sure that the thread the ptracer thinks it is targeting does not exist anymore, or isn't the same as the one it thinks it is targeting. - To some degree this patch adds nothing new. In the scenario above ptrace(L) can fail with -ESRCH if it is called after the execing sub-thread wakes the leader up and before it "steals" the leader's pid. Test-case: #include <stdio.h> #include <unistd.h> #include <signal.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <errno.h> #include <pthread.h> #include <assert.h> void *tf(void *arg) { execve("/usr/bin/true", NULL, NULL); assert(0); return NULL; } int main(void) { int leader = fork(); if (!leader) { kill(getpid(), SIGSTOP); pthread_t th; pthread_create(&th, NULL, tf, NULL); for (;;) pause(); return 0; } waitpid(leader, NULL, WSTOPPED); ptrace(PTRACE_SEIZE, leader, 0, PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC); waitpid(leader, NULL, 0); ptrace(PTRACE_CONT, leader, 0,0); waitpid(leader, NULL, 0); int status, thread = waitpid(-1, &status, 0); assert(thread > 0 && thread != leader); assert(status == 0x80137f); ptrace(PTRACE_CONT, thread, 0,0); /* * waitid() because waitpid(leader, &status, WNOWAIT) does not * report status. Why ???? * * Why WEXITED? because we have another kernel problem connected * to mt-exec. */ siginfo_t info; assert(waitid(P_PID, leader, &info, WSTOPPED|WEXITED|WNOWAIT) == 0); assert(info.si_pid == leader && info.si_status == 0x0405); /* OK, it sleeps in ptrace(PTRACE_EVENT_EXEC == 0x04) */ assert(ptrace(PTRACE_CONT, leader, 0,0) == -1); assert(errno == ESRCH); assert(leader == waitpid(leader, &status, WNOHANG)); assert(status == 0x04057f); assert(ptrace(PTRACE_CONT, leader, 0,0) == 0); return 0; } Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Simon Marchi <simon.marchi@efficios.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pedro Alves <palves@redhat.com> Acked-by: Simon Marchi <simon.marchi@efficios.com> Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-12nbd: share nbd_put and return by goto put_nbdSun Ke1-4/+3
Replace the following two statements by the statement “goto put_nbd;” nbd_put(nbd); return 0; Signed-off-by: Sun Ke <sunke32@huawei.com> Suggested-by: Markus Elfring <Markus.Elfring@web.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210512114331.1233964-3-sunke32@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-12nbd: Fix NULL pointer in flush_workqueueSun Ke1-1/+2
Open /dev/nbdX first, the config_refs will be 1 and the pointers in nbd_device are still null. Disconnect /dev/nbdX, then reference a null recv_workq. The protection by config_refs in nbd_genl_disconnect is useless. [ 656.366194] BUG: kernel NULL pointer dereference, address: 0000000000000020 [ 656.368943] #PF: supervisor write access in kernel mode [ 656.369844] #PF: error_code(0x0002) - not-present page [ 656.370717] PGD 10cc87067 P4D 10cc87067 PUD 1074b4067 PMD 0 [ 656.371693] Oops: 0002 [#1] SMP [ 656.372242] CPU: 5 PID: 7977 Comm: nbd-client Not tainted 5.11.0-rc5-00040-g76c057c84d28 #1 [ 656.373661] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 656.375904] RIP: 0010:mutex_lock+0x29/0x60 [ 656.376627] Code: 00 0f 1f 44 00 00 55 48 89 fd 48 83 05 6f d7 fe 08 01 e8 7a c3 ff ff 48 83 05 6a d7 fe 08 01 31 c0 65 48 8b 14 25 00 6d 01 00 <f0> 48 0f b1 55 d [ 656.378934] RSP: 0018:ffffc900005eb9b0 EFLAGS: 00010246 [ 656.379350] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 656.379915] RDX: ffff888104cf2600 RSI: ffffffffaae8f452 RDI: 0000000000000020 [ 656.380473] RBP: 0000000000000020 R08: 0000000000000000 R09: ffff88813bd6b318 [ 656.381039] R10: 00000000000000c7 R11: fefefefefefefeff R12: ffff888102710b40 [ 656.381599] R13: ffffc900005eb9e0 R14: ffffffffb2930680 R15: ffff88810770ef00 [ 656.382166] FS: 00007fdf117ebb40(0000) GS:ffff88813bd40000(0000) knlGS:0000000000000000 [ 656.382806] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 656.383261] CR2: 0000000000000020 CR3: 0000000100c84000 CR4: 00000000000006e0 [ 656.383819] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 656.384370] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 656.384927] Call Trace: [ 656.385111] flush_workqueue+0x92/0x6c0 [ 656.385395] nbd_disconnect_and_put+0x81/0xd0 [ 656.385716] nbd_genl_disconnect+0x125/0x2a0 [ 656.386034] genl_family_rcv_msg_doit.isra.0+0x102/0x1b0 [ 656.386422] genl_rcv_msg+0xfc/0x2b0 [ 656.386685] ? nbd_ioctl+0x490/0x490 [ 656.386954] ? genl_family_rcv_msg_doit.isra.0+0x1b0/0x1b0 [ 656.387354] netlink_rcv_skb+0x62/0x180 [ 656.387638] genl_rcv+0x34/0x60 [ 656.387874] netlink_unicast+0x26d/0x590 [ 656.388162] netlink_sendmsg+0x398/0x6c0 [ 656.388451] ? netlink_rcv_skb+0x180/0x180 [ 656.388750] ____sys_sendmsg+0x1da/0x320 [ 656.389038] ? ____sys_recvmsg+0x130/0x220 [ 656.389334] ___sys_sendmsg+0x8e/0xf0 [ 656.389605] ? ___sys_recvmsg+0xa2/0xf0 [ 656.389889] ? handle_mm_fault+0x1671/0x21d0 [ 656.390201] __sys_sendmsg+0x6d/0xe0 [ 656.390464] __x64_sys_sendmsg+0x23/0x30 [ 656.390751] do_syscall_64+0x45/0x70 [ 656.391017] entry_SYSCALL_64_after_hwframe+0x44/0xa9 To fix it, just add if (nbd->recv_workq) to nbd_disconnect_and_put(). Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs") Signed-off-by: Sun Ke <sunke32@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210512114331.1233964-2-sunke32@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-12f2fs: return EINVAL for hole cases in swap fileJaegeuk Kim1-2/+2
This tries to fix xfstests/generic/495. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-12ACPI: PM: Add ACPI ID of Alder Lake FanSumeet Pawnikar1-0/+1
Add a new unique fan ACPI device ID for Alder Lake to support it in acpi_dev_pm_attach() function. Fixes: 38748bcb940e ("ACPI: DPTF: Support Alder Lake") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: 5.10+ <stable@vger.kernel.org> # 5.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-05-12blkdev.h: remove unused codes blk_account_rqLin Feng1-5/+0
Last users of blk_account_rq gone with patch commit a1ce35fa49852db ("block: remove dead elevator code") and now it gets no caller, it can be safely removed. Signed-off-by: Lin Feng <linf@wangsu.com> Link: https://lore.kernel.org/r/20210512100124.173769-1-linf@wangsu.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-12block, bfq: avoid circular stable mergesPaolo Valente1-2/+29
BFQ may merge a new bfq_queue, stably, with the last bfq_queue created. In particular, BFQ first waits a little bit for some I/O to flow inside the new queue, say Q2, if this is needed to understand whether it is better or worse to merge Q2 with the last queue created, say Q1. This delayed stable merge is performed by assigning bic->stable_merge_bfqq = Q1, for the bic associated with Q1. Yet, while waiting for some I/O to flow in Q2, a non-stable queue merge of Q2 with Q1 may happen, causing the bic previously associated with Q2 to be associated with exactly Q1 (bic->bfqq = Q1). After that, Q2 and Q1 may happen to be split, and, in the split, Q1 may happen to be recycled as a non-shared bfq_queue. In that case, Q1 may then happen to undergo a stable merge with the bfq_queue pointed by bic->stable_merge_bfqq. Yet bic->stable_merge_bfqq still points to Q1. So Q1 would be merged with itself. This commit fixes this error by intercepting this situation, and canceling the schedule of the stable merge. Fixes: 430a67f9d616 ("block, bfq: merge bursts of newly-created queues") Signed-off-by: Pietro Pedroni <pedroni.pietro.96@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20210512094352.85545-2-paolo.valente@linaro.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-11f2fs: avoid swapon failure by giving a warning firstJaegeuk Kim1-6/+23
The final solution can be migrating blocks to form a section-aligned file internally. Meanwhile, let's ask users to do that when preparing the swap file initially like: 1) create() 2) ioctl(F2FS_IOC_SET_PIN_FILE) 3) fallocate() Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: 36e4d95891ed ("f2fs: check if swapfile is section-alligned") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11blk-iocost: fix weight updates of inner active iocgsTejun Heo1-2/+12
When the weight of an active iocg is updated, weight_updated() is called which in turn calls __propagate_weights() to update the active and inuse weights so that the effective hierarchical weights are update accordingly. The current implementation is incorrect for inner active nodes. For an active leaf iocg, inuse can be any value between 1 and active and the difference represents how much the iocg is donating. When weight is updated, as long as inuse is clamped between 1 and the new weight, we're alright and this is what __propagate_weights() currently implements. However, that's not how an active inner node's inuse is set. An inner node's inuse is solely determined by the ratio between the sums of inuse's and active's of its children - ie. they're results of propagating the leaves' active and inuse weights upwards. __propagate_weights() incorrectly applies the same clamping as for a leaf when an active inner node's weight is updated. Consider a hierarchy which looks like the following with saturating workloads in AA and BB. R / \ A B | | AA BB 1. For both A and B, active=100, inuse=100, hwa=0.5, hwi=0.5. 2. echo 200 > A/io.weight 3. __propagate_weights() update A's active to 200 and leave inuse at 100 as it's already between 1 and the new active, making A:active=200, A:inuse=100. As R's active_sum is updated along with A's active, A:hwa=2/3, B:hwa=1/3. However, because the inuses didn't change, the hwi's remain unchanged at 0.5. 4. The weight of A is now twice that of B but AA and BB still have the same hwi of 0.5 and thus are doing the same amount of IOs. Fix it by making __propgate_weights() always calculate the inuse of an active inner iocg based on the ratio of child_inuse_sum to child_active_sum. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dan Schatzberg <dschatzberg@fb.com> Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Cc: stable@vger.kernel.org # v5.4+ Link: https://lore.kernel.org/r/YJsxnLZV1MnBcqjj@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-11f2fs: compress: fix to assign cc.cluster_idx correctlyChao Yu3-12/+13
In f2fs_destroy_compress_ctx(), after f2fs_destroy_compress_ctx(), cc.cluster_idx will be cleared w/ NULL_CLUSTER, f2fs_cluster_blocks() may check wrong cluster metadata, fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>