aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2024-06-18f2fs: fix to use mnt_{want,drop}_write_file replace file_{start,end}_wrtieZhiguo Niu1-8/+16
mnt_{want,drop}_write_file is more suitable than file_{start,end}_wrtie and also is consistent with other ioctl operations. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-18f2fs: clean up set REQ_RAHEAD given racJaegeuk Kim3-8/+13
Let's set REQ_RAHEAD per rac by single source. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-18f2fs: enable atgc dynamically if conditions are metZhiguo Niu3-3/+26
Now atgc can only be enabled when umounted->mounted device even related conditions have reached. If the device has not be umounted->mounted for a long time, atgc can not work. So enable atgc dynamically when atgc_age_threshold is less than elapsed_time and ATGC mount option is on. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to truncate preallocated blocks in f2fs_file_open()Chao Yu3-9/+42
chenyuwen reports a f2fs bug as below: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000011 fscrypt_set_bio_crypt_ctx+0x78/0x1e8 f2fs_grab_read_bio+0x78/0x208 f2fs_submit_page_read+0x44/0x154 f2fs_get_read_data_page+0x288/0x5f4 f2fs_get_lock_data_page+0x60/0x190 truncate_partial_data_page+0x108/0x4fc f2fs_do_truncate_blocks+0x344/0x5f0 f2fs_truncate_blocks+0x6c/0x134 f2fs_truncate+0xd8/0x200 f2fs_iget+0x20c/0x5ac do_garbage_collect+0x5d0/0xf6c f2fs_gc+0x22c/0x6a4 f2fs_disable_checkpoint+0xc8/0x310 f2fs_fill_super+0x14bc/0x1764 mount_bdev+0x1b4/0x21c f2fs_mount+0x20/0x30 legacy_get_tree+0x50/0xbc vfs_get_tree+0x5c/0x1b0 do_new_mount+0x298/0x4cc path_mount+0x33c/0x5fc __arm64_sys_mount+0xcc/0x15c invoke_syscall+0x60/0x150 el0_svc_common+0xb8/0xf8 do_el0_svc+0x28/0xa0 el0_svc+0x24/0x84 el0t_64_sync_handler+0x88/0xec It is because inode.i_crypt_info is not initialized during below path: - mount - f2fs_fill_super - f2fs_disable_checkpoint - f2fs_gc - f2fs_iget - f2fs_truncate So, let's relocate truncation of preallocated blocks to f2fs_file_open(), after fscrypt_file_open(). Fixes: d4dd19ec1ea0 ("f2fs: do not expose unwritten blocks to user by DIO") Reported-by: chenyuwen <yuwen.chen@xjmz.com> Closes: https://lore.kernel.org/linux-kernel/20240517085327.1188515-1-yuwen.chen@xjmz.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to cover read extent cache access with lockChao Yu3-35/+25
syzbot reports a f2fs bug as below: BUG: KASAN: slab-use-after-free in sanity_check_extent_cache+0x370/0x410 fs/f2fs/extent_cache.c:46 Read of size 4 at addr ffff8880739ab220 by task syz-executor200/5097 CPU: 0 PID: 5097 Comm: syz-executor200 Not tainted 6.9.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 sanity_check_extent_cache+0x370/0x410 fs/f2fs/extent_cache.c:46 do_read_inode fs/f2fs/inode.c:509 [inline] f2fs_iget+0x33e1/0x46e0 fs/f2fs/inode.c:560 f2fs_nfs_get_inode+0x74/0x100 fs/f2fs/super.c:3237 generic_fh_to_dentry+0x9f/0xf0 fs/libfs.c:1413 exportfs_decode_fh_raw+0x152/0x5f0 fs/exportfs/expfs.c:444 exportfs_decode_fh+0x3c/0x80 fs/exportfs/expfs.c:584 do_handle_to_path fs/fhandle.c:155 [inline] handle_to_path fs/fhandle.c:210 [inline] do_handle_open+0x495/0x650 fs/fhandle.c:226 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f We missed to cover sanity_check_extent_cache() w/ extent cache lock, so, below race case may happen, result in use after free issue. - f2fs_iget - do_read_inode - f2fs_init_read_extent_tree : add largest extent entry in to cache - shrink - f2fs_shrink_read_extent_tree - __shrink_extent_tree - __detach_extent_node : drop largest extent entry - sanity_check_extent_cache : access et->largest w/o lock let's refactor sanity_check_extent_cache() to avoid extent cache access and call it before f2fs_init_read_extent_tree() to fix this issue. Reported-by: syzbot+74ebe2104433e9dc610d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/00000000000009beea061740a531@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix return value of f2fs_convert_inline_inode()Chao Yu1-2/+4
If device is readonly, make f2fs_convert_inline_inode() return EROFS instead of zero, otherwise it may trigger panic during writeback of inline inode's dirty page as below: f2fs_write_single_data_page+0xbb6/0x1e90 fs/f2fs/data.c:2888 f2fs_write_cache_pages fs/f2fs/data.c:3187 [inline] __f2fs_write_data_pages fs/f2fs/data.c:3342 [inline] f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3369 do_writepages+0x359/0x870 mm/page-writeback.c:2634 filemap_fdatawrite_wbc+0x125/0x180 mm/filemap.c:397 __filemap_fdatawrite_range mm/filemap.c:430 [inline] file_write_and_wait_range+0x1aa/0x290 mm/filemap.c:788 f2fs_do_sync_file+0x68a/0x1ae0 fs/f2fs/file.c:276 generic_write_sync include/linux/fs.h:2806 [inline] f2fs_file_write_iter+0x7bd/0x24e0 fs/f2fs/file.c:4977 call_write_iter include/linux/fs.h:2114 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xa72/0xc90 fs/read_write.c:590 ksys_write+0x1a0/0x2c0 fs/read_write.c:643 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Cc: stable@vger.kernel.org Reported-by: syzbot+848062ba19c8782ca5c8@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000d103ce06174d7ec3@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: use new ioprio Macro to get ckpt thread ioprio levelZhiguo Niu1-6/+6
IOPRIO_PRIO_DATA in the new kernel version includes level and hint, So Macro IOPRIO_PRIO_LEVEL is more accurate to get ckpt thread ioprio data/level, and it is also consisten with the way setting ckpt thread ioprio by IOPRIO_PRIO_VALUE(class, data/level). Besides, change variable name from "data" to "level" for more readable. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to don't dirty inode for readonly filesystemChao Yu1-0/+3
syzbot reports f2fs bug as below: kernel BUG at fs/f2fs/inode.c:933! RIP: 0010:f2fs_evict_inode+0x1576/0x1590 fs/f2fs/inode.c:933 Call Trace: evict+0x2a4/0x620 fs/inode.c:664 dispose_list fs/inode.c:697 [inline] evict_inodes+0x5f8/0x690 fs/inode.c:747 generic_shutdown_super+0x9d/0x2c0 fs/super.c:675 kill_block_super+0x44/0x90 fs/super.c:1667 kill_f2fs_super+0x303/0x3b0 fs/f2fs/super.c:4894 deactivate_locked_super+0xc1/0x130 fs/super.c:484 cleanup_mnt+0x426/0x4c0 fs/namespace.c:1256 task_work_run+0x24a/0x300 kernel/task_work.c:180 ptrace_notify+0x2cd/0x380 kernel/signal.c:2399 ptrace_report_syscall include/linux/ptrace.h:411 [inline] ptrace_report_syscall_exit include/linux/ptrace.h:473 [inline] syscall_exit_work kernel/entry/common.c:251 [inline] syscall_exit_to_user_mode_prepare kernel/entry/common.c:278 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline] syscall_exit_to_user_mode+0x15c/0x280 kernel/entry/common.c:296 do_syscall_64+0x50/0x110 arch/x86/entry/common.c:88 entry_SYSCALL_64_after_hwframe+0x63/0x6b The root cause is: - do_sys_open - f2fs_lookup - __f2fs_find_entry - f2fs_i_depth_write - f2fs_mark_inode_dirty_sync - f2fs_dirty_inode - set_inode_flag(inode, FI_DIRTY_INODE) - umount - kill_f2fs_super - kill_block_super - generic_shutdown_super - sync_filesystem : sb is readonly, skip sync_filesystem() - evict_inodes - iput - f2fs_evict_inode - f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE)) : trigger kernel panic When we try to repair i_current_depth in readonly filesystem, let's skip dirty inode to avoid panic in later f2fs_evict_inode(). Cc: stable@vger.kernel.org Reported-by: syzbot+31e4659a3fe953aec2f4@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000e890bc0609a55cff@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to avoid use SSR allocate when do defragmentZhiguo Niu1-1/+2
SSR allocate mode will be used when doing file defragment if ATGC is working at the same time, that is because set_page_private_gcing may make CURSEG_ALL_DATA_ATGC segment type got in f2fs_allocate_data_block when defragment page is writeback, which may cause file fragmentation is worse. A file with 2 fragmentations is changed as following after defragment: ----------------file info------------------- sensorsdata : -------------------------------------------- dev [254:48] ino [0x 3029 : 12329] mode [0x 81b0 : 33200] nlink [0x 1 : 1] uid [0x 27e6 : 10214] gid [0x 27e6 : 10214] size [0x 242000 : 2367488] blksize [0x 1000 : 4096] blocks [0x 1210 : 4624] -------------------------------------------- file_pos start_blk end_blk blks 0 11361121 11361207 87 356352 11361215 11361216 2 364544 11361218 11361218 1 368640 11361220 11361221 2 376832 11361224 11361225 2 385024 11361227 11361238 12 434176 11361240 11361252 13 487424 11361254 11361254 1 491520 11361271 11361279 9 528384 3681794 3681795 2 536576 3681797 3681797 1 540672 3681799 3681799 1 544768 3681803 3681803 1 548864 3681805 3681805 1 552960 3681807 3681807 1 557056 3681809 3681809 1 Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to force buffered IO on inline_data inodeChao Yu1-0/+2
It will return all zero data when DIO reading from inline_data inode, it is because f2fs_iomap_begin() assign iomap->type w/ IOMAP_HOLE incorrectly for this case. We can let iomap framework handle inline data via assigning iomap->type and iomap->inline_data correctly, however, it will be a little bit complicated when handling race case in between direct IO and buffered IO. So, let's force to use buffered IO to fix this issue. Cc: stable@vger.kernel.org Reported-by: Barry Song <v-songbaohua@oppo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to remove redundant SBI_NEED_FSCK flag setZhiguo Niu1-1/+0
Subsequent f2fs_stop_checkpoint will set cp_err, so this SBI_NEED_FSCK flag set action is invalid. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: alloc new section if curseg is not the first seg in its zoneSheng Yong1-1/+2
If curseg is not the first segment in its zone, the zone is not empty. A new section should be allocated and avoid resetting the old zone. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Sheng Yong <shengyong@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: add support for FS_IOC_GETFSSYSFSPATHChao Yu1-0/+1
FS_IOC_GETFSSYSFSPATH ioctl expects sysfs sub-path of a filesystem, the format can be "$FSTYP/$SYSFS_IDENTIFIER" under /sys/fs, it can helps to standardizes exporting sysfs datas across filesystems. This patch wires up FS_IOC_GETFSSYSFSPATH for f2fs, it will output "f2fs/<dev>". Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to do sanity check on blocks for inline_data inodeChao Yu3-3/+21
inode can be fuzzed, so it can has F2FS_INLINE_DATA flag and valid i_blocks/i_nid value, this patch supports to do extra sanity check to detect such corrupted state. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to do sanity check on F2FS_INLINE_DATA flag in inode during GCChao Yu1-0/+10
syzbot reports a f2fs bug as below: ------------[ cut here ]------------ kernel BUG at fs/f2fs/inline.c:258! CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0 RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258 Call Trace: f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834 f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline] __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline] f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315 do_writepages+0x35b/0x870 mm/page-writeback.c:2612 __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650 writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941 wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117 wb_do_writeback fs/fs-writeback.c:2264 [inline] wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f2/0x390 kernel/kthread.c:388 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 The root cause is: inline_data inode can be fuzzed, so that there may be valid blkaddr in its direct node, once f2fs triggers background GC to migrate the block, it will hit f2fs_bug_on() during dirty page writeback. Let's add sanity check on F2FS_INLINE_DATA flag in inode during GC, so that, it can forbid migrating inline_data inode's data block for fixing. Reported-by: syzbot+848062ba19c8782ca5c8@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000d103ce06174d7ec3@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-09Linux 6.10-rc3Linus Torvalds1-1/+1
2024-06-07HID: Ignore battery for ELAN touchscreens 2F2C and 4116Louis Dalibard2-0/+6
At least ASUS Zenbook 14 (2023) and ASUS Zenbook 14 Pro (2023) are affected. The touchscreen reports a battery status of 0% and jumps to 1% when a stylus is used. The device ID was added and the battery ignore quirk was enabled for it. [jkosina@suse.com: reformatted changelog a bit] Signed-off-by: Louis Dalibard <ontake@ontake.dev> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2024-06-07HID: i2c-hid: elan: fix reset suspend current leakageJohan Hovold1-12/+47
The Elan eKTH5015M touch controller found on the Lenovo ThinkPad X13s shares the VCC33 supply with other peripherals that may remain powered during suspend (e.g. when enabled as wakeup sources). The reset line is also wired so that it can be left deasserted when the supply is off. This is important as it avoids holding the controller in reset for extended periods of time when it remains powered, which can lead to increased power consumption, and also avoids leaking current through the X13s reset circuitry during suspend (and after driver unbind). Use the new 'no-reset-on-power-off' devicetree property to determine when reset needs to be asserted on power down. Notably this also avoids wasting power on machine variants without a touchscreen for which the driver would otherwise exit probe with reset asserted. Fixes: bd3cba00dcc6 ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens") Cc: <stable@vger.kernel.org> # 6.0 Cc: Douglas Anderson <dianders@chromium.org> Tested-by: Steev Klimaszewski <steev@kali.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240507144821.12275-5-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' propertyJohan Hovold1-0/+6
When the power supply is shared with other peripherals the reset line can be wired in such a way that it can remain deasserted regardless of whether the supply is on or not. This is important as it can be used to avoid holding the controller in reset for extended periods of time when it remains powered, something which can lead to increased power consumption. Leaving reset deasserted also avoids leaking current through the reset circuitry pull-up resistors. Add a new 'no-reset-on-power-off' devicetree property which can be used by the OS to determine when reset needs to be asserted on power down. Note that this property can also be used when the supply cannot be turned off by the OS at all. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240507144821.12275-4-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015MJohan Hovold1-4/+8
Add a compatible string for the Elan eKTH5015M touch controller. Judging from the current binding and commit bd3cba00dcc6 ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens"), eKTH5015M appears to be compatible with eKTH6915. Notably the power-on sequence is the same. While at it, drop a redundant label from the example. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240507144821.12275-3-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schemaJohan Hovold2-1/+66
The Ilitek ILI2901 touch screen controller was apparently incorrectly added to the Elan eKTH6915 schema simply because it also has a reset gpio and is currently managed by the Elan driver in Linux. The two controllers are not related even if an unfortunate wording in the commit message adding the Ilitek compatible made it sound like they were. Add a dedicated schema for the ILI2901 which does not specify the I2C address (which is likely 0x41 rather than 0x10 as for other Ilitek touch controllers) to avoid cluttering the Elan schema with unrelated devices and to make it easier to find the correct schema when adding further Ilitek controllers. Fixes: d74ac6f60a7e ("dt-bindings: HID: i2c-hid: elan: Introduce Ilitek ili2901") Cc: Zhengqiao Xia <xiazhengqiao@huaqin.corp-partner.google.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240507144821.12275-2-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07input: Add support for "Do Not Disturb"Aseda Aboagye3-0/+10
HUTRR94 added support for a new usage titled "System Do Not Disturb" which toggles a system-wide Do Not Disturb setting. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-gUHE70s7wCAoB@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07input: Add event code for accessibility keyAseda Aboagye3-0/+3
HUTRR116 added support for a new usage titled "System Accessibility Binding" which toggles a system-wide bound accessibility UI or command. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-e97O9nvudco5z@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07hid: asus: asus_report_fixup: fix potential read out of boundsAndrew Ballance1-2/+2
syzbot reported a potential read out of bounds in asus_report_fixup. this patch adds checks so that a read out of bounds will not occur Signed-off-by: Andrew Ballance <andrewjballance@gmail.com> Reported-by: <syzbot+07762f019fd03d01f04c@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=07762f019fd03d01f04c Fixes: 59d2f5b7392e ("HID: asus: fix more n-key report descriptors if n-key quirked") Link: https://lore.kernel.org/r/20240602085023.1720492-1-andrewjballance@gmail.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07gpio: add missing MODULE_DESCRIPTION() macrosJeff Johnson4-0/+4
On x86, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpio/gpio-gw-pld.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpio/gpio-mc33880.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpio/gpio-pcf857x.o Add the missing invocations of the MODULE_DESCRIPTION() macro, including the one missing in gpio-pl061.c, which is not built for x86. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240606-md-drivers-gpio-v1-1-cb42d240ca5c@quicinc.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-07cifs: Don't advance the I/O iterator before terminating subrequestDavid Howells1-3/+0
There's now no need to make sure subreq->io_iter is advanced to match subreq->transferred before calling one of the netfs subrequest termination functions as the check has been removed netfslib and the iterator is reset prior to retrying a subreq. Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-06-07smb: client: fix deadlock in smb2_find_smb_tcon()Enzo Matsumiya1-1/+1
Unlock cifs_tcp_ses_lock before calling cifs_put_smb_ses() to avoid such deadlock. Cc: stable@vger.kernel.org Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-06-07modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.oMasahiro Yamada1-2/+3
Building with W=1 incorrectly emits the following warning: WARNING: modpost: missing MODULE_DESCRIPTION() in vmlinux.o This check should apply only to modules. Fixes: 1fffe7a34c89 ("script: modpost: emit a warning when the description is missing") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
2024-06-07kbuild: explicitly run mksysmap as sed script from link-vmlinux.shRichard Acayan1-1/+1
In commit b18b047002b7 ("kbuild: change scripts/mksysmap into sed script"), the mksysmap script was transformed into a sed script, made directly executable with "#!/bin/sed -f". Apparently, the path to sed is different on NixOS. The shebang can't use the env command, otherwise the "sed -f" command would be treated as a single argument. This can be solved with the -S flag, but that is a GNU extension. Explicitly use sed instead of relying on the executable shebang to fix NixOS builds without breaking build environments using Busybox. Fixes: b18b047002b7 ("kbuild: change scripts/mksysmap into sed script") Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Richard Acayan <mailingradian@gmail.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dmitry Safonov <0x7f454c46@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-06-06btrfs: protect folio::private when attaching extent buffer foliosQu Wenruo1-29/+31
[BUG] Since v6.8 there are rare kernel crashes reported by various people, the common factor is bad page status error messages like this: BUG: Bad page state in process kswapd0 pfn:d6e840 page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c pfn:0xd6e840 aops:btree_aops ino:1 flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff) page_type: 0xffffffff() raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0 raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: non-NULL mapping [CAUSE] Commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") changes the sequence when allocating a new extent buffer. Previously we always called grab_extent_buffer() under mapping->i_private_lock, to ensure the safety on modification on folio::private (which is a pointer to extent buffer for regular sectorsize). This can lead to the following race: Thread A is trying to allocate an extent buffer at bytenr X, with 4 4K pages, meanwhile thread B is trying to release the page at X + 4K (the second page of the extent buffer at X). Thread A | Thread B -----------------------------------+------------------------------------- | btree_release_folio() | | This is for the page at X + 4K, | | Not page X. | | alloc_extent_buffer() | |- release_extent_buffer() |- filemap_add_folio() for the | | |- atomic_dec_and_test(eb->refs) | page at bytenr X (the first | | | | page). | | | | Which returned -EEXIST. | | | | | | | |- filemap_lock_folio() | | | | Returned the first page locked. | | | | | | | |- grab_extent_buffer() | | | | |- atomic_inc_not_zero() | | | | | Returned false | | | | |- folio_detach_private() | | |- folio_detach_private() for X | |- folio_test_private() | | |- folio_test_private() | Returned true | | | Returned true |- folio_put() | |- folio_put() Now there are two puts on the same folio at folio X, leading to refcount underflow of the folio X, and eventually causing the BUG_ON() on the page->mapping. The condition is not that easy to hit: - The release must be triggered for the middle page of an eb If the release is on the same first page of an eb, page lock would kick in and prevent the race. - folio_detach_private() has a very small race window It's only between folio_test_private() and folio_clear_private(). That's exactly when mapping->i_private_lock is used to prevent such race, and commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") screwed that up. At that time, I thought the page lock would kick in as filemap_release_folio() also requires the page to be locked, but forgot the filemap_release_folio() only locks one page, not all pages of an extent buffer. [FIX] Move all the code requiring i_private_lock into attach_eb_folio_to_filemap(), so that everything is done with proper lock protection. Furthermore to prevent future problems, add an extra lockdep_assert_locked() to ensure we're holding the proper lock. To reproducer that is able to hit the race (takes a few minutes with instrumented code inserting delays to alloc_extent_buffer()): #!/bin/sh drop_caches () { while(true); do echo 3 > /proc/sys/vm/drop_caches echo 1 > /proc/sys/vm/compact_memory done } run_tar () { while(true); do for x in `seq 1 80` ; do tar cf /dev/zero /mnt > /dev/null & done wait done } mkfs.btrfs -f -d single -m single /dev/vda mount -o noatime /dev/vda /mnt # create 200,000 files, 1K each ./simoop -n 200000 -E -f 1k /mnt drop_caches & (run_tar) Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/linux-btrfs/CAHk-=wgt362nGfScVOOii8cgKn2LVVHeOvOA7OBwg1OwbuJQcw@mail.gmail.com/ Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Link: https://lore.kernel.org/lkml/CABXGCsPktcHQOvKTbPaTwegMExije=Gpgci5NW=hqORo-s7diA@mail.gmail.com/ Reported-by: Toralf Förster <toralf.foerster@gmx.de> Link: https://lore.kernel.org/linux-btrfs/e8b3311c-9a75-4903-907f-fc0f7a3fe423@gmx.de/ Reported-by: syzbot+f80b066392366b4af85e@syzkaller.appspotmail.com Fixes: 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") CC: stable@vger.kernel.org # 6.8+ CC: Chris Mason <clm@fb.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-06selftests: net: lib: set 'i' as localMatthieu Baerts (NGI0)1-0/+1
Without this, the 'i' variable declared before could be overridden by accident, e.g. for i in "${@}"; do __ksft_status_merge "${i}" ## 'i' has been modified foo "${i}" ## using 'i' with an unexpected value done After a quick look, it looks like 'i' is currently not used after having been modified in __ksft_status_merge(), but still, better be safe than sorry. I saw this while modifying the same file, not because I suspected an issue somewhere. Fixes: 596c8819cb78 ("selftests: forwarding: Have RET track kselftest framework constants") Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-3-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-06selftests: net: lib: avoid error removing empty netns nameMatthieu Baerts (NGI0)1-6/+7
If there is an error to create the first netns with 'setup_ns()', 'cleanup_ns()' will be called with an empty string as first parameter. The consequences is that 'cleanup_ns()' will try to delete an invalid netns, and wait 20 seconds if the netns list is empty. Instead of just checking if the name is not empty, convert the string separated by spaces to an array. Manipulating the array is cleaner, and calling 'cleanup_ns()' with an empty array will be a no-op. Fixes: 25ae948b4478 ("selftests/net: add lib.sh") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-2-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-06selftests: net: lib: support errexit with busywaitMatthieu Baerts (NGI0)1-3/+1
If errexit is enabled ('set -e'), loopy_wait -- or busywait and others using it -- will stop after the first failure. Note that if the returned status of loopy_wait is checked, and even if errexit is enabled, Bash will not stop at the first error. Fixes: 25ae948b4478 ("selftests/net: add lib.sh") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-1-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-06ata: pata_macio: Fix max_segment_size with PAGE_SIZE == 64KMichael Ellerman1-3/+6
The pata_macio driver advertises a max_segment_size of 0xff00, because the hardware doesn't cope with requests >= 64K. However the SCSI core requires max_segment_size to be at least PAGE_SIZE, which is a problem for pata_macio when the kernel is built with 64K pages. In older kernels the SCSI core would just increase the segment size to be equal to PAGE_SIZE, however since the commit tagged below it causes a warning and the device fails to probe: WARNING: CPU: 0 PID: 26 at block/blk-settings.c:202 .blk_validate_limits+0x2f8/0x35c CPU: 0 PID: 26 Comm: kworker/u4:1 Not tainted 6.10.0-rc1 #1 Hardware name: PowerMac7,2 PPC970 0x390202 PowerMac ... NIP .blk_validate_limits+0x2f8/0x35c LR .blk_alloc_queue+0xc0/0x2f8 Call Trace: .blk_alloc_queue+0xc0/0x2f8 .blk_mq_alloc_queue+0x60/0xf8 .scsi_alloc_sdev+0x208/0x3c0 .scsi_probe_and_add_lun+0x314/0x52c .__scsi_add_device+0x170/0x1a4 .ata_scsi_scan_host+0x2bc/0x3e4 .async_port_probe+0x6c/0xa0 .async_run_entry_fn+0x60/0x1bc .process_one_work+0x228/0x510 .worker_thread+0x360/0x530 .kthread+0x134/0x13c .start_kernel_thread+0x10/0x14 ... scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured Although the hardware can't cope with a 64K segment, the driver already deals with that internally by splitting large requests in pata_macio_qc_prep(). That is how the driver has managed to function until now on 64K kernels. So fix the driver to advertise a max_segment_size of 64K, which avoids the warning and keeps the SCSI core happy. Fixes: afd53a3d8528 ("scsi: core: Initialize scsi midlayer limits before allocating the queue") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/ce2bf6af-4382-4fe1-b392-cc6829f5ceb2@roeck-us.net/ Reported-by: Doru Iorgulescu <doru.iorgulescu1@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218858 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-06-06net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()Su Hui1-1/+1
Clang static checker (scan-build) warning: net/ethtool/ioctl.c:line 2233, column 2 Called function pointer is null (null dereference). Return '-EOPNOTSUPP' when 'ops->get_ethtool_phy_stats' is NULL to fix this typo error. Fixes: 201ed315f967 ("net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240605034742.921751-1-suhui@nfschina.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06kconfig: remove wrong expr_trans_bool()Masahiro Yamada3-32/+0
expr_trans_bool() performs an incorrect transformation. [Test Code] config MODULES def_bool y modules config A def_bool y select C if B != n config B def_tristate m config C tristate [Result] CONFIG_MODULES=y CONFIG_A=y CONFIG_B=m CONFIG_C=m This output is incorrect because CONFIG_C=y is expected. Documentation/kbuild/kconfig-language.rst clearly explains the function of the '!=' operator: If the values of both symbols are equal, it returns 'n', otherwise 'y'. Therefore, the statement: select C if B != n should be equivalent to: select C if y Or, more simply: select C Hence, the symbol C should be selected by the value of A, which is 'y'. However, expr_trans_bool() wrongly transforms it to: select C if B Therefore, the symbol C is selected by (A && B), which is 'm'. The comment block of expr_trans_bool() correctly explains its intention: * bool FOO!=n => FOO ^^^^ If FOO is bool, FOO!=n can be simplified into FOO. This is correct. However, the actual code performs this transformation when FOO is tristate: if (e->left.sym->type == S_TRISTATE) { ^^^^^^^^^^ While it can be fixed to S_BOOLEAN, there is no point in doing so because expr_tranform() already transforms FOO!=n to FOO when FOO is bool. (see the "case E_UNEQUAL" part) expr_trans_bool() is wrong and unnecessary. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org>
2024-06-06ipv6: fix possible race in __fib6_drop_pcpu_from()Eric Dumazet2-1/+6
syzbot found a race in __fib6_drop_pcpu_from() [1] If compiler reads more than once (*ppcpu_rt), second read could read NULL, if another cpu clears the value in rt6_get_pcpu_route(). Add a READ_ONCE() to prevent this race. Also add rcu_read_lock()/rcu_read_unlock() because we rely on RCU protection while dereferencing pcpu_rt. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Workqueue: netns cleanup_net RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984 Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48 RSP: 0018:ffffc900040df070 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16 RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091 RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007 R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8 R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline] fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline] fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038 fib6_del_route net/ipv6/ip6_fib.c:1998 [inline] fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043 fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205 fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127 fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175 fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255 __fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271 rt6_sync_down_dev net/ipv6/route.c:4906 [inline] rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911 addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855 addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778 notifier_call_chain+0xb9/0x410 kernel/notifier.c:93 call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992 call_netdevice_notifiers_extack net/core/dev.c:2030 [inline] call_netdevice_notifiers net/core/dev.c:2044 [inline] dev_close_many+0x333/0x6a0 net/core/dev.c:1585 unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193 unregister_netdevice_many net/core/dev.c:11276 [inline] default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759 ops_exit_list+0x128/0x180 net/core/net_namespace.c:178 cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640 process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 process_scheduled_works kernel/workqueue.c:3312 [inline] worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: d52d3997f843 ("ipv6: Create percpu rt6_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20240604193549.981839-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06kconfig: doc: document behavior of 'select' and 'imply' followed by 'if'Masahiro Yamada1-0/+10
Documentation/kbuild/kconfig-language.rst explains the behavior of 'select' as follows: reverse dependencies can be used to force a lower limit of another symbol. The value of the current menu symbol is used as the minimal value <symbol> can be set to. This is not true when the 'select' property is followed by 'if'. [Test Code] config MODULES def_bool y modules config A def_tristate y select C if B config B def_tristate m config C tristate [Result] CONFIG_MODULES=y CONFIG_A=y CONFIG_B=m CONFIG_C=m If "the value of A is used as the minimal value C can be set to", C must be 'y'. The actual behavior is "C is selected by (A && B)". The lower limit of C is downgraded due to B being 'm'. This behavior is kind of weird, and this has arisen several times in the mailing list. I do not know whether it is a bug or intended behavior. Anyway, it is not feasible to change it now because many Kconfig files are written based on this behavior. The same applies to 'imply'. Document this (but reserve the possibility for a future change). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
2024-06-06kconfig: doc: fix a typo in the note about 'imply'Masahiro Yamada1-1/+1
This sentence does not make sense due to a typo. Fix it. Fixes: def2fbffe62c ("kconfig: allow symbols implied by y to become m") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
2024-06-06kconfig: gconf: give a proper initial state to the Save buttonMasahiro Yamada1-1/+2
Currently, the initial state of the "Save" button is always active. If none of the CONFIG options are changed while loading the .config file, the "Save" button should be greyed out. This can be fixed by calling conf_read() after widget initialization. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-06-06kconfig: remove unneeded code for user-supplied values being out of rangeMasahiro Yamada1-13/+0
This is a leftover from commit ce1fc9345a59 ("kconfig: do not clear SYMBOL_DEF_USER when the value is out of range"). This code is now redundant because if a user-supplied value is out of range, the value adjusted by sym_validate_range() differs, and conf_unsaved has already been incremented a few lines above. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-06-06af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().Kuniyuki Iwashima1-1/+1
While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock(). Let's use READ_ONCE() to read sk->sk_shutdown. Fixes: e4e541a84863 ("sock-diag: Report shutdown for inet and unix sockets (v2)") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().Kuniyuki Iwashima1-1/+1
We can dump the socket queue length via UNIX_DIAG by specifying UDIAG_SHOW_RQLEN. If sk->sk_state is TCP_LISTEN, we return the recv queue length, but here we do not hold recvq lock. Let's use skb_queue_len_lockless() in sk_diag_show_rqlen(). Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Use skb_queue_empty_lockless() in unix_release_sock().Kuniyuki Iwashima1-1/+1
If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock() checks the length of the peer socket's recvq under unix_state_lock(). However, unix_stream_read_generic() calls skb_unlink() after releasing the lock. Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks skb without unix_state_lock(). Thues, unix_state_lock() does not protect qlen. Let's use skb_queue_empty_lockless() in unix_release_sock(). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().Kuniyuki Iwashima1-8/+2
Once sk->sk_state is changed to TCP_LISTEN, it never changes. unix_accept() takes advantage of this characteristics; it does not hold the listener's unix_state_lock() and only acquires recvq lock to pop one skb. It means unix_state_lock() does not prevent the queue length from changing in unix_stream_connect(). Thus, we need to use unix_recvq_full_lockless() to avoid data-race. Now we remove unix_recvq_full() as no one uses it. Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in unix_recvq_full_lockless() because of the following reasons: (1) For SOCK_DGRAM, it is a written-once field in unix_create1() (2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the listener's unix_state_lock() in unix_listen(), and we hold the lock in unix_stream_connect() Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.Kuniyuki Iwashima1-1/+1
net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be changed concurrently. Let's use READ_ONCE() in unix_create1(). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Annotate data-races around sk->sk_sndbuf.Kuniyuki Iwashima1-3/+3
sk_setsockopt() changes sk->sk_sndbuf under lock_sock(), but it's not used in af_unix.c. Let's use READ_ONCE() to read sk->sk_sndbuf in unix_writable(), unix_dgram_sendmsg(), and unix_stream_sendmsg(). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.Kuniyuki Iwashima1-4/+4
While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read locklessly. Let's use READ_ONCE() there. Note that the result could be inconsistent if the socket is dumped during the state change. This is common for other SOCK_DIAG and similar interfaces. Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report") Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA") Fixes: 45a96b9be6ec ("unix_diag: Dumping all sockets core") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().Kuniyuki Iwashima1-1/+1
unix_stream_read_skb() is called from sk->sk_data_ready() context where unix_state_lock() is not held. Let's use READ_ONCE() there. Fixes: 77462de14a43 ("af_unix: Add read_sock for stream socket types") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().Kuniyuki Iwashima1-4/+4
The following functions read sk->sk_state locklessly and proceed only if the state is TCP_ESTABLISHED. * unix_stream_sendmsg * unix_stream_read_generic * unix_seqpacket_sendmsg * unix_seqpacket_recvmsg Let's use READ_ONCE() there. Fixes: a05d2ad1c1f3 ("af_unix: Only allow recv on connected seqpacket sockets.") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>