Age | Commit message (Collapse) | Author | Files | Lines |
|
Currently on driver bringup with KASAN enabled, meson triggers an OOB
memory access as shown below:
[ 117.904528] ==================================================================
[ 117.904560] BUG: KASAN: global-out-of-bounds in meson_viu_set_osd_lut+0x7a0/0x890
[ 117.904588] Read of size 4 at addr ffff20000a63ce24 by task systemd-udevd/498
[ 117.904601]
[ 118.083372] CPU: 4 PID: 498 Comm: systemd-udevd Not tainted 4.20.0-rc3Lyude-Test+ #20
[ 118.091143] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018
[ 118.099768] Call trace:
[ 118.102181] dump_backtrace+0x0/0x3e8
[ 118.105796] show_stack+0x14/0x20
[ 118.109083] dump_stack+0x130/0x1c4
[ 118.112539] print_address_description+0x60/0x25c
[ 118.117214] kasan_report+0x1b4/0x368
[ 118.120851] __asan_report_load4_noabort+0x18/0x20
[ 118.125566] meson_viu_set_osd_lut+0x7a0/0x890
[ 118.129953] meson_viu_init+0x10c/0x290
[ 118.133741] meson_drv_bind_master+0x474/0x748
[ 118.138141] meson_drv_bind+0x10/0x18
[ 118.141760] try_to_bring_up_master+0x3d8/0x768
[ 118.146249] component_add+0x214/0x570
[ 118.149978] meson_dw_hdmi_probe+0x18/0x20 [meson_dw_hdmi]
[ 118.155404] platform_drv_probe+0x98/0x138
[ 118.159455] really_probe+0x2a0/0xa70
[ 118.163070] driver_probe_device+0x1b4/0x2d8
[ 118.167299] __driver_attach+0x200/0x280
[ 118.171189] bus_for_each_dev+0x10c/0x1a8
[ 118.175144] driver_attach+0x38/0x50
[ 118.178681] bus_add_driver+0x330/0x608
[ 118.182471] driver_register+0x140/0x388
[ 118.186361] __platform_driver_register+0xc8/0x108
[ 118.191117] meson_dw_hdmi_platform_driver_init+0x1c/0x1000 [meson_dw_hdmi]
[ 118.198022] do_one_initcall+0x12c/0x3bc
[ 118.201883] do_init_module+0x1fc/0x638
[ 118.205673] load_module+0x4b4c/0x6808
[ 118.209387] __se_sys_init_module+0x2e8/0x3c0
[ 118.213699] __arm64_sys_init_module+0x68/0x98
[ 118.218100] el0_svc_common+0x104/0x210
[ 118.221893] el0_svc_handler+0x48/0xb8
[ 118.225594] el0_svc+0x8/0xc
[ 118.228429]
[ 118.229887] The buggy address belongs to the variable:
[ 118.235007] eotf_33_linear_mapping+0x84/0xc0
[ 118.239301]
[ 118.240752] Memory state around the buggy address:
[ 118.245522] ffff20000a63cd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 118.252695] ffff20000a63cd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 118.259850] >ffff20000a63ce00: 00 00 00 00 04 fa fa fa fa fa fa fa 00 00 00 00
[ 118.267000] ^
[ 118.271222] ffff20000a63ce80: 00 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
[ 118.278393] ffff20000a63cf00: 00 00 00 00 00 00 00 00 00 00 00 00 04 fa fa fa
[ 118.285542] ==================================================================
[ 118.292699] Disabling lock debugging due to kernel taint
It seems that when looping through the OSD EOTF LUT maps, we use the
same max iterator for OETF: 20. This is wrong though, since 20*2 is 40,
which means that we'll stop out of bounds on the EOTF maps.
But, this whole thing is already confusing enough to read through as-is,
so let's just replace all of the hardcoded sizes with
OSD_(OETF/EOTF)_LUT_SIZE / 2.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: bbbe775ec5b5 ("drm: Add support for Amlogic Meson Graphic Controller")
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Maxime Ripard <maxime.ripard@bootlin.com>
Cc: Carlo Caione <carlo@caione.org>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: <stable@vger.kernel.org> # v4.10+
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181125012117.31915-1-lyude@redhat.com
Signed-off-by: Sean Paul <seanpaul@chromium.org>
|
|
Seeing as we use this registermap in the context of our IRQ handlers, we
need to be using spinlocks for reading/writing registers so that we can
still read them from IRQ handlers without having to grab any mutexes and
accidentally sleep. We don't currently do this, as pointed out by
lockdep:
[ 18.403770] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
[ 18.406744] in_atomic(): 1, irqs_disabled(): 128, pid: 68, name: kworker/u17:0
[ 18.413864] INFO: lockdep is turned off.
[ 18.417675] irq event stamp: 12
[ 18.420778] hardirqs last enabled at (11): [<ffff000008a4f57c>] _raw_spin_unlock_irq+0x2c/0x60
[ 18.429510] hardirqs last disabled at (12): [<ffff000008a48914>] __schedule+0xc4/0xa60
[ 18.437345] softirqs last enabled at (0): [<ffff0000080b55e0>] copy_process.isra.4.part.5+0x4d8/0x1c50
[ 18.446684] softirqs last disabled at (0): [<0000000000000000>] (null)
[ 18.453979] CPU: 0 PID: 68 Comm: kworker/u17:0 Tainted: G W O 4.20.0-rc3Lyude-Test+ #9
[ 18.469839] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018
[ 18.480037] Workqueue: hci0 hci_power_on [bluetooth]
[ 18.487138] Call trace:
[ 18.494192] dump_backtrace+0x0/0x1b8
[ 18.501280] show_stack+0x14/0x20
[ 18.508361] dump_stack+0xbc/0xf4
[ 18.515427] ___might_sleep+0x140/0x1d8
[ 18.522515] __might_sleep+0x50/0x88
[ 18.529582] __mutex_lock+0x60/0x870
[ 18.536621] mutex_lock_nested+0x1c/0x28
[ 18.543660] regmap_lock_mutex+0x10/0x18
[ 18.550696] regmap_read+0x38/0x70
[ 18.557727] dw_hdmi_hardirq+0x58/0x138 [dw_hdmi]
[ 18.564804] __handle_irq_event_percpu+0xac/0x410
[ 18.571891] handle_irq_event_percpu+0x34/0x88
[ 18.578982] handle_irq_event+0x48/0x78
[ 18.586051] handle_fasteoi_irq+0xac/0x160
[ 18.593061] generic_handle_irq+0x24/0x38
[ 18.599989] __handle_domain_irq+0x60/0xb8
[ 18.606857] gic_handle_irq+0x50/0xa0
[ 18.613659] el1_irq+0xb4/0x130
[ 18.620394] debug_lockdep_rcu_enabled+0x2c/0x30
[ 18.627111] schedule+0x38/0xa0
[ 18.633781] schedule_timeout+0x3a8/0x510
[ 18.640389] wait_for_common+0x15c/0x180
[ 18.646905] wait_for_completion+0x14/0x20
[ 18.653319] mmc_wait_for_req_done+0x28/0x168
[ 18.659693] mmc_wait_for_req+0xa8/0xe8
[ 18.665978] mmc_wait_for_cmd+0x64/0x98
[ 18.672180] mmc_io_rw_direct_host+0x94/0x130
[ 18.678385] mmc_io_rw_direct+0x10/0x18
[ 18.684516] sdio_enable_func+0xe8/0x1d0
[ 18.690627] btsdio_open+0x24/0xc0 [btsdio]
[ 18.696821] hci_dev_do_open+0x64/0x598 [bluetooth]
[ 18.703025] hci_power_on+0x50/0x270 [bluetooth]
[ 18.709163] process_one_work+0x2a0/0x6e0
[ 18.715252] worker_thread+0x40/0x448
[ 18.721310] kthread+0x12c/0x130
[ 18.727326] ret_from_fork+0x10/0x1c
[ 18.735555] ------------[ cut here ]------------
[ 18.741430] do not call blocking ops when !TASK_RUNNING; state=2 set at [<000000006265ec59>] wait_for_common+0x140/0x180
[ 18.752417] WARNING: CPU: 0 PID: 68 at kernel/sched/core.c:6096 __might_sleep+0x7c/0x88
[ 18.760553] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod
btsdio bluetooth snd_soc_hdmi_codec dw_hdmi_i2s_audio ecdh_generic
brcmfmac brcmutil cfg80211 rfkill ir_nec_decoder meson_dw_hdmi(O)
dw_hdmi rc_geekbox meson_rng meson_ir ao_cec rng_core rc_core cec
leds_pwm efivars nfsd ip_tables x_tables crc32_generic f2fs uas
meson_gxbb_wdt pwm_meson efivarfs ipv6
[ 18.799469] CPU: 0 PID: 68 Comm: kworker/u17:0 Tainted: G W O 4.20.0-rc3Lyude-Test+ #9
[ 18.808858] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018
[ 18.818045] Workqueue: hci0 hci_power_on [bluetooth]
[ 18.824088] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[ 18.829891] pc : __might_sleep+0x7c/0x88
[ 18.835722] lr : __might_sleep+0x7c/0x88
[ 18.841256] sp : ffff000008003cb0
[ 18.846751] x29: ffff000008003cb0 x28: 0000000000000000
[ 18.852269] x27: ffff00000938e000 x26: ffff800010283000
[ 18.857726] x25: ffff800010353280 x24: ffff00000868ef50
[ 18.863166] x23: 0000000000000000 x22: 0000000000000000
[ 18.868551] x21: 0000000000000000 x20: 000000000000038c
[ 18.873850] x19: ffff000008cd08c0 x18: 0000000000000010
[ 18.879081] x17: ffff000008a68cb0 x16: 0000000000000000
[ 18.884197] x15: 0000000000aaaaaa x14: 0e200e200e200e20
[ 18.889239] x13: 0000000000000001 x12: 00000000ffffffff
[ 18.894261] x11: ffff000008adfa48 x10: 0000000000000001
[ 18.899517] x9 : ffff0000092a0158 x8 : 0000000000000000
[ 18.904674] x7 : ffff00000812136c x6 : 0000000000000000
[ 18.909895] x5 : 0000000000000000 x4 : 0000000000000001
[ 18.915080] x3 : 0000000000000007 x2 : 0000000000000007
[ 18.920269] x1 : 99ab8e9ebb6c8500 x0 : 0000000000000000
[ 18.925443] Call trace:
[ 18.929904] __might_sleep+0x7c/0x88
[ 18.934311] __mutex_lock+0x60/0x870
[ 18.938687] mutex_lock_nested+0x1c/0x28
[ 18.943076] regmap_lock_mutex+0x10/0x18
[ 18.947453] regmap_read+0x38/0x70
[ 18.951842] dw_hdmi_hardirq+0x58/0x138 [dw_hdmi]
[ 18.956269] __handle_irq_event_percpu+0xac/0x410
[ 18.960712] handle_irq_event_percpu+0x34/0x88
[ 18.965176] handle_irq_event+0x48/0x78
[ 18.969612] handle_fasteoi_irq+0xac/0x160
[ 18.974058] generic_handle_irq+0x24/0x38
[ 18.978501] __handle_domain_irq+0x60/0xb8
[ 18.982938] gic_handle_irq+0x50/0xa0
[ 18.987351] el1_irq+0xb4/0x130
[ 18.991734] debug_lockdep_rcu_enabled+0x2c/0x30
[ 18.996180] schedule+0x38/0xa0
[ 19.000609] schedule_timeout+0x3a8/0x510
[ 19.005064] wait_for_common+0x15c/0x180
[ 19.009513] wait_for_completion+0x14/0x20
[ 19.013951] mmc_wait_for_req_done+0x28/0x168
[ 19.018402] mmc_wait_for_req+0xa8/0xe8
[ 19.022809] mmc_wait_for_cmd+0x64/0x98
[ 19.027177] mmc_io_rw_direct_host+0x94/0x130
[ 19.031563] mmc_io_rw_direct+0x10/0x18
[ 19.035922] sdio_enable_func+0xe8/0x1d0
[ 19.040294] btsdio_open+0x24/0xc0 [btsdio]
[ 19.044742] hci_dev_do_open+0x64/0x598 [bluetooth]
[ 19.049228] hci_power_on+0x50/0x270 [bluetooth]
[ 19.053687] process_one_work+0x2a0/0x6e0
[ 19.058143] worker_thread+0x40/0x448
[ 19.062608] kthread+0x12c/0x130
[ 19.067064] ret_from_fork+0x10/0x1c
[ 19.071513] irq event stamp: 12
[ 19.075937] hardirqs last enabled at (11): [<ffff000008a4f57c>] _raw_spin_unlock_irq+0x2c/0x60
[ 19.083560] hardirqs last disabled at (12): [<ffff000008a48914>] __schedule+0xc4/0xa60
[ 19.091401] softirqs last enabled at (0): [<ffff0000080b55e0>] copy_process.isra.4.part.5+0x4d8/0x1c50
[ 19.100801] softirqs last disabled at (0): [<0000000000000000>] (null)
[ 19.108135] ---[ end trace 38c4920787b88c75 ]---
So, fix this by enabling the fast_io option in our regmap config so that
regmap uses spinlocks for locking instead of mutexes.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 3f68be7d8e96 ("drm/meson: Add support for HDMI encoder and DW-HDMI bridge + PHY")
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Carlo Caione <carlo@caione.org>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: <stable@vger.kernel.org> # v4.12+
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181124191238.28276-1-lyude@redhat.com
Signed-off-by: Sean Paul <seanpaul@chromium.org>
|
|
Since Linux 4.17, calls to drm_crtc_vblank_on/off are mandatory, and we get
a warning when ctrc is disabled :
" driver forgot to call drm_crtc_vblank_off()"
But, the vsync IRQ was not totally disabled due the transient hardware
state and specific interrupt line, thus adding proper IRQ masking from
the HHI system control registers.
The last change fixes a race condition introduced by calling the added
drm_crtc_vblank_on/off when an HPD event occurs from the HDMI connector,
triggering a WARN_ON() in the _atomic_begin() callback when the CRTC
is disabled, thus also triggering a WARN_ON() in drm_vblank_put() :
WARNING: CPU: 0 PID: 1185 at drivers/gpu/drm/meson/meson_crtc.c:157 meson_crtc_atomic_begin+0x78/0x80
[...]
Call trace:
meson_crtc_atomic_begin+0x78/0x80
drm_atomic_helper_commit_planes+0x140/0x218
drm_atomic_helper_commit_tail+0x38/0x80
commit_tail+0x7c/0x80
drm_atomic_helper_commit+0xdc/0x150
drm_atomic_commit+0x54/0x60
restore_fbdev_mode_atomic+0x198/0x238
restore_fbdev_mode+0x6c/0x1c0
drm_fb_helper_restore_fbdev_mode_unlocked+0x7c/0xf0
drm_fb_helper_set_par+0x34/0x60
drm_fb_helper_hotplug_event.part.28+0xb8/0xc8
drm_fbdev_client_hotplug+0xa4/0xe0
drm_client_dev_hotplug+0x90/0xe0
drm_kms_helper_hotplug_event+0x3c/0x48
drm_helper_hpd_irq_event+0x134/0x168
dw_hdmi_top_thread_irq+0x3c/0x50
[...]
WARNING: CPU: 0 PID: 1185 at drivers/gpu/drm/drm_vblank.c:1026 drm_vblank_put+0xb4/0xc8
[...]
Call trace:
drm_vblank_put+0xb4/0xc8
drm_crtc_vblank_put+0x24/0x30
drm_atomic_helper_wait_for_vblanks.part.9+0x130/0x2b8
drm_atomic_helper_commit_tail+0x68/0x80
[...]
The issue is that vblank need to be enabled in any occurrence of :
- atomic_enable()
- atomic_begin() and state->enable == true, which was not the case
Moving the CRTC enable code to a common function and calling in one of
these occurrence solves this race condition and makes sure vblank is
enabled in each call to _atomic_begin() from the HPD event leading to
drm_atomic_helper_commit_planes().
To Summarize :
- Make sure that the CRTC code will call the drm_crtc_vblank_on()/off()
- *Really* mask the Vsync IRQ
- Initialize and enable vblank at the first
atomic_begin()/_atomic_enable()
Cc: stable@vger.kernel.org # 4.17+
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[fixed typos+added cc for stable]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181122160103.10993-1-narmstrong@baylibre.com
Signed-off-by: Sean Paul <seanpaul@chromium.org>
|
|
When drm_new_set_master() fails, set is_master to 0, to prevent a
possible NULL pointer deref.
Here is a problematic flow: we check is_master in drm_is_current_master(),
then proceed to call drm_lease_owner() passing master. If we do not restore
is_master status when drm_new_set_master() fails, we may have a situation
in which is_master will be 1 and master itself, NULL, leading to the deref
of a NULL pointer in drm_lease_owner().
This fixes the following OOPS, observed on an ArchLinux running a 4.19.2
kernel:
[ 97.804282] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
[ 97.807224] PGD 0 P4D 0
[ 97.807224] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 97.807224] CPU: 0 PID: 1348 Comm: xfwm4 Tainted: P OE 4.19.2-arch1-1-ARCH #1
[ 97.807224] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Pro4, BIOS P5.10 10/16/2018
[ 97.807224] RIP: 0010:drm_lease_owner+0xd/0x20 [drm]
[ 97.807224] Code: 83 c4 18 5b 5d c3 b8 ea ff ff ff eb e2 b8 ed ff ff ff eb db e8 b4 ca 68 fb 0f 1f 40 00 0f 1f 44 00 00 48 89 f8 eb 03 48 89 d0 <48> 8b 90 80 00 00 00 48 85 d2 75 f1 c3 66 0f 1f 44 00 00 0f 1f 44
[ 97.807224] RSP: 0018:ffffb8cf08e07bb0 EFLAGS: 00010202
[ 97.807224] RAX: 0000000000000000 RBX: ffff9cf0f2586c00 RCX: ffff9cf0f2586c88
[ 97.807224] RDX: ffff9cf0ddbd8000 RSI: 0000000000000000 RDI: 0000000000000000
[ 97.807224] RBP: ffff9cf1040e9800 R08: 0000000000000000 R09: 0000000000000000
[ 97.807224] R10: ffffdeb30fd5d680 R11: ffffdeb30f5d6808 R12: ffff9cf1040e9888
[ 97.807224] R13: 0000000000000000 R14: dead000000000200 R15: ffff9cf0f2586cc8
[ 97.807224] FS: 00007f4145513180(0000) GS:ffff9cf10ea00000(0000) knlGS:0000000000000000
[ 97.807224] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 97.807224] CR2: 0000000000000080 CR3: 00000003d7548000 CR4: 00000000003406f0
[ 97.807224] Call Trace:
[ 97.807224] drm_is_current_master+0x1a/0x30 [drm]
[ 97.807224] drm_master_release+0x3e/0x130 [drm]
[ 97.807224] drm_file_free.part.0+0x2be/0x2d0 [drm]
[ 97.807224] drm_open+0x1ba/0x1e0 [drm]
[ 97.807224] drm_stub_open+0xaf/0xe0 [drm]
[ 97.807224] chrdev_open+0xa3/0x1b0
[ 97.807224] ? cdev_put.part.0+0x20/0x20
[ 97.807224] do_dentry_open+0x132/0x340
[ 97.807224] path_openat+0x2d1/0x14e0
[ 97.807224] ? mem_cgroup_commit_charge+0x7a/0x520
[ 97.807224] do_filp_open+0x93/0x100
[ 97.807224] ? __check_object_size+0x102/0x189
[ 97.807224] ? _raw_spin_unlock+0x16/0x30
[ 97.807224] do_sys_open+0x186/0x210
[ 97.807224] do_syscall_64+0x5b/0x170
[ 97.807224] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 97.807224] RIP: 0033:0x7f4147b07976
[ 97.807224] Code: 89 54 24 08 e8 7b f4 ff ff 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f2 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 89 44 24 08 e8 a6 f4 ff ff 8b 44
[ 97.807224] RSP: 002b:00007ffcced96ca0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[ 97.807224] RAX: ffffffffffffffda RBX: 00005619d5037f80 RCX: 00007f4147b07976
[ 97.807224] RDX: 0000000000000002 RSI: 00005619d46b969c RDI: 00000000ffffff9c
[ 98.040039] RBP: 0000000000000024 R08: 0000000000000000 R09: 0000000000000000
[ 98.040039] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000024
[ 98.040039] R13: 0000000000000012 R14: 00005619d5035950 R15: 0000000000000012
[ 98.040039] Modules linked in: nct6775 hwmon_vid algif_skcipher af_alg nls_iso8859_1 nls_cp437 vfat fat uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common arc4 videodev media snd_usb_audio snd_hda_codec_hdmi snd_usbmidi_lib snd_rawmidi snd_seq_device mousedev input_leds iwlmvm mac80211 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec edac_mce_amd kvm_amd snd_hda_core kvm iwlwifi snd_hwdep r8169 wmi_bmof cfg80211 snd_pcm irqbypass snd_timer snd libphy soundcore pinctrl_amd rfkill pcspkr sp5100_tco evdev gpio_amdpt k10temp mac_hid i2c_piix4 wmi pcc_cpufreq acpi_cpufreq vboxnetflt(OE) vboxnetadp(OE) vboxpci(OE) vboxdrv(OE) msr sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto uas usb_storage dm_crypt hid_generic usbhid hid
[ 98.040039] dm_mod raid1 md_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc ahci libahci aesni_intel aes_x86_64 libata crypto_simd cryptd glue_helper ccp xhci_pci rng_core scsi_mod xhci_hcd nvidia_drm(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart nvidia_uvm(POE) nvidia_modeset(POE) nvidia(POE) ipmi_devintf ipmi_msghandler
[ 98.040039] CR2: 0000000000000080
[ 98.040039] ---[ end trace 3b65093b6fe62b2f ]---
[ 98.040039] RIP: 0010:drm_lease_owner+0xd/0x20 [drm]
[ 98.040039] Code: 83 c4 18 5b 5d c3 b8 ea ff ff ff eb e2 b8 ed ff ff ff eb db e8 b4 ca 68 fb 0f 1f 40 00 0f 1f 44 00 00 48 89 f8 eb 03 48 89 d0 <48> 8b 90 80 00 00 00 48 85 d2 75 f1 c3 66 0f 1f 44 00 00 0f 1f 44
[ 98.040039] RSP: 0018:ffffb8cf08e07bb0 EFLAGS: 00010202
[ 98.040039] RAX: 0000000000000000 RBX: ffff9cf0f2586c00 RCX: ffff9cf0f2586c88
[ 98.040039] RDX: ffff9cf0ddbd8000 RSI: 0000000000000000 RDI: 0000000000000000
[ 98.040039] RBP: ffff9cf1040e9800 R08: 0000000000000000 R09: 0000000000000000
[ 98.040039] R10: ffffdeb30fd5d680 R11: ffffdeb30f5d6808 R12: ffff9cf1040e9888
[ 98.040039] R13: 0000000000000000 R14: dead000000000200 R15: ffff9cf0f2586cc8
[ 98.040039] FS: 00007f4145513180(0000) GS:ffff9cf10ea00000(0000) knlGS:0000000000000000
[ 98.040039] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 98.040039] CR2: 0000000000000080 CR3: 00000003d7548000 CR4: 00000000003406f0
Signed-off-by: Sergio Correia <sergio@correia.cc>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181122053329.2692-1-sergio@correia.cc
Signed-off-by: Sean Paul <seanpaul@chromium.org>
|
|
Jerry Zuo pointed out a rather obscure hotplugging issue that it seems I
accidentally introduced into DRM two years ago.
Pretend we have a topology like this:
|- DP-1: mst_primary
|- DP-4: active display
|- DP-5: disconnected
|- DP-6: active hub
|- DP-7: active display
|- DP-8: disconnected
|- DP-9: disconnected
If we unplug DP-6, the topology starting at DP-7 will be destroyed but
it's payloads will live on in DP-1's VCPI allocations and thus require
removal. However, this removal currently fails because
drm_dp_update_payload_part1() will (rightly so) try to validate the port
before accessing it, fail then abort. If we keep going, eventually we
run the MST hub out of bandwidth and all new allocations will start to
fail (or in my case; all new displays just start flickering a ton).
We could just teach drm_dp_update_payload_part1() not to drop the port
ref in this case, but then we also need to teach
drm_dp_destroy_payload_step1() to do the same thing, then hope no one
ever adds anything to the that requires a validated port reference in
drm_dp_destroy_connector_work(). Kind of sketchy.
So let's go with a more clever solution: any port that
drm_dp_destroy_connector_work() interacts with is guaranteed to still
exist in memory until we say so. While said port might not be valid we
don't really care: that's the whole reason we're destroying it in the
first place! So, teach drm_dp_get_validated_port_ref() to use the all
mighty current_work() function to avoid attempting to validate ports
from the context of mgr->destroy_connector_work. I can't see any
situation where this wouldn't be safe, and this avoids having to play
whack-a-mole in the future of trying to work around port validation.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 263efde31f97 ("drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()")
Reported-by: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: <stable@vger.kernel.org> # v4.6+
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181113224613.28809-1-lyude@redhat.com
Signed-off-by: Sean Paul <seanpaul@chromium.org>
|
|
|
|
I'm taking over the maintainance of Sparse so add myself as
maintainer and move Christopher's info to CREDITS.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The TX stats should be started with the tx_stats_syncp,
there seems to be a copy/paste error in the driver.
Signed-off-by: Andreas Fiedler <andreas.fiedler@gmx.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The vsc85xx_default_config function called in the vsc85xx_config_init
function which is used by VSC8530, VSC8531, VSC8540 and VSC8541 PHYs
mistakenly calls phy_read and phy_write in-between phy_select_page and
phy_restore_page.
phy_select_page and phy_restore_page actually take and release the MDIO
bus lock and phy_write and phy_read take and release the lock to write
or read to a PHY register.
Let's fix this deadlock by using phy_modify_paged which handles
correctly a read followed by a write in a non-standard page.
Fixes: 6a0bfbbe20b0 ("net: phy: mscc: migrate to phy_select/restore_page functions")
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The correct form is "can be probed", so fix the typo.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine
since it is used to check if tso dma descriptor queue has been previously
allocated. The issue can be triggered with the following reproducer:
$ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o
$ip link set dev enP2p1s0v0 xdpdrv off
[ 341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0
[ 341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[ 341.521874] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 341.526654] pc : __vunmap+0x98/0xe0
[ 341.530132] lr : __vunmap+0x98/0xe0
[ 341.533609] sp : ffff00001c5db860
[ 341.536913] x29: ffff00001c5db860 x28: 0000000000020000
[ 341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000
[ 341.547515] x25: 0000000000000000 x24: 00000000fbd00000
[ 341.552816] x23: 0000000000000000 x22: ffff810feb5090b0
[ 341.558117] x21: 0000000000000000 x20: 0000000000000000
[ 341.563418] x19: ffff000017e57000 x18: 0000000000000000
[ 341.568719] x17: 0000000000000000 x16: 0000000000000000
[ 341.574020] x15: 0000000000000010 x14: ffffffffffffffff
[ 341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f
[ 341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510
[ 341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8
[ 341.595224] x7 : 3430303030303030 x6 : 00000000000006ef
[ 341.600525] x5 : 00000000003fffff x4 : 0000000000000000
[ 341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff
[ 341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038
[ 341.616428] Call trace:
[ 341.618866] __vunmap+0x98/0xe0
[ 341.621997] vunmap+0x3c/0x50
[ 341.624961] arch_dma_free+0x68/0xa0
[ 341.628534] dma_direct_free+0x50/0x80
[ 341.632285] nicvf_free_resources+0x160/0x2d8 [nicvf]
[ 341.637327] nicvf_config_data_transfer+0x174/0x5e8 [nicvf]
[ 341.642890] nicvf_stop+0x298/0x340 [nicvf]
[ 341.647066] __dev_close_many+0x9c/0x108
[ 341.650977] dev_close_many+0xa4/0x158
[ 341.654720] rollback_registered_many+0x140/0x530
[ 341.659414] rollback_registered+0x54/0x80
[ 341.663499] unregister_netdevice_queue+0x9c/0xe8
[ 341.668192] unregister_netdev+0x28/0x38
[ 341.672106] nicvf_remove+0xa4/0xa8 [nicvf]
[ 341.676280] nicvf_shutdown+0x20/0x30 [nicvf]
[ 341.680630] pci_device_shutdown+0x44/0x88
[ 341.684720] device_shutdown+0x144/0x250
[ 341.688640] kernel_restart_prepare+0x44/0x50
[ 341.692986] kernel_restart+0x20/0x68
[ 341.696638] __se_sys_reboot+0x210/0x238
[ 341.700550] __arm64_sys_reboot+0x24/0x30
[ 341.704555] el0_svc_handler+0x94/0x110
[ 341.708382] el0_svc+0x8/0xc
[ 341.711252] ---[ end trace 3f4019c8439959c9 ]---
[ 341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4
[ 341.723872] flags: 0x1fffe000000000()
[ 341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000
[ 341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
[ 341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
where xdp_dummy.c is a simple bpf program that forwards the incoming
frames to the network stack (available here:
https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c)
Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
of_find_node_by_path() acquires a reference to the node
returned by it and that reference needs to be dropped by its caller.
This place doesn't do that, so fix it.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
team_notify_peers() will send ARP and NA to notify peers. team_mcast_rejoin()
will send multicast join group message to notify peers. We should do this when
enabling/changed to a new port. But it doesn't make sense to do it when a port
is disabled.
On the other hand, when we set mcast_rejoin_count to 2, and do a failover,
team_port_disable() will increase mcast_rejoin.count_pending to 2 and then
team_port_enable() will increase mcast_rejoin.count_pending to 4. We will send
4 mcast rejoin messages at latest, which will make user confused. The same
with notify_peers.count.
Fix it by deleting team_notify_peers() and team_mcast_rejoin() in
team_port_disable().
Reported-by: Liang Li <liali@redhat.com>
Fixes: fc423ff00df3a ("team: add peer notification")
Fixes: 492b200efdd20 ("team: add support for sending multicast rejoins")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't support partial csumed packet since its metadata will be lost
or incorrect during XDP processing. So fail the XDP set if guest_csum
feature is negotiated.
Fixes: f600b6905015 ("virtio_net: Add XDP support")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't disable VIRTIO_NET_F_GUEST_CSUM if XDP was set. This means we
can receive partial csumed packets with metadata kept in the
vnet_hdr. This may have several side effects:
- It could be overridden by header adjustment, thus is might be not
correct after XDP processing.
- There's no way to pass such metadata information through
XDP_REDIRECT to another driver.
- XDP does not support checksum offload right now.
So simply disable guest csum if possible in this the case of XDP.
Fixes: 3f93522ffab2d ("virtio-net: switch off offloads on demand if possible on XDP set")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit f2cbd4852820 ("net/sched: act_police: fix race condition on state
variables") introduces a new spinlock, but forgets its initialization.
Ensure that tcf_police_init() initializes 'tcfp_lock' every time a 'police'
action is newly created, to avoid the following lockdep splat:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
<...>
Call Trace:
dump_stack+0x85/0xcb
register_lock_class+0x581/0x590
__lock_acquire+0xd4/0x1330
? tcf_police_init+0x2fa/0x650 [act_police]
? lock_acquire+0x9e/0x1a0
lock_acquire+0x9e/0x1a0
? tcf_police_init+0x2fa/0x650 [act_police]
? tcf_police_init+0x55a/0x650 [act_police]
_raw_spin_lock_bh+0x34/0x40
? tcf_police_init+0x2fa/0x650 [act_police]
tcf_police_init+0x2fa/0x650 [act_police]
tcf_action_init_1+0x384/0x4c0
tcf_action_init+0xf6/0x160
tcf_action_add+0x73/0x170
tc_ctl_action+0x122/0x160
rtnetlink_rcv_msg+0x2a4/0x490
? netlink_deliver_tap+0x99/0x400
? validate_linkmsg+0x370/0x370
netlink_rcv_skb+0x4d/0x130
netlink_unicast+0x196/0x230
netlink_sendmsg+0x2e5/0x3e0
sock_sendmsg+0x36/0x40
___sys_sendmsg+0x280/0x2f0
? _raw_spin_unlock+0x24/0x30
? handle_pte_fault+0xafe/0xf30
? find_held_lock+0x2d/0x90
? syscall_trace_enter+0x1df/0x360
? __sys_sendmsg+0x5e/0xa0
__sys_sendmsg+0x5e/0xa0
do_syscall_64+0x60/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f1841c7cf10
Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
RSP: 002b:00007ffcf9df4d68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f1841c7cf10
RDX: 0000000000000000 RSI: 00007ffcf9df4dc0 RDI: 0000000000000003
RBP: 000000005bf56105 R08: 0000000000000002 R09: 00007ffcf9df8edc
R10: 00007ffcf9df47e0 R11: 0000000000000246 R12: 0000000000671be0
R13: 00007ffcf9df4e84 R14: 0000000000000008 R15: 0000000000000000
Fixes: f2cbd4852820 ("net/sched: act_police: fix race condition on state variables")
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Eric noted that with UDP GRO and NAPI timeout, we could keep a single
UDP packet inside the GRO hash forever, if the related NAPI instance
calls napi_gro_complete() at an higher frequency than the NAPI timeout.
Willem noted that even TCP packets could be trapped there, till the
next retransmission.
This patch tries to address the issue, flushing the old packets -
those with a NAPI_GRO_CB age before the current jiffy - before scheduling
the NAPI timeout. The rationale is that such a timeout should be
well below a jiffy and we are not flushing packets eligible for sane GRO.
v1 -> v2:
- clarified the commit message and comment
RFC -> v1:
- added 'Fixes tags', cleaned-up the wording.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 3b47d30396ba ("net: gro: add a per device gro flush timer")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we add a new IPv6 address, we should also join corresponding solicited-node
multicast address, unless the interface has IFF_NOARP flag, as function
addrconf_join_solict() did. But if we remove IFF_NOARP flag later, we do
not do dad and add the mcast address. So we will drop corresponding neighbour
discovery message that came from other nodes.
A typical example is after creating a ipvlan with mode l3, setting up an ipv6
address and changing the mode to l2. Then we will not be able to ping this
address as the interface doesn't join related solicited-node mcast address.
Fix it by re-doing dad when interface changed IFF_NOARP flag. Then we will add
corresponding mcast group and check if there is a duplicate address on the
network.
Reported-by: Jianlin Shi <jishi@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tpacket_snd sends packets with user pages linked into skb frags. It
notifies that pages can be reused when the skb is released by setting
skb->destructor to tpacket_destruct_skb.
This can cause data corruption if the skb is orphaned (e.g., on
transmit through veth) or cloned (e.g., on mirror to another psock).
Create a kernel-private copy of data in these cases, same as tun/tap
zerocopy transmission. Reuse that infrastructure: mark the skb as
SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx).
Unlike other zerocopy packets, do not set shinfo destructor_arg to
struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify
when the original skb is released and a timestamp is recorded. Do not
change this timestamp behavior. The ubuf_info->callback is not needed
anyway, as no zerocopy notification is expected.
Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The
resulting value is not a valid ubuf_info pointer, nor a valid
tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this.
The fix relies on features introduced in commit 52267790ef52 ("sock:
add MSG_ZEROCOPY"), so can be backported as is only to 4.14.
Tested with from `./in_netns.sh ./txring_overwrite` from
http://github.com/wdebruij/kerneltools/tests
Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap")
Reported-by: Anand H. Krishnan <anandhkrishnan@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When merging support for SSBD and the CRC32 instructions, the conflict
resolution for the new capability entries in arm64_features[]
inadvertedly predicated the availability of the CRC32 instructions on
CONFIG_ARM64_SSBD, despite the functionality being entirely unrelated.
Move the #ifdef CONFIG_ARM64_SSBD down so that it only covers the SSBD
capability.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Specify correct type for the constants to avoid
the following sparse complaints:
./arch/arm64/include/asm/sysreg.h:471:42: warning: constant 0xffffffffffffffff is so big it is unsigned long
./arch/arm64/include/asm/sysreg.h:512:42: warning: constant 0xffffffffffffffff is so big it is unsigned long
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
During device reset, queue memory is not being updated to accommodate
changes in ring buffer sizes supported by backing hardware. Track
any differences in ring buffer sizes following the reset and update
queue memory when possible.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The wrong index is used when cleaning up RX buffer objects during release
of RX queues. Update to use the correct index counter.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Set xdp_prog pointer to NULL if bpf_prog_add fails since that routine
reports the error code instead of NULL in case of failure and xdp_prog
pointer value is used in the driver to verify if XDP is currently
enabled.
Moreover report the error code to userspace if nicvf_xdp_setup fails
Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On every iteration of net_dim, the algorithm may choose to
check for the system state by comparing current data sample
with previous data sample. After each of these comparison,
regardless of the action taken, the sample used as baseline
is needed to be updated.
This patch fixes a bug that causes DIM to take wrong decisions,
due to never updating the baseline sample for comparison between
iterations. This way, DIM always compares current sample with
zeros.
Although this is a functional fix, it also improves and stabilizes
performance as the algorithm works properly now.
Performance:
Tested single UDP TX stream with pktgen:
samples/pktgen/pktgen_sample03_burst_single_flow.sh -i p4p2 -d 1.1.1.1
-m 24:8a:07:88:26:8b -f 3 -b 128
ConnectX-5 100GbE packet rate improved from 15-19Mpps to 19-20Mpps.
Also, toggling between profiles is less frequent with the fix.
Fixes: 8115b750dbcb ("net/dim: use struct net_dim_sample as arg to net_dim")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rfc8435 says:
For tight coupling, ffds_stateid provides the stateid to be used by
the client to access the file.
However current implementation replaces per-mirror provided stateid with
by open or lock stateid.
Ensure that per-mirror stateid is used by ff_layout_write_prepare_v4 and
nfs4_ff_layout_prepare_ds.
Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Bruce pointed out that we shouldn't allocate memory while holding
a lock in the nfs4_callback_offload() and handle_async_copy()
that deal with a racing CB_OFFLOAD and reply to COPY case.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
memunmap() should be used to free the return of memremap(), not
iounmap().
Fixes: dfddb969edf0 ('iommu/vt-d: Switch from ioremap_cache to memremap')
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This reverts commit aaf9978c3c0291ef3beaa97610bc9c3084656a85.
Quoting Peter:
There is a HID feature report called "Resolution Multiplier"
Described in the "Enhanced Wheel Support in Windows" doc and
the "USB HID Usage Tables" page 30.
http://download.microsoft.com/download/b/d/1/bd1f7ef4-7d72-419e-bc5c-9f79ad7bb66e/wheel.docx
https://www.usb.org/sites/default/files/documents/hut1_12v2.pdf
This was new for Windows Vista, so we're only a decade behind here. I only
accidentally found this a few days ago while debugging a stuck button on a
Microsoft mouse.
The docs above describe it like this: a wheel control by default sends
value 1 per notch. If the resolution multiplier is active, the wheel is
expected to send a value of $multiplier per notch (e.g. MS Sculpt mouse) or
just send events more often, i.e. for less physical motion (e.g. MS Comfort
mouse).
For the latter, you need the right HW of course. The Sculpt mouse has
tactile wheel clicks, so nothing really changes. The Comfort mouse has
continuous motion with no tactile clicks. Similar to the free-wheeling
Logitech mice but without any inertia.
Note that the doc also says that Vista and onwards *always* enable this
feature where available.
An example HID definition looks like this:
Usage Page Generic Desktop (0x01)
Usage Resolution Multiplier (0x48)
Logical Minimum 0
Logical Maximum 1
Physical Minimum 1
Physical Maximum 16
Report Size 2 # in bits
Report Count 1
Feature (Data, Var, Abs)
So the actual bits have values 0 or 1 and that reflects real values 1 or 16.
We've only seen single-bits so far, so there's low-res and hi-res, but
nothing in between.
The multiplier is available for HID usages "Wheel" and "AC Pan" (horiz wheel).
Microsoft suggests that
> Vendors should ship their devices with smooth scrolling disabled and allow
> Windows to enable it. This ensures that the device works like a regular HID
> device on legacy operating systems that do not support smooth scrolling.
(see the wheel doc linked above)
The mice that we tested so far do reset on unplug.
Device Support looks to be all (?) Microsoft mice but nothing else
Not supported:
- Logitech G500s, G303
- Roccat Kone XTD
- all the cheap Lenovo, HP, Dell, Logitech USB mice that come with a
workstation that I could find don't have it.
- Etekcity something something
- Razer Imperator
Supported:
- Microsoft Comfort Optical Mouse 3000 - yes, physical: 1:4
- Microsoft Sculpt Ergonomic Mouse - yes, physical: 1:12
- Microsoft Surface mouse - yes, physical: 1:4
So again, I think this is really just available on Microsoft mice, but
probably all decent MS mice released over the last decade.
Looking at the hardware itself:
- no noticeable notches in the weel
- low-res: 18 events per 360deg rotation (click angle 20 deg)
- high-res: 72 events per 360deg → matches multiplier of 4
- I can feel the notches during wheel turns
- low-res: 24 events per 360 deg rotation (click angle 15 deg)
- horiz wheel is tilt-based, continuous output value 1
- high-res: 24 events per 360deg with value 12 → matches multiplier of 12
- horiz wheel output rate doubles/triples?, values is 3
- It's a touch strip, not a wheel so no notches
- high-res: events have value 4 instead of 1
a bit strange given that it doesn't actually have notches.
Ok, why is this an issue for the current API? First, because the logitech
multiplier used in Harry's patches looks suspiciously like the Resolution
Multiplier so I think we should assume it's the same thing. Nestor, can you
shed some light on that?
- `REL_WHEEL` is defined as the number of notches, emulated where needed.
- `REL_WHEEL_HI_RES` is the movement of the user's finger in microns.
- `WM_MOUSEWHEEL` (Windows) is is a multiple of 120, defined as "the threshold
for action to be taken and one such action"
https://docs.microsoft.com/en-us/windows/desktop/inputdev/wm-mousewheel
If the multiplier is set to M, this means we need an accumulated value of M
until we can claim there was a wheel click. So after enabling the multiplier
and setting it to the maximum (like Windows):
- M units are 15deg rotation → 1 unit is 2620/M micron (see below). This is
the `REL_WHEEL_HI_RES` value.
- wheel diameter 20mm: 15 deg rotation is 2.62mm, 2620 micron (pi * 20mm /
(360deg/15deg))
- For every M units accumulated, send one `REL_WHEEL` event
The problem here is that we've now hardcoded 20mm/15 deg into the kernel and
we have no way of getting the size of the wheel or the click angle into the
kernel.
In userspace we now have to undo the kernel's calculation. If our click angle
is e.g. 20 degree we have to undo the (lossy) calculation from the kernel and
calculate the correct angle instead. This also means the 15 is a hardcoded
option forever and cannot be changed.
In hid-logitech-hidpp.c, the microns per unit is hardcoded per device.
Harry, did you measure those by hand? We'd need to update the kernel for
every device and there are 10 years worth of devices from MS alone.
The multiplier default is 8 which is in the right ballpark, so I'm pretty
sure this is the same as the Resolution Multiplier, just in HID++ lingo. And
given that the 120 magic factor is what Windows uses in the end, I can't
imagine Logitech rolling their own thing here. Nestor?
And we're already fairly inaccurate with the microns anyway. The MX Anywhere
2S has a click angle of 20 degrees (18 stops) and a 17mm wheel, so a wheel
notch is approximately 2.67mm, one event at multiplier 8 (1/8 of a notch)
would be 334 micron. That's only 80% of the fallback value of 406 in the
kernel. Multiplier 6 gives us 445micron (10% off). I'm assuming multiplier 7
doesn't exist because it's not a factor of 120.
Summary:
Best option may be to simply do what Windows is doing, all the HW manufacturers
have to use that approach after all. Switch `REL_WHEEL_HI_RES` to report in
fractions of 120, with 120 being one notch and divide that by the multiplier
for the actual events. So e.g. the Logitech multiplier 8 would send value 15
for each event in hi-res mode. This can be converted in userspace to
whatever userspace needs (combined with a hwdb there that tells you wheel
size/click angle/...).
Conflicts:
include/uapi/linux/input-event-codes.h -> I kept the new
reserved event in the code, so I had to adapt the revert
slightly
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Harry Cutts <hcutts@chromium.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
|
|
This reverts commit 1ff2e1a44e02d4bdbb9be67c7d9acc240a67141f.
It turns out the current API is not that compatible with
some Microsoft mice, so better start again from scratch.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Harry Cutts <hcutts@chromium.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
|
|
This reverts commit 051dc9b0579602bd63e9df74d0879b5293e71581.
It turns out the current API is not that compatible with
some Microsoft mice, so better start again from scratch.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Harry Cutts <hcutts@chromium.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
|
|
This reverts commit d56ca9855bf924f3bc9807a3e42f38539df3f41f.
It turns out the current API is not that compatible with
some Microsoft mice, so better start again from scratch.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Harry Cutts <hcutts@chromium.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
|
|
This reverts commit 3fe1d6bbcd16f384d2c7dab2caf8e4b2df9ea7e6.
It turns out the current API is not that compatible with
some Microsoft mice, so better start again from scratch.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Harry Cutts <hcutts@chromium.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
|
|
This reverts commit 5fe2ccbef9d7aecf5c4402c753444f1a12096cfd.
It turns out the current API is not that compatible with
some Microsoft mice, so better start again from scratch.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Harry Cutts <hcutts@chromium.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
|
|
This reverts commit 044ee890286153a1aefb40cb8b6659921aecb38b.
It turns out the current API is not that compatible with
some Microsoft mice, so better start again from scratch.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Harry Cutts <hcutts@chromium.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
|
|
Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
In the original ftmac100_interrupt(), the interrupts are only disabled when
the condition "netif_running(netdev)" is true. However, this condition
causes kerenl hang in the following case. When the user requests to
disable the network device, kernel will clear the bit __LINK_STATE_START
from the dev->state and then call the driver's ndo_stop function. Network
device interrupts are not blocked during this process. If an interrupt
occurs between clearing __LINK_STATE_START and stopping network device,
kernel cannot disable the interrupts due to the condition
"netif_running(netdev)" in the ISR. Hence, kernel will hang due to the
continuous interruption of the network device.
In order to solve the above problem, the interrupts of the network device
should always be disabled in the ISR without being restricted by the
condition "netif_running(netdev)".
[V2]
Remove unnecessary curly braces.
Signed-off-by: Vincent Chen <vincentc@andestech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The value of pitches is not correct while calling mode_set.
The issue we found so far on following system:
- Debian8 with XFCE Desktop
- Ubuntu with KDE Desktop
- SUSE15 with KDE Desktop
Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com>
Cc: <stable@vger.kernel.org>
Tested-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
In smc_wr_tx_put_slot() field pend->idx is used after being
cleared. That means always idx 0 is cleared in the wr_tx_mask.
This results in a broken administration of available WR send
payload buffers.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Running uperf tests with SMCD on LPARs results in corrupted cursors.
SMCD cursors should be treated atomically to fix cursor corruption.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a SMC-D link group is freed, a shutdown signal should be sent to
the peer to indicate that the link group is invalid. This patch adds the
shutdown signal to the SMC code.
Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When searching for an existing link group the queue pair number is also
to be taken into consideration. When the SMC server sends a new number
in a CLC packet (keeping all other values equal) then a new link group
is to be created on the SMC client side.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case of a non-blocking SMC socket, the initial CLC handshake is
performed over a blocking TCP connection in a worker. If the SMC socket
is released, smc_release has to wait for the blocking CLC socket
operations (e.g., kernel_connect) inside the worker.
This patch aborts a CLC connection when the respective non-blocking SMC
socket is released to avoid waiting on socket operations or timeouts.
Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jean-Louis reported a TCP regression and bisected to recent SACK
compression.
After a loss episode (receiver not able to keep up and dropping
packets because its backlog is full), linux TCP stack is sending
a single SACK (DUPACK).
Sender waits a full RTO timer before recovering losses.
While RFC 6675 says in section 5, "Algorithm Details",
(2) If DupAcks < DupThresh but IsLost (HighACK + 1) returns true --
indicating at least three segments have arrived above the current
cumulative acknowledgment point, which is taken to indicate loss
-- go to step (4).
...
(4) Invoke fast retransmit and enter loss recovery as follows:
there are old TCP stacks not implementing this strategy, and
still counting the dupacks before starting fast retransmit.
While these stacks probably perform poorly when receivers implement
LRO/GRO, we should be a little more gentle to them.
This patch makes sure we do not enable SACK compression unless
3 dupacks have been sent since last rcv_nxt update.
Ideally we should even rearm the timer to send one or two
more DUPACK if no more packets are coming, but that will
be work aiming for linux-4.21.
Many thanks to Jean-Louis for bisecting the issue, providing
packet captures and testing this patch.
Fixes: 5d9f4262b7ea ("tcp: add SACK compression")
Reported-by: Jean-Louis Dupond <jean-louis@dupond.be>
Tested-by: Jean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a packet is trapped and the corresponding SKB marked as
already-forwarded, it retains this marking even after it is forwarded
across veth links into another bridge. There, since it ingresses the
bridge over veth, which doesn't have offload_fwd_mark, it triggers a
warning in nbp_switchdev_frame_mark().
Then nbp_switchdev_allowed_egress() decides not to allow egress from
this bridge through another veth, because the SKB is already marked, and
the mark (of 0) of course matches. Thus the packet is incorrectly
blocked.
Solve by resetting offload_fwd_mark() in skb_scrub_packet(). That
function is called from tunnels and also from veth, and thus catches the
cases where traffic is forwarded between bridges and transformed in a
way that invalidates the marking.
Fixes: 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked devices")
Fixes: abf4bb6b63d0 ("skbuff: Add the offload_mr_fwd_mark field")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Suggested-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we read the EOF page of the file via readpages, we need
to zero the region beyond EOF that we either do not read or
should not contain data so that mmap does not expose stale data to
user applications.
However, iomap_adjust_read_range() fails to detect EOF correctly,
and so fsx on 1k block size filesystems fails very quickly with
mapreads exposing data beyond EOF. There are two problems here.
Firstly, when calculating the end block of the EOF byte, we have
to round the size by one to avoid a block aligned EOF from reporting
a block too large. i.e. a size of 1024 bytes is 1 block, which in
index terms is block 0. Therefore we have to calculate the end block
from (isize - 1), not isize.
The second bug is determining if the current page spans EOF, and so
whether we need split it into two half, one for the IO, and the
other for zeroing. Unfortunately, the code that checks whether
we should split the block doesn't actually check if we span EOF, it
just checks if the read spans the /offset in the page/ that EOF
sits on. So it splits every read into two if EOF is not page
aligned, regardless of whether we are reading the EOF block or not.
Hence we need to restrict the "does the read span EOF" check to
just the page that spans EOF, not every page we read.
This patch results in correct EOF detection through readpages:
xfs_vm_readpages: dev 259:0 ino 0x43 nr_pages 24
xfs_iomap_found: dev 259:0 ino 0x43 size 0x66c00 offset 0x4f000 count 98304 type hole startoff 0x13c startblock 1368 blockcount 0x4
iomap_readpage_actor: orig pos 323584 pos 323584, length 4096, poff 0 plen 4096, isize 420864
xfs_iomap_found: dev 259:0 ino 0x43 size 0x66c00 offset 0x50000 count 94208 type hole startoff 0x140 startblock 1497 blockcount 0x5c
iomap_readpage_actor: orig pos 327680 pos 327680, length 94208, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 331776 pos 331776, length 90112, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 335872 pos 335872, length 86016, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 339968 pos 339968, length 81920, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 344064 pos 344064, length 77824, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 348160 pos 348160, length 73728, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 352256 pos 352256, length 69632, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 356352 pos 356352, length 65536, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 360448 pos 360448, length 61440, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 364544 pos 364544, length 57344, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 368640 pos 368640, length 53248, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 372736 pos 372736, length 49152, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 376832 pos 376832, length 45056, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 380928 pos 380928, length 40960, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 385024 pos 385024, length 36864, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 389120 pos 389120, length 32768, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 393216 pos 393216, length 28672, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 397312 pos 397312, length 24576, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 401408 pos 401408, length 20480, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 405504 pos 405504, length 16384, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 409600 pos 409600, length 12288, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 413696 pos 413696, length 8192, poff 0 plen 4096, isize 420864
iomap_readpage_actor: orig pos 417792 pos 417792, length 4096, poff 0 plen 3072, isize 420864
iomap_readpage_actor: orig pos 420864 pos 420864, length 1024, poff 3072 plen 1024, isize 420864
As you can see, it now does full page reads until the last one which
is split correctly at the block aligned EOF, reading 3072 bytes and
zeroing the last 1024 bytes. The original version of the patch got
this right, but it got another case wrong.
The EOF detection crossing really needs to the the original length
as plen, while it starts at the end of the block, will be shortened
as up-to-date blocks are found on the page. This means "orig_pos +
plen" no longer points to the end of the page, and so will not
correctly detect EOF crossing. Hence we have to use the length
passed in to detect this partial page case:
xfs_filemap_fault: dev 259:1 ino 0x43 write_fault 0
xfs_vm_readpage: dev 259:1 ino 0x43 nr_pages 1
xfs_iomap_found: dev 259:1 ino 0x43 size 0x2cc00 offset 0x2c000 count 4096 type hole startoff 0xb0 startblock 282 blockcount 0x4
iomap_readpage_actor: orig pos 180224 pos 181248, length 4096, poff 1024 plen 2048, isize 183296
xfs_iomap_found: dev 259:1 ino 0x43 size 0x2cc00 offset 0x2cc00 count 1024 type hole startoff 0xb3 startblock 285 blockcount 0x1
iomap_readpage_actor: orig pos 183296 pos 183296, length 1024, poff 3072 plen 1024, isize 183296
Heere we see a trace where the first block on the EOF page is up to
date, hence poff = 1024 bytes. The offset into the page of EOF is
3072, so the range we want to read is 1024 - 3071, and the range we
want to zero is 3072 - 4095. You can see this is split correctly
now.
This fixes the stale data beyond EOF problem that fsx quickly
uncovers on 1k block size filesystems.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
It returns EINVAL when the operation is not supported by the
filesystem. Fix it to return EOPNOTSUPP to be consistent with
the man page and clone_file_range().
Clean up the inconsistent error return handling while I'm there.
(I know, lipstick on a pig, but every little bit helps...)
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
When doing direct IO to a pipe for do_splice_direct(), then pipe is
trivial to fill up and overflow as it can only hold 16 pages. At
this point bio_iov_iter_get_pages() then returns -EFAULT, and we
abort the IO submission process. Unfortunately, iomap_dio_rw()
propagates the error back up the stack.
The error is converted from the EFAULT to EAGAIN in
generic_file_splice_read() to tell the splice layers that the pipe
is full. do_splice_direct() completely fails to handle EAGAIN errors
(it aborts on error) and returns EAGAIN to the caller.
copy_file_write() then completely fails to handle EAGAIN as well,
and so returns EAGAIN to userspace, having failed to copy the data
it was asked to.
Avoid this whole steaming pile of fail by having iomap_dio_rw()
silently swallow EFAULT errors and so do short reads.
To make matters worse, iomap_dio_actor() has a stale data exposure
bug bio_iov_iter_get_pages() fails - it does not zero the tail block
that it may have been left uncovered by partial IO. Fix the error
handling case to drop to the sub-block zeroing rather than
immmediately returning the -EFAULT error.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
If we are doing sub-block dio that extends EOF, we need to zero
the unused tail of the block to initialise the data in it it. If we
do not zero the tail of the block, then an immediate mmap read of
the EOF block will expose stale data beyond EOF to userspace. Found
with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations.
Fix this by detecting if the end of the DIO write is beyond EOF
and zeroing the tail if necessary.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
When we write into an unwritten extent via direct IO, we dirty
metadata on IO completion to convert the unwritten extent to
written. However, when we do the FUA optimisation checks, the inode
may be clean and so we issue a FUA write into the unwritten extent.
This means we then bypass the generic_write_sync() call after
unwritten extent conversion has ben done and we don't force the
modified metadata to stable storage.
This violates O_DSYNC semantics. The window of exposure is a single
IO, as the next DIO write will see the inode has dirty metadata and
hence will not use the FUA optimisation. Calling
generic_write_sync() after completion of the second IO will also
sync the first write and it's metadata.
Fix this by avoiding the FUA optimisation when writing to unwritten
extents.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|