linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2020-03-19	x86/setup: Fix static memory detection	Guenter Roeck	2	-1/+20
	When booting x86 images in qemu, the following warning is seen randomly if DEBUG_LOCKDEP is enabled. WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:1119 lockdep_register_key+0xc0/0x100 static_obj() returns true if an address is between _stext and _end. On x86, this includes the brk memory space. Problem is that this memory block is not static on x86; its unused portions are released after init and can be allocated. This results in the observed warning if a lockdep object is allocated from this memory. Solve the problem by implementing arch_is_kernel_initmem_freed() for x86 and have it return true if an address is within the released memory range. The same problem was solved for s390 with commit 7a5da02de8d6e ("locking/lockdep: check for freed initmem in static_obj()"), which introduced arch_is_kernel_initmem_freed(). Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200131021159.9178-1-linux@roeck-us.net
2020-02-25	x86/vmlinux: Drop unneeded linker script discard of .eh_frame	Arvind Sankar	5	-14/+4
	Now that .eh_frame sections for the files in setup.elf and realmode.elf are not generated anymore, the linker scripts don't need the special output section name /DISCARD/ any more. Remove the one in the main kernel linker script as well, since there are no .eh_frame sections already, and fix up a comment referencing .eh_frame. Update the comment in asm/dwarf2.h referring to .eh_frame so it continues to make sense, as well as being more specific. [ bp: Touch up commit message. ] Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://lkml.kernel.org/r/20200224232129.597160-3-nivedita@alum.mit.edu
2020-02-25	x86/*/Makefile: Use -fno-asynchronous-unwind-tables to suppress .eh_frame sections	Arvind Sankar	4	-1/+5
	While discussing a patch to discard .eh_frame from the compressed vmlinux using the linker script, Fangrui Song pointed out [1] that these sections shouldn't exist in the first place because arch/x86/Makefile uses -fno-asynchronous-unwind-tables. It turns out this is because the Makefiles used to build the compressed kernel redefine KBUILD_CFLAGS, dropping this flag. Add the flag to the Makefile for the compressed kernel, as well as the EFI stub Makefile to fix this. Also add the flag to boot/Makefile and realmode/rm/Makefile so that the kernel's boot code (boot/setup.elf) and realmode trampoline (realmode/rm/realmode.elf) won't be compiled with .eh_frame sections, since their linker scripts also just discard them. [1] https://lore.kernel.org/lkml/20200222185806.ywnqhfqmy67akfsa@google.com/ Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://lkml.kernel.org/r/20200224232129.597160-2-nivedita@alum.mit.edu
2020-02-24	x86/boot/compressed: Remove .eh_frame section from bzImage	Arvind Sankar	1	-0/+5
	Discarding unnecessary sections with "()" (see thread at Link: below) works fine with the bfd linker but fails with lld: $ make -j$(nproc) -s CC=clang LD=ld.lld O=out.x86_64 distclean defconfig bzImage ld.lld: error: discarding .shstrtab section is not allowed lld tries to also discard essential sections like .shstrtab, .symtab and .strtab, which results in the link failing since .shstrtab is required by the ELF specification: the e_shstrndx field in the ELF header is the index of .shstrtab, and each section in the section table is required to have an sh_name that points into the .shstrtab. .symtab and .strtab are also necessary to generate the zoffset.h file for the bzImage header. Since the only sizeable section that can be discarded is .eh_frame, restrict the discard to only .eh_frame to be safe. [ bp: Flesh out commit message and replace offending commit with this one. ] Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://lkml.kernel.org/r/20200109150218.16544-2-nivedita@alum.mit.edu
2020-02-19	x86/boot/compressed/64: Remove .bss/.pgtable from bzImage	Arvind Sankar	1	-1/+1
	Commit 5b11f1cee579 ("x86, boot: straighten out ranges to copy/zero in compressed/head*.S") introduced a separate .pgtable section, splitting it out from the rest of .bss. This section was added without the writeable flag, marking it as read-only. This results in the linker putting the .rela.dyn section (containing bogus dynamic relocations from head_64.o) after the .bss and .pgtable sections. When objcopy is used to convert compressed/vmlinux into a binary for the bzImage: $ objcopy -O binary -R .note -R .comment -S arch/x86/boot/compressed/vmlinux \ arch/x86/boot/vmlinux.bin the .bss and .pgtable sections get materialized as ~176KiB of zero bytes in the binary in order to place .rela.dyn at the correct location. Fix this by marking .pgtable as writeable. This moves the .rela.dyn section up in the ELF image layout so that .bss and .pgtable are the last allocated sections and so don't appear in bzImage. [ bp: Massage commit message. ] Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200109150218.16544-1-nivedita@alum.mit.edu
2020-02-12	x86/boot/compressed/64: Use 32-bit (zero-extended) MOV for z_output_len	Arvind Sankar	1	-1/+1
	z_output_len is the size of the decompressed payload (i.e. vmlinux + vmlinux.relocs) and is generated as an unsigned 32-bit quantity by mkpiggy.c. The current movq $z_output_len, %r9 instruction generates a sign-extended move to %r9. Using movl $z_output_len, %r9d will instead zero-extend into %r9, which is appropriate for an unsigned 32-bit quantity. This is also what is already done for z_input_len, the size of the compressed payload. [ bp: Also, z_output_len cannot be a 64-bit quantity because it participates in: init_size: .long INIT_SIZE # kernel initialization size through INIT_SIZE which is a 32-bit quantity determined by the .long directive (vs .quad for 64-bit). Furthermore, if it really must be a 64-bit quantity, then the insn must be MOVABS which can accommodate a 64-bit immediate and which the toolchain does not generate automatically. ] Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200211173333.1722739-1-nivedita@alum.mit.edu
2020-02-12	x86/boot/compressed/64: Use LEA to initialize boot stack pointer	Arvind Sankar	1	-3/+1
	It's shorter, and it's what is used in every other place, so make it consistent. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200107194436.2166846-2-nivedita@alum.mit.edu
2020-02-09	Linux 5.6-rc1	Linus Torvalds	1	-2/+2

2020-02-09	irqchip/gic-v4.1: Avoid 64bit division for the sake of 32bit ARM	Marc Zyngier	1	-2/+2
	In order to allow the GICv4 code to link properly on 32bit ARM, make sure we don't use 64bit divisions when it isn't strictly necessary. Fixes: 4e6437f12d6e ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08	fs: Add VirtualBox guest shared folder (vboxsf) support	Hans de Goede	12	-0/+3280
	VirtualBox hosts can share folders with guests, this commit adds a VFS driver implementing the Linux-guest side of this, allowing folders exported by the host to be mounted under Linux. This driver depends on the guest <-> host IPC functions exported by the vboxguest driver. Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-08	Fix up remaining devm_ioremap_nocache() in SGI IOC3 8250 UART driver	Linus Torvalds	1	-1/+1
	This is a merge error on my part - the driver was merged into mainline by commit c5951e7c8ee5 ("Merge tag 'mips_5.6' of git://../mips/linux") over a week ago, but nobody apparently noticed that it didn't actually build due to still having a reference to the devm_ioremap_nocache() function, removed a few days earlier through commit 6a1000bd2703 ("Merge tag 'ioremap-5.6' of git://../ioremap"). Apparently this didn't get any build testing anywhere. Not perhaps all that surprising: it's restricted to 64-bit MIPS only, and only with the new SGI_MFD_IOC3 support enabled. I only noticed because the ioremap conflicts in the ARM SoC driver update made me check there weren't any others hiding, and I found this one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08	pipe: use exclusive waits when reading or writing	Linus Torvalds	4	-30/+51
	This makes the pipe code use separate wait-queues and exclusive waiting for readers and writers, avoiding a nasty thundering herd problem when there are lots of readers waiting for data on a pipe (or, less commonly, lots of writers waiting for a pipe to have space). While this isn't a common occurrence in the traditional "use a pipe as a data transport" case, where you typically only have a single reader and a single writer process, there is one common special case: using a pipe as a source of "locking tokens" rather than for data communication. In particular, the GNU make jobserver code ends up using a pipe as a way to limit parallelism, where each job consumes a token by reading a byte from the jobserver pipe, and releases the token by writing a byte back to the pipe. This pattern is fairly traditional on Unix, and works very well, but will waste a lot of time waking up a lot of processes when only a single reader needs to be woken up when a writer releases a new token. A simplified test-case of just this pipe interaction is to create 64 processes, and then pass a single token around between them (this test-case also intentionally passes another token that gets ignored to test the "wake up next" logic too, in case anybody wonders about it): #include <unistd.h> int main(int argc, char *argv) { int fd[2], counters[2]; pipe(fd); counters[0] = 0; counters[1] = -1; write(fd[1], counters, sizeof(counters)); / 64 processes / fork(); fork(); fork(); fork(); fork(); fork(); do { int i; read(fd[0], &i, sizeof(i)); if (i < 0) continue; counters[0] = i+1; write(fd[1], counters, (1+(i & 1)) sizeof(int)); } while (counters[0] < 1000000); return 0; } and in a perfect world, passing that token around should only cause one context switch per transfer, when the writer of a token causes a directed wakeup of just a single reader. But with the "writer wakes all readers" model we traditionally had, on my test box the above case causes more than an order of magnitude more scheduling: instead of the expected ~1M context switches, "perf stat" shows 231,852.37 msec task-clock # 15.857 CPUs utilized 11,250,961 context-switches # 0.049 M/sec 616,304 cpu-migrations # 0.003 M/sec 1,648 page-faults # 0.007 K/sec 1,097,903,998,514 cycles # 4.735 GHz 120,781,778,352 instructions # 0.11 insn per cycle 27,997,056,043 branches # 120.754 M/sec 283,581,233 branch-misses # 1.01% of all branches 14.621273891 seconds time elapsed 0.018243000 seconds user 3.611468000 seconds sys before this commit. After this commit, I get 5,229.55 msec task-clock # 3.072 CPUs utilized 1,212,233 context-switches # 0.232 M/sec 103,951 cpu-migrations # 0.020 M/sec 1,328 page-faults # 0.254 K/sec 21,307,456,166 cycles # 4.074 GHz 12,947,819,999 instructions # 0.61 insn per cycle 2,881,985,678 branches # 551.096 M/sec 64,267,015 branch-misses # 2.23% of all branches 1.702148350 seconds time elapsed 0.004868000 seconds user 0.110786000 seconds sys instead. Much better. [ Note! This kernel improvement seems to be very good at triggering a race condition in the make jobserver (in GNU make 4.2.1) for me. It's a long known bug that was fixed back in June 2017 by GNU make commit b552b0525198 ("[SV 51159] Use a non-blocking read with pselect to avoid hangs."). But there wasn't a new release of GNU make until 4.3 on Jan 19 2020, so a number of distributions may still have the buggy version. Some have backported the fix to their 4.2.1 release, though, and even without the fix it's quite timing-dependent whether the bug actually is hit. ] Josh Triplett says: "I've been hammering on your pipe fix patch (switching to exclusive wait queues) for a month or so, on several different systems, and I've run into no issues with it. The patch substantially improves parallel build times on large (~100 CPU) systems, both with parallel make and with other things that use make's pipe-based jobserver. All current distributions (including stable and long-term stable distributions) have versions of GNU make that no longer have the jobserver bug" Tested-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08	compat_ioctl: fix FIONREAD on devices	Arnd Bergmann	1	-4/+7
	My final cleanup patch for sys_compat_ioctl() introduced a regression on the FIONREAD ioctl command, which is used for both regular and special files, but only works on regular files after my patch, as I had missed the warning that Al Viro put into a comment right above it. Change it back so it can work on any file again by moving the implementation to do_vfs_ioctl() instead. Fixes: 77b9040195de ("compat_ioctl: simplify the implementation") Reported-and-tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Reported-and-tested-by: youling257 <youling257@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-02-08	net: thunderx: use proper interface type for RGMII	Tim Harvey	1	-1/+1
	The configuration of the OCTEONTX XCV_DLL_CTL register via xcv_init_hw() is such that the RGMII RX delay is bypassed leaving the RGMII TX delay enabled in the MAC: /* Configure DLL - enable or bypass * TX no bypass, RX bypass */ cfg = readq_relaxed(xcv->reg_base + XCV_DLL_CTL); cfg &= ~0xFF03; cfg \|= CLKRX_BYP; writeq_relaxed(cfg, xcv->reg_base + XCV_DLL_CTL); This would coorespond to a interface type of PHY_INTERFACE_MODE_RGMII_RXID and not PHY_INTERFACE_MODE_RGMII. Fixing this allows RGMII PHY drivers to do the right thing (enable RX delay in the PHY) instead of erroneously enabling both delays in the PHY. Signed-off-by: Tim Harvey <tharvey@gateworks.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-08	powerpc: Fix CONFIG_TRACE_IRQFLAGS with CONFIG_VMAP_STACK	Christophe Leroy	1	-1/+1
	When CONFIG_PROVE_LOCKING is selected together with (now default) CONFIG_VMAP_STACK, kernel enter deadlock during boot. At the point of checking whether interrupts are enabled or not, the value of MSR saved on stack is read using the physical address of the stack. But at this point, when using VMAP stack the DATA MMU translation has already been re-enabled, leading to deadlock. Don't use the physical address of the stack when CONFIG_VMAP_STACK is set. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 028474876f47 ("powerpc/32: prepare for CONFIG_VMAP_STACK") Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/daeacdc0dec0416d1c587cc9f9e7191ad3068dc0.1581095957.git.christophe.leroy@c-s.fr
2020-02-08	powerpc/futex: Fix incorrect user access blocking	Michael Ellerman	1	-4/+6
	The early versions of our kernel user access prevention (KUAP) were written by Russell and Christophe, and didn't have separate read/write access. At some point I picked up the series and added the read/write access, but I failed to update the usages in futex.h to correctly allow read and write. However we didn't notice because of another bug which was causing the low-level code to always enable read and write. That bug was fixed recently in commit 1d8f739b07bd ("powerpc/kuap: Fix set direction in allow/prevent_user_access()"). futex_atomic_cmpxchg_inatomic() is passed the user address as %3 and does: 1: lwarx %1, 0, %3 cmpw 0, %1, %4 bne- 3f 2: stwcx. %5, 0, %3 Which clearly loads and stores from/to %3. The logic in arch_futex_atomic_op_inuser() is similar, so fix both of them to use allow_read_write_user(). Without this fix, and with PPC_KUAP_DEBUG=y, we see eg: Bug: Read fault blocked by AMR! WARNING: CPU: 94 PID: 149215 at arch/powerpc/include/asm/book3s/64/kup-radix.h:126 __do_page_fault+0x600/0xf30 CPU: 94 PID: 149215 Comm: futex_requeue_p Tainted: G W 5.5.0-rc7-gcc9x-g4c25df5640ae #1 ... NIP [c000000000070680] __do_page_fault+0x600/0xf30 LR [c00000000007067c] __do_page_fault+0x5fc/0xf30 Call Trace: [c00020138e5637e0] [c00000000007067c] __do_page_fault+0x5fc/0xf30 (unreliable) [c00020138e5638c0] [c00000000000ada8] handle_page_fault+0x10/0x30 --- interrupt: 301 at cmpxchg_futex_value_locked+0x68/0xd0 LR = futex_lock_pi_atomic+0xe0/0x1f0 [c00020138e563bc0] [c000000000217b50] futex_lock_pi_atomic+0x80/0x1f0 (unreliable) [c00020138e563c30] [c00000000021b668] futex_requeue+0x438/0xb60 [c00020138e563d60] [c00000000021c6cc] do_futex+0x1ec/0x2b0 [c00020138e563d90] [c00000000021c8b8] sys_futex+0x128/0x200 [c00020138e563e20] [c00000000000b7ac] system_call+0x5c/0x68 Fixes: de78a9c42a79 ("powerpc: Add a framework for Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Reported-by: syzbot+e808452bad7c375cbee6@syzkaller-ppc64.appspotmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Link: https://lore.kernel.org/r/20200207122145.11928-1-mpe@ellerman.id.au
2020-02-08	irqchip/gic-v3-its: Rename VPENDBASER/VPROPBASER accessors	Zenghui Yu	3	-24/+24
	V{PEND,PROP}BASER registers are actually located in VLPI_base frame of the redistributor. Rename their accessors to reflect this fact. No functional changes. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-7-yuzenghui@huawei.com
2020-02-08	irqchip/gic-v3-its: Remove superfluous WARN_ON	Zenghui Yu	1	-1/+0
	"ITS virtual pending table not cleaning" is already complained inside its_clear_vpend_valid(), there's no need to trigger a WARN_ON again. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-6-yuzenghui@huawei.com
2020-02-08	irqchip/gic-v4.1: Drop 'tmp' in inherit_vpe_l1_table_from_rd()	Zenghui Yu	1	-3/+1
	The variable 'tmp' in inherit_vpe_l1_table_from_rd() is actually not needed, drop it. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-5-yuzenghui@huawei.com
2020-02-08	irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level	Zenghui Yu	1	-0/+80
	In GICv4, we will ensure that level2 vPE table memory is allocated for the specified vpe_id on all v4 ITS, in its_alloc_vpe_table(). This still works well for the typical GICv4.1 implementation, where the new vPE table is shared between the ITSs and the RDs. To make it explicit, let us introduce allocate_vpe_l2_table() to make sure that the L2 tables are allocated on all v4.1 RDs. We're likely not need to allocate memory in it because the vPE table is shared and (L2 table is) already allocated at ITS level, except for the case where the ITS doesn't share anything (say SVPET == 0, practically unlikely but architecturally allowed). The implementation of allocate_vpe_l2_table() is mostly copied from its_alloc_table_entry(). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-4-yuzenghui@huawei.com
2020-02-08	irqchip/gic-v4.1: Set vpe_l1_base for all redistributors	Zenghui Yu	2	-2/+5
	Currently, we will not set vpe_l1_page for the current RD if we can inherit the vPE configuration table from another RD (or ITS), which results in an inconsistency between RDs within the same CommonLPIAff group. Let's rename it to vpe_l1_base to indicate the base address of the vPE configuration table of this RD, and set it properly for all v4.1 redistributors. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-3-yuzenghui@huawei.com
2020-02-08	irqchip/gic-v4.1: Fix programming of GICR_VPROPBASER_4_1_SIZE	Zenghui Yu	1	-1/+1
	The Size field of GICv4.1 VPROPBASER register indicates number of pages minus one and together Page_Size and Size control the vPEID width. Let's respect this requirement of the architecture. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-2-yuzenghui@huawei.com
2020-02-08	mt76: mt7615: fix max_nss in mt7615_eeprom_parse_hw_cap	Lorenzo Bianconi	1	-1/+2
	Fix u8 cast reading max_nss from MT_TOP_STRAP_STA register in mt7615_eeprom_parse_hw_cap routine Fixes: acf5457fd99db ("mt76: mt7615: read {tx,rx} mask from eeprom") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2020-02-07	bpf: Improve bucket_log calculation logic	Martin KaFai Lau	1	-2/+3
	It was reported that the max_t, ilog2, and roundup_pow_of_two macros have exponential effects on the number of states in the sparse checker. This patch breaks them up by calculating the "nbuckets" first so that the "bucket_log" only needs to take ilog2(). In addition, Linus mentioned: Patch looks good, but I'd like to point out that it's not just sparse. You can see it with a simple make net/core/bpf_sk_storage.i grep 'smap->bucket_log = ' net/core/bpf_sk_storage.i \| wc and see the end result: 1 365071 2686974 That's one line (the assignment line) that is 2,686,974 characters in length. Now, sparse does happen to react particularly badly to that (I didn't look to why, but I suspect it's just that evaluating all the types that don't actually ever end up getting used ends up being much more expensive than it should be), but I bet it's not good for gcc either. Fixes: 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Link: https://lore.kernel.org/bpf/20200207081810.3918919-1-kafai@fb.com