wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2018-08-22	arch/h8300: add a defconfig target	Randy Dunlap	1	-0/+2
	Make the "defconfig" target valid for arch/h8300. Currently "make ARCH=h8300 defconfig" produces: * Can't find default configuration "arch/h8300/defconfig"! ../scripts/kconfig/Makefile:87: recipe for target 'defconfig' failed By adding a value for KBUILD_DEFCONFIG, "make ARCH=h8300 defconfig" successfully produces a kernel .config file: * Default configuration is based on 'edosk2674_defconfig' This is useful for Kconfig editing/testing. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp (moderated for non-subscribers) Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	arch/h8300: eliminate kgbd.c warning	Randy Dunlap	1	-1/+1
	Drop the "const" qualifier from arch_kgdb_ops to eliminate the gcc warning (gcc version is 8.1.0). arch/h8300/kernel/kgdb.c:132:24: error: conflicting type qualifiers for 'arch_kgdb_ops' const struct kgdb_arch arch_kgdb_ops = { In file included from ../arch/h8300/kernel/kgdb.c:12: ../include/linux/kgdb.h:284:26: note: previous declaration of 'arch_kgdb_ops' was here extern struct kgdb_arch arch_kgdb_ops; Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	arch/h8300: eliminate ptrace.h warnings	Randy Dunlap	1	-0/+2
	Add a "struct task_struct;" stub to arch/h8300's ptrace.h header to eliminate gcc warnings (gcc version is 8.1.0). ../arch/h8300/include/asm/ptrace.h:32:34: warning: 'struct task_struct' declared inside parameter list will not be visible outside of this definition or declaration extern long h8300_get_reg(struct task_struct task, int regno); ../arch/h8300/include/asm/ptrace.h:33:33: warning: 'struct task_struct' declared inside parameter list will not be visible outside of this definition or declaration extern int h8300_put_reg(struct task_struct task, int regno, Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300:let the checker know that size_t is ulong	Luc Van Oostenryck	1	-0/+2
	All 64bit archs use unsigned long for size_t and most 32bit archs use 'unsigned int'. By default, this is what is assumed by sparse. However, on h8300 (a 32bit arch) size_t is unsigned long which can led sparse to emit wrong warnings. Fix this by passing to sparse the flag -msize-long, telling it that size_t is unsigned long. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300: Don't include linux/kernel.h in asm/atomic.h	Will Deacon	1	-2/+2
	linux/kernel.h isn't needed by asm/atomic.h and will result in circular dependencies when the asm-generic atomic bitops are built around the tomic_long_t interface. Remove the broad include and replace it with linux/compiler.h for READ_ONCE etc and asm/irqflags.h for arch_local_irq_save etc. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300: remove unnecessary of_platform_populate call	Rob Herring	1	-10/+0
	The DT core will call of_platform_populate, so it is not necessary for arch specific code to call it unless there are custom match entries, auxdata or parent device. Neither of those apply here, so remove the call. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300: Correct signature of test_bit()	Geert Uytterhoeven	1	-1/+1
	mm/filemap.c: In function 'clear_bit_unlock_is_negative_byte': mm/filemap.c:1181:30: warning: passing argument 2 of 'test_bit' discards 'volatile' qualifier from pointer target type [-Wdiscarded-qualifiers] return test_bit(PG_waiters, mem); ^~~ In file included from include/linux/bitops.h:38, from include/linux/kernel.h:11, from include/linux/list.h:9, from include/linux/wait.h:7, from include/linux/wait_bit.h:8, from include/linux/fs.h:6, from include/linux/dax.h:5, from mm/filemap.c:14: arch/h8300/include/asm/bitops.h:69:57: note: expected 'const long unsigned int ' but argument is of type 'volatile void ' static inline int test_bit(int nr, const unsigned long *addr) ~~~~~~~~~~~~~~~~~~~~~^~~~ Make the bitmask pointed to by the "addr" parameter volatile to fix this, like is done on other architectures. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300: irqchip: fix warning	Yoshinori Sato	1	-3/+3
	Var "addr" type incorrect. It have interrupt controler register address. Type of void __iomem is correct. Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300: switch to NO_BOOTMEM	Rob Herring	2	-28/+9
	Commit 0fa1c579349f ("of/fdt: use memblock_virt_alloc for early alloc") inadvertently switched the DT unflattening allocations from memblock to bootmem which doesn't work because the unflattening happens before bootmem is initialized. Swapping the order of bootmem init and unflattening could also fix this, but removing bootmem is desired. So enable NO_BOOTMEM on h8300 like other architectures have done. Fixes: 0fa1c579349f ("of/fdt: use memblock_virt_alloc for early alloc") Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300: gcc-8.1 fix	Yoshinori Sato	1	-3/+4
	Since gcc 8.1 does not generate an assignment statement to er 0, we had to explicitly write it. Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-22	h8300: Add missing output register.	Yoshinori Sato	1	-6/+6
	Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
2018-08-21	sparc: fix KBUILD_DEFCONFIG for ARCH=sparc32	Masahiro Yamada	1	-3/+3
	As commit 5ba800962a80 ("kbuild: update ARCH alias info for sparc") addressed, SPARC accepts ARCH=sparc32 as an alias. However, arch/sparc/Makefile wrongly sets KBUILD_DEFCONFIG, then sparc64_defconfig is chosen as the base configuration for ARCH=sparc32. $ make ARCH=sparc32 defconfig *** Default configuration is based on 'sparc64_defconfig' # # configuration written to .config # Fix the logic to choose sparc64_defconfig only when ARCH=sparc64. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21	sparc32: split ramdisk detection and reservation to a helper function	Mike Rapoport	1	-25/+31
	The detection and reservation of ramdisk memory were separated to allow bootmem bitmap initialization after the ramdisk boundaries are detected. Since the bootmem initialization is removed, the reservation of ramdisk memory is done immediately after its boundaries are found. Split the entire block into a separate helper function. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Suggested-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21	sparc32: switch to NO_BOOTMEM	Mike Rapoport	2	-58/+21
	Each populated sparc_phys_bank is added to memblock.memory. The reserve_bootmem() calls are replaced with memblock_reserve(), and the bootmem bitmap initialization is droppped. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21	sparc: mm/init_32: kill trailing whitespace	Mike Rapoport	1	-3/+3
	Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21	sparc: use generic dma_noncoherent_ops	Christoph Hellwig	3	-165/+35
	Switch to the generic noncoherent direct mapping implementation. This removes the previous sync_single_for_device implementation, which looks bogus given that no syncing is happening in the similar but more important map_single case. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21	microblaze/PCI: Remove stale pcibios_align_resource() comment	Lorenzo Pieralisi	1	-13/+0
	commit 01cf9d524ff0 ("microblaze/PCI: Support generic Xilinx AXI PCIe Host Bridge IP driver") and commit ecf677c8dcaa ("PCI: Add a generic weak pcibios_align_resource()") first patched then removed pcibios_align_resource() from the microblaze architecture code but failed to remove the comment that was added to it. Remove it since it has now become stale and it is quite confusing. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Bharat Kumar Gogada <bharatku@xilinx.com> Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2018-08-20	Raise the minimum required gcc version to 4.6	Joe Perches	2	-69/+19
	Various architectures fail to build properly with older versions of the gcc compiler. An example from Guenter Roeck in thread [1]: > > In file included from ./include/linux/mm.h:17:0, > from ./include/linux/pid_namespace.h:7, > from ./include/linux/ptrace.h:10, > from arch/openrisc/kernel/asm-offsets.c:32: > ./include/linux/mm_types.h:497:16: error: flexible array member in otherwise empty struct > > This is just an example with gcc 4.5.1 for or32. I have seen the problem > with gcc 4.4 (for unicore32) as well. So update the minimum required version of gcc to 4.6. [1] https://lore.kernel.org/lkml/20180814170904.GA12768@roeck-us.net/ Miscellanea: - Update Documentation/process/changes.rst - Remove and consolidate version test blocks in compiler-gcc.h for versions lower than 4.6 Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-20	ia64: Fix kernel BUG at lib/ioremap.c:72!	Tony Luck	1	-0/+1
	Commit 0bbf47eab469 ("ia64: use asm-generic/io.h") results in a BUG while booting ia64. This is because asm-generic/io.h defines PCI_IOBASE, which results in the function acpi_pci_root_remap_iospace() doing a lot of unnecessary (and wrong) things. I'd suggested an #if !CONFIG_IA64 in the functon, but Arnd suggested keeping the fix inside the arch/ia64 tree. Fixes: 0bbf47eab469 ("ia64: use asm-generic/io.h") Suggested-by: Arnd Bergman <arnd@arndb.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-20	i2c: rcar: implement STOP and REP_START according to docs	Wolfram Sang	1	-14/+20
	When doing a REP_START after a read message, the driver used to trigger a STOP first which would then be overwritten by REP_START. This was the only stable method found when doing the last refactoring. However, this was not in accordance with the documentation. After research from our BSP team and myself, we now can implement a version which works and is according to the documentation. The new approach ensures the ICMCR register is only changed when really needed. Tested on a R-Car Gen2 (H2) and Gen3 with DMA (M3N). Signed-off-by: Hiromitsu Yamasaki <hiromitsu.yamasaki.ym@renesas.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	i2c: rcar: refactor private flags	Wolfram Sang	1	-3/+4
	Use BIT macro to avoid shift-31-problem, indent a little more and use GENMASK to make it easier to add new flags. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	i2c: core: ACPI: Make acpi_gsb_i2c_read_bytes() check i2c_transfer return value	Hans de Goede	1	-3/+5
	Currently acpi_gsb_i2c_read_bytes() directly returns i2c_transfer's return value. i2c_transfer returns a value < 0 on error and 2 (for 2 successfully executed transfers) on success. But the ACPI code expects 0 on success, so currently acpi_gsb_i2c_read_bytes()'s caller does: if (status > 0) status = 0; This commit makes acpi_gsb_i2c_read_bytes() return a value which can be directly consumed by the ACPI code, mirroring acpi_gsb_i2c_write_bytes(), this commit also makes acpi_gsb_i2c_read_bytes() explitcly check that i2c_transfer returns 2, rather then accepting any value > 0. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	i2c: core: ACPI: Properly set status byte to 0 for multi-byte writes	Hans de Goede	1	-3/+8
	acpi_gsb_i2c_write_bytes() returns i2c_transfer()'s return value, which is the number of transfers executed on success, so 1. The ACPI code expects us to store 0 in gsb->status for success, not 1. Specifically this breaks the following code in the Thinkpad 8 DSDT: ECWR = I2CW = ECWR /* \_SB_.I2C1.BAT0.ECWR / If ((ECST == Zero)) { ECRD = I2CR / \_SB_.I2C1.I2CR */ } Before this commit we set ECST to 1, causing the read to never happen breaking battery monitoring on the Thinkpad 8. This commit makes acpi_gsb_i2c_write_bytes() return 0 when i2c_transfer() returns 1, so the single write transfer completed successfully, and makes it return -EIO on for other (unexpected) return values >= 0. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	dt-bindings: i2c: rcar: Add r8a774a1 support	Fabrizio Castro	1	-1/+3
	Document RZ/G2M (R8A774A1) I2C compatibility with the relevant driver dt-bindings. Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by: Biju Das <biju.das@bp.renesas.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	dt-bindings: i2c: sh_mobile: Add r8a774a1 support	Fabrizio Castro	1	-1/+3
	Document RZ/G2M (R8A774A1) SoC bindings. Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by: Biju Das <biju.das@bp.renesas.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	i2c: imx: Simplify stopped state tracking	Esben Haabendal	1	-8/+7
	Always update the stopped state when busy status have been checked. This is identical to what was done before, with the exception of error handling. Without this change, some errors cause the stopped state to be left in incorrect state in i2c_imx_stop(), i2c_imx_dma_read(), i2c_imx_read() and i2c_imx_xfer(). Signed-off-by: Esben Haabendal <eha@deif.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	i2c: imx: Fix race condition in dma read	Esben Haabendal	1	-4/+4
	This fixes a race condition, where the DMAEN bit ends up being set after I2C slave has transmitted a byte following the dummy read. When that happens, an interrupt is generated instead, and no DMA request is generated to kickstart the DMA read, and a timeout happens after DMA_TIMEOUT (1 sec). Fixed by setting the DMAEN bit before the dummy read. Signed-off-by: Esben Haabendal <eha@deif.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-08-20	i2c: pasemi: remove hardcoded bus numbers on smbus	Darren Stevens	1	-2/+1
	The pasemi smbus controller uses PCI_FUNC(dev->devfn) to define which number bus to attach to, however this fails when something else is probed first, for example an ATI Radeon graphics card will claim 9 or 10 busses, including the ones the pasemi wants. Patch the driver to call i2c_add_adapter rather than i2c_add_numbered_adapter. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-20	i2c: designware: Add SPDX license tag	Andy Shevchenko	7	-94/+7
	Replace short statement in comment with proper SPDX license tag. Note, for i2c-desingware-slave.c the identifier is chosen in accordance with MODULE_LICENSE() macro since it is visible to user. Another point to this choice is that the header seems to be copy'n'paste from the other file of this very driver. Acked-by: Luis Oliveira <Luis.Oliveira@synopsys.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-19	ip6_vti: fix creating fallback tunnel device for vti6	Haishuang Yan	1	-0/+2
	When set fb_tunnels_only_for_init_net to 1, don't create fallback tunnel device for vti6 when a new namespace is created. Tested: [root@builder2 ~]# modprobe ip6_tunnel [root@builder2 ~]# modprobe ip6_vti [root@builder2 ~]# echo 1 > /proc/sys/net/core/fb_tunnels_only_for_init_net [root@builder2 ~]# unshare -n [root@builder2 ~]# ip link 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-19	ip_vti: fix a null pointer deferrence when create vti fallback tunnel	Haishuang Yan	1	-1/+2
	After set fb_tunnels_only_for_init_net to 1, the itn->fb_tunnel_dev will be NULL and will cause following crash: [ 2742.849298] BUG: unable to handle kernel NULL pointer dereference at 0000000000000941 [ 2742.851380] PGD 800000042c21a067 P4D 800000042c21a067 PUD 42aaed067 PMD 0 [ 2742.852818] Oops: 0002 [#1] SMP PTI [ 2742.853570] CPU: 7 PID: 2484 Comm: unshare Kdump: loaded Not tainted 4.18.0-rc8+ #2 [ 2742.855163] Hardware name: Fedora Project OpenStack Nova, BIOS seabios-1.7.5-11.el7 04/01/2014 [ 2742.856970] RIP: 0010:vti_init_net+0x3a/0x50 [ip_vti] [ 2742.858034] Code: 90 83 c0 48 c7 c2 20 a1 83 c0 48 89 fb e8 6e 3b f6 ff 85 c0 75 22 8b 0d f4 19 00 00 48 8b 93 00 14 00 00 48 8b 14 ca 48 8b 12 <c6> 82 41 09 00 00 04 c6 82 38 09 00 00 45 5b c3 66 0f 1f 44 00 00 [ 2742.861940] RSP: 0018:ffff9be28207fde0 EFLAGS: 00010246 [ 2742.863044] RAX: 0000000000000000 RBX: ffff8a71ebed4980 RCX: 0000000000000013 [ 2742.864540] RDX: 0000000000000000 RSI: 0000000000000013 RDI: ffff8a71ebed4980 [ 2742.866020] RBP: ffff8a71ea717000 R08: ffffffffc083903c R09: ffff8a71ea717000 [ 2742.867505] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a71ebed4980 [ 2742.868987] R13: 0000000000000013 R14: ffff8a71ea5b49c0 R15: 0000000000000000 [ 2742.870473] FS: 00007f02266c9740(0000) GS:ffff8a71ffdc0000(0000) knlGS:0000000000000000 [ 2742.872143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2742.873340] CR2: 0000000000000941 CR3: 000000042bc20006 CR4: 00000000001606e0 [ 2742.874821] Call Trace: [ 2742.875358] ops_init+0x38/0xf0 [ 2742.876078] setup_net+0xd9/0x1f0 [ 2742.876789] copy_net_ns+0xb7/0x130 [ 2742.877538] create_new_namespaces+0x11a/0x1d0 [ 2742.878525] unshare_nsproxy_namespaces+0x55/0xa0 [ 2742.879526] ksys_unshare+0x1a7/0x330 [ 2742.880313] __x64_sys_unshare+0xe/0x20 [ 2742.881131] do_syscall_64+0x5b/0x180 [ 2742.881933] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reproduce: echo 1 > /proc/sys/net/core/fb_tunnels_only_for_init_net modprobe ip_vti unshare -n Fixes: 79134e6ce2c9 ("net: do not create fallback tunnels for non-default namespaces") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-19	r8169: don't use MSI-X on RTL8106e	Jian-Hong Pan	1	-3/+6
	Found the ethernet network on ASUS X441UAR doesn't come back on resume from suspend when using MSI-X. The chip is RTL8106e - version 39. [ 21.848357] libphy: r8169: probed [ 21.848473] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, XID 44900000, IRQ 127 [ 22.518860] r8169 0000:02:00.0 enp2s0: renamed from eth0 [ 29.458041] Generic PHY r8169-200:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) [ 63.227398] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control off [ 124.514648] Generic PHY r8169-200:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) Here is the ethernet controller in detail: 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136] (rev 07) Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast Ethernet controller [1043:200f] Flags: bus master, fast devsel, latency 0, IRQ 16 I/O ports at e000 [size=256] Memory at ef100000 (64-bit, non-prefetchable) [size=4K] Memory at e0000000 (64-bit, prefetchable) [size=16K] Capabilities: <access denied> Kernel driver in use: r8169 Kernel modules: r8169 Falling back to MSI fixes the issue. Fixes: 6c6aa15fdea5 ("r8169: improve interrupt handling") Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-19	net: lan743x_ptp: convert to ktime_get_clocktai_ts64	Arnd Bergmann	1	-2/+1
	timekeeping_clocktai64() has been renamed to ktime_get_clocktai_ts64() for consistency with the other ktime_get_* access functions. Rename the new caller that has come up as well. Question: this is the only ptp driver that sets the hardware time to the current system time in TAI. Why does it do that? Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-19	net: sched: always disable bh when taking tcf_lock	Vlad Buslov	7	-44/+47
	Recently, ops->init() and ops->dump() of all actions were modified to always obtain tcf_lock when accessing private action state. Actions that don't depend on tcf_lock for synchronization with their data path use non-bh locking API. However, tcf_lock is also used to protect rate estimator stats in softirq context by timer callback. Change ops->init() and ops->dump() of all actions to disable bh when using tcf_lock to prevent deadlock reported by following lockdep warning: [ 105.470398] ================================ [ 105.475014] WARNING: inconsistent lock state [ 105.479628] 4.18.0-rc8+ #664 Not tainted [ 105.483897] -------------------------------- [ 105.488511] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 105.494871] swapper/16/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 105.500449] 00000000f86c012e (&(&p->tcfa_lock)->rlock){+.?.}, at: est_fetch_counters+0x3c/0xa0 [ 105.509696] {SOFTIRQ-ON-W} state was registered at: [ 105.514925] _raw_spin_lock+0x2c/0x40 [ 105.519022] tcf_bpf_init+0x579/0x820 [act_bpf] [ 105.523990] tcf_action_init_1+0x4e4/0x660 [ 105.528518] tcf_action_init+0x1ce/0x2d0 [ 105.532880] tcf_exts_validate+0x1d8/0x200 [ 105.537416] fl_change+0x55a/0x268b [cls_flower] [ 105.542469] tc_new_tfilter+0x748/0xa20 [ 105.546738] rtnetlink_rcv_msg+0x56a/0x6d0 [ 105.551268] netlink_rcv_skb+0x18d/0x200 [ 105.555628] netlink_unicast+0x2d0/0x370 [ 105.559990] netlink_sendmsg+0x3b9/0x6a0 [ 105.564349] sock_sendmsg+0x6b/0x80 [ 105.568271] ___sys_sendmsg+0x4a1/0x520 [ 105.572547] __sys_sendmsg+0xd7/0x150 [ 105.576655] do_syscall_64+0x72/0x2c0 [ 105.580757] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 105.586243] irq event stamp: 489296 [ 105.590084] hardirqs last enabled at (489296): [<ffffffffb507e639>] _raw_spin_unlock_irq+0x29/0x40 [ 105.599765] hardirqs last disabled at (489295): [<ffffffffb507e745>] _raw_spin_lock_irq+0x15/0x50 [ 105.609277] softirqs last enabled at (489292): [<ffffffffb413a6a3>] irq_enter+0x83/0xa0 [ 105.618001] softirqs last disabled at (489293): [<ffffffffb413a800>] irq_exit+0x140/0x190 [ 105.626813] other info that might help us debug this: [ 105.633976] Possible unsafe locking scenario: [ 105.640526] CPU0 [ 105.643325] ---- [ 105.646125] lock(&(&p->tcfa_lock)->rlock); [ 105.650747] <Interrupt> [ 105.653717] lock(&(&p->tcfa_lock)->rlock); [ 105.658514] * DEADLOCK * [ 105.665349] 1 lock held by swapper/16/0: [ 105.669629] #0: 00000000a640ad99 ((&est->timer)){+.-.}, at: call_timer_fn+0x10b/0x550 [ 105.678200] stack backtrace: [ 105.683194] CPU: 16 PID: 0 Comm: swapper/16 Not tainted 4.18.0-rc8+ #664 [ 105.690249] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 105.698626] Call Trace: [ 105.701421] <IRQ> [ 105.703791] dump_stack+0x92/0xeb [ 105.707461] print_usage_bug+0x336/0x34c [ 105.711744] mark_lock+0x7c9/0x980 [ 105.715500] ? print_shortest_lock_dependencies+0x2e0/0x2e0 [ 105.721424] ? check_usage_forwards+0x230/0x230 [ 105.726315] __lock_acquire+0x923/0x26f0 [ 105.730597] ? debug_show_all_locks+0x240/0x240 [ 105.735478] ? mark_lock+0x493/0x980 [ 105.739412] ? check_chain_key+0x140/0x1f0 [ 105.743861] ? __lock_acquire+0x836/0x26f0 [ 105.748323] ? lock_acquire+0x12e/0x290 [ 105.752516] lock_acquire+0x12e/0x290 [ 105.756539] ? est_fetch_counters+0x3c/0xa0 [ 105.761084] _raw_spin_lock+0x2c/0x40 [ 105.765099] ? est_fetch_counters+0x3c/0xa0 [ 105.769633] est_fetch_counters+0x3c/0xa0 [ 105.773995] est_timer+0x87/0x390 [ 105.777670] ? est_fetch_counters+0xa0/0xa0 [ 105.782210] ? lock_acquire+0x12e/0x290 [ 105.786410] call_timer_fn+0x161/0x550 [ 105.790512] ? est_fetch_counters+0xa0/0xa0 [ 105.795055] ? del_timer_sync+0xd0/0xd0 [ 105.799249] ? __lock_is_held+0x93/0x110 [ 105.803531] ? mark_held_locks+0x20/0xe0 [ 105.807813] ? _raw_spin_unlock_irq+0x29/0x40 [ 105.812525] ? est_fetch_counters+0xa0/0xa0 [ 105.817069] ? est_fetch_counters+0xa0/0xa0 [ 105.821610] run_timer_softirq+0x3c4/0x9f0 [ 105.826064] ? lock_acquire+0x12e/0x290 [ 105.830257] ? __bpf_trace_timer_class+0x10/0x10 [ 105.835237] ? __lock_is_held+0x25/0x110 [ 105.839517] __do_softirq+0x11d/0x7bf [ 105.843542] irq_exit+0x140/0x190 [ 105.847208] smp_apic_timer_interrupt+0xac/0x3b0 [ 105.852182] apic_timer_interrupt+0xf/0x20 [ 105.856628] </IRQ> [ 105.859081] RIP: 0010:cpuidle_enter_state+0xd8/0x4d0 [ 105.864395] Code: 46 ff 48 89 44 24 08 0f 1f 44 00 00 31 ff e8 cf ec 46 ff 80 7c 24 07 00 0f 85 1d 02 00 00 e8 9f 90 4b ff fb 66 0f 1f 44 00 00 <4c> 8b 6c 24 08 4d 29 fd 0f 80 36 03 00 00 4c 89 e8 48 ba cf f7 53 [ 105.884288] RSP: 0018:ffff8803ad94fd20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 105.892494] RAX: 0000000000000000 RBX: ffffe8fb300829c0 RCX: ffffffffb41e19e1 [ 105.899988] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8803ad9358ac [ 105.907503] RBP: ffffffffb6636300 R08: 0000000000000004 R09: 0000000000000000 [ 105.914997] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004 [ 105.922487] R13: ffffffffb6636140 R14: ffffffffb66362d8 R15: 000000188d36091b [ 105.929988] ? trace_hardirqs_on_caller+0x141/0x2d0 [ 105.935232] do_idle+0x28e/0x320 [ 105.938817] ? arch_cpu_idle_exit+0x40/0x40 [ 105.943361] ? mark_lock+0x8c1/0x980 [ 105.947295] ? _raw_spin_unlock_irqrestore+0x32/0x60 [ 105.952619] cpu_startup_entry+0xc2/0xd0 [ 105.956900] ? cpu_in_idle+0x20/0x20 [ 105.960830] ? _raw_spin_unlock_irqrestore+0x32/0x60 [ 105.966146] ? trace_hardirqs_on_caller+0x141/0x2d0 [ 105.971391] start_secondary+0x2b5/0x360 [ 105.975669] ? set_cpu_sibling_map+0x1330/0x1330 [ 105.980654] secondary_startup_64+0xa5/0xb0 Taking tcf_lock in sample action with bh disabled causes lockdep to issue a warning regarding possible irq lock inversion dependency between tcf_lock, and psample_groups_lock that is taken when holding tcf_lock in sample init: [ 162.108959] Possible interrupt unsafe locking scenario: [ 162.116386] CPU0 CPU1 [ 162.121277] ---- ---- [ 162.126162] lock(psample_groups_lock); [ 162.130447] local_irq_disable(); [ 162.136772] lock(&(&p->tcfa_lock)->rlock); [ 162.143957] lock(psample_groups_lock); [ 162.150813] <Interrupt> [ 162.153808] lock(&(&p->tcfa_lock)->rlock); [ 162.158608] * DEADLOCK * In order to prevent potential lock inversion dependency between tcf_lock and psample_groups_lock, extract call to psample_group_get() from tcf_lock protected section in sample action init function. Fixes: 4e232818bd32 ("net: sched: act_mirred: remove dependency on rtnl lock") Fixes: 764e9a24480f ("net: sched: act_vlan: remove dependency on rtnl lock") Fixes: 729e01260989 ("net: sched: act_tunnel_key: remove dependency on rtnl lock") Fixes: d77284956656 ("net: sched: act_sample: remove dependency on rtnl lock") Fixes: e8917f437006 ("net: sched: act_gact: remove dependency on rtnl lock") Fixes: b6a2b971c0b0 ("net: sched: act_csum: remove dependency on rtnl lock") Fixes: 2142236b4584 ("net: sched: act_bpf: remove dependency on rtnl lock") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-18	ip6_vti: simplify stats handling in vti6_xmit	Haishuang Yan	1	-11/+3
	Same as ip_vti, use iptunnel_xmit_stats to updates stats in tunnel xmit code path. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-18	pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function	Linus Torvalds	3	-49/+0
	This function was created as a deprecated fallback case back in 2010 by commit eb14120f743d ("pcmcia: re-work pcmcia_request_irq()") for legacy cases. Actual in-kernel users haven't been around for a long while. The last in-kernel user was apparently removed four years ago by commit 5f5316fcd08e ("am2150: Update nmclan_cs.c to use update PCMCIA API"). Just remove it entirely. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-18	deprecate the '__deprecated' attribute warnings entirely and for good	Linus Torvalds	3	-28/+2
	We haven't had lots of deprecation warnings lately, but the rdma use of it made them flare up again. They are not useful. They annoy everybody, and nobody ever does anything about them, because it's always "somebody elses problem". And when people start thinking that warnings are normal, they stop looking at them, and the real warnings that mean something go unnoticed. If you want to get rid of a function, just get rid of it. Convert every user to the new world order. And if you can't do that, then don't annoy everybody else with your marking that says "I couldn't be bothered to fix this, so I'll just spam everybody elses build logs with warnings about my laziness". Make a kernelnewbies wiki page about things that could be cleaned up, write a blog post about it, or talk to people on the mailing lists. But don't add warnings to the kernel build about cleanup that you think should happen but you aren't doing yourself. Don't. Just don't. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/hmm.c: remove unused variables align_start and align_end	Colin Ian King	1	-4/+1
	Variables align_start and align_end are being assigned but are never used hence they are redundant and can be removed. Cleans up clang warnings: warning: variable 'align_start' set but not used [-Wunused-but-set-variable] warning: variable 'align_size' set but not used [-Wunused-but-set-variable] Link: http://lkml.kernel.org/r/20180714161124.3923-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	fs/userfaultfd.c: remove redundant pointer uwq	Colin Ian King	1	-3/+0
	Pointer uwq is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: warning: variable 'uwq' set but not used [-Wunused-but-set-variable] Link: http://lkml.kernel.org/r/20180717090802.18357-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm, vmacache: hash addresses based on pmd	David Rientjes	2	-15/+29
	When perf profiling a wide variety of different workloads, it was found that vmacache_find() had higher than expected cost: up to 0.08% of cpu utilization in some cases. This was found to rival other core VM functions such as alloc_pages_vma() with thp enabled and default mempolicy, and the conditionals in __get_vma_policy(). VMACACHE_HASH() determines which of the four per-task_struct slots a vma is cached for a particular address. This currently depends on the pfn, so pfn 5212 occupies a different vmacache slot than its neighboring pfn 5213. vmacache_find() iterates through all four of current's vmacache slots when looking up an address. Hashing based on pfn, an address has ~1/VMACACHE_SIZE chance of being cached in the first vmacache slot, or about 25%, if the vma is cached. This patch hashes an address by its pmd instead of pte to optimize for workloads with good spatial locality. This results in a higher probability of vmas being cached in the first slot that is checked: normally ~70% on the same workloads instead of 25%. [rientjes@google.com: various updates] Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1807231532290.109445@chino.kir.corp.google.com Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1807091749150.114630@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/list_lru: introduce list_lru_shrink_walk_irq()	Sebastian Andrzej Siewior	3	-6/+42
	Provide list_lru_shrink_walk_irq() and let it behave like list_lru_walk_one() except that it locks the spinlock with spin_lock_irq(). This is used by scan_shadow_nodes() because its lock nests within the i_pages lock which is acquired with IRQ. This change allows to use proper locking promitives instead hand crafted lock_irq_disable() plus spin_lock(). There is no EXPORT_SYMBOL provided because the current user is in-kernel only. Add list_lru_shrink_walk_irq() which acquires the spinlock with the proper locking primitives. Link: http://lkml.kernel.org/r/20180716111921.5365-5-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one()	Sebastian Andrzej Siewior	1	-6/+6
	__list_lru_walk_one() is invoked with struct list_lru lru, int nid as the first two argument. Those two are only used to retrieve struct list_lru_node. Since this is already done by the caller of the function for the locking, we can pass struct list_lru_node directly and avoid the dance around it. Link: http://lkml.kernel.org/r/20180716111921.5365-4-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/list_lru.c: move locking from __list_lru_walk_one() to its caller	Sebastian Andrzej Siewior	1	-5/+13
	Move the locking inside __list_lru_walk_one() to its caller. This is a preparation step in order to introduce list_lru_walk_one_irq() which does spin_lock_irq() instead of spin_lock() for the locking. Link: http://lkml.kernel.org/r/20180716111921.5365-3-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node()	Sebastian Andrzej Siewior	1	-2/+2
	Patch series "mm/list_lru: Add list_lru_shrink_walk_irq() and a user". This series removes the local_irq_disable() around list_lru_shrink_walk() (as used by mm/workingset) by adding list_lru_shrink_walk_irq(). Vladimir Davydov preferred this over `irq' argument which I added to struct list_lru. The initial post (of this series) received a Reviewed-by tag by Vladimir Davydov which I added to each patch of the series. The series applies on top of akpm's tree which has Kirill's shrink_slab series and does not clash with it (akpm asked me to wait a week or so and repost it then). I tested the code paths by triggering the OOM-killer via memory over commit and lockdep did not complain (nor did I see any warnings). This patch (of 4): list_lru_walk_node() invokes __list_lru_walk_one() with -1 as the memcg_idx parameter. The same can be achieved by list_lru_walk_one() and passing NULL as memcg argument which then gets converted into -1. This is a preparation step when the spin_lock() function is lifted to the caller of __list_lru_walk_one(). Invoke list_lru_walk_one() instead __list_lru_walk_one() when possible. Link: http://lkml.kernel.org/r/20180716111921.5365-2-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP	Huang Ying	1	-2/+3
	CONFIG_THP_SWAP should depend on CONFIG_SWAP, because it's unreasonable to optimize swapping for THP (Transparent Huge Page) without basic swapping support. In original code, when CONFIG_SWAP=n and CONFIG_THP_SWAP=y, split_swap_cluster() will not be built because it is in swapfile.c, but it will be called in huge_memory.c. This doesn't trigger a build error in practice because the call site is enclosed by PageSwapCache(), which is defined to be constant 0 when CONFIG_SWAP=n. But this is fragile and should be fixed. The comments are fixed too to reflect the latest progress. Link: http://lkml.kernel.org/r/20180713021228.439-1-ying.huang@intel.com Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/sparse: delete old sparse_init and enable new one	Pavel Tatashin	4	-267/+1
	Rename new_sparse_init() to sparse_init() which enables it. Delete old sparse_init() and all the code that became obsolete with. [pasha.tatashin@oracle.com: remove unused sparse_mem_maps_populate_node()] Link: http://lkml.kernel.org/r/20180716174447.14529-6-pasha.tatashin@oracle.com Link: http://lkml.kernel.org/r/20180712203730.8703-6-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Tested-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/sparse: add new sparse_init_nid() and sparse_init()	Pavel Tatashin	1	-0/+85
	sparse_init() requires to temporary allocate two large buffers: usemap_map and map_map. Baoquan He has identified that these buffers are so large that Linux is not bootable on small memory machines, such as a kdump boot. The buffers are especially large when CONFIG_X86_5LEVEL is set, as they are scaled to the maximum physical memory size. Baoquan provided a fix, which reduces these sizes of these buffers, but it is much better to get rid of them entirely. Add a new way to initialize sparse memory: sparse_init_nid(), which only operates within one memory node, and thus allocates memory either in large contiguous block or allocates section by section. This eliminates the need for use of temporary buffers. For simplified bisecting and review temporarly call sparse_init() new_sparse_init(), the new interface is going to be enabled as well as old code removed in the next patch. Link: http://lkml.kernel.org/r/20180712203730.8703-5-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Tested-by: Oscar Salvador <osalvador@suse.de> Tested-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/sparse: move buffer init/fini to the common place	Pavel Tatashin	3	-12/+7
	Now that both variants of sparse memory use the same buffers to populate memory map, we can move sparse_buffer_init()/sparse_buffer_fini() to the common place. Link: http://lkml.kernel.org/r/20180712203730.8703-4-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Tested-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/sparse: use the new sparse buffer functions in non-vmemmap	Pavel Tatashin	1	-27/+14
	non-vmemmap sparse also allocated large contiguous chunk of memory, and if fails falls back to smaller allocations. Use the same functions to allocate buffer as the vmemmap-sparse Link: http://lkml.kernel.org/r/20180712203730.8703-3-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Reviewed-by: Oscar Salvador <osalvador@suse.de> Tested-by: Oscar Salvador <osalvador@suse.de> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17	mm/sparse: abstract sparse buffer allocations	Pavel Tatashin	3	-35/+54
	Patch series "sparse_init rewrite", v6. In sparse_init() we allocate two large buffers to temporary hold usemap and memmap for the whole machine. However, we can avoid doing that if we changed sparse_init() to operated on per-node bases instead of doing it on the whole machine beforehand. As shown by Baoquan http://lkml.kernel.org/r/20180628062857.29658-1-bhe@redhat.com The buffers are large enough to cause machine stop to boot on small memory systems. Another benefit of these changes is that they also obsolete CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER. This patch (of 5): When struct pages are allocated for sparse-vmemmap VA layout, we first try to allocate one large buffer, and than if that fails allocate struct pages for each section as we go. The code that allocates buffer is uses global variables and is spread across several call sites. Cleanup the code by introducing three functions to handle the global buffer: sparse_buffer_init() initialize the buffer sparse_buffer_fini() free the remaining part of the buffer sparse_buffer_alloc() alloc from the buffer, and if buffer is empty return NULL Define these functions in sparse.c instead of sparse-vmemmap.c because later we will use them for non-vmemmap sparse allocations as well. [akpm@linux-foundation.org: use PTR_ALIGN()] [akpm@linux-foundation.org: s/BUG_ON/WARN_ON/] Link: http://lkml.kernel.org/r/20180712203730.8703-2-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Reviewed-by: Oscar Salvador <osalvador@suse.de> Tested-by: Oscar Salvador <osalvador@suse.de> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>