linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-06-18	scsi: megaraid_sas: Rework code around controller reset	Shivasharan S	2	-13/+11
	No functional change. This patch reworks code around controller reset path which gets rid of a couple of goto labels. This is in preparation for the next patch which adds PCI config space access locking while controller reset is in progress. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: megaraid_sas: fw_reset_no_pci_access required for MFI adapters only	Shivasharan S	1	-8/+10
	fw_reset_no_pci_access is only applicable for MFI controllers and is not used for Fusion controllers. For all Fusion controllers, driver can check reset adapter bit in status register before performing a chip reset without setting "fw_reset_no_pci_access". Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: megaraid_sas: Remove unused variable target_index	Shivasharan S	1	-3/+0
	No functional change. Remove set but unused variable in megasas_set_static_target_properties. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: fdomain: Resurrect driver - ISA support	Ondrej Zary	3	-0/+237
	Future Domain 16xx ISA SCSI support card support. Tested on IBM 92F0330 card (18C50 chip) with v1.00 BIOS. Signed-off-by: Ondrej Zary <linux@zary.sk> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: fdomain: Resurrect driver - PCI support	Ondrej Zary	3	-0/+86
	Future Domain TMC-3260/AHA-2920A PCI card support. Tested on Adaptec AHA-2920A PCI card. Signed-off-by: Ondrej Zary <linux@zary.sk> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: fdomain: Resurrect driver - Core	Ondrej Zary	4	-0/+644
	Future Domain TMC-16xx/TMC-3260 SCSI driver. This is the core driver, common for PCI, ISA and PCMCIA cards. Signed-off-by: Ondrej Zary <linux@zary.sk> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: hpsa: update driver version	Don Brace	1	-1/+1
	[mkp: wrong baseline, applied by hand] Reviewed-by: Gerry Morong <gerry.morong@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: hpsa: correct device resets	Don Brace	3	-90/+88
	Correct a race condition that occurs between the reset handler and the completion handler. There are times when the wait_event condition is never met due to this race condition and the reset never completes. The reset_pending field is NULL initially. t Reset Handler Thread Completion Thread -- -------------------- ----------------- t1 if (c->reset_pending) t2 c->reset_pending = dev; if (atomic_dev_and_test(counter)) t3 atomic_inc(counter) wait_up_all(event_sync_wait_queue) t4 t5 wait_event(...counter == 0) Kernel.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=1994350 Bug 199435 - HPSA + P420i resetting logical Direct-Access never complete Reviewed-by: Justin Lindley <justin.lindley@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: hpsa: do-not-complete-cmds-for-deleted-devices	Don Brace	2	-1/+14
	Close up a rare multipath issue. Close up small hole where a command completes after a device has been removed from SML and before the device is re-added. - Mark device as removed in slave_destroy - Do not complete commands for deleted devices Reviewed-by: Justin Lindley <justin.lindley@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: hpsa: wait longer for ptraid commands	Don Brace	1	-6/+20
	Wait longer for outstanding commands before removing a multipath device. Increase the timeout value for ptraid commands. Reviewed-by: Justin Lindley <justin.lindley@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: hpsa: check for tag collision	Don Brace	2	-7/+15
	Correct rare multipath issue where a device is deleted with an outstanding cmd which results in a tag collision. The cmd eventually completes. If a collision is detected wait until the command slot is cleared. Reviewed-by: Justin Lindley <justin.lindley@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: hpsa: use local workqueues instead of system workqueues	Don Brace	2	-3/+20
	Avoid system stalls by switching to local workqueue. Reviewed-by: Justin Lindley <justin.lindley@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: hpsa: correct simple mode	Don Brace	1	-4/+16
	Correct issue with hpsa_simple_mode module parameter. Driver was hanging due to incorrect interrupt setup. Reviewed-by: Justin Lindley <justin.lindley@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: osst: kill obsolete driver	Hannes Reinecke	9	-7126/+3
	The osst driver is becoming obsolete, as the manufacturer went out of business ages ago, and the maintainer has no means of testing any improvements anymore. Plus these days flash drives are cheaper and offer a higher capacity. So drop it completely. Cc: Willem Riede <osst@riede.org> Signed-off-by: Hannes Reinece <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: sd: Inline sd_probe_part2()	Bart Van Assche	1	-59/+44
	Make sd_probe() easier to read by inlining sd_probe_part2(). This patch does not change any functionality. [mkp: applied by hand] Cc: Lee Duncan <lduncan@suse.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18	scsi: sd: Rely on the driver core for asynchronous probing	Bart Van Assche	4	-24/+7
	As explained during the 2018 LSF/MM session about increasing SCSI disk probing concurrency, the problems with the current probing approach are as follows: - The driver core is unaware of asynchronous SCSI LUN probing. wait_for_device_probe() waits for all asynchronous probes except asynchronous SCSI disk probes. - There is unnecessary serialization between sd_probe() and sd_remove(). This can lead to a deadlock. Hence this patch that modifies the sd driver such that it uses the driver core framework for asynchronous probing. The async domain and get_device()/put_device() pairs that became superfluous due to this change are removed. This patch does not affect the time needed for loading the scsi_debug kernel module with parameters delay=0 and max_luns=256. This patch depends on commit ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver") that went upstream in kernel version v5.1-rc1. Cc: Lee Duncan <lduncan@suse.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: st: add a SPDX tag to st.c	Christoph Hellwig	1	-0/+1
	st.c is the only st file missing licensing information. Add a GPLv2 tag for the default kernel license. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: sr: add a SPDX tag to sr.c	Christoph Hellwig	1	-0/+1
	sr.c is the only sr file missing licensing information. Add a GPLv2 tag for the default kernel license. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: sg: switch to SPDX tags	Christoph Hellwig	1	-6/+1
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: ses: switch to SPDX tags	Christoph Hellwig	1	-18/+2
	Use the the GPLv2 SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: sd: switch remaining files to SPDX tags	Christoph Hellwig	2	-30/+2
	Use the the GPLv2 SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: sd: add a SPDX tag to sd.c	Christoph Hellwig	1	-0/+1
	sd.c is the only sd file missing licensing information. Add a GPLv2 tag for the default kernel license. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: libsas: switch remaining files to SPDX tags	Christoph Hellwig	12	-193/+12
	Use the the GPLv2 SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: libsas: switch sas_ata.[ch] to SPDX tags	Christoph Hellwig	1	-15/+1
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: libsas: add a SPDX tag to sas_task.c	Christoph Hellwig	1	-1/+1
	sas_task.c is the only libsas file missing licensing information. Add a GPLv2 tag for the default kernel license. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: libiscsi: switch to SPDX tags	Christoph Hellwig	7	-91/+7
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: libfcoe: switch to SPDX tags	Christoph Hellwig	6	-78/+6
	Use the the GPLv2 SPDX tag instead of verbose boilerplate text. [mkp: fixed comment syntax on *.c] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: libfc: switch to SPDX tags	Christoph Hellwig	17	-222/+19
	Use the the GPLv2 SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: libfc: remove duplicate GPL boilerplate text	Christoph Hellwig	4	-52/+0
	The libfc uapi headers already have proper SPDX tags, remove the duplicate boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_transport_srp: switch to SPDX tags	Christoph Hellwig	1	-15/+1
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_transport_spi: switch to SPDX tags	Christoph Hellwig	2	-28/+2
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_transport_sas: switch to SPDX tags	Christoph Hellwig	1	-1/+1
	Use the the GPLv2 SPDX tag instead of a free form blurb. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_transport_iscsi: switch to SPDX tags	Christoph Hellwig	2	-28/+2
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_transport_fc: switch to SPDX tags	Christoph Hellwig	2	-34/+2
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_transport_fc: remove duplicate GPL boilerplate text	Christoph Hellwig	2	-30/+0
	The FC transport class uapi headers already have proper SPDX tags, remove the duplicate boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_transport.h: switch to SPDX tags	Christoph Hellwig	1	-14/+1
	Use the the GPLv2+ SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: scsi_netlink: remove duplicate GPL boilerplate text	Christoph Hellwig	1	-15/+0
	The SCSI netlink uapi header already has a proper SPDX tag, remove the duplicate boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: core: switch the remaining scsi midlayer files to use SPDX tags	Christoph Hellwig	3	-16/+3
	Use the GPLv2 SPDX tag instead of verbose boilerplate text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21	scsi: core: add SPDX tags to scsi midlayer files missing licensing information	Christoph Hellwig	8	-0/+8
	Add the default kernel GPLv2 annotation to SCSI midlayer files missing any licensing information. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-19	Linux 5.2-rc1	Linus Torvalds	1	-2/+2

2019-05-19	kconfig: use 'else ifneq' for Makefile to improve readability	Masahiro Yamada	1	-3/+1
	'ifeq ... else ifneq ... endif' notation is supported by GNU Make 3.81 or later, which is the requirement for building the kernel since commit 37d69ee30808 ("docs: bump minimal GNU Make version to 3.81"). Use it to improve the readability. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-05-18	panic: add an option to replay all the printk message in buffer	Feng Tang	5	-4/+24
	Currently on panic, kernel will lower the loglevel and print out pending printk msg only with console_flush_on_panic(). Add an option for users to configure the "panic_print" to replay all dmesg in buffer, some of which they may have never seen due to the loglevel setting, which will help panic debugging . [feng.tang@intel.com: keep the original console_flush_on_panic() inside panic()] Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com [feng.tang@intel.com: use logbuf lock to protect the console log index] Link: http://lkml.kernel.org/r/1556269868-22654-1-git-send-email-feng.tang@intel.com Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Aaro Koskinen <aaro.koskinen@nokia.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18	initramfs: don't free a non-existent initrd	Steven Price	1	-1/+1
	Since commit 54c7a8916a88 ("initramfs: free initrd memory if opening /initrd.image fails"), the kernel has unconditionally attempted to free the initrd even if it doesn't exist. In the non-existent case this causes a boot-time splat if CONFIG_DEBUG_VIRTUAL is enabled due to a call to virt_to_phys() with a NULL address. Instead we should check that the initrd actually exists and only attempt to free it if it does. Link: http://lkml.kernel.org/r/20190516143125.48948-1-steven.price@arm.com Fixes: 54c7a8916a88 ("initramfs: free initrd memory if opening /initrd.image fails") Signed-off-by: Steven Price <steven.price@arm.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18	fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount	Jiufei Xue	1	-3/+8
	synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb switch may not go to the workqueue after synchronize_rcu(). Thus previous scheduled switches was not finished even flushing the workqueue, which will cause a NULL pointer dereferenced followed below. VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds. Have a nice day... BUG: unable to handle kernel NULL pointer dereference at 0000000000000278 evict+0xb3/0x180 iput+0x1b0/0x230 inode_switch_wbs_work_fn+0x3c0/0x6a0 worker_thread+0x4e/0x490 ? process_one_work+0x410/0x410 kthread+0xe6/0x100 ret_from_fork+0x39/0x50 Replace the synchronize_rcu() call with a rcu_barrier() to wait for all pending callbacks to finish. And inc isw_nr_in_flight after call_rcu() in inode_switch_wbs() to make more sense. Link: http://lkml.kernel.org/r/20190429024108.54150-1-jiufei.xue@linux.alibaba.com Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Acked-by: Tejun Heo <tj@kernel.org> Suggested-by: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18	mm/compaction.c: correct zone boundary handling when isolating pages from a pageblock	Mel Gorman	1	-2/+2
	syzbot reported the following error from a tree with a head commit of baf76f0c58ae ("slip: make slhc_free() silently accept an error pointer") BUG: unable to handle kernel paging request at ffffea0003348000 #PF error: [normal kernel read fault] PGD 12c3f9067 P4D 12c3f9067 PUD 12c3f8067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 28916 Comm: syz-executor.2 Not tainted 5.1.0-rc6+ #89 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:constant_test_bit arch/x86/include/asm/bitops.h:314 [inline] RIP: 0010:PageCompound include/linux/page-flags.h:186 [inline] RIP: 0010:isolate_freepages_block+0x1c0/0xd40 mm/compaction.c:579 Code: 01 d8 ff 4d 85 ed 0f 84 ef 07 00 00 e8 29 00 d8 ff 4c 89 e0 83 85 38 ff ff ff 01 48 c1 e8 03 42 80 3c 38 00 0f 85 31 0a 00 00 <4d> 8b 2c 24 31 ff 49 c1 ed 10 41 83 e5 01 44 89 ee e8 3a 01 d8 ff RSP: 0018:ffff88802b31eab8 EFLAGS: 00010246 RAX: 1ffffd4000669000 RBX: 00000000000cd200 RCX: ffffc9000a235000 RDX: 000000000001ca5e RSI: ffffffff81988cc7 RDI: 0000000000000001 RBP: ffff88802b31ebd8 R08: ffff88805af700c0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffea0003348000 R13: 0000000000000000 R14: ffff88802b31f030 R15: dffffc0000000000 FS: 00007f61648dc700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffea0003348000 CR3: 0000000037c64000 CR4: 00000000001426e0 Call Trace: fast_isolate_around mm/compaction.c:1243 [inline] fast_isolate_freepages mm/compaction.c:1418 [inline] isolate_freepages mm/compaction.c:1438 [inline] compaction_alloc+0x1aee/0x22e0 mm/compaction.c:1550 There is no reproducer and it is difficult to hit -- 1 crash every few days. The issue is very similar to the fix in commit 6b0868c820ff ("mm/compaction.c: correct zone boundary handling when resetting pageblock skip hints"). When isolating free pages around a target pageblock, the boundary handling is off by one and can stray into the next pageblock. Triggering the syzbot error requires that the end of pageblock is section or zone aligned, and that the next section is unpopulated. A more subtle consequence of the bug is that pageblocks were being improperly used as migration targets which potentially hurts fragmentation avoidance in the long-term one page at a time. A debugging patch revealed that it's definitely possible to stray outside of a pageblock which is not intended. While syzbot cannot be used to verify this patch, it was confirmed that the debugging warning no longer triggers with this patch applied. It has also been confirmed that the THP allocation stress tests are not degraded by this patch. Link: http://lkml.kernel.org/r/20190510182124.GI18914@techsingularity.net Fixes: e332f741a8dd ("mm, compaction: be selective about what pageblocks to clear skip hints") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: syzbot+d84c80f9fe26a0f7a734@syzkaller.appspotmail.com Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Qian Cai <cai@lca.pw> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> # v5.1+ Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18	mm/vmap: add DEBUG_AUGMENT_LOWEST_MATCH_CHECK macro	Uladzislau Rezki (Sony)	1	-0/+43
	This macro adds some debug code to check that vmap allocations are happened in ascending order. By default this option is set to 0 and not active. It requires recompilation of the kernel to activate it. Set to 1, compile the kernel. [urezki@gmail.com: v4] Link: http://lkml.kernel.org/r/20190406183508.25273-4-urezki@gmail.com Link: http://lkml.kernel.org/r/20190402162531.10888-4-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joel Fernandes <joelaf@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18	mm/vmap: add DEBUG_AUGMENT_PROPAGATE_CHECK macro	Uladzislau Rezki (Sony)	1	-0/+48
	This macro adds some debug code to check that the augment tree is maintained correctly, meaning that every node contains valid subtree_max_size value. By default this option is set to 0 and not active. It requires recompilation of the kernel to activate it. Set to 1, compile the kernel. [urezki@gmail.com: v4] Link: http://lkml.kernel.org/r/20190406183508.25273-3-urezki@gmail.com Link: http://lkml.kernel.org/r/20190402162531.10888-3-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joel Fernandes <joelaf@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18	mm/vmalloc.c: keep track of free blocks for vmap allocation	Uladzislau Rezki (Sony)	2	-247/+763
	Patch series "improve vmap allocation", v3. Objective --------- Please have a look for the description at: https://lkml.org/lkml/2018/10/19/786 but let me also summarize it a bit here as well. The current implementation has O(N) complexity. Requests with different permissive parameters can lead to long allocation time. When i say "long" i mean milliseconds. Description ----------- This approach organizes the KVA memory layout into free areas of the 1-ULONG_MAX range, i.e. an allocation is done over free areas lookups, instead of finding a hole between two busy blocks. It allows to have lower number of objects which represent the free space, therefore to have less fragmented memory allocator. Because free blocks are always as large as possible. It uses the augment tree where all free areas are sorted in ascending order of va->va_start address in pair with linked list that provides O(1) access to prev/next elements. Since the tree is augment, we also maintain the "subtree_max_size" of VA that reflects a maximum available free block in its left or right sub-tree. Knowing that, we can easily traversal toward the lowest (left most path) free area. Allocation: ~O(log(N)) complexity. It is sequential allocation method therefore tends to maximize locality. The search is done until a first suitable block is large enough to encompass the requested parameters. Bigger areas are split. I copy paste here the description of how the area is split, since i described it in https://lkml.org/lkml/2018/10/19/786 <snip> A free block can be split by three different ways. Their names are FL_FIT_TYPE, LE_FIT_TYPE/RE_FIT_TYPE and NE_FIT_TYPE, i.e. they correspond to how requested size and alignment fit to a free block. FL_FIT_TYPE - in this case a free block is just removed from the free list/tree because it fully fits. Comparing with current design there is an extra work with rb-tree updating. LE_FIT_TYPE/RE_FIT_TYPE - left/right edges fit. In this case what we do is just cutting a free block. It is as fast as a current design. Most of the vmalloc allocations just end up with this case, because the edge is always aligned to 1. NE_FIT_TYPE - Is much less common case. Basically it happens when requested size and alignment does not fit left nor right edges, i.e. it is between them. In this case during splitting we have to build a remaining left free area and place it back to the free list/tree. Comparing with current design there are two extra steps. First one is we have to allocate a new vmap_area structure. Second one we have to insert that remaining free block to the address sorted list/tree. In order to optimize a first case there is a cache with free_vmap objects. Instead of allocating from slab we just take an object from the cache and reuse it. Second one is pretty optimized. Since we know a start point in the tree we do not do a search from the top. Instead a traversal begins from a rb-tree node we split. <snip> De-allocation. ~O(log(N)) complexity. An area is not inserted straight away to the tree/list, instead we identify the spot first, checking if it can be merged around neighbors. The list provides O(1) access to prev/next, so it is pretty fast to check it. Summarizing. If merged then large coalesced areas are created, if not the area is just linked making more fragments. There is one more thing that i should mention here. After modification of VA node, its subtree_max_size is updated if it was/is the biggest area in its left or right sub-tree. Apart of that it can also be populated back to upper levels to fix the tree. For more details please have a look at the __augment_tree_propagate_from() function and the description. Tests and stressing ------------------- I use the "test_vmalloc.sh" test driver available under "tools/testing/selftests/vm/" since 5.1-rc1 kernel. Just trigger "sudo ./test_vmalloc.sh" to find out how to deal with it. Tested on different platforms including x86_64/i686/ARM64/x86_64_NUMA. Regarding last one, i do not have any physical access to NUMA system, therefore i emulated it. The time of stressing is days. If you run the test driver in "stress mode", you also need the patch that is in Andrew's tree but not in Linux 5.1-rc1. So, please apply it: http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/commit/?id=e0cf7749bade6da318e98e934a24d8b62fab512c After massive testing, i have not identified any problems like memory leaks, crashes or kernel panics. I find it stable, but more testing would be good. Performance analysis -------------------- I have used two systems to test. One is i5-3320M CPU @ 2.60GHz and another is HiKey960(arm64) board. i5-3320M runs on 4.20 kernel, whereas Hikey960 uses 4.15 kernel. I have both system which could run on 5.1-rc1 as well, but the results have not been ready by time i an writing this. Currently it consist of 8 tests. There are three of them which correspond to different types of splitting(to compare with default). We have 3 ones(see above). Another 5 do allocations in different conditions. a) sudo ./test_vmalloc.sh performance When the test driver is run in "performance" mode, it runs all available tests pinned to first online CPU with sequential execution test order. We do it in order to get stable and repeatable results. Take a look at time difference in "long_busy_list_alloc_test". It is not surprising because the worst case is O(N). # i5-3320M How many cycles all tests took: CPU0=646919905370(default) cycles vs CPU0=193290498550(patched) cycles # See detailed table with results here: ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_performance_default.txt ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_performance_patched.txt # Hikey960 8x CPUs How many cycles all tests took: CPU0=3478683207 cycles vs CPU0=463767978 cycles # See detailed table with results here: ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/HiKey960_performance_default.txt ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/HiKey960_performance_patched.txt b) time sudo ./test_vmalloc.sh test_repeat_count=1 With this configuration, all tests are run on all available online CPUs. Before running each CPU shuffles its tests execution order. It gives random allocation behaviour. So it is rough comparison, but it puts in the picture for sure. # i5-3320M <default> vs <patched> real 101m22.813s real 0m56.805s user 0m0.011s user 0m0.015s sys 0m5.076s sys 0m0.023s # See detailed table with results here: ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_test_repeat_count_1_default.txt ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_test_repeat_count_1_patched.txt # Hikey960 8x CPUs <default> vs <patched> real unknown real 4m25.214s user unknown user 0m0.011s sys unknown sys 0m0.670s I did not manage to complete this test on "default Hikey960" kernel version. After 24 hours it was still running, therefore i had to cancel it. That is why real/user/sys are "unknown". This patch (of 3): Currently an allocation of the new vmap area is done over busy list iteration(complexity O(n)) until a suitable hole is found between two busy areas. Therefore each new allocation causes the list being grown. Due to over fragmented list and different permissive parameters an allocation can take a long time. For example on embedded devices it is milliseconds. This patch organizes the KVA memory layout into free areas of the 1-ULONG_MAX range. It uses an augment red-black tree that keeps blocks sorted by their offsets in pair with linked list keeping the free space in order of increasing addresses. Nodes are augmented with the size of the maximum available free block in its left or right sub-tree. Thus, that allows to take a decision and traversal toward the block that will fit and will have the lowest start address, i.e. it is sequential allocation. Allocation: to allocate a new block a search is done over the tree until a suitable lowest(left most) block is large enough to encompass: the requested size, alignment and vstart point. If the block is bigger than requested size - it is split. De-allocation: when a busy vmap area is freed it can either be merged or inserted to the tree. Red-black tree allows efficiently find a spot whereas a linked list provides a constant-time access to previous and next blocks to check if merging can be done. In case of merging of de-allocated memory chunk a large coalesced area is created. Complexity: ~O(log(N)) [urezki@gmail.com: v3] Link: http://lkml.kernel.org/r/20190402162531.10888-2-urezki@gmail.com [urezki@gmail.com: v4] Link: http://lkml.kernel.org/r/20190406183508.25273-2-urezki@gmail.com Link: http://lkml.kernel.org/r/20190321190327.11813-2-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Joel Fernandes <joelaf@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18	kbuild: check uniqueness of module names	Masahiro Yamada	2	-0/+17
	In the recent build test of linux-next, Stephen saw a build error caused by a broken .tmp_versions/*.mod file: https://lkml.org/lkml/2019/5/13/991 drivers/net/phy/asix.ko and drivers/net/usb/asix.ko have the same basename, and there is a race in generating .tmp_versions/asix.mod Kbuild has not checked this before, and it suddenly shows up with obscure error messages when this kind of race occurs. Non-unique module names cause various sort of problems, but it is not trivial to catch them by eyes. Hence, this script. It checks not only real modules, but also built-in modules (i.e. controlled by tristate CONFIG option, but currently compiled with =y). Non-unique names for built-in modules also cause problems because /sys/modules/ would fall over. For the latest kernel, I tested "make allmodconfig all" (or more quickly "make allyesconfig modules"), and it detected the following: warning: same basename if the following are built as modules: drivers/regulator/88pm800.ko drivers/mfd/88pm800.ko warning: same basename if the following are built as modules: drivers/gpu/drm/bridge/adv7511/adv7511.ko drivers/media/i2c/adv7511.ko warning: same basename if the following are built as modules: drivers/net/phy/asix.ko drivers/net/usb/asix.ko warning: same basename if the following are built as modules: fs/coda/coda.ko drivers/media/platform/coda/coda.ko warning: same basename if the following are built as modules: drivers/net/phy/realtek.ko drivers/net/dsa/realtek.ko Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
2019-05-18	kconfig: Terminate menu blocks with a comment in the generated config	Alexander Popov	1	-1/+12
	Currently menu blocks start with a pretty header but end with nothing in the generated config. So next config options stick together with the options from the menu block. Let's terminate menu blocks in the generated config with a comment and a newline if needed. Example: ... CONFIG_BPF_STREAM_PARSER=y CONFIG_NET_FLOW_LIMIT=y # # Network testing # CONFIG_NET_PKTGEN=y CONFIG_NET_DROP_MONITOR=y # end of Network testing # end of Networking options CONFIG_HAMRADIO=y ... Signed-off-by: Alexander Popov <alex.popov@linux.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>