linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2017-09-27	scsi: ILLEGAL REQUEST + ASC==27 => target failure	Martin Wilck	1	-1/+2
	ASC 0x27 is "WRITE PROTECTED". This error code is returned e.g. by Fujitsu ETERNUS systems under certain conditions for WRITE SAME 16 commands with UNMAP bit set. It should not be treated as a path error. In general, it makes sense to assume that being write protected is a target rather than a path property. Signed-off-by: Martin Wilck <mwilck@suse.com> Acked-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-27	scsi: aacraid: Add a small delay after IOP reset	Guilherme G. Piccoli	1	-0/+2
	Commit 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status") changed the way driver checks if a reset succeeded. Now, after an IOP reset, aacraid immediately start polling a register to verify the reset is complete. This behavior cause regressions on the reset path in PowerPC (at least). Since the delay after the IOP reset was removed by the aforementioned patch, the fact driver just starts to read a register instantly after the reset was issued (by writing in another register) "corrupts" the reset procedure, which ends up failing all the time. The issue highly impacted kdump on PowerPC, since on kdump path we proactively issue a reset in adapter (through the reset_devices kernel parameter). This patch (re-)adds a delay right after IOP reset is issued. Empirically we measured that 3 seconds is enough, but for safety reasons we delay for 5s (and since it was 30s before, 5s is still a small amount). For reference, without this patch we observe the following messages on kdump kernel boot process: [ 76.294] aacraid 0003:01:00.0: IOP reset failed [ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed [ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff. [ 86.524] aacraid 0003:01:00.0: Controller reset type is 3 [ 86.524] aacraid 0003:01:00.0: Issuing IOP reset [146.534] aacraid 0003:01:00.0: IOP reset failed [146.534] aacraid 0003:01:00.0: ARC Reset attempt failed Fixes: 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-25	scsi: scsi_transport_fc: Also check for NOTPRESENT in fc_remote_port_add()	Hannes Reinecke	1	-1/+2
	During failover there is a small race window between fc_remote_port_add() and fc_timeout_deleted_rport(); the latter drops the lock after setting the port to NOTPRESENT, so if fc_remote_port_add() is called right at that time it will fail to detect the existing rport and happily adding a new structure, causing rports to get registered twice. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-25	scsi: scsi_transport_fc: set scsi_target_id upon rescan	Hannes Reinecke	1	-10/+1
	When an rport is found in the bindings array there is no guarantee that it had been a target port, so we need to call fc_remote_port_rolechg() here to ensure the scsi_target_id is set correctly. Otherwise the port will never be scanned. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-25	scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly	Xin Long	1	-1/+1
	ChunYu found a kernel crash by syzkaller: [ 651.617875] kasan: CONFIG_KASAN_INLINE enabled [ 651.618217] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 651.618731] general protection fault: 0000 [#1] SMP KASAN [ 651.621543] CPU: 1 PID: 9539 Comm: scsi Not tainted 4.11.0.cov #32 [ 651.621938] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 651.622309] task: ffff880117780000 task.stack: ffff8800a3188000 [ 651.622762] RIP: 0010:skb_release_data+0x26c/0x590 [...] [ 651.627260] Call Trace: [ 651.629156] skb_release_all+0x4f/0x60 [ 651.629450] consume_skb+0x1a5/0x600 [ 651.630705] netlink_unicast+0x505/0x720 [ 651.632345] netlink_sendmsg+0xab2/0xe70 [ 651.633704] sock_sendmsg+0xcf/0x110 [ 651.633942] ___sys_sendmsg+0x833/0x980 [ 651.637117] __sys_sendmsg+0xf3/0x240 [ 651.638820] SyS_sendmsg+0x32/0x50 [ 651.639048] entry_SYSCALL_64_fastpath+0x1f/0xc2 It's caused by skb_shared_info at the end of sk_buff was overwritten by ISCSI_KEVENT_IF_ERROR when parsing nlmsg info from skb in iscsi_if_rx. During the loop if skb->len == nlh->nlmsg_len and both are sizeof(*nlh), ev = nlmsg_data(nlh) will acutally get skb_shinfo(SKB) instead and set a new value to skb_shinfo(SKB)->nr_frags by ev->type. This patch is to fix it by checking nlh->nlmsg_len properly there to avoid over accessing sk_buff. Reported-by: ChunYu Wang <chunwang@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15	scsi: aacraid: error: testing array offset 'bus' after use	Nikola Pajkovsky	1	-8/+12
	Fix possible indexing array of bound for &aac->hba_map[bus][cid], where bus and cid boundary check happens later. Fixes: 0d643ff3c353 ("scsi: aacraid: use aac_tmf_callback for reset fib") Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15	scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function	Stefano Brivio	1	-0/+1
	Internal error codes happen to be positive, thus the PCI driver core won't treat them as failure, but we do. This would cause a crash later on as lpfc_pci_remove_one() is called (e.g. as shutdown function). Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15	scsi: aacraid: Fix 2T+ drives on SmartIOC-2000	Dave Carroll	2	-6/+11
	The logic for supporting large drives was previously tied to 4Kn support for SmartIOC-2000. As SmartIOC-2000 does not support volumes using 4Kn drives, use the intended option flag AAC_OPT_NEW_COMM_64 to determine support for volumes greater than 2T. Cc: <stable@vger.kernel.org> Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15	scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE	Hannes Reinecke	1	-3/+2
	When calling SG_GET_REQUEST_TABLE ioctl only a half-filled table is returned; the remaining part will then contain stale kernel memory information. This patch zeroes out the entire table to avoid this issue. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15	scsi: sg: factor out sg_fill_request_table()	Hannes Reinecke	1	-26/+35
	Factor out sg_fill_request_table() for better readability. [mkp: typos, applied by hand] Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15	scsi: sd: Remove unnecessary condition in sd_read_block_limits()	Lukas Czerner	1	-2/+0
	After series of changes around WRITE_SAME and UNMAP setup we ended up with leftover unnecessary condition. Remove it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-14	scsi: acornscsi: fix build error	Arnd Bergmann	1	-3/+3
	A cleanup patch introduced a fatal typo from inbalanced curly braces: drivers/scsi/arm/acornscsi.c: In function 'acornscsi_host_reset': drivers/scsi/arm/acornscsi.c:2773:1: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] drivers/scsi/arm/acornscsi.c:2795:12: error: invalid storage class for function 'acornscsi_show_info' static int acornscsi_show_info(struct seq_file m, struct Scsi_Host instance) The same patch incorrectly changed the argument type of the reset handler, as shown by this warning: drivers/scsi/arm/acornscsi.c:2888:27: error: initialization of 'int ()(struct scsi_cmnd )' from incompatible pointer type 'int ()(struct Scsi_Host )' [-Werror=incompatible-pointer-types] .eh_host_reset_handler = acornscsi_host_reset, This removes one the extraneous opening brace and reverts the argument type change. [mkp: fixed checkpatch complaint] Fixes: 74fa80ee3fae ("scsi: acornscsi: move bus reset to host reset") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-14	scsi: scsi_transport_fc: fix NULL pointer dereference in fc_bsg_job_timeout	Christoph Hellwig	1	-1/+1
	bsg-lib now embeddeds the job structure into the request, and req->special can't be used anymore. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-13	Fix up MAINTAINERS file sorting	Linus Torvalds	1	-55/+55
	Another merge window, another MAINTAINERS file disaster. People have serious problems with the alphabet and sorting, and poor Jérôme Glisse and Radim Krčmář get their names mangled by locale issues, turning them into some mangled mess (probably others do too, but those two stood out when sorting things again). And we now have two copies of the same 'AS3645A LED FLASH CONTROLLER DRIVER' in the tree and in the MAINTAINERS file, but that's a separate issue - the duplication is real, and I left them as two entries for the same name. This does not try to sort the actual section pattern entries, although I may end up doing that later. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-12	xfs: XFS_IS_REALTIME_INODE() should be false if no rt device present	Richard Wareing	1	-1/+8
	If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on a directory in a filesystem that does not have a realtime device and create a new file in that directory, it gets marked as a real time file. When data is written and a fsync is issued, the filesystem attempts to flush a non-existent rt device during the fsync process. This results in a crash dereferencing a null buftarg pointer in xfs_blkdev_issue_flush(): BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: xfs_blkdev_issue_flush+0xd/0x20 ..... Call Trace: xfs_file_fsync+0x188/0x1c0 vfs_fsync_range+0x3b/0xa0 do_fsync+0x3d/0x70 SyS_fsync+0x10/0x20 do_syscall_64+0x4d/0xb0 entry_SYSCALL64_slow_path+0x25/0x25 Setting RT inode flags does not require special privileges so any unprivileged user can cause this oops to occur. To reproduce, confirm kernel is compiled with CONFIG_XFS_RT=y and run: # mkfs.xfs -f /dev/pmem0 # mount /dev/pmem0 /mnt/test # mkdir /mnt/test/foo # xfs_io -c 'chattr +t' /mnt/test/foo # xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait. Kernels built with CONFIG_XFS_RT=n are not exposed to this bug. Fixes: f538d4da8d52 ("[XFS] write barrier support") Cc: <stable@vger.kernel.org> Signed-off-by: Richard Wareing <rwareing@fb.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-12	f2fs: hurry up to issue discard after io interruption	Chao Yu	1	-2/+15
	Once we encounter I/O interruption during issuing discards, we will delay long time before next round, but if system status is I/O idle during the time, it may loses opportunity to issue discards. So this patch changes to hurry up to issue discard after io interruption. Besides, this patch also fixes to issue discards accurately with assigned rate. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-12	f2fs: fix to show correct discard_granularity in sysfs	Chao Yu	1	-0/+2
	Fix below incorrect display when reading discard_granularity sysfs node. $ cat /sys/fs/f2fs/<device>/discard_granularity $ 16 $ echo 32 > /sys/fs/f2fs/<device>/discard_granularity $ cat /sys/fs/f2fs/<device>/discard_granularity $ 16 Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-12	f2fs: detect dirty inode in evict_inode	Chao Yu	1	-0/+3
	Add a bugon in f2fs_evict_inode to detect inconsistent status between inode cache and related node page cache. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-12	ovl: fix false positive ESTALE on lookup	Amir Goldstein	1	-4/+7
	Commit b9ac5c274b8c ("ovl: hash overlay non-dir inodes by copy up origin") verifies that the origin lower inode stored in the overlayfs inode matched the inode of a copy up origin dentry found by lookup. There is a false positive result in that check when lower fs does not support file handles and copy up origin cannot be followed by file handle at lookup time. The false negative happens when finding an overlay inode in cache on a copied up overlay dentry lookup. The overlay inode still 'remembers' the copy up origin inode, but the copy up origin dentry is not available for verification. Relax the check in case copy up origin dentry is not available. Fixes: b9ac5c274b8c ("ovl: hash overlay non-dir inodes by copy up...") Cc: <stable@vger.kernel.org> # v4.13 Reported-by: Jordi Pujol <jordipujolp@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-12	fuse: getattr cleanup	Miklos Szeredi	3	-23/+18
	The refreshed argument isn't used by any caller, get rid of it. Use a helper for just updating the inode (no need to fill in a kstat). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-12	fuse: honor iocb sync flags on write	Miklos Szeredi	3	-22/+28
	If the IOCB_DSYNC flag is set a sync is not being performed by fuse_file_write_iter. Honor IOCB_DSYNC/IOCB_SYNC by setting O_DYSNC/O_SYNC respectively in the flags filed of the write request. We don't need to sync data or metadata, since fuse_perform_write() does write-through and the filesystem is responsible for updating file times. Original patch by Vitaly Zolotusky. Reported-by: Nate Clark <nate@neworld.us> Cc: Vitaly Zolotusky <vitaly@unitc.com>. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-12	fuse: allow server to run in different pid_ns	Miklos Szeredi	2	-9/+7
	Commit 0b6e9ea041e6 ("fuse: Add support for pid namespaces") broke Sandstorm.io development tools, which have been sending FUSE file descriptors across PID namespace boundaries since early 2014. The above patch added a check that prevented I/O on the fuse device file descriptor if the pid namespace of the reader/writer was different from the pid namespace of the mounter. With this change passing the device file descriptor to a different pid namespace simply doesn't work. The check was added because pids are transferred to/from the fuse userspace server in the namespace registered at mount time. To fix this regression, remove the checks and do the following: 1) the pid in the request header (the pid of the task that initiated the filesystem operation) is translated to the reader's pid namespace. If a mapping doesn't exist for this pid, then a zero pid is used. Note: even if a mapping would exist between the initiator task's pid namespace and the reader's pid namespace the pid will be zero if either mapping from initator's to mounter's namespace or mapping from mounter's to reader's namespace doesn't exist. 2) The lk.pid value in setlk/setlkw requests and getlk reply is left alone. Userspace should not interpret this value anyway. Also allow the setlk/setlkw operations if the pid of the task cannot be represented in the mounter's namespace (pid being zero in that case). Reported-by: Kenton Varda <kenton@sandstorm.io> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 0b6e9ea041e6 ("fuse: Add support for pid namespaces") Cc: <stable@vger.kernel.org> # v4.12+ Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Seth Forshee <seth.forshee@canonical.com>
2017-09-12	ALSA: seq: Cancel pending autoload work at unbinding device	Takashi Iwai	1	-0/+3
	ALSA sequencer core has a mechanism to load the enumerated devices automatically, and it's performed in an off-load work. This seems causing some race when a sequencer is removed while the pending autoload work is running. As syzkaller spotted, it may lead to some use-after-free: BUG: KASAN: use-after-free in snd_rawmidi_dev_seq_free+0x69/0x70 sound/core/rawmidi.c:1617 Write of size 8 at addr ffff88006c611d90 by task kworker/2:1/567 CPU: 2 PID: 567 Comm: kworker/2:1 Not tainted 4.13.0+ #29 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: events autoload_drivers Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x192/0x22c lib/dump_stack.c:52 print_address_description+0x78/0x280 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x230/0x340 mm/kasan/report.c:409 __asan_report_store8_noabort+0x1c/0x20 mm/kasan/report.c:435 snd_rawmidi_dev_seq_free+0x69/0x70 sound/core/rawmidi.c:1617 snd_seq_dev_release+0x4f/0x70 sound/core/seq_device.c:192 device_release+0x13f/0x210 drivers/base/core.c:814 kobject_cleanup lib/kobject.c:648 [inline] kobject_release lib/kobject.c:677 [inline] kref_put include/linux/kref.h:70 [inline] kobject_put+0x145/0x240 lib/kobject.c:694 put_device+0x25/0x30 drivers/base/core.c:1799 klist_devices_put+0x36/0x40 drivers/base/bus.c:827 klist_next+0x264/0x4a0 lib/klist.c:403 next_device drivers/base/bus.c:270 [inline] bus_for_each_dev+0x17e/0x210 drivers/base/bus.c:312 autoload_drivers+0x3b/0x50 sound/core/seq_device.c:117 process_one_work+0x9fb/0x1570 kernel/workqueue.c:2097 worker_thread+0x1e4/0x1350 kernel/workqueue.c:2231 kthread+0x324/0x3f0 kernel/kthread.c:231 ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:425 The fix is simply to assure canceling the autoload work at removing the device. Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-09-12	ALSA: firewire: Use common error handling code in snd_motu_stream_start_duplex()	Markus Elfring	1	-8/+8
	Add a jump target so that a bit of exception handling can be better reused at the end of this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-09-11	f2fs: clear radix tree dirty tag of pages whose dirty flag is cleared	Daeho Jeong	2	-0/+14
	On a senario like writing out the first dirty page of the inode as the inline data, we only cleared dirty flags of the pages, but didn't clear the dirty tags of those pages in the radix tree. If we don't clear the dirty tags of the pages in the radix tree, the inodes which contain the pages will be marked with I_DIRTY_PAGES again and again, and writepages() for the inodes will be invoked in every writeback period. As a result, nothing will be done in every writepages() for the inodes and it will just consume CPU time meaninglessly. Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-11	f2fs: speed up gc_urgent mode with SSR	Jaegeuk Kim	3	-13/+16
	This patch activates SSR in gc_urgent mode. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-11	f2fs: better to wait for fstrim completion	Jaegeuk Kim	1	-1/+6
	In android, we'd better wait for fstrim completion instead of issuing the discard commands asynchronous. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-11	block: directly insert blk-mq request from blk_insert_cloned_request()	Jens Axboe	3	-1/+23
	A NULL pointer crash was reported for the case of having the BFQ IO scheduler attached to the underlying blk-mq paths of a DM multipath device. The crash occured in blk_mq_sched_insert_request()'s call to e->type->ops.mq.insert_requests(). Paolo Valente correctly summarized why the crash occured with: "the call chain (dm_mq_queue_rq -> map_request -> setup_clone -> blk_rq_prep_clone) creates a cloned request without invoking e->type->ops.mq.prepare_request for the target elevator e. The cloned request is therefore not initialized for the scheduler, but it is however inserted into the scheduler by blk_mq_sched_insert_request." All said, a request-based DM multipath device's IO scheduler should be the only one used -- when the original requests are issued to the underlying paths as cloned requests they are inserted directly in the underlying dispatch queue(s) rather than through an additional elevator. But commit bd166ef18 ("blk-mq-sched: add framework for MQ capable IO schedulers") switched blk_insert_cloned_request() from using blk_mq_insert_request() to blk_mq_sched_insert_request(). Which incorrectly added elevator machinery into a call chain that isn't supposed to have any. To fix this introduce a blk-mq private blk_mq_request_bypass_insert() that blk_insert_cloned_request() calls to insert the request without involving any elevator that may be attached to the cloned request's request_queue. Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers") Cc: stable@vger.kernel.org Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com> Tested-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-11	mm/backing-dev.c: fix an error handling path in 'cgwb_create()'	Christophe JAILLET	1	-2/+4
	If the 'kmalloc' fails, we must go through the existing error handling path. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Fixes: 52ebea749aae ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-11	string.h: un-fortify memcpy_and_pad	Martin Wilck	1	-13/+2
	The way I'd implemented the new helper memcpy_and_pad with __FORTIFY_INLINE caused compiler warnings for certain kernel configurations. This helper is only used in a single place at this time, and thus doesn't benefit much from fortification. So simplify the code by dropping fortification support for now. Fixes: 01f33c336e2d "string.h: add memcpy_and_pad()" Signed-off-by: Martin Wilck <mwilck@suse.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-09-11	nvme-pci: implement the HMB entry number and size limitations	Christoph Hellwig	4	-2/+13
	Adds support for the new Host Memory Buffer Minimum Descriptor Entry Size and Host Memory Maximum Descriptors Entries field that were added in TP 4002 HMB Enhancements. These allow the controller to advertise limits for the usual number of segments in the host memory buffer, as well as a minimum usable per-segment size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com>
2017-09-11	nvme-pci: propagate (some) errors from host memory buffer setup	Christoph Hellwig	1	-6/+12
	We want to catch command execution errors when resetting the device, so propagate errors from the Set Features when setting up the host memory buffer. We keep ignoring memory allocation failures, as the spec clearly says that the controller must work without a host memory buffer. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Cc: stable@vger.kernel.org
2017-09-11	nvme-pci: use appropriate initial chunk size for HMB allocation	Akinobu Mita	1	-1/+1
	The initial chunk size for host memory buffer allocation is currently PAGE_SIZE << MAX_ORDER. MAX_ORDER order allocation is usually failed without CONFIG_DMA_CMA. So the HMB allocation is retried with chunk size PAGE_SIZE << (MAX_ORDER - 1) in general, but there is no problem if the retry allocation works correctly. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> [hch: rebased] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Cc: stable@vger.kernel.org
2017-09-11	nvme-pci: fix host memory buffer allocation fallback	Christoph Hellwig	1	-18/+30
	nvme_alloc_host_mem currently contains two loops that are interwinded, and the outer retry loop turns out to be broken. Fix this by untangling the two. Based on a report an initial patch from Akinobu Mita. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Cc: stable@vger.kernel.org
2017-09-11	nvme: fix lightnvm check	Christoph Hellwig	4	-35/+14
	nvme_nvm_ns_supported assumes every device is a pci_dev, which leads to reading an incorrect field, or possible even a dereference of unallocated memory for fabrics controllers. Fix this by introducing a quirk for lighnvm capable devices instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matias Bjørling <mb@lightnvm.io> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2017-09-11	block: fix integer overflow in __blkdev_sectors_to_bio_pages()	Mikulas Patocka	1	-2/+2
	Fix possible integer overflow in __blkdev_sectors_to_bio_pages if sector_t is 32-bit. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 615d22a51c04 ("block: Fix __blkdev_issue_zeroout loop") Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-11	block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled	Scott Bauer	2	-0/+33
	Users who are booting off their Opal enabled drives are having issues when they have a shadow MBR set up after s3/resume cycle. When the Drive has a shadow MBR setup the MBRDone flag is set to false upon power loss (S3/S4/S5). When the MBRDone flag is false I/O to LBA 0 -> LBA_END_MBR are remapped to the shadow mbr of the drive. If the drive contains useful data in the 0 -> end_mbr range upon s3 resume the user can never get to that data as the drive will keep remapping it to the MBR. To fix this when we unlock on S3 resume, we need to tell the drive that we're done with the shadow mbr (even though we didnt use it) by setting true to MBRDone. This way the drive will stop the remapping and the user can access their data. Acked-by Jon Derrick: <jonathan.derrick@intel.com> Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-11	block: tolerate tracing of NULL bio	Greg Thelen	1	-3/+2
	__get_request() can call trace_block_getrq() with bio=NULL which causes block_get_rq::TP_fast_assign() to deref a NULL pointer and panic. Syzkaller fuzzer panics with linux-next (1d53d908b79d7870d89063062584eead4cf83448): kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 0 PID: 2983 Comm: syzkaller401111 Not tainted 4.13.0-rc7-next-20170901+ #13 task: ffff8801cf1da000 task.stack: ffff8801ce440000 RIP: 0010:perf_trace_block_get_rq+0x697/0x970 include/trace/events/block.h:384 RSP: 0018:ffff8801ce4473f0 EFLAGS: 00010246 RAX: ffff8801cf1da000 RBX: 1ffff10039c88e84 RCX: 1ffffd1ffff84d27 RDX: dffffc0000000001 RSI: 1ffff1003b643e7a RDI: ffffe8ffffc26938 RBP: ffff8801ce447530 R08: 1ffff1003b643e6c R09: ffffe8ffffc26964 R10: 0000000000000002 R11: fffff91ffff84d2d R12: ffffe8ffffc1f890 R13: ffffe8ffffc26930 R14: ffffffff85cad9e0 R15: 0000000000000000 FS: 0000000002641880(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000043e670 CR3: 00000001d1d7a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: trace_block_getrq include/trace/events/block.h:423 [inline] __get_request block/blk-core.c:1283 [inline] get_request+0x1518/0x23b0 block/blk-core.c:1355 blk_old_get_request block/blk-core.c:1402 [inline] blk_get_request+0x1d8/0x3c0 block/blk-core.c:1427 sg_scsi_ioctl+0x117/0x750 block/scsi_ioctl.c:451 sg_ioctl+0x192d/0x2ed0 drivers/scsi/sg.c:1070 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 entry_SYSCALL_64_fastpath+0x1f/0xbe block_get_rq::TP_fast_assign() has multiple redundant ->dev assignments. Only one of them is NULL tolerant. Favor the NULL tolerant one. Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-11	x86/cpu: Remove unused and undefined __generic_processor_info() declaration	Dou Liyang	2	-2/+1
	The following revert: 2b85b3d22920 ("x86/acpi: Restore the order of CPU IDs") ... got rid of __generic_processor_info(), but forgot to remove its declaration in mpspec.h. Remove the declaration and update the comments as well. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1505101403-29100-1-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-11	sched/fair: Fix nuisance kernel-doc warning	Randy Dunlap	1	-1/+1
	Work around kernel-doc warning ('*' in Sphinx doc means "emphasis"): ../kernel/sched/fair.c:7584: WARNING: Inline emphasis start-string without end-string. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/f18b30f9-6251-6d86-9d44-16501e386891@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-10	Revert "firmware: add sanity check on shutdown/suspend"	Linus Torvalds	2	-110/+0
	This reverts commit 81f95076281fdd3bc382e004ba1bce8e82fccbce. It causes random failures of firmware loading at resume time (well, random for me, it seems to be more reliable for others) because the firmware disabling is not actually synchronous with any particular resume event, and at least the btusb driver that uses a workqueue to load the firmware at resume seems to occasionally hit the "firmware loading is disabled" logic because the firmware loader hasn't gotten the resume event yet. Some kind of sanity check for not trying to load firmware when it's not possible might be a good thing, but this commit was not it. Greg seems to have silently suffered the same issue, and pointed to the likely culprit, and Gabriel C verified the revert fixed it for him too. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Pointed-at-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Gabriel C <nix.or.die@gmail.com> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-10	m68k: Add braces to __pmd(x) initializer to kill compiler warning	Geert Uytterhoeven	1	-1/+1
	With gcc 4.1.2: include/linux/swapops.h: In function ‘swp_entry_to_pmd’: include/linux/swapops.h:294: warning: missing braces around initializer include/linux/swapops.h:294: warning: (near initialization for ‘(anonymous).pmd’) Due to a GCC zero initializer bug (#53119), the standard "(pmd_t){ 0 }" initializer is not accepted by all GCC versions. In addition, on m68k pmd_t is an array instead of a single value, so we need "(pmd_t){ { 0 }, }" instead of "(pmd_t){ 0 }". Based on commit 9157259d16a8 ("mm: add pmd_t initializer __pmd() to work around a GCC bug.") for sparc32. Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-10	x86/mm/64: Fix an incorrect warning with CONFIG_DEBUG_VM=y, !PCID	Andy Lutomirski	1	-1/+1
	I've been staring at the word PCID too long. Fixes: f13c8e8c58ba ("x86/mm: Reinitialize TLB state on hotplug and resume") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-09	sparc64: Handle additional cases of no fault loads	Rob Gardner	1	-0/+51
	Load instructions using ASI_PNF or other no-fault ASIs should not cause a SIGSEGV or SIGBUS. A garden variety unmapped address follows the TSB miss path, and when no valid mapping is found in the process page tables, the miss handler checks to see if the access was via a no-fault ASI. It then fixes up the target register with a zero, and skips the no-fault load instruction. But different paths are taken for data access exceptions and alignment traps, and these do not respect the no-fault ASI. We add checks in these paths for the no-fault ASI, and fix up the target register and TPC just like in the TSB miss case. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-09	sparc64: speed up etrap/rtrap on NG2 and later processors	Anthony Yznaga	5	-6/+45
	For many sun4v processor types, reading or writing a privileged register has a latency of 40 to 70 cycles. Use a combination of the low-latency allclean, otherw, normalw, and nop instructions in etrap and rtrap to replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap performance. allclean, otherw, and normalw are available on NG2 and later processors. The average ticks to execute the flush windows trap ("ta 0x3") with and without this patch on select platforms: CPU Not patched Patched % Latency Reduction NG2 1762 1558 -11.58 NG4 3619 3204 -11.47 M7 3015 2624 -12.97 SPARC64-X 829 770 -7.12 Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-09	Bluetooth: Properly check L2CAP config option output buffer length	Ben Seri	1	-37/+43
	Validate the output buffer length for L2CAP config requests and responses to avoid overflowing the stack buffer used for building the option blocks. Cc: stable@vger.kernel.org Signed-off-by: Ben Seri <ben@armis.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-09	NFS: Count the bytes of skipped subrequests in nfs_lock_and_join_requests()	Trond Myklebust	1	-1/+5
	If we skip a subrequest due to a zero refcount, we should still count the byte range that it covered so that we accurately reconstruct the original request size. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-09-09	NFS: Don't hold the group lock when calling nfs_release_request()	Trond Myklebust	1	-1/+1
	That can deadlock if this is the last reference since nfs_page_group_destroy() calls nfs_page_group_sync_on_bit(). Note that even if the page was removed from the subpage list, the req->wb_head could still be pointing to the old head. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-09-09	libnvdimm, btt: fix format string warnings	Randy Dunlap	1	-2/+2
	Fix format warnings (seen on i386) in nvdimm/btt.c: ../drivers/nvdimm/btt.c: In function ‘btt_map_init’: ../drivers/nvdimm/btt.c:430:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Wformat=] dev_WARN_ONCE(to_dev(arena), size < 512, ^ ../drivers/nvdimm/btt.c: In function ‘btt_log_init’: ../drivers/nvdimm/btt.c:474:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Wformat=] dev_WARN_ONCE(to_dev(arena), size < 512, ^ Fixes: 86652d2eb347 ("libnvdimm, btt: clean up warning and error messages") Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-09-09	remove gperf left-overs from build system	Linus Torvalds	1	-9/+0
	I removed all the gperf use, but not the Makefile rules. Sam Ravnborg says I get bonus points for cleaning this up. I'll hold him to it. Requested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>