linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-03-10	target/user: Fix size_t format-spec build warning	Nicholas Bellinger	1	-3/+2
	Fix the following printk size_t warning as per 0-day build: All warnings (new ones prefixed by >>): drivers/target/target_core_user.c: In function 'is_ring_space_avail': >> drivers/target/target_core_user.c:385:12: warning: format '%lu' >> expects argument of type 'long unsigned int', but argument 3 has type >> 'size_t {aka unsigned int}' [-Wformat=] pr_debug("no data space: only %lu available, but ask for %lu\n", ^ Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	target/user: Don't free expired command when time out	Sheng Yang	1	-8/+17
	Which would result in NPE after when userspace connected again. Expired command would be freed either when handling command(by userspace), or when device was tearing down Reviewed-by: Andy Grover <agrover@redhat.com> Signed-off-by: Sheng Yang <sheng@yasker.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	target/user: Introduce data_bitmap, replace data_length/data_head/data_tail	Sheng Yang	1	-100/+121
	The data_bitmap was introduced to support asynchornization accessing of data area. We divide mailbox data area into blocks, and use data_bitmap to track the usage of data area. All the new command's data would start with a new block, and may left unusable space after it end. But it's easy to track using data_bitmap. Now we can allocate data area for asynchronization accessing from userspace, since we can track the allocation using data_bitmap. The userspace part would be the same as Maxim's previous asynchronized implementation. Signed-off-by: Sheng Yang <sheng@yasker.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	target/user: Free data ring in unified function	Sheng Yang	1	-10/+12
	Prepare for data_bitmap in the next patch. Reviewed-by: Andy Grover <agrover@redhat.com> Signed-off-by: Sheng Yang <sheng@yasker.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	target/user: Use iovec[] to describe continuous area	Sheng Yang	1	-15/+26
	We don't need use one iovec per scatter-gather list entry, since data area are continuous. Reviewed-by: Andy Grover <agrover@redhat.com> Signed-off-by: Sheng Yang <sheng@yasker.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	target: Remove enum transport_lunflags_table	Andy Grover	6	-71/+45
	se_dev_entry.lun_flags and se_lun.lun_access are only used for keeping track of read-write vs. read-only state. Since this is an either/or thing we can represent it as bool, and remove the unneeded enum transport_lunflags_table, which is left over from when there were more flags. Change code that uses this enum to just use true/false, and make it clear through variable and param names that true means read-only, false means read-write. Signed-off-by: Andy Grover <agrover@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	target/iblock: pass WRITE_SAME to device if possible	Mike Christie	1	-0/+34
	This patch has iblock pass the WRITE_SAME command to the device for offloading if possible. It is similar to what is done for UNMAP/discards, except that we export a large max write same value to the initiator, and then rely on the block layer to break it up into multiple requests if it cannot fit into one. v2. - Drop file backend changes and move helper function to iblock backend. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc	Christoph Hellwig	2	-9/+8
	We only use the pointer when processing regular iSER commands, and it then always points to the struct iser_cmd that contains the TX descriptor. Remove it and rely on container_of to save a little space and avoid a pointer that is updated multiple times per processed command. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Kill struct isert_rdma_wr	Christoph Hellwig	2	-130/+119
	There is exactly one instance per struct isert_cmd, so merge the two to simplify everyones life. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Convert to new CQ API	Christoph Hellwig	2	-243/+184
	Use the workqueue based CQ type similar to what isert was using previously, and properly split up the completion handlers. Note that this also takes special care to handle the magic login WRs separately, and also renames the submission functions so that it's clear that they are only to be used for the login buffers. (Fix up isert_print_wc usage in isert_beacon_done - nab) Signed-off-by: Christoph Hellwig <hch@lst.de> [sagig: added iscsi conn reinstatement in non-flush error completions and added error completion type print] Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Split and properly type the login buffer	Christoph Hellwig	2	-30/+29
	The login receive buffer is used as a iser_rx_desc, so type it as such in struct isert_conn and allocate the exactly right space for it. The TX buffer is moved to a separate variable and properly sized as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Remove ISER_RECV_DATA_SEG_LEN	Christoph Hellwig	2	-10/+8
	This is the same as ISCSI_DEF_MAX_RECV_SEG_LEN (and must be the same given the structure layouts), so just use that constant instead. This also allows removing ISER_RX_LOGIN_SIZE in favor of ISER_RX_PAYLOAD_SIZE. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Remove impossible condition from isert_wait_conn	Jenny Derzhavetz	1	-8/+0
	We can never get to isert_wait_conn in INIT state anymore, so get rid of this condition. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Remove redundant wait in release_conn	Jenny Derzhavetz	2	-8/+0
	With current termination flow we call release_conn after completion. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Rework connection termination	Jenny Derzhavetz	2	-56/+52
	When we receive an event that triggers connection termination, we have a a couple of things we may want to do: 1. In case we are already terminating, bailout early 2. In case we are connected but not bound, disconnect and schedule a connection cleanup silently (don't reinstate) 3. In case we are connected and bound, disconnect and reinstate the connection This rework fixes a bug that was detected against a mis-behaved initiator which rejected our rdma_cm accept, in this stage the isert_conn is no bound and reinstate caused a bogus dereference. What's great about this is that we don't need the post_recv_buf_count anymore, so get rid of it. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Separate flows for np listeners and connections cma events	Jenny Derzhavetz	1	-6/+5
	No need to restrict this check to specific events. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Add new state ISER_CONN_BOUND to isert_conn	Jenny Derzhavetz	2	-2/+6
	We need an indication that isert_conn->iscsi_conn binding has happened so we'll know not to invoke a connection reinstatement on an unbound connection which will lead to a bogus isert_conn->conn dereferece. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	iser-target: Fix identification of login rx descriptor type	Jenny Derzhavetz	1	-1/+2
	Once connection request is accepted, one rx descriptor is posted to receive login request. This descriptor has rx type, but is outside the main pool of rx descriptors, and thus was mistreated as tx type. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	qla2xxx: Add DebugFS node for target sess list.	Quinn Tran	3	-20/+89
	#cat /sys/kernel/debug/qla2xxx/qla2xxx_31/tgt_sess qla2xxx_31 Port ID Port Name Handle ff:fc:01 21:fd:00:05:33:c7:ec:16 0 01:0e:00 21:00:00:24:ff:7b:8a:e4 1 01:0f:00 21:00:00:24:ff:7b:8a:e5 2 .... (Drop ->check_initiator_node_acl() parameter usage - nab) Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	ib_srpt: Convert to percpu_ida tag allocation	Nicholas Bellinger	2	-40/+17
	This patch converts ib_srpt to use existing percpu_ida tag pre-allocation for struct srpt_send_ioctx. This allows ib_srpt to drop it's internal pre-allocation mechanisms with the extra spin_lock_irqsave, and use percpu_ida common code for doing this. Cc: Vu Pham <vu@mellanox.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	tcm_fc: Convert to TARGET_SCF_ACK_KREF I/O + TMR krefs	Nicholas Bellinger	1	-4/+16
	This patch converts tcm_fc to modern TARGET_SCF_ACK_KREF usage for ft_queue_status(), and fixes ft_check_stop_free() to return transport_generic_free_cmd() for ->cmd_kref. It also converts TM request -> ft_send_tm() to use ACK_KREF, and update ft_queue_tm_resp() to drop the outstanding kref after queueing TM response into fabric code. Cc: Vasu Dev <vasu.dev@linux.intel.com> Cc: Mark Rustad <mark.d.rustad@intel.com> Cc: Robert Love <robert.w.love@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	xen-scsiback: Convert to TARGET_SCF_ACK_KREF I/O krefs	Nicholas Bellinger	1	-28/+28
	This patch converts xen-scsiback to modern TARGET_SCF_ACK_KREF usage for scsiback_cmd_done() callback path. It also also converts TMR -> scsiback_device_action() to use modern target_submit_tmr() code. Acked-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	xen-scsiback: Convert to percpu_ida tag allocation	Nicholas Bellinger	1	-78/+89
	This patch converts xen-scsiback to use percpu_ida tag pre-allocation for struct vscsibk_pend descriptor, in order to avoid fast-path struct vscsibk_pend memory allocations. Note by default this is currently hardcoded to 128. (Add wrapper for handling pending_req tag failure - Juergen) (Drop left-over se_cmd memset in scsiback_cmd_exec - Juergen) Acked-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	usb-gadget/tcm: Convert to TARGET_SCF_ACK_KREF I/O krefs	Nicholas Bellinger	1	-40/+11
	This patch drops struct usbg_cmd->kref internal kref-erence usage, for proper TARGET_SCF_ACK_KREF conversion. Tested-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	usb-gadget/tcm: Conversion to percpu_ida tag pre-allocation	Nicholas Bellinger	2	-39/+60
	This patch converts usb-gadget target to use percpu_ida tag pre-allocation for struct usbg_cmd descriptor, in order to avoid fast-path struct usbg_cmd memory allocations. Note by default this is currently hardcoded to 128. Tested-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	sbp-target: Convert to TARGET_SCF_ACK_KREF I/O krefs	Nicholas Bellinger	1	-14/+16
	This patch converts sbp-target to modern TARGET_SCF_ACK_KREF usage for sbp_send_status() callback path, and drops the now obsolete sbp_free_request() failure path calls. Acked-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	sbp-target: Conversion to percpu_ida tag pre-allocation	Nicholas Bellinger	1	-6/+30
	This patch converts sbp-target to use struct sbp_target_request descriptor tag pre-allocation using percpu_ida. (Fix sbp_mgt_get_req() IS_ERR failure checking - Dan Carpenter) (Add missing sbp_target_request tag memset - Chris Boot) Acked-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	ib_srpt: Convert to target_alloc_session usage	Nicholas Bellinger	1	-17/+6
	This patch converts ib_srpt internal assignments of se_node_acl and transport_register_session() to use the new alloc_session method. Cc: Vu Pham <vu@mellanox.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	tcm_fc: Convert to target_alloc_session usage	Nicholas Bellinger	1	-22/+22
	This patch converts tcm_fc target mode addition of tf_sess->hash to port_id hlist_head using the new alloc_session callback(). Cc: Vasu Dev <vasu.dev@linux.intel.com> Cc: Mark Rustad <mark.d.rustad@intel.com> Cc: Robert Love <robert.w.love@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	tcm_qla2xxx: Convert to target_alloc_session usage	Nicholas Bellinger	3	-46/+43
	This patch converts existing qla2xxx target mode assignment of struct qla_tgt_sess related sid + loop_id values to use a callback via the new target_alloc_session API caller. Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	vhost/scsi: Convert to target_alloc_session usage	Nicholas Bellinger	1	-58/+41
	This patch converts vhost/scsi pre-allocation of vhost_scsi_cmd descriptors to use the new alloc_session callback(). Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10	target: Convert demo-mode only drivers to target_alloc_session	Christoph Hellwig	4	-122/+72
	This patch converts existing loopback, usb-gadget, and xen-scsiback demo-mode only fabric drivers to use the new target_alloc_session API caller. This includes adding a new alloc_session callback for fabric driver internal nexus pointer assignments. (Fixes for early for-next nexus breakage - Dan Carpenter) Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Acked-by: Juergen Gross <jgross@suse.com> Tested-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-02-27	target: Add target_alloc_session() helper function	Nicholas Bellinger	2	-0/+62
	Based on HCH's original patch, this adds a full version to support percpu-ida tag pre-allocation and callback function pointer into fabric driver code to complete session setup. Reported-by: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-02-20	Linux 4.5-rc5	Linus Torvalds	1	-1/+1

2016-02-20	kernel/resource.c: fix muxed resource handling in __request_region()	Simon Guinot	1	-2/+3
	In __request_region, if a conflict with a BUSY and MUXED resource is detected, then the caller goes to sleep and waits for the resource to be released. A pointer on the conflicting resource is kept. At wake-up this pointer is used as a parent to retry to request the region. A first problem is that this pointer might well be invalid (if for example the conflicting resource have already been freed). Another problem is that the next call to __request_region() fails to detect a remaining conflict. The previously conflicting resource is passed as a parameter and __request_region() will look for a conflict among the children of this resource and not at the resource itself. It is likely to succeed anyway, even if there is still a conflict. Instead, the parent of the conflicting resource should be passed to __request_region(). As a fix, this patch doesn't update the parent resource pointer in the case we have to wait for a muxed region right after. Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Tested-by: Vincent Donnefort <vdonnefort@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-19	arm64: mm: allow the kernel to handle alignment faults on user accesses	EunTaik Lee	1	-1/+8
	Although we don't expect to take alignment faults on access to normal memory, misbehaving (i.e. buggy) user code can pass MMIO pointers into system calls, leading to things like get_user accessing device memory. Rather than OOPS the kernel, allow any exception fixups to run and return something like -EFAULT back to userspace. This makes the behaviour more consistent with userspace, even though applications with access to device mappings can easily cause other issues if they try hard enough. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Eun Taik Lee <eun.taik.lee@samsung.com> [will: dropped __kprobes annotation and rewrote commit mesage] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-19	arm64: kbuild: make "make install" not depend on vmlinux	Masahiro Yamada	3	-3/+17
	For the same reason as commit 19514fc665ff ("arm, kbuild: make "make install" not depend on vmlinux"), the install targets should never trigger the rebuild of the kernel. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-19	ext4: fix crashes in dioread_nolock mode	Jan Kara	1	-20/+20
	Competing overwrite DIO in dioread_nolock mode will just overwrite pointer to io_end in the inode. This may result in data corruption or extent conversion happening from IO completion interrupt because we don't properly set buffer_defer_completion() when unlocked DIO races with locked DIO to unwritten extent. Since unlocked DIO doesn't need io_end for anything, just avoid allocating it and corrupting pointer from inode for locked DIO. A cleaner fix would be to avoid these games with io_end pointer from the inode but that requires more intrusive changes so we leave that for later. Cc: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-19	ext4: fix bh->b_state corruption	Jan Kara	1	-2/+30
	ext4 can update bh->b_state non-atomically in _ext4_get_block() and ext4_da_get_block_prep(). Usually this is fine since bh is just a temporary storage for mapping information on stack but in some cases it can be fully living bh attached to a page. In such case non-atomic update of bh->b_state can race with an atomic update which then gets lost. Usually when we are mapping bh and thus updating bh->b_state non-atomically, nobody else touches the bh and so things work out fine but there is one case to especially worry about: ext4_finish_bio() uses BH_Uptodate_Lock on the first bh in the page to synchronize handling of PageWriteback state. So when blocksize < pagesize, we can be atomically modifying bh->b_state of a buffer that actually isn't under IO and thus can race e.g. with delalloc trying to map that buffer. The result is that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock. Fix the problem by always updating bh->b_state bits atomically. CC: stable@vger.kernel.org Reported-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-19	drm/nouveau: use post-decrement in error handling	Rasmus Villemoes	1	-1/+1
	We need to use post-decrement to get the dma_map_page undone also for i==0, and to avoid some very unpleasant behaviour if dma_map_page failed already at i==0. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-19	drm/atomic: Allow for holes in connector state, v2.	Maarten Lankhorst	4	-55/+48
	Because we record connector_mask using 1 << drm_connector_index now the connector_mask should stay the same even when other connectors are removed. This was not the case with MST, in that case when removing a connector all other connectors may change their index. This is fixed by waiting until the first get_connector_state to allocate connector_state, and force reallocation when state is too small. As a side effect connector arrays no longer have to be preallocated, and can be allocated on first use which means a less allocations in the page flip only path. Changes since v1: - Whitespace. (Ville) - Call ida_remove when destroying the connector. (Ville) - u32 alloc -> int. (Ville) Fixes: 14de6c44d149 ("drm/atomic: Remove drm_atomic_connectors_for_crtc.") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lyude <cpaul@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-18	clk: gpio: Really allow an optional clock= DT property	Stephen Boyd	1	-1/+1
	We mis-merged the original patch from Russell here and so the patch went almost all the way, except that we still failed to probe when there wasn't a clocks property in the DT node. Allow that case by making a negative value from of_clk_get_parent_count() into "no parents", like the original patch did. Fixes: 7ed88aa2efa5 ("clk: fix clk-gpio.c with optional clock= DT property") Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-02-18	mm: slab: free kmem_cache_node after destroy sysfs file	Dmitry Safonov	5	-27/+29
	When slub_debug alloc_calls_show is enabled we will try to track location and user of slab object on each online node, kmem_cache_node structure and cpu_cache/cpu_slub shouldn't be freed till there is the last reference to sysfs file. This fixes the following panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: list_locations+0x169/0x4e0 PGD 257304067 PUD 438456067 PMD 0 Oops: 0000 [#1] SMP CPU: 3 PID: 973074 Comm: cat ve: 0 Not tainted 3.10.0-229.7.2.ovz.9.30-00007-japdoll-dirty #2 9.30 Hardware name: DEPO Computers To Be Filled By O.E.M./H67DE3, BIOS L1.60c 07/14/2011 task: ffff88042a5dc5b0 ti: ffff88037f8d8000 task.ti: ffff88037f8d8000 RIP: list_locations+0x169/0x4e0 Call Trace: alloc_calls_show+0x1d/0x30 slab_attr_show+0x1b/0x30 sysfs_read_file+0x9a/0x1a0 vfs_read+0x9c/0x170 SyS_read+0x58/0xb0 system_call_fastpath+0x16/0x1b Code: 5e 07 12 00 b9 00 04 00 00 3d 00 04 00 00 0f 4f c1 3d 00 04 00 00 89 45 b0 0f 84 c3 00 00 00 48 63 45 b0 49 8b 9c c4 f8 00 00 00 <48> 8b 43 20 48 85 c0 74 b6 48 89 df e8 46 37 44 00 48 8b 53 10 CR2: 0000000000000020 Separated __kmem_cache_release from __kmem_cache_shutdown which now called on slab_kmem_cache_release (after the last reference to sysfs file object has dropped). Reintroduced locking in free_partial as sysfs file might access cache's partial list after shutdowning - partial revert of the commit 69cb8e6b7c29 ("slub: free slabs without holding locks"). Zap __remove_partial and use remove_partial (w/o underscores) as free_partial now takes list_lock which s partial revert for commit 1e4dd9461fab ("slub: do not assert not having lock in removing freed partial") Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Suggested-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18	ipc/shm: handle removed segments gracefully in shm_mmap()	Kirill A. Shutemov	1	-10/+43
	remap_file_pages(2) emulation can reach file which represents removed IPC ID as long as a memory segment is mapped. It breaks expectations of IPC subsystem. Test case (rewritten to be more human readable, originally autogenerated by syzkaller[1]): #define _GNU_SOURCE #include <stdlib.h> #include <sys/ipc.h> #include <sys/mman.h> #include <sys/shm.h> #define PAGE_SIZE 4096 int main() { int id; void p; id = shmget(IPC_PRIVATE, 3 PAGE_SIZE, 0); p = shmat(id, NULL, 0); shmctl(id, IPC_RMID, NULL); remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0); return 0; } The patch changes shm_mmap() and code around shm_lock() to propagate locking error back to caller of shm_mmap(). [1] http://github.com/google/syzkaller Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18	MAINTAINERS: update Kselftest Framework mailing list	Shuah Khan	1	-1/+1
	Kselftest Framework now has a dedicated mailing list linux-kselftest. Update the entry in MAINTAINERS file. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18	devm_memremap_release(): fix memremap'd addr handling	Toshi Kani	1	-1/+1
	The pmem driver calls devm_memremap() to map a persistent memory range. When the pmem driver is unloaded, this memremap'd range is not released so the kernel will leak a vma. Fix devm_memremap_release() to handle a given memremap'd address properly. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18	mm/hugetlb.c: fix incorrect proc nr_hugepages value	Vaishali Thakkar	1	-2/+4
	Currently incorrect default hugepage pool size is reported by proc nr_hugepages when number of pages for the default huge page size is specified twice. When multiple huge page sizes are supported, /proc/sys/vm/nr_hugepages indicates the current number of pre-allocated huge pages of the default size. Basically /proc/sys/vm/nr_hugepages displays default_hstate-> max_huge_pages and after boot time pre-allocation, max_huge_pages should equal the number of pre-allocated pages (nr_hugepages). Test case: Note that this is specific to x86 architecture. Boot the kernel with command line option 'default_hugepagesz=1G hugepages=X hugepagesz=2M hugepages=Y hugepagesz=1G hugepages=Z'. After boot, 'cat /proc/sys/vm/nr_hugepages' and 'sysctl -a \| grep hugepages' returns the value X. However, dmesg output shows that Z huge pages were pre-allocated. So, the root cause of the problem here is that the global variable default_hstate_max_huge_pages is set if a default huge page size is specified (directly or indirectly) on the command line. After the command line processing in hugetlb_init, if default_hstate_max_huge_pages is set, the value is assigned to default_hstae.max_huge_pages. However, default_hstate.max_huge_pages may have already been set based on the number of pre-allocated huge pages of default_hstate size. The solution to this problem is if hstate->max_huge_pages is already set then it should not set as a result of global max_huge_pages value. Basically if the value of the variable hugepages is set multiple times on a command line for a specific supported hugepagesize then proc layer should consider the last specified value. Signed-off-by: Vaishali Thakkar <vaishali.thakkar@oracle.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18	mm, x86: fix pte_page() crash in gup_pte_range()	Hugh Dickins	1	-1/+1
	Commit 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings") has moved up the pte_page(pte) in x86's fast gup_pte_range(), for no discernible reason: put it back where it belongs, after the pte_flags check and the pfn_valid cross-check. That may be the cause of the NULL pointer dereference in gup_pte_range(), seen when vfio called vaddr_get_pfn() when starting a qemu-kvm based VM. Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Michael Long <Harn-Solo@gmx.de> Tested-by: Michael Long <Harn-Solo@gmx.de> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18	fsnotify: turn fsnotify reaper thread into a workqueue job	Jeff Layton	1	-31/+18
	We don't require a dedicated thread for fsnotify cleanup. Switch it over to a workqueue job instead that runs on the system_unbound_wq. In the interest of not thrashing the queued job too often when there are a lot of marks being removed, we delay the reaper job slightly when queueing it, to allow several to gather on the list. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Tested-by: Eryu Guan <guaneryu@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18	Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"	Jeff Layton	2	-18/+53
	This reverts commit c510eff6beba ("fsnotify: destroy marks with call_srcu instead of dedicated thread"). Eryu reported that he was seeing some OOM kills kick in when running a testcase that adds and removes inotify marks on a file in a tight loop. The above commit changed the code to use call_srcu to clean up the marks. While that does (in principle) work, the srcu callback job is limited to cleaning up entries in small batches and only once per jiffy. It's easily possible to overwhelm that machinery with too many call_srcu callbacks, and Eryu's reproduer did just that. There's also another potential problem with using call_srcu here. While you can obviously sleep while holding the srcu_read_lock, the callbacks run under local_bh_disable, so you can't sleep there. It's possible when putting the last reference to the fsnotify_mark that we'll end up putting a chain of references including the fsnotify_group, uid, and associated keys. While I don't see any obvious ways that that could occurs, it's probably still best to avoid using call_srcu here after all. This patch reverts the above patch. A later patch will take a different approach to eliminated the dedicated thread here. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reported-by: Eryu Guan <guaneryu@gmail.com> Tested-by: Eryu Guan <guaneryu@gmail.com> Cc: Jan Kara <jack@suse.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>