linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-01-04	NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments	Trond Myklebust	2	-6/+6
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04	NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures	Trond Myklebust	2	-2/+9
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04	NFSv4.1/pNFS: Cleanup pnfs_mark_matching_lsegs_invalid()	Trond Myklebust	1	-5/+5
	Make it more obvious what we're returning... Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04	NFSv4.1/pNFS: Fix a race in initiate_file_draining()	Trond Myklebust	1	-4/+1
	Peng Tao points out that the call to pnfs_mark_matching_lsegs_return() could race with pnfs_put_lseg(), in which case the layout segment is cleared, but no layoutreturn will be sent. Fix is to replace the call to pnfs_mark_matching_lsegs_invalid(). Reported-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04	NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout	Trond Myklebust	2	-7/+21
	Fix a bug whereby if all the layout segments could be immediately freed, the call to pnfs_error_mark_layout_for_return() would never result in a layoutreturn. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04	NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return() should set the iomode	Trond Myklebust	1	-4/+12
	If pnfs_mark_matching_lsegs_return() needs to mark a layout segment for return, then it must also set the return iomode. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04	NFSv4.1/pNFS: Use nfs4_stateid_copy for copying stateids	Trond Myklebust	1	-3/+3
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04	NFSv4.1/pNFS: Don't pass stateids by value to pnfs_send_layoutreturn()	Trond Myklebust	1	-6/+6
	A stateid is a structure, pass it as a pointer. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-31	NFS: Relax requirements in nfs_flush_incompatible	Trond Myklebust	3	-7/+8
	If two processes share the same credentials and NFSv4 open stateid, then allow them both to dirty the same page, even if their nfs_open_context differs. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-31	NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid	Trond Myklebust	7	-0/+39
	If the layout segment is invalid, then we should not be adding more write requests to the commit list. Instead, those writes should be replayed after requesting a new layout. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-31	NFS: Allow multiple commit requests in flight per file	Trond Myklebust	7	-51/+35
	Allow synchronous RPC calls to wait for pending RPC calls to finish, but also allow asynchronous ones to just fire off another commit. With this patch, the xfstests generic/074 test completes in 226s instead of 242s Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-31	NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs	Trond Myklebust	5	-19/+23
	The flexfiles layout in particular, seems to want to poke around in the O_DIRECT flags when retransmitting. This patch sets up an interface to allow it to call back into O_DIRECT to handle retransmission correctly. It also fixes a potential bug whereby we could change the behaviour of O_DIRECT if an error is already pending. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	NFSv4: List stateid information in the callback tracepoints	Trond Myklebust	2	-6/+79
	The stateid is extremely valuable when debugging. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	NFSv4.1/pNFS: Don't return NFS4ERR_DELAY unnecessarily in CB_LAYOUTRECALL	Trond Myklebust	1	-1/+1
	If the client is promising to return the layout ASAP, then there is no need to return DELAY and have the server retry. Instead default to the normal procedure described in RFC5661. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	NFSv4.1/pNFS: Ensure we enforce RFC5661 Section 12.5.5.2.1	Trond Myklebust	1	-0/+20
	The RFC requires us to check if the server is recalling a stateid that we haven't yet received. If so, tell it to wait. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS: If we have to delay the layout callback, mark the layout for return	Trond Myklebust	3	-3/+18
	If the client needs to delay the layout callback, then speed up the recall process by marking the remaining layout segments to be actively returned by the client. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	NFSv4.1/pNFS: Add a helper to mark the layout as returned	Trond Myklebust	4	-1/+17
	This ensures that we don't reuse the stateid if a layout return or implied layout return means that we've returned all layout segments Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS: Ensure nfs4_layoutget_prepare returns the correct error	Trond Myklebust	1	-4/+5
	If we're unable to perform the layoutget due to an invalid open stateid or a bulk recall, ensure that we return the error so that the caller can decide on an appropriate action. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS/flexfiles: Ensure we record layoutstats even if RPC is terminated early	Trond Myklebust	1	-6/+31
	Currently, we will only record the layoutstats correctly if the RPC call successfully obtains a slot. If we exit before that happens, then we may find ourselves starting the busy timer through the call in ff_layout_(read\|write)_prepare_layoutstats, but never stopping it. The same thing happens if we're doing DA-DS. The fix is to ensure that we catch these cases in the rpc_release() callback. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS: Add flag to track if we've called nfs4_ff_layout_stat_io_start_read/write	Trond Myklebust	2	-25/+72
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS/flexfiles: Fix a statistics gathering imbalance	Trond Myklebust	1	-1/+1
	When we replay a failed read, write or commit to the dataserver, we need to ensure that we call ff_layout_read_prepare_v3(), ff_layout_write_prepare_v3 or ff_layout_commit_prepare_v3() so that we reset the statistics. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS/flexfiles: Don't mark the entire layout as failed, when returning it	Trond Myklebust	2	-3/+1
	In pNFS/flexfiles, we want to return the layout without necessarily marking it as having completely failed. We therefore move the call to pnfs_layout_io_set_failed() out of pnfs_error_mark_layout_for_return(), and then ensura that pNFS/files layout calls it separately. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS/flexfiles: Don't prevent flexfiles client from retrying LAYOUTGET	Trond Myklebust	4	-53/+6
	Fix a bug in which flexfiles clients are falling back to I/O through the MDS even when the FF_FLAGS_NO_IO_THRU_MDS flag is set. The flexfiles client will always report errors through the LAYOUTRETURN and/or LAYOUTERROR mechanisms, so it should normally be safe for it to retry the LAYOUTGET until it fails or succeeds. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pnfs/flexfiles: count io stat in rpc_count_stats callback	Peng Tao	1	-12/+10
	If client ever restarts IO due to some errors, we'll endup mis-counting IO stats if we do the counting in .rpc_done callback. Move it to .rpc_count_stats callback that is only called when releasing RPC. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pnfs/flexfiles: do not mark delay-like status as DS failure	Peng Tao	1	-1/+8
	We just need to delay and retry in these cases. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	NFS41: map NFS4ERR_LAYOUTUNAVAILABLE to ENODATA	Peng Tao	1	-0/+9
	Instead of mapping it to EIO that is a fatal error and fails application. We'll go inband after getting NFS4ERR_LAYOUTUNAVAILABLE. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	nfs: only remove page from mapping if launder_page fails	Peng Tao	3	-18/+37
	Instead of dropping pages when write fails, only do it when we get fatal failure in launder_page write back. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	nfs: handle request add failure properly	Peng Tao	5	-31/+67
	When we fail to queue a read page to IO descriptor, we need to clean it up otherwise it is hanging around preventing nfs module from being removed. When we fail to queue a write page to IO descriptor, we need to clean it up and also save the failure status to open context. Then at file close, we can try to write pages back again and drop the page if it fails to writeback in .launder_page, which will be done in the next patch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	nfs: centralize pgio error cleanup	Peng Tao	2	-32/+33
	In case we fail during setting things up for read/write IO, set pg_error in IO descriptor and do the cleanup in nfs_pageio_add_request, where we clean up all pages that are still hanging around on the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	nfs: clean up rest of reqs when failing to add one	Peng Tao	1	-3/+14
	If we fail to set up things before sending anything over wire, we need to clean up the reqs that are still attached to the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	NFS41: pop some layoutget errors to application	Peng Tao	6	-14/+78
	For ERESTARTSYS/EIO/EROFS/ENOSPC/E2BIG in layoutget, we should just bail out instead of hiding the error and retrying inband IO. Change all the call sites to pop the error all the way up. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS/flexfiles: Support server-supplied layoutstats sampling period	Trond Myklebust	2	-3/+14
	Some servers want to be able to control the frequency with which clients report layoutstats, for instance, in order to monitor QoS for a particular file or set of file. In order to support this, the flexfiles layout allows the server to pass this info as a hint in the layout payload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	SUNRPC: drop unused xs_reclassify_socketX() helpers	Stefan Hajnoczi	1	-13/+1
	xs_reclassify_socket4() and friends used to be called directly. xs_reclassify_socket() is called instead nowadays. The xs_reclassify_socketX() helper functions are empty when CONFIG_DEBUG_LOCK_ALLOC is not defined. Drop them since they have no callers. Note that AF_LOCAL still calls xs_reclassify_socketu() directly but is easily converted to generic xs_reclassify_socket(). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	nfs: machine credential support for additional operations	Andrew Elble	2	-0/+21
	Allow LAYOUTRETURN and DELEGRETURN to use machine credentials if the server supports it. Add request for OPEN_DOWNGRADE as the close path also uses that. Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	nfs: do not initialise statics to 0	Wei Tang	1	-1/+1
	This patch fixes the checkpatch.pl error to nfs4sysctl.c: ERROR: do not initialise statics to 0 Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	pNFS: Modify pnfs_update_layout tracepoints to use layout stateid	Trond Myklebust	3	-16/+28
	Instead of displaying a layout segment pointer in these tracepoints, let's use the layout stateid, now that Olga gave us a set of tools for displaying them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	NFSv4: Fix unused variable warnings in nfs4_init_*_client_string()	Trond Myklebust	1	-6/+3
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	nfs: add new tracepoint for pnfs_update_layout	Jeff Layton	3	-6/+102
	pnfs_update_layout is really the "nexus" of layout handling. If it returns NULL then we end up going through the MDS. This patch adds some tracepoints to that function that allow us to determine the cause when we end up going through the MDS unexpectedly. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	Adding tracepoint to cached open	Olga Kornievskaia	2	-0/+41
	Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28	Adding stateid information to tracepoints	Olga Kornievskaia	3	-32/+245
	Operations to which stateid information is added: close, delegreturn, open, read, setattr, layoutget, layoutcommit, test_stateid, write, lock, locku, lockt Format is "stateid=<seqid>:<crc32 hash stateid.other>", also "openstateid=", "layoutstateid=", and "lockstateid=" for open_file, layoutget, set_lock tracepoints. New function is added to internal.h, nfs_stateid_hash(), to compute the hash trace_nfs4_setattr() is moved from nfs4_do_setattr() to _nfs4_do_setattr() to get access to stateid. trace_nfs4_setattr and trace_nfs4_delegreturn are changed from INODE_EVENT to new event type, INODE_STATEID_EVENT which is same as INODE_EVENT but adds stateid information for locking tracepoints, moved trace_nfs4_set_lock() into _nfs4_do_setlk() to get access to stateid information, and removed trace_nfs4_lock_reclaim(), trace_nfs4_lock_expired() as they call into _nfs4_do_setlk() and both were previously same LOCK_EVENT type. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-27	Linux 4.4-rc7	Linus Torvalds	1	-1/+1

2015-12-27	MIPS: Fix bitrot in __get_user_unaligned()	Al Viro	1	-3/+3
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-24	tty/serial: Skip 'NULL' char after console break when sysrq enabled	Vijay Kumar	1	-2/+4
	When sysrq is triggered from console, serial driver for SUN hypervisor console receives a console break and enables the sysrq. It expects a valid sysrq char following with break. Meanwhile if driver receives 'NULL' ASCII char then it disables sysrq and sysrq handler will never be invoked. This fix skips calling uart sysrq handler when 'NULL' is received while sysrq is enabled. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Acked-by: Karl Volz <karl.volz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: fix FP corruption in user copy functions	Rob Gardner	13	-134/+235
	Short story: Exception handlers used by some copy_to_user() and copy_from_user() functions do not diligently clean up floating point register usage, and this can result in a user process seeing invalid values in floating point registers. This sometimes makes the process fail. Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions use floating point registers and VIS alignaddr/faligndata to accelerate data copying when source and dest addresses don't align well. Linux uses a lazy scheme for saving floating point registers; It is not done upon entering the kernel since it's a very expensive operation. Rather, it is done only when needed. If the kernel ends up not using FP regs during the course of some trap or system call, then it can return to user space without saving or restoring them. The various memcpy functions begin their FP code with VISEntry (or a variation thereof), which saves the FP regs. They conclude their FP code with VISExit (or a variation) which essentially marks the FP regs "clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned off so that a lazy restore will be triggered when/if the user process accesses floating point regs again. The bug is that the user copy variants of memcpy, copy_from_user() and copy_to_user(), employ an exception handling mechanism to detect faults when accessing user space addresses, and when this handler is invoked, an immediate return from the function is forced, and VISExit is not executed, thus leaving the fprs register in an indeterminate state, but often with fprs.FPRS_FEF set and one or more dirty bits. This results in a return to user space with invalid values in the FP regs, and since fprs.FPRS_FEF is on, no lazy restore occurs. This bug affects copy_to_user() and copy_from_user() for NG4, NG2, U3, and U1. All are fixed by using a new exception handler for those loads and stores that are done during the time between VISEnter and VISExit. n.b. In NG4memcpy, the problematic code can be triggered by a copy size greater than 128 bytes and an unaligned source address. This bug is known to be the cause of random user process memory corruptions while perf is running with the callgraph option (ie, perf record -g). This occurs because perf uses copy_from_user() to read user stacks, and may fault when it follows a stack frame pointer off to an invalid page. Validation checks on the stack address just obscure the underlying problem. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Perf should save/restore fault info	Rob Gardner	1	-0/+4
	There have been several reports of random processes being killed with a bus error or segfault during userspace stack walking in perf. One of the root causes of this problem is an asynchronous modification to thread_info fault_address and fault_code, which stems from a perf counter interrupt arriving during kernel processing of a "benign" fault, such as a TSB miss. Since perf_callchain_user() invokes copy_from_user() to read user stacks, a fault is not only possible, but probable. Validity checks on the stack address merely cover up the problem and reduce its frequency. The solution here is to save and restore fault_address and fault_code in perf_callchain_user() so that the benign fault handler is not disturbed by a perf interrupt. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Ensure perf can access user stacks	Rob Gardner	1	-0/+7
	When an interrupt (such as a perf counter interrupt) is delivered while executing in user space, the trap entry code puts ASI_AIUS in %asi so that copy_from_user() and copy_to_user() will access the correct memory. But if a perf counter interrupt is delivered while the cpu is already executing in kernel space, then the trap entry code will put ASI_P in %asi, and this will prevent copy_from_user() from reading any useful stack data in either of the perf_callchain_user_X functions, and thus no user callgraph data will be collected for this sample period. An additional problem is that a fault is guaranteed to occur, and though it will be silently covered up, it wastes time and could perturb state. In perf_callchain_user(), we ensure that %asi contains ASI_AIUS because we know for a fact that the subsequent calls to copy_from_user() are intended to read the user's stack. [ Use get_fs()/set_fs() -DaveM ] Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Don't set %pil in rtrap_nmi too early	Rob Gardner	1	-1/+7
	Commit 28a1f53 delays setting %pil to avoid potential hardirq stack overflow in the common rtrap_irq path. Setting %pil also needs to be delayed in the rtrap_nmi path for the same reason. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Add ADI capability to cpu capabilities	Khalid Aziz	2	-4/+6
	Add ADI (Application Data Integrity) capability to cpu capabilities list. ADI capability allows virtual addresses to be encoded with a tag in bits 63-60. This tag serves as an access control key for the regions of virtual address with ADI enabled and a key set on them. Hypervisor encodes this capability as "adp" in "hwcap-list" property in machine description. Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	tty: serial: constify sunhv_ops structs	Aya Mahfouz	1	-3/+3
	Constifies sunhv_ops structures in tty's serial driver since they are not modified after their initialization. Detected and found using Coccinelle. Suggested-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Aya Mahfouz <mahfouz.saif.elyazal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()	Dan Carpenter	1	-1/+1
	The "domain" variable needs to be signed for the error handling to work. Fixes: 8def31034d03 (cpufreq: arm_big_little: add SCPI interface driver) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>