aboutsummaryrefslogtreecommitdiffstats
path: root/include/uapi
AgeCommit message (Collapse)AuthorFilesLines
2017-07-27RDMA/qedr: notify user application if DPM is supportedAmrani, Ram1-0/+1
Direct Packet Mode support may be disabled, e.g, due to limited resources. Notifying the user application prevents wasting cycles on attempting to send these kind of packets. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-27Backmerge tag 'v4.13-rc2' into drm-nextDave Airlie2-4/+4
Linux 4.13-rc2 This is required for drm-misc fixing.
2017-07-26Merge airlied/drm-next into drm-misc-nextDaniel Vetter58-208/+837
I need this to be able to apply the deferred fbdev setup patches, I need the relevant prep work that landed through the drm-intel tree. Also squash in conflict fixup from Laurent Pinchart. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-24signal: Remove kernel interal si_code magicEric W. Biederman1-69/+46
struct siginfo is a union and the kernel since 2.4 has been hiding a union tag in the high 16bits of si_code using the values: __SI_KILL __SI_TIMER __SI_POLL __SI_FAULT __SI_CHLD __SI_RT __SI_MESGQ __SI_SYS While this looks plausible on the surface, in practice this situation has not worked well. - Injected positive signals are not copied to user space properly unless they have these magic high bits set. - Injected positive signals are not reported properly by signalfd unless they have these magic high bits set. - These kernel internal values leaked to userspace via ptrace_peek_siginfo - It was possible to inject these kernel internal values and cause the the kernel to misbehave. - Kernel developers got confused and expected these kernel internal values in userspace in kernel self tests. - Kernel developers got confused and set si_code to __SI_FAULT which is SI_USER in userspace which causes userspace to think an ordinary user sent the signal and that it was not kernel generated. - The values make it impossible to reorganize the code to transform siginfo_copy_to_user into a plain copy_to_user. As si_code must be massaged before being passed to userspace. So remove these kernel internal si codes and make the kernel code simpler and more maintainable. To replace these kernel internal magic si_codes introduce the helper function siginfo_layout, that takes a signal number and an si_code and computes which union member of siginfo is being used. Have siginfo_layout return an enumeration so that gcc will have enough information to warn if a switch statement does not handle all of union members. A couple of architectures have a messed up ABI that defines signal specific duplications of SI_USER which causes more special cases in siginfo_layout than I would like. The good news is only problem architectures pay the cost. Update all of the code that used the previous magic __SI_ values to use the new SIL_ values and to call siginfo_layout to get those values. Escept where not all of the cases are handled remove the defaults in the switch statements so that if a new case is missed in the future the lack will show up at compile time. Modify the code that copies siginfo si_code to userspace to just copy the value and not cast si_code to a short first. The high bits are no longer used to hold a magic union member. Fixup the siginfo header files to stop including the __SI_ values in their constants and for the headers that were missing it to properly update the number of si_codes for each signal type. The fixes to copy_siginfo_from_user32 implementations has the interesting property that several of them perviously should never have worked as the __SI_ values they depended up where kernel internal. With that dependency gone those implementations should work much better. The idea of not passing the __SI_ values out to userspace and then not reinserting them has been tested with criu and criu worked without changes. Ref: 2.4.0-test1 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-07-24fcntl: Don't use ambiguous SIG_POLL si_codesEric W. Biederman1-2/+2
We have a weird and problematic intersection of features that when they all come together result in ambiguous siginfo values, that we can not support properly. - Supporting fcntl(F_SETSIG,...) with arbitrary valid signals. - Using positive values for POLL_IN, POLL_OUT, POLL_MSG, ..., etc that imply they are signal specific si_codes and using the aforementioned arbitrary signal to deliver them. - Supporting injection of arbitrary siginfo values for debugging and checkpoint/restore. The result is that just looking at siginfo si_codes of 1 to 6 are ambigious. It could either be a signal specific si_code or it could be a generic si_code. For most of the kernel this is a non-issue but for sending signals with siginfo it is impossible to play back the kernel signals and get the same result. Strictly speaking when the si_code was changed from SI_SIGIO to POLL_IN and friends between 2.2 and 2.4 this functionality was not ambiguous, as only real time signals were supported. Before 2.4 was released the kernel began supporting siginfo with non realtime signals so they could give details of why the signal was sent. The result is that if F_SETSIG is set to one of the signals with signal specific si_codes then user space can not know why the signal was sent. I grepped through a bunch of userspace programs using debian code search to get a feel for how often people choose a signal that results in an ambiguous si_code. I only found one program doing so and it was using SIGCHLD to test the F_SETSIG functionality, and did not appear to be a real world usage. Therefore the ambiguity does not appears to be a real world problem in practice. Remove the ambiguity while introducing the smallest chance of breakage by changing the si_code to SI_SIGIO when signals with signal specific si_codes are targeted. Fixes: v2.3.40 -- Added support for queueing non-rt signals Fixes: v2.3.21 -- Changed the si_code from SI_SIGIO Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-07-24IB/mlx4: Add support for RSS QPGuy Levi1-0/+33
Add support to work with a RSS QP by using an indirection table object upon QP creation. Other related QP verbs (e.g. modify/destroy/query) were updated as well for that QP mode. Notes: - The RX hash properties are supplied as driver private data. - The RSS QP port is used on the associated WQs in its indirection table. Applying different ports during WQ life time is not allowed. - The expected RSS QP flow is: create, modify(RST->INIT), modify(RST->RTR), destroy. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx4: Add support for WQ indirection table related verbsGuy Levi1-0/+4
To enable RSS functionality the IB indirection table object (i.e. ib_rwq_ind_table) should be used. This patch implements the related verbs as of create and destroy an indirection table. In downstream patches the indirection table will be used as part of RSS QP creation. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx4: Add support for WQ related verbsGuy Levi1-0/+14
Support create/modify/destroy WQ related verbs. The base IB object to enable RSS functionality is a WQ (i.e. ib_wq). This patch implements the related WQ verbs as of create, modify and destroy. In downstream patches the WQ will be used as part of an indirection table (i.e. ib_rwq_ind_table) to enable RSS QP creation. Notes: ConnectX-3 hardware requires consecutive WQNs list as receive descriptor queues for the RSS QP. Hence, the driver manages consecutive ranges lists per context which the user must respect. Destroying the WQ does not return its WQN back to its range for reusing. However, destroying all WQs from the same range releases the range and in turn releases its WQNs for reusing. Since the WQ object is not a natural object in the hardware, the driver implements the WQ by the hardware QP. As such, the WQ inherits its port from its RSS QP parent upon its RST->INIT transition and by that time its state is applied to the hardware. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx4: Add inline-receive supportMaor Gottlieb1-1/+2
When inline-receive is enabled, the HCA may write received data into the receive WQE. Inline-receive is enabled by setting its matching bit in the QP context and each single-packet message with payload not exceeding the receive WQE size will be delivered to the WQE. The completion report will indicate that the payload was placed to the WQE. It includes: 1) Return maximum supported size of inline-receive by the hardware in query_device vendor's data part. 2) Enable the feature when requested by the vendor data input. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/uverbs: Enable QP creation with a given source QP numberYishai Hadas1-1/+1
Enable QP creation with a given source QP number, the created QP will use the source QPN as its wire QP number. To create such a QP, root privileges (i.e. CAP_NET_RAW) are required from the user application. This comes as a pre-patch for downstream patches in this series to allow user space applications to accelerate traffic which is typically handled by IPoIB ULP. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24netfilter: nf_tables: Attach process info to NFT_MSG_NEWGEN notificationsPhil Sutter1-0/+2
This is helpful for 'nft monitor' to track which process caused a given change to the ruleset. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-23Merge 4.13-rc2 into tty-nextGreg Kroah-Hartman2-4/+4
We want the tty/serial fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-22Merge tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds1-1/+1
Pull tty/serial fixes from Greg KH: "Here are some small tty and serial driver fixes for 4.13-rc2. Nothing huge at all, a revert of a patch that turned out to break things, a fix up for a new tty ioctl we added in 4.13-rc1 to get the uapi definition correct, and a few minor serial driver fixes for reported issues. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: Fix TIOCGPTPEER ioctl definition tty: hide unused pty_get_peer function tty: serial: lpuart: Fix the logic for detecting the 32-bit type UART serial: imx: Prevent TX buffer PIO write when a DMA has been started Revert "serial: imx-serial - move DMA buffer configuration to DT" serial: sh-sci: Uninitialized variables in sysfs files serial: st-asc: Potential error pointer dereference
2017-07-21rxrpc: Move the packet.h include file into net/rxrpc/David Howells1-0/+44
Move the protocol description header file into net/rxrpc/ and rename it to protocol.h. It's no longer necessary to expose it as packets are no longer exposed to kernel services (such as AFS) that use the facility. The abort codes are transferred to the UAPI header instead as we pass these back to userspace and also to kernel services. Signed-off-by: David Howells <dhowells@redhat.com>
2017-07-21rxrpc: Expose UAPI definitions to userspaceDavid Howells1-0/+80
Move UAPI definitions from the internal header and place them in a UAPI header file so that userspace can make use of them. Signed-off-by: David Howells <dhowells@redhat.com>
2017-07-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller4-2/+26
2017-07-18perf/core: Define the common branch type classificationJin Yao1-1/+26
It is often useful to know the branch types while analyzing branch data. For example, a call is very different from a conditional branch. Currently we have to look it up in binary while the binary may later not be available and even the binary is available but user has to take some time. It is very useful for user to check it directly in perf report. Perf already has support for disassembling the branch instruction to get the x86 branch type. To keep consistent on kernel and userspace and make the classification more common, the patch adds the common branch type classification in perf_event.h. The patch only defines a minimum but most common set of branch types. PERF_BR_UNKNOWN : unknown PERF_BR_COND :conditional PERF_BR_UNCOND : unconditional PERF_BR_IND : indirect PERF_BR_CALL : function call PERF_BR_IND_CALL : indirect function call PERF_BR_RET : function return PERF_BR_SYSCALL : syscall PERF_BR_SYSRET : syscall return PERF_BR_COND_CALL : conditional function call PERF_BR_COND_RET : conditional function return The patch also adds a new field type (4 bits) in perf_branch_entry to record the branch type. Since the disassembling of branch instruction needs some overhead, a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it needs to disassemble the branch instruction and record the branch type. Change log: v10: Not changed. v9: Not changed. v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN. No other change. v7: Just keep the most common branch types. Others are removed. v6: Not changed. v5: Not changed. The v5 patch series just change the userspace. v4: Comparing to previous version, the major changes are: 1. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be computed later in userspace. 2. Remove the "cross" field in perf_branch_entry. The cross page computing will be done later in userspace. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1500379995-6449-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18media: cec: rework the cec event handlingHans Verkuil1-1/+2
Event handling was always fairly simplistic since there were only two events. With the addition of pin events this needed to be redesigned. The state_change and lost_msgs events are now core events with the guarantee that the last state is always available. The new pin events are a queue of events (up to 64 for each event) and the oldest event will be dropped if the application cannot keep up. Lost events are marked with a new event flag. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-07-18media: linux/cec.h: add pin monitoring API supportHans Verkuil1-0/+5
Add support for low-level CEC pin monitoring. This adds a new monitor mode, a new capability and two new events. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-07-18tty: serial: owl: Implement console driverAndreas Färber1-0/+1
Implement serial console driver to complement earlycon. Based on LeMaker linux-actions tree. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-18include: usb: audio: specify exact endiannes of descriptorsRuslan Bilovol1-3/+3
USB spec says that multiple byte fields are stored in little-endian order (see chapter 8.1 of USB2.0 spec and chapter 7.1 of USB3.0 spec), thus mark such fields as LE for UAC1 and UAC2 headers Signed-off-by: Ruslan Bilovol <ruslan.bilovol@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-07-17bpf: add bpf_redirect_map helper routineJohn Fastabend1-1/+7
BPF programs can use the devmap with a bpf_redirect_map() helper routine to forward packets to netdevice in map. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17bpf: add devmap, a map for storing net device referencesJohn Fastabend1-0/+1
Device map (devmap) is a BPF map, primarily useful for networking applications, that uses a key to lookup a reference to a netdevice. The map provides a clean way for BPF programs to build virtual port to physical port maps. Additionally, it provides a scoping function for the redirect action itself allowing multiple optimizations. Future patches will leverage the map to provide batching at the XDP layer. Another optimization/feature, that is not yet implemented, would be to support multiple netdevices per key to support efficient multicast and broadcast support. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17xdp: add bpf_redirect helper functionJohn Fastabend1-0/+1
This adds support for a bpf_redirect helper function to the XDP infrastructure. For now this only supports redirecting to the egress path of a port. In order to support drivers handling a xdp_buff natively this patches uses a new ndo operation ndo_xdp_xmit() that takes pushes a xdp_buff to the specified device. If the program specifies either (a) an unknown device or (b) a device that does not support the operation a BPF warning is thrown and the XDP_ABORTED error code is returned. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17tty: Fix TIOCGPTPEER ioctl definitionGleb Fotengauer-Malinovskiy1-1/+1
This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag because it doesn't copy anything from/to userspace to access the argument. Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl") Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Acked-by: Aleksa Sarai <asarai@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-15Merge tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+3
Pull more KVM updates from Radim Krčmář: "Second batch of KVM updates for v4.13 Common: - add uevents for VM creation/destruction - annotate and properly access RCU-protected objects s390: - rename IOCTL added in the first v4.13 merge x86: - emulate VMLOAD VMSAVE feature in SVM - support paravirtual asynchronous page fault while nested - add Hyper-V userspace interfaces for better migration - improve master clock corner cases - extend internal error reporting after EPT misconfig - correct single-stepping of emulated instructions in SVM - handle MCE during VM entry - fix nVMX VM entry checks and nVMX VMCS shadowing" * tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) kvm: x86: hyperv: make VP_INDEX managed by userspace KVM: async_pf: Let guest support delivery of async_pf from guest mode KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf KVM: async_pf: Add L1 guest async_pf #PF vmexit handler KVM: x86: Simplify kvm_x86_ops->queue_exception parameter list kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2 KVM: x86: make backwards_tsc_observed a per-VM variable KVM: trigger uevents when creating or destroying a VM KVM: SVM: Enable Virtual VMLOAD VMSAVE feature KVM: SVM: Add Virtual VMLOAD VMSAVE feature definition KVM: SVM: Rename lbr_ctl field in the vmcb control area KVM: SVM: Prepare for new bit definition in lbr_ctl KVM: SVM: handle singlestep exception when skipping emulated instructions KVM: x86: take slots_lock in kvm_free_pit KVM: s390: Fix KVM_S390_GET_CMMA_BITS ioctl definition kvm: vmx: Properly handle machine check during VM-entry KVM: x86: update master clock before computing kvmclock_offset kvm: nVMX: Shadow "high" parts of shadowed 64-bit VMCS fields kvm: nVMX: Fix nested_vmx_check_msr_bitmap_controls kvm: nVMX: Validate the I/O bitmaps on nested VM-entry ...
2017-07-14kvm: x86: hyperv: make VP_INDEX managed by userspaceRoman Kagan1-0/+1
Hyper-V identifies vCPUs by Virtual Processor Index, which can be queried via HV_X64_MSR_VP_INDEX msr. It is defined by the spec as a sequential number which can't exceed the maximum number of vCPUs per VM. APIC ids can be sparse and thus aren't a valid replacement for VP indices. Current KVM uses its internal vcpu index as VP_INDEX. However, to make it predictable and persistent across VM migrations, the userspace has to control the value of VP_INDEX. This patch achieves that, by storing vp_index explicitly on vcpu, and allowing HV_X64_MSR_VP_INDEX to be set from the host side. For compatibility it's initialized to KVM vcpu index. Also a few variables are renamed to make clear distinction betweed this Hyper-V vp_index and KVM vcpu_id (== APIC id). Besides, a new capability, KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip attempting msr writes where unsupported, to avoid spamming error logs. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-13Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds1-0/+12
Pull SCSI target updates from Nicholas Bellinger: "It's been usually busy for summer, with most of the efforts centered around TCMU developments and various target-core + fabric driver bug fixing activities. Not particularly large in terms of LoC, but lots of smaller patches from many different folks. The highlights include: - ibmvscsis logical partition manager support (Michael Cyr + Bryant Ly) - Convert target/iblock WRITE_SAME to blkdev_issue_zeroout (hch + nab) - Add support for TMR percpu LUN reference counting (nab) - Fix a potential deadlock between EXTENDED_COPY and iscsi shutdown (Bart) - Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce (Jiang Yi) - Fix TMCU module removal (Xiubo Li) - Fix iser-target OOPs during login failure (Andrea Righi + Sagi) - Breakup target-core free_device backend driver callback (mnc) - Perform TCMU add/delete/reconfig synchronously (mnc) - Fix TCMU multiple UIO open/close sequences (mnc) - Fix TCMU CHECK_CONDITION sense handling (mnc) - Fix target-core SAM_STAT_BUSY + TASK_SET_FULL handling (mnc + nab) - Introduce TYPE_ZBC support in PSCSI (Damien Le Moal) - Fix possible TCMU memory leak + OOPs when recalculating cmd base size (Xiubo Li + Bryant Ly + Damien Le Moal + mnc) - Add login_keys_workaround attribute for non RFC initiators (Robert LeBlanc + Arun Easi + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (68 commits) iscsi-target: Add login_keys_workaround attribute for non RFC initiators Revert "qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT" tcmu: clean up the code and with one small fix tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size target: export lio pgr/alua support as device attr target: Fix return sense reason in target_scsi3_emulate_pr_out target: Fix cmd size for PR-OUT in passthrough_parse_cdb tcmu: Fix dev_config_store target: pscsi: Introduce TYPE_ZBC support target: Use macro for WRITE_VERIFY_32 operation codes target: fix SAM_STAT_BUSY/TASK_SET_FULL handling target: remove transport_complete pscsi: finish cmd processing from pscsi_req_done tcmu: fix sense handling during completion target: add helper to copy sense to se_cmd buffer target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE target: make device_mutex and device_list static tcmu: Fix flushing cmd entry dcache page tcmu: fix multiple uio open/close sequences tcmu: drop configured check in destroy ...
2017-07-13kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2Roman Kagan1-0/+1
There is a flaw in the Hyper-V SynIC implementation in KVM: when message page or event flags page is enabled by setting the corresponding msr, KVM zeroes it out. This is problematic because on migration the corresponding MSRs are loaded on the destination, so the content of those pages is lost. This went unnoticed so far because the only user of those pages was in-KVM hyperv synic timers, which could continue working despite that zeroing. Newer QEMU uses those pages for Hyper-V VMBus implementation, and zeroing them breaks the migration. Besides, in newer QEMU the content of those pages is fully managed by QEMU, so zeroing them is undesirable even when writing the MSRs from the guest side. To support this new scheme, introduce a new capability, KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic pages aren't zeroed out in KVM. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-12include/linux/sem.h: correctly document sem_ctimeManfred Spraul1-1/+1
sem_ctime is initialized to the semget() time and then updated at every semctl() that changes the array. Thus it does not represent the time of the last change. Especially, semop() calls are only stored in sem_otime, not in sem_ctime. This is already described in ipc/sem.c, I just overlooked that there is a comment in include/linux/sem.h and man semctl(2) as well. So: Correct wrong comments. Link: http://lkml.kernel.org/r/20170515171912.6298-4-manfred@colorfullife.com Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Kees Cook <keescook@chromium.org> Cc: <1vier1@web.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12kcmp: add KCMP_EPOLL_TFD mode to compare epoll target filesCyrill Gorcunov1-0/+10
With current epoll architecture target files are addressed with file_struct and file descriptor number, where the last is not unique. Moreover files can be transferred from another process via unix socket, added into queue and closed then so we won't find this descriptor in the task fdinfo list. Thus to checkpoint and restore such processes CRIU needs to find out where exactly the target file is present to add it into epoll queue. For this sake one can use kcmp call where some particular target file from the queue is compared with arbitrary file passed as an argument. Because epoll target files can have same file descriptor number but different file_struct a caller should explicitly specify the offset within. To test if some particular file is matching entry inside epoll one have to - fill kcmp_epoll_slot structure with epoll file descriptor, target file number and target file offset (in case if only one target is present then it should be 0) - call kcmp as kcmp(pid1, pid2, KCMP_EPOLL_TFD, fd, &kcmp_epoll_slot) - the kernel fetch file pointer matching file descriptor @fd of pid1 - lookups for file struct in epoll queue of pid2 and returns traditional 0,1,2 result for sorting purpose Link: http://lkml.kernel.org/r/20170424154423.511592110@gmail.com Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Andrey Vagin <avagin@openvz.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12KVM: s390: Fix KVM_S390_GET_CMMA_BITS ioctl definitionGleb Fotengauer-Malinovskiy1-1/+1
In case of KVM_S390_GET_CMMA_BITS, the kernel does not only read struct kvm_s390_cmma_log passed from userspace (which constitutes _IOC_WRITE), it also writes back a return value (which constitutes _IOC_READ) making this an _IOWR ioctl instead of _IOW. Fixes: 4036e387 ("KVM: s390: ioctls to get and set guest storage attributes") Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-10Fix up over-eager 'wait_queue_t' renamingLinus Torvalds2-4/+4
Commit ac6424b981bc ("sched/wait: Rename wait_queue_t => wait_queue_entry_t") had scripted the renaming incorrectly, and didn't actually check that the 'wait_queue_t' was a full token. As a result, it also triggered on 'wait_queue_token', and renamed that to 'wait_queue_entry_token' entry in the autofs4 packet structure definition too. That was entirely incorrect, and not intended. The end result built fine when building just the kernel - because everything had been renamed consistently there - but caused problems in user space because the "struct autofs_packet_missing" type is exported as part of the uapi. This scripts it all back again: git grep -lw wait_queue_entry_token | xargs sed -i 's/wait_queue_entry_token/wait_queue_token/g' and checks the end result. Reported-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Fixes: ac6424b981bc ("sched/wait: Rename wait_queue_t => wait_queue_entry_t") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-09Merge tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds8-17/+209
Pull drm updates from Dave Airlie: "This is the main pull request for the drm, I think I've got one later driver pull for mediatek SoC driver, I'm undecided on if it needs to go to you yet. Otherwise summary below: Core drm: - Atomic add driver private objects - Deprecate preclose hook in modern drivers - MST bandwidth tracking - Use kvmalloc in more places - Add mode_valid hook for crtc/encoder/bridge - Reduce sync_file construction time - Documentation updates - New DRM synchronisation object support New drivers: - pl111 - pl111 CLCD display controller Panel: - Innolux P079ZCA panel driver - Add NL12880B20-05, NL192108AC18-02D, P320HVN03 panels - panel-samsung-s6e3ha2: Add s6e3hf2 panel support i915: - SKL+ watermark fixes - G4x/G33 reset improvements - DP AUX backlight improvements - Buffer based GuC/host communication - New getparam for (sub)slice infomation - Cannonlake and Coffeelake initial patches - Execbuf optimisations radeon/amdgpu: - Lots of Vega10 bug fixes - Preliminary raven support - KIQ support for compute rings - MEC queue management rework - DCE6 Audio support - SR-IOV improvements - Better radeon/amdgpu selection support nouveau: - HDMI stereoscopic support - Display code rework for >= GM20x GPUs msm: - GEM rework for fine-grained locking - Per-process pagetable work - HDMI fixes for Snapdragon 820. vc4: - Remove 256MB CMA limit from vc4 - Add out-fence support - Add support for cygnus - Get/set tiling ioctls support - Add T-format tiling support for scanout zte: - add VGA support. etnaviv: - Thermal throttle support for newer GPUs - Restore userspace buffer cache performance - dma-buf sync fix stm: - add stm32f429 display support exynos: - Rework vblank handling - Fixup sw-trigger code sun4i: - V3s display engine support - HDMI support for older SoCs - Preliminary work on dual-pipeline SoCs. rcar-du: - VSP work imx-drm: - Remove counter load enable from PRE - Double read/write reduction flag support tegra: - Documentation for the host1x and drm driver. - Lots of staging ioctl fixes due to grate project work. omapdrm: - dma-buf fence support - TILER rotation fixes" * tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linux: (1270 commits) drm: Remove unused drm_file parameter to drm_syncobj_replace_fence() drm/amd/powerplay: fix bug fail to remove sysfs when rmmod amdgpu. amdgpu: Set cik/si_support to 1 by default if radeon isn't built drm/amdgpu/gfx9: fix driver reload with KIQ drm/amdgpu/gfx8: fix driver reload with KIQ drm/amdgpu: Don't call amd_powerplay_destroy() if we don't have powerplay drm/ttm: Fix use-after-free in ttm_bo_clean_mm drm/amd/amdgpu: move get memory type function from early init to sw init drm/amdgpu/cgs: always set reference clock in mode_info drm/amdgpu: fix vblank_time when displays are off drm/amd/powerplay: power value format change for Vega10 drm/amdgpu/gfx9: support the amdgpu.disable_cu option drm/amd/powerplay: change PPSMC_MSG_GetCurrPkgPwr for Vega10 drm/amdgpu: Make amdgpu_cs_parser_init static (v2) drm/amdgpu/cs: fix a typo in a comment drm/amdgpu: Fix the exported always on CU bitmap drm/amdgpu/gfx9: gfx_v9_0_enable_gfx_static_mg_power_gating() can be static drm/amdgpu/psp: upper_32_bits/lower_32_bits for address setup drm/amd/powerplay/cz: print message if smc message fails drm/amdgpu: fix typo in amdgpu_debugfs_test_ib_init ...
2017-07-09Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-8/+8
Pull scheduler fixes from Thomas Gleixner: "This scheduler update provides: - The (hopefully) final fix for the vtime accounting issues which were around for quite some time - Use types known to user space in UAPI headers to unbreak user space builds - Make load balancing respect the current scheduling domain again instead of evaluating unrelated CPUs" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/headers/uapi: Fix linux/sched/types.h userspace compilation errors sched/fair: Fix load_balance() affinity redo path sched/cputime: Accumulate vtime on top of nsec clocksource sched/cputime: Move the vtime task fields to their own struct sched/cputime: Rename vtime fields sched/cputime: Always set tsk->vtime_snap_whence after accounting vtime vtime, sched/cputime: Remove vtime_account_user() Revert "sched/cputime: Refactor the cputime_adjust() code"
2017-07-09Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscryptLinus Torvalds1-0/+2
Pull fscrypt updates from Ted Ts'o: "Add support for 128-bit AES and some cleanups to fscrypt" * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: fscrypt: make ->dummy_context() return bool fscrypt: add support for AES-128-CBC fscrypt: inline fscrypt_free_filename()
2017-07-08Merge tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds2-0/+4
Pull PCI updates from Bjorn Helgaas: - add sysfs max_link_speed/width, current_link_speed/width (Wong Vee Khee) - make host bridge IRQ mapping much more generic (Matthew Minter, Lorenzo Pieralisi) - convert most drivers to pci_scan_root_bus_bridge() (Lorenzo Pieralisi) - mutex sriov_configure() (Jakub Kicinski) - mutex pci_error_handlers callbacks (Christoph Hellwig) - split ->reset_notify() into ->reset_prepare()/reset_done() (Christoph Hellwig) - support multiple PCIe portdrv interrupts for MSI as well as MSI-X (Gabriele Paoloni) - allocate MSI/MSI-X vector for Downstream Port Containment (Gabriele Paoloni) - fix MSI IRQ affinity pre/post/min_vecs issue (Michael Hernandez) - test INTx masking during enumeration, not at run-time (Piotr Gregor) - avoid using device_may_wakeup() for runtime PM (Rafael J. Wysocki) - restore the status of PCI devices across hibernation (Chen Yu) - keep parent resources that start at 0x0 (Ard Biesheuvel) - enable ECRC only if device supports it (Bjorn Helgaas) - restore PRI and PASID state after Function-Level Reset (CQ Tang) - skip DPC event if device is not present (Keith Busch) - check domain when matching SMBIOS info (Sujith Pandel) - mark Intel XXV710 NIC INTx masking as broken (Alex Williamson) - avoid AMD SB7xx EHCI USB wakeup defect (Kai-Heng Feng) - work around long-standing Macbook Pro poweroff issue (Bjorn Helgaas) - add Switchtec "running" status flag (Logan Gunthorpe) - fix dra7xx incorrect RW1C IRQ register usage (Arvind Yadav) - modify xilinx-nwl IRQ chip for legacy interrupts (Bharat Kumar Gogada) - move VMD SRCU cleanup after bus, child device removal (Jon Derrick) - add Faraday clock handling (Linus Walleij) - configure Rockchip MPS and reorganize (Shawn Lin) - limit Qualcomm TLP size to 2K (hardware issue) (Srinivas Kandagatla) - support Tegra MSI 64-bit addressing (Thierry Reding) - use Rockchip normal (not privileged) register bank (Shawn Lin) - add HiSilicon Kirin SoC PCIe controller driver (Xiaowei Song) - add Sigma Designs Tango SMP8759 PCIe controller driver (Marc Gonzalez) - add MediaTek PCIe host controller support (Ryder Lee) - add Qualcomm IPQ4019 support (John Crispin) - add HyperV vPCI protocol v1.2 support (Jork Loeser) - add i.MX6 regulator support (Quentin Schulz) * tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits) PCI: tango: Add Sigma Designs Tango SMP8759 PCIe host bridge support PCI: Add DT binding for Sigma Designs Tango PCIe controller PCI: rockchip: Use normal register bank for config accessors dt-bindings: PCI: Add documentation for MediaTek PCIe PCI: Remove __pci_dev_reset() and pci_dev_reset() PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done() PCI: xilinx: Make of_device_ids const PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts PCI: vmd: Move SRCU cleanup after bus, child device removal PCI: vmd: Correct comment: VMD domains start at 0x10000, not 0x1000 PCI: versatile: Add local struct device pointers PCI: tegra: Do not allocate MSI target memory PCI: tegra: Support MSI 64-bit addressing PCI: rockchip: Use local struct device pointer consistently PCI: rockchip: Check for clk_prepare_enable() errors during resume MAINTAINERS: Remove Wenrui Li as Rockchip PCIe driver maintainer PCI: rockchip: Configure RC's MPS setting PCI: rockchip: Reconfigure configuration space header type PCI: rockchip: Split out rockchip_pcie_cfg_configuration_accesses() PCI: rockchip: Move configuration accesses into rockchip_pcie_cfg_atu() ...
2017-07-08Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds1-0/+1
Pull input updates from Dmitry Torokhov: - a new driver for STM FingerTip touchscreen - a new driver for D-Link DIR-685 touch keys - updated list of supported devices in xpad driver - other assorted updates and fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (23 commits) MAINTAINERS: update input subsystem patterns Input: introduce KEY_ASSISTANT Input: xpad - sync supported devices with XBCD Input: xpad - sync supported devices with 360Controller Input: xen-kbdfront - use string constants from PV protocol Input: stmfts - mark all PM functions as __maybe_unused Input: add support for the STMicroelectronics FingerTip touchscreen Input: add D-Link DIR-685 touchkeys driver Input: s3c2410_ts - handle return value of clk_prepare_enable Input: axp20x-pek - add wakeup support Input: synaptics-rmi4 - use %phN to form F34 configuration ID Input: synaptics-rmi4 - change a char type to u8 Input: sparse-keymap - remove sparse_keymap_free() Input: tsc2007 - move header file out of I2C realm Input: mms114 - move header file out of I2C realm Input: mcs - move header file out of I2C realm Input: lm8323 - move header file out of I2C realm Input: elantech - force relative mode on a certain module Input: elan_i2c - add support for fetching chip type on newer hardware Input: elan_i2c - check if device is there before really probing ...
2017-07-08sched/headers/uapi: Fix linux/sched/types.h userspace compilation errorsDmitry V. Levin1-8/+8
Consistently use types provided by <linux/types.h> to fix the following linux/sched/types.h userspace compilation errors: /usr/include/linux/sched/types.h:57:2: error: unknown type name 'u32' u32 size; ... u64 sched_period; Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # v4.12 Fixes: e2d1e2aec572 ("sched/headers: Move various ABI definitions to <uapi/linux/sched/types.h>") Link: http://lkml.kernel.org/r/20170705162328.GA11026@altlinux.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-07Merge tag 'libnvdimm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds1-1/+41
Pull libnvdimm updates from Dan Williams: "libnvdimm updates for the latest ACPI and UEFI specifications. This pull request also includes new 'struct dax_operations' enabling to undo the abuse of copy_user_nocache() for copy operations to pmem. The dax work originally missed 4.12 to address concerns raised by Al. Summary: - Introduce the _flushcache() family of memory copy helpers and use them for persistent memory write operations on x86. The _flushcache() semantic indicates that the cache is either bypassed for the copy operation (movnt) or any lines dirtied by the copy operation are written back (clwb, clflushopt, or clflush). - Extend dax_operations with ->copy_from_iter() and ->flush() operations. These operations and other infrastructure updates allow all persistent memory specific dax functionality to be pushed into libnvdimm and the pmem driver directly. It also allows dax-specific sysfs attributes to be linked to a host device, for example: /sys/block/pmem0/dax/write_cache - Add support for the new NVDIMM platform/firmware mechanisms introduced in ACPI 6.2 and UEFI 2.7. This support includes the v1.2 namespace label format, extensions to the address-range-scrub command set, new error injection commands, and a new BTT (block-translation-table) layout. These updates support inter-OS and pre-OS compatibility. - Fix a longstanding memory corruption bug in nfit_test. - Make the pmem and nvdimm-region 'badblocks' sysfs files poll(2) capable. - Miscellaneous fixes and small updates across libnvdimm and the nfit driver. Acknowledgements that came after the branch was pushed: commit 6aa734a2f38e ("libnvdimm, region, pmem: fix 'badblocks' sysfs_get_dirent() reference lifetime") was reviewed by Toshi Kani <toshi.kani@hpe.com>" * tag 'libnvdimm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (42 commits) libnvdimm, namespace: record 'lbasize' for pmem namespaces acpi/nfit: Issue Start ARS to retrieve existing records libnvdimm: New ACPI 6.2 DSM functions acpi, nfit: Show bus_dsm_mask in sysfs libnvdimm, acpi, nfit: Add bus level dsm mask for pass thru. acpi, nfit: Enable DSM pass thru for root functions. libnvdimm: passthru functions clear to send libnvdimm, btt: convert some info messages to warn/err libnvdimm, region, pmem: fix 'badblocks' sysfs_get_dirent() reference lifetime libnvdimm: fix the clear-error check in nsio_rw_bytes libnvdimm, btt: fix btt_rw_page not returning errors acpi, nfit: quiet invalid block-aperture-region warnings libnvdimm, btt: BTT updates for UEFI 2.7 format acpi, nfit: constify *_attribute_group libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile region libnvdimm, pmem, dax: export a cache control attribute dax: convert to bitmask for flags dax: remove default copy_from_iter fallback libnvdimm, nfit: enable support for volatile ranges libnvdimm, pmem: fix persistence warning ...
2017-07-06tcmu: perfom device add, del and reconfig synchronouslyMike Christie1-0/+7
This makes the device add, del reconfig operations sync. It fixes the issue where for add and reconfig, we do not know if userspace successfully completely the operation, so we leave invalid kernel structs or report incorrect status for the config/reconfig operations. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06tcmu: reconfigure netlink attr changesMike Christie1-8/+4
1. TCMU_ATTR_TYPE is too generic when it describes only the reconfiguration type, so rename to TCMU_ATTR_RECONFIG_TYPE. 2. Only return the reconfig type when it is a TCMU_CMD_RECONFIG_DEVICE command. 3. CONFIG_* type is not needed. We can pass the value along with an ATTR to userspace, so it does not need to read sysfs/configfs. 4. Fix leak in tcmu_dev_path_store and rename to dev_config to reflect it is more than just a path that can be changed. 6. Don't update kernel struct value if netlink sending fails. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: "Bryant G. Ly" <bryantly@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06tcmu: Add Type of reconfig into netlinkBryant G. Ly1-0/+8
This patch adds more info about the attribute being changed, so that usersapce can easily figure out what is happening. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-By: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06tcmu: Add netlink for device reconfigurationBryant G. Ly1-0/+1
This gives tcmu the ability to handle events that can cause reconfiguration, such as resize, path changes, write_cache, etc... Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-By: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-8/+1
Merge misc updates from Andrew Morton: - a few hotfixes - various misc updates - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (108 commits) mm, memory_hotplug: move movable_node to the hotplug proper mm, memory_hotplug: drop CONFIG_MOVABLE_NODE mm, memory_hotplug: drop artificial restriction on online/offline mm: memcontrol: account slab stats per lruvec mm: memcontrol: per-lruvec stats infrastructure mm: memcontrol: use generic mod_memcg_page_state for kmem pages mm: memcontrol: use the node-native slab memory counters mm: vmstat: move slab statistics from zone to node counters mm/zswap.c: delete an error message for a failed memory allocation in zswap_dstmem_prepare() mm/zswap.c: improve a size determination in zswap_frontswap_init() mm/zswap.c: delete an error message for a failed memory allocation in zswap_pool_create() mm/swapfile.c: sort swap entries before free mm/oom_kill: count global and memory cgroup oom kills mm: per-cgroup memory reclaim stats mm: kmemleak: treat vm_struct as alternative reference to vmalloc'ed objects mm: kmemleak: factor object reference updating out of scan_block() mm: kmemleak: slightly reduce the size of some structures on 64-bit architectures mm, mempolicy: don't check cpuset seqlock where it doesn't matter mm, cpuset: always use seqlock when changing task's nodemask mm, mempolicy: simplify rebinding mempolicies when updating cpusets ...
2017-07-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+35
Pull KVM updates from Paolo Bonzini: "PPC: - Better machine check handling for HV KVM - Ability to support guests with threads=2, 4 or 8 on POWER9 - Fix for a race that could cause delayed recognition of signals - Fix for a bug where POWER9 guests could sleep with interrupts pending. ARM: - VCPU request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups s390: - initial machine check forwarding - migration support for the CMMA page hinting information - cleanups and fixes x86: - nested VMX bugfixes and improvements - more reliable NMI window detection on AMD - APIC timer optimizations Generic: - VCPU request overhaul + documentation of common code patterns - kvm_stat improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits) Update my email address kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12 kvm: x86: mmu: allow A/D bits to be disabled in an mmu x86: kvm: mmu: make spte mmio mask more explicit x86: kvm: mmu: dead code thanks to access tracking KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code KVM: PPC: Book3S HV: Close race with testing for signals on guest entry KVM: PPC: Book3S HV: Simplify dynamic micro-threading code KVM: x86: remove ignored type attribute KVM: LAPIC: Fix lapic timer injection delay KVM: lapic: reorganize restart_apic_timer KVM: lapic: reorganize start_hv_timer kvm: nVMX: Check memory operand to INVVPID KVM: s390: Inject machine check into the nested guest KVM: s390: Inject machine check into the guest tools/kvm_stat: add new interactive command 'b' tools/kvm_stat: add new command line switch '-i' tools/kvm_stat: fix error on interactive command 'g' KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit ...
2017-07-06mm, mempolicy: simplify rebinding mempolicies when updating cpusetsVlastimil Babka1-8/+0
Commit c0ff7453bb5c ("cpuset,mm: fix no node to alloc memory when changing cpuset's mems") has introduced a two-step protocol when rebinding task's mempolicy due to cpuset update, in order to avoid a parallel allocation seeing an empty effective nodemask and failing. Later, commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier related damage v3") introduced a seqlock protection and removed the synchronization point between the two update steps. At that point (or perhaps later), the two-step rebinding became unnecessary. Currently it only makes sure that the update first adds new nodes in step 1 and then removes nodes in step 2. Without memory barriers the effects are questionable, and even then this cannot prevent a parallel zonelist iteration checking the nodemask at each step to observe all nodes as unusable for allocation. We now fully rely on the seqlock to prevent premature OOMs and allocation failures. We can thus remove the two-step update parts and simplify the code. Link: http://lkml.kernel.org/r/20170517081140.30654-5-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Hugh Dickins <hughd@google.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06ocfs2: use magic.hFabian Frederick1-0/+1
Filesystems generally use SUPER_MAGIC values from magic.h instead of a local definition. Link: http://lkml.kernel.org/r/20170521154217.27917-1-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Reviewed-by: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+83
Pull SCSI updates from James Bottomley: "This is mostly updates of the usual suspects: lpfc, qla2xxx, bnx2fc, qedf, hpsa, hisi_sas, smartpqi, cxlflash, aacraid, csiostor along with a host of minor and miscellaneous changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (276 commits) qla2xxx: Fix NVMe entry_type for iocb packet on BE system scsi: qla2xxx: avoid unused-function warning scsi: snic: fix a couple of spelling mistakes/typos scsi: qla2xxx: fix a bunch of typos and spelling mistakes scsi: lpfc: don't double count abort errors scsi: lpfc: spin_lock_irq() is not nestable scsi: hisi_sas: optimise DMA slot memory scsi: ibmvfc: constify dev_pm_ops structures. scsi: ibmvscsi: constify dev_pm_ops structures. scsi: cxlflash: Update debug prints in reset handlers scsi: cxlflash: Update send_tmf() parameters scsi: cxlflash: Avoid double free of character device scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails. scsi: ufs: flush eh_work when eh_work scheduled. scsi: qla2xxx: Protect access to qpair members with qpair->qp_lock scsi: sun_esp: fix device reference leaks scsi: fnic: changing queue command to return result DID_IMM_RETRY when rport is init scsi: fnic: correct speed display and add support for 25,40 and 100G scsi: fnic: added timestamp reporting in fnic debug stats ...
2017-07-06Merge tag 'for-4.13/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dmLinus Torvalds1-1/+3
Pull device mapper updates from Mike Snitzer: - Add the ability to use select or poll /dev/mapper/control to wait for events from multiple DM devices. - Convert DM's printk macros over to using pr_<level> macros. - Add a big-endian variant of plain64 IV to dm-crypt. - Add support for zoned (aka SMR) devices to DM core. DM kcopyd was also improved to provide a sequential write feature needed by zoned devices. - Introduce DM zoned target that provides support for host-managed zoned devices, the result dm-zoned device acts as a drive-managed interface to the underlying host-managed device. - A DM raid fix to avoid using BUG() for error handling. * tag 'for-4.13/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm zoned: fix overflow when converting zone ID to sectors dm raid: stop using BUG() in __rdev_sectors() dm zoned: drive-managed zoned block device target dm kcopyd: add sequential write feature dm linear: add support for zoned block devices dm flakey: add support for zoned block devices dm: introduce dm_remap_zone_report() dm: fix REQ_OP_ZONE_REPORT bio handling dm: fix REQ_OP_ZONE_RESET bio handling dm table: add zoned block devices validation dm: convert DM printk macros to pr_<level> macros dm crypt: add big-endian variant of plain64 IV dm bio prison: use rb_entry() rather than container_of() dm ioctl: report event number in DM_LIST_DEVICES dm ioctl: add a new DM_DEV_ARM_POLL ioctl dm: add basic support for using the select or poll function