aboutsummaryrefslogtreecommitdiffstats
path: root/include/rdma/ib.h (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-05-03RDMA/nldev: add driver-specific resource trackingSteve Wise4-0/+68
Each driver can register a "fill entry" function with the restrack core. This function will be called when filling out a resource, allowing the driver to add driver-specific details. The details consist of a nltable of nested attributes, that are in the form of <key, [print-type], value> tuples. Both key and value attributes are mandatory. The key nlattr must be a string, and the value nlattr can be one of the driver attributes that are generic, but typed, allowing the attributes to be validated. Currently the driver nlattr types include string, s32, u32, s64, and u64. The print-type nlattr allows a driver to specify an alternative display format for user tools displaying the attribute. For example, a u32 attribute will default to "%u", but a print-type attribute can be included for it to be displayed in hex. This allows the user tool to print the number in the format desired by the driver driver. More attrs can be defined as they become needed by drivers. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-03RDMA/nldev: Add explicit pad attributeSteve Wise2-11/+18
Add a specific RDMA_NLDEV_ATTR_PAD attribute to be used for 64b attribute padding. To preserve the ABI, make this attribute equal to RDMA_NLDEV_ATTR_UNSPEC, which has a value of 0, because that has been used up until now as the pad attribute. Change all the previous use of 0 as the pad with this new enum. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-01RDMA/qedr: fix spelling mistake: "failes" -> "fails"Colin Ian King1-1/+1
Trivial fix to spelling mistake in DP_ERR error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-01IB/cxgb4: use skb_put_zero()/__skb_put_zeroYueHaibing2-9/+4
Use the recently introduced helper to replace the pattern of skb_put_zero/__skb_put() && memset(). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-01IB/core: Use CONFIG_SECURITY_INFINIBAND to compile out security codeParav Pandit2-5/+2
Make security.c depends on CONFIG_SECURITY_INFINIBAND. Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: remove unused function variableZhu Yanjun6-21/+17
In the functions rxe_mem_init_dma, rxe_mem_init_user, rxe_mem_init_fast and copy_data, the function variable rxe is not used. So this function variable rxe is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: change rxe_set_mtu function type to voidZhu Yanjun2-7/+3
The function rxe_set_mtu always returns zero. So this function type is changed to void. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: Change rxe_rcv to return voidYuval Shaia4-9/+12
It always returns 0. Change return type to void. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27infiniband: hw: qib: Change return type to vm_fault_tSouptick Joarder1-1/+1
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27infiniband: hw: hfi1: Change return type to vm_fault_tSouptick Joarder1-2/+2
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19IB/rxe: replace refcount_inc with skb_getZhu Yanjun2-2/+2
Follow the advice from Bart, the function refcount_inc is replaced with skb_get in commit 99dae690255e ("IB/rxe: optimize mcast recv process") and commit 86af61764151 ("IB/rxe: remove unnecessary skb_clone"). CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Suggested-by: Bart Van Assche <Bart.VanAssche@wdc.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19IB/rxe: optimize the function duplicate_requestZhu Yanjun1-14/+3
In the function duplicate_request, the reference of skb can be increased to replace the function skb_clone. This will make rxe performace better and save memory. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19IB/rxe: make rxe_release_udp_tunnel staticZhu Yanjun2-3/+1
The function rxe_release_udp_tunnel is only used in rxe_net.c. So it is necessary to make this function as static. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-17infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_l2param_changeJia-Ju Bai1-1/+1
i40iw_l2param_change() is never called in atomic context. i40iw_make_listen_node() is only set as ".l2_param_change" in struct i40e_client_ops, and this function pointer is not called in atomic context. Despite never getting called from atomic context, i40iw_l2param_change() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_make_listen_nodeJia-Ju Bai1-1/+1
i40iw_make_listen_node() is never called in atomic context. i40iw_make_listen_node() is only called by i40iw_create_listen, which is set as ".create_listen" in struct iw_cm_verbs. Despite never getting called from atomic context, i40iw_make_listen_node() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_add_mqh_4Jia-Ju Bai1-1/+1
i40iw_add_mqh_4() is never called in atomic context, because it calls rtnl_lock() that can sleep. Despite never getting called from atomic context, i40iw_add_mqh_4() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17IB/rxe: avoid export symbolsZhu Yanjun1-3/+0
The functions rxe_set_mtu, rxe_add and rxe_remove are only used in their own module. So it is not necessary to export them. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17IB/rxe: make the variable staticZhu Yanjun2-2/+1
The variable rxe_net_notifier is only used in the file rxe_net.c. So remove it from rxe_net.h file and make it static in the file rxe_net.c. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17RDMA/rdma_cm: Delete rdma_addr_clientJason Gunthorpe3-58/+6
The only thing it does is block module unload while work is posted from rdma_resolve_ip(). However, this is not the right place to do this. The users of rdma_resolve_ip() must ensure their own module does not unload until rdma_resolve_ip() calls the callback, or until rdma_addr_cancel() is called. Similarly callers to rdma_addr_find_l2_eth_by_grh() must ensure their module does not unload while they are calling code. The only two users are already safe, so there is no need for this. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17RDMA/rdma_cm: Make rdma_addr_cancel into a fenceJason Gunthorpe1-18/+40
Currently rdma_addr_cancel does not prevent the callback from being used, this is surprising and hard to reason about. There does not appear to be a bug here as the only user of this API does refcount properly, fixing it only to increase clarity. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17RDMA/rdma_cm: Remove process_req and timer sortingJason Gunthorpe1-71/+25
Now that the work queue is used directly to launch and track the work there is no need for the second processing function to do 'all list entries'. Just schedule all entries onto the main work queue directly. We can also drop all of the useless list sorting now, as the workqueue sorts by expiration time automatically. This change requires switching lock to a spinlock as netdev notifiers are called in an atomic context, this is now easy since the lock does not need to be held across the lookup, that is already single threaded due to the work queue. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-15Linux 4.17-rc1Linus Torvalds1-2/+2
2018-04-13kernel/kexec_file.c: move purgatories sha256 to common codePhilipp Rudo5-4/+28
The code to verify the new kernels sha digest is applicable for all architectures. Move it to common code. One problem is the string.c implementation on x86. Currently sha256 includes x86/boot/string.h which defines memcpy and memset to be gcc builtins. By moving the sha256 implementation to common code and changing the include to linux/string.h both functions are no longer defined. Thus definitions have to be provided in x86/purgatory/string.c Link: http://lkml.kernel.org/r/20180321112751.22196-12-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: allow archs to set purgatory load addressPhilipp Rudo4-32/+31
For s390 new kernels are loaded to fixed addresses in memory before they are booted. With the current code this is a problem as it assumes the kernel will be loaded to an 'arbitrary' address. In particular, kexec_locate_mem_hole searches for a large enough memory region and sets the load address (kexec_bufer->mem) to it. Luckily there is a simple workaround for this problem. By returning 1 in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off. This allows the architecture to set kbuf->mem by hand. While the trick works fine for the kernel it does not for the purgatory as here the architectures don't have access to its kexec_buffer. Give architectures access to the purgatories kexec_buffer by changing kexec_load_purgatory to take a pointer to it. With this change architectures have access to the buffer and can edit it as they need. A nice side effect of this change is that we can get rid of the purgatory_info->purgatory_load_address field. As now the information stored there can directly be accessed from kbuf->mem. Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory loadPhilipp Rudo2-34/+13
The current code uses the sh_offset field in purgatory_info->sechdrs to store a pointer to the current load address of the section. Depending whether the section will be loaded or not this is either a pointer into purgatory_info->purgatory_buf or kexec_purgatory. This is not only a violation of the ELF standard but also makes the code very hard to understand as you cannot tell if the memory you are using is read-only or not. Remove this misuse and store the offset of the section in pugaroty_info->purgatory_buf in sh_offset. Link: http://lkml.kernel.org/r/20180321112751.22196-10-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrsPhilipp Rudo1-22/+12
The main loop currently uses quite a lot of variables to update the section headers. Some of them are unnecessary. So clean them up a little. Link: http://lkml.kernel.org/r/20180321112751.22196-9-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrsPhilipp Rudo1-46/+30
To update the entry point there is an extra loop over all section headers although this can be done in the main loop. So move it there and eliminate the extra loop and variable to store the 'entry section index'. Also, in the main loop, move the usual case, i.e. non-bss section, out of the extra if-block. Link: http://lkml.kernel.org/r/20180321112751.22196-8-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: split up __kexec_load_puragoryPhilipp Rudo1-97/+103
When inspecting __kexec_load_purgatory you find that it has two tasks 1) setting up the kexec_buffer for the new kernel and, 2) setting up pi->sechdrs for the final load address. The two tasks are independent of each other. To improve readability split up __kexec_load_purgatory into two functions, one for each task, and call them directly from kexec_load_purgatory. Link: http://lkml.kernel.org/r/20180321112751.22196-7-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*Philipp Rudo3-61/+71
When the relocations are applied to the purgatory only the section the relocations are applied to is writable. The other sections, i.e. the symtab and .rel/.rela, are in read-only kexec_purgatory. Highlight this by marking the corresponding variables as 'const'. While at it also change the signatures of arch_kexec_apply_relocations* to take section pointers instead of just the index of the relocation section. This removes the second lookup and sanity check of the sections in arch code. Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: search symbols in read-only kexec_purgatoryPhilipp Rudo1-16/+22
The stripped purgatory does not contain a symtab. So when looking for symbols this is done in read-only kexec_purgatory. Highlight this by marking the corresponding variables as 'const'. Link: http://lkml.kernel.org/r/20180321112751.22196-5-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: make purgatory_info->ehdr constPhilipp Rudo2-8/+13
The kexec_purgatory buffer is read-only. Thus all pointers into kexec_purgatory are read-only, too. Point this out by explicitly marking purgatory_info->ehdr as 'const' and update the comments in purgatory_info. Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kernel/kexec_file.c: remove checks in kexec_purgatory_loadPhilipp Rudo1-14/+0
Before the purgatory is loaded several checks are done whether the ELF file in kexec_purgatory is valid or not. These checks are incomplete. For example they don't check for the total size of the sections defined in the section header table or if the entry point actually points into the purgatory. On the other hand the purgatory, although an ELF file on its own, is part of the kernel. Thus not trusting the purgatory means not trusting the kernel build itself. So remove all validity checks on the purgatory and just trust the kernel build. Link: http://lkml.kernel.org/r/20180321112751.22196-3-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13include/linux/kexec.h: silence compile warningsPhilipp Rudo1-0/+2
Patch series "kexec_file: Clean up purgatory load", v2. Following the discussion with Dave and AKASHI, here are the common code patches extracted from my recent patch set (Add kexec_file_load support to s390) [1]. The patches were extracted to allow upstream integration together with AKASHI's common code patches before the arch code gets adjusted to the new base. The reason for this series is to prepare common code for adding kexec_file_load to s390 as well as cleaning up the mis-use of the sh_offset field during purgatory load. In detail this series contains: Patch #1&2: Minor cleanups/fixes. Patch #3-9: Clean up the purgatory load/relocation code. Especially remove the mis-use of the purgatory_info->sechdrs->sh_offset field, currently holding a pointer into either kexec_purgatory (ro) or purgatory_buf (rw) depending on the section. With these patches the section address will be calculated verbosely and sh_offset will contain the offset of the section in the stripped purgatory binary (purgatory_buf). Patch #10: Allows architectures to set the purgatory load address. This patch is important for s390 as the kernel and purgatory have to be loaded to fixed addresses. In current code this is impossible as the purgatory load is opaque to the architecture. Patch #11: Moves x86 purgatories sha implementation to common lib/ directory to allow reuse in other architectures. This patch (of 11) When building the kernel with CONFIG_KEXEC_FILE enabled gcc prints a compile warning multiple times. In file included from <path>/linux/init/initramfs.c:526:0: <path>/include/linux/kexec.h:120:9: warning: `struct kimage' declared inside parameter list [enabled by default] unsigned long cmdline_len); ^ This is because the typedefs for kexec_file_load uses struct kimage before it is declared. Fix this by simply forward declaring struct kimage. Link: http://lkml.kernel.org/r/20180321112751.22196-2-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kexec_file, x86: move re-factored code to generic sideAKASHI Takahiro3-188/+201
In the previous patches, commonly-used routines, exclude_mem_range() and prepare_elf64_headers(), were carved out. Now place them in kexec common code. A prefix "crash_" is given to each of their names to avoid possible name collisions. Link: http://lkml.kernel.org/r/20180306102303.9063-8-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13x86: kexec_file: clean up prepare_elf64_headers()AKASHI Takahiro1-11/+7
Removing bufp variable in prepare_elf64_headers() makes the code simpler and more understandable. Link: http://lkml.kernel.org/r/20180306102303.9063-7-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem bufferAKASHI Takahiro1-51/+31
While CRASH_MAX_RANGES (== 16) seems to be good enough, fixed-number array is not a good idea in general. In this patch, size of crash_mem buffer is calculated as before and the buffer is now dynamically allocated. This change also allows removing crash_elf_data structure. Link: http://lkml.kernel.org/r/20180306102303.9063-6-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()AKASHI Takahiro1-12/+12
The code guarded by CONFIG_X86_64 is necessary on some architectures which have a dedicated kernel mapping outside of linear memory mapping. (arm64 is among those.) In this patch, an additional argument, kernel_map, is added to enable/ disable the code removing #ifdef. Link: http://lkml.kernel.org/r/20180306102303.9063-5-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13x86: kexec_file: purge system-ram walking from prepare_elf64_headers()AKASHI Takahiro1-63/+58
While prepare_elf64_headers() in x86 looks pretty generic for other architectures' use, it contains some code which tries to list crash memory regions by walking through system resources, which is not always architecture agnostic. To make this function more generic, the related code should be purged. In this patch, prepare_elf64_headers() simply scans crash_mem buffer passed and add all the listed regions to elf header as a PT_LOAD segment. So walk_system_ram_res(prepare_elf64_headers_callback) have been moved forward before prepare_elf64_headers() where the callback, prepare_elf64_headers_callback(), is now responsible for filling up crash_mem buffer. Meanwhile exclude_elf_header_ranges() used to be called every time in this callback it is rather redundant and now called only once in prepare_elf_headers() as well. Link: http://lkml.kernel.org/r/20180306102303.9063-4-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kexec_file,x86,powerpc: factor out kexec_file_ops functionsAKASHI Takahiro8-94/+71
As arch_kexec_kernel_image_{probe,load}(), arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig() are almost duplicated among architectures, they can be commonalized with an architecture-defined kexec_file_ops array. So let's factor them out. Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kexec_file: make use of purgatory optionalAKASHI Takahiro3-0/+11
Patch series "kexec_file, x86, powerpc: refactoring for other architecutres", v2. This is a preparatory patchset for adding kexec_file support on arm64. It was originally included in a arm64 patch set[1], but Philipp is also working on their kexec_file support on s390[2] and some changes are now conflicting. So these common parts were extracted and put into a separate patch set for better integration. What's more, my original patch#4 was split into a few small chunks for easier review after Dave's comment. As such, the resulting code is basically identical with my original, and the only *visible* differences are: - renaming of _kexec_kernel_image_probe() and _kimage_file_post_load_cleanup() - change one of types of arguments at prepare_elf64_headers() Those, unfortunately, require a couple of trivial changes on the rest (#1, #6 to #13) of my arm64 kexec_file patch set[1]. Patch #1 allows making a use of purgatory optional, particularly useful for arm64. Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load, verify_sig}() and arch_kimage_file_post_load_cleanup() across architectures. Patches #3-#7 are also intended to generalize parse_elf64_headers(), along with exclude_mem_range(), to be made best re-use of. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html [2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html This patch (of 7): On arm64, crash dump kernel's usable memory is protected by *unmapping* it from kernel virtual space unlike other architectures where the region is just made read-only. It is highly unlikely that the region is accidentally corrupted and this observation rationalizes that digest check code can also be dropped from purgatory. The resulting code is so simple as it doesn't require a bit ugly re-linking/relocation stuff, i.e. arch_kexec_apply_relocations_add(). Please see: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html All that the purgatory does is to shuffle arguments and jump into a new kernel, while we still need to have some space for a hash value (purgatory_sha256_digest) which is never checked against. As such, it doesn't make sense to have trampline code between old kernel and new kernel on arm64. This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be compiled in only if necessary. [takahiro.akashi@linaro.org: fix trivial screwup] Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13proc: revalidate misc dentriesAlexey Dobriyan1-1/+22
If module removes proc directory while another process pins it by chdir'ing to it, then subsequent recreation of proc entry and all entries down the tree will not be visible to any process until pinning process unchdir from directory and unpins everything. Steps to reproduce: proc_mkdir("aaa", NULL); proc_create("aaa/bbb", ...); chdir("/proc/aaa"); remove_proc_entry("aaa/bbb", NULL); remove_proc_entry("aaa", NULL); proc_mkdir("aaa", NULL); # inaccessible because "aaa" dentry still points # to the original "aaa". proc_create("aaa/bbb", ...); Fix is to implement ->d_revalidate and ->d_delete. Link: http://lkml.kernel.org/r/20180312201938.GA4871@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13mm, slab: reschedule cache_reap() on the same CPUVlastimil Babka1-1/+2
cache_reap() is initially scheduled in start_cpu_timer() via schedule_delayed_work_on(). But then the next iterations are scheduled via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND. Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run on the originally intended cpu, although it's still preferred. I was able to demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu. IIUC, it may also happen due to migrating timers in nohz context. As a result, some cpu's would be calling cache_reap() more frequently and others never. This patch uses schedule_delayed_work_on() with the current cpu when scheduling the next iteration. Link: http://lkml.kernel.org/r/20180411070007.32225-1-vbabka@suse.cz Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Pekka Enberg <penberg@kernel.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kexec: export PG_swapbacked to VMCOREINFOPetr Tesarik1-0/+1
Since commit 6326fec1122c ("mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked"), PG_swapcache is an alias for PG_owner_priv_1, which may be also used for other purposes. To know whether the bit indeed has the PG_swapcache meaning, it is necessary to check PG_swapbacked, hence this bit must be exported. Link: http://lkml.kernel.org/r/20180410161345.142e142d@ezekiel.suse.cz Signed-off-by: Petr Tesarik <ptesarik@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Young <dyoung@redhat.com> Cc: Xunlei Pang <xlpang@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: "Marc-Andr Lureau" <marcandre.lureau@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13ipc/shm: fix use-after-free of shm file via remap_file_pages()Eric Biggers1-3/+20
syzbot reported a use-after-free of shm_file_data(file)->file->f_op in shm_get_unmapped_area(), called via sys_remap_file_pages(). Unfortunately it couldn't generate a reproducer, but I found a bug which I think caused it. When remap_file_pages() is passed a full System V shared memory segment, the memory is first unmapped, then a new map is created using the ->vm_file. Between these steps, the shm ID can be removed and reused for a new shm segment. But, shm_mmap() only checks whether the ID is currently valid before calling the underlying file's ->mmap(); it doesn't check whether it was reused. Thus it can use the wrong underlying file, one that was already freed. Fix this by making the "outer" shm file (the one that gets put in ->vm_file) hold a reference to the real shm file, and by making __shm_open() require that the file associated with the shm ID matches the one associated with the "outer" file. Taking the reference to the real shm file is needed to fully solve the problem, since otherwise sfd->file could point to a freed file, which then could be reallocated for the reused shm ID, causing the wrong shm segment to be mapped (and without the required permission checks). Commit 1ac0b6dec656 ("ipc/shm: handle removed segments gracefully in shm_mmap()") almost fixed this bug, but it didn't go far enough because it didn't consider the case where the shm ID is reused. The following program usually reproduces this bug: #include <stdlib.h> #include <sys/shm.h> #include <sys/syscall.h> #include <unistd.h> int main() { int is_parent = (fork() != 0); srand(getpid()); for (;;) { int id = shmget(0xF00F, 4096, IPC_CREAT|0700); if (is_parent) { void *addr = shmat(id, NULL, 0); usleep(rand() % 50); while (!syscall(__NR_remap_file_pages, addr, 4096, 0, 0, 0)); } else { usleep(rand() % 50); shmctl(id, IPC_RMID, NULL); } } } It causes the following NULL pointer dereference due to a 'struct file' being used while it's being freed. (I couldn't actually get a KASAN use-after-free splat like in the syzbot report. But I think it's possible with this bug; it would just take a more extraordinary race...) BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 9 PID: 258 Comm: syz_ipc Not tainted 4.16.0-05140-gf8cf2f16a7c95 #189 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 RIP: 0010:d_inode include/linux/dcache.h:519 [inline] RIP: 0010:touch_atime+0x25/0xd0 fs/inode.c:1724 [...] Call Trace: file_accessed include/linux/fs.h:2063 [inline] shmem_mmap+0x25/0x40 mm/shmem.c:2149 call_mmap include/linux/fs.h:1789 [inline] shm_mmap+0x34/0x80 ipc/shm.c:465 call_mmap include/linux/fs.h:1789 [inline] mmap_region+0x309/0x5b0 mm/mmap.c:1712 do_mmap+0x294/0x4a0 mm/mmap.c:1483 do_mmap_pgoff include/linux/mm.h:2235 [inline] SYSC_remap_file_pages mm/mmap.c:2853 [inline] SyS_remap_file_pages+0x232/0x310 mm/mmap.c:2769 do_syscall_64+0x64/0x1a0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ebiggers@google.com: add comment] Link: http://lkml.kernel.org/r/20180410192850.235835-1-ebiggers3@gmail.com Link: http://lkml.kernel.org/r/20180409043039.28915-1-ebiggers3@gmail.com Reported-by: syzbot+d11f321e7f1923157eac80aa990b446596f46439@syzkaller.appspotmail.com Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13mm/filemap.c: provide dummy filemap_page_mkwrite() for NOMMUArnd Bergmann1-1/+5
Building orangefs on MMU-less machines now results in a link error because of the newly introduced use of the filemap_page_mkwrite() function: ERROR: "filemap_page_mkwrite" [fs/orangefs/orangefs.ko] undefined! This adds a dummy version for it, similar to the existing generic_file_mmap and generic_file_readonly_mmap stubs in the same file, to avoid the link error without adding #ifdefs in each file system that uses these. Link: http://lkml.kernel.org/r/20180409105555.2439976-1-arnd@arndb.de Fixes: a5135eeab2e5 ("orangefs: implement vm_ops->fault") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Martin Brandenburg <martin@omnibond.com> Cc: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13mm/gup.c: document return valueMichael S. Tsirkin6-3/+17
__get_user_pages_fast handles errors differently from get_user_pages_fast: the former always returns the number of pages pinned, the later might return a negative error code. Link: http://lkml.kernel.org/r/1522962072-182137-6-git-send-email-mst@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13get_user_pages_fast(): return -EFAULT on access_ok failureMichael S. Tsirkin1-1/+4
get_user_pages_fast is supposed to be a faster drop-in equivalent of get_user_pages. As such, callers expect it to return a negative return code when passed an invalid address, and never expect it to return 0 when passed a positive number of pages, since its documentation says: * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages * were pinned, returns -errno. When get_user_pages_fast fall back on get_user_pages this is exactly what happens. Unfortunately the implementation is inconsistent: it returns 0 if passed a kernel address, confusing callers: for example, the following is pretty common but does not appear to do the right thing with a kernel address: ret = get_user_pages_fast(addr, 1, writeable, &page); if (ret < 0) return ret; Change get_user_pages_fast to return -EFAULT when supplied a kernel address to make it match expectations. All callers have been audited for consistency with the documented semantics. Link: http://lkml.kernel.org/r/1522962072-182137-4-git-send-email-mst@redhat.com Fixes: 5b65c4677a57 ("mm, x86/mm: Fix performance regression in get_user_pages_fast()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13mm/gup_benchmark: handle gup failuresMichael S. Tsirkin1-1/+3
Patch series "mm/get_user_pages_fast fixes, cleanups", v2. Turns out get_user_pages_fast and __get_user_pages_fast return different values on error when given a single page: __get_user_pages_fast returns 0. get_user_pages_fast returns either 0 or an error. Callers of get_user_pages_fast expect an error so fix it up to return an error consistently. Stress the difference between get_user_pages_fast and __get_user_pages_fast to make sure callers aren't confused. This patch (of 3): __gup_benchmark_ioctl does not handle the case where get_user_pages_fast fails: - a negative return code will cause a buffer overrun - returning with partial success will cause use of uninitialized memory. [akpm@linux-foundation.org: simplification] Link: http://lkml.kernel.org/r/1522962072-182137-3-git-send-email-mst@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13resource: fix integer overflow at reallocationTakashi Iwai1-1/+2
We've got a bug report indicating a kernel panic at booting on an x86-32 system, and it turned out to be the invalid PCI resource assigned after reallocation. __find_resource() first aligns the resource start address and resets the end address with start+size-1 accordingly, then checks whether it's contained. Here the end address may overflow the integer, although resource_contains() still returns true because the function validates only start and end address. So this ends up with returning an invalid resource (start > end). There was already an attempt to cover such a problem in the commit 47ea91b4052d ("Resource: fix wrong resource window calculation"), but this case is an overseen one. This patch adds the validity check of the newly calculated resource for avoiding the integer overflow problem. Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739 Link: http://lkml.kernel.org/r/s5hpo37d5l8.wl-tiwai@suse.de Fixes: 23c570a67448 ("resource: ability to resize an allocated resource") Signed-off-by: Takashi Iwai <tiwai@suse.de> Reported-by: Michael Henders <hendersm@shaw.ca> Tested-by: Michael Henders <hendersm@shaw.ca> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13kconfig: extend output of 'listnewconfig'Don Zickus1-2/+12
We at Red Hat/Fedora have generally tried to have a per file breakdown of every config option we set. This makes it easy for us to add new options when they are exposed and keep a changelog of why they were set. A Fedora example is here: https://src.fedoraproject.org/cgit/rpms/kernel.git/tree/configs/fedora/generic Using various merge scripts, we build up a config file and run it through 'make listnewconfig' and 'make oldnoconfig'. The idea is to print out new config options that haven't been manually set and use the default until a patch is posted to set it properly. To speed things up, it would be nice to make it easier to generate a patch to post the default setting. The output of 'make listnewconfig' has two issues that limit us: - it doesn't provide the default value - it doesn't provide the new 'choice' options that get flagged in 'oldconfig' This patch extends 'listnewconfig' to address the above two issues. This allows us to run a script make listnewconfig | rhconfig-tool -o patches; git send-email patches/ The output of 'make listnewconfig': CONFIG_NET_EMATCH_IPT CONFIG_IPVLAN CONFIG_ICE CONFIG_NET_VENDOR_NI CONFIG_IEEE802154_MCR20A CONFIG_IR_IMON_DECODER CONFIG_IR_IMON_RAW The new output of 'make listnewconfig': CONFIG_KERNEL_XZ=n CONFIG_KERNEL_LZO=n CONFIG_NET_EMATCH_IPT=n CONFIG_IPVLAN=n CONFIG_ICE=n CONFIG_NET_VENDOR_NI=y CONFIG_IEEE802154_MCR20A=n CONFIG_IR_IMON_DECODER=n CONFIG_IR_IMON_RAW=n Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>