linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-11-29	iommu/arm-smmu-v3: Split probe functions into DT/generic portions	Lorenzo Pieralisi	1	-16/+27
	Current ARM SMMUv3 probe functions intermingle HW and DT probing in the initialization functions to detect and programme the ARM SMMU v3 driver features. In order to allow probing the ARM SMMUv3 with other firmwares than DT, this patch splits the ARM SMMUv3 init functions into DT and HW specific portions so that other FW interfaces (ie ACPI) can reuse the HW probing functions and skip the DT portion accordingly. This patch implements no functional change, only code reshuffling. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	ACPI/IORT: Add support for ARM SMMU platform devices creation	Lorenzo Pieralisi	1	-0/+151
	In ARM ACPI systems, IOMMU components are specified through static IORT table entries. In order to create platform devices for the corresponding ARM SMMU components, IORT kernel code should be made able to parse IORT table entries and create platform devices dynamically. This patch adds the generic IORT infrastructure required to create platform devices for ARM SMMUs. ARM SMMU versions have different resources requirement therefore this patch also introduces an IORT specific structure (ie iort_iommu_config) that contains hooks (to be defined when the corresponding ARM SMMU driver support is added to the kernel) to be used to define the platform devices names, init the IOMMUs, count their resources and finally initialize them. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	ACPI/IORT: Add node match function	Lorenzo Pieralisi	2	-0/+17
	Device drivers (eg ARM SMMU) need to know if a specific component is part of the IORT table, so that kernel data structures are not initialized at initcalls time if the respective component is not part of the IORT table. To this end, this patch adds a trivial function that allows detecting if a given IORT node type is present or not in the ACPI table, providing an ACPI IORT equivalent for of_find_matching_node(). Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	ACPI: Implement acpi_dma_configure	Lorenzo Pieralisi	5	-4/+50
	On DT based systems, the of_dma_configure() API implements DMA configuration for a given device. On ACPI systems an API equivalent to of_dma_configure() is missing which implies that it is currently not possible to set-up DMA operations for devices through the ACPI generic kernel layer. This patch fills the gap by introducing acpi_dma_configure/deconfigure() calls that for now are just wrappers around arch_setup_dma_ops() and arch_teardown_dma_ops() and also updates ACPI and PCI core code to use the newly introduced acpi_dma_configure/acpi_dma_deconfigure functions. Since acpi_dma_configure() is used to configure DMA operations, the function initializes the dma/coherent_dma masks to sane default values if the current masks are uninitialized (also to keep the default values consistent with DT systems) to make sure the device has a complete default DMA set-up. The DMA range size passed to arch_setup_dma_ops() is sized according to the device coherent_dma_mask (starting at address 0x0), mirroring the DT probing path behaviour when a dma-ranges property is not provided for the device being probed; this changes the current arch_setup_dma_ops() call parameters in the ACPI probing case, but since arch_setup_dma_ops() is a NOP on all architectures but ARM/ARM64 this patch does not change the current kernel behaviour on them. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> [pci] Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu/arm-smmu-v3: Convert struct device of_node to fwnode usage	Lorenzo Pieralisi	1	-5/+7
	Current ARM SMMU v3 driver rely on the struct device.of_node pointer for device look-up and iommu_ops retrieval. In preparation for ACPI probing enablement, convert the driver to use the struct device.fwnode member for device and iommu_ops look-up so that the driver infrastructure can be used also on systems that do not associate an of_node pointer to a struct device (eg ACPI), making the device look-up and iommu_ops retrieval firmware agnostic. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu/arm-smmu: Convert struct device of_node to fwnode usage	Lorenzo Pieralisi	1	-5/+6
	Current ARM SMMU driver rely on the struct device.of_node pointer for device look-up and iommu_ops retrieval. In preparation for ACPI probing enablement, convert the driver to use the struct device.fwnode member for device and iommu_ops look-up so that the driver infrastructure can be used also on systems that do not associate an of_node pointer to a struct device (eg ACPI), making the device look-up and iommu_ops retrieval firmware agnostic. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu: Make of_iommu_set/get_ops() DT agnostic	Lorenzo Pieralisi	4	-41/+64
	The of_iommu_{set/get}_ops() API is used to associate a device tree node with a specific set of IOMMU operations. The same kernel interface is required on systems booting with ACPI, where devices are not associated with a device tree node, therefore the interface requires generalization. The struct device fwnode member represents the fwnode token associated with the device and the struct it points at is firmware specific; regardless, it is initialized on both ACPI and DT systems and makes an ideal candidate to use it to associate a set of IOMMU operations to a given device, through its struct device.fwnode member pointer, paving the way for representing per-device iommu_ops (ie an iommu instance associated with a device). Convert the DT specific of_iommu_{set/get}_ops() interface to use struct device.fwnode as a look-up token, making the interface usable on ACPI systems and rename the data structures and the registration API so that they are made to represent their usage more clearly. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	ACPI/IORT: Add support for IOMMU fwnode registration	Lorenzo Pieralisi	1	-0/+86
	The ACPI IORT table provide entries for IOMMU (aka SMMU in ARM world) components that allow creating the kernel data structures required to probe and initialize the IOMMU devices. This patch provides support in the IORT kernel code to register IOMMU components and their respective fwnode. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	ACPI/IORT: Introduce linker section for IORT entries probing	Lorenzo Pieralisi	3	-3/+14
	Since commit e647b532275b ("ACPI: Add early device probing infrastructure") the kernel has gained the infrastructure that allows adding linker script section entries to execute ACPI driver callbacks (ie probe routines) for all subsystems that register a table entry in the respective kernel section (eg clocksource, irqchip). Since ARM IOMMU devices data is described through IORT tables when booting with ACPI, the ARM IOMMU drivers must be made able to hook ACPI callback routines that are called to probe IORT entries and initialize the respective IOMMU devices. To avoid adding driver specific hooks into IORT table initialization code (breaking therefore code modularity - ie ACPI IORT code must be made aware of ARM SMMU drivers ACPI init callbacks), this patch adds code that allows ARM SMMU drivers to take advantage of the ACPI early probing infrastructure, so that they can add linker script section entries containing drivers callback to be executed on IORT tables detection. Since IORT nodes are differentiated by a type, the callback routines can easily parse the IORT table entries, check the IORT nodes and carry out some actions whenever the IORT node type associated with the driver specific callback is matched. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	ACPI: Add FWNODE_ACPI_STATIC fwnode type	Lorenzo Pieralisi	2	-1/+23
	On systems booting with a device tree, every struct device is associated with a struct device_node, that provides its DT firmware representation. The device node can be used in generic kernel contexts (eg IRQ translation, IOMMU streamid mapping), to retrieve the properties associated with the device and carry out kernel operations accordingly. Owing to the 1:1 relationship between the device and its device_node, the device_node can also be used as a look-up token for the device (eg looking up a device through its device_node), to retrieve the device in kernel paths where the device_node is available. On systems booting with ACPI, the same abstraction provided by the device_node is required to provide look-up functionality. The struct acpi_device, that represents firmware objects in the ACPI namespace already includes a struct fwnode_handle of type FWNODE_ACPI as their member; the same abstraction is missing though for devices that are instantiated out of static ACPI tables entries (eg ARM SMMU devices). Add a new fwnode_handle type to associate devices created out of static ACPI table entries to the respective firmware components and create a simple ACPI core layer interface to dynamically allocate and free the corresponding firmware nodes so that kernel subsystems can use it to instantiate the nodes and associate them with the respective devices. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tomasz Nowicki <tn@semihalf.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu/arm-smmu: Set SMTNMB_TLBEN in ACR to enable caching of bypass entries	Nipun Gupta	1	-9/+16
	The SMTNMB_TLBEN in the Auxiliary Configuration Register (ACR) provides an option to enable the updation of TLB in case of bypass transactions due to no stream match in the stream match table. This reduces the latencies of the subsequent transactions with the same stream-id which bypasses the SMMU. This provides a significant performance benefit for certain networking workloads. With this change substantial performance improvement of ~9% is observed with DPDK l3fwd application (http://dpdk.org/doc/guides/sample_app_ug/l3_forward.html) on NXP's LS2088a platform. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu/io-pgtable-arm: Use const and __initconst for iommu_gather_ops structures	Bhumika Goyal	1	-1/+1
	Check for iommu_gather_ops structures that are only stored in the tlb field of an io_pgtable_cfg structure. The tlb field is of type const struct iommu_gather_ops *, so iommu_gather_ops structures having this property can be declared as const. Also, replace __initdata with __initconst. Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu/arm-smmu: Constify iommu_gather_ops structures	Bhumika Goyal	1	-1/+1
	Check for iommu_gather_ops structures that are only stored in the tlb field of an io_pgtable_cfg structure. The tlb field is of type const struct iommu_gather_ops *, so iommu_gather_ops structures having this property can be declared as const. Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu/arm-smmu: Constify iommu_gather_ops structures	Bhumika Goyal	1	-1/+1
	Check for iommu_gather_ops structures that are only stored in the tlb field of an io_pgtable_cfg structure. The tlb field is of type const struct iommu_gather_ops *, so iommu_gather_ops structures having this property can be declared as const. Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-29	iommu/io-pgtable-arm: Use for_each_set_bit to simplify the code	Kefeng Wang	2	-8/+2
	We can use for_each_set_bit() to simplify the code slightly in the ARM io-pgtable self tests. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-11-13	Linux 4.9-rc5	Linus Torvalds	1	-1/+1

2016-11-13	gp8psk: Fix DVB frontend attach	Mauro Carvalho Chehab	7	-152/+246
	The DVB binding schema at the DVB core assumes that the frontend is a separate driver. Faling to do that causes OOPS when the module is removed, as it tries to do a symbol_put_addr on an internal symbol, causing craches like: WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70 Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core] CPU: 1 PID: 28102 Comm: rmmod Tainted: P WC O 4.8.4-build.1 #1 Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009 Call Trace: dump_stack+0x44/0x64 __warn+0xfa/0x120 module_put+0x57/0x70 module_put+0x57/0x70 warn_slowpath_null+0x23/0x30 module_put+0x57/0x70 gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk] symbol_put_addr+0x27/0x50 dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb] From Derek's tests: "Attach bug is fixed, tuning works, module unloads without crashing. Everything seems ok!" Reported-by: Derek <user.vdr@gmail.com> Tested-by: Derek <user.vdr@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-13	gp8psk: fix gp8psk_usb_in_op() logic	Mauro Carvalho Chehab	1	-2/+3
	Commit bc29131ecb10 ("[media] gp8psk: don't do DMA on stack") fixed the usage of DMA on stack, but the memcpy was wrong for gp8psk_usb_in_op(). Fix it. From Derek's email: "Fix confirmed using 2 different Skywalker models with HD mpeg4, SD mpeg2." Suggested-by: Johannes Stezenbach <js@linuxtv.org> Fixes: bc29131ecb10 ("[media] gp8psk: don't do DMA on stack") Tested-by: Derek <user.vdr@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-13	dvb-usb: move data_mutex to struct dvb_usb_device	Mauro Carvalho Chehab	7	-95/+61
	The data_mutex is initialized too late, as it is needed for each device driver's power control, causing an OOPS: dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 PGD 0 Oops: 0002 [#1] SMP Modules linked in: dvb_usb_cinergyT2(+) dvb_usb CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24 Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014 task: ffff88020e943840 task.stack: ffff8801f36ec000 RIP: 0010:[<ffffffff846617af>] [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 RSP: 0018:ffff8801f36efb10 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88021509bdc8 RCX: 00000000c0000100 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88021509bdcc RBP: ffff8801f36efb58 R08: ffff88021f216320 R09: 0000000000100000 R10: ffff88021f216320 R11: 00000023fee6c5a1 R12: ffff88020e943840 R13: ffff88021509bdcc R14: 00000000ffffffff R15: ffff88021509bdd0 FS: 00007f21adb86740(0000) GS:ffff88021f200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000215bce000 CR4: 00000000001406f0 Call Trace: mutex_lock+0x16/0x25 cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2] dvb_usb_device_init+0x21e/0x5d0 [dvb_usb] cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2] usb_probe_interface+0xf3/0x2a0 driver_probe_device+0x208/0x2b0 __driver_attach+0x87/0x90 driver_probe_device+0x2b0/0x2b0 bus_for_each_dev+0x52/0x80 bus_add_driver+0x1a3/0x220 driver_register+0x56/0xd0 usb_register_driver+0x77/0x130 do_one_initcall+0x46/0x180 free_vmap_area_noflush+0x38/0x70 kmem_cache_alloc+0x84/0xc0 do_init_module+0x50/0x1be load_module+0x1d8b/0x2100 find_symbol_in_section+0xa0/0xa0 SyS_finit_module+0x89/0x90 entry_SYSCALL_64_fastpath+0x13/0x94 Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 4c 89 64 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 RSP <ffff8801f36efb10> CR2: 0000000000000000 So, move it to the struct dvb_usb_device and initialize it before calling the driver's callbacks. Reported-by: Jörg Otte <jrg.otte@gmail.com> Tested-by: Jörg Otte <jrg.otte@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-13	iio: maxim_thermocouple: detect invalid storage size in read()	Arnd Bergmann	1	-0/+2
	As found by gcc -Wmaybe-uninitialized, having a storage_bytes value other than 2 or 4 will result in undefined behavior: drivers/iio/temperature/maxim_thermocouple.c: In function 'maxim_thermocouple_read': drivers/iio/temperature/maxim_thermocouple.c:141:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized] This probably cannot happen, but returning -EINVAL here is appropriate and makes gcc happy and the code more robust. Fixes: 231147ee77f3 ("iio: maxim_thermocouple: Align 16 bit big endian value of raw reads") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> (cherry picked from commit 32cb7d27e65df9daa7cee8f1fdf7b259f214bee2) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-12	aoe: fix crash in page count manipulation	Jens Axboe	1	-41/+0
	aoeblk contains some mysterious code, that wants to elevate the bio vec page counts while it's under IO. That is not needed, it's fragile, and it's causing kernel oopses for some. Reported-by: Tested-by: Don Koch <kochd@us.ibm.com> Tested-by: Tested-by: Don Koch <kochd@us.ibm.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-11	lightnvm: invalid offset calculation for lba_shift	Matias Bjørling	1	-1/+1
	The ns->lba_shift assumes its value to be the logarithmic of the LA size. A previous patch duplicated the lba_shift calculation into lightnvm. It prematurely also subtracted a 512byte shift, which commonly is applied per-command. The 512byte shift being subtracted twice led to data loss when restoring the logical to physical mapping table from device and when issuing I/O commands using rrpc. Fix offset by removing the 512byte shift subtraction when calculating lba_shift. Fixes: b0b4e09c1ae7 "lightnvm: control life of nvm_dev in driver" Reported-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-11	Kbuild: enable -Wmaybe-uninitialized warnings by default	Arnd Bergmann	1	-2/+0
	Previously the warnings were added back at the W=1 level and above, this now turns them on again by default, assuming that we have addressed all warnings and again have a clean build for v4.10. I found a number of new warnings in linux-next already and submitted bugfixes for those. Hopefully they are caught by the 0day builder in the future as soon as this patch is merged. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	pcmcia: fix return value of soc_pcmcia_regulator_set	Arnd Bergmann	1	-1/+1
	The newly introduced soc_pcmcia_regulator_set() function sometimes returns without setting its return code, as shown by this warning: drivers/pcmcia/soc_common.c: In function 'soc_pcmcia_regulator_set': drivers/pcmcia/soc_common.c:112:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized] This changes it to propagate the regulator_disable() result instead. Fixes: ac61b6001a63 ("pcmcia: soc_common: add support for Vcc and Vpp regulators") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	infiniband: shut up a maybe-uninitialized warning	Arnd Bergmann	1	-26/+28
	Some configurations produce this harmless warning when built with gcc -Wmaybe-uninitialized: infiniband/core/cma.c: In function 'cma_get_net_dev': infiniband/core/cma.c:1242:12: warning: 'src_addr_storage.sin_addr.s_addr' may be used uninitialized in this function [-Wmaybe-uninitialized] I previously reported this for the powerpc64 defconfig, but have now reproduced the same thing for x86 as well, using gcc-5 or higher. The code looks correct to me, and this change just rearranges it by making sure we alway initialize the entire address structure to make the warning disappear. My first approach added an initialization at the time of the declaration, which Doug commented may be too costly, so I hope this version doesn't add overhead. Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.7-rc6/buildall.powerpc.ppc64_defconfig.log.passed Link: https://patchwork.kernel.org/patch/9212825/ Acked-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	crypto: aesni: shut up -Wmaybe-uninitialized warning	Arnd Bergmann	1	-2/+2
	The rfc4106 encrypy/decrypt helper functions cause an annoying false-positive warning in allmodconfig if we turn on -Wmaybe-uninitialized warnings again: arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’: include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized] The problem seems to be that the compiler doesn't track the state of the 'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end section. This takes the easy way out by adding a bogus initialization, which should be harmless enough to get the patch into v4.9 so we can turn on this warning again by default without producing useless output. A follow-up patch for v4.10 rearranges the code to make the warning go away. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	rc: print correct variable for z8f0811	Arnd Bergmann	1	-1/+1
	A recent rework accidentally left a debugging printk untouched while changing the meaning of the variables, leading to an uninitialized variable being printed: drivers/media/i2c/ir-kbd-i2c.c: In function 'get_key_haup_common': drivers/media/i2c/ir-kbd-i2c.c:62:2: error: 'toggle' may be used uninitialized in this function [-Werror=maybe-uninitialized] This prints the correct one instead, as we did before the patch. Fixes: 00bb820755ed ("[media] rc: Hauppauge z8f0811 can decode RC6") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	dib0700: fix nec repeat handling	Sean Young	1	-2/+3
	When receiving a nec repeat, ensure the correct scancode is repeated rather than a random value from the stack. This removes the need for the bogus uninitialized_var() and also fixes the warnings: drivers/media/usb/dvb-usb/dib0700_core.c: In function ‘dib0700_rc_urb_completion’: drivers/media/usb/dvb-usb/dib0700_core.c:679: warning: ‘protocol’ may be used uninitialized in this function [sean addon: So after writing the patch and submitting it, I've bought the hardware on ebay. Without this patch you get random scancodes on nec repeats, which the patch indeed fixes.] Signed-off-by: Sean Young <sean@mess.org> Tested-by: Sean Young <sean@mess.org> Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	s390: pci: don't print uninitialized data for debugging	Arnd Bergmann	1	-1/+1
	gcc correctly warns about an incorrect use of the 'pa' variable in case we pass an empty scatterlist to __s390_dma_map_sg: arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg': arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized] This adds a bogus initialization to the function to sanitize the debug output. I would have preferred a solution without the initialization, but I only got the report from the kbuild bot after turning on the warning again, and didn't manage to reproduce it myself. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	nios2: fix timer initcall return value	Arnd Bergmann	1	-0/+1
	When called more than twice, the nios2_time_init() function return an uninitialized value, as detected by gcc -Wmaybe-uninitialized arch/nios2/kernel/time.c: warning: 'ret' may be used uninitialized in this function This makes it return '0' here, matching the comment above the function. Acked-by: Ley Foon Tan <lftan@altera.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	x86: apm: avoid uninitialized data	Arnd Bergmann	1	-1/+4
	apm_bios_call() can fail, and return a status in its argument structure. If that status however is zero during a call from apm_get_power_status(), we end up using data that may have never been set, as reported by "gcc -Wmaybe-uninitialized": arch/x86/kernel/apm_32.c: In function ‘apm’: arch/x86/kernel/apm_32.c:1729:17: error: ‘bx’ may be used uninitialized in this function [-Werror=maybe-uninitialized] arch/x86/kernel/apm_32.c:1835:5: error: ‘cx’ may be used uninitialized in this function [-Werror=maybe-uninitialized] arch/x86/kernel/apm_32.c:1730:17: note: ‘cx’ was declared here arch/x86/kernel/apm_32.c:1842:27: error: ‘dx’ may be used uninitialized in this function [-Werror=maybe-uninitialized] arch/x86/kernel/apm_32.c:1731:17: note: ‘dx’ was declared here This changes the function to return "APM_NO_ERROR" here, which makes the code more robust to broken BIOS versions, and avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	NFSv4.1: work around -Wmaybe-uninitialized warning	Arnd Bergmann	1	-4/+6
	A bugfix introduced a harmless gcc warning in nfs4_slot_seqid_in_use if we enable -Wmaybe-uninitialized again: fs/nfs/nfs4session.c:203:54: error: 'cur_seq' may be used uninitialized in this function [-Werror=maybe-uninitialized] gcc is not smart enough to conclude that the IS_ERR/PTR_ERR pair results in a nonzero return value here. Using PTR_ERR_OR_ZERO() instead makes this clear to the compiler. Fixes: e09c978aae5b ("NFSv4.1: Fix Oopsable condition in server callback races") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"	Arnd Bergmann	4	-5/+16
	Traditionally, we have always had warnings about uninitialized variables enabled, as this is part of -Wall, and generally a good idea [1], but it also always produced false positives, mainly because this is a variation of the halting problem and provably impossible to get right in all cases [2]. Various people have identified cases that are particularly bad for false positives, and in commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os"), I turned off the warning for any build that was done with CC_OPTIMIZE_FOR_SIZE. This drastically reduced the number of false positive warnings in the default build but unfortunately had the side effect of turning the warning off completely in 'allmodconfig' builds, which in turn led to a lot of warnings (both actual bugs, and remaining false positives) to go in unnoticed. With commit 877417e6ffb9 ("Kbuild: change CC_OPTIMIZE_FOR_SIZE definition") enabled the warning again for allmodconfig builds in v4.7 and in v4.8-rc1, I had finally managed to address all warnings I get in an ARM allmodconfig build and most other maybe-uninitialized warnings for ARM randconfig builds. However, commit 6e8d666e9253 ("Disable "maybe-uninitialized" warning globally") was merged at the same time and disabled it completely for all configurations, because of false-positive warnings on x86 that I had not addressed until then. This caused a lot of actual bugs to get merged into mainline, and I sent several dozen patches for these during the v4.9 development cycle. Most of these are actual bugs, some are for correct code that is safe because it is only called under external constraints that make it impossible to run into the case that gcc sees, and in a few cases gcc is just stupid and finds something that can obviously never happen. I have now done a few thousand randconfig builds on x86 and collected all patches that I needed to address every single warning I got (I can provide the combined patch for the other warnings if anyone is interested), so I hope we can get the warning back and let people catch the actual bugs earlier. This reverts the change to disable the warning completely and for now brings it back at the "make W=1" level, so we can get it merged into mainline without introducing false positives. A follow-up patch enables it on all levels unless some configuration option turns it off because of false-positives. Link: https://rusty.ozlabs.org/?p=232 [1] Link: https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings [2] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	lib/stackdepot: export save/fetch stack for drivers	Chris Wilson	1	-0/+2
	Some drivers would like to record stacktraces in order to aide leak tracing. As stackdepot already provides a facility for only storing the unique traces, thereby reducing the memory required, export that functionality for use by drivers. The code was originally created for KASAN and moved under lib in commit cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") so that it could be shared with mm/. In turn, we want to share it now with drivers. Link: http://lkml.kernel.org/r/20161108133209.22704-1-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm: kmemleak: scan .data.ro_after_init	Jakub Kicinski	4	-1/+10
	Limit the number of kmemleak false positives by including .data.ro_after_init in memory scanning. To achieve this we need to add symbols for start and end of the section to the linker scripts. The problem was been uncovered by commit 56989f6d8568 ("genetlink: mark families as __ro_after_init"). Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB	Greg Thelen	1	-2/+2
	While testing OBJFREELIST_SLAB integration with pagealloc, we found a bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB & CFLGS_OBJFREELIST_SLAB. When it happened, critical allocations needed for loading drivers or creating new caches will fail. The original kmem_cache is created early making OFF_SLAB not possible. When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc is enabled it will try to enable it first under certain conditions. Given kmem_cache(sys) reuses the original flag, you can have both flags at the same time resulting in allocation failures and odd behaviors. This fix discards allocator specific flags from memcg before calling create_cache. The bug exists since 4.6-rc1 and affects testing debug pagealloc configurations. Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB") Link: http://lkml.kernel.org/r/1478553075-120242-1-git-send-email-thgarnie@google.com Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: Thomas Garnier <thgarnie@google.com> Tested-by: Thomas Garnier <thgarnie@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	coredump: fix unfreezable coredumping task	Andrey Ryabinin	1	-0/+3
	It could be not possible to freeze coredumping task when it waits for 'core_state->startup' completion, because threads are frozen in get_signal() before they got a chance to complete 'core_state->startup'. Inability to freeze a task during suspend will cause suspend to fail. Also CRIU uses cgroup freezer during dump operation. So with an unfreezable task the CRIU dump will fail because it waits for a transition from 'FREEZING' to 'FROZEN' state which will never happen. Use freezer_do_not_count() to tell freezer to ignore coredumping task while it waits for core_state->startup completion. Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm/filemap: don't allow partially uptodate page for pipes	Eryu Guan	1	-0/+3
	Starting from 4.9-rc1 kernel, I started noticing some test failures of sendfile(2) and splice(2) (sendfile0N and splice01 from LTP) when testing on sub-page block size filesystems (tested both XFS and ext4), these syscalls start to return EIO in the tests. e.g. sendfile02 1 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 26, got: -1 sendfile02 2 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 24, got: -1 sendfile02 3 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 22, got: -1 sendfile02 4 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 20, got: -1 This is because that in sub-page block size cases, we don't need the whole page to be uptodate, only the part we care about is uptodate is OK (if fs has ->is_partially_uptodate defined). But page_cache_pipe_buf_confirm() doesn't have the ability to check the partially-uptodate case, it needs the whole page to be uptodate. So it returns EIO in this case. This is a regression introduced by commit 82c156f85384 ("switch generic_file_splice_read() to use of ->read_iter()"). Prior to the change, generic_file_splice_read() doesn't allow partially-uptodate page either, so it worked fine. Fix it by skipping the partially-uptodate check if we're working on a pipe in do_generic_file_read(), so we read the whole page from disk as long as the page is not uptodate. I think the other way to fix it is to add the ability to check & allow partially-uptodate page to page_cache_pipe_buf_confirm(), but that is much harder to do and seems gain little. Link: http://lkml.kernel.org/r/1477986187-12717-1-git-send-email-guaneryu@gmail.com Signed-off-by: Eryu Guan <guaneryu@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm/hugetlb: fix huge page reservation leak in private mapping error paths	Mike Kravetz	1	-0/+66
	Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly allocated huge page. If a reservation was associated with the huge page, alloc_huge_page() consumed the reservation while allocating. When the newly allocated page is freed in free_huge_page(), it will increment the global reservation count. However, the reservation entry in the reserve map will remain. This is not an issue for shared mappings as the entry in the reserve map indicates a reservation exists. But, an entry in a private mapping reserve map indicates the reservation was consumed and no longer exists. This results in an inconsistency between the reserve map and the global reservation count. This 'leaks' a reserved huge page. Create a new routine restore_reserve_on_error() to restore the reserve entry in these specific error paths. This routine makes use of a new function vma_add_reservation() which will add a reserve entry for a specific address/page. In general, these error paths were rarely (if ever) taken on most architectures. However, powerpc contained arch specific code that that resulted in an extra fault and execution of these error paths on all private mappings. Fixes: 67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private mappings) Link: http://lkml.kernel.org/r/1476933077-23091-2-git-send-email-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Kirill A . Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	ocfs2: fix not enough credit panic	Junxiao Bi	1	-1/+1
	The following panic was caught when run ocfs2 disconfig single test (block size 512 and cluster size 8192). ocfs2_journal_dirty() return -ENOSPC, that means credits were used up. The total credit should include 3 times of "num_dx_leaves" from ocfs2_dx_dir_rebalance(), because 2 times will be consumed in ocfs2_dx_dir_transfer_leaf() and 1 time will be consumed in ocfs2_dx_dir_new_cluster() -> __ocfs2_dx_dir_new_cluster() -> ocfs2_dx_dir_format_cluster(). But only two times is included in ocfs2_dx_dir_rebalance_credits(), fix it. This can cause read-only fs(v4.1+) or panic for mainline linux depending on mount option. ------------[ cut here ]------------ kernel BUG at fs/ocfs2/journal.c:775! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport acpi_cpufreq i2c_piix4 i2c_core pcspkr ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 10601 Comm: dd Not tainted 4.1.12-71.el6uek.bug24939243.x86_64 #2 Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016 task: ffff8800b6de6200 ti: ffff8800a7d48000 task.ti: ffff8800a7d48000 RIP: ocfs2_journal_dirty+0xa7/0xb0 [ocfs2] RSP: 0018:ffff8800a7d4b6d8 EFLAGS: 00010286 RAX: 00000000ffffffe4 RBX: 00000000814d0a9c RCX: 00000000000004f9 RDX: ffffffffa008e990 RSI: ffffffffa008f1ee RDI: ffff8800622b6460 RBP: ffff8800a7d4b6f8 R08: ffffffffa008f288 R09: ffff8800622b6460 R10: 0000000000000000 R11: 0000000000000282 R12: 0000000002c8421e R13: ffff88006d0cad00 R14: ffff880092beef60 R15: 0000000000000070 FS: 00007f9b83e92700(0000) GS:ffff8800be880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb2c0d1a000 CR3: 0000000008f80000 CR4: 00000000000406e0 Call Trace: ocfs2_dx_dir_transfer_leaf+0x159/0x1a0 [ocfs2] ocfs2_dx_dir_rebalance+0xd9b/0xea0 [ocfs2] ocfs2_find_dir_space_dx+0xd3/0x300 [ocfs2] ocfs2_prepare_dx_dir_for_insert+0x219/0x450 [ocfs2] ocfs2_prepare_dir_for_insert+0x1d6/0x580 [ocfs2] ocfs2_mknod+0x5a2/0x1400 [ocfs2] ocfs2_create+0x73/0x180 [ocfs2] vfs_create+0xd8/0x100 lookup_open+0x185/0x1c0 do_last+0x36d/0x780 path_openat+0x92/0x470 do_filp_open+0x4a/0xa0 do_sys_open+0x11a/0x230 SyS_open+0x1e/0x20 system_call_fastpath+0x12/0x71 Code: 1d 3f 29 09 00 48 85 db 74 1f 48 8b 03 0f 1f 80 00 00 00 00 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 eb eb 90 <0f> 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 RIP ocfs2_journal_dirty+0xa7/0xb0 [ocfs2] ---[ end trace 91ac5312a6ee1288 ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled Link: http://lkml.kernel.org/r/1478248135-31963-1-git-send-email-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	Revert "console: don't prefer first registered if DT specifies stdout-path"	Hans de Goede	3	-20/+1
	This reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path"). The reverted commit changes existing behavior on which many ARM boards rely. Many ARM small-board-computers, like e.g. the Raspberry Pi have both a video output and a serial console. Depending on whether the user is using the device as a more regular computer; or as a headless device we need to have the console on either one or the other. Many users rely on the kernel behavior of the console being present on both outputs, before the reverted commit the console setup with no console= kernel arguments on an ARM board which sets stdout-path in dt would look like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 tty0 -WU (E p ) 4:1 Where as after the reverted commit, it looks like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 This commit reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") restoring the original behavior. Fixes: 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm: hwpoison: fix thp split handling in memory_failure()	Naoya Horiguchi	1	-7/+5
	When memory_failure() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE(): page:ffffd7cd819b0040 count:0 mapcount:0 mapping: (null) index:0x1 flags: 0x1fffc000400000(hwpoison) page dumped because: VM_BUG_ON_PAGE(!page_count(p)) ------------[ cut here ]------------ kernel BUG at /src/linux-dev/mm/memory-failure.c:1132! memory_failure() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page(). Fixes: 61f5d698cc97 ("mm: re-enable THP") Link: http://lkml.kernel.org/r/1477961577-7183-1-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	swapfile: fix memory corruption via malformed swapfile	Jann Horn	1	-0/+2
	When root activates a swap partition whose header has the wrong endianness, nr_badpages elements of badpages are swabbed before nr_badpages has been checked, leading to a buffer overrun of up to 8GB. This normally is not a security issue because it can only be exploited by root (more specifically, a process with CAP_SYS_ADMIN or the ability to modify a swap file/partition), and such a process can already e.g. modify swapped-out memory of any other userspace process on the system. Link: http://lkml.kernel.org/r/1477949533-2509-1-git-send-email-jann@thejh.net Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm/cma.c: check the max limit for cma allocation	Shiraz Hashim	1	-0/+3
	CMA allocation request size is represented by size_t that gets truncated when same is passed as int to bitmap_find_next_zero_area_off. We observe that during fuzz testing when cma allocation request is too high, bitmap_find_next_zero_area_off still returns success due to the truncation. This leads to kernel crash, as subsequent code assumes that requested memory is available. Fail cma allocation in case the request breaches the corresponding cma region size. Link: http://lkml.kernel.org/r/1478189211-3467-1-git-send-email-shashim@codeaurora.org Signed-off-by: Shiraz Hashim <shashim@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	scripts/bloat-o-meter: fix SIGPIPE	Alexey Dobriyan	1	-0/+3
	Fix piping output to a program which quickly exits (read: head -n1) $ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux \| head -n1 add/remove: 0/0 grow/shrink: 9/60 up/down: 124/-305 (-181) close failed in file object destructor: sys.excepthook is missing lost sys.stderr Link: http://lkml.kernel.org/r/20161028204618.GA29923@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	shmem: fix pageflags after swapping DMA32 object	Hugh Dickins	1	-0/+2
	If shmem_alloc_page() does not set PageLocked and PageSwapBacked, then shmem_replace_page() needs to do so for itself. Without this, it puts newpage on the wrong lru, re-unlocks the unlocked newpage, and system descends into "Bad page" reports and freeze; or if CONFIG_DEBUG_VM=y, it hits an earlier VM_BUG_ON_PAGE(!PageLocked), depending on config. But shmem_replace_page() is not a common path: it's only called when swapin (or swapoff) finds the page was already read into an unsuitable zone: usually all zones are suitable, but gem objects for a few drm devices (gma500, omapdrm, crestline, broadwater) require zone DMA32 if there's more than 4GB of ram. Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1611062003510.11253@eggly.anvils Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> [4.8.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm, frontswap: make sure allocated frontswap map is assigned	Vlastimil Babka	1	-2/+3
	Christian Borntraeger reports: With commit 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") kmemleak complains about a memory leak in swapon unreferenced object 0x3e09ba56000 (size 32112640): comm "swapon", pid 7852, jiffies 4294968787 (age 1490.770s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: __vmalloc_node_range+0x194/0x2d8 vzalloc+0x58/0x68 SyS_swapon+0xd60/0x12f8 system_call+0xd6/0x270 Turns out kmemleak is right. We now allocate the frontswap map depending on the kernel config (and no longer on the enablement) swapfile.c: [...] if (IS_ENABLED(CONFIG_FRONTSWAP)) frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long)); but later on this is passed along --> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map); and ignored if frontswap is disabled --> frontswap_init(p->type, frontswap_map); static inline void frontswap_init(unsigned type, unsigned long *map) { if (frontswap_enabled()) __frontswap_init(type, map); } Thing is, that frontswap map is never freed. The leakage is relatively not that bad, because swapon is an infrequent and privileged operation. However, if the first frontswap backend is registered after a swap type has been already enabled, it will WARN_ON in frontswap_register_ops() and frontswap will not be available for the swap type. Fix this by making sure the map is assigned by frontswap_init() as long as CONFIG_FRONTSWAP is enabled. Fixes: 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm: remove extra newline from allocation stall warning	Tetsuo Handa	1	-1/+1
	Commit 63f53dea0c98 ("mm: warn about allocations which stall for too long") by error embedded "\n" in the format string, resulting in strange output. [ 722.876655] kworker/0:1: page alloction stalls for 160001ms, order:0 [ 722.876656] , mode:0x2400000(GFP_NOIO) [ 722.876657] CPU: 0 PID: 6966 Comm: kworker/0:1 Not tainted 4.8.0+ #69 Link: http://lkml.kernel.org/r/1476026219-7974-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails	Brian Norris	1	-4/+4
	Consider two devices, A and B, where B is a child of A, and B utilizes asynchronous suspend (it does not matter whether A is sync or async). If B fails to suspend_noirq() or suspend_late(), or is interrupted by a wakeup (pm_wakeup_pending()), then it aborts and sets the async_error variable. However, device A does not (immediately) check the async_error variable; it may continue to run its own suspend_noirq()/suspend_late() callback. This is bad. We can resolve this problem by doing our error and wakeup checking (particularly, for the async_error flag) after waiting for children to suspend, instead of before. This also helps align the logic for the noirq and late suspend cases with the logic in __device_suspend(). It's easy to observe this erroneous behavior by, for example, forcing a device to sleep a bit in its suspend_noirq() (to ensure the parent is waiting for the child to complete), then return an error, and watch the parent suspend_noirq() still get called. (Or similarly, fake a wakeup event at the right (or is it wrong?) time.) Fixes: de377b397272 (PM / sleep: Asynchronous threads for suspend_late) Fixes: 28b6fd6e3779 (PM / sleep: Asynchronous threads for suspend_noirq) Reported-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-10	splice: remove detritus from generic_file_splice_read()	Al Viro	1	-5/+0
	i_size check is a leftover from the horrors that used to play with the page cache in that function. With the switch to ->read_iter(), it's neither needed nor correct - for gfs2 it ends up being buggy, since i_size is not guaranteed to be correct until later (inside ->read_iter()). Spotted-by: Abhi Das <adas@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>