aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/edac (follow)
AgeCommit message (Collapse)AuthorFilesLines
2014-08-04Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-3/+1
Pull RAS updates from Ingo Molnar: "The main changes in this cycle are: - RAS tracing/events infrastructure, by Gong Chen. - Various generalizations of the APEI code to make it available to non-x86 architectures, by Tomasz Nowicki" * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ras: Fix build warnings in <linux/aer.h> acpi, apei, ghes: Factor out ioremap virtual memory for IRQ and NMI context. acpi, apei, ghes: Make NMI error notification to be GHES architecture extension. apei, mce: Factor out APEI architecture specific MCE calls. RAS, extlog: Adjust init flow trace, eMCA: Add a knob to adjust where to save event log trace, RAS: Add eMCA trace event interface RAS, debugfs: Add debugfs interface for RAS subsystem CPER: Adjust code flow of some functions x86, MCE: Robustify mcheck_init_device trace, AER: Move trace into unified interface trace, RAS: Add basic RAS trace event x86, MCE: Kill CPU_POST_DEAD
2014-07-14EDAC, MCE, AMD: Add MCE decoding for F15h M60hAravind Gopalakrishnan1-4/+40
Add decoding logic for new Fam15h model 60h. Tested using mce_amd_inj module and works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1405098795-4678-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: simplify a bit. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-07-10ie31200_edac: Allocate mci and map mchbar firstJason Baron1-38/+33
Check for memory allocation and mchbar mapping failures before initializing the dimm info tables needlessly. Signed-off-by: Jason Baron <jbaron@akamai.com> Suggested-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/ead8f53e699f1ce21c2e17f3cffb4685d4faf72a.1404939455.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-07-04ie31200_edac: Introduce the driverJason Baron3-0/+549
Add a driver for the E3-1200 series of Intel DRAM controllers, based on the following E3-1200 specs: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-vol-2-datasheet.html I've tested this on bad memory hardware, and observed correlating bad reads and uncorrected memory errors as reported by the driver. Tested against: CPU E3-1270 v3 @ 3.50GHz : 8086:0c08 (haswell) CPU E3-1270 V2 @ 3.50GHz : 8086:0158 (ivy bridge) CPU E31270 @ 3.40GHz : 8086:0108 (sandy bridge) Signed-off-by: Jason Baron <jbaron@akamai.com> Link: http://lkml.kernel.org/r/95c83e80dd40b5377e8bb206285c5d95ac623872.1403818526.git.jbaron@akamai.com [ Boris: realign defines ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-07-04x38_edac: make use of lo_hi_readq()Jason Baron1-9/+6
Convert to the generic API. Signed-off-by: Jason Baron <jbaron@akamai.com> Link: http://lkml.kernel.org/r/bb9a4cbb980cc7b51be75cbfcf644553bf6a04cd.1403818526.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-06-24EDAC, edac_module.c: Remove unnecessary test on unsigned valueFabian Frederick1-1/+1
unsigned long value is never < 0. Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1402341618-10674-1-git-send-email-fabf@skynet.be Signed-off-by: Borislav Petkov <bp@suse.de>
2014-06-23trace, RAS: Add basic RAS trace eventChen, Gong2-3/+1
To avoid confuision and conflict of usage for RAS related trace event, add an unified RAS trace event stub. Start a RAS subsystem menu which will be fleshed out in time, when more features get added to it. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1402475691-30045-2-git-send-email-gong.chen@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-06-02Merge tag 'pci-v3.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into nextLinus Torvalds1-7/+1
Pull PCI changes from Bjorn Helgaas: "Enumeration - Notify driver before and after device reset (Keith Busch) - Use reset notification in NVMe (Keith Busch) NUMA - Warn if we have to guess host bridge node information (Myron Stowe) - Work around AMD Fam15h BIOSes that fail to provide _PXM (Suravee Suthikulpanit) - Clean up and mark early_root_info_init() as deprecated (Suravee Suthikulpanit) Driver binding - Add "driver_override" for force specific binding (Alex Williamson) - Fail "new_id" addition for devices we already know about (Bandan Das) Resource management - Support BAR sizes up to 8GB (Nikhil Rao, Alan Cox) - Don't move IORESOURCE_PCI_FIXED resources (Bjorn Helgaas) - Mark SBx00 HPET BAR as IORESOURCE_PCI_FIXED (Bjorn Helgaas) - Fail safely if we can't handle BARs larger than 4GB (Bjorn Helgaas) - Reject BAR above 4GB if dma_addr_t is too small (Bjorn Helgaas) - Don't convert BAR address to resource if dma_addr_t is too small (Bjorn Helgaas) - Don't set BAR to zero if dma_addr_t is too small (Bjorn Helgaas) - Don't print anything while decoding is disabled (Bjorn Helgaas) - Don't add disabled subtractive decode bus resources (Bjorn Helgaas) - Add resource allocation comments (Bjorn Helgaas) - Restrict 64-bit prefetchable bridge windows to 64-bit resources (Yinghai Lu) - Assign i82875p_edac PCI resources before adding device (Yinghai Lu) PCI device hotplug - Remove unnecessary "dev->bus" test (Bjorn Helgaas) - Use PCI_EXP_SLTCAP_PSN define (Bjorn Helgaas) - Fix rphahp endianess issues (Laurent Dufour) - Acknowledge spurious "cmd completed" event (Rajat Jain) - Allow hotplug service drivers to operate in polling mode (Rajat Jain) - Fix cpqphp possible NULL dereference (Rickard Strandqvist) MSI - Replace pci_enable_msi_block() by pci_enable_msi_exact() (Alexander Gordeev) - Replace pci_enable_msix() by pci_enable_msix_exact() (Alexander Gordeev) - Simplify populate_msi_sysfs() (Jan Beulich) Virtualization - Add Intel Patsburg (X79) root port ACS quirk (Alex Williamson) - Mark RTL8110SC INTx masking as broken (Alex Williamson) Generic host bridge driver - Add generic PCI host controller driver (Will Deacon) Freescale i.MX6 - Use new clock names (Lucas Stach) - Drop old IRQ mapping (Lucas Stach) - Remove optional (and unused) IRQs (Lucas Stach) - Add support for MSI (Lucas Stach) - Fix imx6_add_pcie_port() section mismatch warning (Sachin Kamat) Renesas R-Car - Add gen2 device tree support (Ben Dooks) - Use new OF interrupt mapping when possible (Lucas Stach) - Add PCIe driver (Phil Edworthy) - Add PCIe MSI support (Phil Edworthy) - Add PCIe device tree bindings (Phil Edworthy) Samsung Exynos - Remove unnecessary OOM messages (Jingoo Han) - Fix add_pcie_port() section mismatch warning (Sachin Kamat) Synopsys DesignWare - Make MSI ISR shared IRQ aware (Lucas Stach) Miscellaneous - Check for broken config space aliasing (Alex Williamson) - Update email address (Ben Hutchings) - Fix Broadcom CNB20LE unintended sign extension (Bjorn Helgaas) - Fix incorrect vgaarb conditional in WARN_ON() (Bjorn Helgaas) - Remove unnecessary __ref annotations (Bjorn Helgaas) - Add arch/x86/kernel/quirks.c to MAINTAINERS PCI file patterns (Bjorn Helgaas) - Fix use of uninitialized MPS value (Bjorn Helgaas) - Tidy x86/gart messages (Bjorn Helgaas) - Fix return value from pci_user_{read,write}_config_*() (Gavin Shan) - Turn pcibios_penalize_isa_irq() into a weak function (Hanjun Guo) - Remove unused serial device IDs (Jean Delvare) - Use designated initialization in PCI_VDEVICE (Mark Rustad) - Fix powerpc NULL dereference in pci_root_buses traversal (Mike Qiu) - Configure MPS on ARM (Murali Karicheri) - Remove unnecessary includes of <linux/init.h> (Paul Gortmaker) - Move Open Firmware devspec attribute to PCI common code (Sebastian Ott) - Use pdev->dev.groups for attribute creation on s390 (Sebastian Ott) - Remove pcibios_add_platform_entries() (Sebastian Ott) - Add new ID for Intel GPU "spurious interrupt" quirk (Thomas Jarosch) - Rename pci_is_bridge() to pci_has_subordinate() (Yijing Wang) - Add and use new pci_is_bridge() interface (Yijing Wang) - Make pci_bus_add_device() void (Yijing Wang) DMA API - Clarify physical/bus address distinction in docs (Bjorn Helgaas) - Fix typos in docs (Emilio López) - Update dma_pool_create ()and dma_pool_alloc() descriptions (Gioh Kim) - Change dma_declare_coherent_memory() CPU address to phys_addr_t (Bjorn Helgaas) - Pass GAPSPCI_DMA_BASE CPU & bus address to dma_declare_coherent_memory() (Bjorn Helgaas)" * tag 'pci-v3.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (92 commits) MAINTAINERS: Add generic PCI host controller driver PCI: generic: Add generic PCI host controller driver PCI: imx6: Add support for MSI PCI: designware: Make MSI ISR shared IRQ aware PCI: imx6: Remove optional (and unused) IRQs PCI: imx6: Drop old IRQ mapping PCI: imx6: Use new clock names i82875p_edac: Assign PCI resources before adding device ARM/PCI: Call pcie_bus_configure_settings() to set MPS PCI: imx6: Fix imx6_add_pcie_port() section mismatch warning PCI: Make pci_bus_add_device() void PCI: exynos: Fix add_pcie_port() section mismatch warning PCI: Introduce new device binding path using pci_dev.driver_override PCI: rcar: Add gen2 device tree support PCI: cpqphp: Fix possible null pointer dereference PCI: rcar: Add R-Car PCIe device tree bindings PCI: rcar: Add MSI support for PCIe PCI: rcar: Add Renesas R-Car PCIe driver PCI: Fix return value from pci_user_{read,write}_config_*() PCI: exynos: Remove unnecessary OOM messages ...
2014-05-30Merge branches 'pci/host-exynos', 'pci/host-imx6', 'pci/resource' and 'pci/misc' into nextBjorn Helgaas1-7/+1
* pci/host-exynos: PCI: exynos: Fix add_pcie_port() section mismatch warning * pci/host-imx6: PCI: imx6: Add support for MSI PCI: designware: Make MSI ISR shared IRQ aware PCI: imx6: Remove optional (and unused) IRQs PCI: imx6: Drop old IRQ mapping PCI: imx6: Use new clock names PCI: imx6: Fix imx6_add_pcie_port() section mismatch warning * pci/resource: i82875p_edac: Assign PCI resources before adding device * pci/misc: ARM/PCI: Call pcie_bus_configure_settings() to set MPS PCI: Make pci_bus_add_device() void Conflicts: drivers/edac/i82875p_edac.c
2014-05-30i82875p_edac: Assign PCI resources before adding deviceYinghai Lu1-1/+2
Assign PCI resources before pci_bus_add_device(). The resources must be assigned before a driver can claim the device. [bhelgaas: changelog] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-05-30PCI: Make pci_bus_add_device() voidYijing Wang1-7/+1
pci_bus_add_device() always returns 0, so there's no point in returning anything at all. Make it a void function and remove the tests of the return value from the callers. [bhelgaas: changelog, remove unused "err" from i82875p_setup_overfl_dev()] Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-05-09EDAC: Fix MC scrub mode comparsion bug for correctable errorsLoc Ho1-1/+1
The MC structure field scrub_mode is of integer type - not bit field. Use it accordingly. Signed-off-by: Loc Ho <lho@apm.com> Link: http://lkml.kernel.org/r/1399590199-12256-2-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-05-08EDAC, MCE, AMD: Remove leftover unused maskBorislav Petkov1-2/+0
295d8cda2689 ("EDAC, MCE, AMD: Drop local coreid reporting") removed the code snippet which used that mask but forgot to drop the mask itself. Do that now. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-04-04Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-mediaLinus Torvalds6-10/+10
Pull media updates from Mauro Carvalho Chehab: "The main set of series of patches for media subsystem, including: - document RC sysfs class - added an API to setup scancode to allow waking up systems using the Remote Controller - add API for SDR devices. Drivers are still on staging - some API improvements for getting EDID data from media inputs/outputs - new DVB frontend driver for drx-j (ATSC) - one driver (it913x/it9137) got removed, in favor of an improvement on another driver (af9035) - added a skeleton V4L2 PCI driver at documentation - added a dual flash driver (lm3646) - added a new IR driver (img-ir) - added an IR scancode decoder for the Sharp protocol - some improvements at the usbtv driver, to allow its core to be reused. - added a new SDR driver (rtl2832u_sdr) - added a new tuner driver (msi001) - several improvements at em28xx driver to fix PM support, device removal and to split the V4L2 specific bits into a separate sub-driver - one driver got converted to videobuf2 (s2255drv) - the e4000 tuner driver now follows an improved binding model - some fixes at V4L2 compat32 code - several fixes and enhancements at videobuf2 code - some cleanups at V4L2 API documentation - usual driver enhancements, new board additions and misc fixups" [ NOTE! This merge effective drops commit 4329b93b283c ("of: Reduce indentation in of_graph_get_next_endpoint"). The of_graph_get_next_endpoint() function was moved and renamed by commit fd9fdb78a9bf ("[media] of: move graph helpers from drivers/media/v4l2-core to drivers/of"). It was originally called v4l2_of_get_next_endpoint() and lived in the file drivers/media/v4l2-core/v4l2-of.c. In that original location, it was then fixed to support empty port nodes by commit b9db140c1e46 ("[media] v4l: of: Support empty port nodes"), and that commit clashes badly with the dropped "Reduce intendation" commit. I had to choose one or the other, and decided that the "Support empty port nodes" commit was more important ] * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (426 commits) [media] em28xx-dvb: fix PCTV 461e tuner I2C binding Revert "[media] em28xx-dvb: fix PCTV 461e tuner I2C binding" [media] em28xx: fix PCTV 290e LNA oops [media] em28xx-dvb: fix PCTV 461e tuner I2C binding [media] m88ds3103: fix bug on .set_tone() [media] saa7134: fix WARN_ON during resume [media] v4l2-dv-timings: add module name, description, license [media] videodev2.h: add parenthesis around macro arguments [media] saa6752hs: depends on CRC32 [media] si4713: fix Kconfig dependencies [media] Sensoray 2255 uses videobuf2 [media] adv7180: free an interrupt on failure paths in init_device() [media] e4000: make VIDEO_V4L2 dependency optional [media] af9033: Don't export functions for the hardware filter [media] af9035: use af9033 PID filters [media] af9033: implement PID filter [media] rtl2832_sdr: do not use dynamic stack allocation [media] e4000: fix 32-bit build error [media] em28xx-audio: make sure audio is unmuted on open() [media] DocBook media: v4l2_format_sdr was renamed to v4l2_sdr_format ...
2014-04-03Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edacLinus Torvalds1-9/+16
Pull sb_edac patches from Mauro Carvalho Chehab: "A couple sb_edac driver improvements, cleaning a little bit the amount of data sent to dmesg, and fixing one error message" * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: sb_edac: mark MCE messages as KERN_DEBUG sb_edac: use "event" instead of "exception" when MC wasnt signaled
2014-04-02Merge branch 'mips-for-linux-next' of git://git.linux-mips.org/pub/scm/ralf/upstream-sfrLinus Torvalds1-6/+173
Pull MIPS updates from Ralf Baechle: - Support for Imgtec's Aptiv family of MIPS cores. - Improved detection of BCM47xx configurations. - Fix hiberation for certain configurations. - Add support for the Chinese Loongson 3 CPU, a MIPS64 R2 core and systems. - Detection and support for the MIPS P5600 core. - A few more random fixes that didn't make 3.14. - Support for the EVA Extended Virtual Addressing - Switch Alchemy to the platform PATA driver - Complete unification of Alchemy support - Allow availability of I/O cache coherency to be runtime detected - Improvments to multiprocessing support for Imgtec platforms - A few microoptimizations - Cleanups of FPU support - Paul Gortmaker's fixes for the init stuff - Support for seccomp * 'mips-for-linux-next' of git://git.linux-mips.org/pub/scm/ralf/upstream-sfr: (165 commits) MIPS: CPC: Use __raw_ memory access functions MIPS: CM: use __raw_ memory access functions MIPS: Fix warning when including smp-ops.h with CONFIG_SMP=n MIPS: Malta: GIC IPIs may be used without MT MIPS: smp-mt: Use common GIC IPI implementation MIPS: smp-cmp: Remove incorrect core number probe MIPS: Fix gigaton of warning building with microMIPS. MIPS: Fix core number detection for MT cores MIPS: MT: core_nvpes function to retrieve VPE count MIPS: Provide empty mips_mt_set_cpuoptions when CONFIG_MIPS_MT=n MIPS: Lasat: Replace del_timer by del_timer_sync MIPS: Malta: Setup PM I/O region on boot MIPS: Loongson: Add a Loongson-3 default config file MIPS: Loongson 3: Add CPU hotplug support MIPS: Loongson 3: Add Loongson-3 SMP support MIPS: Loongson: Add Loongson-3 Kconfig options MIPS: Loongson: Add swiotlb to support All-Memory DMA MIPS: Loongson 3: Add serial port support MIPS: Loongson 3: Add IRQ init and dispatch support MIPS: Loongson 3: Add HT-linked PCI support ...
2014-04-01Merge tag 'edac_for_3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds12-91/+130
Pull EDAC updates from Borislav Petkov: "A bunch of EDAC updates all over the place: - Support for new AMD models, along with more graceful fallback for unsupported hw. - Bunch of fixes from SUSE accumulated from bug reports - Misc other fixes and cleanups" * tag 'edac_for_3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Add support for newer F16h models i7core_edac: Drop unused variable i82875p_edac: Drop redundant call to pci_get_device() amd8111_edac: Fix leaks in probe error paths e752x_edac: Drop pvt->bridge_ck MCE, AMD: Fix decoding module loading on unsupported hw i5100_edac: Remove an unneeded condition in i5100_init_csrows() sb_edac: Degrade log level for device registration amd64_edac: Fix logic to determine channel for F15 M30h processors edac/85xx: Remove deprecated IRQF_DISABLED i3200_edac: Add a missing pci_disable_device() on the exit path i5400_edac: Disable device when unloading module e752x_edac: Simplify call to pci_get_device()
2014-03-31EDAC: Octeon: Add error injection supportDaniel Walker1-6/+171
This adds an ad-hoc error injection method. Octeon II doesn't have hardware support for injection, so this simulates it. Signed-off-by: Daniel Walker <dwalker@fifo99.com> Cc: David Daney <david.daney@cavium.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: linux-edac@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/5873/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-03-31EDAC: Octeon: Fix lack of opstate_initDaniel Walker1-0/+2
If the opstate_init() isn't called the driver won't start properly. I just added it in what appears to be an appropriate place. Signed-off-by: Daniel Walker <dwalker@fifo99.com> Cc: David Daney <david.daney@cavium.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: linux-edac@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/5872/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-03-13sb_edac: mark MCE messages as KERN_DEBUGAristeu Rozanski1-11/+12
Since the driver is decoding the MCE, it's useless to have these messages printed unless you're debugging a problem in the driver. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-03-13sb_edac: use "event" instead of "exception" when MC wasnt signaledAristeu Rozanski1-2/+8
Corrected Errors are MC events, not exceptions and reporting as the later might confuse users. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-03-11Merge tag 'v3.14-rc5' into patchworkMauro Carvalho Chehab5-30/+42
Linux 3.14-rc5 * tag 'v3.14-rc5': (1117 commits) Linux 3.14-rc5 drm/vmwgfx: avoid null pointer dereference at failure paths drm/vmwgfx: Make sure backing mobs are cleared when allocated. Update driver date. drm/vmwgfx: Remove some unused surface formats MAINTAINERS: add maintainer entry for Armada DRM driver arm64: Fix !CONFIG_SMP kernel build arm64: mm: Add double logical invert to pte accessors dm cache: fix truncation bug when mapping I/O to >2TB fast device perf tools: Fix strict alias issue for find_first_bit powerpc/powernv: Fix indirect XSCOM unmangling powerpc/powernv: Fix opal_xscom_{read,write} prototype powerpc/powernv: Refactor PHB diag-data dump powerpc/powernv: Dump PHB diag-data immediately powerpc: Increase stack redzone for 64-bit userspace to 512 bytes powerpc/ftrace: bugfix for test_24bit_addr powerpc/crashdump : Fix page frame number check in copy_oldmem_page powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctly kvm, vmx: Really fix lazy FPU on nested guest perf tools: fix BFD detection on opensuse drm/radeon: enable speaker allocation setup on dce3.2 ...
2014-02-27amd64_edac: Add support for newer F16h modelsAravind Gopalakrishnan2-0/+27
Extend ECC decoding support for F16h M30h. Tested on F16h M30h with ECC turned on using mce_amd_inj module and the patch works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1392913726-16961-1-git-send-email-Aravind.Gopalakrishnan@amd.com Tested-by: Arindam Nath <Arindam.Nath@amd.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-25i7core_edac: Drop unused variableJean Delvare1-7/+3
Fix the following warning: drivers/edac/i7core_edac.c: In function "core_mce_output_error": drivers/edac/i7core_edac.c:1711:8: warning: variable "type" set but not used [-Wunused-but-set-variable] char *type, *optype, *err; ^ According to Mauro, type can just be dropped, as tp_event now maps if the error is corrected, uncorrected non-fatal or uncorrected fatal one. Signed-off-by: Jean Delvare <jdelvare@suse.de> Link: http://lkml.kernel.org/r/20140224171358.692d7e5a@endymion.delvare Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-25i82875p_edac: Drop redundant call to pci_get_device()Jean Delvare1-2/+0
The first call to pci_get_device() in i82875p_probe1() is useless. The result is immediately reset at the beginning of i82875p_setup_overfl_dev(), which then issues the same pci_get_device() call. Dropping the redundant call avoids a device reference leak. Signed-off-by: Jean Delvare <jdelvare@suse.de> Link: http://lkml.kernel.org/r/20140224160316.60e55fb6@endymion.delvare Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-25amd8111_edac: Fix leaks in probe error pathsJean Delvare1-14/+30
Both probe error paths are incomplete and leak memory and device references. Add the missing cleanups. Signed-off-by: Jean Delvare <jdelvare@suse.de> Link: http://lkml.kernel.org/r/20140224154534.5a3b797a@endymion.delvare Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-25e752x_edac: Drop pvt->bridge_ckJean Delvare1-18/+10
pvt->bridge_ck always points to the same device as pvt->dev_d0f1, so get rid of the former and only use the latter. Signed-off-by: Jean Delvare <jdelvare@suse.de> Tested-by: Aristeu Rozanski <aris@redhat.com> Link: http://lkml.kernel.org/r/20140224151544.16ba28a0@endymion.delvare Cc: Mark Gross <mark.gross@intel.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-25i7300_edac: Fix device reference countJean Delvare1-18/+20
pci_get_device() decrements the reference count of "from" (last argument) so when we break off the loop successfully we have only one device reference - and we don't know which device we have. If we want a reference to each device, we must take them explicitly and let the pci_get_device() walk complete to avoid duplicate references. This is serious, as over-putting device references will cause the device to eventually disappear. Without this fix, the kernel crashes after a few insmod/rmmod cycles. Tested on an Intel S7000FC4UR system with a 7300 chipset. Signed-off-by: Jean Delvare <jdelvare@suse.de> Link: http://lkml.kernel.org/r/20140224111656.09bbb7ed@endymion.delvare Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: stable@vger.kernel.org Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-25i7core_edac: Fix PCI device reference countJean Delvare1-2/+7
The reference count changes done by pci_get_device can be a little misleading when the usage diverges from the most common scheme. The reference count of the device passed as the last parameter is always decreased, even if the function returns no new device. So if we are going to try alternative device IDs, we must manually increment the device reference count before each retry. If we don't, we end up decreasing the reference count, and after a few modprobe/rmmod cycles the PCI devices will vanish. In other words and as Alan put it: without this fix the EDAC code corrupts the PCI device list. This fixes kernel bug #50491: https://bugzilla.kernel.org/show_bug.cgi?id=50491 Signed-off-by: Jean Delvare <jdelvare@suse.de> Link: http://lkml.kernel.org/r/20140224093927.7659dd9d@endymion.delvare Reviewed-by: Alan Cox <alan@linux.intel.com> Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: stable@vger.kernel.org Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-24MCE, AMD: Fix decoding module loading on unsupported hwBorislav Petkov1-32/+33
We want to still be able to issue some error information on systems for which there is no decoding support (think older distro kernels here, for example). Therefore, we allow module registration but skip the per-family bank-specific decoders and issue the general information only, i.e.: [ 46.822828] [Hardware Error]: Error Status: Uncorrected, software containable error. [ 46.822846] [Hardware Error]: CPU:0 (15:30:0) MC0_STATUS[-|UE|-|-|-|-|-]: 0xa000000000010f0f [ 46.822858] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out) with the hope that it still contains helpful useful bits. Suggested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Tested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1392659391-2411-1-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-20i5100_edac: Remove an unneeded condition in i5100_init_csrows()Dan Carpenter1-10/+7
We checked that "npages" was not zero a couple lines earlier so we can remove this and pull the code in one indent level. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: http://lkml.kernel.org/r/20140213111738.GB15549@elgon.mountain Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-20sb_edac: Degrade log level for device registrationJiang Liu1-1/+1
On a system with four Intel processors, it generates too many messages "EDAC sbridge: Seeking for: dev 1d.3 PCI ID xxxx". And it doesn't give many useful information for normal users, so change log level from INFO to DEBUG. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Link: http://lkml.kernel.org/r/1392613824-11230-1-git-send-email-jiang.liu@linux.intel.com Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-14EDAC: Correct workqueue setup pathBorislav Petkov1-4/+7
We're using edac_mc_workq_setup() both on the init path, when we load an edac driver and when we change the polling period (edac_mc_reset_delay_period) through /sys/.../edac_mc_poll_msec. On that second path we don't need to init the workqueue which has been initialized already. Thanks to Tejun for workqueue insights. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.com Cc: <stable@vger.kernel.org>
2014-02-14EDAC: Poll timeout cannot be zero, p2Borislav Petkov3-7/+9
Sanitize code even more to accept unsigned longs only and to not allow polling intervals below 1 second as this is unnecessary and doesn't make much sense anyway for polling errors. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.com Cc: Doug Thompson <dougthompson@xmission.com> Cc: <stable@vger.kernel.org>
2014-02-10drivers/edac/edac_mc_sysfs.c: poll timeout cannot be zeroPrarit Bhargava1-1/+1
If you do echo 0 > /sys/module/edac_core/parameters/edac_mc_poll_msec the following stack trace is output because the edac module is not designed to poll with a timeout of zero. WARNING: CPU: 12 PID: 0 at lib/list_debug.c:33 __list_add+0xac/0xc0() list_add corruption. prev->next should be next (ffff8808291dd1b8), but was (null). (prev=ffff8808286fe3f8). Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 12 PID: 0 Comm: swapper/12 Not tainted 3.13.0+ #1 Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Call Trace: <IRQ> __list_add+0xac/0xc0 __internal_add_timer+0xab/0x130 internal_add_timer+0x17/0x40 mod_timer_pinned+0xca/0x170 intel_pstate_timer_func+0x28a/0x380 call_timer_fn+0x36/0x100 run_timer_softirq+0x1ff/0x2f0 __do_softirq+0xf5/0x2e0 irq_exit+0x10d/0x120 smp_apic_timer_interrupt+0x45/0x60 apic_timer_interrupt+0x6d/0x80 <EOI> cpuidle_idle_call+0xb9/0x1f0 arch_cpu_idle+0xe/0x30 cpu_startup_entry+0x9e/0x240 start_secondary+0x1e4/0x290 kernel BUG at kernel/timer.c:1084! invalid opcode: 0000 [#1] SMP Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 12 PID: 0 Comm: swapper/12 Tainted: G W 3.13.0+ #1 Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Call Trace: <IRQ> run_timer_softirq+0x245/0x2f0 __do_softirq+0xf5/0x2e0 irq_exit+0x10d/0x120 smp_apic_timer_interrupt+0x45/0x60 apic_timer_interrupt+0x6d/0x80 <EOI> cpuidle_idle_call+0xb9/0x1f0 arch_cpu_idle+0xe/0x30 cpu_startup_entry+0x9e/0x240 start_secondary+0x1e4/0x290 RIP cascade+0x93/0xa0 WARNING: CPU: 36 PID: 1154 at kernel/workqueue.c:1461 __queue_delayed_work+0xed/0x1a0() Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 36 PID: 1154 Comm: kworker/u481:3 Tainted: G W 3.13.0+ #1 Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Workqueue: edac-poller edac_mc_workq_function [edac_core] Call Trace: dump_stack+0x45/0x56 warn_slowpath_common+0x7d/0xa0 warn_slowpath_null+0x1a/0x20 __queue_delayed_work+0xed/0x1a0 queue_delayed_work_on+0x27/0x50 edac_mc_workq_function+0x72/0xa0 [edac_core] process_one_work+0x17b/0x460 worker_thread+0x11b/0x400 kthread+0xd2/0xf0 ret_from_fork+0x7c/0xb0 This patch adds a range check in the edac_mc_poll_msec code to check for 0. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-07amd64_edac: Fix logic to determine channel for F15 M30h processorsAravind Gopalakrishnan1-3/+11
Update current channel selection logic to include F15h, M30h memory controllers. Refer F15 M30h BKDG D18F2x110[7:6] (DRAM Controller Select Low) (Link:http://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf) Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1390338216-3873-1-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-07edac/85xx: Remove deprecated IRQF_DISABLEDJohannes Thumshirn1-3/+3
Remove IRQF_DISABLED as it is a NOOP. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@men.de> Link: http://lkml.kernel.org/r/1390293747-11185-1-git-send-email-johannes.thumshirn@men.de Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-07i3200_edac: Add a missing pci_disable_device() on the exit pathHitoshi Mitake1-0/+2
Currently, i3200_edac.c forgets to call pci_disable_device() during a process of disabling device. This patch adds that call. The problem was detected by Huqiu Liu. Reported-by: Huqiu Liu <liuhq11@mails.tsinghua.edu.cn> Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp> Link: http://lkml.kernel.org/r/1389587178-10167-1-git-send-email-mitake.hitoshi@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-07i5400_edac: Disable device when unloading moduleAristeu Rozanski1-0/+2
This was found by Huqiu Liu using a static analysis. Reported-by: Huqiu Liu <liuhq11@mails.tsinghua.edu.cn> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Link: http://lkml.kernel.org/r/20140116162021.GY15716@redhat.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-07[media, edac] Change my email addressMauro Carvalho Chehab6-10/+10
There are several left overs with my old email address. Remove their occurrences and add myself at CREDITS, to allow people to be able to reach me on my new addresses. Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-02-03e752x_edac: Simplify call to pci_get_device()Jean Delvare1-1/+1
The last parameter, pvt->bridge_ck, is always NULL as far as I can see, so just use that. Signed-off-by: Jean Delvare <jdelvare@suse.de> Acked-by: Aristeu Rozanski <aris@redhat.com> Link: http://lkml.kernel.org/r/20140130091101.22f2b550@endymion.delvare Cc: Mark Gross <mark.gross@intel.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-01-20Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-1/+24
Pull x86 RAS changes from Ingo Molnar: - SCI reporting for other error types not only correctable ones - GHES cleanups - Add the functionality to override error reporting agents as some machines are sporting a new extended error logging capability which, if done properly in the BIOS, makes a corresponding EDAC module redundant - PCIe AER tracepoint severity levels fix - Error path correction for the mce device init - MCE timer fix - Add more flexibility to the error injection (EINJ) debugfs interface * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, mce: Fix mce_start_timer semantics ACPI, APEI, GHES: Cleanup ghes memory error handling ACPI, APEI: Cleanup alignment-aware accesses ACPI, APEI, GHES: Do not report only correctable errors with SCI ACPI, APEI, EINJ: Changes to the ACPI/APEI/EINJ debugfs interface ACPI, eMCA: Combine eMCA/EDAC event reporting priority EDAC, sb_edac: Modify H/W event reporting policy EDAC: Add an edac_report parameter to EDAC PCI, AER: Fix severity usage in aer trace event x86, mce: Call put_device on device_register failure
2014-01-20Merge tag 'edac_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds22-104/+181
Pull EDAC updates from Borislav Petkov: - mpc85xx PCIe error interrupt support - misc small enhancements/fixes all over the place. * tag 'edac_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC: Don't try to cancel workqueue when it's never setup e752x_edac: Fix pci_dev usage count sb_edac: Mark get_mci_for_node_id as static EDAC: Mark edac_create_debug_nodes as static amd64_edac: Remove "amd64" prefix from static functions amd64_edac: Simplify code around decode_bus_error amd64_edac: Mark amd64_decode_bus_error as static EDAC: Remove DEFINE_PCI_DEVICE_TABLE macro amd64_edac: Fix condition to verify max channels allowed for F15 M30h edac/85xx: Add PCIe error interrupt edac support
2014-01-10EDAC: Don't try to cancel workqueue when it's never setupStephen Boyd1-0/+3
We only setup a workqueue for edac devices that use the polling method. We still try to cancel the workqueue if an edac_device uses the irq method though. This causes a warning from debug objects when we remove an edac device: WARNING: CPU: 0 PID: 56 at lib/debugobjects.c:260 debug_print_object+0x98/0xc0() ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x28 Modules linked in: krait_edac(-) CPU: 0 PID: 56 Comm: rmmod Not tainted 3.12.0-rc2-00035-g15292a0 #69 (unwind_backtrace+0x0/0x144) (show_stack+0x20/0x24) (dump_stack+0x74/0xb4) (warn_slowpath_common+0x78/0x9c) (warn_slowpath_fmt+0x40/0x48) (debug_print_object+0x98/0xc0) (debug_object_assert_init+0xdc/0xec) (del_timer+0x24/0x7c) (try_to_grab_pending+0xc0/0x1b0) (cancel_delayed_work+0x2c/0xa0) (edac_device_workq_teardown+0x1c/0x38) (edac_device_del_device+0xb8/0xe4) (krait_edac_remove+0x50/0x70 [krait_edac]) (platform_drv_remove+0x24/0x28) (__device_release_driver+0x68/0xc0) (driver_detach+0xc4/0xc8) (bus_remove_driver+0xac/0x114) (driver_unregister+0x38/0x58) (platform_driver_unregister+0x1c/0x20) (krait_edac_driver_exit+0x14/0x1c [krait_edac]) (SyS_delete_module+0x178/0x2b4) Fix it by skipping the workqueue teardown for such devices. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Link: http://lkml.kernel.org/r/1388434457-4194-2-git-send-email-sboyd@codeaurora.org Signed-off-by: Borislav Petkov <bp@suse.de>
2013-12-21e752x_edac: Fix pci_dev usage countAristeu Rozanski1-1/+3
In case the device 0, function 1 is not found using pci_get_device(), pci_scan_single_device() will be used but, differently than pci_get_device(), it allocates a pci_dev but doesn't does bump the usage count on the pci_dev and after few module removals and loads the pci_dev will be freed. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Reviewed-by: mark gross <mark.gross@intel.com> Link: http://lkml.kernel.org/r/20131205153755.GL4545@redhat.com Signed-off-by: Borislav Petkov <bp@suse.de>
2013-12-16Merge tag 'ras_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/rasIngo Molnar2-1/+24
Pull RAS updates from Borislav Petkov: * Add the functionality to override error reporting agents as some machines are sporting a new extended error logging capability which, if done properly in the BIOS, makes a corresponding EDAC module redundant, from Gong Chen. * PCIe AER tracepoint severity levels fix, from Rui Wang. * Error path correction for the mce device init, from Levente Kurusa. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-15sb_edac: Mark get_mci_for_node_id as staticRashika Kheria1-1/+1
This patch marks the function get_mci_for_node_id() as static because it is not used outside of sb_edac.c. Thus, it also eliminates the following warning: drivers/edac/sb_edac.c:918:22: warning: no previous prototype for ‘get_mci_for_node_id’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/0441f508186fc4eeabc8e9c3e4dde013d99405d4.1387029387.git.rashika.kheria@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2013-12-15EDAC: Mark edac_create_debug_nodes as staticRashika Kheria1-1/+1
This patch marks the function edac_create_debug_nodes() as static because it is not used outside of edac_mc_sysfs.c. Thus, it also eliminates the following warning: drivers/edac/edac_mc_sysfs.c:917:5: warning: no previous prototype for ‘edac_create_debug_nodes’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/a1c863b08c0d6f67d03280cf908c771bf26a3239.1387029387.git.rashika.kheria@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2013-12-15amd64_edac: Remove "amd64" prefix from static functionsBorislav Petkov1-62/+56
No need for the namespace tagging there. Cleanup setup_pci_device while at it. Signed-off-by: Borislav Petkov <bp@suse.de>
2013-12-15amd64_edac: Simplify code around decode_bus_errorBorislav Petkov1-9/+4
Drop wrapper function and prefixes. Signed-off-by: Borislav Petkov <bp@suse.de>