aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/edac/mce_amd.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2014-02-24MCE, AMD: Fix decoding module loading on unsupported hwBorislav Petkov1-32/+33
We want to still be able to issue some error information on systems for which there is no decoding support (think older distro kernels here, for example). Therefore, we allow module registration but skip the per-family bank-specific decoders and issue the general information only, i.e.: [ 46.822828] [Hardware Error]: Error Status: Uncorrected, software containable error. [ 46.822846] [Hardware Error]: CPU:0 (15:30:0) MC0_STATUS[-|UE|-|-|-|-|-]: 0xa000000000010f0f [ 46.822858] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out) with the hope that it still contains helpful useful bits. Suggested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Tested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1392659391-2411-1-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2013-06-08EDAC, MCE, AMD: Add an MCE signature for new Fam15h modelsAravind Gopalakrishnan1-2/+3
Add a new error signature for Family 15h, models 30h-3fh. Patch has been tested on Fam15h using mce_amd_inj facility and has been verified to work correctly. Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> [ cleanup commit message and error string ] Signed-off-by: Borislav Petkov <bp@suse.de>
2013-01-22EDAC, MCE, AMD: Remove unneeded exportsBorislav Petkov1-11/+6
Initially, those strings describing different parts of an MCE message were shared with amd64_edac and were therefore exported to modules. However, all except pp_msgs are used only in one place right now so hide them and make them static. No functionality change. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Borislav Petkov <bp@alien8.de>
2013-01-22EDAC, MCE, AMD: Add MCE decoding support for Family 16hJacob Shin1-17/+78
Add MCE decoding logic for AMD Family 16h processors. Boris: - drop unneeded uu_msgs export - exit early in cat_mc1_mce and save us an indentation level Signed-off-by: Jacob Shin <jacob.shin@amd.com> Signed-off-by: Borislav Petkov <bp@alien8.de>
2013-01-22EDAC, MCE, AMD: Make MC2 decoding per-familyJacob Shin1-27/+29
Currently only AMD Family 15h processors have special handling for MC2 errors. Since upcoming Family 16h will also need unique handling, let's make MC2 handling part of amd_decoder_ops. Signed-off-by: Jacob Shin <jacob.shin@amd.com> Signed-off-by: Borislav Petkov <bp@alien8.de>
2012-11-28MCE, AMD: Dump error statusBorislav Petkov1-2/+20
Dump error status after decoding the error which describes the error disposition. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-11-28MCE, AMD: Report decoded error type firstBorislav Petkov1-25/+25
Instead of starting with the error details, report the decoded, readable error type first. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-11-28MCE, AMD: Dump CPU f/m/s triple with the errorBorislav Petkov1-4/+6
It is very useful to have the family/model/stepping with the reported error so dump it. This saves us asking the bug reporter about it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-11-28MCE, AMD: Remove functional unit referencesBorislav Petkov1-94/+92
Having the functional unit names in each bank decode is only misleading as this code supports multiple families and there's no guarantee the mapping between FUs and MCE banks will stay the same. And also, knowing the functional unit name doesn't help much since you end up looking at the respective BKDG anyway. So drop all FU references and use the MC bank numbers instead. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-04MCE, AMD: Drop too granulary family model checksBorislav Petkov1-4/+2
MCA details seldom change inbetween the models of a family so don't be too conservative and enable decoding on everything starting from K8 onwards. Minor adjustments can come in later but most importantly, we have some decoding infrastructure in place for upcoming models by default. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-03-19MCE, AMD: Constify error tablesBorislav Petkov1-7/+7
... so that checkpatch can chill out. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
2012-03-19MCE, AMD: Correct bank 5 error signaturesBorislav Petkov1-4/+1
... and remove superfluous ErrorCodeExt check. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
2012-03-19MCE, AMD: Rework NB MCE signaturesBorislav Petkov1-128/+48
Correct their formulation, replace per-family functions with a single, unified lookup table. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
2012-03-19MCE, AMD: Correct VB data error descriptionBorislav Petkov1-1/+1
Sync with latest BKDG error types. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
2012-03-19MCE, AMD: Correct ucode patch buffer descriptionBorislav Petkov1-2/+6
This MC1 error signature is called differently now, fix it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
2012-03-19MCE, AMD: Correct some MC0 error typesBorislav Petkov1-3/+2
Use "System Read Data Error" as a more general name for MC0 bus errors on F15h and update some error definitions. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
2011-12-14x86, mce: Add wrappers for registering on the decode chainBorislav Petkov1-2/+2
No functionality change, this is done so that in a follow-on patch all queued-up MCEs can be decoded after registering on the chain. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-10-06EDAC, MCE, AMD: Simplify NB MCE decoder interfaceBorislav Petkov1-10/+10
Drop third nbcfg argument which is old remains and not required anymore. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-10-06EDAC, MCE, AMD: Drop local coreid reportingBorislav Petkov1-19/+1
MCE decoding code is reporting the core which encountered the error unconditionally now so drop this piece. Besides, it reported the coreid in the local processor package which is not that valuable as a datapoint. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-10-06EDAC, MCE, AMD: Print valid addr when reporting an errorBorislav Petkov1-1/+3
The MCi_STATUS bank has a AddrV bit which, when set, denotes that the corresponding MCi_ADDR MSR contains a valid address belonging to the MCE currently being reported. Dump it since it is definitely relevant information. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-10-06EDAC, MCE, AMD: Print CPU number when reporting the errorBorislav Petkov1-2/+2
Currently, correctable ECCs go through mcelog and do not print the scary MCE banner. In that case, however, reporting the core where the CECC happened is important information so dump it along with the decoded string albeit at risk of having a minor redundancy. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17amd64_edac: Enable driver on F15hBorislav Petkov1-3/+3
Add the PCI device ids required for driver registration. Remove pvt->ctl_name and use the family descriptor directly, instead. Then, bump driver version and fixup its format. Finally, enable DRAM ECC decoding on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17amd64_edac: Cleanup NBSH cruftBorislav Petkov1-1/+1
Remove reporting of errors with UC bit set - this is done by the MCE decoding code anyway and this driver deals with DRAM ECC errors only. UC (NB uncorrectable error) doesn't necessarily mean it is a DRAM error. Remove unused macros while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Fix NB error formattingBorislav Petkov1-7/+10
Minor formatting fixup since the information which core was associated with the MCE is not always valid. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bitRandy Dunlap1-2/+2
Building for X86_32 produces shift count warnings, so use BIT_64() to eliminate the warnings. drivers/edac/mce_amd.c:778: warning: left shift count >= width of type drivers/edac/mce_amd.c:778: warning: left shift count >= width of type Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: bluesmoke-devel@lists.sourceforge.net Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Enable MCE decoding on F15hBorislav Petkov1-6/+8
Now that everything is inplace, enable MCE decoding on F15h. Make initcall routine a bit more readable. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Shorten error report formattingBorislav Petkov1-22/+32
Shorten up MCi_STATUS flags and add BD's new deferred and poison types. Also, simplify formatting. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Overhaul error fields extraction macrosBorislav Petkov1-47/+36
Make macro names shorter thus making code shorter and more clear. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Add F15h FP MCE decoderBorislav Petkov1-0/+44
Add decoder for FP MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Add F15 EX MCE decoderBorislav Petkov1-7/+34
Integrate the single FIROB signature into an expanded table along with the new BD MCE types. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Add an F15h NB MCE decoderBorislav Petkov1-0/+10
by (almost) reusing the F10h one since the signatures are the same. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: No F15h LS MCE decoderBorislav Petkov1-1/+1
F15h BD doesn't generate LS MCEs so warn about it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Add F15h CU MCE decoderBorislav Petkov1-1/+61
MCE bank 2 is redefined from a BU to a CU (Combined Unit) bank on F15h. Add a decoder function for CU MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Add F15h IC MCE decoderBorislav Petkov1-3/+50
Add support for decoding F15h IC MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Add F15h DC MCE decoderBorislav Petkov1-18/+61
Add a decoder for F15h DC MCEs to support the new types of DC MCEs introduced by the BD microarchitecture. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-01-07EDAC, MCE: Select extended error code maskBorislav Petkov1-4/+9
F15h enlarges the extended error code of an MCE to a 5-bit field (MCi_STATUS[20:16]). Add a mask variable which default 0xf is overridden on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Fix shift warning on 32-bitBorislav Petkov1-1/+1
Fix drivers/edac/mce_amd.c:262: warning: left shift count >= width of type on 32-bit builds. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Enable MCE decoding on F12hBorislav Petkov1-1/+1
Turn on MCE decoding on F12h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add F12h NB MCE decoderBorislav Petkov1-2/+3
F12h is completely covered by the generic path. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add F12h IC MCE decoderBorislav Petkov1-0/+1
... which is the same as for K8 and F10h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add F12h DC MCE decoderBorislav Petkov1-7/+17
F12h DC MCE signatures are a subset of F10h's so reuse them. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add support for F11h MCEsBorislav Petkov1-3/+12
F11h has almost the same MCE signatures as K8 except DRAM ECC and MC5 bank errors. Reuse functionality from the other families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Enable MCE decoding on F14hBorislav Petkov1-4/+5
Now that all decoders have been taught about F14h, models < 0x10 MCEs, enable decoding on this family of CPUs. Also, issue a short informational message upon boot that MCE decoding gets enabled. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Fix FR MCEs decodingBorislav Petkov1-4/+10
Those are N/A on K8, so don't decode them there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Complete NB MCE decodersBorislav Petkov1-53/+157
Add support for decoding F14h BU MCEs and improve decoding of the remaining families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Warn about LS MCEs on F14hBorislav Petkov1-5/+13
F14h CPUs do not generate LS MCEs so exit early and warn the user in case this path is ever hit that something else might be going haywire. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Adjust IC decoders to F14hBorislav Petkov1-48/+70
Add support for IC MCEs for F14h CPUs. K8 and F10h are almost identical so use one function for both. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Adjust DC decoders to F14hBorislav Petkov1-27/+131
Add a per-family data cache decoders. Since there is a certain overlap between the different DC MCE signatures, reuse functionality between the families as far as possible. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Rename filesBorislav Petkov1-0/+414
Drop "edac_" string from the filenames since they're prefixed with edac/ in their pathname anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>