aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/edac/i7core_edac.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-11-01i7core_edac: Initialize memory name with cpu, channel, bankMauro Carvalho Chehab1-0/+4
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-11-01i7core_edac: Fix compilation on 32 bits archSedat Dilek1-2/+5
on i386: ERROR: "__udivdi3" [drivers/edac/i7core_edac.ko] undefined!\ In both get_sdram_scrub_rate() and set_sdram_scrub_rate() Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-11-01i7core_edac: scrubbing fixupsNils Carlson1-8/+133
Get a more reliable DCLK value from DMI, name the SCRUBINTERVAL mask and guard against potential overflow in the scrub rate computations. Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
2011-11-01i7core_edac: return -ENODEV if no MC is foundMauro Carvalho Chehab1-2/+18
Nehalem-EX uses a different memory controller. However, as the memory controller is not visible on some Nehalem/Nehalem-EP, we need to indirectly probe via a X58 PCI device. The same devices are found on (some) Nehalem-EX. So, on those machines, the probe routine needs to return -ENODEV, as the actual Memory Controller registers won't be detected. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-11-01i7core_edac: use edac's own way to print errorsMauro Carvalho Chehab1-1/+2
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-11-01i7core_edac: Drop the edac_mce facilityBorislav Petkov1-26/+25
Remove edac_mce pieces and use the normal MCE decoder notifier chain by retaining the same functionality with considerably less code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-10-31EDAC i7core: Use mce socketid for better compatibilityThomas Renninger1-1/+1
mce->socketid and cpu_data(mce->cpu).phys_proc_id are the same, compare with mce_setup (in mce.c): m->cpu = m->extcpu = smp_processor_id(); ... m->socketid = cpu_data(m->extcpu).phys_proc_id; This makes it easier for example for XEN patches to hook into the MCE subsystem. Compile tested on x86_64. Signed-off-by: Thomas Renninger <trenn@suse.de> CC: JBeulich@novell.com CC: linux-edac@vger.kernel.org CC: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-10-31i7core_edac: Don't enable memory scrubbing for Xeon 35xxMauro Carvalho Chehab1-7/+39
Xeon 35xx doesn't mention memory scrub. It seems that only Xeon 55xx and above supports it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-10-31i7core_edac: Add scrubbing supportSamuel Gabrielsson1-0/+126
Add scrubbing support to i7core_edac, tested on intel Xeon L5638. Signed-off-by: Samuel Gabrielsson <samuel.gabrielsson@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-10-31i7core_edac: Fix oops when trying to inject errorsMauro Carvalho Chehab1-2/+35
Error injection needs the pci device 0:0. So, we need to revert this changeset: 79daef2099a02fed35747c23bad22f30441133ea. Tests need to be made to be sure that refcount won't be wrong as noticed before. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-10-31i7core_edac: fix misuse of logical operation in place of bitopDavid Sterba1-1/+1
CC: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-08-18i7core_edac: fixed typo in error count calculationMathias Krause1-1/+1
Based on a patch from the PaX Team, found during a clang analysis pass. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: PaX Team <pageexec@freemail.hu> Cc: stable@kernel.org [v2.6.35+] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6Linus Torvalds1-1/+1
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: gfs2: Drop __TIME__ usage isdn/diva: Drop __TIME__ usage atm: Drop __TIME__ usage dlm: Drop __TIME__ usage wan/pc300: Drop __TIME__ usage parport: Drop __TIME__ usage hdlcdrv: Drop __TIME__ usage baycom: Drop __TIME__ usage pmcraid: Drop __DATE__ usage edac: Drop __DATE__ usage rio: Drop __DATE__ usage scsi/wd33c93: Drop __TIME__ usage scsi/in2000: Drop __TIME__ usage aacraid: Drop __TIME__ usage media/cx231xx: Drop __TIME__ usage media/radio-maxiradio: Drop __TIME__ usage nozomi: Drop __TIME__ usage cyclades: Drop __TIME__ usage
2011-04-19edac: Drop __DATE__ usageMichal Marek1-1/+1
The kernel already prints its build timestamp during boot, no need to repeat it in random drivers and produce different object files each time. Cc: Doug Thompson <dougthompson@xmission.com> Cc: bluesmoke-devel@lists.sourceforge.net Cc: linux-edac@vger.kernel.org Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-03-31Fix common misspellingsLucas De Marchi1-1/+1
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-12-28i7core_edac: fix typos in commentsDavid Sterba1-3/+3
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-24i7core_edac: return -ENODEV when devices were already probedMauro Carvalho Chehab1-1/+1
Due to the nature of i7core, we need to probe and attach all PCI devices used by this driver during the first time probe is called. However, PCI core will call the probe routine one time for each CPU socket. If we return -EINVAL to those calls, it would seem that the driver fails, when, in fact, there's no more devices left to initialize. Changing the return code to -ENODEV solves this issue. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: properly terminate pci_dev_tableMauro Carvalho Chehab1-4/+5
At pci_xeon_fixup(), it waits for a null-terminated table, while at i7core_get_all_devices, it just do a for 0..ARRAY_SIZE. As other tables are zero-terminated, change it to be terminate with 0 as well, and fixes a bug where it may be running out of the table elements. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Avoid PCI refcount to reach zero on successive load/reloadMauro Carvalho Chehab1-0/+7
That's a nasty bug that took me a lot of time to track, and whose solution took just one line to solve. The best fragrances and the worse poisons are shipped on the smalest bottles. The drivers/pci/quick.c implements the pci_get_device function. The normal behavior is that you call it, the function returns you a pdev pointer and increment pdev->kobj.kref.refcount of the pci device. However, if you want to keep searching an object, you need to pass the previous pdev function to the search. When you use a not null pointer to pdev "from" field, pci_get_device will decrement pdev->kobj.kref.refcount, assuming that the driver won't be using the previous pdev. The solution is simple: we just need to call pci_dev_get() manually, for the pdev's that the driver will actually use. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix refcount error at PCI devicesMauro Carvalho Chehab1-35/+2
Probably due to a bug or some testing logic at PCI level, device refcount for <bus>:00.0 device is decremented at the end of the pci_get_device, made by i7core_get_all_devices(). The fact is that the first versions of the driver relied on those devices to probe for Nehalem, but the current versions don't use it at all. So, let's just remove those devices from the driver, making it simpler and fixing the bug. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: it is safe to i7core_unregister_mci() when mci=NULLMauro Carvalho Chehab1-8/+5
i7core_unregister_mci() checks internally when mci=NULL. There's no need to test it outside. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix an oops at i7core probeMauro Carvalho Chehab1-4/+4
changeset c91d57ba9ce5b5c93a7077e2f72510eb1f9131c4 moved the init of the priv pointer to the end of the probe routine. However, we need them before that, otherwise, we hit an OOPS: [ 67.743453] EDAC DEBUG: mci_bind_devs: Associated fn 0.0, dev = ffff88011b46e000, socket 0 [ 67.751861] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 67.759685] IP: [<ffffffffa017e484>] i7core_probe+0x979/0x130c [i7core_edac] [ 67.766721] PGD 10bd38067 PUD 10bd37067 PMD 0 [ 67.771178] Oops: 0000 [#1] SMP [ 67.774414] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map [ 67.782213] CPU 1 [ 67.784042] Modules linked in: i7core_edac(+) edac_core cpufreq_ondemand binfmt_misc dm_multipath video output pci_slot snd_hda_codd Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Remove unused member channels in i7core_pvtHidetoshi Seto1-2/+0
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Remove unused arg csrow from get_dimm_configHidetoshi Seto1-7/+7
A local is enough. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Reduce args of i7core_register_mciHidetoshi Seto1-15/+9
We can check the number of channels in i7core_register_mci. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce i7core_unregister_mciHidetoshi Seto1-32/+44
In i7core_probe, when setup of mci for 2nd or later socket failed, we should cleanup prepared mci for 1st socket or so before "put" of all devices. So let have i7core_unregister_mci that can be shared between here and i7core_remove. While here fix a typo "hanler". Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Use saved pointersHidetoshi Seto1-3/+2
We already have saved pointers. Use shorter ones. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Check probe counter in i7core_removeHidetoshi Seto1-0/+6
Prevent i7core_remove from running multiple times. Otherwise value proved will be negative and something will be wrong. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Call pci_dev_put() when alloc_i7core_dev() failedHidetoshi Seto1-1/+3
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix error path of i7core_register_mciHidetoshi Seto1-5/+11
Release resources properly. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix order of lines in i7core_register_mciHidetoshi Seto1-11/+9
The flag is_registered is not initialized until mci_bind_devs() is called. Refer it properly. The mci->dev and mci->edac_check is required in edac_mc_add_mc(), so prepare them just before the call. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Always do get/put for all devicesHidetoshi Seto1-13/+14
We already do 'get' for all sockets at once. So do 'put' in the same way. And let args of the 'get' function to void since it handles only the single, static and known size table pci_dev_table[]. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce i7core_pci_ctl_create/releaseHidetoshi Seto1-20/+24
Have a couple of method. while here sort out lines in the i7core_register_mci() a bit. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce free_i7core_devHidetoshi Seto1-5/+9
Have a method to make a couple with alloc_i7core_dev() previously introduced. Using in pair will help proper resource handling. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce alloc_i7core_devHidetoshi Seto1-10/+24
It's nice to have a method for a single purpose. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Reduce args of i7core_get_onedeviceHidetoshi Seto1-12/+9
Since we need to pass the index of the entry, pass the table itself instead of passing individual members of the table. While here make it static. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix the logic in i7core_remove()Hidetoshi Seto1-2/+2
commit 47251b4d960bdfa648b0d06dbc6d445f41cb3906 have changed the logic for unexplained reasons. It looks strange that it can release i7core_dev without calling i7core_put_devices() that releases i7core_dev->pdev. Fix the part. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Don't do the legacy PCI probe by defaultMauro Carvalho Chehab1-1/+6
The legacy PCI probe sometimes cause hangs. Better to have it disabled by default, and have a parameter to enable it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: don't use a freed mci structMauro Carvalho Chehab1-2/+1
This is a nasty bug. Since kobject count will be reduced by zero by edac_mc_del_mc(), and this triggers the kobj release method, the mci memory will be freed automatically. So, all we have left is ctl_name, as shown by enabling debug: [ 80.822186] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device() remove_link [ 80.832590] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device() remove_mci_instance [ 80.843776] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing [ 80.855163] EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:3f:03.0 [ 80.862936] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 2089: (null): free structs [ 80.871134] EDAC DEBUG: in drivers/edac/edac_mc.c, line at 238: edac_mc_free() [ 80.878379] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 726: edac_mc_unregister_sysfs_main_kobj() [ 80.888043] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1232: drivers/edac/i7core_edac.c: i7core_put_devices() Also, kfree(mci) shouldn't happen at the kobj.release, as it happens when edac_remove_sysfs_mci_device() is called, but the logic is: edac_remove_sysfs_mci_device(mci); edac_printk(KERN_INFO, EDAC_MC, "Removed device %d for %s %s: DEV %s\n", mci->mc_idx, mci->mod_name, mci->ctl_name, edac_dev_name(mci)); So, as the edac_printk() needs the mci struct, this generates an OOPS. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24edac_core: Print debug messages at release callsMauro Carvalho Chehab1-0/+1
This is important to track a nasty bug at the free logic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: explicitly remove PCI devices from the devices listMauro Carvalho Chehab1-4/+6
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: MCE NMI handling should stop firstMauro Carvalho Chehab1-1/+8
Otherwise, a NMI may happen causing a race condition and a panic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Initialize all priv vars before start pollingMauro Carvalho Chehab1-12/+12
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Improve debug to seek for register/remove errorsMauro Carvalho Chehab1-2/+9
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: move #if PAGE_SHIFT to edac_core.hMauro Carvalho Chehab1-5/+1
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Properly mark const static vars as suchMauro Carvalho Chehab1-40/+77
There are two groups of sysfs attributes: one for rdimm and another for udimm. Instead of changing dynamically the unique static struct for handling udimm's, declare two vars and make them constant. This avoids the risk of having two or more memory controllers, each needing a different set of attributes. While here, use const on all places where it is applicable. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> edac_core: use const for constant sysfs arguments Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: move static vars to the beginning of the fileMauro Carvalho Chehab1-6/+5
While here, don't initialize probed with 0. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Be sure that the edac pci handler will be properly releasedMauro Carvalho Chehab1-15/+23
With multi-sockets, more than one edac pci handler is enabled. Be sure to un-register all instances. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-01i7core_edac: fix panic in udimm sysfs attributes registrationMarcin Slusarz1-0/+1
Array of udimm sysfs attributes was not ended with NULL marker, leading to dereference of random memory. EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2 BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4 IP: [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name RIP: 0010:[<ffffffff81330b36>] [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 (...) Call Trace: [<ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1 [<ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2 [<ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557 [<ffffffff81428901>] i7core_probe+0xccf/0xec0 RIP [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 ---[ end trace 20de320855b81d78 ]--- Kernel panic - not syncing: Attempted to kill init! Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-26quiesce EDAC initialisation on desktop/mobile i7Daniel J Blueman1-1/+1
Don't print failure to detect Core i7 EDAC facilities to the console at boot time, most often occurring on Core i7 desktops and laptops. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>