From 2b2fd87a6ef56ba7647a578e81bb8c8efda166b8 Mon Sep 17 00:00:00 2001 From: Weidong Han Date: Fri, 17 Apr 2009 16:42:12 +0800 Subject: docs, x86: add nox2apic back to kernel-parameters.txt "nox2apic" was removed from kernel-parameters.txt by mistake, when entries were sorted in alpha order (commit 0cb55ad2). But this early parameter is still there, add it back to kernel-parameters.txt. [ Impact: add boot parameter description ] Signed-off-by: Suresh Siddha Signed-off-by: Weidong Han Acked-by: David Woodhouse Cc: Randy Dunlap Cc: iommu@lists.linux-foundation.org Cc: allen.m.kay@intel.com Cc: fenghua.yu@intel.com LKML-Reference: <1239957736-6161-2-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar --- Documentation/kernel-parameters.txt | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 6172e4360f60..33989d284ff0 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1588,6 +1588,8 @@ and is between 256 and 4096 characters. It is defined in the file nowb [ARM] + nox2apic [X86-64,APIC] Do not enable x2APIC mode. + nptcg= [IA64] Override max number of concurrent global TLB purges which is reported from either PAL_VM_SUMMARY or SAL PALO. -- cgit v1.2.3-59-g8ed1b From 03ea81550676296d94596e4337c771c6ba29f542 Mon Sep 17 00:00:00 2001 From: Weidong Han Date: Fri, 17 Apr 2009 16:42:15 +0800 Subject: x86, intr-remap: add option to disable interrupt remapping Add option "nointremap" to disable interrupt remapping. [ Impact: add new boot option ] Signed-off-by: Suresh Siddha Signed-off-by: Weidong Han Acked-by: David Woodhouse Cc: iommu@lists.linux-foundation.org Cc: allen.m.kay@intel.com Cc: fenghua.yu@intel.com LKML-Reference: <1239957736-6161-5-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar --- Documentation/kernel-parameters.txt | 3 +++ drivers/pci/intr_remapping.c | 11 +++++++++++ 2 files changed, 14 insertions(+) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 33989d284ff0..843cb6646d82 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1533,6 +1533,9 @@ and is between 256 and 4096 characters. It is defined in the file noinitrd [RAM] Tells the kernel not to load any configured initial RAM disk. + nointremap [X86-64, Intel-IOMMU] Do not enable interrupt + remapping. + nointroute [IA-64] nojitter [IA64] Disables jitter checking for ITC timers. diff --git a/drivers/pci/intr_remapping.c b/drivers/pci/intr_remapping.c index 5c2142656e96..842039e4955b 100644 --- a/drivers/pci/intr_remapping.c +++ b/drivers/pci/intr_remapping.c @@ -15,6 +15,14 @@ static struct ioapic_scope ir_ioapic[MAX_IO_APICS]; static int ir_ioapic_num; int intr_remapping_enabled; +static int disable_intremap; +static __init int setup_nointremap(char *str) +{ + disable_intremap = 1; + return 0; +} +early_param("nointremap", setup_nointremap); + struct irq_2_iommu { struct intel_iommu *iommu; u16 irte_index; @@ -506,6 +514,9 @@ int __init intr_remapping_supported(void) { struct dmar_drhd_unit *drhd; + if (disable_intremap) + return 0; + for_each_drhd_unit(drhd) { struct intel_iommu *iommu = drhd->iommu; -- cgit v1.2.3-59-g8ed1b From 1dcdb5a9e7c235e6e80f1f4d5b8247b3e5347e48 Mon Sep 17 00:00:00 2001 From: Andi Kleen Date: Mon, 27 Apr 2009 17:44:11 +0200 Subject: oprofile: re-add force_arch_perfmon option This re-adds the force_arch_perfmon option that was in the original arch perfmon patchkit. Originally this was rejected in favour of a generalized perfmon=name option, but it turned out implementing the later in a reliable way is hard (and it would have been easy to crash the kernel if a user gets it wrong) But now Atom and Core i7 support being readded a user would need to update their oprofile userland to beyond 0.9.4 to use oprofile again on Atom or Core i7. To avoid this problem readd the force_arch_perfmon option. Signed-off-by: Andi Kleen Signed-off-by: Robert Richter --- Documentation/kernel-parameters.txt | 6 ++++++ arch/x86/oprofile/nmi_int.c | 6 ++++++ 2 files changed, 12 insertions(+) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 90b3924071b6..9b9566bf3301 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1650,6 +1650,12 @@ and is between 256 and 4096 characters. It is defined in the file oprofile.timer= [HW] Use timer interrupt instead of performance counters + oprofile.force_arch_perfmon=1 [X86] + Force use of architectural perfmon instead of + the CPU specific event set. + This might be useful if you have older oprofile + userland or if you want common events over Intel CPUs. + osst= [HW,SCSI] SCSI Tape Driver Format: , See also Documentation/scsi/st.txt. diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c index 202864ad49a7..e5171c99e157 100644 --- a/arch/x86/oprofile/nmi_int.c +++ b/arch/x86/oprofile/nmi_int.c @@ -389,10 +389,16 @@ static int __init p4_init(char **cpu_type) return 0; } +int force_arch_perfmon; +module_param(force_arch_perfmon, int, 0); + static int __init ppro_init(char **cpu_type) { __u8 cpu_model = boot_cpu_data.x86_model; + if (force_arch_perfmon && cpu_has_arch_perfmon) + return 0; + switch (cpu_model) { case 0 ... 2: *cpu_type = "i386/ppro"; -- cgit v1.2.3-59-g8ed1b From 7e4e0bd50e80df2fe5501f48f872448376cdd997 Mon Sep 17 00:00:00 2001 From: Robert Richter Date: Wed, 6 May 2009 12:10:23 +0200 Subject: oprofile: introduce module_param oprofile.cpu_type This patch removes module_param oprofile.force_arch_perfmon and introduces oprofile.cpu_type=archperfmon instead. This new parameter can be reused for other models and architectures. Currently only archperfmon is supported. Cc: Andi Kleen Signed-off-by: Robert Richter --- Documentation/kernel-parameters.txt | 12 +++++++----- arch/x86/oprofile/nmi_int.c | 13 +++++++++++-- 2 files changed, 18 insertions(+), 7 deletions(-) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 9b9566bf3301..6ce5f48859cc 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1650,11 +1650,13 @@ and is between 256 and 4096 characters. It is defined in the file oprofile.timer= [HW] Use timer interrupt instead of performance counters - oprofile.force_arch_perfmon=1 [X86] - Force use of architectural perfmon instead of - the CPU specific event set. - This might be useful if you have older oprofile - userland or if you want common events over Intel CPUs. + oprofile.cpu_type= Force an oprofile cpu type + This might be useful if you have an older oprofile + userland or if you want common events. + Format: { archperfmon } + archperfmon: [X86] Force use of architectural + perfmon on Intel CPUs instead of the + CPU specific event set. osst= [HW,SCSI] SCSI Tape Driver Format: , diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c index 3308147182ae..3b285e656e27 100644 --- a/arch/x86/oprofile/nmi_int.c +++ b/arch/x86/oprofile/nmi_int.c @@ -386,8 +386,17 @@ static int __init p4_init(char **cpu_type) return 0; } -int force_arch_perfmon; -module_param(force_arch_perfmon, int, 0); +static int force_arch_perfmon; +static int force_cpu_type(const char *str, struct kernel_param *kp) +{ + if (!strcmp(str, "archperfmon")) { + force_arch_perfmon = 1; + printk(KERN_INFO "oprofile: forcing architectural perfmon\n"); + } + + return 0; +} +module_param_call(cpu_type, force_cpu_type, NULL, NULL, 0); static int __init ppro_init(char **cpu_type) { -- cgit v1.2.3-59-g8ed1b From 5789ba3bd0a3cd20df5980ebf03358f2eb44fd67 Mon Sep 17 00:00:00 2001 From: Eric Paris Date: Thu, 21 May 2009 15:47:06 -0400 Subject: IMA: Minimal IMA policy and boot param for TCB IMA policy The IMA TCB policy is dangerous. A normal use can use all of a system's memory (which cannot be freed) simply by building and running lots of executables. The TCB policy is also nearly useless because logging in as root often causes a policy violation when dealing with utmp, thus rendering the measurements meaningless. There is no good fix for this in the kernel. A full TCB policy would need to be loaded in userspace using LSM rule matching to get both a protected and useful system. But, if too little is measured before userspace can load a real policy one again ends up with a meaningless set of measurements. One option would be to put the policy load inside the initrd in order to get it early enough in the boot sequence to be useful, but this runs into trouble with the LSM. For IMA to measure the LSM policy and the LSM policy loading mechanism it needs rules to do so, but we already talked about problems with defaulting to such broad rules.... IMA also depends on the files being measured to be on an FS which implements and supports i_version. Since the only FS with this support (ext4) doesn't even use it by default it seems silly to have any IMA rules by default. This should reduce the performance overhead of IMA to near 0 while still letting users who choose to configure their machine as such to inclue the ima_tcb kernel paramenter and get measurements during boot before they can load a customized, reasonable policy in userspace. Signed-off-by: Eric Paris Acked-by: Mimi Zohar Signed-off-by: James Morris --- Documentation/kernel-parameters.txt | 6 ++++++ security/integrity/ima/ima_policy.c | 30 +++++++++++++++++++++++++++--- 2 files changed, 33 insertions(+), 3 deletions(-) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbfbcc75..d9a24a04cfb1 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -914,6 +914,12 @@ and is between 256 and 4096 characters. It is defined in the file Formt: { "sha1" | "md5" } default: "sha1" + ima_tcb [IMA] + Load a policy which meets the needs of the Trusted + Computing Base. This means IMA will measure all + programs exec'd, files mmap'd for exec, and all files + opened for read by uid=0. + in2000= [HW,SCSI] See header of drivers/scsi/in2000.c. diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c index 31d677f7c65f..4719bbf1641a 100644 --- a/security/integrity/ima/ima_policy.c +++ b/security/integrity/ima/ima_policy.c @@ -45,9 +45,17 @@ struct ima_measure_rule_entry { } lsm[MAX_LSM_RULES]; }; -/* Without LSM specific knowledge, the default policy can only be +/* + * Without LSM specific knowledge, the default policy can only be * written in terms of .action, .func, .mask, .fsmagic, and .uid */ + +/* + * The minimum rule set to allow for full TCB coverage. Measures all files + * opened or mmap for exec and everything read by root. Dangerous because + * normal users can easily run the machine out of memory simply building + * and running executables. + */ static struct ima_measure_rule_entry default_rules[] = { {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags = IMA_FSMAGIC}, {.action = DONT_MEASURE,.fsmagic = SYSFS_MAGIC,.flags = IMA_FSMAGIC}, @@ -59,6 +67,8 @@ static struct ima_measure_rule_entry default_rules[] = { .flags = IMA_FUNC | IMA_MASK}, {.action = MEASURE,.func = BPRM_CHECK,.mask = MAY_EXEC, .flags = IMA_FUNC | IMA_MASK}, + {.action = MEASURE,.func = PATH_CHECK,.mask = MAY_READ,.uid = 0, + .flags = IMA_FUNC | IMA_MASK | IMA_UID}, }; static LIST_HEAD(measure_default_rules); @@ -67,6 +77,14 @@ static struct list_head *ima_measure; static DEFINE_MUTEX(ima_measure_mutex); +static bool ima_use_tcb __initdata; +static int __init default_policy_setup(char *str) +{ + ima_use_tcb = 1; + return 1; +} +__setup("ima_tcb", default_policy_setup); + /** * ima_match_rules - determine whether an inode matches the measure rule. * @rule: a pointer to a rule @@ -162,9 +180,15 @@ int ima_match_policy(struct inode *inode, enum ima_hooks func, int mask) */ void ima_init_policy(void) { - int i; + int i, entries; + + /* if !ima_use_tcb set entries = 0 so we load NO default rules */ + if (ima_use_tcb) + entries = ARRAY_SIZE(default_rules); + else + entries = 0; - for (i = 0; i < ARRAY_SIZE(default_rules); i++) + for (i = 0; i < entries; i++) list_add_tail(&default_rules[i].list, &measure_default_rules); ima_measure = &measure_default_rules; } -- cgit v1.2.3-59-g8ed1b From 29fcefba8a2f0fea11e2b721fe174a1832801284 Mon Sep 17 00:00:00 2001 From: Pekka Enberg Date: Sun, 24 May 2009 11:13:17 +0300 Subject: kmemtrace: fix kernel parameter documentation The kmemtrace.enable kernel parameter no longer works. To enable kmemtrace at boot-time, you must pass "ftrace=kmemtrace" instead. [ Impact: remove obsolete kernel parameter documentation ] Cc: Eduard - Gabriel Munteanu Signed-off-by: Pekka Enberg LKML-Reference: Signed-off-by: Frederic Weisbecker --- Documentation/kernel-parameters.txt | 10 ---------- 1 file changed, 10 deletions(-) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbfbcc75..9243dd84f4d6 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -56,7 +56,6 @@ parameter is applicable: ISAPNP ISA PnP code is enabled. ISDN Appropriate ISDN support is enabled. JOY Appropriate joystick support is enabled. - KMEMTRACE kmemtrace is enabled. LIBATA Libata driver is enabled LP Printer support is enabled. LOOP Loopback device support is enabled. @@ -1054,15 +1053,6 @@ and is between 256 and 4096 characters. It is defined in the file use the HighMem zone if it exists, and the Normal zone if it does not. - kmemtrace.enable= [KNL,KMEMTRACE] Format: { yes | no } - Controls whether kmemtrace is enabled - at boot-time. - - kmemtrace.subbufs=n [KNL,KMEMTRACE] Overrides the number of - subbufs kmemtrace's relay channel has. Set this - higher than default (KMEMTRACE_N_SUBBUFS in code) if - you experience buffer overruns. - kgdboc= [HW] kgdb over consoles. Requires a tty driver that supports console polling. (only serial suported for now) -- cgit v1.2.3-59-g8ed1b From d9cfed925448f097ec7faab80d903eb7e5f99712 Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Tue, 19 May 2009 12:16:29 +0200 Subject: amd-iommu: remove amd_iommu_size kernel parameter This parameter is not longer necessary when aperture increases dynamically. Signed-off-by: Joerg Roedel --- Documentation/kernel-parameters.txt | 5 ----- arch/x86/kernel/amd_iommu.c | 18 ++++-------------- arch/x86/kernel/amd_iommu_init.c | 15 --------------- 3 files changed, 4 insertions(+), 34 deletions(-) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbfbcc75..5b776c6e7964 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -329,11 +329,6 @@ and is between 256 and 4096 characters. It is defined in the file flushed before they will be reused, which is a lot of faster - amd_iommu_size= [HW,X86-64] - Define the size of the aperture for the AMD IOMMU - driver. Possible values are: - '32M', '64M' (default), '128M', '256M', '512M', '1G' - amijoy.map= [HW,JOY] Amiga joystick support Map of devices attached to JOY0DAT and JOY1DAT Format: , diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c index d129d8feba07..31d56c36010a 100644 --- a/arch/x86/kernel/amd_iommu.c +++ b/arch/x86/kernel/amd_iommu.c @@ -939,17 +939,10 @@ static void dma_ops_domain_free(struct dma_ops_domain *dom) * It also intializes the page table and the address allocator data * structures required for the dma_ops interface */ -static struct dma_ops_domain *dma_ops_domain_alloc(struct amd_iommu *iommu, - unsigned order) +static struct dma_ops_domain *dma_ops_domain_alloc(struct amd_iommu *iommu) { struct dma_ops_domain *dma_dom; - /* - * Currently the DMA aperture must be between 32 MB and 1GB in size - */ - if ((order < 25) || (order > 30)) - return NULL; - dma_dom = kzalloc(sizeof(struct dma_ops_domain), GFP_KERNEL); if (!dma_dom) return NULL; @@ -1087,7 +1080,6 @@ static int device_change_notifier(struct notifier_block *nb, struct protection_domain *domain; struct dma_ops_domain *dma_domain; struct amd_iommu *iommu; - int order = amd_iommu_aperture_order; unsigned long flags; if (devid > amd_iommu_last_bdf) @@ -1126,7 +1118,7 @@ static int device_change_notifier(struct notifier_block *nb, dma_domain = find_protection_domain(devid); if (dma_domain) goto out; - dma_domain = dma_ops_domain_alloc(iommu, order); + dma_domain = dma_ops_domain_alloc(iommu); if (!dma_domain) goto out; dma_domain->target_dev = devid; @@ -1826,7 +1818,6 @@ static void prealloc_protection_domains(void) struct pci_dev *dev = NULL; struct dma_ops_domain *dma_dom; struct amd_iommu *iommu; - int order = amd_iommu_aperture_order; u16 devid; while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { @@ -1839,7 +1830,7 @@ static void prealloc_protection_domains(void) iommu = amd_iommu_rlookup_table[devid]; if (!iommu) continue; - dma_dom = dma_ops_domain_alloc(iommu, order); + dma_dom = dma_ops_domain_alloc(iommu); if (!dma_dom) continue; init_unity_mappings_for_device(dma_dom, devid); @@ -1865,7 +1856,6 @@ static struct dma_map_ops amd_iommu_dma_ops = { int __init amd_iommu_init_dma_ops(void) { struct amd_iommu *iommu; - int order = amd_iommu_aperture_order; int ret; /* @@ -1874,7 +1864,7 @@ int __init amd_iommu_init_dma_ops(void) * protection domain will be assigned to the default one. */ list_for_each_entry(iommu, &amd_iommu_list, list) { - iommu->default_dom = dma_ops_domain_alloc(iommu, order); + iommu->default_dom = dma_ops_domain_alloc(iommu); if (iommu->default_dom == NULL) return -ENOMEM; iommu->default_dom->domain.flags |= PD_DEFAULT_MASK; diff --git a/arch/x86/kernel/amd_iommu_init.c b/arch/x86/kernel/amd_iommu_init.c index 8c0be0902dac..762a4eefec93 100644 --- a/arch/x86/kernel/amd_iommu_init.c +++ b/arch/x86/kernel/amd_iommu_init.c @@ -121,7 +121,6 @@ u16 amd_iommu_last_bdf; /* largest PCI device id we have to handle */ LIST_HEAD(amd_iommu_unity_map); /* a list of required unity mappings we find in ACPI */ -unsigned amd_iommu_aperture_order = 26; /* size of aperture in power of 2 */ bool amd_iommu_isolate = true; /* if true, device isolation is enabled */ bool amd_iommu_unmap_flush; /* if true, flush on every unmap */ @@ -1137,9 +1136,6 @@ int __init amd_iommu_init(void) enable_iommus(); - printk(KERN_INFO "AMD IOMMU: aperture size is %d MB\n", - (1 << (amd_iommu_aperture_order-20))); - printk(KERN_INFO "AMD IOMMU: device isolation "); if (amd_iommu_isolate) printk("enabled\n"); @@ -1225,15 +1221,4 @@ static int __init parse_amd_iommu_options(char *str) return 1; } -static int __init parse_amd_iommu_size_options(char *str) -{ - unsigned order = PAGE_SHIFT + get_order(memparse(str, &str)); - - if ((order > 24) && (order < 31)) - amd_iommu_aperture_order = order; - - return 1; -} - __setup("amd_iommu=", parse_amd_iommu_options); -__setup("amd_iommu_size=", parse_amd_iommu_size_options); -- cgit v1.2.3-59-g8ed1b From 2af15d6a44b871ad4c2a651302374cde8f335480 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 28 May 2009 13:37:24 -0400 Subject: ftrace: add kernel command line function filtering When using ftrace=function on the command line to trace functions on boot up, one can not filter out functions that are commonly called. This patch adds two new ftrace command line commands. ftrace_notrace=function-list ftrace_filter=function-list Where function-list is a comma separated list of functions to filter. The ftrace_notrace will make the functions listed not be included in the function tracing, and ftrace_filter will only trace the functions listed. These two act the same as the debugfs/tracing/set_ftrace_notrace and debugfs/tracing/set_ftrace_filter respectively. The simple glob expressions that are allowed by the filter files can also be used by the command line interface. ftrace_notrace=rcu*,*lock,*spin* Will not trace any function that starts with rcu, ends with lock, or has the word spin in it. Note, if the self tests are enabled, they may interfere with the filtering set by the command lines. Signed-off-by: Steven Rostedt --- Documentation/kernel-parameters.txt | 17 +++++++++++++-- kernel/trace/ftrace.c | 42 +++++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+), 2 deletions(-) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 9243dd84f4d6..fcd3bfbe74e8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -751,12 +751,25 @@ and is between 256 and 4096 characters. It is defined in the file ia64_pal_cache_flush instead of SAL_CACHE_FLUSH. ftrace=[tracer] - [ftrace] will set and start the specified tracer + [FTRACE] will set and start the specified tracer as early as possible in order to facilitate early boot debugging. ftrace_dump_on_oops - [ftrace] will dump the trace buffers on oops. + [FTRACE] will dump the trace buffers on oops. + + ftrace_filter=[function-list] + [FTRACE] Limit the functions traced by the function + tracer at boot up. function-list is a comma separated + list of functions. This list can be changed at run + time by the set_ftrace_filter file in the debugfs + tracing directory. + + ftrace_notrace=[function-list] + [FTRACE] Do not trace the functions specified in + function-list. This list can be changed at run time + by the set_ftrace_notrace file in the debugfs + tracing directory. gamecon.map[2|3]= [HW,JOY] Multisystem joystick and NES/SNES/PSX pad diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 140699a9a8a7..2074e5b7766b 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -32,6 +32,7 @@ #include #include +#include #include "trace_output.h" #include "trace_stat.h" @@ -2369,6 +2370,45 @@ void ftrace_set_notrace(unsigned char *buf, int len, int reset) ftrace_set_regex(buf, len, reset, 0); } +/* + * command line interface to allow users to set filters on boot up. + */ +#define FTRACE_FILTER_SIZE COMMAND_LINE_SIZE +static char ftrace_notrace_buf[FTRACE_FILTER_SIZE] __initdata; +static char ftrace_filter_buf[FTRACE_FILTER_SIZE] __initdata; + +static int __init set_ftrace_notrace(char *str) +{ + strncpy(ftrace_notrace_buf, str, FTRACE_FILTER_SIZE); + return 1; +} +__setup("ftrace_notrace=", set_ftrace_notrace); + +static int __init set_ftrace_filter(char *str) +{ + strncpy(ftrace_filter_buf, str, FTRACE_FILTER_SIZE); + return 1; +} +__setup("ftrace_filter=", set_ftrace_filter); + +static void __init set_ftrace_early_filter(char *buf, int enable) +{ + char *func; + + while (buf) { + func = strsep(&buf, ","); + ftrace_set_regex(func, strlen(func), 0, enable); + } +} + +static void __init set_ftrace_early_filters(void) +{ + if (ftrace_filter_buf[0]) + set_ftrace_early_filter(ftrace_filter_buf, 1); + if (ftrace_notrace_buf[0]) + set_ftrace_early_filter(ftrace_notrace_buf, 0); +} + static int ftrace_regex_release(struct inode *inode, struct file *file, int enable) { @@ -2829,6 +2869,8 @@ void __init ftrace_init(void) if (ret) pr_warning("Failed to register trace ftrace module notifier\n"); + set_ftrace_early_filters(); + return; failed: ftrace_disabled = 1; -- cgit v1.2.3-59-g8ed1b From 1745de5e5639457513fe43440f2800e23c3cbc7d Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Fri, 22 May 2009 21:49:51 +0200 Subject: dma-debug: add dma_debug_driver kernel command line This patch add the dma_debug_driver= boot parameter to enable the driver filter for early boot. Signed-off-by: Joerg Roedel --- Documentation/kernel-parameters.txt | 7 +++++++ lib/dma-debug.c | 18 ++++++++++++++++++ 2 files changed, 25 insertions(+) (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbfbcc75..b3f1314588c9 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -646,6 +646,13 @@ and is between 256 and 4096 characters. It is defined in the file DMA-API debugging code disables itself because the architectural default is too low. + dma_debug_driver= + With this option the DMA-API debugging driver + filter feature can be enabled at boot time. Just + pass the driver to filter for as the parameter. + The filter can be disabled or changed to another + driver later using sysfs. + dscc4.setup= [NET] dtc3181e= [HW,SCSI] diff --git a/lib/dma-debug.c b/lib/dma-debug.c index c6330b1a7be7..d0618aa13b49 100644 --- a/lib/dma-debug.c +++ b/lib/dma-debug.c @@ -1109,3 +1109,21 @@ void debug_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, } EXPORT_SYMBOL(debug_dma_sync_sg_for_device); +static int __init dma_debug_driver_setup(char *str) +{ + int i; + + for (i = 0; i < NAME_MAX_LEN - 1; ++i, ++str) { + current_driver_name[i] = *str; + if (*str == 0) + break; + } + + if (current_driver_name[0]) + printk(KERN_INFO "DMA-API: enable driver filter for " + "driver [%s]\n", current_driver_name); + + + return 1; +} +__setup("dma_debug_driver=", dma_debug_driver_setup); -- cgit v1.2.3-59-g8ed1b From 04f70336c80c43a15e617b36c2043dfa0ad6ed0f Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Thu, 11 Jun 2009 13:22:39 +0100 Subject: kmemleak: Add documentation on the memory leak detector This patch adds the Documentation/kmemleak.txt file with some information about how kmemleak works. Signed-off-by: Catalin Marinas --- Documentation/kernel-parameters.txt | 4 + Documentation/kmemleak.txt | 142 ++++++++++++++++++++++++++++++++++++ 2 files changed, 146 insertions(+) create mode 100644 Documentation/kmemleak.txt (limited to 'Documentation/kernel-parameters.txt') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 4a3c2209a124..04a44cc5048a 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1077,6 +1077,10 @@ and is between 256 and 4096 characters. It is defined in the file Configure the RouterBoard 532 series on-chip Ethernet adapter MAC address. + kmemleak= [KNL] Boot-time kmemleak enable/disable + Valid arguments: on, off + Default: on + kstack=N [X86] Print N words from the kernel stack in oops dumps. diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt new file mode 100644 index 000000000000..0112da3b9ab8 --- /dev/null +++ b/Documentation/kmemleak.txt @@ -0,0 +1,142 @@ +Kernel Memory Leak Detector +=========================== + +Introduction +------------ + +Kmemleak provides a way of detecting possible kernel memory leaks in a +way similar to a tracing garbage collector +(http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Tracing_garbage_collectors), +with the difference that the orphan objects are not freed but only +reported via /sys/kernel/debug/kmemleak. A similar method is used by the +Valgrind tool (memcheck --leak-check) to detect the memory leaks in +user-space applications. + +Usage +----- + +CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel +thread scans the memory every 10 minutes (by default) and prints any new +unreferenced objects found. To trigger an intermediate scan and display +all the possible memory leaks: + + # mount -t debugfs nodev /sys/kernel/debug/ + # cat /sys/kernel/debug/kmemleak + +Note that the orphan objects are listed in the order they were allocated +and one object at the beginning of the list may cause other subsequent +objects to be reported as orphan. + +Memory scanning parameters can be modified at run-time by writing to the +/sys/kernel/debug/kmemleak file. The following parameters are supported: + + off - disable kmemleak (irreversible) + stack=on - enable the task stacks scanning + stack=off - disable the tasks stacks scanning + scan=on - start the automatic memory scanning thread + scan=off - stop the automatic memory scanning thread + scan= - set the automatic memory scanning period in seconds (0 + to disable it) + +Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on +the kernel command line. + +Basic Algorithm +--------------- + +The memory allocations via kmalloc, vmalloc, kmem_cache_alloc and +friends are traced and the pointers, together with additional +information like size and stack trace, are stored in a prio search tree. +The corresponding freeing function calls are tracked and the pointers +removed from the kmemleak data structures. + +An allocated block of memory is considered orphan if no pointer to its +start address or to any location inside the block can be found by +scanning the memory (including saved registers). This means that there +might be no way for the kernel to pass the address of the allocated +block to a freeing function and therefore the block is considered a +memory leak. + +The scanning algorithm steps: + + 1. mark all objects as white (remaining white objects will later be + considered orphan) + 2. scan the memory starting with the data section and stacks, checking + the values against the addresses stored in the prio search tree. If + a pointer to a white object is found, the object is added to the + gray list + 3. scan the gray objects for matching addresses (some white objects + can become gray and added at the end of the gray list) until the + gray set is finished + 4. the remaining white objects are considered orphan and reported via + /sys/kernel/debug/kmemleak + +Some allocated memory blocks have pointers stored in the kernel's +internal data structures and they cannot be detected as orphans. To +avoid this, kmemleak can also store the number of values pointing to an +address inside the block address range that need to be found so that the +block is not considered a leak. One example is __vmalloc(). + +Kmemleak API +------------ + +See the include/linux/kmemleak.h header for the functions prototype. + +kmemleak_init - initialize kmemleak +kmemleak_alloc - notify of a memory block allocation +kmemleak_free - notify of a memory block freeing +kmemleak_not_leak - mark an object as not a leak +kmemleak_ignore - do not scan or report an object as leak +kmemleak_scan_area - add scan areas inside a memory block +kmemleak_no_scan - do not scan a memory block +kmemleak_erase - erase an old value in a pointer variable +kmemleak_alloc_recursive - as kmemleak_alloc but checks the recursiveness +kmemleak_free_recursive - as kmemleak_free but checks the recursiveness + +Dealing with false positives/negatives +-------------------------------------- + +The false negatives are real memory leaks (orphan objects) but not +reported by kmemleak because values found during the memory scanning +point to such objects. To reduce the number of false negatives, kmemleak +provides the kmemleak_ignore, kmemleak_scan_area, kmemleak_no_scan and +kmemleak_erase functions (see above). The task stacks also increase the +amount of false negatives and their scanning is not enabled by default. + +The false positives are objects wrongly reported as being memory leaks +(orphan). For objects known not to be leaks, kmemleak provides the +kmemleak_not_leak function. The kmemleak_ignore could also be used if +the memory block is known not to contain other pointers and it will no +longer be scanned. + +Some of the reported leaks are only transient, especially on SMP +systems, because of pointers temporarily stored in CPU registers or +stacks. Kmemleak defines MSECS_MIN_AGE (defaulting to 1000) representing +the minimum age of an object to be reported as a memory leak. + +Limitations and Drawbacks +------------------------- + +The main drawback is the reduced performance of memory allocation and +freeing. To avoid other penalties, the memory scanning is only performed +when the /sys/kernel/debug/kmemleak file is read. Anyway, this tool is +intended for debugging purposes where the performance might not be the +most important requirement. + +To keep the algorithm simple, kmemleak scans for values pointing to any +address inside a block's address range. This may lead to an increased +number of false negatives. However, it is likely that a real memory leak +will eventually become visible. + +Another source of false negatives is the data stored in non-pointer +values. In a future version, kmemleak could only scan the pointer +members in the allocated structures. This feature would solve many of +the false negative cases described above. + +The tool can report false positives. These are cases where an allocated +block doesn't need to be freed (some cases in the init_call functions), +the pointer is calculated by other methods than the usual container_of +macro or the pointer is stored in a location not scanned by kmemleak. + +Page allocations and ioremap are not tracked. Only the ARM and x86 +architectures are currently supported. -- cgit v1.2.3-59-g8ed1b