aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/call-graph-from-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-03-02orangefs: Use RCU for destroy_inodePeter Zijlstra1-1/+8
freeing of inodes must be RCU-delayed on all filesystems Cc: stable@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-02virtio-console: avoid DMA from stackOmar Sandoval1-2/+10
put_chars() stuffs the buffer it gets into an sg, but that buffer may be on the stack. This breaks with CONFIG_VMAP_STACK=y (for me, it manifested as printks getting turned into NUL bytes). Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com>
2017-03-02vhost: introduce O(1) vq metadata cacheJason Wang2-26/+118
When device IOTLB is enabled, all address translations were stored in interval tree. O(lgN) searching time could be slow for virtqueue metadata (avail, used and descriptors) since they were accessed much often than other addresses. So this patch introduces an O(1) array which points to the interval tree nodes that store the translations of vq metadata. Those array were update during vq IOTLB prefetching and were reset during each invalidation and tlb update. Each time we want to access vq metadata, this small array were queried before interval tree. This would be sufficient for static mappings but not dynamic mappings, we could do optimizations on top. Test were done with l2fwd in guest (2M hugepage): noiommu | before | after tx 1.32Mpps | 1.06Mpps(82%) | 1.30Mpps(98%) rx 2.33Mpps | 1.46Mpps(63%) | 2.29Mpps(98%) We can almost reach the same performance as noiommu mode. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02selinux: wrap cgroup seclabel support with its own policy capabilityStephen Smalley4-4/+12
commit 1ea0ce40690dff38935538e8dab7b12683ded0d3 ("selinux: allow changing labels for cgroupfs") broke the Android init program, which looks up security contexts whenever creating directories and attempts to assign them via setfscreatecon(). When creating subdirectories in cgroup mounts, this would previously be ignored since cgroup did not support userspace setting of security contexts. However, after the commit, SELinux would attempt to honor the requested context on cgroup directories and fail due to permission denial. Avoid breaking existing userspace/policy by wrapping this change with a conditional on a new cgroup_seclabel policy capability. This preserves existing behavior until/unless a new policy explicitly enables this capability. Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-02KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()David Howells15-22/+43
rcu_dereference_key() and user_key_payload() are currently being used in two different, incompatible ways: (1) As a wrapper to rcu_dereference() - when only the RCU read lock used to protect the key. (2) As a wrapper to rcu_dereference_protected() - when the key semaphor is used to protect the key and the may be being modified. Fix this by splitting both of the key wrappers to produce: (1) RCU accessors for keys when caller has the key semaphore locked: dereference_key_locked() user_key_payload_locked() (2) RCU accessors for keys when caller holds the RCU read lock: dereference_key_rcu() user_key_payload_rcu() This should fix following warning in the NFS idmapper =============================== [ INFO: suspicious RCU usage. ] 4.10.0 #1 Tainted: G W ------------------------------- ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 0 1 lock held by mount.nfs/5987: #0: (rcu_read_lock){......}, at: [<d000000002527abc>] nfs_idmap_get_key+0x15c/0x420 [nfsv4] stack backtrace: CPU: 1 PID: 5987 Comm: mount.nfs Tainted: G W 4.10.0 #1 Call Trace: dump_stack+0xe8/0x154 (unreliable) lockdep_rcu_suspicious+0x140/0x190 nfs_idmap_get_key+0x380/0x420 [nfsv4] nfs_map_name_to_uid+0x2a0/0x3b0 [nfsv4] decode_getfattr_attrs+0xfac/0x16b0 [nfsv4] decode_getfattr_generic.constprop.106+0xbc/0x150 [nfsv4] nfs4_xdr_dec_lookup_root+0xac/0xb0 [nfsv4] rpcauth_unwrap_resp+0xe8/0x140 [sunrpc] call_decode+0x29c/0x910 [sunrpc] __rpc_execute+0x140/0x8f0 [sunrpc] rpc_run_task+0x170/0x200 [sunrpc] nfs4_call_sync_sequence+0x68/0xa0 [nfsv4] _nfs4_lookup_root.isra.44+0xd0/0xf0 [nfsv4] nfs4_lookup_root+0xe0/0x350 [nfsv4] nfs4_lookup_root_sec+0x70/0xa0 [nfsv4] nfs4_find_root_sec+0xc4/0x100 [nfsv4] nfs4_proc_get_rootfh+0x5c/0xf0 [nfsv4] nfs4_get_rootfh+0x6c/0x190 [nfsv4] nfs4_server_common_setup+0xc4/0x260 [nfsv4] nfs4_create_server+0x278/0x3c0 [nfsv4] nfs4_remote_mount+0x50/0xb0 [nfsv4] mount_fs+0x74/0x210 vfs_kern_mount+0x78/0x220 nfs_do_root_mount+0xb0/0x140 [nfsv4] nfs4_try_mount+0x60/0x100 [nfsv4] nfs_fs_mount+0x5ec/0xda0 [nfs] mount_fs+0x74/0x210 vfs_kern_mount+0x78/0x220 do_mount+0x254/0xf70 SyS_mount+0x94/0x100 system_call+0x38/0xe0 Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-01PM / OPP: Documentation: Fix opp-microvolt in examplesViresh Kumar1-22/+22
The triplet present in "opp-microvolt" property should be in the order <target min max>, while all the examples have it in the order <min target max>. Fix it. Luckily all of the users of "opp-microvolt" property have applied brain instead of copying the examples from documentation and none of the actual dts files have it wrong. Reported-by: Rob Herring <robh@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-01objtool, modules: Discard objtool annotation sections for modulesJosh Poimboeuf6-8/+10
The '__unreachable' and '__func_stack_frame_non_standard' sections are only used at compile time. They're discarded for vmlinux but they should also be discarded for modules. Since this is a recurring pattern, prefix the section names with ".discard.". It's a nice convention and vmlinux.lds.h already discards such sections. Also remove the 'a' (allocatable) flag from the __unreachable section since it doesn't make sense for a discarded section. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jessica Yu <jeyu@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") Link: http://lkml.kernel.org/r/20170301180444.lhd53c5tibc4ns77@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-01Input: rmi4 - f30: detect INPUT_PROP_BUTTONPAD from the button countBenjamin Tissoires1-2/+3
INPUT_PROP_BUTTONPAD is currently only set through the platform data. The RMI4 header doc says that this property is there to force the buttonpad property, so we also need to detect it by looking at the exported buttons count. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-01watchdog: retu: restore MFD dependencyArnd Bergmann1-1/+1
The retu watchdog calls into the respective mfd driver, but fails to link if that is diabled: drivers/watchdog/built-in.o: In function `retu_wdt_set_timeout': ziirave_wdt.c:(.text+0x8c88): undefined reference to `retu_write' ziirave_wdt.c:(.text+0x8c88): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `retu_write' drivers/watchdog/built-in.o: In function `retu_wdt_start': ziirave_wdt.c:(.text+0x8cc8): undefined reference to `retu_write' ziirave_wdt.c:(.text+0x8cc8): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `retu_write' This restores the dependency as it was before Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: db8500: add back prmcu dependencyArnd Bergmann1-1/+1
When the db8500 watchdog is enabled without the PRCMU, we get a lot of warnings about duplicate or missing helper functions: In file included from drivers/watchdog/ux500_wdt.c:21:0: include/linux/mfd/dbx500-prcmu.h:422:19: error: redefinition of 'prcmu_abb_read' static inline int prcmu_abb_read(u8 slave, u8 reg, u8 *value, u8 size) This restores the dependency as it was. Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: kempld: fix gcc-4.3 buildArnd Bergmann1-1/+8
gcc-4.3 can't decide whether the constant value in kempld_prescaler[PRESCALER_21] is built-time constant or not, and gets confused by the logic in do_div(): drivers/watchdog/kempld_wdt.o: In function `kempld_wdt_set_stage_timeout': kempld_wdt.c:(.text.kempld_wdt_set_stage_timeout+0x130): undefined reference to `__aeabi_uldivmod' This adds a call to ACCESS_ONCE() to force it to not consider it to be constant, and leaves the more efficient normal case in place for modern compilers, using an #ifdef to annotate why we do this hack. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: softdog: fire watchdog even if softirqs do not get to runNiklas Cassel1-17/+27
Checking for timer expiration is done from the softirq TIMER_SOFTIRQ. Since commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), pending softirqs are no longer always handled immediately, instead, if there are pending softirqs, and ksoftirqd is in state TASK_RUNNING, the handling of the softirqs are deferred, and are instead supposed to be handled by ksoftirqd, when ksoftirqd gets scheduled. If a user space process with a real-time policy starts to misbehave by never relinquishing the CPU while ksoftirqd is in state TASK_RUNNING, what will happen is that all softirqs will get deferred, while ksoftirqd, which is supposed to handle the deferred softirqs, will never get to run. To make sure that the watchdog is able to fire even when we do not get to run softirqs, replace the timers with hrtimers. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: kempld: revert to full dependencyArnd Bergmann1-1/+1
The kempld watchdog driver requires the respective MFD driver: drivers/watchdog/built-in.o: In function `kempld_wdt_probe': kempld_wdt.c:(.text+0x5c78): undefined reference to `kempld_get_mutex' kempld_wdt.c:(.text+0x5c84): undefined reference to `kempld_read8' kempld_wdt.c:(.text+0x5c8e): undefined reference to `kempld_release_mutex' kempld_wdt.c:(.text+0x5d1c): undefined reference to `kempld_read8' kempld_wdt.c:(.text+0x5d2c): undefined reference to `kempld_write8' This adds the Kconfig dependency back. Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: bcm2835: add CONFIG_OF dependencyArnd Bergmann1-1/+1
Without CONFIG_OF, the driver fails to link: drivers/watchdog/built-in.o: In function `bcm2835_power_off': bcm2835_wdt.c:(.text+0x1946): undefined reference to `of_find_device_by_node' This adds a new dependency, to allow the COMPILE_TEST check to succeed. Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: sp805: add back AMBA dependencyArnd Bergmann1-1/+1
The driver fails to link if ARM_AMBA is disabled: drivers/watchdog/sp805_wdt.o: In function `sp805_wdt_driver_init': sp805_wdt.c:(.init.text+0x4): undefined reference to `amba_driver_register' It seems that the COMPILE_TEST was added in the wrong place, as there is no architecture dependency, but a bus dependency. This moves the dependency accordingly. Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: menf21bmc: add I2C dependencyArnd Bergmann1-0/+1
This driver fails to link when CONFIG_I2C is disabled or a loadable module while the watchdog is built-in: drivers/watchdog/built-in.o: In function `menf21bmc_wdt_shutdown': menf21bmc_wdt.c:(.text+0x9b44): undefined reference to `i2c_smbus_write_word_data' menf21bmc_wdt.c:(.text+0x9b44): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `i2c_smbus_write_word_data' This adds a Kconfig dependency for it, to enforce a valid configuration. Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: geode: restore hard CS5535_MFGPT dependencyArnd Bergmann1-1/+1
Wtihout CONFIG_CS5535_MFGPT, the driver does not link right: drivers/watchdog/built-in.o: In function `geodewdt_probe': geodewdt.c:(.init.text+0xca3): undefined reference to `cs5535_mfgpt_alloc_timer' geodewdt.c:(.init.text+0xcd4): undefined reference to `cs5535_mfgpt_write' geodewdt.c:(.init.text+0xcef): undefined reference to `cs5535_mfgpt_toggle_event' This adds back the dependency on this base driver. Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01watchdog: wm831x watchdog really needs mfdArnd Bergmann1-1/+1
The wm831x watchdog driver can now be built without the wm831x mfd driver, which results in a link error: (.text+0x1a95c): undefined reference to `wm831x_set_bits' (.text+0x1a95c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `wm831x_set_bits' (.text+0x1a968): undefined reference to `wm831x_reg_lock' (.text+0x1a968): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `wm831x_reg_lock' (.text+0x1a9dc): undefined reference to `wm831x_reg_unlock' (.text+0x1a9dc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `wm831x_reg_unlock' This adds back the dependency that was removed. We can still build test this driver on all architectures by enabling the MFD driver for it first. Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01objtool, compiler.h: Fix __unreachable section relocation sizeJosh Poimboeuf1-1/+1
Linus reported the following commit broke module loading on his laptop: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") It showed errors like the following: module: overflow in relocation type 10 val ffffffffc02afc81 module: 'nvme' likely not compiled with -mcmodel=kernel The problem is that the __unreachable section addresses are stored using the '.long' asm directive, which isn't big enough for .text section kernel addresses. Use relative addresses instead: ".long %c0b - .\t\n" Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") Link: http://lkml.kernel.org/r/20170301060504.oltm3iws6fmubnom@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-01tools/power turbostat: version 17.02.24Len Brown1-1/+1
The turbostat before this last set of changes is obsolete. This new version can do a lot more, but it also has some different defaults, that might catch some off-guard. So it seems a good time to give a new version number. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: bugfix: --add u32 was printed as u64Len Brown1-19/+7
When the "u32" keyword is used with --add, it means that the output should be truncated to 32-bits. This was not happening and all 64-bits were printed. Also, when no column name was used for an added MSR, The default column name was in deximal, eg. MSR16. Users report that they tend to use hex MSR numbers, so print them in hex. To always fit into the columns, use the syntax M0x10. Note that the user can always supply any column header that they want. eg --add msr0x10,MY_TSC Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: show error on execLen Brown1-0/+1
When turbostat is run in one-shot command mode, the parent takes the 'before' counter snapshot, fork/exec/wait for the child to exit, takes the 'after' counter snapshot, and prints the results. however, if the child fails to exec the command, it immediately returns, without indicating that anythign was wrong. Add an error message showing that exec failed: sudo turbostat sleeeep 4 ... turbostat: exec sleeeep: No such file or directory ... Note that the parent will still print out the statistics, because it can't tell the difference between the failed exec and a command that is purposefully returning the same status. Unfortunately, this may obscure the error message. However, if the --out parameter is used, the error message is evident on stderr. Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: dump p-state software configLen Brown1-0/+50
cpu1: cpufreq driver: acpi-cpufreq cpu1: cpufreq governor: ondemand cpufreq boost: 1 or cpu0: cpufreq driver: intel_pstate cpu0: cpufreq governor: powersave cpufreq intel_pstate no_turbo: 0 Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: show package number, even without --debugLen Brown1-1/+1
On multi-package systems, the "Package" column was being displayed only if --debug was used. Show it always. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: support "--hide C1" etc.Len Brown2-44/+73
Originally, the only way to hide the sysfs C-state statistics columns was with "--hide sysfs". This was because we process "--hide" before we probe for those columns. hack --hide to remember deferred hide requests, and apply them when sysfs is probed. "--hide sysfs" is still available as short-hand to refer to the entire group of counters. The down-side of this change is that we no longer error check for bogus --hide column names. But the user will quickly figure that out if a column they mean to hide is still there... Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: move --Package and --processor into the --cpu optionLen Brown2-16/+21
--Package is now "--cpu package", which will display just the 1st CPU in each package --processor is not "--cpu core" which will display just the 1st CPU in each core Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: turbostat.8 updateLen Brown1-98/+140
update examples to show recently updated features. In particular --add --show --hide --cpu --list Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: update --list featureLen Brown1-106/+113
Make it possible to take the entire un-edited output from `turbostat --list` and feed it to "turbostat --show" or "turbostat --hide". To do this, the leading comma was removed (no mater what columns are active) and also they dynamic C-state "C1, C2, C3" etc are replaced by the string "sysfs", which refers to them as a group. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: use wide columns to display large numbersLen Brown1-13/+55
When a counter overlfows 7 columns, it shifts the remaining columns to the right, so they no longer line up under their column header. Update turbostat to dectect when it is handling large numbers, and switch to wider columns where, necessary. Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: Add --list option to show available header namesLen Brown1-52/+66
It is handy to know the list of column header names, so that they can be used with --add and --skip The new --list option shows them: sudo ./turbostat --list --hide sysfs ,Core,CPU,Avg_MHz,Busy%,Bzy_MHz,TSC_MHz,IRQ,SMI,CPU%c1,CPU%c3,CPU%c6,CPU%c7,CoreTmp,PkgTmp,GFX%rc6,GFXMHz,PkgWatt,CorWatt,GFXWatt Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: fix zero IRQ count shown in one-shot command modeLen Brown1-4/+8
The IRQ column has been working for periodic mode, but not in one-shot command mode, it shows only 0. until now. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: add --cpu parameterLen Brown2-2/+95
With the --cpu parameter, turbostat prints only lines for the specified set of CPUs: sudo ./turbostat --quiet --show Core,CPU --cpu 0,1,3..5,6-7 Core CPU - - 0 0 0 4 1 1 1 5 2 6 3 3 3 7 Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: print sysfs C-state statsLen Brown2-18/+147
When turbostat shows % of time in a CPU idle power state, it has always been showing information from underlying hardware residency counters. While this reflects what the hardware is doing, and is thus useful for understanding the hardware, it doesn't directly tell us what Linux requested -- which is useful for tuning Linux itself. Here we add columns to turbostat to show the Linux cpuidle sub-system statistics: /sys/devices/system/cpu/cpu*/cpuidle/state*/* The first group of columns are the "usage", which is the number of times software requested that C-state in the measurement interval. eg C1 below. The second group of columns are the "time", which is the percentage of the measurement interval time that software has requested the specified C-state. eg C1% below. These software counters can be compared to the underlying hardware residency counters (eg CPU%c1 CPU%c3 CPU%c6 CPU%c7) to compare what sofware requested to what the hardware delivered. These sysfs attributes are discovered when turbostat starts, rather than being "built in". So the --show and --hide parameters do not know about these dynamic column names. However "--show sysfs" and "--hide sysfs" act on the entire group of columns: turbostat --show sysfs ... cpu4: POLL: CPUIDLE CORE POLL IDLE cpu4: C1: MWAIT 0x00 cpu4: C1E: MWAIT 0x01 cpu4: C3: MWAIT 0x10 cpu4: C6: MWAIT 0x20 cpu4: C7s: MWAIT 0x32 ... C1 C1E C3 C6 C7s C1% C1E% C3% C6% C7s% 3 6 5 1 188 0.00 0.02 0.00 0.00 99.93 0 6 5 0 58 0.00 0.16 0.02 0.00 99.70 0 0 0 0 9 0.00 0.00 0.00 0.00 99.96 0 0 0 1 24 0.00 0.00 0.00 0.02 99.93 0 0 0 0 9 0.00 0.00 0.00 0.00 99.97 0 0 0 0 32 0.00 0.00 0.00 0.00 99.96 0 0 0 0 7 0.00 0.00 0.00 0.00 99.98 2 0 0 0 36 0.00 0.00 0.00 0.00 99.97 1 0 0 0 13 0.00 0.00 0.00 0.00 99.98 Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: extend --add option to accept /sys pathLen Brown1-23/+69
Previously, the --add option could specify only an MSR. Here is is extended so an arbitrary /sys attribute, as specified by an absolute file path name. sudo ./turbostat --add /sys/devices/system/cpu/cpu0/cpuidle/state5/usage Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: skip unused counters on BDXLen Brown1-0/+17
Skip these two counters on BDX, as they are always zero: cc7, pc7 Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limitsLen Brown1-23/+67
Newer processors do not hard-code the the number of cpus in each bin to {1, 2, 3, 4, 5, 6, 7, 8} Rather, they can specify any number of CPUS in each of the 8 bins: eg. ... 37 * 100.0 = 3600.0 MHz max turbo 4 active cores 38 * 100.0 = 3700.0 MHz max turbo 3 active cores 39 * 100.0 = 3800.0 MHz max turbo 2 active cores 39 * 100.0 = 3900.0 MHz max turbo 1 active cores could now look something like this: ... 37 * 100.0 = 3600.0 MHz max turbo 16 active cores 38 * 100.0 = 3700.0 MHz max turbo 8 active cores 39 * 100.0 = 3800.0 MHz max turbo 4 active cores 39 * 100.0 = 3900.0 MHz max turbo 2 active cores Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: skip unused counters on SKXLen Brown1-0/+18
Skip these four counters on SKX, as they are always zero: cc3, pc3 cc7, pc7 Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7Len Brown1-0/+20
The CC1 column in tubostat can be computed by subtracting the core c-state residency countes from the total Cx residency. CC1 = (Idle_time_as_measured by MPERF) - (all core C-states with residency counters) However, as the underlying counter reads are not atomic, error can be noticed in this calculations, especially when the numbers are small. Denverton has a hardware CC1 residency counter to improve the accuracy of the cc1 statistic -- use it. At the same time, Denverton has no concept of CC3, PC3, CC7, PC7, so skip collecting and printing those columns. Finally, a note of clarification. Turbostat prints the standard PC2 residency counter, but on Denverton hardware, that actually means PC1E. Turbostat prints the standard PC6 residency counter, but on Denverton hardware, that actually means PC2. At this point, we document that differnce in this commit message, rather than adding a quirk to the software. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: initial Gemini Lake SOC supportLen Brown1-0/+5
Gemini Lake is similar to Apollo Lake (Broxton/Goldmont) Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01x86: intel-family.h: Add GEMINI_LAKE SOCLen Brown1-0/+1
Cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: bug fixes to --add, --show/--hide featuresLen Brown1-61/+77
Fix a bug with --add, where the title of the column is un-initialized if not specified by the user. The initial implementation of --show and --hide neglected to handle the pc8/pc9/pc10 counters. Fix a bug where "--show Core" only worked with --debug Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: use tsc_tweak everwhere it is neededLen Brown1-23/+25
The CPU ticks at a rate in the "bus clock" domain. eg. 100 MHz * bus_ratio. On newer processors, the TSC has been moved out of this BCLK domain and into a separate crystal-clock domain. While the TSC ticks "close to" the base frequency, those that look closely at the numbers will notice small errors in calculations that mix units of TSC clocks and bus clocks. "tsc_tweak" was introduced to address the most visible mixing -- the %Busy and the the Busy_MHz calculations. (A simplification as since removed TSC from the BusyMHz calculation) Here we apply the tsc_tweak to everyplace where BCLK and TSC units are mixed. The results is that on a system which is 100% idle, the sum of the C-states are now much more likely to be closer to 100%. Reported-by: Travis Downs <travis.downs@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: print system config, unless --quietLen Brown2-60/+52
Some users want turbostat to tell them everything, by default. Some users want turbostat to be quiet, by default. I find that I'm in the 1st camp, and so I've never liked needing to type the --debug parameter to decode the system configuration. So here we change the default and print the system configuration, by default. (The --debug option is now un-documented, though it does still exist for debugging turbostat internals) When you do not want to see the system configuration header, use the new "--quiet" option. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: show all columns, independent of --debugLen Brown1-7/+0
Some time ago, turbostat overflowed 80 columns. So on the assumption that a "casual" user would always want topology and frequency columns, we hid the rest of the columns and the system configuration decoding behind the --debug option. Not everybody liked that change -- including me. I use --debug 99% of the time... Well, now we have "-o file" to put turbostat output into a file, so unless you are watching real-time in a small window, column count is less frequently a factor. And more recently, we got the "--hide columnA,columnB" option to specify columns to skip. So now we "un-hide" the rest of the columns from behind --debug, and show them all, by default. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: decode MSR_MISC_FEATURE_CONTROLLen Brown1-0/+24
useful for observing if the BIOS disabled prefetch Not architectural, but docuemented as present on NHM, SNB and is present on others. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01x86 msr_index.h: Define MSR_MISC_FEATURE_CONTROLLen Brown1-0/+1
This non-architectural MSR has disable bits for various prefetchers on modern processors. While these bits are generally touched only by the BIOS, say, via BIOS SETUP, it is useful to dump them when examining options that can alter performance. Cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: decode CPUID(6).TURBOLen Brown1-1/+4
show the CPUID feature for turbo to clarify the case when it may not be shown in MISC_ENABLE CPUID(6): APERF, TURBO, DTS, PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, EPB cpu4: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT TURBO) Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01tools/power turbostat: dump Atom P-states correctlyLen Brown1-21/+82
Turbostat dumps MSR_TURBO_RATIO_LIMIT on Core Architecture. But Atom Architecture uses MSR_ATOM_CORE_RATIOS and MSR_ATOM_CORE_TURBO_RATIOS. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01intel_pstate: use MSR_ATOM_RATIOS definitions from msr-index.hLen Brown1-10/+5
Originally, these MSRs were locally defined in this driver. Now the definitions are in msr-index.h -- use them. Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01x86 msr-index.h: Define Atom specific core ratio MSR locationsLen Brown1-0/+6
These MSRs are currently used by the intel_pstate driver, using a local definition. Cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>