aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/cpufreq (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-05-08format-security: move static strings to constKees Cook1-1/+2
While examining output from trial builds with -Wformat-security enabled, many strings were found that should be defined as "const", or as a char array instead of char pointer. This makes some static analysis easier, by producing fewer false positives. As these are all trivial changes, it seemed best to put them all in a single patch rather than chopping them up per maintainer. Link: http://lkml.kernel.org/r/20170405214711.GA5711@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Jes Sorensen <jes@trained-monkey.org> [runner.c] Cc: Tony Lindgren <tony@atomide.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: David Airlie <airlied@linux.ie> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Jiri Slaby <jslaby@suse.com> Cc: Patrice Chotard <patrice.chotard@st.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Mugunthan V N <mugunthanvnm@ti.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Jarod Wilson <jarod@redhat.com> Cc: Florian Westphal <fw@strlen.de> Cc: Antonio Quartulli <a@unstable.cc> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Kejian Yan <yankejian@huawei.com> Cc: Daode Huang <huangdaode@hisilicon.com> Cc: Qianqian Xie <xieqianqian@huawei.com> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Christian Gromm <christian.gromm@microchip.com> Cc: Andrey Shvetsov <andrey.shvetsov@k2l.de> Cc: Jason Litzinger <jlitzingerdev@gmail.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08scripts/spelling.txt: add regsiter -> register spelling mistakeStephen Boyd1-2/+2
This typo is quite common. Fix it and add it to the spelling file so that checkpatch catches it earlier. Link: http://lkml.kernel.org/r/20170317011131.6881-2-sboyd@codeaurora.org Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-01Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-125/+103
Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - another round of rq-clock handling debugging, robustization and fixes - PELT accounting improvements - CPU hotplug related ->cpus_allowed affinity handling fixes all around the tree - ... plus misc fixes, cleanups and updates" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) sched/x86: Update reschedule warning text crypto: N2 - Replace racy task affinity logic cpufreq/sparc-us2e: Replace racy task affinity logic cpufreq/sparc-us3: Replace racy task affinity logic cpufreq/sh: Replace racy task affinity logic cpufreq/ia64: Replace racy task affinity logic ACPI/processor: Replace racy task affinity logic ACPI/processor: Fix error handling in __acpi_processor_start() sparc/sysfs: Replace racy task affinity logic powerpc/smp: Replace open coded task affinity logic ia64/sn/hwperf: Replace racy task affinity logic ia64/salinfo: Replace racy task affinity logic workqueue: Provide work_on_cpu_safe() ia64/topology: Remove cpus_allowed manipulation sched/fair: Move the PELT constants into a generated header sched/fair: Increase PELT accuracy for small tasks sched/fair: Fix comments sched/Documentation: Add 'sched-pelt' tool sched/fair: Fix corner case in __accumulate_sum() sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() ...
2017-04-28Merge schedutil governor updates for v4.12.Rafael J. Wysocki1-0/+2
2017-04-28Merge intel_pstate driver updates for v4.12.Rafael J. Wysocki1-500/+408
2017-04-20Annotate hardware config module parameters in drivers/cpufreq/David Howells1-1/+1
When the kernel is running in secure boot mode, we lock down the kernel to prevent userspace from modifying the running kernel image. Whilst this includes prohibiting access to things like /dev/mem, it must also prevent access by means of configuring driver modules in such a way as to cause a device to access or modify the kernel image. To this end, annotate module_param* statements that refer to hardware configuration and indicate for future reference what type of parameter they specify. The parameter parser in the core sees this information and can skip such parameters with an error message if the kernel is locked down. The module initialisation then runs as normal, but just sees whatever the default values for those parameters is. Note that we do still need to do the module initialisation because some drivers have viable defaults set in case parameters aren't specified and some drivers support automatic configuration (e.g. PNP or PCI) in addition to manually coded parameters. This patch annotates drivers in drivers/cpufreq/. Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> cc: linux-pm@vger.kernel.org
2017-04-19cpufreq: Add Tegra186 cpufreq driverMikko Perttunen3-0/+282
Add a new cpufreq driver for Tegra186 (and likely later). The CPUs are organized into two clusters, Denver and A57, with two and four cores respectively. CPU frequency can be adjusted by writing the desired rate divisor and a voltage hint to a special per-core register. The frequency of each core can be set individually; however, this is just a hint as all CPUs in a cluster will run at the maximum rate of non-idle CPUs in the cluster. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19cpufreq: imx6q: Fix error handling codeChristophe Jaillet1-1/+1
According to the previous error handling code, it is likely that 'goto out_free_opp' is expected here in order to avoid a memory leak in error handling path. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19cpufreq: imx6q: Set max suspend_freq to avoid changes during suspendLeonard Crestez1-1/+7
If the cpufreq driver tries to modify voltage/freq during suspend/resume it might need to control an external PMIC via I2C or SPI but those devices might be already suspended. This issue is likely to happen whenever the LDOs have their vin-supply set. To avoid this scenario we just increase cpufreq to the maximum before suspend. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19cpufreq: imx6q: Fix handling EPROBE_DEFER from regulatorIrina Tirdea1-0/+7
If there are any errors in getting the cpu0 regulators, the driver returns -ENOENT. In case the regulators are not yet available, the devm_regulator_get calls will return -EPROBE_DEFER, so that the driver can be probed later. If we return -ENOENT, the driver will fail its initialization and will not try to probe again (when the regulators become available). Return the actual error received from regulator_get in probe. Print a differentiated message in case we need to probe the device later and in case we actually failed. Also add a message to inform when the driver has been successfully registered. Signed-off-by: Irina Tirdea <irina.tirdea@nxp.com> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-17cpufreq: schedutil: Use policy-dependent transition delaysRafael J. Wysocki1-0/+2
Make the schedutil governor take the initial (default) value of the rate_limit_us sysfs attribute from the (new) transition_delay_us policy parameter (to be set by the scaling driver). That will allow scaling drivers to make schedutil use smaller default values of rate_limit_us and reduce the default average time interval between consecutive frequency changes. Make intel_pstate set transition_delay_us to 500. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2017-04-17Merge branch 'intel_pstate' into pm-cpufreq-schedRafael J. Wysocki1-590/+431
2017-04-15cpufreq/sparc-us2e: Replace racy task affinity logicThomas Gleixner1-24/+21
The access to the HBIRD_ESTAR_MODE register in the cpu frequency control functions must happen on the target CPU. This is achieved by temporarily setting the affinity of the calling user space thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. CPU hotplug and concurrent affinity settings for that thread resulting in code executing on the wrong CPU and overwriting the new affinity setting. Replace it by a straight forward smp function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131020280.2408@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-15cpufreq/sparc-us3: Replace racy task affinity logicThomas Gleixner1-30/+16
The access to the safari config register in the CPU frequency functions must be executed on the target CPU. This is achieved by temporarily setting the affinity of the calling user space thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. CPU hotplug and concurrent affinity settings for that thread resulting in code executing on the wrong CPU and overwriting the new affinity setting. Replace it by a straight forward smp function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/20170412201043.047558840@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-15cpufreq/sh: Replace racy task affinity logicThomas Gleixner1-18/+27
The target() callback must run on the affected cpu. This is achieved by temporarily setting the affinity of the calling thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. concurrent affinity settings for that thread resulting in code executing on the wrong CPU. Replace it by work_on_cpu(). All call pathes which invoke the callbacks are already protected against CPU hotplug. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/20170412201042.958216363@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-15cpufreq/ia64: Replace racy task affinity logicThomas Gleixner1-53/+39
The get() and target() callbacks must run on the affected cpu. This is achieved by temporarily setting the affinity of the calling thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. concurrent affinity settings for that thread resulting in code executing on the wrong CPU and overwriting the new affinity setting. Replace it by work_on_cpu(). All call pathes which invoke the callbacks are already protected against CPU hotplug. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704122231100.2548@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-15Merge back cpufreq core changes for v4.12.Rafael J. Wysocki3-9/+58
2017-04-13cpufreq: Bring CPUs up even if cpufreq_online() failedChen Yu1-2/+16
There is a report that after commit 27622b061eb4 ("cpufreq: Convert to hotplug state machine"), the normal CPU offline/online cycle fails on some platforms. According to the ftrace result, this problem was triggered on platforms using acpi-cpufreq as the default cpufreq driver, and due to the lack of some ACPI freq method (eg. _PCT), cpufreq_online() failed and returned a negative value, so the CPU hotplug state machine rolled back the CPU online process. Actually, from the user's perspective, the failure of cpufreq_online() should not prevent that CPU from being brought up, although cpufreq might not work on that CPU. BTW, during system startup cpufreq_online() is not invoked via CPU online but by the cpufreq device creation process, so the APs can be brought up even though cpufreq_online() fails in that stage. This patch ignores the return value of cpufreq_online/offline() and lets the cpufreq framework deal with the failure. cpufreq_online() itself will do a proper rollback in that case and if _PCT is missing, the ACPI cpufreq driver will print a warning if the corresponding debug options have been enabled. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194581 Fixes: 27622b061eb4 ("cpufreq: Convert to hotplug state machine") Reported-and-tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.9+ <stable@vger.kernel.org> # 4.9+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-12CPUFREQ: Loongson2: drop set_cpus_allowed_ptr()Sebastian Andrzej Siewior1-7/+0
It is pure mystery to me why we need to be on a specific CPU while looking up a value in an array. My best shot at this is that before commit d4019f0a92ab ("cpufreq: move freq change notifications to cpufreq core") it was required to invoke cpufreq_notify_transition() on a special CPU. Since it looks like a waste, remove it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: tglx@linutronix.de Cc: linux-pm@vger.kernel.org Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15888/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-04-06Merge back cpufreq changes for v4.12.Rafael J. Wysocki3-9/+58
2017-03-31Merge branches 'pm-cpufreq-fixes' and 'pm-cpuidle-fixes'Rafael J. Wysocki1-17/+21
* pm-cpufreq-fixes: cpufreq: Fix creation of symbolic links to policy directories * pm-cpuidle-fixes: cpuidle: powernv: Pass correct drv->cpumask for registration
2017-03-29cpufreq: intel_pstate: Add support for Gemini LakeBox, David E1-0/+1
Use same parameters as INTEL_FAM6_ATOM_GOLDMONT to enable Gemini Lake. Signed-off-by: Box, David E <david.e.box@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Eliminate intel_pstate_get_min_max()Rafael J. Wysocki1-26/+14
Some computations in intel_pstate_get_min_max() are not necessary and one of its two callers doesn't even use the full result. First off, the fixed-point value of cpu->max_perf represents a non-negative number between 0 and 1 inclusive and cpu->min_perf cannot be greater than cpu->max_perf. It is not necessary to check those conditions every time the numbers in question are used. Moreover, since intel_pstate_max_within_limits() only needs the upper boundary, it doesn't make sense to compute the lower one in there and returning min and max from intel_pstate_get_min_max() via pointers doesn't look particularly nice. For the above reasons, drop intel_pstate_get_min_max(), add a helper to get the base P-state for min/max computations and carry out them directly in the previous callers of intel_pstate_get_min_max(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Do not walk policy->cpusRafael J. Wysocki1-64/+60
intel_pstate_hwp_set() is the only function walking policy->cpus in intel_pstate. The rest of the code simply assumes one CPU per policy, including the initialization code. Therefore it doesn't make sense for intel_pstate_hwp_set() to walk policy->cpus as it is guaranteed to have only one bit set for policy->cpu. For this reason, rearrange intel_pstate_hwp_set() to take the CPU number as the argument and drop the loop over policy->cpus from it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Introduce pid_in_use()Rafael J. Wysocki1-5/+11
Add a new function pid_in_use() to return the information on whether or not the PID-based P-state selection algorithm is in use. That allows a couple of complicated conditions in the code to be reduced to simple checks against the new function's return value. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Drop struct cpu_defaultsRafael J. Wysocki1-87/+67
The cpu_defaults structure is redundant, because it only contains one member of type struct pstate_funcs which can be used directly instead of struct cpu_defaults. For this reason, drop struct cpu_defaults, use struct pstate_funcs directly instead of it where applicable and rename all of the variables of that type accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Move cpu_defaults definitionsRafael J. Wysocki1-67/+62
Move the definitions of the cpu_defaults structures after the definitions of utilization update callback routines to avoid extra declarations of the latter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Add update_util callback to pstate_funcsRafael J. Wysocki1-38/+43
Avoid using extra function pointers during P-state selection by dropping the get_target_pstate member from struct pstate_funcs, adding a new update_util callback to it (to be registered with the CPU scheduler as the utilization update callback in the active mode) and reworking the utilization update callback routines to invoke specific P-state selection functions directly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Use different utilization update callbacksRafael J. Wysocki1-25/+54
Notice that some overhead in the utilization update callbacks registered by intel_pstate in the active mode can be avoided if those callbacks are tailored to specific configurations of the driver. For example, the utilization update callback for the HWP enabled case only needs to update the average CPU performance periodically whereas the utilization update callback for the PID-based algorithm does not need to take IO-wait boosting into account and so on. With that in mind, define three utilization update callbacks for three different use cases: HWP enabled, the CPU load "powersave" P-state selection algorithm and the PID-based "powersave" P-state selection algorithm and modify the driver initialization to choose the callback matching its current configuration. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Modify check in intel_pstate_update_status()Rafael J. Wysocki1-1/+1
One of the checks in intel_pstate_update_status() implicitly relies on the information that there are only two struct cpufreq_driver objects available, but it is better to do it directly against the value it really is about (to make the code easier to follow if nothing else). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Drop driver_registered variableRafael J. Wysocki1-27/+19
The driver_registered variable in intel_pstate is used for checking whether or not the driver has been registered, but intel_pstate_driver can be used for that too (with the rule that the driver is not registered as long as it is NULL). That is a bit more straightforward and the code may be simplified a bit this way, so modify the driver accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Skip unnecessary PID resets on initRafael J. Wysocki1-2/+2
PID controller parameters only need to be initialized if the get_target_pstate_use_performance() P-state selection routine is going to be used. It is not necessary to initialize them otherwise, so don't do that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Set HWP sampling interval onceRafael J. Wysocki1-2/+1
In the HWP enabled case pid_params.sample_rate_ns only needs to be updated once, because it is global, so do that when setting hwp_active instead of doing it during the initialization of every CPU. Moreover, pid_params.sample_rate_ms is never used if HWP is enabled, so do not update it at all then. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Clean up intel_pstate_busy_pid_reset()Rafael J. Wysocki1-30/+16
intel_pstate_busy_pid_reset() is the only caller of pid_reset(), pid_p_gain_set(), pid_i_gain_set(), and pid_d_gain_set(). Moreover, it passes constants as two parameters of pid_reset() and all of the other routines above essentially contain the same code, so fold all of them into the caller and drop unnecessary computations. Introduce percent_fp() for converting integer values in percent to fixed-point fractions and use it in the above code cleanup. Finally, rename intel_pstate_busy_pid_reset() to intel_pstate_pid_reset() as it also is used for the initialization of PID parameters for every CPU and the meaning of the "busy" part of the name is not particularly clear. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Fold intel_pstate_reset_all_pid() into the callerRafael J. Wysocki1-11/+6
There is only one caller of intel_pstate_reset_all_pid(), which is pid_param_set() used in the debugfs interface only, and having that code split does not make it particularly convenient to follow. For this reason, move the body of intel_pstate_reset_all_pid() into its caller and drop that function. Also change the loop from for_each_online_cpu() (which is obviously racy with respect to CPU offline/online) to for_each_possible_cpu(), so that all PID parameters are reset for all CPUs regardless of their online/offline status (to prevent, for example, a previously offline CPU from going online with a stale set of PID parameters). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Initialize pid_params staticallyRafael J. Wysocki1-32/+10
Notice that both the existing struct cpu_defaults instances in which PID parameters are actually initialized use the same values of those parameters, so it is not really necessary to copy them over to pid_params dynamically. Instead, initialize pid_params statically with those values and drop the unused pid_policy member from struct cpu_defaults along with copy_pid_params() used for initializing it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Drop pointless initialization of PID parametersRafael J. Wysocki1-26/+2
The P-state selection algorithm used by intel_pstate for Atom processors is not based on the PID controller and the initialization of PID parametrs for those processors is pointless and confusing, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-28cpufreq: intel_pstate: Eliminate struct perf_limitsRafael J. Wysocki1-36/+23
After recent changes the purpose of struct perf_limits is not particularly clear any more and the code may be made somewhat easier to follow by eliminating it, so go for that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-27cpufreq: Fix creation of symbolic links to policy directoriesRafael J. Wysocki1-17/+21
The cpufreq core only tries to create symbolic links from CPU directories in sysfs to policy directories in cpufreq_add_dev(), either when a given CPU is registered or when the cpufreq driver is registered, whichever happens first. That is not sufficient, however, because cpufreq_add_dev() may be called for an offline CPU whose policy object has not been created yet and, quite obviously, the symbolic cannot be added in that case. Fix that by making cpufreq_online() attempt to add symbolic links to policy objects for the CPUs in the related_cpus mask of every new policy object created by it. The cpufreq_driver_lock locking around the for_each_cpu() loop in cpufreq_online() is dropped, because it is not necessary and the code is somewhat simpler without it. Moreover, failures to create a symbolic link will not be regarded as hard errors any more and the CPUs without those links will not be taken offline automatically, but that should not be problematic in practice. Reported-and-tested-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 4.9+ <stable@vger.kernel.org> # 4.9+
2017-03-24cpufreq: intel_pstate: Avoid transient updates of cpuinfo.max_freqRafael J. Wysocki1-19/+28
Both intel_pstate_verify_policy() and intel_cpufreq_verify_policy() set policy->cpuinfo.max_freq depending on the turbo status, but the updates made by them are discarded by the core, because the policy object passed to them by the core is temporary and cpuinfo.max_freq from that object is not copied to the final policy object in cpufreq_set_policy(). However, cpufreq_set_policy() passes the temporary policy object to the ->setpolicy callback of the driver, so intel_pstate_set_policy() actually sees the policy->cpuinfo.max_freq value updated by intel_pstate_verify_policy() and not the final one. It also updates policy->max sometimes which basically has no effect after it returns, because the core discards that update. To avoid confusion, eliminate policy->cpuinfo.max_freq updates from intel_pstate_verify_policy() and intel_cpufreq_verify_policy() entirely and check the maximum frequency explicitly in intel_pstate_update_perf_limits() instead of relying on the transiently updated policy->cpuinfo.max_freq value. Moreover, move the max->policy adjustment carried out in intel_pstate_set_policy() to a separate function and call that function from the ->verify driver callbacks to ensure that it will actually be effective. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-24cpufreq: intel_pstate: Active mode P-state limits reworkRafael J. Wysocki1-100/+85
The coordination of P-state limits used by intel_pstate in the active mode (ie. by default) is problematic, because it synchronizes all of the limits (ie. the global ones and the per-policy ones) so as to use one common pair of P-state limits (min and max) across all CPUs in the system. The drawbacks of that are as follows: - If P-states are coordinated in hardware, it is not necessary to coordinate them in software on top of that, so in that case all of the above activity is in vain. - If P-states are not coordinated in hardware, then the processor is actually capable of setting different P-states for different CPUs and coordinating them at the software level simply doesn't allow that capability to be utilized. - The coordination works in such a way that setting a per-policy limit (eg. scaling_max_freq) for one CPU causes the common effective limit to change (and it will affect all of the other CPUs too), but subsequent reads from the corresponding sysfs attributes for the other CPUs will return stale values (which is confusing). - Reads from the global P-state limit attributes, min_perf_pct and max_perf_pct, return the effective common values and not the last values set through these attributes. However, the last values set through these attributes become hard limits that cannot be exceeded by writes to scaling_min_freq and scaling_max_freq, respectively, and they are not exposed, so essentially users have to remember what they are. All of that is painful enough to warrant a change of the management of P-state limits in the active mode. To that end, redesign the active mode P-state limits management in intel_pstate in accordance with the following rules: (1) All CPUs are affected by the global limits (that is, none of them can be requested to run faster than the global max and none of them can be requested to run slower than the global min). (2) Each individual CPU is affected by its own per-policy limits (that is, it cannot be requested to run faster than its own per-policy max and it cannot be requested to run slower than its own per-policy min). (3) The global and per-policy limits can be set independently. Also, the global maximum and minimum P-state limits will be always expressed as percentages of the maximum supported turbo P-state. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-24cpufreq: intel_pstate: Use load-based P-state selection more widelyRafael J. Wysocki1-1/+7
Extend the set of systems for which intel_pstate will use the "powersave" P-state selection algorithm based on CPU load in the active mode by systems with ACPI preferred profile set to "tablet", "appliance PC", "desktop", or "workstation" (ie. everything with a specified preferred profile that is not a "server"). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-24cpufreq: intel_pstate: Support HWP processors in all operation modesRafael J. Wysocki1-14/+19
Currently, some processors supporting HWP are only supported by intel_pstate if HWP is actually going to be used and not supported otherwise which is confusing. Specifically, they are not supported if "intel_pstate=no_hwp" is passed to the kernel in the command line or if the driver is started in the passive mode ("intel_pstate=passive"). There is no real reason for that, because everything about those processor is known anyway and the driver can work with them in all modes, so make that happen, but use the load-based P-state selection algorithm for the active mode "powersave" policy with them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-24Merge back intel_pstate updates for 4.12.Rafael J. Wysocki1-14/+4
2017-03-24Merge branches 'pm-cpufreq-fixes', 'pm-cpufreq-sched-fixes' and 'intel_pstate-fixes'Rafael J. Wysocki1-148/+79
* pm-cpufreq-fixes: cpufreq: Restore policy min/max limits on CPU online * pm-cpufreq-sched-fixes: cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start() * intel_pstate-fixes: cpufreq: intel_pstate: Fix policy data management in passive mode cpufreq: intel_pstate: One set of global limits in active mode
2017-03-22cpufreq: Restore policy min/max limits on CPU onlineViresh Kumar1-0/+3
On CPU online the cpufreq core restores the previous governor (or the previous "policy" setting for ->setpolicy drivers), but it does not restore the min/max limits at the same time, which is confusing, inconsistent and real pain for users who set the limits and then suspend/resume the system (using full suspend), in which case the limits are reset on all CPUs except for the boot one. Fix this by making cpufreq_online() restore the limits when an inactive policy is brought online. The commit log and patch are inspired from Rafael's earlier work. Reported-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.3+ <stable@vger.kernel.org> # 4.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-21cpufreq: intel_pstate: Fix policy data management in passive modeRafael J. Wysocki1-23/+6
The policy->cpuinfo.max_freq and policy->max updates in intel_cpufreq_turbo_update() are excessive as they are done for no good reason and may lead to problems in principle, so they should be dropped. However, after dropping them intel_cpufreq_turbo_update() becomes almost entirely pointless, because the check made by it is made again down the road in intel_pstate_prepare_request(). The only thing in it that still needs to be done is the call to update_turbo_state(), so drop intel_cpufreq_turbo_update() altogether and make its callers invoke update_turbo_state() directly instead of it. In addition to that, fix intel_cpufreq_verify_policy() so that it checks global.no_turbo in addition to global.turbo_disabled when updating policy->cpuinfo.max_freq to make it consistent with intel_pstate_verify_policy(). Fixes: 001c76f05b01 (cpufreq: intel_pstate: Generic governors support) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-18cpufreq: intel_pstate: One set of global limits in active modeRafael J. Wysocki1-96/+46
In the active mode intel_pstate currently uses two sets of global limits, each associated with one of the possible scaling_governor settings in that mode: "powersave" or "performance". The driver switches over from one of those sets to the other depending on the scaling_governor setting for the last CPU whose per-policy cpufreq interface in sysfs was last used to change parameters exposed in there. That obviously leads to no end of issues when the scaling_governor settings differ between CPUs. The most recent issue was introduced by commit a240c4aa5d0f (cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy) that eliminated the reinitialization of "performance" limits in intel_pstate_set_policy() preventing the max limit from being set to anything below 100, among other things. Namely, an undesirable side effect of commit a240c4aa5d0f is that now, after setting scaling_governor to "performance" in the active mode, the per-policy limits for the CPU in question go to the highest level and stay there even when it is switched back to "powersave" later. As it turns out, some distributions set scaling_governor to "performance" temporarily for all CPUs to speed-up system initialization, so that change causes them to misbehave later. To fix that, get rid of the performance/powersave global limits split and use just one set of global limits for everything. From the user's persepctive, after this modification, when scaling_governor is switched from "performance" to "powersave" or the other way around on one CPU, the limits settings (ie. the global max/min_perf_pct and per-policy scaling_max/min_freq for any CPUs) will not change. Still, switching from "performance" to "powersave" or the other way around changes the way in which P-states are selected and in particular "performance" causes the driver to always request the highest P-state it is allowed to ask for for the given CPU. Fixes: a240c4aa5d0f (cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-18Merge branches 'pm-cpufreq-fixes' and 'intel_pstate-fixes'Rafael J. Wysocki1-33/+31
* pm-cpufreq-fixes: cpufreq: Fix and clean up show_cpuinfo_cur_freq() * intel_pstate-fixes: cpufreq: intel_pstate: Avoid percentages in limits-related computations cpufreq: intel_pstate: Correct frequency setting in the HWP mode cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set()
2017-03-16cpufreq: dbx500: Manage cooling device from cpufreq driverViresh Kumar1-0/+20
The best place to register the CPU cooling device is from the cpufreq driver as we would know if all the resources are already available or not. That's what is done for the cpufreq-dt.c driver as well. The cpu-cooling driver for dbx500 platform was just (un)registering with the thermal framework and that can be handled easily by the cpufreq driver as well and in proper sequence as well. Get rid of the cooling driver and its its users and manage everything from the cpufreq driver instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>