Age | Commit message (Collapse) | Author | Files | Lines |
|
The measurements of the power-on|off latencies in genpd for a PM domain are
superfluous, unless the corresponding genpd has a governor assigned to it,
which would make use of the data.
Therefore, let's improve the behaviour in genpd by making the measurements
conditional, based upon if there's a governor assigned.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If a genpd doesn't have an associated governor assigned, several variables
in the struct generic_pm_domain becomes superfluous.
Rather than wasting memory in allocated genpds, let's move the variables
from the struct generic_pm_domain into a new separate struct. In this way,
we can instead dynamically decide when we need to allocate the
corresponding data for it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
To improve the readability of the code, let's move the parts that deals
with allocation/freeing of data, into two separate functions.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In the genpd governor we walk the list of child-domains to take into
account their next_wakeup. If the child-domain itself, doesn't have a
governor assigned to it, we can end up using the next_wakeup value before
it has been properly initialized. To prevent a possible incorrect behaviour
in the governor, let's initialize next_wakeup to KTIME_MAX.
Fixes: c79aa080fb0f ("PM: domains: use device's next wakeup to determine domain idle state")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When an IRQ safe device is attached to a non-IRQ safe PM domain, genpd
needs to prevent the PM domain from being powered off. However, genpd still
allows the device to be runtime suspended/resumed, hence it's also
reasonable to think that a governor may be used to validate the QoS latency
constraints.
Unfortunately, genpd_runtime_resume() treats the configuration above, as a
reason to skip measuring the QoS resume latency for the device. This is a
legacy behaviour that was earlier correct, but should have been changed
when genpd was transformed into its current behaviour around how it manages
IRQ safe devices. Luckily, there's no report about problems, so let's just
fixup the behaviour.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The QoS latency measurements for devices in genpd_runtime_suspend|resume()
are superfluous, unless the corresponding genpd has a governor assigned to
it, which would make use of the data.
Therefore, let's improve the behaviour in genpd by making the measurements
conditional, based upon if there's a governor assigned.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If the corresponding genpd for the device doesn't use a governor, the
variable next_wakeup within the struct generic_pm_domain_data becomes
superfluous.
To avoid wasting memory, let's move it into the struct gpd_timing_data,
which is already being allocated based upon if there is governor assigned.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If a genpd doesn't have an associated governor assigned, there's really no
point to allocate the per device gpd_timing_data, as the data isn't being
used by a governor anyway.
To avoid wasting memory, let's therefore convert the corresponding td
variable in the struct generic_pm_domain_data into a pointer and manage the
allocation of its data dynamically.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In irq_safe_dev_in_sleep_domain() we correctly skip the dev_warn_once() if
the corresponding genpd for the device, has the GENPD_FLAG_ALWAYS_ON flag
being set. For the same reason (the genpd is always-on in runtime), let's
also skip the warning if the GENPD_FLAG_RPM_ALWAYS_ON flag is set for the
genpd.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The name "irq_safe_dev_in_no_sleep_domain", doesn't really match the
conditions that are being checked in the function, hence the code becomes a
bit confusing to read.
Let's clarify this by renaming it into "irq_safe_dev_in_sleep_domain" and
let's also take the opportunity to clarify a corresponding comment in the
code.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Back in the days when genpd supported intermediate power states of its
devices, it made sense to check the PM_QOS_FLAG_NO_POWER_OFF in
genpd_power_off(). This because the attached devices were all being put
into low power state together when the PM domain was also being powered
off.
At this point, the flag PM_QOS_FLAG_NO_POWER_OFF is better checked by
drivers from their ->runtime_suspend() callbacks, like in the
usb_port_runtime_suspend(), for example. Or perhaps an even better option
is to set the QoS resume latency constraint for the device to zero, which
informs the runtime PM core to prevent the device from being runtime
suspended.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Due to recent changes, the always-on governor is always used with a genpd
that has the GENPD_FLAG_RPM_ALWAYS_ON flag being set. This means genpd,
doesn't invoke the governor's ->power_down_ok() callback, which makes the
code in the governor redundant, so let's drop it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Rather than relying on the genpd provider to set the corresponding flag,
GENPD_FLAG_RPM_ALWAYS_ON, when the always-on governor is being used, let's
add it in pm_genpd_init(). In this way, it starts to benefits all genpd
providers immediately.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There is no need to store the result of the multiply back to variable value
after the multiplication. The store is redundant, replace *= with just *.
Cleans up clang scan build warning:
warning: Although the value stored to 'value' is used in the enclosing
expression, the value is never actually read from 'value'
[deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The communication mean of the _CPC desired performance can be
PCC, System Memory, System IO, or Functional Fixed Hardware (FFH).
PCC, SystemMemory and SystemIo address spaces are available from any
CPU. Thus, dvfs_possible_from_any_cpu should be enabled in such case.
For FFH, let the FFH implementation do smp_call_function_*() calls.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The communication mean of the _CPC desired performance can be
PCC, System Memory, System IO, or Functional Fixed Hardware.
commit b7898fda5bc7 ("cpufreq: Support for fast frequency switching")
fast_switching is 'for switching CPU frequencies from interrupt
context'.
Writes to SystemMemory and SystemIo are fast and suitable this.
This is not the case for PCC and might not be the case for FFH.
Enable fast_switching for the cppc_cpufreq driver in above cases.
Add cppc_allow_fast_switch() to check the desired performance
register address space and set fast_switching accordingly.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The transition_delay_us (struct cpufreq_policy) is currently defined
as:
Preferred average time interval between consecutive invocations of
the driver to set the frequency for this policy. To be set by the
scaling driver (0, which is the default, means no preference).
The transition_latency represents the amount of time necessary for a
CPU to change its frequency.
A PCCT table advertises mutliple values:
- pcc_nominal: Expected latency to process a command, in microseconds
- pcc_mpar: The maximum number of periodic requests that the subspace
channel can support, reported in commands per minute. 0 indicates no
limitation.
- pcc_mrtt: The minimum amount of time that OSPM must wait after the
completion of a command before issuing the next command,
in microseconds.
cppc_get_transition_latency() allows to get the max of them.
commit d4f3388afd48 ("cpufreq / CPPC: Set platform specific
transition_delay_us") allows to select transition_delay_us based on
the platform, and fallbacks to cppc_get_transition_latency()
otherwise.
If _CPC objects are not using PCC channels (no PPCT table), the
transition_delay_us is set to CPUFREQ_ETERNAL, leading to really long
periods between frequency updates (~4s).
If the desired_reg, where performance requests are written, is in
SystemMemory or SystemIo ACPI address space, there is no delay
in requests. So return 0 instead of CPUFREQ_ETERNAL, leading to
transition_delay_us being set to LATENCY_MULTIPLIER us (1000 us).
This patch also adds two macros to check the address spaces.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The _OSC method allows the OS and firmware to communicate about
supported features/capabitlities. It also allows the OS to take
control of some features.
In ACPI 6.4, s6.2.11.2 Platform-Wide OSPM Capabilities, the CPPC
(resp. v2) bit should be set by the OS if it 'supports controlling
processor performance via the interfaces described in the _CPC
object'.
The OS supports CPPC and parses the _CPC object only if
CONFIG_ACPI_CPPC_LIB is set. Replace the x86 specific
boot_cpu_has(X86_FEATURE_HWP) dynamic check with an arch
generic CONFIG_ACPI_CPPC_LIB build-time check.
Note:
CONFIG_X86_INTEL_PSTATE selects CONFIG_ACPI_CPPC_LIB.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ACPI 6.2 Section 6.2.11.2 'Platform-Wide OSPM Capabilities':
Starting with ACPI Specification 6.2, all _CPC registers can be in
PCC, System Memory, System IO, or Functional Fixed Hardware address
spaces. OSPM support for this more flexible register space scheme is
indicated by the “Flexible Address Space for CPPC Registers” _OSC bit
Otherwise (cf ACPI 6.1, s8.4.7.1.1.X), _CPC registers must be in:
- PCC or Functional Fixed Hardware address space if defined
- SystemMemory address space (NULL register) if not defined
Add the corresponding _OSC bit and check it when parsing _CPC objects.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The intent is to use a genpd governor when there are some states that needs
to be managed. Although, the current code ends up to never assign a
governor, let's fix this.
Fixes: 6abf32f1d9c50 ("cpuidle: Add RISC-V SBI CPU idle driver")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
While factoring out the PM domain related code from PSCI domain driver into
a set of library functions, a regression when initializing the genpds got
introduced. More precisely, we fail to assign a genpd governor, so let's
fix this.
Fixes: 9d976d6721df ("cpuidle: Factor-out power domain related code from PSCI domain driver")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Each devfreq governor specifies the supported governor event
such as GOV_START and GOV_STOP. When not-supported event is required,
just return non-error. But, commit ce9a0d88d97a ("PM / devfreq: Add
cpu based scaling support to passive governor") returned the error
value. So that return non-error value when not-supported event is required.
Fixes: ce9a0d88d97a ("PM / devfreq: Add cpu based scaling support to passive governor")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add ALDERLAKE_N to the list of supported processor models in the Intel
RAPL power capping driver.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cpufreq_offline() calls offline() and exit() under the policy rwsem
But they are called outside the rwsem in cpufreq_online().
Make cpufreq_online() call offline() and exit() as well as online() and
init() under the policy rwsem to achieve a clear lock relationship.
All of the init() and online() implementations in the tree only
initialize the policy object without attempting to acquire the policy
rwsem and they won't call cpufreq APIs attempting to acquire it.
Signed-off-by: Schspa Shi <schspa@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If policy initialization fails after the sysfs files are created,
there is a possibility to end up running show()/store() callbacks
for half-initialized policies, which may have unpredictable
outcomes.
Abort show()/store() in such a case by making sure the policy is active.
Also dectivate the policy on such failures.
Signed-off-by: Schspa Shi <schspa@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The passive governor requires the cpu data to get the next target frequency
of devfreq device if depending on cpu. In order to reduce the unnecessary
memory data, keep cpufreq_policy data for possible cpus instead of NR_CPU.
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Tested-by: Johnson Wang <johnson.wang@mediatek.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
In order to keep the consistent coding style between passive_devfreq
and passive_cpufreq, use common code for handling required opp property.
Also remove the unneed conditional statement and unify the comment
of both passive_devfreq and passive_cpufreq when getting the target frequency.
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Tested-by: Johnson Wang <johnson.wang@mediatek.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
Many CPU architectures have caches that can scale independent of the
CPUs. Frequency scaling of the caches is necessary to make sure that the
cache is not a performance bottleneck that leads to poor performance and
power. The same idea applies for RAM/DDR.
To achieve this, this patch adds support for cpu based scaling to the
passive governor. This is accomplished by taking the current frequency
of each CPU frequency domain and then adjust the frequency of the cache
(or any devfreq device) based on the frequency of the CPUs. It listens
to CPU frequency transition notifiers to keep itself up to date on the
current CPU frequency.
To decide the frequency of the device, the governor does one of the
following:
* Derives the optimal devfreq device opp from required-opps property of
the parent cpu opp_table.
* Scales the device frequency in proportion to the CPU frequency. So, if
the CPUs are running at their max frequency, the device runs at its
max frequency. If the CPUs are running at their min frequency, the
device runs at its min frequency. It is interpolated for frequencies
in between.
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Tested-by: Johnson Wang <johnson.wang@mediatek.com>
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
[Sibi: Integrated cpu-freqmap governor into passive_governor]
Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
[Chanwoo: Fix conflict with latest code and cleanup code]
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
In order to get frequency range within devfreq governors,
export devfreq_get_freq_range symbol within devfreq.
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Tested-by: Johnson Wang <johnson.wang@mediatek.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
|
|
We're having unresolved issues with the glock holder auto-demotion mechanism
introduced in commit dc732906c245. This mechanism was assumed to be essential
for avoiding frequent short reads and writes until commit 296abc0d91d8
("gfs2: No short reads or writes upon glock contention"). Since then,
when the inode glock is lost, it is simply re-acquired and the operation
is resumed. This means that apart from the performance penalty, we
might as well drop the inode glock before faulting in pages, and
re-acquire it afterwards.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
In gfs2_file_buffered_write, to increase the likelihood that all the
user memory we're trying to write will be resident in memory, carry out
the write in chunks and fault in each chunk of user memory before trying
to write it. Otherwise, some workloads will trigger frequent short
"internal" writes, causing filesystem blocks to be allocated and then
partially deallocated again when writing into holes, which is wasteful
and breaks reservations.
Neither the chunked writes nor any of the short "internal" writes are
user visible.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Align the chunks that reads and writes are carried out in to the page
cache rather than the user buffers. This will be more efficient in
general, especially for allocating writes. Optimizing the case that the
user buffer is gfs2 backed isn't very useful; we only need to make sure
we won't deadlock.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Pull the return value test of the previous read or write operation out
of should_fault_in_pages(). In a following patch, we'll fault in pages
before the I/O and there will be no return value to check.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
No need to store the return value of the fault_in functions in separate
variables.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Instead of counting the number of bytes read from the filesystem,
functions gfs2_file_direct_read and gfs2_file_read_iter count the number
of bytes written into the user buffer. Conversely, functions
gfs2_file_direct_write and gfs2_file_buffered_write count the number of
bytes read from the user buffer. This is nothing but confusing, so
change the read functions to count how many bytes they have read, and
the write functions to count how many bytes they have written.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
When a write cannot be carried out in full, gfs2_iomap_end() releases
blocks that have been allocated for this write but haven't been used.
To compute the end of the allocation, gfs2_iomap_end() incorrectly
rounded the end of the attempted write down to the next block boundary
to arrive at the end of the allocation. It would have to round up, but
the end of the allocation is also available as iomap->offset +
iomap->length, so just use that instead.
In addition, use round_up() for computing the start of the unused range.
Fixes: 64bc06bb32ee ("gfs2: iomap buffered write support")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
With very limited vram on svga3 it's difficult to handle all the surface
migrations. Without gbobjects, i.e. the ability to store surfaces in
guest mobs, there's no reason to support intermediate svga2 features,
especially because we can fall back to fb traces and svga3 will never
support those in-between features.
On svga3 we wither want to use fb traces or screen targets
(i.e. gbobjects), nothing in between. This fixes presentation on a lot
of fusion/esxi tech previews where the exposed svga3 caps haven't been
finalized yet.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Fixes: 2cd80dbd3551 ("drm/vmwgfx: Add basic support for SVGA3")
Cc: <stable@vger.kernel.org> # v5.14+
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318174332.440068-5-zack@kde.org
|
|
Transition to drm_mode_fb_cmd2 from drm_mode_fb_cmd left the structure
unitialized. drm_mode_fb_cmd2 adds a few additional members, e.g. flags
and modifiers which were never initialized. Garbage in those members
can cause random failures during the bringup of the fbcon.
Initializing the structure fixes random blank screens after bootup due
to flags/modifiers mismatches during the fbcon bring up.
Fixes: dabdcdc9822a ("drm/vmwgfx: Switch to mode_cmd2")
Signed-off-by: Zack Rusin <zackr@vmware.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: <stable@vger.kernel.org> # v4.10+
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302152426.885214-7-zack@kde.org
|
|
Port of the vmwgfx to SVGAv3 lacked support for fencing. SVGAv3 removed
FIFO's and replaced them with command buffers and extra registers.
The initial version of SVGAv3 lacked support for most advanced features
(e.g. 3D) which made fences unnecessary. That is no longer the case,
especially as 3D support is being turned on.
Switch from FIFO commands and capabilities to command buffers and extra
registers to enable fences on SVGAv3.
Fixes: 2cd80dbd3551 ("drm/vmwgfx: Add basic support for SVGA3")
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302152426.885214-5-zack@kde.org
|
|
The unused part precedes the new range spanned by the start, end parameters
of vmemmap_use_new_sub_pmd(). This means it actually goes from
ALIGN_DOWN(start, PMD_SIZE) up to start.
Use the correct address when applying the mark using memset.
Fixes: 8d400913c231 ("x86/vmemmap: handle unpopulated sub-pmd ranges")
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220509090637.24152-2-ken@codelabs.ch
|
|
The commit cited below claims to fix a use-after-free condition after
tls_device_down. Apparently, the description wasn't fully accurate. The
context stayed alive, but ctx->netdev became NULL, and the offload was
torn down without a proper fallback, so a bug was present, but a
different kind of bug.
Due to misunderstanding of the issue, the original patch dropped the
refcount_dec_and_test line for the context to avoid the alleged
premature deallocation. That line has to be restored, because it matches
the refcount_inc_not_zero from the same function, otherwise the contexts
that survived tls_device_down are leaked.
This patch fixes the described issue by restoring refcount_dec_and_test.
After this change, there is no leak anymore, and the fallback to
software kTLS still works.
Fixes: c55dcdd435aa ("net/tls: Fix use-after-free after the TLS device goes down and up")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220512091830.678684-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the NIC ->probe() callback, ->mtd_probe() callback is called.
If NIC has 2 ports, ->probe() is called twice and ->mtd_probe() too.
In the ->mtd_probe(), which is efx_ef10_mtd_probe() it allocates and
initializes mtd partiion.
But mtd partition for sfc is shared data.
So that allocated mtd partition data from last called
efx_ef10_mtd_probe() will not be used.
Therefore it must be freed.
But it doesn't free a not used mtd partition data in efx_ef10_mtd_probe().
kmemleak reports:
unreferenced object 0xffff88811ddb0000 (size 63168):
comm "systemd-udevd", pid 265, jiffies 4294681048 (age 348.586s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffffa3767749>] kmalloc_order_trace+0x19/0x120
[<ffffffffa3873f0e>] __kmalloc+0x20e/0x250
[<ffffffffc041389f>] efx_ef10_mtd_probe+0x11f/0x270 [sfc]
[<ffffffffc0484c8a>] efx_pci_probe.cold.17+0x3df/0x53d [sfc]
[<ffffffffa414192c>] local_pci_probe+0xdc/0x170
[<ffffffffa4145df5>] pci_device_probe+0x235/0x680
[<ffffffffa443dd52>] really_probe+0x1c2/0x8f0
[<ffffffffa443e72b>] __driver_probe_device+0x2ab/0x460
[<ffffffffa443e92a>] driver_probe_device+0x4a/0x120
[<ffffffffa443f2ae>] __driver_attach+0x16e/0x320
[<ffffffffa4437a90>] bus_for_each_dev+0x110/0x190
[<ffffffffa443b75e>] bus_add_driver+0x39e/0x560
[<ffffffffa4440b1e>] driver_register+0x18e/0x310
[<ffffffffc02e2055>] 0xffffffffc02e2055
[<ffffffffa3001af3>] do_one_initcall+0xc3/0x450
[<ffffffffa33ca574>] do_init_module+0x1b4/0x700
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Fixes: 8127d661e77f ("sfc: Add support for Solarflare SFC9100 family")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220512054709.12513-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Non blocking sendmsg will return -EAGAIN when any signal pending
and no send space left, while non blocking recvmsg return -EINTR
when signal pending and no data received. This may makes confused.
As TCP returns -EAGAIN in the conditions described above. Align the
behavior of smc with TCP.
Fixes: 846e344eb722 ("net/smc: add receive timeout check")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20220512030820.73848-1-guangguan.wang@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 2d1f90f9ba83 ("net: dsa/bcm_sf2: fix incorrect usage of
state->link") the interface suspend path would call our mac_link_down()
call back which would forcibly set the link down, thus preventing
Wake-on-LAN packets from reaching our management port.
Fix this by looking at whether the port is enabled for Wake-on-LAN and
not clearing the link status in that case to let packets go through.
Fixes: 2d1f90f9ba83 ("net: dsa/bcm_sf2: fix incorrect usage of state->link")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220512021731.2494261-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The bandwidth budget table is introduced to trace ideal bandwidth used
by each INT/ISOC endpoint, but in fact the endpoint may consume more
bandwidth and cause data transfer error, so it's better to leave some
margin. Obviously it's difficult to find the best margin for all cases,
instead take use of the worst-case scenario.
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/20220512064931.31670-2-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Due to the scheduler allocates the optimal bandwidth for FS ISOC endpoints,
this may be not enough actually and causes data transfer error, so come up
with an estimate that is no less than the worst case bandwidth used for
any one mframe, but may be an over-estimate.
Fixes: 451d3912586a ("usb: xhci-mtk: update fs bus bandwidth by bw_budget_table")
Cc: stable@vger.kernel.org
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/20220512064931.31670-1-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The usb_gadget_register_driver can be called multi time by to
threads via USB_RAW_IOCTL_RUN ioctl syscall, which will lead
to multiple registrations.
Call trace:
driver_register+0x220/0x3a0 drivers/base/driver.c:171
usb_gadget_register_driver_owner+0xfb/0x1e0
drivers/usb/gadget/udc/core.c:1546
raw_ioctl_run drivers/usb/gadget/legacy/raw_gadget.c:513 [inline]
raw_ioctl+0x1883/0x2730 drivers/usb/gadget/legacy/raw_gadget.c:1220
ioctl USB_RAW_IOCTL_RUN
This routine allows two processes to register the same driver instance
via ioctl syscall. which lead to a race condition.
Please refer to the following scenarios.
T1 T2
------------------------------------------------------------------
usb_gadget_register_driver_owner
driver_register driver_register
driver_find driver_find
bus_add_driver bus_add_driver
priv alloced <context switch>
drv->p = priv;
<schedule out>
kobject_init_and_add // refcount = 1;
//couldn't find an available UDC or it's busy
<context switch>
priv alloced
drv->priv = priv;
kobject_init_and_add
---> refcount = 1 <------
// register success
<context switch>
===================== another ioctl/process ======================
driver_register
driver_find
k = kset_find_obj()
---> refcount = 2 <------
<context out>
driver_unregister
// drv->p become T2's priv
---> refcount = 1 <------
<context switch>
kobject_put(k)
---> refcount = 0 <------
return priv->driver;
--------UAF here----------
There will be UAF in this scenario.
We can fix it by adding a new STATE_DEV_REGISTERING device state to
avoid double register.
Reported-by: syzbot+dc7c3ca638e773db07f6@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/000000000000e66c2805de55b15a@google.com/
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Schspa Shi <schspa@gmail.com>
Link: https://lore.kernel.org/r/20220508150247.38204-1-schspa@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Update MT6360 BMC PHY Tx/Rx setting for the compatibility.
Macpaul reported this CtoDP cable attention message cannot be received from
MT6360 TCPC. But actually, attention message really sent from UFP_D
device.
After RD's comment, there may be BMC PHY Tx/Rx setting causes this issue.
Below's the detailed TCPM log and DP attention message didn't received from 6360
TCPCI.
[ 1206.367775] Identity: 0000:0000.0000
[ 1206.416570] Alternate mode 0: SVID 0xff01, VDO 1: 0x00000405
[ 1206.447378] AMS DFP_TO_UFP_ENTER_MODE start
[ 1206.447383] PD TX, header: 0x1d6f
[ 1206.449393] PD TX complete, status: 0
[ 1206.454110] PD RX, header: 0x184f [1]
[ 1206.456867] Rx VDM cmd 0xff018144 type 1 cmd 4 len 1
[ 1206.456872] AMS DFP_TO_UFP_ENTER_MODE finished
[ 1206.456873] cc:=4
[ 1206.473100] AMS STRUCTURED_VDMS start
[ 1206.473103] PD TX, header: 0x2f6f
[ 1206.475397] PD TX complete, status: 0
[ 1206.480442] PD RX, header: 0x2a4f [1]
[ 1206.483145] Rx VDM cmd 0xff018150 type 1 cmd 16 len 2
[ 1206.483150] AMS STRUCTURED_VDMS finished
[ 1206.483151] cc:=4
[ 1206.505643] AMS STRUCTURED_VDMS start
[ 1206.505646] PD TX, header: 0x216f
[ 1206.507933] PD TX complete, status: 0
[ 1206.512664] PD RX, header: 0x1c4f [1]
[ 1206.515456] Rx VDM cmd 0xff018151 type 1 cmd 17 len 1
[ 1206.515460] AMS STRUCTURED_VDMS finished
[ 1206.515461] cc:=4
Fixes: e1aefcdd394fd ("usb typec: mt6360: Add support for mt6360 Type-C driver")
Cc: stable <stable@vger.kernel.org>
Reported-by: Macpaul Lin <macpaul.lin@mediatek.com>
Tested-by: Macpaul Lin <macpaul.lin@mediatek.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Signed-off-by: Fabien Parent <fparent@baylibre.com>
Link: https://lore.kernel.org/r/1652159580-30959-1-git-send-email-u0084500@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently, cpufreq_remove_dev() invokes the ->exit() driver callback
without holding the policy rwsem which is inconsistent with what
happens if ->exit() is invoked directly from cpufreq_offline().
It also manipulates the real_cpus mask and removes the CPU device
symlink without holding the policy rwsem, but cpufreq_offline() holds
the rwsem around the modifications thereof.
For consistency, modify cpufreq_remove_dev() to hold the policy rwsem
until the ->exit() callback has been called (or it has been determined
that it is not necessary to call it).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
|