linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-06-21	of: reserved-memory: Add stub for RESERVEDMEM_OF_DECLARE()	Dmitry Osipenko	2	-6/+13
	The reserved-memory Kconfig could be disabled when drivers are compile-tested. In this case RESERVEDMEM_OF_DECLARE() produces a noisy warning about the orphaned __reservedmem_of_table section. Add the missing stub that fixes the warning. In particular this is needed for compile-testing of NVIDIA Tegra210 memory driver which uses reserved-memory. Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210610162313.20942-1-digetx@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>
2021-06-21	net: dsa: export the dsa_port_is_{user,cpu,dsa} helpers	Vladimir Oltean	1	-0/+15
	The difference between dsa_is_user_port and dsa_port_is_user is that the former needs to look up the list of ports of the DSA switch tree in order to find the struct dsa_port, while the latter directly receives it as an argument. dsa_is_user_port is already in widespread use and has its place, so there isn't any chance of converting all callers to a single form. But being able to do: dsa_port_is_user(dp) instead of dsa_is_user_port(dp->ds, dp->index) is much more efficient too, especially when the "dp" comes from an iterator over the DSA switch tree - this reduces the complexity from quadratic to linear. Move these helpers from dsa2.c to include/net/dsa.h so that others can use them too. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21	net: sched: add barrier to ensure correct ordering for lockless qdisc	Yunsheng Lin	1	-0/+12
	The spin_trylock() was assumed to contain the implicit barrier needed to ensure the correct ordering between STATE_MISSED setting/clearing and STATE_MISSED checking in commit a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc"). But it turns out that spin_trylock() only has load-acquire semantic, for strongly-ordered system(like x86), the compiler barrier implicitly contained in spin_trylock() seems enough to ensure the correct ordering. But for weakly-orderly system (like arm64), the store-release semantic is needed to ensure the correct ordering as clear_bit() and test_bit() is store operation, see queued_spin_lock(). So add the explicit barrier to ensure the correct ordering for the above case. Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21	Merge series "Extend regulator notification support" from Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>:	Mark Brown	50	-102/+481
	Extend regulator notification support This series extends the regulator notification and error flag support. Initial discussion on the topic can be found here: https://lore.kernel.org/lkml/6046836e22b8252983f08d5621c35ececb97820d.camel@fi.rohmeurope.com/ In a nutshell - the series adds: 1. WARNING level events/error flags. (Patch 3) Current regulator 'ERROR' event notifications for over/under voltage, over current and over temperature are used to indicate condition where monitored entity is so badly "off" that it actually indicates a hardware error which can not be recovered. The most typical hanling for that is believed to be a (graceful) system-shutdown. Here we add set of 'WARNING' level flags to allow sending notifications to consumers before things are 'that badly off' so that consumer drivers can implement recovery-actions. 2. Device-tree properties for specifying limit values. (Patches 1, 5) Add limits for above mentioned 'ERROR' and 'WARNING' levels (which send notifications to consumers) and also for a 'PROTECTION' level (which will be used to immediately shut-down the regulator(s) W/O informing consumer drivers. Typically implemented by hardware). Property parsing is implemented in regulator core which then calls callback operations for limit setting from the IC drivers. A warning is emitted if protection is requested by device tree but the underlying IC does not support configuring requested protection. 3. Helpers which can be registered by IC. (Patch 4) Target is to avoid implementing IRQ handling and IRQ storm protection in each IC driver. (Many of the ICs implementin these IRQs do not allow masking or acking the IRQ but keep the IRQ asserted for the whole duration of problem keeping the processor in IRQ handling loop). 4. Emergency poweroff function (refactored out of the thermal_core to kernel/reboot.c) which is called if IC fires error IRQs but IC reading fails and given retry-count is exceeded. (Patches 2, 4) Please note that the mutex in the emergency shutdown was replaced by a simple atomic in order to allow call from any context. The helper was attempted to be done so it could be used to implement roughly same logic as is used in qcom-labibb regulator. This means amongst other things a safety shut-down if IC registers are not readable. Using these shut-down retry counters are optional. The idea is that the helper could be also used by simpler ICs which do not provide status register(s) which can be used to check if error is still active. ICs which do not have such status register can simply omit the 'renable' callback (and retry-counts etc) - and helper assumes the situation is Ok and re-enables IRQ after given time period. If problem persists the handler is ran again and another notification is sent - but at least the delay allows processor to avoid IRQ loop. Patch 7 takes this notification support in use at BD9576MUF. Patch 8 is related to MFD change which is not really related to the RFC here. It was added to this series in order to avoid potential conflicts. Patch 9 adds a maintainers entry. Changelog v10-RESEND: - rebased on v5.13-rc4 Changelog v10: - rebased on v5.13-rc2 - Move rdev_*() print macros to the internal.h and use rdev_dbg() from irq_helpers.c - Export rdev_get_name() and move it from coupler.h to driver.h for others to use. (It was already in coupler.h but not exported - usage was limited and coupler.h does not sound like optimal place as rdev_name is not only used by coupled regulators) - Send all regulator notifications from irq_helpers.c at one OR'd event for the sake of simplicity. For BD9576 this does not matter as it has own IRQ for each event case. Header defining events says they may be OR'd. - Change WARN() at protection shutdown to pr_emerg as suggested by Petr. Changelog v9: - rebases on v5.13-rc1 - Update thermal documentation - Fix regulator notification event number Changelog v8: - split shutdown API adding and thermal core taking it in use to own patches. - replace the spinlock with atomic when ensuring the emergency shutdown is only called once. Changelog v7: general: - rebased on v5.12-rc7 - new patch for refactoring the hw-failure reboot logic out of thermal_core.c for others to use. notification helpers: - fix regulator error_flags query - grammar/typos - do not BUG() but attempt to shut-down the system - use BITS_PER_TYPE() Changelog v6: Add MAINTAINERS entry Changes to IRQ notifiers - move devm functions to drivers/regulator/devres.c - drop irq validity check - use devm_add_action_or_reset() - fix styling issues - fix kerneldocs Changelog v5: - Fix the badly formatted pr_emerg() call. Changelog v4: - rebased on v5.12-rc6 - dropped RFC - fix external FET DT-binding. - improve prints for cases when expecting HW failure. - styling and typos Changelog v3: Regulator core: - Fix dangling pointer access at regulator_irq_helper() stpmic1_regulator: - fix function prototype (compile error) bd9576-regulator: - Update over current limits to what was given in new data-sheet (REV00K) - Allow over-current monitoring without external FET. Set limits to values given in data-sheet (REV00K). Changelog v2: Generic: - rebase on v5.12-rc2 + BD9576 series - Split devm variant of delayed wq to own series Regulator framework: - Provide non devm variant of IRQ notification helpers - shorten dt-property names as suggested by Rob - unconditionally call map_event in IRQ handling and require it to be populated BD9576 regulators: - change the FET resistance property to micro-ohms - fix voltage computation in OC limit setting
2021-06-21	RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs	Avihai Horon	1	-0/+8
	Relaxed Ordering is a capability that can only benefit users that support it. All kernel ULPs should support Relaxed Ordering, as they are designed to read data only after observing the CQE and use the DMA API correctly. Hence, implicitly enable Relaxed Ordering by default for MR transfers in kernel ULPs. Link: https://lore.kernel.org/r/b7e820aab7402b8efa63605f4ea465831b3b1e5e.1623236426.git.leonro@nvidia.com Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21	skmsg: Improve udp_bpf_recvmsg() accuracy	Cong Wang	1	-2/+0
	I tried to reuse sk_msg_wait_data() for different protocols, but it turns out it can not be simply reused. For example, UDP actually uses two queues to receive skb: udp_sk(sk)->reader_queue and sk->sk_receive_queue. So we have to check both of them to know whether we have received any packet. Also, UDP does not lock the sock during BH Rx path, it makes no sense for its ->recvmsg() to lock the sock. It is always possible for ->recvmsg() to be called before packets actually arrive in the receive queue, we just use best effort to make it accurate here. Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-2-xiyou.wangcong@gmail.com
2021-06-21	btrfs: pass btrfs_inode to btrfs_writepage_endio_finish_ordered()	Qu Wenruo	1	-12/+8
	There is a pretty bad abuse of btrfs_writepage_endio_finish_ordered() in end_compressed_bio_write(). It passes compressed pages to btrfs_writepage_endio_finish_ordered(), which is only supposed to accept inode pages. Thankfully the important info here is the inode, so let's pass btrfs_inode directly into btrfs_writepage_endio_finish_ordered(), and make @page parameter optional. By this, end_compressed_bio_write() can happily pass page=NULL while still getting everything done properly. Also, to cooperate with such modification, replace @page parameter for trace_btrfs_writepage_end_io_hook() with btrfs_inode. Although this removes page_index info, the existing start/len should be enough for most usage. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-21	regulator: add property parsing and callbacks to set protection limits	Matti Vaittinen	2	-4/+63
	Add DT property parsing code and setting callback for regulator over/under voltage, over-current and temperature error limits. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/e7b8007ba9eae7076178bf3363fb942ccb1cc9a5.1622628334.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	regulator: IRQ based event/error notification helpers	Matti Vaittinen	1	-0/+135
	Provide helper function for IC's implementing regulator notifications when an IRQ fires. The helper also works for IRQs which can not be acked. Helper can be set to disable the IRQ at handler and then re-enabling it on delayed work later. The helper also adds regulator_get_error_flags() errors in cache for the duration of IRQ disabling. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/ebdf86d8c22b924667ec2385330e30fcbfac0119.1622628334.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	regulator: move rdev_print helpers to internal.h	Matti Vaittinen	2	-5/+10
	The rdev print helpers are a nice way to print messages related to a specific regulator device. Move them from core.c to internal.h As the rdev print helpers use rdev_get_name() export it from core.c. Also move the declaration from coupler.h to driver.h because the rdev name is not just a coupled regulator property. I guess the main audience for rdev_get_name() will be the regulator core and drivers. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/dc7fd70dc31de4d0e820b7646bb78eeb04f80735.1622628333.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	regulator: add warning flags	Matti Vaittinen	1	-0/+14
	Add 'warning' level events and error flags to regulator core. Current regulator core notifications are used to inform consumers about errors where HW is misbehaving in such way it is assumed to be broken/unrecoverable. There are PMICs which are designed for system(s) that may have use for regulator indications sent before HW is damaged so that some board/consumer specific recovery-event can be performed while continuing most of the normal operations. Add new WARNING level events and notifications to be used for that purpose. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/9b54aa5589ae4b5945d53d114bac3fae55fa4818.1622628333.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	reboot: Add hardware protection power-off	Matti Vaittinen	1	-0/+1
	There can be few cases when we need to shut-down the system in order to protect the hardware. Currently this is done at least by the thermal core when temperature raises over certain limit. Some PMICs can also generate interrupts for example for over-current or over-voltage, voltage drops, short-circuit, ... etc. On some systems these are a sign of hardware failure and only thing to do is try to protect the rest of the hardware by shutting down the system. Add shut-down logic which can be used by all subsystems instead of implementing the shutdown in each subsystem. The logic is stolen from thermal_core with difference of using atomic_t instead of a mutex in order to allow calls directly from IRQ context and changing the WARN() to pr_emerg() as discussed here: https://lore.kernel.org/lkml/YJuPwAZroVZ%2Fw633@alley/ and here: https://lore.kernel.org/linux-iommu/20210331093104.383705-4-geert+renesas@glider.be/ Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/e83ec1ca9408f90c857ea9dcdc57b14d9037b03f.1622628333.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	ASoC: soc-core: remove snd_soc_of_parse_daifmt()	Kuninori Morimoto	1	-4/+0
	No driver is using snd_soc_of_parse_daifmt(). This patch removes it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87zgvtuuro.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	ASoC: soc-core: add snd_soc_daifmt_parse_format/clock_provider()	Kuninori Morimoto	1	-0/+14
	snd_soc_of_parse_daifmt() parses daifmt, but bitclock/frame provider parsing part is one of headacke, because we are assuming below both cases. A) node { bitclock-master; frame-master; ... }; B) link { bitclock-master = <&xxx>; frame-master = <&xxx>; ... }; The original was style A), and style B) was added later by commit b3ca11ff59bc ("ASoC: simple-card: Move dai-link level properties away from dai subnodes"). snd_soc_of_parse_daifmt() parses it as style A), and user need to update it to style B) if needed. To handle it more flexibile, this patch adds new functions which separates snd_soc_of_parse_daifmt() helper function. snd_soc_daifmt_parse_format() :for DAI format snd_soc_daifmt_parse_clock_provider_as_flag() :for style A) snd_soc_daifmt_parse_clock_provider_as_phandl() :for style B) snd_soc_daifmt_parse_clock_provider_as_bitmap() :use with _from_bitmap This means snd_soc_of_parse_daifmt() == snd_soc_daifmt_parse_format() \| snd_soc_daifmt_parse_clock_provider_as_flag() This patch also indicate relatesionship comment for snd_soc_daifmt_clock_provider_from_bitmap(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/877dixw9ej.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	ASoC: soc-core: add snd_soc_daifmt_clock_provider_fliped()	Kuninori Morimoto	1	-0/+2
	Sometimes we want to get CLOCK_PROVIDER fliped dai_fmt. This patch adds new snd_soc_daifmt_clock_provider_fliped() for it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/878s3dw9ex.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	ASoC: soc-core: add snd_soc_daifmt_clock_provider_from_bitmap()	Kuninori Morimoto	1	-0/+1
	This patch adds snd_soc_daifmt_clock_provider_from_bitmap() function to judge clock/frame master from its bitmap. This is prepare for snd_soc_of_parse_daifmt() cleanup. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87a6ntw9f5.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	soundwire: export sdw_update() and sdw_update_no_pm()	Pierre-Louis Bossart	1	-0/+3
	We currently export sdw_read() and sdw_write() but the sdw_update() and sdw_update_no_pm() are currently available only to the bus code. This was missed in an earlier contribution. Export both functions so that codec drivers can perform read-modify-write operations without duplicating the code. Fixes: b04c975e654c ('soundwire: bus: use sdw_update_no_pm when initializing a device') Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <bard.liao@intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Acked-By: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20210614180815.153711-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21	printk: Remove trailing semicolon in macros	Huilong Deng	1	-1/+1
	Macros should not use a trailing semicolon. Signed-off-by: Huilong Deng <denghuilong@cdjrlc.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2021-06-21	Merge tag 'v5.13-rc7' into usb-next	Greg Kroah-Hartman	15	-18/+73
	We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-21	drm/i915: Document the Virtual Engine uAPI	Tvrtko Ursulin	1	-0/+188
	A little bit of documentation covering the topics of engine discovery, context engine maps and virtual engines. It is not very detailed but supposed to be a starting point of giving a brief high level overview of general principles and intended use cases. v2: * Have the text in uapi header and link from there. v4: * Link from driver-uapi.rst. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210618150036.2507653-1-tvrtko.ursulin@linux.intel.com
2021-06-21	xfrm: replay: remove last replay indirection	Florian Westphal	1	-7/+1
	This replaces the overflow indirection with the new xfrm_replay_overflow helper. After this, the 'repl' pointer in xfrm_state is no longer needed and can be removed as well. xfrm_replay_overflow() is added in two incarnations, one is used when the kernel is compiled with xfrm hardware offload support enabled, the other when its disabled. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-06-21	xfrm: replay: avoid replay indirection	Florian Westphal	1	-3/+1
	Add and use xfrm_replay_check helper instead of indirection. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-06-21	xfrm: replay: remove recheck indirection	Florian Westphal	1	-3/+1
	Adds new xfrm_replay_recheck() helper and calls it from xfrm input path instead of the indirection. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-06-21	xfrm: replay: remove advance indirection	Florian Westphal	1	-1/+1
	Similar to other patches: add a new helper to avoid an indirection. v2: fix 'net/xfrm/xfrm_replay.c:519:13: warning: 'seq' may be used uninitialized in this function' warning. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-06-21	xfrm: replay: avoid xfrm replay notify indirection	Florian Westphal	1	-1/+10
	replay protection is implemented using a callback structure and then called via x->repl->notify(), x->repl->recheck(), and so on. all the differect functions are always built-in, so this could be direct calls instead. This first patch prepares for removal of the x->repl structure. Add an enum with the three available replay modes to the xfrm_state structure and then replace all x->repl->notify() calls by the new xfrm_replay_notify() helper. The helper checks the enum internally to adapt behaviour as needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-06-21	watchdog: Remove MV64x60 watchdog driver	Christophe Leroy	1	-8/+0
	Commit 92c8c16f3457 ("powerpc/embedded6xx: Remove C2K board support") removed the last selector of CONFIG_MV64X60. Therefore CONFIG_MV64X60_WDT cannot be selected anymore and can be removed. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/9c2952bcfaec3b1789909eaa36bbce2afbfab7ab.1616085654.git.christophe.leroy@csgroup.eu Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-06-20	cifs: decoding negTokenInit with generic ASN1 decoder	Hyunchul Lee	1	-0/+8
	Decode negTokenInit with lib/asn1_decoder. For that, add OIDs in linux/oid_registry.h and a negTokenInit ASN1 file, "spnego_negtokeninit.asn1". And define decoder's callback functions, which are the gssapi_this_mech for checking SPENGO oid and the neg_token_init_mech_type for getting authentication mechanisms supported by a server. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-20	memory: tegra: Add compile-test stub for tegra_mc_probe_device()	Thierry Reding	1	-2/+7
	The tegra_mc_probe_device() symbol is only available when the TEGRA_MC Kconfig option is enabled. Provide a stub if that's not the case so that the driver can be compile-tested. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210618111846.1286166-1-thierry.reding@gmail.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
2021-06-20	soundwire: bus: Make sdw_nwrite() data pointer argument const	Richard Fitzgerald	1	-1/+1
	Idiomatically, write functions should take const pointers to the data buffer, as they don't change the data. They are also likely to be called from functions that receive a const data pointer. Internally the pointer is passed to function/structs shared with the read functions, requiring a cast, but this is an implementation detail that should be hidden by the public API. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://lore.kernel.org/r/20210616145901.29402-1-rf@opensource.cirrus.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-18	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	43	-80/+279
	Trivial conflicts in net/can/isotp.c and tools/testing/selftests/net/mptcp/mptcp_connect.sh scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c to include/linux/ptp_clock_kernel.h in -next so re-apply the fix there. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-18	Merge tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Linus Torvalds	8	-10/+42
	Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.13-rc7, including fixes from wireless, bpf, bluetooth, netfilter and can. Current release - regressions: - mlxsw: spectrum_qdisc: Pass handle, not band number to find_class() to fix modifying offloaded qdiscs - lantiq: net: fix duplicated skb in rx descriptor ring - rtnetlink: fix regression in bridge VLAN configuration, empty info is not an error, bot-generated "fix" was not needed - libbpf: s/rx/tx/ typo on umem->rx_ring_setup_done to fix umem creation Current release - new code bugs: - ethtool: fix NULL pointer dereference during module EEPROM dump via the new netlink API - mlx5e: don't update netdev RQs with PTP-RQ, the special purpose queue should not be visible to the stack - mlx5e: select special PTP queue only for SKBTX_HW_TSTAMP skbs - mlx5e: verify dev is present in get devlink port ndo, avoid a panic Previous releases - regressions: - neighbour: allow NUD_NOARP entries to be force GCed - further fixes for fallout from reorg of WiFi locking (staging: rtl8723bs, mac80211, cfg80211) - skbuff: fix incorrect msg_zerocopy copy notifications - mac80211: fix NULL ptr deref for injected rate info - Revert "net/mlx5: Arm only EQs with EQEs" it may cause missed IRQs Previous releases - always broken: - bpf: more speculative execution fixes - netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local - udp: fix race between close() and udp_abort() resulting in a panic - fix out of bounds when parsing TCP options before packets are validated (in netfilter: synproxy, tc: sch_cake and mptcp) - mptcp: improve operation under memory pressure, add missing wake-ups - mptcp: fix double-lock/soft lookup in subflow_error_report() - bridge: fix races (null pointer deref and UAF) in vlan tunnel egress - ena: fix DMA mapping function issues in XDP - rds: fix memory leak in rds_recvmsg Misc: - vrf: allow larger MTUs - icmp: don't send out ICMP messages with a source address of 0.0.0.0 - cdc_ncm: switch to eth%d interface naming" * tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (139 commits) net: ethernet: fix potential use-after-free in ec_bhf_remove selftests/net: Add icmp.sh for testing ICMP dummy address responses icmp: don't send out ICMP messages with a source address of 0.0.0.0 net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY net: ll_temac: Fix TX BD buffer overwrite net: ll_temac: Add memory-barriers for TX BD access net: ll_temac: Make sure to free skb when it is completely used MAINTAINERS: add Guvenc as SMC maintainer bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path bnxt_en: Fix TQM fastpath ring backing store computation bnxt_en: Rediscover PHY capabilities after firmware reset cxgb4: fix wrong shift. mac80211: handle various extensible elements correctly mac80211: reset profile_periodicity/ema_ap cfg80211: avoid double free of PMSR request cfg80211: make certificate generation more robust mac80211: minstrel_ht: fix sample time check net: qed: Fix memcpy() overflow of qed_dcbx_params() net: cdc_eem: fix tx fixup skb leak net: hamradio: fix memory leak in mkiss_close ...
2021-06-18	net: wwan: Allow WWAN drivers to provide blocking tx and poll function	Stephan Gerhold	1	-2/+11
	At the moment, the WWAN core provides wwan_port_txon/off() to implement blocking writes. The tx() port operation should not block, instead wwan_port_txon/off() should be called when the TX queue is full or has free space again. However, in some cases it is not straightforward to make use of that functionality. For example, the RPMSG API used by rpmsg_wwan_ctrl.c does not provide any way to be notified when the TX queue has space again. Instead, it only provides the following operations: - rpmsg_send(): blocking write (wait until there is space) - rpmsg_trysend(): non-blocking write (return error if no space) - rpmsg_poll(): set poll flags depending on TX queue state Generally that's totally sufficient for implementing a char device, but it does not fit well to the currently provided WWAN port ops. Most of the time, using the non-blocking rpmsg_trysend() in the WWAN tx() port operation works just fine. However, with high-frequent writes to the char device it is possible to trigger a situation where this causes issues. For example, consider the following (somewhat unrealistic) example: # dd if=/dev/zero bs=1000 of=/dev/wwan0qmi0 dd: error writing '/dev/wwan0qmi0': Resource temporarily unavailable 1+0 records out This fails immediately after writing the first record. It's likely only a matter of time until this triggers issues for some real application (e.g. ModemManager sending a lot of large QMI packets). The rpmsg_char device does not have this problem, because it uses rpmsg_trysend() and rpmsg_poll() to support non-blocking operations. Make it possible to use the same in the RPMSG WWAN driver by adding two new optional wwan_port_ops: - tx_blocking(): send data blocking if allowed - tx_poll(): set additional TX poll flags This integrates nicely with the RPMSG API and does not require any change in existing WWAN drivers. With these changes, the dd example above blocks instead of exiting with an error. Cc: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	rpmsg: core: Add driver_data for rpmsg_device_id	Stephan Gerhold	1	-0/+1
	Most device_id structs provide a driver_data field that can be used by drivers to associate data more easily for a particular device ID. Add the same for the rpmsg_device_id. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	Revert "net: add pf_family_names[] for protocol family"	David S. Miller	1	-48/+0
	This reverts commit 1f3c98eaddec857e16a7a1c6cd83317b3dc89438. Does not build... Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	net: add pf_family_names[] for protocol family	Yejune Deng	1	-0/+48
	Modify the pr_info content from int to char *, this looks more readable. Signed-off-by: Yejune Deng <yejune.deng@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	icmp: don't send out ICMP messages with a source address of 0.0.0.0	Toke Høiland-Jørgensen	1	-0/+3
	When constructing ICMP response messages, the kernel will try to pick a suitable source address for the outgoing packet. However, if no IPv4 addresses are configured on the system at all, this will fail and we end up producing an ICMP message with a source address of 0.0.0.0. This can happen on a box routing IPv4 traffic via v6 nexthops, for instance. Since 0.0.0.0 is not generally routable on the internet, there's a good chance that such ICMP messages will never make it back to the sender of the original packet that the ICMP message was sent in response to. This, in turn, can create connectivity and PMTUd problems for senders. Fortunately, RFC7600 reserves a dummy address to be used as a source for ICMP messages (192.0.0.8/32), so let's teach the kernel to substitute that address as a last resort if the regular source address selection procedure fails. Below is a quick example reproducing this issue with network namespaces: ip netns add ns0 ip l add type veth peer netns ns0 ip l set dev veth0 up ip a add 10.0.0.1/24 dev veth0 ip a add fc00:dead:cafe:42::1/64 dev veth0 ip r add 10.1.0.0/24 via inet6 fc00:dead:cafe:42::2 ip -n ns0 l set dev veth0 up ip -n ns0 a add fc00:dead:cafe:42::2/64 dev veth0 ip -n ns0 r add 10.0.0.0/24 via inet6 fc00:dead:cafe:42::1 ip netns exec ns0 sysctl -w net.ipv4.icmp_ratelimit=0 ip netns exec ns0 sysctl -w net.ipv4.ip_forward=1 tcpdump -tpni veth0 -c 2 icmp & ping -w 1 10.1.0.1 > /dev/null tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 29, seq 1, length 64 IP 0.0.0.0 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92 2 packets captured 2 packets received by filter 0 packets dropped by kernel With this patch the above capture changes to: IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 31127, seq 1, length 64 IP 192.0.0.8 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Juliusz Chroboczek <jch@irif.fr> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	mptcp: dump csum fields in mptcp_dump_mpext	Geliang Tang	1	-6/+11
	In mptcp_dump_mpext, dump the csum fields, csum and csum_reqd in struct mptcp_dump_mpext too. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	mptcp: receive checksum for MP_CAPABLE with data	Geliang Tang	1	-1/+2
	This patch added a new member named csum in struct mptcp_options_received. When parsing the MP_CAPABLE with data, if the checksum is enabled, adjust the expected_opsize. If the receiving option length matches the length with the data checksum, get the checksum value and save it in mp_opt->csum. And in mptcp_incoming_options, pass it to mpext->csum. We always parse any csum/nocsum combination and delay the presence check to later code, to allow reset if missing. Additionally, in the TX path, use the newly introduce ext field to avoid MPTCP csum recomputation on TCP retransmission and unneeded csum update on when setting the data fin_flag. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	mptcp: add csum_reqd in mptcp_out_options	Geliang Tang	1	-2/+3
	This patch added a new member csum_reqd in struct mptcp_out_options and struct mptcp_subflow_request_sock. Initialized it with the helper function mptcp_is_checksum_enabled(). In mptcp_write_options, if this field is enabled, send out the MP_CAPABLE suboption with the MPTCP_CAP_CHECKSUM_REQD flag. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	mptcp: generate the data checksum	Geliang Tang	1	-0/+1
	This patch added a new member named csum in struct mptcp_ext, implemented a new function named mptcp_generate_data_checksum(). Generate the data checksum in mptcp_sendmsg_frag, save it in mpext->csum. Note that we must generate the csum for zero window probe, too. Do the csum update incrementally, to avoid multiple csum computation when the data is appended to existing skb. Note that in a later patch we will skip unneeded csum related operation. Changes not included here to keep the delta small. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	mptcp: add csum_enabled in mptcp_sock	Geliang Tang	1	-0/+1
	This patch added a new member named csum_enabled in struct mptcp_sock, used a dummy mptcp_is_checksum_enabled() helper to initialize it. Also added a new member named mptcpi_csum_enabled in struct mptcp_info to expose the csum_enabled flag. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	seg6: add support for SRv6 End.DT46 Behavior	Andrea Mayer	1	-0/+2
	IETF RFC 8986 [1] includes the definition of SRv6 End.DT4, End.DT6, and End.DT46 Behaviors. The current SRv6 code in the Linux kernel only implements End.DT4 and End.DT6 which can be used respectively to support IPv4-in-IPv6 and IPv6-in-IPv6 VPNs. With End.DT4 and End.DT6 it is not possible to create a single SRv6 VPN tunnel to carry both IPv4 and IPv6 traffic. The proposed End.DT46 implementation is meant to support the decapsulation of IPv4 and IPv6 traffic coming from a single SRv6 tunnel. The implementation of the SRv6 End.DT46 Behavior in the Linux kernel greatly simplifies the setup and operations of SRv6 VPNs. The SRv6 End.DT46 Behavior leverages the infrastructure of SRv6 End.DT{4,6} Behaviors implemented so far, because it makes use of a VRF device in order to force the routing lookup into the associated routing table. To make the End.DT46 work properly, it must be guaranteed that the routing table used for routing lookup operations is bound to one and only one VRF during the tunnel creation. Such constraint has to be enforced by enabling the VRF strict_mode sysctl parameter, i.e.: $ sysctl -wq net.vrf.strict_mode=1 Note that the same approach is used for the SRv6 End.DT4 Behavior and for the End.DT6 Behavior in VRF mode. The command used to instantiate an SRv6 End.DT46 Behavior is straightforward, i.e.: $ ip -6 route add 2001:db8::1 encap seg6local action End.DT46 vrftable 100 dev vrf100. [1] https://www.rfc-editor.org/rfc/rfc8986.html#name-enddt46-decapsulation-and-s ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Performance and impact of SRv6 End.DT46 Behavior on the SRv6 Networking ======================================================================= This patch aims to add the SRv6 End.DT46 Behavior with minimal impact on the performance of SRv6 End.DT4 and End.DT6 Behaviors. In order to verify this, we tested the performance of the newly introduced SRv6 End.DT46 Behavior and compared it with the performance of SRv6 End.DT{4,6} Behaviors, considering both the patched kernel and the kernel before applying the End.DT46 patch (referred to as vanilla kernel). In details, the following decapsulation scenarios were considered: 1.a) IPv6 traffic in SRv6 End.DT46 Behavior on patched kernel; 1.b) IPv4 traffic in SRv6 End.DT46 Behavior on patched kernel; 2.a) SRv6 End.DT6 Behavior (VRF mode) on patched kernel; 2.b) SRv6 End.DT4 Behavior on patched kernel; 3.a) SRv6 End.DT6 Behavior (VRF mode) on vanilla kernel (without the End.DT46 patch); 3.b) SRv6 End.DT4 Behavior on vanilla kernel (without the End.DT46 patch). All tests were performed on a testbed deployed on the CloudLab [2] facilities. We considered IPv{4,6} traffic handled by a single core (at 2.4 GHz on a Xeon(R) CPU E5-2630 v3) on kernel 5.13-rc1 using packets of size ~ 100 bytes. Scenario (1.a): average 684.70 kpps; std. dev. 0.7 kpps; Scenario (1.b): average 711.69 kpps; std. dev. 1.2 kpps; Scenario (2.a): average 690.70 kpps; std. dev. 1.2 kpps; Scenario (2.b): average 722.22 kpps; std. dev. 1.7 kpps; Scenario (3.a): average 690.02 kpps; std. dev. 2.6 kpps; Scenario (3.b): average 721.91 kpps; std. dev. 1.2 kpps; Considering the results for the patched kernel (1.a, 1.b, 2.a, 2.b) we observe that the performance degradation incurred in using End.DT46 rather than End.DT6 and End.DT4 respectively for IPv6 and IPv4 traffic is minimal, around 0.9% and 1.5%. Such very minimal performance degradation is the price to be paid if one prefers to use a single tunnel capable of handling both types of traffic (IPv4 and IPv6). Comparing the results for End.DT4 and End.DT6 under the patched and the vanilla kernel (2.a, 2.b, 3.a, 3.b) we observe that the introduction of the End.DT46 patch has no impact on the performance of End.DT4 and End.DT6. [2] https://www.cloudlab.us Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	Merge tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm	Linus Torvalds	1	-1/+0
	Pull power management fix from Rafael Wysocki: "Remove recently added frequency invariance support from the CPPC cpufreq driver, because it has turned out to be problematic and it cannot be fixed properly on time for 5.13 (Viresh Kumar)" * tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "cpufreq: CPPC: Add support for frequency invariance"
2021-06-18	xsk: Fix missing validation for skb and unaligned mode	Magnus Karlsson	1	-2/+7
	Fix a missing validation of a Tx descriptor when executing in skb mode and the umem is in unaligned mode. A descriptor could point to a buffer straddling the end of the umem, thus effectively tricking the kernel to read outside the allowed umem region. This could lead to a kernel crash if that part of memory is not mapped. In zero-copy mode, the descriptor validation code rejects such descriptors by checking a bit in the DMA address that tells us if the next page is physically contiguous or not. For the last page in the umem, this bit is not set, therefore any descriptor pointing to a packet straddling this last page boundary will be rejected. However, the skb path does not use this bit since it copies out data and can do so to two different pages. (It also does not have the array of DMA address, so it cannot even store this bit.) The code just returned that the packet is always physically contiguous. But this is unfortunately also returned for the last page in the umem, which means that packets that cross the end of the umem are being allowed, which they should not be. Fix this by introducing a check for this in the SKB path only, not penalizing the zero-copy path. Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20210617092255.3487-1-magnus.karlsson@gmail.com
2021-06-18	blk-mq: fix an IS_ERR() vs NULL bug	Dan Carpenter	1	-1/+1
	The __blk_mq_alloc_disk() function doesn't return NULLs it returns error pointers. Fixes: b461dfc49eb6 ("blk-mq: add the blk_mq_alloc_disk APIs") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/YMyjci35WBqrtqG+@mwanda Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-18	Merge tag 'devm-helpers-v5.14-1' into review-hans	Hans de Goede	1	-0/+25
	Signed tag for the immutable devm-helpers branch for merging into the extcon and pdx86 trees.
2021-06-18	netfilter: conntrack: pass hook state to log functions	Florian Westphal	1	-8/+12
	The packet logger backend is unable to provide the incoming (or outgoing) interface name because that information isn't available. Pass the hook state, it contains the network namespace, the protocol family, the network interfaces and other things. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-18	habanalabs: added open_stats info ioctl	Yuri Nudelman	1	-0/+12
	In a system with multiple ASICs, there is a need to provide monitoring tools with information on how long a device was opened and how many times a device was opened. Therefore, we add a new opcode to the INFO ioctl to provide that information. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18	habanalabs: add debug flag to prevent failure on timeout	Yuri Nudelman	1	-0/+1
	Sometimes it is useful to allow the command to continue running despite the timeout occurred, to differentiate between really stuck or just very time consuming commands. This can be achieved by passing a new debug flag alongside the cs, HL_CS_FLAGS_SKIP_RESET_ON_TIMEOUT. Anyway, if the timeout occurred, a warning print shall be issued, however this shall not fail the submission. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18	sched: Change task_struct::state	Peter Zijlstra	3	-18/+17
	Change the type and name of task_struct::state. Drop the volatile and shrink it to an 'unsigned int'. Rename it in order to find all uses such that we can use READ_ONCE/WRITE_ONCE as appropriate. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://lore.kernel.org/r/20210611082838.550736351@infradead.org