xsk: Publish global consumer pointers when NAPI is finished - linux-dev - Linux kernel development work

diff options

author	Magnus Karlsson <magnus.karlsson@intel.com>	2020-02-10 16:27:12 +0100
committer	Daniel Borkmann <daniel@iogearbox.net>	2020-02-11 15:51:11 +0100
commit	30744a68626db6a0029aca9c646831c869c16d83 (patch)
tree	25971a1b9cdf51047e40baae29dee252a40c49af /tools/testing/selftests
parent	bpf: Make btf_check_func_type_match() static (diff)
download	linux-dev-30744a68626db6a0029aca9c646831c869c16d83.tar.xz linux-dev-30744a68626db6a0029aca9c646831c869c16d83.zip

xsk: Publish global consumer pointers when NAPI is finished

The commit 4b638f13bab4 ("xsk: Eliminate the RX batch size") introduced a much more lazy way of updating the global consumer pointers from the kernel side, by only doing so when running out of entries in the fill or Tx rings (the rings consumed by the kernel). This can result in a deadlock with the user application if the kernel requires more than one entry to proceed and the application cannot put these entries in the fill ring because the kernel has not updated the global consumer pointer since the ring is not empty. Fix this by publishing the local kernel side consumer pointer whenever we have completed Rx or Tx processing in the kernel. This way, user space will have an up-to-date view of the consumer pointers whenever it gets to execute in the one core case (application and driver on the same core), or after a certain number of packets have been processed in the two core case (application and driver on different cores). A side effect of this patch is that the one core case gets better performance, but the two core case gets worse. The reason that the one core case improves is that updating the global consumer pointer is relatively cheap since the application by definition is not running when the kernel is (they are on the same core) and it is beneficial for the application, once it gets to run, to have pointers that are as up to date as possible since it then can operate on more packets and buffers. In the two core case, the most important performance aspect is to minimize the number of accesses to the global pointers since they are shared between two cores and bounces between the caches of those cores. This patch results in more updates to global state, which means lower performance in the two core case. Fixes: 4b638f13bab4 ("xsk: Eliminate the RX batch size") Reported-by: Ryan Goodfellow <rgoodfel@isi.edu> Reported-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Maxim Mikityanskiy <maximmi@mellanox.com> Link: https://lore.kernel.org/bpf/1581348432-6747-1-git-send-email-magnus.karlsson@intel.com

Diffstat (limited to 'tools/testing/selftests')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: