aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/netdevsim (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2-4/+4
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2-5/+1
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-08-04 The following pull-request contains BPF updates for your *net-next* tree. We've added 73 non-merge commits during the last 9 day(s) which contain a total of 135 files changed, 4603 insertions(+), 1013 deletions(-). The main changes are: 1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko. 2) Add BPF iterator for map elements and to iterate all BPF programs for efficient in-kernel inspection, from Yonghong Song and Alexei Starovoitov. 3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid unwinder errors, from Song Liu. 4) Allow cgroup local storage map to be shared between programs on the same cgroup. Also extend BPF selftests with coverage, from YiFei Zhu. 5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM load instructions, from Jean-Philippe Brucker. 6) Follow-up fixes on BPF socket lookup in combination with reuseport group handling. Also add related BPF selftests, from Jakub Sitnicki. 7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for socket create/release as well as bind functions, from Stanislav Fomichev. 8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct xdp_statistics, from Peilin Ye. 9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime. 10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6} fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin. 11) Fix a bpftool segfault due to missing program type name and make it more robust to prevent them in future gaps, from Quentin Monnet. 12) Consolidate cgroup helper functions across selftests and fix a v6 localhost resolver issue, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03devlink: Pass extack when setting trap's action and group's parametersIdo Schimmel1-2/+4
A later patch will refuse to set the action of certain traps in mlxsw and also to change the policer binding of certain groups. Pass extack so that failure could be communicated clearly to user space. Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commandsAndrii Nakryiko2-5/+1
Now that BPF program/link management is centralized in generic net_device code, kernel code never queries program id from drivers, so XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary. This patch removes all the implementations of those commands in kernel, along the xdp_attachment_query(). This patch was compile-tested on allyesconfig. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-10-andriin@fb.com
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+1
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21netdevsim: fix unbalaced locking in nsim_create()Taehee Yoo1-2/+2
In the nsim_create(), rtnl_lock() is called before nsim_bpf_init(). If nsim_bpf_init() is failed, rtnl_unlock() should be called, but it isn't called. So, unbalanced locking would occur. Fixes: e05b2d141fef ("netdevsim: move netdev creation/destruction to dev probe") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10netdevsim: add UDP tunnel port offload supportJakub Kicinski5-2/+224
Add UDP tunnel port handlers to our fake driver so we can test the core infra. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09devlink: Replace devlink_port_attrs_set parameters with a structDanielle Ratson1-4/+6
Currently, devlink_port_attrs_set accepts a long list of parameters, that most of them are devlink port's attributes. Use the devlink_port_attrs struct to replace the relevant parameters. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01netdevsim: Register control trapsIdo Schimmel1-0/+7
Register two control traps with devlink. The existing selftest at tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh iterates over all registered traps and checks that the action of non-drop traps cannot be changed. Up until now only exception traps were tested, now control traps will be tested as well. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01netdevsim: Move layer 3 exceptions to exceptions trap groupIdo Schimmel1-1/+2
The layer 3 exceptions are still subject to the same trap policer, so nothing changes, but user space can choose to assign a different one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22netdevsim: Ensure policer drop counter always increasesIdo Schimmel1-2/+1
In case the policer drop counter is retrieved when the jiffies value is a multiple of 64, the counter will not be incremented. This randomly breaks a selftest [1] the reads the counter twice and checks that it was incremented: ``` TEST: Trap policer [FAIL] Policer drop counter was not incremented ``` Fix by always incrementing the counter by 1. [1] tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh Fixes: ad188458d012 ("netdevsim: Add devlink-trap policer support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_writeGustavo A. R. Silva1-0/+1
In case memory resources for dummy_data were allocated, release them before return. Addresses-Coverity-ID: 1491997 ("Resource leak") Fixes: 7ef19d3b1d5e ("devlink: report error once U32_MAX snapshot ids have been used") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30netdevsim: Add support for setting of packet trap group parametersIdo Schimmel2-0/+18
Add a dummy callback to set trap group parameters. Return an error when the 'fail_trap_group_set' debugfs file is set in order to exercise error paths and verify that error is propagated to user space when should. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30devlink: Add packet trap group parameters supportIdo Schimmel1-4/+4
Packet trap groups are used to aggregate logically related packet traps. Currently, these groups allow user space to batch operations such as setting the trap action of all member traps. In order to prevent the CPU from being overwhelmed by too many trapped packets, it is desirable to bind a packet trap policer to these groups. For example, to limit all the packets that encountered an exception during routing to 10Kpps. Allow device drivers to bind default packet trap policers to packet trap groups when the latter are registered with devlink. The next patch will enable user space to change this default binding. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30netdevsim: Add devlink-trap policer supportIdo Schimmel2-1/+86
Register three dummy packet trap policers with devlink and implement callbacks to change their parameters and read their counters. This will be used later on in the series to test the devlink-trap policer infrastructure. v2: * Remove check about burst size being a power of 2 and instead add a debugfs knob to fail the operation * Provide max/min rate/burst size when registering policers and remove the validity checks from nsim_dev_devlink_trap_policer_set() Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30devlink: Implicitly set auto recover flag when registering health reporterEran Ben Elisha1-2/+2
When health reporter is registered to devlink, devlink will implicitly set auto recover if and only if the reporter has a recover method. No reason to explicitly get the auto recover flag from the driver. Remove this flag from all drivers that called devlink_health_reporter_create. All existing health reporters set auto recovery to true if they have a recover method. Yet, administrator can unset auto recover via netlink command as prior to this patch. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30netdevsim: Change dummy reporter auto recover defaultEran Ben Elisha1-1/+1
Health reporters should be registered with auto recover set to true. Align dummy reporter behaviour with that, as in later patch the option to set auto recover behaviour will be removed. In addition, align netdevsim selftest to the new default value. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26netdevsim: support taking immediate snapshot via devlinkJacob Keller1-6/+22
Implement the .snapshot region operation for the dummy data region. This enables a region snapshot to be taken upon request via the new DEVLINK_CMD_REGION_SNAPSHOT command. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: track snapshot id usage count using an xarrayJacob Keller1-1/+5
Each snapshot created for a devlink region must have an id. These ids are supposed to be unique per "event" that caused the snapshot to be created. Drivers call devlink_region_snapshot_id_get to obtain a new id to use for a new event trigger. The id values are tracked per devlink, so that the same id number can be used if a triggering event creates multiple snapshots on different regions. There is no mechanism for snapshot ids to ever be reused. Introduce an xarray to store the count of how many snapshots are using a given id, replacing the snapshot_id field previously used for picking the next id. The devlink_region_snapshot_id_get() function will use xa_alloc to insert an initial value of 1 value at an available slot between 0 and U32_MAX. The new __devlink_snapshot_id_increment() and __devlink_snapshot_id_decrement() functions will be used to track how many snapshots currently use an id. Drivers must now call devlink_snapshot_id_put() in order to release their reference of the snapshot id after adding region snapshots. By tracking the total number of snapshots using a given id, it is possible for the decrement() function to erase the id from the xarray when it is not in use. With this method, a snapshot id can become reused again once all snapshots that referred to it have been deleted via DEVLINK_CMD_REGION_DEL, and the driver has finished adding snapshots. This work also paves the way to introduce a mechanism for userspace to request a snapshot. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: report error once U32_MAX snapshot ids have been usedJacob Keller1-1/+5
The devlink_snapshot_id_get() function returns a snapshot id. The snapshot id is a u32, so there is no way to indicate an error code. A future change is going to possibly add additional cases where this function could fail. Refactor the function to return the snapshot id in an argument, so that it can return zero or an error value. This ensures that snapshot ids cannot be confused with error values, and aids in the future refactor of snapshot id allocation management. Because there is no current way to release previously used snapshot ids, add a simple check ensuring that an error is reported in case the snapshot_id would over flow. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: convert snapshot destructor callback to region opJacob Keller1-1/+2
It does not makes sense that two snapshots for a given region would use different destructors. Simplify snapshot creation by adding a .destructor op for regions. This operation will replace the data_destructor for the snapshot creation, and makes snapshot creation easier. Noticed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: prepare to support region operationsJacob Keller1-1/+5
Modify the devlink region code in preparation for adding new operations on regions. Create a devlink_region_ops structure, and move the name pointer from within the devlink_region structure into the ops structure (similar to the devlink_health_reporter_ops). This prepares the regions to enable support of additional operations in the future such as requesting snapshots, or accessing the region directly without a snapshot. In order to re-use the constant strings in the mlx4 driver their declaration must be changed to 'const char * const' to ensure the compiler realizes that both the data and the pointer cannot change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-15/+15
Overlapping header include additions in macsec.c A bug fix in 'net' overlapping with the removal of 'version' string in ena_netdev.c Overlapping test additions in selftests Makefile Overlapping PCI ID table adjustments in iwlwifi driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23devlink: Only pass packet trap group identifier in trap structureIdo Schimmel1-4/+4
Packet trap groups are now explicitly registered by drivers and not implicitly registered when the packet traps are registered. Therefore, there is no need to encode entire group structure the trap is associated with inside the trap structure. Instead, only pass the group identifier. Refer to it as initial group identifier, as future patches will allow user space to move traps between groups. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23netdevsim: Explicitly register packet trap groupsIdo Schimmel1-1/+18
Use the previously added API to explicitly register / unregister supported packet trap groups. This is in preparation for future patches that will enable drivers to pass additional group attributes, such as associated policer identifier. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-15net: netdevsim: Use scnprintf() for avoiding potential buffer overflowTakashi Iwai1-15/+15
Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Cc: "David S . Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25netdevsim: add ACL trap reporting cookie as a metadataJiri Pirko2-3/+116
Add new trap ACL which reports flow action cookie in a metadata. Allow used to setup the cookie using debugfs file. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25devlink: extend devlink_trap_report() to accept cookie and passJiri Pirko1-1/+1
Add cookie argument to devlink_trap_report() allowing driver to pass in the user cookie. Pass on the cookie down to drop monitor code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-05netdevsim: fix ptr_ret.cocci warningskbuild test robot1-3/+1
drivers/net/netdevsim/dev.c:937:1-3: WARNING: PTR_ERR_OR_ZERO can be used Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Generated by: scripts/coccinelle/api/ptr_ret.cocci Fixes: 6556ff32f12d ("netdevsim: use IS_ERR instead of IS_ERR_OR_NULL for debugfs") CC: Taehee Yoo <ap420073@gmail.com> Signed-off-by: kbuild test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-03netdevsim: remove unused sdev codeTaehee Yoo1-69/+0
sdev.c code is merged into dev.c and is not used anymore. it would be removed. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-02-03netdevsim: use __GFP_NOWARN to avoid memalloc warningTaehee Yoo2-2/+2
vfnum buffer size and binary_len buffer size is received by user-space. So, this buffer size could be too large. If so, kmalloc will internally print a warning message. This warning message is actually not necessary for the netdevsim module. So, this patch adds __GFP_NOWARN. Test commands: modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device echo 1000000000 > /sys/devices/netdevsim1/sriov_numvfs Splat looks like: [ 357.847266][ T1000] WARNING: CPU: 0 PID: 1000 at mm/page_alloc.c:4738 __alloc_pages_nodemask+0x2f3/0x740 [ 357.850273][ T1000] Modules linked in: netdevsim veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrx [ 357.852989][ T1000] CPU: 0 PID: 1000 Comm: bash Tainted: G B 5.5.0-rc5+ #270 [ 357.854334][ T1000] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 357.855703][ T1000] RIP: 0010:__alloc_pages_nodemask+0x2f3/0x740 [ 357.856669][ T1000] Code: 64 fe ff ff 65 48 8b 04 25 c0 0f 02 00 48 05 f0 12 00 00 41 be 01 00 00 00 49 89 47 0 [ 357.860272][ T1000] RSP: 0018:ffff8880b7f47bd8 EFLAGS: 00010246 [ 357.861009][ T1000] RAX: ffffed1016fe8f80 RBX: 1ffff11016fe8fae RCX: 0000000000000000 [ 357.861843][ T1000] RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000 [ 357.862661][ T1000] RBP: 0000000000040dc0 R08: 1ffff11016fe8f67 R09: dffffc0000000000 [ 357.863509][ T1000] R10: ffff8880b7f47d68 R11: fffffbfff2798180 R12: 1ffff11016fe8f80 [ 357.864355][ T1000] R13: 0000000000000017 R14: 0000000000000017 R15: ffff8880c2038d68 [ 357.865178][ T1000] FS: 00007fd9a5b8c740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000 [ 357.866248][ T1000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 357.867531][ T1000] CR2: 000055ce01ba8100 CR3: 00000000b7dbe005 CR4: 00000000000606f0 [ 357.868972][ T1000] Call Trace: [ 357.869423][ T1000] ? lock_contended+0xcd0/0xcd0 [ 357.870001][ T1000] ? __alloc_pages_slowpath+0x21d0/0x21d0 [ 357.870673][ T1000] ? _kstrtoull+0x76/0x160 [ 357.871148][ T1000] ? alloc_pages_current+0xc1/0x1a0 [ 357.871704][ T1000] kmalloc_order+0x22/0x80 [ 357.872184][ T1000] kmalloc_order_trace+0x1d/0x140 [ 357.872733][ T1000] __kmalloc+0x302/0x3a0 [ 357.873204][ T1000] nsim_bus_dev_numvfs_store+0x1ab/0x260 [netdevsim] [ 357.873919][ T1000] ? kernfs_get_active+0x12c/0x180 [ 357.874459][ T1000] ? new_device_store+0x450/0x450 [netdevsim] [ 357.875111][ T1000] ? kernfs_get_parent+0x70/0x70 [ 357.875632][ T1000] ? sysfs_file_ops+0x160/0x160 [ 357.876152][ T1000] kernfs_fop_write+0x276/0x410 [ 357.876680][ T1000] ? __sb_start_write+0x1ba/0x2e0 [ 357.877225][ T1000] vfs_write+0x197/0x4a0 [ 357.877671][ T1000] ksys_write+0x141/0x1d0 [ ... ] Reviewed-by: Jakub Kicinski <kuba@kernel.org> Fixes: 79579220566c ("netdevsim: add SR-IOV functionality") Fixes: 82c93a87bf8b ("netdevsim: implement couple of testing devlink health reporters") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-02-03netdevsim: use IS_ERR instead of IS_ERR_OR_NULL for debugfsTaehee Yoo3-14/+16
Debugfs APIs return valid pointer or error pointer. it doesn't return NULL. So, using IS_ERR is enough, not using IS_ERR_OR_NULL. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-02-03netdevsim: fix stack-out-of-bounds in nsim_dev_debugfs_init()Taehee Yoo1-1/+1
When netdevsim dev is being created, a debugfs directory is created. The variable "dev_ddir_name" is 16bytes device name pointer and device name is "netdevsim<dev id>". The maximum dev id length is 10. So, 16bytes for device name isn't enough. Test commands: modprobe netdevsim echo "1000000000 0" > /sys/bus/netdevsim/new_device Splat looks like: [ 249.622710][ T900] BUG: KASAN: stack-out-of-bounds in number+0x824/0x880 [ 249.623658][ T900] Write of size 1 at addr ffff88804c527988 by task bash/900 [ 249.624521][ T900] [ 249.624830][ T900] CPU: 1 PID: 900 Comm: bash Not tainted 5.5.0+ #322 [ 249.625691][ T900] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 249.626712][ T900] Call Trace: [ 249.627103][ T900] dump_stack+0x96/0xdb [ 249.627639][ T900] ? number+0x824/0x880 [ 249.628173][ T900] print_address_description.constprop.5+0x1be/0x360 [ 249.629022][ T900] ? number+0x824/0x880 [ 249.629569][ T900] ? number+0x824/0x880 [ 249.630105][ T900] __kasan_report+0x12a/0x170 [ 249.630717][ T900] ? number+0x824/0x880 [ 249.631201][ T900] kasan_report+0xe/0x20 [ 249.631723][ T900] number+0x824/0x880 [ 249.632235][ T900] ? put_dec+0xa0/0xa0 [ 249.632716][ T900] ? rcu_read_lock_sched_held+0x90/0xc0 [ 249.633392][ T900] vsnprintf+0x63c/0x10b0 [ 249.633983][ T900] ? pointer+0x5b0/0x5b0 [ 249.634543][ T900] ? mark_lock+0x11d/0xc40 [ 249.635200][ T900] sprintf+0x9b/0xd0 [ 249.635750][ T900] ? scnprintf+0xe0/0xe0 [ 249.636370][ T900] nsim_dev_probe+0x63c/0xbf0 [netdevsim] [ ... ] Reviewed-by: Jakub Kicinski <kuba@kernel.org> Fixes: ab1d0cc004d7 ("netdevsim: change debugfs tree topology") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-02-03netdevsim: fix panic in nsim_dev_take_snapshot_write()Taehee Yoo2-2/+12
nsim_dev_take_snapshot_write() uses nsim_dev and nsim_dev->dummy_region. So, during this function, these data shouldn't be removed. But there is no protecting stuff in this function. There are two similar cases. 1. reload case reload could be called during nsim_dev_take_snapshot_write(). When reload is being executed, nsim_dev_reload_down() is called and it calls nsim_dev_reload_destroy(). nsim_dev_reload_destroy() calls devlink_region_destroy() to destroy nsim_dev->dummy_region. So, during nsim_dev_take_snapshot_write(), nsim_dev->dummy_region() would be removed. At this point, snapshot_write() would access freed pointer. In order to fix this case, take_snapshot file will be removed before devlink_region_destroy(). The take_snapshot file will be re-created by ->reload_up(). 2. del_device_store case del_device_store() also could call nsim_dev_reload_destroy() during nsim_dev_take_snapshot_write(). If so, panic would occur. This problem is actually the same problem with the first case. So, this problem will be fixed by the first case's solution. Test commands: modprobe netdevsim while : do echo 1 > /sys/bus/netdevsim/new_device & echo 1 > /sys/bus/netdevsim/del_device & devlink dev reload netdevsim/netdevsim1 & echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/take_snapshot & done Splat looks like: [ 45.564513][ T975] general protection fault, probably for non-canonical address 0xdffffc000000003a: 0000 [#1] SMP DEI [ 45.566131][ T975] KASAN: null-ptr-deref in range [0x00000000000001d0-0x00000000000001d7] [ 45.566135][ T975] CPU: 1 PID: 975 Comm: bash Not tainted 5.5.0+ #322 [ 45.569020][ T975] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 45.569026][ T975] RIP: 0010:__mutex_lock+0x10a/0x14b0 [ 45.570518][ T975] Code: 08 84 d2 0f 85 7f 12 00 00 44 8b 0d 10 23 65 02 45 85 c9 75 29 49 8d 7f 68 48 b8 00 00 00 0f [ 45.570522][ T975] RSP: 0018:ffff888046ccfbf0 EFLAGS: 00010206 [ 45.572305][ T975] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 45.572308][ T975] RDX: 000000000000003a RSI: ffffffffac926440 RDI: 00000000000001d0 [ 45.576843][ T975] RBP: ffff888046ccfd70 R08: ffffffffab610645 R09: 0000000000000000 [ 45.576847][ T975] R10: ffff888046ccfd90 R11: ffffed100d6360ad R12: 0000000000000000 [ 45.578471][ T975] R13: dffffc0000000000 R14: ffffffffae1976c0 R15: 0000000000000168 [ 45.578475][ T975] FS: 00007f614d6e7740(0000) GS:ffff88806c400000(0000) knlGS:0000000000000000 [ 45.581492][ T975] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.582942][ T975] CR2: 00005618677d1cf0 CR3: 000000005fb9c002 CR4: 00000000000606e0 [ 45.584543][ T975] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.586633][ T975] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.589889][ T975] Call Trace: [ 45.591445][ T975] ? devlink_region_snapshot_create+0x55/0x4a0 [ 45.601250][ T975] ? mutex_lock_io_nested+0x1380/0x1380 [ 45.602817][ T975] ? mutex_lock_io_nested+0x1380/0x1380 [ 45.603875][ T975] ? mark_held_locks+0xa5/0xe0 [ 45.604769][ T975] ? _raw_spin_unlock_irqrestore+0x2d/0x50 [ 45.606147][ T975] ? __mutex_unlock_slowpath+0xd0/0x670 [ 45.607723][ T975] ? crng_backtrack_protect+0x80/0x80 [ 45.613530][ T975] ? wait_for_completion+0x390/0x390 [ 45.615152][ T975] ? devlink_region_snapshot_create+0x55/0x4a0 [ 45.616834][ T975] devlink_region_snapshot_create+0x55/0x4a0 [ ... ] Fixes: 4418f862d675 ("netdevsim: implement support for devlink region and snapshots") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-02-03netdevsim: disable devlink reload when resources are being usedTaehee Yoo2-0/+21
devlink reload destroys resources and allocates resources again. So, when devices and ports resources are being used, devlink reload function should not be executed. In order to avoid this race, a new lock is added and new_port() and del_port() call devlink_reload_disable() and devlink_reload_enable(). Thread0 Thread1 {new/del}_port() {new/del}_port() devlink_reload_disable() devlink_reload_disable() devlink_reload_enable() //here devlink_reload_enable() Before Thread1's devlink_reload_enable(), the devlink is already allowed to execute reload because Thread0 allows it. devlink reload disable/enable variable type is bool. So the above case would exist. So, disable/enable should be executed atomically. In order to do that, a new lock is used. Test commands: modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device while : do echo 1 > /sys/devices/netdevsim1/new_port & echo 1 > /sys/devices/netdevsim1/del_port & devlink dev reload netdevsim/netdevsim1 & done Splat looks like: [ 23.342145][ T932] DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) [ 23.342159][ T932] WARNING: CPU: 0 PID: 932 at kernel/locking/mutex-debug.c:103 mutex_destroy+0xc7/0xf0 [ 23.344182][ T932] Modules linked in: netdevsim openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_dx [ 23.346485][ T932] CPU: 0 PID: 932 Comm: devlink Not tainted 5.5.0+ #322 [ 23.347696][ T932] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 23.348893][ T932] RIP: 0010:mutex_destroy+0xc7/0xf0 [ 23.349505][ T932] Code: e0 07 83 c0 03 38 d0 7c 04 84 d2 75 2e 8b 05 00 ac b0 02 85 c0 75 8b 48 c7 c6 00 5e 07 96 40 [ 23.351887][ T932] RSP: 0018:ffff88806208f810 EFLAGS: 00010286 [ 23.353963][ T932] RAX: dffffc0000000008 RBX: ffff888067f6f2c0 RCX: ffffffff942c4bd4 [ 23.355222][ T932] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff96dac5b4 [ 23.356169][ T932] RBP: ffff888067f6f000 R08: fffffbfff2d235a5 R09: fffffbfff2d235a5 [ 23.357160][ T932] R10: 0000000000000001 R11: fffffbfff2d235a4 R12: ffff888067f6f208 [ 23.358288][ T932] R13: ffff88806208fa70 R14: ffff888067f6f000 R15: ffff888069ce3800 [ 23.359307][ T932] FS: 00007fe2a3876740(0000) GS:ffff88806c000000(0000) knlGS:0000000000000000 [ 23.360473][ T932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 23.361319][ T932] CR2: 00005561357aa000 CR3: 000000005227a006 CR4: 00000000000606f0 [ 23.362323][ T932] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 23.363417][ T932] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 23.364414][ T932] Call Trace: [ 23.364828][ T932] nsim_dev_reload_destroy+0x77/0xb0 [netdevsim] [ 23.365655][ T932] nsim_dev_reload_down+0x84/0xb0 [netdevsim] [ 23.366433][ T932] devlink_reload+0xb1/0x350 [ 23.367010][ T932] genl_rcv_msg+0x580/0xe90 [ ...] [ 23.531729][ T1305] kernel BUG at lib/list_debug.c:53! [ 23.532523][ T1305] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 23.533467][ T1305] CPU: 2 PID: 1305 Comm: bash Tainted: G W 5.5.0+ #322 [ 23.534962][ T1305] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 23.536503][ T1305] RIP: 0010:__list_del_entry_valid+0xe6/0x150 [ 23.538346][ T1305] Code: 89 ea 48 c7 c7 00 73 1e 96 e8 df f7 4c ff 0f 0b 48 c7 c7 60 73 1e 96 e8 d1 f7 4c ff 0f 0b 44 [ 23.541068][ T1305] RSP: 0018:ffff888047c27b58 EFLAGS: 00010282 [ 23.542001][ T1305] RAX: 0000000000000054 RBX: ffff888067f6f318 RCX: 0000000000000000 [ 23.543051][ T1305] RDX: 0000000000000054 RSI: 0000000000000008 RDI: ffffed1008f84f61 [ 23.544072][ T1305] RBP: ffff88804aa0fca0 R08: ffffed100d940539 R09: ffffed100d940539 [ 23.545085][ T1305] R10: 0000000000000001 R11: ffffed100d940538 R12: ffff888047c27cb0 [ 23.546422][ T1305] R13: ffff88806208b840 R14: ffffffff981976c0 R15: ffff888067f6f2c0 [ 23.547406][ T1305] FS: 00007f76c0431740(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 23.548527][ T1305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 23.549389][ T1305] CR2: 00007f5048f1a2f8 CR3: 000000004b310006 CR4: 00000000000606e0 [ 23.550636][ T1305] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 23.551578][ T1305] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 23.552597][ T1305] Call Trace: [ 23.553004][ T1305] mutex_remove_waiter+0x101/0x520 [ 23.553646][ T1305] __mutex_lock+0xac7/0x14b0 [ 23.554218][ T1305] ? nsim_dev_port_del+0x4e/0x140 [netdevsim] [ 23.554908][ T1305] ? mutex_lock_io_nested+0x1380/0x1380 [ 23.555570][ T1305] ? _parse_integer+0xf0/0xf0 [ 23.556043][ T1305] ? kstrtouint+0x86/0x110 [ 23.556504][ T1305] ? nsim_dev_port_del+0x4e/0x140 [netdevsim] [ 23.557133][ T1305] nsim_dev_port_del+0x4e/0x140 [netdevsim] [ 23.558024][ T1305] del_port_store+0xcc/0xf0 [netdevsim] [ ... ] Fixes: 75ba029f3c07 ("netdevsim: implement proper devlink reload") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-02-03netdevsim: fix using uninitialized resourcesTaehee Yoo2-3/+41
When module is being initialized, __init() calls bus_register() and driver_register(). These functions internally create various resources and sysfs files. The sysfs files are used for basic operations(add/del device). /sys/bus/netdevsim/new_device /sys/bus/netdevsim/del_device These sysfs files use netdevsim resources, they are mostly allocated and initialized in ->probe() function, which is nsim_dev_probe(). But, sysfs files could be executed before ->probe() is finished. So, accessing uninitialized data would occur. Another problem is very similar. /sys/bus/netdevsim/new_device internally creates sysfs files. /sys/devices/netdevsim<id>/new_port /sys/devices/netdevsim<id>/del_port These sysfs files also use netdevsim resources, they are mostly allocated and initialized in creating device routine, which is nsim_bus_dev_new(). But they also could be executed before nsim_bus_dev_new() is finished. So, accessing uninitialized data would occur. To fix these problems, this patch adds flags, which means whether the operation is finished or not. The flag variable 'nsim_bus_enable' means whether netdevsim bus was initialized or not. This is protected by nsim_bus_dev_list_lock. The flag variable 'nsim_bus_dev->init' means whether nsim_bus_dev was initialized or not. This could be used in {new/del}_port_store() with no lock. Test commands: #SHELL1 modprobe netdevsim while : do echo "1 1" > /sys/bus/netdevsim/new_device echo "1 1" > /sys/bus/netdevsim/del_device done #SHELL2 while : do echo 1 > /sys/devices/netdevsim1/new_port echo 1 > /sys/devices/netdevsim1/del_port done Splat looks like: [ 47.508954][ T1008] general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 I [ 47.510793][ T1008] KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f] [ 47.511963][ T1008] CPU: 2 PID: 1008 Comm: bash Not tainted 5.5.0+ #322 [ 47.512823][ T1008] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 47.514041][ T1008] RIP: 0010:__mutex_lock+0x10a/0x14b0 [ 47.514699][ T1008] Code: 08 84 d2 0f 85 7f 12 00 00 44 8b 0d 10 23 65 02 45 85 c9 75 29 49 8d 7f 68 48 b8 00 00 00 0f [ 47.517163][ T1008] RSP: 0018:ffff888059b4fbb0 EFLAGS: 00010206 [ 47.517802][ T1008] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 47.518941][ T1008] RDX: 0000000000000021 RSI: ffffffff85926440 RDI: 0000000000000108 [ 47.519732][ T1008] RBP: ffff888059b4fd30 R08: ffffffffc073fad0 R09: 0000000000000000 [ 47.520729][ T1008] R10: ffff888059b4fd50 R11: ffff88804bb38040 R12: 0000000000000000 [ 47.521702][ T1008] R13: dffffc0000000000 R14: ffffffff871976c0 R15: 00000000000000a0 [ 47.522760][ T1008] FS: 00007fd4be05a740(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 47.523877][ T1008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.524627][ T1008] CR2: 0000561c82b69cf0 CR3: 0000000065dd6004 CR4: 00000000000606e0 [ 47.527662][ T1008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.528604][ T1008] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.529531][ T1008] Call Trace: [ 47.529874][ T1008] ? nsim_dev_port_add+0x50/0x150 [netdevsim] [ 47.530470][ T1008] ? mutex_lock_io_nested+0x1380/0x1380 [ 47.531018][ T1008] ? _kstrtoull+0x76/0x160 [ 47.531449][ T1008] ? _parse_integer+0xf0/0xf0 [ 47.531874][ T1008] ? kernfs_fop_write+0x1cf/0x410 [ 47.532330][ T1008] ? sysfs_file_ops+0x160/0x160 [ 47.532773][ T1008] ? kstrtouint+0x86/0x110 [ 47.533168][ T1008] ? nsim_dev_port_add+0x50/0x150 [netdevsim] [ 47.533721][ T1008] nsim_dev_port_add+0x50/0x150 [netdevsim] [ 47.534336][ T1008] ? sysfs_file_ops+0x160/0x160 [ 47.534858][ T1008] new_port_store+0x99/0xb0 [netdevsim] [ 47.535439][ T1008] ? del_port_store+0xb0/0xb0 [netdevsim] [ 47.536035][ T1008] ? sysfs_file_ops+0x112/0x160 [ 47.536544][ T1008] ? sysfs_kf_write+0x3b/0x180 [ 47.537029][ T1008] kernfs_fop_write+0x276/0x410 [ 47.537548][ T1008] ? __sb_start_write+0x215/0x2e0 [ 47.538110][ T1008] vfs_write+0x197/0x4a0 [ ... ] Fixes: f9d9db47d3ba ("netdevsim: add bus attributes to add new and delete devices") Fixes: 794b2c05ca1c ("netdevsim: extend device attrs to support port addition and deletion") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-19Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+1
2020-01-17netdevsim: fix nsim_fib6_rt_create() error pathEric Dumazet1-3/+3
It seems nsim_fib6_rt_create() intent was to return either a valid pointer or an embedded error code. BUG: unable to handle page fault for address: fffffffffffffff4 PGD 9870067 P4D 9870067 PUD 9872067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 22851 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:jhash2 include/linux/jhash.h:125 [inline] RIP: 0010:rhashtable_jhash2+0x76/0x2c0 lib/rhashtable.c:963 Code: b9 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 14 08 4c 89 f0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 30 02 00 00 49 8d 7e 04 <41> 8b 06 48 be 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f b6 RSP: 0018:ffffc90016127190 EFLAGS: 00010246 RAX: 0000000000000007 RBX: 00000000dfb3ab49 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffffffff839ba7c8 RDI: fffffffffffffff8 RBP: ffffc900161271c0 R08: ffff8880951f8640 R09: ffffed1015d0703d R10: ffffed1015d0703c R11: ffff8880ae8381e3 R12: 00000000dfb3ab49 R13: 00000000dfb3ab49 R14: fffffffffffffff4 R15: 0000000000000007 FS: 00007f40bfbc6700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffffffffffff4 CR3: 0000000093660000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rht_key_get_hash include/linux/rhashtable.h:133 [inline] rht_key_hashfn include/linux/rhashtable.h:159 [inline] rht_head_hashfn include/linux/rhashtable.h:174 [inline] __rhashtable_insert_fast.constprop.0+0xe15/0x1180 include/linux/rhashtable.h:723 rhashtable_insert_fast include/linux/rhashtable.h:832 [inline] nsim_fib6_rt_add drivers/net/netdevsim/fib.c:603 [inline] nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:658 [inline] nsim_fib6_event drivers/net/netdevsim/fib.c:719 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:744 [inline] nsim_fib_event_nb+0x1b16/0x2600 drivers/net/netdevsim/fib.c:772 notifier_call_chain+0xc2/0x230 kernel/notifier.c:83 __atomic_notifier_call_chain+0xa6/0x1a0 kernel/notifier.c:173 atomic_notifier_call_chain+0x2e/0x40 kernel/notifier.c:183 call_fib_notifiers+0x173/0x2a0 net/core/fib_notifier.c:35 call_fib6_notifiers+0x4b/0x60 net/ipv6/fib6_notifier.c:22 call_fib6_entry_notifiers+0xfb/0x150 net/ipv6/ip6_fib.c:399 fib6_add_rt2node net/ipv6/ip6_fib.c:1216 [inline] fib6_add+0x20cd/0x3ec0 net/ipv6/ip6_fib.c:1471 __ip6_ins_rt+0x54/0x80 net/ipv6/route.c:1315 ip6_ins_rt+0x96/0xd0 net/ipv6/route.c:1325 __ipv6_dev_ac_inc+0x76f/0xb20 net/ipv6/anycast.c:324 ipv6_sock_ac_join+0x4c1/0x790 net/ipv6/anycast.c:139 do_ipv6_setsockopt.isra.0+0x3908/0x4290 net/ipv6/ipv6_sockglue.c:670 ipv6_setsockopt+0xff/0x180 net/ipv6/ipv6_sockglue.c:944 udpv6_setsockopt+0x68/0xb0 net/ipv6/udp.c:1564 sock_common_setsockopt+0x94/0xd0 net/core/sock.c:3149 __sys_setsockopt+0x261/0x4c0 net/socket.c:2130 __do_sys_setsockopt net/socket.c:2146 [inline] __se_sys_setsockopt net/socket.c:2143 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:2143 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45aff9 Fixes: 48bb9eb47b27 ("netdevsim: fib: Add dummy implementation for FIB offload") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14netdevsim: fib: Add dummy implementation for FIB offloadIdo Schimmel1-9/+662
Implement dummy IPv4 and IPv6 FIB "offload" in the driver by storing currently "programmed" routes in a hash table. Each route in the hash table is marked with "trap" indication. The indication is cleared when the route is replaced or when the netdevsim instance is deleted. This will later allow us to test the route offload API on top of netdevsim. v2: * Convert to new fib_alias_hw_flags_set() interface Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-11devlink: correct misspelling of snapshotJacob Keller1-1/+1
The function to obtain a unique snapshot id was mistakenly typo'd as devlink_region_shapshot_id_get. Fix this typo by renaming the function and all of its users. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-10devlink: rename and expand devlink-trap-netdevsim.rstJacob Keller1-1/+1
Rename the trap-specific netdevimsim.rst file, and expand it to include documentation of all the devlink features currently implemented by the netdevsim driver code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-10devlink: move devlink documentation to subfolderJacob Keller1-1/+1
Combine the documentation for devlink into a subfolder, and provide an index.rst file that can be used to generally describe devlink. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24ipv6: Remove old route notifications and convert listenersIdo Schimmel1-1/+0
Now that mlxsw is converted to use the new FIB notifications it is possible to delete the old ones and use the new replace / append / delete notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-16ipv4: Remove old route notifications and convert listenersIdo Schimmel1-2/+2
Unlike mlxsw, the other listeners to the FIB notification chain do not require any special modifications as they never considered multiple identical routes. This patch removes the old route notifications and converts all the listeners to use the new replace / delete notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12netdevsim: Update dummy reporter's devlink binary interfaceAya Levin1-7/+1
Update dummy reporter's output to use updated devlink interface of binary fmsg pair. Signed-off-by: Aya Levin <ayal@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-08devlink: disallow reload operation during device cleanupJiri Pirko1-0/+3
There is a race between driver code that does setup/cleanup of device and devlink reload operation that in some drivers works with the same code. Use after free could we easily obtained by running: while true; do echo 10 > /sys/bus/netdevsim/new_device devlink dev reload netdevsim/netdevsim10 & echo 10 > /sys/bus/netdevsim/del_device done Fix this by enabling reload only after setup of device is complete and disabling it at the beginning of the cleanup process. Reported-by: Ido Schimmel <idosch@mellanox.com> Fixes: 2d8dc5bbf4e7 ("devlink: Add support for reload") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06netdevsim: drop code duplicated by a mergeJakub Kicinski1-39/+8
Looks like the port adding loop makes a re-appearance on net-next after net was merged back into it (even though it doesn't feature in the merge diff). The ports are already added in nsim_dev_create() so when we try to add them again get EEXIST, and see: netdevsim: probe of netdevsim0 failed with error -17 in the logs. When we remove the loop again the nsim_dev_probe() and nsim_dev_remove() become a wrapper of nsim_dev_create() and nsim_dev_destroy(). Remove this layer of indirection. Fixes: d31e95585ca6 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-0/+17
The only slightly tricky merge conflict was the netdevsim because the mutex locking fix overlapped a lot of driver reload reorganization. The rest were (relatively) trivial in nature. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-31netdevsim: Fix use-after-free during device dismantleIdo Schimmel1-0/+5
Commit da58f90f11f5 ("netdevsim: Add devlink-trap support") added delayed work to netdevsim that periodically iterates over the registered netdevsim ports and reports various packet traps via devlink. While the delayed work takes the 'port_list_lock' mutex to protect against concurrent addition / deletion of ports, during device creation / dismantle ports are added / deleted without this lock, which can result in a use-after-free [1]. Fix this by making sure that the ports list is always modified under the lock. [1] [ 59.205543] ================================================================== [ 59.207748] BUG: KASAN: use-after-free in nsim_dev_trap_report_work+0xa67/0xad0 [ 59.210247] Read of size 8 at addr ffff8883cbdd3398 by task kworker/3:1/38 [ 59.212584] [ 59.213148] CPU: 3 PID: 38 Comm: kworker/3:1 Not tainted 5.4.0-rc3-custom-16119-ge6abb5f0261e #2013 [ 59.215896] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014 [ 59.218384] Workqueue: events nsim_dev_trap_report_work [ 59.219428] Call Trace: [ 59.219924] dump_stack+0xa9/0x10e [ 59.220623] print_address_description.constprop.4+0x21/0x340 [ 59.221976] ? vprintk_func+0x66/0x240 [ 59.222752] __kasan_report.cold.8+0x78/0x91 [ 59.223602] ? nsim_dev_trap_report_work+0xa67/0xad0 [ 59.224603] kasan_report+0xe/0x20 [ 59.225296] nsim_dev_trap_report_work+0xa67/0xad0 [ 59.226435] ? rcu_read_lock_sched_held+0xaf/0xe0 [ 59.227512] ? trace_event_raw_event_rcu_quiescent_state_report+0x360/0x360 [ 59.228851] process_one_work+0x98f/0x1760 [ 59.229684] ? pwq_dec_nr_in_flight+0x330/0x330 [ 59.230656] worker_thread+0x91/0xc40 [ 59.231587] ? process_one_work+0x1760/0x1760 [ 59.232451] kthread+0x34a/0x410 [ 59.233104] ? __kthread_queue_delayed_work+0x240/0x240 [ 59.234141] ret_from_fork+0x3a/0x50 [ 59.234982] [ 59.235371] Allocated by task 187: [ 59.236189] save_stack+0x19/0x80 [ 59.236853] __kasan_kmalloc.constprop.5+0xc1/0xd0 [ 59.237822] kmem_cache_alloc_trace+0x14c/0x380 [ 59.238769] __nsim_dev_port_add+0xaf/0x5c0 [ 59.239627] nsim_dev_probe+0x4fc/0x1140 [ 59.240550] really_probe+0x264/0xc00 [ 59.241418] driver_probe_device+0x208/0x2e0 [ 59.242255] __device_attach_driver+0x215/0x2d0 [ 59.243150] bus_for_each_drv+0x154/0x1d0 [ 59.243944] __device_attach+0x1ba/0x2b0 [ 59.244923] bus_probe_device+0x1dd/0x290 [ 59.245805] device_add+0xbac/0x1550 [ 59.246528] new_device_store+0x1f4/0x400 [ 59.247306] bus_attr_store+0x7b/0xa0 [ 59.248047] sysfs_kf_write+0x10f/0x170 [ 59.248941] kernfs_fop_write+0x283/0x430 [ 59.249843] __vfs_write+0x81/0x100 [ 59.250546] vfs_write+0x1ce/0x510 [ 59.251190] ksys_write+0x104/0x200 [ 59.251873] do_syscall_64+0xa4/0x4e0 [ 59.252642] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 59.253837] [ 59.254203] Freed by task 187: [ 59.254811] save_stack+0x19/0x80 [ 59.255463] __kasan_slab_free+0x125/0x170 [ 59.256265] kfree+0x100/0x440 [ 59.256870] nsim_dev_remove+0x98/0x100 [ 59.257651] nsim_bus_remove+0x16/0x20 [ 59.258382] device_release_driver_internal+0x20b/0x4d0 [ 59.259588] bus_remove_device+0x2e9/0x5a0 [ 59.260551] device_del+0x410/0xad0 [ 59.263777] device_unregister+0x26/0xc0 [ 59.264616] nsim_bus_dev_del+0x16/0x60 [ 59.265381] del_device_store+0x2d6/0x3c0 [ 59.266295] bus_attr_store+0x7b/0xa0 [ 59.267192] sysfs_kf_write+0x10f/0x170 [ 59.267960] kernfs_fop_write+0x283/0x430 [ 59.268800] __vfs_write+0x81/0x100 [ 59.269551] vfs_write+0x1ce/0x510 [ 59.270252] ksys_write+0x104/0x200 [ 59.270910] do_syscall_64+0xa4/0x4e0 [ 59.271680] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 59.272812] [ 59.273211] The buggy address belongs to the object at ffff8883cbdd3200 [ 59.273211] which belongs to the cache kmalloc-512 of size 512 [ 59.275838] The buggy address is located 408 bytes inside of [ 59.275838] 512-byte region [ffff8883cbdd3200, ffff8883cbdd3400) [ 59.278151] The buggy address belongs to the page: [ 59.279215] page:ffffea000f2f7400 refcount:1 mapcount:0 mapping:ffff8883ecc0ce00 index:0x0 compound_mapcount: 0 [ 59.281449] flags: 0x200000000010200(slab|head) [ 59.282356] raw: 0200000000010200 ffffea000f2f3a08 ffffea000f2fd608 ffff8883ecc0ce00 [ 59.283949] raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000 [ 59.285608] page dumped because: kasan: bad access detected [ 59.286981] [ 59.287337] Memory state around the buggy address: [ 59.288310] ffff8883cbdd3280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 59.289763] ffff8883cbdd3300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 59.291452] >ffff8883cbdd3380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 59.292945] ^ [ 59.293815] ffff8883cbdd3400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 59.295220] ffff8883cbdd3480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 59.296872] ================================================================== Fixes: da58f90f11f5 ("netdevsim: Add devlink-trap support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: syzbot+9ed8f68ab30761f3678e@syzkaller.appspotmail.com Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-13netdevsim: Fix error handling in nsim_fib_init and nsim_fib_exitYueHaibing1-1/+2
In nsim_fib_init(), if register_fib_notifier failed, nsim_fib_net_ops should be unregistered before return. In nsim_fib_exit(), unregister_fib_notifier should be called before nsim_fib_net_ops be unregistered, otherwise may cause use-after-free: BUG: KASAN: use-after-free in nsim_fib_event_nb+0x342/0x570 [netdevsim] Read of size 8 at addr ffff8881daaf4388 by task kworker/0:3/3499 CPU: 0 PID: 3499 Comm: kworker/0:3 Not tainted 5.3.0-rc7+ #30 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work [ipv6] Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xa9/0x10e lib/dump_stack.c:113 print_address_description+0x65/0x380 mm/kasan/report.c:351 __kasan_report+0x149/0x18d mm/kasan/report.c:482 kasan_report+0xe/0x20 mm/kasan/common.c:618 nsim_fib_event_nb+0x342/0x570 [netdevsim] notifier_call_chain+0x52/0xf0 kernel/notifier.c:95 __atomic_notifier_call_chain+0x78/0x140 kernel/notifier.c:185 call_fib_notifiers+0x30/0x60 net/core/fib_notifier.c:30 call_fib6_entry_notifiers+0xc1/0x100 [ipv6] fib6_add+0x92e/0x1b10 [ipv6] __ip6_ins_rt+0x40/0x60 [ipv6] ip6_ins_rt+0x84/0xb0 [ipv6] __ipv6_ifa_notify+0x4b6/0x550 [ipv6] ipv6_ifa_notify+0xa5/0x180 [ipv6] addrconf_dad_completed+0xca/0x640 [ipv6] addrconf_dad_work+0x296/0x960 [ipv6] process_one_work+0x5c0/0xc00 kernel/workqueue.c:2269 worker_thread+0x5c/0x670 kernel/workqueue.c:2415 kthread+0x1d7/0x200 kernel/kthread.c:255 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 Allocated by task 3388: save_stack+0x19/0x80 mm/kasan/common.c:69 set_track mm/kasan/common.c:77 [inline] __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:493 kmalloc include/linux/slab.h:557 [inline] kzalloc include/linux/slab.h:748 [inline] ops_init+0xa9/0x220 net/core/net_namespace.c:127 __register_pernet_operations net/core/net_namespace.c:1135 [inline] register_pernet_operations+0x1d4/0x420 net/core/net_namespace.c:1212 register_pernet_subsys+0x24/0x40 net/core/net_namespace.c:1253 nsim_fib_init+0x12/0x70 [netdevsim] veth_get_link_ksettings+0x2b/0x50 [veth] do_one_initcall+0xd4/0x454 init/main.c:939 do_init_module+0xe0/0x330 kernel/module.c:3490 load_module+0x3c2f/0x4620 kernel/module.c:3841 __do_sys_finit_module+0x163/0x190 kernel/module.c:3931 do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 3534: save_stack+0x19/0x80 mm/kasan/common.c:69 set_track mm/kasan/common.c:77 [inline] __kasan_slab_free+0x130/0x180 mm/kasan/common.c:455 slab_free_hook mm/slub.c:1423 [inline] slab_free_freelist_hook mm/slub.c:1474 [inline] slab_free mm/slub.c:3016 [inline] kfree+0xe9/0x2d0 mm/slub.c:3957 ops_free net/core/net_namespace.c:151 [inline] ops_free_list.part.7+0x156/0x220 net/core/net_namespace.c:184 ops_free_list net/core/net_namespace.c:182 [inline] __unregister_pernet_operations net/core/net_namespace.c:1165 [inline] unregister_pernet_operations+0x221/0x2a0 net/core/net_namespace.c:1224 unregister_pernet_subsys+0x1d/0x30 net/core/net_namespace.c:1271 nsim_fib_exit+0x11/0x20 [netdevsim] nsim_module_exit+0x16/0x21 [netdevsim] __do_sys_delete_module kernel/module.c:1015 [inline] __se_sys_delete_module kernel/module.c:958 [inline] __x64_sys_delete_module+0x244/0x330 kernel/module.c:958 do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 59c84b9fcf42 ("netdevsim: Restore per-network namespace accounting for fib entries") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>