aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/core/sysfs.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-11-12RDMA: Change MAD processing function to remove extra casting and parameterLeon Romanovsky1-4/+2
All users of process_mad() converts input pointers from ib_mad_hdr to be ib_mad, update the function declaration to use ib_mad directly. Also remove not used input MAD size parameter. Link: https://lore.kernel.org/r/20191029062745.7932-17-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Tested-By: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-06RDMA/mad: Allocate zeroed MAD bufferLeon Romanovsky1-2/+2
Ensure that MAD output buffer is zero-based allocated in all the callers of process_mad and remove the various memset()'s from the drivers. Link: https://lore.kernel.org/r/20191029062745.7932-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-01RDMA/core: Fix return code when modify_device isn't supportedKamal Heib1-1/+1
The proper return code is "-EOPNOTSUPP" when modify_device callback is not supported. Link: https://lore.kernel.org/r/20190923104158.5331-2-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-12RDMA: Introduce ib_port_phys_state enumKamal Heib1-10/+20
In order to improve readability, add ib_port_phys_state enum to replace the use of magic numbers. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Andrew Boyer <aboyer@tobark.org> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Acked-by: Bernard Metzler <bmt@zurich.ibm.com> Link: https://lore.kernel.org/r/20190807103138.17219-2-kamalheib1@gmail.com Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-07-05RDMA/nldev: Allow get default counter statistics through RDMA netlinkMark Zhang1-0/+6
This patch adds the ability to return the hwstats of per-port default counters (which can also be queried through sysfs nodes). Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-05RDMA/core: Get sum value of all counters when perform a sysfs stat readMark Zhang1-3/+7
Since a QP can only be bound to one counter, then if it is bound to a separate counter, for backward compatibility purpose, the statistic value must be: * stat of default counter + stat of all running allocated counters + stat of all deallocated counters (history stats) Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-06RDMA: Add EFA related definitionsGal Pressman1-0/+1
Add EFA driver ID to the IOCTL interface uapi. This patch also adds unspecified node/transport type that will be used by EFA (usnic is left unchanged as it's already part of our ABI). Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-03RDMA/core: Allow detaching gid attribute netdevice for RoCEParav Pandit1-4/+9
When there is active traffic through a GID, a QP/AH holds reference to this GID entry. RoCE GID entry holds reference to its attached netdevice. Due to this when netdevice is deleted by admin user, its refcount is not dropped. Therefore, while deleting RoCE GID, wait for all GID attribute's netdev users to finish accessing netdev in rcu context. Once all users done accessing it, release the netdev refcount. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-03RDMA/core: Do not invoke init_port on compat devicesParav Pandit1-8/+8
The driver interface cannot manipulate the sysfs of the compat device, only of the full device so we must avoid calling the driver sysfs APIs on compat devices. This prevents an oops: Call Trace: dump_stack+0x5a/0x73 kobject_init+0x74/0x80 kobject_init_and_add+0x35/0xb0 hfi1_create_port_files+0x6e/0x3c0 [hfi1] ib_setup_port_attrs+0x43b/0x560 [ib_core] add_one_compat_dev+0x16a/0x230 [ib_core] rdma_dev_init_net+0x110/0x160 [ib_core] ops_init+0x38/0xf0 setup_net+0xcf/0x1e0 copy_net_ns+0xb7/0x130 create_new_namespaces+0x11a/0x1b0 unshare_nsproxy_namespaces+0x55/0xa0 ksys_unshare+0x1a7/0x340 __x64_sys_unshare+0xe/0x20 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 5417783eabb2 ("RDMA/core: Support core port attributes in non init_net") Reported-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08RDMA/cm: Move debug counters to be under relevant IB deviceLeon Romanovsky1-0/+43
The sysfs layout is created by CM incorrectly presented RDMA devices with InfiniBand link layer. Layout of such devices represents device tree of connections. By moving CM statistics to be under relevant port of IB device, we will fix the following issues: * Symlink name - It used device name instead of specific identifier. * Target location - It was supposed to point to PCI-ID/infiniband_cm/ instead of PCI-ID/infiniband/ * Target name - It created extra device file under already existing device folder, e.g. mlx5_0/mlx5_0 * Crash during boot with RDMA persistent naming patches. sysfs: cannot create duplicate filename '/class/infiniband_cm/mlx5_0' CPU: 29 PID: 433 Comm: modprobe Not tainted 5.0.0-rc5+ #178 Call Trace: dump_stack+0xcc/0x180 sysfs_warn_dup.cold.3+0x17/0x2d sysfs_do_create_link_sd.isra.2+0xd0/0xf0 device_add+0x7cb/0x1450 device_create_groups_vargs+0x1ae/0x220 device_create+0x93/0xc0 cm_add_one+0x38f/0xf60 [ib_cm] add_client_context+0x167/0x210 [ib_core] enable_device_and_get+0x230/0x3f0 [ib_core] ib_register_device+0x823/0xbf0 [ib_core] __mlx5_ib_add+0x45/0x150 [mlx5_ib] mlx5_ib_add+0x1b3/0x5e0 [mlx5_ib] mlx5_add_device+0x130/0x3a0 [mlx5_core] mlx5_register_interface+0x1a9/0x270 [mlx5_core] do_one_initcall+0x14f/0x5de do_init_module+0x247/0x7c0 load_module+0x4c2f/0x60d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe After this change: [leonro@server ~]$ ls -al /sys/class/infiniband/ibp0s12f0/ports/1/ drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_duplicates drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_msgs drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_msgs drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_retries Fixes: 110cf374a809 ("infiniband: make cm_device use a struct device and not a kobject.") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Support core port attributes in non init_netParav Pandit1-7/+8
Now that sysfs compatibility layer for non init_net exists, add core port attributes such as pkey and gid table to non init_net ns. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Introduce ib_core_device to hold deviceParav Pandit1-13/+16
In order to support sysfs entries in multiple net namespaces for a rdma device, introduce a ib_core_device whose scope is limited to hold core device and per port sysfs related entries. This is preparation patch so that multiple ib_core_devices in each net namespace can be created in subsequent patch who all can share ib_device. (a) Move sysfs specific fields to ib_core_device. (b) Make sysfs and device life cycle related routines to work on ib_core_device. (c) Introduce and use rdma_init_coredev() helper to initialize coredev fields. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA: Add and use rdma_for_each_portJason Gunthorpe1-9/+3
We have many loops iterating over all of the end port numbers on a struct ib_device, simplify them with a for_each helper. Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15RDMA/core: Move device addition deletion to device.cParav Pandit1-12/+2
Move core device addition and removal from sysfs.c to device.c as device.c is more appropriate place for device management. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15RDMA/core: Introduce and use ib_setup_port_attrs()Parav Pandit1-29/+35
Refactor code for device and port sysfs attributes for reuse. While at it, rename counter part free function to ib_free_port_attrs. Also attribute setup sequence is: (a) port specific init. (b) device stats alloc/init. So for cleanup, follow reverse sequence: (a) device stats dealloc (b) port specific cleanup Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15RDMA/core: Use simpler device_del() instead of device_unregister()Parav Pandit1-5/+2
Instead of holding extra reference using get_device() that device_unregister() releases, simplify it as below. device_add() balances with device_del(). device_initialize() balances with put_device(), always via ib_dealloc_device(). Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14RDMA: Introduce and use rdma_device_to_ibdev()Parav Pandit1-6/+6
Introduce and use rdma_device_to_ibdev() API for those drivers which are registering one sysfs group and also use in ib_core. In subsequent patch, device->provider_ibdev one-to-one mapping is no longer holds true during accessing sysfs entries. Therefore, introduce an API rdma_device_to_ibdev() that provides such information. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14RDMA: Rename port_callback to init_portParav Pandit1-10/+6
Most provider routines are callback routines which ib core invokes. _callback suffix doesn't convey information about when such callback is invoked. Therefore, rename port_callback to init_port. Additionally, store the init_port function pointer in ib_device_ops, so that it can be accessed in subsequent patches when binding rdma device to net namespace. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-12RDMA: Start use ib_device_opsKamal Heib1-14/+14
Make all the required change to start use the ib_device_ops structure. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-17RDMA/core: Fix comment for hw stats init for port == 0Parav Pandit1-3/+3
When add_port() is done for port == 0, it indicates that ports hardware counters initialization should be skipped. Reflect so in the comment. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16RDMA/core: Rename ports_parent to ports_kobjParav Pandit1-5/+4
Normally kobj objects have kobj suffix to reflect it. Rename ports_parent to ports_kobj. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16RDMA/core: Do not expose unsupported countersParav Pandit1-7/+12
If the provider driver (such as rdma_rxe) doesn't support pma counters, avoid exposing its directory similar to optional hw_counters directory. If core fails to read the PMA counter, return an error so that user can retry later if needed. Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c") Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-26RDMA: Fully setup the device name in ib_register_deviceJason Gunthorpe1-4/+0
The current code has two copies of the device name, ibdev->dev and dev_name(&ibdev->dev), and they are setup at different times, which is very confusing. Set them both up at the same time and make dev_name() the lead name, which is the proper use of the driver core APIs. To make it very clear that the name is not valid until registration pass it in to the ib_register_device() call rather than messing with ibdev->name directly. Also the reorganization now checks that dev_name is unique even if it does not contain a %. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Devesh Sharma <devesh.sharma@broadcom.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
2018-09-05Merge branch 'uverbs_dev_cleanups' into rdma.git for-nextJason Gunthorpe1-34/+27
For dependencies, branch based on rdma.git 'for-rc' of https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/ Pull 'uverbs_dev_cleanups' from Leon Romanovsky: ==================== Reuse the char device code interfaces to simplify ib_uverbs_device creation and destruction. As part of this series, we are sending fix to cleanup path, which was discovered during internal review, The fix definitely can go to -rc, but it means that this series will be dependent on rdma-rc. ==================== * branch 'uverbs_dev_cleanups': RDMA/uverbs: Use device.groups to initialize device attributes RDMA/uverbs: Use cdev_device_add() instead of cdev_add() RDMA/core: Depend on device_add() to add device attributes RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one() Resolved conflict in ib_device_unregister_sysfs() Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05RDMA/core: Depend on device_add() to add device attributesParav Pandit1-34/+27
Instead of adding/removing device attribute files, depend on device_add() which considers adding these device files based on NULL terminated attributes group array. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05RDMA/core: Replace open-coded variant of get_deviceParav Pandit1-2/+2
Reuse existing get_device() API to do it symmetric to already used put_device() in commit 924b8900a49d ("RDMA/core: Replace open-coded variant of put_device") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18IB/core: Replace ib_query_gid with rdma_get_gid_attrParav Pandit1-35/+31
These call sites have a use of ib_query_gid with a simple lifetime for the struct gid_attr pointer, with an easy conversion. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-03IB/core: Refactor GID modify code for RoCEParav Pandit1-3/+15
Code is refactored to prepare separate functions for RoCE which can do more complex operations related to reference counting, while still maintainining code readability. This includes (a) Simplification to not perform netdevice checks and modifications for IB link layer. (b) Do not add RoCE GID entry which has NULL netdevice; instead return an error. (c) If GID addition fails at provider level add_gid(), do not add the entry in the cache and keep the entry marked as INVALID. (d) Simplify and reuse the ib_cache_gid_add()/del() routines so that they can be used even for modifying default GIDs. This avoid some code duplication in modifying default GIDs. (e) find_gid() routine refers to the data entry flags to qualify a GID as valid or invalid GID rather than depending on attributes and zeroness of the GID content. (f) gid_table_reserve_default() sets the GID default attribute at beginning while setting up the GID table. There is no need to use default_gid flag in low level functions such as write_gid(), add_gid(), del_gid(), as they never need to update the DEFAULT property of the GID entry while during GID table update. As as result of this refactor, reserved GID 0:0:0:0:0:0:0:0 is no longer searchable as described below. A unicast GID entry of 0:0:0:0:0:0:0:0 is Reserved GID as per the IB spec version 1.3 section 4.1.1, point (6) whose snippet is below. "The unicast GID address 0:0:0:0:0:0:0:0 is reserved - referred to as the Reserved GID. It shall never be assigned to any endport. It shall not be used as a destination address or in a global routing header (GRH)." GID table cache now only stores valid GID entries. Before this patch, Reserved GID 0:0:0:0:0:0:0:0 was searchable in the GID table using ib_find_cached_gid_by_port() and other similar find routines. Zero GID is no longer searchable as it shall not to be present in GRH or path recored entry as described in IB spec version 1.3 section 4.1.1, point (6), section 12.7.10 and section 12.7.20. ib_cache_update() is simplified to check link layer once, use unified locking scheme for all link layers, removed temporary gid table allocation/free logic. Additionally, (a) Expand ib_gid_attr to store port and index so that GID query routines can get port and index information from the attribute structure. (b) Expand ib_gid_attr to store device as well so that in future code when GID reference counting is done, device is used to reach back to the GID table entry. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27IB/core: Protect against concurrent access to hardware statsMark Bloch1-6/+28
Currently access to hardware stats buffer isn't protected, this can result in multiple writes and reads at the same time to the same memory location. This can lead to providing an incorrect value to the user. Add a mutex to protect against it. Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-19IB/core: Set speed string to SDR for invalid active ratesHonggang Li1-0/+1
Before commit f1b65df5a232 ("IB/mlx5: Add support for active_width and active_speed in RoCE"), the mlx5_ib driver set default active_width and active_speed to IB_WIDTH_4X and IB_SPEED_QDR. Now, the active_width and active_speed are zeros if the RoCE port is in DOWN state. The speed string should be set to " SDR" instead of a blank string when active_speed is zero. Signed-off-by: Honggang Li <honli@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-03IB/core: Fix two kernel warnings triggered by rxe registrationBart Van Assche1-1/+0
Eliminate the WARN_ONs that create following two warnings when registering an rxe device: WARNING: CPU: 2 PID: 1005 at drivers/infiniband/core/device.c:449 ib_register_device+0x591/0x640 [ib_core] CPU: 2 PID: 1005 Comm: run_tests Not tainted 4.15.0-rc4-dbg+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:ib_register_device+0x591/0x640 [ib_core] Call Trace: rxe_register_device+0x3c6/0x470 [rdma_rxe] rxe_add+0x543/0x5e0 [rdma_rxe] rxe_net_add+0x37/0xb0 [rdma_rxe] rxe_param_set_add+0x5a/0x120 [rdma_rxe] param_attr_store+0x5e/0xc0 module_attr_store+0x19/0x30 sysfs_kf_write+0x3d/0x50 kernfs_fop_write+0x116/0x1a0 __vfs_write+0x23/0x120 vfs_write+0xbe/0x1b0 SyS_write+0x44/0xa0 entry_SYSCALL_64_fastpath+0x23/0x9a WARNING: CPU: 2 PID: 1005 at drivers/infiniband/core/sysfs.c:1279 ib_device_register_sysfs+0x11d/0x160 [ib_core] CPU: 2 PID: 1005 Comm: run_tests Tainted: G W 4.15.0-rc4-dbg+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:ib_device_register_sysfs+0x11d/0x160 [ib_core] Call Trace: ib_register_device+0x3f7/0x640 [ib_core] rxe_register_device+0x3c6/0x470 [rdma_rxe] rxe_add+0x543/0x5e0 [rdma_rxe] rxe_net_add+0x37/0xb0 [rdma_rxe] rxe_param_set_add+0x5a/0x120 [rdma_rxe] param_attr_store+0x5e/0xc0 module_attr_store+0x19/0x30 sysfs_kf_write+0x3d/0x50 kernfs_fop_write+0x116/0x1a0 __vfs_write+0x23/0x120 vfs_write+0xbe/0x1b0 SyS_write+0x44/0xa0 entry_SYSCALL_64_fastpath+0x23/0x9a The code should accept either a parent pointer or a fully specified DMA specification without producing warnings. Fixes: 99db9494035f ("IB/core: Remove ib_device.dma_device") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: stable@vger.kernel.org # v4.11 Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-10-18IB/core: Fix unable to change lifespan entry for hw_countersParav Pandit1-1/+15
This patch fixes the case where 'lifespan' entry of the hw_counters is not writable. Currently write callback is not exposed for for the hw_counters sysfs operation. Due to this, modifying lifespan value results into permission denied error in below example. echo 10 > /sys/class/infiniband/mlx5_0/ports/1/hw_counters/lifespan -bash: /sys/class/infiniband/mlx5_0/ports/1/hw_counters/lifespan: Permission denied This patch adds the hook to modify any attribute which implements store() operation. Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-10RDMA: Simplify get firmware interfaceLeon Romanovsky1-2/+2
There is a need to forward FW version to user space application through RDMA netlink. In order to make it safe, there is need to declare nla_policy and limit the size of FW string. The new define IB_FW_VERSION_NAME_MAX will limit the size of FW version string. That define was chosen to be equal to ETHTOOL_FWVERS_LEN, because many drivers anyway are limited by that value indirectly. The introduction of this define allows us to remove the string size from get_fw_str function signature. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-04-21IB/core: Add HDR speed enumNoa Osherovich1-0/+4
Add high data rate speed to the ib_port_speed enumeration. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/core: Fix sysfs registration error flowJack Morgenstein1-1/+1
The kernel commit cited below restructured ib device management so that the device kobject is initialized in ib_alloc_device. As part of the restructuring, the kobject is now initialized in procedure ib_alloc_device, and is later added to the device hierarchy in the ib_register_device call stack, in procedure ib_device_register_sysfs (which calls device_add). However, in the ib_device_register_sysfs error flow, if an error occurs following the call to device_add, the cleanup procedure device_unregister is called. This call results in the device object being deleted -- which results in various use-after-free crashes. The correct cleanup call is device_del -- which undoes device_add without deleting the device object. The device object will then (correctly) be deleted in the ib_register_device caller's error cleanup flow, when the caller invokes ib_dealloc_device. Fixes: 55aeed06544f6 ("IB/core: Make ib_alloc_device init the kobject") Cc: <stable@vger.kernel.org> # v4.2+ Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/core: Initialize ib_device.dev.parent earlierBart Van Assche1-1/+1
Move the ib_device.dev.parent initialization code from ib_device_register_sysfs() to ib_register_device(). Additionally, allow HBA drivers to set ib_device.dev.parent without setting ib_device.dma_device. This is the first step towards removing ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/{core,hw}: Add constant for node_descYuval Shaia1-1/+1
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds1-1/+14
Pull base rdma updates from Doug Ledford: "Round one of 4.8 code: while this is mostly normal, there is a new driver in here (the driver was hosted outside the kernel for several years and is actually a fairly mature and well coded driver). It amounts to 13,000 of the 16,000 lines of added code in here. Summary: - Updates/fixes for iw_cxgb4 driver - Updates/fixes for mlx5 driver - Add flow steering and RSS API - Add hardware stats to mlx4 and mlx5 drivers - Add firmware version API for RDMA driver use - Add the rxe driver (this is a software RoCE driver that makes any Ethernet device a RoCE device) - Fixes for i40iw driver - Support for send only multicast joins in the cma layer - Other minor fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (72 commits) Soft RoCE driver IB/core: Support for CMA multicast join flags IB/sa: Add cached attribute containing SM information to SA port IB/uverbs: Fix race between uverbs_close and remove_one IB/mthca: Clean up error unwind flow in mthca_reset() IB/mthca: NULL arg to pci_dev_put is OK IB/hfi1: NULL arg to sc_return_credits is OK IB/mlx4: Add diagnostic hardware counters net/mlx4: Query performance and diagnostics counters net/mlx4: Add diagnostic counters capability bit Use smaller 512 byte messages for portmapper messages IB/ipoib: Report SG feature regardless of HW UD CSUM capability IB/mlx4: Don't use GFP_ATOMIC for CQ resize struct IB/hfi1: Disable by default IB/rdmavt: Disable by default IB/mlx5: Fix port counter ID association to QP offset IB/mlx5: Fix iteration overrun in GSI qps i40iw: Add NULL check for puda buffer i40iw: Change dup_ack_thresh to u8 i40iw: Remove unnecessary check for moving CQ head ...
2016-07-12IB core: Add port_xmit_wait counterChristoph Lameter1-0/+4
Add the missing port_xmit_wait counter. This counter is displayed through some tools like perfquery but is not available via sysfs. For the PORT_PMA_ATTR macro the _counter field is set to zero allowing us to specify the offset directly like with PORT_PMA_ATTR_EXT See also the earlier work in 2008 by Vladimir Skolovsky https://www.mail-archive.com/general@lists.openfabrics.org/msg20313.html Signed-off-by: Vladimir Sokolvsky <vlad@mellanox.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23IB/core: Export a common fw_ver sysfs entryIra Weiny1-1/+14
Now that all the devices have stopped exporting their own sysfs entry points we can have the core export this on their behalf. Eventually this may be removed but this provides for backwards compatibility. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07IB/core: Initialize sysfs attributes before sysfs create groupMark Bloch1-0/+3
For dynamically allocated sysfs attributes there is a need to call sysfs_attr_init in order to comply with lockdep, not calling it will result in error complaining key is not in .data section. Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07IB/core: fix error unwind in sysfs hw counters codeDoug Ledford1-3/+4
Between the initial and final versions of the function setup_hw_stats, the order of variable initialization was changed. However, the unwind flow on error did not properly keep up with the flow changes. Make the unwind flow match a proper unwind of the allocation flow, then remove no longer needed variable initializations. Fixes: b40f4757daa1 (IB/core: Make device counter infrastructure dynamic) Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07IB/core: Fix array length allocationDoug Ledford1-2/+5
The new sysfs hw_counters code had an off by one in its array allocation length. Fix that and the comment along with it. Reported-by: Mark Bloch <markb@mellanox.com> Fixes: b40f4757daa1 (IB/core: Make device counter infrastructure dynamic) Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-06IB/core: fix null pointer deref and mem leak in error handlingColin Ian King1-3/+4
The current error handling in setup_hw_stats has a couple of issues. It is possible to generate a null pointer deference on the kfree of hsag->attrs[i] because two of the early error exit paths jump to the kfree when hsags NULL and not allocated. Fix this by moving the kfree on stats and jumping to that, avoiding the hsag freeing. Secondly, there is a memory leak of stats if the hsag allocation fails; instead of returning, jump to the kfree on stats. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26IB/core: Make device counter infrastructure dynamicChristoph Lameter1-124/+242
In practice, each RDMA device has a unique set of counters that the hardware implements. Having a central set of counters that they must all adhere to is limiting and causes many useful counters to not be available. Therefore we create a dynamic counter registration infrastructure. The driver must implement a stats structure allocation routine, in which the driver must place the directory name it wants, a list of names for all of the counters, an array of u64 counters themselves, plus a few generic configuration options. We then implement a core routine to create a sysfs file for each of the named stats elements, and a core routine to retrieve the stats when any of the sysfs attribute files are read. To avoid excessive beating on the stats generation routine in the drivers, the core code also caches the stats for a short period of time so that someone attempting to read all of the stats in a given device's directory will not result in a stats generation call per file read. Future work will attempt to standardize just the shared stats elements, and possibly add a method to get the stats via netlink in addition to sysfs. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [ Add caching, make structure names more informative, add i40iw support, other significant rewrites from the original patch ]
2016-02-11IB/core: Fix reading capability mask of the port info classEran Ben Elisha1-3/+2
When checking specific attribute from a bit mask, need to use bitwise AND and not logical AND, fixed that. Fixes: 145d9c541032 ('IB/core: Display extended counter set if available') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-04IB/sysfs: remove unused va_list argsColin Ian King1-2/+0
_show_port_gid_attr performs a va_end on some unused va_list args. Clean this up by removing the args completely. Fixes: 470be516a226e8 ("IB/core: Add gid attributes to sysfs") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/core: sysfs.c: Fix PerfMgt ClassPortInfo handlingHal Rosenstock1-4/+6
Port number is not part of ClassPortInfo attribute but is still needed as a parameter when invoking process_mad. To properly handle this attribute, port_num is added as a parameter to get_counter_table and get_perf_mad was changed not to store port_num in the attribute itself when it's querying the ClassPortInfo attribute. This handles issue pointed out by Matan Barak <matanb@dev.mellanox.co.il> Fixes: 145d9c541032 ('IB/core: Display extended counter set if available') Signed-off-by: Hal Rosenstock <hal@mellanox.com> Acked-by: Matan Barak <matanb@mellanox.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/sysfs: Fix sparse warning on attr_idIra Weiny1-2/+2
Attributed ID was declared as an int while the value should really be big endian 16. Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c") Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Lameter <cl@linux.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Display extended counter set if availableChristoph Lameter1-3/+107
Check if the extended counters are available and if so create the proper extended and additional counters. Signed-off-by: Christoph Lameter <cl@linux.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>