aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/core/ucma.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-06-18RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEVJason Gunthorpe1-0/+23
Update the struct ib_client for all modules exporting cdevs related to the ibdevice to also implement RDMA_NLDEV_CMD_GET_CHARDEV. All cdevs are now autoloadable and discoverable by userspace over netlink instead of relying on sysfs. uverbs also exposes the DRIVER_ID for drivers that are able to support driver id binding in rdma-core. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-07ucma: Convert ctx_idr to XArrayMatthew Wilcox1-34/+24
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-07ucma: Convert multicast_idr to XArrayMatthew Wilcox1-19/+11
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-07RDMA/ucma: Use struct_size() helperGustavo A. R. Silva1-2/+1
Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-06*: convert stream-like files from nonseekable_open -> stream_openKirill Smelkov1-1/+1
Using scripts/coccinelle/api/stream_open.cocci added in 10dce8af3422 ("fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock"), search and convert to stream_open all in-kernel nonseekable_open users for which read and write actually do not depend on ppos and where there is no other methods in file_operations which assume @offset access. I've verified each generated change manually - that it is correct to convert - and each other nonseekable_open instance left - that it is either not correct to convert there, or that it is not converted due to current stream_open.cocci limitations. The script also does not convert files that should be valid to convert, but that currently have .llseek = noop_llseek or generic_file_llseek for unknown reason despite file being opened with nonseekable_open (e.g. drivers/input/mousedev.c) Among cases converted 14 were potentially vulnerable to read vs write deadlock (see details in 10dce8af3422): drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/infiniband/core/user_mad.c:988:1-17: ERROR: umad_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/input/misc/uinput.c:401:1-17: ERROR: uinput_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. and the rest were just safe to convert to stream_open because their read and write do not use ppos at all and corresponding file_operations do not have methods that assume @offset file access(*): arch/powerpc/platforms/52xx/mpc52xx_gpt.c:631:8-24: WARNING: mpc52xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_ibox_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_ibox_stat_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_mbox_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_mbox_stat_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_wbox_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_wbox_stat_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. arch/um/drivers/harddog_kern.c:88:8-24: WARNING: harddog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. arch/x86/kernel/cpu/microcode/core.c:430:33-49: WARNING: microcode_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/char/ds1620.c:215:8-24: WARNING: ds1620_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/char/dtlk.c:301:1-17: WARNING: dtlk_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/char/ipmi/ipmi_watchdog.c:840:9-25: WARNING: ipmi_wdog_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/char/pcmcia/scr24x_cs.c:95:8-24: WARNING: scr24x_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/char/tb0219.c:246:9-25: WARNING: tb0219_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/firewire/nosy.c:306:8-24: WARNING: nosy_ops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/hwmon/fschmd.c:840:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/hwmon/w83793.c:1344:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/infiniband/core/ucma.c:1747:8-24: WARNING: ucma_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/infiniband/core/ucm.c:1178:8-24: WARNING: ucm_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/infiniband/core/uverbs_main.c:1086:8-24: WARNING: uverbs_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/input/joydev.c:282:1-17: WARNING: joydev_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/pci/switch/switchtec.c:393:1-17: WARNING: switchtec_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/platform/chrome/cros_ec_debugfs.c:135:8-24: WARNING: cros_ec_console_log_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/rtc/rtc-ds1374.c:470:9-25: WARNING: ds1374_wdt_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/rtc/rtc-m41t80.c:805:9-25: WARNING: wdt_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/s390/char/tape_char.c:293:2-18: WARNING: tape_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/s390/char/zcore.c:194:8-24: WARNING: zcore_reipl_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/s390/crypto/zcrypt_api.c:528:8-24: WARNING: zcrypt_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/spi/spidev.c:594:1-17: WARNING: spidev_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/staging/pi433/pi433_if.c:974:1-17: WARNING: pi433_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/acquirewdt.c:203:8-24: WARNING: acq_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/advantechwdt.c:202:8-24: WARNING: advwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/alim1535_wdt.c:252:8-24: WARNING: ali_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/alim7101_wdt.c:217:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/ar7_wdt.c:166:8-24: WARNING: ar7_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/at91rm9200_wdt.c:113:8-24: WARNING: at91wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/ath79_wdt.c:135:8-24: WARNING: ath79_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/bcm63xx_wdt.c:119:8-24: WARNING: bcm63xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/cpu5wdt.c:143:8-24: WARNING: cpu5wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/cpwd.c:397:8-24: WARNING: cpwd_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/eurotechwdt.c:319:8-24: WARNING: eurwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/f71808e_wdt.c:528:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/gef_wdt.c:232:8-24: WARNING: gef_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/geodewdt.c:95:8-24: WARNING: geodewdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/ib700wdt.c:241:8-24: WARNING: ibwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/ibmasr.c:326:8-24: WARNING: asr_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/indydog.c:80:8-24: WARNING: indydog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/intel_scu_watchdog.c:307:8-24: WARNING: intel_scu_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/iop_wdt.c:104:8-24: WARNING: iop_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/it8712f_wdt.c:330:8-24: WARNING: it8712f_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/ixp4xx_wdt.c:68:8-24: WARNING: ixp4xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/ks8695_wdt.c:145:8-24: WARNING: ks8695wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/m54xx_wdt.c:88:8-24: WARNING: m54xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/machzwd.c:336:8-24: WARNING: zf_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/mixcomwd.c:153:8-24: WARNING: mixcomwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/mtx-1_wdt.c:121:8-24: WARNING: mtx1_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/mv64x60_wdt.c:136:8-24: WARNING: mv64x60_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/nuc900_wdt.c:134:8-24: WARNING: nuc900wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/nv_tco.c:164:8-24: WARNING: nv_tco_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pc87413_wdt.c:289:8-24: WARNING: pc87413_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pcwd.c:698:8-24: WARNING: pcwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pcwd.c:737:8-24: WARNING: pcwd_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pcwd_pci.c:581:8-24: WARNING: pcipcwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pcwd_pci.c:623:8-24: WARNING: pcipcwd_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pcwd_usb.c:488:8-24: WARNING: usb_pcwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pcwd_usb.c:527:8-24: WARNING: usb_pcwd_temperature_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pika_wdt.c:121:8-24: WARNING: pikawdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/pnx833x_wdt.c:119:8-24: WARNING: pnx833x_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/rc32434_wdt.c:153:8-24: WARNING: rc32434_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/rdc321x_wdt.c:145:8-24: WARNING: rdc321x_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/riowd.c:79:1-17: WARNING: riowd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sa1100_wdt.c:62:8-24: WARNING: sa1100dog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sbc60xxwdt.c:211:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sbc7240_wdt.c:139:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sbc8360.c:274:8-24: WARNING: sbc8360_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sbc_epx_c3.c:81:8-24: WARNING: epx_c3_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sbc_fitpc2_wdt.c:78:8-24: WARNING: fitpc2_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sb_wdog.c:108:1-17: WARNING: sbwdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sc1200wdt.c:181:8-24: WARNING: sc1200wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sc520_wdt.c:261:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/sch311x_wdt.c:319:8-24: WARNING: sch311x_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/scx200_wdt.c:105:8-24: WARNING: scx200_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/smsc37b787_wdt.c:369:8-24: WARNING: wb_smsc_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/w83877f_wdt.c:227:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/w83977f_wdt.c:301:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wafer5823wdt.c:200:8-24: WARNING: wafwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/watchdog_dev.c:828:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdrtas.c:379:8-24: WARNING: wdrtas_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdrtas.c:445:8-24: WARNING: wdrtas_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdt285.c:104:1-17: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdt977.c:276:8-24: WARNING: wdt977_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdt.c:424:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdt.c:484:8-24: WARNING: wdt_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdt_pci.c:464:8-24: WARNING: wdtpci_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/watchdog/wdt_pci.c:527:8-24: WARNING: wdtpci_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. net/batman-adv/log.c:105:1-17: WARNING: batadv_log_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. sound/core/control.c:57:7-23: WARNING: snd_ctl_f_ops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. sound/core/rawmidi.c:385:7-23: WARNING: snd_rawmidi_f_ops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. sound/core/seq/seq_clientmgr.c:310:7-23: WARNING: snd_seq_f_ops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open. sound/core/timer.c:1428:7-23: WARNING: snd_timer_f_ops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. One can also recheck/review the patch via generating it with explanation comments included via $ make coccicheck MODE=patch COCCI=scripts/coccinelle/api/stream_open.cocci SPFLAGS="-D explain" (*) This second group also contains cases with read/write deadlocks that stream_open.cocci don't yet detect, but which are still valid to convert to stream_open since ppos is not used. For example drivers/pci/switch/switchtec.c calls wait_for_completion_interruptible() in its .read, but stream_open.cocci currently detects only "wait_event*" as blocking. Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Yongzhi Pan <panyongzhi@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Tejun Heo <tj@kernel.org> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Nikolaus Rath <Nikolaus@rath.org> Cc: Han-Wen Nienhuys <hanwen@google.com> Cc: Anatolij Gustschin <agust@denx.de> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James R. Van Zandt" <jrv@vanzandt.mv.com> Cc: Corey Minyard <minyard@acm.org> Cc: Harald Welte <laforge@gnumonks.org> Acked-by: Lubomir Rintel <lkundrak@v3.sk> [scr24x_cs] Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Johan Hovold <johan@kernel.org> Cc: David Herrmann <dh.herrmann@googlemail.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Jean Delvare <jdelvare@suse.com> Acked-by: Guenter Roeck <linux@roeck-us.net> [watchdog/* hwmon/*] Cc: Rudolf Marek <r.marek@assembler.cz> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Logan Gunthorpe <logang@deltatee.com> [drivers/pci/switch/switchtec] Acked-by: Bjorn Helgaas <bhelgaas@google.com> [drivers/pci/switch/switchtec] Cc: Benson Leung <bleung@chromium.org> Acked-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> [platform/chrome] Cc: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> [rtc/*] Cc: Mark Brown <broonie@kernel.org> Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: bcm-kernel-feedback-list@broadcom.com Cc: Wan ZongShun <mcuos.com@gmail.com> Cc: Zwane Mwaikambo <zwanem@gmail.com> Cc: Marek Lindner <mareklindner@neomailbox.ch> Cc: Simon Wunderlich <sw@simonwunderlich.de> Cc: Antonio Quartulli <a@unstable.cc> Cc: "David S. Miller" <davem@davemloft.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
2019-02-08IB/cma: Define option to set ack timeout and pack tos_setDanit Goldberg1-0/+7
Define new option in 'rdma_set_option' to override calculated QP timeout when requested to provide QP attributes to modify a QP. At the same time, pack tos_set to be bitfield. Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/ucma: Fix Spectre v1 vulnerabilityGustavo A. R. Silva1-0/+3
hdr.cmd can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/infiniband/core/ucma.c:1686 ucma_write() warn: potential spectre issue 'ucma_cmd_table' [r] (local cap) Fix this by sanitizing hdr.cmd before using it to index ucm_cmd_table. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-13ucma: fix a use-after-free in ucma_resolve_ip()Cong Wang1-0/+2
There is a race condition between ucma_close() and ucma_resolve_ip(): CPU0 CPU1 ucma_resolve_ip(): ucma_close(): ctx = ucma_get_ctx(file, cmd.id); list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) { mutex_lock(&mut); idr_remove(&ctx_idr, ctx->id); mutex_unlock(&mut); ... mutex_lock(&mut); if (!ctx->closing) { mutex_unlock(&mut); rdma_destroy_id(ctx->cm_id); ... ucma_free_ctx(ctx); ret = rdma_resolve_addr(); ucma_put_ctx(ctx); Before idr_remove(), ucma_get_ctx() could still find the ctx and after rdma_destroy_id(), rdma_resolve_addr() may still access id_priv pointer. Also, ucma_put_ctx() may use ctx after ucma_free_ctx() too. ucma_close() should call ucma_put_ctx() too which tests the refcnt and waits for the last one releasing it. The similar pattern is already used by ucma_destroy_id(). Reported-and-tested-by: syzbot+da2591e115d57a9cbb8b@syzkaller.appspotmail.com Reported-by: syzbot+cfe3c1e8ef634ba8964b@syzkaller.appspotmail.com Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Leon Romanovsky <leon@kernel.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-04RDMA/ucma: check fd type in ucma_migrate_id()Jann Horn1-0/+6
The current code grabs the private_data of whatever file descriptor userspace has supplied and implicitly casts it to a `struct ucma_file *`, potentially causing a type confusion. This is probably fine in practice because the pointer is only used for comparisons, it is never actually dereferenced; and even in the comparisons, it is unlikely that a file from another filesystem would have a ->private_data pointer that happens to also be valid in this context. But ->private_data is not always guaranteed to be a valid pointer to an object owned by the file's filesystem; for example, some filesystems just cram numbers in there. Check the type of the supplied file descriptor to be safe, analogous to how other places in the kernel do it. Fixes: 88314e4dda1e ("RDMA/cma: add support for rdma_migrate_id()") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-04infiniband: fix a possible use-after-free bugCong Wang1-1/+5
ucma_process_join() will free the new allocated "mc" struct, if there is any error after that, especially the copy_to_user(). But in parallel, ucma_leave_multicast() could find this "mc" through idr_find() before ucma_process_join() frees it, since it is already published. So "mc" could be used in ucma_leave_multicast() after it is been allocated and freed in ucma_process_join(), since we don't refcnt it. Fix this by separating "publish" from ID allocation, so that we can get an ID first and publish it later after copy_to_user(). Fixes: c8f6a362bf3e ("RDMA/cma: Add multicast communication support") Reported-by: Noam Rathaus <noamr@beyondsecurity.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-23RDMA/ucma: Allow resolving address w/o specifying source addressRoland Dreier1-1/+1
The RDMA CM will select a source device and address by consulting the routing table if no source address is passed into rdma_resolve_address(). Userspace will ask for this by passing an all-zero source address in the RESOLVE_IP command. Unfortunately the new check for non-zero address size rejects this with EINVAL, which breaks valid userspace applications. Fix this by explicitly allowing a zero address family for the source. Fixes: 2975d5de6428 ("RDMA/ucma: Check AF family prior resolving address") Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19RDMA/ucma: Check for a cm_id->device in all user calls that need itJason Gunthorpe1-12/+24
This is done by auditing all callers of ucma_get_ctx and switching the ones that unconditionally touch ->device to ucma_get_ctx_dev. This covers a little less than half of the call sites. The 11 remaining call sites to ucma_get_ctx() were manually audited. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-16RDMA/ucma: ucma_context reference leak in error pathShamir Rabinovitch1-3/+3
Validating input parameters should be done before getting the cm_id otherwise it can leak a cm_id reference. Fixes: 6a21dfc0d0db ("RDMA/ucma: Limit possible option size") Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-06Merge tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-16/+24
Pull rdma updates from Jason Gunthorpe: "Doug and I are at a conference next week so if another PR is sent I expect it to only be bug fixes. Parav noted yesterday that there are some fringe case behavior changes in his work that he would like to fix, and I see that Intel has a number of rc looking patches for HFI1 they posted yesterday. Parav is again the biggest contributor by patch count with his ongoing work to enable container support in the RDMA stack, followed by Leon doing syzkaller inspired cleanups, though most of the actual fixing went to RC. There is one uncomfortable series here fixing the user ABI to actually work as intended in 32 bit mode. There are lots of notes in the commit messages, but the basic summary is we don't think there is an actual 32 bit kernel user of drivers/infiniband for several good reasons. However we are seeing people want to use a 32 bit user space with 64 bit kernel, which didn't completely work today. So in fixing it we required a 32 bit rxe user to upgrade their userspace. rxe users are still already quite rare and we think a 32 bit one is non-existing. - Fix RDMA uapi headers to actually compile in userspace and be more complete - Three shared with netdev pull requests from Mellanox: * 7 patches, mostly to net with 1 IB related one at the back). This series addresses an IRQ performance issue (patch 1), cleanups related to the fix for the IRQ performance problem (patches 2-6), and then extends the fragmented completion queue support that already exists in the net side of the driver to the ib side of the driver (patch 7). * Mostly IB, with 5 patches to net that are needed to support the remaining 10 patches to the IB subsystem. This series extends the current 'representor' framework when the mlx5 driver is in switchdev mode from being a netdev only construct to being a netdev/IB dev construct. The IB dev is limited to raw Eth queue pairs only, but by having an IB dev of this type attached to the representor for a switchdev port, it enables DPDK to work on the switchdev device. * All net related, but needed as infrastructure for the rdma driver - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers - SRP performance updates - IB uverbs write path cleanup patch series from Leon - Add RDMA_CM support to ib_srpt. This is disabled by default. Users need to set the port for ib_srpt to listen on in configfs in order for it to be enabled (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port) - TSO and Scatter FCS support in mlx4 - Refactor of modify_qp routine to resolve problems seen while working on new code that is forthcoming - More refactoring and updates of RDMA CM for containers support from Parav - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device memory' user API features - Infrastructure updates for the new IOCTL interface, based on increased usage - ABI compatibility bug fixes to fully support 32 bit userspace on 64 bit kernel as was originally intended. See the commit messages for extensive details - Syzkaller bugs and code cleanups motivated by them" * tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits) IB/rxe: Fix for oops in rxe_register_device on ppc64le arch IB/mlx5: Device memory mr registration support net/mlx5: Mkey creation command adjustments IB/mlx5: Device memory support in mlx5_ib net/mlx5: Query device memory capabilities IB/uverbs: Add device memory registration ioctl support IB/uverbs: Add alloc/free dm uverbs ioctl support IB/uverbs: Add device memory capabilities reporting IB/uverbs: Expose device memory capabilities to user RDMA/qedr: Fix wmb usage in qedr IB/rxe: Removed GID add/del dummy routines RDMA/qedr: Zero stack memory before copying to user space IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR IB/mlx5: Add information for querying IPsec capabilities IB/mlx5: Add IPsec support for egress and ingress {net,IB}/mlx5: Add ipsec helper IB/mlx5: Add modify_flow_action_esp verb IB/mlx5: Add implementation for create and destroy action_xfrm IB/uverbs: Introduce ESP steering match filter IB/uverbs: Add modify ESP flow_action ...
2018-04-03RDMA/ucma: Don't allow setting RDMA_OPTION_IB_PATH without an RDMA deviceRoland Dreier1-0/+3
Check to make sure that ctx->cm_id->device is set before we use it. Otherwise userspace can trigger a NULL dereference by doing RDMA_USER_CM_CMD_SET_OPTION on an ID that is not bound to a device. Cc: <stable@vger.kernel.org> Reported-by: <syzbot+a67bc93e14682d92fc2f@syzkaller.appspotmail.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-29RDMA: Use u64_to_user_ptr everywhereJason Gunthorpe1-10/+10
This is already used in many places, get the rest of them too, only to make the code a bit clearer & simpler. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-28RDMA/ucma: Introduce safer rdma_addr_size() variantsRoland Dreier1-17/+17
There are several places in the ucma ABI where userspace can pass in a sockaddr but set the address family to AF_IB. When that happens, rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6, and the ucma kernel code might end up copying past the end of a buffer not sized for a struct sockaddr_ib. Fix this by introducing new variants int rdma_addr_size_in6(struct sockaddr_in6 *addr); int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr); that are type-safe for the types used in the ucma ABI and return 0 if the size computed is bigger than the size of the type passed in. We can use these new variants to check what size userspace has passed in before copying any addresses. Reported-by: <syzbot+6800425d54ed3ed8135d@syzkaller.appspotmail.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/ucma: Fix uABI structure layouts for 32/64 compatJason Gunthorpe1-2/+7
The rdma_ucm_event_resp is a different length on 32 and 64 bit compiles. The kernel requires it to be the expected length or longer so 32 bit builds running on a 64 bit kernel will not work. Retain full compat by having all kernels accept a struct with or without the trailing reserved field. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/ucma: Check that device exists prior to accessing itLeon Romanovsky1-2/+4
Ensure that device exists prior to accessing its properties. Reported-by: <syzbot+71655d44855ac3e76366@syzkaller.appspotmail.com> Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/ucma: Check that device is connected prior to access itLeon Romanovsky1-0/+5
Add missing check that device is connected prior to access it. [ 55.358652] BUG: KASAN: null-ptr-deref in rdma_init_qp_attr+0x4a/0x2c0 [ 55.359389] Read of size 8 at addr 00000000000000b0 by task qp/618 [ 55.360255] [ 55.360432] CPU: 1 PID: 618 Comm: qp Not tainted 4.16.0-rc1-00071-gcaf61b1b8b88 #91 [ 55.361693] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 55.363264] Call Trace: [ 55.363833] dump_stack+0x5c/0x77 [ 55.364215] kasan_report+0x163/0x380 [ 55.364610] ? rdma_init_qp_attr+0x4a/0x2c0 [ 55.365238] rdma_init_qp_attr+0x4a/0x2c0 [ 55.366410] ucma_init_qp_attr+0x111/0x200 [ 55.366846] ? ucma_notify+0xf0/0xf0 [ 55.367405] ? _get_random_bytes+0xea/0x1b0 [ 55.367846] ? urandom_read+0x2f0/0x2f0 [ 55.368436] ? kmem_cache_alloc_trace+0xd2/0x1e0 [ 55.369104] ? refcount_inc_not_zero+0x9/0x60 [ 55.369583] ? refcount_inc+0x5/0x30 [ 55.370155] ? rdma_create_id+0x215/0x240 [ 55.370937] ? _copy_to_user+0x4f/0x60 [ 55.371620] ? mem_cgroup_commit_charge+0x1f5/0x290 [ 55.372127] ? _copy_from_user+0x5e/0x90 [ 55.372720] ucma_write+0x174/0x1f0 [ 55.373090] ? ucma_close_id+0x40/0x40 [ 55.373805] ? __lru_cache_add+0xa8/0xd0 [ 55.374403] __vfs_write+0xc4/0x350 [ 55.374774] ? kernel_read+0xa0/0xa0 [ 55.375173] ? fsnotify+0x899/0x8f0 [ 55.375544] ? fsnotify_unmount_inodes+0x170/0x170 [ 55.376689] ? __fsnotify_update_child_dentry_flags+0x30/0x30 [ 55.377522] ? handle_mm_fault+0x174/0x320 [ 55.378169] vfs_write+0xf7/0x280 [ 55.378864] SyS_write+0xa1/0x120 [ 55.379270] ? SyS_read+0x120/0x120 [ 55.379643] ? mm_fault_error+0x180/0x180 [ 55.380071] ? task_work_run+0x7d/0xd0 [ 55.380910] ? __task_pid_nr_ns+0x120/0x140 [ 55.381366] ? SyS_read+0x120/0x120 [ 55.381739] do_syscall_64+0xeb/0x250 [ 55.382143] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 55.382841] RIP: 0033:0x7fc2ef803e99 [ 55.383227] RSP: 002b:00007fffcc5f3be8 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 [ 55.384173] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc2ef803e99 [ 55.386145] RDX: 0000000000000057 RSI: 0000000020000080 RDI: 0000000000000003 [ 55.388418] RBP: 00007fffcc5f3c00 R08: 0000000000000000 R09: 0000000000000000 [ 55.390542] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400480 [ 55.392916] R13: 00007fffcc5f3cf0 R14: 0000000000000000 R15: 0000000000000000 [ 55.521088] Code: e5 4d 1e ff 48 89 df 44 0f b6 b3 b8 01 00 00 e8 65 50 1e ff 4c 8b 2b 49 8d bd b0 00 00 00 e8 56 50 1e ff 41 0f b6 c6 48 c1 e0 04 <49> 03 85 b0 00 00 00 48 8d 78 08 48 89 04 24 e8 3a 4f 1e ff 48 [ 55.525980] RIP: rdma_init_qp_attr+0x52/0x2c0 RSP: ffff8801e2c2f9d8 [ 55.532648] CR2: 00000000000000b0 [ 55.534396] ---[ end trace 70cee64090251c0b ]--- Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Fixes: d541e45500bd ("IB/core: Convert ah_attr from OPA to IB when copying to user") Reported-by: <syzbot+7b62c837c2516f8f38c8@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-21RDMA/ucma: Correct option size check using optlenChien Tin Tung1-1/+1
The option size check is using optval instead of optlen causing the set option call to fail. Use the correct field, optlen, for size check. Fixes: 6a21dfc0d0db ("RDMA/ucma: Limit possible option size") Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-20RDMA/ucma: Ensure that CM_ID exists prior to access itLeon Romanovsky1-6/+9
Prior to access UCMA commands, the context should be initialized and connected to CM_ID with ucma_create_id(). In case user skips this step, he can provide non-valid ctx without CM_ID and cause to multiple NULL dereferences. Also there are situations where the create_id can be raced with other user access, ensure that the context is only shared to other threads once it is fully initialized to avoid the races. [ 109.088108] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 109.090315] IP: ucma_connect+0x138/0x1d0 [ 109.092595] PGD 80000001dc02d067 P4D 80000001dc02d067 PUD 1da9ef067 PMD 0 [ 109.095384] Oops: 0000 [#1] SMP KASAN PTI [ 109.097834] CPU: 0 PID: 663 Comm: uclose Tainted: G B 4.16.0-rc1-00062-g2975d5de6428 #45 [ 109.100816] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 109.105943] RIP: 0010:ucma_connect+0x138/0x1d0 [ 109.108850] RSP: 0018:ffff8801c8567a80 EFLAGS: 00010246 [ 109.111484] RAX: 0000000000000000 RBX: 1ffff100390acf50 RCX: ffffffff9d7812e2 [ 109.114496] RDX: 1ffffffff3f507a5 RSI: 0000000000000297 RDI: 0000000000000297 [ 109.117490] RBP: ffff8801daa15600 R08: 0000000000000000 R09: ffffed00390aceeb [ 109.120429] R10: 0000000000000001 R11: ffffed00390aceea R12: 0000000000000000 [ 109.123318] R13: 0000000000000120 R14: ffff8801de6459c0 R15: 0000000000000118 [ 109.126221] FS: 00007fabb68d6700(0000) GS:ffff8801e5c00000(0000) knlGS:0000000000000000 [ 109.129468] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 109.132523] CR2: 0000000000000020 CR3: 00000001d45d8003 CR4: 00000000003606b0 [ 109.135573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 109.138716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 109.142057] Call Trace: [ 109.144160] ? ucma_listen+0x110/0x110 [ 109.146386] ? wake_up_q+0x59/0x90 [ 109.148853] ? futex_wake+0x10b/0x2a0 [ 109.151297] ? save_stack+0x89/0xb0 [ 109.153489] ? _copy_from_user+0x5e/0x90 [ 109.155500] ucma_write+0x174/0x1f0 [ 109.157933] ? ucma_resolve_route+0xf0/0xf0 [ 109.160389] ? __mod_node_page_state+0x1d/0x80 [ 109.162706] __vfs_write+0xc4/0x350 [ 109.164911] ? kernel_read+0xa0/0xa0 [ 109.167121] ? path_openat+0x1b10/0x1b10 [ 109.169355] ? fsnotify+0x899/0x8f0 [ 109.171567] ? fsnotify_unmount_inodes+0x170/0x170 [ 109.174145] ? __fget+0xa8/0xf0 [ 109.177110] vfs_write+0xf7/0x280 [ 109.179532] SyS_write+0xa1/0x120 [ 109.181885] ? SyS_read+0x120/0x120 [ 109.184482] ? compat_start_thread+0x60/0x60 [ 109.187124] ? SyS_read+0x120/0x120 [ 109.189548] do_syscall_64+0xeb/0x250 [ 109.192178] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 109.194725] RIP: 0033:0x7fabb61ebe99 [ 109.197040] RSP: 002b:00007fabb68d5e98 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 109.200294] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fabb61ebe99 [ 109.203399] RDX: 0000000000000120 RSI: 00000000200001c0 RDI: 0000000000000004 [ 109.206548] RBP: 00007fabb68d5ec0 R08: 0000000000000000 R09: 0000000000000000 [ 109.209902] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fabb68d5fc0 [ 109.213327] R13: 0000000000000000 R14: 00007fff40ab2430 R15: 00007fabb68d69c0 [ 109.216613] Code: 88 44 24 2c 0f b6 84 24 6e 01 00 00 88 44 24 2d 0f b6 84 24 69 01 00 00 88 44 24 2e 8b 44 24 60 89 44 24 30 e8 da f6 06 ff 31 c0 <66> 41 83 7c 24 20 1b 75 04 8b 44 24 64 48 8d 74 24 20 4c 89 e7 [ 109.223602] RIP: ucma_connect+0x138/0x1d0 RSP: ffff8801c8567a80 [ 109.226256] CR2: 0000000000000020 Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Reported-by: <syzbot+36712f50b0552615bf59@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-19RDMA/ucma: Fix use-after-free access in ucma_closeLeon Romanovsky1-0/+3
The error in ucma_create_id() left ctx in the list of contexts belong to ucma file descriptor. The attempt to close this file descriptor causes to use-after-free accesses while iterating over such list. Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Reported-by: <syzbot+dcfd344365a56fbebd0f@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15RDMA/ucma: Check AF family prior resolving addressLeon Romanovsky1-3/+7
Garbage supplied by user will cause to UCMA module provide zero memory size for memcpy(), because it wasn't checked, it will produce unpredictable results in rdma_resolve_addr(). [ 42.873814] BUG: KASAN: null-ptr-deref in rdma_resolve_addr+0xc8/0xfb0 [ 42.874816] Write of size 28 at addr 00000000000000a0 by task resaddr/1044 [ 42.876765] [ 42.876960] CPU: 1 PID: 1044 Comm: resaddr Not tainted 4.16.0-rc1-00057-gaa56a5293d7e #34 [ 42.877840] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 42.879691] Call Trace: [ 42.880236] dump_stack+0x5c/0x77 [ 42.880664] kasan_report+0x163/0x380 [ 42.881354] ? rdma_resolve_addr+0xc8/0xfb0 [ 42.881864] memcpy+0x34/0x50 [ 42.882692] rdma_resolve_addr+0xc8/0xfb0 [ 42.883366] ? deref_stack_reg+0x88/0xd0 [ 42.883856] ? vsnprintf+0x31a/0x770 [ 42.884686] ? rdma_bind_addr+0xc40/0xc40 [ 42.885327] ? num_to_str+0x130/0x130 [ 42.885773] ? deref_stack_reg+0x88/0xd0 [ 42.886217] ? __read_once_size_nocheck.constprop.6+0x10/0x10 [ 42.887698] ? unwind_get_return_address_ptr+0x50/0x50 [ 42.888302] ? replace_slot+0x147/0x170 [ 42.889176] ? delete_node+0x12c/0x340 [ 42.890223] ? __radix_tree_lookup+0xa9/0x160 [ 42.891196] ? ucma_resolve_ip+0xb7/0x110 [ 42.891917] ucma_resolve_ip+0xb7/0x110 [ 42.893003] ? ucma_resolve_addr+0x190/0x190 [ 42.893531] ? _copy_from_user+0x5e/0x90 [ 42.894204] ucma_write+0x174/0x1f0 [ 42.895162] ? ucma_resolve_route+0xf0/0xf0 [ 42.896309] ? dequeue_task_fair+0x67e/0xd90 [ 42.897192] ? put_prev_entity+0x7d/0x170 [ 42.897870] ? ring_buffer_record_is_on+0xd/0x20 [ 42.898439] ? tracing_record_taskinfo_skip+0x20/0x50 [ 42.899686] __vfs_write+0xc4/0x350 [ 42.900142] ? kernel_read+0xa0/0xa0 [ 42.900602] ? firmware_map_remove+0xdf/0xdf [ 42.901135] ? do_task_dead+0x5d/0x60 [ 42.901598] ? do_exit+0xcc6/0x1220 [ 42.902789] ? __fget+0xa8/0xf0 [ 42.903190] vfs_write+0xf7/0x280 [ 42.903600] SyS_write+0xa1/0x120 [ 42.904206] ? SyS_read+0x120/0x120 [ 42.905710] ? compat_start_thread+0x60/0x60 [ 42.906423] ? SyS_read+0x120/0x120 [ 42.908716] do_syscall_64+0xeb/0x250 [ 42.910760] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 42.912735] RIP: 0033:0x7f138b0afe99 [ 42.914734] RSP: 002b:00007f138b799e98 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 42.917134] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f138b0afe99 [ 42.919487] RDX: 000000000000002e RSI: 0000000020000c40 RDI: 0000000000000004 [ 42.922393] RBP: 00007f138b799ec0 R08: 00007f138b79a700 R09: 0000000000000000 [ 42.925266] R10: 00007f138b79a700 R11: 0000000000000287 R12: 00007f138b799fc0 [ 42.927570] R13: 0000000000000000 R14: 00007ffdbae757c0 R15: 00007f138b79a9c0 [ 42.930047] [ 42.932681] Disabling lock debugging due to kernel taint [ 42.934795] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 [ 42.936939] IP: memcpy_erms+0x6/0x10 [ 42.938864] PGD 80000001bea92067 P4D 80000001bea92067 PUD 1bea96067 PMD 0 [ 42.941576] Oops: 0002 [#1] SMP KASAN PTI [ 42.943952] CPU: 1 PID: 1044 Comm: resaddr Tainted: G B 4.16.0-rc1-00057-gaa56a5293d7e #34 [ 42.946964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 42.952336] RIP: 0010:memcpy_erms+0x6/0x10 [ 42.954707] RSP: 0018:ffff8801c8b479c8 EFLAGS: 00010286 [ 42.957227] RAX: 00000000000000a0 RBX: ffff8801c8b47ba0 RCX: 000000000000001c [ 42.960543] RDX: 000000000000001c RSI: ffff8801c8b47bbc RDI: 00000000000000a0 [ 42.963867] RBP: ffff8801c8b47b60 R08: 0000000000000000 R09: ffffed0039168ed1 [ 42.967303] R10: 0000000000000001 R11: ffffed0039168ed0 R12: ffff8801c8b47bbc [ 42.970685] R13: 00000000000000a0 R14: 1ffff10039168f4a R15: 0000000000000000 [ 42.973631] FS: 00007f138b79a700(0000) GS:ffff8801e5d00000(0000) knlGS:0000000000000000 [ 42.976831] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.979239] CR2: 00000000000000a0 CR3: 00000001be908002 CR4: 00000000003606a0 [ 42.982060] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 42.984877] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 42.988033] Call Trace: [ 42.990487] rdma_resolve_addr+0xc8/0xfb0 [ 42.993202] ? deref_stack_reg+0x88/0xd0 [ 42.996055] ? vsnprintf+0x31a/0x770 [ 42.998707] ? rdma_bind_addr+0xc40/0xc40 [ 43.000985] ? num_to_str+0x130/0x130 [ 43.003410] ? deref_stack_reg+0x88/0xd0 [ 43.006302] ? __read_once_size_nocheck.constprop.6+0x10/0x10 [ 43.008780] ? unwind_get_return_address_ptr+0x50/0x50 [ 43.011178] ? replace_slot+0x147/0x170 [ 43.013517] ? delete_node+0x12c/0x340 [ 43.016019] ? __radix_tree_lookup+0xa9/0x160 [ 43.018755] ? ucma_resolve_ip+0xb7/0x110 [ 43.021270] ucma_resolve_ip+0xb7/0x110 [ 43.023968] ? ucma_resolve_addr+0x190/0x190 [ 43.026312] ? _copy_from_user+0x5e/0x90 [ 43.029384] ucma_write+0x174/0x1f0 [ 43.031861] ? ucma_resolve_route+0xf0/0xf0 [ 43.034782] ? dequeue_task_fair+0x67e/0xd90 [ 43.037483] ? put_prev_entity+0x7d/0x170 [ 43.040215] ? ring_buffer_record_is_on+0xd/0x20 [ 43.042990] ? tracing_record_taskinfo_skip+0x20/0x50 [ 43.045595] __vfs_write+0xc4/0x350 [ 43.048624] ? kernel_read+0xa0/0xa0 [ 43.051604] ? firmware_map_remove+0xdf/0xdf [ 43.055379] ? do_task_dead+0x5d/0x60 [ 43.058000] ? do_exit+0xcc6/0x1220 [ 43.060783] ? __fget+0xa8/0xf0 [ 43.063133] vfs_write+0xf7/0x280 [ 43.065677] SyS_write+0xa1/0x120 [ 43.068647] ? SyS_read+0x120/0x120 [ 43.071179] ? compat_start_thread+0x60/0x60 [ 43.074025] ? SyS_read+0x120/0x120 [ 43.076705] do_syscall_64+0xeb/0x250 [ 43.079006] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 43.081606] RIP: 0033:0x7f138b0afe99 [ 43.083679] RSP: 002b:00007f138b799e98 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 43.086802] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f138b0afe99 [ 43.089989] RDX: 000000000000002e RSI: 0000000020000c40 RDI: 0000000000000004 [ 43.092866] RBP: 00007f138b799ec0 R08: 00007f138b79a700 R09: 0000000000000000 [ 43.096233] R10: 00007f138b79a700 R11: 0000000000000287 R12: 00007f138b799fc0 [ 43.098913] R13: 0000000000000000 R14: 00007ffdbae757c0 R15: 00007f138b79a9c0 [ 43.101809] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 [ 43.107950] RIP: memcpy_erms+0x6/0x10 RSP: ffff8801c8b479c8 Reported-by: <syzbot+1d8c43206853b369d00c@syzkaller.appspotmail.com> Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-14Merge branch 'k.o/wip/dl-for-rc' into k.o/wip/dl-for-nextDoug Ledford1-1/+13
Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14RDMA/ucma: Don't allow join attempts for unsupported AF familyLeon Romanovsky1-1/+7
Users can provide garbage while calling to ucma_join_ip_multicast(), it will indirectly cause to rdma_addr_size() return 0, making the call to ucma_process_join(), which had the right checks, but it is better to check the input as early as possible. The following crash from syzkaller revealed it. kernel BUG at lib/string.c:1052! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 4113 Comm: syz-executor0 Not tainted 4.16.0-rc5+ #261 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:fortify_panic+0x13/0x20 lib/string.c:1051 RSP: 0018:ffff8801ca81f8f0 EFLAGS: 00010286 RAX: 0000000000000022 RBX: 1ffff10039503f23 RCX: 0000000000000000 RDX: 0000000000000022 RSI: 1ffff10039503ed3 RDI: ffffed0039503f12 RBP: ffff8801ca81f8f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000006 R11: 0000000000000000 R12: ffff8801ca81f998 R13: ffff8801ca81f938 R14: ffff8801ca81fa58 R15: 000000000000fa00 FS: 0000000000000000(0000) GS:ffff8801db200000(0063) knlGS:000000000a12a900 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000008138024 CR3: 00000001cbb58004 CR4: 00000000001606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: memcpy include/linux/string.h:344 [inline] ucma_join_ip_multicast+0x36b/0x3b0 drivers/infiniband/core/ucma.c:1421 ucma_write+0x2d6/0x3d0 drivers/infiniband/core/ucma.c:1633 __vfs_write+0xef/0x970 fs/read_write.c:480 vfs_write+0x189/0x510 fs/read_write.c:544 SYSC_write fs/read_write.c:589 [inline] SyS_write+0xef/0x220 fs/read_write.c:581 do_syscall_32_irqs_on arch/x86/entry/common.c:330 [inline] do_fast_syscall_32+0x3ec/0xf9f arch/x86/entry/common.c:392 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f9ec99 RSP: 002b:00000000ff8172cc EFLAGS: 00000282 ORIG_RAX: 0000000000000004 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020000100 RDX: 0000000000000063 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Code: 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 48 89 df e8 42 2c e3 fb eb de 55 48 89 fe 48 c7 c7 80 75 98 86 48 89 e5 e8 85 95 94 fb <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 56 RIP: fortify_panic+0x13/0x20 lib/string.c:1051 RSP: ffff8801ca81f8f0 Fixes: 5bc2b7b397b0 ("RDMA/ucma: Allow user space to specify AF_IB when joining multicast") Reported-by: <syzbot+2287ac532caa81900a4e@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08RDMA/nldev: provide detailed CM_ID informationSteve Wise1-4/+4
Implement RDMA nldev netlink interface to get detailed CM_ID information. Because cm_id's are attached to rdma devices in various work queue contexts, the pid and task information at restrak_add() time is sometimes not useful. For example, an nvme/f host connection cm_id ends up being bound to a device in a work queue context and the resulting pid at attach time no longer exists after connection setup. So instead we mark all cm_id's created via the rdma_ucm as "user", and all others as "kernel". This required tweaking the restrack code a little. It also required wrapping some rdma_cm functions to allow passing the module name string. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07RDMA/ucma: Check that user doesn't overflow QP stateLeon Romanovsky1-0/+3
The QP state is limited and declared in enum ib_qp_state, but ucma user was able to supply any possible (u32) value. Reported-by: syzbot+0df1ab766f8924b1edba@syzkaller.appspotmail.com Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07RDMA/ucma: Limit possible option sizeLeon Romanovsky1-0/+3
Users of ucma are supposed to provide size of option level, in most paths it is supposed to be equal to u8 or u16, but it is not the case for the IB path record, where it can be multiple of struct ib_path_rec_data. This patch takes simplest possible approach and prevents providing values more than possible to allocate. Reported-by: syzbot+a38b0e9f694c379ca7ce@syzkaller.appspotmail.com Fixes: 7ce86409adcd ("RDMA/ucma: Allow user space to set service type") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-02-11vfs: do bulk POLL* -> EPOLL* replacementLinus Torvalds1-1/+1
This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL* constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-31Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-9/+10
Pull RDMA subsystem updates from Jason Gunthorpe: "Overall this cycle did not have any major excitement, and did not require any shared branch with netdev. Lots of driver updates, particularly of the scale-up and performance variety. The largest body of core work was Parav's patches fixing and restructing some of the core code to make way for future RDMA containerization. Summary: - misc small driver fixups to bnxt_re/hfi1/qib/hns/ocrdma/rdmavt/vmw_pvrdma/nes - several major feature adds to bnxt_re driver: SRIOV VF RoCE support, HugePages support, extended hardware stats support, and SRQ support - a notable number of fixes to the i40iw driver from debugging scale up testing - more work to enable the new hip08 chip in the hns driver - misc small ULP fixups to srp/srpt//ipoib - preparation for srp initiator and target to support the RDMA-CM protocol for connections - add RDMA-CM support to srp initiator, srp target is still a WIP - fixes for a couple of places where ipoib could spam the dmesg log - fix encode/decode of FDR/EDR data rates in the core - many patches from Parav with ongoing work to clean up inconsistencies and bugs in RoCE support around the rdma_cm - mlx5 driver support for the userspace features 'thread domain', 'wallclock timestamps' and 'DV Direct Connected transport'. Support for the firmware dual port rocee capability - core support for more than 32 rdma devices in the char dev allocation - kernel doc updates from Randy Dunlap - new netlink uAPI for inspecting RDMA objects similar in spirit to 'ss' - one minor change to the kobject code acked by Greg KH" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (259 commits) RDMA/nldev: Provide detailed QP information RDMA/nldev: Provide global resource utilization RDMA/core: Add resource tracking for create and destroy PDs RDMA/core: Add resource tracking for create and destroy CQs RDMA/core: Add resource tracking for create and destroy QPs RDMA/restrack: Add general infrastructure to track RDMA resources RDMA/core: Save kernel caller name when creating PD and CQ objects RDMA/core: Use the MODNAME instead of the function name for pd callers RDMA: Move enum ib_cq_creation_flags to uapi headers IB/rxe: Change RDMA_RXE kconfig to use select IB/qib: remove qib_keys.c IB/mthca: remove mthca_user.h RDMA/cm: Fix access to uninitialized variable RDMA/cma: Use existing netif_is_bond_master function IB/core: Avoid SGID attributes query while converting GID from OPA to IB RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure IB/umad: Fix use of unprotected device pointer IB/iser: Combine substrings for three messages IB/iser: Delete an unnecessary variable initialisation in iser_send_data_out() IB/iser: Delete an error message for a failed memory allocation in iser_send_data_out() ...
2018-01-19RDMA/ucma: Use rdma cm API to query GIDParav Pandit1-4/+4
Make use of rdma_read_gids() API to read SGID and DGID which returns correct GIDs for RoCE and other transports. rdma_addr_get_dgid() for RoCE for client side connections returns MAC address, instead of DGID. rdma_addr_get_sgid() for RoCE doesn't return correct SGID for IPv6 and when more than one IP address is assigned to the netdevice. Therefore use transport agnostic rdma_read_gids() API provided by rdma_cm module. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-10RDMA/cma: Fix rdma_cm path querying for RoCEParav Pandit1-3/+4
The 'if' logic in ucma_query_path was broken with OPA was introduced and started to treat RoCE paths as as OPA paths. Invert the logic of the 'if' so only OPA paths are treated as OPA paths. Otherwise the path records returned to rdma_cma users are mangled when in RoCE mode. Fixes: 57520751445b ("IB/SA: Add OPA path record type") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-10RDMA/{cma, ucma}: Simplify and rename rdma_set_ib_pathsParav Pandit1-2/+2
Since 2006 there has been no user of rdmacm based application to make use of setting multiple path records using rdma_set_ib_paths API. Therefore code is simplified to allow setting one path record entry. Now that it sets only single path, it is renamed to reflect the same. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-11-28the rest of drivers/*: annotate ->poll() instancesAl Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-08-08IB/core: Convert ah_attr from OPA to IB when copying to userDasaratharaman Chandramouli1-4/+6
OPA address handle atttibutes that have 32 bit LIDs would have to be converted to IB address handle attribute with the LID field programmed in the GID before copying to user space. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/SA: Add OPA path record typeDasaratharaman Chandramouli1-3/+19
Add opa_sa_path_rec to sa_path_rec data structure. The 'type' field in sa_path_rec identifies the type of the path record. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/SA: Rename ib_sa_path_rec to sa_path_recDasaratharaman Chandramouli1-1/+1
Rename ib_sa_path_rec to a more generic sa_path_rec. This is part of extending ib_sa to also support OPA path records in addition to the IB defined path records. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03infiniband: remove WARN that is not kernel bugLeon Romanovsky1-1/+4
On Mon, Nov 21, 2016 at 09:52:53AM -0700, Jason Gunthorpe wrote: > On Mon, Nov 21, 2016 at 02:14:08PM +0200, Leon Romanovsky wrote: > > > > > > In ib_ucm_write function there is a wrong prefix: > > > > > > + pr_err_once("ucm_write: process %d (%s) tried to do something hinky\n", > > > > I did it intentionally to have the same errors for all flows. > > Lets actually use a good message too please? > > pr_err_once("ucm_write: process %d (%s) changed security contexts after opening FD, this is not allowed.\n", > > Jason >From 70f95b2d35aea42e5b97e7d27ab2f4e8effcbe67 Mon Sep 17 00:00:00 2001 From: Leon Romanovsky <leonro@mellanox.com> Date: Mon, 21 Nov 2016 13:30:59 +0200 Subject: [PATCH rdma-next V2] IB/{core, qib}: Remove WARN that is not kernel bug WARNINGs mean kernel bugs, in this case, they are placed to mark programming errors and/or malicious attempts. BUG/WARNs that are not kernel bugs hinder automated testing efforts. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/ucma: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar1-1/+2
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces deprecated create_singlethread_workqueue(). This is the identity conversion. The workqueue "close_wq" queues work items &ctx->close_work (maps to ucma_close_id) and &con_req_eve->close_work (maps to ucma_close_event_id). It has been identity converted. WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-03IB/core: Support for CMA multicast join flagsAlex Vesker1-4/+14
Added UCMA and CMA support for multicast join flags. Flags are passed using UCMA CM join command previously reserved fields. Currently supporting two join flags indicating two different multicast JoinStates: 1. Full Member: The initiator creates the Multicast group(MCG) if it wasn't previously created, can send Multicast messages to the group and receive messages from the MCG. 2. Send Only Full Member: The initiator creates the Multicast group(MCG) if it wasn't previously created, can send Multicast messages to the group but doesn't receive any messages from the MCG. IB: Send Only Full Member requires a query of ClassPortInfo to determine if SM/SA supports this option. If SM/SA doesn't support Send-Only there will be no join request sent and an error will be returned. ETH: When Send Only Full Member is requested no IGMP join will be sent. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28IB/security: Restrict use of the write() interfaceJason Gunthorpe1-0/+3
The drivers/infiniband stack uses write() as a replacement for bi-directional ioctl(). This is not safe. There are ways to trigger write calls that result in the return structure that is normally written to user space being shunted off to user specified kernel memory instead. For the immediate repair, detect and deny suspicious accesses to the write API. For long term, update the user space libraries and the kernel API to something that doesn't present the same security vulnerabilities (likely a structured ioctl() interface). The impacted uAPI interfaces are generally only available if hardware from drivers/infiniband is installed in the system. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [ Expanded check to all known write() entry points ] Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-03IB/core: trivial prink cleanup.Parav Pandit1-3/+3
1. Replaced printk with appropriate pr_warn, pr_err, pr_info. 2. Removed unnecessary prints around memory allocation failure which are not required, as reported by the checkpatch script. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/ucma: Take the network namespace from the processGuy Shapiro1-2/+3
Add support for network namespaces from user space. This is done by passing the network namespace of the process instead of init_net. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/cma: Add support for network namespacesGuy Shapiro1-1/+2
Add support for network namespaces in the ib_cma module. This is accomplished by: 1. Adding network namespace parameter for rdma_create_id. This parameter is used to populate the network namespace field in rdma_id_private. rdma_create_id keeps a reference on the network namespace. 2. Using the network namespace from the rdma_id instead of init_net inside of ib_cma, when listening on an ID and when looking for an ID for an incoming request. 3. Decrementing the reference count for the appropriate network namespace when calling rdma_destroy_id. In order to preserve the current behavior init_net is passed when calling from other modules. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21IB/core: Remove smac and vlan id from qp_attr and ah_attrMatan Barak1-1/+0
Smac and vlan id could be resolved from the GID attribute, and thus these attributes aren't needed anymore. Removing them. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-20IB/ucma: check workqueue allocation before usageSasha Levin1-1/+6
Allocating a workqueue might fail, which wasn't checked so far and would lead to NULL ptr derefs when an attempt to use it was made. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/ucma: HW Device hot-removal supportYishai Hadas1-11/+129
Currently, IB/cma remove_one flow blocks until all user descriptor managed by IB/ucma are released. This prevents hot-removal of IB devices. This patch allows IB/cma to remove devices regardless of user space activity. Upon getting the RDMA_CM_EVENT_DEVICE_REMOVAL event we close all the underlying HW resources for the given ucontext. The ucontext itself is still alive till its explicit destroying by its creator. Running applications at that time will have some zombie device, further operations may fail. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/ucma: Fix theoretical user triggered use-after-freeJason Gunthorpe1-3/+3
Something like this: CPU A CPU B Acked-by: Sean Hefty <sean.hefty@intel.com> ======================== ================================ ucma_destroy_id() wait_for_completion() .. anything ucma_put_ctx() complete() .. continues ... ucma_leave_multicast() mutex_lock(mut) atomic_inc(ctx->ref) mutex_unlock(mut) ucma_free_ctx() ucma_cleanup_multicast() mutex_lock(mut) kfree(mc) rdma_leave_multicast(mc->ctx->cm_id,.. Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot leave the mutex(mut) protection. The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects it from racing with ucma_destroy_id. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-07-14IB/core: Destroy multcast_idr on module exitJohannes Thumshirn1-0/+1
Destroy multcast_idr on module exit, reclaiming the allocated memory. This was detected by the following semantic patch (written by Luis Rodriguez <mcgrof@suse.com>) <SmPL> @ defines_module_init @ declarer name module_init, module_exit; declarer name DEFINE_IDR; identifier init; @@ module_init(init); @ defines_module_exit @ identifier exit; @@ module_exit(exit); @ declares_idr depends on defines_module_init && defines_module_exit @ identifier idr; @@ DEFINE_IDR(idr); @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... idr_destroy(&idr); ... } @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... +idr_destroy(&idr); } </SmPL> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>