aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2015-08-30IB/cm: Remove compare_data checksHaggai Eran4-95/+21
Now that there are no ib_cm clients using the compare_data feature for matching IB CM requests' private data, remove the compare_data parameter of ib_cm_listen and remove the code implementing the feature. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Share ib_cm_ids between rdma_cm_idsHaggai Eran1-55/+4
Use ib_cm_insert_listen to create listening IB CM IDs or share existing ones if needed. When given a request on a specific CM ID, the code now matches the request to the RDMA CM ID based on the request parameters, so it no longer needs to rely on the ib_cm's private data matching capabilities. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Use found net_dev for passive connectionsHaggai Eran1-27/+49
When receiving a new connection in cma_req_handler, we actually already know the net_dev that is used for the connection's creation. Instead of calling cma_translate_addr to resolve the new connection id's source address, just use the net_dev that was found. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Validate routing of incoming requestsHaggai Eran1-3/+92
Pass incoming request parameters through the relevant IPv4/IPv6 routing tables and make sure the network stack is configured to handle such requests. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Add net_dev and private data checks to RDMA CMHaggai Eran1-3/+185
Instead of relying on a the ib_cm module to check an incoming CM request's private data header, add these checks to the RDMA CM module. This allows a following patch to to clean up the ib_cm interface and remove the code that looks into the private headers. It will also allow supporting namespaces in RDMA CM by making these checks namespace aware later on. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cm: Expose BTH P_Key in CM and SIDR request eventsHaggai Eran2-0/+26
The rdma_cm module will later use the P_Key from the BTH to de-mux requests. See discussion at: http://www.spinics.net/lists/netdev/msg336067.html Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Liran Liss <liranl@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Helper functions to access port space IDRsHaggai Eran1-21/+60
Add helper functions to access the IDRs by port-space and port number. Pass around the port-space enum in cma.c instead of using pointers to port-space IDRs. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cma: Refactor RDMA IP CM private-data parsing codeHaggai Eran1-65/+105
When receiving a connection request, rdma_cm needs to associate the request with a network device, in order to disambiguate requests. To do this, it needs to know the request's destination IP. For this the module needs to allow getting this information from the private data in the request packet, instead of relying on the information already being in the listening RDMA CM ID. When creating a new incoming connection ID, the code in cma_save_ip{4,6}_info can no longer rely on the listener's private data to find the port number, so it reads it from the requested service ID. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cm: Share listening CM IDsHaggai Eran2-6/+120
Enabling network namespaces for RDMA CM will allow processes on different namespaces to listen on the same port. In order to leave namespace support out of the CM layer, this requires that multiple RDMA CM IDs will be able to share a single CM ID. This patch adds infrastructure to retrieve an existing listening ib_cm_id, based on its device and service ID, or create a new one if one does not already exist. It also adds a reference count for such instances (cm_id_private.listen_sharecount), and prevents cm_destroy_id from destroying a CM if it is still shared. See the relevant discussion [1]. [1] Re: [PATCH v3 for-next 05/13] IB/cm: Reference count ib_cm_ids http://www.spinics.net/lists/netdev/msg328860.html Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cm: Expose service ID in request eventsHaggai Eran2-0/+4
Expose the service ID on an incoming CM or SIDR request to the event handler. This will allow the RDMA CM module to de-multiplex connection requests based on the information encoded in the service ID. Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/ipoib: Return IPoIB devices matching connection parametersGuy Shapiro1-1/+228
Implement the get_net_device_by_port_pkey_ip callback that returns network device to ib_core according to connection parameters. Check the ipoib device and iterate over all child devices to look for a match. For each IPoIB device we iterate through all upper devices when searching for a matching IP, in order to support bonding. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Find the network device matching connection parametersYotam Kenneth2-0/+73
In the case of IPoIB, and maybe in other cases, the network device is managed by an upper-layer protocol (ULP). In order to expose this network device to other users of the IB device, let ULPs implement a callback that returns network device according to connection parameters. The IB device and port, together with the P_Key and the GID should be enough to uniquely identify the ULP net device. However, in current kernels there can be multiple IPoIB interfaces created with the same GID. Furthermore, such configuration may be desireable to support ipvlan-like configurations for RDMA CM with IPoIB. To resolve the device in these cases the code will also take the IP address as an additional input. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: lock client data with lists_rwsemHaggai Eran16-52/+82
An ib_client callback that is called with the lists_rwsem locked only for read is protected from changes to the IB client lists, but not from ib_unregister_device() freeing its client data. This is because ib_unregister_device() will remove the device from the device list with lists_rwsem locked for write, but perform the rest of the cleanup, including the call to remove() without that lock. Mark client data that is undergoing de-registration with a new going_down flag in the client data context. Lock the client data list with lists_rwsem for write in addition to using the spinlock, so that functions calling the callback would be able to lock only lists_rwsem for read and let callbacks sleep. Since ib_unregister_client() now marks the client data context, no need for remove() to search the context again, so pass the client data directly to remove() callbacks. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Add rwsem to allow reading device list or client listHaggai Eran1-12/+28
Currently the RDMA subsystem's device list and client list are protected by a single mutex. This prevents adding user-facing APIs that iterate these lists, since using them may cause a deadlock. The patch attempts to solve this problem by adding a read-write semaphore to protect the lists. Readers now don't need the mutex, and are safe just by read-locking the semaphore. The ib_register_device, ib_register_client, ib_unregister_device, and ib_unregister_client functions are modified to lock the semaphore for write during their respective list modification. Also, in order to make sure client callbacks are called only between add() and remove() calls, the code is changed to only add items to the lists after the add() calls and remove from the lists before the remove() calls. This patch attempts to solve a similar need [1] that was seen in the RoCE v2 patch series. [1] http://www.spinics.net/lists/linux-rdma/msg24733.html Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Matan Barak <matanb@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28RDMA/Core: remove rdma_cap_read_multi_sge() helperSteve Wise1-28/+0
This functionality already exists via the max_sge_rd device capability. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28svcrdma: Use max_sge_rd for destination read depthsSteve Wise3-11/+6
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28ipath,qib: Expose max_sge_rd correctlySteve Wise2-0/+2
Applications must not assume that max_sge and max_sge_rd are the same, Hence expose max_sge_rd correctly as well. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28mlx4, mlx5, mthca: Expose max_sge_rd correctlySagi Grimberg3-0/+3
Applications must not assume that max_sge and max_sge_rd are the same, Hence expose max_sge_rd correctly as well. Reported-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28staging/hfi1: replace indent spaces with tabsJeff Becker1-5/+5
Running checkpatch.pl on mad.c produces several "ERROR: code indent should use tabs where possible" messages. This patch fixes these. Signed-off-by: Jeff Becker <Jeffrey.C.Becker@nasa.gov> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/hfi1: add driver filesMike Marciniszyn60-0/+57480
Signed-off-by: Andrew Friedley <andrew.friedley@intel.com> Signed-off-by: Arthur Kepner <arthur.kepner@intel.com> Signed-off-by: Brendan Cunningham <brendan.cunningham@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Caz Yokoyama <caz.yokoyama@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jim Snow <jim.m.snow@intel.com> Signed-off-by: John Gregor <john.a.gregor@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Kevin Pine <kevin.pine@intel.com> Signed-off-by: Kyle Liddell <kyle.liddell@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Ravi Krishnaswamy <ravi.krishnaswamy@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Sanath Kumar <sanath.s.kumar@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Vlad Danushevsky <vladimir.danusevsky@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/core: Add core header changes needed for OPADennis Dalessandro6-139/+1055
This patch adds the value of the CNP opcode to the existing list of enumerated opcodes in ib_pack.h Add common OPA header definitions for driver build: - opa_port_info.h - opa_smi.h - hfi1_user.h Additionally, ib_mad.h, has additional definitions that are common to ib_drivers including: - trap support - cca support The qib driver has the duplication removed in favor those in ib_mad.h Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: John, Jubin <jubin.john@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28RDMA/amso1100: Deprecate the amso1100 driver and move to stagingSteve Wise28-2/+7
The HW hasn't been sold since 2005, and the SW has definite bit rot. Its time to remove it. So move it to staging for a few releases and then remove it after that. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/ipath: Deprecate ipath driver and move to staging.Dennis Dalessandro43-4/+12
It is now time for the ipath driver to begin to be phased out of the kernel. This patch moves the ipath driver from the Infiniband sub tree to the staging area where it will remain until the code is removed from the kernel in a few releases. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28Staging: Add staging/rdma directory and update MAINTAINERSDoug Ledford5-0/+30
Create the rdma directory in the staging area for use as we deprecate some older drivers and as we bring in some new drivers that are in need of work. Update the MAINTAINERS file so that updates to these files go to linux-rdma@vger.kernel.org. Expected lifespan of this directory is three releases for any deprecated drivers moved here and an unknown, but theoretically bounded amount of time for the new drivers as a new core RDMA transfer library needs to be written and the drivers modified to use it in order for them to move out of this directory. Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28iw_cxgb4: Add support for clipHariprasad S1-4/+72
Add support for ipv6 address handling clip api provided by lld Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28RDMA/cma: fix IPv6 address resolutionSpencer Baugh1-2/+5
Resolving a link-local IPv6 address with an unspecified source address was broken by commit 5462eddd7a, which prevented the IPv6 stack from learning the scope id of the link-local IPv6 address, causing random failures as the IP stack chose a random link to resolve the address on. This commit 5462eddd7a made us bail out of cma_check_linklocal early if the address passed in was not an IPv6 link-local address. On the address resolution path, the address passed in is the source address; if the source address is the unspecified address, which is not link-local, we will bail out early. This is mostly correct, but if the destination address is a link-local address, then we will be following a link-local route, and we'll need to tell the IPv6 stack what the scope id of the destination address is. This used to be done by last line of cma_check_linklocal, which is skipped when bailing out early: dev_addr->bound_dev_if = sin6->sin6_scope_id; (In cma_bind_addr, the sin6_scope_id of the source address is set to the sin6_scope_id of the destination address, so this is correct) This line is required in turn for the following line, L279 of addr6_resolve, to actually inform the IPv6 stack of the scope id: fl6.flowi6_oif = addr->bound_dev_if; Since we can only know we are in this failure case when we have access to both the source IPv6 address and destination IPv6 address, we have to deal with this further up the stack. So detect this failure case in cma_bind_addr, and set bound_dev_if to the destination address scope id to correct it. Signed-off-by: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/ucma: Fix theoretical user triggered use-after-freeJason Gunthorpe1-3/+3
Something like this: CPU A CPU B Acked-by: Sean Hefty <sean.hefty@intel.com> ======================== ================================ ucma_destroy_id() wait_for_completion() .. anything ucma_put_ctx() complete() .. continues ... ucma_leave_multicast() mutex_lock(mut) atomic_inc(ctx->ref) mutex_unlock(mut) ucma_free_ctx() ucma_cleanup_multicast() mutex_lock(mut) kfree(mc) rdma_leave_multicast(mc->ctx->cm_id,.. Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot leave the mutex(mut) protection. The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects it from racing with ucma_destroy_id. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28iw_cxgb4: set the default MPA version to 2Hariprasad S1-2/+2
This enables ORD/IRD negotiation and its about time to enable it by default Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28RDMA/iser: Limit sgs to the device fastreg depthSteve Wise1-0/+9
Currently the sg tablesize, which dictates fast register page list depth to use, does not take into account the limits of the rdma device. So adjust it once we discover the device fastreg max depth limit. Also adjust the max_sectors based on the resulting sg tablesize. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/mlx5: Remove dead code from alloc_cached_mr()Roland Dreier1-3/+0
The only place that assigns mr inside the loop already does a break. So "if (mr)" will never be true here since the function initializes mr to NULL at the top. We can just drop the extra if and break here. Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/qib: Change lkey table allocation to support more MRsMike Marciniszyn3-4/+16
The lkey table is allocated with with a get_user_pages() with an order based on a number of index bits from a module parameter. The underlying kernel code cannot allocate that many contiguous pages. There is no reason the underlying memory needs to be physically contiguous. This patch: - switches the allocation/deallocation to vmalloc/vfree - caps the number of bits to 23 to insure at least 1 generation bit o this matches the module parameter description Cc: stable@vger.kernel.org Reviewed-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28mlx5: Expose correct page_size_cap in device attributesSagi Grimberg1-1/+2
Should be all the page sizes that are supported by the device. Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28mlx5: Fix missing device local_dma_lkeySagi Grimberg4-1/+42
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY but does not set the the device local_dma_lkey. This breaks rpcrdma drivers. Query and set this lkey when creating the device resources. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-23Linux 4.2-rc8Linus Torvalds1-1/+1
2015-08-229p: ensure err is initialized to 0 in p9_client_read/writeVincent Bernat1-0/+2
Some use of those functions were providing unitialized values to those functions. Notably, when reading 0 bytes from an empty file on a 9P filesystem, the return code of read() was not 0. Tested with this simple program: #include <assert.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> int main(int argc, const char **argv) { assert(argc == 2); char buffer[256]; int fd = open(argv[1], O_RDONLY|O_NOCTTY); assert(fd >= 0); assert(read(fd, buffer, 0) == 0); return 0; } Cc: stable@vger.kernel.org # v4.1 Signed-off-by: Vincent Bernat <vincent@bernat.im> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-08-22x86/fpu/math-emu: Fix crash in fork()Ingo Molnar1-1/+1
During later stages of math-emu bootup the following crash triggers: math_emulate: 0060:c100d0a8 Kernel panic - not syncing: Math emulation needed in kernel CPU: 0 PID: 1511 Comm: login Not tainted 4.2.0-rc7+ #1012 [...] Call Trace: [<c181d50d>] dump_stack+0x41/0x52 [<c181c918>] panic+0x77/0x189 [<c1003530>] ? math_error+0x140/0x140 [<c164c2d7>] math_emulate+0xba7/0xbd0 [<c100d0a8>] ? fpu__copy+0x138/0x1c0 [<c1109c3c>] ? __alloc_pages_nodemask+0x12c/0x870 [<c136ac20>] ? proc_clear_tty+0x40/0x70 [<c136ac6e>] ? session_clear_tty+0x1e/0x30 [<c1003530>] ? math_error+0x140/0x140 [<c1003575>] do_device_not_available+0x45/0x70 [<c100d0a8>] ? fpu__copy+0x138/0x1c0 [<c18258e6>] error_code+0x5a/0x60 [<c1003530>] ? math_error+0x140/0x140 [<c100d0a8>] ? fpu__copy+0x138/0x1c0 [<c100c205>] arch_dup_task_struct+0x25/0x30 [<c1048cea>] copy_process.part.51+0xea/0x1480 [<c115a8e5>] ? dput+0x175/0x200 [<c136af70>] ? no_tty+0x30/0x30 [<c1157242>] ? do_vfs_ioctl+0x322/0x540 [<c104a21a>] _do_fork+0xca/0x340 [<c1057b06>] ? SyS_rt_sigaction+0x66/0x90 [<c104a557>] SyS_clone+0x27/0x30 [<c1824a80>] sysenter_do_call+0x12/0x12 The reason is the incorrect assumption in fpu_copy(), that FNSAVE can be executed from math-emu kernels as well. Don't try to copy the registers, the soft state will be copied by fork anyway, so the child task inherits the parent task's soft math state. With this fix applied math-emu kernels boot up fine on modern hardware and the 'no387 nofxsr' boot options. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22x86/fpu/math-emu: Fix math-emu boot crashIngo Molnar1-1/+6
On a math-emu bootup the following crash occurs: Initializing CPU#0 ------------[ cut here ]------------ kernel BUG at arch/x86/kernel/traps.c:779! invalid opcode: 0000 [#1] SMP [...] EIP is at do_device_not_available+0xe/0x70 [...] Call Trace: [<c18238e6>] error_code+0x5a/0x60 [<c1002bd0>] ? math_error+0x140/0x140 [<c100bbd9>] ? fpu__init_cpu+0x59/0xa0 [<c1012322>] cpu_init+0x202/0x330 [<c104509f>] ? __native_set_fixmap+0x1f/0x30 [<c1b56ab0>] trap_init+0x305/0x346 [<c1b548af>] start_kernel+0x1a5/0x35d [<c1b542b4>] i386_start_kernel+0x82/0x86 The reason is that in the following commit: b1276c48e91b ("x86/fpu: Initialize fpregs in fpu__init_cpu_generic()") I failed to consider math-emu's limitation that it cannot execute the FNINIT instruction in kernel mode. The long term fix might be to allow math-emu to execute (certain) kernel mode FPU instructions, but for now apply the safe (albeit somewhat ugly) fix: initialize the emulation state explicitly without trapping out to the FPU emulator. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>