summaryrefslogtreecommitdiffstats
path: root/sys/dev/pci/if_em.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* match on Intel Alder Lake and Meteor Lake I219 Ethernet idsjsg2021-01-241-1/+10
|
* Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.jan2020-12-121-2/+2
| | | | | | OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
* add kstat support for reading hardware counters.dlg2020-07-121-217/+352
| | | | | | | | | | | | | this replaces the existing counters implementation, which just collected the stats in the softc, but didn't really provide a way for a person to read them. em counters get cleared on read. a lot of them are 32bit, so to avoid overflow the counters are polled and the newly accumulated values are added to some 64 bit counters in software. tested by hrvoje popovski and SAITOH Masanobu ok mpi@
* Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.patrick2020-07-101-2/+2
| | | | ok dlg@ tobhe@
* use ifiq_input and use it's return value to apply backpressure to rxrs.dlg2020-06-221-2/+3
| | | | this is a step toward deprecating softclock based livelock detection.
* Various changes related but independant from multiqueue logic:mpi2020-06-091-25/+55
| | | | | | | | | | | - return an error if em_rxfill() fails when setting up the ring, this means em_get_buf() couldn't get a mbuf. - disable "Drop Enable" to have the same behavior when queues > 1 - use local variables for statistics in preparation for using the counters_add(9) API to not trash values - extend hw_stats to print per-queue counters Tested by Hrvoje Popovski, ok jmatthew@
* Set timeout(9) to refill the receive ring descriptors if the amount ofjan2020-05-121-2/+2
| | | | | | | | | | descriptors runs below the low watermark. The em(4) firmware seems not to work properly with just a few descriptors in the receive ring. Thus, we use the low water mark as an indicator instead of zero descriptors, which causes deadlocks. ok kettenis@
* Map em(4) descriptor rings coherent. This doesn't make a difference on x86,patrick2020-04-261-2/+2
| | | | | | | | | | | | | but on selected ARM64 machines with non-cache-coherent PCIe controllers this makes em(4) work reliably. Without it the network controller's view of the head and tail get out of sync. The reason remains unclear. It could be an issue in our arm64 bus dma code, it could be an issue in the em(4) code, or maybe the hardware itself just doesn't cope well with non-coherent memory. Linux maps them coherent as well, and it might actually be better to map them that way, since otherwise we might spend a lot of time flushing our caches. ok kettenis@ deraadt@
* Use FOREACH_QUEUE() where nothing else is required to support multi-queues.mpi2020-04-221-252/+287
| | | | Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@
* Put MSI-X stuff under !SMALL_KERNEL to reduce the growth for i386 floppy.mpi2020-03-241-33/+40
|
* Make it possible to use em(4) with MSI-X, currently disabled by default.mpi2020-03-231-28/+305
| | | | | | | | | | The current implementation still uses a single queue but already establishes a different handler for link interrupts. This is done in preparation for multi-queues support. Based on a bigger diff from haesbaert@ and on the FreeBSD code. Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@
* Use queue descriptor instead of hardcoded value when initializing hw.mpi2020-03-081-2/+2
| | | | Tested by Hrvoje Popovski, ok jmatthew@
* Merge two blocks calling if_link_state_change().mpi2020-03-031-9/+6
| | | | No functional change.
* Introduce the concept of queue to prepare supporting multiple of them.mpi2020-02-201-204/+254
| | | | | | | | | | | | Move the tx/rx descriptors to dedicated structures similar to what already exist in ix(4). Only one queue is currently used, no real architectural change introduced in this diff. Extracted from a big diff from haesbaert@ via patrick@. Tested by Hrvoje Popovski and jmatthew@, ok jmatthew@
* Refactoring to prepare multi-queues support, no intended behavior change:mpi2020-02-041-70/+109
| | | | | | | | | | | | | | | | | | - Abstract the allocation/freeing of TX/RX ring into em_dma_malloc(). This will ease the introduction of multiple rings. - Split the 82576 variant out of 82575. The distinction is necessary when it comes to setting multiple queues. - Change multiple TX/RX related macro to take an index argument corresponding to a ring. Currently only the index 0 and 1 are used. - Gather and print more stats counters - Switch to using a function, like FreeBSD, to translate 82542 registers and get rid of a set of defines. Tested by many, thanks! ok mlarkin@, jmatthew@
* match on Intel Comet Lake and Tiger Lake Ethernetjsg2020-01-201-1/+12
|
* use a timeout to refill the rx ring when it's empty.dlg2019-03-011-8/+17
| | | | | | | | | | | em had rxr, but didn't use a timeout cos it claimed to generate an RX overflow interrupt when packets fell off slots in the ring. turns out that's a lie on at least one chip, so add the timeout like other drivers. this was hit by mlarkin@, who had nfs and bufs steal all the packets and memory for packets from em, which didn't recover after the memory had been released back to the system.
* em: Port an i219 errata workaround from FreeBSDsf2018-04-071-2/+4
| | | | | | https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-connection-spec-update.pdf?asset=9561 ok mikeb@ jsg@
* em: Print error code and phy/mac typesf2018-04-071-3/+6
| | | | | | | | Print the error code if hardware initialization failed. If EM_DEBUG is defined, print the phy/mac type during attach. ok mikeb@ jsg@
* Add untested support for Cannon Lake and Ice Lake Ethernet (pch_cnp).jsg2018-03-161-5/+15
| | | | | Going by changes in FreeBSD and Linux it is almost identical to pch_spt but doesn't need one of the workarounds for a pch_spt specific errata.
* match two more copper i210 idsjsg2018-03-101-1/+3
|
* Add another ICH10 em(4). From John the.cheeze at gmail.jsg2018-03-101-1/+2
|
* The LINK_STATE_IS_UP() macro considers LINK_STATE_UNKNOWN as up.bluhm2017-07-251-6/+6
| | | | | | So the em(4) driver never got out of that state. Better compare the new link state value with the old one, like other drivers do. bug report Matthias Pitzl; OK deraadt@
* Match the Kaby Lake and Lewisburg (Skylake-EP PCH) MACs with I219 PHYs.jsg2017-03-191-2/+7
| | | | Expanded version of a diff from claudio@ who tested on x270 ok kettenis@
* add support for multiple transmit ifqueues per network interface.dlg2017-01-241-7/+8
| | | | | | | | | | | | | | | | | | | | | | | an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues(). the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set. enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled. getting this in now so everyone can kick the tyres. ok mpi@ visa@ (who provided some tweaks for cnmac).
* move counting if_opackets next to counting if_obytes in if_enqueue.dlg2017-01-221-3/+1
| | | | | | | this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it. ok mpi@ deraadt@
* tell ix and em to use 2k+ETHER_ALIGN clusters for rx on all archs.dlg2016-10-271-3/+1
| | | | | | | | | | | | | | | | | this means that the ethernet header and therefore its payload will be aligned correctly for the stack. without this em and ix are sufferring a 30 to 40 percent hit in forwarding performance because the ethernet stack expects to be able to prepend 8 bytes for an ethernet header so it can gaurantee its alignment. because em and ix only had 6 bytes where the ethernet header was, it always prepends an mbuf which turns out to be expensive. this way the prepend will be cheap because the 8 byte space will exist. 2k+ETHER_ALIGN clusters will end up using the newly created mcl2k2 pool. the regression was isolated and the fix tested by hrvoje popovski. ok mikeb@
* G/C IFQ_SET_READY().mpi2016-04-131-2/+1
|
* Add support for the Intel i219 network chip to the em(4) driver.bluhm2016-02-181-7/+175
| | | | | from Christian Ehrhardt; input jsg@; OK deraadt@ sthen@ mpi@ jsg@ tested by sthen@ jca@ benno@ bluhm@
* post the packet on em_82547 chips after bpfdlg2016-01-121-7/+13
| | | | | now that start and txeof can run on different cpus, txeof could have freed the mbuf before bpf got to it.
* do further work on the em transmit path to simplify the code.dlg2016-01-111-197/+140
| | | | | | | | | | | noone could understand how em_txeof worked, so i rewrote it. this also gets rid of the sc_tx_desc_free var that needed atomic ops. space to use in em_start and space to free in em_txeof is now calculated from the producer and consumer. testers have reported better responsiveness with this. somehow. if em issues persist after this, im rolling back to pre-mpsafe changes.
* consistently use the desc ring pointers as guards for their dmamem.dlg2016-01-091-3/+8
|
* look at pkts inside the loop over the pkts in em_free_receive_structures.dlg2016-01-071-2/+2
|
* rename em_buffers to em_packets.dlg2016-01-071-138/+133
| | | | shorten a bunch of variable names while here.
* rename the rx and tx ring softc vars.dlg2016-01-071-138/+138
|
* prefix the rx and tx ring softc members with sc_dlg2016-01-071-153/+155
|
* host the rx ring dmamap syncs out of em_get_buf into em_rxfill.dlg2016-01-071-9/+9
| | | | | | this lets us do the syncs once for a fill of the ring instead of once for every packet put onto the ring. it mirrors how we try to do things for tx.
* unify the bus_dmamap_sync calls around the tx and rx rings.dlg2016-01-071-26/+31
|
* simplify the calculation of the dmamem size for the tx and rx rings.dlg2016-01-071-21/+6
| | | | | | we dont user config of the ring size, especially before attach time, and the dmamem api takes care of rounding up to PAGE_SIZE if it needs to.
* unify the dma tag into sc_dmat in em_softc.dlg2016-01-071-103/+82
|
* sprinkle DEVNAMEdlg2016-01-071-30/+30
|
* rename the struct arpcom interface_data in em_softc to sc_ac.dlg2016-01-071-21/+19
| | | | makes it more consistent with the rest of the tree.
* rename em_softc sc_dv to sc_dev. like ALL OUR OTHER DRIVERS.dlg2016-01-071-30/+30
|
* tweak em to make it mpsafe, both for interrupts and if_start.dlg2016-01-071-111/+61
| | | | | | | | | | | this is mostly work by kettenis and claudio, with further work from me to make the transmit side from the stack mpsafe. there's a watchdog issue that will be worked on in tree after this change. tested by hrvoje popovski and gregor best ok mpi@ claudio@ deraadt@ jmatthew@
* 82544 on pcix busses needs a workaround that effectively doublesdlg2015-12-311-3/+3
| | | | | | | | | | the possible number of slots a packet can use on the tx ring. to make it easier to reserve and account for space on the ring, half the number of dma descriptors on those chips so the number of slots can stay the same. ok claudio@
* replace IFF_OACTIVE manipulation with mpsafe operations.dlg2015-11-251-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too. IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change. instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd. this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too. ok kettenis@ mpi@ jmatthew@ deraadt@
* Revert all the changes to run the tx completion path wihtout holding thempi2015-11-201-35/+44
| | | | | | | | | | | KERNE_LOCK. A piece is still not right as many peole reported a "watchdog timeout" problem. This basically brings us back to r1.305. ok dlg@, jmatthew@
* shuffle struct ifqueue so in flight mbufs are protected by a mutex.dlg2015-11-201-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the code is refactored so the IFQ macros call newly implemented ifq functions. the ifq code is split so each discipline (priq and hfsc in our case) is an opaque set of operations that the common ifq code can call. the common code does the locking, accounting (ifq_len manipulation), and freeing of the mbuf if the disciplines enqueue function rejects it. theyre kind of like bufqs in the block layer with their fifo and nscan disciplines. the new api also supports atomic switching of disciplines at runtime. the hfsc setup in pf_ioctl.c has been tweaked to build a complete hfsc_if structure which it attaches to the send queue in a single operation, rather than attaching to the interface up front and building up a list of queues. the send queue is now mutexed, which raises the expectation that packets can be enqueued or purged on one cpu while another cpu is dequeueing them in a driver for transmission. a lot of drivers use IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before committing to it with a later IFQ_DEQUEUE operation. if the mbuf gets freed in between the POLL and DEQUEUE operations, fireworks will ensue. to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback, and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq mutex and get a reference to the mbuf they wish to try and tx. if there's space, they can ifq_deq_commit it to remove the mbuf and release the mutex. if there's no space, ifq_deq_rollback simply releases the mutex. this api was developed to make updating the drivers using IFQ_POLL easy, instead of having to do significant semantic changes to avoid POLL that we cannot test on all the hardware. the common code has been tested pretty hard, and all the driver modifications are straightforward except for de(4). if that breaks it can be dealt with later. ok mpi@ jmatthew@
* fix newlines on an error messagejsg2015-10-291-2/+2
|
* arp_ifinit() is no longer needed.mpi2015-10-251-4/+1
|