summaryrefslogtreecommitdiffstats
path: root/sys/dev/pci/if_vmx.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.jan2020-12-121-2/+2
| | | | | | OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
* Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.patrick2020-07-101-2/+2
| | | | ok dlg@ tobhe@
* apparently vmx(4) needs a power of 2 number of interrupts.dlg2020-07-071-2/+2
| | | | | | so we pass INTRMAP_POWEROF2 to intrmap_create and things are better. reported by and fixed by mark patruck. thanks :)
* add kstat support for reading the "hardware" counters for each ring.dlg2020-07-071-20/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the counters happen to be a series of uint64_t values in memory, so we treat them as arrays that get mapped to a series of kstat_kv structs that are set up as 64 bit counters with either packet or byte counters as appropriate. this helps keep the code size down. while we export the counters as separate kstats per rx and tx ring, you request an update from the hypervisor at the controller level. this code ratelimits these requests to 1 per second per interface to try and debounce this a bit so each kstat read doesnt cause a vmexit. here's an example of the stats. note that we get to see how many packets that rx ring moderation drops for the first time. see the "no buffers" stat. vmx0:0:rxq:5 packets: 2372483 packets bytes: 3591909057 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets ... vmx0:0:txq:5 packets: 1316856 packets bytes: 86961577 bytes qdrops: 0 packets errors: 0 packets qlen: 1 packets maxqlen: 512 packets oactive: false ... vmx0:0:vmx-rxstats:5 LRO packets: 0 packets LRO bytes: 0 bytes ucast packets: 2372483 packets ucast bytes: 3591909053 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes no buffers: 696 packets errors: 0 packets ... vmx0:0:vmx-txstats:5 TSO packets: 0 packets TSO bytes: 0 bytes ucast packets: 1316839 packets ucast bytes: 86960455 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes errors: 0 packets discards: 0 packets
* report rx ring state for all rings, not just the first one.dlg2020-06-251-3/+30
| | | | systat mbuf looks a bit better now.
* actually use pci_intr_establish_cpu with cpus from the intrmap.dlg2020-06-241-4/+3
| | | | | | sigh, i don't know how i forgot this. yes jmatthew@
* if the chip did rss, use the hash from the chip as an mbuf flowid.dlg2020-06-171-1/+6
| | | | another sniped commit from jmatthew@
* enable multiple queues (and interrupts on multiple cpus) on vmx(4).dlg2020-06-171-31/+51
| | | | | | | | | | | | | | | | | | | | | im doing this with vmx(4) because it only exists on two archs (well, one and a half archs really) so any impact is localised. most other drivers i'm working on are enabled on 3 or 4 archs, and we're still working on the interrupt code on those archs. in the meantime vmx(4) can be used as a reference driver on how to implement multiq. it shows the use of rss, toeplitz, intrmap, and interrupts on multiple cpus. it's also a relatively simple device, which makes it easier to understand the above features. note that vmx(4) seems to advertise 25 msi-x vectors. it appears that the intention is that 16 of these vectors are supposed to be used for rx, 8 for tx, and 1 for events (eg, link up and down). we're keeping things simple for now and using a maximum of 8 vectors for both tx and rx, and one for events. this is mostly based on work that jmatthew@ did, but it's simplified now cos intrmap makes things easier.
* configure toeplitz using the kernel stoeplitz key if needed.dlg2020-06-161-1/+30
| | | | | | | | "if needed" basically means if more than 1 queue is set up, then set up rss. again, i think jmatthew@ wrote most of this, but im sniping it cos of the stoeplitz integration.
* Use MSI-X interrupts where available, and rearrange structures to allowjmatthew2020-05-281-56/+148
| | | | | | | | for multi-queue operation. Vector 0 is used for events, and the subsequent vectors are mapped to a tx and rx queue each. tested on esxi 6.7 and qemu by me, and on vmware fusion by dlg@ ok dlg@
* tweak the rx path to look more like the tx path.dlg2019-10-271-80/+72
| | | | | it's a bit shorter, and a bit more correct wrt use of bus_dma. still a bit to go though.
* fix the last commit.dlg2019-10-271-4/+4
| | | | | | | if gen is toggled per packet, then it needs to be toggled before each packet, not before the loop. also, if 0 out the right offload. brad pointed out the if 0 bit.
* put vlan tag offload back indlg2019-10-261-6/+13
|
* make vmx transmit (vmxnet3_start) mpsafe.dlg2019-10-261-146/+101
| | | | | to make mpsafetey a bit easier to figure out i disabled checksum and vlan offload. i'll put them back in soon though.
* avoid rxr races between rxfill from the interrupt and timeout pathsdlg2019-10-261-6/+32
| | | | | | | | | | | | vmx "hardware" seems to be able to use rx descriptors as soon as theyre filled in, which means filling the ring from a timeout can run conccurently with an isr that's pulling stuff off the ring. this is mostly a problem with the rxr accounting, so we serialise updates to the alive counter by running rxfill in a mutex. most of the investigation was done by claudio@ and mathieu@ an earlier version of this diff was tested hard by mathieu@ and was ok@ claudio
* remove some debug cruft i should have removed before the last commit.dlg2019-08-061-6/+2
|
* have a go at using msi interrupts.dlg2019-08-061-7/+38
| | | | | | | | | | | | | | | | | | vmx has an interesting feature where config in the hypervisor can say what type of interrupts the guest should configure for the nic, with the options of auto, msix, msi, and intx. depending on this, the driver should try to map the type specified and fall back from there. also interesting is that my guest gets "auto" from the hypervisor, which i fall through to msi with, but an msi interrupt cannot be mapped. i cannot see any msi interrupts in this guest actually. there must be something funky at the platform level that we don't like, and that prevents msi from being mapped. if msi does get mapped, we should be able to avoid a register read on every interrupt. that should probably provide a noticable performance improvement if we can ever take advantage of it.
* i replaced a misplaced tab with g, not a space. make this work again.dlg2019-08-061-2/+2
|
* if the rx ring gets empty and can't be filled, retry in the futuredlg2019-08-061-19/+40
| | | | | | there have been several reports that vmx gets stuck sometimes and only comes good after it's taken down and up again. hopefully this fixes that issue.
* use ifiq_input so we can call if_rxr_livelocked to apply backpressuredlg2019-08-061-2/+3
|
* move counting if_opackets next to counting if_obytes in if_enqueue.dlg2017-01-221-2/+1
| | | | | | | this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it. ok mpi@ deraadt@
* G/C IFQ_SET_READY().mpi2016-04-131-2/+1
|
* Improve the previous fix: call vmxnet3_load_mbuf, bpf_mtap, and flipreyk2016-01-261-8/+11
| | | | | | | | the generation bit to pass the tx descriptor and mbuf to the "hardware". This way bpf is not called if vmxnet3_load_mbuf dropped the mbuf. Tested by me OK mikeb@
* In vmxnet3_start(), do not send the mbuf to bpf after passing it toreyk2016-01-251-6/+6
| | | | | | | | | | | | the hardware. This could have resulted in a page fault when the mbuf has already been freed by the TX interrupt handler on another CPU. This has the slight drawback that bpf can be called before the packet is eventually dropped by vmxnet3_load_mbuf() - but I'm getting this simple and verified fix in before doing further optimizations on the start handler. As discussed with mikeb@ jsg@ dlg@
* Record the modified mbuf chain after transmit checksum setup codemikeb2016-01-041-10/+11
| | | | | | | | | Keep the modified chain pointer and pass it back to the calling code so that it will get properly accounted for. Change it to m_pullup since m_pulldown with a zero offset is just as good. Tested by yasuoka@, myself and mxb at alumni ! chalmers ! se, thanks! ok yasuoka, mpi
* replace IFF_OACTIVE manipulation with mpsafe operations.dlg2015-11-251-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too. IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change. instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd. this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too. ok kettenis@ mpi@ jmatthew@ deraadt@
* No need for "vlan.h" if you don't check for "#if NVLAN > 0".mpi2015-11-241-2/+1
|
* No need to include <net/if_arp.h>mpi2015-11-241-2/+1
| | | | | | | | | This header is only needed because <netinet/if_ether.h> declares a structure that needs it. But it turns out that <net/if.h> already includes it as workaround. A proper solution would be to stop declarting "struct ether_arp" there. But no driver should need this header.
* The only network driver needing <net/if_types.h> is upl(4) for IFT_OTHER.mpi2015-11-241-2/+1
|
* Include <sys/atomic.h> when atomic operations are used.mpi2015-11-231-1/+2
| | | | | | This has been masked because <sys/srp.h> is pulled unconditionally. ok dlg@
* Do not include <net/if_vlan_var.h> when it's not necessary.mpi2015-11-141-3/+1
| | | | | | Because of the VLAN hacks in mpw(4) this file still contains the definition of "struct ifvlan" which depends on <sys/refcnt.h> which in turns pull <sys/atomic.h>...
* arp_ifinit() is no longer needed.mpi2015-10-251-4/+1
|
* brad points out i need bpf_mtap_ether to reconstruct vlan headersdlg2015-09-201-2/+2
|
* need to keep bpf in the tx path. got a bit ahead of myself there...dlg2015-09-201-1/+6
| | | | noticed by brad
* make vmx(4) interrupts mpsafe.dlg2015-09-181-63/+97
| | | | | | | | | | | | | | | the vmx rx path is only touched in the interrupt handler, so it is already guaranteed to be accessed by only one cpu at a time. the tx path has been massaged so the the producer is only touched by the start routine, and the consumer is only touched by the interrupt path, and can therefore be run concurrently. the only interlock is a count of the free descriptors. if txintr clears IFF_OACTIVE, it takes the kernel lock before running the start routine. other interrupts, eg, link state handling, take the kernel lock.
* Increment if_ipackets in if_input().mpi2015-06-241-2/+1
| | | | | | | Note that pseudo-drivers not using if_input() are not affected by this conversion. ok mikeb@, kettenis@, claudio@, dlg@
* Check if interface was stopped before calling rx/tx interrupt routines.mikeb2015-06-041-4/+7
| | | | | Report & tests by mxb@alumni.chalmers.se, thanks! OK deraadt, chris
* Revert unrelated changes in previous.uebayasi2015-05-291-26/+1
|
* Initial addition of ``Patrol Read'' support in bio(4), biocto(8), anduebayasi2015-05-291-1/+26
| | | | | | mfi(4). Based on FreeBSD, but done without mfiutil(8). OK deraadt@
* bump the number of tx and rx descriptors from 128 up to 512.dlg2015-05-261-3/+3
|
* Remove some includes include-what-you-use claims don'tjsg2015-03-141-2/+1
| | | | | | | have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels. ok tedu@ deraadt@
* convert VMXNET drivers to ml_enqueue + if_inputpelikan2015-02-101-9/+6
| | | | ok dlg reyk
* fix print/panic messages + remove superfluous if_ibytes additionpelikan2015-02-091-5/+3
| | | | ok reyk
* unifdef INETtedu2014-12-221-3/+1
|
* Rearrange mostly vmxnet3_init() to look like other Ethernet drivers.brad2014-12-191-12/+22
| | | | ok reyk@
* dont base the mru on the mtu. unconditionally make it what thedlg2014-08-261-22/+7
| | | | | | hardware can do (9k). implement the rxr ioctl while here. ok jsg@
* Fewer <netinet/in_systm.h>mpi2014-07-221-2/+1
|
* Remove left-over call to removed function m_clsetwms().stsp2014-07-081-3/+1
| | | | ok mpi
* cut things that relied on mclgeti for rx ring accounting/restriction overdlg2014-07-081-10/+14
| | | | | | | | | | to using if_rxr. cut the reporting systat did over to the rxr ioctl. tested as much as i can on alpha, amd64, and sparc64. mpi@ has run it on macppc. ok mpi@
* - Unconditionally set IFCAP_VLAN_MTUbrad2014-01-221-26/+26
| | | | | | | | - Bring the receive filter handling in line with other drivers - Simplify the RX checksum code a bit and only set the flags when the RX checksum is Ok ok uebayasi@