summaryrefslogtreecommitdiffstats
path: root/sys/netinet/ip_output.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* add IP_IPSECFLOWINFO option to sendmsg() and recvmsg(), so npppd(4)markus2012-07-161-2/+12
| | | | | | | can use this to select the IPsec tunnel for sending L2TP packets. this fixes Windows (always binding to 1701) and Android clients (negotiating wildcard flows); feedback mpf@ and yasuoka@; ok henning@ and yasuoka@; ok jmc@ for the manpage
* unneccessary casts to unsigned; ok claudioderaadt2012-04-131-5/+5
|
* Bring the rtable sockopt code in line with the setrtable() implementation.claudio2012-04-071-8/+9
| | | | | While there change IP_RTABLE to SO_RTABLE. IP_RTABLE will die soon. With and OK guenther@
* actually store the result of the pmtu-route lookup. otherwise wemarkus2012-03-301-3/+2
| | | | | don't have a MTU to announce in the icmp need fragment packet. this fixes PMTU-discovery for TCP over IPsec; ok mpf@, fries@
* remove IP_JUMBO, SO_JUMBO, and RTF_JUMBO.dlg2012-03-171-9/+1
| | | | no objection from mcbride@ krw@ markus@ deraadt@
* Escape hardware-checksumming if interface is in a bridge, this ishaesbaert2011-12-291-4/+7
| | | | | | | | | | | | | | | | | | already done for UDP/TCP/ICMP. This fixes a problem where checksumming would not be computed if you have a bridge with at least one interface with hardware checksumming and another without. Discussed with sthen@ and henning@, this is somewhat a temporary fix, we should not have these special bridge cases in ip_output, as Henning said, the bridge must behave. But for that to work we need to poke the bridge harder, this problem has been seen by at least two users at: http://marc.info/?l=openbsd-misc&m=132391433319512&w=2 http://marc.info/?l=openbsd-misc&m=132234363030132&w=2 I promised to work on a better diff :-). ok henning@ sthen@ mikeb@
* Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.haesbaert2011-12-021-5/+3
| | | | ok claudio@ henning@ mikeb@
* Bye bye pf_test6(). Only one pf_test function for both IPv4 and v6.claudio2011-07-041-3/+3
| | | | | | The functions were 95% identical anyway. While there use struct pf_addr in struct pf_divert instead of some union which is the same. OK bluhm@ mcbride@ and most probably henning@ as well
* Add IP_RECVRTABLE socket option to be used with a IPPROTO_IPmikeb2011-06-151-1/+9
| | | | | | | | level that allows one to retrieve the original routing domain of UDP datagrams diverted by the pf via "divert-to" with a recvmsg(2). ok claudio
* Do not allow traffic to be sent with a destination address in 0/8;weerd2011-05-281-1/+10
| | | | | | | | this is not allowed according to Stevens and RFCs 5735 and 1122. Suggestion to use ENETUNREACH from claudio. OK phessler@, claudio@
* recognize SO_RTABLE socket option at the SOL_SOCKET level;mikeb2011-05-021-2/+2
| | | | discussed with and ok claudio
* Make in_broadcast() rdomain aware. Mostly mechanical change.claudio2011-04-281-2/+3
| | | | | | This fixes the problem of binding sockets to broadcast IPs in other rdomains. OK henning@
* in_proto_csum_out: if M_ICMP_CSUM_OUT is set, do the icmp checksumhenning2011-04-051-1/+15
| | | | ok dlg fondue-kinda-ok claudio
* mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUThenning2011-04-051-7/+7
| | | | ok claudio krw
* de-guttenberg our stack a bithenning2011-04-041-31/+23
| | | | | we don't need 7 f***ing copies of the same code to do the protocol checksums (or not, depending on hw capabilities). claudio ok
* there is no need to special case the bridge in the ip checksum handlinghenning2011-04-041-7/+4
| | | | ok sthen claudio dlg
* If a caller is requesting to be set to the same rtable that theyphessler2010-09-301-6/+7
| | | | | | | | currently have, let the call succeede. Mirrors the same behaviour as setrtable() OK claudio@
* add a new IP level socket option IP_PIPEX. This option is used for L2TPyasuoka2010-09-231-1/+13
| | | | | support by pipex. OK henning@, "Carry on" blambert@
* Return EACCES when pf_test() blocks a packet in ip_output(). This allowsclaudio2010-09-081-2/+2
| | | | | | | | ip_forward() to know the difference between blocked packets and those that can't be forwarded (EHOSTUNREACH). Only in the latter case an ICMP should be sent. In the other callers of ip_output() change the error back to EHOSTUNREACH since userland may not expect EACCES on a sendto(). OK henning@, markus@
* when sending a fragmented packet, dont check if the interfaces send queuedlg2010-08-131-20/+1
| | | | | | | | | | | | | | has enough space for all the fragments on it. this check was snuck in by itojun under an unrelated commit. it broke when i set the virtual interface send queue depths to 1, which beck had to special case at n2k10. without this code we avoid these dubious checks along with another splnet/splx pair, and it should make future work on manipulating send queues easier. ive been running this in production since n2k10 (~7months ago). ok claudio@ henning@ deraadt@
* Add support for using IPsec in multiple rdomains.reyk2010-07-091-4/+8
| | | | | | | | | | | | | | | | | This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1. Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain. ok claudio@ naddy@
* Fix the naming of interfaces and variables for rdomains and rtablesguenther2010-07-031-16/+15
| | | | | | | | | | | | and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0. Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped. Written by claudio@, criticized^Wcritiqued by me
* m_copyback can fail to allocate memory, but is a void fucntion so gymnasticsblambert2010-07-021-3/+3
| | | | | | | | | are required to detect that. Change the function to take a wait argument (used in nfs server, but M_NOWAIT everywhere else for now) and to return an error ok claudio@ henning@ krw@
* Allow to specify an alternative enc(4) interface for an SA. Allreyk2010-07-011-11/+11
| | | | | | | | | | | | | | | | | | | traffic for this SA will appear on the specified enc interface instead of enc0 and can be filtered and monitored separately. This will allow to group individual ipsec policies to virtual interfaces and simplifies monitoring and pf filtering with many ipsec policies a lot. This diff includes the following changes: - Store the enc interface unit (default 0) in the TDB of an SA and pass it to the enc_getif() lookup when running the bpf or pf_test() handlers. - Add the pfkey SADB_X_EXT_TAP extension to communicate the encX interface unit for a specified SA between userland and kernel. - Update enc(4) again to use an allocate array instead of the TAILQ to lookup the matching enc interface in enc_getif() quickly. Discussed with many, tested by a few, will need more testing & review. ok deraadt@
* Replace enc(4) with a new implementation as a cloner device. We stillreyk2010-06-291-3/+6
| | | | | | | | | | create enc0 by default, but it is possible to add additional enc interfaces. This will be used later to allow alternative encs per policy or to have an enc per rdomain when IPsec becomes rdomain-aware. manpage bits ok jmc@ input from henning@ deraadt@ toby@ naddy@ ok henning@ claudio@
* Start cleaning up the mess called rtalloc*. Kill rtalloc2, make rtalloc1claudio2010-05-071-9/+12
| | | | | | | | accept flags for report and nocloning. Move the rtableid into struct route (with a minor twist for now) and make a few more codepathes rdomain aware. Appart from the pf.c and route.c bits the diff is mostly mechanical. More to come... OK michele, henning
* Double and in comment.claudio2010-01-131-2/+2
|
* Allow the queueing of multiple fragments on virtual interfaces with abeck2010-01-121-2/+6
| | | | | | | | | | | queue length of one - i.e. vlans with the forthcoming change from dlg. this allows fragmented frames to be sent on such an interface, hoping that the interface underneath copes correctly - A better fix for this will be forthcoming soon, but this is good enough for now, and will allow the change for vlans to use an ifq length of 1. tested by me and dlg@, ok dlg@, claudio@, deraadt@
* The process's rdomain should be, well, per-process and not per-rthread,guenther2009-12-231-2/+3
| | | | | | | | so put it in struct process instead of struct proc. While at it, move the p_emul member inside struct proc so that it gets copied automatically instead of requiring manual assignment. ok deraadt@
* Two cases of IPSEC getsockopt() returning two bytes of uninitialializedderaadt2009-12-111-1/+3
| | | | kernel stack content instead of proper information; found by Clement LECIGNE
* Add setrdomain() and getrdomain() system calls. Committing now toguenther2009-11-271-2/+7
| | | | | | | | catch the libc major bump per request from deraadt@ Diff by reyk. ok guenther@
* NULL dereference in IPV6_PORTRANGE and IP_IPSEC_*, found by Clement LECIGNE,guenther2009-11-201-2/+2
| | | | | | | localhost DoS everywhere. To help minimize further issues, make the mbuf != NULL test explicit instead of implicit in a length test. Suggestions and initial work by mpf@ and miod@ ok henning@, mpf@, claudio@,
* Packets generated by ip_fragment() need to inherit the rdomain from theclaudio2009-11-131-2/+3
| | | | | original packet or they will trigger the diagnostic check in the interface output routines. OK jsg@
* rtables are stacked on rdomains (it is possible to have multiple routingclaudio2009-11-031-2/+3
| | | | | | | | | | | | | | tables on top of a rdomain) but until now our code was a crazy mix so that it was impossible to correctly use rtables in that case. Additionally pf(4) only knows about rtables and not about rdomains. This is especially bad when tracking (possibly conflicting) states in various domains. This diff fixes all or most of these issues. It adds a lookup function to get the rdomain id based on a rtable id. Makes pf understand rdomains and allows pf to move packets between rdomains (it is similar to NAT). Because pf states now track the rdomain id as well it is necessary to modify the pfsync wire format. So old and new systems will not sync up. A lot of help by dlg@, tested by sthen@, jsg@ and probably more OK dlg@, mpf@, deraadt@
* *NULL store in IP_AUTH_LEVEL, IP_ESP_TRANS_LEVEL, IP_ESP_NETWORK_LEVEL,deraadt2009-10-281-1/+2
| | | | | | | IP_IPCOMP_LEVEL found by Clement LECIGNE, localhost root exploitable on userland/kernel shared vm machines (ie. i386, amd64, arm, sparc (but not sparc64), sh, ...) on OpenBSD 4.3 or older ok claudio
* Redo the route lookup in the output (and IPv6 forwarding) path if theclaudio2009-10-061-2/+23
| | | | | | | | | | | | | | | | | | | | | | destination of a packet was changed by pf. This allows for some evil games with rdr-to or nat-to but is mostly needed for better rdomain/rtable support. This is a first step and more work and cleanup is needed. Here a list of what works and what does not (needs a patched pfctl): pass out rdr-to: from local rdr-to local addr works (if state tracking on lo0 is done) from remote rdr-to local addr does NOT work from local rdr-to remote works from remote rdr-to remote works pass in nat-to: from remote nat-to local addr does NOT work from remote nat-to non-local addr works non-local is an IP that is routed to the FW but is not assigned on the FW. The non working cases need some magic to correctly rewrite the incomming packet since the rewriting would happen outbound which is too late. "time to get it in" deraadt@
* Initial support for routing domains. This allows to bind interfaces toclaudio2009-06-051-25/+52
| | | | | | | | | alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
* When don't-fragment packets need to get fragemnted some code tries toclaudio2009-01-301-2/+3
| | | | | | | | | | update the route specific MTU from the interface (because it could have changed in between). This only makes sense if we actually have a valid route but e.g. multicast traffic does no route lookup and so there is no route at all and we don't need to update anything. Hit by dlg@'s pfsync rewrite which already found 3 other bugs in the network stack and slowly makes us wonder how it worked in the first place. OK mcbride@ dlg@
* Always zero the IP checksum field for packets and packet fragmentsnaddy2009-01-291-10/+7
| | | | | being passed down if using HW checksum offload. From Brad, inspired by NetBSD/FreeBSD. ok markus@
* IP_RECVDSTPORT, allows you to get the destination port of UDP datagramsmarkus2008-05-091-1/+9
| | | | for pf(4) diverted packets; based on patch by Scot Loach; ok beck@
* MALLOC/FREE -> malloc/freechl2007-10-291-7/+6
| | | | ok krw@
* allow 4095 instead of 20 multicast group memberships per socket (you needmarkus2007-09-181-7/+39
| | | | | | one entry for each multicast group and interface combination). this allows you to run OSPF with more than 10 interfaces. adapted from freebsd; ok claudio, henning, mpf
* Remove inm_ifp from struct in_multi -- caching struct ifnet is dangerousclaudio2007-07-201-3/+3
| | | | | | | because interfaces may disappear without notice causing use after free bugs. Instead use the inm_ia->ia_ifp as a hint, struct in_ifaddr correctly tracks removals of interfaces and invalidates ia_ifp in such cases. looks good henning@ markus@
* no need to declare extern ipsec_in_use, we get it via ip_ipsp.hhenning2007-05-301-2/+1
| | | | found by itojun
* gain another 5+% in ip forwarding performance.henning2007-05-291-4/+9
| | | | | | | | | boring details: skip looking for ipsec tags and descending into ip_spd_lookup if there are no ipsec flows, except in one case in ip_output (spotted by markus) where we have to if we have a pcb. ip_spd_lookup has the shortcut already, but there is enough work done before so that skipping that gains us about 5%. ok theo, markus
* -staticdlg2007-05-271-5/+5
| | | | ok reyk@
* do not install pmtu routes for transport mode SAs, as they do notmarkus2006-12-051-2/+11
| | | | the dest IP; PMTU debugging support; ok hshoexer
* rangecheck ttl on IP_TTL, collected dust in my treehenning2006-12-011-2/+5
|
* implement IP_MINTTL socket option fo tcp socketshenning2006-10-111-1/+13
| | | | | | | | This is for RFC3682 aka the TTL security hack - sender sets TTL to 255, receiver checks no router on the way (or, no more than expected) reduced the TTL. carp uses that technique already. modeled after FreeBSD implementation. ok claudio djm deraadt
* implement IP_RECVTTL socket option.henning2006-10-111-1/+10
| | | | | | when set on raw or udp sockets, userland receives the incoming packet's TTL as ancillary data (cmsg shitz). modeled after the FreeBSD implementation. ok claudio djm deraadt