summaryrefslogtreecommitdiffstats
path: root/sys/net/pf_ioctl.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* pfsync_state_import() must not be called with the pf state lock held,patrick2021-02-091-3/+1
| | | | | | | | | since the actual modification of the state table is done by a call to pf_state_insert(), which takes the pf state lock itself. Other calls to pfsync_state_import() also only have the pf lock. Reported-by: syzbot+d6ea8620b43dc69ecbc6@syzkaller.appspotmail.com ok bluhm@
* Activate use of PF_LOCK() by removing the WITH_PF_LOCK ifdefs.patrick2021-02-091-3/+1
| | | | | Silence from the network group ok sashan@
* Reject rules with invalid port rangeskn2020-12-161-1/+23
| | | | | | | | | | | | | | | | | | Ranges where the left boundary is bigger than the right one are always bogus as they work like `port any' (`port 34<>12' means "all ports") or in way that inverts the rule's action (`pass ... port 34:12' means "pass no port at all"). Add checks for all ranges and invalidate those that yield no or all ports. For this to work on redirections, make pfctl(8) pass the range's type, otherwise boundary including ranges are not detected as such; that is to say, `struct pf_pool's `port_op' member was unused in the kernel so far. `rdr-to' rules with invalid ranges could panic the kernel when hit. Reported-by: syzbot+9c309db201f06e39a8ba@syzkaller.appspotmail.com OK sashan
* - missing NET_UNLOCK() in pf_ioctl.c error pathsashan2020-10-221-1/+2
| | | | | | Reported-by: syzbot+b9af9c29ed1a6dabda25@syzkaller.appspotmail.com OK anton@
* - move NET_LOCK() further down in pf_ioctl.c. Also move memory allocationssashan2020-10-211-64/+239
| | | | | | outside of NET_LOCK()/PF_LOCK() scope in easy spots. OK kn@
* relax check for valid onrdomain range. onrdomain is -1 if the value isclaudio2020-10-021-2/+3
| | | | | | unused by the rule. So skip the rest of the check in that case. Fixes rulest loading for semarie@ OK semarie@
* rdomain IDs do not need to exist for "on rdomain N" to workkn2020-10-011-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Unlike "... rtable N", pf.conf(5)'s "on rdomain N" does not alter packet state and will always work no matter if rdomain N currently exists or not, i.e. the rule "pass on rdomain 42" will simply match (and pass) packets if rdomain 42 exists, and it will simply not match (neither pass nor block) packets if 42 does not exist. There's no need to reload the ruleset whenever routing domains are created or deleted, which can already be observed now by creating an rdomain, loading rules referencing it and deleting the same rdomain immediately afterwards: pf will continue to work as expected. Relax both pfctl(8)'s parser check as well as pf(4)'s copyin routine to accept any valid routing domain ID without expecting it to exist at the time of ruleset creation - this lifts the requirement to create rdomains before referencing them in pf.conf while keeping pf behaviour unchanged. Prompted by yasuoka's recent pfctl parse.y r1.702 commit requiring an rtable to exist upon ruleset creation. Discussed with claudio and bluhm at k2k20. Feedback sashan OK sashan yasouka claudio
* Rehash main ruleset after rule expirationkn2020-08-241-12/+9
| | | | | | | | | | | | | | | | When "once" rules expire, they are removed from the active ruleset, hence the main ruleset needs to be rehashed iff itself contains once rules. After the previous commit, pf_setup_pfsync_matching() became much simpler but its name even less accurate; simplify it further and rename it to pf_calc_chksum() while here. Admins using "once" rules in combination with pfsync(4) are hopefully aware of this caveat (self-changing rulesets) already, but now the checksum in "pfctl -v -s info" actually indicates out-of-sync rulesets and is no longer misleading. OK sashan
* Remove ptr_array from struct pf_rulesetkn2020-08-241-19/+2
| | | | | | | | | | | | | | | | | | | Each ruleset's rules are stored in a TAILQ called "ptr" with "rcount" representing the number of rules in the ruleset; "ptr_array" points to an array of the same length. "ptr" is backed by pool_get(9) and may change in size as "expired" rules get removed from the ruleset - see "once" in pf.conf(5). "ptr_array" is allocated momentarily through mallocarray(9) and gets filled with the TAILQ entries, so that the sole user pfsync(4) can access the list of rules by index to pick the n-th rule during state insertion. Remove "ptr_array" and make pfsync iterate over the TAILQ instead to get the matching rule's index. This simplifies both code and data structures and avoids duplicate memory management. OK sashan
* when calculating the ruleset's checksum, skip automatic table names.henning2020-07-211-2/+4
| | | | | | | the checksum is exclusively used for pfsync to verify rulesets are identical on all nodes. the automatic table names are random and have a near zero chance to match. found at a customer in zurich ok sashan kn
* kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)cheloha2020-06-241-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | time_second(9) and time_uptime(9) are widely used in the kernel to quickly get the system UTC or system uptime as a time_t. However, time_t is 64-bit everywhere, so it is not generally safe to use them on 32-bit platforms: you have a split-read problem if your hardware cannot perform atomic 64-bit reads. This patch replaces time_second(9) with gettime(9), a safer successor interface, throughout the kernel. Similarly, time_uptime(9) is replaced with getuptime(9). There is a performance cost on 32-bit platforms in exchange for eliminating the split-read problem: instead of two register reads you now have a lockless read loop to pull the values from the timehands. This is really not *too* bad in the grand scheme of things, but compared to what we were doing before it is several times slower. There is no performance cost on 64-bit (__LP64__) platforms. With input from visa@, dlg@, and tedu@. Several bugs squashed by visa@. ok kettenis@
* Document the various flavors of NET_LOCK() and rename the reader version.mpi2020-05-271-3/+3
| | | | | | | | | | Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path take the reader lock. This is mostly for documentation purpose as long as the softnet thread is converted back to use a read lock. dlg@ said that comments should be good enough. ok sashan@
* fix insufficient input sanitization in pf_rulecopyin() and pf_pool_copyin()sashan2020-04-191-1/+4
| | | | | | | Reported-by: syzbot+d0639632a0affe0a690e@syzkaller.appspotmail.com Reported-by: syzbot+ae5e359d7f82688edd6a@syzkaller.appspotmail.com OK anton@
* Stop processing packets under non-exclusive (read) netlock.mpi2020-04-121-3/+3
| | | | | | | | | | | | Prevent concurrency in the socket layer which is not ready for that. Two recent data corruptions in pfsync(4) and the socket layer pointed out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until we find a way in software to avoid future mistakes and to make sure that only the softnet thread and some ioctls are safe to use a read version of the lock, put everything back to the exclusive version. ok stsp@, visa@
* Cleanup <sys/kthread.h> and <sys/proc.h> includes.mpi2020-02-181-2/+2
| | | | | | | Do not include <sys/kthread.h> where it is not needed and stop including <sys/proc.h> in it. ok visa@, anton@
* Check address family of pf ioctl(2) DIOCNATLOOK parameter at kernelbluhm2020-01-081-1/+13
| | | | | | entry instead of calling panic() due to unhandled af. Reported-by: syzbot+92be143c2dd1746cf2af@syzkaller.appspotmail.com from Benjamin Baier
* Use proper NUL byte not zero with stringskn2019-11-261-7/+7
| | | | | No object change OK sashan
* fix kernel crash in pf_ioctl with WITH_PF_LOCK and NET_TASKQ > 1sashan2019-11-261-4/+2
| | | | | | | | | | the problem was introduced with a "mechanical" patch, which replaced all "breaks;" with "PF_UNLOCK(); break;" This is wrong for case of DIOCGETRULESETS. issue analyzed and patch created by Joerg Goltermann <jg@osn.de> OK tb@
* "set delay" never worked as committed: the delay field was not copiedotto2019-11-171-1/+2
| | | | | in and the pf_pktdelay struct ws not declared and initialzed properly. ok rob@ kn@
* Add a sysctl accessor to struct pf_status. The pf_status only holds theclaudio2019-05-091-1/+17
| | | | | | current status and statistics and can be exported without super-user rights via sysctl to make it easier for tools like systat to access those. OK deraadt@, sashan@
* Change ps_len of struct pfioc_states and psn_len of structbluhm2019-02-181-4/+4
| | | | | | | | | pfioc_src_nodes to size_t. This avoids integer truncation by casts to unsigned. As the types of DIOCGETSTATES and DIOCGETSRCNODES ioctl(2) arguments change, pfctl(8) and systat(1) should be updated together with the kernel. Calculate number of pf(4) states as size_t in userland. OK sashan@ deraadt@
* Check for main ruleset explicitlykn2018-12-271-3/+3
| | | | | | | | | | | All rulesets reference their parent anchor, except for the special cased main anchor containing the main ruleset, which's reference is always NULL since initialization and never changes. Replacing nullity tests with clearer equality checks makes the code less ambigious and easier to understand. OK sashan
* Rename pf_anchor_remove() to pf_remove_anchor()kn2018-12-171-2/+2
| | | | | | | | | For semantic consistency with pf_{create,find,remove}_{anchor,ruleset}(). Simplify logic by squashing the if/else block while here. No functional change. Feedback jca and mikeb, OK mikeb
* Use timeout_add_sec() instead of timeout_add() with a multiplication with hzclaudio2018-12-171-2/+2
| | | | OK kn@, florian@, visa@, cheloha@
* Remove useless macroskn2018-12-101-11/+13
| | | | | | These are just unhelpful case conversion. OK sashan henning
* Allow DIOCRGETADDRS when securelevel(7) > 1kn2018-10-011-1/+2
| | | | | | | | | | This fixes certain operations such as `pfctl -t foo -T show' when the system is in "Highly secure mode". `pfctl -t foo -T show -v' would already work due to a different ioctl (DIOCRGETASTATS) being used. Reported by Zbyszek Żółkiewski, thanks! OK sthen sashan
* - moving state look up outside of PF_LOCK()sashan2018-09-111-10/+43
| | | | | | | | | | | | | | this change adds a pf_state_lock rw-lock, which protects consistency of state table in PF. The code delivered in this change is guarded by 'WITH_PF_LOCK', which is still undefined. People, who are willing to experiment and want to run it must do two things: - compile kernel with -DWITH_PF_LOCK - bump NET_TASKQ from 1 to ... sky is the limit, (just select some sensible value for number of tasks your system is able to handle) OK bluhm@
* Fix arguments of pf_purge_expired_{src_nodes,rules}()sf2018-07-221-4/+4
| | | | | | | | | | | | | | | Due to the missing "void", this extern void pf_purge_expired_src_nodes(); is no prototype but a declaration. It is enough to suppress the 'implicit declaration' warning but it does not allow the compiler to check the arguments passed to the calls of the function. Fix the prototypes and don't pass the waslocked argument anymore. It has been removed a year ago. ok sashan henning
* provide a generic packet delay functionality. packets to be delayed are markedhenning2018-07-101-1/+4
| | | | | | by pf in the packet header. pf_delay_pkt reads the delay value from the packet header, schedules a timeout and re-queues the packet when the timeout fires. ok benno sashan
* Use pf_rm_rule() instead of pool_put() to decrement references bybluhm2018-04-241-2/+2
| | | | | | the rule created in pf_rule_copyin(). Plugs a potential kif memory leak in pf(4) ioctl. OK sashan@
* Remove compatibility with pfctl from 6.1 and plug a few leaksmikeb2018-04-131-1/+7
| | | | No objections from henning, OK visa
* Explicitly check PF_TRANS_RULESET in DIOCXBEGIN, DIOCXCOMMIT, and DIOCXROLLBACK.lteo2018-04-051-5/+29
| | | | ok bluhm@ sashan@ visa@
* add DIOCGETSYNFLWATS to get current synflood detection watermarks,henning2018-02-081-2/+13
| | | | ok claudio benno procter
* look ma, henning forgot to PF_LOCK/_UNLOCK in the new ioctls, ok procterhenning2018-02-071-1/+5
|
* syncookies for pf.henning2018-02-061-1/+16
| | | | | | | | | | | | | | | | | when syncookies are on, pf will blindly answer each and every SYN with a syncookie-SYNACK. Upon reception of the ACK completing the 3WHS, pf will reconstruct the original SYN, shove it through pf_test, where state will be created if the ruleset permits it. Then massage the freshly created state (we won't see the SYNACK), set up the sequence number modulator, and call into the existing synproxy code to start the 3WHS with the backend host. Add an - somewhat basic for now - adaptive mode where syncookies get enabled if a certain percentage of the state table is filled up with half-open tcp connections. This makes pf firewalls resilient against large synflood attacks. syncookies are off by default until we gained more experience, considered experimental for now. see http://bulabula.org/papers/2017/bsdcan/ for more details. joint work with sashan@, widely discussed and with lots of input by many
* some finger muscle workout:henning2018-02-061-21/+21
| | | | bzero -> memset and (very few) bcopy -> memcpy/memmove
* In pfioctl() a pf unlock was missing in the error path.bluhm2018-01-191-1/+2
| | | | OK lteo@ sashan@
* The divert structure was using the port number to indicate thatbluhm2017-11-281-3/+2
| | | | | | | | | | divert-to or divert-reply was active. If the address was also set, it meant divert-to. Divert packet used a separate structure. This is confusing and makes it hard to add new features. It is better to have a divert type that explicitly says what is configured. Adapt the pf rule struct in kernel and pfctl, no functional change. Note that kernel and pfctl have to be updated together. OK sashan@
* add a generic packet rate matching filter. allows things likehenning2017-11-131-1/+4
| | | | | | | | | | pass in proto icmp max-pkt-rate 100/10 all packets matching the rule in the direction the state was created are taken into consideration (typically: requests, but not replies). Just like with the other max-*, the rule stops matching if the maximum is reached, so in typical scenarios the default block rule would kick in then. with input from Holger Mikolon ok mikeb
* - add one more softnet taskqsashan2017-10-311-2/+2
| | | | | | | NOTE: code still runs with single softnet task. change definition of SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task OK mpi@, OK phessler@
* - fine tuning PF_LOCK in pfioctl()sashan2017-10-301-3/+176
| | | | | | (extra thanks to Hrvoje for testing) OK mpi@
* Remove NET_LOCK()'s argument.mpi2017-08-111-4/+3
| | | | Tested by Hrvoje Popovski, ok bluhm@
* Reduce contention on the NET_LOCK() by moving the logic of the pfpurgempi2017-08-061-12/+3
| | | | | | | | thread to a task running on the `softnettq`. Tested and inputs from Hrvoje Popovski. ok visa@, sashan@
* For pf the anchor is a C string so ensure that the value passed in via ioctlclaudio2017-07-271-1/+29
| | | | | | is correctly NUL terminated. Reported by Ilja Van Sprundel With and OK bluhm@
* Rework HFSC vs FQ-CoDel checksmikeb2017-07-191-6/+7
| | | | | | | | The selection mechanism introduced in pf_ioctl.c -r1.316 suffers from being too ambiguous and lacks robustness. Instead of relying on composition of multiple flags in the queue specification, it's easier to identify the root class (if it exists) and derive all further checks from it.
* Convert pf tagname malloc(9) into pool_get(9) to make it MP safe.bluhm2017-07-051-9/+13
| | | | | While there use TAILQ_FOREACH macro for traversing tags. OK mpi@
* Introduce a simple mechanism to select the appropriate queue managermikeb2017-06-281-1/+9
| | | | Discussed with and OK henning@ at d2k17 as a part of a larger diff.
* Tighten up FQ-CoDel vs HFSC checksmikeb2017-06-281-3/+4
| | | | | | | | PFQS_FLOWQUEUE is about to become a flag that HFSC classes may specify as well; thus a better way of telling FQ-CoDel and HFSC apart needs to be found. At the moment its derived from the queue specification. Discussed with and OK henning@ at d2k17 as a part of a larger diff.
* - let's add PF_LOCK()sashan2017-06-051-1/+7
| | | | | | | | to enable PF_LOCK(), you must add 'option WITH_PF_LOCK' to your kernel configuration. The code does not do much currently it's just the very small step towards MP. O.K. henning@, mikeb@, mpi@
* Return time_uptime as value for when pf was enabled instead ofpatrick2017-06-011-4/+4
| | | | | | | | | time_second. Since time_second changes depending on the wall- clock time, time_second is not a reliable source for the status. We can even end up with a negative time delta. Thus, use the monotonically growing time_uptime and export it to userland. ok bluhm@ mikeb@