<feed xmlns='http://www.w3.org/2005/Atom'>
<title>wireguard-linux/net/netfilter, branch stable</title>
<subtitle>WireGuard for the Linux kernel</subtitle>
<id>https://git.zx2c4.com/wireguard-linux/atom/net/netfilter?h=stable</id>
<link rel='self' href='https://git.zx2c4.com/wireguard-linux/atom/net/netfilter?h=stable'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/'/>
<updated>2025-10-29T13:47:59Z</updated>
<entry>
<title>netfilter: nft_ct: add seqadj extension for natted connections</title>
<updated>2025-10-29T13:47:59Z</updated>
<author>
<name>Andrii Melnychenko</name>
<email>a.melnychenko@vyos.io</email>
</author>
<published>2025-10-24T16:22:16Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=90918e3b6404c2a37837b8f11692471b4c512de2'/>
<id>urn:sha1:90918e3b6404c2a37837b8f11692471b4c512de2</id>
<content type='text'>
Sequence adjustment may be required for FTP traffic with PASV/EPSV modes.
due to need to re-write packet payload (IP, port) on the ftp control
connection. This can require changes to the TCP length and expected
seq / ack_seq.

The easiest way to reproduce this issue is with PASV mode.
Example ruleset:
table inet ftp_nat {
        ct helper ftp_helper {
                type "ftp" protocol tcp
                l3proto inet
        }

        chain prerouting {
                type filter hook prerouting priority 0; policy accept;
                tcp dport 21 ct state new ct helper set "ftp_helper"
        }
}
table ip nat {
        chain prerouting {
                type nat hook prerouting priority -100; policy accept;
                tcp dport 21 dnat ip prefix to ip daddr map {
			192.168.100.1 : 192.168.13.2/32 }
        }

        chain postrouting {
                type nat hook postrouting priority 100 ; policy accept;
                tcp sport 21 snat ip prefix to ip saddr map {
			192.168.13.2 : 192.168.100.1/32 }
        }
}

Note that the ftp helper gets assigned *after* the dnat setup.

The inverse (nat after helper assign) is handled by an existing
check in nf_nat_setup_info() and will not show the problem.

Topoloy:

 +-------------------+     +----------------------------------+
 | FTP: 192.168.13.2 | &lt;-&gt; | NAT: 192.168.13.3, 192.168.100.1 |
 +-------------------+     +----------------------------------+
                                      |
                         +-----------------------+
                         | Client: 192.168.100.2 |
                         +-----------------------+

ftp nat changes do not work as expected in this case:
Connected to 192.168.100.1.
[..]
ftp&gt; epsv
EPSV/EPRT on IPv4 off.
ftp&gt; ls
227 Entering passive mode (192,168,100,1,209,129).
421 Service not available, remote server has closed connection.

Kernel logs:
Missing nfct_seqadj_ext_add() setup call
WARNING: CPU: 1 PID: 0 at net/netfilter/nf_conntrack_seqadj.c:41
[..]
 __nf_nat_mangle_tcp_packet+0x100/0x160 [nf_nat]
 nf_nat_ftp+0x142/0x280 [nf_nat_ftp]
 help+0x4d1/0x880 [nf_conntrack_ftp]
 nf_confirm+0x122/0x2e0 [nf_conntrack]
 nf_hook_slow+0x3c/0xb0
 ..

Fix this by adding the required extension when a conntrack helper is assigned
to a connection that has a nat binding.

Fixes: 1a64edf54f55 ("netfilter: nft_ct: add helper set support")
Signed-off-by: Andrii Melnychenko &lt;a.melnychenko@vyos.io&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_connlimit: fix possible data race on connection count</title>
<updated>2025-10-29T13:47:59Z</updated>
<author>
<name>Fernando Fernandez Mancera</name>
<email>fmancera@suse.de</email>
</author>
<published>2025-10-24T15:54:39Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=8d96dfdcabef00e28f0c851b1502adb679dfc6d9'/>
<id>urn:sha1:8d96dfdcabef00e28f0c851b1502adb679dfc6d9</id>
<content type='text'>
nft_connlimit_eval() reads priv-&gt;list-&gt;count to check if the connection
limit has been exceeded. This value is being read without a lock and can
be modified by a different process. Use READ_ONCE() for correctness.

Fixes: df4a90250976 ("netfilter: nf_conncount: merge lookup and add functions")
Signed-off-by: Fernando Fernandez Mancera &lt;fmancera@suse.de&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_ct: enable labels for get case too</title>
<updated>2025-10-29T13:47:59Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-10-22T15:18:10Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=514f1dc8f2ca3101e04cdf452e53baca3a76e544'/>
<id>urn:sha1:514f1dc8f2ca3101e04cdf452e53baca3a76e544</id>
<content type='text'>
conntrack labels can only be set when the conntrack has been created
with the "ctlabel" extension.

For older iptables (connlabel match), adding an "-m connlabel" rule
turns on the ctlabel extension allocation for all future conntrack
entries.

For nftables, its only enabled for 'ct label set foo', but not for
'ct label foo' (i.e. check).
But users could have a ruleset that only checks for presence, and rely
on userspace to set a label bit via ctnetlink infrastructure.

This doesn't work without adding a dummy 'ct label set' rule.
We could also enable extension infra for the first (failing) ctnetlink
request, but unlike ruleset we would not be able to disable the
extension again.

Therefore turn on ctlabel extension allocation if an nftables ruleset
checks for a connlabel too.

Fixes: 1ad8f48df6f6 ("netfilter: nftables: add connlabel set support")
Reported-by: Antonio Ojea &lt;aojea@google.com&gt;
Closes: https://lore.kernel.org/netfilter-devel/aPi_VdZpVjWujZ29@strlen.de/
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_objref: validate objref and objrefmap expressions</title>
<updated>2025-10-08T11:17:25Z</updated>
<author>
<name>Fernando Fernandez Mancera</name>
<email>fmancera@suse.de</email>
</author>
<published>2025-10-08T10:08:16Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=f359b809d54c6e3dd1d039b97e0b68390b0e53e4'/>
<id>urn:sha1:f359b809d54c6e3dd1d039b97e0b68390b0e53e4</id>
<content type='text'>
Referencing a synproxy stateful object from OUTPUT hook causes kernel
crash due to infinite recursive calls:

BUG: TASK stack guard page was hit at 000000008bda5b8c (stack is 000000003ab1c4a5..00000000494d8b12)
[...]
Call Trace:
 __find_rr_leaf+0x99/0x230
 fib6_table_lookup+0x13b/0x2d0
 ip6_pol_route+0xa4/0x400
 fib6_rule_lookup+0x156/0x240
 ip6_route_output_flags+0xc6/0x150
 __nf_ip6_route+0x23/0x50
 synproxy_send_tcp_ipv6+0x106/0x200
 synproxy_send_client_synack_ipv6+0x1aa/0x1f0
 nft_synproxy_do_eval+0x263/0x310
 nft_do_chain+0x5a8/0x5f0 [nf_tables
 nft_do_chain_inet+0x98/0x110
 nf_hook_slow+0x43/0xc0
 __ip6_local_out+0xf0/0x170
 ip6_local_out+0x17/0x70
 synproxy_send_tcp_ipv6+0x1a2/0x200
 synproxy_send_client_synack_ipv6+0x1aa/0x1f0
[...]

Implement objref and objrefmap expression validate functions.

Currently, only NFT_OBJECT_SYNPROXY object type requires validation.
This will also handle a jump to a chain using a synproxy object from the
OUTPUT hook.

Now when trying to reference a synproxy object in the OUTPUT hook, nft
will produce the following error:

synproxy_crash.nft: Error: Could not process rule: Operation not supported
  synproxy name mysynproxy
  ^^^^^^^^^^^^^^^^^^^^^^^^

Fixes: ee394f96ad75 ("netfilter: nft_synproxy: add synproxy stateful object support")
Reported-by: Georg Pfuetzenreuter &lt;georg.pfuetzenreuter@suse.com&gt;
Closes: https://bugzilla.suse.com/1250237
Signed-off-by: Fernando Fernandez Mancera &lt;fmancera@suse.de&gt;
Reviewed-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: do not skip entries in /proc/net/nf_conntrack</title>
<updated>2025-09-24T09:50:28Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-09-24T07:27:09Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=c5ba345b2d358b07cc4f07253ba1ada73e77d586'/>
<id>urn:sha1:c5ba345b2d358b07cc4f07253ba1ada73e77d586</id>
<content type='text'>
ct_seq_show() has an opportunistic garbage collector :

if (nf_ct_should_gc(ct)) {
    nf_ct_kill(ct);
    goto release;
}

So if one nf_conn is killed there, next time ct_get_next() runs,
we skip the following item in the bucket, even if it should have
been displayed if gc did not take place.

We can decrement st-&gt;skip_elems to tell ct_get_next() one of the items
was removed from the chain.

Fixes: 58e207e4983d ("netfilter: evict stale entries when user reads /proc/net/nf_conntrack")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_set_pipapo_avx2: fix skip of expired entries</title>
<updated>2025-09-24T09:50:28Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-09T12:11:48Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=5823699a11cf3c05d751b473c66920d2d3cac0a5'/>
<id>urn:sha1:5823699a11cf3c05d751b473c66920d2d3cac0a5</id>
<content type='text'>
KASAN reports following splat:
BUG: KASAN: slab-out-of-bounds in pipapo_get_avx2+0x941/0x25d0
Read of size 1 at addr ffff88814c561be0 by task nft/3944
Call Trace:
 pipapo_get_avx2+0x941/0x25d0
 nft_pipapo_insert+0x440/0x11b0
 nf_tables_newsetelem+0x220a/0x3a00
 ..

This bisects to commit 84c1da7b38d9 ("netfilter: nft_set_pipapo: use AVX2
algorithm for insertions too").

However, that change merely uncovers this bug.

When we find a match but that match has expired or timed out, the AVX2
implementation restarts the full match loop.

At that point, the pointer to the key data has already been changed and
points to the keys last field.
This will then result in out-of-bounds read once its incremented again
for the next field.

The restart logic in AVX2 is different compared to the plain C
implementation, but both should follow the same logic.

The C implementation just calls pipapo_refill() again do check the next
entry.  Do the same in the AVX2 implementation.

Note that with this change, due to implementation differences of
pipapo_refill vs. nft_pipapo_avx2_refill, the refill call will return
the same element again. Then, on the next call, it will move to the next
entry as expected.  This is because avx2_refill doesn't clear the bitmap
in the 'last' conditional.  This is harmless. Expired/timed out elements
are also not expected to be frequent.

selftest is added in a followup commit.

Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_set_pipapo: use 0 genmask for packetpath lookups</title>
<updated>2025-09-24T09:50:28Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-16T16:34:01Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=4dbac7db17f1e973fbbd6aea07a00181cd4cd162'/>
<id>urn:sha1:4dbac7db17f1e973fbbd6aea07a00181cd4cd162</id>
<content type='text'>
In commit c4eaca2e1052 ("netfilter: nft_set_pipapo: don't check genbit from
packetpath lookups") I replaced genmask_cur() with NFT_GENMASK_ANY, but
this change has no effect in the pipapo set type.

New entries are unreachable from the active copy, so NFT_GENMASK_ANY has
same result as genmask_cur():

current-gen elements are disabled and the new-generation
elements cannot be found.

Tests did not catch this incomplete fix because the change also dropped
the genmask test from the AVX2 version of the algorithm, so test only
fails if host cpu lacks AVX2 support.

Use genmask test only from the control plane (inserts, deletions, ..).

Packet path has to skip the check, use of 0 is enough for this because
ext-&gt;genmask has a the relevant bit set when the element is INACTIVE
in that generation: using a 0 genmask thus makes nft_set_elem_active()
always return true.

Fix the comment and replace NFT_GENMASK_ANY with 0.

Fixes: c4eaca2e1052 ("netfilter: nft_set_pipapo: don't check genbit from packetpath lookups")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nfnetlink: reset nlh pointer during batch replay</title>
<updated>2025-09-24T09:50:28Z</updated>
<author>
<name>Fernando Fernandez Mancera</name>
<email>fmancera@suse.de</email>
</author>
<published>2025-09-19T12:40:43Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=09efbac953f6f076a07735f9ba885148d4796235'/>
<id>urn:sha1:09efbac953f6f076a07735f9ba885148d4796235</id>
<content type='text'>
During a batch replay, the nlh pointer is not reset until the parsing of
the commands. Since commit bf2ac490d28c ("netfilter: nfnetlink: Handle
ACK flags for batch messages") that is problematic as the condition to
add an ACK for batch begin will evaluate to true even if NLM_F_ACK
wasn't used for batch begin message.

If there is an error during the command processing, netlink is sending
an ACK despite that. This misleads userspace tools which think that the
return code was 0. Reset the nlh pointer to the original one when a
replay is triggered.

Fixes: bf2ac490d28c ("netfilter: nfnetlink: Handle ACK flags for batch messages")
Signed-off-by: Fernando Fernandez Mancera &lt;fmancera@suse.de&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>ipvs: Defer ip_vs_ftp unregister during netns cleanup</title>
<updated>2025-09-24T09:50:28Z</updated>
<author>
<name>Slavin Liu</name>
<email>slavin452@gmail.com</email>
</author>
<published>2025-09-11T17:57:59Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=134121bfd99a06d44ef5ba15a9beb075297c0821'/>
<id>urn:sha1:134121bfd99a06d44ef5ba15a9beb075297c0821</id>
<content type='text'>
On the netns cleanup path, __ip_vs_ftp_exit() may unregister ip_vs_ftp
before connections with valid cp-&gt;app pointers are flushed, leading to a
use-after-free.

Fix this by introducing a global `exiting_module` flag, set to true in
ip_vs_ftp_exit() before unregistering the pernet subsystem. In
__ip_vs_ftp_exit(), skip ip_vs_ftp unregister if called during netns
cleanup (when exiting_module is false) and defer it to
__ip_vs_cleanup_batch(), which unregisters all apps after all connections
are flushed. If called during module exit, unregister ip_vs_ftp
immediately.

Fixes: 61b1ab4583e2 ("IPVS: netns, add basic init per netns.")
Suggested-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Slavin Liu &lt;slavin452@gmail.com&gt;
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>net: replace use of system_wq with system_percpu_wq</title>
<updated>2025-09-23T00:40:30Z</updated>
<author>
<name>Marco Crivellari</name>
<email>marco.crivellari@suse.com</email>
</author>
<published>2025-09-18T14:24:26Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/wireguard-linux/commit/?id=5fd8bb982e10f29e856ef71072609af5ce55d281'/>
<id>urn:sha1:5fd8bb982e10f29e856ef71072609af5ce55d281</id>
<content type='text'>
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.

Adding system_dfl_wq to encourage its use when unbound work should be used.

The old system_unbound_wq will be kept for a few release cycles.

Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Marco Crivellari &lt;marco.crivellari@suse.com&gt;
Link: https://patch.msgid.link/20250918142427.309519-3-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
