<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/include/net, branch master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/include/net?h=master</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/include/net?h=master'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-11-02T04:29:06Z</updated>
<entry>
<title>netlink: introduce bigendian integer types</title>
<updated>2022-11-02T04:29:06Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2022-10-31T12:34:07Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=ecaf75ffd5f5db320d8b1da0198eef5a5ce64a3f'/>
<id>urn:sha1:ecaf75ffd5f5db320d8b1da0198eef5a5ce64a3f</id>
<content type='text'>
Jakub reported that the addition of the "network_byte_order"
member in struct nla_policy increases size of 32bit platforms.

Instead of scraping the bit from elsewhere Johannes suggested
to add explicit NLA_BE types instead, so do this here.

NLA_POLICY_MAX_BE() macro is removed again, there is no need
for it: NLA_POLICY_MAX(NLA_BE.., ..) will do the right thing.

NLA_BE64 can be added later.

Fixes: 08724ef69907 ("netlink: introduce NLA_POLICY_MAX_BE")
Reported-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Suggested-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Link: https://lore.kernel.org/r/20221031123407.9158-1-fw@strlen.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: remove SOCK_SUPPORT_ZC from sockmap</title>
<updated>2022-10-29T03:21:25Z</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-10-26T23:25:57Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=fee9ac06647e59a69fb7aec58f25267c134264b4'/>
<id>urn:sha1:fee9ac06647e59a69fb7aec58f25267c134264b4</id>
<content type='text'>
sockmap replaces -&gt;sk_prot with its own callbacks, we should remove
SOCK_SUPPORT_ZC as the new proto doesn't support msghdr::ubuf_info.

Cc: &lt;stable@vger.kernel.org&gt; # 6.0
Reported-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>netlink: hide validation union fields from kdoc</title>
<updated>2022-10-29T03:19:15Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2022-10-27T21:21:07Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=7354c9024f2835f6122ed9612e21ab379df050f9'/>
<id>urn:sha1:7354c9024f2835f6122ed9612e21ab379df050f9</id>
<content type='text'>
Mark the validation fields as private, users shouldn't set
them directly and they are too complicated to explain in
a more succinct way (there's already a long explanation
in the comment above).

The strict_start_type field is set directly and has a dedicated
comment so move that above the "private" section.

Link: https://lore.kernel.org/r/20221027212107.2639255-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>genetlink: piggy back on resv_op to default to a reject policy</title>
<updated>2022-10-25T02:08:46Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2022-10-21T19:35:32Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=4fa86555d1cd338afc6e6308cc1ff890a014ec8c'/>
<id>urn:sha1:4fa86555d1cd338afc6e6308cc1ff890a014ec8c</id>
<content type='text'>
To keep backward compatibility we used to leave attribute parsing
to the family if no policy is specified. This becomes tedious as
we move to more strict validation. Families must define reject all
policies if they don't want any attributes accepted.

Piggy back on the resv_start_op field as the switchover point.
AFAICT only ethtool has added new commands since the resv_start_op
was defined, and it has per-op policies so this should be a no-op.

Nonetheless the patch should still go into v6.1 for consistency.

Link: https://lore.kernel.org/all/20221019125745.3f2e7659@kernel.org/
Link: https://lore.kernel.org/r/20221021193532.1511293-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net-memcg: avoid stalls when under memory pressure</title>
<updated>2022-10-24T17:35:09Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2022-10-21T16:03:04Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=720ca52bcef225b967a339e0fffb6d0c7e962240'/>
<id>urn:sha1:720ca52bcef225b967a339e0fffb6d0c7e962240</id>
<content type='text'>
As Shakeel explains the commit under Fixes had the unintended
side-effect of no longer pre-loading the cached memory allowance.
Even tho we previously dropped the first packet received when
over memory limit - the consecutive ones would get thru by using
the cache. The charging was happening in batches of 128kB, so
we'd let in 128kB (truesize) worth of packets per one drop.

After the change we no longer force charge, there will be no
cache filling side effects. This causes significant drops and
connection stalls for workloads which use a lot of page cache,
since we can't reclaim page cache under GFP_NOWAIT.

Some of the latency can be recovered by improving SACK reneg
handling but nowhere near enough to get back to the pre-5.15
performance (the application I'm experimenting with still
sees 5-10x worst latency).

Apply the suggested workaround of using GFP_ATOMIC. We will now
be more permissive than previously as we'll drop _no_ packets
in softirq when under pressure. But I can't think of any good
and simple way to address that within networking.

Link: https://lore.kernel.org/all/20221012163300.795e7b86@kernel.org/
Suggested-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Fixes: 4b1327be9fe5 ("net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()")
Acked-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Link: https://lore.kernel.org/r/20221021160304.1362511-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2022-10-21T00:24:59Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-21T00:24:59Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=6d36c728bc2e2d632f4b0dea00df5532e20dfdab'/>
<id>urn:sha1:6d36c728bc2e2d632f4b0dea00df5532e20dfdab</id>
<content type='text'>
Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - revert "net: fix cpu_max_bits_warn() usage in
     netif_attrmask_next{,_and}"

   - revert "net: sched: fq_codel: remove redundant resource cleanup in
     fq_codel_init()"

   - dsa: uninitialized variable in dsa_slave_netdevice_event()

   - eth: sunhme: uninitialized variable in happy_meal_init()

  Current release - new code bugs:

   - eth: octeontx2: fix resource not freed after malloc

  Previous releases - regressions:

   - sched: fix return value of qdisc ingress handling on success

   - sched: fix race condition in qdisc_graft()

   - udp: update reuse-&gt;has_conns under reuseport_lock.

   - tls: strp: make sure the TCP skbs do not have overlapping data

   - hsr: avoid possible NULL deref in skb_clone()

   - tipc: fix an information leak in tipc_topsrv_kern_subscr

   - phylink: add mac_managed_pm in phylink_config structure

   - eth: i40e: fix DMA mappings leak

   - eth: hyperv: fix a RX-path warning

   - eth: mtk: fix memory leaks

  Previous releases - always broken:

   - sched: cake: fix null pointer access issue when cake_init() fails"

* tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
  net: phy: dp83822: disable MDI crossover status change interrupt
  net: sched: fix race condition in qdisc_graft()
  net: hns: fix possible memory leak in hnae_ae_register()
  wwan_hwsim: fix possible memory leak in wwan_hwsim_dev_new()
  sfc: include vport_id in filter spec hash and equal()
  genetlink: fix kdoc warnings
  selftests: add selftest for chaining of tc ingress handling to egress
  net: Fix return value of qdisc ingress handling on success
  net: sched: sfb: fix null pointer access issue when sfb_init() fails
  Revert "net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()"
  net: sched: cake: fix null pointer access issue when cake_init() fails
  ethernet: marvell: octeontx2 Fix resource not freed after malloc
  netfilter: nf_tables: relax NFTA_SET_ELEM_KEY_END set flags requirements
  netfilter: rpfilter/fib: Set -&gt;flowic_uid correctly for user namespaces.
  ionic: catch NULL pointer issue on reconfig
  net: hsr: avoid possible NULL deref in skb_clone()
  bnxt_en: fix memory leak in bnxt_nvm_test()
  ip6mr: fix UAF issue in ip6mr_sk_done() when addrconf_init_net() failed
  udp: Update reuse-&gt;has_conns under reuseport_lock.
  net: ethernet: mediatek: ppe: Remove the unused function mtk_foe_entry_usable()
  ...
</content>
</entry>
<entry>
<title>genetlink: fix kdoc warnings</title>
<updated>2022-10-19T20:08:30Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2022-10-18T23:13:10Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a1a824f448ba96c610b85288d844adc9f781828e'/>
<id>urn:sha1:a1a824f448ba96c610b85288d844adc9f781828e</id>
<content type='text'>
Address a bunch of kdoc warnings:

include/net/genetlink.h:81: warning: Function parameter or member 'module' not described in 'genl_family'
include/net/genetlink.h:243: warning: expecting prototype for struct genl_info. Prototype was for struct genl_dumpit_info instead
include/net/genetlink.h:419: warning: Function parameter or member 'net' not described in 'genlmsg_unicast'
include/net/genetlink.h:438: warning: expecting prototype for gennlmsg_data(). Prototype was for genlmsg_data() instead
include/net/genetlink.h:244: warning: Function parameter or member 'op' not described in 'genl_dumpit_info'

Link: https://lore.kernel.org/r/20221018231310.1040482-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>udp: Update reuse-&gt;has_conns under reuseport_lock.</title>
<updated>2022-10-18T08:17:18Z</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@amazon.com</email>
</author>
<published>2022-10-14T18:26:25Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=69421bf98482d089e50799f45e48b25ce4a8d154'/>
<id>urn:sha1:69421bf98482d089e50799f45e48b25ce4a8d154</id>
<content type='text'>
When we call connect() for a UDP socket in a reuseport group, we have
to update sk-&gt;sk_reuseport_cb-&gt;has_conns to 1.  Otherwise, the kernel
could select a unconnected socket wrongly for packets sent to the
connected socket.

However, the current way to set has_conns is illegal and possible to
trigger that problem.  reuseport_has_conns() changes has_conns under
rcu_read_lock(), which upgrades the RCU reader to the updater.  Then,
it must do the update under the updater's lock, reuseport_lock, but
it doesn't for now.

For this reason, there is a race below where we fail to set has_conns
resulting in the wrong socket selection.  To avoid the race, let's split
the reader and updater with proper locking.

 cpu1                               cpu2
+----+                             +----+

__ip[46]_datagram_connect()        reuseport_grow()
.                                  .
|- reuseport_has_conns(sk, true)   |- more_reuse = __reuseport_alloc(more_socks_size)
|  .                               |
|  |- rcu_read_lock()
|  |- reuse = rcu_dereference(sk-&gt;sk_reuseport_cb)
|  |
|  |                               |  /* reuse-&gt;has_conns == 0 here */
|  |                               |- more_reuse-&gt;has_conns = reuse-&gt;has_conns
|  |- reuse-&gt;has_conns = 1         |  /* more_reuse-&gt;has_conns SHOULD BE 1 HERE */
|  |                               |
|  |                               |- rcu_assign_pointer(reuse-&gt;socks[i]-&gt;sk_reuseport_cb,
|  |                               |                     more_reuse)
|  `- rcu_read_unlock()            `- kfree_rcu(reuse, rcu)
|
|- sk-&gt;sk_state = TCP_ESTABLISHED

Note the likely(reuse) in reuseport_has_conns_set() is always true,
but we put the test there for ease of review.  [0]

For the record, usually, sk_reuseport_cb is changed under lock_sock().
The only exception is reuseport_grow() &amp; TCP reqsk migration case.

  1) shutdown() TCP listener, which is moved into the latter part of
     reuse-&gt;socks[] to migrate reqsk.

  2) New listen() overflows reuse-&gt;socks[] and call reuseport_grow().

  3) reuse-&gt;max_socks overflows u16 with the new listener.

  4) reuseport_grow() pops the old shutdown()ed listener from the array
     and update its sk-&gt;sk_reuseport_cb as NULL without lock_sock().

shutdown()ed TCP sk-&gt;sk_reuseport_cb can be changed without lock_sock(),
but, reuseport_has_conns_set() is called only for UDP under lock_sock(),
so likely(reuse) never be false in reuseport_has_conns_set().

[0]: https://lore.kernel.org/netdev/CANn89iLja=eQHbsM_Ta2sQF0tOGU8vAGrh_izRuuHjuO1ouUag@mail.gmail.com/

Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets")
Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://lore.kernel.org/r/20221014182625.89913-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random</title>
<updated>2022-10-16T22:27:07Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-16T22:27:07Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=f1947d7c8a61db1cb0ef909a6512ede0b1f2115b'/>
<id>urn:sha1:f1947d7c8a61db1cb0ef909a6512ede0b1f2115b</id>
<content type='text'>
Pull more random number generator updates from Jason Donenfeld:
 "This time with some large scale treewide cleanups.

  The intent of this pull is to clean up the way callers fetch random
  integers. The current rules for doing this right are:

   - If you want a secure or an insecure random u64, use get_random_u64()

   - If you want a secure or an insecure random u32, use get_random_u32()

     The old function prandom_u32() has been deprecated for a while
     now and is just a wrapper around get_random_u32(). Same for
     get_random_int().

   - If you want a secure or an insecure random u16, use get_random_u16()

   - If you want a secure or an insecure random u8, use get_random_u8()

   - If you want secure or insecure random bytes, use get_random_bytes().

     The old function prandom_bytes() has been deprecated for a while
     now and has long been a wrapper around get_random_bytes()

   - If you want a non-uniform random u32, u16, or u8 bounded by a
     certain open interval maximum, use prandom_u32_max()

     I say "non-uniform", because it doesn't do any rejection sampling
     or divisions. Hence, it stays within the prandom_*() namespace, not
     the get_random_*() namespace.

     I'm currently investigating a "uniform" function for 6.2. We'll see
     what comes of that.

  By applying these rules uniformly, we get several benefits:

   - By using prandom_u32_max() with an upper-bound that the compiler
     can prove at compile-time is ≤65536 or ≤256, internally
     get_random_u16() or get_random_u8() is used, which wastes fewer
     batched random bytes, and hence has higher throughput.

   - By using prandom_u32_max() instead of %, when the upper-bound is
     not a constant, division is still avoided, because
     prandom_u32_max() uses a faster multiplication-based trick instead.

   - By using get_random_u16() or get_random_u8() in cases where the
     return value is intended to indeed be a u16 or a u8, we waste fewer
     batched random bytes, and hence have higher throughput.

  This series was originally done by hand while I was on an airplane
  without Internet. Later, Kees and I worked on retroactively figuring
  out what could be done with Coccinelle and what had to be done
  manually, and then we split things up based on that.

  So while this touches a lot of files, the actual amount of code that's
  hand fiddled is comfortably small"

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove unused functions
  treewide: use get_random_bytes() when possible
  treewide: use get_random_u32() when possible
  treewide: use get_random_{u8,u16}() when possible, part 2
  treewide: use get_random_{u8,u16}() when possible, part 1
  treewide: use prandom_u32_max() when possible, part 2
  treewide: use prandom_u32_max() when possible, part 1
</content>
</entry>
<entry>
<title>Merge tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2022-10-13T17:51:01Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-13T17:51:01Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=66ae04368efbe20eb8951c9a76158f99ce672f25'/>
<id>urn:sha1:66ae04368efbe20eb8951c9a76158f99ce672f25</id>
<content type='text'>
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, and wifi.

Current release - regressions:

   - Revert "net/sched: taprio: make qdisc_leaf() see the
     per-netdev-queue pfifo child qdiscs", it may cause crashes when the
     qdisc is reconfigured

   - inet: ping: fix splat due to packet allocation refactoring in inet

   - tcp: clean up kernel listener's reqsk in inet_twsk_purge(), fix UAF
     due to races when per-netns hash table is used

  Current release - new code bugs:

   - eth: adin1110: check in netdev_event that netdev belongs to driver

   - fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co.

  Previous releases - regressions:

   - ipv4: handle attempt to delete multipath route when fib_info
     contains an nh reference, avoid oob access

   - wifi: fix handful of bugs in the new Multi-BSSID code

   - wifi: mt76: fix rate reporting / throughput regression on mt7915
     and newer, fix checksum offload

   - wifi: iwlwifi: mvm: fix double list_add at
     iwl_mvm_mac_wake_tx_queue (other cases)

   - wifi: mac80211: do not drop packets smaller than the LLC-SNAP
     header on fast-rx

  Previous releases - always broken:

   - ieee802154: don't warn zero-sized raw_sendmsg()

   - ipv6: ping: fix wrong checksum for large frames

   - mctp: prevent double key removal and unref

   - tcp/udp: fix memory leaks and races around IPV6_ADDRFORM

   - hv_netvsc: fix race between VF offering and VF association message

  Misc:

   - remove -Warray-bounds silencing in the drivers, compilers fixed"

* tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
  sunhme: fix an IS_ERR() vs NULL check in probe
  net: marvell: prestera: fix a couple NULL vs IS_ERR() checks
  kcm: avoid potential race in kcm_tx_work
  tcp: Clean up kernel listener's reqsk in inet_twsk_purge()
  net: phy: micrel: Fixes FIELD_GET assertion
  openvswitch: add nf_ct_is_confirmed check before assigning the helper
  tcp: Fix data races around icsk-&gt;icsk_af_ops.
  ipv6: Fix data races around sk-&gt;sk_prot.
  tcp/udp: Call inet6_destroy_sock() in IPv6 sk-&gt;sk_destruct().
  udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  tcp/udp: Fix memory leak in ipv6_renew_options().
  mctp: prevent double key removal and unref
  selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1
  netfilter: rpfilter/fib: Populate flowic_l3mdev field
  selftests: netfilter: Test reverse path filtering
  net/mlx5: Make ASO poll CQ usable in atomic context
  tcp: cdg: allow tcp_cdg_release() to be called multiple times
  inet: ping: fix recent breakage
  ipv6: ping: fix wrong checksum for large frames
  net: ethernet: ti: am65-cpsw: set correct devlink flavour for unused ports
  ...
</content>
</entry>
</feed>
