linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2017-01-24	net: dsa: b53: Utilize mdio_module_driver	Florian Fainelli	1	-12/+1
	Eliminate a bit of boilerplate code. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	net: phy: Fix typo for MDIO module boilerplate comment	Florian Fainelli	1	-1/+1
	The module boilerplate macro is named mdio_module_driver and not module_mdio_driver, fix that. Fixes: a9049e0c513c ("mdio: Add support for mdio drivers.") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	net: stmmac: dwmac-meson8b: make the RGMII TX delay configurable	Martin Blumenstingl	1	-6/+14
	Prior to this patch we were using a hardcoded RGMII TX clock delay of 2ns (= 1/4 cycle of the 125MHz RGMII TX clock). This value works for many boards, but unfortunately not for all (due to the way the actual circuit is designed, sometimes because the TX delay is enabled in the PHY, etc.). Making the TX delay on the MAC side configurable allows us to support all possible hardware combinations. This allows fixing a compatibility issue on some boards, where the RTL8211F PHY is configured to generate the TX delay. We can now turn off the TX delay in the MAC, because otherwise we would be applying the delay twice (which results in non-working TX traffic). Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	net: dt-bindings: add RGMII TX delay configuration to meson8b-dwmac	Martin Blumenstingl	1	-0/+16
	This allows configuring the RGMII TX clock delay. The RGMII clock is generated by underlying hardware of the the Meson 8b / GXBB DWMAC glue. The configuration depends on the actual hardware (no delay may be needed due to the design of the actual circuit, the PHY might add this delay, etc.). Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	net: dsa: Fix inverted test for multiple CPU interface	Andrew Lunn	1	-1/+1
	Remove the wrong !, otherwise we get false positives about having multiple CPU interfaces. Fixes: b22de490869d ("net: dsa: store CPU switch structure in the tree") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	bridge: multicast to unicast	Felix Fietkau	8	-29/+114
	Implements an optional, per bridge port flag and feature to deliver multicast packets to any host on the according port via unicast individually. This is done by copying the packet per host and changing the multicast destination MAC to a unicast one accordingly. multicast-to-unicast works on top of the multicast snooping feature of the bridge. Which means unicast copies are only delivered to hosts which are interested in it and signalized this via IGMP/MLD reports previously. This feature is intended for interface types which have a more reliable and/or efficient way to deliver unicast packets than broadcast ones (e.g. wifi). However, it should only be enabled on interfaces where no IGMPv2/MLDv1 report suppression takes place. This feature is disabled by default. The initial patch and idea is from Felix Fietkau. Signed-off-by: Felix Fietkau <nbd@nbd.name> [linus.luessing@c0d3.blue: various bug + style fixes, commit message] Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	Introduce a sysctl that modifies the value of PROT_SOCK.	Krister Johansen	9	-12/+86
	Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl that denotes the first unprivileged inet port in the namespace. To disable all privileged ports set this to zero. It also checks for overlap with the local port range. The privileged and local range may not overlap. The use case for this change is to allow containerized processes to bind to priviliged ports, but prevent them from ever being allowed to modify their container's network configuration. The latter is accomplished by ensuring that the network namespace is not a child of the user namespace. This modification was needed to allow the container manager to disable a namespace's priviliged port restrictions without exposing control of the network namespace to processes in the user namespace. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	bpf, lpm: fix kfree of im_node in trie_update_elem	Daniel Borkmann	1	-1/+1
	We need to initialize im_node to NULL, otherwise in case of error path it gets passed to kfree() as uninitialized pointer. Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	samples/bpf: add lpm-trie benchmark	David Herrmann	2	-0/+79
	Extend the map_perf_test_{user,kern}.c infrastructure to stress test lpm-trie lookups. We hook into the kprobe on sys_gettid() and measure the latency depending on trie size and lookup count. On my Intel Haswell i7-6400U, a single gettid() syscall with an empty bpf program takes roughly 6.5us on my system. Lookups in empty tries take ~1.8us on first try, ~0.9us on retries. Lookups in tries with 8192 entries take ~7.1us (on the first _and_ any subsequent try). Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	bpf: Add tests for the lpm trie map	David Herrmann	3	-2/+361
	The first part of this program runs randomized tests against the lpm-bpf-map. It implements a "Trivial Longest Prefix Match" (tlpm) based on simple, linear, single linked lists. The implementation should be pretty straightforward. Based on tlpm, this inserts randomized data into bpf-lpm-maps and verifies the trie-based bpf-map implementation behaves the same way as tlpm. The second part uses 'real world' IPv4 and IPv6 addresses and tests the trie with those. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	bpf: add a longest prefix match trie map implementation	Daniel Mack	3	-1/+511
	This trie implements a longest prefix match algorithm that can be used to match IP addresses to a stored set of ranges. Internally, data is stored in an unbalanced trie of nodes that has a maximum height of n, where n is the prefixlen the trie was created with. Tries may be created with prefix lengths that are multiples of 8, in the range from 8 to 2048. The key used for lookup and update operations is a struct bpf_lpm_trie_key, and the value is a uint64_t. The code carries more information about the internal implementation. Signed-off-by: Daniel Mack <daniel@zonque.org> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	net: xilinx: constify net_device_ops structure	Bhumika Goyal	1	-2/+2
	Declare net_device_ops structure as const as it is only stored in the netdev_ops field of a net_device structure. This field is of type const, so net_device_ops structures having same properties can be made const too. Done using Coccinelle: @r1 disable optional_qualifier@ identifier i; position p; @@ static struct net_device_ops i@p={...}; @ok1@ identifier r1.i; position p; struct net_device ndev; @@ ndev.netdev_ops=&i@p @bad@ position p!={r1.p,ok1.p}; identifier r1.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r1.i; @@ +const struct net_device_ops i; File size before: text data bss dec hex filename 6201 744 0 6945 1b21 ethernet/xilinx/xilinx_emaclite.o File size after: text data bss dec hex filename 6745 192 0 6937 1b19 ethernet/xilinx/xilinx_emaclite.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	net: moxa: constify net_device_ops structures	Bhumika Goyal	1	-1/+1
	Declare net_device_ops structure as const as it is only stored in the netdev_ops field of a net_device structure. This field is of type const, so net_device_ops structures having same properties can be made const too. Done using Coccinelle: @r1 disable optional_qualifier@ identifier i; position p; @@ static struct net_device_ops i@p={...}; @ok1@ identifier r1.i; position p; struct net_device ndev; @@ ndev.netdev_ops=&i@p @bad@ position p!={r1.p,ok1.p}; identifier r1.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r1.i; @@ +const struct net_device_ops i; File size before: text data bss dec hex filename 4821 744 0 5565 15bd ethernet/moxa/moxart_ether.o File size after: text data bss dec hex filename 5373 192 0 5565 15bd ethernet/moxa/moxart_ether.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	net: qcom/emac: claim the irq only when the device is opened	Timur Tabi	3	-14/+11
	During reset, functions emac_mac_down() and emac_mac_up() are called, so we don't want to free and claim the IRQ unnecessarily. Move those operations to open/close. Signed-off-by: Timur Tabi <timur@codeaurora.org> Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	net: qcom/emac: rename emac_phy to emac_sgmii and move it	Timur Tabi	9	-24/+23
	The EMAC has an internal PHY that is often called the "SGMII". This SGMII is also connected to an external PHY, which is managed by phylib. These dual PHYs often cause confusion. In this case, the data structure for managing the SGMII was mis-named and located in the wrong header file. Structure emac_phy is renamed to emac_sgmii to clearly indicate it applies to the internal PHY only. It also also moved from emac_phy.h (which supports the external PHY) to emac_sgmii.h (where it belongs). To keep the changes minimal, only the structure name is changed, not the names of any variables of that type. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	bnx2x: avoid two atomic ops per page on x86	Eric Dumazet	1	-10/+5
	Commit 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") added extra put_page() and get_page() calls on arches where PAGE_SIZE=4K like x86 Reorder things to avoid this overhead. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Cc: Yuval Mintz <Yuval.Mintz@cavium.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: phy: bcm7xxx: Implement EGPHY workaround for 7278	Florian Fainelli	1	-0/+34
	Implement the HW design team recommended workaround in for 7278. Since the GPHY now returns its revision information in MII_PHYS_ID[23] we need to check whether the revision provided in flags is 0 or not. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: phy: bcm7xxx: Add entry for BCM7278	Florian Fainelli	2	-0/+3
	Add support for the BCM7278 28nm process Gigabit Ethernet PHY. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: dsa: bcm_sf2: Allow non-IMP ports to have Broadcom tags enabled	Florian Fainelli	3	-0/+18
	Parse the "brcm,use-bcm-hdr" boolean property during ports identification to fill a bitmask of ports that should have Broadcom tags enabled. This is needed in some configurations where per-packet metadata can be exchanged using Broadcom tags between the switch and an on-chip acceleration device. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: dsa: bcm_sf2: Move code enabling Broadcom tags	Florian Fainelli	1	-27/+34
	In preparation for enabling Broadcom tags on different ports based on configuration information, dedicate a function that is responsible for enabling Broadcom tags for a given port and update the IMP port setup to call it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: dsa: bcm_sf2: Add support for BCM7278 integrated switch	Florian Fainelli	5	-10/+68
	Add support for the integrated switch found on BCM7278: - core_reg_align is set to 1, to force a translation into the target address space which is 8 bytes aligned - an alternate SWITCH_REG layout is provided since registers are largely bit/masks compatible but have different offsets - conditional for all CORE_STS_OVERRIDE_{IMP,GMII_P} since those got moved way out of the traditional register space Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: dsa: bcm_sf2: Prepare for different register layouts	Florian Fainelli	3	-25/+109
	In preparation for supporting a new device with a slightly different register layout, affecting the SWITCH_REG and SWITCH_CORE address spaces, perform a few preparatory steps: - allow matching the compatible string against a data description - convert the SWITCH_REG register accesses into an indirection table - prepare for supporting a SWITCH_CORE register alignment requirement Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: dsa: bcm_sf2: Make SF2_IO64_MACRO() utilize 32-bit macro	Florian Fainelli	1	-2/+2
	There is no point inlining the 32-bit direct register read/write part, just infer it from the existing macro. This will make it easier to centralize the address rewriting that we are going to introduce later on. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: systemport: Add support for SYSTEMPORT Lite	Florian Fainelli	3	-79/+327
	Add supporf for the SYSTEMPORT Lite Ethernet controller, this piece of hardware is largely based on the full-blown SYSTEMPORT and differs in the following: - no full-blown UniMAC, instead we have the MagicPacket matching from UniMAC at same offset, and a GMII Interface Block (GIB) for the MAC-level stuff, since we are always interfaced to an Ethernet switch which is fully Ethernet compliant shortcuts could be made - 16 transmit queues, whose interrupts are moved into the first Level-2 interrupt controller bank - slight TDMA offset change (a register was inserted after TDMA_STATUS, sigh) - 256 RX descriptors (512 words) and 256 TX descriptors (not visible) As a consequence of these two things, update the code paths accordingly to differentiate the full-blown from the light version. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: systemport: Dynamically allocate number of TX rings	Florian Fainelli	2	-1/+12
	In preparation for adding SYSTEMPORT Lite, which has twice as less transmit queues than SYSTEMPORT make sure we do allocate TX rings based on the systemport,txq property to get an appropriate memory footprint. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	ipv6: add NUMA awareness to seg6_hmac_init_algo()	Eric Dumazet	1	-1/+2
	Since we allocate per cpu storage, let's also use NUMA hints. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net: stmicro: fix LS field mask in EEE configuration	jpinto	1	-1/+1
	This patch fixes the LS mask when setting EEE timer. LS field is 10 bits long and not 11 as currently. Signed-off-by: Joao Pinto <jpinto@synopsys.com> Reported-By: Rayagond Kokatanur <rayagond@vayavyalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	net/mlx4: use rb_entry()	Geliang Tang	1	-4/+4
	To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-22	6lowpan: use rb_entry()	Geliang Tang	1	-4/+4
	To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: dsa: Remove hwmon support	Andrew Lunn	8	-350/+0
	Only the Marvell mv88e6xxx DSA driver made use of the HWMON support in DSA. The temperature sensor registers are actually in the embedded PHYs, and the PHY driver now supports it. So remove all HWMON support from DSA and drivers. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	phy: marvell: Add support for temperature sensor	Andrew Lunn	1	-3/+420
	Some Marvell PHYs have an inbuilt temperature sensor. Add hwmon support for this sensor. There are two different variants. The simpler, older chips have a 5 degree accuracy. The newer devices have 1 degree accuracy. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	inet: don't use sk_v6_rcv_saddr directly	Josef Bacik	1	-1/+1
	When comparing two sockets we need to use inet6_rcv_saddr so we get a NULL sk_v6_rcv_saddr if the socket isn't AF_INET6, otherwise our comparison function can be wrong. Fixes: 637bc8b ("inet: reset tb->fastreuseport when adding a reuseport sk") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: ethernet: ti: cpsw: clarify ethtool ops changing num of descs	Ivan Khoronzhuk	1	-73/+59
	After adding cpsw_set_ringparam ethtool op, better to carry out common parts of similar ops splitting descriptors in runtime. It allows to reuse these parts and shows what the ops actually do. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: ethernet: ti: cpsw: don't duplicate common res in rx handler	Ivan Khoronzhuk	1	-24/+16
	No need to duplicate the same function in rx handler to get info if any interface is running. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: ethernet: ti: cpsw: don't duplicate ndev_running	Ivan Khoronzhuk	1	-14/+17
	No need to create additional vars to identify if interface is running. So simplify code by removing redundant var and checking usage counter instead. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: ethernet: ti: cpsw: don't disable interrupts in ndo_open	Ivan Khoronzhuk	1	-2/+0
	No need to disable interrupts if no open devices, they are disabled anyway. Even no need to disable interrupts if some ndev is opened, In this case shared resources are not touched, only parameters of ndev shell, so no reason to disable them also. Removed lines have proved it. So, no need in redundant check and interrupt disable. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: ethernet: ti: cpsw: remove dual check from common res usage function	Ivan Khoronzhuk	1	-3/+0
	Common res usage is possible only in case an interface is running. In case of not dual emac here can be only one interface, so while ndo_open and switch mode, only one interface can be opened, thus if open is called no any interface is running ... and no common res are used. So remove check on dual emac, it will simplify code/understanding and will match the name it's called. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	macvlan: use netdev_is_rx_handler_busy instead of checking specific type	Mahesh Bandewar	1	-1/+1
	netdev_is_rx_handler_busy() check is a superset of netif_is_ipvlan_port() check and hence should be preferred. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	ipvlan: use netdev_is_rx_handler_busy instead of checking specific type	Mahesh Bandewar	1	-2/+2
	IPvlan checks if the master device is already used by checking a specific device (here it's macvlan device). This is technically not sufficient and it should just ensure the rx_handler is busy or not. This would be a super check that includes macvlan and any other that has already registered rx-handler. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: remove duplicate code.	Mahesh Bandewar	1	-3/+1
	netdev_rx_handler_register() checks to see if the handler is already busy which was recently separated into netdev_is_rx_handler_busy(). So use the same function inside register() to avoid code duplication. Essentially this change should be a no-op Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	fq_codel: Avoid regenerating skb flow hash unless necessary	Andrew Collins	1	-5/+1
	The fq_codel qdisc currently always regenerates the skb flow hash. This wastes some cycles and prevents flow seperation in cases where the traffic has been encrypted and can no longer be understood by the flow dissector. Change it to use the prexisting flow hash if one exists, and only regenerate if necessary. Signed-off-by: Andrew Collins <acollins@cradlepoint.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	vxlan: preserve type of dst_port parm for encap_bypass_if_local()	Lance Richardson	1	-1/+1
	Eliminate sparse warning by maintaining type of dst_port as __be16. Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	csum: eliminate sparse warning in remcsum_unadjust()	Lance Richardson	1	-1/+1
	Cast second parameter of csum_sub() from __sum16 to __wsum. Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	tipc: make replicast a user selectable option	Jon Paul Maloy	6	-17/+112
	If the bearer carrying multicast messages supports broadcast, those messages will be sent to all cluster nodes, irrespective of whether these nodes host any actual destinations socket or not. This is clearly wasteful if the cluster is large and there are only a few real destinations for the message being sent. In this commit we extend the eligibility of the newly introduced "replicast" transmit option. We now make it possible for a user to select which method he wants to be used, either as a mandatory setting via setsockopt(), or as a relative setting where we let the broadcast layer decide which method to use based on the ratio between cluster size and the message's actual number of destination nodes. In the latter case, a sending socket must stick to a previously selected method until it enters an idle period of at least 5 seconds. This eliminates the risk of message reordering caused by method change, i.e., when changes to cluster size or number of destinations would otherwise mandate a new method to be used. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	tipc: introduce replicast as transport option for multicast	Jon Paul Maloy	7	-47/+149
	TIPC multicast messages are currently carried over a reliable 'broadcast link', making use of the underlying media's ability to transport packets as L2 broadcast or IP multicast to all nodes in the cluster. When the used bearer is lacking that ability, we can instead emulate the broadcast service by replicating and sending the packets over as many unicast links as needed to reach all identified destinations. We now introduce a new TIPC link-level 'replicast' service that does this. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	tipc: add functionality to lookup multicast destination nodes	Jon Paul Maloy	4	-8/+87
	As a further preparation for the upcoming 'replicast' functionality, we add some necessary structs and functions for looking up and returning a list of all nodes that host destinations for a given multicast message. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	tipc: add function for checking broadcast support in bearer	Jon Paul Maloy	4	-9/+34
	As a preparation for the 'replicast' functionality we are going to introduce in the next commits, we need the broadcast base structure to store whether bearer broadcast is available at all from the currently used bearer or bearers. We do this by adding a new function tipc_bearer_bcast_support() to the bearer layer, and letting the bearer selection function in bcast.c use this to give a new boolean field, 'bcast_support' the appropriate value. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	bpf: add bpf_probe_read_str helper	Gianluca Borello	2	-1/+46
	Provide a simple helper with the same semantics of strncpy_from_unsafe(): int bpf_probe_read_str(void dst, int size, const void unsafe_addr) This gives more flexibility to a bpf program. A typical use case is intercepting a file name during sys_open(). The current approach is: SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs ctx) { char buf[PATHLEN]; // PATHLEN is defined to 256 bpf_probe_read(buf, sizeof(buf), ctx->di); / consume buf / } This is suboptimal because the size of the string needs to be estimated at compile time, causing more memory to be copied than often necessary, and can become more problematic if further processing on buf is done, for example by pushing it to userspace via bpf_perf_event_output(), since the real length of the string is unknown and the entire buffer must be copied (and defining an unrolled strnlen() inside the bpf program is a very inefficient and unfeasible approach). With the new helper, the code can easily operate on the actual string length rather than the buffer size: SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs ctx) { char buf[PATHLEN]; // PATHLEN is defined to 256 int res = bpf_probe_read_str(buf, sizeof(buf), ctx->di); /* consume buf, for example push it to userspace via * bpf_perf_event_output(), but this time we can use * res (the string length) as event size, after checking * its boundaries. */ } Another useful use case is when parsing individual process arguments or individual environment variables navigating current->mm->arg_start and current->mm->env_start: using this helper and the return value, one can quickly iterate at the right offset of the memory area. The code changes simply leverage the already existent strncpy_from_unsafe() kernel function, which is safe to be called from a bpf program as it is used in bpf_trace_printk(). Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	device: Implement a bus agnostic dev_num_vf routine	Phil Sutter	3	-4/+8
	Now that pci_bus_type has num_vf callback set, dev_num_vf can be implemented in a bus type independent way and the check for whether a PCI device is being handled in rtnetlink can be dropped. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	PCI: implement num_vf bus type callback	Phil Sutter	1	-0/+6
	Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>