linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-11-27	bpf: drop unnecessary context cast from BPF_PROG_RUN	Daniel Borkmann	4	-6/+6
	Since long already bpf_func is not only about struct sk_buff * as input anymore. Make it generic as void *, so that callers don't need to cast for it each time they call BPF_PROG_RUN(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	sfc: remove unneeded variable	Dan Carpenter	2	-9/+1
	We don't use ->heap_buf after commit 46d1efd852cc ("sfc: remove Software TSO") so let's remove the last traces. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	netdevice: fix sparse warning for HARD_TX_LOCK	Michael S. Tsirkin	1	-1/+16
	sparse warns about context imbalance in any code that uses HARD_TX_LOCK/UNLOCK - this is because it's unable to determine that flags don't change so lock and unlock are paired. Seems easy enough to fix by adding __acquire/__release calls. With this patch af_packet.c is now sparse-clean, Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	ptp: gianfar: Use high resolution frequency method.	Ulrik De Bie	1	-8/+13
	This patch depends on commit d8d263541913 ("ptp: Introduce a high resolution frequency adjustment method.") The gianfar devices offer a frequency resolution of about 0.46 ppb (depends on actual value of tmr_add, for the calculation assumed 0x80000000). This patch lets users of the device benefit from the increased frequency resolution when tuning the clock. Thanks to the rounding the maximum error between the requested frequency and the applied frequency will then be about 0.23 ppb. Tested on a v3.3.8 kernel on a real gianfar device. Verified compilation on net-next (currently at v4.9-rc5). Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	mlx4: do not use priv->stats_lock in mlx4_en_auto_moderation()	Eric Dumazet	1	-4/+2
	Per RX ring packets/bytes counters are not protected by global priv->stats_lock. Better not confuse the reader, and use READ_ONCE() to show we read these counters without surrounding synchronization. Interrupt moderation is best effort, and we do not really care of ultra precise counters. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-26	fix default_file_splice_read()	Al Viro	1	-1/+2
	Botched calculation of number of pages. As the result, we were dropping pieces when doing splice to pipe from e.g. 9p. Reported-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-11-25	tipc: resolve connection flow control compatibility problem	Jon Paul Maloy	1	-1/+1
	In commit 10724cc7bb78 ("tipc: redesign connection-level flow control") we replaced the previous message based flow control with one based on 1k blocks. In order to ensure backwards compatibility the mechanism falls back to using message as base unit when it senses that the peer doesn't support the new algorithm. The default flow control window, i.e., how many units can be sent before the sender blocks and waits for an acknowledge (aka advertisement) is 512. This was tested against the previous version, which uses an acknowledge frequency of on ack per 256 received message, and found to work fine. However, we missed the fact that versions older than Linux 3.15 use an acknowledge frequency of 512, which is exactly the limit where a 4.6+ sender will stop and wait for acknowledge. This would also work fine if it weren't for the fact that if the first sent message on a 4.6+ server side is an empty SYNACK, this one is also is counted as a sent message, while it is not counted as a received message on a legacy 3.15-receiver. This leads to the sender always being one step ahead of the receiver, a scenario causing the sender to block after 512 sent messages, while the receiver only has registered 511 read messages. Hence, the legacy receiver is not trigged to send an acknowledge, with a permanently blocked sender as result. We solve this deadlock by simply allowing the sender to send one more message before it blocks, i.e., by a making minimal change to the condition used for determining connection congestion. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: spectrum: Add policers for trap groups	Nogah Frankel	1	-2/+72
	Configure policers and connect them to trap groups. While many trap groups share policer's configuration they don't share the actual policer because each trap group represents a different flow / protocol and we don't want one of them to be able to exceed its rate on behalf of another. For example, if STP and LLDP gets to send 128 packets/sec each, if we put them in one 256 packets/sec policer, one can send 200 packets while the other only 50. Note that IP2ME covers lots of flows, so it's limit is set to match the cpu ability to handle data. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: reg: Add QoS Policer Configuration Register	Nogah Frankel	1	-0/+141
	The QPCR register is used to create and control policers. A policer can discard or change the color of packets that are trapped by a specific trap. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: resources: Add max cpu policers resource	Nogah Frankel	1	-0/+2
	Add a new resource to resources query: max cpu policers which tells us how many policers can be used to limit the data rate to the cpu port. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: Create a different trap group list for each device	Nogah Frankel	3	-34/+77
	Trap groups can be used to control traps priority, both in terms of which trap "wins" if a packet matches two traps (priority) and in terms of packets from which trap group will be scheduled to the cpu first (tc). They can also be used to set rate limiters (policers) on them (will be added in the next patches). Currently, we support two trap groups. In Spectrum we want a better resolution, so every protocol / flow will have a different trap group, so we can control its parameters separately. Once the policers will be implemented, it will also allow us limit the rate of each protocol by itself. This patch change the trap group list to include: * the emad trap group, which is shared for all the devices. * Switchx2's trap groups, which are a copy of the current trap groups. * Spectrum's new trap groups, in order to match the above guidelines. (Switchib is using only the emad trap group, so it require no changes). This patch also includes new configuration for Spectrum's trap groups, with primary priority order within them. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: spectrum: Add BGP trap	Nogah Frankel	2	-0/+2
	Add a trap for BGP protocol that was previously trapped by the generic trap for IP2ME. This trap will allow us to have better control (over priority and rate) of the traffic. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: Change trap groups setting	Nogah Frankel	4	-40/+92
	Trap groups have many options which we currently set to default values. In the next patches we will use many of them with non-default values. Some of these options have no default value, so this patch sets them as params for the trap group set function. Others almost always use the same values, so the set function will use this default values. In the rare cases when they will need to be with other values, these values can be set directly (using the macros for fields in registers). Parameters without default value: TC - the traffic class for packets that hit this trap group. (old default is the max tc) priority - if one packet hits multiple trap groups, the group with the higher priority will "catch" it. (old default is 0) policer - limit rate policer (old default is disabled) Default parameters: swid - switch id, relevant for the emad trap only, ignored on Spectrum. (new default is 0) rdq - CPU receive descriptor queue (new default is identical to trap group id) Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: resources: Add max trap groups resource	Nogah Frankel	1	-0/+2
	Add the max number of trap groups to resource query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: core: Change emad trap group settings	Nogah Frankel	5	-39/+39
	Currently, the emad trap init was done in the core. In the future we will want to add some changes to the traps groups, according to device type. This commit create a driver function to create the trap group for the emad, so later it can be changed by devices. It also changes the emad registration to use the new generic functions. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: Add option to choose trap group	Nogah Frankel	4	-21/+22
	Currently, we set the trap group to pre-determined option, based on whether it is an rx or event trap. This commit adds a possibility to chose the trap group, so it can be set to different values in the following patches. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: Change trap set function	Nogah Frankel	5	-44/+49
	Change trap setting function so instead of determining the trap group by trap id, it gets it as a parameter (so later we can have different trap groups for Spectrum and Switchx2). Add "is_ctrl" parameter to the trap setting function. It control whether the trapped packets wait in a designated control buffer or in their default one. This parameter is ignored by Switchx2 and Switchib. Add these parameters to the traps array in Spectrum, Switchx2 and Switchib. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: switchib: Use generic listener struct for events	Nogah Frankel	1	-35/+27
	Change the event handling in Switchib to be comptible with Spectrum and Switchx2. Use the generic listener struct for the events. Init and fini them by loop (and not by calling each event by its name). Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: switchx2: Use generic listener struct for events	Nogah Frankel	1	-66/+12
	Change the events to use the generic listener struct. Merge the event list into the trap list, so the same functions will handle both. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: spectrum: Use generic listener struct for events	Nogah Frankel	1	-65/+12
	Change the events to use the generic listener struct. Merge the event list into the trap list, so the same functions will handle both. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: core: Introduce generic macro for event	Nogah Frankel	1	-0/+12
	Create a macro for creating the generic listener struct for events, similar to the one for rx traps. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: switchx2: Use generic listener struct for rx traps	Nogah Frankel	1	-100/+27
	Reorganize the traps to use the new generic listener struct and functions. Use macros to shorten the traps list. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: spectrum: Use generic listener struct for rx traps	Nogah Frankel	1	-57/+38
	Replace the old rx listener struct definitions by the generic ones. Use the new generic registering / unregistering functions for them. Add some macros to organize the trap list. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: core: Expose generic macros for rx trap	Nogah Frankel	1	-0/+14
	In Spectrum, there is a macro to arrange the traps list. This macro is useful for everyone who is using rx traps. Create a similar macro in core.h for creating the generic listener struct for rx traps. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: core: Create a generic function to register / unregister traps	Nogah Frankel	2	-0/+85
	We have 2 types of HW traps to handle, rx traps and events. The registration workflow for both is very similar. So it only make sense to create one function to handle both. This patch creates a struct to hold the data for both cases. It also creates a registration and an un-registration functions that get this generic struct as input. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mlxsw: spectrum: Remove unused traps	Nogah Frankel	1	-6/+1
	Since commit 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") we no longer rely on flooding traffic to the CPU in order to trap packets intended for the host itself. Therefore, the FDB MC trap can be removed. Remove traps for protocols that are not supported yet. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	mvpp2: use correct size for memset	Arnd Bergmann	1	-1/+1
	gcc-7 detects a short memset in mvpp2, introduced in the original merge of the driver: drivers/net/ethernet/marvell/mvpp2.c: In function 'mvpp2_cls_init': drivers/net/ethernet/marvell/mvpp2.c:3296:2: error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size] The result seems to be that we write uninitialized data into the flow table registers, although we did not get any warning about that uninitialized data usage. Using sizeof() lets us initialize then entire array instead. Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net/mlx5: drop duplicate header delay.h	Geliang Tang	1	-1/+0
	Drop duplicate header delay.h from mlx5/core/main.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Acked-by: Matan Barak <matanb@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: ieee802154: drop duplicate header delay.h	Geliang Tang	1	-1/+0
	Drop duplicate header delay.h from adf7242.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	ibmvnic: drop duplicate header seq_file.h	Geliang Tang	1	-1/+0
	Drop duplicate header seq_file.h from ibmvnic.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	fsl/fman: fix a leak in tgec_free()	Dan Carpenter	1	-3/+0
	We set "tgec->cfg" to NULL before passing it to kfree(). There is no need to set it to NULL at all. Let's just delete it. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net/mlx5: remove a duplicate condition	Dan Carpenter	1	-2/+1
	We verified that MLX5_FLOW_CONTEXT_ACTION_COUNT was set on the first line of the function so we don't need to check again here. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS	Miroslav Lichvar	1	-0/+1
	The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET command and likewise it shouldn't require the CAP_NET_ADMIN capability. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: thunderx: Pause frame support	Sunil Goutham	6	-0/+166
	Enable pause frames on both Rx and Tx side, configure pause interval e.t.c. Also support for enable/disable pause frames on Rx/Tx via ethtool has been added. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: thunderx: Configure RED and backpressure levels	Sunil Goutham	4	-9/+41
	This patch enables moving average calculation of Rx pkt's resources and configures RED and backpressure levels for both CQ and RBDR. Also initialize SQ's CQ_LIMIT properly. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: thunderx: Add ethtool support for supported ports and link modes.	Thanneeru Srinivasulu	5	-3/+38
	Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: thunderx: 80xx BGX0 configuration changes	Sunil Goutham	1	-3/+17
	On 80xx only one lane of DLM0 and DLM1 (of BGX0) can be used , so even though lmac count may be 2 but LMAC1 should use serdes lane of DLM1. Since it's not possible to distinguish 80xx from 81xx as PCI devid are same, this patch adds this config support by replying on what firmware configures the lmacs with. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	tipc: improve sanity check for received domain records	Jon Paul Maloy	1	-5/+5
	In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we added a data area to the link monitor STATE messages under the assumption that previous versions did not use any such data area. For versions older than Linux 4.3 this assumption is not correct. In those version, all STATE messages sent out from a node inadvertently contain a 16 byte data area containing a string; -a leftover from previous RESET messages which were using this during the setup phase. This string serves no purpose in STATE messages, and should no be there. Unfortunately, this data area is delivered to the link monitor framework, where a sanity check catches that it is not a correct domain record, and drops it. It also issues a rate limited warning about the event. Since such events occur much more frequently than anticipated, we now choose to remove the warning in order to not fill the kernel log with useless contents. We also make the sanity check stricter, to further reduce the risk that such data is inavertently admitted. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	tipc: fix compatibility bug in link monitoring	Jon Paul Maloy	1	-2/+3
	commit 817298102b0b ("tipc: fix link priority propagation") introduced a compatibility problem between TIPC versions newer than Linux 4.6 and those older than Linux 4.4. In versions later than 4.4, link STATE messages only contain a non-zero link priority value when the sender wants the receiver to change its priority. This has the effect that the receiver resets itself in order to apply the new priority. This works well, and is consistent with the said commit. However, in versions older than 4.4 a valid link priority is present in all sent link STATE messages, leading to cyclic link establishment and reset on the 4.6+ node. We fix this by adding a test that the received value should not only be valid, but also differ from the current value in order to cause the receiving link endpoint to reset. Reported-by: Amar Nv <amar.nv005@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	phy: fix error case of phy_led_triggers_(un)register	Woojung Huh	2	-5/+3
	When phy_init_hw() fails at phy_attach_direct(); - phy_detach() calls phy_led_triggers_unregister() without previous call of phy_led_triggers_register(). - still call phy_led_triggers_register() and cause memory leak. Fixes: 2e0bc452f472 ("net: phy: leds: add support for led triggers on phy link state change") Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented	Andrew Lunn	1	-1/+1
	The mvneta driver advertises it supports IFF_UNICAST_FLT. However, it actually does not. The hardware probably does support it, but there is no code to configure the filter. As a quick and simple fix, remove the flag. This will cause the core to fall back to promiscuous mode. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: b50b72de2f2f ("net: mvneta: enable features before registering the driver") Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: properly flush delay-freed skbs	Eric Dumazet	1	-2/+3
	Typical NAPI drivers use napi_consume_skb(skb) at TX completion time. This put skb in a percpu special queue, napi_alloc_cache, to get bulk frees. It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE limit quite often, with skbs that were queued hundreds of usec earlier. I measured this can take ~6000 nsec to perform one flush. __kfree_skb_flush() can be called from two points right now : 1) From net_tx_action(), but only for skbs that were queued to sd->completion_queue. -> Irrelevant for NAPI drivers in normal operation. 2) From net_rx_action(), but only under high stress or if RPS/RFS has a pending action. This patch changes net_rx_action() to perform the flush in all cases and after more urgent operations happened (like kicking remote CPUS for RPS/RFS). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	Fix subtle CONFIG_MODVERSIONS problems	Linus Torvalds	1	-0/+1
	CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series, and quite frankly, nobody has cared very deeply. We absolutely know how to fix it, and it's not _complicated_, but it's not exactly pretty either. This oneliner fixes it without the ugliness, and allows for further future cleanups. "We've secretly replaced their regular MODVERSIONS with nothing at all, let's see if they notice" Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-25	MAINTAINERS: Add bug tracking system location entry type	Rafael J. Wysocki	1	-0/+10
	Following the kernel Bugzilla discussion during the Kernel Summit (https://lwn.net/Articles/705245/), add bug tracking system location entry type (B) to MAINTAINERS and populate it for several subsystems known to be using the kernel BZ actively (and add the upstream BZ for ACPICA too). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-25	Revert "i2c: designware: do not disable adapter after transfer"	Jarkko Nikula	1	-37/+18
	This reverts commit 0317e6c0f1dc1ba86b8d9dccc010c5e77b8355fa. Srinivas reported recently touchscreen and touchpad stopped working in Haswell based machine in Linux 4.9-rc series with timeout errors from i2c_designware: [ 16.508013] i2c_designware INT33C3:00: controller timed out [ 16.508302] i2c_hid i2c-MSFT0001:02: failed to change power setting. [ 17.532016] i2c_designware INT33C3:00: controller timed out [ 18.556022] i2c_designware INT33C3:00: controller timed out [ 18.556315] i2c_hid i2c-ATML1000:00: failed to retrieve report from device. I managed to reproduce similar errors on another Haswell based machine where touchscreen initialization fails maybe in every 1/5 - 1/2 boots. Since root cause for these errors is not clear yet and debugging is ongoing it's better to revert this commit as we are near to release. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-11-25	samples: bpf: add userspace example for attaching eBPF programs to cgroups	Daniel Mack	4	-0/+173
	Add a simple userpace program to demonstrate the new API to attach eBPF programs to cgroups. This is what it does: * Create arraymap in kernel with 4 byte keys and 8 byte values * Load eBPF program The eBPF program accesses the map passed in to store two pieces of information. The number of invocations of the program, which maps to the number of packets received, is stored to key 0. Key 1 is incremented on each iteration by the number of bytes stored in the skb. * Detach any eBPF program previously attached to the cgroup * Attach the new program to the cgroup using BPF_PROG_ATTACH * Once a second, read map[0] and map[1] to see how many bytes and packets were seen on any socket of tasks in the given cgroup. The program takes a cgroup path as 1st argument, and either "ingress" or "egress" as 2nd. Optionally, "drop" can be passed as 3rd argument, which will make the generated eBPF program return 0 instead of 1, so the kernel will drop the packet. libbpf gained two new wrappers for the new syscall commands. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: ipv4, ipv6: run cgroup eBPF egress programs	Daniel Mack	2	-2/+33
	If the cgroup associated with the receiving socket has an eBPF programs installed, run them from ip_output(), ip6_output() and ip_mc_output(). From mentioned functions we have two socket contexts as per 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn()."). We explicitly need to use sk instead of skb->sk here, since otherwise the same program would run multiple times on egress when encap devices are involved, which is not desired in our case. eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	net: filter: run cgroup eBPF ingress programs	Daniel Mack	1	-0/+4
	If the cgroup associated with the receiving socket has an eBPF programs installed, run them from sk_filter_trim_cap(). eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands	Daniel Mack	2	-0/+89
	Extend the bpf(2) syscall by two new commands, BPF_PROG_ATTACH and BPF_PROG_DETACH which allow attaching and detaching eBPF programs to a target. On the API level, the target could be anything that has an fd in userspace, hence the name of the field in union bpf_attr is called 'target_fd'. When called with BPF_ATTACH_TYPE_CGROUP_INET_{E,IN}GRESS, the target is expected to be a valid file descriptor of a cgroup v2 directory which has the bpf controller enabled. These are the only use-cases implemented by this patch at this point, but more can be added. If a program of the given type already exists in the given cgroup, the program is swapped automically, so userspace does not have to drop an existing program first before installing a new one, which would otherwise leave a gap in which no program is attached. For more information on the propagation logic to subcgroups, please refer to the bpf cgroup controller implementation. The API is guarded by CAP_NET_ADMIN. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25	cgroup: add support for eBPF programs	Daniel Mack	6	-0/+281
	This patch adds two sets of eBPF program pointers to struct cgroup. One for such that are directly pinned to a cgroup, and one for such that are effective for it. To illustrate the logic behind that, assume the following example cgroup hierarchy. A - B - C \ D - E If only B has a program attached, it will be effective for B, C, D and E. If D then attaches a program itself, that will be effective for both D and E, and the program in B will only affect B and C. Only one program of a given type is effective for a cgroup. Attaching and detaching programs will be done through the bpf(2) syscall. For now, ingress and egress inet socket filtering are the only supported use-cases. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>