aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/fpga/fpga-bridge.h (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-08-09net-next: tag_mtk: add flow_dissect callback to the ops structJohn Crispin1-2/+12
The MT7530 inserts the 4 magic header in between the 802.3 address and protocol field. The patch implements the callback that can be called by the flow dissector to figure out the real protocol and offset of the network header. With this patch applied we can properly parse the packet and thus make hashing function properly. Signed-off-by: Muciri Gatimu <muciri@openmesh.com> Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com> Signed-off-by: John Crispin <john@phrozen.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net-next: dsa: add flow_dissect callback to struct dsa_device_opsJohn Crispin1-0/+2
When the flow dissector first sees packets coming in on a DSA devices the 802.3 header wont be located where the code expects it to be as the tag is still present. Adding this new callback allows a DSA device to provide a new function that the flow_dissector can use to get the correct protocol and offset of the network header. Signed-off-by: Muciri Gatimu <muciri@openmesh.com> Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com> Signed-off-by: John Crispin <john@phrozen.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net-next: dsa: move struct dsa_device_ops to the global header fileJohn Crispin2-7/+7
We need to access this struct from within the flow_dissector to fix dissection for packets coming in on DSA devices. Signed-off-by: Muciri Gatimu <muciri@openmesh.com> Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net-next: mediatek: bring up QDMA RX ring 0John Crispin2-10/+29
This patch is in preparation for adding HW flow and QoS offloading. For those features to work, the driver needs to bring up the first QDMA RX ring. This ring is used by the PPE offloading HW. Signed-off-by: John Crisp in <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net-next: mediatek: fix typos inside the header fileJohn Crispin1-2/+2
Trivial patch fixing 2 typos. Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net: atm: make atmdev_ops constBhumika Goyal4-4/+4
Make these const as they are only stored in the ops field of a atm_dev structure, which is const. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09atm: make atmdev_ops constBhumika Goyal6-6/+6
Make these structures const as they are either passed to the function atm_dev_register having the corresponding argument as const or stored in the ops field of a atm_dev structure, which is also const. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net: dsa: make dsa_switch_ops constBhumika Goyal3-3/+3
Make these structures const as they are only stored in the ops field of a dsa_switch structure, which is const. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09liquidio: napi cleanupIntiyaz Basha2-0/+29
Disable napi when interface is going down. Delete napi when destroying the interface. Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net: ipv6: lower ndisc notifier priority below addrconfDavid Ahern1-0/+1
ndisc_notify is used to send unsolicited neighbor advertisements (e.g., on a link up). Currently, the ndisc notifier is run before the addrconf notifer which means NA's are not sent for link-local addresses which are added by the addrconf notifier. Fix by lowering the priority of the ndisc notifier. Setting the priority to ADDRCONF_NOTIFY_PRIORITY - 5 means it runs after addrconf and before the route notifier which is ADDRCONF_NOTIFY_PRIORITY - 10. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09ibmvnic: Correct 'unused variable' warning in build.Nathan Fontenot1-1/+0
Commit a248878d7a1d ("ibmvnic: Check for transport event on driver resume") removed the loop to kick irqs on driver resume but didn't remove the now unused loop variable 'i'. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09ibmvnic: Add netdev_dbg output for debuggingNathan Fontenot1-7/+55
To ease debugging of the ibmvnic driver add a series of netdev_dbg() statements to track driver status, especially during initialization, removal, and resetting of the driver. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09ibmvnic: Clean up resources on probe failureNathan Fontenot1-11/+15
Ensure that any resources allocated during probe are released if the probe of the driver fails. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09net: call newid/getid without rtnl mutex heldFlorian Westphal1-2/+3
Both functions take nsid_lock and don't rely on rtnl lock. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09rtnetlink: add RTNL_FLAG_DOIT_UNLOCKEDFlorian Westphal2-0/+19
Allow callers to tell rtnetlink core that its doit callback should be invoked without holding rtnl mutex. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09rtnetlink: protect handler table with rcuFlorian Westphal1-56/+65
Note that netlink dumps still acquire rtnl mutex via the netlink dump infrastructure. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09rtnetlink: small rtnl lock pushdownFlorian Westphal1-6/+13
instead of rtnl lock/unload at the top level, push it down to the called function. This is just an intermediate step, next commit switches protection of the rtnl_link ops table to rcu, in which case (for dumps) the rtnl lock is acquired only by the netlink dumper infrastructure (current lock/unlock/dump/lock/unlock rtnl sequence becomes rcu lock/rcu unlock/dump). Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09rtnetlink: add reference counting to prevent module unload while dump is in progressFlorian Westphal1-1/+13
I don't see what prevents rmmod (unregister_all is called) while a dump is active. Even if we'd add rtnl lock/unlock pair to unregister_all (as done here), thats not enough either as rtnl_lock is released right before the dump process starts. So this adds a refcount: * acquire rtnl mutex * bump refcount * release mutex * start the dump ... and make unregister_all remove the callbacks (no new dumps possible) and then wait until refcount is 0. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09rtnetlink: make rtnl_register accept a flags parameterFlorian Westphal26-96/+95
This change allows us to later indicate to rtnetlink core that certain doit functions should be called without acquiring rtnl_mutex. This change should have no effect, we simply replace the last (now unused) calcit argument with the new flag. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09rtnetlink: call rtnl_calcit directlyFlorian Westphal1-25/+4
There is only a single place in the kernel that regisers the "calcit" callback (to determine min allocation for dumps). This is in rtnetlink.c for PF_UNSPEC RTM_GETLINK. The function that checks for calcit presence at run time will first check the requested family (which will always fail for !PF_UNSPEC as no callsite registers this), then falls back to checking PF_UNSPEC. Therefore we can just check if type is RTM_GETLINK and then do a direct call. Because of fallback to PF_UNSPEC all RTM_GETLINK types used this regardless of family. This has the advantage that we don't need to allocate space for the function pointer for all the other families. A followup patch will drop the calcit function pointer from the rtnl_link callback structure. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf: add test cases for new BPF_J{LT, LE, SLT, SLE} instructionsDaniel Borkmann1-0/+313
Add test cases to the verifier selftest suite in order to verify that i) direct packet access, and ii) dynamic map value access is working with the changes related to the new instructions. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifierDaniel Borkmann1-4/+58
Enable the newly added jump opcodes, main parts are in two different areas, namely direct packet access and dynamic map value access. For the direct packet access, we now allow for the following two new patterns to match in order to trigger markings with find_good_pkt_pointers(): Variant 1 (access ok when taking the branch): 0: (61) r2 = *(u32 *)(r1 +76) 1: (61) r3 = *(u32 *)(r1 +80) 2: (bf) r0 = r2 3: (07) r0 += 8 4: (ad) if r0 < r3 goto pc+2 R0=pkt(id=0,off=8,r=0) R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R10=fp 5: (b7) r0 = 0 6: (95) exit from 4 to 7: R0=pkt(id=0,off=8,r=8) R1=ctx R2=pkt(id=0,off=0,r=8) R3=pkt_end R10=fp 7: (71) r0 = *(u8 *)(r2 +0) 8: (05) goto pc-4 5: (b7) r0 = 0 6: (95) exit processed 11 insns, stack depth 0 Variant 2 (access ok on fall-through): 0: (61) r2 = *(u32 *)(r1 +76) 1: (61) r3 = *(u32 *)(r1 +80) 2: (bf) r0 = r2 3: (07) r0 += 8 4: (bd) if r3 <= r0 goto pc+1 R0=pkt(id=0,off=8,r=8) R1=ctx R2=pkt(id=0,off=0,r=8) R3=pkt_end R10=fp 5: (71) r0 = *(u8 *)(r2 +0) 6: (b7) r0 = 1 7: (95) exit from 4 to 6: R0=pkt(id=0,off=8,r=0) R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R10=fp 6: (b7) r0 = 1 7: (95) exit processed 10 insns, stack depth 0 The above two basically just swap the branches where we need to handle an exception and allow packet access compared to the two already existing variants for find_good_pkt_pointers(). For the dynamic map value access, we add the new instructions to reg_set_min_max() and reg_set_min_max_inv() in order to learn bounds. Verifier test cases for both are added in a follow-up patch. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf, nfp: implement jiting of BPF_J{LT,LE}Daniel Borkmann1-0/+24
This work implements jiting of BPF_J{LT,LE} instructions with BPF_X/BPF_K variants for the nfp eBPF JIT. The two BPF_J{SLT,SLE} instructions have not been added yet given BPF_J{SGT,SGE} are not supported yet either. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf, ppc64: implement jiting of BPF_J{LT, LE, SLT, SLE}Daniel Borkmann2-0/+21
This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions with BPF_X/BPF_K variants for the ppc64 eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf, s390x: implement jiting of BPF_J{LT, LE, SLT, SLE}Daniel Borkmann1-0/+24
This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions with BPF_X/BPF_K variants for the s390x eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf, sparc64: implement jiting of BPF_J{LT, LE, SLT, SLE}Daniel Borkmann1-0/+34
This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions with BPF_X/BPF_K variants for the sparc64 eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf, arm64: implement jiting of BPF_J{LT, LE, SLT, SLE}Daniel Borkmann2-0/+24
This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions with BPF_X/BPF_K variants for the arm64 eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf, x86: implement jiting of BPF_J{LT,LE,SLT,SLE}Daniel Borkmann1-0/+26
This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions with BPF_X/BPF_K variants for the x86_64 eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09bpf: add BPF_J{LT,LE,SLT,SLE} instructionsDaniel Borkmann6-4/+455
Currently, eBPF only understands BPF_JGT (>), BPF_JGE (>=), BPF_JSGT (s>), BPF_JSGE (s>=) instructions, this means that particularly *JLT/*JLE counterparts involving immediates need to be rewritten from e.g. X < [IMM] by swapping arguments into [IMM] > X, meaning the immediate first is required to be loaded into a register Y := [IMM], such that then we can compare with Y > X. Note that the destination operand is always required to be a register. This has the downside of having unnecessarily increased register pressure, meaning complex program would need to spill other registers temporarily to stack in order to obtain an unused register for the [IMM]. Loading to registers will thus also affect state pruning since we need to account for that register use and potentially those registers that had to be spilled/filled again. As a consequence slightly more stack space might have been used due to spilling, and BPF programs are a bit longer due to extra code involving the register load and potentially required spill/fills. Thus, add BPF_JLT (<), BPF_JLE (<=), BPF_JSLT (s<), BPF_JSLE (s<=) counterparts to the eBPF instruction set. Modifying LLVM to remove the NegateCC() workaround in a PoC patch at [1] and allowing it to also emit the new instructions resulted in cilium's BPF programs that are injected into the fast-path to have a reduced program length in the range of 2-3% (e.g. accumulated main and tail call sections from one of the object file reduced from 4864 to 4729 insns), reduced complexity in the range of 10-30% (e.g. accumulated sections reduced in one of the cases from 116432 to 88428 insns), and reduced stack usage in the range of 1-5% (e.g. accumulated sections from one of the object files reduced from 824 to 784b). The modification for LLVM will be incorporated in a backwards compatible way. Plan is for LLVM to have i) a target specific option to offer a possibility to explicitly enable the extension by the user (as we have with -m target specific extensions today for various CPU insns), and ii) have the kernel checked for presence of the extensions and enable them transparently when the user is selecting more aggressive options such as -march=native in a bpf target context. (Other frontends generating BPF byte code, e.g. ply can probe the kernel directly for its code generation.) [1] https://github.com/borkmann/llvm/tree/bpf-insns Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09sock: fix zerocopy_success regression with msg_zerocopyWillem de Bruijn1-2/+7
Do not use uarg->zerocopy outside msg_zerocopy. In other paths the field is not explicitly initialized and aliases another field. Those paths have only one reference so do not need this intermediate variable. Call uarg->callback directly. Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09sock: fix zerocopy panic in mem accountingWillem de Bruijn1-2/+2
Only call mm_unaccount_pinned_pages when releasing a struct ubuf_info that has initialized its field uarg->mmp. Before this patch, a vhost-net with experimental_zcopytx can crash in mm_unaccount_pinned_pages sock_zerocopy_put skb_zcopy_clear skb_release_data Only sock_zerocopy_alloc initializes this field. Move the unaccount call from generic sock_zerocopy_put to its specific callback sock_zerocopy_callback. Fixes: a91dbff551a6 ("sock: ulimit on MSG_ZEROCOPY pages") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09futex: Remove unnecessary warning from get_futex_keyMel Gorman1-2/+3
Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in get_futex_key()") removed an unnecessary lock_page() with the side-effect that page->mapping needed to be treated very carefully. Two defensive warnings were added in case any assumption was missed and the first warning assumed a correct application would not alter a mapping backing a futex key. Since merging, it has not triggered for any unexpected case but Mark Rutland reported the following bug triggering due to the first warning. kernel BUG at kernel/futex.c:679! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3 Hardware name: linux,dummy-virt (DT) task: ffff80001e271780 task.stack: ffff000010908000 PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145 The fact that it's a bug instead of a warning was due to an unrelated arm64 problem, but the warning itself triggered because the underlying mapping changed. This is an application issue but from a kernel perspective it's a recoverable situation and the warning is unnecessary so this patch removes the warning. The warning may potentially be triggered with the following test program from Mark although it may be necessary to adjust NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the system. #include <linux/futex.h> #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <sys/syscall.h> #include <sys/time.h> #include <unistd.h> #define NR_FUTEX_THREADS 16 pthread_t threads[NR_FUTEX_THREADS]; void *mem; #define MEM_PROT (PROT_READ | PROT_WRITE) #define MEM_SIZE 65536 static int futex_wrapper(int *uaddr, int op, int val, const struct timespec *timeout, int *uaddr2, int val3) { syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3); } void *poll_futex(void *unused) { for (;;) { futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1); } } int main(int argc, char *argv[]) { int i; mem = mmap(NULL, MEM_SIZE, MEM_PROT, MAP_SHARED | MAP_ANONYMOUS, -1, 0); printf("Mapping @ %p\n", mem); printf("Creating futex threads...\n"); for (i = 0; i < NR_FUTEX_THREADS; i++) pthread_create(&threads[i], NULL, poll_futex, NULL); printf("Flipping mapping...\n"); for (;;) { mmap(mem, MEM_SIZE, MEM_PROT, MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0); } return 0; } Reported-and-tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-08net: ipv6: avoid overhead when no custom FIB rules are installedVincent Bernat3-13/+29
If the user hasn't installed any custom rules, don't go through the whole FIB rules layer. This is pretty similar to f4530fa574df (ipv4: Avoid overhead when no custom FIB rules are installed). Using a micro-benchmark module [1], timing ip6_route_output() with get_cycles(), with 40,000 routes in the main routing table, before this patch: min=606 max=12911 count=627 average=1959 95th=4903 90th=3747 50th=1602 mad=821 table=254 avgdepth=21.8 maxdepth=39 value │ ┊ count 600 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 199 880 │▒▒▒░░░░░░░░░░░░░░░░ 43 1160 │▒▒▒░░░░░░░░░░░░░░░░░░░░ 48 1440 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░ 43 1720 │▒▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░ 59 2000 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 50 2280 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 26 2560 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 31 2840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 28 3120 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 17 3400 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 17 3680 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8 3960 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11 4240 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 6 4520 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 6 4800 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9 After: min=544 max=11687 count=627 average=1776 95th=4546 90th=3585 50th=1227 mad=565 table=254 avgdepth=21.8 maxdepth=39 value │ ┊ count 540 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 201 800 │▒▒▒▒▒░░░░░░░░░░░░░░░░ 63 1060 │▒▒▒▒▒░░░░░░░░░░░░░░░░░░░░░ 68 1320 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░ 39 1580 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 32 1840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 32 2100 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 34 2360 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 33 2620 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 26 2880 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 22 3140 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9 3400 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8 3660 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9 3920 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8 4180 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8 4440 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8 At the frequency of the host during the bench (~ 3.7 GHz), this is about a 100 ns difference on the median value. A next step would be to collapse local and main tables, as in 0ddcf43d5d4a (ipv4: FIB Local/MAIN table collapse). [1]: https://github.com/vincentbernat/network-lab/blob/master/lab-routes-ipv6/kbench_mod.c Signed-off-by: Vincent Bernat <vincent@bernat.im> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net: avoid skb_warn_bad_offload false positives on UFOWillem de Bruijn3-3/+3
skb_warn_bad_offload triggers a warning when an skb enters the GSO stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL checksum offload set. Commit b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") observed that SKB_GSO_DODGY producers can trigger the check and that passing those packets through the GSO handlers will fix it up. But, the software UFO handler will set ip_summed to CHECKSUM_NONE. When __skb_gso_segment is called from the receive path, this triggers the warning again. Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On Tx these two are equivalent. On Rx, this better matches the skb state (checksum computed), as CHECKSUM_NONE here means no checksum computed. See also this thread for context: http://patchwork.ozlabs.org/patch/799015/ Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08isdn: hfcsusb: constify usb_device_idArvind Yadav1-1/+1
usb_device_id are not supposed to change at runtime. All functions working with usb_device_id provided by <linux/usb.h> work with const usb_device_id. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08isdn: hisax: hfc_usb: constify usb_device_idArvind Yadav1-1/+1
usb_device_id are not supposed to change at runtime. All functions working with usb_device_id provided by <linux/usb.h> work with const usb_device_id. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08qmi_wwan: fix NULL deref on disconnectBjørn Mork1-1/+5
qmi_wwan_disconnect is called twice when disconnecting devices with separate control and data interfaces. The first invocation will set the interface data to NULL for both interfaces to flag that the disconnect has been handled. But the matching NULL check was left out when qmi_wwan_disconnect was added, resulting in this oops: usb 2-1.4: USB disconnect, device number 4 qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] PGD 0 P4D 0 Oops: 0000 [#1] SMP Modules linked in: <stripped irrelevant module list> CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G E 4.12.3-nr44-normandy-r1500619820+ #1 Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017 Workqueue: usb_hub_wq hub_event [usbcore] task: ffff8c882b716040 task.stack: ffffb8e800d84000 RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400 RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000 R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8 R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0 Call Trace: ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan] ? usbnet_disconnect+0x6c/0xf0 [usbnet] ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan] ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 Reported-and-tested-by: Nathaniel Roach <nroach44@gmail.com> Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Cc: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08qmi_wwan: fix NULL deref on disconnectBjørn Mork1-1/+5
qmi_wwan_disconnect is called twice when disconnecting devices with separate control and data interfaces. The first invocation will set the interface data to NULL for both interfaces to flag that the disconnect has been handled. But the matching NULL check was left out when qmi_wwan_disconnect was added, resulting in this oops: usb 2-1.4: USB disconnect, device number 4 qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] PGD 0 P4D 0 Oops: 0000 [#1] SMP Modules linked in: <stripped irrelevant module list> CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G E 4.12.3-nr44-normandy-r1500619820+ #1 Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017 Workqueue: usb_hub_wq hub_event [usbcore] task: ffff8c882b716040 task.stack: ffffb8e800d84000 RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400 RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000 R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8 R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0 Call Trace: ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan] ? usbnet_disconnect+0x6c/0xf0 [usbnet] ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan] ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 Reported-and-tested-by: Nathaniel Roach <nroach44@gmail.com> Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Cc: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net: phy: mdio-bcm-unimac: fix unsigned wrap-around when decrementing timeoutColin Ian King1-1/+1
Change post-decrement compare to pre-decrement to avoid an unsigned integer wrap-around on timeout. This leads to the following !timeout check to never to be true so -ETIMEDOUT is never returned. Detected by CoverityScan, CID#1452623 ("Logically dead code") Fixes: 69a60b0579a4 ("net: phy: mdio-bcm-unimac: factor busy polling loop") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08ppp: fix xmit recursion detection on ppp channelsGuillaume Nault1-8/+10
Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") dropped the xmit_recursion counter incrementation in ppp_channel_push() and relied on ppp_xmit_process() for this task. But __ppp_channel_push() can also send packets directly (using the .start_xmit() channel callback), in which case the xmit_recursion counter isn't incremented anymore. If such packets get routed back to the parent ppp unit, ppp_xmit_process() won't notice the recursion and will call ppp_channel_push() on the same channel, effectively creating the deadlock situation that the xmit_recursion mechanism was supposed to prevent. This patch re-introduces the xmit_recursion counter incrementation in ppp_channel_push(). Since the xmit_recursion variable is now part of the parent ppp unit, incrementation is skipped if the channel doesn't have any. This is fine because only packets routed through the parent unit may enter the channel recursively. Finally, we have to ensure that pch->ppp is not going to be modified while executing ppp_channel_push(). Instead of taking this lock only while calling ppp_xmit_process(), we now have to hold it for the full ppp_channel_push() execution. This respects the ppp locks ordering which requires locking ->upl before ->downl. Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08rds: Reintroduce statistics countingHåkon Bugge1-1/+4
In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection while senders are present"), refilling the receive queue was removed from rds_ib_recv(), along with the increment of s_ib_rx_refill_from_thread. Commit 73ce4317bf98 ("RDS: make sure we post recv buffers") re-introduces filling the receive queue from rds_ib_recv(), but does not add the statistics counter. rds_ib_recv() was later renamed to rds_ib_recv_path(). This commit reintroduces the statistics counting of s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com> Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08tcp: fastopen: tcp_connect() must refresh the routeEric Dumazet1-0/+4
With new TCP_FASTOPEN_CONNECT socket option, there is a possibility to call tcp_connect() while socket sk_dst_cache is either NULL or invalid. +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4 +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0 +0 connect(4, ..., ...) = 0 << sk->sk_dst_cache becomes obsolete, or even set to NULL >> +1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000 We need to refresh the route otherwise bad things can happen, especially when syzkaller is running on the host :/ Fixes: 19f6d3f3c8422 ("net/tcp-fastopen: Add new API support") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Wei Wang <weiwan@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net: sched: set xt_tgchk_param par.net properly in ipt_init_targetXin Long1-10/+10
Now xt_tgchk_param par in ipt_init_target is a local varibale, par.net is not initialized there. Later when xt_check_target calls target's checkentry in which it may access par.net, it would cause kernel panic. Jaroslav found this panic when running: # ip link add TestIface type dummy # tc qd add dev TestIface ingress handle ffff: # tc filter add dev TestIface parent ffff: u32 match u32 0 0 \ action xt -j CONNMARK --set-mark 4 This patch is to pass net param into ipt_init_target and set par.net with it properly in there. v1->v2: As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix it by also passing net_id to __tcf_ipt_init. v2->v3: Missed the fixes tag, so add it. Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put") Reported-by: Jaroslav Aster <jaster@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08cxgb4: Clear On FLASH config file after a FW upgradeArjun Vynipadath2-0/+71
Because Firmware and the Firmware Configuration File need to be in sync; clear out any On-FLASH Firmware Configuration File when new Firmware is loaded. This will avoid difficult to diagnose and fix problems with a mis-matched Firmware Configuration File which prevents the adapter from being initialized. Original work by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net_sched: get rid of some forward declarationsWANG Cong1-111/+103
If we move up tcf_fill_node() we can get rid of these forward declarations. Also, move down tfilter_notify_chain() to group them together. Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net: dsa: lan9303: Only allocate 3 portsEgil Hjelmeland2-3/+2
Save 2628 bytes on arm eabi by allocate only the required 3 ports. Now that ds->num_ports is correct: In net/dsa/tag_lan9303.c eliminate duplicate LAN9303_MAX_PORTS, use ds->num_ports. (Matching the pattern of other net/dsa/tag_xxx.c files.) Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08selftests: bpf: add a test for XDP redirectWilliam Tu4-3/+86
Add test for xdp_redirect by creating two namespaces with two veth peers, then forward packets in-between. Signed-off-by: William Tu <u9012063@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08liquidio: fix misspelled firmware image filenamesDerek Chickles1-4/+8
Fix misspelled firmware image filenames advertised via MODULE_FIRMWARE(). Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08bpf: Extend check_uarg_tail_zero() checksMickaël Salaün1-11/+15
The function check_uarg_tail_zero() was created from bpf(2) for BPF_OBJ_GET_INFO_BY_FD without taking the access_ok() nor the PAGE_SIZE checks. Make this checks more generally available while unlikely to be triggered, extend the memory range check and add an explanation including why the ToCToU should not be a security concern. Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Kees Cook <keescook@chromium.org> Cc: Martin KaFai Lau <kafai@fb.com> Link: https://lkml.kernel.org/r/CAGXu5j+vRGFvJZmjtAcT8Hi8B+Wz0e1b6VKYZHfQP_=DXzC4CQ@mail.gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08bpf: Move check_uarg_tail_zero() upwardMickaël Salaün1-26/+26
The function check_uarg_tail_zero() may be useful for other part of the code in the syscall.c file. Move this function at the beginning of the file. Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Kees Cook <keescook@chromium.org> Cc: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>