aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-07-21x86/devicetree: Convert to using %pOF instead of ->full_nameRob Herring1-2/+1
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each device node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: devicetree@vger.kernel.org Link: http://lkml.kernel.org/r/20170718214339.7774-7-robh@kernel.org [ Clarify the error message while at it, as 'node' is ambiguous. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-21perf/x86/intel: Add proper condition to run sched_task callbacksJiri Olsa3-10/+14
We have 2 functions using the same sched_task callback: - PEBS drain for free running counters - LBR save/store Both of them are called from intel_pmu_sched_task() and either of them can be unwillingly triggered when the other one is configured to run. Let's say there's PEBS drain configured in sched_task callback for the event, but in the callback itself (intel_pmu_sched_task()) we will also run the code for LBR save/restore, which we did not ask for, but the code in intel_pmu_sched_task() does not check for that. This can lead to extra cycles in some perf monitoring, like when we monitor PEBS event without LBR data. # perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000 (We need PEBS, non freq/non timestamp event to enable the sched_task callback) The perf stat of cycles and msr:write_msr for above command before the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,519,557,441 cycles:k 91,195,527 msr:write_msr 29.334476406 seconds time elapsed And after the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,704,973,540 cycles:k 27,184,720 msr:write_msr 16.977875900 seconds time elapsed There's no affect on cycles:k because the sched_task happens with events switched off, however the msr:write_msr tracepoint counter together with almost 50% of time speedup show the improvement. Monitoring LBR event and having extra PEBS drain processing in sched_task callback showed just a little speedup, because the drain function does not do much extra work in case there is no PEBS data. Adding conditions to recognize the configured work that needs to be done in the x86_pmu's sched_task callback. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170719075247.GA27506@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-21x86/platform/uv/BAU: Disable BAU on single hub configurationsAndrew Banman1-5/+18
The BAU confers no benefit to a UV system running with only one hub/socket. Permanently disable the BAU driver if there are less than two hubs online to avoid BAU overhead. We have observed failed boots on single-socket UV4 systems caused by BAU that are avoided with this patch. Also, while at it, consolidate initialization error blocks and fix a memory leak. Signed-off-by: Andrew Banman <abanman@hpe.com> Acked-by: Russ Anderson <rja@hpe.com> Acked-by: Mike Travis <mike.travis@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: tony.ernst@hpe.com Link: http://lkml.kernel.org/r/1500588351-78016-1-git-send-email-abanman@hpe.com [ Minor cleanups. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-21perf/core: Fix locking for children siblings group readJiri Olsa1-0/+5
We're missing ctx lock when iterating children siblings within the perf_read path for group reading. Following race and crash can happen: User space doing read syscall on event group leader: T1: perf_read lock event->ctx->mutex perf_read_group lock leader->child_mutex __perf_read_group_add(child) list_for_each_entry(sub, &leader->sibling_list, group_entry) ----> sub might be invalid at this point, because it could get removed via perf_event_exit_task_context in T2 Child exiting and cleaning up its events: T2: perf_event_exit_task_context lock ctx->mutex list_for_each_entry_safe(child_event, next, &child_ctx->event_list,... perf_event_exit_event(child) lock ctx->lock perf_group_detach(child) unlock ctx->lock ----> child is removed from sibling_list without any sync with T1 path above ... free_event(child) Before the child is removed from the leader's child_list, (and thus is omitted from perf_read_group processing), we need to ensure that perf_read_group touches child's siblings under its ctx->lock. Peter further notes: | One additional note; this bug got exposed by commit: | | ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP") | | which made it possible to actually trigger this code-path. Tested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP") Link: http://lkml.kernel.org/r/20170720141455.2106-1-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-21ide: avoid warning for timings calculationArnd Bergmann1-9/+9
gcc-7 warns about the result of a constant multiplication used as a boolean: drivers/ide/ide-timings.c: In function 'ide_timing_quantize': drivers/ide/ide-timings.c:112:24: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context] q->setup = EZ(t->setup * 1000, T); This slightly rearranges the macro to simplify the code and avoid the warning at the same time. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20net: bonding: Fix transmit load balancing in balance-alb modeKosuke Tatsukawa1-1/+1
balance-alb mode used to have transmit dynamic load balancing feature enabled by default. However, transmit dynamic load balancing no longer works in balance-alb after commit 8b426dc54cf4 ("bonding: remove hardcoded value"). Both balance-tlb and balance-alb use the function bond_do_alb_xmit() to send packets. This function uses the parameter tlb_dynamic_lb. tlb_dynamic_lb used to have the default value of 1 for balance-alb, but now the value is set to 0 except in balance-tlb. Re-enable transmit dyanmic load balancing by initializing tlb_dynamic_lb for balance-alb similar to balance-tlb. Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value") Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20rds: Make sure updates to cp_send_gen can be observedHåkon Bugge1-3/+3
cp->cp_send_gen is treated as a normal variable, although it may be used by different threads. This is fixed by using {READ,WRITE}_ONCE when it is incremented and READ_ONCE when it is read outside the {acquire,release}_in_xmit protection. Normative reference from the Linux-Kernel Memory Model: Loads from and stores to shared (but non-atomic) variables should be protected with the READ_ONCE(), WRITE_ONCE(), and ACCESS_ONCE(). Clause 5.1.2.4/25 in the C standard is also relevant. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20net: ethernet: ti: cpsw: Push the request_irq function to the end of probeKeerthy1-24/+25
Push the request_irq function to the end of probe so as to ensure all the required fields are populated in the event of an ISR getting executed right after requesting the irq. Currently while loading the crash kernel a crash was seen as soon as devm_request_threaded_irq was called. This was due to n->poll being NULL which is called as part of net_rx_action function. Suggested-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Keerthy <j-keerthy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20ipv4: initialize fib_trie prior to register_netdev_notifier call.Mahesh Bandewar1-4/+5
Net stack initialization currently initializes fib-trie after the first call to netdevice_notifier() call. In fact fib_trie initialization needs to happen before first rtnl_register(). It does not cause any problem since there are no devices UP at this moment, but trying to bring 'lo' UP at initialization would make this assumption wrong and exposes the issue. Fixes following crash Call Trace: ? alternate_node_alloc+0x76/0xa0 fib_table_insert+0x1b7/0x4b0 fib_magic.isra.17+0xea/0x120 fib_add_ifaddr+0x7b/0x190 fib_netdev_event+0xc0/0x130 register_netdevice_notifier+0x1c1/0x1d0 ip_fib_init+0x72/0x85 ip_rt_init+0x187/0x1e9 ip_init+0xe/0x1a inet_init+0x171/0x26c ? ipv4_offload_init+0x66/0x66 do_one_initcall+0x43/0x160 kernel_init_freeable+0x191/0x219 ? rest_init+0x80/0x80 kernel_init+0xe/0x150 ret_from_fork+0x22/0x30 Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08 RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28 CR2: 0000000000000014 Fixes: 7b1a74fdbb9e ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.") Fixes: 7f9b80529b8a ("[IPV4]: fib hash|trie initialization") Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20rtnetlink: allocate more memory for dev_set_mac_address()WANG Cong1-1/+2
virtnet_set_mac_address() interprets mac address as struct sockaddr, but upper layer only allocates dev->addr_len which is ETH_ALEN + sizeof(sa_family_t) in this case. We lack a unified definition for mac address, so just fix the upper layer, this also allows drivers to interpret it to struct sockaddr freely. Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20net: dsa: b53: Add missing ARL entries for BCM53125Florian Fainelli1-0/+1
The BCM53125 entry was missing an arl_entries member which would basically prevent the ARL search from terminating properly. This switch has 4 ARL entries, so add that. Fixes: 1da6df85c6fb ("net: dsa: b53: Implement ARL add/del/dump operations") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20bpf: more tests for mixed signed and unsigned bounds checksDaniel Borkmann1-0/+418
Add a couple of more test cases to BPF selftests that are related to mixed signed and unsigned checks. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20bpf: add test for mixed signed and unsigned bounds checksEdward Cree1-0/+52
These failed due to a bug in verifier bounds handling. Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20bpf: fix up test cases with mixed signed/unsigned boundsDaniel Borkmann1-4/+4
Fix the few existing test cases that used mixed signed/unsigned bounds and switch them only to one flavor. Reason why we need this is that proper boundaries cannot be derived from mixed tests. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20bpf: allow to specify log level and reduce it for test_verifierDaniel Borkmann4-5/+5
For the test_verifier case, it's quite hard to parse log level 2 to figure out what's causing an issue when used to log level 1. We do want to use bpf_verify_program() in order to simulate some of the tests with strict alignment. So just add an argument to pass the level and put it to 1 for test_verifier. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20bpf: fix mixed signed/unsigned derived min/max value boundsDaniel Borkmann2-14/+95
Edward reported that there's an issue in min/max value bounds tracking when signed and unsigned compares both provide hints on limits when having unknown variables. E.g. a program such as the following should have been rejected: 0: (7a) *(u64 *)(r10 -8) = 0 1: (bf) r2 = r10 2: (07) r2 += -8 3: (18) r1 = 0xffff8a94cda93400 5: (85) call bpf_map_lookup_elem#1 6: (15) if r0 == 0x0 goto pc+7 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp 7: (7a) *(u64 *)(r10 -16) = -8 8: (79) r1 = *(u64 *)(r10 -16) 9: (b7) r2 = -1 10: (2d) if r1 > r2 goto pc+3 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=0 R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp 11: (65) if r1 s> 0x1 goto pc+2 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=0,max_value=1 R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp 12: (0f) r0 += r1 13: (72) *(u8 *)(r0 +0) = 0 R0=map_value_adj(ks=8,vs=8,id=0),min_value=0,max_value=1 R1=inv,min_value=0,max_value=1 R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp 14: (b7) r0 = 0 15: (95) exit What happens is that in the first part ... 8: (79) r1 = *(u64 *)(r10 -16) 9: (b7) r2 = -1 10: (2d) if r1 > r2 goto pc+3 ... r1 carries an unsigned value, and is compared as unsigned against a register carrying an immediate. Verifier deduces in reg_set_min_max() that since the compare is unsigned and operation is greater than (>), that in the fall-through/false case, r1's minimum bound must be 0 and maximum bound must be r2. Latter is larger than the bound and thus max value is reset back to being 'invalid' aka BPF_REGISTER_MAX_RANGE. Thus, r1 state is now 'R1=inv,min_value=0'. The subsequent test ... 11: (65) if r1 s> 0x1 goto pc+2 ... is a signed compare of r1 with immediate value 1. Here, verifier deduces in reg_set_min_max() that since the compare is signed this time and operation is greater than (>), that in the fall-through/false case, we can deduce that r1's maximum bound must be 1, meaning with prior test, we result in r1 having the following state: R1=inv,min_value=0,max_value=1. Given that the actual value this holds is -8, the bounds are wrongly deduced. When this is being added to r0 which holds the map_value(_adj) type, then subsequent store access in above case will go through check_mem_access() which invokes check_map_access_adj(), that will then probe whether the map memory is in bounds based on the min_value and max_value as well as access size since the actual unknown value is min_value <= x <= max_value; commit fce366a9dd0d ("bpf, verifier: fix alu ops against map_value{, _adj} register types") provides some more explanation on the semantics. It's worth to note in this context that in the current code, min_value and max_value tracking are used for two things, i) dynamic map value access via check_map_access_adj() and since commit 06c1c049721a ("bpf: allow helpers access to variable memory") ii) also enforced at check_helper_mem_access() when passing a memory address (pointer to packet, map value, stack) and length pair to a helper and the length in this case is an unknown value defining an access range through min_value/max_value in that case. The min_value/max_value tracking is /not/ used in the direct packet access case to track ranges. However, the issue also affects case ii), for example, the following crafted program based on the same principle must be rejected as well: 0: (b7) r2 = 0 1: (bf) r3 = r10 2: (07) r3 += -512 3: (7a) *(u64 *)(r10 -16) = -8 4: (79) r4 = *(u64 *)(r10 -16) 5: (b7) r6 = -1 6: (2d) if r4 > r6 goto pc+5 R1=ctx R2=imm0,min_value=0,max_value=0,min_align=2147483648 R3=fp-512 R4=inv,min_value=0 R6=imm-1,max_value=18446744073709551615,min_align=1 R10=fp 7: (65) if r4 s> 0x1 goto pc+4 R1=ctx R2=imm0,min_value=0,max_value=0,min_align=2147483648 R3=fp-512 R4=inv,min_value=0,max_value=1 R6=imm-1,max_value=18446744073709551615,min_align=1 R10=fp 8: (07) r4 += 1 9: (b7) r5 = 0 10: (6a) *(u16 *)(r10 -512) = 0 11: (85) call bpf_skb_load_bytes#26 12: (b7) r0 = 0 13: (95) exit Meaning, while we initialize the max_value stack slot that the verifier thinks we access in the [1,2] range, in reality we pass -7 as length which is interpreted as u32 in the helper. Thus, this issue is relevant also for the case of helper ranges. Resetting both bounds in check_reg_overflow() in case only one of them exceeds limits is also not enough as similar test can be created that uses values which are within range, thus also here learned min value in r1 is incorrect when mixed with later signed test to create a range: 0: (7a) *(u64 *)(r10 -8) = 0 1: (bf) r2 = r10 2: (07) r2 += -8 3: (18) r1 = 0xffff880ad081fa00 5: (85) call bpf_map_lookup_elem#1 6: (15) if r0 == 0x0 goto pc+7 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp 7: (7a) *(u64 *)(r10 -16) = -8 8: (79) r1 = *(u64 *)(r10 -16) 9: (b7) r2 = 2 10: (3d) if r2 >= r1 goto pc+3 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp 11: (65) if r1 s> 0x4 goto pc+2 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3,max_value=4 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp 12: (0f) r0 += r1 13: (72) *(u8 *)(r0 +0) = 0 R0=map_value_adj(ks=8,vs=8,id=0),min_value=3,max_value=4 R1=inv,min_value=3,max_value=4 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp 14: (b7) r0 = 0 15: (95) exit This leaves us with two options for fixing this: i) to invalidate all prior learned information once we switch signed context, ii) to track min/max signed and unsigned boundaries separately as done in [0]. (Given latter introduces major changes throughout the whole verifier, it's rather net-next material, thus this patch follows option i), meaning we can derive bounds either from only signed tests or only unsigned tests.) There is still the case of adjust_reg_min_max_vals(), where we adjust bounds on ALU operations, meaning programs like the following where boundaries on the reg get mixed in context later on when bounds are merged on the dst reg must get rejected, too: 0: (7a) *(u64 *)(r10 -8) = 0 1: (bf) r2 = r10 2: (07) r2 += -8 3: (18) r1 = 0xffff89b2bf87ce00 5: (85) call bpf_map_lookup_elem#1 6: (15) if r0 == 0x0 goto pc+6 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp 7: (7a) *(u64 *)(r10 -16) = -8 8: (79) r1 = *(u64 *)(r10 -16) 9: (b7) r2 = 2 10: (3d) if r2 >= r1 goto pc+2 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp 11: (b7) r7 = 1 12: (65) if r7 s> 0x0 goto pc+2 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R7=imm1,max_value=0 R10=fp 13: (b7) r0 = 0 14: (95) exit from 12 to 15: R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R7=imm1,min_value=1 R10=fp 15: (0f) r7 += r1 16: (65) if r7 s> 0x4 goto pc+2 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R7=inv,min_value=4,max_value=4 R10=fp 17: (0f) r0 += r7 18: (72) *(u8 *)(r0 +0) = 0 R0=map_value_adj(ks=8,vs=8,id=0),min_value=4,max_value=4 R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R7=inv,min_value=4,max_value=4 R10=fp 19: (b7) r0 = 0 20: (95) exit Meaning, in adjust_reg_min_max_vals() we must also reset range values on the dst when src/dst registers have mixed signed/ unsigned derived min/max value bounds with one unbounded value as otherwise they can be added together deducing false boundaries. Once both boundaries are established from either ALU ops or compare operations w/o mixing signed/unsigned insns, then they can safely be added to other regs also having both boundaries established. Adding regs with one unbounded side to a map value where the bounded side has been learned w/o mixing ops is possible, but the resulting map value won't recover from that, meaning such op is considered invalid on the time of actual access. Invalid bounds are set on the dst reg in case i) src reg, or ii) in case dst reg already had them. The only way to recover would be to perform i) ALU ops but only 'add' is allowed on map value types or ii) comparisons, but these are disallowed on pointers in case they span a range. This is fine as only BPF_JEQ and BPF_JNE may be performed on PTR_TO_MAP_VALUE_OR_NULL registers which potentially turn them into PTR_TO_MAP_VALUE type depending on the branch, so only here min/max value cannot be invalidated for them. In terms of state pruning, value_from_signed is considered as well in states_equal() when dealing with adjusted map values. With regards to breaking existing programs, there is a small risk, but use-cases are rather quite narrow where this could occur and mixing compares probably unlikely. Joint work with Josef and Edward. [0] https://lists.iovisor.org/pipermail/iovisor-dev/2017-June/000822.html Fixes: 484611357c19 ("bpf: allow access into map value arrays") Reported-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-20x86: mark kprobe templates as character arrays, not single charactersLinus Torvalds1-4/+4
They really are, and the "take the address of a single character" makes the string fortification code unhappy (it believes that you can now only acccess one byte, rather than a byte range, and then raises errors for the memory copies going on in there). We could now remove a few 'addressof' operators (since arrays naturally degrade to pointers), but this is the minimal patch that just changes the C prototypes of those template arrays (the templates themselves are defined in inline asm). Reported-by: kernel test robot <xiaolong.ye@intel.com> Acked-and-tested-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-20HID: hid-logitech-hidpp: add NULL check on devm_kmemdup() return valueGustavo A. R. Silva1-0/+3
Check return value from call to devm_kmemdup() in order to prevent a NULL pointer dereference. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-07-20debug: Fix WARN_ON_ONCE() for modulesJosh Poimboeuf9-18/+18
Mike Galbraith reported a situation where a WARN_ON_ONCE() call in DRM code turned into an oops. As it turns out, WARN_ON_ONCE() seems to be completely broken when called from a module. The bug was introduced with the following commit: 19d436268dde ("debug: Add _ONCE() logic to report_bug()") That commit changed WARN_ON_ONCE() to move its 'once' logic into the bug trap handler. It requires a writable bug table so that the BUGFLAG_DONE bit can be written to the flags to indicate the first warning has occurred. The bug table was made writable for vmlinux, which relies on vmlinux.lds.S and vmlinux.lds.h for laying out the sections. However, it wasn't made writable for modules, which rely on the ELF section header flags. Reported-by: Mike Galbraith <efault@gmx.de> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 19d436268dde ("debug: Add _ONCE() logic to report_bug()") Link: http://lkml.kernel.org/r/a53b04235a65478dd9afc51f5b329fdc65c84364.1500095401.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/platform/intel-mid: Fix a format string overflow warningArnd Bergmann1-2/+4
We have space for exactly three characters for the index in "max7315_%d_base", but as GCC points out having more would cause an string overflow: arch/x86/platform/intel-mid/device_libs/platform_max7315.c: In function 'max7315_platform_data': arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: error: '%d' directive writing between 1 and 11 bytes into a region of size 9 [-Werror=format-overflow=] sprintf(base_pin_name, "max7315_%d_base", nr); ^~~~~~~~~~~~~~~~~ arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: note: directive argument in the range [-2147483647, 2147483647] arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:3: note: 'sprintf' output between 15 and 25 bytes into a destination of size 17 sprintf(base_pin_name, "max7315_%d_base", nr); This makes it use an snprintf() to truncate the string if that happened rather than overflowing the stack. In practice, this is safe, because there won't be a large number of max7315 devices in the systems, and both the format and the length are defined by the firmware interface. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-9-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/platform: Add PCI dependency for PUNIT_ATOM_DEBUGArnd Bergmann1-0/+1
The IOSF_MBI option requires PCI support, without it we get a harmless Kconfig warning when it gets selected by PUNIT_ATOM_DEBUG: warning: (X86_INTEL_LPSS && SND_SST_IPC_ACPI && MMC_SDHCI_ACPI && PUNIT_ATOM_DEBUG) selects IOSF_MBI which has unmet direct dependencies (PCI) This adds another dependency to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-8-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/build: Silence the build with "make -s"Arnd Bergmann1-2/+3
Every kernel build on x86 will result in some output: Setup is 13084 bytes (padded to 13312 bytes). System is 4833 kB CRC 6d35fa35 Kernel: arch/x86/boot/bzImage is ready (#2) This shuts it up, so that 'make -s' is truely silent as long as everything works. Building without '-s' should produce unchanged output. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-6-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outslArnd Bergmann1-2/+2
The x86 version of insb/insw/insl uses an inline assembly that does not have the target buffer listed as an output. This can confuse the compiler, leading it to think that a subsequent access of the buffer is uninitialized: drivers/net/wireless/wl3501_cs.c: In function ‘wl3501_mgmt_scan_confirm’: drivers/net/wireless/wl3501_cs.c:665:9: error: ‘sig.status’ is used uninitialized in this function [-Werror=uninitialized] drivers/net/wireless/wl3501_cs.c:668:12: error: ‘sig.cap_info’ may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c: In function 'sb1000_rx': drivers/net/sb1000.c:775:9: error: 'st[0]' is used uninitialized in this function [-Werror=uninitialized] drivers/net/sb1000.c:776:10: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c:784:11: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] I tried to mark the exact input buffer as an output here, but couldn't figure it out. As suggested by Linus, marking all memory as clobbered however is good enough too. For the outs operations, I also add the memory clobber, to force the input to be written to local variables. This is probably already guaranteed by the "asm volatile", but it can't hurt to do this for symmetry. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: http://lkml.kernel.org/r/20170719125310.2487451-5-arnd@arndb.de Link: https://lkml.org/lkml/2017/7/12/605 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/fpu/math-emu: Avoid bogus -Wint-in-bool-context warningArnd Bergmann1-1/+1
gcc-7.1.1 produces this warning: arch/x86/math-emu/reg_add_sub.c: In function 'FPU_add': arch/x86/math-emu/reg_add_sub.c:80:48: error: ?: using integer constants in boolean context [-Werror=int-in-bool-context] This appears to be a bug in gcc-7.1.1, and I have reported it as PR81484. The compiler suggests that code written as if (a & b ? c : d) is usually incorrect and should have been if (a & (b ? c : d)) However, in this case, we correctly write if ((a & b) ? c : d) and should not get a warning for it. This adds a dirty workaround for the problem, adding a comparison with zero inside of the macro. The warning is currently disabled in the kernel, so we may decide not to apply the patch, and instead wait for future gcc releases to fix the problem. On the other hand, it seems to be the only instance of this particular problem. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Bill Metzenthen <billm@melbpc.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-4-arnd@arndb.de Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/fpu/math-emu: Fix possible uninitialized variable useArnd Bergmann2-10/+10
When building the kernel with "make EXTRA_CFLAGS=...", this overrides the "PARANOID" preprocessor macro defined in arch/x86/math-emu/Makefile, and we run into a build warning: arch/x86/math-emu/reg_compare.c: In function ‘compare_i_st_st’: arch/x86/math-emu/reg_compare.c:254:6: error: ‘f’ may be used uninitialized in this function [-Werror=maybe-uninitialized] This fixes the implementation to work correctly even without the PARANOID flag, and also fixes the Makefile to not use the EXTRA_CFLAGS variable but instead use the ccflags-y variable in the Makefile that is meant for this purpose. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Bill Metzenthen <billm@melbpc.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-3-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20perf/x86: Shut up false-positive -Wmaybe-uninitialized warningArnd Bergmann1-2/+2
The intialization function checks for various failure scenarios, but unfortunately the compiler gets a little confused about the possible combinations, leading to a false-positive build warning when -Wmaybe-uninitialized is set: arch/x86/events/core.c: In function ‘init_hw_perf_events’: arch/x86/events/core.c:264:3: warning: ‘reg_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] arch/x86/events/core.c:264:3: warning: ‘val_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] pr_err(FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", We can't actually run into this case, so this shuts up the warning by initializing the variables to a known-invalid state. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-2-arnd@arndb.de Link: https://patchwork.kernel.org/patch/9392595/ Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/defconfig: Remove stale, old Kconfig optionsKrzysztof Kozlowski2-6/+0
Remove old, dead Kconfig options (in order appearing in this commit): - EXPERIMENTAL is gone since v3.9; - IP_NF_TARGET_ULOG: commit d4da843e6fad ("netfilter: kill remnants of ulog targets"); - USB_LIBUSUAL: commit f61870ee6f8c ("usb: remove libusual"); Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1500526885-4341-1-git-send-email-krzk@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/ioapic: Pass the correct data to unmask_ioapic_irq()Seunghun Han1-1/+1
One of the rarely executed code pathes in check_timer() calls unmask_ioapic_irq() passing irq_get_chip_data(0) as argument. That's wrong as unmask_ioapic_irq() expects a pointer to the irq data of interrupt 0. irq_get_chip_data(0) returns NULL, so the following dereference in unmask_ioapic_irq() causes a kernel panic. The issue went unnoticed in the first place because irq_get_chip_data() returns a void pointer so the compiler cannot do a type check on the argument. The code path was added for machines with broken configuration, but it seems that those machines are either not running current kernels or simply do not longer exist. Hand in irq_get_irq_data(0) as argument which provides the correct data. [ tglx: Rewrote changelog ] Fixes: 4467715a44cc ("x86/irq: Move irq_cfg.irq_2_pin into io_apic.c") Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1500369644-45767-1-git-send-email-kkamagui@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/acpi: Prevent out of bound access caused by broken ACPI tablesSeunghun Han1-0/+8
The bus_irq argument of mp_override_legacy_irq() is used as the index into the isa_irq_to_gsi[] array. The bus_irq argument originates from ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI tables, but is nowhere sanity checked. That allows broken or malicious ACPI tables to overwrite memory, which might cause malfunction, panic or arbitrary code execution. Add a sanity check and emit a warning when that triggers. [ tglx: Added warning and rewrote changelog ] Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20perf/core: Fix scheduling regression of pinned groupsAlexander Shishkin1-0/+7
Vince Weaver reported: > I was tracking down some regressions in my perf_event_test testsuite. > Some of the tests broke in the 4.11-rc1 timeframe. > > I've bisected one of them, this report is about > tests/overflow/simul_oneshot_group_overflow > This test creates an event group containing two sampling events, set > to overflow to a signal handler (which disables and then refreshes the > event). > > On a good kernel you get the following: > Event perf::instructions with period 1000000 > Event perf::instructions with period 2000000 > fd 3 overflows: 946 (perf::instructions/1000000) > fd 4 overflows: 473 (perf::instructions/2000000) > Ending counts: > Count 0: 946379875 > Count 1: 946365218 > > With the broken kernels you get: > Event perf::instructions with period 1000000 > Event perf::instructions with period 2000000 > fd 3 overflows: 938 (perf::instructions/1000000) > fd 4 overflows: 318 (perf::instructions/2000000) > Ending counts: > Count 0: 946373080 > Count 1: 653373058 The root cause of the bug is that the following commit: 487f05e18a ("perf/core: Optimize event rescheduling on active contexts") erronously assumed that event's 'pinned' setting determines whether the event belongs to a pinned group or not, but in fact, it's the group leader's pinned state that matters. This was discovered by Vince in the test case described above, where two instruction counters are grouped, the group leader is pinned, but the other event is not; in the regressed case the counters were off by 33% (the difference between events' periods), but should be the same within the error margin. Fix the problem by looking at the group leader's pinning. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 487f05e18a ("perf/core: Optimize event rescheduling on active contexts") Link: http://lkml.kernel.org/r/87lgnmvw7h.fsf@ashishki-desk.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-19ipv6: avoid overflow of offset in ip6_find_1stfragoptSabrina Dubroca1-2/+6
In some cases, offset can overflow and can cause an infinite loop in ip6_find_1stfragopt(). Make it unsigned int to prevent the overflow, and cap it at IPV6_MAXPLEN, since packets larger than that should be invalid. This problem has been here since before the beginning of git history. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19net: tehuti: don't process data if it has not been copied from userspaceColin Ian King1-0/+2
The array data is only populated with valid information from userspace if cmd != SIOCDEVPRIVATE, other cases the array contains garbage on the stack. The subsequent switch statement acts on a subcommand in data[0] which could be any garbage value if cmd is SIOCDEVPRIVATE which seems incorrect to me. Instead, just return EOPNOTSUPP for the case where cmd == SIOCDEVPRIVATE to avoid this issue. As a side note, I suspect that the original intention of the code was for this ioctl to work just for cmd == SIOCDEVPRIVATE (and the current logic is reversed). However, I don't wont to change the current semantics in case any userspace code relies on this existing behaviour. Detected by CoverityScan, CID#139647 ("Uninitialized scalar variable") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19Revert "rtnetlink: Do not generate notifications for CHANGEADDR event"David Ahern1-0/+1
This reverts commit cd8966e75ed3c6b41a37047a904617bc44fa481f. The duplicate CHANGEADDR event message is sent regardless of link status whereas the setlink changes only generate a notification when the link is up. Not sending a notification when the link is down breaks dhcpcd which only processes hwaddr changes when the link is down. Fixes reported regression: https://bugzilla.kernel.org/show_bug.cgi?id=196355 Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19net: dsa: mv88e6xxx: Enable CMODE config support for 6390XMartin Hundebøll1-0/+1
Commit f39908d3b1c45 ('net: dsa: mv88e6xxx: Set the CMODE for mv88e6390 ports 9 & 10') added support for setting the CMODE for the 6390X family, but only enabled it for 9290 and 6390 - and left out 6390X. Fix support for setting the CMODE on 6390X also by assigning mv88e6390x_port_set_cmode() to the .port_set_cmode function pointer in mv88e6390x_ops too. Fixes: f39908d3b1c4 ("net: dsa: mv88e6xxx: Set the CMODE for mv88e6390 ports 9 & 10") Signed-off-by: Martin Hundebøll <mnhu@prevas.dk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19dt-binding: ptp: Add SoC compatibility strings for dte ptp clockArun Parameswaran1-4/+11
Add SoC specific compatibility strings to the Broadcom DTE based PTP clock binding document. Fixed the document heading and node name. Fixes: 80d6076140b2 ("dt-binding: ptp: add bindings document for dte based ptp clock") Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19llist: clang: introduce member_address_is_nonnull()Alexander Potapenko1-2/+19
Currently llist_for_each_entry() and llist_for_each_entry_safe() iterate until &pos->member != NULL. But when building the kernel with Clang, the compiler assumes &pos->member cannot be NULL if the member's offset is greater than 0 (which would be equivalent to the object being non-contiguous in memory). Therefore the loop condition is always true, and the loops become infinite. To work around this, introduce the member_address_is_nonnull() macro, which casts object pointer to uintptr_t, thus letting the member pointer to be NULL. Signed-off-by: Alexander Potapenko <glider@google.com> Tested-by: Sodagudi Prasad <psodagud@codeaurora.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-19NET: dwmac: Make dwmac reset unconditionalEugeniy Paltsev1-1/+8
Unconditional reset dwmac before HW init if reset controller is present. In existing implementation we reset dwmac only after second module probing: (module load -> unload -> load again [reset happens]) Now we reset dwmac at every module load: (module load [reset happens] -> unload -> load again [reset happens]) Also some reset controllers have only reset callback instead of assert + deassert callbacks pair, so handle this case. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19net: Zero terminate ifr_name in dev_ifname().David S. Miller1-0/+1
The ifr.ifr_name is passed around and assumed to be NULL terminated. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19wireless: wext: terminate ifr name coming from userspaceLevin, Alexander1-0/+2
ifr name is assumed to be a valid string by the kernel, but nothing was forcing username to pass a valid string. In turn, this would cause panics as we tried to access the string past it's valid memory. Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19audit: fix memleak in auditd_send_unicast_skb.Shu Wang1-0/+1
Found this issue by kmemleak report, auditd_send_unicast_skb did not free skb if rcu_dereference(auditd_conn) returns null. unreferenced object 0xffff88082568ce00 (size 256): comm "auditd", pid 1119, jiffies 4294708499 backtrace: [<ffffffff8176166a>] kmemleak_alloc+0x4a/0xa0 [<ffffffff8121820c>] kmem_cache_alloc_node+0xcc/0x210 [<ffffffff8161b99d>] __alloc_skb+0x5d/0x290 [<ffffffff8113c614>] audit_make_reply+0x54/0xd0 [<ffffffff8113dfa7>] audit_receive_msg+0x967/0xd70 ---------------- (gdb) list *audit_receive_msg+0x967 0xffffffff8113dff7 is in audit_receive_msg (kernel/audit.c:1133). 1132 skb = audit_make_reply(0, AUDIT_REPLACE, 0, 0, &pvnr, sizeof(pvnr)); --------------- [<ffffffff8113e402>] audit_receive+0x52/0xa0 [<ffffffff8166c561>] netlink_unicast+0x181/0x240 [<ffffffff8166c8e2>] netlink_sendmsg+0x2c2/0x3b0 [<ffffffff816112e8>] sock_sendmsg+0x38/0x50 [<ffffffff816117a2>] SYSC_sendto+0x102/0x190 [<ffffffff81612f4e>] SyS_sendto+0xe/0x10 [<ffffffff8176d337>] entry_SYSCALL_64_fastpath+0x1a/0xa5 [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Shu Wang <shuwang@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-07-19PM / Domains: defer dev_pm_domain_set() until genpd->attach_dev succeeds if presentSudeep Holla1-4/+4
If the genpd->attach_dev or genpd->power_on fails, genpd_dev_pm_attach may return -EPROBE_DEFER initially. However genpd_alloc_dev_data sets the PM domain for the device unconditionally. When subsequent attempts are made to call genpd_dev_pm_attach, it may return -EEXISTS checking dev->pm_domain without re-attempting to call attach_dev or power_on. platform_drv_probe then attempts to call drv->probe as the return value -EEXIST != -EPROBE_DEFER, which may end up in a situation where the device is accessed without it's power domain switched on. Fixes: f104e1e5ef57 (PM / Domains: Re-order initialization of generic_pm_domain_data) Cc: 4.4+ <stable@vger.kernel.org> # v4.4+ Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-18device-dax: fix sysfs duplicate warningsDan Williams3-14/+24
Fix warnings of the form... WARNING: CPU: 10 PID: 4983 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 sysfs: cannot create duplicate filename '/class/dax/dax12.0' Call Trace: dump_stack+0x63/0x86 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5a/0x80 ? kernfs_path_from_node+0x4f/0x60 sysfs_warn_dup+0x62/0x80 sysfs_do_create_link_sd.isra.2+0x97/0xb0 sysfs_create_link+0x25/0x40 device_add+0x266/0x630 devm_create_dax_dev+0x2cf/0x340 [dax] dax_pmem_probe+0x1f5/0x26e [dax_pmem] nvdimm_bus_probe+0x71/0x120 ...by reusing the namespace id for the device-dax instance name. Now that we have decided that there will never by more than one device-dax instance per libnvdimm-namespace parent device [1], we can directly reuse the namepace ids. There are some possible follow-on cleanups, but those are saved for a later patch to simplify the -stable backport. [1]: https://lists.01.org/pipermail/linux-nvdimm/2016-December/008266.html Fixes: 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple pmem...") Cc: Jeff Moyer <jmoyer@redhat.com> Cc: <stable@vger.kernel.org> Reported-by: Dariusz Dokupil <dariusz.dokupil@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-07-18netfilter: fix netfilter_net_init() returnDan Carpenter1-2/+2
We accidentally return an uninitialized variable. Fixes: cf56c2f892a8 ("netfilter: remove old pre-netns era hook api") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18irqchip/digicolor: Drop unnecessary staticJulia Lawall1-1/+1
Drop static on a local variable, when the variable is initialized before any possible use. Thus, the static has no benefit. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @bad exists@ position p; identifier x; type T; @@ static T x@p; ... x = <+...x...+> @@ identifier x; expression e; type T; position p != bad.p; @@ -static T x@p; ... when != x when strict ?x = e; // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Baruch Siach <baruch@tkos.co.il> Cc: keescook@chromium.org Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kernel-janitors@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: Jason Cooper <jason@lakedaemon.net> Link: http://lkml.kernel.org/r/1500149266-32357-11-git-send-email-Julia.Lawall@lip6.fr
2017-07-18irqchip/mips-cpu: Drop unnecessary staticJulia Lawall1-1/+1
Drop static on a local variable, when the variable is initialized before any possible use. Thus, the static has no benefit. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @bad exists@ position p; identifier x; type T; @@ static T x@p; ... x = <+...x...+> @@ identifier x; expression e; type T; position p != bad.p; @@ -static T x@p; ... when != x when strict ?x = e; // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kernel-janitors@vger.kernel.org Cc: keescook@chromium.org Cc: Jason Cooper <jason@lakedaemon.net> Link: http://lkml.kernel.org/r/1500149266-32357-7-git-send-email-Julia.Lawall@lip6.fr
2017-07-18irqchip/gic/realview: Drop unnecessary staticJulia Lawall1-1/+1
Drop static on a local variable, when the variable is initialized before any possible use. Thus, the static has no benefit. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @bad exists@ position p; identifier x; type T; @@ static T x@p; ... x = <+...x...+> @@ identifier x; expression e; type T; position p != bad.p; @@ -static T x@p; ... when != x when strict ?x = e; // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kernel-janitors@vger.kernel.org Cc: keescook@chromium.org Cc: Jason Cooper <jason@lakedaemon.net> Link: http://lkml.kernel.org/r/1500149266-32357-6-git-send-email-Julia.Lawall@lip6.fr
2017-07-18udp: preserve skb->dst if required for IP options processingPaolo Abeni1-2/+11
Eric noticed that in udp_recvmsg() we still need to access skb->dst while processing the IP options. Since commit 0a463c78d25b ("udp: avoid a cache miss on dequeue") skb->dst is no more available at recvmsg() time and bad things will happen if we enter the relevant code path. This commit address the issue, avoid clearing skb->dst if any IP options are present into the relevant skb. Since the IP CB is contained in the first skb cacheline, we can test it to decide to leverage the consume_stateless_skb() optimization, without measurable additional cost in the faster path. v1 -> v2: updated commit message tags Fixes: 0a463c78d25b ("udp: avoid a cache miss on dequeue") Reported-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18atm: zatm: Fix an error handling path in 'zatm_init_one()'Christophe Jaillet1-1/+1
If 'dma_set_mask_and_coherent()' fails, we must undo the previous 'pci_request_regions()' call. Adjust corresponding 'goto' to jump at the right place of the error handling path. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check()Alexander Potapenko2-0/+2
KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(), which originated from the TCP request socket created in cookie_v6_check(): ================================================================== BUG: KMSAN: use of uninitialized memory in tcp_transmit_skb+0xf77/0x3ec0 CPU: 1 PID: 2949 Comm: syz-execprog Not tainted 4.11.0-rc5+ #2931 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 TCP: request_sock_TCPv6: Possible SYN flooding on port 20028. Sending cookies. Check SNMP counters. Call Trace: <IRQ> __dump_stack lib/dump_stack.c:16 dump_stack+0x172/0x1c0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927 __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469 skb_set_hash_from_sk ./include/net/sock.h:2011 tcp_transmit_skb+0xf77/0x3ec0 net/ipv4/tcp_output.c:983 tcp_send_ack+0x75b/0x830 net/ipv4/tcp_output.c:3493 tcp_delack_timer_handler+0x9a6/0xb90 net/ipv4/tcp_timer.c:284 tcp_delack_timer+0x1b0/0x310 net/ipv4/tcp_timer.c:309 call_timer_fn+0x240/0x520 kernel/time/timer.c:1268 expire_timers kernel/time/timer.c:1307 __run_timers+0xc13/0xf10 kernel/time/timer.c:1601 run_timer_softirq+0x36/0xa0 kernel/time/timer.c:1614 __do_softirq+0x485/0x942 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 irq_exit+0x1fa/0x230 kernel/softirq.c:405 exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:657 smp_apic_timer_interrupt+0x5a/0x80 arch/x86/kernel/apic/apic.c:966 apic_timer_interrupt+0x86/0x90 arch/x86/entry/entry_64.S:489 RIP: 0010:native_restore_fl ./arch/x86/include/asm/irqflags.h:36 RIP: 0010:arch_local_irq_restore ./arch/x86/include/asm/irqflags.h:77 RIP: 0010:__msan_poison_alloca+0xed/0x120 mm/kmsan/kmsan_instr.c:440 RSP: 0018:ffff880024917cd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 RAX: 0000000000000246 RBX: ffff8800224c0000 RCX: 0000000000000005 RDX: 0000000000000004 RSI: ffff880000000000 RDI: ffffea0000b6d770 RBP: ffff880024917d58 R08: 0000000000000dd8 R09: 0000000000000004 R10: 0000160000000000 R11: 0000000000000000 R12: ffffffff85abf810 R13: ffff880024917dd8 R14: 0000000000000010 R15: ffffffff81cabde4 </IRQ> poll_select_copy_remaining+0xac/0x6b0 fs/select.c:293 SYSC_select+0x4b4/0x4e0 fs/select.c:653 SyS_select+0x76/0xa0 fs/select.c:634 entry_SYSCALL_64_fastpath+0x13/0x94 arch/x86/entry/entry_64.S:204 RIP: 0033:0x4597e7 RSP: 002b:000000c420037ee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000017 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004597e7 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 000000c420037ef0 R08: 000000c420037ee0 R09: 0000000000000059 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000042dc20 R13: 00000000000000f3 R14: 0000000000000030 R15: 0000000000000003 chained origin: save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302 kmsan_save_stack mm/kmsan/kmsan.c:317 kmsan_internal_chain_origin+0x12a/0x1f0 mm/kmsan/kmsan.c:547 __msan_store_shadow_origin_4+0xac/0x110 mm/kmsan/kmsan_instr.c:259 tcp_create_openreq_child+0x709/0x1ae0 net/ipv4/tcp_minisocks.c:472 tcp_v6_syn_recv_sock+0x7eb/0x2a30 net/ipv6/tcp_ipv6.c:1103 tcp_get_cookie_sock+0x136/0x5f0 net/ipv4/syncookies.c:212 cookie_v6_check+0x17a9/0x1b50 net/ipv6/syncookies.c:245 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989 tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298 tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487 ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279 NF_HOOK ./include/linux/netfilter.h:257 ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322 dst_input ./include/net/dst.h:492 ip6_rcv_finish net/ipv6/ip6_input.c:69 NF_HOOK ./include/linux/netfilter.h:257 ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203 __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208 __netif_receive_skb net/core/dev.c:4246 process_backlog+0x667/0xba0 net/core/dev.c:4866 napi_poll net/core/dev.c:5268 net_rx_action+0xc95/0x1590 net/core/dev.c:5333 __do_softirq+0x485/0x942 kernel/softirq.c:284 origin: save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198 kmsan_kmalloc+0x7f/0xe0 mm/kmsan/kmsan.c:337 kmem_cache_alloc+0x1c2/0x1e0 mm/slub.c:2766 reqsk_alloc ./include/net/request_sock.h:87 inet_reqsk_alloc+0xa4/0x5b0 net/ipv4/tcp_input.c:6200 cookie_v6_check+0x4f4/0x1b50 net/ipv6/syncookies.c:169 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989 tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298 tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487 ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279 NF_HOOK ./include/linux/netfilter.h:257 ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322 dst_input ./include/net/dst.h:492 ip6_rcv_finish net/ipv6/ip6_input.c:69 NF_HOOK ./include/linux/netfilter.h:257 ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203 __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208 __netif_receive_skb net/core/dev.c:4246 process_backlog+0x667/0xba0 net/core/dev.c:4866 napi_poll net/core/dev.c:5268 net_rx_action+0xc95/0x1590 net/core/dev.c:5333 __do_softirq+0x485/0x942 kernel/softirq.c:284 ================================================================== Similar error is reported for cookie_v4_check(). Fixes: 58d607d3e52f ("tcp: provide skb->hash to synack packets") Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18ppp: Fix false xmit recursion detect with two ppp devicesGao Feng1-9/+21
The global percpu variable ppp_xmit_recursion is used to detect the ppp xmit recursion to avoid the deadlock, which is caused by one CPU tries to lock the xmit lock twice. But it would report false recursion when one CPU wants to send the skb from two different PPP devices, like one L2TP on the PPPoE. It is a normal case actually. Now use one percpu member of struct ppp instead of the gloable variable to detect the xmit recursion of one ppp device. Fixes: 55454a565836 ("ppp: avoid dealock on recursive xmit") Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: Liu Jianying <jianying.liu@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>