aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-10-31userns: Don't read extents twice in m_startEric W. Biederman1-2/+4
This is important so reading /proc/<pid>/{uid_map,gid_map,projid_map} while the map is being written does not do strange things. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-10-31userns: Simplify the user and group mapping functionsEric W. Biederman1-74/+58
Consolidate reading the number of extents and computing the return value in the map_id_down, map_id_range_down and map_id_range. This removal of one read of extents makes one smp_rmb unnecessary and makes the code safe it is executed during the map write. Reading the number of extents twice and depending on the result being the same is not safe, as it could be 0 the first time and > 5 the second time, which would lead to misinterpreting the union fields. The consolidation of the return value just removes a duplicate caluculation which should make it easier to understand and maintain the code. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-10-31userns: Don't special case a count of 0Eric W. Biederman1-7/+3
We can always use a count of 1 so there is no reason to have a special case of a count of 0. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-10-31userns: bump idmap limits to 340Christian Brauner2-33/+324
There are quite some use cases where users run into the current limit for {g,u}id mappings. Consider a user requesting us to map everything but 999, and 1001 for a given range of 1000000000 with a sub{g,u}id layout of: some-user:100000:1000000000 some-user:999:1 some-user:1000:1 some-user:1001:1 some-user:1002:1 This translates to: MAPPING-TYPE | CONTAINER | HOST | RANGE | -------------|-----------|---------|-----------| uid | 999 | 999 | 1 | uid | 1001 | 1001 | 1 | uid | 0 | 1000000 | 999 | uid | 1000 | 1001000 | 1 | uid | 1002 | 1001002 | 999998998 | ------------------------------------------------ gid | 999 | 999 | 1 | gid | 1001 | 1001 | 1 | gid | 0 | 1000000 | 999 | gid | 1000 | 1001000 | 1 | gid | 1002 | 1001002 | 999998998 | which is already the current limit. As discussed at LPC simply bumping the number of limits is not going to work since this would mean that struct uid_gid_map won't fit into a single cache-line anymore thereby regressing performance for the base-cases. The same problem seems to arise when using a single pointer. So the idea is to use struct uid_gid_extent { u32 first; u32 lower_first; u32 count; }; struct uid_gid_map { /* 64 bytes -- 1 cache line */ u32 nr_extents; union { struct uid_gid_extent extent[UID_GID_MAP_MAX_BASE_EXTENTS]; struct { struct uid_gid_extent *forward; struct uid_gid_extent *reverse; }; }; }; For the base cases we will only use the struct uid_gid_extent extent member. If we go over UID_GID_MAP_MAX_BASE_EXTENTS mappings we perform a single 4k kmalloc() which means we can have a maximum of 340 mappings (340 * size(struct uid_gid_extent) = 4080). For the latter case we use two pointers "forward" and "reverse". The forward pointer points to an array sorted by "first" and the reverse pointer points to an array sorted by "lower_first". We can then perform binary search on those arrays. Performance Testing: When Eric introduced the extent-based struct uid_gid_map approach he measured the performanc impact of his idmap changes: > My benchmark consisted of going to single user mode where nothing else was > running. On an ext4 filesystem opening 1,000,000 files and looping through all > of the files 1000 times and calling fstat on the individuals files. This was > to ensure I was benchmarking stat times where the inodes were in the kernels > cache, but the inode values were not in the processors cache. My results: > v3.4-rc1: ~= 156ns (unmodified v3.4-rc1 with user namespace support disabled) > v3.4-rc1-userns-: ~= 155ns (v3.4-rc1 with my user namespace patches and user namespace support disabled) > v3.4-rc1-userns+: ~= 164ns (v3.4-rc1 with my user namespace patches and user namespace support enabled) I used an identical approach on my laptop. Here's a thorough description of what I did. I built a 4.14.0-rc4 mainline kernel with my new idmap patches applied. I booted into single user mode and used an ext4 filesystem to open/create 1,000,000 files. Then I looped through all of the files calling fstat() on each of them 1000 times and calculated the mean fstat() time for a single file. (The test program can be found below.) Here are the results. For fun, I compared the first version of my patch which scaled linearly with the new version of the patch: | # MAPPINGS | PATCH-V1 | PATCH-NEW | |--------------|------------|-----------| | 0 mappings | 158 ns | 158 ns | | 1 mappings | 164 ns | 157 ns | | 2 mappings | 170 ns | 158 ns | | 3 mappings | 175 ns | 161 ns | | 5 mappings | 187 ns | 165 ns | | 10 mappings | 218 ns | 199 ns | | 50 mappings | 528 ns | 218 ns | | 100 mappings | 980 ns | 229 ns | | 200 mappings | 1880 ns | 239 ns | | 300 mappings | 2760 ns | 240 ns | | 340 mappings | not tested | 248 ns | Here's the test program I used. I asked Eric what he did and this is a more "advanced" implementation of the idea. It's pretty straight-forward: #define __GNU_SOURCE #define __STDC_FORMAT_MACROS #include <errno.h> #include <dirent.h> #include <fcntl.h> #include <inttypes.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/stat.h> #include <sys/time.h> #include <sys/types.h> int main(int argc, char *argv[]) { int ret; size_t i, k; int fd[1000000]; int times[1000]; char pathname[4096]; struct stat st; struct timeval t1, t2; uint64_t time_in_mcs; uint64_t sum = 0; if (argc != 2) { fprintf(stderr, "Please specify a directory where to create " "the test files\n"); exit(EXIT_FAILURE); } for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++) { sprintf(pathname, "%s/idmap_test_%zu", argv[1], i); fd[i]= open(pathname, O_RDWR | O_CREAT, S_IXUSR | S_IXGRP | S_IXOTH); if (fd[i] < 0) { ssize_t j; for (j = i; j >= 0; j--) close(fd[j]); exit(EXIT_FAILURE); } } for (k = 0; k < 1000; k++) { ret = gettimeofday(&t1, NULL); if (ret < 0) goto close_all; for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++) { ret = fstat(fd[i], &st); if (ret < 0) goto close_all; } ret = gettimeofday(&t2, NULL); if (ret < 0) goto close_all; time_in_mcs = (1000000 * t2.tv_sec + t2.tv_usec) - (1000000 * t1.tv_sec + t1.tv_usec); printf("Total time in micro seconds: %" PRIu64 "\n", time_in_mcs); printf("Total time in nanoseconds: %" PRIu64 "\n", time_in_mcs * 1000); printf("Time per file in nanoseconds: %" PRIu64 "\n", (time_in_mcs * 1000) / 1000000); times[k] = (time_in_mcs * 1000) / 1000000; } close_all: for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++) close(fd[i]); if (ret < 0) exit(EXIT_FAILURE); for (k = 0; k < 1000; k++) { sum += times[k]; } printf("Mean time per file in nanoseconds: %" PRIu64 "\n", sum / 1000); exit(EXIT_SUCCESS);; } Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> CC: Serge Hallyn <serge@hallyn.com> CC: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-10-31userns: use union in {g,u}idmap structChristian Brauner2-17/+31
- Add a struct containing two pointer to extents and wrap both the static extent array and the struct into a union. This is done in preparation for bumping the {g,u}idmap limits for user namespaces. - Add brackets around anonymous union when using designated initializers to initialize members in order to please gcc <= 4.4. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-09-24Linux 4.14-rc2Linus Torvalds1-1/+1
2017-09-23tpm: ibmvtpm: simplify crq initialization and document crq formatMichal Suchanek1-36/+60
The crq is passed in registers and is the same on BE and LE hosts. However, current implementation allocates a structure on-stack to represent the crq, initializes the members swapping them to BE, and loads the structure swapping it from BE. This is pointless and causes GCC warnings about ununitialized members. Get rid of the structure and the warnings. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-09-23tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic driversHamza Attak5-14/+21
The patch simply replaces all msleep function calls with usleep_range calls in the generic drivers. Tested with an Infineon TPM 1.2, using the generic tpm-tis module, for a thousand PCR extends, we see results going from 1m57s unpatched to 40s with the new patch. We obtain similar results when using the original and patched tpm_infineon driver, which is also part of the patch. Similarly with a STM TPM 2.0, using the CRB driver, it takes about 20ms per extend unpatched and around 7ms with the new patch. Note that the PCR consistency is untouched with this patch, each TPM has been tested with 10 million extends and the aggregated PCR value is continuously verified to be correct. As an extension of this work, this could potentially and easily be applied to other vendor's drivers. Still, these changes are not included in the proposed patch as they are untested. Signed-off-by: Hamza Attak <hamza@hpe.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-09-23Documentation: tpm: add powered-while-suspended binding documentationEnric Balletbo i Serra1-0/+6
Add a new powered-while-suspended property to control the behavior of the TPM suspend/resume. Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-09-23tpm: tpm_crb: constify acpi_device_id.Arvind Yadav1-1/+1
acpi_device_id are not supposed to change at runtime. All functions working with acpi_device_id provided by <acpi/acpi_bus.h> work with const acpi_device_id. So mark the non-const structs as const. File size before: text data bss dec hex filename 4198 608 0 4806 12c6 drivers/char/tpm/tpm_crb.o File size After adding 'const': text data bss dec hex filename 4262 520 0 4782 12ae drivers/char/tpm/tpm_crb.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-09-23tpm: vtpm: constify vio_device_idArvind Yadav1-1/+1
vio_device_id are not supposed to change at runtime. All functions working with vio_device_id provided by <asm/vio.h> work with const vio_device_id. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-09-23security: fix description of values returned by cap_inode_need_killprivStefan Berger1-3/+3
cap_inode_need_killpriv returns 1 if security.capability exists and has a value and inode_killpriv() is required, 0 otherwise. Fix the description of the return value to reflect this. Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-09-23x86/asm: Fix inline asm call constraints for ClangJosh Poimboeuf13-45/+42
For inline asm statements which have a CALL instruction, we list the stack pointer as a constraint to convince GCC to ensure the frame pointer is set up first: static inline void foo() { register void *__sp asm(_ASM_SP); asm("call bar" : "+r" (__sp)) } Unfortunately, that pattern causes Clang to corrupt the stack pointer. The fix is easy: convert the stack pointer register variable to a global variable. It should be noted that the end result is different based on the GCC version. With GCC 6.4, this patch has exactly the same result as before: defconfig defconfig-nofp distro distro-nofp before 9820389 9491555 8816046 8516940 after 9820389 9491555 8816046 8516940 With GCC 7.2, however, GCC's behavior has changed. It now changes its behavior based on the conversion of the register variable to a global. That somehow convinces it to *always* set up the frame pointer before inserting *any* inline asm. (Therefore, listing the variable as an output constraint is a no-op and is no longer necessary.) It's a bit overkill, but the performance impact should be negligible. And in fact, there's a nice improvement with frame pointers disabled: defconfig defconfig-nofp distro distro-nofp before 9796316 9468236 9076191 8790305 after 9796957 9464267 9076381 8785949 So in summary, while listing the stack pointer as an output constraint is no longer necessary for newer versions of GCC, it's still needed for older versions. Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-23objtool: Handle another GCC stack pointer adjustment bugJosh Poimboeuf2-17/+32
The kbuild bot reported the following warning with GCC 4.4 and a randconfig: net/socket.o: warning: objtool: compat_sock_ioctl()+0x1083: stack state mismatch: cfa1=7+160 cfa2=-1+0 This is caused by another GCC non-optimization, where it backs up and restores the stack pointer for no apparent reason: 2f91: 48 89 e0 mov %rsp,%rax 2f94: 4c 89 e7 mov %r12,%rdi 2f97: 4c 89 f6 mov %r14,%rsi 2f9a: ba 20 00 00 00 mov $0x20,%edx 2f9f: 48 89 c4 mov %rax,%rsp This issue would have been happily ignored before the following commit: dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug") But now that objtool is paying attention to such stack pointer writes to/from a register, it needs to understand them properly. In this case that means recognizing that the "mov %rsp, %rax" instruction is potentially a backup of the stack pointer. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug") Link: http://lkml.kernel.org/r/8c7aa8e9a36fbbb6655d9d8e7cea58958c912da8.1505942196.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-22inet: fix improper empty comparisonJosef Bacik1-1/+1
When doing my reuseport rework I screwed up and changed a if (hlist_empty(&tb->owners)) to if (!hlist_empty(&tb->owners)) This is obviously bad as all of the reuseport/reuse logic was reversed, which caused weird problems like allowing an ipv4 bind conflict if we opened an ipv4 only socket on a port followed by an ipv6 only socket on the same port. Fixes: b9470c27607b ("inet: kill smallest_size and smallest_port") Reported-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-22net: use inet6_rcv_saddr to compare socketsJosef Bacik1-1/+1
In ipv6_rcv_saddr_equal() we need to use inet6_rcv_saddr(sk) for the ipv6 compare with the fast socket information to make sure we're doing the proper comparisons. Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk") Reported-and-tested-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-22net: set tb->fast_sk_familyJosef Bacik1-0/+2
We need to set the tb->fast_sk_family properly so we can use the proper comparison function for all subsequent reuseport bind requests. Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk") Reported-and-tested-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-22net: orphan frags on stand-alone ptype in dev_queue_xmit_nitWillem de Bruijn1-2/+6
Zerocopy skbs frags are copied when the skb is looped to a local sock. Commit 1080e512d44d ("net: orphan frags on receive") introduced calls to skb_orphan_frags to deliver_skb and __netif_receive_skb for this. With msg_zerocopy, these skbs can also exist in the tx path and thus loop from dev_queue_xmit_nit. This already calls deliver_skb in its loop. But it does not orphan before a separate pt_prev->func(). Add the missing skb_orphan_frags_rx. Changes v1->v2: handle skb_orphan_frags_rx failure Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-22MAINTAINERS: update git tree locations for ieee802154 subsystemStefan Schmidt1-2/+2
Patches for ieee802154 will go through my new trees towards netdev from now on. The 6LoWPAN subsystem will stay as is (shared between ieee802154 and bluetooth) and go through the bluetooth tree as usual. Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-22SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flagsSteve French1-0/+7
Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2017-09-22SMB3: handle new statx fieldsSteve French1-0/+15
We weren't returning the creation time or the two easily supported attributes (ENCRYPTED or COMPRESSED) for the getattr call to allow statx to return these fields. Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>\ Acked-by: Jeff Layton <jlayton@poochiereds.net> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2017-09-22arch: remove unused *_segments() macros/functionsTobias Klauser10-51/+0
Some architectures define the no-op macros/functions copy_segments, release_segments and forget_segments. These are used nowhere in the tree, so removed them. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-22parisc: Unbreak bootloader due to gcc-7 optimizationsHelge Deller2-2/+3
gcc-7 optimizes the byte-wise accesses of get_unaligned_le32() into word-wise accesses if the 32-bit integer output_len is declared as external. This panics then the bootloader since we don't have the unaligned access fault trap handler installed during boot time. Avoid this optimization by declaring output_len as byte-aligned and thus unbreak the bootloader code. Additionally, compile the boot code optimized for size. Signed-off-by: Helge Deller <deller@gmx.de>
2017-09-22parisc: Reintroduce option to gzip-compress the kernelHelge Deller2-0/+17
By adding the feature to build the kernel as self-extracting executeable, the possibility to simply compress the kernel with gzip was lost. This patch now reintroduces this possibilty again and leaves it up to the user to decide how the kernel should be built. The palo bootloader is able to natively load both formats. Signed-off-by: Helge Deller <deller@gmx.de>
2017-09-22apparmor: fix apparmorfs DAC access permissionsJohn Johansen1-4/+4
The DAC access permissions for several apparmorfs files are wrong. .access - needs to be writable by all tasks to perform queries the others in the set only provide a read fn so should be read only. With policy namespace virtualization all apparmor needs to control the permission and visibility checks directly which means DAC access has to be allowed for all user, group, and other. BugLink: http://bugs.launchpad.net/bugs/1713103 Fixes: c97204baf840b ("apparmor: rename apparmor file fns and data to indicate use") Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: fix build failure on sparc caused by undeclared signalsJohn Johansen1-1/+4
In file included from security/apparmor/ipc.c:23:0: security/apparmor/include/sig_names.h:26:3: error: 'SIGSTKFLT' undeclared here (not in a function) [SIGSTKFLT] = 16, /* -, 16, - */ ^ security/apparmor/include/sig_names.h:26:3: error: array index in initializer not of integer type security/apparmor/include/sig_names.h:26:3: note: (near initialization for 'sig_map') security/apparmor/include/sig_names.h:51:3: error: 'SIGUNUSED' undeclared here (not in a function) [SIGUNUSED] = 34, /* -, 31, - */ ^ security/apparmor/include/sig_names.h:51:3: error: array index in initializer not of integer type security/apparmor/include/sig_names.h:51:3: note: (near initialization for 'sig_map') Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: c6bf1adaecaa ("apparmor: add the ability to mediate signals") Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: fix incorrect type assignment when freeing proxiesJohn Johansen1-1/+1
sparse reports poisoning the proxy->label before freeing the struct is resulting in a sparse build warning. ../security/apparmor/label.c:52:30: warning: incorrect type in assignment (different address spaces) ../security/apparmor/label.c:52:30: expected struct aa_label [noderef] <asn:4>*label ../security/apparmor/label.c:52:30: got struct aa_label *<noident> fix with RCU_INIT_POINTER as this is one of those cases where rcu_assign_pointer() is not needed. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: ensure unconfined profiles have dfas initializedJohn Johansen1-0/+2
Generally unconfined has early bailout tests and does not need the dfas initialized, however if an early bailout test is ever missed it will result in an oops. Be defensive and initialize the unconfined profile to have null dfas (no permission) so if an early bailout test is missed we fail closed (no perms granted) instead of oopsing. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: fix race condition in null profile creationJohn Johansen1-3/+11
There is a race when null- profile is being created between the initial lookup/creation of the profile and lock/addition of the profile. This could result in multiple version of a profile being added to the list which need to be removed/replaced. Since these are learning profile their is no affect on mediation. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: move new_null_profile to after profile lookup fns()John Johansen1-79/+79
new_null_profile will need to use some of the profile lookup fns() so move instead of doing forward fn declarations. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: add base infastructure for socket mediationJohn Johansen12-16/+840
Provide a basic mediation of sockets. This is not a full net mediation but just whether a spcific family of socket can be used by an application, along with setting up some basic infrastructure for network mediation to follow. the user space rule hav the basic form of NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ] [ TYPE | PROTOCOL ] DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' | 'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' | 'netbeui' | 'security' | 'key' | 'packet' | 'ash' | 'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' | 'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' | 'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' | 'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' | 'vsock' | 'mpls' | 'ib' | 'kcm' ) ',' TYPE = ( 'stream' | 'dgram' | 'seqpacket' | 'rdm' | 'raw' | 'packet' ) PROTOCOL = ( 'tcp' | 'udp' | 'icmp' ) eg. network, network inet, Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add more debug asserts to apparmorfsJohn Johansen1-0/+17
Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: make policy_unpack able to audit different info messagesJohn Johansen2-16/+40
Switch unpack auditing to using the generic name field in the audit struct and make it so we can start adding new info messages about why an unpack failed. Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add support for absolute root view based labelsJohn Johansen2-1/+10
With apparmor policy virtualization based on policy namespace View's we don't generally want/need absolute root based views, however there are cases like debugging and some secid based conversions where using a root based view is important. Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: cleanup conditional check for label in label_printJohn Johansen1-14/+8
Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add mount mediationJohn Johansen9-4/+841
Add basic mount mediation. That allows controlling based on basic mount parameters. It does not include special mount parameters for apparmor, super block labeling, or any triggers for apparmor namespace parameter modifications on pivot root. default userspace policy rules have the form of MOUNT RULE = ( MOUNT | REMOUNT | UMOUNT ) MOUNT = [ QUALIFIERS ] 'mount' [ MOUNT CONDITIONS ] [ SOURCE FILEGLOB ] [ '->' MOUNTPOINT FILEGLOB ] REMOUNT = [ QUALIFIERS ] 'remount' [ MOUNT CONDITIONS ] MOUNTPOINT FILEGLOB UMOUNT = [ QUALIFIERS ] 'umount' [ MOUNT CONDITIONS ] MOUNTPOINT FILEGLOB MOUNT CONDITIONS = [ ( 'fstype' | 'vfstype' ) ( '=' | 'in' ) MOUNT FSTYPE EXPRESSION ] [ 'options' ( '=' | 'in' ) MOUNT FLAGS EXPRESSION ] MOUNT FSTYPE EXPRESSION = ( MOUNT FSTYPE LIST | MOUNT EXPRESSION ) MOUNT FSTYPE LIST = Comma separated list of valid filesystem and virtual filesystem types (eg ext4, debugfs, etc) MOUNT FLAGS EXPRESSION = ( MOUNT FLAGS LIST | MOUNT EXPRESSION ) MOUNT FLAGS LIST = Comma separated list of MOUNT FLAGS. MOUNT FLAGS = ( 'ro' | 'rw' | 'nosuid' | 'suid' | 'nodev' | 'dev' | 'noexec' | 'exec' | 'sync' | 'async' | 'remount' | 'mand' | 'nomand' | 'dirsync' | 'noatime' | 'atime' | 'nodiratime' | 'diratime' | 'bind' | 'rbind' | 'move' | 'verbose' | 'silent' | 'loud' | 'acl' | 'noacl' | 'unbindable' | 'runbindable' | 'private' | 'rprivate' | 'slave' | 'rslave' | 'shared' | 'rshared' | 'relatime' | 'norelatime' | 'iversion' | 'noiversion' | 'strictatime' | 'nouser' | 'user' ) MOUNT EXPRESSION = ( ALPHANUMERIC | AARE ) ... PIVOT ROOT RULE = [ QUALIFIERS ] pivot_root [ oldroot=OLD PUT FILEGLOB ] [ NEW ROOT FILEGLOB ] SOURCE FILEGLOB = FILEGLOB MOUNTPOINT FILEGLOB = FILEGLOB eg. mount, mount /dev/foo, mount options=ro /dev/foo -> /mnt/, mount options in (ro,atime) /dev/foo -> /mnt/, mount options=ro options=atime, Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add the ability to mediate signalsJohn Johansen7-0/+231
Add signal mediation where the signal can be mediated based on the signal, direction, or the label or the peer/target. The signal perms are verified on a cross check to ensure policy consistency in the case of incremental policy load/replacement. The optimization of skipping the cross check when policy is guaranteed to be consistent (single compile unit) remains to be done. policy rules have the form of SIGNAL_RULE = [ QUALIFIERS ] 'signal' [ SIGNAL ACCESS PERMISSIONS ] [ SIGNAL SET ] [ SIGNAL PEER ] SIGNAL ACCESS PERMISSIONS = SIGNAL ACCESS | SIGNAL ACCESS LIST SIGNAL ACCESS LIST = '(' Comma or space separated list of SIGNAL ACCESS ')' SIGNAL ACCESS = ( 'r' | 'w' | 'rw' | 'read' | 'write' | 'send' | 'receive' ) SIGNAL SET = 'set' '=' '(' SIGNAL LIST ')' SIGNAL LIST = Comma or space separated list of SIGNALS SIGNALS = ( 'hup' | 'int' | 'quit' | 'ill' | 'trap' | 'abrt' | 'bus' | 'fpe' | 'kill' | 'usr1' | 'segv' | 'usr2' | 'pipe' | 'alrm' | 'term' | 'stkflt' | 'chld' | 'cont' | 'stop' | 'stp' | 'ttin' | 'ttou' | 'urg' | 'xcpu' | 'xfsz' | 'vtalrm' | 'prof' | 'winch' | 'io' | 'pwr' | 'sys' | 'emt' | 'exists' | 'rtmin+0' ... 'rtmin+32' ) SIGNAL PEER = 'peer' '=' AARE eg. signal, # allow all signals signal send set=(hup, kill) peer=foo, Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: Redundant condition: prev_ns. in [label.c:1498]John Johansen1-1/+1
Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: Fix an error code in aafs_create()Dan Carpenter1-1/+3
We accidentally forgot to set the error code on this path. It means we return NULL instead of an error pointer. I looked through a bunch of callers and I don't think it really causes a big issue, but the documentation says we're supposed to return error pointers here. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: Fix logical error in verify_header()Christos Gkekas1-1/+1
verify_header() is currently checking whether interface version is less than 5 *and* greater than 7, which always evaluates to false. Instead it should check whether it is less than 5 *or* greater than 7. Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: Fix shadowed local variable in unpack_trans_table()Geert Uytterhoeven1-2/+2
with W=2: security/apparmor/policy_unpack.c: In function ‘unpack_trans_table’: security/apparmor/policy_unpack.c:469: warning: declaration of ‘pos’ shadows a previous local security/apparmor/policy_unpack.c:451: warning: shadowed declaration is here Rename the old "pos" to "saved_pos" to fix this. Fixes: 5379a3312024a8be ("apparmor: support v7 transition format compatible with label_parse") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22bnxt_re: Don't issue cmd to delete GID for QP1 GID entry before the QP is destroyedSomnath Kotur1-3/+20
FW needs the 0th GID Entry in the Table to be preserved before it's corresponding QP1 is deleted, else it will fail the cmd. Check for the same and return to prevent error msg being logged for cmd failure. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22bnxt_re: Fix memory leak in FRMR pathSelvin Xavier1-1/+1
This patch fixes a memory leak issue when alloc_mr is used. mr->pages and mr->npages are used only in alloc_mr path. mr->pages is allocated when alloc_mr is called or in the case of FRMR, while creating the MR. mr->npages is updated only when the MR created is used i.e. after invoking map_mr_sg verb, before data transfer. In the dereg_mr path, if mr->npages is 0, driver ends up not freeing the memory created. Removing the npages check from the dereg_mr path for kernel consumers. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22bnxt_re: Remove RTNL lock dependency in bnxt_re_query_portSomnath Kotur3-8/+9
When there is a NETDEV_UNREGISTER event, bnxt_re driver calls ib_unregister_device() (RTNL lock held). ib_unregister_device attempts to flush a worker queue scheduled by ib_core and that queue might have a pending ib_query_port(). ib_query_port in turn calls bnxt_re_query_port(), which while querying the link speed using ib_get_eth_speed(), tries to acquire the rtnl_lock() which was already held by NETDEV_UNREGISTER. Fixing the issue by removing the link speed query from bnxt_re_query_port() Now the speed is queried post a successful ib_register_device or whenever there is a NETDEV_CHANGE event. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22bnxt_re: Fix race between the netdev register and unregister eventsSomnath Kotur2-5/+15
Upon receipt of the NETDEV_REGISTER event from the netdev notifier chain, the IB stack registration is spawned off to a workqueue since that also requires an rtnl lock. There could be 2 kinds of races between the NETDEV_REGISTER and the NETDEV_UNREGISTER event handling. a)The NETDEV_UNREGISTER event is received in rapid succession after the NETDEV_REGISTER event even before the work queue got a chance to run. b)The NETDEV_UNREGISTER event is received while the workqueue that handles registration with the IB stack is still in progress. Handle both the races with a bit flag that is set just before the work item is queued and cleared in the workqueue after the event is handled just before the workqueue item is freed. While adding the new flag, it was noted that the flags are all used in *_bit() operations which expect a bit number and not a literal constant with a bit set. So change the numbers to be bit numbers. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22bnxt_re: Free up devices in module_exit pathSomnath Kotur1-0/+16
Clean up all devices added to the bnxt_re_dev_list in the module_exit entry point. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22bnxt_re: Fix compare and swap atomic operandsDevesh Sharma1-0/+1
Driver must assign the user supplied compare/swap values in the wqe to successfully complete the atomic compare and swap operation. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22bnxt_re: Stop issuing further cmds to FW once a cmd times outSomnath Kotur2-1/+6
Once a cmd to FW times out(after 20s) it is reasonable to assume the FW or atleast the control path is dead. No point issuing further cmds to the FW as each subsequent cmd with another 20s timeout will cascade resulting in unnecessary traces and/or NMI Lockups. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22bnxt_re: Fix update of qplib_qp.mtu when modifiedDevesh Sharma1-0/+3
The MTU value in the qplib_qp.mtu should be consistent with whatever mtu was set during INIT to RTR.The Next PSN and number of packets are calculated based on this member in the qplib_qp structure. Signed-off-by: Narender Reddy <narender.reddy@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-22parisc: Add HWPOISON page fault handler codeHelge Deller1-4/+29
Commit 24587380f61d ("parisc: Add MADV_HWPOISON and MADV_SOFT_OFFLINE") added the necessary constants to handle hardware-poisoning. Those were needed to support the page deallocation feature from firmware. But I completely missed to add the relevant fault handler code. This now showed up when I ran the madvise07 testcase from the Linux Test Project, which failed with a kernel BUG at arch/parisc/mm/fault.c:320. With this patch the parisc kernel now behaves like other platforms and gives the same kernel syslog warnings when poisoning pages. Signed-off-by: Helge Deller <deller@gmx.de>