aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/x86/entry_64.txt
diff options
context:
space:
mode:
authorAndy Lutomirski <luto@amacapital.net>2014-11-11 12:49:41 -0800
committerAndy Lutomirski <luto@amacapital.net>2015-01-02 10:22:45 -0800
commit48e08d0fb265b007ebbb29a72297ff7e40938969 (patch)
tree424a8207cc53c2b0dfbd9fb12bee15952ce822ae /Documentation/x86/entry_64.txt
parentrcu: Make rcu_nmi_enter() handle nesting (diff)
downloadlinux-dev-48e08d0fb265b007ebbb29a72297ff7e40938969.tar.xz
linux-dev-48e08d0fb265b007ebbb29a72297ff7e40938969.zip
x86, entry: Switch stacks on a paranoid entry from userspace
This causes all non-NMI, non-double-fault kernel entries from userspace to run on the normal kernel stack. Double-fault is exempt to minimize confusion if we double-fault directly from userspace due to a bad kernel stack. This is, suprisingly, simpler and shorter than the current code. It removes the IMO rather frightening paranoid_userspace path, and it make sync_regs much simpler. There is no risk of stack overflow due to this change -- the kernel stack that we switch to is empty. This will also enable us to create non-atomic sections within machine checks from userspace, which will simplify memory failure handling. It will also allow the upcoming fsgsbase code to be simplified, because it doesn't need to worry about usergs when scheduling in paranoid_exit, as that code no longer exists. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Tony Luck <tony.luck@intel.com> Acked-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Diffstat (limited to 'Documentation/x86/entry_64.txt')
-rw-r--r--Documentation/x86/entry_64.txt18
1 files changed, 12 insertions, 6 deletions
diff --git a/Documentation/x86/entry_64.txt b/Documentation/x86/entry_64.txt
index 4a1c5c2dc5a9..9132b86176a3 100644
--- a/Documentation/x86/entry_64.txt
+++ b/Documentation/x86/entry_64.txt
@@ -78,9 +78,6 @@ The expensive (paranoid) way is to read back the MSR_GS_BASE value
xorl %ebx,%ebx
1: ret
-and the whole paranoid non-paranoid macro complexity is about whether
-to suffer that RDMSR cost.
-
If we are at an interrupt or user-trap/gate-alike boundary then we can
use the faster check: the stack will be a reliable indicator of
whether SWAPGS was already done: if we see that we are a secondary
@@ -93,6 +90,15 @@ which might have triggered right after a normal entry wrote CS to the
stack but before we executed SWAPGS, then the only safe way to check
for GS is the slower method: the RDMSR.
-So we try only to mark those entry methods 'paranoid' that absolutely
-need the more expensive check for the GS base - and we generate all
-'normal' entry points with the regular (faster) entry macros.
+Therefore, super-atomic entries (except NMI, which is handled separately)
+must use idtentry with paranoid=1 to handle gsbase correctly. This
+triggers three main behavior changes:
+
+ - Interrupt entry will use the slower gsbase check.
+ - Interrupt entry from user mode will switch off the IST stack.
+ - Interrupt exit to kernel mode will not attempt to reschedule.
+
+We try to only use IST entries and the paranoid entry code for vectors
+that absolutely need the more expensive check for the GS base - and we
+generate all 'normal' entry points with the regular (faster) paranoid=0
+variant.