11 files changed, 388 insertions, 86 deletions
diff --git a/Documentation/admin-guide/LSM/tomoyo.rst b/Documentation/admin-guide/LSM/tomoyo.rst
index a5947218fa64..e2d6b6e15082 100644
--- a/Documentation/admin-guide/LSM/tomoyo.rst
+++ b/Documentation/admin-guide/LSM/tomoyo.rst
@@ -9,8 +9,8 @@ TOMOYO is a name-based MAC extension (LSM module) for the Linux kernel.
 
 LiveCD-based tutorials are available at
 
-http://tomoyo.sourceforge.jp/1.7/1st-step/ubuntu10.04-live/
-http://tomoyo.sourceforge.jp/1.7/1st-step/centos5-live/
+http://tomoyo.sourceforge.jp/1.8/ubuntu12.04-live.html
+http://tomoyo.sourceforge.jp/1.8/centos6-live.html
 
 Though these tutorials use non-LSM version of TOMOYO, they are useful for you
 to know what TOMOYO is.
@@ -21,35 +21,35 @@ How to enable TOMOYO?
 Build the kernel with ``CONFIG_SECURITY_TOMOYO=y`` and pass ``security=tomoyo`` on
 kernel's command line.
 
-Please see http://tomoyo.sourceforge.jp/2.3/ for details.
+Please see http://tomoyo.osdn.jp/2.5/ for details.
 
 Where is documentation?
 =======================
 
 User <-> Kernel interface documentation is available at
-http://tomoyo.sourceforge.jp/2.3/policy-reference.html .
+http://tomoyo.osdn.jp/2.5/policy-specification/index.html .
 
 Materials we prepared for seminars and symposiums are available at
-http://sourceforge.jp/projects/tomoyo/docs/?category_id=532&language_id=1 .
+http://osdn.jp/projects/tomoyo/docs/?category_id=532&language_id=1 .
 Below lists are chosen from three aspects.
 
 What is TOMOYO?
   TOMOYO Linux Overview
-    http://sourceforge.jp/projects/tomoyo/docs/lca2009-takeda.pdf
+    http://osdn.jp/projects/tomoyo/docs/lca2009-takeda.pdf
   TOMOYO Linux: pragmatic and manageable security for Linux
-    http://sourceforge.jp/projects/tomoyo/docs/freedomhectaipei-tomoyo.pdf
+    http://osdn.jp/projects/tomoyo/docs/freedomhectaipei-tomoyo.pdf
   TOMOYO Linux: A Practical Method to Understand and Protect Your Own Linux Box
-    http://sourceforge.jp/projects/tomoyo/docs/PacSec2007-en-no-demo.pdf
+    http://osdn.jp/projects/tomoyo/docs/PacSec2007-en-no-demo.pdf
 
 What can TOMOYO do?
   Deep inside TOMOYO Linux
-    http://sourceforge.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf
+    http://osdn.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf
   The role of "pathname based access control" in security.
-    http://sourceforge.jp/projects/tomoyo/docs/lfj2008-bof.pdf
+    http://osdn.jp/projects/tomoyo/docs/lfj2008-bof.pdf
 
 History of TOMOYO?
   Realities of Mainlining
-    http://sourceforge.jp/projects/tomoyo/docs/lfj2008.pdf
+    http://osdn.jp/projects/tomoyo/docs/lfj2008.pdf
 
 What is future plan?
 ====================
@@ -60,6 +60,6 @@ multiple LSM modules at the same time. We feel sorry that you have to give up
 SELinux/SMACK/AppArmor etc. when you want to use TOMOYO.
 
 We hope that LSM becomes stackable in future. Meanwhile, you can use non-LSM
-version of TOMOYO, available at http://tomoyo.sourceforge.jp/1.7/ .
+version of TOMOYO, available at http://tomoyo.osdn.jp/1.8/ .
 LSM version of TOMOYO is a subset of non-LSM version of TOMOYO. We are planning
 to port non-LSM version's functionalities to LSM versions.
diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
index 6b71852dadc2..4ec843123cc3 100644
--- a/Documentation/admin-guide/devices.txt
+++ b/Documentation/admin-guide/devices.txt
@@ -3081,3 +3081,8 @@
 		  1 = /dev/osd1		Second OSD Device
 		  ...
 		  255 = /dev/osd255	256th OSD Device
+
+ 384-511 char	RESERVED FOR DYNAMIC ASSIGNMENT
+		Character devices that request a dynamic allocation of major
+		number will take numbers starting from 511 and downward,
+		once the 234-254 range is full.
diff --git a/Documentation/admin-guide/kernel-parameters.rst b/Documentation/admin-guide/kernel-parameters.rst
index d76ab3907e2b..b2598cc9834c 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -138,6 +138,7 @@ parameter is applicable::
 	PPT	Parallel port support is enabled.
 	PS2	Appropriate PS/2 support is enabled.
 	RAM	RAM disk support is enabled.
+	RDT	Intel Resource Director Technology.
 	S390	S390 architecture is enabled.
 	SCSI	Appropriate SCSI support is enabled.
 			A lot of drivers have their options described inside
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d9c171ce4190..05496622b4ef 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2233,6 +2233,17 @@
 			memory contents and reserves bad memory
 			regions that are detected.
 
+	mem_encrypt=	[X86-64] AMD Secure Memory Encryption (SME) control
+			Valid arguments: on, off
+			Default (depends on kernel configuration option):
+			  on  (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
+			  off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n)
+			mem_encrypt=on:		Activate SME
+			mem_encrypt=off:	Do not activate SME
+
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.
+
 	mem_sleep_default=	[SUSPEND] Default system suspend mode:
 			s2idle  - Suspend-To-Idle
 			shallow - Power-On Suspend or equivalent (if supported)
@@ -2633,9 +2644,10 @@
 			In kernels built with CONFIG_NO_HZ_FULL=y, set
 			the specified list of CPUs whose tick will be stopped
 			whenever possible. The boot CPU will be forced outside
-			the range to maintain the timekeeping.
-			The CPUs in this range must also be included in the
-			rcu_nocbs= set.
+			the range to maintain the timekeeping.  Any CPUs
+			in this list will have their RCU callbacks offloaded,
+			just as if they had also been called out in the
+			rcu_nocbs= boot parameter.
 
 	noiotrap	[SH] Disables trapped I/O port accesses.
 
@@ -2696,6 +2708,8 @@
 	nopat		[X86] Disable PAT (page attribute table extension of
 			pagetables) support.
 
+	nopcid		[X86-64] Disable the PCID cpu feature.
+
 	norandmaps	Don't use address space randomization.  Equivalent to
 			echo 0 > /proc/sys/kernel/randomize_va_space
 
@@ -2750,6 +2764,15 @@
 			If the dependencies are under your control, you can
 			turn on cpu0_hotplug.
 
+	nps_mtm_hs_ctr= [KNL,ARC]
+			This parameter sets the maximum duration, in
+			cycles, each HW thread of the CTOP can run
+			without interruptions, before HW switches it.
+			The actual maximum duration is 16 times this
+			parameter's value.
+			Format: integer between 1 and 255
+			Default: 255
+
 	nptcg=		[IA-64] Override max number of concurrent global TLB
 			purges which is reported from either PAL_VM_SUMMARY or
 			SAL PALO.
@@ -2769,7 +2792,7 @@
 			Allowed values are enable and disable
 
 	numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
-			one of ['zone', 'node', 'default'] can be specified
+			'node', 'default' can be specified
 			This can be set from sysctl after boot.
 			See Documentation/sysctl/vm.txt for details.
 
@@ -3598,6 +3621,12 @@
 			Run specified binary instead of /init from the ramdisk,
 			used for early userspace startup. See initrd.
 
+	rdt=		[HW,X86,RDT]
+			Turn on/off individual RDT features. List is:
+			cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, mba.
+			E.g. to turn on cmt and turn off mba use:
+				rdt=cmt,!mba
+
 	reboot=		[KNL]
 			Format (x86 or x86_64):
 				[w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \
@@ -4375,6 +4404,10 @@
 			decrease the size and leave more room for directly
 			mapped kernel RAM.
 
+	vmcp_cma=nn[MG]	[KNL,S390]
+			Sets the memory size reserved for contiguous memory
+			allocations for the vmcp device driver.
+
 	vmhalt=		[KNL,S390] Perform z/VM CP command after system halt.
 			Format: <command>
 
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 463cf7e73db8..47153e64dfb5 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -237,6 +237,14 @@ are the following:
 	This attribute is not present if the scaling driver in use does not
 	support it.
 
+``cpuinfo_cur_freq``
+	Current frequency of the CPUs belonging to this policy as obtained from
+	the hardware (in KHz).
+
+	This is expected to be the frequency the hardware actually runs at.
+	If that frequency cannot be determined, this attribute should not
+	be present.
+
 ``cpuinfo_max_freq``
 	Maximum possible operating frequency the CPUs belonging to this policy
 	can run at (in kHz).
@@ -471,14 +479,6 @@ This governor exposes the following tunables:
 
 	# echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
 
-
-``min_sampling_rate``
-	The minimum value of ``sampling_rate``.
-
-	Equal to 10000 (10 ms) if :c:macro:`CONFIG_NO_HZ_COMMON` and
-	:c:data:`tick_nohz_active` are both set or to 20 times the value of
-	:c:data:`jiffies` in microseconds otherwise.
-
 ``up_threshold``
 	If the estimated CPU load is above this value (in percent), the governor
 	will set the frequency to the maximum value allowed for the policy.
diff --git a/Documentation/admin-guide/pm/index.rst b/Documentation/admin-guide/pm/index.rst
index 7f148f76f432..49237ac73442 100644
--- a/Documentation/admin-guide/pm/index.rst
+++ b/Documentation/admin-guide/pm/index.rst
@@ -5,12 +5,6 @@ Power Management
 .. toctree::
    :maxdepth: 2
 
-   cpufreq
-   intel_pstate
-
-.. only::  subproject and html
-
-   Indices
-   =======
-
-   * :ref:`genindex`
+   strategies
+   system-wide
+   working-state
diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst
index 1d6249825efc..d2b6fda3d67b 100644
--- a/Documentation/admin-guide/pm/intel_pstate.rst
+++ b/Documentation/admin-guide/pm/intel_pstate.rst
@@ -167,35 +167,17 @@ is set.
 ``powersave``
 .............
 
-Without HWP, this P-state selection algorithm generally depends on the
-processor model and/or the system profile setting in the ACPI tables and there
-are two variants of it.
-
-One of them is used with processors from the Atom line and (regardless of the
-processor model) on platforms with the system profile in the ACPI tables set to
-"mobile" (laptops mostly), "tablet", "appliance PC", "desktop", or
-"workstation".  It is also used with processors supporting the HWP feature if
-that feature has not been enabled (that is, with the ``intel_pstate=no_hwp``
-argument in the kernel command line).  It is similar to the algorithm
+Without HWP, this P-state selection algorithm is similar to the algorithm
 implemented by the generic ``schedutil`` scaling governor except that the
 utilization metric used by it is based on numbers coming from feedback
 registers of the CPU.  It generally selects P-states proportional to the
-current CPU utilization, so it is referred to as the "proportional" algorithm.
-
-The second variant of the ``powersave`` P-state selection algorithm, used in all
-of the other cases (generally, on processors from the Core line, so it is
-referred to as the "Core" algorithm), is based on the values read from the APERF
-and MPERF feedback registers and the previously requested target P-state.
-It does not really take CPU utilization into account explicitly, but as a rule
-it causes the CPU P-state to ramp up very quickly in response to increased
-utilization which is generally desirable in server environments.
-
-Regardless of the variant, this algorithm is run by the driver's utilization
-update callback for the given CPU when it is invoked by the CPU scheduler, but
-not more often than every 10 ms (that can be tweaked via ``debugfs`` in `this
-particular case <Tuning Interface in debugfs_>`_).  Like in the ``performance``
-case, the hardware configuration is not touched if the new P-state turns out to
-be the same as the current one.
+current CPU utilization.
+
+This algorithm is run by the driver's utilization update callback for the
+given CPU when it is invoked by the CPU scheduler, but not more often than
+every 10 ms.  Like in the ``performance`` case, the hardware configuration
+is not touched if the new P-state turns out to be the same as the current
+one.
 
 This is the default P-state selection algorithm if the
 :c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE` kernel configuration option
@@ -720,34 +702,7 @@ P-state is called, the ``ftrace`` filter can be set to to
       gnome-shell-3409  [001] ..s.  2537.650850: intel_pstate_set_pstate <-intel_pstate_timer_func
            <idle>-0     [000] ..s.  2537.654843: intel_pstate_set_pstate <-intel_pstate_timer_func
 
-Tuning Interface in ``debugfs``
--------------------------------
-
-The ``powersave`` algorithm provided by ``intel_pstate`` for `the Core line of
-processors in the active mode <powersave_>`_ is based on a `PID controller`_
-whose parameters were chosen to address a number of different use cases at the
-same time.  However, it still is possible to fine-tune it to a specific workload
-and the ``debugfs`` interface under ``/sys/kernel/debug/pstate_snb/`` is
-provided for this purpose.  [Note that the ``pstate_snb`` directory will be
-present only if the specific P-state selection algorithm matching the interface
-in it actually is in use.]
-
-The following files present in that directory can be used to modify the PID
-controller parameters at run time:
-
-| ``deadband``
-| ``d_gain_pct``
-| ``i_gain_pct``
-| ``p_gain_pct``
-| ``sample_rate_ms``
-| ``setpoint``
-
-Note, however, that achieving desirable results this way generally requires
-expert-level understanding of the power vs performance tradeoff, so extra care
-is recommended when attempting to do that.
-
 
 .. _LCEU2015: http://events.linuxfoundation.org/sites/events/files/slides/LinuxConEurope_2015.pdf
 .. _SDM: http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
 .. _ACPI specification: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
-.. _PID controller: https://en.wikipedia.org/wiki/PID_controller
diff --git a/Documentation/admin-guide/pm/sleep-states.rst b/Documentation/admin-guide/pm/sleep-states.rst
new file mode 100644
index 000000000000..1e5c0f00cb2f
--- /dev/null
+++ b/Documentation/admin-guide/pm/sleep-states.rst
@@ -0,0 +1,245 @@
+===================
+System Sleep States
+===================
+
+::
+
+ Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+Sleep states are global low-power states of the entire system in which user
+space code cannot be executed and the overall system activity is significantly
+reduced.
+
+
+Sleep States That Can Be Supported
+==================================
+
+Depending on its configuration and the capabilities of the platform it runs on,
+the Linux kernel can support up to four system sleep states, includig
+hibernation and up to three variants of system suspend.  The sleep states that
+can be supported by the kernel are listed below.
+
+.. _s2idle:
+
+Suspend-to-Idle
+---------------
+
+This is a generic, pure software, light-weight variant of system suspend (also
+referred to as S2I or S2Idle).  It allows more energy to be saved relative to
+runtime idle by freezing user space, suspending the timekeeping and putting all
+I/O devices into low-power states (possibly lower-power than available in the
+working state), such that the processors can spend time in their deepest idle
+states while the system is suspended.
+
+The system is woken up from this state by in-band interrupts, so theoretically
+any devices that can cause interrupts to be generated in the working state can
+also be set up as wakeup devices for S2Idle.
+
+This state can be used on platforms without support for :ref:`standby <standby>`
+or :ref:`suspend-to-RAM <s2ram>`, or it can be used in addition to any of the
+deeper system suspend variants to provide reduced resume latency.  It is always
+supported if the :c:macro:`CONFIG_SUSPEND` kernel configuration option is set.
+
+.. _standby:
+
+Standby
+-------
+
+This state, if supported, offers moderate, but real, energy savings, while
+providing a relatively straightforward transition back to the working state.  No
+operating state is lost (the system core logic retains power), so the system can
+go back to where it left off easily enough.
+
+In addition to freezing user space, suspending the timekeeping and putting all
+I/O devices into low-power states, which is done for :ref:`suspend-to-idle
+<s2idle>` too, nonboot CPUs are taken offline and all low-level system functions
+are suspended during transitions into this state.  For this reason, it should
+allow more energy to be saved relative to :ref:`suspend-to-idle <s2idle>`, but
+the resume latency will generally be greater than for that state.
+
+The set of devices that can wake up the system from this state usually is
+reduced relative to :ref:`suspend-to-idle <s2idle>` and it may be necessary to
+rely on the platform for setting up the wakeup functionality as appropriate.
+
+This state is supported if the :c:macro:`CONFIG_SUSPEND` kernel configuration
+option is set and the support for it is registered by the platform with the
+core system suspend subsystem.  On ACPI-based systems this state is mapped to
+the S1 system state defined by ACPI.
+
+.. _s2ram:
+
+Suspend-to-RAM
+--------------
+
+This state (also referred to as STR or S2RAM), if supported, offers significant
+energy savings as everything in the system is put into a low-power state, except
+for memory, which should be placed into the self-refresh mode to retain its
+contents.  All of the steps carried out when entering :ref:`standby <standby>`
+are also carried out during transitions to S2RAM.  Additional operations may
+take place depending on the platform capabilities.  In particular, on ACPI-based
+systems the kernel passes control to the platform firmware (BIOS) as the last
+step during S2RAM transitions and that usually results in powering down some
+more low-level components that are not directly controlled by the kernel.
+
+The state of devices and CPUs is saved and held in memory.  All devices are
+suspended and put into low-power states.  In many cases, all peripheral buses
+lose power when entering S2RAM, so devices must be able to handle the transition
+back to the "on" state.
+
+On ACPI-based systems S2RAM requires some minimal boot-strapping code in the
+platform firmware to resume the system from it.  This may be the case on other
+platforms too.
+
+The set of devices that can wake up the system from S2RAM usually is reduced
+relative to :ref:`suspend-to-idle <s2idle>` and :ref:`standby <standby>` and it
+may be necessary to rely on the platform for setting up the wakeup functionality
+as appropriate.
+
+S2RAM is supported if the :c:macro:`CONFIG_SUSPEND` kernel configuration option
+is set and the support for it is registered by the platform with the core system
+suspend subsystem.  On ACPI-based systems it is mapped to the S3 system state
+defined by ACPI.
+
+.. _hibernation:
+
+Hibernation
+-----------
+
+This state (also referred to as Suspend-to-Disk or STD) offers the greatest
+energy savings and can be used even in the absence of low-level platform support
+for system suspend.  However, it requires some low-level code for resuming the
+system to be present for the underlying CPU architecture.
+
+Hibernation is significantly different from any of the system suspend variants.
+It takes three system state changes to put it into hibernation and two system
+state changes to resume it.
+
+First, when hibernation is triggered, the kernel stops all system activity and
+creates a snapshot image of memory to be written into persistent storage.  Next,
+the system goes into a state in which the snapshot image can be saved, the image
+is written out and finally the system goes into the target low-power state in
+which power is cut from almost all of its hardware components, including memory,
+except for a limited set of wakeup devices.
+
+Once the snapshot image has been written out, the system may either enter a
+special low-power state (like ACPI S4), or it may simply power down itself.
+Powering down means minimum power draw and it allows this mechanism to work on
+any system.  However, entering a special low-power state may allow additional
+means of system wakeup to be used  (e.g. pressing a key on the keyboard or
+opening a laptop lid).
+
+After wakeup, control goes to the platform firmware that runs a boot loader
+which boots a fresh instance of the kernel (control may also go directly to
+the boot loader, depending on the system configuration, but anyway it causes
+a fresh instance of the kernel to be booted).  That new instance of the kernel
+(referred to as the ``restore kernel``) looks for a hibernation image in
+persistent storage and if one is found, it is loaded into memory.  Next, all
+activity in the system is stopped and the restore kernel overwrites itself with
+the image contents and jumps into a special trampoline area in the original
+kernel stored in the image (referred to as the ``image kernel``), which is where
+the special architecture-specific low-level code is needed.  Finally, the
+image kernel restores the system to the pre-hibernation state and allows user
+space to run again.
+
+Hibernation is supported if the :c:macro:`CONFIG_HIBERNATION` kernel
+configuration option is set.  However, this option can only be set if support
+for the given CPU architecture includes the low-level code for system resume.
+
+
+Basic ``sysfs`` Interfaces for System Suspend and Hibernation
+=============================================================
+
+The following files located in the :file:`/sys/power/` directory can be used by
+user space for sleep states control.
+
+``state``
+	This file contains a list of strings representing sleep states supported
+	by the kernel.  Writing one of these strings into it causes the kernel
+	to start a transition of the system into the sleep state represented by
+	that string.
+
+	In particular, the strings "disk", "freeze" and "standby" represent the
+	:ref:`hibernation <hibernation>`, :ref:`suspend-to-idle <s2idle>` and
+	:ref:`standby <standby>` sleep states, respectively.  The string "mem"
+	is interpreted in accordance with the contents of the ``mem_sleep`` file
+	described below.
+
+	If the kernel does not support any system sleep states, this file is
+	not present.
+
+``mem_sleep``
+	This file contains a list of strings representing supported system
+	suspend	variants and allows user space to select the variant to be
+	associated with the "mem" string in the ``state`` file described above.
+
+	The strings that may be present in this file are "s2idle", "shallow"
+	and "deep".  The string "s2idle" always represents :ref:`suspend-to-idle
+	<s2idle>` and, by convention, "shallow" and "deep" represent
+	:ref:`standby <standby>` and :ref:`suspend-to-RAM <s2ram>`,
+	respectively.
+
+	Writing one of the listed strings into this file causes the system
+	suspend variant represented by it to be associated with the "mem" string
+	in the ``state`` file.  The string representing the suspend variant
+	currently associated with the "mem" string in the ``state`` file
+	is listed in square brackets.
+
+	If the kernel does not support system suspend, this file is not present.
+
+``disk``
+	This file contains a list of strings representing different operations
+	that can be carried out after the hibernation image has been saved.  The
+	possible options are as follows:
+
+	``platform``
+		Put the system into a special low-power state (e.g. ACPI S4) to
+		make additional wakeup options available and possibly allow the
+		platform firmware to take a simplified initialization path after
+		wakeup.
+
+	``shutdown``
+		Power off the system.
+
+	``reboot``
+		Reboot the system (useful for diagnostics mostly).
+
+	``suspend``
+		Hybrid system suspend.  Put the system into the suspend sleep
+		state selected through the ``mem_sleep`` file described above.
+		If the system is successfully woken up from that state, discard
+		the hibernation image and continue.  Otherwise, use the image
+		to restore the previous state of the system.
+
+	``test_resume``
+		Diagnostic operation.  Load the image as though the system had
+		just woken up from hibernation and the currently running kernel
+		instance was a restore kernel and follow up with full system
+		resume.
+
+	Writing one of the listed strings into this file causes the option
+	represented by it to be selected.
+
+	The currently selected option is shown in square brackets which means
+	that the operation represented by it will be carried out after creating
+	and saving the image next time hibernation is triggered by writing
+	``disk`` to :file:`/sys/power/state`.
+
+	If the kernel does not support hibernation, this file is not present.
+
+According to the above, there are two ways to make the system go into the
+:ref:`suspend-to-idle <s2idle>` state.  The first one is to write "freeze"
+directly to :file:`/sys/power/state`.  The second one is to write "s2idle" to
+:file:`/sys/power/mem_sleep` and then to write "mem" to
+:file:`/sys/power/state`.  Likewise, there are two ways to make the system go
+into the :ref:`standby <standby>` state (the strings to write to the control
+files in that case are "standby" or "shallow" and "mem", respectively) if that
+state is supported by the platform.  However, there is only one way to make the
+system go into the :ref:`suspend-to-RAM <s2ram>` state (write "deep" into
+:file:`/sys/power/mem_sleep` and "mem" into :file:`/sys/power/state`).
+
+The default suspend variant (ie. the one to be used without writing anything
+into :file:`/sys/power/mem_sleep`) is either "deep" (on the majority of systems
+supporting :ref:`suspend-to-RAM <s2ram>`) or "s2idle", but it can be overridden
+by the value of the "mem_sleep_default" parameter in the kernel command line.
+On some ACPI-based systems, depending on the information in the ACPI tables, the
+default may be "s2idle" even if :ref:`suspend-to-RAM <s2ram>` is supported.
diff --git a/Documentation/admin-guide/pm/strategies.rst b/Documentation/admin-guide/pm/strategies.rst
new file mode 100644
index 000000000000..afe4d3f831fe
--- /dev/null
+++ b/Documentation/admin-guide/pm/strategies.rst
@@ -0,0 +1,52 @@
+===========================
+Power Management Strategies
+===========================
+
+::
+
+ Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+The Linux kernel supports two major high-level power management strategies.
+
+One of them is based on using global low-power states of the whole system in
+which user space code cannot be executed and the overall system activity is
+significantly reduced, referred to as :doc:`sleep states <sleep-states>`.  The
+kernel puts the system into one of these states when requested by user space
+and the system stays in it until a special signal is received from one of
+designated devices, triggering a transition to the ``working state`` in which
+user space code can run.  Because sleep states are global and the whole system
+is affected by the state changes, this strategy is referred to as the
+:doc:`system-wide power management <system-wide>`.
+
+The other strategy, referred to as the :doc:`working-state power management
+<working-state>`, is based on adjusting the power states of individual hardware
+components of the system, as needed, in the working state.  In consequence, if
+this strategy is in use, the working state of the system usually does not
+correspond to any particular physical configuration of it, but can be treated as
+a metastate covering a range of different power states of the system in which
+the individual components of it can be either ``active`` (in use) or
+``inactive`` (idle).  If they are active, they have to be in power states
+allowing them to process data and to be accessed by software.  In turn, if they
+are inactive, ideally, they should be in low-power states in which they may not
+be accessible.
+
+If all of the system components are active, the system as a whole is regarded as
+"runtime active" and that situation typically corresponds to the maximum power
+draw (or maximum energy usage) of it.  If all of them are inactive, the system
+as a whole is regarded as "runtime idle" which may be very close to a sleep
+state from the physical system configuration and power draw perspective, but
+then it takes much less time and effort to start executing user space code than
+for the same system in a sleep state.  However, transitions from sleep states
+back to the working state can only be started by a limited set of devices, so
+typically the system can spend much more time in a sleep state than it can be
+runtime idle in one go.  For this reason, systems usually use less energy in
+sleep states than when they are runtime idle most of the time.
+
+Moreover, the two power management strategies address different usage scenarios.
+Namely, if the user indicates that the system will not be in use going forward,
+for example by closing its lid (if the system is a laptop), it probably should
+go into a sleep state at that point.  On the other hand, if the user simply goes
+away from the laptop keyboard, it probably should stay in the working state and
+use the working-state power management in case it becomes idle, because the user
+may come back to it at any time and then may want the system to be immediately
+accessible.
diff --git a/Documentation/admin-guide/pm/system-wide.rst b/Documentation/admin-guide/pm/system-wide.rst
new file mode 100644
index 000000000000..0c81e4c5de39
--- /dev/null
+++ b/Documentation/admin-guide/pm/system-wide.rst
@@ -0,0 +1,8 @@
+============================
+System-Wide Power Management
+============================
+
+.. toctree::
+   :maxdepth: 2
+
+   sleep-states
diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst
new file mode 100644
index 000000000000..fa01bf083dfe
--- /dev/null
+++ b/Documentation/admin-guide/pm/working-state.rst
@@ -0,0 +1,9 @@
+==============================
+Working-State Power Management
+==============================
+
+.. toctree::
+   :maxdepth: 2
+
+   cpufreq
+   intel_pstate