Skip to content

Commit

Permalink
Merge tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/…
Browse files Browse the repository at this point in the history
…git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These make hibernation on 32-bit x86 systems work in all of the cases
  in which it works on 64-bit x86 ones, update the menu cpuidle governor
  and the "polling" state to make them more efficient, add more hardware
  support to cpufreq drivers and fix issues with some of them, fix a bug
  in the conservative cpufreq governor, fix the operating performance
  points (OPP) framework and make it more stable, update the devfreq
  subsystem to take changes in the APIs used by into account and clean
  up some things all over.

  Specifics:

   - Backport hibernation bug fixes from x86-64 to x86-32 and
     consolidate hibernation handling on x86 to allow 32-bit systems to
     work in all of the cases in which 64-bit ones work (Zhimin Gu, Chen
     Yu).

   - Fix hibernation documentation (Vladimir D. Seleznev).

   - Update the menu cpuidle governor to fix a couple of issues with it,
     make it more efficient in some cases and clean it up (Rafael
     Wysocki).

   - Rework the cpuidle polling state implementation to make it more
     efficient (Rafael Wysocki).

   - Clean up the cpuidle core somewhat (Fieah Lim).

   - Fix the cpufreq conservative governor to take policy limits into
     account properly in some cases (Rafael Wysocki).

   - Add support for retrieving guaranteed performance information to
     the ACPI CPPC library and make the intel_pstate driver use it to
     expose the CPU base frequency via sysfs on systems with the
     hardware-managed P-states (HWP) feature enabled (Srinivas
     Pandruvada).

   - Fix clang warning in the CPPC cpufreq driver (Nathan Chancellor).

   - Get rid of device_node.name printing from cpufreq (Rob Herring).

   - Remove unnecessary unlikely() from the cpufreq core (Igor Stoppa).

   - Add support for the r8a7744 SoC to the cpufreq-dt driver (Biju
     Das).

   - Update the dt-platdev cpufreq driver to allow RK3399 to have
     separate tunables per cluster (Dmitry Torokhov).

   - Fix the dma_alloc_coherent() usage in the tegra186 cpufreq driver
     (Christoph Hellwig).

   - Make the imx6q cpufreq driver read OCOTP through nvmem for
     imx6ul/imx6ull (Anson Huang).

   - Fix several bugs in the operating performance points (OPP)
     framework and make it more stable (Viresh Kumar, Dave Gerlach).

   - Update the devfreq subsystem to take changes in the APIs used by
     into account, fix some issues with it and make it stop print
     device_node.name directly (Bjorn Andersson, Enric Balletbo i Serra,
     Matthias Kaehlcke, Rob Herring, Vincent Donnefort, zhong jiang).

   - Prepare the generic power domains (genpd) framework for dealing
     with domains containing CPUs (Ulf Hansson).

   - Prevent sysfs attributes representing low-power S0 residency
     counters from being exposed if low-power S0 support is not
     indicated in ACPI FADT (Rajneesh Bhardwaj).

   - Get rid of custom CPU features macros for Intel CPUs from the
     intel_idle and RAPL drivers (Andy Shevchenko).

   - Update the tasks freezer to list tasks that refused to freeze and
     caused a system transition to a sleep state to be aborted (Todd
     Brandt).

   - Update the pm-graph set of tools to v5.2 (Todd Brandt).

   - Fix some issues in the cpupower utility (Anders Roxell, Prarit
     Bhargava)"

* tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
  PM / Domains: Document flags for genpd
  PM / Domains: Deal with multiple states but no governor in genpd
  PM / Domains: Don't treat zero found compatible idle states as an error
  cpuidle: menu: Avoid computations when result will be discarded
  cpuidle: menu: Drop redundant comparison
  cpufreq: tegra186: don't pass GFP_DMA32 to dma_alloc_coherent()
  cpufreq: conservative: Take limits changes into account properly
  Documentation: intel_pstate: Add base_frequency information
  cpufreq: intel_pstate: Add base_frequency attribute
  ACPI / CPPC: Add support for guaranteed performance
  cpuidle: menu: Simplify checks related to the polling state
  PM / tools: sleepgraph and bootgraph: upgrade to v5.2
  PM / tools: sleepgraph: first batch of v5.2 changes
  cpupower: Fix coredump on VMWare
  cpupower: Fix AMD Family 0x17 msr_pstate size
  cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull
  cpufreq: dt-platdev: allow RK3399 to have separate tunables per cluster
  cpuidle: poll_state: Revise loop termination condition
  cpuidle: menu: Move the latency_req == 0 special case check
  cpuidle: menu: Avoid computations for very close timers
  ...
  • Loading branch information
torvalds committed Oct 23, 2018
2 parents 72f86d0 + cc19b05 commit 12dd08f
Show file tree
Hide file tree
Showing 58 changed files with 2,123 additions and 1,586 deletions.
2 changes: 1 addition & 1 deletion Documentation/ABI/testing/sysfs-power
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ Description:
this file, the suspend image will be as small as possible.

Reading from this file will display the current image size
limit, which is set to 500 MB by default.
limit, which is set to around 2/5 of available RAM by default.

What: /sys/power/pm_trace
Date: August 2006
Expand Down
7 changes: 7 additions & 0 deletions Documentation/admin-guide/pm/intel_pstate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,13 @@ Next, the following policy attributes have special meaning if
policy for the time interval between the last two invocations of the
driver's utilization update callback by the CPU scheduler for that CPU.

One more policy attribute is present if the `HWP feature is enabled in the
processor <Active Mode With HWP_>`_:

``base_frequency``
Shows the base frequency of the CPU. Any frequency above this will be
in the turbo frequency range.

The meaning of these attributes in the `passive mode <Passive Mode_>`_ is the
same as for other scaling drivers.

Expand Down
2 changes: 1 addition & 1 deletion Documentation/power/swsusp.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ If you want to limit the suspend image size to N bytes, do

echo N > /sys/power/image_size

before suspend (it is limited to 500 MB by default).
before suspend (it is limited to around 2/5 of available RAM by default).

. The resume process checks for the presence of the resume device,
if found, it then checks the contents for the hibernation image signature.
Expand Down
2 changes: 1 addition & 1 deletion arch/x86/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -2422,7 +2422,7 @@ menu "Power management and ACPI options"

config ARCH_HIBERNATION_HEADER
def_bool y
depends on X86_64 && HIBERNATION
depends on HIBERNATION

source "kernel/power/Kconfig"

Expand Down
8 changes: 8 additions & 0 deletions arch/x86/include/asm/suspend.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,11 @@
#else
# include <asm/suspend_64.h>
#endif
extern unsigned long restore_jump_address __visible;
extern unsigned long jump_address_phys;
extern unsigned long restore_cr3 __visible;
extern unsigned long temp_pgt __visible;
extern unsigned long relocated_restore_code __visible;
extern int relocate_restore_code(void);
/* Defined in hibernate_asm_32/64.S */
extern asmlinkage __visible int restore_image(void);
4 changes: 4 additions & 0 deletions arch/x86/include/asm/suspend_32.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,8 @@ struct saved_context {
unsigned long return_address;
} __attribute__((packed));

/* routines for saving/restoring kernel state */
extern char core_restore_code[];
extern char restore_registers[];

#endif /* _ASM_X86_SUSPEND_32_H */
2 changes: 1 addition & 1 deletion arch/x86/kernel/setup.c
Original file line number Diff line number Diff line change
Expand Up @@ -1251,7 +1251,7 @@ void __init setup_arch(char **cmdline_p)
x86_init.hyper.guest_late_init();

e820__reserve_resources();
e820__register_nosave_regions(max_low_pfn);
e820__register_nosave_regions(max_pfn);

x86_init.resources.reserve_resources();

Expand Down
2 changes: 1 addition & 1 deletion arch/x86/power/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ nostackp := $(call cc-option, -fno-stack-protector)
CFLAGS_cpu.o := $(nostackp)

obj-$(CONFIG_PM_SLEEP) += cpu.o
obj-$(CONFIG_HIBERNATION) += hibernate_$(BITS).o hibernate_asm_$(BITS).o
obj-$(CONFIG_HIBERNATION) += hibernate_$(BITS).o hibernate_asm_$(BITS).o hibernate.o
248 changes: 248 additions & 0 deletions arch/x86/power/hibernate.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,248 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Hibernation support for x86
*
* Copyright (c) 2007 Rafael J. Wysocki <[email protected]>
* Copyright (c) 2002 Pavel Machek <[email protected]>
* Copyright (c) 2001 Patrick Mochel <[email protected]>
*/
#include <linux/gfp.h>
#include <linux/smp.h>
#include <linux/suspend.h>
#include <linux/scatterlist.h>
#include <linux/kdebug.h>

#include <crypto/hash.h>

#include <asm/e820/api.h>
#include <asm/init.h>
#include <asm/proto.h>
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/mtrr.h>
#include <asm/sections.h>
#include <asm/suspend.h>
#include <asm/tlbflush.h>

/*
* Address to jump to in the last phase of restore in order to get to the image
* kernel's text (this value is passed in the image header).
*/
unsigned long restore_jump_address __visible;
unsigned long jump_address_phys;

/*
* Value of the cr3 register from before the hibernation (this value is passed
* in the image header).
*/
unsigned long restore_cr3 __visible;
unsigned long temp_pgt __visible;
unsigned long relocated_restore_code __visible;

/**
* pfn_is_nosave - check if given pfn is in the 'nosave' section
*/
int pfn_is_nosave(unsigned long pfn)
{
unsigned long nosave_begin_pfn;
unsigned long nosave_end_pfn;

nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT;
nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT;

return pfn >= nosave_begin_pfn && pfn < nosave_end_pfn;
}


#define MD5_DIGEST_SIZE 16

struct restore_data_record {
unsigned long jump_address;
unsigned long jump_address_phys;
unsigned long cr3;
unsigned long magic;
u8 e820_digest[MD5_DIGEST_SIZE];
};

#if IS_BUILTIN(CONFIG_CRYPTO_MD5)
/**
* get_e820_md5 - calculate md5 according to given e820 table
*
* @table: the e820 table to be calculated
* @buf: the md5 result to be stored to
*/
static int get_e820_md5(struct e820_table *table, void *buf)
{
struct crypto_shash *tfm;
struct shash_desc *desc;
int size;
int ret = 0;

tfm = crypto_alloc_shash("md5", 0, 0);
if (IS_ERR(tfm))
return -ENOMEM;

desc = kmalloc(sizeof(struct shash_desc) + crypto_shash_descsize(tfm),
GFP_KERNEL);
if (!desc) {
ret = -ENOMEM;
goto free_tfm;
}

desc->tfm = tfm;
desc->flags = 0;

size = offsetof(struct e820_table, entries) +
sizeof(struct e820_entry) * table->nr_entries;

if (crypto_shash_digest(desc, (u8 *)table, size, buf))
ret = -EINVAL;

kzfree(desc);

free_tfm:
crypto_free_shash(tfm);
return ret;
}

static int hibernation_e820_save(void *buf)
{
return get_e820_md5(e820_table_firmware, buf);
}

static bool hibernation_e820_mismatch(void *buf)
{
int ret;
u8 result[MD5_DIGEST_SIZE];

memset(result, 0, MD5_DIGEST_SIZE);
/* If there is no digest in suspend kernel, let it go. */
if (!memcmp(result, buf, MD5_DIGEST_SIZE))
return false;

ret = get_e820_md5(e820_table_firmware, result);
if (ret)
return true;

return memcmp(result, buf, MD5_DIGEST_SIZE) ? true : false;
}
#else
static int hibernation_e820_save(void *buf)
{
return 0;
}

static bool hibernation_e820_mismatch(void *buf)
{
/* If md5 is not builtin for restore kernel, let it go. */
return false;
}
#endif

#ifdef CONFIG_X86_64
#define RESTORE_MAGIC 0x23456789ABCDEF01UL
#else
#define RESTORE_MAGIC 0x12345678UL
#endif

/**
* arch_hibernation_header_save - populate the architecture specific part
* of a hibernation image header
* @addr: address to save the data at
*/
int arch_hibernation_header_save(void *addr, unsigned int max_size)
{
struct restore_data_record *rdr = addr;

if (max_size < sizeof(struct restore_data_record))
return -EOVERFLOW;
rdr->magic = RESTORE_MAGIC;
rdr->jump_address = (unsigned long)restore_registers;
rdr->jump_address_phys = __pa_symbol(restore_registers);

/*
* The restore code fixes up CR3 and CR4 in the following sequence:
*
* [in hibernation asm]
* 1. CR3 <= temporary page tables
* 2. CR4 <= mmu_cr4_features (from the kernel that restores us)
* 3. CR3 <= rdr->cr3
* 4. CR4 <= mmu_cr4_features (from us, i.e. the image kernel)
* [in restore_processor_state()]
* 5. CR4 <= saved CR4
* 6. CR3 <= saved CR3
*
* Our mmu_cr4_features has CR4.PCIDE=0, and toggling
* CR4.PCIDE while CR3's PCID bits are nonzero is illegal, so
* rdr->cr3 needs to point to valid page tables but must not
* have any of the PCID bits set.
*/
rdr->cr3 = restore_cr3 & ~CR3_PCID_MASK;

return hibernation_e820_save(rdr->e820_digest);
}

/**
* arch_hibernation_header_restore - read the architecture specific data
* from the hibernation image header
* @addr: address to read the data from
*/
int arch_hibernation_header_restore(void *addr)
{
struct restore_data_record *rdr = addr;

if (rdr->magic != RESTORE_MAGIC) {
pr_crit("Unrecognized hibernate image header format!\n");
return -EINVAL;
}

restore_jump_address = rdr->jump_address;
jump_address_phys = rdr->jump_address_phys;
restore_cr3 = rdr->cr3;

if (hibernation_e820_mismatch(rdr->e820_digest)) {
pr_crit("Hibernate inconsistent memory map detected!\n");
return -ENODEV;
}

return 0;
}

int relocate_restore_code(void)
{
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;

relocated_restore_code = get_safe_page(GFP_ATOMIC);
if (!relocated_restore_code)
return -ENOMEM;

memcpy((void *)relocated_restore_code, core_restore_code, PAGE_SIZE);

/* Make the page containing the relocated code executable */
pgd = (pgd_t *)__va(read_cr3_pa()) +
pgd_index(relocated_restore_code);
p4d = p4d_offset(pgd, relocated_restore_code);
if (p4d_large(*p4d)) {
set_p4d(p4d, __p4d(p4d_val(*p4d) & ~_PAGE_NX));
goto out;
}
pud = pud_offset(p4d, relocated_restore_code);
if (pud_large(*pud)) {
set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX));
goto out;
}
pmd = pmd_offset(pud, relocated_restore_code);
if (pmd_large(*pmd)) {
set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX));
goto out;
}
pte = pte_offset_kernel(pmd, relocated_restore_code);
set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX));
out:
__flush_tlb_all();
return 0;
}
Loading

0 comments on commit 12dd08f

Please sign in to comment.