Skip to content

Commit

Permalink
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linu…
Browse files Browse the repository at this point in the history
…x/kernel/git/tip/tip

Pull x86 core updates from Ingo Molnar:
 "Note that in this cycle most of the x86 topics interacted at a level
  that caused them to be merged into tip:x86/asm - but this should be a
  temporary phenomenon, hopefully we'll back to the usual patterns in
  the next merge window.

  The main changes in this cycle were:

  Hardware enablement:

   - Add support for the Intel UMIP (User Mode Instruction Prevention)
     CPU feature. This is a security feature that disables certain
     instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri)

     [ Note that this is disabled by default for now, there are some
       smaller enhancements in the pipeline that I'll follow up with in
       the next 1-2 days, which allows this to be enabled by default.]

   - Add support for the AMD SEV (Secure Encrypted Virtualization) CPU
     feature, on top of SME (Secure Memory Encryption) support that was
     added in v4.14. (Tom Lendacky, Brijesh Singh)

   - Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES,
     VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela)

  Other changes:

   - A big series of entry code simplifications and enhancements (Andy
     Lutomirski)

   - Make the ORC unwinder default on x86 and various objtool
     enhancements. (Josh Poimboeuf)

   - 5-level paging enhancements (Kirill A. Shutemov)

   - Micro-optimize the entry code a bit (Borislav Petkov)

   - Improve the handling of interdependent CPU features in the early
     FPU init code (Andi Kleen)

   - Build system enhancements (Changbin Du, Masahiro Yamada)

   - ... plus misc enhancements, fixes and cleanups"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
  x86/build: Make the boot image generation less verbose
  selftests/x86: Add tests for the STR and SLDT instructions
  selftests/x86: Add tests for User-Mode Instruction Prevention
  x86/traps: Fix up general protection faults caused by UMIP
  x86/umip: Enable User-Mode Instruction Prevention at runtime
  x86/umip: Force a page fault when unable to copy emulated result to user
  x86/umip: Add emulation code for UMIP instructions
  x86/cpufeature: Add User-Mode Instruction Prevention definitions
  x86/insn-eval: Add support to resolve 16-bit address encodings
  x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  x86/insn-eval: Add support to resolve 32-bit address encodings
  x86/insn-eval: Compute linear address in several utility functions
  resource: Fix resource_size.cocci warnings
  X86/KVM: Clear encryption attribute when SEV is active
  X86/KVM: Decrypt shared per-cpu variables when SEV is active
  percpu: Introduce DEFINE_PER_CPU_DECRYPTED
  x86: Add support for changing memory encryption attribute in early boot
  x86/io: Unroll string I/O when SEV is active
  x86/boot: Add early boot support when running with SEV active
  ...
  • Loading branch information
torvalds committed Nov 13, 2017
2 parents 3e20146 + 91a6a6c commit d6ec9d9
Show file tree
Hide file tree
Showing 110 changed files with 4,067 additions and 1,145 deletions.
30 changes: 26 additions & 4 deletions Documentation/x86/amd-memory-encryption.txt
Original file line number Diff line number Diff line change
@@ -1,29 +1,44 @@
Secure Memory Encryption (SME) is a feature found on AMD processors.
Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV) are
features found on AMD processors.

SME provides the ability to mark individual pages of memory as encrypted using
the standard x86 page tables. A page that is marked encrypted will be
automatically decrypted when read from DRAM and encrypted when written to
DRAM. SME can therefore be used to protect the contents of DRAM from physical
attacks on the system.

SEV enables running encrypted virtual machines (VMs) in which the code and data
of the guest VM are secured so that a decrypted version is available only
within the VM itself. SEV guest VMs have the concept of private and shared
memory. Private memory is encrypted with the guest-specific key, while shared
memory may be encrypted with hypervisor key. When SME is enabled, the hypervisor
key is the same key which is used in SME.

A page is encrypted when a page table entry has the encryption bit set (see
below on how to determine its position). The encryption bit can also be
specified in the cr3 register, allowing the PGD table to be encrypted. Each
successive level of page tables can also be encrypted by setting the encryption
bit in the page table entry that points to the next table. This allows the full
page table hierarchy to be encrypted. Note, this means that just because the
encryption bit is set in cr3, doesn't imply the full hierarchy is encyrpted.
encryption bit is set in cr3, doesn't imply the full hierarchy is encrypted.
Each page table entry in the hierarchy needs to have the encryption bit set to
achieve that. So, theoretically, you could have the encryption bit set in cr3
so that the PGD is encrypted, but not set the encryption bit in the PGD entry
for a PUD which results in the PUD pointed to by that entry to not be
encrypted.

Support for SME can be determined through the CPUID instruction. The CPUID
function 0x8000001f reports information related to SME:
When SEV is enabled, instruction pages and guest page tables are always treated
as private. All the DMA operations inside the guest must be performed on shared
memory. Since the memory encryption bit is controlled by the guest OS when it
is operating in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware
forces the memory encryption bit to 1.

Support for SME and SEV can be determined through the CPUID instruction. The
CPUID function 0x8000001f reports information related to SME:

0x8000001f[eax]:
Bit[0] indicates support for SME
Bit[1] indicates support for SEV
0x8000001f[ebx]:
Bits[5:0] pagetable bit number used to activate memory
encryption
Expand All @@ -39,6 +54,13 @@ determine if SME is enabled and/or to enable memory encryption:
Bit[23] 0 = memory encryption features are disabled
1 = memory encryption features are enabled

If SEV is supported, MSR 0xc0010131 (MSR_AMD64_SEV) can be used to determine if
SEV is active:

0xc0010131:
Bit[0] 0 = memory encryption is not active
1 = memory encryption is active

Linux relies on BIOS to set this bit if BIOS has determined that the reduction
in the physical address space as a result of enabling memory encryption (see
CPUID information above) will not conflict with the address space resource
Expand Down
2 changes: 1 addition & 1 deletion Documentation/x86/orc-unwinder.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ ORC unwinder
Overview
--------

The kernel CONFIG_ORC_UNWINDER option enables the ORC unwinder, which is
The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
similar in concept to a DWARF unwinder. The difference is that the
format of the ORC data is much simpler than DWARF, which in turn allows
the ORC unwinder to be much simpler and faster.
Expand Down
2 changes: 1 addition & 1 deletion Documentation/x86/x86_64/mm.txt
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ ff92000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space
ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
... unused hole ...
ffd8000000000000 - fff7ffffffffffff (=53 bits) kasan shadow memory (8PB)
ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
... unused hole ...
ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
... unused hole ...
Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -934,8 +934,8 @@ ifdef CONFIG_STACK_VALIDATION
ifeq ($(has_libelf),1)
objtool_target := tools/objtool FORCE
else
ifdef CONFIG_ORC_UNWINDER
$(error "Cannot generate ORC metadata for CONFIG_ORC_UNWINDER=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
ifdef CONFIG_UNWINDER_ORC
$(error "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
else
$(warning "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
endif
Expand Down
12 changes: 9 additions & 3 deletions arch/powerpc/kernel/machine_kexec_file_64.c
Original file line number Diff line number Diff line change
Expand Up @@ -91,11 +91,13 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
* and that value will be returned. If all free regions are visited without
* func returning non-zero, then zero will be returned.
*/
int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
int arch_kexec_walk_mem(struct kexec_buf *kbuf,
int (*func)(struct resource *, void *))
{
int ret = 0;
u64 i;
phys_addr_t mstart, mend;
struct resource res = { };

if (kbuf->top_down) {
for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
Expand All @@ -105,7 +107,9 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
* range while in kexec, end points to the last byte
* in the range.
*/
ret = func(mstart, mend - 1, kbuf);
res.start = mstart;
res.end = mend - 1;
ret = func(&res, kbuf);
if (ret)
break;
}
Expand All @@ -117,7 +121,9 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
* range while in kexec, end points to the last byte
* in the range.
*/
ret = func(mstart, mend - 1, kbuf);
res.start = mstart;
res.end = mend - 1;
ret = func(&res, kbuf);
if (ret)
break;
}
Expand Down
13 changes: 11 additions & 2 deletions arch/x86/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ config X86
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if X86_64 && FRAME_POINTER_UNWINDER && STACK_VALIDATION
select HAVE_RELIABLE_STACKTRACE if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION
select HAVE_STACK_VALIDATION if X86_64
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_UNSTABLE_SCHED_CLOCK
Expand Down Expand Up @@ -303,7 +303,6 @@ config ARCH_SUPPORTS_DEBUG_PAGEALLOC
config KASAN_SHADOW_OFFSET
hex
depends on KASAN
default 0xdff8000000000000 if X86_5LEVEL
default 0xdffffc0000000000

config HAVE_INTEL_TXT
Expand Down Expand Up @@ -1803,6 +1802,16 @@ config X86_SMAP

If unsure, say Y.

config X86_INTEL_UMIP
def_bool n
depends on CPU_SUP_INTEL
prompt "Intel User Mode Instruction Prevention" if EXPERT
---help---
The User Mode Instruction Prevention (UMIP) is a security
feature in newer Intel processors. If enabled, a general
protection fault is issued if the instructions SGDT, SLDT,
SIDT, SMSW and STR are executed in user mode.

config X86_INTEL_MPX
prompt "Intel MPX (Memory Protection Extensions)"
def_bool n
Expand Down
39 changes: 20 additions & 19 deletions arch/x86/Kconfig.debug
Original file line number Diff line number Diff line change
Expand Up @@ -359,28 +359,14 @@ config PUNIT_ATOM_DEBUG

choice
prompt "Choose kernel unwinder"
default FRAME_POINTER_UNWINDER
default UNWINDER_ORC if X86_64
default UNWINDER_FRAME_POINTER if X86_32
---help---
This determines which method will be used for unwinding kernel stack
traces for panics, oopses, bugs, warnings, perf, /proc/<pid>/stack,
livepatch, lockdep, and more.

config FRAME_POINTER_UNWINDER
bool "Frame pointer unwinder"
select FRAME_POINTER
---help---
This option enables the frame pointer unwinder for unwinding kernel
stack traces.

The unwinder itself is fast and it uses less RAM than the ORC
unwinder, but the kernel text size will grow by ~3% and the kernel's
overall performance will degrade by roughly 5-10%.

This option is recommended if you want to use the livepatch
consistency model, as this is currently the only way to get a
reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).

config ORC_UNWINDER
config UNWINDER_ORC
bool "ORC unwinder"
depends on X86_64
select STACK_VALIDATION
Expand All @@ -396,7 +382,22 @@ config ORC_UNWINDER
Enabling this option will increase the kernel's runtime memory usage
by roughly 2-4MB, depending on your kernel config.

config GUESS_UNWINDER
config UNWINDER_FRAME_POINTER
bool "Frame pointer unwinder"
select FRAME_POINTER
---help---
This option enables the frame pointer unwinder for unwinding kernel
stack traces.

The unwinder itself is fast and it uses less RAM than the ORC
unwinder, but the kernel text size will grow by ~3% and the kernel's
overall performance will degrade by roughly 5-10%.

This option is recommended if you want to use the livepatch
consistency model, as this is currently the only way to get a
reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).

config UNWINDER_GUESS
bool "Guess unwinder"
depends on EXPERT
---help---
Expand All @@ -411,7 +412,7 @@ config GUESS_UNWINDER
endchoice

config FRAME_POINTER
depends on !ORC_UNWINDER && !GUESS_UNWINDER
depends on !UNWINDER_ORC && !UNWINDER_GUESS
bool

endmenu
3 changes: 3 additions & 0 deletions arch/x86/boot/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@ zoffset.h
setup
setup.bin
setup.elf
fdimage
mtools.conf
image.iso
59 changes: 11 additions & 48 deletions arch/x86/boot/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -123,63 +123,26 @@ image_cmdline = default linux $(FDARGS) $(if $(FDINITRD),initrd=initrd.img,)
$(obj)/mtools.conf: $(src)/mtools.conf.in
sed -e 's|@OBJ@|$(obj)|g' < $< > $@

quiet_cmd_genimage = GENIMAGE $3
cmd_genimage = sh $(srctree)/$(src)/genimage.sh $2 $3 $(obj)/bzImage \
$(obj)/mtools.conf '$(image_cmdline)' $(FDINITRD)

# This requires write access to /dev/fd0
bzdisk: $(obj)/bzImage $(obj)/mtools.conf
MTOOLSRC=$(obj)/mtools.conf mformat a: ; sync
syslinux /dev/fd0 ; sync
echo '$(image_cmdline)' | \
MTOOLSRC=$(src)/mtools.conf mcopy - a:syslinux.cfg
if [ -f '$(FDINITRD)' ] ; then \
MTOOLSRC=$(obj)/mtools.conf mcopy '$(FDINITRD)' a:initrd.img ; \
fi
MTOOLSRC=$(obj)/mtools.conf mcopy $(obj)/bzImage a:linux ; sync
$(call cmd,genimage,bzdisk,/dev/fd0)

# These require being root or having syslinux 2.02 or higher installed
fdimage fdimage144: $(obj)/bzImage $(obj)/mtools.conf
dd if=/dev/zero of=$(obj)/fdimage bs=1024 count=1440
MTOOLSRC=$(obj)/mtools.conf mformat v: ; sync
syslinux $(obj)/fdimage ; sync
echo '$(image_cmdline)' | \
MTOOLSRC=$(obj)/mtools.conf mcopy - v:syslinux.cfg
if [ -f '$(FDINITRD)' ] ; then \
MTOOLSRC=$(obj)/mtools.conf mcopy '$(FDINITRD)' v:initrd.img ; \
fi
MTOOLSRC=$(obj)/mtools.conf mcopy $(obj)/bzImage v:linux ; sync
$(call cmd,genimage,fdimage144,$(obj)/fdimage)
@$(kecho) 'Kernel: $(obj)/fdimage is ready'

fdimage288: $(obj)/bzImage $(obj)/mtools.conf
dd if=/dev/zero of=$(obj)/fdimage bs=1024 count=2880
MTOOLSRC=$(obj)/mtools.conf mformat w: ; sync
syslinux $(obj)/fdimage ; sync
echo '$(image_cmdline)' | \
MTOOLSRC=$(obj)/mtools.conf mcopy - w:syslinux.cfg
if [ -f '$(FDINITRD)' ] ; then \
MTOOLSRC=$(obj)/mtools.conf mcopy '$(FDINITRD)' w:initrd.img ; \
fi
MTOOLSRC=$(obj)/mtools.conf mcopy $(obj)/bzImage w:linux ; sync
$(call cmd,genimage,fdimage288,$(obj)/fdimage)
@$(kecho) 'Kernel: $(obj)/fdimage is ready'

isoimage: $(obj)/bzImage
-rm -rf $(obj)/isoimage
mkdir $(obj)/isoimage
for i in lib lib64 share end ; do \
if [ -f /usr/$$i/syslinux/isolinux.bin ] ; then \
cp /usr/$$i/syslinux/isolinux.bin $(obj)/isoimage ; \
if [ -f /usr/$$i/syslinux/ldlinux.c32 ]; then \
cp /usr/$$i/syslinux/ldlinux.c32 $(obj)/isoimage ; \
fi ; \
break ; \
fi ; \
if [ $$i = end ] ; then exit 1 ; fi ; \
done
cp $(obj)/bzImage $(obj)/isoimage/linux
echo '$(image_cmdline)' > $(obj)/isoimage/isolinux.cfg
if [ -f '$(FDINITRD)' ] ; then \
cp '$(FDINITRD)' $(obj)/isoimage/initrd.img ; \
fi
mkisofs -J -r -o $(obj)/image.iso -b isolinux.bin -c boot.cat \
-no-emul-boot -boot-load-size 4 -boot-info-table \
$(obj)/isoimage
isohybrid $(obj)/image.iso 2>/dev/null || true
rm -rf $(obj)/isoimage
$(call cmd,genimage,isoimage,$(obj)/image.iso)
@$(kecho) 'Kernel: $(obj)/image.iso is ready'

bzlilo: $(obj)/bzImage
if [ -f $(INSTALL_PATH)/vmlinuz ]; then mv $(INSTALL_PATH)/vmlinuz $(INSTALL_PATH)/vmlinuz.old; fi
Expand Down
1 change: 1 addition & 0 deletions arch/x86/boot/compressed/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
ifdef CONFIG_X86_64
vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
vmlinux-objs-y += $(obj)/mem_encrypt.o
endif

$(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
Expand Down
16 changes: 16 additions & 0 deletions arch/x86/boot/compressed/head_64.S
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,19 @@ ENTRY(startup_32)
/*
* Build early 4G boot pagetable
*/
/*
* If SEV is active then set the encryption mask in the page tables.
* This will insure that when the kernel is copied and decompressed
* it will be done so encrypted.
*/
call get_sev_encryption_bit
xorl %edx, %edx
testl %eax, %eax
jz 1f
subl $32, %eax /* Encryption bit is always above bit 31 */
bts %eax, %edx /* Set encryption mask for page tables */
1:

/* Initialize Page tables to 0 */
leal pgtable(%ebx), %edi
xorl %eax, %eax
Expand All @@ -141,12 +154,14 @@ ENTRY(startup_32)
leal pgtable + 0(%ebx), %edi
leal 0x1007 (%edi), %eax
movl %eax, 0(%edi)
addl %edx, 4(%edi)

/* Build Level 3 */
leal pgtable + 0x1000(%ebx), %edi
leal 0x1007(%edi), %eax
movl $4, %ecx
1: movl %eax, 0x00(%edi)
addl %edx, 0x04(%edi)
addl $0x00001000, %eax
addl $8, %edi
decl %ecx
Expand All @@ -157,6 +172,7 @@ ENTRY(startup_32)
movl $0x00000183, %eax
movl $2048, %ecx
1: movl %eax, 0(%edi)
addl %edx, 4(%edi)
addl $0x00200000, %eax
addl $8, %edi
decl %ecx
Expand Down
Loading

0 comments on commit d6ec9d9

Please sign in to comment.