Skip to content

Commit

Permalink
Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/li…
Browse files Browse the repository at this point in the history
…nux-2.6

* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (94 commits)
  [PATCH] x86-64: Remove mk_pte_phys()
  [PATCH] i386: Fix broken CONFIG_COMPAT_VDSO on i386
  [PATCH] i386: fix 32-bit ioctls on x64_32
  [PATCH] x86: Unify pcspeaker platform device code between i386/x86-64
  [PATCH] i386: Remove extern declaration from mm/discontig.c, put in header.
  [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c
  [PATCH] i386: Move mce_disabled to asm/mce.h
  [PATCH] i386: paravirt unhandled fallthrough
  [PATCH] x86_64: Wire up compat epoll_pwait
  [PATCH] x86: Don't require the vDSO for handling a.out signals
  [PATCH] i386: Fix Cyrix MediaGX detection
  [PATCH] i386: Fix warning in cpu initialization
  [PATCH] i386: Fix warning in microcode.c
  [PATCH] x86: Enable NMI watchdog for AMD Family 0x10 CPUs
  [PATCH] x86: Add new CPUID bits for AMD Family 10 CPUs in /proc/cpuinfo
  [PATCH] i386: Remove fastcall in paravirt.[ch]
  [PATCH] x86-64: Fix wrong gcc check in bitops.h
  [PATCH] x86-64: survive having no irq mapping for a vector
  [PATCH] i386: geode configuration fixes
  [PATCH] i386: add option to show more code in oops reports
  ...
  • Loading branch information
Linus Torvalds committed Feb 14, 2007
2 parents 86a71db + 126b192 commit 414f827
Show file tree
Hide file tree
Showing 137 changed files with 4,113 additions and 1,103 deletions.
8 changes: 8 additions & 0 deletions Documentation/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,9 @@ loader, and have no meaning to the kernel directly.
Do not modify the syntax of boot loader parameters without extreme
need or coordination with <Documentation/i386/boot.txt>.

There are also arch-specific kernel-parameters not documented here.
See for example <Documentation/x86_64/boot-options.txt>.

Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
a trailing = on the name of any parameter states that that parameter will
be entered as an environment variable, whereas its absence indicates that
Expand Down Expand Up @@ -361,6 +364,11 @@ and is between 256 and 4096 characters. It is defined in the file
clocksource is not available, it defaults to PIT.
Format: { pit | tsc | cyclone | pmtmr }

code_bytes [IA32] How many bytes of object code to print in an
oops report.
Range: 0 - 8192
Default: 64

disable_8254_timer
enable_8254_timer
[IA32/X86_64] Disable/Enable interrupt 0 timer routing
Expand Down
132 changes: 83 additions & 49 deletions Documentation/x86_64/boot-options.txt
Original file line number Diff line number Diff line change
Expand Up @@ -180,40 +180,81 @@ PCI
pci=lastbus=NUMBER Scan upto NUMBER busses, no matter what the mptable says.
pci=noacpi Don't use ACPI to set up PCI interrupt routing.

IOMMU

iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]][,merge]
[,forcesac][,fullflush][,nomerge][,noaperture][,calgary]
size set size of iommu (in bytes)
noagp don't initialize the AGP driver and use full aperture.
off don't use the IOMMU
leak turn on simple iommu leak tracing (only when CONFIG_IOMMU_LEAK is on)
memaper[=order] allocate an own aperture over RAM with size 32MB^order.
noforce don't force IOMMU usage. Default.
force Force IOMMU.
merge Do SG merging. Implies force (experimental)
nomerge Don't do SG merging.
forcesac For SAC mode for masks <40bits (experimental)
fullflush Flush IOMMU on each allocation (default)
nofullflush Don't use IOMMU fullflush
allowed overwrite iommu off workarounds for specific chipsets.
soft Use software bounce buffering (default for Intel machines)
noaperture Don't touch the aperture for AGP.
allowdac Allow DMA >4GB
When off all DMA over >4GB is forced through an IOMMU or bounce
buffering.
nodac Forbid DMA >4GB
panic Always panic when IOMMU overflows
calgary Use the Calgary IOMMU if it is available

swiotlb=pages[,force]

pages Prereserve that many 128K pages for the software IO bounce buffering.
force Force all IO through the software TLB.

calgary=[64k,128k,256k,512k,1M,2M,4M,8M]
calgary=[translate_empty_slots]
calgary=[disable=<PCI bus number>]
IOMMU (input/output memory management unit)

Currently four x86-64 PCI-DMA mapping implementations exist:

1. <arch/x86_64/kernel/pci-nommu.c>: use no hardware/software IOMMU at all
(e.g. because you have < 3 GB memory).
Kernel boot message: "PCI-DMA: Disabling IOMMU"

2. <arch/x86_64/kernel/pci-gart.c>: AMD GART based hardware IOMMU.
Kernel boot message: "PCI-DMA: using GART IOMMU"

3. <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used
e.g. if there is no hardware IOMMU in the system and it is need because
you have >3GB memory or told the kernel to us it (iommu=soft))
Kernel boot message: "PCI-DMA: Using software bounce buffering
for IO (SWIOTLB)"

4. <arch/x86_64/pci-calgary.c> : IBM Calgary hardware IOMMU. Used in IBM
pSeries and xSeries servers. This hardware IOMMU supports DMA address
mapping with memory protection, etc.
Kernel boot message: "PCI-DMA: Using Calgary IOMMU"

iommu=[<size>][,noagp][,off][,force][,noforce][,leak[=<nr_of_leak_pages>]
[,memaper[=<order>]][,merge][,forcesac][,fullflush][,nomerge]
[,noaperture][,calgary]

General iommu options:
off Don't initialize and use any kind of IOMMU.
noforce Don't force hardware IOMMU usage when it is not needed.
(default).
force Force the use of the hardware IOMMU even when it is
not actually needed (e.g. because < 3 GB memory).
soft Use software bounce buffering (SWIOTLB) (default for
Intel machines). This can be used to prevent the usage
of an available hardware IOMMU.

iommu options only relevant to the AMD GART hardware IOMMU:
<size> Set the size of the remapping area in bytes.
allowed Overwrite iommu off workarounds for specific chipsets.
fullflush Flush IOMMU on each allocation (default).
nofullflush Don't use IOMMU fullflush.
leak Turn on simple iommu leak tracing (only when
CONFIG_IOMMU_LEAK is on). Default number of leak pages
is 20.
memaper[=<order>] Allocate an own aperture over RAM with size 32MB<<order.
(default: order=1, i.e. 64MB)
merge Do scatter-gather (SG) merging. Implies "force"
(experimental).
nomerge Don't do scatter-gather (SG) merging.
noaperture Ask the IOMMU not to touch the aperture for AGP.
forcesac Force single-address cycle (SAC) mode for masks <40bits
(experimental).
noagp Don't initialize the AGP driver and use full aperture.
allowdac Allow double-address cycle (DAC) mode, i.e. DMA >4GB.
DAC is used with 32-bit PCI to push a 64-bit address in
two cycles. When off all DMA over >4GB is forced through
an IOMMU or software bounce buffering.
nodac Forbid DAC mode, i.e. DMA >4GB.
panic Always panic when IOMMU overflows.
calgary Use the Calgary IOMMU if it is available

iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
implementation:
swiotlb=<pages>[,force]
<pages> Prereserve that many 128K pages for the software IO
bounce buffering.
force Force all IO through the software TLB.

Settings for the IBM Calgary hardware IOMMU currently found in IBM
pSeries and xSeries machines:

calgary=[64k,128k,256k,512k,1M,2M,4M,8M]
calgary=[translate_empty_slots]
calgary=[disable=<PCI bus number>]
panic Always panic when IOMMU overflows

64k,...,8M - Set the size of each PCI slot's translation table
when using the Calgary IOMMU. This is the size of the translation
Expand All @@ -234,14 +275,14 @@ IOMMU

Debugging

oops=panic Always panic on oopses. Default is to just kill the process,
but there is a small probability of deadlocking the machine.
This will also cause panics on machine check exceptions.
Useful together with panic=30 to trigger a reboot.
oops=panic Always panic on oopses. Default is to just kill the process,
but there is a small probability of deadlocking the machine.
This will also cause panics on machine check exceptions.
Useful together with panic=30 to trigger a reboot.

kstack=N Print that many words from the kernel stack in oops dumps.
kstack=N Print N words from the kernel stack in oops dumps.

pagefaulttrace Dump all page faults. Only useful for extreme debugging
pagefaulttrace Dump all page faults. Only useful for extreme debugging
and will create a lot of output.

call_trace=[old|both|newfallback|new]
Expand All @@ -251,15 +292,8 @@ Debugging
newfallback: use new unwinder but fall back to old if it gets
stuck (default)

call_trace=[old|both|newfallback|new]
old: use old inexact backtracer
new: use new exact dwarf2 unwinder
both: print entries from both
newfallback: use new unwinder but fall back to old if it gets
stuck (default)

Misc
Miscellaneous

noreplacement Don't replace instructions with more appropriate ones
for the CPU. This may be useful on asymmetric MP systems
where some CPU have less capabilities than the others.
where some CPUs have less capabilities than others.
2 changes: 1 addition & 1 deletion Documentation/x86_64/cpu-hotplug-spec
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Firmware support for CPU hotplug under Linux/x86-64
---------------------------------------------------

Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to
know in advance boot time the maximum number of CPUs that could be plugged
know in advance of boot time the maximum number of CPUs that could be plugged
into the system. ACPI 3.0 currently has no official way to supply
this information from the firmware to the operating system.

Expand Down
26 changes: 13 additions & 13 deletions Documentation/x86_64/kernel-stacks
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ zombie. While the thread is in user space the kernel stack is empty
except for the thread_info structure at the bottom.

In addition to the per thread stacks, there are specialized stacks
associated with each cpu. These stacks are only used while the kernel
is in control on that cpu, when a cpu returns to user space the
specialized stacks contain no useful data. The main cpu stacks is
associated with each CPU. These stacks are only used while the kernel
is in control on that CPU; when a CPU returns to user space the
specialized stacks contain no useful data. The main CPU stacks are:

* Interrupt stack. IRQSTACKSIZE

Expand All @@ -32,17 +32,17 @@ x86_64 also has a feature which is not available on i386, the ability
to automatically switch to a new stack for designated events such as
double fault or NMI, which makes it easier to handle these unusual
events on x86_64. This feature is called the Interrupt Stack Table
(IST). There can be up to 7 IST entries per cpu. The IST code is an
index into the Task State Segment (TSS), the IST entries in the TSS
point to dedicated stacks, each stack can be a different size.
(IST). There can be up to 7 IST entries per CPU. The IST code is an
index into the Task State Segment (TSS). The IST entries in the TSS
point to dedicated stacks; each stack can be a different size.

An IST is selected by an non-zero value in the IST field of an
An IST is selected by a non-zero value in the IST field of an
interrupt-gate descriptor. When an interrupt occurs and the hardware
loads such a descriptor, the hardware automatically sets the new stack
pointer based on the IST value, then invokes the interrupt handler. If
software wants to allow nested IST interrupts then the handler must
adjust the IST values on entry to and exit from the interrupt handler.
(this is occasionally done, e.g. for debug exceptions)
(This is occasionally done, e.g. for debug exceptions.)

Events with different IST codes (i.e. with different stacks) can be
nested. For example, a debug interrupt can safely be interrupted by an
Expand All @@ -58,17 +58,17 @@ The currently assigned IST stacks are :-

Used for interrupt 12 - Stack Fault Exception (#SS).

This allows to recover from invalid stack segments. Rarely
This allows the CPU to recover from invalid stack segments. Rarely
happens.

* DOUBLEFAULT_STACK. EXCEPTION_STKSZ (PAGE_SIZE).

Used for interrupt 8 - Double Fault Exception (#DF).

Invoked when handling a exception causes another exception. Happens
when the kernel is very confused (e.g. kernel stack pointer corrupt)
Using a separate stack allows to recover from it well enough in many
cases to still output an oops.
Invoked when handling one exception causes another exception. Happens
when the kernel is very confused (e.g. kernel stack pointer corrupt).
Using a separate stack allows the kernel to recover from it well enough
in many cases to still output an oops.

* NMI_STACK. EXCEPTION_STKSZ (PAGE_SIZE).

Expand Down
70 changes: 70 additions & 0 deletions Documentation/x86_64/machinecheck
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@

Configurable sysfs parameters for the x86-64 machine check code.

Machine checks report internal hardware error conditions detected
by the CPU. Uncorrected errors typically cause a machine check
(often with panic), corrected ones cause a machine check log entry.

Machine checks are organized in banks (normally associated with
a hardware subsystem) and subevents in a bank. The exact meaning
of the banks and subevent is CPU specific.

mcelog knows how to decode them.

When you see the "Machine check errors logged" message in the system
log then mcelog should run to collect and decode machine check entries
from /dev/mcelog. Normally mcelog should be run regularly from a cronjob.

Each CPU has a directory in /sys/devices/system/machinecheck/machinecheckN
(N = CPU number)

The directory contains some configurable entries:

Entries:

bankNctl
(N bank number)
64bit Hex bitmask enabling/disabling specific subevents for bank N
When a bit in the bitmask is zero then the respective
subevent will not be reported.
By default all events are enabled.
Note that BIOS maintain another mask to disable specific events
per bank. This is not visible here

The following entries appear for each CPU, but they are truly shared
between all CPUs.

check_interval
How often to poll for corrected machine check errors, in seconds
(Note output is hexademical). Default 5 minutes.

tolerant
Tolerance level. When a machine check exception occurs for a non
corrected machine check the kernel can take different actions.
Since machine check exceptions can happen any time it is sometimes
risky for the kernel to kill a process because it defies
normal kernel locking rules. The tolerance level configures
how hard the kernel tries to recover even at some risk of deadlock.

0: always panic,
1: panic if deadlock possible,
2: try to avoid panic,
3: never panic or exit (for testing only)

Default: 1

Note this only makes a difference if the CPU allows recovery
from a machine check exception. Current x86 CPUs generally do not.

trigger
Program to run when a machine check event is detected.
This is an alternative to running mcelog regularly from cron
and allows to detect events faster.

TBD document entries for AMD threshold interrupt configuration

For more details about the x86 machine check architecture
see the Intel and AMD architecture manuals from their developer websites.

For more details about the architecture see
see http://one.firstfloor.org/~andi/mce.pdf
22 changes: 11 additions & 11 deletions Documentation/x86_64/mm.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,26 @@

Virtual memory map with 4 level page tables:

0000000000000000 - 00007fffffffffff (=47bits) user space, different per mm
0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
hole caused by [48:63] sign extension
ffff800000000000 - ffff80ffffffffff (=40bits) guard hole
ffff810000000000 - ffffc0ffffffffff (=46bits) direct mapping of all phys. memory
ffffc10000000000 - ffffc1ffffffffff (=40bits) hole
ffffc20000000000 - ffffe1ffffffffff (=45bits) vmalloc/ioremap space
ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
ffff810000000000 - ffffc0ffffffffff (=46 bits) direct mapping of all phys. memory
ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole
ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space
... unused hole ...
ffffffff80000000 - ffffffff82800000 (=40MB) kernel text mapping, from phys 0
ffffffff80000000 - ffffffff82800000 (=40 MB) kernel text mapping, from phys 0
... unused hole ...
ffffffff88000000 - fffffffffff00000 (=1919MB) module mapping space
ffffffff88000000 - fffffffffff00000 (=1919 MB) module mapping space

The direct mapping covers all memory in the system upto the highest
The direct mapping covers all memory in the system up to the highest
memory address (this means in some cases it can also include PCI memory
holes)
holes).

vmalloc space is lazily synchronized into the different PML4 pages of
the processes using the page fault handler, with init_level4_pgt as
reference.

Current X86-64 implementations only support 40 bit of address space,
but we support upto 46bits. This expands into MBZ space in the page tables.
Current X86-64 implementations only support 40 bits of address space,
but we support up to 46 bits. This expands into MBZ space in the page tables.

-Andi Kleen, Jul 2004
1 change: 1 addition & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -3779,6 +3779,7 @@ P: Andi Kleen
M: [email protected]
L: [email protected]
W: http://www.x86-64.org
T: quilt ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current
S: Maintained

YAM DRIVER FOR AX.25
Expand Down
18 changes: 18 additions & 0 deletions arch/i386/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,15 @@ config PARAVIRT
However, when run without a hypervisor the kernel is
theoretically slower. If in doubt, say N.

config VMI
bool "VMI Paravirt-ops support"
depends on PARAVIRT
default y
help
VMI provides a paravirtualized interface to multiple hypervisors
include VMware ESX server and Xen by connecting to a ROM module
provided by the hypervisor.

config ACPI_SRAT
bool
default y
Expand Down Expand Up @@ -1263,3 +1272,12 @@ config X86_TRAMPOLINE
config KTIME_SCALAR
bool
default y

config NO_IDLE_HZ
bool
depends on PARAVIRT
default y
help
Switches the regular HZ timer off when the system is going idle.
This helps a hypervisor detect that the Linux system is idle,
reducing the overhead of idle systems.
Loading

0 comments on commit 414f827

Please sign in to comment.