Skip to content

Commit

Permalink
Merge branch 'linus' into tracing/core
Browse files Browse the repository at this point in the history
Merge reason: tracing/core was on a .30-rc1 base and was missing out on
              on a handful of tracing fixes present in .30-rc5-almost.

Signed-off-by: Ingo Molnar <[email protected]>
  • Loading branch information
Ingo Molnar committed May 7, 2009
2 parents d94fc52 + 413f81e commit 44347d9
Show file tree
Hide file tree
Showing 1,603 changed files with 32,864 additions and 29,590 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ include/linux/compile.h
include/linux/version.h
include/linux/utsrelease.h
include/linux/bounds.h
include/generated

# stgit generated dirs
patches-*
Expand Down
6 changes: 3 additions & 3 deletions Documentation/ABI/testing/debugfs-pktcdvd
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
What: /debug/pktcdvd/pktcdvd[0-7]
What: /sys/kernel/debug/pktcdvd/pktcdvd[0-7]
Date: Oct. 2006
KernelVersion: 2.6.20
Contact: Thomas Maier <[email protected]>
Expand All @@ -10,10 +10,10 @@ debugfs interface
The pktcdvd module (packet writing driver) creates
these files in debugfs:

/debug/pktcdvd/pktcdvd[0-7]/
/sys/kernel/debug/pktcdvd/pktcdvd[0-7]/
info (0444) Lots of driver statistics and infos.

Example:
-------

cat /debug/pktcdvd/pktcdvd0/info
cat /sys/kernel/debug/pktcdvd/pktcdvd0/info
8 changes: 6 additions & 2 deletions Documentation/ABI/testing/sysfs-firmware-acpi
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,13 @@ Description:
gpe1F: 0 invalid
gpe_all: 1192
sci: 1194
sci_not: 0

sci - The total number of times the ACPI SCI
has claimed an interrupt.
sci - The number of times the ACPI SCI
has been called and claimed an interrupt.

sci_not - The number of times the ACPI SCI
has been called and NOT claimed an interrupt.

gpe_all - count of SCI caused by GPEs.

Expand Down
5 changes: 3 additions & 2 deletions Documentation/DocBook/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,8 @@ quiet_cmd_db2pdf = PDF $@
$(call cmd,db2pdf)


main_idx = Documentation/DocBook/index.html
index = index.html
main_idx = Documentation/DocBook/$(index)
build_main_index = rm -rf $(main_idx) && \
echo '<h1>Linux Kernel HTML Documentation</h1>' >> $(main_idx) && \
echo '<h2>Kernel Version: $(KERNELVERSION)</h2>' >> $(main_idx) && \
Expand Down Expand Up @@ -233,7 +234,7 @@ clean-files := $(DOCBOOKS) \
$(patsubst %.xml, %.pdf, $(DOCBOOKS)) \
$(patsubst %.xml, %.html, $(DOCBOOKS)) \
$(patsubst %.xml, %.9, $(DOCBOOKS)) \
$(C-procfs-example)
$(C-procfs-example) $(index)

clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man

Expand Down
6 changes: 5 additions & 1 deletion Documentation/DocBook/kernel-api.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -190,16 +190,20 @@ X!Ekernel/module.c
!Edrivers/pci/pci.c
!Edrivers/pci/pci-driver.c
!Edrivers/pci/remove.c
!Edrivers/pci/pci-acpi.c
!Edrivers/pci/search.c
!Edrivers/pci/msi.c
!Edrivers/pci/bus.c
!Edrivers/pci/access.c
!Edrivers/pci/irq.c
!Edrivers/pci/htirq.c
<!-- FIXME: Removed for now since no structured comments in source
X!Edrivers/pci/hotplug.c
-->
!Edrivers/pci/probe.c
!Edrivers/pci/slot.c
!Edrivers/pci/rom.c
!Edrivers/pci/iov.c
!Idrivers/pci/pci-sysfs.c
</sect1>
<sect1><title>PCI Hotplug Support Library</title>
!Edrivers/pci/hotplug/pci_hotplug_core.c
Expand Down
19 changes: 6 additions & 13 deletions Documentation/block/biodoc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1040,23 +1040,21 @@ Front merges are handled by the binary trees in AS and deadline schedulers.
iii. Plugging the queue to batch requests in anticipation of opportunities for
merge/sort optimizations

This is just the same as in 2.4 so far, though per-device unplugging
support is anticipated for 2.5. Also with a priority-based i/o scheduler,
such decisions could be based on request priorities.

Plugging is an approach that the current i/o scheduling algorithm resorts to so
that it collects up enough requests in the queue to be able to take
advantage of the sorting/merging logic in the elevator. If the
queue is empty when a request comes in, then it plugs the request queue
(sort of like plugging the bottom of a vessel to get fluid to build up)
(sort of like plugging the bath tub of a vessel to get fluid to build up)
till it fills up with a few more requests, before starting to service
the requests. This provides an opportunity to merge/sort the requests before
passing them down to the device. There are various conditions when the queue is
unplugged (to open up the flow again), either through a scheduled task or
could be on demand. For example wait_on_buffer sets the unplugging going
(by running tq_disk) so the read gets satisfied soon. So in the read case,
the queue gets explicitly unplugged as part of waiting for completion,
in fact all queues get unplugged as a side-effect.
through sync_buffer() running blk_run_address_space(mapping). Or the caller
can do it explicity through blk_unplug(bdev). So in the read case,
the queue gets explicitly unplugged as part of waiting for completion on that
buffer. For page driven IO, the address space ->sync_page() takes care of
doing the blk_run_address_space().

Aside:
This is kind of controversial territory, as it's not clear if plugging is
Expand All @@ -1067,11 +1065,6 @@ Aside:
multi-page bios being queued in one shot, we may not need to wait to merge
a big request from the broken up pieces coming by.

Per-queue granularity unplugging (still a Todo) may help reduce some of the
concerns with just a single tq_disk flush approach. Something like
blk_kick_queue() to unplug a specific queue (right away ?)
or optionally, all queues, is in the plan.

4.4 I/O contexts
I/O contexts provide a dynamically allocated per process data area. They may
be used in I/O schedulers, and in the block layer (could be used for IO statis,
Expand Down
55 changes: 32 additions & 23 deletions Documentation/cgroups/memory.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,14 @@ used here with the memory controller that is used in hardware.

Salient features

a. Enable control of both RSS (mapped) and Page Cache (unmapped) pages
a. Enable control of Anonymous, Page Cache (mapped and unmapped) and
Swap Cache memory pages.
b. The infrastructure allows easy addition of other types of memory to control
c. Provides *zero overhead* for non memory controller users
d. Provides a double LRU: global memory pressure causes reclaim from the
global LRU; a cgroup on hitting a limit, reclaims from the per
cgroup LRU

NOTE: Swap Cache (unmapped) is not accounted now.

Benefits and Purpose of the memory controller

The memory controller isolates the memory behaviour of a group of tasks
Expand Down Expand Up @@ -290,34 +289,44 @@ will be charged as a new owner of it.
moved to the parent. If you want to avoid that, force_empty will be useful.

5.2 stat file
memory.stat file includes following statistics (now)
cache - # of pages from page-cache and shmem.
rss - # of pages from anonymous memory.
pgpgin - # of event of charging
pgpgout - # of event of uncharging
active_anon - # of pages on active lru of anon, shmem.
inactive_anon - # of pages on active lru of anon, shmem
active_file - # of pages on active lru of file-cache
inactive_file - # of pages on inactive lru of file cache
unevictable - # of pages cannot be reclaimed.(mlocked etc)

Below is depend on CONFIG_DEBUG_VM.
inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
recent_scanned_file - VM internal parameter. (see mm/vmscan.c)

Memo:

memory.stat file includes following statistics

cache - # of bytes of page cache memory.
rss - # of bytes of anonymous and swap cache memory.
pgpgin - # of pages paged in (equivalent to # of charging events).
pgpgout - # of pages paged out (equivalent to # of uncharging events).
active_anon - # of bytes of anonymous and swap cache memory on active
lru list.
inactive_anon - # of bytes of anonymous memory and swap cache memory on
inactive lru list.
active_file - # of bytes of file-backed memory on active lru list.
inactive_file - # of bytes of file-backed memory on inactive lru list.
unevictable - # of bytes of memory that cannot be reclaimed (mlocked etc).

The following additional stats are dependent on CONFIG_DEBUG_VM.

inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
recent_scanned_file - VM internal parameter. (see mm/vmscan.c)

Memo:
recent_rotated means recent frequency of lru rotation.
recent_scanned means recent # of scans to lru.
showing for better debug please see the code for meanings.

Note:
Only anonymous and swap cache memory is listed as part of 'rss' stat.
This should not be confused with the true 'resident set size' or the
amount of physical memory used by the cgroup. Per-cgroup rss
accounting is not done yet.

5.3 swappiness
Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only.

Following cgroup's swapiness can't be changed.
Following cgroups' swapiness can't be changed.
- root cgroup (uses /proc/sys/vm/swappiness).
- a cgroup which uses hierarchy and it has child cgroup.
- a cgroup which uses hierarchy and not the root of hierarchy.
Expand Down
27 changes: 21 additions & 6 deletions Documentation/cgroups/resource_counter.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,13 +47,18 @@ to work with it.

2. Basic accounting routines

a. void res_counter_init(struct res_counter *rc)
a. void res_counter_init(struct res_counter *rc,
struct res_counter *rc_parent)

Initializes the resource counter. As usual, should be the first
routine called for a new counter.

b. int res_counter_charge[_locked]
(struct res_counter *rc, unsigned long val)
The struct res_counter *parent can be used to define a hierarchical
child -> parent relationship directly in the res_counter structure,
NULL can be used to define no relationship.

c. int res_counter_charge(struct res_counter *rc, unsigned long val,
struct res_counter **limit_fail_at)

When a resource is about to be allocated it has to be accounted
with the appropriate resource counter (controller should determine
Expand All @@ -67,15 +72,25 @@ to work with it.
* if the charging is performed first, then it should be uncharged
on error path (if the one is called).

c. void res_counter_uncharge[_locked]
If the charging fails and a hierarchical dependency exists, the
limit_fail_at parameter is set to the particular res_counter element
where the charging failed.

d. int res_counter_charge_locked
(struct res_counter *rc, unsigned long val)

The same as res_counter_charge(), but it must not acquire/release the
res_counter->lock internally (it must be called with res_counter->lock
held).

e. void res_counter_uncharge[_locked]
(struct res_counter *rc, unsigned long val)

When a resource is released (freed) it should be de-accounted
from the resource counter it was accounted to. This is called
"uncharging".

The _locked routines imply that the res_counter->lock is taken.

The _locked routines imply that the res_counter->lock is taken.

2.1 Other accounting routines

Expand Down
59 changes: 59 additions & 0 deletions Documentation/driver-model/platform.txt
Original file line number Diff line number Diff line change
Expand Up @@ -169,3 +169,62 @@ three different ways to find such a match:
be probed later if another device registers. (Which is OK, since
this interface is only for use with non-hotpluggable devices.)


Early Platform Devices and Drivers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The early platform interfaces provide platform data to platform device
drivers early on during the system boot. The code is built on top of the
early_param() command line parsing and can be executed very early on.

Example: "earlyprintk" class early serial console in 6 steps

1. Registering early platform device data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code registers platform device data using the function
early_platform_add_devices(). In the case of early serial console this
should be hardware configuration for the serial port. Devices registered
at this point will later on be matched against early platform drivers.

2. Parsing kernel command line
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code calls parse_early_param() to parse the kernel
command line. This will execute all matching early_param() callbacks.
User specified early platform devices will be registered at this point.
For the early serial console case the user can specify port on the
kernel command line as "earlyprintk=serial.0" where "earlyprintk" is
the class string, "serial" is the name of the platfrom driver and
0 is the platform device id. If the id is -1 then the dot and the
id can be omitted.

3. Installing early platform drivers belonging to a certain class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code may optionally force registration of all early
platform drivers belonging to a certain class using the function
early_platform_driver_register_all(). User specified devices from
step 2 have priority over these. This step is omitted by the serial
driver example since the early serial driver code should be disabled
unless the user has specified port on the kernel command line.

4. Early platform driver registration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compiled-in platform drivers making use of early_platform_init() are
automatically registered during step 2 or 3. The serial driver example
should use early_platform_init("earlyprintk", &platform_driver).

5. Probing of early platform drivers belonging to a certain class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code calls early_platform_driver_probe() to match
registered early platform devices associated with a certain class with
registered early platform drivers. Matched devices will get probed().
This step can be executed at any point during the early boot. As soon
as possible may be good for the serial port case.

6. Inside the early platform driver probe()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The driver code needs to take special care during early boot, especially
when it comes to memory allocation and interrupt registration. The code
in the probe() function can use is_early_platform_device() to check if
it is called at early platform device or at the regular platform device
time. The early serial driver performs register_console() at this point.

For further information, see <linux/platform_device.h>.
24 changes: 16 additions & 8 deletions Documentation/filesystems/Locking
Original file line number Diff line number Diff line change
Expand Up @@ -512,16 +512,24 @@ locking rules:
BKL mmap_sem PageLocked(page)
open: no yes
close: no yes
fault: no yes
page_mkwrite: no yes no
fault: no yes can return with page locked
page_mkwrite: no yes can return with page locked
access: no yes

->page_mkwrite() is called when a previously read-only page is
about to become writeable. The file system is responsible for
protecting against truncate races. Once appropriate action has been
taking to lock out truncate, the page range should be verified to be
within i_size. The page mapping should also be checked that it is not
NULL.
->fault() is called when a previously not present pte is about
to be faulted in. The filesystem must find and return the page associated
with the passed in "pgoff" in the vm_fault structure. If it is possible that
the page may be truncated and/or invalidated, then the filesystem must lock
the page, then ensure it is not already truncated (the page lock will block
subsequent truncate), and then return with VM_FAULT_LOCKED, and the page
locked. The VM will unlock the page.

->page_mkwrite() is called when a previously read-only pte is
about to become writeable. The filesystem again must ensure that there are
no truncate/invalidate races, and then return with the page locked. If
the page has been truncated, the filesystem should not look up a new page
like the ->fault() handler, but simply return with VM_FAULT_NOPAGE, which
will cause the VM to retry the fault.

->access() is called when get_user_pages() fails in
acces_process_vm(), typically used to debug a process through
Expand Down
8 changes: 4 additions & 4 deletions Documentation/filesystems/caching/cachefiles.txt
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ A NOTE ON SECURITY
==================

CacheFiles makes use of the split security in the task_struct. It allocates
its own task_security structure, and redirects current->act_as to point to it
its own task_security structure, and redirects current->cred to point to it
when it acts on behalf of another process, in that process's context.

The reason it does this is that it calls vfs_mkdir() and suchlike rather than
Expand All @@ -429,9 +429,9 @@ This means it may lose signals or ptrace events for example, and affects what
the process looks like in /proc.

So CacheFiles makes use of a logical split in the security between the
objective security (task->sec) and the subjective security (task->act_as). The
objective security holds the intrinsic security properties of a process and is
never overridden. This is what appears in /proc, and is what is used when a
objective security (task->real_cred) and the subjective security (task->cred).
The objective security holds the intrinsic security properties of a process and
is never overridden. This is what appears in /proc, and is what is used when a
process is the target of an operation by some other process (SIGKILL for
example).

Expand Down
5 changes: 3 additions & 2 deletions Documentation/filesystems/pohmelfs/design_notes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,10 @@ workloads and can fully utilize the bandwidth to the servers when doing bulk
data transfers.

POHMELFS clients operate with a working set of servers and are capable of balancing read-only
operations (like lookups or directory listings) between them.
operations (like lookups or directory listings) between them according to IO priorities.
Administrators can add or remove servers from the set at run-time via special commands (described
in Documentation/pohmelfs/info.txt file). Writes are replicated to all servers.
in Documentation/pohmelfs/info.txt file). Writes are replicated to all servers, which are connected
with write permission turned on. IO priority and permissions can be changed in run-time.

POHMELFS is capable of full data channel encryption and/or strong crypto hashing.
One can select any kernel supported cipher, encryption mode, hash type and operation mode
Expand Down
Loading

0 comments on commit 44347d9

Please sign in to comment.