Skip to content

Commit

Permalink
docs: accounting: convert to ReST
Browse files Browse the repository at this point in the history
Rename the accounting documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <[email protected]>
  • Loading branch information
mchehab committed Jul 15, 2019
1 parent a36d053 commit c312355
Show file tree
Hide file tree
Showing 8 changed files with 140 additions and 93 deletions.
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
==================
Control Groupstats
==================

Control Groupstats is inspired by the discussion at
http://lkml.org/lkml/2007/4/11/187 and implements per cgroup statistics as
suggested by Andrew Morton in http://lkml.org/lkml/2007/4/11/263.
Expand All @@ -19,9 +23,9 @@ about tasks blocked on I/O. If CONFIG_TASK_DELAY_ACCT is disabled, this
information will not be available.

To extract cgroup statistics a utility very similar to getdelays.c
has been developed, the sample output of the utility is shown below
has been developed, the sample output of the utility is shown below::

~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup/a"
sleeping 1, blocked 0, running 1, stopped 0, uninterruptible 0
~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup"
sleeping 155, blocked 0, running 1, stopped 0, uninterruptible 2
~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup/a"
sleeping 1, blocked 0, running 1, stopped 0, uninterruptible 0
~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup"
sleeping 155, blocked 0, running 1, stopped 0, uninterruptible 2
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
================
Delay accounting
----------------
================

Tasks encounter delays in execution when they wait
for some kernel resource to become available e.g. a
Expand Down Expand Up @@ -39,7 +40,9 @@ in detail in a separate document in this directory. Taskstats returns a
generic data structure to userspace corresponding to per-pid and per-tgid
statistics. The delay accounting functionality populates specific fields of
this structure. See

include/linux/taskstats.h

for a description of the fields pertaining to delay accounting.
It will generally be in the form of counters returning the cumulative
delay seen for cpu, sync block I/O, swapin, memory reclaim etc.
Expand All @@ -61,13 +64,16 @@ also serves as an example of using the taskstats interface.
Usage
-----

Compile the kernel with
Compile the kernel with::

CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASKSTATS=y

Delay accounting is enabled by default at boot up.
To disable, add
To disable, add::

nodelayacct

to the kernel boot options. The rest of the instructions
below assume this has not been done.

Expand All @@ -78,40 +84,43 @@ The utility also allows a given command to be
executed and the corresponding delays to be
seen.

General format of the getdelays command
General format of the getdelays command::

getdelays [-t tgid] [-p pid] [-c cmd...]
getdelays [-t tgid] [-p pid] [-c cmd...]


Get delays, since system boot, for pid 10
# ./getdelays -p 10
(output similar to next case)
Get delays, since system boot, for pid 10::

Get sum of delays, since system boot, for all pids with tgid 5
# ./getdelays -t 5
# ./getdelays -p 10
(output similar to next case)

Get sum of delays, since system boot, for all pids with tgid 5::

CPU count real total virtual total delay total
7876 92005750 100000000 24001500
IO count delay total
0 0
SWAP count delay total
0 0
RECLAIM count delay total
0 0
# ./getdelays -t 5


CPU count real total virtual total delay total
7876 92005750 100000000 24001500
IO count delay total
0 0
SWAP count delay total
0 0
RECLAIM count delay total
0 0

Get delays seen in executing a given simple command::

Get delays seen in executing a given simple command
# ./getdelays -c ls /
# ./getdelays -c ls /

bin data1 data3 data5 dev home media opt root srv sys usr
boot data2 data4 data6 etc lib mnt proc sbin subdomain tmp var
bin data1 data3 data5 dev home media opt root srv sys usr
boot data2 data4 data6 etc lib mnt proc sbin subdomain tmp var


CPU count real total virtual total delay total
CPU count real total virtual total delay total
6 4000250 4000000 0
IO count delay total
IO count delay total
0 0
SWAP count delay total
SWAP count delay total
0 0
RECLAIM count delay total
RECLAIM count delay total
0 0
14 changes: 14 additions & 0 deletions Documentation/accounting/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
:orphan:

==========
Accounting
==========

.. toctree::
:maxdepth: 1

cgroupstats
delay-accounting
psi
taskstats
taskstats-struct
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,14 @@ Pressure interface
Pressure information for each resource is exported through the
respective file in /proc/pressure/ -- cpu, memory, and io.

The format for CPU is as such:
The format for CPU is as such::

some avg10=0.00 avg60=0.00 avg300=0.00 total=0
some avg10=0.00 avg60=0.00 avg300=0.00 total=0

and for memory and IO:
and for memory and IO::

some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0
some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0

The "some" line indicates the share of time in which at least some
tasks are stalled on a given resource.
Expand Down Expand Up @@ -77,9 +77,9 @@ To register a trigger user has to open psi interface file under
/proc/pressure/ representing the resource to be monitored and write the
desired threshold and time window. The open file descriptor should be
used to wait for trigger events using select(), poll() or epoll().
The following format is used:
The following format is used::

<some|full> <stall amount in us> <time window in us>
<some|full> <stall amount in us> <time window in us>

For example writing "some 150000 1000000" into /proc/pressure/memory
would add 150ms threshold for partial memory stall measured within
Expand Down Expand Up @@ -115,18 +115,20 @@ trigger is closed.
Userspace monitor usage example
===============================

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <poll.h>
#include <string.h>
#include <unistd.h>

/*
* Monitor memory partial stall with 1s tracking window size
* and 150ms threshold.
*/
int main() {
::

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <poll.h>
#include <string.h>
#include <unistd.h>

/*
* Monitor memory partial stall with 1s tracking window size
* and 150ms threshold.
*/
int main() {
const char trig[] = "some 150000 1000000";
struct pollfd fds;
int n;
Expand Down Expand Up @@ -165,7 +167,7 @@ int main() {
}

return 0;
}
}

Cgroup2 interface
=================
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
====================
The struct taskstats
--------------------
====================

This document contains an explanation of the struct taskstats fields.

Expand All @@ -10,16 +11,24 @@ There are three different groups of fields in the struct taskstats:
the common fields and basic accounting fields are collected for
delivery at do_exit() of a task.
2) Delay accounting fields
These fields are placed between
/* Delay accounting fields start */
and
/* Delay accounting fields end */
These fields are placed between::

/* Delay accounting fields start */
and::

/* Delay accounting fields end */
Their values are collected if CONFIG_TASK_DELAY_ACCT is set.
3) Extended accounting fields
These fields are placed between
/* Extended accounting fields start */
and
/* Extended accounting fields end */
These fields are placed between::

/* Extended accounting fields start */
and::

/* Extended accounting fields end */
Their values are collected if CONFIG_TASK_XACCT is set.

4) Per-task and per-thread context switch count statistics
Expand All @@ -31,31 +40,33 @@ There are three different groups of fields in the struct taskstats:
Future extension should add fields to the end of the taskstats struct, and
should not change the relative position of each field within the struct.

::

struct taskstats {
struct taskstats {

1) Common and basic accounting fields::

1) Common and basic accounting fields:
/* The version number of this struct. This field is always set to
* TAKSTATS_VERSION, which is defined in <linux/taskstats.h>.
* Each time the struct is changed, the value should be incremented.
*/
__u16 version;

/* The exit code of a task. */
/* The exit code of a task. */
__u32 ac_exitcode; /* Exit status */

/* The accounting flags of a task as defined in <linux/acct.h>
/* The accounting flags of a task as defined in <linux/acct.h>
* Defined values are AFORK, ASU, ACOMPAT, ACORE, and AXSIG.
*/
__u8 ac_flag; /* Record flags */

/* The value of task_nice() of a task. */
/* The value of task_nice() of a task. */
__u8 ac_nice; /* task_nice */

/* The name of the command that started this task. */
/* The name of the command that started this task. */
char ac_comm[TS_COMM_LEN]; /* Command name */

/* The scheduling discipline as set in task->policy field. */
/* The scheduling discipline as set in task->policy field. */
__u8 ac_sched; /* Scheduling discipline */

__u8 ac_pad[3];
Expand All @@ -64,26 +75,27 @@ struct taskstats {
__u32 ac_pid; /* Process ID */
__u32 ac_ppid; /* Parent process ID */

/* The time when a task begins, in [secs] since 1970. */
/* The time when a task begins, in [secs] since 1970. */
__u32 ac_btime; /* Begin time [sec since 1970] */

/* The elapsed time of a task, in [usec]. */
/* The elapsed time of a task, in [usec]. */
__u64 ac_etime; /* Elapsed time [usec] */

/* The user CPU time of a task, in [usec]. */
/* The user CPU time of a task, in [usec]. */
__u64 ac_utime; /* User CPU time [usec] */

/* The system CPU time of a task, in [usec]. */
/* The system CPU time of a task, in [usec]. */
__u64 ac_stime; /* System CPU time [usec] */

/* The minor page fault count of a task, as set in task->min_flt. */
/* The minor page fault count of a task, as set in task->min_flt. */
__u64 ac_minflt; /* Minor Page Fault Count */

/* The major page fault count of a task, as set in task->maj_flt. */
__u64 ac_majflt; /* Major Page Fault Count */


2) Delay accounting fields:
2) Delay accounting fields::

/* Delay accounting fields start
*
* All values, until the comment "Delay accounting fields end" are
Expand Down Expand Up @@ -134,7 +146,8 @@ struct taskstats {
/* version 1 ends here */


3) Extended accounting fields
3) Extended accounting fields::

/* Extended accounting fields start */

/* Accumulated RSS usage in duration of a task, in MBytes-usecs.
Expand All @@ -145,15 +158,15 @@ struct taskstats {
*/
__u64 coremem; /* accumulated RSS usage in MB-usec */

/* Accumulated virtual memory usage in duration of a task.
/* Accumulated virtual memory usage in duration of a task.
* Same as acct_rss_mem1 above except that we keep track of VM usage.
*/
__u64 virtmem; /* accumulated VM usage in MB-usec */

/* High watermark of RSS usage in duration of a task, in KBytes. */
/* High watermark of RSS usage in duration of a task, in KBytes. */
__u64 hiwater_rss; /* High-watermark of RSS usage */

/* High watermark of VM usage in duration of a task, in KBytes. */
/* High watermark of VM usage in duration of a task, in KBytes. */
__u64 hiwater_vm; /* High-water virtual memory usage */

/* The following four fields are I/O statistics of a task. */
Expand All @@ -164,17 +177,23 @@ struct taskstats {

/* Extended accounting fields end */

4) Per-task and per-thread statistics
4) Per-task and per-thread statistics::

__u64 nvcsw; /* Context voluntary switch counter */
__u64 nivcsw; /* Context involuntary switch counter */

5) Time accounting for SMT machines
5) Time accounting for SMT machines::

__u64 ac_utimescaled; /* utime scaled on frequency etc */
__u64 ac_stimescaled; /* stime scaled on frequency etc */
__u64 cpu_scaled_run_real_total; /* scaled cpu_run_real_total */

6) Extended delay accounting fields for memory reclaim
6) Extended delay accounting fields for memory reclaim::

/* Delay waiting for memory reclaim */
__u64 freepages_count;
__u64 freepages_delay_total;
}

::

}
Loading

0 comments on commit c312355

Please sign in to comment.