forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits) nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4 nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc nfsd41: Documentation/filesystems/nfs41-server.txt nfsd41: CREATE_EXCLUSIVE4_1 nfsd41: SUPPATTR_EXCLCREAT attribute nfsd41: support for 3-word long attribute bitmask nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify nfsd41: pass writable attrs mask to nfsd4_decode_fattr nfsd41: provide support for minor version 1 at rpc level nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap nfsd41: access_valid nfsd41: clientid handling nfsd41: check encode size for sessions maxresponse cached nfsd41: stateid handling nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op nfsd41: destroy_session operation nfsd41: non-page DRC for solo sequence responses nfsd41: Add a create session replay cache nfsd41: create_session operation ...
- Loading branch information
Showing
29 changed files
with
2,997 additions
and
555 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
|
||
Kernel NFS Server Statistics | ||
============================ | ||
|
||
This document describes the format and semantics of the statistics | ||
which the kernel NFS server makes available to userspace. These | ||
statistics are available in several text form pseudo files, each of | ||
which is described separately below. | ||
|
||
In most cases you don't need to know these formats, as the nfsstat(8) | ||
program from the nfs-utils distribution provides a helpful command-line | ||
interface for extracting and printing them. | ||
|
||
All the files described here are formatted as a sequence of text lines, | ||
separated by newline '\n' characters. Lines beginning with a hash | ||
'#' character are comments intended for humans and should be ignored | ||
by parsing routines. All other lines contain a sequence of fields | ||
separated by whitespace. | ||
|
||
/proc/fs/nfsd/pool_stats | ||
------------------------ | ||
|
||
This file is available in kernels from 2.6.30 onwards, if the | ||
/proc/fs/nfsd filesystem is mounted (it almost always should be). | ||
|
||
The first line is a comment which describes the fields present in | ||
all the other lines. The other lines present the following data as | ||
a sequence of unsigned decimal numeric fields. One line is shown | ||
for each NFS thread pool. | ||
|
||
All counters are 64 bits wide and wrap naturally. There is no way | ||
to zero these counters, instead applications should do their own | ||
rate conversion. | ||
|
||
pool | ||
The id number of the NFS thread pool to which this line applies. | ||
This number does not change. | ||
|
||
Thread pool ids are a contiguous set of small integers starting | ||
at zero. The maximum value depends on the thread pool mode, but | ||
currently cannot be larger than the number of CPUs in the system. | ||
Note that in the default case there will be a single thread pool | ||
which contains all the nfsd threads and all the CPUs in the system, | ||
and thus this file will have a single line with a pool id of "0". | ||
|
||
packets-arrived | ||
Counts how many NFS packets have arrived. More precisely, this | ||
is the number of times that the network stack has notified the | ||
sunrpc server layer that new data may be available on a transport | ||
(e.g. an NFS or UDP socket or an NFS/RDMA endpoint). | ||
|
||
Depending on the NFS workload patterns and various network stack | ||
effects (such as Large Receive Offload) which can combine packets | ||
on the wire, this may be either more or less than the number | ||
of NFS calls received (which statistic is available elsewhere). | ||
However this is a more accurate and less workload-dependent measure | ||
of how much CPU load is being placed on the sunrpc server layer | ||
due to NFS network traffic. | ||
|
||
sockets-enqueued | ||
Counts how many times an NFS transport is enqueued to wait for | ||
an nfsd thread to service it, i.e. no nfsd thread was considered | ||
available. | ||
|
||
The circumstance this statistic tracks indicates that there was NFS | ||
network-facing work to be done but it couldn't be done immediately, | ||
thus introducing a small delay in servicing NFS calls. The ideal | ||
rate of change for this counter is zero; significantly non-zero | ||
values may indicate a performance limitation. | ||
|
||
This can happen either because there are too few nfsd threads in the | ||
thread pool for the NFS workload (the workload is thread-limited), | ||
or because the NFS workload needs more CPU time than is available in | ||
the thread pool (the workload is CPU-limited). In the former case, | ||
configuring more nfsd threads will probably improve the performance | ||
of the NFS workload. In the latter case, the sunrpc server layer is | ||
already choosing not to wake idle nfsd threads because there are too | ||
many nfsd threads which want to run but cannot, so configuring more | ||
nfsd threads will make no difference whatsoever. The overloads-avoided | ||
statistic (see below) can be used to distinguish these cases. | ||
|
||
threads-woken | ||
Counts how many times an idle nfsd thread is woken to try to | ||
receive some data from an NFS transport. | ||
|
||
This statistic tracks the circumstance where incoming | ||
network-facing NFS work is being handled quickly, which is a good | ||
thing. The ideal rate of change for this counter will be close | ||
to but less than the rate of change of the packets-arrived counter. | ||
|
||
overloads-avoided | ||
Counts how many times the sunrpc server layer chose not to wake an | ||
nfsd thread, despite the presence of idle nfsd threads, because | ||
too many nfsd threads had been recently woken but could not get | ||
enough CPU time to actually run. | ||
|
||
This statistic counts a circumstance where the sunrpc layer | ||
heuristically avoids overloading the CPU scheduler with too many | ||
runnable nfsd threads. The ideal rate of change for this counter | ||
is zero. Significant non-zero values indicate that the workload | ||
is CPU limited. Usually this is associated with heavy CPU usage | ||
on all the CPUs in the nfsd thread pool. | ||
|
||
If a sustained large overloads-avoided rate is detected on a pool, | ||
the top(1) utility should be used to check for the following | ||
pattern of CPU usage on all the CPUs associated with the given | ||
nfsd thread pool. | ||
|
||
- %us ~= 0 (as you're *NOT* running applications on your NFS server) | ||
|
||
- %wa ~= 0 | ||
|
||
- %id ~= 0 | ||
|
||
- %sy + %hi + %si ~= 100 | ||
|
||
If this pattern is seen, configuring more nfsd threads will *not* | ||
improve the performance of the workload. If this patten is not | ||
seen, then something more subtle is wrong. | ||
|
||
threads-timedout | ||
Counts how many times an nfsd thread triggered an idle timeout, | ||
i.e. was not woken to handle any incoming network packets for | ||
some time. | ||
|
||
This statistic counts a circumstance where there are more nfsd | ||
threads configured than can be used by the NFS workload. This is | ||
a clue that the number of nfsd threads can be reduced without | ||
affecting performance. Unfortunately, it's only a clue and not | ||
a strong indication, for a couple of reasons: | ||
|
||
- Currently the rate at which the counter is incremented is quite | ||
slow; the idle timeout is 60 minutes. Unless the NFS workload | ||
remains constant for hours at a time, this counter is unlikely | ||
to be providing information that is still useful. | ||
|
||
- It is usually a wise policy to provide some slack, | ||
i.e. configure a few more nfsds than are currently needed, | ||
to allow for future spikes in load. | ||
|
||
|
||
Note that incoming packets on NFS transports will be dealt with in | ||
one of three ways. An nfsd thread can be woken (threads-woken counts | ||
this case), or the transport can be enqueued for later attention | ||
(sockets-enqueued counts this case), or the packet can be temporarily | ||
deferred because the transport is currently being used by an nfsd | ||
thread. This last case is not very interesting and is not explicitly | ||
counted, but can be inferred from the other counters thus: | ||
|
||
packets-deferred = packets-arrived - ( sockets-enqueued + threads-woken ) | ||
|
||
|
||
More | ||
---- | ||
Descriptions of the other statistics file should go here. | ||
|
||
|
||
Greg Banks <[email protected]> | ||
26 Mar 2009 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
NFSv4.1 Server Implementation | ||
|
||
Server support for minorversion 1 can be controlled using the | ||
/proc/fs/nfsd/versions control file. The string output returned | ||
by reading this file will contain either "+4.1" or "-4.1" | ||
correspondingly. | ||
|
||
Currently, server support for minorversion 1 is disabled by default. | ||
It can be enabled at run time by writing the string "+4.1" to | ||
the /proc/fs/nfsd/versions control file. Note that to write this | ||
control file, the nfsd service must be taken down. Use your user-mode | ||
nfs-utils to set this up; see rpc.nfsd(8) | ||
|
||
The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based | ||
on the latest NFSv4.1 Internet Draft: | ||
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29 | ||
|
||
From the many new features in NFSv4.1 the current implementation | ||
focuses on the mandatory-to-implement NFSv4.1 Sessions, providing | ||
"exactly once" semantics and better control and throttling of the | ||
resources allocated for each client. | ||
|
||
Other NFSv4.1 features, Parallel NFS operations in particular, | ||
are still under development out of tree. | ||
See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design | ||
for more information. | ||
|
||
The table below, taken from the NFSv4.1 document, lists | ||
the operations that are mandatory to implement (REQ), optional | ||
(OPT), and NFSv4.0 operations that are required not to implement (MNI) | ||
in minor version 1. The first column indicates the operations that | ||
are not supported yet by the linux server implementation. | ||
|
||
The OPTIONAL features identified and their abbreviations are as follows: | ||
pNFS Parallel NFS | ||
FDELG File Delegations | ||
DDELG Directory Delegations | ||
|
||
The following abbreviations indicate the linux server implementation status. | ||
I Implemented NFSv4.1 operations. | ||
NS Not Supported. | ||
NS* unimplemented optional feature. | ||
P pNFS features implemented out of tree. | ||
PNS pNFS features that are not supported yet (out of tree). | ||
|
||
Operations | ||
|
||
+----------------------+------------+--------------+----------------+ | ||
| Operation | REQ, REC, | Feature | Definition | | ||
| | OPT, or | (REQ, REC, | | | ||
| | MNI | or OPT) | | | ||
+----------------------+------------+--------------+----------------+ | ||
| ACCESS | REQ | | Section 18.1 | | ||
NS | BACKCHANNEL_CTL | REQ | | Section 18.33 | | ||
NS | BIND_CONN_TO_SESSION | REQ | | Section 18.34 | | ||
| CLOSE | REQ | | Section 18.2 | | ||
| COMMIT | REQ | | Section 18.3 | | ||
| CREATE | REQ | | Section 18.4 | | ||
I | CREATE_SESSION | REQ | | Section 18.36 | | ||
NS*| DELEGPURGE | OPT | FDELG (REQ) | Section 18.5 | | ||
| DELEGRETURN | OPT | FDELG, | Section 18.6 | | ||
| | | DDELG, pNFS | | | ||
| | | (REQ) | | | ||
NS | DESTROY_CLIENTID | REQ | | Section 18.50 | | ||
I | DESTROY_SESSION | REQ | | Section 18.37 | | ||
I | EXCHANGE_ID | REQ | | Section 18.35 | | ||
NS | FREE_STATEID | REQ | | Section 18.38 | | ||
| GETATTR | REQ | | Section 18.7 | | ||
P | GETDEVICEINFO | OPT | pNFS (REQ) | Section 18.40 | | ||
P | GETDEVICELIST | OPT | pNFS (OPT) | Section 18.41 | | ||
| GETFH | REQ | | Section 18.8 | | ||
NS*| GET_DIR_DELEGATION | OPT | DDELG (REQ) | Section 18.39 | | ||
P | LAYOUTCOMMIT | OPT | pNFS (REQ) | Section 18.42 | | ||
P | LAYOUTGET | OPT | pNFS (REQ) | Section 18.43 | | ||
P | LAYOUTRETURN | OPT | pNFS (REQ) | Section 18.44 | | ||
| LINK | OPT | | Section 18.9 | | ||
| LOCK | REQ | | Section 18.10 | | ||
| LOCKT | REQ | | Section 18.11 | | ||
| LOCKU | REQ | | Section 18.12 | | ||
| LOOKUP | REQ | | Section 18.13 | | ||
| LOOKUPP | REQ | | Section 18.14 | | ||
| NVERIFY | REQ | | Section 18.15 | | ||
| OPEN | REQ | | Section 18.16 | | ||
NS*| OPENATTR | OPT | | Section 18.17 | | ||
| OPEN_CONFIRM | MNI | | N/A | | ||
| OPEN_DOWNGRADE | REQ | | Section 18.18 | | ||
| PUTFH | REQ | | Section 18.19 | | ||
| PUTPUBFH | REQ | | Section 18.20 | | ||
| PUTROOTFH | REQ | | Section 18.21 | | ||
| READ | REQ | | Section 18.22 | | ||
| READDIR | REQ | | Section 18.23 | | ||
| READLINK | OPT | | Section 18.24 | | ||
NS | RECLAIM_COMPLETE | REQ | | Section 18.51 | | ||
| RELEASE_LOCKOWNER | MNI | | N/A | | ||
| REMOVE | REQ | | Section 18.25 | | ||
| RENAME | REQ | | Section 18.26 | | ||
| RENEW | MNI | | N/A | | ||
| RESTOREFH | REQ | | Section 18.27 | | ||
| SAVEFH | REQ | | Section 18.28 | | ||
| SECINFO | REQ | | Section 18.29 | | ||
NS | SECINFO_NO_NAME | REC | pNFS files | Section 18.45, | | ||
| | | layout (REQ) | Section 13.12 | | ||
I | SEQUENCE | REQ | | Section 18.46 | | ||
| SETATTR | REQ | | Section 18.30 | | ||
| SETCLIENTID | MNI | | N/A | | ||
| SETCLIENTID_CONFIRM | MNI | | N/A | | ||
NS | SET_SSV | REQ | | Section 18.47 | | ||
NS | TEST_STATEID | REQ | | Section 18.48 | | ||
| VERIFY | REQ | | Section 18.31 | | ||
NS*| WANT_DELEGATION | OPT | FDELG (OPT) | Section 18.49 | | ||
| WRITE | REQ | | Section 18.32 | | ||
|
||
Callback Operations | ||
|
||
+-------------------------+-----------+-------------+---------------+ | ||
| Operation | REQ, REC, | Feature | Definition | | ||
| | OPT, or | (REQ, REC, | | | ||
| | MNI | or OPT) | | | ||
+-------------------------+-----------+-------------+---------------+ | ||
| CB_GETATTR | OPT | FDELG (REQ) | Section 20.1 | | ||
P | CB_LAYOUTRECALL | OPT | pNFS (REQ) | Section 20.3 | | ||
NS*| CB_NOTIFY | OPT | DDELG (REQ) | Section 20.4 | | ||
P | CB_NOTIFY_DEVICEID | OPT | pNFS (OPT) | Section 20.12 | | ||
NS*| CB_NOTIFY_LOCK | OPT | | Section 20.11 | | ||
NS*| CB_PUSH_DELEG | OPT | FDELG (OPT) | Section 20.5 | | ||
| CB_RECALL | OPT | FDELG, | Section 20.2 | | ||
| | | DDELG, pNFS | | | ||
| | | (REQ) | | | ||
NS*| CB_RECALL_ANY | OPT | FDELG, | Section 20.6 | | ||
| | | DDELG, pNFS | | | ||
| | | (REQ) | | | ||
NS | CB_RECALL_SLOT | REQ | | Section 20.8 | | ||
NS*| CB_RECALLABLE_OBJ_AVAIL | OPT | DDELG, pNFS | Section 20.7 | | ||
| | | (REQ) | | | ||
I | CB_SEQUENCE | OPT | FDELG, | Section 20.9 | | ||
| | | DDELG, pNFS | | | ||
| | | (REQ) | | | ||
NS*| CB_WANTS_CANCELLED | OPT | FDELG, | Section 20.10 | | ||
| | | DDELG, pNFS | | | ||
| | | (REQ) | | | ||
+-------------------------+-----------+-------------+---------------+ | ||
|
||
Implementation notes: | ||
|
||
EXCHANGE_ID: | ||
* only SP4_NONE state protection supported | ||
* implementation ids are ignored | ||
|
||
CREATE_SESSION: | ||
* backchannel attributes are ignored | ||
* backchannel security parameters are ignored | ||
|
||
SEQUENCE: | ||
* no support for dynamic slot table renegotiation (optional) | ||
|
||
nfsv4.1 COMPOUND rules: | ||
The following cases aren't supported yet: | ||
* Enforcing of NFS4ERR_NOT_ONLY_OP for: BIND_CONN_TO_SESSION, CREATE_SESSION, | ||
DESTROY_CLIENTID, DESTROY_SESSION, EXCHANGE_ID. | ||
* DESTROY_SESSION MUST be the final operation in the COMPOUND request. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.