Skip to content

Commit

Permalink
Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux
Browse files Browse the repository at this point in the history
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits)
  nfsd: nfs4xdr.c do-while is not a compound statement
  nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c
  lockd: Pass "struct sockaddr *" to new failover-by-IP function
  lockd: get host reference in nlmsvc_create_block() instead of callers
  lockd: minor svclock.c style fixes
  lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock
  lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock
  lockd: nlm_release_host() checks for NULL, caller needn't
  file lock: reorder struct file_lock to save space on 64 bit builds
  nfsd: take file and mnt write in nfs4_upgrade_open
  nfsd: document open share bit tracking
  nfsd: tabulate nfs4 xdr encoding functions
  nfsd: dprint operation names
  svcrdma: Change WR context get/put to use the kmem cache
  svcrdma: Create a kmem cache for the WR contexts
  svcrdma: Add flush_scheduled_work to module exit function
  svcrdma: Limit ORD based on client's advertised IRD
  svcrdma: Remove unused wait q from svcrdma_xprt structure
  svcrdma: Remove unneeded spin locks from __svc_rdma_free
  svcrdma: Add dma map count and WARN_ON
  ...
  • Loading branch information
torvalds committed Jul 21, 2008
2 parents 734b397 + 5108b27 commit 14b395e
Show file tree
Hide file tree
Showing 36 changed files with 1,042 additions and 966 deletions.
103 changes: 59 additions & 44 deletions Documentation/filesystems/nfs-rdma.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
################################################################################

Author: NetApp and Open Grid Computing
Date: April 15, 2008
Date: May 29, 2008

Table of Contents
~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -60,39 +60,52 @@ Installation
The procedures described in this document have been tested with
distributions from Red Hat's Fedora Project (http://fedora.redhat.com/).

- Install nfs-utils-1.1.1 or greater on the client
- Install nfs-utils-1.1.2 or greater on the client

An NFS/RDMA mount point can only be obtained by using the mount.nfs
command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs
you are using, type:
An NFS/RDMA mount point can be obtained by using the mount.nfs command in
nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils
version with support for NFS/RDMA mounts, but for various reasons we
recommend using nfs-utils-1.1.2 or greater). To see which version of
mount.nfs you are using, type:

> /sbin/mount.nfs -V
$ /sbin/mount.nfs -V

If the version is less than 1.1.1 or the command does not exist,
then you will need to install the latest version of nfs-utils.
If the version is less than 1.1.2 or the command does not exist,
you should install the latest version of nfs-utils.

Download the latest package from:

http://www.kernel.org/pub/linux/utils/nfs

Uncompress the package and follow the installation instructions.

If you will not be using GSS and NFSv4, the installation process
can be simplified by disabling these features when running configure:
If you will not need the idmapper and gssd executables (you do not need
these to create an NFS/RDMA enabled mount command), the installation
process can be simplified by disabling these features when running
configure:

> ./configure --disable-gss --disable-nfsv4
$ ./configure --disable-gss --disable-nfsv4

For more information on this see the package's README and INSTALL files.
To build nfs-utils you will need the tcp_wrappers package installed. For
more information on this see the package's README and INSTALL files.

After building the nfs-utils package, there will be a mount.nfs binary in
the utils/mount directory. This binary can be used to initiate NFS v2, v3,
or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4.
The standard technique is to create a symlink called mount.nfs4 to mount.nfs.
or v4 mounts. To initiate a v4 mount, the binary must be called
mount.nfs4. The standard technique is to create a symlink called
mount.nfs4 to mount.nfs.

NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed
This mount.nfs binary should be installed at /sbin/mount.nfs as follows:

$ sudo cp utils/mount/mount.nfs /sbin/mount.nfs

In this location, mount.nfs will be invoked automatically for NFS mounts
by the system mount commmand.

NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed
on the NFS client machine. You do not need this specific version of
nfs-utils on the server. Furthermore, only the mount.nfs command from
nfs-utils-1.1.1 is needed on the client.
nfs-utils-1.1.2 is needed on the client.

- Install a Linux kernel with NFS/RDMA

Expand Down Expand Up @@ -156,8 +169,8 @@ Check RDMA and NFS Setup
this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel
card:

> modprobe ib_mthca
> modprobe ib_ipoib
$ modprobe ib_mthca
$ modprobe ib_ipoib

If you are using InfiniBand, make sure there is a Subnet Manager (SM)
running on the network. If your IB switch has an embedded SM, you can
Expand All @@ -166,18 +179,18 @@ Check RDMA and NFS Setup

If an SM is running on your network, you should see the following:

> cat /sys/class/infiniband/driverX/ports/1/state
$ cat /sys/class/infiniband/driverX/ports/1/state
4: ACTIVE

where driverX is mthca0, ipath5, ehca3, etc.

To further test the InfiniBand software stack, use IPoIB (this
assumes you have two IB hosts named host1 and host2):

host1> ifconfig ib0 a.b.c.x
host2> ifconfig ib0 a.b.c.y
host1> ping a.b.c.y
host2> ping a.b.c.x
host1$ ifconfig ib0 a.b.c.x
host2$ ifconfig ib0 a.b.c.y
host1$ ping a.b.c.y
host2$ ping a.b.c.x

For other device types, follow the appropriate procedures.

Expand All @@ -202,55 +215,57 @@ NFS/RDMA Setup
/vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash)
/vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash)

The IP address(es) is(are) the client's IPoIB address for an InfiniBand HCA or the
cleint's iWARP address(es) for an RNIC.
The IP address(es) is(are) the client's IPoIB address for an InfiniBand
HCA or the cleint's iWARP address(es) for an RNIC.

NOTE: The "insecure" option must be used because the NFS/RDMA client does not
use a reserved port.
NOTE: The "insecure" option must be used because the NFS/RDMA client does
not use a reserved port.

Each time a machine boots:

- Load and configure the RDMA drivers

For InfiniBand using a Mellanox adapter:

> modprobe ib_mthca
> modprobe ib_ipoib
> ifconfig ib0 a.b.c.d
$ modprobe ib_mthca
$ modprobe ib_ipoib
$ ifconfig ib0 a.b.c.d

NOTE: use unique addresses for the client and server

- Start the NFS server

If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
load the RDMA transport module:
If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
kernel config), load the RDMA transport module:

> modprobe svcrdma
$ modprobe svcrdma

Regardless of how the server was built (module or built-in), start the server:
Regardless of how the server was built (module or built-in), start the
server:

> /etc/init.d/nfs start
$ /etc/init.d/nfs start

or

> service nfs start
$ service nfs start

Instruct the server to listen on the RDMA transport:

> echo rdma 2050 > /proc/fs/nfsd/portlist
$ echo rdma 2050 > /proc/fs/nfsd/portlist

- On the client system

If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
load the RDMA client module:
If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
kernel config), load the RDMA client module:

> modprobe xprtrdma.ko
$ modprobe xprtrdma.ko

Regardless of how the client was built (module or built-in), issue the mount.nfs command:
Regardless of how the client was built (module or built-in), use this
command to mount the NFS/RDMA server:

> /path/to/your/mount.nfs <IPoIB-server-name-or-address>:/<export> /mnt -i -o rdma,port=2050
$ mount -o rdma,port=2050 <IPoIB-server-name-or-address>:/<export> /mnt

To verify that the mount is using RDMA, run "cat /proc/mounts" and check the
"proto" field for the given mount.
To verify that the mount is using RDMA, run "cat /proc/mounts" and check
the "proto" field for the given mount.

Congratulations! You're using NFS/RDMA!
33 changes: 13 additions & 20 deletions fs/lockd/svc.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ EXPORT_SYMBOL(nlmsvc_ops);
static DEFINE_MUTEX(nlmsvc_mutex);
static unsigned int nlmsvc_users;
static struct task_struct *nlmsvc_task;
static struct svc_serv *nlmsvc_serv;
static struct svc_rqst *nlmsvc_rqst;
int nlmsvc_grace_period;
unsigned long nlmsvc_timeout;

Expand Down Expand Up @@ -194,20 +194,11 @@ lockd(void *vrqstp)

svc_process(rqstp);
}

flush_signals(current);
if (nlmsvc_ops)
nlmsvc_invalidate_all();
nlm_shutdown_hosts();

unlock_kernel();

nlmsvc_task = NULL;
nlmsvc_serv = NULL;

/* Exit the RPC thread */
svc_exit_thread(rqstp);

return 0;
}

Expand Down Expand Up @@ -254,16 +245,15 @@ int
lockd_up(int proto) /* Maybe add a 'family' option when IPv6 is supported ?? */
{
struct svc_serv *serv;
struct svc_rqst *rqstp;
int error = 0;

mutex_lock(&nlmsvc_mutex);
/*
* Check whether we're already up and running.
*/
if (nlmsvc_serv) {
if (nlmsvc_rqst) {
if (proto)
error = make_socks(nlmsvc_serv, proto);
error = make_socks(nlmsvc_rqst->rq_server, proto);
goto out;
}

Expand All @@ -288,26 +278,26 @@ lockd_up(int proto) /* Maybe add a 'family' option when IPv6 is supported ?? */
/*
* Create the kernel thread and wait for it to start.
*/
rqstp = svc_prepare_thread(serv, &serv->sv_pools[0]);
if (IS_ERR(rqstp)) {
error = PTR_ERR(rqstp);
nlmsvc_rqst = svc_prepare_thread(serv, &serv->sv_pools[0]);
if (IS_ERR(nlmsvc_rqst)) {
error = PTR_ERR(nlmsvc_rqst);
nlmsvc_rqst = NULL;
printk(KERN_WARNING
"lockd_up: svc_rqst allocation failed, error=%d\n",
error);
goto destroy_and_out;
}

svc_sock_update_bufs(serv);
nlmsvc_serv = rqstp->rq_server;

nlmsvc_task = kthread_run(lockd, rqstp, serv->sv_name);
nlmsvc_task = kthread_run(lockd, nlmsvc_rqst, serv->sv_name);
if (IS_ERR(nlmsvc_task)) {
error = PTR_ERR(nlmsvc_task);
svc_exit_thread(nlmsvc_rqst);
nlmsvc_task = NULL;
nlmsvc_serv = NULL;
nlmsvc_rqst = NULL;
printk(KERN_WARNING
"lockd_up: kthread_run failed, error=%d\n", error);
svc_exit_thread(rqstp);
goto destroy_and_out;
}

Expand Down Expand Up @@ -346,6 +336,9 @@ lockd_down(void)
BUG();
}
kthread_stop(nlmsvc_task);
svc_exit_thread(nlmsvc_rqst);
nlmsvc_task = NULL;
nlmsvc_rqst = NULL;
out:
mutex_unlock(&nlmsvc_mutex);
}
Expand Down
7 changes: 3 additions & 4 deletions fs/lockd/svc4proc.c
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,7 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
return 0;

no_locks:
if (host)
nlm_release_host(host);
nlm_release_host(host);
if (error)
return error;
return nlm_lck_denied_nolocks;
Expand Down Expand Up @@ -100,7 +99,7 @@ nlm4svc_proc_test(struct svc_rqst *rqstp, struct nlm_args *argp,
return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;

/* Now check for conflicting locks */
resp->status = nlmsvc_testlock(rqstp, file, &argp->lock, &resp->lock, &resp->cookie);
resp->status = nlmsvc_testlock(rqstp, file, host, &argp->lock, &resp->lock, &resp->cookie);
if (resp->status == nlm_drop_reply)
rc = rpc_drop_reply;
else
Expand Down Expand Up @@ -146,7 +145,7 @@ nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_args *argp,
#endif

/* Now try to lock the file */
resp->status = nlmsvc_lock(rqstp, file, &argp->lock,
resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
argp->block, &argp->cookie);
if (resp->status == nlm_drop_reply)
rc = rpc_drop_reply;
Expand Down
Loading

0 comments on commit 14b395e

Please sign in to comment.