Skip to content

Commit

Permalink
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/gi…
Browse files Browse the repository at this point in the history
…t/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A smaller cycle this time. Notably we see another new driver, 'Soft
  iWarp', and the deletion of an ancient unused driver for nes.

   - Revise and simplify the signature offload RDMA MR APIs

   - More progress on hoisting object allocation boiler plate code out
     of the drivers

   - Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib,
     i40iw

   - Tree wide cleanups: struct_size, put_user_page, xarray, rst doc
     conversion

   - Removal of obsolete ib_ucm chardev and nes driver

   - netlink based discovery of chardevs and autoloading of the modules
     providing them

   - Move more of the rdamvt/hfi1 uapi to include/uapi/rdma

   - New driver 'siw' for software based iWarp running on top of netdev,
     much like rxe's software RoCE.

   - mlx5 feature to report events in their raw devx format to userspace

   - Expose per-object counters through rdma tool

   - Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
     from netdev"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (194 commits)
  RMDA/siw: Require a 64 bit arch
  RDMA/siw: Mark expected switch fall-throughs
  RDMA/core: Fix -Wunused-const-variable warnings
  rdma/siw: Remove set but not used variable 's'
  rdma/siw: Add missing dependencies on LIBCRC32C and DMA_VIRT_OPS
  RDMA/siw: Add missing rtnl_lock around access to ifa
  rdma/siw: Use proper enumerated type in map_cqe_status
  RDMA/siw: Remove unnecessary kthread create/destroy printouts
  IB/rdmavt: Fix variable shadowing issue in rvt_create_cq
  RDMA/core: Fix race when resolving IP address
  RDMA/core: Make rdma_counter.h compile stand alone
  IB/core: Work on the caller socket net namespace in nldev_newlink()
  RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM
  RDMA/mlx5: Set RDMA DIM to be enabled by default
  RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink
  RDMA/core: Provide RDMA DIM support for ULPs
  linux/dim: Implement RDMA adaptive moderation (DIM)
  IB/mlx5: Report correctly tag matching rendezvous capability
  docs: infiniband: add it to the driver-api bookset
  IB/mlx5: Implement VHCA tunnel mechanism in DEVX
  ...
  • Loading branch information
torvalds committed Jul 16, 2019
2 parents 8de2625 + 0b04364 commit 2a3c389
Show file tree
Hide file tree
Showing 221 changed files with 18,855 additions and 24,841 deletions.
17 changes: 0 additions & 17 deletions Documentation/ABI/stable/sysfs-class-infiniband
Original file line number Diff line number Diff line change
Expand Up @@ -423,23 +423,6 @@ Description:
(e.g. driver restart on the VM which owns the VF).


sysfs interface for NetEffect RNIC Low-Level iWARP driver (nes)
---------------------------------------------------------------

What: /sys/class/infiniband/nesX/hw_rev
What: /sys/class/infiniband/nesX/hca_type
What: /sys/class/infiniband/nesX/board_id
Date: Feb, 2008
KernelVersion: v2.6.25
Contact: [email protected]
Description:
hw_rev: (RO) Hardware revision number

hca_type: (RO) Host Channel Adapter type (NEX020)

board_id: (RO) Manufacturing board id


sysfs interface for Chelsio T4/T5 RDMA driver (cxgb4)
-----------------------------------------------------

Expand Down
1 change: 1 addition & 0 deletions Documentation/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ needed).

driver-api/index
core-api/index
infiniband/index
media/index
networking/index
input/index
Expand Down
Original file line number Diff line number Diff line change
@@ -1,50 +1,54 @@
INFINIBAND MIDLAYER LOCKING
===========================
InfiniBand Midlayer Locking
===========================

This guide is an attempt to make explicit the locking assumptions
made by the InfiniBand midlayer. It describes the requirements on
both low-level drivers that sit below the midlayer and upper level
protocols that use the midlayer.

Sleeping and interrupt context
==============================

With the following exceptions, a low-level driver implementation of
all of the methods in struct ib_device may sleep. The exceptions
are any methods from the list:

create_ah
modify_ah
query_ah
destroy_ah
post_send
post_recv
poll_cq
req_notify_cq
map_phys_fmr
- create_ah
- modify_ah
- query_ah
- destroy_ah
- post_send
- post_recv
- poll_cq
- req_notify_cq
- map_phys_fmr

which may not sleep and must be callable from any context.

The corresponding functions exported to upper level protocol
consumers:

ib_create_ah
ib_modify_ah
ib_query_ah
ib_destroy_ah
ib_post_send
ib_post_recv
ib_req_notify_cq
ib_map_phys_fmr
- ib_create_ah
- ib_modify_ah
- ib_query_ah
- ib_destroy_ah
- ib_post_send
- ib_post_recv
- ib_req_notify_cq
- ib_map_phys_fmr

are therefore safe to call from any context.

In addition, the function

ib_dispatch_event
- ib_dispatch_event

used by low-level drivers to dispatch asynchronous events through
the midlayer is also safe to call from any context.

Reentrancy
----------

All of the methods in struct ib_device exported by a low-level
driver must be fully reentrant. The low-level driver is required to
Expand All @@ -62,6 +66,7 @@ Reentrancy
information between different calls of ib_poll_cq() is not defined.

Callbacks
---------

A low-level driver must not perform a callback directly from the
same callchain as an ib_device method call. For example, it is not
Expand All @@ -74,25 +79,26 @@ Callbacks
completion event handlers for the same CQ are not called
simultaneously. The driver must guarantee that only one CQ event
handler for a given CQ is running at a time. In other words, the
following situation is not allowed:
following situation is not allowed::

CPU1 CPU2
CPU1 CPU2

low-level driver ->
consumer CQ event callback:
/* ... */
ib_req_notify_cq(cq, ...);
low-level driver ->
/* ... */ consumer CQ event callback:
/* ... */
return from CQ event handler
low-level driver ->
consumer CQ event callback:
/* ... */
ib_req_notify_cq(cq, ...);
low-level driver ->
/* ... */ consumer CQ event callback:
/* ... */
return from CQ event handler

The context in which completion event and asynchronous event
callbacks run is not defined. Depending on the low-level driver, it
may be process context, softirq context, or interrupt context.
Upper level protocol consumers may not sleep in a callback.

Hot-plug
--------

A low-level driver announces that a device is ready for use by
consumers when it calls ib_register_device(), all initialization
Expand Down
23 changes: 23 additions & 0 deletions Documentation/infiniband/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
.. SPDX-License-Identifier: GPL-2.0
==========
InfiniBand
==========

.. toctree::
:maxdepth: 1

core_locking
ipoib
opa_vnic
sysfs
tag_matching
user_mad
user_verbs

.. only:: subproject and html

Indices
=======

* :ref:`genindex`
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
IP OVER INFINIBAND
==================
IP over InfiniBand
==================

The ib_ipoib driver is an implementation of the IP over InfiniBand
protocol as specified by RFC 4391 and 4392, issued by the IETF ipoib
Expand All @@ -8,16 +10,17 @@ IP OVER INFINIBAND
masqueraded to the kernel as ethernet interfaces).

Partitions and P_Keys
=====================

When the IPoIB driver is loaded, it creates one interface for each
port using the P_Key at index 0. To create an interface with a
different P_Key, write the desired P_Key into the main interface's
/sys/class/net/<intf name>/create_child file. For example:
/sys/class/net/<intf name>/create_child file. For example::

echo 0x8001 > /sys/class/net/ib0/create_child

This will create an interface named ib0.8001 with P_Key 0x8001. To
remove a subinterface, use the "delete_child" file:
remove a subinterface, use the "delete_child" file::

echo 0x8001 > /sys/class/net/ib0/delete_child

Expand All @@ -28,6 +31,7 @@ Partitions and P_Keys
rtnl_link_ops, where children created using either way behave the same.

Datagram vs Connected modes
===========================

The IPoIB driver supports two modes of operation: datagram and
connected. The mode is set and read through an interface's
Expand All @@ -51,6 +55,7 @@ Datagram vs Connected modes
networking stack to use the smaller UD MTU for these neighbours.

Stateless offloads
==================

If the IB HW supports IPoIB stateless offloads, IPoIB advertises
TCP/IP checksum and/or Large Send (LSO) offloading capability to the
Expand All @@ -60,9 +65,10 @@ Stateless offloads
on/off using ethtool calls. Currently LRO is supported only for
checksum offload capable devices.

Stateless offloads are supported only in datagram mode.
Stateless offloads are supported only in datagram mode.

Interrupt moderation
====================

If the underlying IB device supports CQ event moderation, one can
use ethtool to set interrupt mitigation parameters and thus reduce
Expand All @@ -71,6 +77,7 @@ Interrupt moderation
moderation is supported.

Debugging Information
=====================

By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set
to 'y', tracing messages are compiled into the driver. They are
Expand All @@ -79,7 +86,7 @@ Debugging Information
runtime through files in /sys/module/ib_ipoib/.

CONFIG_INFINIBAND_IPOIB_DEBUG also enables files in the debugfs
virtual filesystem. By mounting this filesystem, for example with
virtual filesystem. By mounting this filesystem, for example with::

mount -t debugfs none /sys/kernel/debug

Expand All @@ -96,10 +103,13 @@ Debugging Information
performance, because it adds tests to the fast path.

References
==========

Transmission of IP over InfiniBand (IPoIB) (RFC 4391)
http://ietf.org/rfc/rfc4391.txt
http://ietf.org/rfc/rfc4391.txt

IP over InfiniBand (IPoIB) Architecture (RFC 4392)
http://ietf.org/rfc/rfc4392.txt
http://ietf.org/rfc/rfc4392.txt

IP over InfiniBand: Connected Mode (RFC 4755)
http://ietf.org/rfc/rfc4755.txt
Loading

0 comments on commit 2a3c389

Please sign in to comment.