Skip to content

Commit

Permalink
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Browse files Browse the repository at this point in the history
Pull networking updates from David Miller:

 1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

 2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers.  Thanks to Al Viro
    and Herbert Xu.

 3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

 4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

 5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

 6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

 7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

 8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

 9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets.  From Alexei
    Starovoitov.

10) Support TSO/LSO in sunvnet driver, from David L Stevens.

11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

12) Remote checksum offload, from Tom Herbert.

13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

14) Add MPLS support to openvswitch, from Simon Horman.

15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet.  This tries to resolve the conflicting goals between the
    desired handling of bulk vs.  RPC-like traffic.

17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU.  From Eric Dumazet.

18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

21) Add VLAN packet scheduler action, from Jiri Pirko.

22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
  Fix race condition between vxlan_sock_add and vxlan_sock_release
  net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
  net/mlx4: Add support for A0 steering
  net/mlx4: Refactor QUERY_PORT
  net/mlx4_core: Add explicit error message when rule doesn't meet configuration
  net/mlx4: Add A0 hybrid steering
  net/mlx4: Add mlx4_bitmap zone allocator
  net/mlx4: Add a check if there are too many reserved QPs
  net/mlx4: Change QP allocation scheme
  net/mlx4_core: Use tasklet for user-space CQ completion events
  net/mlx4_core: Mask out host side virtualization features for guests
  net/mlx4_en: Set csum level for encapsulated packets
  be2net: Export tunnel offloads only when a VxLAN tunnel is created
  gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
  cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
  net: fec: only enable mdio interrupt before phy device link up
  net: fec: clear all interrupt events to support i.MX6SX
  net: fec: reset fep link status in suspend function
  net: sock: fix access via invalid file descriptor
  net: introduce helper macro for_each_cmsghdr
  ...
  • Loading branch information
torvalds committed Dec 11, 2014
2 parents bae41e4 + 00c83b0 commit 70e71ca
Show file tree
Hide file tree
Showing 1,336 changed files with 70,846 additions and 29,177 deletions.
8 changes: 8 additions & 0 deletions Documentation/ABI/testing/sysfs-class-net
Original file line number Diff line number Diff line change
Expand Up @@ -216,3 +216,11 @@ Contact: [email protected]
Description:
Indicates the interface protocol type as a decimal value. See
include/uapi/linux/if_arp.h for all possible values.

What: /sys/class/net/<iface>/phys_switch_id
Date: November 2014
KernelVersion: 3.19
Contact: [email protected]
Description:
Indicates the unique physical switch identifier of a switch this
port belongs to, as a string.
2 changes: 1 addition & 1 deletion Documentation/Changes
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ o <http://www.iptables.org/downloads.html>

Ip-route2
---------
o <ftp://ftp.tux.org/pub/net/ip-routing/iproute2-2.2.4-now-ss991023.tar.gz>
o <https://www.kernel.org/pub/linux/utils/net/iproute2/>

OProfile
--------
Expand Down
29 changes: 29 additions & 0 deletions Documentation/devicetree/bindings/btmrvl.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
btmrvl
------

Required properties:

- compatible : must be "btmrvl,cfgdata"

Optional properties:

- btmrvl,cal-data : Calibration data downloaded to the device during
initialization. This is an array of 28 values(u8).

- btmrvl,gpio-gap : gpio and gap (in msecs) combination to be
configured.

Example:

GPIO pin 13 is configured as a wakeup source and GAP is set to 100 msecs
in below example.

btmrvl {
compatible = "btmrvl,cfgdata";

btmrvl,cal-data = /bits/ 8 <
0x37 0x01 0x1c 0x00 0xff 0xff 0xff 0xff 0x01 0x7f 0x04 0x02
0x00 0x00 0xba 0xce 0xc0 0xc6 0x2d 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0xf0 0x00>;
btmrvl,gpio-gap = <0x0d64>;
};
21 changes: 21 additions & 0 deletions Documentation/devicetree/bindings/bus/bcma.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,11 @@ Required properties:

The cores on the AXI bus are automatically detected by bcma with the
memory ranges they are using and they get registered afterwards.
Automatic detection of the IRQ number is not working on
BCM47xx/BCM53xx ARM SoCs. To assign IRQ numbers to the cores, provide
them manually through device tree. Use an interrupt-map to specify the
IRQ used by the devices on the bus. The first address is just an index,
because we do not have any special register.

The top-level axi bus may contain children representing attached cores
(devices). This is needed since some hardware details can't be auto
Expand All @@ -22,6 +27,22 @@ Example:
ranges = <0x00000000 0x18000000 0x00100000>;
#address-cells = <1>;
#size-cells = <1>;
#interrupt-cells = <1>;
interrupt-map-mask = <0x000fffff 0xffff>;
interrupt-map =
/* Ethernet Controller 0 */
<0x00024000 0 &gic GIC_SPI 147 IRQ_TYPE_LEVEL_HIGH>,

/* Ethernet Controller 1 */
<0x00025000 0 &gic GIC_SPI 148 IRQ_TYPE_LEVEL_HIGH>;

/* PCIe Controller 0 */
<0x00012000 0 &gic GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>,
<0x00012000 1 &gic GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>,
<0x00012000 2 &gic GIC_SPI 128 IRQ_TYPE_LEVEL_HIGH>,
<0x00012000 3 &gic GIC_SPI 129 IRQ_TYPE_LEVEL_HIGH>,
<0x00012000 4 &gic GIC_SPI 130 IRQ_TYPE_LEVEL_HIGH>,
<0x00012000 5 &gic GIC_SPI 131 IRQ_TYPE_LEVEL_HIGH>;

chipcommon {
reg = <0x00000000 0x1000>;
Expand Down
12 changes: 10 additions & 2 deletions Documentation/devicetree/bindings/net/amd-xgbe.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,10 @@ Required properties:
- PCS registers
- interrupt-parent: Should be the phandle for the interrupt controller
that services interrupts for this device
- interrupts: Should contain the amd-xgbe interrupt
- interrupts: Should contain the amd-xgbe interrupt(s). The first interrupt
listed is required and is the general device interrupt. If the optional
amd,per-channel-interrupt property is specified, then one additional
interrupt for each DMA channel supported by the device should be specified
- clocks:
- DMA clock for the amd-xgbe device (used for calculating the
correct Rx interrupt watchdog timer value on a DMA channel
Expand All @@ -23,14 +26,19 @@ Optional properties:
- mac-address: mac address to be assigned to the device. Can be overridden
by UEFI.
- dma-coherent: Present if dma operations are coherent
- amd,per-channel-interrupt: Indicates that Rx and Tx complete will generate
a unique interrupt for each DMA channel - this requires an additional
interrupt be configured for each DMA channel

Example:
xgbe@e0700000 {
compatible = "amd,xgbe-seattle-v1a";
reg = <0 0xe0700000 0 0x80000>,
<0 0xe0780000 0 0x80000>;
interrupt-parent = <&gic>;
interrupts = <0 325 4>;
interrupts = <0 325 4>,
<0 326 1>, <0 327 1>, <0 328 1>, <0 329 1>;
amd,per-channel-interrupt;
clocks = <&xgbe_dma_clk>, <&xgbe_ptp_clk>;
clock-names = "dma_clk", "ptp_clk";
phy-handle = <&phy>;
Expand Down
5 changes: 5 additions & 0 deletions Documentation/devicetree/bindings/net/can/c_can.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ Bosch C_CAN/D_CAN controller Device Tree Bindings
Required properties:
- compatible : Should be "bosch,c_can" for C_CAN controllers and
"bosch,d_can" for D_CAN controllers.
Can be "ti,dra7-d_can", "ti,am3352-d_can" or
"ti,am4372-d_can".
- reg : physical base address and size of the C_CAN/D_CAN
registers map
- interrupts : property with a value describing the interrupt
Expand All @@ -12,6 +14,9 @@ Required properties:
Optional properties:
- ti,hwmods : Must be "d_can<n>" or "c_can<n>", n being the
instance number
- syscon-raminit : Handle to system control region that contains the
RAMINIT register, register offset to the RAMINIT
register and the CAN instance number (0 offset).

Note: "ti,hwmods" field is used to fetch the base address and irq
resources from TI, omap hwmod data base during device registration.
Expand Down
9 changes: 8 additions & 1 deletion Documentation/devicetree/bindings/net/dsa/dsa.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Required properties:
- dsa,ethernet : Should be a phandle to a valid Ethernet device node
- dsa,mii-bus : Should be a phandle to a valid MDIO bus device node

Optionnal properties:
Optional properties:
- interrupts : property with a value describing the switch
interrupt number (not supported by the driver)

Expand All @@ -23,6 +23,13 @@ Each of these switch child nodes should have the following required properties:
- #address-cells : Must be 1
- #size-cells : Must be 0

A switch child node has the following optional property:

- eeprom-length : Set to the length of an EEPROM connected to the
switch. Must be set if the switch can not detect
the presence and/or size of a connected EEPROM,
otherwise optional.

A switch may have multiple "port" children nodes

Each port children node must have the following mandatory properties:
Expand Down
35 changes: 24 additions & 11 deletions Documentation/devicetree/bindings/net/micrel.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,32 @@ Optional properties:

- micrel,led-mode : LED mode value to set for PHYs with configurable LEDs.

Configure the LED mode with single value. The list of PHYs and
the bits that are currently supported:
Configure the LED mode with single value. The list of PHYs and the
bits that are currently supported:

KSZ8001: register 0x1e, bits 15..14
KSZ8041: register 0x1e, bits 15..14
KSZ8021: register 0x1f, bits 5..4
KSZ8031: register 0x1f, bits 5..4
KSZ8051: register 0x1f, bits 5..4
KSZ8001: register 0x1e, bits 15..14
KSZ8041: register 0x1e, bits 15..14
KSZ8021: register 0x1f, bits 5..4
KSZ8031: register 0x1f, bits 5..4
KSZ8051: register 0x1f, bits 5..4
KSZ8081: register 0x1f, bits 5..4
KSZ8091: register 0x1f, bits 5..4

See the respective PHY datasheet for the mode values.
See the respective PHY datasheet for the mode values.

- micrel,rmii-reference-clock-select-25-mhz: RMII Reference Clock Select
bit selects 25 MHz mode

Setting the RMII Reference Clock Select bit enables 25 MHz rather
than 50 MHz clock mode.

Note that this option in only needed for certain PHY revisions with a
non-standard, inverted function of this configuration bit.
Specifically, a clock reference ("rmii-ref" below) is always needed to
actually select a mode.

- clocks, clock-names: contains clocks according to the common clock bindings.

supported clocks:
- KSZ8021, KSZ8031: "rmii-ref": The RMII refence input clock. Used
to determine the XI input clock.
supported clocks:
- KSZ8021, KSZ8031, KSZ8081, KSZ8091: "rmii-ref": The RMII reference
input clock. Used to determine the XI input clock.
3 changes: 2 additions & 1 deletion Documentation/devicetree/bindings/net/phy.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ Optional Properties:
specifications. If neither of these are specified, the default is to
assume clause 22. The compatible list may also contain other
elements.
- max-speed: Maximum PHY supported speed (10, 100, 1000...)

If the phy's identifier is known then the list may contain an entry
of the form: "ethernet-phy-idAAAA.BBBB" where
Expand All @@ -29,6 +28,8 @@ Optional Properties:
4 hex digits. This is the chip vendor OUI bits 19:24,
followed by 10 bits of a vendor specific ID.

- max-speed: Maximum PHY supported speed (10, 100, 1000...)

Example:

ethernet-phy@0 {
Expand Down
1 change: 1 addition & 0 deletions Documentation/devicetree/bindings/net/sh_eth.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Required properties:
"renesas,ether-r8a7779" if the device is a part of R8A7779 SoC.
"renesas,ether-r8a7790" if the device is a part of R8A7790 SoC.
"renesas,ether-r8a7791" if the device is a part of R8A7791 SoC.
"renesas,ether-r8a7793" if the device is a part of R8A7793 SoC.
"renesas,ether-r8a7794" if the device is a part of R8A7794 SoC.
"renesas,ether-r7s72100" if the device is a part of R7S72100 SoC.
- reg: offset and length of (1) the E-DMAC/feLic register block (required),
Expand Down
7 changes: 2 additions & 5 deletions Documentation/networking/bonding.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2230,11 +2230,8 @@ balance-rr: This mode is the only mode that will permit a single

It is possible to adjust TCP/IP's congestion limits by
altering the net.ipv4.tcp_reordering sysctl parameter. The
usual default value is 3, and the maximum useful value is 127.
For a four interface balance-rr bond, expect that a single
TCP/IP stream will utilize no more than approximately 2.3
interface's worth of throughput, even after adjusting
tcp_reordering.
usual default value is 3. But keep in mind TCP stack is able
to automatically increase this when it detects reorders.

Note that the fraction of packets that will be delivered out of
order is highly variable, and is unlikely to be zero. The level
Expand Down
23 changes: 22 additions & 1 deletion Documentation/networking/ip-sysctl.txt
Original file line number Diff line number Diff line change
Expand Up @@ -383,9 +383,17 @@ tcp_orphan_retries - INTEGER
may consume significant resources. Cf. tcp_max_orphans.

tcp_reordering - INTEGER
Maximal reordering of packets in a TCP stream.
Initial reordering level of packets in a TCP stream.
TCP stack can then dynamically adjust flow reordering level
between this initial value and tcp_max_reordering
Default: 3

tcp_max_reordering - INTEGER
Maximal reordering level of packets in a TCP stream.
300 is a fairly conservative value, but you might increase it
if paths are using per packet load balancing (like bonding rr mode)
Default: 300

tcp_retrans_collapse - BOOLEAN
Bug-to-bug compatibility with some broken printers.
On retransmit try to send bigger packets to work around bugs in
Expand Down Expand Up @@ -1466,6 +1474,19 @@ suppress_frag_ndisc - INTEGER
1 - (default) discard fragmented neighbor discovery packets
0 - allow fragmented neighbor discovery packets

optimistic_dad - BOOLEAN
Whether to perform Optimistic Duplicate Address Detection (RFC 4429).
0: disabled (default)
1: enabled

use_optimistic - BOOLEAN
If enabled, do not classify optimistic addresses as deprecated during
source address selection. Preferred addresses will still be chosen
before optimistic addresses, subject to other ranking in the source
address selection algorithm.
0: disabled (default)
1: enabled

icmp/*:
ratelimit - INTEGER
Limit the maximal rates for sending ICMPv6 packets.
Expand Down
107 changes: 107 additions & 0 deletions Documentation/networking/ipvlan.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@

IPVLAN Driver HOWTO

Initial Release:
Mahesh Bandewar <maheshb AT google.com>

1. Introduction:
This is conceptually very similar to the macvlan driver with one major
exception of using L3 for mux-ing /demux-ing among slaves. This property makes
the master device share the L2 with it's slave devices. I have developed this
driver in conjuntion with network namespaces and not sure if there is use case
outside of it.


2. Building and Installation:
In order to build the driver, please select the config item CONFIG_IPVLAN.
The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module
(CONFIG_IPVLAN=m).


3. Configuration:
There are no module parameters for this driver and it can be configured
using IProute2/ip utility.

ip link add link <master-dev> <slave-dev> type ipvlan mode { l2 | L3 }

e.g. ip link add link ipvl0 eth0 type ipvlan mode l2


4. Operating modes:
IPvlan has two modes of operation - L2 and L3. For a given master device,
you can select one of these two modes and all slaves on that master will
operate in the same (selected) mode. The RX mode is almost identical except
that in L3 mode the slaves wont receive any multicast / broadcast traffic.
L3 mode is more restrictive since routing is controlled from the other (mostly)
default namespace.

4.1 L2 mode:
In this mode TX processing happens on the stack instance attached to the
slave device and packets are switched and queued to the master device to send
out. In this mode the slaves will RX/TX multicast and broadcast (if applicable)
as well.

4.2 L3 mode:
In this mode TX processing upto L3 happens on the stack instance attached
to the slave device and packets are switched to the stack instance of the
master device for the L2 processing and routing from that instance will be
used before packets are queued on the outbound device. In this mode the slaves
will not receive nor can send multicast / broadcast traffic.


5. What to choose (macvlan vs. ipvlan)?
These two devices are very similar in many regards and the specific use
case could very well define which device to choose. if one of the following
situations defines your use case then you can choose to use ipvlan -
(a) The Linux host that is connected to the external switch / router has
policy configured that allows only one mac per port.
(b) No of virtual devices created on a master exceed the mac capacity and
puts the NIC in promiscous mode and degraded performance is a concern.
(c) If the slave device is to be put into the hostile / untrusted network
namespace where L2 on the slave could be changed / misused.


6. Example configuration:

+=============================================================+
| Host: host1 |
| |
| +----------------------+ +----------------------+ |
| | NS:ns0 | | NS:ns1 | |
| | | | | |
| | | | | |
| | ipvl0 | | ipvl1 | |
| +----------#-----------+ +-----------#----------+ |
| # # |
| ################################ |
| # eth0 |
+==============================#==============================+


(a) Create two network namespaces - ns0, ns1
ip netns add ns0
ip netns add ns1

(b) Create two ipvlan slaves on eth0 (master device)
ip link add link eth0 ipvl0 type ipvlan mode l2
ip link add link eth0 ipvl1 type ipvlan mode l2

(c) Assign slaves to the respective network namespaces
ip link set dev ipvl0 netns ns0
ip link set dev ipvl1 netns ns1

(d) Now switch to the namespace (ns0 or ns1) to configure the slave devices
- For ns0
(1) ip netns exec ns0 bash
(2) ip link set dev ipvl0 up
(3) ip link set dev lo up
(4) ip -4 addr add 127.0.0.1 dev lo
(5) ip -4 addr add $IPADDR dev ipvl0
(6) ip -4 route add default via $ROUTER dev ipvl0
- For ns1
(1) ip netns exec ns1 bash
(2) ip link set dev ipvl1 up
(3) ip link set dev lo up
(4) ip -4 addr add 127.0.0.1 dev lo
(5) ip -4 addr add $IPADDR dev ipvl1
(6) ip -4 route add default via $ROUTER dev ipvl1
2 changes: 1 addition & 1 deletion Documentation/networking/ixgbe.txt
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ Other ethtool Commands:
To enable Flow Director
ethtool -K ethX ntuple on
To add a filter
Use -U switch. e.g., ethtool -U ethX flow-type tcp4 src-ip 0x178000a
Use -U switch. e.g., ethtool -U ethX flow-type tcp4 src-ip 10.0.128.23
action 1
To see the list of filters currently present:
ethtool -u ethX
Expand Down
Loading

0 comments on commit 70e71ca

Please sign in to comment.