Linux* Driver for Intel(R) Ethernet Virtual Function 700 Series
======================================================

May 29, 2019

======================================================

Contents
========
- Overview
- Building and Installation
- Command Line Parameters
- Additional Configurations
- Known Issues
- Support
- License

================================================================================



Overview
--------
This driver supports Intel(R) Ethernet Controller 700 Series-based virtual
function devices
 with CONFIG_PCI_IOV enabled.

SR-IOV requires the correct platform and OS support.

The guest OS loading this driver must support MSI-X interrupts.

For questions related to hardware requirements, refer to the documentation
supplied with your Intel adapter. All hardware requirements listed apply to use
with Linux.

Driver information can be obtained using ethtool, lspci, and ifconfig.
Instructions on updating ethtool can be found in the section Additional
Configurations later in this document.

The virtual function driver supports virtual functions generated by the
physical function driver, with one or more VFs enabled through sysfs.



Identifying Your Adapter
========================
The driver in this release is compatible with devices based on the following:
  * Intel(R) Ethernet Controller X710
  * Intel(R) Ethernet Controller XL710
  * Intel(R) Ethernet Network Connection X722
  * Intel(R) Ethernet Controller XXV710

For the best performance, make sure the latest NVM/FW is installed on your
device and that you are using the newest drivers.

For information on how to identify your adapter, and for the latest NVM/FW
images and Intel network drivers, refer to the Intel Support website:
http://www.intel.com/support


Building and Installation
-------------------------
To build a binary RPM* package of this driver, run 'rpmbuild -tb
iavf-<x.x.x>.tar.gz', where <x.x.x> is the version number for the driver tar
file.

Note: For the build to work properly, the currently running kernel MUST match
the version and configuration of the installed kernel sources. If you have just
recompiled the kernel reboot the system before building.

- To compile the driver on some kernel/arch combinations, a package with the
development version of libelf (e.g. libelf-dev, libelf-devel,
elfutilsl-libelf-devel) may need to be installed.

Note: RPM functionality has only been tested in Red Hat distributions.


1. Move the virtual function driver tar file to the directory of your choice.
For
   example, use '/home/username/iavf' or '/usr/local/src/iavf'.

2. Untar/unzip the archive, where <x.x.x> is the version number for the
   driver tar file:
   tar zxf iavf-<x.x.x>.tar.gz

3. Change to the driver src directory, where <x.x.x> is the version number
   for the driver tar:
   cd iavf-<x.x.x>/src/

4. Compile the driver module:
   # make install
   The binary will be installed as:
   /lib/modules/<KERNEL VERSION>/updates/drivers/net/ethernet/intel/iavf/iavf.ko

   The install location listed above is the default location. This may differ
   for various Linux distributions.

  NOTE: To gather and display additional statistics, use the
  IAVF_ADD_PROBES pre-processor macro:
  #make CFLAGS_EXTRA=-DIAVF_ADD_PROBES
  Please note that this additional statistics gathering can impact
  performance.

5. Load the module using the modprobe command:
   modprobe <iavf>

   Make sure that any older iavf drivers are removed from the kernel before
   loading the new module:
   rmmod iavf; modprobe iavf

6. Assign an IP address to the interface by entering the following,
   where ethX is the interface name that was shown in dmesg after modprobe:
  
   ip address add <IP_address>/<netmask bits> dev ethX

7. Verify that the interface works. Enter the following, where IP_address
   is the IP address for another machine on the same subnet as the interface
   that is being tested:
   ping <IP_address>

Note: For certain distributions like (but not limited to) RedHat Enterprise
Linux 7 and Ubuntu, once the driver is installed the initrd/initramfs file may
need to be updated to prevent the OS loading old versions of the iavf driver.
The dracut utility may be used on RedHat distributions:
	# dracut --force
   For Ubuntu:
	# update-initramfs -u


Command Line Parameters
-----------------------

The iavf driver does not support any command line parameters.


Additional Features and Configurations
-------------------------------------------

Viewing Link Messages
---------------------
Link messages will not be displayed to the console if the distribution is
restricting system messages. In order to see network driver link messages on
your console, set dmesg to eight by entering the following:
dmesg -n 8

NOTE: This setting is not saved across reboots.


ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The latest ethtool
version is required for this functionality. Download it at:
http://ftp.kernel.org/pub/software/network/ethtool/


Setting VLAN Tag Stripping
--------------------------
If you have applications that require Virtual Functions (VFs) to receive
packets with VLAN tags, you can disable VLAN tag stripping for the VF. The
Physical Function (PF) processes requests issued from the VF to enable or
disable VLAN tag stripping. Note that if the PF has assigned a VLAN to a VF,
then requests from that VF to set VLAN tag stripping will be ignored.

To enable/disable VLAN tag stripping for a VF, issue the following command
from inside the VM in which you are running the VF:
  ethtool -K <if_name> rxvlan on/off
  or alternatively:
  ethtool --offload <if_name> rxvlan on/off


Adaptive Virtual Function
-------------------------
Adaptive Virtual Function (AVF) allows the virtual function driver, or VF, to
adapt to changing feature sets of the physical function driver (PF) with which
it is associated. This allows system administrators to update a PF without
having to update all the VFs associated with it. All AVFs have a single common
device ID and branding string.

AVFs have a minimum set of features known as "base mode," but may provide
additional features depending on what features are available in the PF with
which the AVF is associated. The following are base mode features:

- 4 Queue Pairs (QP) and associated Configuration Status Registers (CSRs)
  for Tx/Rx.
- i40e descriptors and ring format.
- Descriptor write-back completion.
- 1 control queue, with i40e descriptors, CSRs and ring format.
- 5 MSI-X interrupt vectors and corresponding i40e CSRs.
- 1 Interrupt Throttle Rate (ITR) index.
- 1 Virtual Station Interface (VSI) per VF.
- 1 Traffic Class (TC), TC0
- Receive Side Scaling (RSS) with 64 entry indirection table and key,
  configured through the PF.
- 1 unicast MAC address reserved per VF.
- 16 MAC address filters for each VF.
- Stateless offloads - non-tunneled checksums.
- AVF device ID.
- HW mailbox is used for VF to PF communications (including on Windows).


IEEE 802.1ad (QinQ) Support
---------------------------
The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN
IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as
"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks
allow L2 tunneling and the ability to segregate traffic within a particular
VLAN ID, among other uses.

The following are examples of how to configure 802.1ad (QinQ):
  ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24
  ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371
Where "24" and "371" are example VLAN IDs.

NOTES:
- 802.1ad (QinQ)is supported in 3.19 and later kernels.
- Receive checksum offloads, cloud filters, and VLAN acceleration are not
supported for 802.1ad (QinQ) packets.


Application Device Queues (ADQ)
-------------------------------
Application Device Queues (ADQ) allows you to dedicate one or more queues to a
specific application. This can reduce latency for the specified application,
and allow Tx traffic to be rate limited per application. Follow the steps below
to set ADQ.

Requirements:
- Kernel version 4.15 or later
- The sch_mqprio, act_mirred and cls_flower modules must be loaded
- The latest version of iproute2
- NVM version 6.01 or later
- ADQ cannot be enabled when any the following features are enabled: Data
Center Bridging (DCB), Multiple Functions per Port (MFP), or Sideband Filters.
- If another driver (for example, DPDK) has set cloud filters, you cannot
enable ADQ.

1. Create traffic classes (TCs). You can create a maximum of 8 TCs per
interface. The shaper bw_rlimit parameter is optional.

This example sets up two tcs, tc0 and tc1, with 16 queues each and max tx rate
set to 1Gbit for tc0 and 3Gbit for tc1:
  # tc qdisc add dev <interface> root mqprio num_tc 2 map 0 0 0 0 1 1 1 1
  queues 16@0 16@16 hw 1 mode channel shaper bw_rlimit min_rate 1Gbit 2Gbit
  max_rate 1Gbit 3Gbit

Where
map: priority mapping for up to 16 priorities to tcs (e.g. map 0 0 0 0 1 1 1 1
sets priorities 0-3 to use tc0 and 4-7 to use tc1)

queues: for each tc, <num queues>@<offset> (e.g. queues 16@0 16@16 assigns 16
queues to tc0 at offset 0 and 16 queues to tc1 at offset 16. Max total number
of queues for all tcs is 64 or number of cores, whichever is lower.)

hw 1 mode channel: 'channel' with 'hw' set to 1 is a new new hardware offload
mode in mqprio that makes full use of the mqprio options, the TCs, the queue
configurations, and the QoS parameters.

shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates. The
totals must be equal or less than port speed. For example:
  min_rate 1Gbit 3Gbit:
Verify bandwidth limit using network monitoring tools such as ifstat or sar -n
DEV [interval] [number of samples]

NOTE: Setting up channels via ethtool (ethtool -L) is not supported when the
TCs are configured using mqprio.

2. Enable HW TC offload on interface:
  # ethtool -K <interface> hw-tc-offload on

3. Apply TCs to ingress (RX) flow of interface:
  # tc qdisc add dev <interface> ingress

NOTES:
- Tunnel filters are not supported in ADQ. If encapsulated packets do arrive in
non-tunnel mode, filtering will be done on the inner headers. For example, for
VXLAN traffic in non-tunnel mode, if PCTYPE is identified as a VXLAN
encapsulated packet, then the outer headers are ignored. Therefore, inner
headers are matched.
- If a TC filter on a PF matches traffic over a VF (on the PF), that traffic
will be routed to the appropriate queue of the PF, and will not be passed on
the VF. Such traffic will end up getting dropped higher up in the TCP/IP stack
as it does not match PF address data.
- If traffic matches multiple TC filters that point to different TCs, that
traffic will be duplicated and sent to all matching TC queues. The hardware
switch mirrors the packet to a VSI list when multiple filters are matched.


SR-IOV Hypervisor Management Interface
--------------------------------------
The sysfs file structure below supports the SR-IOV hypervisor management
interface.

/sys/class/net/<interface-name>/device/sriov (see 1 below)
+-- [VF-id, 0 .. 127] (see 2 below)
| +-- trunk
| +-- vlan_mirror
| +-- engress_mirror
| +-- ingress_mirror
| +-- mac_anti_spoof
| +-- vlan_anti_spoof
| +-- loopback
| +-- mac
| +-- mac_list
| +-- promisc
| +-- vlan_strip
| +-- stats
| +-- link_state
| +-- max_tx_rate
| +-- min_tx_rate
| +-- spoofcheck
| +-- trust
| +-- vlan
*1: kobject started from “sriov” is not available from existing kernel
sysfs, and it requires device driver to implement this interface.
*2: assume maximum # of VF supported by a PF is 128. To support a device
that supports more than 128 SR-IOV instances, a “vfx” is added to 0..127.
With “vfx” kboject, users need to add vf index as the first parameter and
followed by “:”.

SR-IOV hypervisor functions:

- trunk
  Supports two operations: add and rem.
  - add: adds one or more VLAN id into VF VLAN filtering.
  - rem: removes VLAN ids from the VF VLAN filtering list.
  Example 1: add multiple VLAN tags, VLANs 2,4,5,10-20, by PF, p1p2, on
  a selected VF, 1, for filtering, with the sysfs support:
  #echo add 2,4,5,10-20 > /sys/class/net/p1p2/device/sriov/1/trunk
  Example 2: remove VLANs 5, 11-13 from PF p1p2 VF 1 with sysfs:
  #echo rem 5,11-13 > /sys/class/net/p1p2/device/sriov/1/trunk
  Note: for rem, if VLAN id is not on the VLAN filtering list, the
  VLAN id will be ignored.

- vlan_mirror
  Supports both ingress and egress traffic mirroring.
  Example 1: mirror traffic based upon VLANs 2,4,6,18-22 to VF 3 of PF p1p1.
  # echo add 2,4,6,18-22 > /sys/class/net/p1p1/device/sriov/3/vlan_mirror
  Example 2: remove VLAN 4, 15-17 from traffic mirroring at destination VF 3.
  # echo rem 15-17 > /sys/class/net/p1p1/device/sriov/3/vlan_mirror
  Example 3: remove all VLANs from mirroring at VF 3.
  # echo rem 0 - 4095> /sys/class/net/p1p1/device/sriov/3/vlan_mirror

- egress_mirror
  Supports egress traffic mirroring.
  Example 1: add egress traffic mirroring on PF p1p2 VF 1 to VF 7.
  #echo add 7 > /sys/class/net/p1p2/device/sriov/1/egress_mirror
  Example 2: remove egress traffic mirroring on PF p1p2 VF 1 to VF 7.
  #echo rem 7 > /sys/class/net/p1p2/device/sriov/1/egress_mirror

- ingress_mirror
  Supports ingress traffic mirroring.
  Example 1: mirror ingress traffic on PF p1p2 VF 1 to VF 7.
  #echo add 7 > /sys/class/net/p1p2/device/sriov/1/ingress_mirror
  Example 2: show current ingress mirroring configuration.
  #cat /sys/class/net/p1p2/device/sriov/1/ingress_mirror

- mac_anti_spoof
  Supports Enable/Disable MAC anti-spoof. Allows VFs to transmit packets with
  any SRC MAC, which is needed for some L2 applications as well as vNIC bonding
  within VMs if set to OFF.
  Example 1: enable MAC anti-spoof for PF p2p1 VF 1.
  #echo ON /sys/class/net/p1p2/device/sriov/1/mac_anti_spoof
  Example 2: disable MAC anti-spoof for PF p2p1 VF 1.
  #echo OFF /sys/class/net/p1p2/device/sriov/1/mac_anti_spoof

- vlan_anti_spoof
  Supports Enable/Disable VLAN anti-spoof. Allows VFs to transmit packets only
  with VLAN tag specified in “trunk” settings, also will not allow to transmit
  “untagged” packets if set to ON. Violation have to increment tx_spoof stats
  counter.
  Example 1: enable VLAN anti-spoof for PF p2p1 VF 1.
  #echo ON /sys/class/net/p1p2/device/sriov/1/vlan_anti_spoof
  Example 2: disable VLAN anti-spoof for PF p2p1 VF 1.
  #echo OFF /sys/class/net/p1p2/device/sriov/1/vlan_anti_spoof

- loopback
  Supports Enable/Disable VEB/VEPA (Local loopback).
  Example 1: allow traffic switching between VFs on the same PF.
  #echo ON > /sys/class/net/p1p2/device/sriov/loopback
  Example 2: send Hairpin traffic to the switch to which the PF is connected.
  #echo OFF > /sys/class/net/p1p2/device/sriov/loopback
  Example 3: show loopback configuration.
  #cat /sys/class/net/p1p2/device/sriov/loopback

- mac
  Supports setting default MAC address. If MAC address is set by this
  command, the PF will not allow VF to change it using an MBOX request.
  Example 1: set default MAC address to VF 1.
  #echo "00:11:22:33:44:55" > /sys/class/net/p1p2/device/sriov/1/mac
  Example 2: show default MAC address.
  #cat /sys/class/net/p1p2/device/sriov/1/mac

- mac_list
  Supports adding additional MACs to the VF. The default MAC is taken from
  "ip link set p1p2 vf 1 mac 00:11:22:33:44:55" if configured. If not, a random
  address is assigned to the VF by the NIC. If the MAC is configured using
the IP LINK command, the VF cannot change it via MBOX/AdminQ requests.
  Example 1: add mac 00:11:22:33:44:55 and 00:66:55:44:33:22 to PF p1p2 VF 1.
  #echo add "00:11:22:33:44:55,00:66:55:44:33:22" >
  /sys/class/net/p1p2/device/sriov/1/mac_list
  Example 2: delete mac 00:11:22:33:44:55 from above VF device.
  #echo rem 00:11:22:33:44:55 > /sys/class/net/p1p2/device/sriov/1/mac_list
  Example 3: display a VF MAC address list.
  #cat /sys/class/net/p1p2/device/sriov/1/mac_lis

- promisc
  Supports setting/unsetting VF device unicast promiscuous mode and multicast
  promiscuous mode.
  Example 1: set MCAST promiscuous on PF p1p2 VF 1.
  #echo add mcast > /sys/class/net/p1p2/device/sriov/1/promisc
  Example 2: set UCAST promiscuous on PF p1p2 VF 1.
  #echo add ucast > /sys/class/net/p1p2/device/sriov/1/promisc
  Example 3: unset MCAST promiscuous on PF p1p2 VF 1.
  #echo rem mcast > /sys/class/net/p1p2/device/sriov/1/promisc
  Example 4: show current promiscuous mode configuration.
  #cat /sys/class/net/p1p2/device/sriov/1/promisc

- vlan_strip
  Supports enabling/disabling VF device outer VLAN stripping
  Example 1: enable VLAN strip on VF 3.
  # echo ON > /sys/class/net/p1p1/device/sriov/3/vlan_strip
  Example 2: disable VLAN striping VF 3.
  # echo OFF > /sys/class/net/p1p1/device/sriov/3/vlan_strip

- stats
  Supports getting VF statistics
  Example 1: display anti-spoofing violations counter for VF 1.
  #cat /sys/class/net/p1p2/device/sriov/1/stats/tx_spoofed

- link_state
  Sets/displays link status.
  Example 1: display link status on link speed.
  #cat /sys/class/net/p1p2/device/sriov/1/link_state
  Example 2 set VF 1 to track status of PF link.
  #echo auto > /sys/class/net/p1p2/device/sriov/1/link_state
  Example 3: disable VF 1.
  #echo disable > /sys/class/net/p1p2/device/sriov/1/link_state


FW-LLDP (Firmware Link Layer Discovery Protocol)
------------------------------------------------
FW-LLDP (Firmware Link Layer Discovery Protocol) can be enabled/disabled by
setting the private flag disable_fw_lldp. Setting disable_fw_lldp to 'off'
enables FW-LLDP. Setting disable_fw_lldp to 'on' disables FW-LLDP.
NOTE: You must enable the UEFI HII "LLDP Agent" attribute for this setting to
take affect. If "LLDP AGENT" is set to disabled, you cannot enable it from the
OS.


================================================================================


Known Issues/Troubleshooting
============================

Hardware Issues
---------------
For known hardware and troubleshooting issues, either refer to the "Release
Notes" in your User Guide, or for more detailed information, go to
http://www.intel.com.

In the search box enter your devices controller ID followed by "spec update"
(i.e., XL710 spec update). The specification update file has complete
information on known hardware issues.


Software Issues
---------------
NOTE: After installing the driver, if your Intel Ethernet Network Connection
is not working, verify that you have installed the correct driver.


Linux bonding fails with Virtual Functions bound to an Intel(R) Ethernet
Controller 700 series based device
------------------------------------------------------------------------
If you bind Virtual Functions (VFs) to an Intel(R) Ethernet Controller 700
series based device, the VF slaves may fail when they become the active slave.
If the MAC address of the VF is set by the PF (Physical Function) of the
device, when you add a slave, or change the active-backup slave, Linux bonding
tries to sync the backup slave's MAC address to the same MAC address as the
active slave. Linux bonding will fail at this point. This issue will not occur
if the VF's MAC address is not set by the PF.
Traffic Is Not Being Passed Between VM and Client
-------------------------------------------------
You may not be able to pass traffic between a client system and a
Virtual Machine (VM) running on a separate host if the Virtual Function
(VF, or Virtual NIC) is not in trusted mode and spoof checking is enabled
on the VF. Note that this situation can occur in any combination of client,
host, and guest operating system. For information on how to set the VF to
trusted mode, refer to the section "VLAN Tag Packet Steering" in this
readme document. For information on setting spoof checking, refer to the
section "MAC and VLAN anti-spoofing feature" in this readme document.


Using four traffic classes fails
--------------------------------
Do not try to reserve more than three traffic classes in the iavf driver. Doing
so will fail to set any traffic classes and will cause the driver to write
errors to stdout. Use a maximum of three queues to avoid this issue.


Multiple log error messages on iavf driver removal
----------------------------------------------------
If you have several VFs and you remove the iavf driver, several instances of
the following log errors are written to the log:
  Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err ok
  Unable to send the message to VF 2 aq_err 12
  ARQ Overflow Error detected


Virtual machine does not get link
---------------------------------
If the virtual machine has more than one virtual port assigned to it, and those
virtual ports are bound to different physical ports, you may not get link on
all of the virtual ports. The following command may work around the issue:
ethtool -r <PF>
Where <PF> is the PF interface in the host, for example: p5p1. You may need to
run the command more than once to get link on all virtual ports.


MAC address of Virtual Function changes unexpectedly
----------------------------------------------------
If a Virtual Function's MAC address is not assigned in the host, then the VF
(virtual function) driver will use a random MAC address. This random MAC
address may change each time the VF driver is reloaded. You can assign a static
MAC address in the host machine. This static MAC address will survive
a VF driver reload.


Driver Buffer Overflow Fix
--------------------------
The fix to resolve CVE-2016-8105, referenced in Intel SA-00069
<https://security-center.intel.com/advisory.aspx?intelid=INTEL-SA-00069&language
id=en-fr>, is included in this and future versions of the driver.


Compiling the Driver
--------------------
When trying to compile the driver by running make install, the following error
may occur: "Linux kernel source not configured - missing version.h"

To solve this issue, create the version.h file by going to the Linux source
tree and entering:
# make include/linux/version.h


Multiple Interfaces on Same Ethernet Broadcast Network
------------------------------------------------------
Due to the default ARP behavior on Linux, it is not possible to have one system
on two IP networks in the same Ethernet broadcast domain (non-partitioned
switch) behave as expected. All Ethernet interfaces will respond to IP traffic
for any IP address assigned to the system. This results in unbalanced receive
traffic.

If you have multiple interfaces in a server, either turn on ARP filtering by
entering:
echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter

This only works if your kernel's version is higher than 2.4.5.


NOTE: This setting is not saved across reboots. The configuration change can be
made permanent by adding the following line to the file /etc/sysctl.conf:
net.ipv4.conf.all.arp_filter = 1

Another alternative is to install the interfaces in separate broadcast domains
(either in different switches or in a switch partitioned to VLANs).


Rx Page Allocation Errors
-------------------------
'Page allocation failure. order:0' errors may occur under stress with kernels
2.6.25 and newer.
This is caused by the way the Linux kernel reports this stressed condition.



Host May Reboot after Removing PF when VF is Active in Guest
------------------------------------------------------------
Using kernel versions earlier than 3.2, do not unload the PF driver with
active VFs. Doing this will cause your VFs to stop working until you reload
the PF driver and may cause a spontaneous reboot of your system.

Prior to unloading the PF driver, you must first ensure that all VFs are
no longer active. Do this by shutting down all VMs and unloading the VF driver.


================================================================================


Support
=======
For general information, go to the Intel support website at:
http://www.intel.com/support/

or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on a supported kernel
with a supported adapter, email the specific information related to the issue
to e1000-devel@lists.sf.net.


================================================================================


License
-------
This program is free software; you can redistribute it and/or modify it under
the terms and conditions of the GNU General Public License, version 2, as
published by the Free Software Foundation.

This program is distributed in the hope it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 51 Franklin
St - Fifth Floor, Boston, MA 02110-1301 USA.

The full GNU General Public License is included in this distribution in the
file called "COPYING".

Copyright(c) 2018 - 2019 Intel Corporation.
================================================================================


Trademarks
----------
Intel is a trademark or registered trademark of Intel Corporation or its
subsidiaries in the United States and/or other countries.

* Other names and brands may be claimed as the property of others.


