From 9a69fb9c21c4bf4107becb877729544759bdd059 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Tue, 28 Apr 2020 00:01:30 +0200
Subject: docs: networking: convert decnet.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark lists as such;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'MAINTAINERS')

diff --git a/MAINTAINERS b/MAINTAINERS
index 453fe0713e68..7323bfc1720f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4728,7 +4728,7 @@ DECnet NETWORK LAYER
 L:	linux-decnet-user@lists.sourceforge.net
 S:	Orphan
 W:	http://linux-decnet.sourceforge.net
-F:	Documentation/networking/decnet.txt
+F:	Documentation/networking/decnet.rst
 F:	net/decnet/
 
 DECSTATION PLATFORM SUPPORT
-- 
cgit v1.2.3


From cb3f0d56e153398a035eb22769d2cb2837f29747 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Tue, 28 Apr 2020 00:01:36 +0200
Subject: docs: networking: convert filter.txt to ReST

- add SPDX header;
- adjust title markup;
- mark code blocks and literals as such;
- use footnote markup;
- mark tables as such;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/bpf/index.rst              |    4 +-
 Documentation/networking/filter.rst      | 1651 ++++++++++++++++++++++++++++++
 Documentation/networking/filter.txt      | 1545 ----------------------------
 Documentation/networking/index.rst       |    1 +
 Documentation/networking/packet_mmap.txt |    2 +-
 MAINTAINERS                              |    2 +-
 tools/bpf/bpf_asm.c                      |    2 +-
 tools/bpf/bpf_dbg.c                      |    2 +-
 8 files changed, 1658 insertions(+), 1551 deletions(-)
 create mode 100644 Documentation/networking/filter.rst
 delete mode 100644 Documentation/networking/filter.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index f99677f3572f..38b4db8be7a2 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -7,7 +7,7 @@ Filter) facility, with a focus on the extended BPF version (eBPF).
 
 This kernel side documentation is still work in progress.  The main
 textual documentation is (for historical reasons) described in
-`Documentation/networking/filter.txt`_, which describe both classical
+`Documentation/networking/filter.rst`_, which describe both classical
 and extended BPF instruction-set.
 The Cilium project also maintains a `BPF and XDP Reference Guide`_
 that goes into great technical depth about the BPF Architecture.
@@ -59,7 +59,7 @@ Testing and debugging BPF
 
 
 .. Links:
-.. _Documentation/networking/filter.txt: ../networking/filter.txt
+.. _Documentation/networking/filter.rst: ../networking/filter.txt
 .. _man-pages: https://www.kernel.org/doc/man-pages/
 .. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
 .. _BPF and XDP Reference Guide: http://cilium.readthedocs.io/en/latest/bpf/
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
new file mode 100644
index 000000000000..a1d3e192b9fa
--- /dev/null
+++ b/Documentation/networking/filter.rst
@@ -0,0 +1,1651 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================================================
+Linux Socket Filtering aka Berkeley Packet Filter (BPF)
+=======================================================
+
+Introduction
+------------
+
+Linux Socket Filtering (LSF) is derived from the Berkeley Packet Filter.
+Though there are some distinct differences between the BSD and Linux
+Kernel filtering, but when we speak of BPF or LSF in Linux context, we
+mean the very same mechanism of filtering in the Linux kernel.
+
+BPF allows a user-space program to attach a filter onto any socket and
+allow or disallow certain types of data to come through the socket. LSF
+follows exactly the same filter code structure as BSD's BPF, so referring
+to the BSD bpf.4 manpage is very helpful in creating filters.
+
+On Linux, BPF is much simpler than on BSD. One does not have to worry
+about devices or anything like that. You simply create your filter code,
+send it to the kernel via the SO_ATTACH_FILTER option and if your filter
+code passes the kernel check on it, you then immediately begin filtering
+data on that socket.
+
+You can also detach filters from your socket via the SO_DETACH_FILTER
+option. This will probably not be used much since when you close a socket
+that has a filter on it the filter is automagically removed. The other
+less common case may be adding a different filter on the same socket where
+you had another filter that is still running: the kernel takes care of
+removing the old one and placing your new one in its place, assuming your
+filter has passed the checks, otherwise if it fails the old filter will
+remain on that socket.
+
+SO_LOCK_FILTER option allows to lock the filter attached to a socket. Once
+set, a filter cannot be removed or changed. This allows one process to
+setup a socket, attach a filter, lock it then drop privileges and be
+assured that the filter will be kept until the socket is closed.
+
+The biggest user of this construct might be libpcap. Issuing a high-level
+filter command like `tcpdump -i em1 port 22` passes through the libpcap
+internal compiler that generates a structure that can eventually be loaded
+via SO_ATTACH_FILTER to the kernel. `tcpdump -i em1 port 22 -ddd`
+displays what is being placed into this structure.
+
+Although we were only speaking about sockets here, BPF in Linux is used
+in many more places. There's xt_bpf for netfilter, cls_bpf in the kernel
+qdisc layer, SECCOMP-BPF (SECure COMPuting [1]_), and lots of other places
+such as team driver, PTP code, etc where BPF is being used.
+
+.. [1] Documentation/userspace-api/seccomp_filter.rst
+
+Original BPF paper:
+
+Steven McCanne and Van Jacobson. 1993. The BSD packet filter: a new
+architecture for user-level packet capture. In Proceedings of the
+USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993
+Conference Proceedings (USENIX'93). USENIX Association, Berkeley,
+CA, USA, 2-2. [http://www.tcpdump.org/papers/bpf-usenix93.pdf]
+
+Structure
+---------
+
+User space applications include <linux/filter.h> which contains the
+following relevant structures::
+
+	struct sock_filter {	/* Filter block */
+		__u16	code;   /* Actual filter code */
+		__u8	jt;	/* Jump true */
+		__u8	jf;	/* Jump false */
+		__u32	k;      /* Generic multiuse field */
+	};
+
+Such a structure is assembled as an array of 4-tuples, that contains
+a code, jt, jf and k value. jt and jf are jump offsets and k a generic
+value to be used for a provided code::
+
+	struct sock_fprog {			/* Required for SO_ATTACH_FILTER. */
+		unsigned short		   len;	/* Number of filter blocks */
+		struct sock_filter __user *filter;
+	};
+
+For socket filtering, a pointer to this structure (as shown in
+follow-up example) is being passed to the kernel through setsockopt(2).
+
+Example
+-------
+
+::
+
+    #include <sys/socket.h>
+    #include <sys/types.h>
+    #include <arpa/inet.h>
+    #include <linux/if_ether.h>
+    /* ... */
+
+    /* From the example above: tcpdump -i em1 port 22 -dd */
+    struct sock_filter code[] = {
+	    { 0x28,  0,  0, 0x0000000c },
+	    { 0x15,  0,  8, 0x000086dd },
+	    { 0x30,  0,  0, 0x00000014 },
+	    { 0x15,  2,  0, 0x00000084 },
+	    { 0x15,  1,  0, 0x00000006 },
+	    { 0x15,  0, 17, 0x00000011 },
+	    { 0x28,  0,  0, 0x00000036 },
+	    { 0x15, 14,  0, 0x00000016 },
+	    { 0x28,  0,  0, 0x00000038 },
+	    { 0x15, 12, 13, 0x00000016 },
+	    { 0x15,  0, 12, 0x00000800 },
+	    { 0x30,  0,  0, 0x00000017 },
+	    { 0x15,  2,  0, 0x00000084 },
+	    { 0x15,  1,  0, 0x00000006 },
+	    { 0x15,  0,  8, 0x00000011 },
+	    { 0x28,  0,  0, 0x00000014 },
+	    { 0x45,  6,  0, 0x00001fff },
+	    { 0xb1,  0,  0, 0x0000000e },
+	    { 0x48,  0,  0, 0x0000000e },
+	    { 0x15,  2,  0, 0x00000016 },
+	    { 0x48,  0,  0, 0x00000010 },
+	    { 0x15,  0,  1, 0x00000016 },
+	    { 0x06,  0,  0, 0x0000ffff },
+	    { 0x06,  0,  0, 0x00000000 },
+    };
+
+    struct sock_fprog bpf = {
+	    .len = ARRAY_SIZE(code),
+	    .filter = code,
+    };
+
+    sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+    if (sock < 0)
+	    /* ... bail out ... */
+
+    ret = setsockopt(sock, SOL_SOCKET, SO_ATTACH_FILTER, &bpf, sizeof(bpf));
+    if (ret < 0)
+	    /* ... bail out ... */
+
+    /* ... */
+    close(sock);
+
+The above example code attaches a socket filter for a PF_PACKET socket
+in order to let all IPv4/IPv6 packets with port 22 pass. The rest will
+be dropped for this socket.
+
+The setsockopt(2) call to SO_DETACH_FILTER doesn't need any arguments
+and SO_LOCK_FILTER for preventing the filter to be detached, takes an
+integer value with 0 or 1.
+
+Note that socket filters are not restricted to PF_PACKET sockets only,
+but can also be used on other socket families.
+
+Summary of system calls:
+
+ * setsockopt(sockfd, SOL_SOCKET, SO_ATTACH_FILTER, &val, sizeof(val));
+ * setsockopt(sockfd, SOL_SOCKET, SO_DETACH_FILTER, &val, sizeof(val));
+ * setsockopt(sockfd, SOL_SOCKET, SO_LOCK_FILTER,   &val, sizeof(val));
+
+Normally, most use cases for socket filtering on packet sockets will be
+covered by libpcap in high-level syntax, so as an application developer
+you should stick to that. libpcap wraps its own layer around all that.
+
+Unless i) using/linking to libpcap is not an option, ii) the required BPF
+filters use Linux extensions that are not supported by libpcap's compiler,
+iii) a filter might be more complex and not cleanly implementable with
+libpcap's compiler, or iv) particular filter codes should be optimized
+differently than libpcap's internal compiler does; then in such cases
+writing such a filter "by hand" can be of an alternative. For example,
+xt_bpf and cls_bpf users might have requirements that could result in
+more complex filter code, or one that cannot be expressed with libpcap
+(e.g. different return codes for various code paths). Moreover, BPF JIT
+implementors may wish to manually write test cases and thus need low-level
+access to BPF code as well.
+
+BPF engine and instruction set
+------------------------------
+
+Under tools/bpf/ there's a small helper tool called bpf_asm which can
+be used to write low-level filters for example scenarios mentioned in the
+previous section. Asm-like syntax mentioned here has been implemented in
+bpf_asm and will be used for further explanations (instead of dealing with
+less readable opcodes directly, principles are the same). The syntax is
+closely modelled after Steven McCanne's and Van Jacobson's BPF paper.
+
+The BPF architecture consists of the following basic elements:
+
+  =======          ====================================================
+  Element          Description
+  =======          ====================================================
+  A                32 bit wide accumulator
+  X                32 bit wide X register
+  M[]              16 x 32 bit wide misc registers aka "scratch memory
+		   store", addressable from 0 to 15
+  =======          ====================================================
+
+A program, that is translated by bpf_asm into "opcodes" is an array that
+consists of the following elements (as already mentioned)::
+
+  op:16, jt:8, jf:8, k:32
+
+The element op is a 16 bit wide opcode that has a particular instruction
+encoded. jt and jf are two 8 bit wide jump targets, one for condition
+"jump if true", the other one "jump if false". Eventually, element k
+contains a miscellaneous argument that can be interpreted in different
+ways depending on the given instruction in op.
+
+The instruction set consists of load, store, branch, alu, miscellaneous
+and return instructions that are also represented in bpf_asm syntax. This
+table lists all bpf_asm instructions available resp. what their underlying
+opcodes as defined in linux/filter.h stand for:
+
+  ===========      ===================  =====================
+  Instruction      Addressing mode      Description
+  ===========      ===================  =====================
+  ld               1, 2, 3, 4, 12       Load word into A
+  ldi              4                    Load word into A
+  ldh              1, 2                 Load half-word into A
+  ldb              1, 2                 Load byte into A
+  ldx              3, 4, 5, 12          Load word into X
+  ldxi             4                    Load word into X
+  ldxb             5                    Load byte into X
+
+  st               3                    Store A into M[]
+  stx              3                    Store X into M[]
+
+  jmp              6                    Jump to label
+  ja               6                    Jump to label
+  jeq              7, 8, 9, 10          Jump on A == <x>
+  jneq             9, 10                Jump on A != <x>
+  jne              9, 10                Jump on A != <x>
+  jlt              9, 10                Jump on A <  <x>
+  jle              9, 10                Jump on A <= <x>
+  jgt              7, 8, 9, 10          Jump on A >  <x>
+  jge              7, 8, 9, 10          Jump on A >= <x>
+  jset             7, 8, 9, 10          Jump on A &  <x>
+
+  add              0, 4                 A + <x>
+  sub              0, 4                 A - <x>
+  mul              0, 4                 A * <x>
+  div              0, 4                 A / <x>
+  mod              0, 4                 A % <x>
+  neg                                   !A
+  and              0, 4                 A & <x>
+  or               0, 4                 A | <x>
+  xor              0, 4                 A ^ <x>
+  lsh              0, 4                 A << <x>
+  rsh              0, 4                 A >> <x>
+
+  tax                                   Copy A into X
+  txa                                   Copy X into A
+
+  ret              4, 11                Return
+  ===========      ===================  =====================
+
+The next table shows addressing formats from the 2nd column:
+
+  ===============  ===================  ===============================================
+  Addressing mode  Syntax               Description
+  ===============  ===================  ===============================================
+   0               x/%x                 Register X
+   1               [k]                  BHW at byte offset k in the packet
+   2               [x + k]              BHW at the offset X + k in the packet
+   3               M[k]                 Word at offset k in M[]
+   4               #k                   Literal value stored in k
+   5               4*([k]&0xf)          Lower nibble * 4 at byte offset k in the packet
+   6               L                    Jump label L
+   7               #k,Lt,Lf             Jump to Lt if true, otherwise jump to Lf
+   8               x/%x,Lt,Lf           Jump to Lt if true, otherwise jump to Lf
+   9               #k,Lt                Jump to Lt if predicate is true
+  10               x/%x,Lt              Jump to Lt if predicate is true
+  11               a/%a                 Accumulator A
+  12               extension            BPF extension
+  ===============  ===================  ===============================================
+
+The Linux kernel also has a couple of BPF extensions that are used along
+with the class of load instructions by "overloading" the k argument with
+a negative offset + a particular extension offset. The result of such BPF
+extensions are loaded into A.
+
+Possible BPF extensions are shown in the following table:
+
+  ===================================   =================================================
+  Extension                             Description
+  ===================================   =================================================
+  len                                   skb->len
+  proto                                 skb->protocol
+  type                                  skb->pkt_type
+  poff                                  Payload start offset
+  ifidx                                 skb->dev->ifindex
+  nla                                   Netlink attribute of type X with offset A
+  nlan                                  Nested Netlink attribute of type X with offset A
+  mark                                  skb->mark
+  queue                                 skb->queue_mapping
+  hatype                                skb->dev->type
+  rxhash                                skb->hash
+  cpu                                   raw_smp_processor_id()
+  vlan_tci                              skb_vlan_tag_get(skb)
+  vlan_avail                            skb_vlan_tag_present(skb)
+  vlan_tpid                             skb->vlan_proto
+  rand                                  prandom_u32()
+  ===================================   =================================================
+
+These extensions can also be prefixed with '#'.
+Examples for low-level BPF:
+
+**ARP packets**::
+
+  ldh [12]
+  jne #0x806, drop
+  ret #-1
+  drop: ret #0
+
+**IPv4 TCP packets**::
+
+  ldh [12]
+  jne #0x800, drop
+  ldb [23]
+  jneq #6, drop
+  ret #-1
+  drop: ret #0
+
+**(Accelerated) VLAN w/ id 10**::
+
+  ld vlan_tci
+  jneq #10, drop
+  ret #-1
+  drop: ret #0
+
+**icmp random packet sampling, 1 in 4**:
+
+  ldh [12]
+  jne #0x800, drop
+  ldb [23]
+  jneq #1, drop
+  # get a random uint32 number
+  ld rand
+  mod #4
+  jneq #1, drop
+  ret #-1
+  drop: ret #0
+
+**SECCOMP filter example**::
+
+  ld [4]                  /* offsetof(struct seccomp_data, arch) */
+  jne #0xc000003e, bad    /* AUDIT_ARCH_X86_64 */
+  ld [0]                  /* offsetof(struct seccomp_data, nr) */
+  jeq #15, good           /* __NR_rt_sigreturn */
+  jeq #231, good          /* __NR_exit_group */
+  jeq #60, good           /* __NR_exit */
+  jeq #0, good            /* __NR_read */
+  jeq #1, good            /* __NR_write */
+  jeq #5, good            /* __NR_fstat */
+  jeq #9, good            /* __NR_mmap */
+  jeq #14, good           /* __NR_rt_sigprocmask */
+  jeq #13, good           /* __NR_rt_sigaction */
+  jeq #35, good           /* __NR_nanosleep */
+  bad: ret #0             /* SECCOMP_RET_KILL_THREAD */
+  good: ret #0x7fff0000   /* SECCOMP_RET_ALLOW */
+
+The above example code can be placed into a file (here called "foo"), and
+then be passed to the bpf_asm tool for generating opcodes, output that xt_bpf
+and cls_bpf understands and can directly be loaded with. Example with above
+ARP code::
+
+    $ ./bpf_asm foo
+    4,40 0 0 12,21 0 1 2054,6 0 0 4294967295,6 0 0 0,
+
+In copy and paste C-like output::
+
+    $ ./bpf_asm -c foo
+    { 0x28,  0,  0, 0x0000000c },
+    { 0x15,  0,  1, 0x00000806 },
+    { 0x06,  0,  0, 0xffffffff },
+    { 0x06,  0,  0, 0000000000 },
+
+In particular, as usage with xt_bpf or cls_bpf can result in more complex BPF
+filters that might not be obvious at first, it's good to test filters before
+attaching to a live system. For that purpose, there's a small tool called
+bpf_dbg under tools/bpf/ in the kernel source directory. This debugger allows
+for testing BPF filters against given pcap files, single stepping through the
+BPF code on the pcap's packets and to do BPF machine register dumps.
+
+Starting bpf_dbg is trivial and just requires issuing::
+
+    # ./bpf_dbg
+
+In case input and output do not equal stdin/stdout, bpf_dbg takes an
+alternative stdin source as a first argument, and an alternative stdout
+sink as a second one, e.g. `./bpf_dbg test_in.txt test_out.txt`.
+
+Other than that, a particular libreadline configuration can be set via
+file "~/.bpf_dbg_init" and the command history is stored in the file
+"~/.bpf_dbg_history".
+
+Interaction in bpf_dbg happens through a shell that also has auto-completion
+support (follow-up example commands starting with '>' denote bpf_dbg shell).
+The usual workflow would be to ...
+
+* load bpf 6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 1,6 0 0 65535,6 0 0 0
+  Loads a BPF filter from standard output of bpf_asm, or transformed via
+  e.g. ``tcpdump -iem1 -ddd port 22 | tr '\n' ','``. Note that for JIT
+  debugging (next section), this command creates a temporary socket and
+  loads the BPF code into the kernel. Thus, this will also be useful for
+  JIT developers.
+
+* load pcap foo.pcap
+
+  Loads standard tcpdump pcap file.
+
+* run [<n>]
+
+bpf passes:1 fails:9
+  Runs through all packets from a pcap to account how many passes and fails
+  the filter will generate. A limit of packets to traverse can be given.
+
+* disassemble::
+
+	l0:	ldh [12]
+	l1:	jeq #0x800, l2, l5
+	l2:	ldb [23]
+	l3:	jeq #0x1, l4, l5
+	l4:	ret #0xffff
+	l5:	ret #0
+
+  Prints out BPF code disassembly.
+
+* dump::
+
+	/* { op, jt, jf, k }, */
+	{ 0x28,  0,  0, 0x0000000c },
+	{ 0x15,  0,  3, 0x00000800 },
+	{ 0x30,  0,  0, 0x00000017 },
+	{ 0x15,  0,  1, 0x00000001 },
+	{ 0x06,  0,  0, 0x0000ffff },
+	{ 0x06,  0,  0, 0000000000 },
+
+  Prints out C-style BPF code dump.
+
+* breakpoint 0::
+
+	breakpoint at: l0:	ldh [12]
+
+* breakpoint 1::
+
+	breakpoint at: l1:	jeq #0x800, l2, l5
+
+  ...
+
+  Sets breakpoints at particular BPF instructions. Issuing a `run` command
+  will walk through the pcap file continuing from the current packet and
+  break when a breakpoint is being hit (another `run` will continue from
+  the currently active breakpoint executing next instructions):
+
+  * run::
+
+	-- register dump --
+	pc:       [0]                       <-- program counter
+	code:     [40] jt[0] jf[0] k[12]    <-- plain BPF code of current instruction
+	curr:     l0:	ldh [12]              <-- disassembly of current instruction
+	A:        [00000000][0]             <-- content of A (hex, decimal)
+	X:        [00000000][0]             <-- content of X (hex, decimal)
+	M[0,15]:  [00000000][0]             <-- folded content of M (hex, decimal)
+	-- packet dump --                   <-- Current packet from pcap (hex)
+	len: 42
+	    0: 00 19 cb 55 55 a4 00 14 a4 43 78 69 08 06 00 01
+	16: 08 00 06 04 00 01 00 14 a4 43 78 69 0a 3b 01 26
+	32: 00 00 00 00 00 00 0a 3b 01 01
+	(breakpoint)
+	>
+
+  * breakpoint::
+
+	breakpoints: 0 1
+
+    Prints currently set breakpoints.
+
+* step [-<n>, +<n>]
+
+  Performs single stepping through the BPF program from the current pc
+  offset. Thus, on each step invocation, above register dump is issued.
+  This can go forwards and backwards in time, a plain `step` will break
+  on the next BPF instruction, thus +1. (No `run` needs to be issued here.)
+
+* select <n>
+
+  Selects a given packet from the pcap file to continue from. Thus, on
+  the next `run` or `step`, the BPF program is being evaluated against
+  the user pre-selected packet. Numbering starts just as in Wireshark
+  with index 1.
+
+* quit
+
+  Exits bpf_dbg.
+
+JIT compiler
+------------
+
+The Linux kernel has a built-in BPF JIT compiler for x86_64, SPARC,
+PowerPC, ARM, ARM64, MIPS, RISC-V and s390 and can be enabled through
+CONFIG_BPF_JIT. The JIT compiler is transparently invoked for each
+attached filter from user space or for internal kernel users if it has
+been previously enabled by root::
+
+  echo 1 > /proc/sys/net/core/bpf_jit_enable
+
+For JIT developers, doing audits etc, each compile run can output the generated
+opcode image into the kernel log via::
+
+  echo 2 > /proc/sys/net/core/bpf_jit_enable
+
+Example output from dmesg::
+
+    [ 3389.935842] flen=6 proglen=70 pass=3 image=ffffffffa0069c8f
+    [ 3389.935847] JIT code: 00000000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 68
+    [ 3389.935849] JIT code: 00000010: 44 2b 4f 6c 4c 8b 87 d8 00 00 00 be 0c 00 00 00
+    [ 3389.935850] JIT code: 00000020: e8 1d 94 ff e0 3d 00 08 00 00 75 16 be 17 00 00
+    [ 3389.935851] JIT code: 00000030: 00 e8 28 94 ff e0 83 f8 01 75 07 b8 ff ff 00 00
+    [ 3389.935852] JIT code: 00000040: eb 02 31 c0 c9 c3
+
+When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set to 1 and
+setting any other value than that will return in failure. This is even the case for
+setting bpf_jit_enable to 2, since dumping the final JIT image into the kernel log
+is discouraged and introspection through bpftool (under tools/bpf/bpftool/) is the
+generally recommended approach instead.
+
+In the kernel source tree under tools/bpf/, there's bpf_jit_disasm for
+generating disassembly out of the kernel log's hexdump::
+
+	# ./bpf_jit_disasm
+	70 bytes emitted from JIT compiler (pass:3, flen:6)
+	ffffffffa0069c8f + <x>:
+	0:	push   %rbp
+	1:	mov    %rsp,%rbp
+	4:	sub    $0x60,%rsp
+	8:	mov    %rbx,-0x8(%rbp)
+	c:	mov    0x68(%rdi),%r9d
+	10:	sub    0x6c(%rdi),%r9d
+	14:	mov    0xd8(%rdi),%r8
+	1b:	mov    $0xc,%esi
+	20:	callq  0xffffffffe0ff9442
+	25:	cmp    $0x800,%eax
+	2a:	jne    0x0000000000000042
+	2c:	mov    $0x17,%esi
+	31:	callq  0xffffffffe0ff945e
+	36:	cmp    $0x1,%eax
+	39:	jne    0x0000000000000042
+	3b:	mov    $0xffff,%eax
+	40:	jmp    0x0000000000000044
+	42:	xor    %eax,%eax
+	44:	leaveq
+	45:	retq
+
+	Issuing option `-o` will "annotate" opcodes to resulting assembler
+	instructions, which can be very useful for JIT developers:
+
+	# ./bpf_jit_disasm -o
+	70 bytes emitted from JIT compiler (pass:3, flen:6)
+	ffffffffa0069c8f + <x>:
+	0:	push   %rbp
+		55
+	1:	mov    %rsp,%rbp
+		48 89 e5
+	4:	sub    $0x60,%rsp
+		48 83 ec 60
+	8:	mov    %rbx,-0x8(%rbp)
+		48 89 5d f8
+	c:	mov    0x68(%rdi),%r9d
+		44 8b 4f 68
+	10:	sub    0x6c(%rdi),%r9d
+		44 2b 4f 6c
+	14:	mov    0xd8(%rdi),%r8
+		4c 8b 87 d8 00 00 00
+	1b:	mov    $0xc,%esi
+		be 0c 00 00 00
+	20:	callq  0xffffffffe0ff9442
+		e8 1d 94 ff e0
+	25:	cmp    $0x800,%eax
+		3d 00 08 00 00
+	2a:	jne    0x0000000000000042
+		75 16
+	2c:	mov    $0x17,%esi
+		be 17 00 00 00
+	31:	callq  0xffffffffe0ff945e
+		e8 28 94 ff e0
+	36:	cmp    $0x1,%eax
+		83 f8 01
+	39:	jne    0x0000000000000042
+		75 07
+	3b:	mov    $0xffff,%eax
+		b8 ff ff 00 00
+	40:	jmp    0x0000000000000044
+		eb 02
+	42:	xor    %eax,%eax
+		31 c0
+	44:	leaveq
+		c9
+	45:	retq
+		c3
+
+For BPF JIT developers, bpf_jit_disasm, bpf_asm and bpf_dbg provides a useful
+toolchain for developing and testing the kernel's JIT compiler.
+
+BPF kernel internals
+--------------------
+Internally, for the kernel interpreter, a different instruction set
+format with similar underlying principles from BPF described in previous
+paragraphs is being used. However, the instruction set format is modelled
+closer to the underlying architecture to mimic native instruction sets, so
+that a better performance can be achieved (more details later). This new
+ISA is called 'eBPF' or 'internal BPF' interchangeably. (Note: eBPF which
+originates from [e]xtended BPF is not the same as BPF extensions! While
+eBPF is an ISA, BPF extensions date back to classic BPF's 'overloading'
+of BPF_LD | BPF_{B,H,W} | BPF_ABS instruction.)
+
+It is designed to be JITed with one to one mapping, which can also open up
+the possibility for GCC/LLVM compilers to generate optimized eBPF code through
+an eBPF backend that performs almost as fast as natively compiled code.
+
+The new instruction set was originally designed with the possible goal in
+mind to write programs in "restricted C" and compile into eBPF with a optional
+GCC/LLVM backend, so that it can just-in-time map to modern 64-bit CPUs with
+minimal performance overhead over two steps, that is, C -> eBPF -> native code.
+
+Currently, the new format is being used for running user BPF programs, which
+includes seccomp BPF, classic socket filters, cls_bpf traffic classifier,
+team driver's classifier for its load-balancing mode, netfilter's xt_bpf
+extension, PTP dissector/classifier, and much more. They are all internally
+converted by the kernel into the new instruction set representation and run
+in the eBPF interpreter. For in-kernel handlers, this all works transparently
+by using bpf_prog_create() for setting up the filter, resp.
+bpf_prog_destroy() for destroying it. The macro
+BPF_PROG_RUN(filter, ctx) transparently invokes eBPF interpreter or JITed
+code to run the filter. 'filter' is a pointer to struct bpf_prog that we
+got from bpf_prog_create(), and 'ctx' the given context (e.g.
+skb pointer). All constraints and restrictions from bpf_check_classic() apply
+before a conversion to the new layout is being done behind the scenes!
+
+Currently, the classic BPF format is being used for JITing on most
+32-bit architectures, whereas x86-64, aarch64, s390x, powerpc64,
+sparc64, arm32, riscv64, riscv32 perform JIT compilation from eBPF
+instruction set.
+
+Some core changes of the new internal format:
+
+- Number of registers increase from 2 to 10:
+
+  The old format had two registers A and X, and a hidden frame pointer. The
+  new layout extends this to be 10 internal registers and a read-only frame
+  pointer. Since 64-bit CPUs are passing arguments to functions via registers
+  the number of args from eBPF program to in-kernel function is restricted
+  to 5 and one register is used to accept return value from an in-kernel
+  function. Natively, x86_64 passes first 6 arguments in registers, aarch64/
+  sparcv9/mips64 have 7 - 8 registers for arguments; x86_64 has 6 callee saved
+  registers, and aarch64/sparcv9/mips64 have 11 or more callee saved registers.
+
+  Therefore, eBPF calling convention is defined as:
+
+    * R0	- return value from in-kernel function, and exit value for eBPF program
+    * R1 - R5	- arguments from eBPF program to in-kernel function
+    * R6 - R9	- callee saved registers that in-kernel function will preserve
+    * R10	- read-only frame pointer to access stack
+
+  Thus, all eBPF registers map one to one to HW registers on x86_64, aarch64,
+  etc, and eBPF calling convention maps directly to ABIs used by the kernel on
+  64-bit architectures.
+
+  On 32-bit architectures JIT may map programs that use only 32-bit arithmetic
+  and may let more complex programs to be interpreted.
+
+  R0 - R5 are scratch registers and eBPF program needs spill/fill them if
+  necessary across calls. Note that there is only one eBPF program (== one
+  eBPF main routine) and it cannot call other eBPF functions, it can only
+  call predefined in-kernel functions, though.
+
+- Register width increases from 32-bit to 64-bit:
+
+  Still, the semantics of the original 32-bit ALU operations are preserved
+  via 32-bit subregisters. All eBPF registers are 64-bit with 32-bit lower
+  subregisters that zero-extend into 64-bit if they are being written to.
+  That behavior maps directly to x86_64 and arm64 subregister definition, but
+  makes other JITs more difficult.
+
+  32-bit architectures run 64-bit internal BPF programs via interpreter.
+  Their JITs may convert BPF programs that only use 32-bit subregisters into
+  native instruction set and let the rest being interpreted.
+
+  Operation is 64-bit, because on 64-bit architectures, pointers are also
+  64-bit wide, and we want to pass 64-bit values in/out of kernel functions,
+  so 32-bit eBPF registers would otherwise require to define register-pair
+  ABI, thus, there won't be able to use a direct eBPF register to HW register
+  mapping and JIT would need to do combine/split/move operations for every
+  register in and out of the function, which is complex, bug prone and slow.
+  Another reason is the use of atomic 64-bit counters.
+
+- Conditional jt/jf targets replaced with jt/fall-through:
+
+  While the original design has constructs such as ``if (cond) jump_true;
+  else jump_false;``, they are being replaced into alternative constructs like
+  ``if (cond) jump_true; /* else fall-through */``.
+
+- Introduces bpf_call insn and register passing convention for zero overhead
+  calls from/to other kernel functions:
+
+  Before an in-kernel function call, the internal BPF program needs to
+  place function arguments into R1 to R5 registers to satisfy calling
+  convention, then the interpreter will take them from registers and pass
+  to in-kernel function. If R1 - R5 registers are mapped to CPU registers
+  that are used for argument passing on given architecture, the JIT compiler
+  doesn't need to emit extra moves. Function arguments will be in the correct
+  registers and BPF_CALL instruction will be JITed as single 'call' HW
+  instruction. This calling convention was picked to cover common call
+  situations without performance penalty.
+
+  After an in-kernel function call, R1 - R5 are reset to unreadable and R0 has
+  a return value of the function. Since R6 - R9 are callee saved, their state
+  is preserved across the call.
+
+  For example, consider three C functions::
+
+    u64 f1() { return (*_f2)(1); }
+    u64 f2(u64 a) { return f3(a + 1, a); }
+    u64 f3(u64 a, u64 b) { return a - b; }
+
+  GCC can compile f1, f3 into x86_64::
+
+    f1:
+	movl $1, %edi
+	movq _f2(%rip), %rax
+	jmp  *%rax
+    f3:
+	movq %rdi, %rax
+	subq %rsi, %rax
+	ret
+
+  Function f2 in eBPF may look like::
+
+    f2:
+	bpf_mov R2, R1
+	bpf_add R1, 1
+	bpf_call f3
+	bpf_exit
+
+  If f2 is JITed and the pointer stored to ``_f2``. The calls f1 -> f2 -> f3 and
+  returns will be seamless. Without JIT, __bpf_prog_run() interpreter needs to
+  be used to call into f2.
+
+  For practical reasons all eBPF programs have only one argument 'ctx' which is
+  already placed into R1 (e.g. on __bpf_prog_run() startup) and the programs
+  can call kernel functions with up to 5 arguments. Calls with 6 or more arguments
+  are currently not supported, but these restrictions can be lifted if necessary
+  in the future.
+
+  On 64-bit architectures all register map to HW registers one to one. For
+  example, x86_64 JIT compiler can map them as ...
+
+  ::
+
+    R0 - rax
+    R1 - rdi
+    R2 - rsi
+    R3 - rdx
+    R4 - rcx
+    R5 - r8
+    R6 - rbx
+    R7 - r13
+    R8 - r14
+    R9 - r15
+    R10 - rbp
+
+  ... since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing
+  and rbx, r12 - r15 are callee saved.
+
+  Then the following internal BPF pseudo-program::
+
+    bpf_mov R6, R1 /* save ctx */
+    bpf_mov R2, 2
+    bpf_mov R3, 3
+    bpf_mov R4, 4
+    bpf_mov R5, 5
+    bpf_call foo
+    bpf_mov R7, R0 /* save foo() return value */
+    bpf_mov R1, R6 /* restore ctx for next call */
+    bpf_mov R2, 6
+    bpf_mov R3, 7
+    bpf_mov R4, 8
+    bpf_mov R5, 9
+    bpf_call bar
+    bpf_add R0, R7
+    bpf_exit
+
+  After JIT to x86_64 may look like::
+
+    push %rbp
+    mov %rsp,%rbp
+    sub $0x228,%rsp
+    mov %rbx,-0x228(%rbp)
+    mov %r13,-0x220(%rbp)
+    mov %rdi,%rbx
+    mov $0x2,%esi
+    mov $0x3,%edx
+    mov $0x4,%ecx
+    mov $0x5,%r8d
+    callq foo
+    mov %rax,%r13
+    mov %rbx,%rdi
+    mov $0x6,%esi
+    mov $0x7,%edx
+    mov $0x8,%ecx
+    mov $0x9,%r8d
+    callq bar
+    add %r13,%rax
+    mov -0x228(%rbp),%rbx
+    mov -0x220(%rbp),%r13
+    leaveq
+    retq
+
+  Which is in this example equivalent in C to::
+
+    u64 bpf_filter(u64 ctx)
+    {
+	return foo(ctx, 2, 3, 4, 5) + bar(ctx, 6, 7, 8, 9);
+    }
+
+  In-kernel functions foo() and bar() with prototype: u64 (*)(u64 arg1, u64
+  arg2, u64 arg3, u64 arg4, u64 arg5); will receive arguments in proper
+  registers and place their return value into ``%rax`` which is R0 in eBPF.
+  Prologue and epilogue are emitted by JIT and are implicit in the
+  interpreter. R0-R5 are scratch registers, so eBPF program needs to preserve
+  them across the calls as defined by calling convention.
+
+  For example the following program is invalid::
+
+    bpf_mov R1, 1
+    bpf_call foo
+    bpf_mov R0, R1
+    bpf_exit
+
+  After the call the registers R1-R5 contain junk values and cannot be read.
+  An in-kernel eBPF verifier is used to validate internal BPF programs.
+
+Also in the new design, eBPF is limited to 4096 insns, which means that any
+program will terminate quickly and will only call a fixed number of kernel
+functions. Original BPF and the new format are two operand instructions,
+which helps to do one-to-one mapping between eBPF insn and x86 insn during JIT.
+
+The input context pointer for invoking the interpreter function is generic,
+its content is defined by a specific use case. For seccomp register R1 points
+to seccomp_data, for converted BPF filters R1 points to a skb.
+
+A program, that is translated internally consists of the following elements::
+
+  op:16, jt:8, jf:8, k:32    ==>    op:8, dst_reg:4, src_reg:4, off:16, imm:32
+
+So far 87 internal BPF instructions were implemented. 8-bit 'op' opcode field
+has room for new instructions. Some of them may use 16/24/32 byte encoding. New
+instructions must be multiple of 8 bytes to preserve backward compatibility.
+
+Internal BPF is a general purpose RISC instruction set. Not every register and
+every instruction are used during translation from original BPF to new format.
+For example, socket filters are not using ``exclusive add`` instruction, but
+tracing filters may do to maintain counters of events, for example. Register R9
+is not used by socket filters either, but more complex filters may be running
+out of registers and would have to resort to spill/fill to stack.
+
+Internal BPF can be used as a generic assembler for last step performance
+optimizations, socket filters and seccomp are using it as assembler. Tracing
+filters may use it as assembler to generate code from kernel. In kernel usage
+may not be bounded by security considerations, since generated internal BPF code
+may be optimizing internal code path and not being exposed to the user space.
+Safety of internal BPF can come from a verifier (TBD). In such use cases as
+described, it may be used as safe instruction set.
+
+Just like the original BPF, the new format runs within a controlled environment,
+is deterministic and the kernel can easily prove that. The safety of the program
+can be determined in two steps: first step does depth-first-search to disallow
+loops and other CFG validation; second step starts from the first insn and
+descends all possible paths. It simulates execution of every insn and observes
+the state change of registers and stack.
+
+eBPF opcode encoding
+--------------------
+
+eBPF is reusing most of the opcode encoding from classic to simplify conversion
+of classic BPF to eBPF. For arithmetic and jump instructions the 8-bit 'code'
+field is divided into three parts::
+
+  +----------------+--------+--------------------+
+  |   4 bits       |  1 bit |   3 bits           |
+  | operation code | source | instruction class  |
+  +----------------+--------+--------------------+
+  (MSB)                                      (LSB)
+
+Three LSB bits store instruction class which is one of:
+
+  ===================     ===============
+  Classic BPF classes     eBPF classes
+  ===================     ===============
+  BPF_LD    0x00          BPF_LD    0x00
+  BPF_LDX   0x01          BPF_LDX   0x01
+  BPF_ST    0x02          BPF_ST    0x02
+  BPF_STX   0x03          BPF_STX   0x03
+  BPF_ALU   0x04          BPF_ALU   0x04
+  BPF_JMP   0x05          BPF_JMP   0x05
+  BPF_RET   0x06          BPF_JMP32 0x06
+  BPF_MISC  0x07          BPF_ALU64 0x07
+  ===================     ===============
+
+When BPF_CLASS(code) == BPF_ALU or BPF_JMP, 4th bit encodes source operand ...
+
+    ::
+
+	BPF_K     0x00
+	BPF_X     0x08
+
+ * in classic BPF, this means::
+
+	BPF_SRC(code) == BPF_X - use register X as source operand
+	BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand
+
+ * in eBPF, this means::
+
+	BPF_SRC(code) == BPF_X - use 'src_reg' register as source operand
+	BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand
+
+... and four MSB bits store operation code.
+
+If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 [ in eBPF ], BPF_OP(code) is one of::
+
+  BPF_ADD   0x00
+  BPF_SUB   0x10
+  BPF_MUL   0x20
+  BPF_DIV   0x30
+  BPF_OR    0x40
+  BPF_AND   0x50
+  BPF_LSH   0x60
+  BPF_RSH   0x70
+  BPF_NEG   0x80
+  BPF_MOD   0x90
+  BPF_XOR   0xa0
+  BPF_MOV   0xb0  /* eBPF only: mov reg to reg */
+  BPF_ARSH  0xc0  /* eBPF only: sign extending shift right */
+  BPF_END   0xd0  /* eBPF only: endianness conversion */
+
+If BPF_CLASS(code) == BPF_JMP or BPF_JMP32 [ in eBPF ], BPF_OP(code) is one of::
+
+  BPF_JA    0x00  /* BPF_JMP only */
+  BPF_JEQ   0x10
+  BPF_JGT   0x20
+  BPF_JGE   0x30
+  BPF_JSET  0x40
+  BPF_JNE   0x50  /* eBPF only: jump != */
+  BPF_JSGT  0x60  /* eBPF only: signed '>' */
+  BPF_JSGE  0x70  /* eBPF only: signed '>=' */
+  BPF_CALL  0x80  /* eBPF BPF_JMP only: function call */
+  BPF_EXIT  0x90  /* eBPF BPF_JMP only: function return */
+  BPF_JLT   0xa0  /* eBPF only: unsigned '<' */
+  BPF_JLE   0xb0  /* eBPF only: unsigned '<=' */
+  BPF_JSLT  0xc0  /* eBPF only: signed '<' */
+  BPF_JSLE  0xd0  /* eBPF only: signed '<=' */
+
+So BPF_ADD | BPF_X | BPF_ALU means 32-bit addition in both classic BPF
+and eBPF. There are only two registers in classic BPF, so it means A += X.
+In eBPF it means dst_reg = (u32) dst_reg + (u32) src_reg; similarly,
+BPF_XOR | BPF_K | BPF_ALU means A ^= imm32 in classic BPF and analogous
+src_reg = (u32) src_reg ^ (u32) imm32 in eBPF.
+
+Classic BPF is using BPF_MISC class to represent A = X and X = A moves.
+eBPF is using BPF_MOV | BPF_X | BPF_ALU code instead. Since there are no
+BPF_MISC operations in eBPF, the class 7 is used as BPF_ALU64 to mean
+exactly the same operations as BPF_ALU, but with 64-bit wide operands
+instead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.:
+dst_reg = dst_reg + src_reg
+
+Classic BPF wastes the whole BPF_RET class to represent a single ``ret``
+operation. Classic BPF_RET | BPF_K means copy imm32 into return register
+and perform function exit. eBPF is modeled to match CPU, so BPF_JMP | BPF_EXIT
+in eBPF means function exit only. The eBPF program needs to store return
+value into register R0 before doing a BPF_EXIT. Class 6 in eBPF is used as
+BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide
+operands for the comparisons instead.
+
+For load and store instructions the 8-bit 'code' field is divided as::
+
+  +--------+--------+-------------------+
+  | 3 bits | 2 bits |   3 bits          |
+  |  mode  |  size  | instruction class |
+  +--------+--------+-------------------+
+  (MSB)                             (LSB)
+
+Size modifier is one of ...
+
+::
+
+  BPF_W   0x00    /* word */
+  BPF_H   0x08    /* half word */
+  BPF_B   0x10    /* byte */
+  BPF_DW  0x18    /* eBPF only, double word */
+
+... which encodes size of load/store operation::
+
+ B  - 1 byte
+ H  - 2 byte
+ W  - 4 byte
+ DW - 8 byte (eBPF only)
+
+Mode modifier is one of::
+
+  BPF_IMM  0x00  /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
+  BPF_ABS  0x20
+  BPF_IND  0x40
+  BPF_MEM  0x60
+  BPF_LEN  0x80  /* classic BPF only, reserved in eBPF */
+  BPF_MSH  0xa0  /* classic BPF only, reserved in eBPF */
+  BPF_XADD 0xc0  /* eBPF only, exclusive add */
+
+eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
+(BPF_IND | <size> | BPF_LD) which are used to access packet data.
+
+They had to be carried over from classic to have strong performance of
+socket filters running in eBPF interpreter. These instructions can only
+be used when interpreter context is a pointer to ``struct sk_buff`` and
+have seven implicit operands. Register R6 is an implicit input that must
+contain pointer to sk_buff. Register R0 is an implicit output which contains
+the data fetched from the packet. Registers R1-R5 are scratch registers
+and must not be used to store the data across BPF_ABS | BPF_LD or
+BPF_IND | BPF_LD instructions.
+
+These instructions have implicit program exit condition as well. When
+eBPF program is trying to access the data beyond the packet boundary,
+the interpreter will abort the execution of the program. JIT compilers
+therefore must preserve this property. src_reg and imm32 fields are
+explicit inputs to these instructions.
+
+For example::
+
+  BPF_IND | BPF_W | BPF_LD means:
+
+    R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
+    and R1 - R5 were scratched.
+
+Unlike classic BPF instruction set, eBPF has generic load/store operations::
+
+    BPF_MEM | <size> | BPF_STX:  *(size *) (dst_reg + off) = src_reg
+    BPF_MEM | <size> | BPF_ST:   *(size *) (dst_reg + off) = imm32
+    BPF_MEM | <size> | BPF_LDX:  dst_reg = *(size *) (src_reg + off)
+    BPF_XADD | BPF_W  | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
+    BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
+
+Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
+2 byte atomic increments are not supported.
+
+eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
+of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
+instruction that loads 64-bit immediate value into a dst_reg.
+Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads
+32-bit immediate value into a register.
+
+eBPF verifier
+-------------
+The safety of the eBPF program is determined in two steps.
+
+First step does DAG check to disallow loops and other CFG validation.
+In particular it will detect programs that have unreachable instructions.
+(though classic BPF checker allows them)
+
+Second step starts from the first insn and descends all possible paths.
+It simulates execution of every insn and observes the state change of
+registers and stack.
+
+At the start of the program the register R1 contains a pointer to context
+and has type PTR_TO_CTX.
+If verifier sees an insn that does R2=R1, then R2 has now type
+PTR_TO_CTX as well and can be used on the right hand side of expression.
+If R1=PTR_TO_CTX and insn is R2=R1+R1, then R2=SCALAR_VALUE,
+since addition of two valid pointers makes invalid pointer.
+(In 'secure' mode verifier will reject any type of pointer arithmetic to make
+sure that kernel addresses don't leak to unprivileged users)
+
+If register was never written to, it's not readable::
+
+  bpf_mov R0 = R2
+  bpf_exit
+
+will be rejected, since R2 is unreadable at the start of the program.
+
+After kernel function call, R1-R5 are reset to unreadable and
+R0 has a return type of the function.
+
+Since R6-R9 are callee saved, their state is preserved across the call.
+
+::
+
+  bpf_mov R6 = 1
+  bpf_call foo
+  bpf_mov R0 = R6
+  bpf_exit
+
+is a correct program. If there was R1 instead of R6, it would have
+been rejected.
+
+load/store instructions are allowed only with registers of valid types, which
+are PTR_TO_CTX, PTR_TO_MAP, PTR_TO_STACK. They are bounds and alignment checked.
+For example::
+
+ bpf_mov R1 = 1
+ bpf_mov R2 = 2
+ bpf_xadd *(u32 *)(R1 + 3) += R2
+ bpf_exit
+
+will be rejected, since R1 doesn't have a valid pointer type at the time of
+execution of instruction bpf_xadd.
+
+At the start R1 type is PTR_TO_CTX (a pointer to generic ``struct bpf_context``)
+A callback is used to customize verifier to restrict eBPF program access to only
+certain fields within ctx structure with specified size and alignment.
+
+For example, the following insn::
+
+  bpf_ld R0 = *(u32 *)(R6 + 8)
+
+intends to load a word from address R6 + 8 and store it into R0
+If R6=PTR_TO_CTX, via is_valid_access() callback the verifier will know
+that offset 8 of size 4 bytes can be accessed for reading, otherwise
+the verifier will reject the program.
+If R6=PTR_TO_STACK, then access should be aligned and be within
+stack bounds, which are [-MAX_BPF_STACK, 0). In this example offset is 8,
+so it will fail verification, since it's out of bounds.
+
+The verifier will allow eBPF program to read data from stack only after
+it wrote into it.
+
+Classic BPF verifier does similar check with M[0-15] memory slots.
+For example::
+
+  bpf_ld R0 = *(u32 *)(R10 - 4)
+  bpf_exit
+
+is invalid program.
+Though R10 is correct read-only register and has type PTR_TO_STACK
+and R10 - 4 is within stack bounds, there were no stores into that location.
+
+Pointer register spill/fill is tracked as well, since four (R6-R9)
+callee saved registers may not be enough for some programs.
+
+Allowed function calls are customized with bpf_verifier_ops->get_func_proto()
+The eBPF verifier will check that registers match argument constraints.
+After the call register R0 will be set to return type of the function.
+
+Function calls is a main mechanism to extend functionality of eBPF programs.
+Socket filters may let programs to call one set of functions, whereas tracing
+filters may allow completely different set.
+
+If a function made accessible to eBPF program, it needs to be thought through
+from safety point of view. The verifier will guarantee that the function is
+called with valid arguments.
+
+seccomp vs socket filters have different security restrictions for classic BPF.
+Seccomp solves this by two stage verifier: classic BPF verifier is followed
+by seccomp verifier. In case of eBPF one configurable verifier is shared for
+all use cases.
+
+See details of eBPF verifier in kernel/bpf/verifier.c
+
+Register value tracking
+-----------------------
+In order to determine the safety of an eBPF program, the verifier must track
+the range of possible values in each register and also in each stack slot.
+This is done with ``struct bpf_reg_state``, defined in include/linux/
+bpf_verifier.h, which unifies tracking of scalar and pointer values.  Each
+register state has a type, which is either NOT_INIT (the register has not been
+written to), SCALAR_VALUE (some value which is not usable as a pointer), or a
+pointer type.  The types of pointers describe their base, as follows:
+
+
+    PTR_TO_CTX
+			Pointer to bpf_context.
+    CONST_PTR_TO_MAP
+			Pointer to struct bpf_map.  "Const" because arithmetic
+			on these pointers is forbidden.
+    PTR_TO_MAP_VALUE
+			Pointer to the value stored in a map element.
+    PTR_TO_MAP_VALUE_OR_NULL
+			Either a pointer to a map value, or NULL; map accesses
+			(see section 'eBPF maps', below) return this type,
+			which becomes a PTR_TO_MAP_VALUE when checked != NULL.
+			Arithmetic on these pointers is forbidden.
+    PTR_TO_STACK
+			Frame pointer.
+    PTR_TO_PACKET
+			skb->data.
+    PTR_TO_PACKET_END
+			skb->data + headlen; arithmetic forbidden.
+    PTR_TO_SOCKET
+			Pointer to struct bpf_sock_ops, implicitly refcounted.
+    PTR_TO_SOCKET_OR_NULL
+			Either a pointer to a socket, or NULL; socket lookup
+			returns this type, which becomes a PTR_TO_SOCKET when
+			checked != NULL. PTR_TO_SOCKET is reference-counted,
+			so programs must release the reference through the
+			socket release function before the end of the program.
+			Arithmetic on these pointers is forbidden.
+
+However, a pointer may be offset from this base (as a result of pointer
+arithmetic), and this is tracked in two parts: the 'fixed offset' and 'variable
+offset'.  The former is used when an exactly-known value (e.g. an immediate
+operand) is added to a pointer, while the latter is used for values which are
+not exactly known.  The variable offset is also used in SCALAR_VALUEs, to track
+the range of possible values in the register.
+
+The verifier's knowledge about the variable offset consists of:
+
+* minimum and maximum values as unsigned
+* minimum and maximum values as signed
+
+* knowledge of the values of individual bits, in the form of a 'tnum': a u64
+  'mask' and a u64 'value'.  1s in the mask represent bits whose value is unknown;
+  1s in the value represent bits known to be 1.  Bits known to be 0 have 0 in both
+  mask and value; no bit should ever be 1 in both.  For example, if a byte is read
+  into a register from memory, the register's top 56 bits are known zero, while
+  the low 8 are unknown - which is represented as the tnum (0x0; 0xff).  If we
+  then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0;
+  0x1ff), because of potential carries.
+
+Besides arithmetic, the register state can also be updated by conditional
+branches.  For instance, if a SCALAR_VALUE is compared > 8, in the 'true' branch
+it will have a umin_value (unsigned minimum value) of 9, whereas in the 'false'
+branch it will have a umax_value of 8.  A signed compare (with BPF_JSGT or
+BPF_JSGE) would instead update the signed minimum/maximum values.  Information
+from the signed and unsigned bounds can be combined; for instance if a value is
+first tested < 8 and then tested s> 4, the verifier will conclude that the value
+is also > 4 and s< 8, since the bounds prevent crossing the sign boundary.
+
+PTR_TO_PACKETs with a variable offset part have an 'id', which is common to all
+pointers sharing that same variable offset.  This is important for packet range
+checks: after adding a variable to a packet pointer register A, if you then copy
+it to another register B and then add a constant 4 to A, both registers will
+share the same 'id' but the A will have a fixed offset of +4.  Then if A is
+bounds-checked and found to be less than a PTR_TO_PACKET_END, the register B is
+now known to have a safe range of at least 4 bytes.  See 'Direct packet access',
+below, for more on PTR_TO_PACKET ranges.
+
+The 'id' field is also used on PTR_TO_MAP_VALUE_OR_NULL, common to all copies of
+the pointer returned from a map lookup.  This means that when one copy is
+checked and found to be non-NULL, all copies can become PTR_TO_MAP_VALUEs.
+As well as range-checking, the tracked information is also used for enforcing
+alignment of pointer accesses.  For instance, on most systems the packet pointer
+is 2 bytes after a 4-byte alignment.  If a program adds 14 bytes to that to jump
+over the Ethernet header, then reads IHL and addes (IHL * 4), the resulting
+pointer will have a variable offset known to be 4n+2 for some n, so adding the 2
+bytes (NET_IP_ALIGN) gives a 4-byte alignment and so word-sized accesses through
+that pointer are safe.
+The 'id' field is also used on PTR_TO_SOCKET and PTR_TO_SOCKET_OR_NULL, common
+to all copies of the pointer returned from a socket lookup. This has similar
+behaviour to the handling for PTR_TO_MAP_VALUE_OR_NULL->PTR_TO_MAP_VALUE, but
+it also handles reference tracking for the pointer. PTR_TO_SOCKET implicitly
+represents a reference to the corresponding ``struct sock``. To ensure that the
+reference is not leaked, it is imperative to NULL-check the reference and in
+the non-NULL case, and pass the valid reference to the socket release function.
+
+Direct packet access
+--------------------
+In cls_bpf and act_bpf programs the verifier allows direct access to the packet
+data via skb->data and skb->data_end pointers.
+Ex::
+
+    1:  r4 = *(u32 *)(r1 +80)  /* load skb->data_end */
+    2:  r3 = *(u32 *)(r1 +76)  /* load skb->data */
+    3:  r5 = r3
+    4:  r5 += 14
+    5:  if r5 > r4 goto pc+16
+    R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp
+    6:  r0 = *(u16 *)(r3 +12) /* access 12 and 13 bytes of the packet */
+
+this 2byte load from the packet is safe to do, since the program author
+did check ``if (skb->data + 14 > skb->data_end) goto err`` at insn #5 which
+means that in the fall-through case the register R3 (which points to skb->data)
+has at least 14 directly accessible bytes. The verifier marks it
+as R3=pkt(id=0,off=0,r=14).
+id=0 means that no additional variables were added to the register.
+off=0 means that no additional constants were added.
+r=14 is the range of safe access which means that bytes [R3, R3 + 14) are ok.
+Note that R5 is marked as R5=pkt(id=0,off=14,r=14). It also points
+to the packet data, but constant 14 was added to the register, so
+it now points to ``skb->data + 14`` and accessible range is [R5, R5 + 14 - 14)
+which is zero bytes.
+
+More complex packet access may look like::
+
+
+    R0=inv1 R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp
+    6:  r0 = *(u8 *)(r3 +7) /* load 7th byte from the packet */
+    7:  r4 = *(u8 *)(r3 +12)
+    8:  r4 *= 14
+    9:  r3 = *(u32 *)(r1 +76) /* load skb->data */
+    10:  r3 += r4
+    11:  r2 = r1
+    12:  r2 <<= 48
+    13:  r2 >>= 48
+    14:  r3 += r2
+    15:  r2 = r3
+    16:  r2 += 8
+    17:  r1 = *(u32 *)(r1 +80) /* load skb->data_end */
+    18:  if r2 > r1 goto pc+2
+    R0=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R1=pkt_end R2=pkt(id=2,off=8,r=8) R3=pkt(id=2,off=0,r=8) R4=inv(id=0,umax_value=3570,var_off=(0x0; 0xfffe)) R5=pkt(id=0,off=14,r=14) R10=fp
+    19:  r1 = *(u8 *)(r3 +4)
+
+The state of the register R3 is R3=pkt(id=2,off=0,r=8)
+id=2 means that two ``r3 += rX`` instructions were seen, so r3 points to some
+offset within a packet and since the program author did
+``if (r3 + 8 > r1) goto err`` at insn #18, the safe range is [R3, R3 + 8).
+The verifier only allows 'add'/'sub' operations on packet registers. Any other
+operation will set the register state to 'SCALAR_VALUE' and it won't be
+available for direct packet access.
+
+Operation ``r3 += rX`` may overflow and become less than original skb->data,
+therefore the verifier has to prevent that.  So when it sees ``r3 += rX``
+instruction and rX is more than 16-bit value, any subsequent bounds-check of r3
+against skb->data_end will not give us 'range' information, so attempts to read
+through the pointer will give "invalid access to packet" error.
+
+Ex. after insn ``r4 = *(u8 *)(r3 +12)`` (insn #7 above) the state of r4 is
+R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) which means that upper 56 bits
+of the register are guaranteed to be zero, and nothing is known about the lower
+8 bits. After insn ``r4 *= 14`` the state becomes
+R4=inv(id=0,umax_value=3570,var_off=(0x0; 0xfffe)), since multiplying an 8-bit
+value by constant 14 will keep upper 52 bits as zero, also the least significant
+bit will be zero as 14 is even.  Similarly ``r2 >>= 48`` will make
+R2=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)), since the shift is not sign
+extending.  This logic is implemented in adjust_reg_min_max_vals() function,
+which calls adjust_ptr_min_max_vals() for adding pointer to scalar (or vice
+versa) and adjust_scalar_min_max_vals() for operations on two scalars.
+
+The end result is that bpf program author can access packet directly
+using normal C code as::
+
+  void *data = (void *)(long)skb->data;
+  void *data_end = (void *)(long)skb->data_end;
+  struct eth_hdr *eth = data;
+  struct iphdr *iph = data + sizeof(*eth);
+  struct udphdr *udp = data + sizeof(*eth) + sizeof(*iph);
+
+  if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end)
+	  return 0;
+  if (eth->h_proto != htons(ETH_P_IP))
+	  return 0;
+  if (iph->protocol != IPPROTO_UDP || iph->ihl != 5)
+	  return 0;
+  if (udp->dest == 53 || udp->source == 9)
+	  ...;
+
+which makes such programs easier to write comparing to LD_ABS insn
+and significantly faster.
+
+eBPF maps
+---------
+'maps' is a generic storage of different types for sharing data between kernel
+and userspace.
+
+The maps are accessed from user space via BPF syscall, which has commands:
+
+- create a map with given type and attributes
+  ``map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)``
+  using attr->map_type, attr->key_size, attr->value_size, attr->max_entries
+  returns process-local file descriptor or negative error
+
+- lookup key in a given map
+  ``err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)``
+  using attr->map_fd, attr->key, attr->value
+  returns zero and stores found elem into value or negative error
+
+- create or update key/value pair in a given map
+  ``err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)``
+  using attr->map_fd, attr->key, attr->value
+  returns zero or negative error
+
+- find and delete element by key in a given map
+  ``err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)``
+  using attr->map_fd, attr->key
+
+- to delete map: close(fd)
+  Exiting process will delete maps automatically
+
+userspace programs use this syscall to create/access maps that eBPF programs
+are concurrently updating.
+
+maps can have different types: hash, array, bloom filter, radix-tree, etc.
+
+The map is defined by:
+
+  - type
+  - max number of elements
+  - key size in bytes
+  - value size in bytes
+
+Pruning
+-------
+The verifier does not actually walk all possible paths through the program.  For
+each new branch to analyse, the verifier looks at all the states it's previously
+been in when at this instruction.  If any of them contain the current state as a
+subset, the branch is 'pruned' - that is, the fact that the previous state was
+accepted implies the current state would be as well.  For instance, if in the
+previous state, r1 held a packet-pointer, and in the current state, r1 holds a
+packet-pointer with a range as long or longer and at least as strict an
+alignment, then r1 is safe.  Similarly, if r2 was NOT_INIT before then it can't
+have been used by any path from that point, so any value in r2 (including
+another NOT_INIT) is safe.  The implementation is in the function regsafe().
+Pruning considers not only the registers but also the stack (and any spilled
+registers it may hold).  They must all be safe for the branch to be pruned.
+This is implemented in states_equal().
+
+Understanding eBPF verifier messages
+------------------------------------
+
+The following are few examples of invalid eBPF programs and verifier error
+messages as seen in the log:
+
+Program with unreachable instructions::
+
+  static struct bpf_insn prog[] = {
+  BPF_EXIT_INSN(),
+  BPF_EXIT_INSN(),
+  };
+
+Error:
+
+  unreachable insn 1
+
+Program that reads uninitialized register::
+
+  BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (bf) r0 = r2
+  R2 !read_ok
+
+Program that doesn't initialize R0 before exiting::
+
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_1),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (bf) r2 = r1
+  1: (95) exit
+  R0 !read_ok
+
+Program that accesses stack out of bounds::
+
+    BPF_ST_MEM(BPF_DW, BPF_REG_10, 8, 0),
+    BPF_EXIT_INSN(),
+
+Error::
+
+    0: (7a) *(u64 *)(r10 +8) = 0
+    invalid stack off=8 size=8
+
+Program that doesn't initialize stack before passing its address into function::
+
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (bf) r2 = r10
+  1: (07) r2 += -8
+  2: (b7) r1 = 0x0
+  3: (85) call 1
+  invalid indirect read from stack off -8+0 size 8
+
+Program that uses invalid map_fd=0 while calling to map_lookup_elem() function::
+
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 0x0
+  4: (85) call 1
+  fd 0 is not pointing to valid bpf_map
+
+Program that doesn't check return value of map_lookup_elem() before accessing
+map element::
+
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 0x0
+  4: (85) call 1
+  5: (7a) *(u64 *)(r0 +0) = 0
+  R0 invalid mem access 'map_value_or_null'
+
+Program that correctly checks map_lookup_elem() returned value for NULL, but
+accesses the memory with incorrect alignment::
+
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 4, 0),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 1
+  4: (85) call 1
+  5: (15) if r0 == 0x0 goto pc+1
+   R0=map_ptr R10=fp
+  6: (7a) *(u64 *)(r0 +4) = 0
+  misaligned access off 4 size 8
+
+Program that correctly checks map_lookup_elem() returned value for NULL and
+accesses memory with correct alignment in one side of 'if' branch, but fails
+to do so in the other side of 'if' branch::
+
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
+  BPF_EXIT_INSN(),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 1),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 1
+  4: (85) call 1
+  5: (15) if r0 == 0x0 goto pc+2
+   R0=map_ptr R10=fp
+  6: (7a) *(u64 *)(r0 +0) = 0
+  7: (95) exit
+
+  from 5 to 8: R0=imm0 R10=fp
+  8: (7a) *(u64 *)(r0 +0) = 1
+  R0 invalid mem access 'imm'
+
+Program that performs a socket lookup then sets the pointer to NULL without
+checking it::
+
+  BPF_MOV64_IMM(BPF_REG_2, 0),
+  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_MOV64_IMM(BPF_REG_3, 4),
+  BPF_MOV64_IMM(BPF_REG_4, 0),
+  BPF_MOV64_IMM(BPF_REG_5, 0),
+  BPF_EMIT_CALL(BPF_FUNC_sk_lookup_tcp),
+  BPF_MOV64_IMM(BPF_REG_0, 0),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (b7) r2 = 0
+  1: (63) *(u32 *)(r10 -8) = r2
+  2: (bf) r2 = r10
+  3: (07) r2 += -8
+  4: (b7) r3 = 4
+  5: (b7) r4 = 0
+  6: (b7) r5 = 0
+  7: (85) call bpf_sk_lookup_tcp#65
+  8: (b7) r0 = 0
+  9: (95) exit
+  Unreleased reference id=1, alloc_insn=7
+
+Program that performs a socket lookup but does not NULL-check the returned
+value::
+
+  BPF_MOV64_IMM(BPF_REG_2, 0),
+  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_MOV64_IMM(BPF_REG_3, 4),
+  BPF_MOV64_IMM(BPF_REG_4, 0),
+  BPF_MOV64_IMM(BPF_REG_5, 0),
+  BPF_EMIT_CALL(BPF_FUNC_sk_lookup_tcp),
+  BPF_EXIT_INSN(),
+
+Error::
+
+  0: (b7) r2 = 0
+  1: (63) *(u32 *)(r10 -8) = r2
+  2: (bf) r2 = r10
+  3: (07) r2 += -8
+  4: (b7) r3 = 4
+  5: (b7) r4 = 0
+  6: (b7) r5 = 0
+  7: (85) call bpf_sk_lookup_tcp#65
+  8: (95) exit
+  Unreleased reference id=1, alloc_insn=7
+
+Testing
+-------
+
+Next to the BPF toolchain, the kernel also ships a test module that contains
+various test cases for classic and internal BPF that can be executed against
+the BPF interpreter and JIT compiler. It can be found in lib/test_bpf.c and
+enabled via Kconfig::
+
+  CONFIG_TEST_BPF=m
+
+After the module has been built and installed, the test suite can be executed
+via insmod or modprobe against 'test_bpf' module. Results of the test cases
+including timings in nsec can be found in the kernel log (dmesg).
+
+Misc
+----
+
+Also trinity, the Linux syscall fuzzer, has built-in support for BPF and
+SECCOMP-BPF kernel fuzzing.
+
+Written by
+----------
+
+The document was written in the hope that it is found useful and in order
+to give potential BPF hackers or security auditors a better overview of
+the underlying architecture.
+
+- Jay Schulist <jschlst@samba.org>
+- Daniel Borkmann <daniel@iogearbox.net>
+- Alexei Starovoitov <ast@kernel.org>
diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
deleted file mode 100644
index 2f0f8b17dade..000000000000
--- a/Documentation/networking/filter.txt
+++ /dev/null
@@ -1,1545 +0,0 @@
-Linux Socket Filtering aka Berkeley Packet Filter (BPF)
-=======================================================
-
-Introduction
-------------
-
-Linux Socket Filtering (LSF) is derived from the Berkeley Packet Filter.
-Though there are some distinct differences between the BSD and Linux
-Kernel filtering, but when we speak of BPF or LSF in Linux context, we
-mean the very same mechanism of filtering in the Linux kernel.
-
-BPF allows a user-space program to attach a filter onto any socket and
-allow or disallow certain types of data to come through the socket. LSF
-follows exactly the same filter code structure as BSD's BPF, so referring
-to the BSD bpf.4 manpage is very helpful in creating filters.
-
-On Linux, BPF is much simpler than on BSD. One does not have to worry
-about devices or anything like that. You simply create your filter code,
-send it to the kernel via the SO_ATTACH_FILTER option and if your filter
-code passes the kernel check on it, you then immediately begin filtering
-data on that socket.
-
-You can also detach filters from your socket via the SO_DETACH_FILTER
-option. This will probably not be used much since when you close a socket
-that has a filter on it the filter is automagically removed. The other
-less common case may be adding a different filter on the same socket where
-you had another filter that is still running: the kernel takes care of
-removing the old one and placing your new one in its place, assuming your
-filter has passed the checks, otherwise if it fails the old filter will
-remain on that socket.
-
-SO_LOCK_FILTER option allows to lock the filter attached to a socket. Once
-set, a filter cannot be removed or changed. This allows one process to
-setup a socket, attach a filter, lock it then drop privileges and be
-assured that the filter will be kept until the socket is closed.
-
-The biggest user of this construct might be libpcap. Issuing a high-level
-filter command like `tcpdump -i em1 port 22` passes through the libpcap
-internal compiler that generates a structure that can eventually be loaded
-via SO_ATTACH_FILTER to the kernel. `tcpdump -i em1 port 22 -ddd`
-displays what is being placed into this structure.
-
-Although we were only speaking about sockets here, BPF in Linux is used
-in many more places. There's xt_bpf for netfilter, cls_bpf in the kernel
-qdisc layer, SECCOMP-BPF (SECure COMPuting [1]), and lots of other places
-such as team driver, PTP code, etc where BPF is being used.
-
- [1] Documentation/userspace-api/seccomp_filter.rst
-
-Original BPF paper:
-
-Steven McCanne and Van Jacobson. 1993. The BSD packet filter: a new
-architecture for user-level packet capture. In Proceedings of the
-USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993
-Conference Proceedings (USENIX'93). USENIX Association, Berkeley,
-CA, USA, 2-2. [http://www.tcpdump.org/papers/bpf-usenix93.pdf]
-
-Structure
----------
-
-User space applications include <linux/filter.h> which contains the
-following relevant structures:
-
-struct sock_filter {	/* Filter block */
-	__u16	code;   /* Actual filter code */
-	__u8	jt;	/* Jump true */
-	__u8	jf;	/* Jump false */
-	__u32	k;      /* Generic multiuse field */
-};
-
-Such a structure is assembled as an array of 4-tuples, that contains
-a code, jt, jf and k value. jt and jf are jump offsets and k a generic
-value to be used for a provided code.
-
-struct sock_fprog {			/* Required for SO_ATTACH_FILTER. */
-	unsigned short		   len;	/* Number of filter blocks */
-	struct sock_filter __user *filter;
-};
-
-For socket filtering, a pointer to this structure (as shown in
-follow-up example) is being passed to the kernel through setsockopt(2).
-
-Example
--------
-
-#include <sys/socket.h>
-#include <sys/types.h>
-#include <arpa/inet.h>
-#include <linux/if_ether.h>
-/* ... */
-
-/* From the example above: tcpdump -i em1 port 22 -dd */
-struct sock_filter code[] = {
-	{ 0x28,  0,  0, 0x0000000c },
-	{ 0x15,  0,  8, 0x000086dd },
-	{ 0x30,  0,  0, 0x00000014 },
-	{ 0x15,  2,  0, 0x00000084 },
-	{ 0x15,  1,  0, 0x00000006 },
-	{ 0x15,  0, 17, 0x00000011 },
-	{ 0x28,  0,  0, 0x00000036 },
-	{ 0x15, 14,  0, 0x00000016 },
-	{ 0x28,  0,  0, 0x00000038 },
-	{ 0x15, 12, 13, 0x00000016 },
-	{ 0x15,  0, 12, 0x00000800 },
-	{ 0x30,  0,  0, 0x00000017 },
-	{ 0x15,  2,  0, 0x00000084 },
-	{ 0x15,  1,  0, 0x00000006 },
-	{ 0x15,  0,  8, 0x00000011 },
-	{ 0x28,  0,  0, 0x00000014 },
-	{ 0x45,  6,  0, 0x00001fff },
-	{ 0xb1,  0,  0, 0x0000000e },
-	{ 0x48,  0,  0, 0x0000000e },
-	{ 0x15,  2,  0, 0x00000016 },
-	{ 0x48,  0,  0, 0x00000010 },
-	{ 0x15,  0,  1, 0x00000016 },
-	{ 0x06,  0,  0, 0x0000ffff },
-	{ 0x06,  0,  0, 0x00000000 },
-};
-
-struct sock_fprog bpf = {
-	.len = ARRAY_SIZE(code),
-	.filter = code,
-};
-
-sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
-if (sock < 0)
-	/* ... bail out ... */
-
-ret = setsockopt(sock, SOL_SOCKET, SO_ATTACH_FILTER, &bpf, sizeof(bpf));
-if (ret < 0)
-	/* ... bail out ... */
-
-/* ... */
-close(sock);
-
-The above example code attaches a socket filter for a PF_PACKET socket
-in order to let all IPv4/IPv6 packets with port 22 pass. The rest will
-be dropped for this socket.
-
-The setsockopt(2) call to SO_DETACH_FILTER doesn't need any arguments
-and SO_LOCK_FILTER for preventing the filter to be detached, takes an
-integer value with 0 or 1.
-
-Note that socket filters are not restricted to PF_PACKET sockets only,
-but can also be used on other socket families.
-
-Summary of system calls:
-
- * setsockopt(sockfd, SOL_SOCKET, SO_ATTACH_FILTER, &val, sizeof(val));
- * setsockopt(sockfd, SOL_SOCKET, SO_DETACH_FILTER, &val, sizeof(val));
- * setsockopt(sockfd, SOL_SOCKET, SO_LOCK_FILTER,   &val, sizeof(val));
-
-Normally, most use cases for socket filtering on packet sockets will be
-covered by libpcap in high-level syntax, so as an application developer
-you should stick to that. libpcap wraps its own layer around all that.
-
-Unless i) using/linking to libpcap is not an option, ii) the required BPF
-filters use Linux extensions that are not supported by libpcap's compiler,
-iii) a filter might be more complex and not cleanly implementable with
-libpcap's compiler, or iv) particular filter codes should be optimized
-differently than libpcap's internal compiler does; then in such cases
-writing such a filter "by hand" can be of an alternative. For example,
-xt_bpf and cls_bpf users might have requirements that could result in
-more complex filter code, or one that cannot be expressed with libpcap
-(e.g. different return codes for various code paths). Moreover, BPF JIT
-implementors may wish to manually write test cases and thus need low-level
-access to BPF code as well.
-
-BPF engine and instruction set
-------------------------------
-
-Under tools/bpf/ there's a small helper tool called bpf_asm which can
-be used to write low-level filters for example scenarios mentioned in the
-previous section. Asm-like syntax mentioned here has been implemented in
-bpf_asm and will be used for further explanations (instead of dealing with
-less readable opcodes directly, principles are the same). The syntax is
-closely modelled after Steven McCanne's and Van Jacobson's BPF paper.
-
-The BPF architecture consists of the following basic elements:
-
-  Element          Description
-
-  A                32 bit wide accumulator
-  X                32 bit wide X register
-  M[]              16 x 32 bit wide misc registers aka "scratch memory
-                   store", addressable from 0 to 15
-
-A program, that is translated by bpf_asm into "opcodes" is an array that
-consists of the following elements (as already mentioned):
-
-  op:16, jt:8, jf:8, k:32
-
-The element op is a 16 bit wide opcode that has a particular instruction
-encoded. jt and jf are two 8 bit wide jump targets, one for condition
-"jump if true", the other one "jump if false". Eventually, element k
-contains a miscellaneous argument that can be interpreted in different
-ways depending on the given instruction in op.
-
-The instruction set consists of load, store, branch, alu, miscellaneous
-and return instructions that are also represented in bpf_asm syntax. This
-table lists all bpf_asm instructions available resp. what their underlying
-opcodes as defined in linux/filter.h stand for:
-
-  Instruction      Addressing mode      Description
-
-  ld               1, 2, 3, 4, 12       Load word into A
-  ldi              4                    Load word into A
-  ldh              1, 2                 Load half-word into A
-  ldb              1, 2                 Load byte into A
-  ldx              3, 4, 5, 12          Load word into X
-  ldxi             4                    Load word into X
-  ldxb             5                    Load byte into X
-
-  st               3                    Store A into M[]
-  stx              3                    Store X into M[]
-
-  jmp              6                    Jump to label
-  ja               6                    Jump to label
-  jeq              7, 8, 9, 10          Jump on A == <x>
-  jneq             9, 10                Jump on A != <x>
-  jne              9, 10                Jump on A != <x>
-  jlt              9, 10                Jump on A <  <x>
-  jle              9, 10                Jump on A <= <x>
-  jgt              7, 8, 9, 10          Jump on A >  <x>
-  jge              7, 8, 9, 10          Jump on A >= <x>
-  jset             7, 8, 9, 10          Jump on A &  <x>
-
-  add              0, 4                 A + <x>
-  sub              0, 4                 A - <x>
-  mul              0, 4                 A * <x>
-  div              0, 4                 A / <x>
-  mod              0, 4                 A % <x>
-  neg                                   !A
-  and              0, 4                 A & <x>
-  or               0, 4                 A | <x>
-  xor              0, 4                 A ^ <x>
-  lsh              0, 4                 A << <x>
-  rsh              0, 4                 A >> <x>
-
-  tax                                   Copy A into X
-  txa                                   Copy X into A
-
-  ret              4, 11                Return
-
-The next table shows addressing formats from the 2nd column:
-
-  Addressing mode  Syntax               Description
-
-   0               x/%x                 Register X
-   1               [k]                  BHW at byte offset k in the packet
-   2               [x + k]              BHW at the offset X + k in the packet
-   3               M[k]                 Word at offset k in M[]
-   4               #k                   Literal value stored in k
-   5               4*([k]&0xf)          Lower nibble * 4 at byte offset k in the packet
-   6               L                    Jump label L
-   7               #k,Lt,Lf             Jump to Lt if true, otherwise jump to Lf
-   8               x/%x,Lt,Lf           Jump to Lt if true, otherwise jump to Lf
-   9               #k,Lt                Jump to Lt if predicate is true
-  10               x/%x,Lt              Jump to Lt if predicate is true
-  11               a/%a                 Accumulator A
-  12               extension            BPF extension
-
-The Linux kernel also has a couple of BPF extensions that are used along
-with the class of load instructions by "overloading" the k argument with
-a negative offset + a particular extension offset. The result of such BPF
-extensions are loaded into A.
-
-Possible BPF extensions are shown in the following table:
-
-  Extension                             Description
-
-  len                                   skb->len
-  proto                                 skb->protocol
-  type                                  skb->pkt_type
-  poff                                  Payload start offset
-  ifidx                                 skb->dev->ifindex
-  nla                                   Netlink attribute of type X with offset A
-  nlan                                  Nested Netlink attribute of type X with offset A
-  mark                                  skb->mark
-  queue                                 skb->queue_mapping
-  hatype                                skb->dev->type
-  rxhash                                skb->hash
-  cpu                                   raw_smp_processor_id()
-  vlan_tci                              skb_vlan_tag_get(skb)
-  vlan_avail                            skb_vlan_tag_present(skb)
-  vlan_tpid                             skb->vlan_proto
-  rand                                  prandom_u32()
-
-These extensions can also be prefixed with '#'.
-Examples for low-level BPF:
-
-** ARP packets:
-
-  ldh [12]
-  jne #0x806, drop
-  ret #-1
-  drop: ret #0
-
-** IPv4 TCP packets:
-
-  ldh [12]
-  jne #0x800, drop
-  ldb [23]
-  jneq #6, drop
-  ret #-1
-  drop: ret #0
-
-** (Accelerated) VLAN w/ id 10:
-
-  ld vlan_tci
-  jneq #10, drop
-  ret #-1
-  drop: ret #0
-
-** icmp random packet sampling, 1 in 4
-  ldh [12]
-  jne #0x800, drop
-  ldb [23]
-  jneq #1, drop
-  # get a random uint32 number
-  ld rand
-  mod #4
-  jneq #1, drop
-  ret #-1
-  drop: ret #0
-
-** SECCOMP filter example:
-
-  ld [4]                  /* offsetof(struct seccomp_data, arch) */
-  jne #0xc000003e, bad    /* AUDIT_ARCH_X86_64 */
-  ld [0]                  /* offsetof(struct seccomp_data, nr) */
-  jeq #15, good           /* __NR_rt_sigreturn */
-  jeq #231, good          /* __NR_exit_group */
-  jeq #60, good           /* __NR_exit */
-  jeq #0, good            /* __NR_read */
-  jeq #1, good            /* __NR_write */
-  jeq #5, good            /* __NR_fstat */
-  jeq #9, good            /* __NR_mmap */
-  jeq #14, good           /* __NR_rt_sigprocmask */
-  jeq #13, good           /* __NR_rt_sigaction */
-  jeq #35, good           /* __NR_nanosleep */
-  bad: ret #0             /* SECCOMP_RET_KILL_THREAD */
-  good: ret #0x7fff0000   /* SECCOMP_RET_ALLOW */
-
-The above example code can be placed into a file (here called "foo"), and
-then be passed to the bpf_asm tool for generating opcodes, output that xt_bpf
-and cls_bpf understands and can directly be loaded with. Example with above
-ARP code:
-
-$ ./bpf_asm foo
-4,40 0 0 12,21 0 1 2054,6 0 0 4294967295,6 0 0 0,
-
-In copy and paste C-like output:
-
-$ ./bpf_asm -c foo
-{ 0x28,  0,  0, 0x0000000c },
-{ 0x15,  0,  1, 0x00000806 },
-{ 0x06,  0,  0, 0xffffffff },
-{ 0x06,  0,  0, 0000000000 },
-
-In particular, as usage with xt_bpf or cls_bpf can result in more complex BPF
-filters that might not be obvious at first, it's good to test filters before
-attaching to a live system. For that purpose, there's a small tool called
-bpf_dbg under tools/bpf/ in the kernel source directory. This debugger allows
-for testing BPF filters against given pcap files, single stepping through the
-BPF code on the pcap's packets and to do BPF machine register dumps.
-
-Starting bpf_dbg is trivial and just requires issuing:
-
-# ./bpf_dbg
-
-In case input and output do not equal stdin/stdout, bpf_dbg takes an
-alternative stdin source as a first argument, and an alternative stdout
-sink as a second one, e.g. `./bpf_dbg test_in.txt test_out.txt`.
-
-Other than that, a particular libreadline configuration can be set via
-file "~/.bpf_dbg_init" and the command history is stored in the file
-"~/.bpf_dbg_history".
-
-Interaction in bpf_dbg happens through a shell that also has auto-completion
-support (follow-up example commands starting with '>' denote bpf_dbg shell).
-The usual workflow would be to ...
-
-> load bpf 6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 1,6 0 0 65535,6 0 0 0
-  Loads a BPF filter from standard output of bpf_asm, or transformed via
-  e.g. `tcpdump -iem1 -ddd port 22 | tr '\n' ','`. Note that for JIT
-  debugging (next section), this command creates a temporary socket and
-  loads the BPF code into the kernel. Thus, this will also be useful for
-  JIT developers.
-
-> load pcap foo.pcap
-  Loads standard tcpdump pcap file.
-
-> run [<n>]
-bpf passes:1 fails:9
-  Runs through all packets from a pcap to account how many passes and fails
-  the filter will generate. A limit of packets to traverse can be given.
-
-> disassemble
-l0:	ldh [12]
-l1:	jeq #0x800, l2, l5
-l2:	ldb [23]
-l3:	jeq #0x1, l4, l5
-l4:	ret #0xffff
-l5:	ret #0
-  Prints out BPF code disassembly.
-
-> dump
-/* { op, jt, jf, k }, */
-{ 0x28,  0,  0, 0x0000000c },
-{ 0x15,  0,  3, 0x00000800 },
-{ 0x30,  0,  0, 0x00000017 },
-{ 0x15,  0,  1, 0x00000001 },
-{ 0x06,  0,  0, 0x0000ffff },
-{ 0x06,  0,  0, 0000000000 },
-  Prints out C-style BPF code dump.
-
-> breakpoint 0
-breakpoint at: l0:	ldh [12]
-> breakpoint 1
-breakpoint at: l1:	jeq #0x800, l2, l5
-  ...
-  Sets breakpoints at particular BPF instructions. Issuing a `run` command
-  will walk through the pcap file continuing from the current packet and
-  break when a breakpoint is being hit (another `run` will continue from
-  the currently active breakpoint executing next instructions):
-
-  > run
-  -- register dump --
-  pc:       [0]                       <-- program counter
-  code:     [40] jt[0] jf[0] k[12]    <-- plain BPF code of current instruction
-  curr:     l0:	ldh [12]              <-- disassembly of current instruction
-  A:        [00000000][0]             <-- content of A (hex, decimal)
-  X:        [00000000][0]             <-- content of X (hex, decimal)
-  M[0,15]:  [00000000][0]             <-- folded content of M (hex, decimal)
-  -- packet dump --                   <-- Current packet from pcap (hex)
-  len: 42
-    0: 00 19 cb 55 55 a4 00 14 a4 43 78 69 08 06 00 01
-   16: 08 00 06 04 00 01 00 14 a4 43 78 69 0a 3b 01 26
-   32: 00 00 00 00 00 00 0a 3b 01 01
-  (breakpoint)
-  >
-
-> breakpoint
-breakpoints: 0 1
-  Prints currently set breakpoints.
-
-> step [-<n>, +<n>]
-  Performs single stepping through the BPF program from the current pc
-  offset. Thus, on each step invocation, above register dump is issued.
-  This can go forwards and backwards in time, a plain `step` will break
-  on the next BPF instruction, thus +1. (No `run` needs to be issued here.)
-
-> select <n>
-  Selects a given packet from the pcap file to continue from. Thus, on
-  the next `run` or `step`, the BPF program is being evaluated against
-  the user pre-selected packet. Numbering starts just as in Wireshark
-  with index 1.
-
-> quit
-#
-  Exits bpf_dbg.
-
-JIT compiler
-------------
-
-The Linux kernel has a built-in BPF JIT compiler for x86_64, SPARC,
-PowerPC, ARM, ARM64, MIPS, RISC-V and s390 and can be enabled through
-CONFIG_BPF_JIT. The JIT compiler is transparently invoked for each
-attached filter from user space or for internal kernel users if it has
-been previously enabled by root:
-
-  echo 1 > /proc/sys/net/core/bpf_jit_enable
-
-For JIT developers, doing audits etc, each compile run can output the generated
-opcode image into the kernel log via:
-
-  echo 2 > /proc/sys/net/core/bpf_jit_enable
-
-Example output from dmesg:
-
-[ 3389.935842] flen=6 proglen=70 pass=3 image=ffffffffa0069c8f
-[ 3389.935847] JIT code: 00000000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 68
-[ 3389.935849] JIT code: 00000010: 44 2b 4f 6c 4c 8b 87 d8 00 00 00 be 0c 00 00 00
-[ 3389.935850] JIT code: 00000020: e8 1d 94 ff e0 3d 00 08 00 00 75 16 be 17 00 00
-[ 3389.935851] JIT code: 00000030: 00 e8 28 94 ff e0 83 f8 01 75 07 b8 ff ff 00 00
-[ 3389.935852] JIT code: 00000040: eb 02 31 c0 c9 c3
-
-When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set to 1 and
-setting any other value than that will return in failure. This is even the case for
-setting bpf_jit_enable to 2, since dumping the final JIT image into the kernel log
-is discouraged and introspection through bpftool (under tools/bpf/bpftool/) is the
-generally recommended approach instead.
-
-In the kernel source tree under tools/bpf/, there's bpf_jit_disasm for
-generating disassembly out of the kernel log's hexdump:
-
-# ./bpf_jit_disasm
-70 bytes emitted from JIT compiler (pass:3, flen:6)
-ffffffffa0069c8f + <x>:
-   0:	push   %rbp
-   1:	mov    %rsp,%rbp
-   4:	sub    $0x60,%rsp
-   8:	mov    %rbx,-0x8(%rbp)
-   c:	mov    0x68(%rdi),%r9d
-  10:	sub    0x6c(%rdi),%r9d
-  14:	mov    0xd8(%rdi),%r8
-  1b:	mov    $0xc,%esi
-  20:	callq  0xffffffffe0ff9442
-  25:	cmp    $0x800,%eax
-  2a:	jne    0x0000000000000042
-  2c:	mov    $0x17,%esi
-  31:	callq  0xffffffffe0ff945e
-  36:	cmp    $0x1,%eax
-  39:	jne    0x0000000000000042
-  3b:	mov    $0xffff,%eax
-  40:	jmp    0x0000000000000044
-  42:	xor    %eax,%eax
-  44:	leaveq
-  45:	retq
-
-Issuing option `-o` will "annotate" opcodes to resulting assembler
-instructions, which can be very useful for JIT developers:
-
-# ./bpf_jit_disasm -o
-70 bytes emitted from JIT compiler (pass:3, flen:6)
-ffffffffa0069c8f + <x>:
-   0:	push   %rbp
-	55
-   1:	mov    %rsp,%rbp
-	48 89 e5
-   4:	sub    $0x60,%rsp
-	48 83 ec 60
-   8:	mov    %rbx,-0x8(%rbp)
-	48 89 5d f8
-   c:	mov    0x68(%rdi),%r9d
-	44 8b 4f 68
-  10:	sub    0x6c(%rdi),%r9d
-	44 2b 4f 6c
-  14:	mov    0xd8(%rdi),%r8
-	4c 8b 87 d8 00 00 00
-  1b:	mov    $0xc,%esi
-	be 0c 00 00 00
-  20:	callq  0xffffffffe0ff9442
-	e8 1d 94 ff e0
-  25:	cmp    $0x800,%eax
-	3d 00 08 00 00
-  2a:	jne    0x0000000000000042
-	75 16
-  2c:	mov    $0x17,%esi
-	be 17 00 00 00
-  31:	callq  0xffffffffe0ff945e
-	e8 28 94 ff e0
-  36:	cmp    $0x1,%eax
-	83 f8 01
-  39:	jne    0x0000000000000042
-	75 07
-  3b:	mov    $0xffff,%eax
-	b8 ff ff 00 00
-  40:	jmp    0x0000000000000044
-	eb 02
-  42:	xor    %eax,%eax
-	31 c0
-  44:	leaveq
-	c9
-  45:	retq
-	c3
-
-For BPF JIT developers, bpf_jit_disasm, bpf_asm and bpf_dbg provides a useful
-toolchain for developing and testing the kernel's JIT compiler.
-
-BPF kernel internals
---------------------
-Internally, for the kernel interpreter, a different instruction set
-format with similar underlying principles from BPF described in previous
-paragraphs is being used. However, the instruction set format is modelled
-closer to the underlying architecture to mimic native instruction sets, so
-that a better performance can be achieved (more details later). This new
-ISA is called 'eBPF' or 'internal BPF' interchangeably. (Note: eBPF which
-originates from [e]xtended BPF is not the same as BPF extensions! While
-eBPF is an ISA, BPF extensions date back to classic BPF's 'overloading'
-of BPF_LD | BPF_{B,H,W} | BPF_ABS instruction.)
-
-It is designed to be JITed with one to one mapping, which can also open up
-the possibility for GCC/LLVM compilers to generate optimized eBPF code through
-an eBPF backend that performs almost as fast as natively compiled code.
-
-The new instruction set was originally designed with the possible goal in
-mind to write programs in "restricted C" and compile into eBPF with a optional
-GCC/LLVM backend, so that it can just-in-time map to modern 64-bit CPUs with
-minimal performance overhead over two steps, that is, C -> eBPF -> native code.
-
-Currently, the new format is being used for running user BPF programs, which
-includes seccomp BPF, classic socket filters, cls_bpf traffic classifier,
-team driver's classifier for its load-balancing mode, netfilter's xt_bpf
-extension, PTP dissector/classifier, and much more. They are all internally
-converted by the kernel into the new instruction set representation and run
-in the eBPF interpreter. For in-kernel handlers, this all works transparently
-by using bpf_prog_create() for setting up the filter, resp.
-bpf_prog_destroy() for destroying it. The macro
-BPF_PROG_RUN(filter, ctx) transparently invokes eBPF interpreter or JITed
-code to run the filter. 'filter' is a pointer to struct bpf_prog that we
-got from bpf_prog_create(), and 'ctx' the given context (e.g.
-skb pointer). All constraints and restrictions from bpf_check_classic() apply
-before a conversion to the new layout is being done behind the scenes!
-
-Currently, the classic BPF format is being used for JITing on most
-32-bit architectures, whereas x86-64, aarch64, s390x, powerpc64,
-sparc64, arm32, riscv64, riscv32 perform JIT compilation from eBPF
-instruction set.
-
-Some core changes of the new internal format:
-
-- Number of registers increase from 2 to 10:
-
-  The old format had two registers A and X, and a hidden frame pointer. The
-  new layout extends this to be 10 internal registers and a read-only frame
-  pointer. Since 64-bit CPUs are passing arguments to functions via registers
-  the number of args from eBPF program to in-kernel function is restricted
-  to 5 and one register is used to accept return value from an in-kernel
-  function. Natively, x86_64 passes first 6 arguments in registers, aarch64/
-  sparcv9/mips64 have 7 - 8 registers for arguments; x86_64 has 6 callee saved
-  registers, and aarch64/sparcv9/mips64 have 11 or more callee saved registers.
-
-  Therefore, eBPF calling convention is defined as:
-
-    * R0	- return value from in-kernel function, and exit value for eBPF program
-    * R1 - R5	- arguments from eBPF program to in-kernel function
-    * R6 - R9	- callee saved registers that in-kernel function will preserve
-    * R10	- read-only frame pointer to access stack
-
-  Thus, all eBPF registers map one to one to HW registers on x86_64, aarch64,
-  etc, and eBPF calling convention maps directly to ABIs used by the kernel on
-  64-bit architectures.
-
-  On 32-bit architectures JIT may map programs that use only 32-bit arithmetic
-  and may let more complex programs to be interpreted.
-
-  R0 - R5 are scratch registers and eBPF program needs spill/fill them if
-  necessary across calls. Note that there is only one eBPF program (== one
-  eBPF main routine) and it cannot call other eBPF functions, it can only
-  call predefined in-kernel functions, though.
-
-- Register width increases from 32-bit to 64-bit:
-
-  Still, the semantics of the original 32-bit ALU operations are preserved
-  via 32-bit subregisters. All eBPF registers are 64-bit with 32-bit lower
-  subregisters that zero-extend into 64-bit if they are being written to.
-  That behavior maps directly to x86_64 and arm64 subregister definition, but
-  makes other JITs more difficult.
-
-  32-bit architectures run 64-bit internal BPF programs via interpreter.
-  Their JITs may convert BPF programs that only use 32-bit subregisters into
-  native instruction set and let the rest being interpreted.
-
-  Operation is 64-bit, because on 64-bit architectures, pointers are also
-  64-bit wide, and we want to pass 64-bit values in/out of kernel functions,
-  so 32-bit eBPF registers would otherwise require to define register-pair
-  ABI, thus, there won't be able to use a direct eBPF register to HW register
-  mapping and JIT would need to do combine/split/move operations for every
-  register in and out of the function, which is complex, bug prone and slow.
-  Another reason is the use of atomic 64-bit counters.
-
-- Conditional jt/jf targets replaced with jt/fall-through:
-
-  While the original design has constructs such as "if (cond) jump_true;
-  else jump_false;", they are being replaced into alternative constructs like
-  "if (cond) jump_true; /* else fall-through */".
-
-- Introduces bpf_call insn and register passing convention for zero overhead
-  calls from/to other kernel functions:
-
-  Before an in-kernel function call, the internal BPF program needs to
-  place function arguments into R1 to R5 registers to satisfy calling
-  convention, then the interpreter will take them from registers and pass
-  to in-kernel function. If R1 - R5 registers are mapped to CPU registers
-  that are used for argument passing on given architecture, the JIT compiler
-  doesn't need to emit extra moves. Function arguments will be in the correct
-  registers and BPF_CALL instruction will be JITed as single 'call' HW
-  instruction. This calling convention was picked to cover common call
-  situations without performance penalty.
-
-  After an in-kernel function call, R1 - R5 are reset to unreadable and R0 has
-  a return value of the function. Since R6 - R9 are callee saved, their state
-  is preserved across the call.
-
-  For example, consider three C functions:
-
-  u64 f1() { return (*_f2)(1); }
-  u64 f2(u64 a) { return f3(a + 1, a); }
-  u64 f3(u64 a, u64 b) { return a - b; }
-
-  GCC can compile f1, f3 into x86_64:
-
-  f1:
-    movl $1, %edi
-    movq _f2(%rip), %rax
-    jmp  *%rax
-  f3:
-    movq %rdi, %rax
-    subq %rsi, %rax
-    ret
-
-  Function f2 in eBPF may look like:
-
-  f2:
-    bpf_mov R2, R1
-    bpf_add R1, 1
-    bpf_call f3
-    bpf_exit
-
-  If f2 is JITed and the pointer stored to '_f2'. The calls f1 -> f2 -> f3 and
-  returns will be seamless. Without JIT, __bpf_prog_run() interpreter needs to
-  be used to call into f2.
-
-  For practical reasons all eBPF programs have only one argument 'ctx' which is
-  already placed into R1 (e.g. on __bpf_prog_run() startup) and the programs
-  can call kernel functions with up to 5 arguments. Calls with 6 or more arguments
-  are currently not supported, but these restrictions can be lifted if necessary
-  in the future.
-
-  On 64-bit architectures all register map to HW registers one to one. For
-  example, x86_64 JIT compiler can map them as ...
-
-    R0 - rax
-    R1 - rdi
-    R2 - rsi
-    R3 - rdx
-    R4 - rcx
-    R5 - r8
-    R6 - rbx
-    R7 - r13
-    R8 - r14
-    R9 - r15
-    R10 - rbp
-
-  ... since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing
-  and rbx, r12 - r15 are callee saved.
-
-  Then the following internal BPF pseudo-program:
-
-    bpf_mov R6, R1 /* save ctx */
-    bpf_mov R2, 2
-    bpf_mov R3, 3
-    bpf_mov R4, 4
-    bpf_mov R5, 5
-    bpf_call foo
-    bpf_mov R7, R0 /* save foo() return value */
-    bpf_mov R1, R6 /* restore ctx for next call */
-    bpf_mov R2, 6
-    bpf_mov R3, 7
-    bpf_mov R4, 8
-    bpf_mov R5, 9
-    bpf_call bar
-    bpf_add R0, R7
-    bpf_exit
-
-  After JIT to x86_64 may look like:
-
-    push %rbp
-    mov %rsp,%rbp
-    sub $0x228,%rsp
-    mov %rbx,-0x228(%rbp)
-    mov %r13,-0x220(%rbp)
-    mov %rdi,%rbx
-    mov $0x2,%esi
-    mov $0x3,%edx
-    mov $0x4,%ecx
-    mov $0x5,%r8d
-    callq foo
-    mov %rax,%r13
-    mov %rbx,%rdi
-    mov $0x6,%esi
-    mov $0x7,%edx
-    mov $0x8,%ecx
-    mov $0x9,%r8d
-    callq bar
-    add %r13,%rax
-    mov -0x228(%rbp),%rbx
-    mov -0x220(%rbp),%r13
-    leaveq
-    retq
-
-  Which is in this example equivalent in C to:
-
-    u64 bpf_filter(u64 ctx)
-    {
-        return foo(ctx, 2, 3, 4, 5) + bar(ctx, 6, 7, 8, 9);
-    }
-
-  In-kernel functions foo() and bar() with prototype: u64 (*)(u64 arg1, u64
-  arg2, u64 arg3, u64 arg4, u64 arg5); will receive arguments in proper
-  registers and place their return value into '%rax' which is R0 in eBPF.
-  Prologue and epilogue are emitted by JIT and are implicit in the
-  interpreter. R0-R5 are scratch registers, so eBPF program needs to preserve
-  them across the calls as defined by calling convention.
-
-  For example the following program is invalid:
-
-    bpf_mov R1, 1
-    bpf_call foo
-    bpf_mov R0, R1
-    bpf_exit
-
-  After the call the registers R1-R5 contain junk values and cannot be read.
-  An in-kernel eBPF verifier is used to validate internal BPF programs.
-
-Also in the new design, eBPF is limited to 4096 insns, which means that any
-program will terminate quickly and will only call a fixed number of kernel
-functions. Original BPF and the new format are two operand instructions,
-which helps to do one-to-one mapping between eBPF insn and x86 insn during JIT.
-
-The input context pointer for invoking the interpreter function is generic,
-its content is defined by a specific use case. For seccomp register R1 points
-to seccomp_data, for converted BPF filters R1 points to a skb.
-
-A program, that is translated internally consists of the following elements:
-
-  op:16, jt:8, jf:8, k:32    ==>    op:8, dst_reg:4, src_reg:4, off:16, imm:32
-
-So far 87 internal BPF instructions were implemented. 8-bit 'op' opcode field
-has room for new instructions. Some of them may use 16/24/32 byte encoding. New
-instructions must be multiple of 8 bytes to preserve backward compatibility.
-
-Internal BPF is a general purpose RISC instruction set. Not every register and
-every instruction are used during translation from original BPF to new format.
-For example, socket filters are not using 'exclusive add' instruction, but
-tracing filters may do to maintain counters of events, for example. Register R9
-is not used by socket filters either, but more complex filters may be running
-out of registers and would have to resort to spill/fill to stack.
-
-Internal BPF can be used as a generic assembler for last step performance
-optimizations, socket filters and seccomp are using it as assembler. Tracing
-filters may use it as assembler to generate code from kernel. In kernel usage
-may not be bounded by security considerations, since generated internal BPF code
-may be optimizing internal code path and not being exposed to the user space.
-Safety of internal BPF can come from a verifier (TBD). In such use cases as
-described, it may be used as safe instruction set.
-
-Just like the original BPF, the new format runs within a controlled environment,
-is deterministic and the kernel can easily prove that. The safety of the program
-can be determined in two steps: first step does depth-first-search to disallow
-loops and other CFG validation; second step starts from the first insn and
-descends all possible paths. It simulates execution of every insn and observes
-the state change of registers and stack.
-
-eBPF opcode encoding
---------------------
-
-eBPF is reusing most of the opcode encoding from classic to simplify conversion
-of classic BPF to eBPF. For arithmetic and jump instructions the 8-bit 'code'
-field is divided into three parts:
-
-  +----------------+--------+--------------------+
-  |   4 bits       |  1 bit |   3 bits           |
-  | operation code | source | instruction class  |
-  +----------------+--------+--------------------+
-  (MSB)                                      (LSB)
-
-Three LSB bits store instruction class which is one of:
-
-  Classic BPF classes:    eBPF classes:
-
-  BPF_LD    0x00          BPF_LD    0x00
-  BPF_LDX   0x01          BPF_LDX   0x01
-  BPF_ST    0x02          BPF_ST    0x02
-  BPF_STX   0x03          BPF_STX   0x03
-  BPF_ALU   0x04          BPF_ALU   0x04
-  BPF_JMP   0x05          BPF_JMP   0x05
-  BPF_RET   0x06          BPF_JMP32 0x06
-  BPF_MISC  0x07          BPF_ALU64 0x07
-
-When BPF_CLASS(code) == BPF_ALU or BPF_JMP, 4th bit encodes source operand ...
-
-  BPF_K     0x00
-  BPF_X     0x08
-
- * in classic BPF, this means:
-
-  BPF_SRC(code) == BPF_X - use register X as source operand
-  BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand
-
- * in eBPF, this means:
-
-  BPF_SRC(code) == BPF_X - use 'src_reg' register as source operand
-  BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand
-
-... and four MSB bits store operation code.
-
-If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 [ in eBPF ], BPF_OP(code) is one of:
-
-  BPF_ADD   0x00
-  BPF_SUB   0x10
-  BPF_MUL   0x20
-  BPF_DIV   0x30
-  BPF_OR    0x40
-  BPF_AND   0x50
-  BPF_LSH   0x60
-  BPF_RSH   0x70
-  BPF_NEG   0x80
-  BPF_MOD   0x90
-  BPF_XOR   0xa0
-  BPF_MOV   0xb0  /* eBPF only: mov reg to reg */
-  BPF_ARSH  0xc0  /* eBPF only: sign extending shift right */
-  BPF_END   0xd0  /* eBPF only: endianness conversion */
-
-If BPF_CLASS(code) == BPF_JMP or BPF_JMP32 [ in eBPF ], BPF_OP(code) is one of:
-
-  BPF_JA    0x00  /* BPF_JMP only */
-  BPF_JEQ   0x10
-  BPF_JGT   0x20
-  BPF_JGE   0x30
-  BPF_JSET  0x40
-  BPF_JNE   0x50  /* eBPF only: jump != */
-  BPF_JSGT  0x60  /* eBPF only: signed '>' */
-  BPF_JSGE  0x70  /* eBPF only: signed '>=' */
-  BPF_CALL  0x80  /* eBPF BPF_JMP only: function call */
-  BPF_EXIT  0x90  /* eBPF BPF_JMP only: function return */
-  BPF_JLT   0xa0  /* eBPF only: unsigned '<' */
-  BPF_JLE   0xb0  /* eBPF only: unsigned '<=' */
-  BPF_JSLT  0xc0  /* eBPF only: signed '<' */
-  BPF_JSLE  0xd0  /* eBPF only: signed '<=' */
-
-So BPF_ADD | BPF_X | BPF_ALU means 32-bit addition in both classic BPF
-and eBPF. There are only two registers in classic BPF, so it means A += X.
-In eBPF it means dst_reg = (u32) dst_reg + (u32) src_reg; similarly,
-BPF_XOR | BPF_K | BPF_ALU means A ^= imm32 in classic BPF and analogous
-src_reg = (u32) src_reg ^ (u32) imm32 in eBPF.
-
-Classic BPF is using BPF_MISC class to represent A = X and X = A moves.
-eBPF is using BPF_MOV | BPF_X | BPF_ALU code instead. Since there are no
-BPF_MISC operations in eBPF, the class 7 is used as BPF_ALU64 to mean
-exactly the same operations as BPF_ALU, but with 64-bit wide operands
-instead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.:
-dst_reg = dst_reg + src_reg
-
-Classic BPF wastes the whole BPF_RET class to represent a single 'ret'
-operation. Classic BPF_RET | BPF_K means copy imm32 into return register
-and perform function exit. eBPF is modeled to match CPU, so BPF_JMP | BPF_EXIT
-in eBPF means function exit only. The eBPF program needs to store return
-value into register R0 before doing a BPF_EXIT. Class 6 in eBPF is used as
-BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide
-operands for the comparisons instead.
-
-For load and store instructions the 8-bit 'code' field is divided as:
-
-  +--------+--------+-------------------+
-  | 3 bits | 2 bits |   3 bits          |
-  |  mode  |  size  | instruction class |
-  +--------+--------+-------------------+
-  (MSB)                             (LSB)
-
-Size modifier is one of ...
-
-  BPF_W   0x00    /* word */
-  BPF_H   0x08    /* half word */
-  BPF_B   0x10    /* byte */
-  BPF_DW  0x18    /* eBPF only, double word */
-
-... which encodes size of load/store operation:
-
- B  - 1 byte
- H  - 2 byte
- W  - 4 byte
- DW - 8 byte (eBPF only)
-
-Mode modifier is one of:
-
-  BPF_IMM  0x00  /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
-  BPF_ABS  0x20
-  BPF_IND  0x40
-  BPF_MEM  0x60
-  BPF_LEN  0x80  /* classic BPF only, reserved in eBPF */
-  BPF_MSH  0xa0  /* classic BPF only, reserved in eBPF */
-  BPF_XADD 0xc0  /* eBPF only, exclusive add */
-
-eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
-(BPF_IND | <size> | BPF_LD) which are used to access packet data.
-
-They had to be carried over from classic to have strong performance of
-socket filters running in eBPF interpreter. These instructions can only
-be used when interpreter context is a pointer to 'struct sk_buff' and
-have seven implicit operands. Register R6 is an implicit input that must
-contain pointer to sk_buff. Register R0 is an implicit output which contains
-the data fetched from the packet. Registers R1-R5 are scratch registers
-and must not be used to store the data across BPF_ABS | BPF_LD or
-BPF_IND | BPF_LD instructions.
-
-These instructions have implicit program exit condition as well. When
-eBPF program is trying to access the data beyond the packet boundary,
-the interpreter will abort the execution of the program. JIT compilers
-therefore must preserve this property. src_reg and imm32 fields are
-explicit inputs to these instructions.
-
-For example:
-
-  BPF_IND | BPF_W | BPF_LD means:
-
-    R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
-    and R1 - R5 were scratched.
-
-Unlike classic BPF instruction set, eBPF has generic load/store operations:
-
-BPF_MEM | <size> | BPF_STX:  *(size *) (dst_reg + off) = src_reg
-BPF_MEM | <size> | BPF_ST:   *(size *) (dst_reg + off) = imm32
-BPF_MEM | <size> | BPF_LDX:  dst_reg = *(size *) (src_reg + off)
-BPF_XADD | BPF_W  | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
-BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
-
-Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
-2 byte atomic increments are not supported.
-
-eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
-of two consecutive 'struct bpf_insn' 8-byte blocks and interpreted as single
-instruction that loads 64-bit immediate value into a dst_reg.
-Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads
-32-bit immediate value into a register.
-
-eBPF verifier
--------------
-The safety of the eBPF program is determined in two steps.
-
-First step does DAG check to disallow loops and other CFG validation.
-In particular it will detect programs that have unreachable instructions.
-(though classic BPF checker allows them)
-
-Second step starts from the first insn and descends all possible paths.
-It simulates execution of every insn and observes the state change of
-registers and stack.
-
-At the start of the program the register R1 contains a pointer to context
-and has type PTR_TO_CTX.
-If verifier sees an insn that does R2=R1, then R2 has now type
-PTR_TO_CTX as well and can be used on the right hand side of expression.
-If R1=PTR_TO_CTX and insn is R2=R1+R1, then R2=SCALAR_VALUE,
-since addition of two valid pointers makes invalid pointer.
-(In 'secure' mode verifier will reject any type of pointer arithmetic to make
-sure that kernel addresses don't leak to unprivileged users)
-
-If register was never written to, it's not readable:
-  bpf_mov R0 = R2
-  bpf_exit
-will be rejected, since R2 is unreadable at the start of the program.
-
-After kernel function call, R1-R5 are reset to unreadable and
-R0 has a return type of the function.
-
-Since R6-R9 are callee saved, their state is preserved across the call.
-  bpf_mov R6 = 1
-  bpf_call foo
-  bpf_mov R0 = R6
-  bpf_exit
-is a correct program. If there was R1 instead of R6, it would have
-been rejected.
-
-load/store instructions are allowed only with registers of valid types, which
-are PTR_TO_CTX, PTR_TO_MAP, PTR_TO_STACK. They are bounds and alignment checked.
-For example:
- bpf_mov R1 = 1
- bpf_mov R2 = 2
- bpf_xadd *(u32 *)(R1 + 3) += R2
- bpf_exit
-will be rejected, since R1 doesn't have a valid pointer type at the time of
-execution of instruction bpf_xadd.
-
-At the start R1 type is PTR_TO_CTX (a pointer to generic 'struct bpf_context')
-A callback is used to customize verifier to restrict eBPF program access to only
-certain fields within ctx structure with specified size and alignment.
-
-For example, the following insn:
-  bpf_ld R0 = *(u32 *)(R6 + 8)
-intends to load a word from address R6 + 8 and store it into R0
-If R6=PTR_TO_CTX, via is_valid_access() callback the verifier will know
-that offset 8 of size 4 bytes can be accessed for reading, otherwise
-the verifier will reject the program.
-If R6=PTR_TO_STACK, then access should be aligned and be within
-stack bounds, which are [-MAX_BPF_STACK, 0). In this example offset is 8,
-so it will fail verification, since it's out of bounds.
-
-The verifier will allow eBPF program to read data from stack only after
-it wrote into it.
-Classic BPF verifier does similar check with M[0-15] memory slots.
-For example:
-  bpf_ld R0 = *(u32 *)(R10 - 4)
-  bpf_exit
-is invalid program.
-Though R10 is correct read-only register and has type PTR_TO_STACK
-and R10 - 4 is within stack bounds, there were no stores into that location.
-
-Pointer register spill/fill is tracked as well, since four (R6-R9)
-callee saved registers may not be enough for some programs.
-
-Allowed function calls are customized with bpf_verifier_ops->get_func_proto()
-The eBPF verifier will check that registers match argument constraints.
-After the call register R0 will be set to return type of the function.
-
-Function calls is a main mechanism to extend functionality of eBPF programs.
-Socket filters may let programs to call one set of functions, whereas tracing
-filters may allow completely different set.
-
-If a function made accessible to eBPF program, it needs to be thought through
-from safety point of view. The verifier will guarantee that the function is
-called with valid arguments.
-
-seccomp vs socket filters have different security restrictions for classic BPF.
-Seccomp solves this by two stage verifier: classic BPF verifier is followed
-by seccomp verifier. In case of eBPF one configurable verifier is shared for
-all use cases.
-
-See details of eBPF verifier in kernel/bpf/verifier.c
-
-Register value tracking
------------------------
-In order to determine the safety of an eBPF program, the verifier must track
-the range of possible values in each register and also in each stack slot.
-This is done with 'struct bpf_reg_state', defined in include/linux/
-bpf_verifier.h, which unifies tracking of scalar and pointer values.  Each
-register state has a type, which is either NOT_INIT (the register has not been
-written to), SCALAR_VALUE (some value which is not usable as a pointer), or a
-pointer type.  The types of pointers describe their base, as follows:
-    PTR_TO_CTX          Pointer to bpf_context.
-    CONST_PTR_TO_MAP    Pointer to struct bpf_map.  "Const" because arithmetic
-                        on these pointers is forbidden.
-    PTR_TO_MAP_VALUE    Pointer to the value stored in a map element.
-    PTR_TO_MAP_VALUE_OR_NULL
-                        Either a pointer to a map value, or NULL; map accesses
-                        (see section 'eBPF maps', below) return this type,
-                        which becomes a PTR_TO_MAP_VALUE when checked != NULL.
-                        Arithmetic on these pointers is forbidden.
-    PTR_TO_STACK        Frame pointer.
-    PTR_TO_PACKET       skb->data.
-    PTR_TO_PACKET_END   skb->data + headlen; arithmetic forbidden.
-    PTR_TO_SOCKET       Pointer to struct bpf_sock_ops, implicitly refcounted.
-    PTR_TO_SOCKET_OR_NULL
-                        Either a pointer to a socket, or NULL; socket lookup
-                        returns this type, which becomes a PTR_TO_SOCKET when
-                        checked != NULL. PTR_TO_SOCKET is reference-counted,
-                        so programs must release the reference through the
-                        socket release function before the end of the program.
-                        Arithmetic on these pointers is forbidden.
-However, a pointer may be offset from this base (as a result of pointer
-arithmetic), and this is tracked in two parts: the 'fixed offset' and 'variable
-offset'.  The former is used when an exactly-known value (e.g. an immediate
-operand) is added to a pointer, while the latter is used for values which are
-not exactly known.  The variable offset is also used in SCALAR_VALUEs, to track
-the range of possible values in the register.
-The verifier's knowledge about the variable offset consists of:
-* minimum and maximum values as unsigned
-* minimum and maximum values as signed
-* knowledge of the values of individual bits, in the form of a 'tnum': a u64
-'mask' and a u64 'value'.  1s in the mask represent bits whose value is unknown;
-1s in the value represent bits known to be 1.  Bits known to be 0 have 0 in both
-mask and value; no bit should ever be 1 in both.  For example, if a byte is read
-into a register from memory, the register's top 56 bits are known zero, while
-the low 8 are unknown - which is represented as the tnum (0x0; 0xff).  If we
-then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0;
-0x1ff), because of potential carries.
-
-Besides arithmetic, the register state can also be updated by conditional
-branches.  For instance, if a SCALAR_VALUE is compared > 8, in the 'true' branch
-it will have a umin_value (unsigned minimum value) of 9, whereas in the 'false'
-branch it will have a umax_value of 8.  A signed compare (with BPF_JSGT or
-BPF_JSGE) would instead update the signed minimum/maximum values.  Information
-from the signed and unsigned bounds can be combined; for instance if a value is
-first tested < 8 and then tested s> 4, the verifier will conclude that the value
-is also > 4 and s< 8, since the bounds prevent crossing the sign boundary.
-
-PTR_TO_PACKETs with a variable offset part have an 'id', which is common to all
-pointers sharing that same variable offset.  This is important for packet range
-checks: after adding a variable to a packet pointer register A, if you then copy
-it to another register B and then add a constant 4 to A, both registers will
-share the same 'id' but the A will have a fixed offset of +4.  Then if A is
-bounds-checked and found to be less than a PTR_TO_PACKET_END, the register B is
-now known to have a safe range of at least 4 bytes.  See 'Direct packet access',
-below, for more on PTR_TO_PACKET ranges.
-
-The 'id' field is also used on PTR_TO_MAP_VALUE_OR_NULL, common to all copies of
-the pointer returned from a map lookup.  This means that when one copy is
-checked and found to be non-NULL, all copies can become PTR_TO_MAP_VALUEs.
-As well as range-checking, the tracked information is also used for enforcing
-alignment of pointer accesses.  For instance, on most systems the packet pointer
-is 2 bytes after a 4-byte alignment.  If a program adds 14 bytes to that to jump
-over the Ethernet header, then reads IHL and addes (IHL * 4), the resulting
-pointer will have a variable offset known to be 4n+2 for some n, so adding the 2
-bytes (NET_IP_ALIGN) gives a 4-byte alignment and so word-sized accesses through
-that pointer are safe.
-The 'id' field is also used on PTR_TO_SOCKET and PTR_TO_SOCKET_OR_NULL, common
-to all copies of the pointer returned from a socket lookup. This has similar
-behaviour to the handling for PTR_TO_MAP_VALUE_OR_NULL->PTR_TO_MAP_VALUE, but
-it also handles reference tracking for the pointer. PTR_TO_SOCKET implicitly
-represents a reference to the corresponding 'struct sock'. To ensure that the
-reference is not leaked, it is imperative to NULL-check the reference and in
-the non-NULL case, and pass the valid reference to the socket release function.
-
-Direct packet access
---------------------
-In cls_bpf and act_bpf programs the verifier allows direct access to the packet
-data via skb->data and skb->data_end pointers.
-Ex:
-1:  r4 = *(u32 *)(r1 +80)  /* load skb->data_end */
-2:  r3 = *(u32 *)(r1 +76)  /* load skb->data */
-3:  r5 = r3
-4:  r5 += 14
-5:  if r5 > r4 goto pc+16
-R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp
-6:  r0 = *(u16 *)(r3 +12) /* access 12 and 13 bytes of the packet */
-
-this 2byte load from the packet is safe to do, since the program author
-did check 'if (skb->data + 14 > skb->data_end) goto err' at insn #5 which
-means that in the fall-through case the register R3 (which points to skb->data)
-has at least 14 directly accessible bytes. The verifier marks it
-as R3=pkt(id=0,off=0,r=14).
-id=0 means that no additional variables were added to the register.
-off=0 means that no additional constants were added.
-r=14 is the range of safe access which means that bytes [R3, R3 + 14) are ok.
-Note that R5 is marked as R5=pkt(id=0,off=14,r=14). It also points
-to the packet data, but constant 14 was added to the register, so
-it now points to 'skb->data + 14' and accessible range is [R5, R5 + 14 - 14)
-which is zero bytes.
-
-More complex packet access may look like:
- R0=inv1 R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp
- 6:  r0 = *(u8 *)(r3 +7) /* load 7th byte from the packet */
- 7:  r4 = *(u8 *)(r3 +12)
- 8:  r4 *= 14
- 9:  r3 = *(u32 *)(r1 +76) /* load skb->data */
-10:  r3 += r4
-11:  r2 = r1
-12:  r2 <<= 48
-13:  r2 >>= 48
-14:  r3 += r2
-15:  r2 = r3
-16:  r2 += 8
-17:  r1 = *(u32 *)(r1 +80) /* load skb->data_end */
-18:  if r2 > r1 goto pc+2
- R0=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R1=pkt_end R2=pkt(id=2,off=8,r=8) R3=pkt(id=2,off=0,r=8) R4=inv(id=0,umax_value=3570,var_off=(0x0; 0xfffe)) R5=pkt(id=0,off=14,r=14) R10=fp
-19:  r1 = *(u8 *)(r3 +4)
-The state of the register R3 is R3=pkt(id=2,off=0,r=8)
-id=2 means that two 'r3 += rX' instructions were seen, so r3 points to some
-offset within a packet and since the program author did
-'if (r3 + 8 > r1) goto err' at insn #18, the safe range is [R3, R3 + 8).
-The verifier only allows 'add'/'sub' operations on packet registers. Any other
-operation will set the register state to 'SCALAR_VALUE' and it won't be
-available for direct packet access.
-Operation 'r3 += rX' may overflow and become less than original skb->data,
-therefore the verifier has to prevent that.  So when it sees 'r3 += rX'
-instruction and rX is more than 16-bit value, any subsequent bounds-check of r3
-against skb->data_end will not give us 'range' information, so attempts to read
-through the pointer will give "invalid access to packet" error.
-Ex. after insn 'r4 = *(u8 *)(r3 +12)' (insn #7 above) the state of r4 is
-R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) which means that upper 56 bits
-of the register are guaranteed to be zero, and nothing is known about the lower
-8 bits. After insn 'r4 *= 14' the state becomes
-R4=inv(id=0,umax_value=3570,var_off=(0x0; 0xfffe)), since multiplying an 8-bit
-value by constant 14 will keep upper 52 bits as zero, also the least significant
-bit will be zero as 14 is even.  Similarly 'r2 >>= 48' will make
-R2=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)), since the shift is not sign
-extending.  This logic is implemented in adjust_reg_min_max_vals() function,
-which calls adjust_ptr_min_max_vals() for adding pointer to scalar (or vice
-versa) and adjust_scalar_min_max_vals() for operations on two scalars.
-
-The end result is that bpf program author can access packet directly
-using normal C code as:
-  void *data = (void *)(long)skb->data;
-  void *data_end = (void *)(long)skb->data_end;
-  struct eth_hdr *eth = data;
-  struct iphdr *iph = data + sizeof(*eth);
-  struct udphdr *udp = data + sizeof(*eth) + sizeof(*iph);
-
-  if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end)
-          return 0;
-  if (eth->h_proto != htons(ETH_P_IP))
-          return 0;
-  if (iph->protocol != IPPROTO_UDP || iph->ihl != 5)
-          return 0;
-  if (udp->dest == 53 || udp->source == 9)
-          ...;
-which makes such programs easier to write comparing to LD_ABS insn
-and significantly faster.
-
-eBPF maps
----------
-'maps' is a generic storage of different types for sharing data between kernel
-and userspace.
-
-The maps are accessed from user space via BPF syscall, which has commands:
-- create a map with given type and attributes
-  map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
-  using attr->map_type, attr->key_size, attr->value_size, attr->max_entries
-  returns process-local file descriptor or negative error
-
-- lookup key in a given map
-  err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
-  using attr->map_fd, attr->key, attr->value
-  returns zero and stores found elem into value or negative error
-
-- create or update key/value pair in a given map
-  err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
-  using attr->map_fd, attr->key, attr->value
-  returns zero or negative error
-
-- find and delete element by key in a given map
-  err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
-  using attr->map_fd, attr->key
-
-- to delete map: close(fd)
-  Exiting process will delete maps automatically
-
-userspace programs use this syscall to create/access maps that eBPF programs
-are concurrently updating.
-
-maps can have different types: hash, array, bloom filter, radix-tree, etc.
-
-The map is defined by:
-  . type
-  . max number of elements
-  . key size in bytes
-  . value size in bytes
-
-Pruning
--------
-The verifier does not actually walk all possible paths through the program.  For
-each new branch to analyse, the verifier looks at all the states it's previously
-been in when at this instruction.  If any of them contain the current state as a
-subset, the branch is 'pruned' - that is, the fact that the previous state was
-accepted implies the current state would be as well.  For instance, if in the
-previous state, r1 held a packet-pointer, and in the current state, r1 holds a
-packet-pointer with a range as long or longer and at least as strict an
-alignment, then r1 is safe.  Similarly, if r2 was NOT_INIT before then it can't
-have been used by any path from that point, so any value in r2 (including
-another NOT_INIT) is safe.  The implementation is in the function regsafe().
-Pruning considers not only the registers but also the stack (and any spilled
-registers it may hold).  They must all be safe for the branch to be pruned.
-This is implemented in states_equal().
-
-Understanding eBPF verifier messages
-------------------------------------
-
-The following are few examples of invalid eBPF programs and verifier error
-messages as seen in the log:
-
-Program with unreachable instructions:
-static struct bpf_insn prog[] = {
-  BPF_EXIT_INSN(),
-  BPF_EXIT_INSN(),
-};
-Error:
-  unreachable insn 1
-
-Program that reads uninitialized register:
-  BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
-  BPF_EXIT_INSN(),
-Error:
-  0: (bf) r0 = r2
-  R2 !read_ok
-
-Program that doesn't initialize R0 before exiting:
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_1),
-  BPF_EXIT_INSN(),
-Error:
-  0: (bf) r2 = r1
-  1: (95) exit
-  R0 !read_ok
-
-Program that accesses stack out of bounds:
-  BPF_ST_MEM(BPF_DW, BPF_REG_10, 8, 0),
-  BPF_EXIT_INSN(),
-Error:
-  0: (7a) *(u64 *)(r10 +8) = 0
-  invalid stack off=8 size=8
-
-Program that doesn't initialize stack before passing its address into function:
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
-  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
-  BPF_LD_MAP_FD(BPF_REG_1, 0),
-  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
-  BPF_EXIT_INSN(),
-Error:
-  0: (bf) r2 = r10
-  1: (07) r2 += -8
-  2: (b7) r1 = 0x0
-  3: (85) call 1
-  invalid indirect read from stack off -8+0 size 8
-
-Program that uses invalid map_fd=0 while calling to map_lookup_elem() function:
-  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
-  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
-  BPF_LD_MAP_FD(BPF_REG_1, 0),
-  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
-  BPF_EXIT_INSN(),
-Error:
-  0: (7a) *(u64 *)(r10 -8) = 0
-  1: (bf) r2 = r10
-  2: (07) r2 += -8
-  3: (b7) r1 = 0x0
-  4: (85) call 1
-  fd 0 is not pointing to valid bpf_map
-
-Program that doesn't check return value of map_lookup_elem() before accessing
-map element:
-  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
-  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
-  BPF_LD_MAP_FD(BPF_REG_1, 0),
-  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
-  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
-  BPF_EXIT_INSN(),
-Error:
-  0: (7a) *(u64 *)(r10 -8) = 0
-  1: (bf) r2 = r10
-  2: (07) r2 += -8
-  3: (b7) r1 = 0x0
-  4: (85) call 1
-  5: (7a) *(u64 *)(r0 +0) = 0
-  R0 invalid mem access 'map_value_or_null'
-
-Program that correctly checks map_lookup_elem() returned value for NULL, but
-accesses the memory with incorrect alignment:
-  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
-  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
-  BPF_LD_MAP_FD(BPF_REG_1, 0),
-  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
-  BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
-  BPF_ST_MEM(BPF_DW, BPF_REG_0, 4, 0),
-  BPF_EXIT_INSN(),
-Error:
-  0: (7a) *(u64 *)(r10 -8) = 0
-  1: (bf) r2 = r10
-  2: (07) r2 += -8
-  3: (b7) r1 = 1
-  4: (85) call 1
-  5: (15) if r0 == 0x0 goto pc+1
-   R0=map_ptr R10=fp
-  6: (7a) *(u64 *)(r0 +4) = 0
-  misaligned access off 4 size 8
-
-Program that correctly checks map_lookup_elem() returned value for NULL and
-accesses memory with correct alignment in one side of 'if' branch, but fails
-to do so in the other side of 'if' branch:
-  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
-  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
-  BPF_LD_MAP_FD(BPF_REG_1, 0),
-  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
-  BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
-  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
-  BPF_EXIT_INSN(),
-  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 1),
-  BPF_EXIT_INSN(),
-Error:
-  0: (7a) *(u64 *)(r10 -8) = 0
-  1: (bf) r2 = r10
-  2: (07) r2 += -8
-  3: (b7) r1 = 1
-  4: (85) call 1
-  5: (15) if r0 == 0x0 goto pc+2
-   R0=map_ptr R10=fp
-  6: (7a) *(u64 *)(r0 +0) = 0
-  7: (95) exit
-
-  from 5 to 8: R0=imm0 R10=fp
-  8: (7a) *(u64 *)(r0 +0) = 1
-  R0 invalid mem access 'imm'
-
-Program that performs a socket lookup then sets the pointer to NULL without
-checking it:
-value:
-  BPF_MOV64_IMM(BPF_REG_2, 0),
-  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
-  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
-  BPF_MOV64_IMM(BPF_REG_3, 4),
-  BPF_MOV64_IMM(BPF_REG_4, 0),
-  BPF_MOV64_IMM(BPF_REG_5, 0),
-  BPF_EMIT_CALL(BPF_FUNC_sk_lookup_tcp),
-  BPF_MOV64_IMM(BPF_REG_0, 0),
-  BPF_EXIT_INSN(),
-Error:
-  0: (b7) r2 = 0
-  1: (63) *(u32 *)(r10 -8) = r2
-  2: (bf) r2 = r10
-  3: (07) r2 += -8
-  4: (b7) r3 = 4
-  5: (b7) r4 = 0
-  6: (b7) r5 = 0
-  7: (85) call bpf_sk_lookup_tcp#65
-  8: (b7) r0 = 0
-  9: (95) exit
-  Unreleased reference id=1, alloc_insn=7
-
-Program that performs a socket lookup but does not NULL-check the returned
-value:
-  BPF_MOV64_IMM(BPF_REG_2, 0),
-  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
-  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
-  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
-  BPF_MOV64_IMM(BPF_REG_3, 4),
-  BPF_MOV64_IMM(BPF_REG_4, 0),
-  BPF_MOV64_IMM(BPF_REG_5, 0),
-  BPF_EMIT_CALL(BPF_FUNC_sk_lookup_tcp),
-  BPF_EXIT_INSN(),
-Error:
-  0: (b7) r2 = 0
-  1: (63) *(u32 *)(r10 -8) = r2
-  2: (bf) r2 = r10
-  3: (07) r2 += -8
-  4: (b7) r3 = 4
-  5: (b7) r4 = 0
-  6: (b7) r5 = 0
-  7: (85) call bpf_sk_lookup_tcp#65
-  8: (95) exit
-  Unreleased reference id=1, alloc_insn=7
-
-Testing
--------
-
-Next to the BPF toolchain, the kernel also ships a test module that contains
-various test cases for classic and internal BPF that can be executed against
-the BPF interpreter and JIT compiler. It can be found in lib/test_bpf.c and
-enabled via Kconfig:
-
-  CONFIG_TEST_BPF=m
-
-After the module has been built and installed, the test suite can be executed
-via insmod or modprobe against 'test_bpf' module. Results of the test cases
-including timings in nsec can be found in the kernel log (dmesg).
-
-Misc
-----
-
-Also trinity, the Linux syscall fuzzer, has built-in support for BPF and
-SECCOMP-BPF kernel fuzzing.
-
-Written by
-----------
-
-The document was written in the hope that it is found useful and in order
-to give potential BPF hackers or security auditors a better overview of
-the underlying architecture.
-
-Jay Schulist <jschlst@samba.org>
-Daniel Borkmann <daniel@iogearbox.net>
-Alexei Starovoitov <ast@kernel.org>
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 807abe25ae4b..144ed838c1a9 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -56,6 +56,7 @@ Contents:
    driver
    eql
    fib_trie
+   filter
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt
index 999eb41da81d..494614573c67 100644
--- a/Documentation/networking/packet_mmap.txt
+++ b/Documentation/networking/packet_mmap.txt
@@ -1051,7 +1051,7 @@ for more information on hardware timestamps.
 -------------------------------------------------------------------------------
 
 - Packet sockets work well together with Linux socket filters, thus you also
-  might want to have a look at Documentation/networking/filter.txt
+  might want to have a look at Documentation/networking/filter.rst
 
 --------------------------------------------------------------------------------
 + THANKS
diff --git a/MAINTAINERS b/MAINTAINERS
index 7323bfc1720f..4ec6d2741d36 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3192,7 +3192,7 @@ Q:	https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
 F:	Documentation/bpf/
-F:	Documentation/networking/filter.txt
+F:	Documentation/networking/filter.rst
 F:	arch/*/net/*
 F:	include/linux/bpf*
 F:	include/linux/filter.h
diff --git a/tools/bpf/bpf_asm.c b/tools/bpf/bpf_asm.c
index e5f95e3eede3..0063c3c029e7 100644
--- a/tools/bpf/bpf_asm.c
+++ b/tools/bpf/bpf_asm.c
@@ -11,7 +11,7 @@
  *
  * How to get into it:
  *
- * 1) read Documentation/networking/filter.txt
+ * 1) read Documentation/networking/filter.rst
  * 2) Run `bpf_asm [-c] <filter-prog file>` to translate into binary
  *    blob that is loadable with xt_bpf, cls_bpf et al. Note: -c will
  *    pretty print a C-like construct.
diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c
index 9d3766e653a9..a0ebcdf59c31 100644
--- a/tools/bpf/bpf_dbg.c
+++ b/tools/bpf/bpf_dbg.c
@@ -13,7 +13,7 @@
  * for making a verdict when multiple simple BPF programs are combined
  * into one in order to prevent parsing same headers multiple times.
  *
- * More on how to debug BPF opcodes see Documentation/networking/filter.txt
+ * More on how to debug BPF opcodes see Documentation/networking/filter.rst
  * which is the main document on BPF. Mini howto for getting started:
  *
  *  1) `./bpf_dbg` to enter the shell (shell cmds denoted with '>'):
-- 
cgit v1.2.3


From 3c3a2fde4d88bb3d6c0592b4b7754f26dab9f697 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Tue, 28 Apr 2020 00:01:43 +0200
Subject: docs: networking: convert hinic.txt to ReST

Not much to be done here:

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/hinic.rst | 128 +++++++++++++++++++++++++++++++++++++
 Documentation/networking/hinic.txt | 125 ------------------------------------
 Documentation/networking/index.rst |   1 +
 MAINTAINERS                        |   2 +-
 4 files changed, 130 insertions(+), 126 deletions(-)
 create mode 100644 Documentation/networking/hinic.rst
 delete mode 100644 Documentation/networking/hinic.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/hinic.rst b/Documentation/networking/hinic.rst
new file mode 100644
index 000000000000..867ac8f4e04a
--- /dev/null
+++ b/Documentation/networking/hinic.rst
@@ -0,0 +1,128 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================================================
+Linux Kernel Driver for Huawei Intelligent NIC(HiNIC) family
+============================================================
+
+Overview:
+=========
+HiNIC is a network interface card for the Data Center Area.
+
+The driver supports a range of link-speed devices (10GbE, 25GbE, 40GbE, etc.).
+The driver supports also a negotiated and extendable feature set.
+
+Some HiNIC devices support SR-IOV. This driver is used for Physical Function
+(PF).
+
+HiNIC devices support MSI-X interrupt vector for each Tx/Rx queue and
+adaptive interrupt moderation.
+
+HiNIC devices support also various offload features such as checksum offload,
+TCP Transmit Segmentation Offload(TSO), Receive-Side Scaling(RSS) and
+LRO(Large Receive Offload).
+
+
+Supported PCI vendor ID/device IDs:
+===================================
+
+19e5:1822 - HiNIC PF
+
+
+Driver Architecture and Source Code:
+====================================
+
+hinic_dev - Implement a Logical Network device that is independent from
+specific HW details about HW data structure formats.
+
+hinic_hwdev - Implement the HW details of the device and include the components
+for accessing the PCI NIC.
+
+hinic_hwdev contains the following components:
+===============================================
+
+HW Interface:
+=============
+
+The interface for accessing the pci device (DMA memory and PCI BARs).
+(hinic_hw_if.c, hinic_hw_if.h)
+
+Configuration Status Registers Area that describes the HW Registers on the
+configuration and status BAR0. (hinic_hw_csr.h)
+
+MGMT components:
+================
+
+Asynchronous Event Queues(AEQs) - The event queues for receiving messages from
+the MGMT modules on the cards. (hinic_hw_eqs.c, hinic_hw_eqs.h)
+
+Application Programmable Interface commands(API CMD) - Interface for sending
+MGMT commands to the card. (hinic_hw_api_cmd.c, hinic_hw_api_cmd.h)
+
+Management (MGMT) - the PF to MGMT channel that uses API CMD for sending MGMT
+commands to the card and receives notifications from the MGMT modules on the
+card by AEQs. Also set the addresses of the IO CMDQs in HW.
+(hinic_hw_mgmt.c, hinic_hw_mgmt.h)
+
+IO components:
+==============
+
+Completion Event Queues(CEQs) - The completion Event Queues that describe IO
+tasks that are finished. (hinic_hw_eqs.c, hinic_hw_eqs.h)
+
+Work Queues(WQ) - Contain the memory and operations for use by CMD queues and
+the Queue Pairs. The WQ is a Memory Block in a Page. The Block contains
+pointers to Memory Areas that are the Memory for the Work Queue Elements(WQEs).
+(hinic_hw_wq.c, hinic_hw_wq.h)
+
+Command Queues(CMDQ) - The queues for sending commands for IO management and is
+used to set the QPs addresses in HW. The commands completion events are
+accumulated on the CEQ that is configured to receive the CMDQ completion events.
+(hinic_hw_cmdq.c, hinic_hw_cmdq.h)
+
+Queue Pairs(QPs) - The HW Receive and Send queues for Receiving and Transmitting
+Data. (hinic_hw_qp.c, hinic_hw_qp.h, hinic_hw_qp_ctxt.h)
+
+IO - de/constructs all the IO components. (hinic_hw_io.c, hinic_hw_io.h)
+
+HW device:
+==========
+
+HW device - de/constructs the HW Interface, the MGMT components on the
+initialization of the driver and the IO components on the case of Interface
+UP/DOWN Events. (hinic_hw_dev.c, hinic_hw_dev.h)
+
+
+hinic_dev contains the following components:
+===============================================
+
+PCI ID table - Contains the supported PCI Vendor/Device IDs.
+(hinic_pci_tbl.h)
+
+Port Commands - Send commands to the HW device for port management
+(MAC, Vlan, MTU, ...). (hinic_port.c, hinic_port.h)
+
+Tx Queues - Logical Tx Queues that use the HW Send Queues for transmit.
+The Logical Tx queue is not dependent on the format of the HW Send Queue.
+(hinic_tx.c, hinic_tx.h)
+
+Rx Queues - Logical Rx Queues that use the HW Receive Queues for receive.
+The Logical Rx queue is not dependent on the format of the HW Receive Queue.
+(hinic_rx.c, hinic_rx.h)
+
+hinic_dev - de/constructs the Logical Tx and Rx Queues.
+(hinic_main.c, hinic_dev.h)
+
+
+Miscellaneous
+=============
+
+Common functions that are used by HW and Logical Device.
+(hinic_common.c, hinic_common.h)
+
+
+Support
+=======
+
+If an issue is identified with the released source code on the supported kernel
+with a supported adapter, email the specific information related to the issue to
+aviad.krawczyk@huawei.com.
diff --git a/Documentation/networking/hinic.txt b/Documentation/networking/hinic.txt
deleted file mode 100644
index 989366a4039c..000000000000
--- a/Documentation/networking/hinic.txt
+++ /dev/null
@@ -1,125 +0,0 @@
-Linux Kernel Driver for Huawei Intelligent NIC(HiNIC) family
-============================================================
-
-Overview:
-=========
-HiNIC is a network interface card for the Data Center Area.
-
-The driver supports a range of link-speed devices (10GbE, 25GbE, 40GbE, etc.).
-The driver supports also a negotiated and extendable feature set.
-
-Some HiNIC devices support SR-IOV. This driver is used for Physical Function
-(PF).
-
-HiNIC devices support MSI-X interrupt vector for each Tx/Rx queue and
-adaptive interrupt moderation.
-
-HiNIC devices support also various offload features such as checksum offload,
-TCP Transmit Segmentation Offload(TSO), Receive-Side Scaling(RSS) and
-LRO(Large Receive Offload).
-
-
-Supported PCI vendor ID/device IDs:
-===================================
-
-19e5:1822 - HiNIC PF
-
-
-Driver Architecture and Source Code:
-====================================
-
-hinic_dev - Implement a Logical Network device that is independent from
-specific HW details about HW data structure formats.
-
-hinic_hwdev - Implement the HW details of the device and include the components
-for accessing the PCI NIC.
-
-hinic_hwdev contains the following components:
-===============================================
-
-HW Interface:
-=============
-
-The interface for accessing the pci device (DMA memory and PCI BARs).
-(hinic_hw_if.c, hinic_hw_if.h)
-
-Configuration Status Registers Area that describes the HW Registers on the
-configuration and status BAR0. (hinic_hw_csr.h)
-
-MGMT components:
-================
-
-Asynchronous Event Queues(AEQs) - The event queues for receiving messages from
-the MGMT modules on the cards. (hinic_hw_eqs.c, hinic_hw_eqs.h)
-
-Application Programmable Interface commands(API CMD) - Interface for sending
-MGMT commands to the card. (hinic_hw_api_cmd.c, hinic_hw_api_cmd.h)
-
-Management (MGMT) - the PF to MGMT channel that uses API CMD for sending MGMT
-commands to the card and receives notifications from the MGMT modules on the
-card by AEQs. Also set the addresses of the IO CMDQs in HW.
-(hinic_hw_mgmt.c, hinic_hw_mgmt.h)
-
-IO components:
-==============
-
-Completion Event Queues(CEQs) - The completion Event Queues that describe IO
-tasks that are finished. (hinic_hw_eqs.c, hinic_hw_eqs.h)
-
-Work Queues(WQ) - Contain the memory and operations for use by CMD queues and
-the Queue Pairs. The WQ is a Memory Block in a Page. The Block contains
-pointers to Memory Areas that are the Memory for the Work Queue Elements(WQEs).
-(hinic_hw_wq.c, hinic_hw_wq.h)
-
-Command Queues(CMDQ) - The queues for sending commands for IO management and is
-used to set the QPs addresses in HW. The commands completion events are
-accumulated on the CEQ that is configured to receive the CMDQ completion events.
-(hinic_hw_cmdq.c, hinic_hw_cmdq.h)
-
-Queue Pairs(QPs) - The HW Receive and Send queues for Receiving and Transmitting
-Data. (hinic_hw_qp.c, hinic_hw_qp.h, hinic_hw_qp_ctxt.h)
-
-IO - de/constructs all the IO components. (hinic_hw_io.c, hinic_hw_io.h)
-
-HW device:
-==========
-
-HW device - de/constructs the HW Interface, the MGMT components on the
-initialization of the driver and the IO components on the case of Interface
-UP/DOWN Events. (hinic_hw_dev.c, hinic_hw_dev.h)
-
-
-hinic_dev contains the following components:
-===============================================
-
-PCI ID table - Contains the supported PCI Vendor/Device IDs.
-(hinic_pci_tbl.h)
-
-Port Commands - Send commands to the HW device for port management
-(MAC, Vlan, MTU, ...). (hinic_port.c, hinic_port.h)
-
-Tx Queues - Logical Tx Queues that use the HW Send Queues for transmit.
-The Logical Tx queue is not dependent on the format of the HW Send Queue.
-(hinic_tx.c, hinic_tx.h)
-
-Rx Queues - Logical Rx Queues that use the HW Receive Queues for receive.
-The Logical Rx queue is not dependent on the format of the HW Receive Queue.
-(hinic_rx.c, hinic_rx.h)
-
-hinic_dev - de/constructs the Logical Tx and Rx Queues.
-(hinic_main.c, hinic_dev.h)
-
-
-Miscellaneous:
-=============
-
-Common functions that are used by HW and Logical Device.
-(hinic_common.c, hinic_common.h)
-
-
-Support
-=======
-
-If an issue is identified with the released source code on the supported kernel
-with a supported adapter, email the specific information related to the issue to
-aviad.krawczyk@huawei.com.
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index b29a08d1f941..5a7889df1375 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -63,6 +63,7 @@ Contents:
    generic_netlink
    gen_stats
    gtp
+   hinic
 
 .. only::  subproject and html
 
diff --git a/MAINTAINERS b/MAINTAINERS
index 4ec6d2741d36..df5e4ccc1ccb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7815,7 +7815,7 @@ HUAWEI ETHERNET DRIVER
 M:	Aviad Krawczyk <aviad.krawczyk@huawei.com>
 L:	netdev@vger.kernel.org
 S:	Supported
-F:	Documentation/networking/hinic.txt
+F:	Documentation/networking/hinic.rst
 F:	drivers/net/ethernet/huawei/hinic/
 
 HUGETLB FILESYSTEM
-- 
cgit v1.2.3


From 82a07bf33d7d0c3a194f62178e0fea2d68227b89 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Tue, 28 Apr 2020 00:01:52 +0200
Subject: docs: networking: convert ipvs-sysctl.txt to ReST

- add SPDX header;
- add a document title;
- mark lists as such;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/admin-guide/sysctl/net.rst |   4 +-
 Documentation/networking/index.rst       |   1 +
 Documentation/networking/ipvs-sysctl.rst | 302 +++++++++++++++++++++++++++++++
 Documentation/networking/ipvs-sysctl.txt | 294 ------------------------------
 MAINTAINERS                              |   2 +-
 5 files changed, 306 insertions(+), 297 deletions(-)
 create mode 100644 Documentation/networking/ipvs-sysctl.rst
 delete mode 100644 Documentation/networking/ipvs-sysctl.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 84e3348a9543..2ad1b77a7182 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -353,8 +353,8 @@ socket's buffer. It will not take effect unless PF_UNIX flag is specified.
 
 3. /proc/sys/net/ipv4 - IPV4 settings
 -------------------------------------
-Please see: Documentation/networking/ip-sysctl.rst and ipvs-sysctl.txt for
-descriptions of these entries.
+Please see: Documentation/networking/ip-sysctl.rst and
+Documentation/admin-guide/sysctl/net.rst for descriptions of these entries.
 
 
 4. Appletalk
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 54dee1575b54..bbd4e0041457 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -72,6 +72,7 @@ Contents:
    ip-sysctl
    ipv6
    ipvlan
+   ipvs-sysctl
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/ipvs-sysctl.rst b/Documentation/networking/ipvs-sysctl.rst
new file mode 100644
index 000000000000..be36c4600e8f
--- /dev/null
+++ b/Documentation/networking/ipvs-sysctl.rst
@@ -0,0 +1,302 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========
+IPvs-sysctl
+===========
+
+/proc/sys/net/ipv4/vs/* Variables:
+==================================
+
+am_droprate - INTEGER
+	default 10
+
+	It sets the always mode drop rate, which is used in the mode 3
+	of the drop_rate defense.
+
+amemthresh - INTEGER
+	default 1024
+
+	It sets the available memory threshold (in pages), which is
+	used in the automatic modes of defense. When there is no
+	enough available memory, the respective strategy will be
+	enabled and the variable is automatically set to 2, otherwise
+	the strategy is disabled and the variable is  set  to 1.
+
+backup_only - BOOLEAN
+	- 0 - disabled (default)
+	- not 0 - enabled
+
+	If set, disable the director function while the server is
+	in backup mode to avoid packet loops for DR/TUN methods.
+
+conn_reuse_mode - INTEGER
+	1 - default
+
+	Controls how ipvs will deal with connections that are detected
+	port reuse. It is a bitmap, with the values being:
+
+	0: disable any special handling on port reuse. The new
+	connection will be delivered to the same real server that was
+	servicing the previous connection. This will effectively
+	disable expire_nodest_conn.
+
+	bit 1: enable rescheduling of new connections when it is safe.
+	That is, whenever expire_nodest_conn and for TCP sockets, when
+	the connection is in TIME_WAIT state (which is only possible if
+	you use NAT mode).
+
+	bit 2: it is bit 1 plus, for TCP connections, when connections
+	are in FIN_WAIT state, as this is the last state seen by load
+	balancer in Direct Routing mode. This bit helps on adding new
+	real servers to a very busy cluster.
+
+conntrack - BOOLEAN
+	- 0 - disabled (default)
+	- not 0 - enabled
+
+	If set, maintain connection tracking entries for
+	connections handled by IPVS.
+
+	This should be enabled if connections handled by IPVS are to be
+	also handled by stateful firewall rules. That is, iptables rules
+	that make use of connection tracking.  It is a performance
+	optimisation to disable this setting otherwise.
+
+	Connections handled by the IPVS FTP application module
+	will have connection tracking entries regardless of this setting.
+
+	Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
+
+cache_bypass - BOOLEAN
+	- 0 - disabled (default)
+	- not 0 - enabled
+
+	If it is enabled, forward packets to the original destination
+	directly when no cache server is available and destination
+	address is not local (iph->daddr is RTN_UNICAST). It is mostly
+	used in transparent web cache cluster.
+
+debug_level - INTEGER
+	- 0          - transmission error messages (default)
+	- 1          - non-fatal error messages
+	- 2          - configuration
+	- 3          - destination trash
+	- 4          - drop entry
+	- 5          - service lookup
+	- 6          - scheduling
+	- 7          - connection new/expire, lookup and synchronization
+	- 8          - state transition
+	- 9          - binding destination, template checks and applications
+	- 10         - IPVS packet transmission
+	- 11         - IPVS packet handling (ip_vs_in/ip_vs_out)
+	- 12 or more - packet traversal
+
+	Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
+
+	Higher debugging levels include the messages for lower debugging
+	levels, so setting debug level 2, includes level 0, 1 and 2
+	messages. Thus, logging becomes more and more verbose the higher
+	the level.
+
+drop_entry - INTEGER
+	- 0  - disabled (default)
+
+	The drop_entry defense is to randomly drop entries in the
+	connection hash table, just in order to collect back some
+	memory for new connections. In the current code, the
+	drop_entry procedure can be activated every second, then it
+	randomly scans 1/32 of the whole and drops entries that are in
+	the SYN-RECV/SYNACK state, which should be effective against
+	syn-flooding attack.
+
+	The valid values of drop_entry are from 0 to 3, where 0 means
+	that this strategy is always disabled, 1 and 2 mean automatic
+	modes (when there is no enough available memory, the strategy
+	is enabled and the variable is automatically set to 2,
+	otherwise the strategy is disabled and the variable is set to
+	1), and 3 means that that the strategy is always enabled.
+
+drop_packet - INTEGER
+	- 0  - disabled (default)
+
+	The drop_packet defense is designed to drop 1/rate packets
+	before forwarding them to real servers. If the rate is 1, then
+	drop all the incoming packets.
+
+	The value definition is the same as that of the drop_entry. In
+	the automatic mode, the rate is determined by the follow
+	formula: rate = amemthresh / (amemthresh - available_memory)
+	when available memory is less than the available memory
+	threshold. When the mode 3 is set, the always mode drop rate
+	is controlled by the /proc/sys/net/ipv4/vs/am_droprate.
+
+expire_nodest_conn - BOOLEAN
+	- 0 - disabled (default)
+	- not 0 - enabled
+
+	The default value is 0, the load balancer will silently drop
+	packets when its destination server is not available. It may
+	be useful, when user-space monitoring program deletes the
+	destination server (because of server overload or wrong
+	detection) and add back the server later, and the connections
+	to the server can continue.
+
+	If this feature is enabled, the load balancer will expire the
+	connection immediately when a packet arrives and its
+	destination server is not available, then the client program
+	will be notified that the connection is closed. This is
+	equivalent to the feature some people requires to flush
+	connections when its destination is not available.
+
+expire_quiescent_template - BOOLEAN
+	- 0 - disabled (default)
+	- not 0 - enabled
+
+	When set to a non-zero value, the load balancer will expire
+	persistent templates when the destination server is quiescent.
+	This may be useful, when a user makes a destination server
+	quiescent by setting its weight to 0 and it is desired that
+	subsequent otherwise persistent connections are sent to a
+	different destination server.  By default new persistent
+	connections are allowed to quiescent destination servers.
+
+	If this feature is enabled, the load balancer will expire the
+	persistence template if it is to be used to schedule a new
+	connection and the destination server is quiescent.
+
+ignore_tunneled - BOOLEAN
+	- 0 - disabled (default)
+	- not 0 - enabled
+
+	If set, ipvs will set the ipvs_property on all packets which are of
+	unrecognized protocols.  This prevents us from routing tunneled
+	protocols like ipip, which is useful to prevent rescheduling
+	packets that have been tunneled to the ipvs host (i.e. to prevent
+	ipvs routing loops when ipvs is also acting as a real server).
+
+nat_icmp_send - BOOLEAN
+	- 0 - disabled (default)
+	- not 0 - enabled
+
+	It controls sending icmp error messages (ICMP_DEST_UNREACH)
+	for VS/NAT when the load balancer receives packets from real
+	servers but the connection entries don't exist.
+
+pmtu_disc - BOOLEAN
+	- 0 - disabled
+	- not 0 - enabled (default)
+
+	By default, reject with FRAG_NEEDED all DF packets that exceed
+	the PMTU, irrespective of the forwarding method. For TUN method
+	the flag can be disabled to fragment such packets.
+
+secure_tcp - INTEGER
+	- 0  - disabled (default)
+
+	The secure_tcp defense is to use a more complicated TCP state
+	transition table. For VS/NAT, it also delays entering the
+	TCP ESTABLISHED state until the three way handshake is completed.
+
+	The value definition is the same as that of drop_entry and
+	drop_packet.
+
+sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period
+	default 3 50
+
+	It sets synchronization threshold, which is the minimum number
+	of incoming packets that a connection needs to receive before
+	the connection will be synchronized. A connection will be
+	synchronized, every time the number of its incoming packets
+	modulus sync_period equals the threshold. The range of the
+	threshold is from 0 to sync_period.
+
+	When sync_period and sync_refresh_period are 0, send sync only
+	for state changes or only once when pkts matches sync_threshold
+
+sync_refresh_period - UNSIGNED INTEGER
+	default 0
+
+	In seconds, difference in reported connection timer that triggers
+	new sync message. It can be used to avoid sync messages for the
+	specified period (or half of the connection timeout if it is lower)
+	if connection state is not changed since last sync.
+
+	This is useful for normal connections with high traffic to reduce
+	sync rate. Additionally, retry sync_retries times with period of
+	sync_refresh_period/8.
+
+sync_retries - INTEGER
+	default 0
+
+	Defines sync retries with period of sync_refresh_period/8. Useful
+	to protect against loss of sync messages. The range of the
+	sync_retries is from 0 to 3.
+
+sync_qlen_max - UNSIGNED LONG
+
+	Hard limit for queued sync messages that are not sent yet. It
+	defaults to 1/32 of the memory pages but actually represents
+	number of messages. It will protect us from allocating large
+	parts of memory when the sending rate is lower than the queuing
+	rate.
+
+sync_sock_size - INTEGER
+	default 0
+
+	Configuration of SNDBUF (master) or RCVBUF (slave) socket limit.
+	Default value is 0 (preserve system defaults).
+
+sync_ports - INTEGER
+	default 1
+
+	The number of threads that master and backup servers can use for
+	sync traffic. Every thread will use single UDP port, thread 0 will
+	use the default port 8848 while last thread will use port
+	8848+sync_ports-1.
+
+snat_reroute - BOOLEAN
+	- 0 - disabled
+	- not 0 - enabled (default)
+
+	If enabled, recalculate the route of SNATed packets from
+	realservers so that they are routed as if they originate from the
+	director. Otherwise they are routed as if they are forwarded by the
+	director.
+
+	If policy routing is in effect then it is possible that the route
+	of a packet originating from a director is routed differently to a
+	packet being forwarded by the director.
+
+	If policy routing is not in effect then the recalculated route will
+	always be the same as the original route so it is an optimisation
+	to disable snat_reroute and avoid the recalculation.
+
+sync_persist_mode - INTEGER
+	default 0
+
+	Controls the synchronisation of connections when using persistence
+
+	0: All types of connections are synchronised
+
+	1: Attempt to reduce the synchronisation traffic depending on
+	the connection type. For persistent services avoid synchronisation
+	for normal connections, do it only for persistence templates.
+	In such case, for TCP and SCTP it may need enabling sloppy_tcp and
+	sloppy_sctp flags on backup servers. For non-persistent services
+	such optimization is not applied, mode 0 is assumed.
+
+sync_version - INTEGER
+	default 1
+
+	The version of the synchronisation protocol used when sending
+	synchronisation messages.
+
+	0 selects the original synchronisation protocol (version 0). This
+	should be used when sending synchronisation messages to a legacy
+	system that only understands the original synchronisation protocol.
+
+	1 selects the current synchronisation protocol (version 1). This
+	should be used where possible.
+
+	Kernels with this sync_version entry are able to receive messages
+	of both version 1 and version 2 of the synchronisation protocol.
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
deleted file mode 100644
index 056898685d40..000000000000
--- a/Documentation/networking/ipvs-sysctl.txt
+++ /dev/null
@@ -1,294 +0,0 @@
-/proc/sys/net/ipv4/vs/* Variables:
-
-am_droprate - INTEGER
-        default 10
-
-        It sets the always mode drop rate, which is used in the mode 3
-        of the drop_rate defense.
-
-amemthresh - INTEGER
-        default 1024
-
-        It sets the available memory threshold (in pages), which is
-        used in the automatic modes of defense. When there is no
-        enough available memory, the respective strategy will be
-        enabled and the variable is automatically set to 2, otherwise
-        the strategy is disabled and the variable is  set  to 1.
-
-backup_only - BOOLEAN
-	0 - disabled (default)
-	not 0 - enabled
-
-	If set, disable the director function while the server is
-	in backup mode to avoid packet loops for DR/TUN methods.
-
-conn_reuse_mode - INTEGER
-	1 - default
-
-	Controls how ipvs will deal with connections that are detected
-	port reuse. It is a bitmap, with the values being:
-
-	0: disable any special handling on port reuse. The new
-	connection will be delivered to the same real server that was
-	servicing the previous connection. This will effectively
-	disable expire_nodest_conn.
-
-	bit 1: enable rescheduling of new connections when it is safe.
-	That is, whenever expire_nodest_conn and for TCP sockets, when
-	the connection is in TIME_WAIT state (which is only possible if
-	you use NAT mode).
-
-	bit 2: it is bit 1 plus, for TCP connections, when connections
-	are in FIN_WAIT state, as this is the last state seen by load
-	balancer in Direct Routing mode. This bit helps on adding new
-	real servers to a very busy cluster.
-
-conntrack - BOOLEAN
-	0 - disabled (default)
-	not 0 - enabled
-
-	If set, maintain connection tracking entries for
-	connections handled by IPVS.
-
-	This should be enabled if connections handled by IPVS are to be
-	also handled by stateful firewall rules. That is, iptables rules
-	that make use of connection tracking.  It is a performance
-	optimisation to disable this setting otherwise.
-
-	Connections handled by the IPVS FTP application module
-	will have connection tracking entries regardless of this setting.
-
-	Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
-
-cache_bypass - BOOLEAN
-        0 - disabled (default)
-        not 0 - enabled
-
-        If it is enabled, forward packets to the original destination
-        directly when no cache server is available and destination
-        address is not local (iph->daddr is RTN_UNICAST). It is mostly
-        used in transparent web cache cluster.
-
-debug_level - INTEGER
-	0          - transmission error messages (default)
-	1          - non-fatal error messages
-	2          - configuration
-	3          - destination trash
-	4          - drop entry
-	5          - service lookup
-	6          - scheduling
-	7          - connection new/expire, lookup and synchronization
-	8          - state transition
-	9          - binding destination, template checks and applications
-	10         - IPVS packet transmission
-	11         - IPVS packet handling (ip_vs_in/ip_vs_out)
-	12 or more - packet traversal
-
-	Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
-
-	Higher debugging levels include the messages for lower debugging
-	levels, so setting debug level 2, includes level 0, 1 and 2
-	messages. Thus, logging becomes more and more verbose the higher
-	the level.
-
-drop_entry - INTEGER
-        0  - disabled (default)
-
-        The drop_entry defense is to randomly drop entries in the
-        connection hash table, just in order to collect back some
-        memory for new connections. In the current code, the
-        drop_entry procedure can be activated every second, then it
-        randomly scans 1/32 of the whole and drops entries that are in
-        the SYN-RECV/SYNACK state, which should be effective against
-        syn-flooding attack.
-
-        The valid values of drop_entry are from 0 to 3, where 0 means
-        that this strategy is always disabled, 1 and 2 mean automatic
-        modes (when there is no enough available memory, the strategy
-        is enabled and the variable is automatically set to 2,
-        otherwise the strategy is disabled and the variable is set to
-        1), and 3 means that that the strategy is always enabled.
-
-drop_packet - INTEGER
-        0  - disabled (default)
-
-        The drop_packet defense is designed to drop 1/rate packets
-        before forwarding them to real servers. If the rate is 1, then
-        drop all the incoming packets.
-
-        The value definition is the same as that of the drop_entry. In
-        the automatic mode, the rate is determined by the follow
-        formula: rate = amemthresh / (amemthresh - available_memory)
-        when available memory is less than the available memory
-        threshold. When the mode 3 is set, the always mode drop rate
-        is controlled by the /proc/sys/net/ipv4/vs/am_droprate.
-
-expire_nodest_conn - BOOLEAN
-        0 - disabled (default)
-        not 0 - enabled
-
-        The default value is 0, the load balancer will silently drop
-        packets when its destination server is not available. It may
-        be useful, when user-space monitoring program deletes the
-        destination server (because of server overload or wrong
-        detection) and add back the server later, and the connections
-        to the server can continue.
-
-        If this feature is enabled, the load balancer will expire the
-        connection immediately when a packet arrives and its
-        destination server is not available, then the client program
-        will be notified that the connection is closed. This is
-        equivalent to the feature some people requires to flush
-        connections when its destination is not available.
-
-expire_quiescent_template - BOOLEAN
-	0 - disabled (default)
-	not 0 - enabled
-
-	When set to a non-zero value, the load balancer will expire
-	persistent templates when the destination server is quiescent.
-	This may be useful, when a user makes a destination server
-	quiescent by setting its weight to 0 and it is desired that
-	subsequent otherwise persistent connections are sent to a
-	different destination server.  By default new persistent
-	connections are allowed to quiescent destination servers.
-
-	If this feature is enabled, the load balancer will expire the
-	persistence template if it is to be used to schedule a new
-	connection and the destination server is quiescent.
-
-ignore_tunneled - BOOLEAN
-	0 - disabled (default)
-	not 0 - enabled
-
-	If set, ipvs will set the ipvs_property on all packets which are of
-	unrecognized protocols.  This prevents us from routing tunneled
-	protocols like ipip, which is useful to prevent rescheduling
-	packets that have been tunneled to the ipvs host (i.e. to prevent
-	ipvs routing loops when ipvs is also acting as a real server).
-
-nat_icmp_send - BOOLEAN
-        0 - disabled (default)
-        not 0 - enabled
-
-        It controls sending icmp error messages (ICMP_DEST_UNREACH)
-        for VS/NAT when the load balancer receives packets from real
-        servers but the connection entries don't exist.
-
-pmtu_disc - BOOLEAN
-	0 - disabled
-	not 0 - enabled (default)
-
-	By default, reject with FRAG_NEEDED all DF packets that exceed
-	the PMTU, irrespective of the forwarding method. For TUN method
-	the flag can be disabled to fragment such packets.
-
-secure_tcp - INTEGER
-        0  - disabled (default)
-
-	The secure_tcp defense is to use a more complicated TCP state
-	transition table. For VS/NAT, it also delays entering the
-	TCP ESTABLISHED state until the three way handshake is completed.
-
-        The value definition is the same as that of drop_entry and
-        drop_packet.
-
-sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period
-	default 3 50
-
-	It sets synchronization threshold, which is the minimum number
-	of incoming packets that a connection needs to receive before
-	the connection will be synchronized. A connection will be
-	synchronized, every time the number of its incoming packets
-	modulus sync_period equals the threshold. The range of the
-	threshold is from 0 to sync_period.
-
-	When sync_period and sync_refresh_period are 0, send sync only
-	for state changes or only once when pkts matches sync_threshold
-
-sync_refresh_period - UNSIGNED INTEGER
-	default 0
-
-	In seconds, difference in reported connection timer that triggers
-	new sync message. It can be used to avoid sync messages for the
-	specified period (or half of the connection timeout if it is lower)
-	if connection state is not changed since last sync.
-
-	This is useful for normal connections with high traffic to reduce
-	sync rate. Additionally, retry sync_retries times with period of
-	sync_refresh_period/8.
-
-sync_retries - INTEGER
-	default 0
-
-	Defines sync retries with period of sync_refresh_period/8. Useful
-	to protect against loss of sync messages. The range of the
-	sync_retries is from 0 to 3.
-
-sync_qlen_max - UNSIGNED LONG
-
-	Hard limit for queued sync messages that are not sent yet. It
-	defaults to 1/32 of the memory pages but actually represents
-	number of messages. It will protect us from allocating large
-	parts of memory when the sending rate is lower than the queuing
-	rate.
-
-sync_sock_size - INTEGER
-	default 0
-
-	Configuration of SNDBUF (master) or RCVBUF (slave) socket limit.
-	Default value is 0 (preserve system defaults).
-
-sync_ports - INTEGER
-	default 1
-
-	The number of threads that master and backup servers can use for
-	sync traffic. Every thread will use single UDP port, thread 0 will
-	use the default port 8848 while last thread will use port
-	8848+sync_ports-1.
-
-snat_reroute - BOOLEAN
-	0 - disabled
-	not 0 - enabled (default)
-
-	If enabled, recalculate the route of SNATed packets from
-	realservers so that they are routed as if they originate from the
-	director. Otherwise they are routed as if they are forwarded by the
-	director.
-
-	If policy routing is in effect then it is possible that the route
-	of a packet originating from a director is routed differently to a
-	packet being forwarded by the director.
-
-	If policy routing is not in effect then the recalculated route will
-	always be the same as the original route so it is an optimisation
-	to disable snat_reroute and avoid the recalculation.
-
-sync_persist_mode - INTEGER
-	default 0
-
-	Controls the synchronisation of connections when using persistence
-
-	0: All types of connections are synchronised
-	1: Attempt to reduce the synchronisation traffic depending on
-	the connection type. For persistent services avoid synchronisation
-	for normal connections, do it only for persistence templates.
-	In such case, for TCP and SCTP it may need enabling sloppy_tcp and
-	sloppy_sctp flags on backup servers. For non-persistent services
-	such optimization is not applied, mode 0 is assumed.
-
-sync_version - INTEGER
-	default 1
-
-	The version of the synchronisation protocol used when sending
-	synchronisation messages.
-
-	0 selects the original synchronisation protocol (version 0). This
-	should be used when sending synchronisation messages to a legacy
-	system that only understands the original synchronisation protocol.
-
-	1 selects the current synchronisation protocol (version 1). This
-	should be used where possible.
-
-	Kernels with this sync_version entry are able to receive messages
-	of both version 1 and version 2 of the synchronisation protocol.
diff --git a/MAINTAINERS b/MAINTAINERS
index df5e4ccc1ccb..3a5f52a3c055 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8934,7 +8934,7 @@ L:	lvs-devel@vger.kernel.org
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next.git
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git
-F:	Documentation/networking/ipvs-sysctl.txt
+F:	Documentation/networking/ipvs-sysctl.rst
 F:	include/net/ip_vs.h
 F:	include/uapi/linux/ip_vs.h
 F:	net/netfilter/ipvs/
-- 
cgit v1.2.3


From 40e79150c1686263e6a031d7702aec63aff31332 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Thu, 30 Apr 2020 18:03:57 +0200
Subject: docs: networking: convert lapb-module.txt to ReST

- add SPDX header;
- adjust title markup;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst       |   1 +
 Documentation/networking/lapb-module.rst | 305 +++++++++++++++++++++++++++++++
 Documentation/networking/lapb-module.txt | 263 --------------------------
 MAINTAINERS                              |   2 +-
 net/lapb/Kconfig                         |   2 +-
 5 files changed, 308 insertions(+), 265 deletions(-)
 create mode 100644 Documentation/networking/lapb-module.rst
 delete mode 100644 Documentation/networking/lapb-module.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 0c5d7a037983..acd2567cf0d4 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -75,6 +75,7 @@ Contents:
    ipvs-sysctl
    kcm
    l2tp
+   lapb-module
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/lapb-module.rst b/Documentation/networking/lapb-module.rst
new file mode 100644
index 000000000000..ff586bc9f005
--- /dev/null
+++ b/Documentation/networking/lapb-module.rst
@@ -0,0 +1,305 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============================
+The Linux LAPB Module Interface
+===============================
+
+Version 1.3
+
+Jonathan Naylor 29.12.96
+
+Changed (Henner Eisen, 2000-10-29): int return value for data_indication()
+
+The LAPB module will be a separately compiled module for use by any parts of
+the Linux operating system that require a LAPB service. This document
+defines the interfaces to, and the services provided by this module. The
+term module in this context does not imply that the LAPB module is a
+separately loadable module, although it may be. The term module is used in
+its more standard meaning.
+
+The interface to the LAPB module consists of functions to the module,
+callbacks from the module to indicate important state changes, and
+structures for getting and setting information about the module.
+
+Structures
+----------
+
+Probably the most important structure is the skbuff structure for holding
+received and transmitted data, however it is beyond the scope of this
+document.
+
+The two LAPB specific structures are the LAPB initialisation structure and
+the LAPB parameter structure. These will be defined in a standard header
+file, <linux/lapb.h>. The header file <net/lapb.h> is internal to the LAPB
+module and is not for use.
+
+LAPB Initialisation Structure
+-----------------------------
+
+This structure is used only once, in the call to lapb_register (see below).
+It contains information about the device driver that requires the services
+of the LAPB module::
+
+	struct lapb_register_struct {
+		void (*connect_confirmation)(int token, int reason);
+		void (*connect_indication)(int token, int reason);
+		void (*disconnect_confirmation)(int token, int reason);
+		void (*disconnect_indication)(int token, int reason);
+		int  (*data_indication)(int token, struct sk_buff *skb);
+		void (*data_transmit)(int token, struct sk_buff *skb);
+	};
+
+Each member of this structure corresponds to a function in the device driver
+that is called when a particular event in the LAPB module occurs. These will
+be described in detail below. If a callback is not required (!!) then a NULL
+may be substituted.
+
+
+LAPB Parameter Structure
+------------------------
+
+This structure is used with the lapb_getparms and lapb_setparms functions
+(see below). They are used to allow the device driver to get and set the
+operational parameters of the LAPB implementation for a given connection::
+
+	struct lapb_parms_struct {
+		unsigned int t1;
+		unsigned int t1timer;
+		unsigned int t2;
+		unsigned int t2timer;
+		unsigned int n2;
+		unsigned int n2count;
+		unsigned int window;
+		unsigned int state;
+		unsigned int mode;
+	};
+
+T1 and T2 are protocol timing parameters and are given in units of 100ms. N2
+is the maximum number of tries on the link before it is declared a failure.
+The window size is the maximum number of outstanding data packets allowed to
+be unacknowledged by the remote end, the value of the window is between 1
+and 7 for a standard LAPB link, and between 1 and 127 for an extended LAPB
+link.
+
+The mode variable is a bit field used for setting (at present) three values.
+The bit fields have the following meanings:
+
+======  =================================================
+Bit	Meaning
+======  =================================================
+0	LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED).
+1	[SM]LP operation (0=LAPB_SLP 1=LAPB=MLP).
+2	DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE)
+3-31	Reserved, must be 0.
+======  =================================================
+
+Extended LAPB operation indicates the use of extended sequence numbers and
+consequently larger window sizes, the default is standard LAPB operation.
+MLP operation is the same as SLP operation except that the addresses used by
+LAPB are different to indicate the mode of operation, the default is Single
+Link Procedure. The difference between DCE and DTE operation is (i) the
+addresses used for commands and responses, and (ii) when the DCE is not
+connected, it sends DM without polls set, every T1. The upper case constant
+names will be defined in the public LAPB header file.
+
+
+Functions
+---------
+
+The LAPB module provides a number of function entry points.
+
+::
+
+    int lapb_register(void *token, struct lapb_register_struct);
+
+This must be called before the LAPB module may be used. If the call is
+successful then LAPB_OK is returned. The token must be a unique identifier
+generated by the device driver to allow for the unique identification of the
+instance of the LAPB link. It is returned by the LAPB module in all of the
+callbacks, and is used by the device driver in all calls to the LAPB module.
+For multiple LAPB links in a single device driver, multiple calls to
+lapb_register must be made. The format of the lapb_register_struct is given
+above. The return values are:
+
+=============		=============================
+LAPB_OK			LAPB registered successfully.
+LAPB_BADTOKEN		Token is already registered.
+LAPB_NOMEM		Out of memory
+=============		=============================
+
+::
+
+    int lapb_unregister(void *token);
+
+This releases all the resources associated with a LAPB link. Any current
+LAPB link will be abandoned without further messages being passed. After
+this call, the value of token is no longer valid for any calls to the LAPB
+function. The valid return values are:
+
+=============		===============================
+LAPB_OK			LAPB unregistered successfully.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+=============		===============================
+
+::
+
+    int lapb_getparms(void *token, struct lapb_parms_struct *parms);
+
+This allows the device driver to get the values of the current LAPB
+variables, the lapb_parms_struct is described above. The valid return values
+are:
+
+=============		=============================
+LAPB_OK			LAPB getparms was successful.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+=============		=============================
+
+::
+
+    int lapb_setparms(void *token, struct lapb_parms_struct *parms);
+
+This allows the device driver to set the values of the current LAPB
+variables, the lapb_parms_struct is described above. The values of t1timer,
+t2timer and n2count are ignored, likewise changing the mode bits when
+connected will be ignored. An error implies that none of the values have
+been changed. The valid return values are:
+
+=============		=================================================
+LAPB_OK			LAPB getparms was successful.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_INVALUE		One of the values was out of its allowable range.
+=============		=================================================
+
+::
+
+    int lapb_connect_request(void *token);
+
+Initiate a connect using the current parameter settings. The valid return
+values are:
+
+==============		=================================
+LAPB_OK			LAPB is starting to connect.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_CONNECTED		LAPB module is already connected.
+==============		=================================
+
+::
+
+    int lapb_disconnect_request(void *token);
+
+Initiate a disconnect. The valid return values are:
+
+=================	===============================
+LAPB_OK			LAPB is starting to disconnect.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_NOTCONNECTED	LAPB module is not connected.
+=================	===============================
+
+::
+
+    int lapb_data_request(void *token, struct sk_buff *skb);
+
+Queue data with the LAPB module for transmitting over the link. If the call
+is successful then the skbuff is owned by the LAPB module and may not be
+used by the device driver again. The valid return values are:
+
+=================	=============================
+LAPB_OK			LAPB has accepted the data.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_NOTCONNECTED	LAPB module is not connected.
+=================	=============================
+
+::
+
+    int lapb_data_received(void *token, struct sk_buff *skb);
+
+Queue data with the LAPB module which has been received from the device. It
+is expected that the data passed to the LAPB module has skb->data pointing
+to the beginning of the LAPB data. If the call is successful then the skbuff
+is owned by the LAPB module and may not be used by the device driver again.
+The valid return values are:
+
+=============		===========================
+LAPB_OK			LAPB has accepted the data.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+=============		===========================
+
+Callbacks
+---------
+
+These callbacks are functions provided by the device driver for the LAPB
+module to call when an event occurs. They are registered with the LAPB
+module with lapb_register (see above) in the structure lapb_register_struct
+(see above).
+
+::
+
+    void (*connect_confirmation)(void *token, int reason);
+
+This is called by the LAPB module when a connection is established after
+being requested by a call to lapb_connect_request (see above). The reason is
+always LAPB_OK.
+
+::
+
+    void (*connect_indication)(void *token, int reason);
+
+This is called by the LAPB module when the link is established by the remote
+system. The value of reason is always LAPB_OK.
+
+::
+
+    void (*disconnect_confirmation)(void *token, int reason);
+
+This is called by the LAPB module when an event occurs after the device
+driver has called lapb_disconnect_request (see above). The reason indicates
+what has happened. In all cases the LAPB link can be regarded as being
+terminated. The values for reason are:
+
+=================	====================================================
+LAPB_OK			The LAPB link was terminated normally.
+LAPB_NOTCONNECTED	The remote system was not connected.
+LAPB_TIMEDOUT		No response was received in N2 tries from the remote
+			system.
+=================	====================================================
+
+::
+
+    void (*disconnect_indication)(void *token, int reason);
+
+This is called by the LAPB module when the link is terminated by the remote
+system or another event has occurred to terminate the link. This may be
+returned in response to a lapb_connect_request (see above) if the remote
+system refused the request. The values for reason are:
+
+=================	====================================================
+LAPB_OK			The LAPB link was terminated normally by the remote
+			system.
+LAPB_REFUSED		The remote system refused the connect request.
+LAPB_NOTCONNECTED	The remote system was not connected.
+LAPB_TIMEDOUT		No response was received in N2 tries from the remote
+			system.
+=================	====================================================
+
+::
+
+    int (*data_indication)(void *token, struct sk_buff *skb);
+
+This is called by the LAPB module when data has been received from the
+remote system that should be passed onto the next layer in the protocol
+stack. The skbuff becomes the property of the device driver and the LAPB
+module will not perform any more actions on it. The skb->data pointer will
+be pointing to the first byte of data after the LAPB header.
+
+This method should return NET_RX_DROP (as defined in the header
+file include/linux/netdevice.h) if and only if the frame was dropped
+before it could be delivered to the upper layer.
+
+::
+
+    void (*data_transmit)(void *token, struct sk_buff *skb);
+
+This is called by the LAPB module when data is to be transmitted to the
+remote system by the device driver. The skbuff becomes the property of the
+device driver and the LAPB module will not perform any more actions on it.
+The skb->data pointer will be pointing to the first byte of the LAPB header.
diff --git a/Documentation/networking/lapb-module.txt b/Documentation/networking/lapb-module.txt
deleted file mode 100644
index d4fc8f221559..000000000000
--- a/Documentation/networking/lapb-module.txt
+++ /dev/null
@@ -1,263 +0,0 @@
-		The Linux LAPB Module Interface 1.3
-
-		      Jonathan Naylor 29.12.96
-
-Changed (Henner Eisen, 2000-10-29): int return value for data_indication() 
-
-The LAPB module will be a separately compiled module for use by any parts of
-the Linux operating system that require a LAPB service. This document
-defines the interfaces to, and the services provided by this module. The
-term module in this context does not imply that the LAPB module is a
-separately loadable module, although it may be. The term module is used in
-its more standard meaning.
-
-The interface to the LAPB module consists of functions to the module,
-callbacks from the module to indicate important state changes, and
-structures for getting and setting information about the module.
-
-Structures
-----------
-
-Probably the most important structure is the skbuff structure for holding
-received and transmitted data, however it is beyond the scope of this
-document.
-
-The two LAPB specific structures are the LAPB initialisation structure and
-the LAPB parameter structure. These will be defined in a standard header
-file, <linux/lapb.h>. The header file <net/lapb.h> is internal to the LAPB
-module and is not for use.
-
-LAPB Initialisation Structure
------------------------------
-
-This structure is used only once, in the call to lapb_register (see below).
-It contains information about the device driver that requires the services
-of the LAPB module.
-
-struct lapb_register_struct {
-	void (*connect_confirmation)(int token, int reason);
-	void (*connect_indication)(int token, int reason);
-	void (*disconnect_confirmation)(int token, int reason);
-	void (*disconnect_indication)(int token, int reason);
-	int  (*data_indication)(int token, struct sk_buff *skb);
-	void (*data_transmit)(int token, struct sk_buff *skb);
-};
-
-Each member of this structure corresponds to a function in the device driver
-that is called when a particular event in the LAPB module occurs. These will
-be described in detail below. If a callback is not required (!!) then a NULL
-may be substituted.
-
-
-LAPB Parameter Structure
-------------------------
-
-This structure is used with the lapb_getparms and lapb_setparms functions
-(see below). They are used to allow the device driver to get and set the
-operational parameters of the LAPB implementation for a given connection.
-
-struct lapb_parms_struct {
-	unsigned int t1;
-	unsigned int t1timer;
-	unsigned int t2;
-	unsigned int t2timer;
-	unsigned int n2;
-	unsigned int n2count;
-	unsigned int window;
-	unsigned int state;
-	unsigned int mode;
-};
-
-T1 and T2 are protocol timing parameters and are given in units of 100ms. N2
-is the maximum number of tries on the link before it is declared a failure.
-The window size is the maximum number of outstanding data packets allowed to
-be unacknowledged by the remote end, the value of the window is between 1
-and 7 for a standard LAPB link, and between 1 and 127 for an extended LAPB
-link.
-
-The mode variable is a bit field used for setting (at present) three values.
-The bit fields have the following meanings:
-
-Bit	Meaning
-0	LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED).
-1	[SM]LP operation (0=LAPB_SLP 1=LAPB=MLP).
-2	DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE)
-3-31	Reserved, must be 0.
-
-Extended LAPB operation indicates the use of extended sequence numbers and
-consequently larger window sizes, the default is standard LAPB operation.
-MLP operation is the same as SLP operation except that the addresses used by
-LAPB are different to indicate the mode of operation, the default is Single
-Link Procedure. The difference between DCE and DTE operation is (i) the
-addresses used for commands and responses, and (ii) when the DCE is not
-connected, it sends DM without polls set, every T1. The upper case constant
-names will be defined in the public LAPB header file.
-
-
-Functions
----------
-
-The LAPB module provides a number of function entry points.
-
-
-int lapb_register(void *token, struct lapb_register_struct);
-
-This must be called before the LAPB module may be used. If the call is
-successful then LAPB_OK is returned. The token must be a unique identifier
-generated by the device driver to allow for the unique identification of the
-instance of the LAPB link. It is returned by the LAPB module in all of the
-callbacks, and is used by the device driver in all calls to the LAPB module.
-For multiple LAPB links in a single device driver, multiple calls to
-lapb_register must be made. The format of the lapb_register_struct is given
-above. The return values are:
-
-LAPB_OK			LAPB registered successfully.
-LAPB_BADTOKEN		Token is already registered.
-LAPB_NOMEM		Out of memory
-
-
-int lapb_unregister(void *token);
-
-This releases all the resources associated with a LAPB link. Any current
-LAPB link will be abandoned without further messages being passed. After
-this call, the value of token is no longer valid for any calls to the LAPB
-function. The valid return values are:
-
-LAPB_OK			LAPB unregistered successfully.
-LAPB_BADTOKEN		Invalid/unknown LAPB token.
-
-
-int lapb_getparms(void *token, struct lapb_parms_struct *parms);
-
-This allows the device driver to get the values of the current LAPB
-variables, the lapb_parms_struct is described above. The valid return values
-are:
-
-LAPB_OK			LAPB getparms was successful.
-LAPB_BADTOKEN		Invalid/unknown LAPB token.
-
-
-int lapb_setparms(void *token, struct lapb_parms_struct *parms);
-
-This allows the device driver to set the values of the current LAPB
-variables, the lapb_parms_struct is described above. The values of t1timer,
-t2timer and n2count are ignored, likewise changing the mode bits when
-connected will be ignored. An error implies that none of the values have
-been changed. The valid return values are:
-
-LAPB_OK			LAPB getparms was successful.
-LAPB_BADTOKEN		Invalid/unknown LAPB token.
-LAPB_INVALUE		One of the values was out of its allowable range.
-
-
-int lapb_connect_request(void *token);
-
-Initiate a connect using the current parameter settings. The valid return
-values are:
-
-LAPB_OK			LAPB is starting to connect.
-LAPB_BADTOKEN		Invalid/unknown LAPB token.
-LAPB_CONNECTED		LAPB module is already connected.
-
-
-int lapb_disconnect_request(void *token);
-
-Initiate a disconnect. The valid return values are:
-
-LAPB_OK			LAPB is starting to disconnect.
-LAPB_BADTOKEN		Invalid/unknown LAPB token.
-LAPB_NOTCONNECTED	LAPB module is not connected.
-
-
-int lapb_data_request(void *token, struct sk_buff *skb);
-
-Queue data with the LAPB module for transmitting over the link. If the call
-is successful then the skbuff is owned by the LAPB module and may not be
-used by the device driver again. The valid return values are:
-
-LAPB_OK			LAPB has accepted the data.
-LAPB_BADTOKEN		Invalid/unknown LAPB token.
-LAPB_NOTCONNECTED	LAPB module is not connected.
-
-
-int lapb_data_received(void *token, struct sk_buff *skb);
-
-Queue data with the LAPB module which has been received from the device. It
-is expected that the data passed to the LAPB module has skb->data pointing
-to the beginning of the LAPB data. If the call is successful then the skbuff
-is owned by the LAPB module and may not be used by the device driver again.
-The valid return values are:
-
-LAPB_OK			LAPB has accepted the data.
-LAPB_BADTOKEN		Invalid/unknown LAPB token.
-
-
-Callbacks
----------
-
-These callbacks are functions provided by the device driver for the LAPB
-module to call when an event occurs. They are registered with the LAPB
-module with lapb_register (see above) in the structure lapb_register_struct
-(see above).
-
-
-void (*connect_confirmation)(void *token, int reason);
-
-This is called by the LAPB module when a connection is established after
-being requested by a call to lapb_connect_request (see above). The reason is
-always LAPB_OK.
-
-
-void (*connect_indication)(void *token, int reason);
-
-This is called by the LAPB module when the link is established by the remote
-system. The value of reason is always LAPB_OK.
-
-
-void (*disconnect_confirmation)(void *token, int reason);
-
-This is called by the LAPB module when an event occurs after the device
-driver has called lapb_disconnect_request (see above). The reason indicates
-what has happened. In all cases the LAPB link can be regarded as being
-terminated. The values for reason are:
-
-LAPB_OK			The LAPB link was terminated normally.
-LAPB_NOTCONNECTED	The remote system was not connected.
-LAPB_TIMEDOUT		No response was received in N2 tries from the remote
-			system.
-
-
-void (*disconnect_indication)(void *token, int reason);
-
-This is called by the LAPB module when the link is terminated by the remote
-system or another event has occurred to terminate the link. This may be
-returned in response to a lapb_connect_request (see above) if the remote
-system refused the request. The values for reason are:
-
-LAPB_OK			The LAPB link was terminated normally by the remote
-			system.
-LAPB_REFUSED		The remote system refused the connect request.
-LAPB_NOTCONNECTED	The remote system was not connected.
-LAPB_TIMEDOUT		No response was received in N2 tries from the remote
-			system.
-
-
-int (*data_indication)(void *token, struct sk_buff *skb);
-
-This is called by the LAPB module when data has been received from the
-remote system that should be passed onto the next layer in the protocol
-stack. The skbuff becomes the property of the device driver and the LAPB
-module will not perform any more actions on it. The skb->data pointer will
-be pointing to the first byte of data after the LAPB header.
-
-This method should return NET_RX_DROP (as defined in the header
-file include/linux/netdevice.h) if and only if the frame was dropped
-before it could be delivered to the upper layer.
-
-
-void (*data_transmit)(void *token, struct sk_buff *skb);
-
-This is called by the LAPB module when data is to be transmitted to the
-remote system by the device driver. The skbuff becomes the property of the
-device driver and the LAPB module will not perform any more actions on it.
-The skb->data pointer will be pointing to the first byte of the LAPB header.
diff --git a/MAINTAINERS b/MAINTAINERS
index 3a5f52a3c055..956999d2d979 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9515,7 +9515,7 @@ F:	drivers/soc/lantiq
 LAPB module
 L:	linux-x25@vger.kernel.org
 S:	Orphan
-F:	Documentation/networking/lapb-module.txt
+F:	Documentation/networking/lapb-module.rst
 F:	include/*/lapb.h
 F:	net/lapb/
 
diff --git a/net/lapb/Kconfig b/net/lapb/Kconfig
index 6acfc999c952..5b50e8d64f26 100644
--- a/net/lapb/Kconfig
+++ b/net/lapb/Kconfig
@@ -15,7 +15,7 @@ config LAPB
 	  currently supports LAPB only over Ethernet connections. If you want
 	  to use LAPB connections over Ethernet, say Y here and to "LAPB over
 	  Ethernet driver" below. Read
-	  <file:Documentation/networking/lapb-module.txt> for technical
+	  <file:Documentation/networking/lapb-module.rst> for technical
 	  details.
 
 	  To compile this driver as a module, choose M here: the
-- 
cgit v1.2.3


From 429ff87bcac75b929d9ffec8d4d24be2616f8052 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Thu, 30 Apr 2020 18:03:59 +0200
Subject: docs: networking: convert mac80211-injection.txt to ReST

- add SPDX header;
- adjust title markup;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst              |   1 +
 Documentation/networking/mac80211-injection.rst | 106 ++++++++++++++++++++++++
 Documentation/networking/mac80211-injection.txt |  97 ----------------------
 MAINTAINERS                                     |   2 +-
 net/mac80211/tx.c                               |   2 +-
 5 files changed, 109 insertions(+), 99 deletions(-)
 create mode 100644 Documentation/networking/mac80211-injection.rst
 delete mode 100644 Documentation/networking/mac80211-injection.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index b3608b177a8b..81c1834bfb57 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -77,6 +77,7 @@ Contents:
    l2tp
    lapb-module
    ltpc
+   mac80211-injection
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/mac80211-injection.rst b/Documentation/networking/mac80211-injection.rst
new file mode 100644
index 000000000000..75d4edcae852
--- /dev/null
+++ b/Documentation/networking/mac80211-injection.rst
@@ -0,0 +1,106 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+How to use packet injection with mac80211
+=========================================
+
+mac80211 now allows arbitrary packets to be injected down any Monitor Mode
+interface from userland.  The packet you inject needs to be composed in the
+following format::
+
+ [ radiotap header  ]
+ [ ieee80211 header ]
+ [ payload ]
+
+The radiotap format is discussed in
+./Documentation/networking/radiotap-headers.txt.
+
+Despite many radiotap parameters being currently defined, most only make sense
+to appear on received packets.  The following information is parsed from the
+radiotap headers and used to control injection:
+
+ * IEEE80211_RADIOTAP_FLAGS
+
+   =========================  ===========================================
+   IEEE80211_RADIOTAP_F_FCS   FCS will be removed and recalculated
+   IEEE80211_RADIOTAP_F_WEP   frame will be encrypted if key available
+   IEEE80211_RADIOTAP_F_FRAG  frame will be fragmented if longer than the
+			      current fragmentation threshold.
+   =========================  ===========================================
+
+ * IEEE80211_RADIOTAP_TX_FLAGS
+
+   =============================  ========================================
+   IEEE80211_RADIOTAP_F_TX_NOACK  frame should be sent without waiting for
+				  an ACK even if it is a unicast frame
+   =============================  ========================================
+
+ * IEEE80211_RADIOTAP_RATE
+
+   legacy rate for the transmission (only for devices without own rate control)
+
+ * IEEE80211_RADIOTAP_MCS
+
+   HT rate for the transmission (only for devices without own rate control).
+   Also some flags are parsed
+
+   ============================  ========================
+   IEEE80211_RADIOTAP_MCS_SGI    use short guard interval
+   IEEE80211_RADIOTAP_MCS_BW_40  send in HT40 mode
+   ============================  ========================
+
+ * IEEE80211_RADIOTAP_DATA_RETRIES
+
+   number of retries when either IEEE80211_RADIOTAP_RATE or
+   IEEE80211_RADIOTAP_MCS was used
+
+ * IEEE80211_RADIOTAP_VHT
+
+   VHT mcs and number of streams used in the transmission (only for devices
+   without own rate control). Also other fields are parsed
+
+   flags field
+	IEEE80211_RADIOTAP_VHT_FLAG_SGI: use short guard interval
+
+   bandwidth field
+	* 1: send using 40MHz channel width
+	* 4: send using 80MHz channel width
+	* 11: send using 160MHz channel width
+
+The injection code can also skip all other currently defined radiotap fields
+facilitating replay of captured radiotap headers directly.
+
+Here is an example valid radiotap header defining some parameters::
+
+	0x00, 0x00, // <-- radiotap version
+	0x0b, 0x00, // <- radiotap header length
+	0x04, 0x0c, 0x00, 0x00, // <-- bitmap
+	0x6c, // <-- rate
+	0x0c, //<-- tx power
+	0x01 //<-- antenna
+
+The ieee80211 header follows immediately afterwards, looking for example like
+this::
+
+	0x08, 0x01, 0x00, 0x00,
+	0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
+	0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
+	0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
+	0x10, 0x86
+
+Then lastly there is the payload.
+
+After composing the packet contents, it is sent by send()-ing it to a logical
+mac80211 interface that is in Monitor mode.  Libpcap can also be used,
+(which is easier than doing the work to bind the socket to the right
+interface), along the following lines:::
+
+	ppcap = pcap_open_live(szInterfaceName, 800, 1, 20, szErrbuf);
+	...
+	r = pcap_inject(ppcap, u8aSendBuffer, nLength);
+
+You can also find a link to a complete inject application here:
+
+http://wireless.kernel.org/en/users/Documentation/packetspammer
+
+Andy Green <andy@warmcat.com>
diff --git a/Documentation/networking/mac80211-injection.txt b/Documentation/networking/mac80211-injection.txt
deleted file mode 100644
index d58d78df9ca2..000000000000
--- a/Documentation/networking/mac80211-injection.txt
+++ /dev/null
@@ -1,97 +0,0 @@
-How to use packet injection with mac80211
-=========================================
-
-mac80211 now allows arbitrary packets to be injected down any Monitor Mode
-interface from userland.  The packet you inject needs to be composed in the
-following format:
-
- [ radiotap header  ]
- [ ieee80211 header ]
- [ payload ]
-
-The radiotap format is discussed in
-./Documentation/networking/radiotap-headers.txt.
-
-Despite many radiotap parameters being currently defined, most only make sense
-to appear on received packets.  The following information is parsed from the
-radiotap headers and used to control injection:
-
- * IEEE80211_RADIOTAP_FLAGS
-
-   IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
-   IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
-   IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
-			      current fragmentation threshold.
-
- * IEEE80211_RADIOTAP_TX_FLAGS
-
-   IEEE80211_RADIOTAP_F_TX_NOACK: frame should be sent without waiting for
-				  an ACK even if it is a unicast frame
-
- * IEEE80211_RADIOTAP_RATE
-
-   legacy rate for the transmission (only for devices without own rate control)
-
- * IEEE80211_RADIOTAP_MCS
-
-   HT rate for the transmission (only for devices without own rate control).
-   Also some flags are parsed
-
-   IEEE80211_RADIOTAP_MCS_SGI: use short guard interval
-   IEEE80211_RADIOTAP_MCS_BW_40: send in HT40 mode
-
- * IEEE80211_RADIOTAP_DATA_RETRIES
-
-   number of retries when either IEEE80211_RADIOTAP_RATE or
-   IEEE80211_RADIOTAP_MCS was used
-
- * IEEE80211_RADIOTAP_VHT
-
-   VHT mcs and number of streams used in the transmission (only for devices
-   without own rate control). Also other fields are parsed
-
-   flags field
-   IEEE80211_RADIOTAP_VHT_FLAG_SGI: use short guard interval
-
-   bandwidth field
-   1: send using 40MHz channel width
-   4: send using 80MHz channel width
-   11: send using 160MHz channel width
-
-The injection code can also skip all other currently defined radiotap fields
-facilitating replay of captured radiotap headers directly.
-
-Here is an example valid radiotap header defining some parameters
-
-	0x00, 0x00, // <-- radiotap version
-	0x0b, 0x00, // <- radiotap header length
-	0x04, 0x0c, 0x00, 0x00, // <-- bitmap
-	0x6c, // <-- rate
-	0x0c, //<-- tx power
-	0x01 //<-- antenna
-
-The ieee80211 header follows immediately afterwards, looking for example like
-this:
-
-	0x08, 0x01, 0x00, 0x00,
-	0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
-	0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
-	0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
-	0x10, 0x86
-
-Then lastly there is the payload.
-
-After composing the packet contents, it is sent by send()-ing it to a logical
-mac80211 interface that is in Monitor mode.  Libpcap can also be used,
-(which is easier than doing the work to bind the socket to the right
-interface), along the following lines:
-
-	ppcap = pcap_open_live(szInterfaceName, 800, 1, 20, szErrbuf);
-...
-	r = pcap_inject(ppcap, u8aSendBuffer, nLength);
-
-You can also find a link to a complete inject application here:
-
-http://wireless.kernel.org/en/users/Documentation/packetspammer
-
-Andy Green <andy@warmcat.com>
diff --git a/MAINTAINERS b/MAINTAINERS
index 956999d2d979..33bfc9e4aead 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10079,7 +10079,7 @@ S:	Maintained
 W:	https://wireless.wiki.kernel.org/
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211.git
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git
-F:	Documentation/networking/mac80211-injection.txt
+F:	Documentation/networking/mac80211-injection.rst
 F:	Documentation/networking/mac80211_hwsim/mac80211_hwsim.rst
 F:	drivers/net/wireless/mac80211_hwsim.[ch]
 F:	include/net/mac80211.h
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 82846aca86d9..9849c14694db 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -2144,7 +2144,7 @@ static bool ieee80211_parse_tx_radiotap(struct ieee80211_local *local,
 
 		/*
 		 * Please update the file
-		 * Documentation/networking/mac80211-injection.txt
+		 * Documentation/networking/mac80211-injection.rst
 		 * when parsing new fields here.
 		 */
 
-- 
cgit v1.2.3


From 6e94eaaa400d66f13e25e071926047ef2e3d21e3 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Thu, 30 Apr 2020 18:04:12 +0200
Subject: docs: networking: convert phonet.txt to ReST

- add SPDX header;
- adjust title markup;
- use copyright symbol;
- add notes markups;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Rémi Denis-Courmont <courmisch@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst       |   1 +
 Documentation/networking/packet_mmap.rst |   2 +-
 Documentation/networking/phonet.rst      | 230 +++++++++++++++++++++++++++++++
 Documentation/networking/phonet.txt      | 214 ----------------------------
 MAINTAINERS                              |   2 +-
 5 files changed, 233 insertions(+), 216 deletions(-)
 create mode 100644 Documentation/networking/phonet.rst
 delete mode 100644 Documentation/networking/phonet.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 8262b535a83e..e460026331c6 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -90,6 +90,7 @@ Contents:
    openvswitch
    operstates
    packet_mmap
+   phonet
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/packet_mmap.rst b/Documentation/networking/packet_mmap.rst
index 5f213d17652f..884c7222b9e9 100644
--- a/Documentation/networking/packet_mmap.rst
+++ b/Documentation/networking/packet_mmap.rst
@@ -1076,7 +1076,7 @@ Miscellaneous bits
 ==================
 
 - Packet sockets work well together with Linux socket filters, thus you also
-  might want to have a look at Documentation/networking/filter.txt
+  might want to have a look at Documentation/networking/filter.rst
 
 THANKS
 ======
diff --git a/Documentation/networking/phonet.rst b/Documentation/networking/phonet.rst
new file mode 100644
index 000000000000..8668dcbc5e6a
--- /dev/null
+++ b/Documentation/networking/phonet.rst
@@ -0,0 +1,230 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+============================
+Linux Phonet protocol family
+============================
+
+Introduction
+------------
+
+Phonet is a packet protocol used by Nokia cellular modems for both IPC
+and RPC. With the Linux Phonet socket family, Linux host processes can
+receive and send messages from/to the modem, or any other external
+device attached to the modem. The modem takes care of routing.
+
+Phonet packets can be exchanged through various hardware connections
+depending on the device, such as:
+
+  - USB with the CDC Phonet interface,
+  - infrared,
+  - Bluetooth,
+  - an RS232 serial port (with a dedicated "FBUS" line discipline),
+  - the SSI bus with some TI OMAP processors.
+
+
+Packets format
+--------------
+
+Phonet packets have a common header as follows::
+
+  struct phonethdr {
+    uint8_t  pn_media;  /* Media type (link-layer identifier) */
+    uint8_t  pn_rdev;   /* Receiver device ID */
+    uint8_t  pn_sdev;   /* Sender device ID */
+    uint8_t  pn_res;    /* Resource ID or function */
+    uint16_t pn_length; /* Big-endian message byte length (minus 6) */
+    uint8_t  pn_robj;   /* Receiver object ID */
+    uint8_t  pn_sobj;   /* Sender object ID */
+  };
+
+On Linux, the link-layer header includes the pn_media byte (see below).
+The next 7 bytes are part of the network-layer header.
+
+The device ID is split: the 6 higher-order bits constitute the device
+address, while the 2 lower-order bits are used for multiplexing, as are
+the 8-bit object identifiers. As such, Phonet can be considered as a
+network layer with 6 bits of address space and 10 bits for transport
+protocol (much like port numbers in IP world).
+
+The modem always has address number zero. All other device have a their
+own 6-bit address.
+
+
+Link layer
+----------
+
+Phonet links are always point-to-point links. The link layer header
+consists of a single Phonet media type byte. It uniquely identifies the
+link through which the packet is transmitted, from the modem's
+perspective. Each Phonet network device shall prepend and set the media
+type byte as appropriate. For convenience, a common phonet_header_ops
+link-layer header operations structure is provided. It sets the
+media type according to the network device hardware address.
+
+Linux Phonet network interfaces support a dedicated link layer packets
+type (ETH_P_PHONET) which is out of the Ethernet type range. They can
+only send and receive Phonet packets.
+
+The virtual TUN tunnel device driver can also be used for Phonet. This
+requires IFF_TUN mode, _without_ the IFF_NO_PI flag. In this case,
+there is no link-layer header, so there is no Phonet media type byte.
+
+Note that Phonet interfaces are not allowed to re-order packets, so
+only the (default) Linux FIFO qdisc should be used with them.
+
+
+Network layer
+-------------
+
+The Phonet socket address family maps the Phonet packet header::
+
+  struct sockaddr_pn {
+    sa_family_t spn_family;    /* AF_PHONET */
+    uint8_t     spn_obj;       /* Object ID */
+    uint8_t     spn_dev;       /* Device ID */
+    uint8_t     spn_resource;  /* Resource or function */
+    uint8_t     spn_zero[...]; /* Padding */
+  };
+
+The resource field is only used when sending and receiving;
+It is ignored by bind() and getsockname().
+
+
+Low-level datagram protocol
+---------------------------
+
+Applications can send Phonet messages using the Phonet datagram socket
+protocol from the PF_PHONET family. Each socket is bound to one of the
+2^10 object IDs available, and can send and receive packets with any
+other peer.
+
+::
+
+  struct sockaddr_pn addr = { .spn_family = AF_PHONET, };
+  ssize_t len;
+  socklen_t addrlen = sizeof(addr);
+  int fd;
+
+  fd = socket(PF_PHONET, SOCK_DGRAM, 0);
+  bind(fd, (struct sockaddr *)&addr, sizeof(addr));
+  /* ... */
+
+  sendto(fd, msg, msglen, 0, (struct sockaddr *)&addr, sizeof(addr));
+  len = recvfrom(fd, buf, sizeof(buf), 0,
+		 (struct sockaddr *)&addr, &addrlen);
+
+This protocol follows the SOCK_DGRAM connection-less semantics.
+However, connect() and getpeername() are not supported, as they did
+not seem useful with Phonet usages (could be added easily).
+
+
+Resource subscription
+---------------------
+
+A Phonet datagram socket can be subscribed to any number of 8-bits
+Phonet resources, as follow::
+
+  uint32_t res = 0xXX;
+  ioctl(fd, SIOCPNADDRESOURCE, &res);
+
+Subscription is similarly cancelled using the SIOCPNDELRESOURCE I/O
+control request, or when the socket is closed.
+
+Note that no more than one socket can be subcribed to any given
+resource at a time. If not, ioctl() will return EBUSY.
+
+
+Phonet Pipe protocol
+--------------------
+
+The Phonet Pipe protocol is a simple sequenced packets protocol
+with end-to-end congestion control. It uses the passive listening
+socket paradigm. The listening socket is bound to an unique free object
+ID. Each listening socket can handle up to 255 simultaneous
+connections, one per accept()'d socket.
+
+::
+
+  int lfd, cfd;
+
+  lfd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
+  listen (lfd, INT_MAX);
+
+  /* ... */
+  cfd = accept(lfd, NULL, NULL);
+  for (;;)
+  {
+    char buf[...];
+    ssize_t len = read(cfd, buf, sizeof(buf));
+
+    /* ... */
+
+    write(cfd, msg, msglen);
+  }
+
+Connections are traditionally established between two endpoints by a
+"third party" application. This means that both endpoints are passive.
+
+
+As of Linux kernel version 2.6.39, it is also possible to connect
+two endpoints directly, using connect() on the active side. This is
+intended to support the newer Nokia Wireless Modem API, as found in
+e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform::
+
+  struct sockaddr_spn spn;
+  int fd;
+
+  fd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
+  memset(&spn, 0, sizeof(spn));
+  spn.spn_family = AF_PHONET;
+  spn.spn_obj = ...;
+  spn.spn_dev = ...;
+  spn.spn_resource = 0xD9;
+  connect(fd, (struct sockaddr *)&spn, sizeof(spn));
+  /* normal I/O here ... */
+  close(fd);
+
+
+.. Warning:
+
+   When polling a connected pipe socket for writability, there is an
+   intrinsic race condition whereby writability might be lost between the
+   polling and the writing system calls. In this case, the socket will
+   block until write becomes possible again, unless non-blocking mode
+   is enabled.
+
+
+The pipe protocol provides two socket options at the SOL_PNPIPE level:
+
+  PNPIPE_ENCAP accepts one integer value (int) of:
+
+    PNPIPE_ENCAP_NONE:
+      The socket operates normally (default).
+
+    PNPIPE_ENCAP_IP:
+      The socket is used as a backend for a virtual IP
+      interface. This requires CAP_NET_ADMIN capability. GPRS data
+      support on Nokia modems can use this. Note that the socket cannot
+      be reliably poll()'d or read() from while in this mode.
+
+  PNPIPE_IFINDEX
+      is a read-only integer value. It contains the
+      interface index of the network interface created by PNPIPE_ENCAP,
+      or zero if encapsulation is off.
+
+  PNPIPE_HANDLE
+      is a read-only integer value. It contains the underlying
+      identifier ("pipe handle") of the pipe. This is only defined for
+      socket descriptors that are already connected or being connected.
+
+
+Authors
+-------
+
+Linux Phonet was initially written by Sakari Ailus.
+
+Other contributors include Mikä Liljeberg, Andras Domokos,
+Carlos Chinea and Rémi Denis-Courmont.
+
+Copyright |copy| 2008 Nokia Corporation.
diff --git a/Documentation/networking/phonet.txt b/Documentation/networking/phonet.txt
deleted file mode 100644
index 81003581f47a..000000000000
--- a/Documentation/networking/phonet.txt
+++ /dev/null
@@ -1,214 +0,0 @@
-Linux Phonet protocol family
-============================
-
-Introduction
-------------
-
-Phonet is a packet protocol used by Nokia cellular modems for both IPC
-and RPC. With the Linux Phonet socket family, Linux host processes can
-receive and send messages from/to the modem, or any other external
-device attached to the modem. The modem takes care of routing.
-
-Phonet packets can be exchanged through various hardware connections
-depending on the device, such as:
-  - USB with the CDC Phonet interface,
-  - infrared,
-  - Bluetooth,
-  - an RS232 serial port (with a dedicated "FBUS" line discipline),
-  - the SSI bus with some TI OMAP processors.
-
-
-Packets format
---------------
-
-Phonet packets have a common header as follows:
-
-  struct phonethdr {
-    uint8_t  pn_media;  /* Media type (link-layer identifier) */
-    uint8_t  pn_rdev;   /* Receiver device ID */
-    uint8_t  pn_sdev;   /* Sender device ID */
-    uint8_t  pn_res;    /* Resource ID or function */
-    uint16_t pn_length; /* Big-endian message byte length (minus 6) */
-    uint8_t  pn_robj;   /* Receiver object ID */
-    uint8_t  pn_sobj;   /* Sender object ID */
-  };
-
-On Linux, the link-layer header includes the pn_media byte (see below).
-The next 7 bytes are part of the network-layer header.
-
-The device ID is split: the 6 higher-order bits constitute the device
-address, while the 2 lower-order bits are used for multiplexing, as are
-the 8-bit object identifiers. As such, Phonet can be considered as a
-network layer with 6 bits of address space and 10 bits for transport
-protocol (much like port numbers in IP world).
-
-The modem always has address number zero. All other device have a their
-own 6-bit address.
-
-
-Link layer
-----------
-
-Phonet links are always point-to-point links. The link layer header
-consists of a single Phonet media type byte. It uniquely identifies the
-link through which the packet is transmitted, from the modem's
-perspective. Each Phonet network device shall prepend and set the media
-type byte as appropriate. For convenience, a common phonet_header_ops
-link-layer header operations structure is provided. It sets the
-media type according to the network device hardware address.
-
-Linux Phonet network interfaces support a dedicated link layer packets
-type (ETH_P_PHONET) which is out of the Ethernet type range. They can
-only send and receive Phonet packets.
-
-The virtual TUN tunnel device driver can also be used for Phonet. This
-requires IFF_TUN mode, _without_ the IFF_NO_PI flag. In this case,
-there is no link-layer header, so there is no Phonet media type byte.
-
-Note that Phonet interfaces are not allowed to re-order packets, so
-only the (default) Linux FIFO qdisc should be used with them.
-
-
-Network layer
--------------
-
-The Phonet socket address family maps the Phonet packet header:
-
-  struct sockaddr_pn {
-    sa_family_t spn_family;    /* AF_PHONET */
-    uint8_t     spn_obj;       /* Object ID */
-    uint8_t     spn_dev;       /* Device ID */
-    uint8_t     spn_resource;  /* Resource or function */
-    uint8_t     spn_zero[...]; /* Padding */
-  };
-
-The resource field is only used when sending and receiving;
-It is ignored by bind() and getsockname().
-
-
-Low-level datagram protocol
----------------------------
-
-Applications can send Phonet messages using the Phonet datagram socket
-protocol from the PF_PHONET family. Each socket is bound to one of the
-2^10 object IDs available, and can send and receive packets with any
-other peer.
-
-  struct sockaddr_pn addr = { .spn_family = AF_PHONET, };
-  ssize_t len;
-  socklen_t addrlen = sizeof(addr);
-  int fd;
-
-  fd = socket(PF_PHONET, SOCK_DGRAM, 0);
-  bind(fd, (struct sockaddr *)&addr, sizeof(addr));
-  /* ... */
-
-  sendto(fd, msg, msglen, 0, (struct sockaddr *)&addr, sizeof(addr));
-  len = recvfrom(fd, buf, sizeof(buf), 0,
-                 (struct sockaddr *)&addr, &addrlen);
-
-This protocol follows the SOCK_DGRAM connection-less semantics.
-However, connect() and getpeername() are not supported, as they did
-not seem useful with Phonet usages (could be added easily).
-
-
-Resource subscription
----------------------
-
-A Phonet datagram socket can be subscribed to any number of 8-bits
-Phonet resources, as follow:
-
-  uint32_t res = 0xXX;
-  ioctl(fd, SIOCPNADDRESOURCE, &res);
-
-Subscription is similarly cancelled using the SIOCPNDELRESOURCE I/O
-control request, or when the socket is closed.
-
-Note that no more than one socket can be subcribed to any given
-resource at a time. If not, ioctl() will return EBUSY.
-
-
-Phonet Pipe protocol
---------------------
-
-The Phonet Pipe protocol is a simple sequenced packets protocol
-with end-to-end congestion control. It uses the passive listening
-socket paradigm. The listening socket is bound to an unique free object
-ID. Each listening socket can handle up to 255 simultaneous
-connections, one per accept()'d socket.
-
-  int lfd, cfd;
-
-  lfd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
-  listen (lfd, INT_MAX);
-
-  /* ... */
-  cfd = accept(lfd, NULL, NULL);
-  for (;;)
-  {
-    char buf[...];
-    ssize_t len = read(cfd, buf, sizeof(buf));
-
-    /* ... */
-
-    write(cfd, msg, msglen);
-  }
-
-Connections are traditionally established between two endpoints by a
-"third party" application. This means that both endpoints are passive.
-
-
-As of Linux kernel version 2.6.39, it is also possible to connect
-two endpoints directly, using connect() on the active side. This is
-intended to support the newer Nokia Wireless Modem API, as found in
-e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform:
-
-  struct sockaddr_spn spn;
-  int fd;
-
-  fd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
-  memset(&spn, 0, sizeof(spn));
-  spn.spn_family = AF_PHONET;
-  spn.spn_obj = ...;
-  spn.spn_dev = ...;
-  spn.spn_resource = 0xD9;
-  connect(fd, (struct sockaddr *)&spn, sizeof(spn));
-  /* normal I/O here ... */
-  close(fd);
-
-
-WARNING:
-When polling a connected pipe socket for writability, there is an
-intrinsic race condition whereby writability might be lost between the
-polling and the writing system calls. In this case, the socket will
-block until write becomes possible again, unless non-blocking mode
-is enabled.
-
-
-The pipe protocol provides two socket options at the SOL_PNPIPE level:
-
-  PNPIPE_ENCAP accepts one integer value (int) of:
-
-    PNPIPE_ENCAP_NONE: The socket operates normally (default).
-
-    PNPIPE_ENCAP_IP: The socket is used as a backend for a virtual IP
-      interface. This requires CAP_NET_ADMIN capability. GPRS data
-      support on Nokia modems can use this. Note that the socket cannot
-      be reliably poll()'d or read() from while in this mode.
-
-  PNPIPE_IFINDEX is a read-only integer value. It contains the
-    interface index of the network interface created by PNPIPE_ENCAP,
-    or zero if encapsulation is off.
-
-  PNPIPE_HANDLE is a read-only integer value. It contains the underlying
-    identifier ("pipe handle") of the pipe. This is only defined for
-    socket descriptors that are already connected or being connected.
-
-
-Authors
--------
-
-Linux Phonet was initially written by Sakari Ailus.
-Other contributors include Mikä Liljeberg, Andras Domokos,
-Carlos Chinea and Rémi Denis-Courmont.
-Copyright (C) 2008 Nokia Corporation.
diff --git a/MAINTAINERS b/MAINTAINERS
index 33bfc9e4aead..785f56e5f210 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13262,7 +13262,7 @@ F:	drivers/input/joystick/pxrc.c
 PHONET PROTOCOL
 M:	Remi Denis-Courmont <courmisch@gmail.com>
 S:	Supported
-F:	Documentation/networking/phonet.txt
+F:	Documentation/networking/phonet.rst
 F:	include/linux/phonet.h
 F:	include/net/phonet/
 F:	include/uapi/linux/phonet.h
-- 
cgit v1.2.3


From bad5b6e223e8409c860c0574d5239ee4348f06b3 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Thu, 30 Apr 2020 18:04:19 +0200
Subject: docs: networking: convert rds.txt to ReST

- add SPDX header;
- add a document title;
- mark code blocks and literals as such;
- mark tables as such;
- mark lists as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst |   1 +
 Documentation/networking/rds.rst   | 448 +++++++++++++++++++++++++++++++++++++
 Documentation/networking/rds.txt   | 423 ----------------------------------
 MAINTAINERS                        |   2 +-
 4 files changed, 450 insertions(+), 424 deletions(-)
 create mode 100644 Documentation/networking/rds.rst
 delete mode 100644 Documentation/networking/rds.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index b7e35b0d905c..e63a2cb2e4cb 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -97,6 +97,7 @@ Contents:
    proc_net_tcp
    radiotap-headers
    ray_cs
+   rds
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/rds.rst b/Documentation/networking/rds.rst
new file mode 100644
index 000000000000..44936c27ab3a
--- /dev/null
+++ b/Documentation/networking/rds.rst
@@ -0,0 +1,448 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==
+RDS
+===
+
+Overview
+========
+
+This readme tries to provide some background on the hows and whys of RDS,
+and will hopefully help you find your way around the code.
+
+In addition, please see this email about RDS origins:
+http://oss.oracle.com/pipermail/rds-devel/2007-November/000228.html
+
+RDS Architecture
+================
+
+RDS provides reliable, ordered datagram delivery by using a single
+reliable connection between any two nodes in the cluster. This allows
+applications to use a single socket to talk to any other process in the
+cluster - so in a cluster with N processes you need N sockets, in contrast
+to N*N if you use a connection-oriented socket transport like TCP.
+
+RDS is not Infiniband-specific; it was designed to support different
+transports.  The current implementation used to support RDS over TCP as well
+as IB.
+
+The high-level semantics of RDS from the application's point of view are
+
+ *	Addressing
+
+	RDS uses IPv4 addresses and 16bit port numbers to identify
+	the end point of a connection. All socket operations that involve
+	passing addresses between kernel and user space generally
+	use a struct sockaddr_in.
+
+	The fact that IPv4 addresses are used does not mean the underlying
+	transport has to be IP-based. In fact, RDS over IB uses a
+	reliable IB connection; the IP address is used exclusively to
+	locate the remote node's GID (by ARPing for the given IP).
+
+	The port space is entirely independent of UDP, TCP or any other
+	protocol.
+
+ *	Socket interface
+
+	RDS sockets work *mostly* as you would expect from a BSD
+	socket. The next section will cover the details. At any rate,
+	all I/O is performed through the standard BSD socket API.
+	Some additions like zerocopy support are implemented through
+	control messages, while other extensions use the getsockopt/
+	setsockopt calls.
+
+	Sockets must be bound before you can send or receive data.
+	This is needed because binding also selects a transport and
+	attaches it to the socket. Once bound, the transport assignment
+	does not change. RDS will tolerate IPs moving around (eg in
+	a active-active HA scenario), but only as long as the address
+	doesn't move to a different transport.
+
+ *	sysctls
+
+	RDS supports a number of sysctls in /proc/sys/net/rds
+
+
+Socket Interface
+================
+
+  AF_RDS, PF_RDS, SOL_RDS
+	AF_RDS and PF_RDS are the domain type to be used with socket(2)
+	to create RDS sockets. SOL_RDS is the socket-level to be used
+	with setsockopt(2) and getsockopt(2) for RDS specific socket
+	options.
+
+  fd = socket(PF_RDS, SOCK_SEQPACKET, 0);
+	This creates a new, unbound RDS socket.
+
+  setsockopt(SOL_SOCKET): send and receive buffer size
+	RDS honors the send and receive buffer size socket options.
+	You are not allowed to queue more than SO_SNDSIZE bytes to
+	a socket. A message is queued when sendmsg is called, and
+	it leaves the queue when the remote system acknowledges
+	its arrival.
+
+	The SO_RCVSIZE option controls the maximum receive queue length.
+	This is a soft limit rather than a hard limit - RDS will
+	continue to accept and queue incoming messages, even if that
+	takes the queue length over the limit. However, it will also
+	mark the port as "congested" and send a congestion update to
+	the source node. The source node is supposed to throttle any
+	processes sending to this congested port.
+
+  bind(fd, &sockaddr_in, ...)
+	This binds the socket to a local IP address and port, and a
+	transport, if one has not already been selected via the
+	SO_RDS_TRANSPORT socket option
+
+  sendmsg(fd, ...)
+	Sends a message to the indicated recipient. The kernel will
+	transparently establish the underlying reliable connection
+	if it isn't up yet.
+
+	An attempt to send a message that exceeds SO_SNDSIZE will
+	return with -EMSGSIZE
+
+	An attempt to send a message that would take the total number
+	of queued bytes over the SO_SNDSIZE threshold will return
+	EAGAIN.
+
+	An attempt to send a message to a destination that is marked
+	as "congested" will return ENOBUFS.
+
+  recvmsg(fd, ...)
+	Receives a message that was queued to this socket. The sockets
+	recv queue accounting is adjusted, and if the queue length
+	drops below SO_SNDSIZE, the port is marked uncongested, and
+	a congestion update is sent to all peers.
+
+	Applications can ask the RDS kernel module to receive
+	notifications via control messages (for instance, there is a
+	notification when a congestion update arrived, or when a RDMA
+	operation completes). These notifications are received through
+	the msg.msg_control buffer of struct msghdr. The format of the
+	messages is described in manpages.
+
+  poll(fd)
+	RDS supports the poll interface to allow the application
+	to implement async I/O.
+
+	POLLIN handling is pretty straightforward. When there's an
+	incoming message queued to the socket, or a pending notification,
+	we signal POLLIN.
+
+	POLLOUT is a little harder. Since you can essentially send
+	to any destination, RDS will always signal POLLOUT as long as
+	there's room on the send queue (ie the number of bytes queued
+	is less than the sendbuf size).
+
+	However, the kernel will refuse to accept messages to
+	a destination marked congested - in this case you will loop
+	forever if you rely on poll to tell you what to do.
+	This isn't a trivial problem, but applications can deal with
+	this - by using congestion notifications, and by checking for
+	ENOBUFS errors returned by sendmsg.
+
+  setsockopt(SOL_RDS, RDS_CANCEL_SENT_TO, &sockaddr_in)
+	This allows the application to discard all messages queued to a
+	specific destination on this particular socket.
+
+	This allows the application to cancel outstanding messages if
+	it detects a timeout. For instance, if it tried to send a message,
+	and the remote host is unreachable, RDS will keep trying forever.
+	The application may decide it's not worth it, and cancel the
+	operation. In this case, it would use RDS_CANCEL_SENT_TO to
+	nuke any pending messages.
+
+  ``setsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..), getsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)``
+	Set or read an integer defining  the underlying
+	encapsulating transport to be used for RDS packets on the
+	socket. When setting the option, integer argument may be
+	one of RDS_TRANS_TCP or RDS_TRANS_IB. When retrieving the
+	value, RDS_TRANS_NONE will be returned on an unbound socket.
+	This socket option may only be set exactly once on the socket,
+	prior to binding it via the bind(2) system call. Attempts to
+	set SO_RDS_TRANSPORT on a socket for which the transport has
+	been previously attached explicitly (by SO_RDS_TRANSPORT) or
+	implicitly (via bind(2)) will return an error of EOPNOTSUPP.
+	An attempt to set SO_RDS_TRANSPORT to RDS_TRANS_NONE will
+	always return EINVAL.
+
+RDMA for RDS
+============
+
+  see rds-rdma(7) manpage (available in rds-tools)
+
+
+Congestion Notifications
+========================
+
+  see rds(7) manpage
+
+
+RDS Protocol
+============
+
+  Message header
+
+    The message header is a 'struct rds_header' (see rds.h):
+
+    Fields:
+
+      h_sequence:
+	  per-packet sequence number
+      h_ack:
+	  piggybacked acknowledgment of last packet received
+      h_len:
+	  length of data, not including header
+      h_sport:
+	  source port
+      h_dport:
+	  destination port
+      h_flags:
+	  Can be:
+
+	  =============  ==================================
+	  CONG_BITMAP    this is a congestion update bitmap
+	  ACK_REQUIRED   receiver must ack this packet
+	  RETRANSMITTED  packet has previously been sent
+	  =============  ==================================
+
+      h_credit:
+	  indicate to other end of connection that
+	  it has more credits available (i.e. there is
+	  more send room)
+      h_padding[4]:
+	  unused, for future use
+      h_csum:
+	  header checksum
+      h_exthdr:
+	  optional data can be passed here. This is currently used for
+	  passing RDMA-related information.
+
+  ACK and retransmit handling
+
+      One might think that with reliable IB connections you wouldn't need
+      to ack messages that have been received.  The problem is that IB
+      hardware generates an ack message before it has DMAed the message
+      into memory.  This creates a potential message loss if the HCA is
+      disabled for any reason between when it sends the ack and before
+      the message is DMAed and processed.  This is only a potential issue
+      if another HCA is available for fail-over.
+
+      Sending an ack immediately would allow the sender to free the sent
+      message from their send queue quickly, but could cause excessive
+      traffic to be used for acks. RDS piggybacks acks on sent data
+      packets.  Ack-only packets are reduced by only allowing one to be
+      in flight at a time, and by the sender only asking for acks when
+      its send buffers start to fill up. All retransmissions are also
+      acked.
+
+  Flow Control
+
+      RDS's IB transport uses a credit-based mechanism to verify that
+      there is space in the peer's receive buffers for more data. This
+      eliminates the need for hardware retries on the connection.
+
+  Congestion
+
+      Messages waiting in the receive queue on the receiving socket
+      are accounted against the sockets SO_RCVBUF option value.  Only
+      the payload bytes in the message are accounted for.  If the
+      number of bytes queued equals or exceeds rcvbuf then the socket
+      is congested.  All sends attempted to this socket's address
+      should return block or return -EWOULDBLOCK.
+
+      Applications are expected to be reasonably tuned such that this
+      situation very rarely occurs.  An application encountering this
+      "back-pressure" is considered a bug.
+
+      This is implemented by having each node maintain bitmaps which
+      indicate which ports on bound addresses are congested.  As the
+      bitmap changes it is sent through all the connections which
+      terminate in the local address of the bitmap which changed.
+
+      The bitmaps are allocated as connections are brought up.  This
+      avoids allocation in the interrupt handling path which queues
+      sages on sockets.  The dense bitmaps let transports send the
+      entire bitmap on any bitmap change reasonably efficiently.  This
+      is much easier to implement than some finer-grained
+      communication of per-port congestion.  The sender does a very
+      inexpensive bit test to test if the port it's about to send to
+      is congested or not.
+
+
+RDS Transport Layer
+===================
+
+  As mentioned above, RDS is not IB-specific. Its code is divided
+  into a general RDS layer and a transport layer.
+
+  The general layer handles the socket API, congestion handling,
+  loopback, stats, usermem pinning, and the connection state machine.
+
+  The transport layer handles the details of the transport. The IB
+  transport, for example, handles all the queue pairs, work requests,
+  CM event handlers, and other Infiniband details.
+
+
+RDS Kernel Structures
+=====================
+
+  struct rds_message
+    aka possibly "rds_outgoing", the generic RDS layer copies data to
+    be sent and sets header fields as needed, based on the socket API.
+    This is then queued for the individual connection and sent by the
+    connection's transport.
+
+  struct rds_incoming
+    a generic struct referring to incoming data that can be handed from
+    the transport to the general code and queued by the general code
+    while the socket is awoken. It is then passed back to the transport
+    code to handle the actual copy-to-user.
+
+  struct rds_socket
+    per-socket information
+
+  struct rds_connection
+    per-connection information
+
+  struct rds_transport
+    pointers to transport-specific functions
+
+  struct rds_statistics
+    non-transport-specific statistics
+
+  struct rds_cong_map
+    wraps the raw congestion bitmap, contains rbnode, waitq, etc.
+
+Connection management
+=====================
+
+  Connections may be in UP, DOWN, CONNECTING, DISCONNECTING, and
+  ERROR states.
+
+  The first time an attempt is made by an RDS socket to send data to
+  a node, a connection is allocated and connected. That connection is
+  then maintained forever -- if there are transport errors, the
+  connection will be dropped and re-established.
+
+  Dropping a connection while packets are queued will cause queued or
+  partially-sent datagrams to be retransmitted when the connection is
+  re-established.
+
+
+The send path
+=============
+
+  rds_sendmsg()
+    - struct rds_message built from incoming data
+    - CMSGs parsed (e.g. RDMA ops)
+    - transport connection alloced and connected if not already
+    - rds_message placed on send queue
+    - send worker awoken
+
+  rds_send_worker()
+    - calls rds_send_xmit() until queue is empty
+
+  rds_send_xmit()
+    - transmits congestion map if one is pending
+    - may set ACK_REQUIRED
+    - calls transport to send either non-RDMA or RDMA message
+      (RDMA ops never retransmitted)
+
+  rds_ib_xmit()
+    - allocs work requests from send ring
+    - adds any new send credits available to peer (h_credits)
+    - maps the rds_message's sg list
+    - piggybacks ack
+    - populates work requests
+    - post send to connection's queue pair
+
+The recv path
+=============
+
+  rds_ib_recv_cq_comp_handler()
+    - looks at write completions
+    - unmaps recv buffer from device
+    - no errors, call rds_ib_process_recv()
+    - refill recv ring
+
+  rds_ib_process_recv()
+    - validate header checksum
+    - copy header to rds_ib_incoming struct if start of a new datagram
+    - add to ibinc's fraglist
+    - if competed datagram:
+	 - update cong map if datagram was cong update
+	 - call rds_recv_incoming() otherwise
+	 - note if ack is required
+
+  rds_recv_incoming()
+    - drop duplicate packets
+    - respond to pings
+    - find the sock associated with this datagram
+    - add to sock queue
+    - wake up sock
+    - do some congestion calculations
+  rds_recvmsg
+    - copy data into user iovec
+    - handle CMSGs
+    - return to application
+
+Multipath RDS (mprds)
+=====================
+  Mprds is multipathed-RDS, primarily intended for RDS-over-TCP
+  (though the concept can be extended to other transports). The classical
+  implementation of RDS-over-TCP is implemented by demultiplexing multiple
+  PF_RDS sockets between any 2 endpoints (where endpoint == [IP address,
+  port]) over a single TCP socket between the 2 IP addresses involved. This
+  has the limitation that it ends up funneling multiple RDS flows over a
+  single TCP flow, thus it is
+  (a) upper-bounded to the single-flow bandwidth,
+  (b) suffers from head-of-line blocking for all the RDS sockets.
+
+  Better throughput (for a fixed small packet size, MTU) can be achieved
+  by having multiple TCP/IP flows per rds/tcp connection, i.e., multipathed
+  RDS (mprds).  Each such TCP/IP flow constitutes a path for the rds/tcp
+  connection. RDS sockets will be attached to a path based on some hash
+  (e.g., of local address and RDS port number) and packets for that RDS
+  socket will be sent over the attached path using TCP to segment/reassemble
+  RDS datagrams on that path.
+
+  Multipathed RDS is implemented by splitting the struct rds_connection into
+  a common (to all paths) part, and a per-path struct rds_conn_path. All
+  I/O workqs and reconnect threads are driven from the rds_conn_path.
+  Transports such as TCP that are multipath capable may then set up a
+  TCP socket per rds_conn_path, and this is managed by the transport via
+  the transport privatee cp_transport_data pointer.
+
+  Transports announce themselves as multipath capable by setting the
+  t_mp_capable bit during registration with the rds core module. When the
+  transport is multipath-capable, rds_sendmsg() hashes outgoing traffic
+  across multiple paths. The outgoing hash is computed based on the
+  local address and port that the PF_RDS socket is bound to.
+
+  Additionally, even if the transport is MP capable, we may be
+  peering with some node that does not support mprds, or supports
+  a different number of paths. As a result, the peering nodes need
+  to agree on the number of paths to be used for the connection.
+  This is done by sending out a control packet exchange before the
+  first data packet. The control packet exchange must have completed
+  prior to outgoing hash completion in rds_sendmsg() when the transport
+  is mutlipath capable.
+
+  The control packet is an RDS ping packet (i.e., packet to rds dest
+  port 0) with the ping packet having a rds extension header option  of
+  type RDS_EXTHDR_NPATHS, length 2 bytes, and the value is the
+  number of paths supported by the sender. The "probe" ping packet will
+  get sent from some reserved port, RDS_FLAG_PROBE_PORT (in <linux/rds.h>)
+  The receiver of a ping from RDS_FLAG_PROBE_PORT will thus immediately
+  be able to compute the min(sender_paths, rcvr_paths). The pong
+  sent in response to a probe-ping should contain the rcvr's npaths
+  when the rcvr is mprds-capable.
+
+  If the rcvr is not mprds-capable, the exthdr in the ping will be
+  ignored.  In this case the pong will not have any exthdrs, so the sender
+  of the probe-ping can default to single-path mprds.
+
diff --git a/Documentation/networking/rds.txt b/Documentation/networking/rds.txt
deleted file mode 100644
index eec61694e894..000000000000
--- a/Documentation/networking/rds.txt
+++ /dev/null
@@ -1,423 +0,0 @@
-
-Overview
-========
-
-This readme tries to provide some background on the hows and whys of RDS,
-and will hopefully help you find your way around the code.
-
-In addition, please see this email about RDS origins:
-http://oss.oracle.com/pipermail/rds-devel/2007-November/000228.html
-
-RDS Architecture
-================
-
-RDS provides reliable, ordered datagram delivery by using a single
-reliable connection between any two nodes in the cluster. This allows
-applications to use a single socket to talk to any other process in the
-cluster - so in a cluster with N processes you need N sockets, in contrast
-to N*N if you use a connection-oriented socket transport like TCP.
-
-RDS is not Infiniband-specific; it was designed to support different
-transports.  The current implementation used to support RDS over TCP as well
-as IB.
-
-The high-level semantics of RDS from the application's point of view are
-
- *	Addressing
-        RDS uses IPv4 addresses and 16bit port numbers to identify
-        the end point of a connection. All socket operations that involve
-        passing addresses between kernel and user space generally
-        use a struct sockaddr_in.
-
-        The fact that IPv4 addresses are used does not mean the underlying
-        transport has to be IP-based. In fact, RDS over IB uses a
-        reliable IB connection; the IP address is used exclusively to
-        locate the remote node's GID (by ARPing for the given IP).
-
-        The port space is entirely independent of UDP, TCP or any other
-        protocol.
-
- *	Socket interface
-        RDS sockets work *mostly* as you would expect from a BSD
-        socket. The next section will cover the details. At any rate,
-        all I/O is performed through the standard BSD socket API.
-        Some additions like zerocopy support are implemented through
-        control messages, while other extensions use the getsockopt/
-        setsockopt calls.
-
-        Sockets must be bound before you can send or receive data.
-        This is needed because binding also selects a transport and
-        attaches it to the socket. Once bound, the transport assignment
-        does not change. RDS will tolerate IPs moving around (eg in
-        a active-active HA scenario), but only as long as the address
-        doesn't move to a different transport.
-
- *	sysctls
-        RDS supports a number of sysctls in /proc/sys/net/rds
-
-
-Socket Interface
-================
-
-  AF_RDS, PF_RDS, SOL_RDS
-	AF_RDS and PF_RDS are the domain type to be used with socket(2)
-	to create RDS sockets. SOL_RDS is the socket-level to be used
-	with setsockopt(2) and getsockopt(2) for RDS specific socket
-	options.
-
-  fd = socket(PF_RDS, SOCK_SEQPACKET, 0);
-        This creates a new, unbound RDS socket.
-
-  setsockopt(SOL_SOCKET): send and receive buffer size
-        RDS honors the send and receive buffer size socket options.
-        You are not allowed to queue more than SO_SNDSIZE bytes to
-        a socket. A message is queued when sendmsg is called, and
-        it leaves the queue when the remote system acknowledges
-        its arrival.
-
-        The SO_RCVSIZE option controls the maximum receive queue length.
-        This is a soft limit rather than a hard limit - RDS will
-        continue to accept and queue incoming messages, even if that
-        takes the queue length over the limit. However, it will also
-        mark the port as "congested" and send a congestion update to
-        the source node. The source node is supposed to throttle any
-        processes sending to this congested port.
-
-  bind(fd, &sockaddr_in, ...)
-        This binds the socket to a local IP address and port, and a
-        transport, if one has not already been selected via the
-	SO_RDS_TRANSPORT socket option
-
-  sendmsg(fd, ...)
-        Sends a message to the indicated recipient. The kernel will
-        transparently establish the underlying reliable connection
-        if it isn't up yet.
-
-        An attempt to send a message that exceeds SO_SNDSIZE will
-        return with -EMSGSIZE
-
-        An attempt to send a message that would take the total number
-        of queued bytes over the SO_SNDSIZE threshold will return
-        EAGAIN.
-
-        An attempt to send a message to a destination that is marked
-        as "congested" will return ENOBUFS.
-
-  recvmsg(fd, ...)
-        Receives a message that was queued to this socket. The sockets
-        recv queue accounting is adjusted, and if the queue length
-        drops below SO_SNDSIZE, the port is marked uncongested, and
-        a congestion update is sent to all peers.
-
-        Applications can ask the RDS kernel module to receive
-        notifications via control messages (for instance, there is a
-        notification when a congestion update arrived, or when a RDMA
-        operation completes). These notifications are received through
-        the msg.msg_control buffer of struct msghdr. The format of the
-        messages is described in manpages.
-
-  poll(fd)
-        RDS supports the poll interface to allow the application
-        to implement async I/O.
-
-        POLLIN handling is pretty straightforward. When there's an
-        incoming message queued to the socket, or a pending notification,
-        we signal POLLIN.
-
-        POLLOUT is a little harder. Since you can essentially send
-        to any destination, RDS will always signal POLLOUT as long as
-        there's room on the send queue (ie the number of bytes queued
-        is less than the sendbuf size).
-
-        However, the kernel will refuse to accept messages to
-        a destination marked congested - in this case you will loop
-        forever if you rely on poll to tell you what to do.
-        This isn't a trivial problem, but applications can deal with
-        this - by using congestion notifications, and by checking for
-        ENOBUFS errors returned by sendmsg.
-
-  setsockopt(SOL_RDS, RDS_CANCEL_SENT_TO, &sockaddr_in)
-        This allows the application to discard all messages queued to a
-        specific destination on this particular socket.
-
-        This allows the application to cancel outstanding messages if
-        it detects a timeout. For instance, if it tried to send a message,
-        and the remote host is unreachable, RDS will keep trying forever.
-        The application may decide it's not worth it, and cancel the
-        operation. In this case, it would use RDS_CANCEL_SENT_TO to
-        nuke any pending messages.
-
-  setsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
-  getsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
-	Set or read an integer defining  the underlying
-	encapsulating transport to be used for RDS packets on the
-	socket. When setting the option, integer argument may be
-	one of RDS_TRANS_TCP or RDS_TRANS_IB. When retrieving the
-	value, RDS_TRANS_NONE will be returned on an unbound socket.
-	This socket option may only be set exactly once on the socket,
-	prior to binding it via the bind(2) system call. Attempts to
-	set SO_RDS_TRANSPORT on a socket for which the transport has
-	been previously attached explicitly (by SO_RDS_TRANSPORT) or
-	implicitly (via bind(2)) will return an error of EOPNOTSUPP.
-	An attempt to set SO_RDS_TRANSPORT to RDS_TRANS_NONE will
-	always return EINVAL.
-
-RDMA for RDS
-============
-
-  see rds-rdma(7) manpage (available in rds-tools)
-
-
-Congestion Notifications
-========================
-
-  see rds(7) manpage
-
-
-RDS Protocol
-============
-
-  Message header
-
-    The message header is a 'struct rds_header' (see rds.h):
-    Fields:
-      h_sequence:
-          per-packet sequence number
-      h_ack:
-          piggybacked acknowledgment of last packet received
-      h_len:
-          length of data, not including header
-      h_sport:
-          source port
-      h_dport:
-          destination port
-      h_flags:
-          CONG_BITMAP - this is a congestion update bitmap
-          ACK_REQUIRED - receiver must ack this packet
-          RETRANSMITTED - packet has previously been sent
-      h_credit:
-          indicate to other end of connection that
-          it has more credits available (i.e. there is
-          more send room)
-      h_padding[4]:
-          unused, for future use
-      h_csum:
-          header checksum
-      h_exthdr:
-          optional data can be passed here. This is currently used for
-          passing RDMA-related information.
-
-  ACK and retransmit handling
-
-      One might think that with reliable IB connections you wouldn't need
-      to ack messages that have been received.  The problem is that IB
-      hardware generates an ack message before it has DMAed the message
-      into memory.  This creates a potential message loss if the HCA is
-      disabled for any reason between when it sends the ack and before
-      the message is DMAed and processed.  This is only a potential issue
-      if another HCA is available for fail-over.
-
-      Sending an ack immediately would allow the sender to free the sent
-      message from their send queue quickly, but could cause excessive
-      traffic to be used for acks. RDS piggybacks acks on sent data
-      packets.  Ack-only packets are reduced by only allowing one to be
-      in flight at a time, and by the sender only asking for acks when
-      its send buffers start to fill up. All retransmissions are also
-      acked.
-
-  Flow Control
-
-      RDS's IB transport uses a credit-based mechanism to verify that
-      there is space in the peer's receive buffers for more data. This
-      eliminates the need for hardware retries on the connection.
-
-  Congestion
-
-      Messages waiting in the receive queue on the receiving socket
-      are accounted against the sockets SO_RCVBUF option value.  Only
-      the payload bytes in the message are accounted for.  If the
-      number of bytes queued equals or exceeds rcvbuf then the socket
-      is congested.  All sends attempted to this socket's address
-      should return block or return -EWOULDBLOCK.
-
-      Applications are expected to be reasonably tuned such that this
-      situation very rarely occurs.  An application encountering this
-      "back-pressure" is considered a bug.
-
-      This is implemented by having each node maintain bitmaps which
-      indicate which ports on bound addresses are congested.  As the
-      bitmap changes it is sent through all the connections which
-      terminate in the local address of the bitmap which changed.
-
-      The bitmaps are allocated as connections are brought up.  This
-      avoids allocation in the interrupt handling path which queues
-      sages on sockets.  The dense bitmaps let transports send the
-      entire bitmap on any bitmap change reasonably efficiently.  This
-      is much easier to implement than some finer-grained
-      communication of per-port congestion.  The sender does a very
-      inexpensive bit test to test if the port it's about to send to
-      is congested or not.
-
-
-RDS Transport Layer
-==================
-
-  As mentioned above, RDS is not IB-specific. Its code is divided
-  into a general RDS layer and a transport layer.
-
-  The general layer handles the socket API, congestion handling,
-  loopback, stats, usermem pinning, and the connection state machine.
-
-  The transport layer handles the details of the transport. The IB
-  transport, for example, handles all the queue pairs, work requests,
-  CM event handlers, and other Infiniband details.
-
-
-RDS Kernel Structures
-=====================
-
-  struct rds_message
-    aka possibly "rds_outgoing", the generic RDS layer copies data to
-    be sent and sets header fields as needed, based on the socket API.
-    This is then queued for the individual connection and sent by the
-    connection's transport.
-  struct rds_incoming
-    a generic struct referring to incoming data that can be handed from
-    the transport to the general code and queued by the general code
-    while the socket is awoken. It is then passed back to the transport
-    code to handle the actual copy-to-user.
-  struct rds_socket
-    per-socket information
-  struct rds_connection
-    per-connection information
-  struct rds_transport
-    pointers to transport-specific functions
-  struct rds_statistics
-    non-transport-specific statistics
-  struct rds_cong_map
-    wraps the raw congestion bitmap, contains rbnode, waitq, etc.
-
-Connection management
-=====================
-
-  Connections may be in UP, DOWN, CONNECTING, DISCONNECTING, and
-  ERROR states.
-
-  The first time an attempt is made by an RDS socket to send data to
-  a node, a connection is allocated and connected. That connection is
-  then maintained forever -- if there are transport errors, the
-  connection will be dropped and re-established.
-
-  Dropping a connection while packets are queued will cause queued or
-  partially-sent datagrams to be retransmitted when the connection is
-  re-established.
-
-
-The send path
-=============
-
-  rds_sendmsg()
-    struct rds_message built from incoming data
-    CMSGs parsed (e.g. RDMA ops)
-    transport connection alloced and connected if not already
-    rds_message placed on send queue
-    send worker awoken
-  rds_send_worker()
-    calls rds_send_xmit() until queue is empty
-  rds_send_xmit()
-    transmits congestion map if one is pending
-    may set ACK_REQUIRED
-    calls transport to send either non-RDMA or RDMA message
-    (RDMA ops never retransmitted)
-  rds_ib_xmit()
-    allocs work requests from send ring
-    adds any new send credits available to peer (h_credits)
-    maps the rds_message's sg list
-    piggybacks ack
-    populates work requests
-    post send to connection's queue pair
-
-The recv path
-=============
-
-  rds_ib_recv_cq_comp_handler()
-    looks at write completions
-    unmaps recv buffer from device
-    no errors, call rds_ib_process_recv()
-    refill recv ring
-  rds_ib_process_recv()
-    validate header checksum
-    copy header to rds_ib_incoming struct if start of a new datagram
-    add to ibinc's fraglist
-    if competed datagram:
-      update cong map if datagram was cong update
-      call rds_recv_incoming() otherwise
-      note if ack is required
-  rds_recv_incoming()
-    drop duplicate packets
-    respond to pings
-    find the sock associated with this datagram
-    add to sock queue
-    wake up sock
-    do some congestion calculations
-  rds_recvmsg
-    copy data into user iovec
-    handle CMSGs
-    return to application
-
-Multipath RDS (mprds)
-=====================
-  Mprds is multipathed-RDS, primarily intended for RDS-over-TCP
-  (though the concept can be extended to other transports). The classical
-  implementation of RDS-over-TCP is implemented by demultiplexing multiple
-  PF_RDS sockets between any 2 endpoints (where endpoint == [IP address,
-  port]) over a single TCP socket between the 2 IP addresses involved. This
-  has the limitation that it ends up funneling multiple RDS flows over a
-  single TCP flow, thus it is
-  (a) upper-bounded to the single-flow bandwidth,
-  (b) suffers from head-of-line blocking for all the RDS sockets.
-
-  Better throughput (for a fixed small packet size, MTU) can be achieved
-  by having multiple TCP/IP flows per rds/tcp connection, i.e., multipathed
-  RDS (mprds).  Each such TCP/IP flow constitutes a path for the rds/tcp
-  connection. RDS sockets will be attached to a path based on some hash
-  (e.g., of local address and RDS port number) and packets for that RDS
-  socket will be sent over the attached path using TCP to segment/reassemble
-  RDS datagrams on that path.
-
-  Multipathed RDS is implemented by splitting the struct rds_connection into
-  a common (to all paths) part, and a per-path struct rds_conn_path. All
-  I/O workqs and reconnect threads are driven from the rds_conn_path.
-  Transports such as TCP that are multipath capable may then set up a
-  TCP socket per rds_conn_path, and this is managed by the transport via
-  the transport privatee cp_transport_data pointer.
-
-  Transports announce themselves as multipath capable by setting the
-  t_mp_capable bit during registration with the rds core module. When the
-  transport is multipath-capable, rds_sendmsg() hashes outgoing traffic
-  across multiple paths. The outgoing hash is computed based on the
-  local address and port that the PF_RDS socket is bound to.
-
-  Additionally, even if the transport is MP capable, we may be
-  peering with some node that does not support mprds, or supports
-  a different number of paths. As a result, the peering nodes need
-  to agree on the number of paths to be used for the connection.
-  This is done by sending out a control packet exchange before the
-  first data packet. The control packet exchange must have completed
-  prior to outgoing hash completion in rds_sendmsg() when the transport
-  is mutlipath capable.
-
-  The control packet is an RDS ping packet (i.e., packet to rds dest
-  port 0) with the ping packet having a rds extension header option  of
-  type RDS_EXTHDR_NPATHS, length 2 bytes, and the value is the
-  number of paths supported by the sender. The "probe" ping packet will
-  get sent from some reserved port, RDS_FLAG_PROBE_PORT (in <linux/rds.h>)
-  The receiver of a ping from RDS_FLAG_PROBE_PORT will thus immediately
-  be able to compute the min(sender_paths, rcvr_paths). The pong
-  sent in response to a probe-ping should contain the rcvr's npaths
-  when the rcvr is mprds-capable.
-
-  If the rcvr is not mprds-capable, the exthdr in the ping will be
-  ignored.  In this case the pong will not have any exthdrs, so the sender
-  of the probe-ping can default to single-path mprds.
-
diff --git a/MAINTAINERS b/MAINTAINERS
index 785f56e5f210..ea5dd3d1df9d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14219,7 +14219,7 @@ L:	linux-rdma@vger.kernel.org
 L:	rds-devel@oss.oracle.com (moderated for non-subscribers)
 S:	Supported
 W:	https://oss.oracle.com/projects/rds/
-F:	Documentation/networking/rds.txt
+F:	Documentation/networking/rds.rst
 F:	net/rds/
 
 RDT - RESOURCE ALLOCATION
-- 
cgit v1.2.3


From 98661e0c579dbda0e0910185f752fddd95e2d29c Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Thu, 30 Apr 2020 18:04:20 +0200
Subject: docs: networking: convert regulatory.txt to ReST

- add SPDX header;
- adjust title markup;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst      |   1 +
 Documentation/networking/regulatory.rst | 209 ++++++++++++++++++++++++++++++++
 Documentation/networking/regulatory.txt | 204 -------------------------------
 MAINTAINERS                             |   2 +-
 4 files changed, 211 insertions(+), 205 deletions(-)
 create mode 100644 Documentation/networking/regulatory.rst
 delete mode 100644 Documentation/networking/regulatory.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index e63a2cb2e4cb..bc3b04a2edde 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -98,6 +98,7 @@ Contents:
    radiotap-headers
    ray_cs
    rds
+   regulatory
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/regulatory.rst b/Documentation/networking/regulatory.rst
new file mode 100644
index 000000000000..8701b91e81ee
--- /dev/null
+++ b/Documentation/networking/regulatory.rst
@@ -0,0 +1,209 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================================
+Linux wireless regulatory documentation
+=======================================
+
+This document gives a brief review over how the Linux wireless
+regulatory infrastructure works.
+
+More up to date information can be obtained at the project's web page:
+
+http://wireless.kernel.org/en/developers/Regulatory
+
+Keeping regulatory domains in userspace
+---------------------------------------
+
+Due to the dynamic nature of regulatory domains we keep them
+in userspace and provide a framework for userspace to upload
+to the kernel one regulatory domain to be used as the central
+core regulatory domain all wireless devices should adhere to.
+
+How to get regulatory domains to the kernel
+-------------------------------------------
+
+When the regulatory domain is first set up, the kernel will request a
+database file (regulatory.db) containing all the regulatory rules. It
+will then use that database when it needs to look up the rules for a
+given country.
+
+How to get regulatory domains to the kernel (old CRDA solution)
+---------------------------------------------------------------
+
+Userspace gets a regulatory domain in the kernel by having
+a userspace agent build it and send it via nl80211. Only
+expected regulatory domains will be respected by the kernel.
+
+A currently available userspace agent which can accomplish this
+is CRDA - central regulatory domain agent. Its documented here:
+
+http://wireless.kernel.org/en/developers/Regulatory/CRDA
+
+Essentially the kernel will send a udev event when it knows
+it needs a new regulatory domain. A udev rule can be put in place
+to trigger crda to send the respective regulatory domain for a
+specific ISO/IEC 3166 alpha2.
+
+Below is an example udev rule which can be used:
+
+# Example file, should be put in /etc/udev/rules.d/regulatory.rules
+KERNEL=="regulatory*", ACTION=="change", SUBSYSTEM=="platform", RUN+="/sbin/crda"
+
+The alpha2 is passed as an environment variable under the variable COUNTRY.
+
+Who asks for regulatory domains?
+--------------------------------
+
+* Users
+
+Users can use iw:
+
+http://wireless.kernel.org/en/users/Documentation/iw
+
+An example::
+
+  # set regulatory domain to "Costa Rica"
+  iw reg set CR
+
+This will request the kernel to set the regulatory domain to
+the specificied alpha2. The kernel in turn will then ask userspace
+to provide a regulatory domain for the alpha2 specified by the user
+by sending a uevent.
+
+* Wireless subsystems for Country Information elements
+
+The kernel will send a uevent to inform userspace a new
+regulatory domain is required. More on this to be added
+as its integration is added.
+
+* Drivers
+
+If drivers determine they need a specific regulatory domain
+set they can inform the wireless core using regulatory_hint().
+They have two options -- they either provide an alpha2 so that
+crda can provide back a regulatory domain for that country or
+they can build their own regulatory domain based on internal
+custom knowledge so the wireless core can respect it.
+
+*Most* drivers will rely on the first mechanism of providing a
+regulatory hint with an alpha2. For these drivers there is an additional
+check that can be used to ensure compliance based on custom EEPROM
+regulatory data. This additional check can be used by drivers by
+registering on its struct wiphy a reg_notifier() callback. This notifier
+is called when the core's regulatory domain has been changed. The driver
+can use this to review the changes made and also review who made them
+(driver, user, country IE) and determine what to allow based on its
+internal EEPROM data. Devices drivers wishing to be capable of world
+roaming should use this callback. More on world roaming will be
+added to this document when its support is enabled.
+
+Device drivers who provide their own built regulatory domain
+do not need a callback as the channels registered by them are
+the only ones that will be allowed and therefore *additional*
+channels cannot be enabled.
+
+Example code - drivers hinting an alpha2:
+------------------------------------------
+
+This example comes from the zd1211rw device driver. You can start
+by having a mapping of your device's EEPROM country/regulatory
+domain value to a specific alpha2 as follows::
+
+  static struct zd_reg_alpha2_map reg_alpha2_map[] = {
+	{ ZD_REGDOMAIN_FCC, "US" },
+	{ ZD_REGDOMAIN_IC, "CA" },
+	{ ZD_REGDOMAIN_ETSI, "DE" }, /* Generic ETSI, use most restrictive */
+	{ ZD_REGDOMAIN_JAPAN, "JP" },
+	{ ZD_REGDOMAIN_JAPAN_ADD, "JP" },
+	{ ZD_REGDOMAIN_SPAIN, "ES" },
+	{ ZD_REGDOMAIN_FRANCE, "FR" },
+
+Then you can define a routine to map your read EEPROM value to an alpha2,
+as follows::
+
+  static int zd_reg2alpha2(u8 regdomain, char *alpha2)
+  {
+	unsigned int i;
+	struct zd_reg_alpha2_map *reg_map;
+		for (i = 0; i < ARRAY_SIZE(reg_alpha2_map); i++) {
+			reg_map = &reg_alpha2_map[i];
+			if (regdomain == reg_map->reg) {
+			alpha2[0] = reg_map->alpha2[0];
+			alpha2[1] = reg_map->alpha2[1];
+			return 0;
+		}
+	}
+	return 1;
+  }
+
+Lastly, you can then hint to the core of your discovered alpha2, if a match
+was found. You need to do this after you have registered your wiphy. You
+are expected to do this during initialization.
+
+::
+
+	r = zd_reg2alpha2(mac->regdomain, alpha2);
+	if (!r)
+		regulatory_hint(hw->wiphy, alpha2);
+
+Example code - drivers providing a built in regulatory domain:
+--------------------------------------------------------------
+
+[NOTE: This API is not currently available, it can be added when required]
+
+If you have regulatory information you can obtain from your
+driver and you *need* to use this we let you build a regulatory domain
+structure and pass it to the wireless core. To do this you should
+kmalloc() a structure big enough to hold your regulatory domain
+structure and you should then fill it with your data. Finally you simply
+call regulatory_hint() with the regulatory domain structure in it.
+
+Bellow is a simple example, with a regulatory domain cached using the stack.
+Your implementation may vary (read EEPROM cache instead, for example).
+
+Example cache of some regulatory domain::
+
+  struct ieee80211_regdomain mydriver_jp_regdom = {
+	.n_reg_rules = 3,
+	.alpha2 =  "JP",
+	//.alpha2 =  "99", /* If I have no alpha2 to map it to */
+	.reg_rules = {
+		/* IEEE 802.11b/g, channels 1..14 */
+		REG_RULE(2412-10, 2484+10, 40, 6, 20, 0),
+		/* IEEE 802.11a, channels 34..48 */
+		REG_RULE(5170-10, 5240+10, 40, 6, 20,
+			NL80211_RRF_NO_IR),
+		/* IEEE 802.11a, channels 52..64 */
+		REG_RULE(5260-10, 5320+10, 40, 6, 20,
+			NL80211_RRF_NO_IR|
+			NL80211_RRF_DFS),
+	}
+  };
+
+Then in some part of your code after your wiphy has been registered::
+
+	struct ieee80211_regdomain *rd;
+	int size_of_regd;
+	int num_rules = mydriver_jp_regdom.n_reg_rules;
+	unsigned int i;
+
+	size_of_regd = sizeof(struct ieee80211_regdomain) +
+		(num_rules * sizeof(struct ieee80211_reg_rule));
+
+	rd = kzalloc(size_of_regd, GFP_KERNEL);
+	if (!rd)
+		return -ENOMEM;
+
+	memcpy(rd, &mydriver_jp_regdom, sizeof(struct ieee80211_regdomain));
+
+	for (i=0; i < num_rules; i++)
+		memcpy(&rd->reg_rules[i],
+		       &mydriver_jp_regdom.reg_rules[i],
+		       sizeof(struct ieee80211_reg_rule));
+	regulatory_struct_hint(rd);
+
+Statically compiled regulatory database
+---------------------------------------
+
+When a database should be fixed into the kernel, it can be provided as a
+firmware file at build time that is then linked into the kernel.
diff --git a/Documentation/networking/regulatory.txt b/Documentation/networking/regulatory.txt
deleted file mode 100644
index 381e5b23d61d..000000000000
--- a/Documentation/networking/regulatory.txt
+++ /dev/null
@@ -1,204 +0,0 @@
-Linux wireless regulatory documentation
----------------------------------------
-
-This document gives a brief review over how the Linux wireless
-regulatory infrastructure works.
-
-More up to date information can be obtained at the project's web page:
-
-http://wireless.kernel.org/en/developers/Regulatory
-
-Keeping regulatory domains in userspace
----------------------------------------
-
-Due to the dynamic nature of regulatory domains we keep them
-in userspace and provide a framework for userspace to upload
-to the kernel one regulatory domain to be used as the central
-core regulatory domain all wireless devices should adhere to.
-
-How to get regulatory domains to the kernel
--------------------------------------------
-
-When the regulatory domain is first set up, the kernel will request a
-database file (regulatory.db) containing all the regulatory rules. It
-will then use that database when it needs to look up the rules for a
-given country.
-
-How to get regulatory domains to the kernel (old CRDA solution)
----------------------------------------------------------------
-
-Userspace gets a regulatory domain in the kernel by having
-a userspace agent build it and send it via nl80211. Only
-expected regulatory domains will be respected by the kernel.
-
-A currently available userspace agent which can accomplish this
-is CRDA - central regulatory domain agent. Its documented here:
-
-http://wireless.kernel.org/en/developers/Regulatory/CRDA
-
-Essentially the kernel will send a udev event when it knows
-it needs a new regulatory domain. A udev rule can be put in place
-to trigger crda to send the respective regulatory domain for a
-specific ISO/IEC 3166 alpha2.
-
-Below is an example udev rule which can be used:
-
-# Example file, should be put in /etc/udev/rules.d/regulatory.rules
-KERNEL=="regulatory*", ACTION=="change", SUBSYSTEM=="platform", RUN+="/sbin/crda"
-
-The alpha2 is passed as an environment variable under the variable COUNTRY.
-
-Who asks for regulatory domains?
---------------------------------
-
-* Users
-
-Users can use iw:
-
-http://wireless.kernel.org/en/users/Documentation/iw
-
-An example:
-
-  # set regulatory domain to "Costa Rica"
-  iw reg set CR
-
-This will request the kernel to set the regulatory domain to
-the specificied alpha2. The kernel in turn will then ask userspace
-to provide a regulatory domain for the alpha2 specified by the user
-by sending a uevent.
-
-* Wireless subsystems for Country Information elements
-
-The kernel will send a uevent to inform userspace a new
-regulatory domain is required. More on this to be added
-as its integration is added.
-
-* Drivers
-
-If drivers determine they need a specific regulatory domain
-set they can inform the wireless core using regulatory_hint().
-They have two options -- they either provide an alpha2 so that
-crda can provide back a regulatory domain for that country or
-they can build their own regulatory domain based on internal
-custom knowledge so the wireless core can respect it.
-
-*Most* drivers will rely on the first mechanism of providing a
-regulatory hint with an alpha2. For these drivers there is an additional
-check that can be used to ensure compliance based on custom EEPROM
-regulatory data. This additional check can be used by drivers by
-registering on its struct wiphy a reg_notifier() callback. This notifier
-is called when the core's regulatory domain has been changed. The driver
-can use this to review the changes made and also review who made them
-(driver, user, country IE) and determine what to allow based on its
-internal EEPROM data. Devices drivers wishing to be capable of world
-roaming should use this callback. More on world roaming will be
-added to this document when its support is enabled.
-
-Device drivers who provide their own built regulatory domain
-do not need a callback as the channels registered by them are
-the only ones that will be allowed and therefore *additional*
-channels cannot be enabled.
-
-Example code - drivers hinting an alpha2:
-------------------------------------------
-
-This example comes from the zd1211rw device driver. You can start
-by having a mapping of your device's EEPROM country/regulatory
-domain value to a specific alpha2 as follows:
-
-static struct zd_reg_alpha2_map reg_alpha2_map[] = {
-	{ ZD_REGDOMAIN_FCC, "US" },
-	{ ZD_REGDOMAIN_IC, "CA" },
-	{ ZD_REGDOMAIN_ETSI, "DE" }, /* Generic ETSI, use most restrictive */
-	{ ZD_REGDOMAIN_JAPAN, "JP" },
-	{ ZD_REGDOMAIN_JAPAN_ADD, "JP" },
-	{ ZD_REGDOMAIN_SPAIN, "ES" },
-	{ ZD_REGDOMAIN_FRANCE, "FR" },
-
-Then you can define a routine to map your read EEPROM value to an alpha2,
-as follows:
-
-static int zd_reg2alpha2(u8 regdomain, char *alpha2)
-{
-	unsigned int i;
-	struct zd_reg_alpha2_map *reg_map;
-		for (i = 0; i < ARRAY_SIZE(reg_alpha2_map); i++) {
-			reg_map = &reg_alpha2_map[i];
-			if (regdomain == reg_map->reg) {
-			alpha2[0] = reg_map->alpha2[0];
-			alpha2[1] = reg_map->alpha2[1];
-			return 0;
-		}
-	}
-	return 1;
-}
-
-Lastly, you can then hint to the core of your discovered alpha2, if a match
-was found. You need to do this after you have registered your wiphy. You
-are expected to do this during initialization.
-
-	r = zd_reg2alpha2(mac->regdomain, alpha2);
-	if (!r)
-		regulatory_hint(hw->wiphy, alpha2);
-
-Example code - drivers providing a built in regulatory domain:
---------------------------------------------------------------
-
-[NOTE: This API is not currently available, it can be added when required]
-
-If you have regulatory information you can obtain from your
-driver and you *need* to use this we let you build a regulatory domain
-structure and pass it to the wireless core. To do this you should
-kmalloc() a structure big enough to hold your regulatory domain
-structure and you should then fill it with your data. Finally you simply
-call regulatory_hint() with the regulatory domain structure in it.
-
-Bellow is a simple example, with a regulatory domain cached using the stack.
-Your implementation may vary (read EEPROM cache instead, for example).
-
-Example cache of some regulatory domain
-
-struct ieee80211_regdomain mydriver_jp_regdom = {
-	.n_reg_rules = 3,
-	.alpha2 =  "JP",
-	//.alpha2 =  "99", /* If I have no alpha2 to map it to */
-	.reg_rules = {
-		/* IEEE 802.11b/g, channels 1..14 */
-		REG_RULE(2412-10, 2484+10, 40, 6, 20, 0),
-		/* IEEE 802.11a, channels 34..48 */
-		REG_RULE(5170-10, 5240+10, 40, 6, 20,
-			NL80211_RRF_NO_IR),
-		/* IEEE 802.11a, channels 52..64 */
-		REG_RULE(5260-10, 5320+10, 40, 6, 20,
-			NL80211_RRF_NO_IR|
-			NL80211_RRF_DFS),
-	}
-};
-
-Then in some part of your code after your wiphy has been registered:
-
-	struct ieee80211_regdomain *rd;
-	int size_of_regd;
-	int num_rules = mydriver_jp_regdom.n_reg_rules;
-	unsigned int i;
-
-	size_of_regd = sizeof(struct ieee80211_regdomain) +
-		(num_rules * sizeof(struct ieee80211_reg_rule));
-
-	rd = kzalloc(size_of_regd, GFP_KERNEL);
-	if (!rd)
-		return -ENOMEM;
-
-	memcpy(rd, &mydriver_jp_regdom, sizeof(struct ieee80211_regdomain));
-
-	for (i=0; i < num_rules; i++)
-		memcpy(&rd->reg_rules[i],
-		       &mydriver_jp_regdom.reg_rules[i],
-		       sizeof(struct ieee80211_reg_rule));
-	regulatory_struct_hint(rd);
-
-Statically compiled regulatory database
----------------------------------------
-
-When a database should be fixed into the kernel, it can be provided as a
-firmware file at build time that is then linked into the kernel.
diff --git a/MAINTAINERS b/MAINTAINERS
index ea5dd3d1df9d..b28823ab48c5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -193,7 +193,7 @@ W:	https://wireless.wiki.kernel.org/
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211.git
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git
 F:	Documentation/driver-api/80211/cfg80211.rst
-F:	Documentation/networking/regulatory.txt
+F:	Documentation/networking/regulatory.rst
 F:	include/linux/ieee80211.h
 F:	include/net/cfg80211.h
 F:	include/net/ieee80211_radiotap.h
-- 
cgit v1.2.3


From 9f72374cb5959556870be8078b128158edde5d3e Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Thu, 30 Apr 2020 18:04:21 +0200
Subject: docs: networking: convert rxrpc.txt to ReST

- add SPDX header;
- adjust title markup;
- use autonumbered list markups;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/filesystems/afs.rst  |    2 +-
 Documentation/networking/index.rst |    1 +
 Documentation/networking/rxrpc.rst | 1169 ++++++++++++++++++++++++++++++++++++
 Documentation/networking/rxrpc.txt | 1155 -----------------------------------
 MAINTAINERS                        |    2 +-
 net/rxrpc/Kconfig                  |    6 +-
 net/rxrpc/sysctl.c                 |    2 +-
 7 files changed, 1176 insertions(+), 1161 deletions(-)
 create mode 100644 Documentation/networking/rxrpc.rst
 delete mode 100644 Documentation/networking/rxrpc.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/filesystems/afs.rst b/Documentation/filesystems/afs.rst
index c4ec39a5966e..cada9464d6bd 100644
--- a/Documentation/filesystems/afs.rst
+++ b/Documentation/filesystems/afs.rst
@@ -70,7 +70,7 @@ list of volume location server IP addresses::
 The first module is the AF_RXRPC network protocol driver.  This provides the
 RxRPC remote operation protocol and may also be accessed from userspace.  See:
 
-	Documentation/networking/rxrpc.txt
+	Documentation/networking/rxrpc.rst
 
 The second module is the kerberos RxRPC security driver, and the third module
 is the actual filesystem driver for the AFS filesystem.
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index bc3b04a2edde..cd307b9601fa 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -99,6 +99,7 @@ Contents:
    ray_cs
    rds
    regulatory
+   rxrpc
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/rxrpc.rst b/Documentation/networking/rxrpc.rst
new file mode 100644
index 000000000000..5ad35113d0f4
--- /dev/null
+++ b/Documentation/networking/rxrpc.rst
@@ -0,0 +1,1169 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+RxRPC Network Protocol
+======================
+
+The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
+that can be used to perform RxRPC remote operations.  This is done over sockets
+of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and
+receive data, aborts and errors.
+
+Contents of this document:
+
+ (#) Overview.
+
+ (#) RxRPC protocol summary.
+
+ (#) AF_RXRPC driver model.
+
+ (#) Control messages.
+
+ (#) Socket options.
+
+ (#) Security.
+
+ (#) Example client usage.
+
+ (#) Example server usage.
+
+ (#) AF_RXRPC kernel interface.
+
+ (#) Configurable parameters.
+
+
+Overview
+========
+
+RxRPC is a two-layer protocol.  There is a session layer which provides
+reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
+layer, but implements a real network protocol; and there's the presentation
+layer which renders structured data to binary blobs and back again using XDR
+(as does SunRPC)::
+
+		+-------------+
+		| Application |
+		+-------------+
+		|     XDR     |		Presentation
+		+-------------+
+		|    RxRPC    |		Session
+		+-------------+
+		|     UDP     |		Transport
+		+-------------+
+
+
+AF_RXRPC provides:
+
+ (1) Part of an RxRPC facility for both kernel and userspace applications by
+     making the session part of it a Linux network protocol (AF_RXRPC).
+
+ (2) A two-phase protocol.  The client transmits a blob (the request) and then
+     receives a blob (the reply), and the server receives the request and then
+     transmits the reply.
+
+ (3) Retention of the reusable bits of the transport system set up for one call
+     to speed up subsequent calls.
+
+ (4) A secure protocol, using the Linux kernel's key retention facility to
+     manage security on the client end.  The server end must of necessity be
+     more active in security negotiations.
+
+AF_RXRPC does not provide XDR marshalling/presentation facilities.  That is
+left to the application.  AF_RXRPC only deals in blobs.  Even the operation ID
+is just the first four bytes of the request blob, and as such is beyond the
+kernel's interest.
+
+
+Sockets of AF_RXRPC family are:
+
+ (1) created as type SOCK_DGRAM;
+
+ (2) provided with a protocol of the type of underlying transport they're going
+     to use - currently only PF_INET is supported.
+
+
+The Andrew File System (AFS) is an example of an application that uses this and
+that has both kernel (filesystem) and userspace (utility) components.
+
+
+RxRPC Protocol Summary
+======================
+
+An overview of the RxRPC protocol:
+
+ (#) RxRPC sits on top of another networking protocol (UDP is the only option
+     currently), and uses this to provide network transport.  UDP ports, for
+     example, provide transport endpoints.
+
+ (#) RxRPC supports multiple virtual "connections" from any given transport
+     endpoint, thus allowing the endpoints to be shared, even to the same
+     remote endpoint.
+
+ (#) Each connection goes to a particular "service".  A connection may not go
+     to multiple services.  A service may be considered the RxRPC equivalent of
+     a port number.  AF_RXRPC permits multiple services to share an endpoint.
+
+ (#) Client-originating packets are marked, thus a transport endpoint can be
+     shared between client and server connections (connections have a
+     direction).
+
+ (#) Up to a billion connections may be supported concurrently between one
+     local transport endpoint and one service on one remote endpoint.  An RxRPC
+     connection is described by seven numbers::
+
+	Local address	}
+	Local port	} Transport (UDP) address
+	Remote address	}
+	Remote port	}
+	Direction
+	Connection ID
+	Service ID
+
+ (#) Each RxRPC operation is a "call".  A connection may make up to four
+     billion calls, but only up to four calls may be in progress on a
+     connection at any one time.
+
+ (#) Calls are two-phase and asymmetric: the client sends its request data,
+     which the service receives; then the service transmits the reply data
+     which the client receives.
+
+ (#) The data blobs are of indefinite size, the end of a phase is marked with a
+     flag in the packet.  The number of packets of data making up one blob may
+     not exceed 4 billion, however, as this would cause the sequence number to
+     wrap.
+
+ (#) The first four bytes of the request data are the service operation ID.
+
+ (#) Security is negotiated on a per-connection basis.  The connection is
+     initiated by the first data packet on it arriving.  If security is
+     requested, the server then issues a "challenge" and then the client
+     replies with a "response".  If the response is successful, the security is
+     set for the lifetime of that connection, and all subsequent calls made
+     upon it use that same security.  In the event that the server lets a
+     connection lapse before the client, the security will be renegotiated if
+     the client uses the connection again.
+
+ (#) Calls use ACK packets to handle reliability.  Data packets are also
+     explicitly sequenced per call.
+
+ (#) There are two types of positive acknowledgment: hard-ACKs and soft-ACKs.
+     A hard-ACK indicates to the far side that all the data received to a point
+     has been received and processed; a soft-ACK indicates that the data has
+     been received but may yet be discarded and re-requested.  The sender may
+     not discard any transmittable packets until they've been hard-ACK'd.
+
+ (#) Reception of a reply data packet implicitly hard-ACK's all the data
+     packets that make up the request.
+
+ (#) An call is complete when the request has been sent, the reply has been
+     received and the final hard-ACK on the last packet of the reply has
+     reached the server.
+
+ (#) An call may be aborted by either end at any time up to its completion.
+
+
+AF_RXRPC Driver Model
+=====================
+
+About the AF_RXRPC driver:
+
+ (#) The AF_RXRPC protocol transparently uses internal sockets of the transport
+     protocol to represent transport endpoints.
+
+ (#) AF_RXRPC sockets map onto RxRPC connection bundles.  Actual RxRPC
+     connections are handled transparently.  One client socket may be used to
+     make multiple simultaneous calls to the same service.  One server socket
+     may handle calls from many clients.
+
+ (#) Additional parallel client connections will be initiated to support extra
+     concurrent calls, up to a tunable limit.
+
+ (#) Each connection is retained for a certain amount of time [tunable] after
+     the last call currently using it has completed in case a new call is made
+     that could reuse it.
+
+ (#) Each internal UDP socket is retained [tunable] for a certain amount of
+     time [tunable] after the last connection using it discarded, in case a new
+     connection is made that could use it.
+
+ (#) A client-side connection is only shared between calls if they have have
+     the same key struct describing their security (and assuming the calls
+     would otherwise share the connection).  Non-secured calls would also be
+     able to share connections with each other.
+
+ (#) A server-side connection is shared if the client says it is.
+
+ (#) ACK'ing is handled by the protocol driver automatically, including ping
+     replying.
+
+ (#) SO_KEEPALIVE automatically pings the other side to keep the connection
+     alive [TODO].
+
+ (#) If an ICMP error is received, all calls affected by that error will be
+     aborted with an appropriate network error passed through recvmsg().
+
+
+Interaction with the user of the RxRPC socket:
+
+ (#) A socket is made into a server socket by binding an address with a
+     non-zero service ID.
+
+ (#) In the client, sending a request is achieved with one or more sendmsgs,
+     followed by the reply being received with one or more recvmsgs.
+
+ (#) The first sendmsg for a request to be sent from a client contains a tag to
+     be used in all other sendmsgs or recvmsgs associated with that call.  The
+     tag is carried in the control data.
+
+ (#) connect() is used to supply a default destination address for a client
+     socket.  This may be overridden by supplying an alternate address to the
+     first sendmsg() of a call (struct msghdr::msg_name).
+
+ (#) If connect() is called on an unbound client, a random local port will
+     bound before the operation takes place.
+
+ (#) A server socket may also be used to make client calls.  To do this, the
+     first sendmsg() of the call must specify the target address.  The server's
+     transport endpoint is used to send the packets.
+
+ (#) Once the application has received the last message associated with a call,
+     the tag is guaranteed not to be seen again, and so it can be used to pin
+     client resources.  A new call can then be initiated with the same tag
+     without fear of interference.
+
+ (#) In the server, a request is received with one or more recvmsgs, then the
+     the reply is transmitted with one or more sendmsgs, and then the final ACK
+     is received with a last recvmsg.
+
+ (#) When sending data for a call, sendmsg is given MSG_MORE if there's more
+     data to come on that call.
+
+ (#) When receiving data for a call, recvmsg flags MSG_MORE if there's more
+     data to come for that call.
+
+ (#) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
+     to indicate the terminal message for that call.
+
+ (#) A call may be aborted by adding an abort control message to the control
+     data.  Issuing an abort terminates the kernel's use of that call's tag.
+     Any messages waiting in the receive queue for that call will be discarded.
+
+ (#) Aborts, busy notifications and challenge packets are delivered by recvmsg,
+     and control data messages will be set to indicate the context.  Receiving
+     an abort or a busy message terminates the kernel's use of that call's tag.
+
+ (#) The control data part of the msghdr struct is used for a number of things:
+
+     (#) The tag of the intended or affected call.
+
+     (#) Sending or receiving errors, aborts and busy notifications.
+
+     (#) Notifications of incoming calls.
+
+     (#) Sending debug requests and receiving debug replies [TODO].
+
+ (#) When the kernel has received and set up an incoming call, it sends a
+     message to server application to let it know there's a new call awaiting
+     its acceptance [recvmsg reports a special control message].  The server
+     application then uses sendmsg to assign a tag to the new call.  Once that
+     is done, the first part of the request data will be delivered by recvmsg.
+
+ (#) The server application has to provide the server socket with a keyring of
+     secret keys corresponding to the security types it permits.  When a secure
+     connection is being set up, the kernel looks up the appropriate secret key
+     in the keyring and then sends a challenge packet to the client and
+     receives a response packet.  The kernel then checks the authorisation of
+     the packet and either aborts the connection or sets up the security.
+
+ (#) The name of the key a client will use to secure its communications is
+     nominated by a socket option.
+
+
+Notes on sendmsg:
+
+ (#) MSG_WAITALL can be set to tell sendmsg to ignore signals if the peer is
+     making progress at accepting packets within a reasonable time such that we
+     manage to queue up all the data for transmission.  This requires the
+     client to accept at least one packet per 2*RTT time period.
+
+     If this isn't set, sendmsg() will return immediately, either returning
+     EINTR/ERESTARTSYS if nothing was consumed or returning the amount of data
+     consumed.
+
+
+Notes on recvmsg:
+
+ (#) If there's a sequence of data messages belonging to a particular call on
+     the receive queue, then recvmsg will keep working through them until:
+
+     (a) it meets the end of that call's received data,
+
+     (b) it meets a non-data message,
+
+     (c) it meets a message belonging to a different call, or
+
+     (d) it fills the user buffer.
+
+     If recvmsg is called in blocking mode, it will keep sleeping, awaiting the
+     reception of further data, until one of the above four conditions is met.
+
+ (2) MSG_PEEK operates similarly, but will return immediately if it has put any
+     data in the buffer rather than sleeping until it can fill the buffer.
+
+ (3) If a data message is only partially consumed in filling a user buffer,
+     then the remainder of that message will be left on the front of the queue
+     for the next taker.  MSG_TRUNC will never be flagged.
+
+ (4) If there is more data to be had on a call (it hasn't copied the last byte
+     of the last data message in that phase yet), then MSG_MORE will be
+     flagged.
+
+
+Control Messages
+================
+
+AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
+calls, to invoke certain actions and to report certain conditions.  These are:
+
+	=======================	=== ===========	===============================
+	MESSAGE ID		SRT DATA	MEANING
+	=======================	=== ===========	===============================
+	RXRPC_USER_CALL_ID	sr- User ID	App's call specifier
+	RXRPC_ABORT		srt Abort code	Abort code to issue/received
+	RXRPC_ACK		-rt n/a		Final ACK received
+	RXRPC_NET_ERROR		-rt error num	Network error on call
+	RXRPC_BUSY		-rt n/a		Call rejected (server busy)
+	RXRPC_LOCAL_ERROR	-rt error num	Local error encountered
+	RXRPC_NEW_CALL		-r- n/a		New call received
+	RXRPC_ACCEPT		s-- n/a		Accept new call
+	RXRPC_EXCLUSIVE_CALL	s-- n/a		Make an exclusive client call
+	RXRPC_UPGRADE_SERVICE	s-- n/a		Client call can be upgraded
+	RXRPC_TX_LENGTH		s-- data len	Total length of Tx data
+	=======================	=== ===========	===============================
+
+	(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
+
+ (#) RXRPC_USER_CALL_ID
+
+     This is used to indicate the application's call ID.  It's an unsigned long
+     that the app specifies in the client by attaching it to the first data
+     message or in the server by passing it in association with an RXRPC_ACCEPT
+     message.  recvmsg() passes it in conjunction with all messages except
+     those of the RXRPC_NEW_CALL message.
+
+ (#) RXRPC_ABORT
+
+     This is can be used by an application to abort a call by passing it to
+     sendmsg, or it can be delivered by recvmsg to indicate a remote abort was
+     received.  Either way, it must be associated with an RXRPC_USER_CALL_ID to
+     specify the call affected.  If an abort is being sent, then error EBADSLT
+     will be returned if there is no call with that user ID.
+
+ (#) RXRPC_ACK
+
+     This is delivered to a server application to indicate that the final ACK
+     of a call was received from the client.  It will be associated with an
+     RXRPC_USER_CALL_ID to indicate the call that's now complete.
+
+ (#) RXRPC_NET_ERROR
+
+     This is delivered to an application to indicate that an ICMP error message
+     was encountered in the process of trying to talk to the peer.  An
+     errno-class integer value will be included in the control message data
+     indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
+     affected.
+
+ (#) RXRPC_BUSY
+
+     This is delivered to a client application to indicate that a call was
+     rejected by the server due to the server being busy.  It will be
+     associated with an RXRPC_USER_CALL_ID to indicate the rejected call.
+
+ (#) RXRPC_LOCAL_ERROR
+
+     This is delivered to an application to indicate that a local error was
+     encountered and that a call has been aborted because of it.  An
+     errno-class integer value will be included in the control message data
+     indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
+     affected.
+
+ (#) RXRPC_NEW_CALL
+
+     This is delivered to indicate to a server application that a new call has
+     arrived and is awaiting acceptance.  No user ID is associated with this,
+     as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT.
+
+ (#) RXRPC_ACCEPT
+
+     This is used by a server application to attempt to accept a call and
+     assign it a user ID.  It should be associated with an RXRPC_USER_CALL_ID
+     to indicate the user ID to be assigned.  If there is no call to be
+     accepted (it may have timed out, been aborted, etc.), then sendmsg will
+     return error ENODATA.  If the user ID is already in use by another call,
+     then error EBADSLT will be returned.
+
+ (#) RXRPC_EXCLUSIVE_CALL
+
+     This is used to indicate that a client call should be made on a one-off
+     connection.  The connection is discarded once the call has terminated.
+
+ (#) RXRPC_UPGRADE_SERVICE
+
+     This is used to make a client call to probe if the specified service ID
+     may be upgraded by the server.  The caller must check msg_name returned to
+     recvmsg() for the service ID actually in use.  The operation probed must
+     be one that takes the same arguments in both services.
+
+     Once this has been used to establish the upgrade capability (or lack
+     thereof) of the server, the service ID returned should be used for all
+     future communication to that server and RXRPC_UPGRADE_SERVICE should no
+     longer be set.
+
+ (#) RXRPC_TX_LENGTH
+
+     This is used to inform the kernel of the total amount of data that is
+     going to be transmitted by a call (whether in a client request or a
+     service response).  If given, it allows the kernel to encrypt from the
+     userspace buffer directly to the packet buffers, rather than copying into
+     the buffer and then encrypting in place.  This may only be given with the
+     first sendmsg() providing data for a call.  EMSGSIZE will be generated if
+     the amount of data actually given is different.
+
+     This takes a parameter of __s64 type that indicates how much will be
+     transmitted.  This may not be less than zero.
+
+The symbol RXRPC__SUPPORTED is defined as one more than the highest control
+message type supported.  At run time this can be queried by means of the
+RXRPC_SUPPORTED_CMSG socket option (see below).
+
+
+==============
+SOCKET OPTIONS
+==============
+
+AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
+
+ (#) RXRPC_SECURITY_KEY
+
+     This is used to specify the description of the key to be used.  The key is
+     extracted from the calling process's keyrings with request_key() and
+     should be of "rxrpc" type.
+
+     The optval pointer points to the description string, and optlen indicates
+     how long the string is, without the NUL terminator.
+
+ (#) RXRPC_SECURITY_KEYRING
+
+     Similar to above but specifies a keyring of server secret keys to use (key
+     type "keyring").  See the "Security" section.
+
+ (#) RXRPC_EXCLUSIVE_CONNECTION
+
+     This is used to request that new connections should be used for each call
+     made subsequently on this socket.  optval should be NULL and optlen 0.
+
+ (#) RXRPC_MIN_SECURITY_LEVEL
+
+     This is used to specify the minimum security level required for calls on
+     this socket.  optval must point to an int containing one of the following
+     values:
+
+     (a) RXRPC_SECURITY_PLAIN
+
+	 Encrypted checksum only.
+
+     (b) RXRPC_SECURITY_AUTH
+
+	 Encrypted checksum plus packet padded and first eight bytes of packet
+	 encrypted - which includes the actual packet length.
+
+     (c) RXRPC_SECURITY_ENCRYPTED
+
+	 Encrypted checksum plus entire packet padded and encrypted, including
+	 actual packet length.
+
+ (#) RXRPC_UPGRADEABLE_SERVICE
+
+     This is used to indicate that a service socket with two bindings may
+     upgrade one bound service to the other if requested by the client.  optval
+     must point to an array of two unsigned short ints.  The first is the
+     service ID to upgrade from and the second the service ID to upgrade to.
+
+ (#) RXRPC_SUPPORTED_CMSG
+
+     This is a read-only option that writes an int into the buffer indicating
+     the highest control message type supported.
+
+
+========
+SECURITY
+========
+
+Currently, only the kerberos 4 equivalent protocol has been implemented
+(security index 2 - rxkad).  This requires the rxkad module to be loaded and,
+on the client, tickets of the appropriate type to be obtained from the AFS
+kaserver or the kerberos server and installed as "rxrpc" type keys.  This is
+normally done using the klog program.  An example simple klog program can be
+found at:
+
+	http://people.redhat.com/~dhowells/rxrpc/klog.c
+
+The payload provided to add_key() on the client should be of the following
+form::
+
+	struct rxrpc_key_sec2_v1 {
+		uint16_t	security_index;	/* 2 */
+		uint16_t	ticket_length;	/* length of ticket[] */
+		uint32_t	expiry;		/* time at which expires */
+		uint8_t		kvno;		/* key version number */
+		uint8_t		__pad[3];
+		uint8_t		session_key[8];	/* DES session key */
+		uint8_t		ticket[0];	/* the encrypted ticket */
+	};
+
+Where the ticket blob is just appended to the above structure.
+
+
+For the server, keys of type "rxrpc_s" must be made available to the server.
+They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an
+rxkad key for the AFS VL service).  When such a key is created, it should be
+given the server's secret key as the instantiation data (see the example
+below).
+
+	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
+
+A keyring is passed to the server socket by naming it in a sockopt.  The server
+socket then looks the server secret keys up in this keyring when secure
+incoming connections are made.  This can be seen in an example program that can
+be found at:
+
+	http://people.redhat.com/~dhowells/rxrpc/listen.c
+
+
+====================
+EXAMPLE CLIENT USAGE
+====================
+
+A client would issue an operation by:
+
+ (1) An RxRPC socket is set up by::
+
+	client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
+
+     Where the third parameter indicates the protocol family of the transport
+     socket used - usually IPv4 but it can also be IPv6 [TODO].
+
+ (2) A local address can optionally be bound::
+
+	struct sockaddr_rxrpc srx = {
+		.srx_family	= AF_RXRPC,
+		.srx_service	= 0,  /* we're a client */
+		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
+		.transport.sin_family	= AF_INET,
+		.transport.sin_port	= htons(7000), /* AFS callback */
+		.transport.sin_address	= 0,  /* all local interfaces */
+	};
+	bind(client, &srx, sizeof(srx));
+
+     This specifies the local UDP port to be used.  If not given, a random
+     non-privileged port will be used.  A UDP port may be shared between
+     several unrelated RxRPC sockets.  Security is handled on a basis of
+     per-RxRPC virtual connection.
+
+ (3) The security is set::
+
+	const char *key = "AFS:cambridge.redhat.com";
+	setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key));
+
+     This issues a request_key() to get the key representing the security
+     context.  The minimum security level can be set::
+
+	unsigned int sec = RXRPC_SECURITY_ENCRYPTED;
+	setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
+		   &sec, sizeof(sec));
+
+ (4) The server to be contacted can then be specified (alternatively this can
+     be done through sendmsg)::
+
+	struct sockaddr_rxrpc srx = {
+		.srx_family	= AF_RXRPC,
+		.srx_service	= VL_SERVICE_ID,
+		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
+		.transport.sin_family	= AF_INET,
+		.transport.sin_port	= htons(7005), /* AFS volume manager */
+		.transport.sin_address	= ...,
+	};
+	connect(client, &srx, sizeof(srx));
+
+ (5) The request data should then be posted to the server socket using a series
+     of sendmsg() calls, each with the following control message attached:
+
+	==================	===================================
+	RXRPC_USER_CALL_ID	specifies the user ID for this call
+	==================	===================================
+
+     MSG_MORE should be set in msghdr::msg_flags on all but the last part of
+     the request.  Multiple requests may be made simultaneously.
+
+     An RXRPC_TX_LENGTH control message can also be specified on the first
+     sendmsg() call.
+
+     If a call is intended to go to a destination other than the default
+     specified through connect(), then msghdr::msg_name should be set on the
+     first request message of that call.
+
+ (6) The reply data will then be posted to the server socket for recvmsg() to
+     pick up.  MSG_MORE will be flagged by recvmsg() if there's more reply data
+     for a particular call to be read.  MSG_EOR will be set on the terminal
+     read for a call.
+
+     All data will be delivered with the following control message attached:
+
+	RXRPC_USER_CALL_ID	- specifies the user ID for this call
+
+     If an abort or error occurred, this will be returned in the control data
+     buffer instead, and MSG_EOR will be flagged to indicate the end of that
+     call.
+
+A client may ask for a service ID it knows and ask that this be upgraded to a
+better service if one is available by supplying RXRPC_UPGRADE_SERVICE on the
+first sendmsg() of a call.  The client should then check srx_service in the
+msg_name filled in by recvmsg() when collecting the result.  srx_service will
+hold the same value as given to sendmsg() if the upgrade request was ignored by
+the service - otherwise it will be altered to indicate the service ID the
+server upgraded to.  Note that the upgraded service ID is chosen by the server.
+The caller has to wait until it sees the service ID in the reply before sending
+any more calls (further calls to the same destination will be blocked until the
+probe is concluded).
+
+
+Example Server Usage
+====================
+
+A server would be set up to accept operations in the following manner:
+
+ (1) An RxRPC socket is created by::
+
+	server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
+
+     Where the third parameter indicates the address type of the transport
+     socket used - usually IPv4.
+
+ (2) Security is set up if desired by giving the socket a keyring with server
+     secret keys in it::
+
+	keyring = add_key("keyring", "AFSkeys", NULL, 0,
+			  KEY_SPEC_PROCESS_KEYRING);
+
+	const char secret_key[8] = {
+		0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 };
+	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
+
+	setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7);
+
+     The keyring can be manipulated after it has been given to the socket. This
+     permits the server to add more keys, replace keys, etc. while it is live.
+
+ (3) A local address must then be bound::
+
+	struct sockaddr_rxrpc srx = {
+		.srx_family	= AF_RXRPC,
+		.srx_service	= VL_SERVICE_ID, /* RxRPC service ID */
+		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
+		.transport.sin_family	= AF_INET,
+		.transport.sin_port	= htons(7000), /* AFS callback */
+		.transport.sin_address	= 0,  /* all local interfaces */
+	};
+	bind(server, &srx, sizeof(srx));
+
+     More than one service ID may be bound to a socket, provided the transport
+     parameters are the same.  The limit is currently two.  To do this, bind()
+     should be called twice.
+
+ (4) If service upgrading is required, first two service IDs must have been
+     bound and then the following option must be set::
+
+	unsigned short service_ids[2] = { from_ID, to_ID };
+	setsockopt(server, SOL_RXRPC, RXRPC_UPGRADEABLE_SERVICE,
+		   service_ids, sizeof(service_ids));
+
+     This will automatically upgrade connections on service from_ID to service
+     to_ID if they request it.  This will be reflected in msg_name obtained
+     through recvmsg() when the request data is delivered to userspace.
+
+ (5) The server is then set to listen out for incoming calls::
+
+	listen(server, 100);
+
+ (6) The kernel notifies the server of pending incoming connections by sending
+     it a message for each.  This is received with recvmsg() on the server
+     socket.  It has no data, and has a single dataless control message
+     attached::
+
+	RXRPC_NEW_CALL
+
+     The address that can be passed back by recvmsg() at this point should be
+     ignored since the call for which the message was posted may have gone by
+     the time it is accepted - in which case the first call still on the queue
+     will be accepted.
+
+ (7) The server then accepts the new call by issuing a sendmsg() with two
+     pieces of control data and no actual data:
+
+	==================	==============================
+	RXRPC_ACCEPT		indicate connection acceptance
+	RXRPC_USER_CALL_ID	specify user ID for this call
+	==================	==============================
+
+ (8) The first request data packet will then be posted to the server socket for
+     recvmsg() to pick up.  At that point, the RxRPC address for the call can
+     be read from the address fields in the msghdr struct.
+
+     Subsequent request data will be posted to the server socket for recvmsg()
+     to collect as it arrives.  All but the last piece of the request data will
+     be delivered with MSG_MORE flagged.
+
+     All data will be delivered with the following control message attached:
+
+
+	==================	===================================
+	RXRPC_USER_CALL_ID	specifies the user ID for this call
+	==================	===================================
+
+ (9) The reply data should then be posted to the server socket using a series
+     of sendmsg() calls, each with the following control messages attached:
+
+	==================	===================================
+	RXRPC_USER_CALL_ID	specifies the user ID for this call
+	==================	===================================
+
+     MSG_MORE should be set in msghdr::msg_flags on all but the last message
+     for a particular call.
+
+(10) The final ACK from the client will be posted for retrieval by recvmsg()
+     when it is received.  It will take the form of a dataless message with two
+     control messages attached:
+
+	==================	===================================
+	RXRPC_USER_CALL_ID	specifies the user ID for this call
+	RXRPC_ACK		indicates final ACK (no data)
+	==================	===================================
+
+     MSG_EOR will be flagged to indicate that this is the final message for
+     this call.
+
+(11) Up to the point the final packet of reply data is sent, the call can be
+     aborted by calling sendmsg() with a dataless message with the following
+     control messages attached:
+
+	==================	===================================
+	RXRPC_USER_CALL_ID	specifies the user ID for this call
+	RXRPC_ABORT		indicates abort code (4 byte data)
+	==================	===================================
+
+     Any packets waiting in the socket's receive queue will be discarded if
+     this is issued.
+
+Note that all the communications for a particular service take place through
+the one server socket, using control messages on sendmsg() and recvmsg() to
+determine the call affected.
+
+
+AF_RXRPC Kernel Interface
+=========================
+
+The AF_RXRPC module also provides an interface for use by in-kernel utilities
+such as the AFS filesystem.  This permits such a utility to:
+
+ (1) Use different keys directly on individual client calls on one socket
+     rather than having to open a whole slew of sockets, one for each key it
+     might want to use.
+
+ (2) Avoid having RxRPC call request_key() at the point of issue of a call or
+     opening of a socket.  Instead the utility is responsible for requesting a
+     key at the appropriate point.  AFS, for instance, would do this during VFS
+     operations such as open() or unlink().  The key is then handed through
+     when the call is initiated.
+
+ (3) Request the use of something other than GFP_KERNEL to allocate memory.
+
+ (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
+     intercepted before they get put into the socket Rx queue and the socket
+     buffers manipulated directly.
+
+To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
+bind an address as appropriate and listen if it's to be a server socket, but
+then it passes this to the kernel interface functions.
+
+The kernel interface functions are as follows:
+
+ (#) Begin a new client call::
+
+	struct rxrpc_call *
+	rxrpc_kernel_begin_call(struct socket *sock,
+				struct sockaddr_rxrpc *srx,
+				struct key *key,
+				unsigned long user_call_ID,
+				s64 tx_total_len,
+				gfp_t gfp,
+				rxrpc_notify_rx_t notify_rx,
+				bool upgrade,
+				bool intr,
+				unsigned int debug_id);
+
+     This allocates the infrastructure to make a new RxRPC call and assigns
+     call and connection numbers.  The call will be made on the UDP port that
+     the socket is bound to.  The call will go to the destination address of a
+     connected client socket unless an alternative is supplied (srx is
+     non-NULL).
+
+     If a key is supplied then this will be used to secure the call instead of
+     the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
+     secured in this way will still share connections if at all possible.
+
+     The user_call_ID is equivalent to that supplied to sendmsg() in the
+     control data buffer.  It is entirely feasible to use this to point to a
+     kernel data structure.
+
+     tx_total_len is the amount of data the caller is intending to transmit
+     with this call (or -1 if unknown at this point).  Setting the data size
+     allows the kernel to encrypt directly to the packet buffers, thereby
+     saving a copy.  The value may not be less than -1.
+
+     notify_rx is a pointer to a function to be called when events such as
+     incoming data packets or remote aborts happen.
+
+     upgrade should be set to true if a client operation should request that
+     the server upgrade the service to a better one.  The resultant service ID
+     is returned by rxrpc_kernel_recv_data().
+
+     intr should be set to true if the call should be interruptible.  If this
+     is not set, this function may not return until a channel has been
+     allocated; if it is set, the function may return -ERESTARTSYS.
+
+     debug_id is the call debugging ID to be used for tracing.  This can be
+     obtained by atomically incrementing rxrpc_debug_id.
+
+     If this function is successful, an opaque reference to the RxRPC call is
+     returned.  The caller now holds a reference on this and it must be
+     properly ended.
+
+ (#) End a client call::
+
+	void rxrpc_kernel_end_call(struct socket *sock,
+				   struct rxrpc_call *call);
+
+     This is used to end a previously begun call.  The user_call_ID is expunged
+     from AF_RXRPC's knowledge and will not be seen again in association with
+     the specified call.
+
+ (#) Send data through a call::
+
+	typedef void (*rxrpc_notify_end_tx_t)(struct sock *sk,
+					      unsigned long user_call_ID,
+					      struct sk_buff *skb);
+
+	int rxrpc_kernel_send_data(struct socket *sock,
+				   struct rxrpc_call *call,
+				   struct msghdr *msg,
+				   size_t len,
+				   rxrpc_notify_end_tx_t notify_end_rx);
+
+     This is used to supply either the request part of a client call or the
+     reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
+     data buffers to be used.  msg_iov may not be NULL and must point
+     exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
+     MSG_MORE if there will be subsequent data sends for this call.
+
+     The msg must not specify a destination address, control data or any flags
+     other than MSG_MORE.  len is the total amount of data to transmit.
+
+     notify_end_rx can be NULL or it can be used to specify a function to be
+     called when the call changes state to end the Tx phase.  This function is
+     called with the call-state spinlock held to prevent any reply or final ACK
+     from being delivered first.
+
+ (#) Receive data from a call::
+
+	int rxrpc_kernel_recv_data(struct socket *sock,
+				   struct rxrpc_call *call,
+				   void *buf,
+				   size_t size,
+				   size_t *_offset,
+				   bool want_more,
+				   u32 *_abort,
+				   u16 *_service)
+
+      This is used to receive data from either the reply part of a client call
+      or the request part of a service call.  buf and size specify how much
+      data is desired and where to store it.  *_offset is added on to buf and
+      subtracted from size internally; the amount copied into the buffer is
+      added to *_offset before returning.
+
+      want_more should be true if further data will be required after this is
+      satisfied and false if this is the last item of the receive phase.
+
+      There are three normal returns: 0 if the buffer was filled and want_more
+      was true; 1 if the buffer was filled, the last DATA packet has been
+      emptied and want_more was false; and -EAGAIN if the function needs to be
+      called again.
+
+      If the last DATA packet is processed but the buffer contains less than
+      the amount requested, EBADMSG is returned.  If want_more wasn't set, but
+      more data was available, EMSGSIZE is returned.
+
+      If a remote ABORT is detected, the abort code received will be stored in
+      ``*_abort`` and ECONNABORTED will be returned.
+
+      The service ID that the call ended up with is returned into *_service.
+      This can be used to see if a call got a service upgrade.
+
+ (#) Abort a call??
+
+     ::
+
+	void rxrpc_kernel_abort_call(struct socket *sock,
+				     struct rxrpc_call *call,
+				     u32 abort_code);
+
+     This is used to abort a call if it's still in an abortable state.  The
+     abort code specified will be placed in the ABORT message sent.
+
+ (#) Intercept received RxRPC messages::
+
+	typedef void (*rxrpc_interceptor_t)(struct sock *sk,
+					    unsigned long user_call_ID,
+					    struct sk_buff *skb);
+
+	void
+	rxrpc_kernel_intercept_rx_messages(struct socket *sock,
+					   rxrpc_interceptor_t interceptor);
+
+     This installs an interceptor function on the specified AF_RXRPC socket.
+     All messages that would otherwise wind up in the socket's Rx queue are
+     then diverted to this function.  Note that care must be taken to process
+     the messages in the right order to maintain DATA message sequentiality.
+
+     The interceptor function itself is provided with the address of the socket
+     and handling the incoming message, the ID assigned by the kernel utility
+     to the call and the socket buffer containing the message.
+
+     The skb->mark field indicates the type of message:
+
+	===============================	=======================================
+	Mark				Meaning
+	===============================	=======================================
+	RXRPC_SKB_MARK_DATA		Data message
+	RXRPC_SKB_MARK_FINAL_ACK	Final ACK received for an incoming call
+	RXRPC_SKB_MARK_BUSY		Client call rejected as server busy
+	RXRPC_SKB_MARK_REMOTE_ABORT	Call aborted by peer
+	RXRPC_SKB_MARK_NET_ERROR	Network error detected
+	RXRPC_SKB_MARK_LOCAL_ERROR	Local error encountered
+	RXRPC_SKB_MARK_NEW_CALL		New incoming call awaiting acceptance
+	===============================	=======================================
+
+     The remote abort message can be probed with rxrpc_kernel_get_abort_code().
+     The two error messages can be probed with rxrpc_kernel_get_error_number().
+     A new call can be accepted with rxrpc_kernel_accept_call().
+
+     Data messages can have their contents extracted with the usual bunch of
+     socket buffer manipulation functions.  A data message can be determined to
+     be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
+     data message has been used up, rxrpc_kernel_data_consumed() should be
+     called on it.
+
+     Messages should be handled to rxrpc_kernel_free_skb() to dispose of.  It
+     is possible to get extra refs on all types of message for later freeing,
+     but this may pin the state of a call until the message is finally freed.
+
+ (#) Accept an incoming call::
+
+	struct rxrpc_call *
+	rxrpc_kernel_accept_call(struct socket *sock,
+				 unsigned long user_call_ID);
+
+     This is used to accept an incoming call and to assign it a call ID.  This
+     function is similar to rxrpc_kernel_begin_call() and calls accepted must
+     be ended in the same way.
+
+     If this function is successful, an opaque reference to the RxRPC call is
+     returned.  The caller now holds a reference on this and it must be
+     properly ended.
+
+ (#) Reject an incoming call::
+
+	int rxrpc_kernel_reject_call(struct socket *sock);
+
+     This is used to reject the first incoming call on the socket's queue with
+     a BUSY message.  -ENODATA is returned if there were no incoming calls.
+     Other errors may be returned if the call had been aborted (-ECONNABORTED)
+     or had timed out (-ETIME).
+
+ (#) Allocate a null key for doing anonymous security::
+
+	struct key *rxrpc_get_null_key(const char *keyname);
+
+     This is used to allocate a null RxRPC key that can be used to indicate
+     anonymous security for a particular domain.
+
+ (#) Get the peer address of a call::
+
+	void rxrpc_kernel_get_peer(struct socket *sock, struct rxrpc_call *call,
+				   struct sockaddr_rxrpc *_srx);
+
+     This is used to find the remote peer address of a call.
+
+ (#) Set the total transmit data size on a call::
+
+	void rxrpc_kernel_set_tx_length(struct socket *sock,
+					struct rxrpc_call *call,
+					s64 tx_total_len);
+
+     This sets the amount of data that the caller is intending to transmit on a
+     call.  It's intended to be used for setting the reply size as the request
+     size should be set when the call is begun.  tx_total_len may not be less
+     than zero.
+
+ (#) Get call RTT::
+
+	u64 rxrpc_kernel_get_rtt(struct socket *sock, struct rxrpc_call *call);
+
+     Get the RTT time to the peer in use by a call.  The value returned is in
+     nanoseconds.
+
+ (#) Check call still alive::
+
+	bool rxrpc_kernel_check_life(struct socket *sock,
+				     struct rxrpc_call *call,
+				     u32 *_life);
+	void rxrpc_kernel_probe_life(struct socket *sock,
+				     struct rxrpc_call *call);
+
+     The first function passes back in ``*_life`` a number that is updated when
+     ACKs are received from the peer (notably including PING RESPONSE ACKs
+     which we can elicit by sending PING ACKs to see if the call still exists
+     on the server).  The caller should compare the numbers of two calls to see
+     if the call is still alive after waiting for a suitable interval.  It also
+     returns true as long as the call hasn't yet reached the completed state.
+
+     This allows the caller to work out if the server is still contactable and
+     if the call is still alive on the server while waiting for the server to
+     process a client operation.
+
+     The second function causes a ping ACK to be transmitted to try to provoke
+     the peer into responding, which would then cause the value returned by the
+     first function to change.  Note that this must be called in TASK_RUNNING
+     state.
+
+ (#) Get reply timestamp::
+
+	bool rxrpc_kernel_get_reply_time(struct socket *sock,
+					 struct rxrpc_call *call,
+					 ktime_t *_ts)
+
+     This allows the timestamp on the first DATA packet of the reply of a
+     client call to be queried, provided that it is still in the Rx ring.  If
+     successful, the timestamp will be stored into ``*_ts`` and true will be
+     returned; false will be returned otherwise.
+
+ (#) Get remote client epoch::
+
+	u32 rxrpc_kernel_get_epoch(struct socket *sock,
+				   struct rxrpc_call *call)
+
+     This allows the epoch that's contained in packets of an incoming client
+     call to be queried.  This value is returned.  The function always
+     successful if the call is still in progress.  It shouldn't be called once
+     the call has expired.  Note that calling this on a local client call only
+     returns the local epoch.
+
+     This value can be used to determine if the remote client has been
+     restarted as it shouldn't change otherwise.
+
+ (#) Set the maxmimum lifespan on a call::
+
+	void rxrpc_kernel_set_max_life(struct socket *sock,
+				       struct rxrpc_call *call,
+				       unsigned long hard_timeout)
+
+     This sets the maximum lifespan on a call to hard_timeout (which is in
+     jiffies).  In the event of the timeout occurring, the call will be
+     aborted and -ETIME or -ETIMEDOUT will be returned.
+
+
+Configurable Parameters
+=======================
+
+The RxRPC protocol driver has a number of configurable parameters that can be
+adjusted through sysctls in /proc/net/rxrpc/:
+
+ (#) req_ack_delay
+
+     The amount of time in milliseconds after receiving a packet with the
+     request-ack flag set before we honour the flag and actually send the
+     requested ack.
+
+     Usually the other side won't stop sending packets until the advertised
+     reception window is full (to a maximum of 255 packets), so delaying the
+     ACK permits several packets to be ACK'd in one go.
+
+ (#) soft_ack_delay
+
+     The amount of time in milliseconds after receiving a new packet before we
+     generate a soft-ACK to tell the sender that it doesn't need to resend.
+
+ (#) idle_ack_delay
+
+     The amount of time in milliseconds after all the packets currently in the
+     received queue have been consumed before we generate a hard-ACK to tell
+     the sender it can free its buffers, assuming no other reason occurs that
+     we would send an ACK.
+
+ (#) resend_timeout
+
+     The amount of time in milliseconds after transmitting a packet before we
+     transmit it again, assuming no ACK is received from the receiver telling
+     us they got it.
+
+ (#) max_call_lifetime
+
+     The maximum amount of time in seconds that a call may be in progress
+     before we preemptively kill it.
+
+ (#) dead_call_expiry
+
+     The amount of time in seconds before we remove a dead call from the call
+     list.  Dead calls are kept around for a little while for the purpose of
+     repeating ACK and ABORT packets.
+
+ (#) connection_expiry
+
+     The amount of time in seconds after a connection was last used before we
+     remove it from the connection list.  While a connection is in existence,
+     it serves as a placeholder for negotiated security; when it is deleted,
+     the security must be renegotiated.
+
+ (#) transport_expiry
+
+     The amount of time in seconds after a transport was last used before we
+     remove it from the transport list.  While a transport is in existence, it
+     serves to anchor the peer data and keeps the connection ID counter.
+
+ (#) rxrpc_rx_window_size
+
+     The size of the receive window in packets.  This is the maximum number of
+     unconsumed received packets we're willing to hold in memory for any
+     particular call.
+
+ (#) rxrpc_rx_mtu
+
+     The maximum packet MTU size that we're willing to receive in bytes.  This
+     indicates to the peer whether we're willing to accept jumbo packets.
+
+ (#) rxrpc_rx_jumbo_max
+
+     The maximum number of packets that we're willing to accept in a jumbo
+     packet.  Non-terminal packets in a jumbo packet must contain a four byte
+     header plus exactly 1412 bytes of data.  The terminal packet must contain
+     a four byte header plus any amount of data.  In any event, a jumbo packet
+     may not exceed rxrpc_rx_mtu in size.
diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt
deleted file mode 100644
index 180e07d956a7..000000000000
--- a/Documentation/networking/rxrpc.txt
+++ /dev/null
@@ -1,1155 +0,0 @@
-			    ======================
-			    RxRPC NETWORK PROTOCOL
-			    ======================
-
-The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
-that can be used to perform RxRPC remote operations.  This is done over sockets
-of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and
-receive data, aborts and errors.
-
-Contents of this document:
-
- (*) Overview.
-
- (*) RxRPC protocol summary.
-
- (*) AF_RXRPC driver model.
-
- (*) Control messages.
-
- (*) Socket options.
-
- (*) Security.
-
- (*) Example client usage.
-
- (*) Example server usage.
-
- (*) AF_RXRPC kernel interface.
-
- (*) Configurable parameters.
-
-
-========
-OVERVIEW
-========
-
-RxRPC is a two-layer protocol.  There is a session layer which provides
-reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
-layer, but implements a real network protocol; and there's the presentation
-layer which renders structured data to binary blobs and back again using XDR
-(as does SunRPC):
-
-		+-------------+
-		| Application |
-		+-------------+
-		|     XDR     |		Presentation
-		+-------------+
-		|    RxRPC    |		Session
-		+-------------+
-		|     UDP     |		Transport
-		+-------------+
-
-
-AF_RXRPC provides:
-
- (1) Part of an RxRPC facility for both kernel and userspace applications by
-     making the session part of it a Linux network protocol (AF_RXRPC).
-
- (2) A two-phase protocol.  The client transmits a blob (the request) and then
-     receives a blob (the reply), and the server receives the request and then
-     transmits the reply.
-
- (3) Retention of the reusable bits of the transport system set up for one call
-     to speed up subsequent calls.
-
- (4) A secure protocol, using the Linux kernel's key retention facility to
-     manage security on the client end.  The server end must of necessity be
-     more active in security negotiations.
-
-AF_RXRPC does not provide XDR marshalling/presentation facilities.  That is
-left to the application.  AF_RXRPC only deals in blobs.  Even the operation ID
-is just the first four bytes of the request blob, and as such is beyond the
-kernel's interest.
-
-
-Sockets of AF_RXRPC family are:
-
- (1) created as type SOCK_DGRAM;
-
- (2) provided with a protocol of the type of underlying transport they're going
-     to use - currently only PF_INET is supported.
-
-
-The Andrew File System (AFS) is an example of an application that uses this and
-that has both kernel (filesystem) and userspace (utility) components.
-
-
-======================
-RXRPC PROTOCOL SUMMARY
-======================
-
-An overview of the RxRPC protocol:
-
- (*) RxRPC sits on top of another networking protocol (UDP is the only option
-     currently), and uses this to provide network transport.  UDP ports, for
-     example, provide transport endpoints.
-
- (*) RxRPC supports multiple virtual "connections" from any given transport
-     endpoint, thus allowing the endpoints to be shared, even to the same
-     remote endpoint.
-
- (*) Each connection goes to a particular "service".  A connection may not go
-     to multiple services.  A service may be considered the RxRPC equivalent of
-     a port number.  AF_RXRPC permits multiple services to share an endpoint.
-
- (*) Client-originating packets are marked, thus a transport endpoint can be
-     shared between client and server connections (connections have a
-     direction).
-
- (*) Up to a billion connections may be supported concurrently between one
-     local transport endpoint and one service on one remote endpoint.  An RxRPC
-     connection is described by seven numbers:
-
-	Local address	}
-	Local port	} Transport (UDP) address
-	Remote address	}
-	Remote port	}
-	Direction
-	Connection ID
-	Service ID
-
- (*) Each RxRPC operation is a "call".  A connection may make up to four
-     billion calls, but only up to four calls may be in progress on a
-     connection at any one time.
-
- (*) Calls are two-phase and asymmetric: the client sends its request data,
-     which the service receives; then the service transmits the reply data
-     which the client receives.
-
- (*) The data blobs are of indefinite size, the end of a phase is marked with a
-     flag in the packet.  The number of packets of data making up one blob may
-     not exceed 4 billion, however, as this would cause the sequence number to
-     wrap.
-
- (*) The first four bytes of the request data are the service operation ID.
-
- (*) Security is negotiated on a per-connection basis.  The connection is
-     initiated by the first data packet on it arriving.  If security is
-     requested, the server then issues a "challenge" and then the client
-     replies with a "response".  If the response is successful, the security is
-     set for the lifetime of that connection, and all subsequent calls made
-     upon it use that same security.  In the event that the server lets a
-     connection lapse before the client, the security will be renegotiated if
-     the client uses the connection again.
-
- (*) Calls use ACK packets to handle reliability.  Data packets are also
-     explicitly sequenced per call.
-
- (*) There are two types of positive acknowledgment: hard-ACKs and soft-ACKs.
-     A hard-ACK indicates to the far side that all the data received to a point
-     has been received and processed; a soft-ACK indicates that the data has
-     been received but may yet be discarded and re-requested.  The sender may
-     not discard any transmittable packets until they've been hard-ACK'd.
-
- (*) Reception of a reply data packet implicitly hard-ACK's all the data
-     packets that make up the request.
-
- (*) An call is complete when the request has been sent, the reply has been
-     received and the final hard-ACK on the last packet of the reply has
-     reached the server.
-
- (*) An call may be aborted by either end at any time up to its completion.
-
-
-=====================
-AF_RXRPC DRIVER MODEL
-=====================
-
-About the AF_RXRPC driver:
-
- (*) The AF_RXRPC protocol transparently uses internal sockets of the transport
-     protocol to represent transport endpoints.
-
- (*) AF_RXRPC sockets map onto RxRPC connection bundles.  Actual RxRPC
-     connections are handled transparently.  One client socket may be used to
-     make multiple simultaneous calls to the same service.  One server socket
-     may handle calls from many clients.
-
- (*) Additional parallel client connections will be initiated to support extra
-     concurrent calls, up to a tunable limit.
-
- (*) Each connection is retained for a certain amount of time [tunable] after
-     the last call currently using it has completed in case a new call is made
-     that could reuse it.
-
- (*) Each internal UDP socket is retained [tunable] for a certain amount of
-     time [tunable] after the last connection using it discarded, in case a new
-     connection is made that could use it.
-
- (*) A client-side connection is only shared between calls if they have have
-     the same key struct describing their security (and assuming the calls
-     would otherwise share the connection).  Non-secured calls would also be
-     able to share connections with each other.
-
- (*) A server-side connection is shared if the client says it is.
-
- (*) ACK'ing is handled by the protocol driver automatically, including ping
-     replying.
-
- (*) SO_KEEPALIVE automatically pings the other side to keep the connection
-     alive [TODO].
-
- (*) If an ICMP error is received, all calls affected by that error will be
-     aborted with an appropriate network error passed through recvmsg().
-
-
-Interaction with the user of the RxRPC socket:
-
- (*) A socket is made into a server socket by binding an address with a
-     non-zero service ID.
-
- (*) In the client, sending a request is achieved with one or more sendmsgs,
-     followed by the reply being received with one or more recvmsgs.
-
- (*) The first sendmsg for a request to be sent from a client contains a tag to
-     be used in all other sendmsgs or recvmsgs associated with that call.  The
-     tag is carried in the control data.
-
- (*) connect() is used to supply a default destination address for a client
-     socket.  This may be overridden by supplying an alternate address to the
-     first sendmsg() of a call (struct msghdr::msg_name).
-
- (*) If connect() is called on an unbound client, a random local port will
-     bound before the operation takes place.
-
- (*) A server socket may also be used to make client calls.  To do this, the
-     first sendmsg() of the call must specify the target address.  The server's
-     transport endpoint is used to send the packets.
-
- (*) Once the application has received the last message associated with a call,
-     the tag is guaranteed not to be seen again, and so it can be used to pin
-     client resources.  A new call can then be initiated with the same tag
-     without fear of interference.
-
- (*) In the server, a request is received with one or more recvmsgs, then the
-     the reply is transmitted with one or more sendmsgs, and then the final ACK
-     is received with a last recvmsg.
-
- (*) When sending data for a call, sendmsg is given MSG_MORE if there's more
-     data to come on that call.
-
- (*) When receiving data for a call, recvmsg flags MSG_MORE if there's more
-     data to come for that call.
-
- (*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
-     to indicate the terminal message for that call.
-
- (*) A call may be aborted by adding an abort control message to the control
-     data.  Issuing an abort terminates the kernel's use of that call's tag.
-     Any messages waiting in the receive queue for that call will be discarded.
-
- (*) Aborts, busy notifications and challenge packets are delivered by recvmsg,
-     and control data messages will be set to indicate the context.  Receiving
-     an abort or a busy message terminates the kernel's use of that call's tag.
-
- (*) The control data part of the msghdr struct is used for a number of things:
-
-     (*) The tag of the intended or affected call.
-
-     (*) Sending or receiving errors, aborts and busy notifications.
-
-     (*) Notifications of incoming calls.
-
-     (*) Sending debug requests and receiving debug replies [TODO].
-
- (*) When the kernel has received and set up an incoming call, it sends a
-     message to server application to let it know there's a new call awaiting
-     its acceptance [recvmsg reports a special control message].  The server
-     application then uses sendmsg to assign a tag to the new call.  Once that
-     is done, the first part of the request data will be delivered by recvmsg.
-
- (*) The server application has to provide the server socket with a keyring of
-     secret keys corresponding to the security types it permits.  When a secure
-     connection is being set up, the kernel looks up the appropriate secret key
-     in the keyring and then sends a challenge packet to the client and
-     receives a response packet.  The kernel then checks the authorisation of
-     the packet and either aborts the connection or sets up the security.
-
- (*) The name of the key a client will use to secure its communications is
-     nominated by a socket option.
-
-
-Notes on sendmsg:
-
- (*) MSG_WAITALL can be set to tell sendmsg to ignore signals if the peer is
-     making progress at accepting packets within a reasonable time such that we
-     manage to queue up all the data for transmission.  This requires the
-     client to accept at least one packet per 2*RTT time period.
-
-     If this isn't set, sendmsg() will return immediately, either returning
-     EINTR/ERESTARTSYS if nothing was consumed or returning the amount of data
-     consumed.
-
-
-Notes on recvmsg:
-
- (*) If there's a sequence of data messages belonging to a particular call on
-     the receive queue, then recvmsg will keep working through them until:
-
-     (a) it meets the end of that call's received data,
-
-     (b) it meets a non-data message,
-
-     (c) it meets a message belonging to a different call, or
-
-     (d) it fills the user buffer.
-
-     If recvmsg is called in blocking mode, it will keep sleeping, awaiting the
-     reception of further data, until one of the above four conditions is met.
-
- (2) MSG_PEEK operates similarly, but will return immediately if it has put any
-     data in the buffer rather than sleeping until it can fill the buffer.
-
- (3) If a data message is only partially consumed in filling a user buffer,
-     then the remainder of that message will be left on the front of the queue
-     for the next taker.  MSG_TRUNC will never be flagged.
-
- (4) If there is more data to be had on a call (it hasn't copied the last byte
-     of the last data message in that phase yet), then MSG_MORE will be
-     flagged.
-
-
-================
-CONTROL MESSAGES
-================
-
-AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
-calls, to invoke certain actions and to report certain conditions.  These are:
-
-	MESSAGE ID		SRT DATA	MEANING
-	=======================	=== ===========	===============================
-	RXRPC_USER_CALL_ID	sr- User ID	App's call specifier
-	RXRPC_ABORT		srt Abort code	Abort code to issue/received
-	RXRPC_ACK		-rt n/a		Final ACK received
-	RXRPC_NET_ERROR		-rt error num	Network error on call
-	RXRPC_BUSY		-rt n/a		Call rejected (server busy)
-	RXRPC_LOCAL_ERROR	-rt error num	Local error encountered
-	RXRPC_NEW_CALL		-r- n/a		New call received
-	RXRPC_ACCEPT		s-- n/a		Accept new call
-	RXRPC_EXCLUSIVE_CALL	s-- n/a		Make an exclusive client call
-	RXRPC_UPGRADE_SERVICE	s-- n/a		Client call can be upgraded
-	RXRPC_TX_LENGTH		s-- data len	Total length of Tx data
-
-	(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
-
- (*) RXRPC_USER_CALL_ID
-
-     This is used to indicate the application's call ID.  It's an unsigned long
-     that the app specifies in the client by attaching it to the first data
-     message or in the server by passing it in association with an RXRPC_ACCEPT
-     message.  recvmsg() passes it in conjunction with all messages except
-     those of the RXRPC_NEW_CALL message.
-
- (*) RXRPC_ABORT
-
-     This is can be used by an application to abort a call by passing it to
-     sendmsg, or it can be delivered by recvmsg to indicate a remote abort was
-     received.  Either way, it must be associated with an RXRPC_USER_CALL_ID to
-     specify the call affected.  If an abort is being sent, then error EBADSLT
-     will be returned if there is no call with that user ID.
-
- (*) RXRPC_ACK
-
-     This is delivered to a server application to indicate that the final ACK
-     of a call was received from the client.  It will be associated with an
-     RXRPC_USER_CALL_ID to indicate the call that's now complete.
-
- (*) RXRPC_NET_ERROR
-
-     This is delivered to an application to indicate that an ICMP error message
-     was encountered in the process of trying to talk to the peer.  An
-     errno-class integer value will be included in the control message data
-     indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
-     affected.
-
- (*) RXRPC_BUSY
-
-     This is delivered to a client application to indicate that a call was
-     rejected by the server due to the server being busy.  It will be
-     associated with an RXRPC_USER_CALL_ID to indicate the rejected call.
-
- (*) RXRPC_LOCAL_ERROR
-
-     This is delivered to an application to indicate that a local error was
-     encountered and that a call has been aborted because of it.  An
-     errno-class integer value will be included in the control message data
-     indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
-     affected.
-
- (*) RXRPC_NEW_CALL
-
-     This is delivered to indicate to a server application that a new call has
-     arrived and is awaiting acceptance.  No user ID is associated with this,
-     as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT.
-
- (*) RXRPC_ACCEPT
-
-     This is used by a server application to attempt to accept a call and
-     assign it a user ID.  It should be associated with an RXRPC_USER_CALL_ID
-     to indicate the user ID to be assigned.  If there is no call to be
-     accepted (it may have timed out, been aborted, etc.), then sendmsg will
-     return error ENODATA.  If the user ID is already in use by another call,
-     then error EBADSLT will be returned.
-
- (*) RXRPC_EXCLUSIVE_CALL
-
-     This is used to indicate that a client call should be made on a one-off
-     connection.  The connection is discarded once the call has terminated.
-
- (*) RXRPC_UPGRADE_SERVICE
-
-     This is used to make a client call to probe if the specified service ID
-     may be upgraded by the server.  The caller must check msg_name returned to
-     recvmsg() for the service ID actually in use.  The operation probed must
-     be one that takes the same arguments in both services.
-
-     Once this has been used to establish the upgrade capability (or lack
-     thereof) of the server, the service ID returned should be used for all
-     future communication to that server and RXRPC_UPGRADE_SERVICE should no
-     longer be set.
-
- (*) RXRPC_TX_LENGTH
-
-     This is used to inform the kernel of the total amount of data that is
-     going to be transmitted by a call (whether in a client request or a
-     service response).  If given, it allows the kernel to encrypt from the
-     userspace buffer directly to the packet buffers, rather than copying into
-     the buffer and then encrypting in place.  This may only be given with the
-     first sendmsg() providing data for a call.  EMSGSIZE will be generated if
-     the amount of data actually given is different.
-
-     This takes a parameter of __s64 type that indicates how much will be
-     transmitted.  This may not be less than zero.
-
-The symbol RXRPC__SUPPORTED is defined as one more than the highest control
-message type supported.  At run time this can be queried by means of the
-RXRPC_SUPPORTED_CMSG socket option (see below).
-
-
-==============
-SOCKET OPTIONS
-==============
-
-AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
-
- (*) RXRPC_SECURITY_KEY
-
-     This is used to specify the description of the key to be used.  The key is
-     extracted from the calling process's keyrings with request_key() and
-     should be of "rxrpc" type.
-
-     The optval pointer points to the description string, and optlen indicates
-     how long the string is, without the NUL terminator.
-
- (*) RXRPC_SECURITY_KEYRING
-
-     Similar to above but specifies a keyring of server secret keys to use (key
-     type "keyring").  See the "Security" section.
-
- (*) RXRPC_EXCLUSIVE_CONNECTION
-
-     This is used to request that new connections should be used for each call
-     made subsequently on this socket.  optval should be NULL and optlen 0.
-
- (*) RXRPC_MIN_SECURITY_LEVEL
-
-     This is used to specify the minimum security level required for calls on
-     this socket.  optval must point to an int containing one of the following
-     values:
-
-     (a) RXRPC_SECURITY_PLAIN
-
-	 Encrypted checksum only.
-
-     (b) RXRPC_SECURITY_AUTH
-
-	 Encrypted checksum plus packet padded and first eight bytes of packet
-	 encrypted - which includes the actual packet length.
-
-     (c) RXRPC_SECURITY_ENCRYPTED
-
-	 Encrypted checksum plus entire packet padded and encrypted, including
-	 actual packet length.
-
- (*) RXRPC_UPGRADEABLE_SERVICE
-
-     This is used to indicate that a service socket with two bindings may
-     upgrade one bound service to the other if requested by the client.  optval
-     must point to an array of two unsigned short ints.  The first is the
-     service ID to upgrade from and the second the service ID to upgrade to.
-
- (*) RXRPC_SUPPORTED_CMSG
-
-     This is a read-only option that writes an int into the buffer indicating
-     the highest control message type supported.
-
-
-========
-SECURITY
-========
-
-Currently, only the kerberos 4 equivalent protocol has been implemented
-(security index 2 - rxkad).  This requires the rxkad module to be loaded and,
-on the client, tickets of the appropriate type to be obtained from the AFS
-kaserver or the kerberos server and installed as "rxrpc" type keys.  This is
-normally done using the klog program.  An example simple klog program can be
-found at:
-
-	http://people.redhat.com/~dhowells/rxrpc/klog.c
-
-The payload provided to add_key() on the client should be of the following
-form:
-
-	struct rxrpc_key_sec2_v1 {
-		uint16_t	security_index;	/* 2 */
-		uint16_t	ticket_length;	/* length of ticket[] */
-		uint32_t	expiry;		/* time at which expires */
-		uint8_t		kvno;		/* key version number */
-		uint8_t		__pad[3];
-		uint8_t		session_key[8];	/* DES session key */
-		uint8_t		ticket[0];	/* the encrypted ticket */
-	};
-
-Where the ticket blob is just appended to the above structure.
-
-
-For the server, keys of type "rxrpc_s" must be made available to the server.
-They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an
-rxkad key for the AFS VL service).  When such a key is created, it should be
-given the server's secret key as the instantiation data (see the example
-below).
-
-	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
-
-A keyring is passed to the server socket by naming it in a sockopt.  The server
-socket then looks the server secret keys up in this keyring when secure
-incoming connections are made.  This can be seen in an example program that can
-be found at:
-
-	http://people.redhat.com/~dhowells/rxrpc/listen.c
-
-
-====================
-EXAMPLE CLIENT USAGE
-====================
-
-A client would issue an operation by:
-
- (1) An RxRPC socket is set up by:
-
-	client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
-
-     Where the third parameter indicates the protocol family of the transport
-     socket used - usually IPv4 but it can also be IPv6 [TODO].
-
- (2) A local address can optionally be bound:
-
-	struct sockaddr_rxrpc srx = {
-		.srx_family	= AF_RXRPC,
-		.srx_service	= 0,  /* we're a client */
-		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
-		.transport.sin_family	= AF_INET,
-		.transport.sin_port	= htons(7000), /* AFS callback */
-		.transport.sin_address	= 0,  /* all local interfaces */
-	};
-	bind(client, &srx, sizeof(srx));
-
-     This specifies the local UDP port to be used.  If not given, a random
-     non-privileged port will be used.  A UDP port may be shared between
-     several unrelated RxRPC sockets.  Security is handled on a basis of
-     per-RxRPC virtual connection.
-
- (3) The security is set:
-
-	const char *key = "AFS:cambridge.redhat.com";
-	setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key));
-
-     This issues a request_key() to get the key representing the security
-     context.  The minimum security level can be set:
-
-	unsigned int sec = RXRPC_SECURITY_ENCRYPTED;
-	setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
-		   &sec, sizeof(sec));
-
- (4) The server to be contacted can then be specified (alternatively this can
-     be done through sendmsg):
-
-	struct sockaddr_rxrpc srx = {
-		.srx_family	= AF_RXRPC,
-		.srx_service	= VL_SERVICE_ID,
-		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
-		.transport.sin_family	= AF_INET,
-		.transport.sin_port	= htons(7005), /* AFS volume manager */
-		.transport.sin_address	= ...,
-	};
-	connect(client, &srx, sizeof(srx));
-
- (5) The request data should then be posted to the server socket using a series
-     of sendmsg() calls, each with the following control message attached:
-
-	RXRPC_USER_CALL_ID	- specifies the user ID for this call
-
-     MSG_MORE should be set in msghdr::msg_flags on all but the last part of
-     the request.  Multiple requests may be made simultaneously.
-
-     An RXRPC_TX_LENGTH control message can also be specified on the first
-     sendmsg() call.
-
-     If a call is intended to go to a destination other than the default
-     specified through connect(), then msghdr::msg_name should be set on the
-     first request message of that call.
-
- (6) The reply data will then be posted to the server socket for recvmsg() to
-     pick up.  MSG_MORE will be flagged by recvmsg() if there's more reply data
-     for a particular call to be read.  MSG_EOR will be set on the terminal
-     read for a call.
-
-     All data will be delivered with the following control message attached:
-
-	RXRPC_USER_CALL_ID	- specifies the user ID for this call
-
-     If an abort or error occurred, this will be returned in the control data
-     buffer instead, and MSG_EOR will be flagged to indicate the end of that
-     call.
-
-A client may ask for a service ID it knows and ask that this be upgraded to a
-better service if one is available by supplying RXRPC_UPGRADE_SERVICE on the
-first sendmsg() of a call.  The client should then check srx_service in the
-msg_name filled in by recvmsg() when collecting the result.  srx_service will
-hold the same value as given to sendmsg() if the upgrade request was ignored by
-the service - otherwise it will be altered to indicate the service ID the
-server upgraded to.  Note that the upgraded service ID is chosen by the server.
-The caller has to wait until it sees the service ID in the reply before sending
-any more calls (further calls to the same destination will be blocked until the
-probe is concluded).
-
-
-====================
-EXAMPLE SERVER USAGE
-====================
-
-A server would be set up to accept operations in the following manner:
-
- (1) An RxRPC socket is created by:
-
-	server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
-
-     Where the third parameter indicates the address type of the transport
-     socket used - usually IPv4.
-
- (2) Security is set up if desired by giving the socket a keyring with server
-     secret keys in it:
-
-	keyring = add_key("keyring", "AFSkeys", NULL, 0,
-			  KEY_SPEC_PROCESS_KEYRING);
-
-	const char secret_key[8] = {
-		0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 };
-	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
-
-	setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7);
-
-     The keyring can be manipulated after it has been given to the socket. This
-     permits the server to add more keys, replace keys, etc. while it is live.
-
- (3) A local address must then be bound:
-
-	struct sockaddr_rxrpc srx = {
-		.srx_family	= AF_RXRPC,
-		.srx_service	= VL_SERVICE_ID, /* RxRPC service ID */
-		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
-		.transport.sin_family	= AF_INET,
-		.transport.sin_port	= htons(7000), /* AFS callback */
-		.transport.sin_address	= 0,  /* all local interfaces */
-	};
-	bind(server, &srx, sizeof(srx));
-
-     More than one service ID may be bound to a socket, provided the transport
-     parameters are the same.  The limit is currently two.  To do this, bind()
-     should be called twice.
-
- (4) If service upgrading is required, first two service IDs must have been
-     bound and then the following option must be set:
-
-	unsigned short service_ids[2] = { from_ID, to_ID };
-	setsockopt(server, SOL_RXRPC, RXRPC_UPGRADEABLE_SERVICE,
-		   service_ids, sizeof(service_ids));
-
-     This will automatically upgrade connections on service from_ID to service
-     to_ID if they request it.  This will be reflected in msg_name obtained
-     through recvmsg() when the request data is delivered to userspace.
-
- (5) The server is then set to listen out for incoming calls:
-
-	listen(server, 100);
-
- (6) The kernel notifies the server of pending incoming connections by sending
-     it a message for each.  This is received with recvmsg() on the server
-     socket.  It has no data, and has a single dataless control message
-     attached:
-
-	RXRPC_NEW_CALL
-
-     The address that can be passed back by recvmsg() at this point should be
-     ignored since the call for which the message was posted may have gone by
-     the time it is accepted - in which case the first call still on the queue
-     will be accepted.
-
- (7) The server then accepts the new call by issuing a sendmsg() with two
-     pieces of control data and no actual data:
-
-	RXRPC_ACCEPT		- indicate connection acceptance
-	RXRPC_USER_CALL_ID	- specify user ID for this call
-
- (8) The first request data packet will then be posted to the server socket for
-     recvmsg() to pick up.  At that point, the RxRPC address for the call can
-     be read from the address fields in the msghdr struct.
-
-     Subsequent request data will be posted to the server socket for recvmsg()
-     to collect as it arrives.  All but the last piece of the request data will
-     be delivered with MSG_MORE flagged.
-
-     All data will be delivered with the following control message attached:
-
-	RXRPC_USER_CALL_ID	- specifies the user ID for this call
-
- (9) The reply data should then be posted to the server socket using a series
-     of sendmsg() calls, each with the following control messages attached:
-
-	RXRPC_USER_CALL_ID	- specifies the user ID for this call
-
-     MSG_MORE should be set in msghdr::msg_flags on all but the last message
-     for a particular call.
-
-(10) The final ACK from the client will be posted for retrieval by recvmsg()
-     when it is received.  It will take the form of a dataless message with two
-     control messages attached:
-
-	RXRPC_USER_CALL_ID	- specifies the user ID for this call
-	RXRPC_ACK		- indicates final ACK (no data)
-
-     MSG_EOR will be flagged to indicate that this is the final message for
-     this call.
-
-(11) Up to the point the final packet of reply data is sent, the call can be
-     aborted by calling sendmsg() with a dataless message with the following
-     control messages attached:
-
-	RXRPC_USER_CALL_ID	- specifies the user ID for this call
-	RXRPC_ABORT		- indicates abort code (4 byte data)
-
-     Any packets waiting in the socket's receive queue will be discarded if
-     this is issued.
-
-Note that all the communications for a particular service take place through
-the one server socket, using control messages on sendmsg() and recvmsg() to
-determine the call affected.
-
-
-=========================
-AF_RXRPC KERNEL INTERFACE
-=========================
-
-The AF_RXRPC module also provides an interface for use by in-kernel utilities
-such as the AFS filesystem.  This permits such a utility to:
-
- (1) Use different keys directly on individual client calls on one socket
-     rather than having to open a whole slew of sockets, one for each key it
-     might want to use.
-
- (2) Avoid having RxRPC call request_key() at the point of issue of a call or
-     opening of a socket.  Instead the utility is responsible for requesting a
-     key at the appropriate point.  AFS, for instance, would do this during VFS
-     operations such as open() or unlink().  The key is then handed through
-     when the call is initiated.
-
- (3) Request the use of something other than GFP_KERNEL to allocate memory.
-
- (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
-     intercepted before they get put into the socket Rx queue and the socket
-     buffers manipulated directly.
-
-To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
-bind an address as appropriate and listen if it's to be a server socket, but
-then it passes this to the kernel interface functions.
-
-The kernel interface functions are as follows:
-
- (*) Begin a new client call.
-
-	struct rxrpc_call *
-	rxrpc_kernel_begin_call(struct socket *sock,
-				struct sockaddr_rxrpc *srx,
-				struct key *key,
-				unsigned long user_call_ID,
-				s64 tx_total_len,
-				gfp_t gfp,
-				rxrpc_notify_rx_t notify_rx,
-				bool upgrade,
-				bool intr,
-				unsigned int debug_id);
-
-     This allocates the infrastructure to make a new RxRPC call and assigns
-     call and connection numbers.  The call will be made on the UDP port that
-     the socket is bound to.  The call will go to the destination address of a
-     connected client socket unless an alternative is supplied (srx is
-     non-NULL).
-
-     If a key is supplied then this will be used to secure the call instead of
-     the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
-     secured in this way will still share connections if at all possible.
-
-     The user_call_ID is equivalent to that supplied to sendmsg() in the
-     control data buffer.  It is entirely feasible to use this to point to a
-     kernel data structure.
-
-     tx_total_len is the amount of data the caller is intending to transmit
-     with this call (or -1 if unknown at this point).  Setting the data size
-     allows the kernel to encrypt directly to the packet buffers, thereby
-     saving a copy.  The value may not be less than -1.
-
-     notify_rx is a pointer to a function to be called when events such as
-     incoming data packets or remote aborts happen.
-
-     upgrade should be set to true if a client operation should request that
-     the server upgrade the service to a better one.  The resultant service ID
-     is returned by rxrpc_kernel_recv_data().
-
-     intr should be set to true if the call should be interruptible.  If this
-     is not set, this function may not return until a channel has been
-     allocated; if it is set, the function may return -ERESTARTSYS.
-
-     debug_id is the call debugging ID to be used for tracing.  This can be
-     obtained by atomically incrementing rxrpc_debug_id.
-
-     If this function is successful, an opaque reference to the RxRPC call is
-     returned.  The caller now holds a reference on this and it must be
-     properly ended.
-
- (*) End a client call.
-
-	void rxrpc_kernel_end_call(struct socket *sock,
-				   struct rxrpc_call *call);
-
-     This is used to end a previously begun call.  The user_call_ID is expunged
-     from AF_RXRPC's knowledge and will not be seen again in association with
-     the specified call.
-
- (*) Send data through a call.
-
-	typedef void (*rxrpc_notify_end_tx_t)(struct sock *sk,
-					      unsigned long user_call_ID,
-					      struct sk_buff *skb);
-
-	int rxrpc_kernel_send_data(struct socket *sock,
-				   struct rxrpc_call *call,
-				   struct msghdr *msg,
-				   size_t len,
-				   rxrpc_notify_end_tx_t notify_end_rx);
-
-     This is used to supply either the request part of a client call or the
-     reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
-     data buffers to be used.  msg_iov may not be NULL and must point
-     exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
-     MSG_MORE if there will be subsequent data sends for this call.
-
-     The msg must not specify a destination address, control data or any flags
-     other than MSG_MORE.  len is the total amount of data to transmit.
-
-     notify_end_rx can be NULL or it can be used to specify a function to be
-     called when the call changes state to end the Tx phase.  This function is
-     called with the call-state spinlock held to prevent any reply or final ACK
-     from being delivered first.
-
- (*) Receive data from a call.
-
-	int rxrpc_kernel_recv_data(struct socket *sock,
-				   struct rxrpc_call *call,
-				   void *buf,
-				   size_t size,
-				   size_t *_offset,
-				   bool want_more,
-				   u32 *_abort,
-				   u16 *_service)
-
-      This is used to receive data from either the reply part of a client call
-      or the request part of a service call.  buf and size specify how much
-      data is desired and where to store it.  *_offset is added on to buf and
-      subtracted from size internally; the amount copied into the buffer is
-      added to *_offset before returning.
-
-      want_more should be true if further data will be required after this is
-      satisfied and false if this is the last item of the receive phase.
-
-      There are three normal returns: 0 if the buffer was filled and want_more
-      was true; 1 if the buffer was filled, the last DATA packet has been
-      emptied and want_more was false; and -EAGAIN if the function needs to be
-      called again.
-
-      If the last DATA packet is processed but the buffer contains less than
-      the amount requested, EBADMSG is returned.  If want_more wasn't set, but
-      more data was available, EMSGSIZE is returned.
-
-      If a remote ABORT is detected, the abort code received will be stored in
-      *_abort and ECONNABORTED will be returned.
-
-      The service ID that the call ended up with is returned into *_service.
-      This can be used to see if a call got a service upgrade.
-
- (*) Abort a call.
-
-	void rxrpc_kernel_abort_call(struct socket *sock,
-				     struct rxrpc_call *call,
-				     u32 abort_code);
-
-     This is used to abort a call if it's still in an abortable state.  The
-     abort code specified will be placed in the ABORT message sent.
-
- (*) Intercept received RxRPC messages.
-
-	typedef void (*rxrpc_interceptor_t)(struct sock *sk,
-					    unsigned long user_call_ID,
-					    struct sk_buff *skb);
-
-	void
-	rxrpc_kernel_intercept_rx_messages(struct socket *sock,
-					   rxrpc_interceptor_t interceptor);
-
-     This installs an interceptor function on the specified AF_RXRPC socket.
-     All messages that would otherwise wind up in the socket's Rx queue are
-     then diverted to this function.  Note that care must be taken to process
-     the messages in the right order to maintain DATA message sequentiality.
-
-     The interceptor function itself is provided with the address of the socket
-     and handling the incoming message, the ID assigned by the kernel utility
-     to the call and the socket buffer containing the message.
-
-     The skb->mark field indicates the type of message:
-
-	MARK				MEANING
-	===============================	=======================================
-	RXRPC_SKB_MARK_DATA		Data message
-	RXRPC_SKB_MARK_FINAL_ACK	Final ACK received for an incoming call
-	RXRPC_SKB_MARK_BUSY		Client call rejected as server busy
-	RXRPC_SKB_MARK_REMOTE_ABORT	Call aborted by peer
-	RXRPC_SKB_MARK_NET_ERROR	Network error detected
-	RXRPC_SKB_MARK_LOCAL_ERROR	Local error encountered
-	RXRPC_SKB_MARK_NEW_CALL		New incoming call awaiting acceptance
-
-     The remote abort message can be probed with rxrpc_kernel_get_abort_code().
-     The two error messages can be probed with rxrpc_kernel_get_error_number().
-     A new call can be accepted with rxrpc_kernel_accept_call().
-
-     Data messages can have their contents extracted with the usual bunch of
-     socket buffer manipulation functions.  A data message can be determined to
-     be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
-     data message has been used up, rxrpc_kernel_data_consumed() should be
-     called on it.
-
-     Messages should be handled to rxrpc_kernel_free_skb() to dispose of.  It
-     is possible to get extra refs on all types of message for later freeing,
-     but this may pin the state of a call until the message is finally freed.
-
- (*) Accept an incoming call.
-
-	struct rxrpc_call *
-	rxrpc_kernel_accept_call(struct socket *sock,
-				 unsigned long user_call_ID);
-
-     This is used to accept an incoming call and to assign it a call ID.  This
-     function is similar to rxrpc_kernel_begin_call() and calls accepted must
-     be ended in the same way.
-
-     If this function is successful, an opaque reference to the RxRPC call is
-     returned.  The caller now holds a reference on this and it must be
-     properly ended.
-
- (*) Reject an incoming call.
-
-	int rxrpc_kernel_reject_call(struct socket *sock);
-
-     This is used to reject the first incoming call on the socket's queue with
-     a BUSY message.  -ENODATA is returned if there were no incoming calls.
-     Other errors may be returned if the call had been aborted (-ECONNABORTED)
-     or had timed out (-ETIME).
-
- (*) Allocate a null key for doing anonymous security.
-
-	struct key *rxrpc_get_null_key(const char *keyname);
-
-     This is used to allocate a null RxRPC key that can be used to indicate
-     anonymous security for a particular domain.
-
- (*) Get the peer address of a call.
-
-	void rxrpc_kernel_get_peer(struct socket *sock, struct rxrpc_call *call,
-				   struct sockaddr_rxrpc *_srx);
-
-     This is used to find the remote peer address of a call.
-
- (*) Set the total transmit data size on a call.
-
-	void rxrpc_kernel_set_tx_length(struct socket *sock,
-					struct rxrpc_call *call,
-					s64 tx_total_len);
-
-     This sets the amount of data that the caller is intending to transmit on a
-     call.  It's intended to be used for setting the reply size as the request
-     size should be set when the call is begun.  tx_total_len may not be less
-     than zero.
-
- (*) Get call RTT.
-
-	u64 rxrpc_kernel_get_rtt(struct socket *sock, struct rxrpc_call *call);
-
-     Get the RTT time to the peer in use by a call.  The value returned is in
-     nanoseconds.
-
- (*) Check call still alive.
-
-	bool rxrpc_kernel_check_life(struct socket *sock,
-				     struct rxrpc_call *call,
-				     u32 *_life);
-	void rxrpc_kernel_probe_life(struct socket *sock,
-				     struct rxrpc_call *call);
-
-     The first function passes back in *_life a number that is updated when
-     ACKs are received from the peer (notably including PING RESPONSE ACKs
-     which we can elicit by sending PING ACKs to see if the call still exists
-     on the server).  The caller should compare the numbers of two calls to see
-     if the call is still alive after waiting for a suitable interval.  It also
-     returns true as long as the call hasn't yet reached the completed state.
-
-     This allows the caller to work out if the server is still contactable and
-     if the call is still alive on the server while waiting for the server to
-     process a client operation.
-
-     The second function causes a ping ACK to be transmitted to try to provoke
-     the peer into responding, which would then cause the value returned by the
-     first function to change.  Note that this must be called in TASK_RUNNING
-     state.
-
- (*) Get reply timestamp.
-
-	bool rxrpc_kernel_get_reply_time(struct socket *sock,
-					 struct rxrpc_call *call,
-					 ktime_t *_ts)
-
-     This allows the timestamp on the first DATA packet of the reply of a
-     client call to be queried, provided that it is still in the Rx ring.  If
-     successful, the timestamp will be stored into *_ts and true will be
-     returned; false will be returned otherwise.
-
- (*) Get remote client epoch.
-
-	u32 rxrpc_kernel_get_epoch(struct socket *sock,
-				   struct rxrpc_call *call)
-
-     This allows the epoch that's contained in packets of an incoming client
-     call to be queried.  This value is returned.  The function always
-     successful if the call is still in progress.  It shouldn't be called once
-     the call has expired.  Note that calling this on a local client call only
-     returns the local epoch.
-
-     This value can be used to determine if the remote client has been
-     restarted as it shouldn't change otherwise.
-
- (*) Set the maxmimum lifespan on a call.
-
-	void rxrpc_kernel_set_max_life(struct socket *sock,
-				       struct rxrpc_call *call,
-				       unsigned long hard_timeout)
-
-     This sets the maximum lifespan on a call to hard_timeout (which is in
-     jiffies).  In the event of the timeout occurring, the call will be
-     aborted and -ETIME or -ETIMEDOUT will be returned.
-
-
-=======================
-CONFIGURABLE PARAMETERS
-=======================
-
-The RxRPC protocol driver has a number of configurable parameters that can be
-adjusted through sysctls in /proc/net/rxrpc/:
-
- (*) req_ack_delay
-
-     The amount of time in milliseconds after receiving a packet with the
-     request-ack flag set before we honour the flag and actually send the
-     requested ack.
-
-     Usually the other side won't stop sending packets until the advertised
-     reception window is full (to a maximum of 255 packets), so delaying the
-     ACK permits several packets to be ACK'd in one go.
-
- (*) soft_ack_delay
-
-     The amount of time in milliseconds after receiving a new packet before we
-     generate a soft-ACK to tell the sender that it doesn't need to resend.
-
- (*) idle_ack_delay
-
-     The amount of time in milliseconds after all the packets currently in the
-     received queue have been consumed before we generate a hard-ACK to tell
-     the sender it can free its buffers, assuming no other reason occurs that
-     we would send an ACK.
-
- (*) resend_timeout
-
-     The amount of time in milliseconds after transmitting a packet before we
-     transmit it again, assuming no ACK is received from the receiver telling
-     us they got it.
-
- (*) max_call_lifetime
-
-     The maximum amount of time in seconds that a call may be in progress
-     before we preemptively kill it.
-
- (*) dead_call_expiry
-
-     The amount of time in seconds before we remove a dead call from the call
-     list.  Dead calls are kept around for a little while for the purpose of
-     repeating ACK and ABORT packets.
-
- (*) connection_expiry
-
-     The amount of time in seconds after a connection was last used before we
-     remove it from the connection list.  While a connection is in existence,
-     it serves as a placeholder for negotiated security; when it is deleted,
-     the security must be renegotiated.
-
- (*) transport_expiry
-
-     The amount of time in seconds after a transport was last used before we
-     remove it from the transport list.  While a transport is in existence, it
-     serves to anchor the peer data and keeps the connection ID counter.
-
- (*) rxrpc_rx_window_size
-
-     The size of the receive window in packets.  This is the maximum number of
-     unconsumed received packets we're willing to hold in memory for any
-     particular call.
-
- (*) rxrpc_rx_mtu
-
-     The maximum packet MTU size that we're willing to receive in bytes.  This
-     indicates to the peer whether we're willing to accept jumbo packets.
-
- (*) rxrpc_rx_jumbo_max
-
-     The maximum number of packets that we're willing to accept in a jumbo
-     packet.  Non-terminal packets in a jumbo packet must contain a four byte
-     header plus exactly 1412 bytes of data.  The terminal packet must contain
-     a four byte header plus any amount of data.  In any event, a jumbo packet
-     may not exceed rxrpc_rx_mtu in size.
diff --git a/MAINTAINERS b/MAINTAINERS
index b28823ab48c5..866a0dcd66ef 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14593,7 +14593,7 @@ M:	David Howells <dhowells@redhat.com>
 L:	linux-afs@lists.infradead.org
 S:	Supported
 W:	https://www.infradead.org/~dhowells/kafs/
-F:	Documentation/networking/rxrpc.txt
+F:	Documentation/networking/rxrpc.rst
 F:	include/keys/rxrpc-type.h
 F:	include/net/af_rxrpc.h
 F:	include/trace/events/rxrpc.h
diff --git a/net/rxrpc/Kconfig b/net/rxrpc/Kconfig
index 57ebb29c26ad..d706bb408365 100644
--- a/net/rxrpc/Kconfig
+++ b/net/rxrpc/Kconfig
@@ -18,7 +18,7 @@ config AF_RXRPC
 	  This module at the moment only supports client operations and is
 	  currently incomplete.
 
-	  See Documentation/networking/rxrpc.txt.
+	  See Documentation/networking/rxrpc.rst.
 
 config AF_RXRPC_IPV6
 	bool "IPv6 support for RxRPC"
@@ -41,7 +41,7 @@ config AF_RXRPC_DEBUG
 	help
 	  Say Y here to make runtime controllable debugging messages appear.
 
-	  See Documentation/networking/rxrpc.txt.
+	  See Documentation/networking/rxrpc.rst.
 
 
 config RXKAD
@@ -56,4 +56,4 @@ config RXKAD
 	  Provide kerberos 4 and AFS kaserver security handling for AF_RXRPC
 	  through the use of the key retention service.
 
-	  See Documentation/networking/rxrpc.txt.
+	  See Documentation/networking/rxrpc.rst.
diff --git a/net/rxrpc/sysctl.c b/net/rxrpc/sysctl.c
index 2bbb38161851..174e903e18de 100644
--- a/net/rxrpc/sysctl.c
+++ b/net/rxrpc/sysctl.c
@@ -21,7 +21,7 @@ static const unsigned long max_jiffies = MAX_JIFFY_OFFSET;
 /*
  * RxRPC operating parameters.
  *
- * See Documentation/networking/rxrpc.txt and the variable definitions for more
+ * See Documentation/networking/rxrpc.rst and the variable definitions for more
  * information on the individual parameters.
  */
 static struct ctl_table rxrpc_sysctl_table[] = {
-- 
cgit v1.2.3


From 671d114d8cde3ba4390714b850c86d8b39d31009 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Thu, 30 Apr 2020 18:04:22 +0200
Subject: docs: networking: convert sctp.txt to ReST

- add SPDX header;
- add a document title;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst |  1 +
 Documentation/networking/sctp.rst  | 42 ++++++++++++++++++++++++++++++++++++++
 Documentation/networking/sctp.txt  | 35 -------------------------------
 MAINTAINERS                        |  2 +-
 4 files changed, 44 insertions(+), 36 deletions(-)
 create mode 100644 Documentation/networking/sctp.rst
 delete mode 100644 Documentation/networking/sctp.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index cd307b9601fa..1761eb715061 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -100,6 +100,7 @@ Contents:
    rds
    regulatory
    rxrpc
+   sctp
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/sctp.rst b/Documentation/networking/sctp.rst
new file mode 100644
index 000000000000..9f4d9c8a925b
--- /dev/null
+++ b/Documentation/networking/sctp.rst
@@ -0,0 +1,42 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Linux Kernel SCTP
+=================
+
+This is the current BETA release of the Linux Kernel SCTP reference
+implementation.
+
+SCTP (Stream Control Transmission Protocol) is a IP based, message oriented,
+reliable transport protocol, with congestion control, support for
+transparent multi-homing, and multiple ordered streams of messages.
+RFC2960 defines the core protocol.  The IETF SIGTRAN working group originally
+developed the SCTP protocol and later handed the protocol over to the
+Transport Area (TSVWG) working group for the continued evolvement of SCTP as a
+general purpose transport.
+
+See the IETF website (http://www.ietf.org) for further documents on SCTP.
+See http://www.ietf.org/rfc/rfc2960.txt
+
+The initial project goal is to create an Linux kernel reference implementation
+of SCTP that is RFC 2960 compliant and provides an programming interface
+referred to as the  UDP-style API of the Sockets Extensions for SCTP, as
+proposed in IETF Internet-Drafts.
+
+Caveats
+=======
+
+- lksctp can be built as statically or as a module.  However, be aware that
+  module removal of lksctp is not yet a safe activity.
+
+- There is tentative support for IPv6, but most work has gone towards
+  implementation and testing lksctp on IPv4.
+
+
+For more information, please visit the lksctp project website:
+
+   http://www.sf.net/projects/lksctp
+
+Or contact the lksctp developers through the mailing list:
+
+   <linux-sctp@vger.kernel.org>
diff --git a/Documentation/networking/sctp.txt b/Documentation/networking/sctp.txt
deleted file mode 100644
index 97b810ca9082..000000000000
--- a/Documentation/networking/sctp.txt
+++ /dev/null
@@ -1,35 +0,0 @@
-Linux Kernel SCTP 
-
-This is the current BETA release of the Linux Kernel SCTP reference
-implementation.  
-
-SCTP (Stream Control Transmission Protocol) is a IP based, message oriented,
-reliable transport protocol, with congestion control, support for
-transparent multi-homing, and multiple ordered streams of messages.
-RFC2960 defines the core protocol.  The IETF SIGTRAN working group originally
-developed the SCTP protocol and later handed the protocol over to the 
-Transport Area (TSVWG) working group for the continued evolvement of SCTP as a 
-general purpose transport.  
-
-See the IETF website (http://www.ietf.org) for further documents on SCTP. 
-See http://www.ietf.org/rfc/rfc2960.txt 
-
-The initial project goal is to create an Linux kernel reference implementation
-of SCTP that is RFC 2960 compliant and provides an programming interface 
-referred to as the  UDP-style API of the Sockets Extensions for SCTP, as 
-proposed in IETF Internet-Drafts.    
-
-Caveats:  
-
--lksctp can be built as statically or as a module.  However, be aware that 
-module removal of lksctp is not yet a safe activity.   
-
--There is tentative support for IPv6, but most work has gone towards 
-implementation and testing lksctp on IPv4.   
-
-
-For more information, please visit the lksctp project website:
-   http://www.sf.net/projects/lksctp
-
-Or contact the lksctp developers through the mailing list:
-   <linux-sctp@vger.kernel.org>
diff --git a/MAINTAINERS b/MAINTAINERS
index 866a0dcd66ef..0ac9cec0bce6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14999,7 +14999,7 @@ M:	Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
 L:	linux-sctp@vger.kernel.org
 S:	Maintained
 W:	http://lksctp.sourceforge.net
-F:	Documentation/networking/sctp.txt
+F:	Documentation/networking/sctp.rst
 F:	include/linux/sctp.h
 F:	include/net/sctp/
 F:	include/uapi/linux/sctp.h
-- 
cgit v1.2.3


From 973d55e590beeca13fece60596ee3b511d36d9da Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:23 +0200
Subject: docs: networking: convert tuntap.txt to ReST

- add SPDX header;
- use copyright symbol;
- adjust titles and chapters, adding proper markups;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst  |   1 +
 Documentation/networking/tuntap.rst | 259 ++++++++++++++++++++++++++++++++++++
 Documentation/networking/tuntap.txt | 227 -------------------------------
 MAINTAINERS                         |   2 +-
 drivers/net/Kconfig                 |   2 +-
 5 files changed, 262 insertions(+), 229 deletions(-)
 create mode 100644 Documentation/networking/tuntap.rst
 delete mode 100644 Documentation/networking/tuntap.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index b423b2db5f96..e7a683f0528d 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -111,6 +111,7 @@ Contents:
    team
    timestamping
    tproxy
+   tuntap
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/tuntap.rst b/Documentation/networking/tuntap.rst
new file mode 100644
index 000000000000..a59d1dd6fdcc
--- /dev/null
+++ b/Documentation/networking/tuntap.rst
@@ -0,0 +1,259 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===============================
+Universal TUN/TAP device driver
+===============================
+
+Copyright |copy| 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com>
+
+  Linux, Solaris drivers
+  Copyright |copy| 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com>
+
+  FreeBSD TAP driver
+  Copyright |copy| 1999-2000 Maksim Yevmenkin <m_evmenkin@yahoo.com>
+
+  Revision of this document 2002 by Florian Thiel <florian.thiel@gmx.net>
+
+1. Description
+==============
+
+  TUN/TAP provides packet reception and transmission for user space programs.
+  It can be seen as a simple Point-to-Point or Ethernet device, which,
+  instead of receiving packets from physical media, receives them from
+  user space program and instead of sending packets via physical media
+  writes them to the user space program.
+
+  In order to use the driver a program has to open /dev/net/tun and issue a
+  corresponding ioctl() to register a network device with the kernel. A network
+  device will appear as tunXX or tapXX, depending on the options chosen. When
+  the program closes the file descriptor, the network device and all
+  corresponding routes will disappear.
+
+  Depending on the type of device chosen the userspace program has to read/write
+  IP packets (with tun) or ethernet frames (with tap). Which one is being used
+  depends on the flags given with the ioctl().
+
+  The package from http://vtun.sourceforge.net/tun contains two simple examples
+  for how to use tun and tap devices. Both programs work like a bridge between
+  two network interfaces.
+  br_select.c - bridge based on select system call.
+  br_sigio.c  - bridge based on async io and SIGIO signal.
+  However, the best example is VTun http://vtun.sourceforge.net :))
+
+2. Configuration
+================
+
+  Create device node::
+
+     mkdir /dev/net (if it doesn't exist already)
+     mknod /dev/net/tun c 10 200
+
+  Set permissions::
+
+     e.g. chmod 0666 /dev/net/tun
+
+  There's no harm in allowing the device to be accessible by non-root users,
+  since CAP_NET_ADMIN is required for creating network devices or for
+  connecting to network devices which aren't owned by the user in question.
+  If you want to create persistent devices and give ownership of them to
+  unprivileged users, then you need the /dev/net/tun device to be usable by
+  those users.
+
+  Driver module autoloading
+
+     Make sure that "Kernel module loader" - module auto-loading
+     support is enabled in your kernel.  The kernel should load it on
+     first access.
+
+  Manual loading
+
+     insert the module by hand::
+
+	modprobe tun
+
+  If you do it the latter way, you have to load the module every time you
+  need it, if you do it the other way it will be automatically loaded when
+  /dev/net/tun is being opened.
+
+3. Program interface
+====================
+
+3.1 Network device allocation
+-----------------------------
+
+``char *dev`` should be the name of the device with a format string (e.g.
+"tun%d"), but (as far as I can see) this can be any valid network device name.
+Note that the character pointer becomes overwritten with the real device name
+(e.g. "tun0")::
+
+  #include <linux/if.h>
+  #include <linux/if_tun.h>
+
+  int tun_alloc(char *dev)
+  {
+      struct ifreq ifr;
+      int fd, err;
+
+      if( (fd = open("/dev/net/tun", O_RDWR)) < 0 )
+	 return tun_alloc_old(dev);
+
+      memset(&ifr, 0, sizeof(ifr));
+
+      /* Flags: IFF_TUN   - TUN device (no Ethernet headers)
+       *        IFF_TAP   - TAP device
+       *
+       *        IFF_NO_PI - Do not provide packet information
+       */
+      ifr.ifr_flags = IFF_TUN;
+      if( *dev )
+	 strncpy(ifr.ifr_name, dev, IFNAMSIZ);
+
+      if( (err = ioctl(fd, TUNSETIFF, (void *) &ifr)) < 0 ){
+	 close(fd);
+	 return err;
+      }
+      strcpy(dev, ifr.ifr_name);
+      return fd;
+  }
+
+3.2 Frame format
+----------------
+
+If flag IFF_NO_PI is not set each frame format is::
+
+     Flags [2 bytes]
+     Proto [2 bytes]
+     Raw protocol(IP, IPv6, etc) frame.
+
+3.3 Multiqueue tuntap interface
+-------------------------------
+
+From version 3.8, Linux supports multiqueue tuntap which can uses multiple
+file descriptors (queues) to parallelize packets sending or receiving. The
+device allocation is the same as before, and if user wants to create multiple
+queues, TUNSETIFF with the same device name must be called many times with
+IFF_MULTI_QUEUE flag.
+
+``char *dev`` should be the name of the device, queues is the number of queues
+to be created, fds is used to store and return the file descriptors (queues)
+created to the caller. Each file descriptor were served as the interface of a
+queue which could be accessed by userspace.
+
+::
+
+  #include <linux/if.h>
+  #include <linux/if_tun.h>
+
+  int tun_alloc_mq(char *dev, int queues, int *fds)
+  {
+      struct ifreq ifr;
+      int fd, err, i;
+
+      if (!dev)
+	  return -1;
+
+      memset(&ifr, 0, sizeof(ifr));
+      /* Flags: IFF_TUN   - TUN device (no Ethernet headers)
+       *        IFF_TAP   - TAP device
+       *
+       *        IFF_NO_PI - Do not provide packet information
+       *        IFF_MULTI_QUEUE - Create a queue of multiqueue device
+       */
+      ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_MULTI_QUEUE;
+      strcpy(ifr.ifr_name, dev);
+
+      for (i = 0; i < queues; i++) {
+	  if ((fd = open("/dev/net/tun", O_RDWR)) < 0)
+	     goto err;
+	  err = ioctl(fd, TUNSETIFF, (void *)&ifr);
+	  if (err) {
+	     close(fd);
+	     goto err;
+	  }
+	  fds[i] = fd;
+      }
+
+      return 0;
+  err:
+      for (--i; i >= 0; i--)
+	  close(fds[i]);
+      return err;
+  }
+
+A new ioctl(TUNSETQUEUE) were introduced to enable or disable a queue. When
+calling it with IFF_DETACH_QUEUE flag, the queue were disabled. And when
+calling it with IFF_ATTACH_QUEUE flag, the queue were enabled. The queue were
+enabled by default after it was created through TUNSETIFF.
+
+fd is the file descriptor (queue) that we want to enable or disable, when
+enable is true we enable it, otherwise we disable it::
+
+  #include <linux/if.h>
+  #include <linux/if_tun.h>
+
+  int tun_set_queue(int fd, int enable)
+  {
+      struct ifreq ifr;
+
+      memset(&ifr, 0, sizeof(ifr));
+
+      if (enable)
+	 ifr.ifr_flags = IFF_ATTACH_QUEUE;
+      else
+	 ifr.ifr_flags = IFF_DETACH_QUEUE;
+
+      return ioctl(fd, TUNSETQUEUE, (void *)&ifr);
+  }
+
+Universal TUN/TAP device driver Frequently Asked Question
+=========================================================
+
+1. What platforms are supported by TUN/TAP driver ?
+
+Currently driver has been written for 3 Unices:
+
+  - Linux kernels 2.2.x, 2.4.x
+  - FreeBSD 3.x, 4.x, 5.x
+  - Solaris 2.6, 7.0, 8.0
+
+2. What is TUN/TAP driver used for?
+
+As mentioned above, main purpose of TUN/TAP driver is tunneling.
+It is used by VTun (http://vtun.sourceforge.net).
+
+Another interesting application using TUN/TAP is pipsecd
+(http://perso.enst.fr/~beyssac/pipsec/), a userspace IPSec
+implementation that can use complete kernel routing (unlike FreeS/WAN).
+
+3. How does Virtual network device actually work ?
+
+Virtual network device can be viewed as a simple Point-to-Point or
+Ethernet device, which instead of receiving packets from a physical
+media, receives them from user space program and instead of sending
+packets via physical media sends them to the user space program.
+
+Let's say that you configured IPv6 on the tap0, then whenever
+the kernel sends an IPv6 packet to tap0, it is passed to the application
+(VTun for example). The application encrypts, compresses and sends it to
+the other side over TCP or UDP. The application on the other side decompresses
+and decrypts the data received and writes the packet to the TAP device,
+the kernel handles the packet like it came from real physical device.
+
+4. What is the difference between TUN driver and TAP driver?
+
+TUN works with IP frames. TAP works with Ethernet frames.
+
+This means that you have to read/write IP packets when you are using tun and
+ethernet frames when using tap.
+
+5. What is the difference between BPF and TUN/TAP driver?
+
+BPF is an advanced packet filter. It can be attached to existing
+network interface. It does not provide a virtual network interface.
+A TUN/TAP driver does provide a virtual network interface and it is possible
+to attach BPF to this interface.
+
+6. Does TAP driver support kernel Ethernet bridging?
+
+Yes. Linux and FreeBSD drivers support Ethernet bridging.
diff --git a/Documentation/networking/tuntap.txt b/Documentation/networking/tuntap.txt
deleted file mode 100644
index 0104830d5075..000000000000
--- a/Documentation/networking/tuntap.txt
+++ /dev/null
@@ -1,227 +0,0 @@
-Universal TUN/TAP device driver.
-Copyright (C) 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com>
-
-  Linux, Solaris drivers 
-  Copyright (C) 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com>
-
-  FreeBSD TAP driver 
-  Copyright (c) 1999-2000 Maksim Yevmenkin <m_evmenkin@yahoo.com>
-
-  Revision of this document 2002 by Florian Thiel <florian.thiel@gmx.net>
-
-1. Description
-  TUN/TAP provides packet reception and transmission for user space programs. 
-  It can be seen as a simple Point-to-Point or Ethernet device, which,
-  instead of receiving packets from physical media, receives them from 
-  user space program and instead of sending packets via physical media 
-  writes them to the user space program. 
-
-  In order to use the driver a program has to open /dev/net/tun and issue a
-  corresponding ioctl() to register a network device with the kernel. A network
-  device will appear as tunXX or tapXX, depending on the options chosen. When
-  the program closes the file descriptor, the network device and all
-  corresponding routes will disappear.
-
-  Depending on the type of device chosen the userspace program has to read/write
-  IP packets (with tun) or ethernet frames (with tap). Which one is being used
-  depends on the flags given with the ioctl().
-
-  The package from http://vtun.sourceforge.net/tun contains two simple examples
-  for how to use tun and tap devices. Both programs work like a bridge between
-  two network interfaces.
-  br_select.c - bridge based on select system call.
-  br_sigio.c  - bridge based on async io and SIGIO signal.
-  However, the best example is VTun http://vtun.sourceforge.net :))
-
-2. Configuration 
-  Create device node:
-     mkdir /dev/net (if it doesn't exist already)
-     mknod /dev/net/tun c 10 200
-  
-  Set permissions:
-     e.g. chmod 0666 /dev/net/tun
-     There's no harm in allowing the device to be accessible by non-root users,
-     since CAP_NET_ADMIN is required for creating network devices or for 
-     connecting to network devices which aren't owned by the user in question.
-     If you want to create persistent devices and give ownership of them to 
-     unprivileged users, then you need the /dev/net/tun device to be usable by
-     those users.
-
-  Driver module autoloading
-
-     Make sure that "Kernel module loader" - module auto-loading
-     support is enabled in your kernel.  The kernel should load it on
-     first access.
-  
-  Manual loading 
-     insert the module by hand:
-        modprobe tun
-
-  If you do it the latter way, you have to load the module every time you
-  need it, if you do it the other way it will be automatically loaded when
-  /dev/net/tun is being opened.
-
-3. Program interface 
-  3.1 Network device allocation:
-
-  char *dev should be the name of the device with a format string (e.g.
-  "tun%d"), but (as far as I can see) this can be any valid network device name.
-  Note that the character pointer becomes overwritten with the real device name
-  (e.g. "tun0")
-
-  #include <linux/if.h>
-  #include <linux/if_tun.h>
-
-  int tun_alloc(char *dev)
-  {
-      struct ifreq ifr;
-      int fd, err;
-
-      if( (fd = open("/dev/net/tun", O_RDWR)) < 0 )
-         return tun_alloc_old(dev);
-
-      memset(&ifr, 0, sizeof(ifr));
-
-      /* Flags: IFF_TUN   - TUN device (no Ethernet headers) 
-       *        IFF_TAP   - TAP device  
-       *
-       *        IFF_NO_PI - Do not provide packet information  
-       */ 
-      ifr.ifr_flags = IFF_TUN; 
-      if( *dev )
-         strncpy(ifr.ifr_name, dev, IFNAMSIZ);
-
-      if( (err = ioctl(fd, TUNSETIFF, (void *) &ifr)) < 0 ){
-         close(fd);
-         return err;
-      }
-      strcpy(dev, ifr.ifr_name);
-      return fd;
-  }              
- 
-  3.2 Frame format:
-  If flag IFF_NO_PI is not set each frame format is: 
-     Flags [2 bytes]
-     Proto [2 bytes]
-     Raw protocol(IP, IPv6, etc) frame.
-
-  3.3 Multiqueue tuntap interface:
-
-  From version 3.8, Linux supports multiqueue tuntap which can uses multiple
-  file descriptors (queues) to parallelize packets sending or receiving. The
-  device allocation is the same as before, and if user wants to create multiple
-  queues, TUNSETIFF with the same device name must be called many times with
-  IFF_MULTI_QUEUE flag.
-
-  char *dev should be the name of the device, queues is the number of queues to
-  be created, fds is used to store and return the file descriptors (queues)
-  created to the caller. Each file descriptor were served as the interface of a
-  queue which could be accessed by userspace.
-
-  #include <linux/if.h>
-  #include <linux/if_tun.h>
-
-  int tun_alloc_mq(char *dev, int queues, int *fds)
-  {
-      struct ifreq ifr;
-      int fd, err, i;
-
-      if (!dev)
-          return -1;
-
-      memset(&ifr, 0, sizeof(ifr));
-      /* Flags: IFF_TUN   - TUN device (no Ethernet headers)
-       *        IFF_TAP   - TAP device
-       *
-       *        IFF_NO_PI - Do not provide packet information
-       *        IFF_MULTI_QUEUE - Create a queue of multiqueue device
-       */
-      ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_MULTI_QUEUE;
-      strcpy(ifr.ifr_name, dev);
-
-      for (i = 0; i < queues; i++) {
-          if ((fd = open("/dev/net/tun", O_RDWR)) < 0)
-             goto err;
-          err = ioctl(fd, TUNSETIFF, (void *)&ifr);
-          if (err) {
-             close(fd);
-             goto err;
-          }
-          fds[i] = fd;
-      }
-
-      return 0;
-  err:
-      for (--i; i >= 0; i--)
-          close(fds[i]);
-      return err;
-  }
-
-  A new ioctl(TUNSETQUEUE) were introduced to enable or disable a queue. When
-  calling it with IFF_DETACH_QUEUE flag, the queue were disabled. And when
-  calling it with IFF_ATTACH_QUEUE flag, the queue were enabled. The queue were
-  enabled by default after it was created through TUNSETIFF.
-
-  fd is the file descriptor (queue) that we want to enable or disable, when
-  enable is true we enable it, otherwise we disable it
-
-  #include <linux/if.h>
-  #include <linux/if_tun.h>
-
-  int tun_set_queue(int fd, int enable)
-  {
-      struct ifreq ifr;
-
-      memset(&ifr, 0, sizeof(ifr));
-
-      if (enable)
-         ifr.ifr_flags = IFF_ATTACH_QUEUE;
-      else
-         ifr.ifr_flags = IFF_DETACH_QUEUE;
-
-      return ioctl(fd, TUNSETQUEUE, (void *)&ifr);
-  }
-
-Universal TUN/TAP device driver Frequently Asked Question.
-   
-1. What platforms are supported by TUN/TAP driver ?
-Currently driver has been written for 3 Unices:
-   Linux kernels 2.2.x, 2.4.x 
-   FreeBSD 3.x, 4.x, 5.x
-   Solaris 2.6, 7.0, 8.0
-
-2. What is TUN/TAP driver used for?
-As mentioned above, main purpose of TUN/TAP driver is tunneling. 
-It is used by VTun (http://vtun.sourceforge.net).
-
-Another interesting application using TUN/TAP is pipsecd
-(http://perso.enst.fr/~beyssac/pipsec/), a userspace IPSec
-implementation that can use complete kernel routing (unlike FreeS/WAN).
-
-3. How does Virtual network device actually work ? 
-Virtual network device can be viewed as a simple Point-to-Point or
-Ethernet device, which instead of receiving packets from a physical 
-media, receives them from user space program and instead of sending 
-packets via physical media sends them to the user space program. 
-
-Let's say that you configured IPv6 on the tap0, then whenever
-the kernel sends an IPv6 packet to tap0, it is passed to the application
-(VTun for example). The application encrypts, compresses and sends it to 
-the other side over TCP or UDP. The application on the other side decompresses
-and decrypts the data received and writes the packet to the TAP device, 
-the kernel handles the packet like it came from real physical device.
-
-4. What is the difference between TUN driver and TAP driver?
-TUN works with IP frames. TAP works with Ethernet frames.
-
-This means that you have to read/write IP packets when you are using tun and
-ethernet frames when using tap.
-
-5. What is the difference between BPF and TUN/TAP driver?
-BPF is an advanced packet filter. It can be attached to existing
-network interface. It does not provide a virtual network interface.
-A TUN/TAP driver does provide a virtual network interface and it is possible
-to attach BPF to this interface.
-
-6. Does TAP driver support kernel Ethernet bridging?
-Yes. Linux and FreeBSD drivers support Ethernet bridging. 
diff --git a/MAINTAINERS b/MAINTAINERS
index 0ac9cec0bce6..6456c5bb02f1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17161,7 +17161,7 @@ TUN/TAP driver
 M:	Maxim Krasnyansky <maxk@qti.qualcomm.com>
 S:	Maintained
 W:	http://vtun.sourceforge.net/tun
-F:	Documentation/networking/tuntap.txt
+F:	Documentation/networking/tuntap.rst
 F:	arch/um/os-Linux/drivers/
 
 TURBOCHANNEL SUBSYSTEM
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index ad64be98330f..3f2c98a7906c 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -355,7 +355,7 @@ config TUN
 	  devices, driver will automatically delete tunXX or tapXX device and
 	  all routes corresponding to it.
 
-	  Please read <file:Documentation/networking/tuntap.txt> for more
+	  Please read <file:Documentation/networking/tuntap.rst> for more
 	  information.
 
 	  To compile this driver as a module, choose M here: the module
-- 
cgit v1.2.3


From 58ccb2b2e87d52ec0b4cbd40b94e0b63e90af873 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:25 +0200
Subject: docs: networking: convert vrf.txt to ReST

- add SPDX header;
- adjust title markup;
- Add a subtitle for the first section;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst |   1 +
 Documentation/networking/vrf.rst   | 451 +++++++++++++++++++++++++++++++++++++
 Documentation/networking/vrf.txt   | 418 ----------------------------------
 MAINTAINERS                        |   2 +-
 4 files changed, 453 insertions(+), 419 deletions(-)
 create mode 100644 Documentation/networking/vrf.rst
 delete mode 100644 Documentation/networking/vrf.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index ca0b0dbfd9ad..2227b9f4509d 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -113,6 +113,7 @@ Contents:
    tproxy
    tuntap
    udplite
+   vrf
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/vrf.rst b/Documentation/networking/vrf.rst
new file mode 100644
index 000000000000..0dde145043bc
--- /dev/null
+++ b/Documentation/networking/vrf.rst
@@ -0,0 +1,451 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================
+Virtual Routing and Forwarding (VRF)
+====================================
+
+The VRF Device
+==============
+
+The VRF device combined with ip rules provides the ability to create virtual
+routing and forwarding domains (aka VRFs, VRF-lite to be specific) in the
+Linux network stack. One use case is the multi-tenancy problem where each
+tenant has their own unique routing tables and in the very least need
+different default gateways.
+
+Processes can be "VRF aware" by binding a socket to the VRF device. Packets
+through the socket then use the routing table associated with the VRF
+device. An important feature of the VRF device implementation is that it
+impacts only Layer 3 and above so L2 tools (e.g., LLDP) are not affected
+(ie., they do not need to be run in each VRF). The design also allows
+the use of higher priority ip rules (Policy Based Routing, PBR) to take
+precedence over the VRF device rules directing specific traffic as desired.
+
+In addition, VRF devices allow VRFs to be nested within namespaces. For
+example network namespaces provide separation of network interfaces at the
+device layer, VLANs on the interfaces within a namespace provide L2 separation
+and then VRF devices provide L3 separation.
+
+Design
+------
+A VRF device is created with an associated route table. Network interfaces
+are then enslaved to a VRF device::
+
+	 +-----------------------------+
+	 |           vrf-blue          |  ===> route table 10
+	 +-----------------------------+
+	    |        |            |
+	 +------+ +------+     +-------------+
+	 | eth1 | | eth2 | ... |    bond1    |
+	 +------+ +------+     +-------------+
+				  |       |
+			      +------+ +------+
+			      | eth8 | | eth9 |
+			      +------+ +------+
+
+Packets received on an enslaved device and are switched to the VRF device
+in the IPv4 and IPv6 processing stacks giving the impression that packets
+flow through the VRF device. Similarly on egress routing rules are used to
+send packets to the VRF device driver before getting sent out the actual
+interface. This allows tcpdump on a VRF device to capture all packets into
+and out of the VRF as a whole\ [1]_. Similarly, netfilter\ [2]_ and tc rules
+can be applied using the VRF device to specify rules that apply to the VRF
+domain as a whole.
+
+.. [1] Packets in the forwarded state do not flow through the device, so those
+       packets are not seen by tcpdump. Will revisit this limitation in a
+       future release.
+
+.. [2] Iptables on ingress supports PREROUTING with skb->dev set to the real
+       ingress device and both INPUT and PREROUTING rules with skb->dev set to
+       the VRF device. For egress POSTROUTING and OUTPUT rules can be written
+       using either the VRF device or real egress device.
+
+Setup
+-----
+1. VRF device is created with an association to a FIB table.
+   e.g,::
+
+	ip link add vrf-blue type vrf table 10
+	ip link set dev vrf-blue up
+
+2. An l3mdev FIB rule directs lookups to the table associated with the device.
+   A single l3mdev rule is sufficient for all VRFs. The VRF device adds the
+   l3mdev rule for IPv4 and IPv6 when the first device is created with a
+   default preference of 1000. Users may delete the rule if desired and add
+   with a different priority or install per-VRF rules.
+
+   Prior to the v4.8 kernel iif and oif rules are needed for each VRF device::
+
+       ip ru add oif vrf-blue table 10
+       ip ru add iif vrf-blue table 10
+
+3. Set the default route for the table (and hence default route for the VRF)::
+
+       ip route add table 10 unreachable default metric 4278198272
+
+   This high metric value ensures that the default unreachable route can
+   be overridden by a routing protocol suite.  FRRouting interprets
+   kernel metrics as a combined admin distance (upper byte) and priority
+   (lower 3 bytes).  Thus the above metric translates to [255/8192].
+
+4. Enslave L3 interfaces to a VRF device::
+
+       ip link set dev eth1 master vrf-blue
+
+   Local and connected routes for enslaved devices are automatically moved to
+   the table associated with VRF device. Any additional routes depending on
+   the enslaved device are dropped and will need to be reinserted to the VRF
+   FIB table following the enslavement.
+
+   The IPv6 sysctl option keep_addr_on_down can be enabled to keep IPv6 global
+   addresses as VRF enslavement changes::
+
+       sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
+
+5. Additional VRF routes are added to associated table::
+
+       ip route add table 10 ...
+
+
+Applications
+------------
+Applications that are to work within a VRF need to bind their socket to the
+VRF device::
+
+    setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1);
+
+or to specify the output device using cmsg and IP_PKTINFO.
+
+By default the scope of the port bindings for unbound sockets is
+limited to the default VRF. That is, it will not be matched by packets
+arriving on interfaces enslaved to an l3mdev and processes may bind to
+the same port if they bind to an l3mdev.
+
+TCP & UDP services running in the default VRF context (ie., not bound
+to any VRF device) can work across all VRF domains by enabling the
+tcp_l3mdev_accept and udp_l3mdev_accept sysctl options::
+
+    sysctl -w net.ipv4.tcp_l3mdev_accept=1
+    sysctl -w net.ipv4.udp_l3mdev_accept=1
+
+These options are disabled by default so that a socket in a VRF is only
+selected for packets in that VRF. There is a similar option for RAW
+sockets, which is enabled by default for reasons of backwards compatibility.
+This is so as to specify the output device with cmsg and IP_PKTINFO, but
+using a socket not bound to the corresponding VRF. This allows e.g. older ping
+implementations to be run with specifying the device but without executing it
+in the VRF. This option can be disabled so that packets received in a VRF
+context are only handled by a raw socket bound to the VRF, and packets in the
+default VRF are only handled by a socket not bound to any VRF::
+
+    sysctl -w net.ipv4.raw_l3mdev_accept=0
+
+netfilter rules on the VRF device can be used to limit access to services
+running in the default VRF context as well.
+
+--------------------------------------------------------------------------------
+
+Using iproute2 for VRFs
+=======================
+iproute2 supports the vrf keyword as of v4.7. For backwards compatibility this
+section lists both commands where appropriate -- with the vrf keyword and the
+older form without it.
+
+1. Create a VRF
+
+   To instantiate a VRF device and associate it with a table::
+
+       $ ip link add dev NAME type vrf table ID
+
+   As of v4.8 the kernel supports the l3mdev FIB rule where a single rule
+   covers all VRFs. The l3mdev rule is created for IPv4 and IPv6 on first
+   device create.
+
+2. List VRFs
+
+   To list VRFs that have been created::
+
+       $ ip [-d] link show type vrf
+	 NOTE: The -d option is needed to show the table id
+
+   For example::
+
+       $ ip -d link show type vrf
+       11: mgmt: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+	   link/ether 72:b3:ba:91:e2:24 brd ff:ff:ff:ff:ff:ff promiscuity 0
+	   vrf table 1 addrgenmode eui64
+       12: red: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+	   link/ether b6:6f:6e:f6:da:73 brd ff:ff:ff:ff:ff:ff promiscuity 0
+	   vrf table 10 addrgenmode eui64
+       13: blue: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+	   link/ether 36:62:e8:7d:bb:8c brd ff:ff:ff:ff:ff:ff promiscuity 0
+	   vrf table 66 addrgenmode eui64
+       14: green: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+	   link/ether e6:28:b8:63:70:bb brd ff:ff:ff:ff:ff:ff promiscuity 0
+	   vrf table 81 addrgenmode eui64
+
+
+   Or in brief output::
+
+       $ ip -br link show type vrf
+       mgmt         UP             72:b3:ba:91:e2:24 <NOARP,MASTER,UP,LOWER_UP>
+       red          UP             b6:6f:6e:f6:da:73 <NOARP,MASTER,UP,LOWER_UP>
+       blue         UP             36:62:e8:7d:bb:8c <NOARP,MASTER,UP,LOWER_UP>
+       green        UP             e6:28:b8:63:70:bb <NOARP,MASTER,UP,LOWER_UP>
+
+
+3. Assign a Network Interface to a VRF
+
+   Network interfaces are assigned to a VRF by enslaving the netdevice to a
+   VRF device::
+
+       $ ip link set dev NAME master NAME
+
+   On enslavement connected and local routes are automatically moved to the
+   table associated with the VRF device.
+
+   For example::
+
+       $ ip link set dev eth0 master mgmt
+
+
+4. Show Devices Assigned to a VRF
+
+   To show devices that have been assigned to a specific VRF add the master
+   option to the ip command::
+
+       $ ip link show vrf NAME
+       $ ip link show master NAME
+
+   For example::
+
+       $ ip link show vrf red
+       3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP mode DEFAULT group default qlen 1000
+	   link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff
+       4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP mode DEFAULT group default qlen 1000
+	   link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff
+       7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master red state DOWN mode DEFAULT group default qlen 1000
+	   link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff
+
+
+   Or using the brief output::
+
+       $ ip -br link show vrf red
+       eth1             UP             02:00:00:00:02:02 <BROADCAST,MULTICAST,UP,LOWER_UP>
+       eth2             UP             02:00:00:00:02:03 <BROADCAST,MULTICAST,UP,LOWER_UP>
+       eth5             DOWN           02:00:00:00:02:06 <BROADCAST,MULTICAST>
+
+
+5. Show Neighbor Entries for a VRF
+
+   To list neighbor entries associated with devices enslaved to a VRF device
+   add the master option to the ip command::
+
+       $ ip [-6] neigh show vrf NAME
+       $ ip [-6] neigh show master NAME
+
+   For example::
+
+       $  ip neigh show vrf red
+       10.2.1.254 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE
+       10.2.2.254 dev eth2 lladdr 5e:54:01:6a:ee:80 REACHABLE
+
+       $ ip -6 neigh show vrf red
+       2002:1::64 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE
+
+
+6. Show Addresses for a VRF
+
+   To show addresses for interfaces associated with a VRF add the master
+   option to the ip command::
+
+       $ ip addr show vrf NAME
+       $ ip addr show master NAME
+
+   For example::
+
+	$ ip addr show vrf red
+	3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
+	    link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff
+	    inet 10.2.1.2/24 brd 10.2.1.255 scope global eth1
+	       valid_lft forever preferred_lft forever
+	    inet6 2002:1::2/120 scope global
+	       valid_lft forever preferred_lft forever
+	    inet6 fe80::ff:fe00:202/64 scope link
+	       valid_lft forever preferred_lft forever
+	4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
+	    link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff
+	    inet 10.2.2.2/24 brd 10.2.2.255 scope global eth2
+	       valid_lft forever preferred_lft forever
+	    inet6 2002:2::2/120 scope global
+	       valid_lft forever preferred_lft forever
+	    inet6 fe80::ff:fe00:203/64 scope link
+	       valid_lft forever preferred_lft forever
+	7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master red state DOWN group default qlen 1000
+	    link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff
+
+   Or in brief format::
+
+	$ ip -br addr show vrf red
+	eth1             UP             10.2.1.2/24 2002:1::2/120 fe80::ff:fe00:202/64
+	eth2             UP             10.2.2.2/24 2002:2::2/120 fe80::ff:fe00:203/64
+	eth5             DOWN
+
+
+7. Show Routes for a VRF
+
+   To show routes for a VRF use the ip command to display the table associated
+   with the VRF device::
+
+       $ ip [-6] route show vrf NAME
+       $ ip [-6] route show table ID
+
+   For example::
+
+	$ ip route show vrf red
+	unreachable default  metric 4278198272
+	broadcast 10.2.1.0 dev eth1  proto kernel  scope link  src 10.2.1.2
+	10.2.1.0/24 dev eth1  proto kernel  scope link  src 10.2.1.2
+	local 10.2.1.2 dev eth1  proto kernel  scope host  src 10.2.1.2
+	broadcast 10.2.1.255 dev eth1  proto kernel  scope link  src 10.2.1.2
+	broadcast 10.2.2.0 dev eth2  proto kernel  scope link  src 10.2.2.2
+	10.2.2.0/24 dev eth2  proto kernel  scope link  src 10.2.2.2
+	local 10.2.2.2 dev eth2  proto kernel  scope host  src 10.2.2.2
+	broadcast 10.2.2.255 dev eth2  proto kernel  scope link  src 10.2.2.2
+
+	$ ip -6 route show vrf red
+	local 2002:1:: dev lo  proto none  metric 0  pref medium
+	local 2002:1::2 dev lo  proto none  metric 0  pref medium
+	2002:1::/120 dev eth1  proto kernel  metric 256  pref medium
+	local 2002:2:: dev lo  proto none  metric 0  pref medium
+	local 2002:2::2 dev lo  proto none  metric 0  pref medium
+	2002:2::/120 dev eth2  proto kernel  metric 256  pref medium
+	local fe80:: dev lo  proto none  metric 0  pref medium
+	local fe80:: dev lo  proto none  metric 0  pref medium
+	local fe80::ff:fe00:202 dev lo  proto none  metric 0  pref medium
+	local fe80::ff:fe00:203 dev lo  proto none  metric 0  pref medium
+	fe80::/64 dev eth1  proto kernel  metric 256  pref medium
+	fe80::/64 dev eth2  proto kernel  metric 256  pref medium
+	ff00::/8 dev red  metric 256  pref medium
+	ff00::/8 dev eth1  metric 256  pref medium
+	ff00::/8 dev eth2  metric 256  pref medium
+	unreachable default dev lo  metric 4278198272  error -101 pref medium
+
+8. Route Lookup for a VRF
+
+   A test route lookup can be done for a VRF::
+
+       $ ip [-6] route get vrf NAME ADDRESS
+       $ ip [-6] route get oif NAME ADDRESS
+
+   For example::
+
+	$ ip route get 10.2.1.40 vrf red
+	10.2.1.40 dev eth1  table red  src 10.2.1.2
+	    cache
+
+	$ ip -6 route get 2002:1::32 vrf red
+	2002:1::32 from :: dev eth1  table red  proto kernel  src 2002:1::2  metric 256  pref medium
+
+
+9. Removing Network Interface from a VRF
+
+   Network interfaces are removed from a VRF by breaking the enslavement to
+   the VRF device::
+
+       $ ip link set dev NAME nomaster
+
+   Connected routes are moved back to the default table and local entries are
+   moved to the local table.
+
+   For example::
+
+    $ ip link set dev eth0 nomaster
+
+--------------------------------------------------------------------------------
+
+Commands used in this example::
+
+     cat >> /etc/iproute2/rt_tables.d/vrf.conf <<EOF
+     1  mgmt
+     10 red
+     66 blue
+     81 green
+     EOF
+
+     function vrf_create
+     {
+	 VRF=$1
+	 TBID=$2
+
+	 # create VRF device
+	 ip link add ${VRF} type vrf table ${TBID}
+
+	 if [ "${VRF}" != "mgmt" ]; then
+	     ip route add table ${TBID} unreachable default metric 4278198272
+	 fi
+	 ip link set dev ${VRF} up
+     }
+
+     vrf_create mgmt 1
+     ip link set dev eth0 master mgmt
+
+     vrf_create red 10
+     ip link set dev eth1 master red
+     ip link set dev eth2 master red
+     ip link set dev eth5 master red
+
+     vrf_create blue 66
+     ip link set dev eth3 master blue
+
+     vrf_create green 81
+     ip link set dev eth4 master green
+
+
+     Interface addresses from /etc/network/interfaces:
+     auto eth0
+     iface eth0 inet static
+	   address 10.0.0.2
+	   netmask 255.255.255.0
+	   gateway 10.0.0.254
+
+     iface eth0 inet6 static
+	   address 2000:1::2
+	   netmask 120
+
+     auto eth1
+     iface eth1 inet static
+	   address 10.2.1.2
+	   netmask 255.255.255.0
+
+     iface eth1 inet6 static
+	   address 2002:1::2
+	   netmask 120
+
+     auto eth2
+     iface eth2 inet static
+	   address 10.2.2.2
+	   netmask 255.255.255.0
+
+     iface eth2 inet6 static
+	   address 2002:2::2
+	   netmask 120
+
+     auto eth3
+     iface eth3 inet static
+	   address 10.2.3.2
+	   netmask 255.255.255.0
+
+     iface eth3 inet6 static
+	   address 2002:3::2
+	   netmask 120
+
+     auto eth4
+     iface eth4 inet static
+	   address 10.2.4.2
+	   netmask 255.255.255.0
+
+     iface eth4 inet6 static
+	   address 2002:4::2
+	   netmask 120
diff --git a/Documentation/networking/vrf.txt b/Documentation/networking/vrf.txt
deleted file mode 100644
index a5f103b083a0..000000000000
--- a/Documentation/networking/vrf.txt
+++ /dev/null
@@ -1,418 +0,0 @@
-Virtual Routing and Forwarding (VRF)
-====================================
-The VRF device combined with ip rules provides the ability to create virtual
-routing and forwarding domains (aka VRFs, VRF-lite to be specific) in the
-Linux network stack. One use case is the multi-tenancy problem where each
-tenant has their own unique routing tables and in the very least need
-different default gateways.
-
-Processes can be "VRF aware" by binding a socket to the VRF device. Packets
-through the socket then use the routing table associated with the VRF
-device. An important feature of the VRF device implementation is that it
-impacts only Layer 3 and above so L2 tools (e.g., LLDP) are not affected
-(ie., they do not need to be run in each VRF). The design also allows
-the use of higher priority ip rules (Policy Based Routing, PBR) to take
-precedence over the VRF device rules directing specific traffic as desired.
-
-In addition, VRF devices allow VRFs to be nested within namespaces. For
-example network namespaces provide separation of network interfaces at the
-device layer, VLANs on the interfaces within a namespace provide L2 separation
-and then VRF devices provide L3 separation.
-
-Design
-------
-A VRF device is created with an associated route table. Network interfaces
-are then enslaved to a VRF device:
-
-         +-----------------------------+
-         |           vrf-blue          |  ===> route table 10
-         +-----------------------------+
-            |        |            |
-         +------+ +------+     +-------------+
-         | eth1 | | eth2 | ... |    bond1    |
-         +------+ +------+     +-------------+
-                                  |       |
-                              +------+ +------+
-                              | eth8 | | eth9 |
-                              +------+ +------+
-
-Packets received on an enslaved device and are switched to the VRF device
-in the IPv4 and IPv6 processing stacks giving the impression that packets
-flow through the VRF device. Similarly on egress routing rules are used to
-send packets to the VRF device driver before getting sent out the actual
-interface. This allows tcpdump on a VRF device to capture all packets into
-and out of the VRF as a whole.[1] Similarly, netfilter[2] and tc rules can be
-applied using the VRF device to specify rules that apply to the VRF domain
-as a whole.
-
-[1] Packets in the forwarded state do not flow through the device, so those
-    packets are not seen by tcpdump. Will revisit this limitation in a
-    future release.
-
-[2] Iptables on ingress supports PREROUTING with skb->dev set to the real
-    ingress device and both INPUT and PREROUTING rules with skb->dev set to
-    the VRF device. For egress POSTROUTING and OUTPUT rules can be written
-    using either the VRF device or real egress device.
-
-Setup
------
-1. VRF device is created with an association to a FIB table.
-   e.g, ip link add vrf-blue type vrf table 10
-        ip link set dev vrf-blue up
-
-2. An l3mdev FIB rule directs lookups to the table associated with the device.
-   A single l3mdev rule is sufficient for all VRFs. The VRF device adds the
-   l3mdev rule for IPv4 and IPv6 when the first device is created with a
-   default preference of 1000. Users may delete the rule if desired and add
-   with a different priority or install per-VRF rules.
-
-   Prior to the v4.8 kernel iif and oif rules are needed for each VRF device:
-       ip ru add oif vrf-blue table 10
-       ip ru add iif vrf-blue table 10
-
-3. Set the default route for the table (and hence default route for the VRF).
-       ip route add table 10 unreachable default metric 4278198272
-
-   This high metric value ensures that the default unreachable route can
-   be overridden by a routing protocol suite.  FRRouting interprets
-   kernel metrics as a combined admin distance (upper byte) and priority
-   (lower 3 bytes).  Thus the above metric translates to [255/8192].
-
-4. Enslave L3 interfaces to a VRF device.
-       ip link set dev eth1 master vrf-blue
-
-   Local and connected routes for enslaved devices are automatically moved to
-   the table associated with VRF device. Any additional routes depending on
-   the enslaved device are dropped and will need to be reinserted to the VRF
-   FIB table following the enslavement.
-
-   The IPv6 sysctl option keep_addr_on_down can be enabled to keep IPv6 global
-   addresses as VRF enslavement changes.
-       sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
-
-5. Additional VRF routes are added to associated table.
-       ip route add table 10 ...
-
-
-Applications
-------------
-Applications that are to work within a VRF need to bind their socket to the
-VRF device:
-
-    setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1);
-
-or to specify the output device using cmsg and IP_PKTINFO.
-
-By default the scope of the port bindings for unbound sockets is
-limited to the default VRF. That is, it will not be matched by packets
-arriving on interfaces enslaved to an l3mdev and processes may bind to
-the same port if they bind to an l3mdev.
-
-TCP & UDP services running in the default VRF context (ie., not bound
-to any VRF device) can work across all VRF domains by enabling the
-tcp_l3mdev_accept and udp_l3mdev_accept sysctl options:
-
-    sysctl -w net.ipv4.tcp_l3mdev_accept=1
-    sysctl -w net.ipv4.udp_l3mdev_accept=1
-
-These options are disabled by default so that a socket in a VRF is only
-selected for packets in that VRF. There is a similar option for RAW
-sockets, which is enabled by default for reasons of backwards compatibility.
-This is so as to specify the output device with cmsg and IP_PKTINFO, but
-using a socket not bound to the corresponding VRF. This allows e.g. older ping
-implementations to be run with specifying the device but without executing it
-in the VRF. This option can be disabled so that packets received in a VRF
-context are only handled by a raw socket bound to the VRF, and packets in the
-default VRF are only handled by a socket not bound to any VRF:
-
-    sysctl -w net.ipv4.raw_l3mdev_accept=0
-
-netfilter rules on the VRF device can be used to limit access to services
-running in the default VRF context as well.
-
-################################################################################
-
-Using iproute2 for VRFs
-=======================
-iproute2 supports the vrf keyword as of v4.7. For backwards compatibility this
-section lists both commands where appropriate -- with the vrf keyword and the
-older form without it.
-
-1. Create a VRF
-
-   To instantiate a VRF device and associate it with a table:
-       $ ip link add dev NAME type vrf table ID
-
-   As of v4.8 the kernel supports the l3mdev FIB rule where a single rule
-   covers all VRFs. The l3mdev rule is created for IPv4 and IPv6 on first
-   device create.
-
-2. List VRFs
-
-   To list VRFs that have been created:
-       $ ip [-d] link show type vrf
-         NOTE: The -d option is needed to show the table id
-
-   For example:
-   $ ip -d link show type vrf
-   11: mgmt: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
-       link/ether 72:b3:ba:91:e2:24 brd ff:ff:ff:ff:ff:ff promiscuity 0
-       vrf table 1 addrgenmode eui64
-   12: red: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
-       link/ether b6:6f:6e:f6:da:73 brd ff:ff:ff:ff:ff:ff promiscuity 0
-       vrf table 10 addrgenmode eui64
-   13: blue: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
-       link/ether 36:62:e8:7d:bb:8c brd ff:ff:ff:ff:ff:ff promiscuity 0
-       vrf table 66 addrgenmode eui64
-   14: green: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
-       link/ether e6:28:b8:63:70:bb brd ff:ff:ff:ff:ff:ff promiscuity 0
-       vrf table 81 addrgenmode eui64
-
-
-   Or in brief output:
-
-   $ ip -br link show type vrf
-   mgmt         UP             72:b3:ba:91:e2:24 <NOARP,MASTER,UP,LOWER_UP>
-   red          UP             b6:6f:6e:f6:da:73 <NOARP,MASTER,UP,LOWER_UP>
-   blue         UP             36:62:e8:7d:bb:8c <NOARP,MASTER,UP,LOWER_UP>
-   green        UP             e6:28:b8:63:70:bb <NOARP,MASTER,UP,LOWER_UP>
-
-
-3. Assign a Network Interface to a VRF
-
-   Network interfaces are assigned to a VRF by enslaving the netdevice to a
-   VRF device:
-       $ ip link set dev NAME master NAME
-
-   On enslavement connected and local routes are automatically moved to the
-   table associated with the VRF device.
-
-   For example:
-   $ ip link set dev eth0 master mgmt
-
-
-4. Show Devices Assigned to a VRF
-
-   To show devices that have been assigned to a specific VRF add the master
-   option to the ip command:
-       $ ip link show vrf NAME
-       $ ip link show master NAME
-
-   For example:
-   $ ip link show vrf red
-   3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP mode DEFAULT group default qlen 1000
-       link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff
-   4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP mode DEFAULT group default qlen 1000
-       link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff
-   7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master red state DOWN mode DEFAULT group default qlen 1000
-       link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff
-
-
-   Or using the brief output:
-   $ ip -br link show vrf red
-   eth1             UP             02:00:00:00:02:02 <BROADCAST,MULTICAST,UP,LOWER_UP>
-   eth2             UP             02:00:00:00:02:03 <BROADCAST,MULTICAST,UP,LOWER_UP>
-   eth5             DOWN           02:00:00:00:02:06 <BROADCAST,MULTICAST>
-
-
-5. Show Neighbor Entries for a VRF
-
-   To list neighbor entries associated with devices enslaved to a VRF device
-   add the master option to the ip command:
-       $ ip [-6] neigh show vrf NAME
-       $ ip [-6] neigh show master NAME
-
-   For example:
-   $  ip neigh show vrf red
-   10.2.1.254 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE
-   10.2.2.254 dev eth2 lladdr 5e:54:01:6a:ee:80 REACHABLE
-
-   $ ip -6 neigh show vrf red
-   2002:1::64 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE
-
-
-6. Show Addresses for a VRF
-
-   To show addresses for interfaces associated with a VRF add the master
-   option to the ip command:
-       $ ip addr show vrf NAME
-       $ ip addr show master NAME
-
-   For example:
-   $ ip addr show vrf red
-   3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
-       link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff
-       inet 10.2.1.2/24 brd 10.2.1.255 scope global eth1
-          valid_lft forever preferred_lft forever
-       inet6 2002:1::2/120 scope global
-          valid_lft forever preferred_lft forever
-       inet6 fe80::ff:fe00:202/64 scope link
-          valid_lft forever preferred_lft forever
-   4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
-       link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff
-       inet 10.2.2.2/24 brd 10.2.2.255 scope global eth2
-          valid_lft forever preferred_lft forever
-       inet6 2002:2::2/120 scope global
-          valid_lft forever preferred_lft forever
-       inet6 fe80::ff:fe00:203/64 scope link
-          valid_lft forever preferred_lft forever
-   7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master red state DOWN group default qlen 1000
-       link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff
-
-   Or in brief format:
-   $ ip -br addr show vrf red
-   eth1             UP             10.2.1.2/24 2002:1::2/120 fe80::ff:fe00:202/64
-   eth2             UP             10.2.2.2/24 2002:2::2/120 fe80::ff:fe00:203/64
-   eth5             DOWN
-
-
-7. Show Routes for a VRF
-
-   To show routes for a VRF use the ip command to display the table associated
-   with the VRF device:
-       $ ip [-6] route show vrf NAME
-       $ ip [-6] route show table ID
-
-   For example:
-   $ ip route show vrf red
-   unreachable default  metric 4278198272
-   broadcast 10.2.1.0 dev eth1  proto kernel  scope link  src 10.2.1.2
-   10.2.1.0/24 dev eth1  proto kernel  scope link  src 10.2.1.2
-   local 10.2.1.2 dev eth1  proto kernel  scope host  src 10.2.1.2
-   broadcast 10.2.1.255 dev eth1  proto kernel  scope link  src 10.2.1.2
-   broadcast 10.2.2.0 dev eth2  proto kernel  scope link  src 10.2.2.2
-   10.2.2.0/24 dev eth2  proto kernel  scope link  src 10.2.2.2
-   local 10.2.2.2 dev eth2  proto kernel  scope host  src 10.2.2.2
-   broadcast 10.2.2.255 dev eth2  proto kernel  scope link  src 10.2.2.2
-
-   $ ip -6 route show vrf red
-   local 2002:1:: dev lo  proto none  metric 0  pref medium
-   local 2002:1::2 dev lo  proto none  metric 0  pref medium
-   2002:1::/120 dev eth1  proto kernel  metric 256  pref medium
-   local 2002:2:: dev lo  proto none  metric 0  pref medium
-   local 2002:2::2 dev lo  proto none  metric 0  pref medium
-   2002:2::/120 dev eth2  proto kernel  metric 256  pref medium
-   local fe80:: dev lo  proto none  metric 0  pref medium
-   local fe80:: dev lo  proto none  metric 0  pref medium
-   local fe80::ff:fe00:202 dev lo  proto none  metric 0  pref medium
-   local fe80::ff:fe00:203 dev lo  proto none  metric 0  pref medium
-   fe80::/64 dev eth1  proto kernel  metric 256  pref medium
-   fe80::/64 dev eth2  proto kernel  metric 256  pref medium
-   ff00::/8 dev red  metric 256  pref medium
-   ff00::/8 dev eth1  metric 256  pref medium
-   ff00::/8 dev eth2  metric 256  pref medium
-   unreachable default dev lo  metric 4278198272  error -101 pref medium
-
-8. Route Lookup for a VRF
-
-   A test route lookup can be done for a VRF:
-       $ ip [-6] route get vrf NAME ADDRESS
-       $ ip [-6] route get oif NAME ADDRESS
-
-   For example:
-   $ ip route get 10.2.1.40 vrf red
-   10.2.1.40 dev eth1  table red  src 10.2.1.2
-       cache
-
-   $ ip -6 route get 2002:1::32 vrf red
-   2002:1::32 from :: dev eth1  table red  proto kernel  src 2002:1::2  metric 256  pref medium
-
-
-9. Removing Network Interface from a VRF
-
-   Network interfaces are removed from a VRF by breaking the enslavement to
-   the VRF device:
-       $ ip link set dev NAME nomaster
-
-   Connected routes are moved back to the default table and local entries are
-   moved to the local table.
-
-   For example:
-   $ ip link set dev eth0 nomaster
-
---------------------------------------------------------------------------------
-
-Commands used in this example:
-
-cat >> /etc/iproute2/rt_tables.d/vrf.conf <<EOF
-1  mgmt
-10 red
-66 blue
-81 green
-EOF
-
-function vrf_create
-{
-    VRF=$1
-    TBID=$2
-
-    # create VRF device
-    ip link add ${VRF} type vrf table ${TBID}
-
-    if [ "${VRF}" != "mgmt" ]; then
-        ip route add table ${TBID} unreachable default metric 4278198272
-    fi
-    ip link set dev ${VRF} up
-}
-
-vrf_create mgmt 1
-ip link set dev eth0 master mgmt
-
-vrf_create red 10
-ip link set dev eth1 master red
-ip link set dev eth2 master red
-ip link set dev eth5 master red
-
-vrf_create blue 66
-ip link set dev eth3 master blue
-
-vrf_create green 81
-ip link set dev eth4 master green
-
-
-Interface addresses from /etc/network/interfaces:
-auto eth0
-iface eth0 inet static
-      address 10.0.0.2
-      netmask 255.255.255.0
-      gateway 10.0.0.254
-
-iface eth0 inet6 static
-      address 2000:1::2
-      netmask 120
-
-auto eth1
-iface eth1 inet static
-      address 10.2.1.2
-      netmask 255.255.255.0
-
-iface eth1 inet6 static
-      address 2002:1::2
-      netmask 120
-
-auto eth2
-iface eth2 inet static
-      address 10.2.2.2
-      netmask 255.255.255.0
-
-iface eth2 inet6 static
-      address 2002:2::2
-      netmask 120
-
-auto eth3
-iface eth3 inet static
-      address 10.2.3.2
-      netmask 255.255.255.0
-
-iface eth3 inet6 static
-      address 2002:3::2
-      netmask 120
-
-auto eth4
-iface eth4 inet static
-      address 10.2.4.2
-      netmask 255.255.255.0
-
-iface eth4 inet6 static
-      address 2002:4::2
-      netmask 120
diff --git a/MAINTAINERS b/MAINTAINERS
index 6456c5bb02f1..d59455c27c42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18106,7 +18106,7 @@ M:	David Ahern <dsahern@kernel.org>
 M:	Shrijeet Mukherjee <shrijeet@gmail.com>
 L:	netdev@vger.kernel.org
 S:	Maintained
-F:	Documentation/networking/vrf.txt
+F:	Documentation/networking/vrf.rst
 F:	drivers/net/vrf.c
 
 VSPRINTF
-- 
cgit v1.2.3


From 0046db09d539523ef1470bcad2f2614cc3ef7ddf Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:33 +0200
Subject: docs: networking: convert z8530drv.txt to ReST

- add SPDX header;
- use copyright symbol;
- adjust titles and chapters, adding proper markups;
- mark tables as such;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/index.rst    |   1 +
 Documentation/networking/z8530drv.rst | 686 ++++++++++++++++++++++++++++++++++
 Documentation/networking/z8530drv.txt | 657 --------------------------------
 MAINTAINERS                           |   2 +-
 drivers/net/hamradio/Kconfig          |   4 +-
 drivers/net/hamradio/scc.c            |   2 +-
 6 files changed, 691 insertions(+), 661 deletions(-)
 create mode 100644 Documentation/networking/z8530drv.rst
 delete mode 100644 Documentation/networking/z8530drv.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 1630801cec19..f5733ca4fbcb 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -121,6 +121,7 @@ Contents:
    xfrm_proc
    xfrm_sync
    xfrm_sysctl
+   z8530drv
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/z8530drv.rst b/Documentation/networking/z8530drv.rst
new file mode 100644
index 000000000000..d2942760f167
--- /dev/null
+++ b/Documentation/networking/z8530drv.rst
@@ -0,0 +1,686 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+=========================================================
+SCC.C - Linux driver for Z8530 based HDLC cards for AX.25
+=========================================================
+
+
+This is a subset of the documentation. To use this driver you MUST have the
+full package from:
+
+Internet:
+
+    1. ftp://ftp.ccac.rwth-aachen.de/pub/jr/z8530drv-utils_3.0-3.tar.gz
+
+    2. ftp://ftp.pspt.fi/pub/ham/linux/ax25/z8530drv-utils_3.0-3.tar.gz
+
+Please note that the information in this document may be hopelessly outdated.
+A new version of the documentation, along with links to other important
+Linux Kernel AX.25 documentation and programs, is available on
+http://yaina.de/jreuter
+
+Copyright |copy| 1993,2000 by Joerg Reuter DL1BKE <jreuter@yaina.de>
+
+portions Copyright |copy| 1993 Guido ten Dolle PE1NNZ
+
+for the complete copyright notice see >> Copying.Z8530DRV <<
+
+1. Initialization of the driver
+===============================
+
+To use the driver, 3 steps must be performed:
+
+     1. if compiled as module: loading the module
+     2. Setup of hardware, MODEM and KISS parameters with sccinit
+     3. Attach each channel to the Linux kernel AX.25 with "ifconfig"
+
+Unlike the versions below 2.4 this driver is a real network device
+driver. If you want to run xNOS instead of our fine kernel AX.25
+use a 2.x version (available from above sites) or read the
+AX.25-HOWTO on how to emulate a KISS TNC on network device drivers.
+
+
+1.1 Loading the module
+======================
+
+(If you're going to compile the driver as a part of the kernel image,
+ skip this chapter and continue with 1.2)
+
+Before you can use a module, you'll have to load it with::
+
+	insmod scc.o
+
+please read 'man insmod' that comes with module-init-tools.
+
+You should include the insmod in one of the /etc/rc.d/rc.* files,
+and don't forget to insert a call of sccinit after that. It
+will read your /etc/z8530drv.conf.
+
+1.2. /etc/z8530drv.conf
+=======================
+
+To setup all parameters you must run /sbin/sccinit from one
+of your rc.*-files. This has to be done BEFORE you can
+"ifconfig" an interface. Sccinit reads the file /etc/z8530drv.conf
+and sets the hardware, MODEM and KISS parameters. A sample file is
+delivered with this package. Change it to your needs.
+
+The file itself consists of two main sections.
+
+1.2.1 configuration of hardware parameters
+==========================================
+
+The hardware setup section defines the following parameters for each
+Z8530::
+
+    chip    1
+    data_a  0x300                   # data port A
+    ctrl_a  0x304                   # control port A
+    data_b  0x301                   # data port B
+    ctrl_b  0x305                   # control port B
+    irq     5                       # IRQ No. 5
+    pclock  4915200                 # clock
+    board   BAYCOM                  # hardware type
+    escc    no                      # enhanced SCC chip? (8580/85180/85280)
+    vector  0                       # latch for interrupt vector
+    special no                      # address of special function register
+    option  0                       # option to set via sfr
+
+
+chip
+	- this is just a delimiter to make sccinit a bit simpler to
+	  program. A parameter has no effect.
+
+data_a
+	- the address of the data port A of this Z8530 (needed)
+ctrl_a
+	- the address of the control port A (needed)
+data_b
+	- the address of the data port B (needed)
+ctrl_b
+	- the address of the control port B (needed)
+
+irq
+	- the used IRQ for this chip. Different chips can use different
+	  IRQs or the same. If they share an interrupt, it needs to be
+	  specified within one chip-definition only.
+
+pclock  - the clock at the PCLK pin of the Z8530 (option, 4915200 is
+	  default), measured in Hertz
+
+board
+	- the "type" of the board:
+
+	   =======================  ========
+	   SCC type                 value
+	   =======================  ========
+	   PA0HZP SCC card          PA0HZP
+	   EAGLE card               EAGLE
+	   PC100 card               PC100
+	   PRIMUS-PC (DG9BL) card   PRIMUS
+	   BayCom (U)SCC card       BAYCOM
+	   =======================  ========
+
+escc
+	- if you want support for ESCC chips (8580, 85180, 85280), set
+	  this to "yes" (option, defaults to "no")
+
+vector
+	- address of the vector latch (aka "intack port") for PA0HZP
+	  cards. There can be only one vector latch for all chips!
+	  (option, defaults to 0)
+
+special
+	- address of the special function register on several cards.
+	  (option, defaults to 0)
+
+option  - The value you write into that register (option, default is 0)
+
+You can specify up to four chips (8 channels). If this is not enough,
+just change::
+
+	#define MAXSCC 4
+
+to a higher value.
+
+Example for the BAYCOM USCC:
+----------------------------
+
+::
+
+	chip    1
+	data_a  0x300                   # data port A
+	ctrl_a  0x304                   # control port A
+	data_b  0x301                   # data port B
+	ctrl_b  0x305                   # control port B
+	irq     5                       # IRQ No. 5 (#)
+	board   BAYCOM                  # hardware type (*)
+	#
+	# SCC chip 2
+	#
+	chip    2
+	data_a  0x302
+	ctrl_a  0x306
+	data_b  0x303
+	ctrl_b  0x307
+	board   BAYCOM
+
+An example for a PA0HZP card:
+-----------------------------
+
+::
+
+	chip 1
+	data_a 0x153
+	data_b 0x151
+	ctrl_a 0x152
+	ctrl_b 0x150
+	irq 9
+	pclock 4915200
+	board PA0HZP
+	vector 0x168
+	escc no
+	#
+	#
+	#
+	chip 2
+	data_a 0x157
+	data_b 0x155
+	ctrl_a 0x156
+	ctrl_b 0x154
+	irq 9
+	pclock 4915200
+	board PA0HZP
+	vector 0x168
+	escc no
+
+A DRSI would should probably work with this:
+--------------------------------------------
+(actually: two DRSI cards...)
+
+::
+
+	chip 1
+	data_a 0x303
+	data_b 0x301
+	ctrl_a 0x302
+	ctrl_b 0x300
+	irq 7
+	pclock 4915200
+	board DRSI
+	escc no
+	#
+	#
+	#
+	chip 2
+	data_a 0x313
+	data_b 0x311
+	ctrl_a 0x312
+	ctrl_b 0x310
+	irq 7
+	pclock 4915200
+	board DRSI
+	escc no
+
+Note that you cannot use the on-board baudrate generator off DRSI
+cards. Use "mode dpll" for clock source (see below).
+
+This is based on information provided by Mike Bilow (and verified
+by Paul Helay)
+
+The utility "gencfg"
+--------------------
+
+If you only know the parameters for the PE1CHL driver for DOS,
+run gencfg. It will generate the correct port addresses (I hope).
+Its parameters are exactly the same as the ones you use with
+the "attach scc" command in net, except that the string "init" must
+not appear. Example::
+
+	gencfg 2 0x150 4 2 0 1 0x168 9 4915200
+
+will print a skeleton z8530drv.conf for the OptoSCC to stdout.
+
+::
+
+	gencfg 2 0x300 2 4 5 -4 0 7 4915200 0x10
+
+does the same for the BAYCOM USCC card. In my opinion it is much easier
+to edit scc_config.h...
+
+
+1.2.2 channel configuration
+===========================
+
+The channel definition is divided into three sub sections for each
+channel:
+
+An example for scc0::
+
+	# DEVICE
+
+	device scc0	# the device for the following params
+
+	# MODEM / BUFFERS
+
+	speed 1200		# the default baudrate
+	clock dpll		# clock source:
+				# 	dpll     = normal half duplex operation
+				# 	external = MODEM provides own Rx/Tx clock
+				#	divider  = use full duplex divider if
+				#		   installed (1)
+	mode nrzi		# HDLC encoding mode
+				#	nrzi = 1k2 MODEM, G3RUH 9k6 MODEM
+				#	nrz  = DF9IC 9k6 MODEM
+				#
+	bufsize	384		# size of buffers. Note that this must include
+				# the AX.25 header, not only the data field!
+				# (optional, defaults to 384)
+
+	# KISS (Layer 1)
+
+	txdelay 36              # (see chapter 1.4)
+	persist 64
+	slot    8
+	tail    8
+	fulldup 0
+	wait    12
+	min     3
+	maxkey  7
+	idle    3
+	maxdef  120
+	group   0
+	txoff   off
+	softdcd on
+	slip    off
+
+The order WITHIN these sections is unimportant. The order OF these
+sections IS important. The MODEM parameters are set with the first
+recognized KISS parameter...
+
+Please note that you can initialize the board only once after boot
+(or insmod). You can change all parameters but "mode" and "clock"
+later with the Sccparam program or through KISS. Just to avoid
+security holes...
+
+(1) this divider is usually mounted on the SCC-PBC (PA0HZP) or not
+    present at all (BayCom). It feeds back the output of the DPLL
+    (digital pll) as transmit clock. Using this mode without a divider
+    installed will normally result in keying the transceiver until
+    maxkey expires --- of course without sending anything (useful).
+
+2. Attachment of a channel by your AX.25 software
+=================================================
+
+2.1 Kernel AX.25
+================
+
+To set up an AX.25 device you can simply type::
+
+	ifconfig scc0 44.128.1.1 hw ax25 dl0tha-7
+
+This will create a network interface with the IP number 44.128.20.107
+and the callsign "dl0tha". If you do not have any IP number (yet) you
+can use any of the 44.128.0.0 network. Note that you do not need
+axattach. The purpose of axattach (like slattach) is to create a KISS
+network device linked to a TTY. Please read the documentation of the
+ax25-utils and the AX.25-HOWTO to learn how to set the parameters of
+the kernel AX.25.
+
+2.2 NOS, NET and TFKISS
+=======================
+
+Since the TTY driver (aka KISS TNC emulation) is gone you need
+to emulate the old behaviour. The cost of using these programs is
+that you probably need to compile the kernel AX.25, regardless of whether
+you actually use it or not. First setup your /etc/ax25/axports,
+for example::
+
+	9k6	dl0tha-9  9600  255 4 9600 baud port (scc3)
+	axlink	dl0tha-15 38400 255 4 Link to NOS
+
+Now "ifconfig" the scc device::
+
+	ifconfig scc3 44.128.1.1 hw ax25 dl0tha-9
+
+You can now axattach a pseudo-TTY::
+
+	axattach /dev/ptys0 axlink
+
+and start your NOS and attach /dev/ptys0 there. The problem is that
+NOS is reachable only via digipeating through the kernel AX.25
+(disastrous on a DAMA controlled channel). To solve this problem,
+configure "rxecho" to echo the incoming frames from "9k6" to "axlink"
+and outgoing frames from "axlink" to "9k6" and start::
+
+	rxecho
+
+Or simply use "kissbridge" coming with z8530drv-utils::
+
+	ifconfig scc3 hw ax25 dl0tha-9
+	kissbridge scc3 /dev/ptys0
+
+
+3. Adjustment and Display of parameters
+=======================================
+
+3.1 Displaying SCC Parameters:
+==============================
+
+Once a SCC channel has been attached, the parameter settings and
+some statistic information can be shown using the param program::
+
+	dl1bke-u:~$ sccstat scc0
+
+	Parameters:
+
+	speed       : 1200 baud
+	txdelay     : 36
+	persist     : 255
+	slottime    : 0
+	txtail      : 8
+	fulldup     : 1
+	waittime    : 12
+	mintime     : 3 sec
+	maxkeyup    : 7 sec
+	idletime    : 3 sec
+	maxdefer    : 120 sec
+	group       : 0x00
+	txoff       : off
+	softdcd     : on
+	SLIP        : off
+
+	Status:
+
+	HDLC                  Z8530           Interrupts         Buffers
+	-----------------------------------------------------------------------
+	Sent       :     273  RxOver :     0  RxInts :   125074  Size    :  384
+	Received   :    1095  TxUnder:     0  TxInts :     4684  NoSpace :    0
+	RxErrors   :    1591                  ExInts :    11776
+	TxErrors   :       0                  SpInts :     1503
+	Tx State   :    idle
+
+
+The status info shown is:
+
+==============	==============================================================
+Sent		number of frames transmitted
+Received	number of frames received
+RxErrors	number of receive errors (CRC, ABORT)
+TxErrors	number of discarded Tx frames (due to various reasons)
+Tx State	status of the Tx interrupt handler: idle/busy/active/tail (2)
+RxOver		number of receiver overruns
+TxUnder		number of transmitter underruns
+RxInts		number of receiver interrupts
+TxInts		number of transmitter interrupts
+EpInts		number of receiver special condition interrupts
+SpInts		number of external/status interrupts
+Size		maximum size of an AX.25 frame (*with* AX.25 headers!)
+NoSpace		number of times a buffer could not get allocated
+==============	==============================================================
+
+An overrun is abnormal. If lots of these occur, the product of
+baudrate and number of interfaces is too high for the processing
+power of your computer. NoSpace errors are unlikely to be caused by the
+driver or the kernel AX.25.
+
+
+3.2 Setting Parameters
+======================
+
+
+The setting of parameters of the emulated KISS TNC is done in the
+same way in the SCC driver. You can change parameters by using
+the kissparms program from the ax25-utils package or use the program
+"sccparam"::
+
+     sccparam <device> <paramname> <decimal-|hexadecimal value>
+
+You can change the following parameters:
+
+===========   =====
+param	      value
+===========   =====
+speed         1200
+txdelay       36
+persist       255
+slottime      0
+txtail        8
+fulldup       1
+waittime      12
+mintime       3
+maxkeyup      7
+idletime      3
+maxdefer      120
+group         0x00
+txoff         off
+softdcd       on
+SLIP          off
+===========   =====
+
+
+The parameters have the following meaning:
+
+speed:
+     The baudrate on this channel in bits/sec
+
+     Example: sccparam /dev/scc3 speed 9600
+
+txdelay:
+     The delay (in units of 10 ms) after keying of the
+     transmitter, until the first byte is sent. This is usually
+     called "TXDELAY" in a TNC.  When 0 is specified, the driver
+     will just wait until the CTS signal is asserted. This
+     assumes the presence of a timer or other circuitry in the
+     MODEM and/or transmitter, that asserts CTS when the
+     transmitter is ready for data.
+     A normal value of this parameter is 30-36.
+
+     Example: sccparam /dev/scc0 txd 20
+
+persist:
+     This is the probability that the transmitter will be keyed
+     when the channel is found to be free.  It is a value from 0
+     to 255, and the probability is (value+1)/256.  The value
+     should be somewhere near 50-60, and should be lowered when
+     the channel is used more heavily.
+
+     Example: sccparam /dev/scc2 persist 20
+
+slottime:
+     This is the time between samples of the channel. It is
+     expressed in units of 10 ms.  About 200-300 ms (value 20-30)
+     seems to be a good value.
+
+     Example: sccparam /dev/scc0 slot 20
+
+tail:
+     The time the transmitter will remain keyed after the last
+     byte of a packet has been transferred to the SCC. This is
+     necessary because the CRC and a flag still have to leave the
+     SCC before the transmitter is keyed down. The value depends
+     on the baudrate selected.  A few character times should be
+     sufficient, e.g. 40ms at 1200 baud. (value 4)
+     The value of this parameter is in 10 ms units.
+
+     Example: sccparam /dev/scc2 4
+
+full:
+     The full-duplex mode switch. This can be one of the following
+     values:
+
+     0:   The interface will operate in CSMA mode (the normal
+	  half-duplex packet radio operation)
+     1:   Fullduplex mode, i.e. the transmitter will be keyed at
+	  any time, without checking the received carrier.  It
+	  will be unkeyed when there are no packets to be sent.
+     2:   Like 1, but the transmitter will remain keyed, also
+	  when there are no packets to be sent.  Flags will be
+	  sent in that case, until a timeout (parameter 10)
+	  occurs.
+
+     Example: sccparam /dev/scc0 fulldup off
+
+wait:
+     The initial waittime before any transmit attempt, after the
+     frame has been queue for transmit.  This is the length of
+     the first slot in CSMA mode.  In full duplex modes it is
+     set to 0 for maximum performance.
+     The value of this parameter is in 10 ms units.
+
+     Example: sccparam /dev/scc1 wait 4
+
+maxkey:
+     The maximal time the transmitter will be keyed to send
+     packets, in seconds.  This can be useful on busy CSMA
+     channels, to avoid "getting a bad reputation" when you are
+     generating a lot of traffic.  After the specified time has
+     elapsed, no new frame will be started. Instead, the trans-
+     mitter will be switched off for a specified time (parameter
+     min), and then the selected algorithm for keyup will be
+     started again.
+     The value 0 as well as "off" will disable this feature,
+     and allow infinite transmission time.
+
+     Example: sccparam /dev/scc0 maxk 20
+
+min:
+     This is the time the transmitter will be switched off when
+     the maximum transmission time is exceeded.
+
+     Example: sccparam /dev/scc3 min 10
+
+idle:
+     This parameter specifies the maximum idle time in full duplex
+     2 mode, in seconds.  When no frames have been sent for this
+     time, the transmitter will be keyed down.  A value of 0 is
+     has same result as the fullduplex mode 1. This parameter
+     can be disabled.
+
+     Example: sccparam /dev/scc2 idle off	# transmit forever
+
+maxdefer
+     This is the maximum time (in seconds) to wait for a free channel
+     to send. When this timer expires the transmitter will be keyed
+     IMMEDIATELY. If you love to get trouble with other users you
+     should set this to a very low value ;-)
+
+     Example: sccparam /dev/scc0 maxdefer 240	# 2 minutes
+
+
+txoff:
+     When this parameter has the value 0, the transmission of packets
+     is enable. Otherwise it is disabled.
+
+     Example: sccparam /dev/scc2 txoff on
+
+group:
+     It is possible to build special radio equipment to use more than
+     one frequency on the same band, e.g. using several receivers and
+     only one transmitter that can be switched between frequencies.
+     Also, you can connect several radios that are active on the same
+     band.  In these cases, it is not possible, or not a good idea, to
+     transmit on more than one frequency.  The SCC driver provides a
+     method to lock transmitters on different interfaces, using the
+     "param <interface> group <x>" command.  This will only work when
+     you are using CSMA mode (parameter full = 0).
+
+     The number <x> must be 0 if you want no group restrictions, and
+     can be computed as follows to create restricted groups:
+     <x> is the sum of some OCTAL numbers:
+
+
+     ===  =======================================================
+     200  This transmitter will only be keyed when all other
+	  transmitters in the group are off.
+     100  This transmitter will only be keyed when the carrier
+	  detect of all other interfaces in the group is off.
+     0xx  A byte that can be used to define different groups.
+	  Interfaces are in the same group, when the logical AND
+	  between their xx values is nonzero.
+     ===  =======================================================
+
+     Examples:
+
+     When 2 interfaces use group 201, their transmitters will never be
+     keyed at the same time.
+
+     When 2 interfaces use group 101, the transmitters will only key
+     when both channels are clear at the same time.  When group 301,
+     the transmitters will not be keyed at the same time.
+
+     Don't forget to convert the octal numbers into decimal before
+     you set the parameter.
+
+     Example: (to be written)
+
+softdcd:
+     use a software dcd instead of the real one... Useful for a very
+     slow squelch.
+
+     Example: sccparam /dev/scc0 soft on
+
+
+4. Problems
+===========
+
+If you have tx-problems with your BayCom USCC card please check
+the manufacturer of the 8530. SGS chips have a slightly
+different timing. Try Zilog...  A solution is to write to register 8
+instead to the data port, but this won't work with the ESCC chips.
+*SIGH!*
+
+A very common problem is that the PTT locks until the maxkeyup timer
+expires, although interrupts and clock source are correct. In most
+cases compiling the driver with CONFIG_SCC_DELAY (set with
+make config) solves the problems. For more hints read the (pseudo) FAQ
+and the documentation coming with z8530drv-utils.
+
+I got reports that the driver has problems on some 386-based systems.
+(i.e. Amstrad) Those systems have a bogus AT bus timing which will
+lead to delayed answers on interrupts. You can recognize these
+problems by looking at the output of Sccstat for the suspected
+port. If it shows under- and overruns you own such a system.
+
+Delayed processing of received data: This depends on
+
+- the kernel version
+
+- kernel profiling compiled or not
+
+- a high interrupt load
+
+- a high load of the machine --- running X, Xmorph, XV and Povray,
+  while compiling the kernel... hmm ... even with 32 MB RAM ...  ;-)
+  Or running a named for the whole .ampr.org domain on an 8 MB
+  box...
+
+- using information from rxecho or kissbridge.
+
+Kernel panics: please read /linux/README and find out if it
+really occurred within the scc driver.
+
+If you cannot solve a problem, send me
+
+- a description of the problem,
+- information on your hardware (computer system, scc board, modem)
+- your kernel version
+- the output of cat /proc/net/z8530
+
+4. Thor RLC100
+==============
+
+Mysteriously this board seems not to work with the driver. Anyone
+got it up-and-running?
+
+
+Many thanks to Linus Torvalds and Alan Cox for including the driver
+in the Linux standard distribution and their support.
+
+::
+
+	Joerg Reuter	ampr-net: dl1bke@db0pra.ampr.org
+			AX-25   : DL1BKE @ DB0ABH.#BAY.DEU.EU
+			Internet: jreuter@yaina.de
+			WWW     : http://yaina.de/jreuter
diff --git a/Documentation/networking/z8530drv.txt b/Documentation/networking/z8530drv.txt
deleted file mode 100644
index 2206abbc3e1b..000000000000
--- a/Documentation/networking/z8530drv.txt
+++ /dev/null
@@ -1,657 +0,0 @@
-This is a subset of the documentation. To use this driver you MUST have the
-full package from:
-
-Internet:
-=========
-
-1. ftp://ftp.ccac.rwth-aachen.de/pub/jr/z8530drv-utils_3.0-3.tar.gz
-
-2. ftp://ftp.pspt.fi/pub/ham/linux/ax25/z8530drv-utils_3.0-3.tar.gz
-
-Please note that the information in this document may be hopelessly outdated.
-A new version of the documentation, along with links to other important
-Linux Kernel AX.25 documentation and programs, is available on
-http://yaina.de/jreuter
-
------------------------------------------------------------------------------
-
-
-	 SCC.C - Linux driver for Z8530 based HDLC cards for AX.25      
-
-   ********************************************************************
-
-        (c) 1993,2000 by Joerg Reuter DL1BKE <jreuter@yaina.de>
-
-        portions (c) 1993 Guido ten Dolle PE1NNZ
-
-        for the complete copyright notice see >> Copying.Z8530DRV <<
-
-   ******************************************************************** 
-
-
-1. Initialization of the driver
-===============================
-
-To use the driver, 3 steps must be performed:
-
-     1. if compiled as module: loading the module
-     2. Setup of hardware, MODEM and KISS parameters with sccinit
-     3. Attach each channel to the Linux kernel AX.25 with "ifconfig"
-
-Unlike the versions below 2.4 this driver is a real network device
-driver. If you want to run xNOS instead of our fine kernel AX.25
-use a 2.x version (available from above sites) or read the
-AX.25-HOWTO on how to emulate a KISS TNC on network device drivers.
-
-
-1.1 Loading the module
-======================
-
-(If you're going to compile the driver as a part of the kernel image,
- skip this chapter and continue with 1.2)
-
-Before you can use a module, you'll have to load it with
-
-	insmod scc.o
-
-please read 'man insmod' that comes with module-init-tools.
-
-You should include the insmod in one of the /etc/rc.d/rc.* files,
-and don't forget to insert a call of sccinit after that. It
-will read your /etc/z8530drv.conf.
-
-1.2. /etc/z8530drv.conf
-=======================
-
-To setup all parameters you must run /sbin/sccinit from one
-of your rc.*-files. This has to be done BEFORE you can
-"ifconfig" an interface. Sccinit reads the file /etc/z8530drv.conf
-and sets the hardware, MODEM and KISS parameters. A sample file is
-delivered with this package. Change it to your needs.
-
-The file itself consists of two main sections.
-
-1.2.1 configuration of hardware parameters
-==========================================
-
-The hardware setup section defines the following parameters for each
-Z8530:
-
-chip    1
-data_a  0x300                   # data port A
-ctrl_a  0x304                   # control port A
-data_b  0x301                   # data port B
-ctrl_b  0x305                   # control port B
-irq     5                       # IRQ No. 5
-pclock  4915200                 # clock
-board   BAYCOM                  # hardware type
-escc    no                      # enhanced SCC chip? (8580/85180/85280)
-vector  0                       # latch for interrupt vector
-special no                      # address of special function register
-option  0                       # option to set via sfr
-
-
-chip	- this is just a delimiter to make sccinit a bit simpler to
-	  program. A parameter has no effect.
-
-data_a  - the address of the data port A of this Z8530 (needed)
-ctrl_a  - the address of the control port A (needed)
-data_b  - the address of the data port B (needed)
-ctrl_b  - the address of the control port B (needed)
-
-irq     - the used IRQ for this chip. Different chips can use different
-          IRQs or the same. If they share an interrupt, it needs to be
-	  specified within one chip-definition only.
-
-pclock  - the clock at the PCLK pin of the Z8530 (option, 4915200 is
-          default), measured in Hertz
-
-board   - the "type" of the board:
-
-	   SCC type                 value
-	   ---------------------------------
-	   PA0HZP SCC card          PA0HZP
-	   EAGLE card               EAGLE
-	   PC100 card               PC100
-	   PRIMUS-PC (DG9BL) card   PRIMUS
-	   BayCom (U)SCC card       BAYCOM
-
-escc    - if you want support for ESCC chips (8580, 85180, 85280), set
-          this to "yes" (option, defaults to "no")
-
-vector  - address of the vector latch (aka "intack port") for PA0HZP
-          cards. There can be only one vector latch for all chips!
-	  (option, defaults to 0)
-
-special - address of the special function register on several cards.
-          (option, defaults to 0)
-
-option  - The value you write into that register (option, default is 0)
-
-You can specify up to four chips (8 channels). If this is not enough,
-just change
-
-	#define MAXSCC 4
-
-to a higher value.
-
-Example for the BAYCOM USCC:
-----------------------------
-
-chip    1
-data_a  0x300                   # data port A
-ctrl_a  0x304                   # control port A
-data_b  0x301                   # data port B
-ctrl_b  0x305                   # control port B
-irq     5                       # IRQ No. 5 (#)
-board   BAYCOM                  # hardware type (*)
-#
-# SCC chip 2
-#
-chip    2
-data_a  0x302
-ctrl_a  0x306
-data_b  0x303
-ctrl_b  0x307
-board   BAYCOM
-
-An example for a PA0HZP card:
------------------------------
-
-chip 1
-data_a 0x153
-data_b 0x151
-ctrl_a 0x152
-ctrl_b 0x150
-irq 9
-pclock 4915200
-board PA0HZP
-vector 0x168
-escc no
-#
-#
-#
-chip 2
-data_a 0x157
-data_b 0x155
-ctrl_a 0x156
-ctrl_b 0x154
-irq 9
-pclock 4915200
-board PA0HZP
-vector 0x168
-escc no
-
-A DRSI would should probably work with this:
---------------------------------------------
-(actually: two DRSI cards...)
-
-chip 1
-data_a 0x303
-data_b 0x301
-ctrl_a 0x302
-ctrl_b 0x300
-irq 7
-pclock 4915200
-board DRSI
-escc no
-#
-#
-#
-chip 2
-data_a 0x313
-data_b 0x311
-ctrl_a 0x312
-ctrl_b 0x310
-irq 7
-pclock 4915200
-board DRSI
-escc no
-
-Note that you cannot use the on-board baudrate generator off DRSI
-cards. Use "mode dpll" for clock source (see below).
-
-This is based on information provided by Mike Bilow (and verified
-by Paul Helay)
-
-The utility "gencfg"
---------------------
-
-If you only know the parameters for the PE1CHL driver for DOS,
-run gencfg. It will generate the correct port addresses (I hope).
-Its parameters are exactly the same as the ones you use with
-the "attach scc" command in net, except that the string "init" must 
-not appear. Example:
-
-gencfg 2 0x150 4 2 0 1 0x168 9 4915200 
-
-will print a skeleton z8530drv.conf for the OptoSCC to stdout.
-
-gencfg 2 0x300 2 4 5 -4 0 7 4915200 0x10
-
-does the same for the BAYCOM USCC card. In my opinion it is much easier
-to edit scc_config.h... 
-
-
-1.2.2 channel configuration
-===========================
-
-The channel definition is divided into three sub sections for each
-channel:
-
-An example for scc0:
-
-# DEVICE
-
-device scc0	# the device for the following params
-
-# MODEM / BUFFERS
-
-speed 1200		# the default baudrate
-clock dpll		# clock source: 
-			# 	dpll     = normal half duplex operation
-			# 	external = MODEM provides own Rx/Tx clock
-			#	divider  = use full duplex divider if
-			#		   installed (1)
-mode nrzi		# HDLC encoding mode
-			#	nrzi = 1k2 MODEM, G3RUH 9k6 MODEM
-			#	nrz  = DF9IC 9k6 MODEM
-			#
-bufsize	384		# size of buffers. Note that this must include
-			# the AX.25 header, not only the data field!
-			# (optional, defaults to 384)
-
-# KISS (Layer 1)
-
-txdelay 36              # (see chapter 1.4)
-persist 64
-slot    8
-tail    8
-fulldup 0
-wait    12
-min     3
-maxkey  7
-idle    3
-maxdef  120
-group   0
-txoff   off
-softdcd on                   
-slip    off
-
-The order WITHIN these sections is unimportant. The order OF these
-sections IS important. The MODEM parameters are set with the first
-recognized KISS parameter...
-
-Please note that you can initialize the board only once after boot
-(or insmod). You can change all parameters but "mode" and "clock" 
-later with the Sccparam program or through KISS. Just to avoid 
-security holes... 
-
-(1) this divider is usually mounted on the SCC-PBC (PA0HZP) or not
-    present at all (BayCom). It feeds back the output of the DPLL 
-    (digital pll) as transmit clock. Using this mode without a divider 
-    installed will normally result in keying the transceiver until 
-    maxkey expires --- of course without sending anything (useful).
-
-2. Attachment of a channel by your AX.25 software
-=================================================
-
-2.1 Kernel AX.25
-================
-
-To set up an AX.25 device you can simply type:
-
-	ifconfig scc0 44.128.1.1 hw ax25 dl0tha-7
-
-This will create a network interface with the IP number 44.128.20.107 
-and the callsign "dl0tha". If you do not have any IP number (yet) you 
-can use any of the 44.128.0.0 network. Note that you do not need 
-axattach. The purpose of axattach (like slattach) is to create a KISS 
-network device linked to a TTY. Please read the documentation of the 
-ax25-utils and the AX.25-HOWTO to learn how to set the parameters of
-the kernel AX.25.
-
-2.2 NOS, NET and TFKISS
-=======================
-
-Since the TTY driver (aka KISS TNC emulation) is gone you need
-to emulate the old behaviour. The cost of using these programs is
-that you probably need to compile the kernel AX.25, regardless of whether
-you actually use it or not. First setup your /etc/ax25/axports,
-for example:
-
-	9k6	dl0tha-9  9600  255 4 9600 baud port (scc3)
-	axlink	dl0tha-15 38400 255 4 Link to NOS
-
-Now "ifconfig" the scc device:
-
-	ifconfig scc3 44.128.1.1 hw ax25 dl0tha-9
-
-You can now axattach a pseudo-TTY:
-
-	axattach /dev/ptys0 axlink
-
-and start your NOS and attach /dev/ptys0 there. The problem is that
-NOS is reachable only via digipeating through the kernel AX.25
-(disastrous on a DAMA controlled channel). To solve this problem,
-configure "rxecho" to echo the incoming frames from "9k6" to "axlink"
-and outgoing frames from "axlink" to "9k6" and start:
-
-	rxecho
-
-Or simply use "kissbridge" coming with z8530drv-utils:
-
-	ifconfig scc3 hw ax25 dl0tha-9
-	kissbridge scc3 /dev/ptys0
-
-
-3. Adjustment and Display of parameters
-=======================================
-
-3.1 Displaying SCC Parameters:
-==============================
-
-Once a SCC channel has been attached, the parameter settings and 
-some statistic information can be shown using the param program:
-
-dl1bke-u:~$ sccstat scc0
-
-Parameters:
-
-speed       : 1200 baud
-txdelay     : 36
-persist     : 255
-slottime    : 0
-txtail      : 8
-fulldup     : 1
-waittime    : 12
-mintime     : 3 sec
-maxkeyup    : 7 sec
-idletime    : 3 sec
-maxdefer    : 120 sec
-group       : 0x00
-txoff       : off
-softdcd     : on
-SLIP        : off
-
-Status:
-
-HDLC                  Z8530           Interrupts         Buffers
------------------------------------------------------------------------
-Sent       :     273  RxOver :     0  RxInts :   125074  Size    :  384
-Received   :    1095  TxUnder:     0  TxInts :     4684  NoSpace :    0
-RxErrors   :    1591                  ExInts :    11776
-TxErrors   :       0                  SpInts :     1503
-Tx State   :    idle
-
-
-The status info shown is:
-
-Sent		- number of frames transmitted
-Received	- number of frames received
-RxErrors	- number of receive errors (CRC, ABORT)
-TxErrors	- number of discarded Tx frames (due to various reasons) 
-Tx State	- status of the Tx interrupt handler: idle/busy/active/tail (2)
-RxOver		- number of receiver overruns
-TxUnder		- number of transmitter underruns
-RxInts		- number of receiver interrupts
-TxInts		- number of transmitter interrupts
-EpInts		- number of receiver special condition interrupts
-SpInts		- number of external/status interrupts
-Size		- maximum size of an AX.25 frame (*with* AX.25 headers!)
-NoSpace		- number of times a buffer could not get allocated
-
-An overrun is abnormal. If lots of these occur, the product of
-baudrate and number of interfaces is too high for the processing
-power of your computer. NoSpace errors are unlikely to be caused by the
-driver or the kernel AX.25.
-
-
-3.2 Setting Parameters
-======================
-
-
-The setting of parameters of the emulated KISS TNC is done in the 
-same way in the SCC driver. You can change parameters by using
-the kissparms program from the ax25-utils package or use the program 
-"sccparam":
-
-     sccparam <device> <paramname> <decimal-|hexadecimal value>
-
-You can change the following parameters:
-
-param	    : value
-------------------------
-speed       : 1200
-txdelay     : 36
-persist     : 255
-slottime    : 0
-txtail      : 8
-fulldup     : 1
-waittime    : 12
-mintime     : 3
-maxkeyup    : 7
-idletime    : 3
-maxdefer    : 120
-group       : 0x00
-txoff       : off
-softdcd     : on
-SLIP        : off
-
-
-The parameters have the following meaning:
-
-speed:
-     The baudrate on this channel in bits/sec
-
-     Example: sccparam /dev/scc3 speed 9600
-
-txdelay:
-     The delay (in units of 10 ms) after keying of the 
-     transmitter, until the first byte is sent. This is usually 
-     called "TXDELAY" in a TNC.  When 0 is specified, the driver 
-     will just wait until the CTS signal is asserted. This 
-     assumes the presence of a timer or other circuitry in the 
-     MODEM and/or transmitter, that asserts CTS when the 
-     transmitter is ready for data.
-     A normal value of this parameter is 30-36.
-
-     Example: sccparam /dev/scc0 txd 20
-
-persist:
-     This is the probability that the transmitter will be keyed 
-     when the channel is found to be free.  It is a value from 0 
-     to 255, and the probability is (value+1)/256.  The value 
-     should be somewhere near 50-60, and should be lowered when 
-     the channel is used more heavily.
-
-     Example: sccparam /dev/scc2 persist 20
-
-slottime:
-     This is the time between samples of the channel. It is 
-     expressed in units of 10 ms.  About 200-300 ms (value 20-30) 
-     seems to be a good value.
-
-     Example: sccparam /dev/scc0 slot 20
-
-tail:
-     The time the transmitter will remain keyed after the last 
-     byte of a packet has been transferred to the SCC. This is 
-     necessary because the CRC and a flag still have to leave the 
-     SCC before the transmitter is keyed down. The value depends 
-     on the baudrate selected.  A few character times should be 
-     sufficient, e.g. 40ms at 1200 baud. (value 4)
-     The value of this parameter is in 10 ms units.
-
-     Example: sccparam /dev/scc2 4
-
-full:
-     The full-duplex mode switch. This can be one of the following 
-     values:
-
-     0:   The interface will operate in CSMA mode (the normal 
-          half-duplex packet radio operation)
-     1:   Fullduplex mode, i.e. the transmitter will be keyed at 
-          any time, without checking the received carrier.  It 
-          will be unkeyed when there are no packets to be sent.
-     2:   Like 1, but the transmitter will remain keyed, also 
-          when there are no packets to be sent.  Flags will be 
-          sent in that case, until a timeout (parameter 10) 
-          occurs.
-
-     Example: sccparam /dev/scc0 fulldup off
-
-wait:
-     The initial waittime before any transmit attempt, after the 
-     frame has been queue for transmit.  This is the length of 
-     the first slot in CSMA mode.  In full duplex modes it is
-     set to 0 for maximum performance.
-     The value of this parameter is in 10 ms units. 
-
-     Example: sccparam /dev/scc1 wait 4
-
-maxkey:
-     The maximal time the transmitter will be keyed to send 
-     packets, in seconds.  This can be useful on busy CSMA 
-     channels, to avoid "getting a bad reputation" when you are 
-     generating a lot of traffic.  After the specified time has 
-     elapsed, no new frame will be started. Instead, the trans-
-     mitter will be switched off for a specified time (parameter 
-     min), and then the selected algorithm for keyup will be 
-     started again.
-     The value 0 as well as "off" will disable this feature, 
-     and allow infinite transmission time. 
-
-     Example: sccparam /dev/scc0 maxk 20
-
-min:
-     This is the time the transmitter will be switched off when 
-     the maximum transmission time is exceeded.
-
-     Example: sccparam /dev/scc3 min 10
-
-idle
-     This parameter specifies the maximum idle time in full duplex 
-     2 mode, in seconds.  When no frames have been sent for this 
-     time, the transmitter will be keyed down.  A value of 0 is
-     has same result as the fullduplex mode 1. This parameter
-     can be disabled.
-
-     Example: sccparam /dev/scc2 idle off	# transmit forever
-
-maxdefer
-     This is the maximum time (in seconds) to wait for a free channel
-     to send. When this timer expires the transmitter will be keyed 
-     IMMEDIATELY. If you love to get trouble with other users you
-     should set this to a very low value ;-)
-
-     Example: sccparam /dev/scc0 maxdefer 240	# 2 minutes
-
-
-txoff:
-     When this parameter has the value 0, the transmission of packets
-     is enable. Otherwise it is disabled.
-
-     Example: sccparam /dev/scc2 txoff on
-
-group:
-     It is possible to build special radio equipment to use more than 
-     one frequency on the same band, e.g. using several receivers and 
-     only one transmitter that can be switched between frequencies.
-     Also, you can connect several radios that are active on the same 
-     band.  In these cases, it is not possible, or not a good idea, to 
-     transmit on more than one frequency.  The SCC driver provides a 
-     method to lock transmitters on different interfaces, using the 
-     "param <interface> group <x>" command.  This will only work when 
-     you are using CSMA mode (parameter full = 0).
-     The number <x> must be 0 if you want no group restrictions, and 
-     can be computed as follows to create restricted groups:
-     <x> is the sum of some OCTAL numbers:
-
-     200  This transmitter will only be keyed when all other 
-          transmitters in the group are off.
-     100  This transmitter will only be keyed when the carrier 
-          detect of all other interfaces in the group is off.
-     0xx  A byte that can be used to define different groups.  
-          Interfaces are in the same group, when the logical AND 
-          between their xx values is nonzero.
-
-     Examples:
-     When 2 interfaces use group 201, their transmitters will never be 
-     keyed at the same time.
-     When 2 interfaces use group 101, the transmitters will only key 
-     when both channels are clear at the same time.  When group 301, 
-     the transmitters will not be keyed at the same time.
-
-     Don't forget to convert the octal numbers into decimal before
-     you set the parameter.
-
-     Example: (to be written)
-
-softdcd:
-     use a software dcd instead of the real one... Useful for a very
-     slow squelch.
-
-     Example: sccparam /dev/scc0 soft on
-
-
-4. Problems 
-===========
-
-If you have tx-problems with your BayCom USCC card please check
-the manufacturer of the 8530. SGS chips have a slightly
-different timing. Try Zilog...  A solution is to write to register 8 
-instead to the data port, but this won't work with the ESCC chips. 
-*SIGH!*
-
-A very common problem is that the PTT locks until the maxkeyup timer
-expires, although interrupts and clock source are correct. In most
-cases compiling the driver with CONFIG_SCC_DELAY (set with
-make config) solves the problems. For more hints read the (pseudo) FAQ 
-and the documentation coming with z8530drv-utils.
-
-I got reports that the driver has problems on some 386-based systems.
-(i.e. Amstrad) Those systems have a bogus AT bus timing which will
-lead to delayed answers on interrupts. You can recognize these
-problems by looking at the output of Sccstat for the suspected
-port. If it shows under- and overruns you own such a system.
-
-Delayed processing of received data: This depends on
-
-- the kernel version
-
-- kernel profiling compiled or not
-
-- a high interrupt load
-
-- a high load of the machine --- running X, Xmorph, XV and Povray,
-  while compiling the kernel... hmm ... even with 32 MB RAM ...  ;-)
-  Or running a named for the whole .ampr.org domain on an 8 MB
-  box...
-
-- using information from rxecho or kissbridge.
-
-Kernel panics: please read /linux/README and find out if it
-really occurred within the scc driver.
-
-If you cannot solve a problem, send me
-
-- a description of the problem,
-- information on your hardware (computer system, scc board, modem)
-- your kernel version
-- the output of cat /proc/net/z8530
-
-4. Thor RLC100
-==============
-
-Mysteriously this board seems not to work with the driver. Anyone
-got it up-and-running?
-
-
-Many thanks to Linus Torvalds and Alan Cox for including the driver
-in the Linux standard distribution and their support.
-
-Joerg Reuter	ampr-net: dl1bke@db0pra.ampr.org
-		AX-25   : DL1BKE @ DB0ABH.#BAY.DEU.EU
-		Internet: jreuter@yaina.de
-		WWW     : http://yaina.de/jreuter
diff --git a/MAINTAINERS b/MAINTAINERS
index d59455c27c42..bee65ebdc67e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18644,7 +18644,7 @@ L:	linux-hams@vger.kernel.org
 S:	Maintained
 W:	http://yaina.de/jreuter/
 W:	http://www.qsl.net/dl1bke/
-F:	Documentation/networking/z8530drv.txt
+F:	Documentation/networking/z8530drv.rst
 F:	drivers/net/hamradio/*scc.c
 F:	drivers/net/hamradio/z8530.h
 
diff --git a/drivers/net/hamradio/Kconfig b/drivers/net/hamradio/Kconfig
index fe409819b56d..f4500f04147d 100644
--- a/drivers/net/hamradio/Kconfig
+++ b/drivers/net/hamradio/Kconfig
@@ -84,7 +84,7 @@ config SCC
 	---help---
 	  These cards are used to connect your Linux box to an amateur radio
 	  in order to communicate with other computers. If you want to use
-	  this, read <file:Documentation/networking/z8530drv.txt> and the
+	  this, read <file:Documentation/networking/z8530drv.rst> and the
 	  AX25-HOWTO, available from
 	  <http://www.tldp.org/docs.html#howto>. Also make sure to say Y
 	  to "Amateur Radio AX.25 Level 2" support.
@@ -98,7 +98,7 @@ config SCC_DELAY
 	help
 	  Say Y here if you experience problems with the SCC driver not
 	  working properly; please read
-	  <file:Documentation/networking/z8530drv.txt> for details.
+	  <file:Documentation/networking/z8530drv.rst> for details.
 
 	  If unsure, say N.
 
diff --git a/drivers/net/hamradio/scc.c b/drivers/net/hamradio/scc.c
index 6c03932d8a6b..33fdd55c6122 100644
--- a/drivers/net/hamradio/scc.c
+++ b/drivers/net/hamradio/scc.c
@@ -7,7 +7,7 @@
  *            ------------------
  *
  * You can find a subset of the documentation in 
- * Documentation/networking/z8530drv.txt.
+ * Documentation/networking/z8530drv.rst.
  */
 
 /*
-- 
cgit v1.2.3


From 9ea2af8d16f5612168ed52cb0ec6752bac0877a9 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:35 +0200
Subject: docs: networking: device drivers: convert 3com/vortex.txt to ReST

- add SPDX header;
- add a document title;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 .../networking/device_drivers/3com/vortex.rst      | 461 +++++++++++++++++++++
 .../networking/device_drivers/3com/vortex.txt      | 448 --------------------
 Documentation/networking/device_drivers/index.rst  |   1 +
 MAINTAINERS                                        |   2 +-
 drivers/net/ethernet/3com/3c59x.c                  |   4 +-
 drivers/net/ethernet/3com/Kconfig                  |   2 +-
 6 files changed, 466 insertions(+), 452 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/3com/vortex.rst
 delete mode 100644 Documentation/networking/device_drivers/3com/vortex.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/3com/vortex.rst b/Documentation/networking/device_drivers/3com/vortex.rst
new file mode 100644
index 000000000000..800add5be338
--- /dev/null
+++ b/Documentation/networking/device_drivers/3com/vortex.rst
@@ -0,0 +1,461 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
+3Com Vortex device driver
+=========================
+
+Documentation/networking/device_drivers/3com/vortex.rst
+
+Andrew Morton
+
+30 April 2000
+
+
+This document describes the usage and errata of the 3Com "Vortex" device
+driver for Linux, 3c59x.c.
+
+The driver was written by Donald Becker <becker@scyld.com>
+
+Don is no longer the prime maintainer of this version of the driver.
+Please report problems to one or more of:
+
+- Andrew Morton
+- Netdev mailing list <netdev@vger.kernel.org>
+- Linux kernel mailing list <linux-kernel@vger.kernel.org>
+
+Please note the 'Reporting and Diagnosing Problems' section at the end
+of this file.
+
+
+Since kernel 2.3.99-pre6, this driver incorporates the support for the
+3c575-series Cardbus cards which used to be handled by 3c575_cb.c.
+
+This driver supports the following hardware:
+
+	- 3c590 Vortex 10Mbps
+	- 3c592 EISA 10Mbps Demon/Vortex
+	- 3c597 EISA Fast Demon/Vortex
+	- 3c595 Vortex 100baseTx
+	- 3c595 Vortex 100baseT4
+	- 3c595 Vortex 100base-MII
+	- 3c900 Boomerang 10baseT
+	- 3c900 Boomerang 10Mbps Combo
+	- 3c900 Cyclone 10Mbps TPO
+	- 3c900 Cyclone 10Mbps Combo
+	- 3c900 Cyclone 10Mbps TPC
+	- 3c900B-FL Cyclone 10base-FL
+	- 3c905 Boomerang 100baseTx
+	- 3c905 Boomerang 100baseT4
+	- 3c905B Cyclone 100baseTx
+	- 3c905B Cyclone 10/100/BNC
+	- 3c905B-FX Cyclone 100baseFx
+	- 3c905C Tornado
+	- 3c920B-EMB-WNM (ATI Radeon 9100 IGP)
+	- 3c980 Cyclone
+	- 3c980C Python-T
+	- 3cSOHO100-TX Hurricane
+	- 3c555 Laptop Hurricane
+	- 3c556 Laptop Tornado
+	- 3c556B Laptop Hurricane
+	- 3c575 [Megahertz] 10/100 LAN  CardBus
+	- 3c575 Boomerang CardBus
+	- 3CCFE575BT Cyclone CardBus
+	- 3CCFE575CT Tornado CardBus
+	- 3CCFE656 Cyclone CardBus
+	- 3CCFEM656B Cyclone+Winmodem CardBus
+	- 3CXFEM656C Tornado+Winmodem CardBus
+	- 3c450 HomePNA Tornado
+	- 3c920 Tornado
+	- 3c982 Hydra Dual Port A
+	- 3c982 Hydra Dual Port B
+	- 3c905B-T4
+	- 3c920B-EMB-WNM Tornado
+
+Module parameters
+=================
+
+There are several parameters which may be provided to the driver when
+its module is loaded.  These are usually placed in ``/etc/modprobe.d/*.conf``
+configuration files.  Example::
+
+    options 3c59x debug=3 rx_copybreak=300
+
+If you are using the PCMCIA tools (cardmgr) then the options may be
+placed in /etc/pcmcia/config.opts::
+
+    module "3c59x" opts "debug=3 rx_copybreak=300"
+
+
+The supported parameters are:
+
+debug=N
+
+  Where N is a number from 0 to 7.  Anything above 3 produces a lot
+  of output in your system logs.  debug=1 is default.
+
+options=N1,N2,N3,...
+
+  Each number in the list provides an option to the corresponding
+  network card.  So if you have two 3c905's and you wish to provide
+  them with option 0x204 you would use::
+
+    options=0x204,0x204
+
+  The individual options are composed of a number of bitfields which
+  have the following meanings:
+
+  Possible media type settings
+
+	==	=================================
+	0	10baseT
+	1	10Mbs AUI
+	2	undefined
+	3	10base2 (BNC)
+	4	100base-TX
+	5	100base-FX
+	6	MII (Media Independent Interface)
+	7	Use default setting from EEPROM
+	8       Autonegotiate
+	9       External MII
+	10      Use default setting from EEPROM
+	==	=================================
+
+  When generating a value for the 'options' setting, the above media
+  selection values may be OR'ed (or added to) the following:
+
+  ======  =============================================
+  0x8000  Set driver debugging level to 7
+  0x4000  Set driver debugging level to 2
+  0x0400  Enable Wake-on-LAN
+  0x0200  Force full duplex mode.
+  0x0010  Bus-master enable bit (Old Vortex cards only)
+  ======  =============================================
+
+  For example::
+
+    insmod 3c59x options=0x204
+
+  will force full-duplex 100base-TX, rather than allowing the usual
+  autonegotiation.
+
+global_options=N
+
+  Sets the ``options`` parameter for all 3c59x NICs in the machine.
+  Entries in the ``options`` array above will override any setting of
+  this.
+
+full_duplex=N1,N2,N3...
+
+  Similar to bit 9 of 'options'.  Forces the corresponding card into
+  full-duplex mode.  Please use this in preference to the ``options``
+  parameter.
+
+  In fact, please don't use this at all! You're better off getting
+  autonegotiation working properly.
+
+global_full_duplex=N1
+
+  Sets full duplex mode for all 3c59x NICs in the machine.  Entries
+  in the ``full_duplex`` array above will override any setting of this.
+
+flow_ctrl=N1,N2,N3...
+
+  Use 802.3x MAC-layer flow control.  The 3com cards only support the
+  PAUSE command, which means that they will stop sending packets for a
+  short period if they receive a PAUSE frame from the link partner.
+
+  The driver only allows flow control on a link which is operating in
+  full duplex mode.
+
+  This feature does not appear to work on the 3c905 - only 3c905B and
+  3c905C have been tested.
+
+  The 3com cards appear to only respond to PAUSE frames which are
+  sent to the reserved destination address of 01:80:c2:00:00:01.  They
+  do not honour PAUSE frames which are sent to the station MAC address.
+
+rx_copybreak=M
+
+  The driver preallocates 32 full-sized (1536 byte) network buffers
+  for receiving.  When a packet arrives, the driver has to decide
+  whether to leave the packet in its full-sized buffer, or to allocate
+  a smaller buffer and copy the packet across into it.
+
+  This is a speed/space tradeoff.
+
+  The value of rx_copybreak is used to decide when to make the copy.
+  If the packet size is less than rx_copybreak, the packet is copied.
+  The default value for rx_copybreak is 200 bytes.
+
+max_interrupt_work=N
+
+  The driver's interrupt service routine can handle many receive and
+  transmit packets in a single invocation.  It does this in a loop.
+  The value of max_interrupt_work governs how many times the interrupt
+  service routine will loop.  The default value is 32 loops.  If this
+  is exceeded the interrupt service routine gives up and generates a
+  warning message "eth0: Too much work in interrupt".
+
+hw_checksums=N1,N2,N3,...
+
+  Recent 3com NICs are able to generate IPv4, TCP and UDP checksums
+  in hardware.  Linux has used the Rx checksumming for a long time.
+  The "zero copy" patch which is planned for the 2.4 kernel series
+  allows you to make use of the NIC's DMA scatter/gather and transmit
+  checksumming as well.
+
+  The driver is set up so that, when the zerocopy patch is applied,
+  all Tornado and Cyclone devices will use S/G and Tx checksums.
+
+  This module parameter has been provided so you can override this
+  decision.  If you think that Tx checksums are causing a problem, you
+  may disable the feature with ``hw_checksums=0``.
+
+  If you think your NIC should be performing Tx checksumming and the
+  driver isn't enabling it, you can force the use of hardware Tx
+  checksumming with ``hw_checksums=1``.
+
+  The driver drops a message in the logfiles to indicate whether or
+  not it is using hardware scatter/gather and hardware Tx checksums.
+
+  Scatter/gather and hardware checksums provide considerable
+  performance improvement for the sendfile() system call, but a small
+  decrease in throughput for send().  There is no effect upon receive
+  efficiency.
+
+compaq_ioaddr=N,
+compaq_irq=N,
+compaq_device_id=N
+
+  "Variables to work-around the Compaq PCI BIOS32 problem"....
+
+watchdog=N
+
+  Sets the time duration (in milliseconds) after which the kernel
+  decides that the transmitter has become stuck and needs to be reset.
+  This is mainly for debugging purposes, although it may be advantageous
+  to increase this value on LANs which have very high collision rates.
+  The default value is 5000 (5.0 seconds).
+
+enable_wol=N1,N2,N3,...
+
+  Enable Wake-on-LAN support for the relevant interface.  Donald
+  Becker's ``ether-wake`` application may be used to wake suspended
+  machines.
+
+  Also enables the NIC's power management support.
+
+global_enable_wol=N
+
+  Sets enable_wol mode for all 3c59x NICs in the machine.  Entries in
+  the ``enable_wol`` array above will override any setting of this.
+
+Media selection
+---------------
+
+A number of the older NICs such as the 3c590 and 3c900 series have
+10base2 and AUI interfaces.
+
+Prior to January, 2001 this driver would autoeselect the 10base2 or AUI
+port if it didn't detect activity on the 10baseT port.  It would then
+get stuck on the 10base2 port and a driver reload was necessary to
+switch back to 10baseT.  This behaviour could not be prevented with a
+module option override.
+
+Later (current) versions of the driver _do_ support locking of the
+media type.  So if you load the driver module with
+
+	modprobe 3c59x options=0
+
+it will permanently select the 10baseT port.  Automatic selection of
+other media types does not occur.
+
+
+Transmit error, Tx status register 82
+-------------------------------------
+
+This is a common error which is almost always caused by another host on
+the same network being in full-duplex mode, while this host is in
+half-duplex mode.  You need to find that other host and make it run in
+half-duplex mode or fix this host to run in full-duplex mode.
+
+As a last resort, you can force the 3c59x driver into full-duplex mode
+with
+
+	options 3c59x full_duplex=1
+
+but this has to be viewed as a workaround for broken network gear and
+should only really be used for equipment which cannot autonegotiate.
+
+
+Additional resources
+--------------------
+
+Details of the device driver implementation are at the top of the source file.
+
+Additional documentation is available at Don Becker's Linux Drivers site:
+
+     http://www.scyld.com/vortex.html
+
+Donald Becker's driver development site:
+
+     http://www.scyld.com/network.html
+
+Donald's vortex-diag program is useful for inspecting the NIC's state:
+
+     http://www.scyld.com/ethercard_diag.html
+
+Donald's mii-diag program may be used for inspecting and manipulating
+the NIC's Media Independent Interface subsystem:
+
+     http://www.scyld.com/ethercard_diag.html#mii-diag
+
+Donald's wake-on-LAN page:
+
+     http://www.scyld.com/wakeonlan.html
+
+3Com's DOS-based application for setting up the NICs EEPROMs:
+
+	ftp://ftp.3com.com/pub/nic/3c90x/3c90xx2.exe
+
+
+Autonegotiation notes
+---------------------
+
+  The driver uses a one-minute heartbeat for adapting to changes in
+  the external LAN environment if link is up and 5 seconds if link is down.
+  This means that when, for example, a machine is unplugged from a hubbed
+  10baseT LAN plugged into a  switched 100baseT LAN, the throughput
+  will be quite dreadful for up to sixty seconds.  Be patient.
+
+  Cisco interoperability note from Walter Wong <wcw+@CMU.EDU>:
+
+  On a side note, adding HAS_NWAY seems to share a problem with the
+  Cisco 6509 switch.  Specifically, you need to change the spanning
+  tree parameter for the port the machine is plugged into to 'portfast'
+  mode.  Otherwise, the negotiation fails.  This has been an issue
+  we've noticed for a while but haven't had the time to track down.
+
+  Cisco switches    (Jeff Busch <jbusch@deja.com>)
+
+    My "standard config" for ports to which PC's/servers connect directly::
+
+	interface FastEthernet0/N
+	description machinename
+	load-interval 30
+	spanning-tree portfast
+
+    If autonegotiation is a problem, you may need to specify "speed
+    100" and "duplex full" as well (or "speed 10" and "duplex half").
+
+    WARNING: DO NOT hook up hubs/switches/bridges to these
+    specially-configured ports! The switch will become very confused.
+
+
+Reporting and diagnosing problems
+---------------------------------
+
+Maintainers find that accurate and complete problem reports are
+invaluable in resolving driver problems.  We are frequently not able to
+reproduce problems and must rely on your patience and efforts to get to
+the bottom of the problem.
+
+If you believe you have a driver problem here are some of the
+steps you should take:
+
+- Is it really a driver problem?
+
+   Eliminate some variables: try different cards, different
+   computers, different cables, different ports on the switch/hub,
+   different versions of the kernel or of the driver, etc.
+
+- OK, it's a driver problem.
+
+   You need to generate a report.  Typically this is an email to the
+   maintainer and/or netdev@vger.kernel.org.  The maintainer's
+   email address will be in the driver source or in the MAINTAINERS file.
+
+- The contents of your report will vary a lot depending upon the
+  problem.  If it's a kernel crash then you should refer to the
+  admin-guide/reporting-bugs.rst file.
+
+  But for most problems it is useful to provide the following:
+
+   - Kernel version, driver version
+
+   - A copy of the banner message which the driver generates when
+     it is initialised.  For example:
+
+     eth0: 3Com PCI 3c905C Tornado at 0xa400,  00:50:da:6a:88:f0, IRQ 19
+     8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface.
+     MII transceiver found at address 24, status 782d.
+     Enabling bus-master transmits and whole-frame receives.
+
+     NOTE: You must provide the ``debug=2`` modprobe option to generate
+     a full detection message.  Please do this::
+
+	modprobe 3c59x debug=2
+
+   - If it is a PCI device, the relevant output from 'lspci -vx', eg::
+
+       00:09.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74)
+	       Subsystem: 3Com Corporation: Unknown device 9200
+	       Flags: bus master, medium devsel, latency 32, IRQ 19
+	       I/O ports at a400 [size=128]
+	       Memory at db000000 (32-bit, non-prefetchable) [size=128]
+	       Expansion ROM at <unassigned> [disabled] [size=128K]
+	       Capabilities: [dc] Power Management version 2
+       00: b7 10 00 92 07 00 10 02 74 00 00 02 08 20 00 00
+       10: 01 a4 00 00 00 00 00 db 00 00 00 00 00 00 00 00
+       20: 00 00 00 00 00 00 00 00 00 00 00 00 b7 10 00 10
+       30: 00 00 00 00 dc 00 00 00 00 00 00 00 05 01 0a 0a
+
+   - A description of the environment: 10baseT? 100baseT?
+     full/half duplex? switched or hubbed?
+
+   - Any additional module parameters which you may be providing to the driver.
+
+   - Any kernel logs which are produced.  The more the merrier.
+     If this is a large file and you are sending your report to a
+     mailing list, mention that you have the logfile, but don't send
+     it.  If you're reporting direct to the maintainer then just send
+     it.
+
+     To ensure that all kernel logs are available, add the
+     following line to /etc/syslog.conf::
+
+	 kern.* /var/log/messages
+
+     Then restart syslogd with::
+
+	 /etc/rc.d/init.d/syslog restart
+
+     (The above may vary, depending upon which Linux distribution you use).
+
+    - If your problem is reproducible then that's great.  Try the
+      following:
+
+      1) Increase the debug level.  Usually this is done via:
+
+	 a) modprobe driver debug=7
+	 b) In /etc/modprobe.d/driver.conf:
+	    options driver debug=7
+
+      2) Recreate the problem with the higher debug level,
+	 send all logs to the maintainer.
+
+      3) Download you card's diagnostic tool from Donald
+	 Becker's website <http://www.scyld.com/ethercard_diag.html>.
+	 Download mii-diag.c as well.  Build these.
+
+	 a) Run 'vortex-diag -aaee' and 'mii-diag -v' when the card is
+	    working correctly.  Save the output.
+
+	 b) Run the above commands when the card is malfunctioning.  Send
+	    both sets of output.
+
+Finally, please be patient and be prepared to do some work.  You may
+end up working on this problem for a week or more as the maintainer
+asks more questions, asks for more tests, asks for patches to be
+applied, etc.  At the end of it all, the problem may even remain
+unresolved.
diff --git a/Documentation/networking/device_drivers/3com/vortex.txt b/Documentation/networking/device_drivers/3com/vortex.txt
deleted file mode 100644
index 587f3fcfbcae..000000000000
--- a/Documentation/networking/device_drivers/3com/vortex.txt
+++ /dev/null
@@ -1,448 +0,0 @@
-Documentation/networking/device_drivers/3com/vortex.txt
-Andrew Morton
-30 April 2000
-
-
-This document describes the usage and errata of the 3Com "Vortex" device
-driver for Linux, 3c59x.c.
-
-The driver was written by Donald Becker <becker@scyld.com>
-
-Don is no longer the prime maintainer of this version of the driver. 
-Please report problems to one or more of:
-
-  Andrew Morton
-  Netdev mailing list <netdev@vger.kernel.org>
-  Linux kernel mailing list <linux-kernel@vger.kernel.org>
-
-Please note the 'Reporting and Diagnosing Problems' section at the end
-of this file.
-
-
-Since kernel 2.3.99-pre6, this driver incorporates the support for the
-3c575-series Cardbus cards which used to be handled by 3c575_cb.c.
-
-This driver supports the following hardware:
-
-	3c590 Vortex 10Mbps
-	3c592 EISA 10Mbps Demon/Vortex
-	3c597 EISA Fast Demon/Vortex
-	3c595 Vortex 100baseTx
-	3c595 Vortex 100baseT4
-	3c595 Vortex 100base-MII
-	3c900 Boomerang 10baseT
-	3c900 Boomerang 10Mbps Combo
-	3c900 Cyclone 10Mbps TPO
-	3c900 Cyclone 10Mbps Combo
-	3c900 Cyclone 10Mbps TPC
-	3c900B-FL Cyclone 10base-FL
-	3c905 Boomerang 100baseTx
-	3c905 Boomerang 100baseT4
-	3c905B Cyclone 100baseTx
-	3c905B Cyclone 10/100/BNC
-	3c905B-FX Cyclone 100baseFx
-	3c905C Tornado
-	3c920B-EMB-WNM (ATI Radeon 9100 IGP)
-	3c980 Cyclone
-	3c980C Python-T
-	3cSOHO100-TX Hurricane
-	3c555 Laptop Hurricane
-	3c556 Laptop Tornado
-	3c556B Laptop Hurricane
-	3c575 [Megahertz] 10/100 LAN  CardBus
-	3c575 Boomerang CardBus
-	3CCFE575BT Cyclone CardBus
-	3CCFE575CT Tornado CardBus
-	3CCFE656 Cyclone CardBus
-	3CCFEM656B Cyclone+Winmodem CardBus
-	3CXFEM656C Tornado+Winmodem CardBus
-	3c450 HomePNA Tornado
-	3c920 Tornado
-	3c982 Hydra Dual Port A
-	3c982 Hydra Dual Port B
-	3c905B-T4
-	3c920B-EMB-WNM Tornado
-
-Module parameters
-=================
-
-There are several parameters which may be provided to the driver when
-its module is loaded.  These are usually placed in /etc/modprobe.d/*.conf
-configuration files.  Example:
-
-options 3c59x debug=3 rx_copybreak=300
-
-If you are using the PCMCIA tools (cardmgr) then the options may be
-placed in /etc/pcmcia/config.opts:
-
-module "3c59x" opts "debug=3 rx_copybreak=300"
-
-
-The supported parameters are:
-
-debug=N
-
-  Where N is a number from 0 to 7.  Anything above 3 produces a lot
-  of output in your system logs.  debug=1 is default.
-
-options=N1,N2,N3,...
-
-  Each number in the list provides an option to the corresponding
-  network card.  So if you have two 3c905's and you wish to provide
-  them with option 0x204 you would use:
-
-    options=0x204,0x204
-
-  The individual options are composed of a number of bitfields which
-  have the following meanings:
-
-  Possible media type settings
-	0	10baseT
-	1	10Mbs AUI
-	2	undefined
-	3	10base2 (BNC)
-	4	100base-TX
-	5	100base-FX
-	6	MII (Media Independent Interface)
-	7	Use default setting from EEPROM
-	8       Autonegotiate
-	9       External MII
-	10      Use default setting from EEPROM
-
-  When generating a value for the 'options' setting, the above media
-  selection values may be OR'ed (or added to) the following:
-
-  0x8000  Set driver debugging level to 7
-  0x4000  Set driver debugging level to 2
-  0x0400  Enable Wake-on-LAN
-  0x0200  Force full duplex mode.
-  0x0010  Bus-master enable bit (Old Vortex cards only)
-
-  For example:
-
-    insmod 3c59x options=0x204
-
-  will force full-duplex 100base-TX, rather than allowing the usual
-  autonegotiation.
-
-global_options=N
-
-  Sets the `options' parameter for all 3c59x NICs in the machine. 
-  Entries in the `options' array above will override any setting of
-  this.
-
-full_duplex=N1,N2,N3...
-
-  Similar to bit 9 of 'options'.  Forces the corresponding card into
-  full-duplex mode.  Please use this in preference to the `options'
-  parameter.
-
-  In fact, please don't use this at all! You're better off getting
-  autonegotiation working properly.
-
-global_full_duplex=N1
-
-  Sets full duplex mode for all 3c59x NICs in the machine.  Entries
-  in the `full_duplex' array above will override any setting of this.
-
-flow_ctrl=N1,N2,N3...
-
-  Use 802.3x MAC-layer flow control.  The 3com cards only support the
-  PAUSE command, which means that they will stop sending packets for a
-  short period if they receive a PAUSE frame from the link partner. 
-
-  The driver only allows flow control on a link which is operating in
-  full duplex mode.
-
-  This feature does not appear to work on the 3c905 - only 3c905B and
-  3c905C have been tested.
-
-  The 3com cards appear to only respond to PAUSE frames which are
-  sent to the reserved destination address of 01:80:c2:00:00:01.  They
-  do not honour PAUSE frames which are sent to the station MAC address.
-
-rx_copybreak=M
-
-  The driver preallocates 32 full-sized (1536 byte) network buffers
-  for receiving.  When a packet arrives, the driver has to decide
-  whether to leave the packet in its full-sized buffer, or to allocate
-  a smaller buffer and copy the packet across into it.
-
-  This is a speed/space tradeoff.
-
-  The value of rx_copybreak is used to decide when to make the copy. 
-  If the packet size is less than rx_copybreak, the packet is copied. 
-  The default value for rx_copybreak is 200 bytes.
-
-max_interrupt_work=N
-
-  The driver's interrupt service routine can handle many receive and
-  transmit packets in a single invocation.  It does this in a loop. 
-  The value of max_interrupt_work governs how many times the interrupt
-  service routine will loop.  The default value is 32 loops.  If this
-  is exceeded the interrupt service routine gives up and generates a
-  warning message "eth0: Too much work in interrupt".
-
-hw_checksums=N1,N2,N3,...
-
-  Recent 3com NICs are able to generate IPv4, TCP and UDP checksums
-  in hardware.  Linux has used the Rx checksumming for a long time. 
-  The "zero copy" patch which is planned for the 2.4 kernel series
-  allows you to make use of the NIC's DMA scatter/gather and transmit
-  checksumming as well.
-
-  The driver is set up so that, when the zerocopy patch is applied,
-  all Tornado and Cyclone devices will use S/G and Tx checksums.
-
-  This module parameter has been provided so you can override this
-  decision.  If you think that Tx checksums are causing a problem, you
-  may disable the feature with `hw_checksums=0'.
-
-  If you think your NIC should be performing Tx checksumming and the
-  driver isn't enabling it, you can force the use of hardware Tx
-  checksumming with `hw_checksums=1'.
-
-  The driver drops a message in the logfiles to indicate whether or
-  not it is using hardware scatter/gather and hardware Tx checksums.
-
-  Scatter/gather and hardware checksums provide considerable
-  performance improvement for the sendfile() system call, but a small
-  decrease in throughput for send().  There is no effect upon receive
-  efficiency.
-
-compaq_ioaddr=N
-compaq_irq=N
-compaq_device_id=N
-
-  "Variables to work-around the Compaq PCI BIOS32 problem"....
-
-watchdog=N
-
-  Sets the time duration (in milliseconds) after which the kernel
-  decides that the transmitter has become stuck and needs to be reset. 
-  This is mainly for debugging purposes, although it may be advantageous
-  to increase this value on LANs which have very high collision rates.
-  The default value is 5000 (5.0 seconds).
-
-enable_wol=N1,N2,N3,...
-
-  Enable Wake-on-LAN support for the relevant interface.  Donald
-  Becker's `ether-wake' application may be used to wake suspended
-  machines.
-
-  Also enables the NIC's power management support.
-
-global_enable_wol=N
-
-  Sets enable_wol mode for all 3c59x NICs in the machine.  Entries in
-  the `enable_wol' array above will override any setting of this.
-
-Media selection
----------------
-
-A number of the older NICs such as the 3c590 and 3c900 series have
-10base2 and AUI interfaces.
-
-Prior to January, 2001 this driver would autoeselect the 10base2 or AUI
-port if it didn't detect activity on the 10baseT port.  It would then
-get stuck on the 10base2 port and a driver reload was necessary to
-switch back to 10baseT.  This behaviour could not be prevented with a
-module option override.
-
-Later (current) versions of the driver _do_ support locking of the
-media type.  So if you load the driver module with
-
-	modprobe 3c59x options=0
-
-it will permanently select the 10baseT port.  Automatic selection of
-other media types does not occur.
-
-
-Transmit error, Tx status register 82
--------------------------------------
-
-This is a common error which is almost always caused by another host on
-the same network being in full-duplex mode, while this host is in
-half-duplex mode.  You need to find that other host and make it run in
-half-duplex mode or fix this host to run in full-duplex mode.
-
-As a last resort, you can force the 3c59x driver into full-duplex mode
-with
-
-	options 3c59x full_duplex=1
-
-but this has to be viewed as a workaround for broken network gear and
-should only really be used for equipment which cannot autonegotiate.
-
-
-Additional resources
---------------------
-
-Details of the device driver implementation are at the top of the source file.
-
-Additional documentation is available at Don Becker's Linux Drivers site:
-
-     http://www.scyld.com/vortex.html
-
-Donald Becker's driver development site:
-
-     http://www.scyld.com/network.html
-
-Donald's vortex-diag program is useful for inspecting the NIC's state:
-
-     http://www.scyld.com/ethercard_diag.html
-
-Donald's mii-diag program may be used for inspecting and manipulating
-the NIC's Media Independent Interface subsystem:
-
-     http://www.scyld.com/ethercard_diag.html#mii-diag
-
-Donald's wake-on-LAN page:
-
-     http://www.scyld.com/wakeonlan.html
-
-3Com's DOS-based application for setting up the NICs EEPROMs:
-
-	ftp://ftp.3com.com/pub/nic/3c90x/3c90xx2.exe
-
-
-Autonegotiation notes
----------------------
-
-  The driver uses a one-minute heartbeat for adapting to changes in
-  the external LAN environment if link is up and 5 seconds if link is down.
-  This means that when, for example, a machine is unplugged from a hubbed
-  10baseT LAN plugged into a  switched 100baseT LAN, the throughput
-  will be quite dreadful for up to sixty seconds.  Be patient.
-
-  Cisco interoperability note from Walter Wong <wcw+@CMU.EDU>:
-
-  On a side note, adding HAS_NWAY seems to share a problem with the
-  Cisco 6509 switch.  Specifically, you need to change the spanning
-  tree parameter for the port the machine is plugged into to 'portfast'
-  mode.  Otherwise, the negotiation fails.  This has been an issue
-  we've noticed for a while but haven't had the time to track down.
-
-  Cisco switches    (Jeff Busch <jbusch@deja.com>)
-
-    My "standard config" for ports to which PC's/servers connect directly:
-
-        interface FastEthernet0/N
-        description machinename
-        load-interval 30
-        spanning-tree portfast
-
-    If autonegotiation is a problem, you may need to specify "speed
-    100" and "duplex full" as well (or "speed 10" and "duplex half").
-
-    WARNING: DO NOT hook up hubs/switches/bridges to these
-    specially-configured ports! The switch will become very confused.
-
-
-Reporting and diagnosing problems
----------------------------------
-
-Maintainers find that accurate and complete problem reports are
-invaluable in resolving driver problems.  We are frequently not able to
-reproduce problems and must rely on your patience and efforts to get to
-the bottom of the problem.
-
-If you believe you have a driver problem here are some of the
-steps you should take:
-
-- Is it really a driver problem?
-
-   Eliminate some variables: try different cards, different
-   computers, different cables, different ports on the switch/hub,
-   different versions of the kernel or of the driver, etc.
-
-- OK, it's a driver problem.
-
-   You need to generate a report.  Typically this is an email to the
-   maintainer and/or netdev@vger.kernel.org.  The maintainer's
-   email address will be in the driver source or in the MAINTAINERS file.
-
-- The contents of your report will vary a lot depending upon the
-  problem.  If it's a kernel crash then you should refer to the
-  admin-guide/reporting-bugs.rst file.
-
-  But for most problems it is useful to provide the following:
-
-   o Kernel version, driver version
-
-   o A copy of the banner message which the driver generates when
-     it is initialised.  For example:
-
-     eth0: 3Com PCI 3c905C Tornado at 0xa400,  00:50:da:6a:88:f0, IRQ 19
-     8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface.
-     MII transceiver found at address 24, status 782d.
-     Enabling bus-master transmits and whole-frame receives.
-
-     NOTE: You must provide the `debug=2' modprobe option to generate
-     a full detection message.  Please do this:
-
-	modprobe 3c59x debug=2
-
-   o If it is a PCI device, the relevant output from 'lspci -vx', eg:
-
-     00:09.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74)
-             Subsystem: 3Com Corporation: Unknown device 9200
-             Flags: bus master, medium devsel, latency 32, IRQ 19
-             I/O ports at a400 [size=128]
-             Memory at db000000 (32-bit, non-prefetchable) [size=128]
-             Expansion ROM at <unassigned> [disabled] [size=128K]
-             Capabilities: [dc] Power Management version 2
-     00: b7 10 00 92 07 00 10 02 74 00 00 02 08 20 00 00
-     10: 01 a4 00 00 00 00 00 db 00 00 00 00 00 00 00 00
-     20: 00 00 00 00 00 00 00 00 00 00 00 00 b7 10 00 10
-     30: 00 00 00 00 dc 00 00 00 00 00 00 00 05 01 0a 0a
-
-   o A description of the environment: 10baseT? 100baseT?
-     full/half duplex? switched or hubbed?
-
-   o Any additional module parameters which you may be providing to the driver.
-
-   o Any kernel logs which are produced.  The more the merrier. 
-     If this is a large file and you are sending your report to a
-     mailing list, mention that you have the logfile, but don't send
-     it.  If you're reporting direct to the maintainer then just send
-     it.
-
-     To ensure that all kernel logs are available, add the
-     following line to /etc/syslog.conf:
-
-         kern.* /var/log/messages
-
-     Then restart syslogd with:
-
-         /etc/rc.d/init.d/syslog restart
-
-     (The above may vary, depending upon which Linux distribution you use).
-
-    o If your problem is reproducible then that's great.  Try the
-      following:
-
-      1) Increase the debug level.  Usually this is done via:
-
-         a) modprobe driver debug=7
-         b) In /etc/modprobe.d/driver.conf:
-            options driver debug=7
-
-      2) Recreate the problem with the higher debug level,
-         send all logs to the maintainer.
-
-      3) Download you card's diagnostic tool from Donald
-         Becker's website <http://www.scyld.com/ethercard_diag.html>.
-         Download mii-diag.c as well.  Build these.
-
-         a) Run 'vortex-diag -aaee' and 'mii-diag -v' when the card is
-            working correctly.  Save the output.
-
-         b) Run the above commands when the card is malfunctioning.  Send
-            both sets of output.
-
-Finally, please be patient and be prepared to do some work.  You may
-end up working on this problem for a week or more as the maintainer
-asks more questions, asks for more tests, asks for patches to be
-applied, etc.  At the end of it all, the problem may even remain
-unresolved.
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 402a9188f446..aaac502b81ea 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -28,6 +28,7 @@ Contents:
    pensando/ionic
    stmicro/stmmac
    3com/3c509
+   3com/vortex
 
 .. only::  subproject and html
 
diff --git a/MAINTAINERS b/MAINTAINERS
index bee65ebdc67e..eaea5f1994c9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -147,7 +147,7 @@ Maintainers List
 M:	Steffen Klassert <klassert@kernel.org>
 L:	netdev@vger.kernel.org
 S:	Odd Fixes
-F:	Documentation/networking/device_drivers/3com/vortex.txt
+F:	Documentation/networking/device_drivers/3com/vortex.rst
 F:	drivers/net/ethernet/3com/3c59x.c
 
 3CR990 NETWORK DRIVER
diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
index a2b7f7ab8170..5984b7033999 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -1149,7 +1149,7 @@ static int vortex_probe1(struct device *gendev, void __iomem *ioaddr, int irq,
 
 	print_info = (vortex_debug > 1);
 	if (print_info)
-		pr_info("See Documentation/networking/device_drivers/3com/vortex.txt\n");
+		pr_info("See Documentation/networking/device_drivers/3com/vortex.rst\n");
 
 	pr_info("%s: 3Com %s %s at %p.\n",
 	       print_name,
@@ -1954,7 +1954,7 @@ vortex_error(struct net_device *dev, int status)
 				   dev->name, tx_status);
 			if (tx_status == 0x82) {
 				pr_err("Probably a duplex mismatch.  See "
-						"Documentation/networking/device_drivers/3com/vortex.txt\n");
+						"Documentation/networking/device_drivers/3com/vortex.rst\n");
 			}
 			dump_tx_ring(dev);
 		}
diff --git a/drivers/net/ethernet/3com/Kconfig b/drivers/net/ethernet/3com/Kconfig
index 3a6fc99c6f32..7cc259893cb9 100644
--- a/drivers/net/ethernet/3com/Kconfig
+++ b/drivers/net/ethernet/3com/Kconfig
@@ -76,7 +76,7 @@ config VORTEX
 	  "Hurricane" (3c555/3cSOHO)                           PCI
 
 	  If you have such a card, say Y here.  More specific information is in
-	  <file:Documentation/networking/device_drivers/3com/vortex.txt> and
+	  <file:Documentation/networking/device_drivers/3com/vortex.rst> and
 	  in the comments at the beginning of
 	  <file:drivers/net/ethernet/3com/3c59x.c>.
 
-- 
cgit v1.2.3


From 8d299c7e912bd8ebb88b9ac2b8e336c9878783aa Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:36 +0200
Subject: docs: networking: device drivers: convert amazon/ena.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 .../networking/device_drivers/amazon/ena.rst       | 344 +++++++++++++++++++++
 .../networking/device_drivers/amazon/ena.txt       | 308 ------------------
 Documentation/networking/device_drivers/index.rst  |   1 +
 MAINTAINERS                                        |   2 +-
 4 files changed, 346 insertions(+), 309 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/amazon/ena.rst
 delete mode 100644 Documentation/networking/device_drivers/amazon/ena.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/amazon/ena.rst b/Documentation/networking/device_drivers/amazon/ena.rst
new file mode 100644
index 000000000000..11af6388ea87
--- /dev/null
+++ b/Documentation/networking/device_drivers/amazon/ena.rst
@@ -0,0 +1,344 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================================================
+Linux kernel driver for Elastic Network Adapter (ENA) family
+============================================================
+
+Overview
+========
+
+ENA is a networking interface designed to make good use of modern CPU
+features and system architectures.
+
+The ENA device exposes a lightweight management interface with a
+minimal set of memory mapped registers and extendable command set
+through an Admin Queue.
+
+The driver supports a range of ENA devices, is link-speed independent
+(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
+a negotiated and extendable feature set.
+
+Some ENA devices support SR-IOV. This driver is used for both the
+SR-IOV Physical Function (PF) and Virtual Function (VF) devices.
+
+ENA devices enable high speed and low overhead network traffic
+processing by providing multiple Tx/Rx queue pairs (the maximum number
+is advertised by the device via the Admin Queue), a dedicated MSI-X
+interrupt vector per Tx/Rx queue pair, adaptive interrupt moderation,
+and CPU cacheline optimized data placement.
+
+The ENA driver supports industry standard TCP/IP offload features such
+as checksum offload and TCP transmit segmentation offload (TSO).
+Receive-side scaling (RSS) is supported for multi-core scaling.
+
+The ENA driver and its corresponding devices implement health
+monitoring mechanisms such as watchdog, enabling the device and driver
+to recover in a manner transparent to the application, as well as
+debug logs.
+
+Some of the ENA devices support a working mode called Low-latency
+Queue (LLQ), which saves several more microseconds.
+
+Supported PCI vendor ID/device IDs
+==================================
+
+=========   =======================
+1d0f:0ec2   ENA PF
+1d0f:1ec2   ENA PF with LLQ support
+1d0f:ec20   ENA VF
+1d0f:ec21   ENA VF with LLQ support
+=========   =======================
+
+ENA Source Code Directory Structure
+===================================
+
+=================   ======================================================
+ena_com.[ch]        Management communication layer. This layer is
+		    responsible for the handling all the management
+		    (admin) communication between the device and the
+		    driver.
+ena_eth_com.[ch]    Tx/Rx data path.
+ena_admin_defs.h    Definition of ENA management interface.
+ena_eth_io_defs.h   Definition of ENA data path interface.
+ena_common_defs.h   Common definitions for ena_com layer.
+ena_regs_defs.h     Definition of ENA PCI memory-mapped (MMIO) registers.
+ena_netdev.[ch]     Main Linux kernel driver.
+ena_syfsfs.[ch]     Sysfs files.
+ena_ethtool.c       ethtool callbacks.
+ena_pci_id_tbl.h    Supported device IDs.
+=================   ======================================================
+
+Management Interface:
+=====================
+
+ENA management interface is exposed by means of:
+
+- PCIe Configuration Space
+- Device Registers
+- Admin Queue (AQ) and Admin Completion Queue (ACQ)
+- Asynchronous Event Notification Queue (AENQ)
+
+ENA device MMIO Registers are accessed only during driver
+initialization and are not involved in further normal device
+operation.
+
+AQ is used for submitting management commands, and the
+results/responses are reported asynchronously through ACQ.
+
+ENA introduces a small set of management commands with room for
+vendor-specific extensions. Most of the management operations are
+framed in a generic Get/Set feature command.
+
+The following admin queue commands are supported:
+
+- Create I/O submission queue
+- Create I/O completion queue
+- Destroy I/O submission queue
+- Destroy I/O completion queue
+- Get feature
+- Set feature
+- Configure AENQ
+- Get statistics
+
+Refer to ena_admin_defs.h for the list of supported Get/Set Feature
+properties.
+
+The Asynchronous Event Notification Queue (AENQ) is a uni-directional
+queue used by the ENA device to send to the driver events that cannot
+be reported using ACQ. AENQ events are subdivided into groups. Each
+group may have multiple syndromes, as shown below
+
+The events are:
+
+	====================	===============
+	Group			Syndrome
+	====================	===============
+	Link state change	**X**
+	Fatal error		**X**
+	Notification		Suspend traffic
+	Notification		Resume traffic
+	Keep-Alive		**X**
+	====================	===============
+
+ACQ and AENQ share the same MSI-X vector.
+
+Keep-Alive is a special mechanism that allows monitoring of the
+device's health. The driver maintains a watchdog (WD) handler which,
+if fired, logs the current state and statistics then resets and
+restarts the ENA device and driver. A Keep-Alive event is delivered by
+the device every second. The driver re-arms the WD upon reception of a
+Keep-Alive event. A missed Keep-Alive event causes the WD handler to
+fire.
+
+Data Path Interface
+===================
+I/O operations are based on Tx and Rx Submission Queues (Tx SQ and Rx
+SQ correspondingly). Each SQ has a completion queue (CQ) associated
+with it.
+
+The SQs and CQs are implemented as descriptor rings in contiguous
+physical memory.
+
+The ENA driver supports two Queue Operation modes for Tx SQs:
+
+- Regular mode
+
+  * In this mode the Tx SQs reside in the host's memory. The ENA
+    device fetches the ENA Tx descriptors and packet data from host
+    memory.
+
+- Low Latency Queue (LLQ) mode or "push-mode".
+
+  * In this mode the driver pushes the transmit descriptors and the
+    first 128 bytes of the packet directly to the ENA device memory
+    space. The rest of the packet payload is fetched by the
+    device. For this operation mode, the driver uses a dedicated PCI
+    device memory BAR, which is mapped with write-combine capability.
+
+The Rx SQs support only the regular mode.
+
+Note: Not all ENA devices support LLQ, and this feature is negotiated
+      with the device upon initialization. If the ENA device does not
+      support LLQ mode, the driver falls back to the regular mode.
+
+The driver supports multi-queue for both Tx and Rx. This has various
+benefits:
+
+- Reduced CPU/thread/process contention on a given Ethernet interface.
+- Cache miss rate on completion is reduced, particularly for data
+  cache lines that hold the sk_buff structures.
+- Increased process-level parallelism when handling received packets.
+- Increased data cache hit rate, by steering kernel processing of
+  packets to the CPU, where the application thread consuming the
+  packet is running.
+- In hardware interrupt re-direction.
+
+Interrupt Modes
+===============
+The driver assigns a single MSI-X vector per queue pair (for both Tx
+and Rx directions). The driver assigns an additional dedicated MSI-X vector
+for management (for ACQ and AENQ).
+
+Management interrupt registration is performed when the Linux kernel
+probes the adapter, and it is de-registered when the adapter is
+removed. I/O queue interrupt registration is performed when the Linux
+interface of the adapter is opened, and it is de-registered when the
+interface is closed.
+
+The management interrupt is named::
+
+   ena-mgmnt@pci:<PCI domain:bus:slot.function>
+
+and for each queue pair, an interrupt is named::
+
+   <interface name>-Tx-Rx-<queue index>
+
+The ENA device operates in auto-mask and auto-clear interrupt
+modes. That is, once MSI-X is delivered to the host, its Cause bit is
+automatically cleared and the interrupt is masked. The interrupt is
+unmasked by the driver after NAPI processing is complete.
+
+Interrupt Moderation
+====================
+ENA driver and device can operate in conventional or adaptive interrupt
+moderation mode.
+
+In conventional mode the driver instructs device to postpone interrupt
+posting according to static interrupt delay value. The interrupt delay
+value can be configured through ethtool(8). The following ethtool
+parameters are supported by the driver: tx-usecs, rx-usecs
+
+In adaptive interrupt moderation mode the interrupt delay value is
+updated by the driver dynamically and adjusted every NAPI cycle
+according to the traffic nature.
+
+By default ENA driver applies adaptive coalescing on Rx traffic and
+conventional coalescing on Tx traffic.
+
+Adaptive coalescing can be switched on/off through ethtool(8)
+adaptive_rx on|off parameter.
+
+The driver chooses interrupt delay value according to the number of
+bytes and packets received between interrupt unmasking and interrupt
+posting. The driver uses interrupt delay table that subdivides the
+range of received bytes/packets into 5 levels and assigns interrupt
+delay value to each level.
+
+The user can enable/disable adaptive moderation, modify the interrupt
+delay table and restore its default values through sysfs.
+
+RX copybreak
+============
+The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK
+and can be configured by the ETHTOOL_STUNABLE command of the
+SIOCETHTOOL ioctl.
+
+SKB
+===
+The driver-allocated SKB for frames received from Rx handling using
+NAPI context. The allocation method depends on the size of the packet.
+If the frame length is larger than rx_copybreak, napi_get_frags()
+is used, otherwise netdev_alloc_skb_ip_align() is used, the buffer
+content is copied (by CPU) to the SKB, and the buffer is recycled.
+
+Statistics
+==========
+The user can obtain ENA device and driver statistics using ethtool.
+The driver can collect regular or extended statistics (including
+per-queue stats) from the device.
+
+In addition the driver logs the stats to syslog upon device reset.
+
+MTU
+===
+The driver supports an arbitrarily large MTU with a maximum that is
+negotiated with the device. The driver configures MTU using the
+SetFeature command (ENA_ADMIN_MTU property). The user can change MTU
+via ip(8) and similar legacy tools.
+
+Stateless Offloads
+==================
+The ENA driver supports:
+
+- TSO over IPv4/IPv6
+- TSO with ECN
+- IPv4 header checksum offload
+- TCP/UDP over IPv4/IPv6 checksum offloads
+
+RSS
+===
+- The ENA device supports RSS that allows flexible Rx traffic
+  steering.
+- Toeplitz and CRC32 hash functions are supported.
+- Different combinations of L2/L3/L4 fields can be configured as
+  inputs for hash functions.
+- The driver configures RSS settings using the AQ SetFeature command
+  (ENA_ADMIN_RSS_HASH_FUNCTION, ENA_ADMIN_RSS_HASH_INPUT and
+  ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG properties).
+- If the NETIF_F_RXHASH flag is set, the 32-bit result of the hash
+  function delivered in the Rx CQ descriptor is set in the received
+  SKB.
+- The user can provide a hash key, hash function, and configure the
+  indirection table through ethtool(8).
+
+DATA PATH
+=========
+Tx
+--
+
+end_start_xmit() is called by the stack. This function does the following:
+
+- Maps data buffers (skb->data and frags).
+- Populates ena_buf for the push buffer (if the driver and device are
+  in push mode.)
+- Prepares ENA bufs for the remaining frags.
+- Allocates a new request ID from the empty req_id ring. The request
+  ID is the index of the packet in the Tx info. This is used for
+  out-of-order TX completions.
+- Adds the packet to the proper place in the Tx ring.
+- Calls ena_com_prepare_tx(), an ENA communication layer that converts
+  the ena_bufs to ENA descriptors (and adds meta ENA descriptors as
+  needed.)
+
+  * This function also copies the ENA descriptors and the push buffer
+    to the Device memory space (if in push mode.)
+
+- Writes doorbell to the ENA device.
+- When the ENA device finishes sending the packet, a completion
+  interrupt is raised.
+- The interrupt handler schedules NAPI.
+- The ena_clean_tx_irq() function is called. This function handles the
+  completion descriptors generated by the ENA, with a single
+  completion descriptor per completed packet.
+
+  * req_id is retrieved from the completion descriptor. The tx_info of
+    the packet is retrieved via the req_id. The data buffers are
+    unmapped and req_id is returned to the empty req_id ring.
+  * The function stops when the completion descriptors are completed or
+    the budget is reached.
+
+Rx
+--
+
+- When a packet is received from the ENA device.
+- The interrupt handler schedules NAPI.
+- The ena_clean_rx_irq() function is called. This function calls
+  ena_rx_pkt(), an ENA communication layer function, which returns the
+  number of descriptors used for a new unhandled packet, and zero if
+  no new packet is found.
+- Then it calls the ena_clean_rx_irq() function.
+- ena_eth_rx_skb() checks packet length:
+
+  * If the packet is small (len < rx_copybreak), the driver allocates
+    a SKB for the new packet, and copies the packet payload into the
+    SKB data buffer.
+
+    - In this way the original data buffer is not passed to the stack
+      and is reused for future Rx packets.
+
+  * Otherwise the function unmaps the Rx buffer, then allocates the
+    new SKB structure and hooks the Rx buffer to the SKB frags.
+
+- The new SKB is updated with the necessary information (protocol,
+  checksum hw verify result, etc.), and then passed to the network
+  stack, using the NAPI interface function napi_gro_receive().
diff --git a/Documentation/networking/device_drivers/amazon/ena.txt b/Documentation/networking/device_drivers/amazon/ena.txt
deleted file mode 100644
index 1bb55c7b604c..000000000000
--- a/Documentation/networking/device_drivers/amazon/ena.txt
+++ /dev/null
@@ -1,308 +0,0 @@
-Linux kernel driver for Elastic Network Adapter (ENA) family:
-=============================================================
-
-Overview:
-=========
-ENA is a networking interface designed to make good use of modern CPU
-features and system architectures.
-
-The ENA device exposes a lightweight management interface with a
-minimal set of memory mapped registers and extendable command set
-through an Admin Queue.
-
-The driver supports a range of ENA devices, is link-speed independent
-(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
-a negotiated and extendable feature set.
-
-Some ENA devices support SR-IOV. This driver is used for both the
-SR-IOV Physical Function (PF) and Virtual Function (VF) devices.
-
-ENA devices enable high speed and low overhead network traffic
-processing by providing multiple Tx/Rx queue pairs (the maximum number
-is advertised by the device via the Admin Queue), a dedicated MSI-X
-interrupt vector per Tx/Rx queue pair, adaptive interrupt moderation,
-and CPU cacheline optimized data placement.
-
-The ENA driver supports industry standard TCP/IP offload features such
-as checksum offload and TCP transmit segmentation offload (TSO).
-Receive-side scaling (RSS) is supported for multi-core scaling.
-
-The ENA driver and its corresponding devices implement health
-monitoring mechanisms such as watchdog, enabling the device and driver
-to recover in a manner transparent to the application, as well as
-debug logs.
-
-Some of the ENA devices support a working mode called Low-latency
-Queue (LLQ), which saves several more microseconds.
-
-Supported PCI vendor ID/device IDs:
-===================================
-1d0f:0ec2 - ENA PF
-1d0f:1ec2 - ENA PF with LLQ support
-1d0f:ec20 - ENA VF
-1d0f:ec21 - ENA VF with LLQ support
-
-ENA Source Code Directory Structure:
-====================================
-ena_com.[ch]      - Management communication layer. This layer is
-                    responsible for the handling all the management
-                    (admin) communication between the device and the
-                    driver.
-ena_eth_com.[ch]  - Tx/Rx data path.
-ena_admin_defs.h  - Definition of ENA management interface.
-ena_eth_io_defs.h - Definition of ENA data path interface.
-ena_common_defs.h - Common definitions for ena_com layer.
-ena_regs_defs.h   - Definition of ENA PCI memory-mapped (MMIO) registers.
-ena_netdev.[ch]   - Main Linux kernel driver.
-ena_syfsfs.[ch]   - Sysfs files.
-ena_ethtool.c     - ethtool callbacks.
-ena_pci_id_tbl.h  - Supported device IDs.
-
-Management Interface:
-=====================
-ENA management interface is exposed by means of:
-- PCIe Configuration Space
-- Device Registers
-- Admin Queue (AQ) and Admin Completion Queue (ACQ)
-- Asynchronous Event Notification Queue (AENQ)
-
-ENA device MMIO Registers are accessed only during driver
-initialization and are not involved in further normal device
-operation.
-
-AQ is used for submitting management commands, and the
-results/responses are reported asynchronously through ACQ.
-
-ENA introduces a small set of management commands with room for
-vendor-specific extensions. Most of the management operations are
-framed in a generic Get/Set feature command.
-
-The following admin queue commands are supported:
-- Create I/O submission queue
-- Create I/O completion queue
-- Destroy I/O submission queue
-- Destroy I/O completion queue
-- Get feature
-- Set feature
-- Configure AENQ
-- Get statistics
-
-Refer to ena_admin_defs.h for the list of supported Get/Set Feature
-properties.
-
-The Asynchronous Event Notification Queue (AENQ) is a uni-directional
-queue used by the ENA device to send to the driver events that cannot
-be reported using ACQ. AENQ events are subdivided into groups. Each
-group may have multiple syndromes, as shown below
-
-The events are:
-	Group			Syndrome
-	Link state change	- X -
-	Fatal error		- X -
-	Notification		Suspend traffic
-	Notification		Resume traffic
-	Keep-Alive		- X -
-
-ACQ and AENQ share the same MSI-X vector.
-
-Keep-Alive is a special mechanism that allows monitoring of the
-device's health. The driver maintains a watchdog (WD) handler which,
-if fired, logs the current state and statistics then resets and
-restarts the ENA device and driver. A Keep-Alive event is delivered by
-the device every second. The driver re-arms the WD upon reception of a
-Keep-Alive event. A missed Keep-Alive event causes the WD handler to
-fire.
-
-Data Path Interface:
-====================
-I/O operations are based on Tx and Rx Submission Queues (Tx SQ and Rx
-SQ correspondingly). Each SQ has a completion queue (CQ) associated
-with it.
-
-The SQs and CQs are implemented as descriptor rings in contiguous
-physical memory.
-
-The ENA driver supports two Queue Operation modes for Tx SQs:
-- Regular mode
-  * In this mode the Tx SQs reside in the host's memory. The ENA
-    device fetches the ENA Tx descriptors and packet data from host
-    memory.
-- Low Latency Queue (LLQ) mode or "push-mode".
-  * In this mode the driver pushes the transmit descriptors and the
-    first 128 bytes of the packet directly to the ENA device memory
-    space. The rest of the packet payload is fetched by the
-    device. For this operation mode, the driver uses a dedicated PCI
-    device memory BAR, which is mapped with write-combine capability.
-
-The Rx SQs support only the regular mode.
-
-Note: Not all ENA devices support LLQ, and this feature is negotiated
-      with the device upon initialization. If the ENA device does not
-      support LLQ mode, the driver falls back to the regular mode.
-
-The driver supports multi-queue for both Tx and Rx. This has various
-benefits:
-- Reduced CPU/thread/process contention on a given Ethernet interface.
-- Cache miss rate on completion is reduced, particularly for data
-  cache lines that hold the sk_buff structures.
-- Increased process-level parallelism when handling received packets.
-- Increased data cache hit rate, by steering kernel processing of
-  packets to the CPU, where the application thread consuming the
-  packet is running.
-- In hardware interrupt re-direction.
-
-Interrupt Modes:
-================
-The driver assigns a single MSI-X vector per queue pair (for both Tx
-and Rx directions). The driver assigns an additional dedicated MSI-X vector
-for management (for ACQ and AENQ).
-
-Management interrupt registration is performed when the Linux kernel
-probes the adapter, and it is de-registered when the adapter is
-removed. I/O queue interrupt registration is performed when the Linux
-interface of the adapter is opened, and it is de-registered when the
-interface is closed.
-
-The management interrupt is named:
-   ena-mgmnt@pci:<PCI domain:bus:slot.function>
-and for each queue pair, an interrupt is named:
-   <interface name>-Tx-Rx-<queue index>
-
-The ENA device operates in auto-mask and auto-clear interrupt
-modes. That is, once MSI-X is delivered to the host, its Cause bit is
-automatically cleared and the interrupt is masked. The interrupt is
-unmasked by the driver after NAPI processing is complete.
-
-Interrupt Moderation:
-=====================
-ENA driver and device can operate in conventional or adaptive interrupt
-moderation mode.
-
-In conventional mode the driver instructs device to postpone interrupt
-posting according to static interrupt delay value. The interrupt delay
-value can be configured through ethtool(8). The following ethtool
-parameters are supported by the driver: tx-usecs, rx-usecs
-
-In adaptive interrupt moderation mode the interrupt delay value is
-updated by the driver dynamically and adjusted every NAPI cycle
-according to the traffic nature.
-
-By default ENA driver applies adaptive coalescing on Rx traffic and
-conventional coalescing on Tx traffic.
-
-Adaptive coalescing can be switched on/off through ethtool(8)
-adaptive_rx on|off parameter.
-
-The driver chooses interrupt delay value according to the number of
-bytes and packets received between interrupt unmasking and interrupt
-posting. The driver uses interrupt delay table that subdivides the
-range of received bytes/packets into 5 levels and assigns interrupt
-delay value to each level.
-
-The user can enable/disable adaptive moderation, modify the interrupt
-delay table and restore its default values through sysfs.
-
-RX copybreak:
-=============
-The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK
-and can be configured by the ETHTOOL_STUNABLE command of the
-SIOCETHTOOL ioctl.
-
-SKB:
-====
-The driver-allocated SKB for frames received from Rx handling using
-NAPI context. The allocation method depends on the size of the packet.
-If the frame length is larger than rx_copybreak, napi_get_frags()
-is used, otherwise netdev_alloc_skb_ip_align() is used, the buffer
-content is copied (by CPU) to the SKB, and the buffer is recycled.
-
-Statistics:
-===========
-The user can obtain ENA device and driver statistics using ethtool.
-The driver can collect regular or extended statistics (including
-per-queue stats) from the device.
-
-In addition the driver logs the stats to syslog upon device reset.
-
-MTU:
-====
-The driver supports an arbitrarily large MTU with a maximum that is
-negotiated with the device. The driver configures MTU using the
-SetFeature command (ENA_ADMIN_MTU property). The user can change MTU
-via ip(8) and similar legacy tools.
-
-Stateless Offloads:
-===================
-The ENA driver supports:
-- TSO over IPv4/IPv6
-- TSO with ECN
-- IPv4 header checksum offload
-- TCP/UDP over IPv4/IPv6 checksum offloads
-
-RSS:
-====
-- The ENA device supports RSS that allows flexible Rx traffic
-  steering.
-- Toeplitz and CRC32 hash functions are supported.
-- Different combinations of L2/L3/L4 fields can be configured as
-  inputs for hash functions.
-- The driver configures RSS settings using the AQ SetFeature command
-  (ENA_ADMIN_RSS_HASH_FUNCTION, ENA_ADMIN_RSS_HASH_INPUT and
-  ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG properties).
-- If the NETIF_F_RXHASH flag is set, the 32-bit result of the hash
-  function delivered in the Rx CQ descriptor is set in the received
-  SKB.
-- The user can provide a hash key, hash function, and configure the
-  indirection table through ethtool(8).
-
-DATA PATH:
-==========
-Tx:
----
-end_start_xmit() is called by the stack. This function does the following:
-- Maps data buffers (skb->data and frags).
-- Populates ena_buf for the push buffer (if the driver and device are
-  in push mode.)
-- Prepares ENA bufs for the remaining frags.
-- Allocates a new request ID from the empty req_id ring. The request
-  ID is the index of the packet in the Tx info. This is used for
-  out-of-order TX completions.
-- Adds the packet to the proper place in the Tx ring.
-- Calls ena_com_prepare_tx(), an ENA communication layer that converts
-  the ena_bufs to ENA descriptors (and adds meta ENA descriptors as
-  needed.)
-  * This function also copies the ENA descriptors and the push buffer
-    to the Device memory space (if in push mode.)
-- Writes doorbell to the ENA device.
-- When the ENA device finishes sending the packet, a completion
-  interrupt is raised.
-- The interrupt handler schedules NAPI.
-- The ena_clean_tx_irq() function is called. This function handles the
-  completion descriptors generated by the ENA, with a single
-  completion descriptor per completed packet.
-  * req_id is retrieved from the completion descriptor. The tx_info of
-    the packet is retrieved via the req_id. The data buffers are
-    unmapped and req_id is returned to the empty req_id ring.
-  * The function stops when the completion descriptors are completed or
-    the budget is reached.
-
-Rx:
----
-- When a packet is received from the ENA device.
-- The interrupt handler schedules NAPI.
-- The ena_clean_rx_irq() function is called. This function calls
-  ena_rx_pkt(), an ENA communication layer function, which returns the
-  number of descriptors used for a new unhandled packet, and zero if
-  no new packet is found.
-- Then it calls the ena_clean_rx_irq() function.
-- ena_eth_rx_skb() checks packet length:
-  * If the packet is small (len < rx_copybreak), the driver allocates
-    a SKB for the new packet, and copies the packet payload into the
-    SKB data buffer.
-    - In this way the original data buffer is not passed to the stack
-      and is reused for future Rx packets.
-  * Otherwise the function unmaps the Rx buffer, then allocates the
-    new SKB structure and hooks the Rx buffer to the SKB frags.
-- The new SKB is updated with the necessary information (protocol,
-  checksum hw verify result, etc.), and then passed to the network
-  stack, using the NAPI interface function napi_gro_receive().
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index aaac502b81ea..019a0d2efe67 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -29,6 +29,7 @@ Contents:
    stmicro/stmmac
    3com/3c509
    3com/vortex
+   amazon/ena
 
 .. only::  subproject and html
 
diff --git a/MAINTAINERS b/MAINTAINERS
index eaea5f1994c9..7b6c13cc832f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -815,7 +815,7 @@ R:	Saeed Bishara <saeedb@amazon.com>
 R:	Zorik Machulsky <zorik@amazon.com>
 L:	netdev@vger.kernel.org
 S:	Supported
-F:	Documentation/networking/device_drivers/amazon/ena.txt
+F:	Documentation/networking/device_drivers/amazon/ena.rst
 F:	drivers/net/ethernet/amazon/
 
 AMAZON RDMA EFA DRIVER
-- 
cgit v1.2.3


From c958119a487ec4578f50b352f45e965a30daa020 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:37 +0200
Subject: docs: networking: device drivers: convert aquantia/atlantic.txt to
 ReST

- add SPDX header;
- use copyright symbol;
- adjust title and its markup;
- comment out text-only TOC from html/pdf output;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 .../device_drivers/aquantia/atlantic.rst           | 556 +++++++++++++++++++++
 .../device_drivers/aquantia/atlantic.txt           | 479 ------------------
 Documentation/networking/device_drivers/index.rst  |   1 +
 MAINTAINERS                                        |   2 +-
 4 files changed, 558 insertions(+), 480 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/aquantia/atlantic.rst
 delete mode 100644 Documentation/networking/device_drivers/aquantia/atlantic.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/aquantia/atlantic.rst b/Documentation/networking/device_drivers/aquantia/atlantic.rst
new file mode 100644
index 000000000000..595ddef1c8b3
--- /dev/null
+++ b/Documentation/networking/device_drivers/aquantia/atlantic.rst
@@ -0,0 +1,556 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===============================
+Marvell(Aquantia) AQtion Driver
+===============================
+
+For the aQuantia Multi-Gigabit PCI Express Family of Ethernet Adapters
+
+.. Contents
+
+    - Identifying Your Adapter
+    - Configuration
+    - Supported ethtool options
+    - Command Line Parameters
+    - Config file parameters
+    - Support
+    - License
+
+Identifying Your Adapter
+========================
+
+The driver in this release is compatible with AQC-100, AQC-107, AQC-108
+based ethernet adapters.
+
+
+SFP+ Devices (for AQC-100 based adapters)
+-----------------------------------------
+
+This release tested with passive Direct Attach Cables (DAC) and SFP+/LC
+Optical Transceiver.
+
+Configuration
+=============
+
+Viewing Link Messages
+---------------------
+  Link messages will not be displayed to the console if the distribution is
+  restricting system messages. In order to see network driver link messages on
+  your console, set dmesg to eight by entering the following::
+
+       dmesg -n 8
+
+  .. note::
+
+     This setting is not saved across reboots.
+
+Jumbo Frames
+------------
+  The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
+  enabled by changing the MTU to a value larger than the default of 1500.
+  The maximum value for the MTU is 16000.  Use the `ip` command to
+  increase the MTU size.  For example::
+
+	ip link set mtu 16000 dev enp1s0
+
+ethtool
+-------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information. The latest
+  ethtool version is required for this functionality.
+
+NAPI
+----
+  NAPI (Rx polling mode) is supported in the atlantic driver.
+
+Supported ethtool options
+=========================
+
+Viewing adapter settings
+------------------------
+
+ ::
+
+    ethtool <ethX>
+
+ Output example::
+
+  Settings for enp1s0:
+    Supported ports: [ TP ]
+    Supported link modes:   100baseT/Full
+			    1000baseT/Full
+			    10000baseT/Full
+			    2500baseT/Full
+			    5000baseT/Full
+    Supported pause frame use: Symmetric
+    Supports auto-negotiation: Yes
+    Supported FEC modes: Not reported
+    Advertised link modes:  100baseT/Full
+			    1000baseT/Full
+			    10000baseT/Full
+			    2500baseT/Full
+			    5000baseT/Full
+    Advertised pause frame use: Symmetric
+    Advertised auto-negotiation: Yes
+    Advertised FEC modes: Not reported
+    Speed: 10000Mb/s
+    Duplex: Full
+    Port: Twisted Pair
+    PHYAD: 0
+    Transceiver: internal
+    Auto-negotiation: on
+    MDI-X: Unknown
+    Supports Wake-on: g
+    Wake-on: d
+    Link detected: yes
+
+
+ .. note::
+
+    AQrate speeds (2.5/5 Gb/s) will be displayed only with linux kernels > 4.10.
+    But you can still use these speeds::
+
+	ethtool -s eth0 autoneg off speed 2500
+
+Viewing adapter information
+---------------------------
+
+ ::
+
+  ethtool -i <ethX>
+
+ Output example::
+
+  driver: atlantic
+  version: 5.2.0-050200rc5-generic-kern
+  firmware-version: 3.1.78
+  expansion-rom-version:
+  bus-info: 0000:01:00.0
+  supports-statistics: yes
+  supports-test: no
+  supports-eeprom-access: no
+  supports-register-dump: yes
+  supports-priv-flags: no
+
+
+Viewing Ethernet adapter statistics
+-----------------------------------
+
+ ::
+
+    ethtool -S <ethX>
+
+ Output example::
+
+  NIC statistics:
+     InPackets: 13238607
+     InUCast: 13293852
+     InMCast: 52
+     InBCast: 3
+     InErrors: 0
+     OutPackets: 23703019
+     OutUCast: 23704941
+     OutMCast: 67
+     OutBCast: 11
+     InUCastOctects: 213182760
+     OutUCastOctects: 22698443
+     InMCastOctects: 6600
+     OutMCastOctects: 8776
+     InBCastOctects: 192
+     OutBCastOctects: 704
+     InOctects: 2131839552
+     OutOctects: 226938073
+     InPacketsDma: 95532300
+     OutPacketsDma: 59503397
+     InOctetsDma: 1137102462
+     OutOctetsDma: 2394339518
+     InDroppedDma: 0
+     Queue[0] InPackets: 23567131
+     Queue[0] OutPackets: 20070028
+     Queue[0] InJumboPackets: 0
+     Queue[0] InLroPackets: 0
+     Queue[0] InErrors: 0
+     Queue[1] InPackets: 45428967
+     Queue[1] OutPackets: 11306178
+     Queue[1] InJumboPackets: 0
+     Queue[1] InLroPackets: 0
+     Queue[1] InErrors: 0
+     Queue[2] InPackets: 3187011
+     Queue[2] OutPackets: 13080381
+     Queue[2] InJumboPackets: 0
+     Queue[2] InLroPackets: 0
+     Queue[2] InErrors: 0
+     Queue[3] InPackets: 23349136
+     Queue[3] OutPackets: 15046810
+     Queue[3] InJumboPackets: 0
+     Queue[3] InLroPackets: 0
+     Queue[3] InErrors: 0
+
+Interrupt coalescing support
+----------------------------
+
+ ITR mode, TX/RX coalescing timings could be viewed with::
+
+    ethtool -c <ethX>
+
+ and changed with::
+
+    ethtool -C <ethX> tx-usecs <usecs> rx-usecs <usecs>
+
+ To disable coalescing::
+
+    ethtool -C <ethX> tx-usecs 0 rx-usecs 0 tx-max-frames 1 tx-max-frames 1
+
+Wake on LAN support
+-------------------
+
+ WOL support by magic packet::
+
+    ethtool -s <ethX> wol g
+
+ To disable WOL::
+
+    ethtool -s <ethX> wol d
+
+Set and check the driver message level
+--------------------------------------
+
+ Set message level
+
+ ::
+
+    ethtool -s <ethX> msglvl <level>
+
+ Level values:
+
+ ======   =============================
+ 0x0001   general driver status.
+ 0x0002   hardware probing.
+ 0x0004   link state.
+ 0x0008   periodic status check.
+ 0x0010   interface being brought down.
+ 0x0020   interface being brought up.
+ 0x0040   receive error.
+ 0x0080   transmit error.
+ 0x0200   interrupt handling.
+ 0x0400   transmit completion.
+ 0x0800   receive completion.
+ 0x1000   packet contents.
+ 0x2000   hardware status.
+ 0x4000   Wake-on-LAN status.
+ ======   =============================
+
+ By default, the level of debugging messages is set 0x0001(general driver status).
+
+ Check message level
+
+ ::
+
+    ethtool <ethX> | grep "Current message level"
+
+ If you want to disable the output of messages::
+
+    ethtool -s <ethX> msglvl 0
+
+RX flow rules (ntuple filters)
+------------------------------
+
+ There are separate rules supported, that applies in that order:
+
+ 1. 16 VLAN ID rules
+ 2. 16 L2 EtherType rules
+ 3. 8 L3/L4 5-Tuple rules
+
+
+ The driver utilizes the ethtool interface for configuring ntuple filters,
+ via ``ethtool -N <device> <filter>``.
+
+ To enable or disable the RX flow rules::
+
+    ethtool -K ethX ntuple <on|off>
+
+ When disabling ntuple filters, all the user programed filters are
+ flushed from the driver cache and hardware. All needed filters must
+ be re-added when ntuple is re-enabled.
+
+ Because of the fixed order of the rules, the location of filters is also fixed:
+
+ - Locations 0 - 15 for VLAN ID filters
+ - Locations 16 - 31 for L2 EtherType filters
+ - Locations 32 - 39 for L3/L4 5-tuple filters (locations 32, 36 for IPv6)
+
+ The L3/L4 5-tuple (protocol, source and destination IP address, source and
+ destination TCP/UDP/SCTP port) is compared against 8 filters. For IPv4, up to
+ 8 source and destination addresses can be matched. For IPv6, up to 2 pairs of
+ addresses can be supported. Source and destination ports are only compared for
+ TCP/UDP/SCTP packets.
+
+ To add a filter that directs packet to queue 5, use
+ ``<-N|-U|--config-nfc|--config-ntuple>`` switch::
+
+    ethtool -N <ethX> flow-type udp4 src-ip 10.0.0.1 dst-ip 10.0.0.2 src-port 2000 dst-port 2001 action 5 <loc 32>
+
+ - action is the queue number.
+ - loc is the rule number.
+
+ For ``flow-type ip4|udp4|tcp4|sctp4|ip6|udp6|tcp6|sctp6`` you must set the loc
+ number within 32 - 39.
+ For ``flow-type ip4|udp4|tcp4|sctp4|ip6|udp6|tcp6|sctp6`` you can set 8 rules
+ for traffic IPv4 or you can set 2 rules for traffic IPv6. Loc number traffic
+ IPv6 is 32 and 36.
+ At the moment you can not use IPv4 and IPv6 filters at the same time.
+
+ Example filter for IPv6 filter traffic::
+
+    sudo ethtool -N <ethX> flow-type tcp6 src-ip 2001:db8:0:f101::1 dst-ip 2001:db8:0:f101::2 action 1 loc 32
+    sudo ethtool -N <ethX> flow-type ip6 src-ip 2001:db8:0:f101::2 dst-ip 2001:db8:0:f101::5 action -1 loc 36
+
+ Example filter for IPv4 filter traffic::
+
+    sudo ethtool -N <ethX> flow-type udp4 src-ip 10.0.0.4 dst-ip 10.0.0.7 src-port 2000 dst-port 2001 loc 32
+    sudo ethtool -N <ethX> flow-type tcp4 src-ip 10.0.0.3 dst-ip 10.0.0.9 src-port 2000 dst-port 2001 loc 33
+    sudo ethtool -N <ethX> flow-type ip4 src-ip 10.0.0.6 dst-ip 10.0.0.4 loc 34
+
+ If you set action -1, then all traffic corresponding to the filter will be discarded.
+
+ The maximum value action is 31.
+
+
+ The VLAN filter (VLAN id) is compared against 16 filters.
+ VLAN id must be accompanied by mask 0xF000. That is to distinguish VLAN filter
+ from L2 Ethertype filter with UserPriority since both User Priority and VLAN ID
+ are passed in the same 'vlan' parameter.
+
+ To add a filter that directs packets from VLAN 2001 to queue 5::
+
+    ethtool -N <ethX> flow-type ip4 vlan 2001 m 0xF000 action 1 loc 0
+
+
+ L2 EtherType filters allows filter packet by EtherType field or both EtherType
+ and User Priority (PCP) field of 802.1Q.
+ UserPriority (vlan) parameter must be accompanied by mask 0x1FFF. That is to
+ distinguish VLAN filter from L2 Ethertype filter with UserPriority since both
+ User Priority and VLAN ID are passed in the same 'vlan' parameter.
+
+ To add a filter that directs IP4 packess of priority 3 to queue 3::
+
+    ethtool -N <ethX> flow-type ether proto 0x800 vlan 0x600 m 0x1FFF action 3 loc 16
+
+ To see the list of filters currently present::
+
+    ethtool <-u|-n|--show-nfc|--show-ntuple> <ethX>
+
+ Rules may be deleted from the table itself. This is done using::
+
+    sudo ethtool <-N|-U|--config-nfc|--config-ntuple> <ethX> delete <loc>
+
+ - loc is the rule number to be deleted.
+
+ Rx filters is an interface to load the filter table that funnels all flow
+ into queue 0 unless an alternative queue is specified using "action". In that
+ case, any flow that matches the filter criteria will be directed to the
+ appropriate queue. RX filters is supported on all kernels 2.6.30 and later.
+
+RSS for UDP
+-----------
+
+ Currently, NIC does not support RSS for fragmented IP packets, which leads to
+ incorrect working of RSS for fragmented UDP traffic. To disable RSS for UDP the
+ RX Flow L3/L4 rule may be used.
+
+ Example::
+
+    ethtool -N eth0 flow-type udp4 action 0 loc 32
+
+UDP GSO hardware offload
+------------------------
+
+ UDP GSO allows to boost UDP tx rates by offloading UDP headers allocation
+ into hardware. A special userspace socket option is required for this,
+ could be validated with /kernel/tools/testing/selftests/net/::
+
+    udpgso_bench_tx -u -4 -D 10.0.1.1 -s 6300 -S 100
+
+ Will cause sending out of 100 byte sized UDP packets formed from single
+ 6300 bytes user buffer.
+
+ UDP GSO is configured by::
+
+    ethtool -K eth0 tx-udp-segmentation on
+
+Private flags (testing)
+-----------------------
+
+ Atlantic driver supports private flags for hardware custom features::
+
+	$ ethtool --show-priv-flags ethX
+
+	Private flags for ethX:
+	DMASystemLoopback  : off
+	PKTSystemLoopback  : off
+	DMANetworkLoopback : off
+	PHYInternalLoopback: off
+	PHYExternalLoopback: off
+
+ Example::
+
+	$ ethtool --set-priv-flags ethX DMASystemLoopback on
+
+ DMASystemLoopback:   DMA Host loopback.
+ PKTSystemLoopback:   Packet buffer host loopback.
+ DMANetworkLoopback:  Network side loopback on DMA block.
+ PHYInternalLoopback: Internal loopback on Phy.
+ PHYExternalLoopback: External loopback on Phy (with loopback ethernet cable).
+
+
+Command Line Parameters
+=======================
+The following command line parameters are available on atlantic driver:
+
+aq_itr -Interrupt throttling mode
+---------------------------------
+Accepted values: 0, 1, 0xFFFF
+
+Default value: 0xFFFF
+
+======   ==============================================================
+0        Disable interrupt throttling.
+1        Enable interrupt throttling and use specified tx and rx rates.
+0xFFFF   Auto throttling mode. Driver will choose the best RX and TX
+	 interrupt throtting settings based on link speed.
+======   ==============================================================
+
+aq_itr_tx - TX interrupt throttle rate
+--------------------------------------
+
+Accepted values: 0 - 0x1FF
+
+Default value: 0
+
+TX side throttling in microseconds. Adapter will setup maximum interrupt delay
+to this value. Minimum interrupt delay will be a half of this value
+
+aq_itr_rx - RX interrupt throttle rate
+--------------------------------------
+
+Accepted values: 0 - 0x1FF
+
+Default value: 0
+
+RX side throttling in microseconds. Adapter will setup maximum interrupt delay
+to this value. Minimum interrupt delay will be a half of this value
+
+.. note::
+
+   ITR settings could be changed in runtime by ethtool -c means (see below)
+
+Config file parameters
+======================
+
+For some fine tuning and performance optimizations,
+some parameters can be changed in the {source_dir}/aq_cfg.h file.
+
+AQ_CFG_RX_PAGEORDER
+-------------------
+
+Default value: 0
+
+RX page order override. Thats a power of 2 number of RX pages allocated for
+each descriptor. Received descriptor size is still limited by
+AQ_CFG_RX_FRAME_MAX.
+
+Increasing pageorder makes page reuse better (actual on iommu enabled systems).
+
+AQ_CFG_RX_REFILL_THRES
+----------------------
+
+Default value: 32
+
+RX refill threshold. RX path will not refill freed descriptors until the
+specified number of free descriptors is observed. Larger values may help
+better page reuse but may lead to packet drops as well.
+
+AQ_CFG_VECS_DEF
+---------------
+
+Number of queues
+
+Valid Range: 0 - 8 (up to AQ_CFG_VECS_MAX)
+
+Default value: 8
+
+Notice this value will be capped by the number of cores available on the system.
+
+AQ_CFG_IS_RSS_DEF
+-----------------
+
+Enable/disable Receive Side Scaling
+
+This feature allows the adapter to distribute receive processing
+across multiple CPU-cores and to prevent from overloading a single CPU core.
+
+Valid values
+
+==  ========
+0   disabled
+1   enabled
+==  ========
+
+Default value: 1
+
+AQ_CFG_NUM_RSS_QUEUES_DEF
+-------------------------
+
+Number of queues for Receive Side Scaling
+
+Valid Range: 0 - 8 (up to AQ_CFG_VECS_DEF)
+
+Default value: AQ_CFG_VECS_DEF
+
+AQ_CFG_IS_LRO_DEF
+-----------------
+
+Enable/disable Large Receive Offload
+
+This offload enables the adapter to coalesce multiple TCP segments and indicate
+them as a single coalesced unit to the OS networking subsystem.
+
+The system consumes less energy but it also introduces more latency in packets
+processing.
+
+Valid values
+
+==  ========
+0   disabled
+1   enabled
+==  ========
+
+Default value: 1
+
+AQ_CFG_TX_CLEAN_BUDGET
+----------------------
+
+Maximum descriptors to cleanup on TX at once.
+
+Default value: 256
+
+After the aq_cfg.h file changed the driver must be rebuilt to take effect.
+
+Support
+=======
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to aqn_support@marvell.com
+
+License
+=======
+
+aQuantia Corporation Network Driver
+
+Copyright |copy| 2014 - 2019 aQuantia Corporation.
+
+This program is free software; you can redistribute it and/or modify it
+under the terms and conditions of the GNU General Public License,
+version 2, as published by the Free Software Foundation.
diff --git a/Documentation/networking/device_drivers/aquantia/atlantic.txt b/Documentation/networking/device_drivers/aquantia/atlantic.txt
deleted file mode 100644
index 2013fcedc2da..000000000000
--- a/Documentation/networking/device_drivers/aquantia/atlantic.txt
+++ /dev/null
@@ -1,479 +0,0 @@
-Marvell(Aquantia) AQtion Driver for the aQuantia Multi-Gigabit PCI Express
-Family of Ethernet Adapters
-=============================================================================
-
-Contents
-========
-
-- Identifying Your Adapter
-- Configuration
-- Supported ethtool options
-- Command Line Parameters
-- Config file parameters
-- Support
-- License
-
-Identifying Your Adapter
-========================
-
-The driver in this release is compatible with AQC-100, AQC-107, AQC-108 based ethernet adapters.
-
-
-SFP+ Devices (for AQC-100 based adapters)
-----------------------------------
-
-This release tested with passive Direct Attach Cables (DAC) and SFP+/LC Optical Transceiver.
-
-Configuration
-=========================
-  Viewing Link Messages
-  ---------------------
-  Link messages will not be displayed to the console if the distribution is
-  restricting system messages. In order to see network driver link messages on
-  your console, set dmesg to eight by entering the following:
-
-       dmesg -n 8
-
-  NOTE: This setting is not saved across reboots.
-
-  Jumbo Frames
-  ------------
-  The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
-  enabled by changing the MTU to a value larger than the default of 1500.
-  The maximum value for the MTU is 16000.  Use the `ip` command to
-  increase the MTU size.  For example:
-
-        ip link set mtu 16000 dev enp1s0
-
-  ethtool
-  -------
-  The driver utilizes the ethtool interface for driver configuration and
-  diagnostics, as well as displaying statistical information. The latest
-  ethtool version is required for this functionality.
-
-  NAPI
-  ----
-  NAPI (Rx polling mode) is supported in the atlantic driver.
-
-Supported ethtool options
-============================
- Viewing adapter settings
- ---------------------
- ethtool <ethX>
-
- Output example:
-
-  Settings for enp1s0:
-    Supported ports: [ TP ]
-    Supported link modes:   100baseT/Full
-                            1000baseT/Full
-                            10000baseT/Full
-                            2500baseT/Full
-                            5000baseT/Full
-    Supported pause frame use: Symmetric
-    Supports auto-negotiation: Yes
-    Supported FEC modes: Not reported
-    Advertised link modes:  100baseT/Full
-                            1000baseT/Full
-                            10000baseT/Full
-                            2500baseT/Full
-                            5000baseT/Full
-    Advertised pause frame use: Symmetric
-    Advertised auto-negotiation: Yes
-    Advertised FEC modes: Not reported
-    Speed: 10000Mb/s
-    Duplex: Full
-    Port: Twisted Pair
-    PHYAD: 0
-    Transceiver: internal
-    Auto-negotiation: on
-    MDI-X: Unknown
-    Supports Wake-on: g
-    Wake-on: d
-    Link detected: yes
-
- ---
- Note: AQrate speeds (2.5/5 Gb/s) will be displayed only with linux kernels > 4.10.
-    But you can still use these speeds:
-	ethtool -s eth0 autoneg off speed 2500
-
- Viewing adapter information
- ---------------------
- ethtool -i <ethX>
-
- Output example:
-
-  driver: atlantic
-  version: 5.2.0-050200rc5-generic-kern
-  firmware-version: 3.1.78
-  expansion-rom-version:
-  bus-info: 0000:01:00.0
-  supports-statistics: yes
-  supports-test: no
-  supports-eeprom-access: no
-  supports-register-dump: yes
-  supports-priv-flags: no
-
-
- Viewing Ethernet adapter statistics:
- ---------------------
- ethtool -S <ethX>
-
- Output example:
- NIC statistics:
-     InPackets: 13238607
-     InUCast: 13293852
-     InMCast: 52
-     InBCast: 3
-     InErrors: 0
-     OutPackets: 23703019
-     OutUCast: 23704941
-     OutMCast: 67
-     OutBCast: 11
-     InUCastOctects: 213182760
-     OutUCastOctects: 22698443
-     InMCastOctects: 6600
-     OutMCastOctects: 8776
-     InBCastOctects: 192
-     OutBCastOctects: 704
-     InOctects: 2131839552
-     OutOctects: 226938073
-     InPacketsDma: 95532300
-     OutPacketsDma: 59503397
-     InOctetsDma: 1137102462
-     OutOctetsDma: 2394339518
-     InDroppedDma: 0
-     Queue[0] InPackets: 23567131
-     Queue[0] OutPackets: 20070028
-     Queue[0] InJumboPackets: 0
-     Queue[0] InLroPackets: 0
-     Queue[0] InErrors: 0
-     Queue[1] InPackets: 45428967
-     Queue[1] OutPackets: 11306178
-     Queue[1] InJumboPackets: 0
-     Queue[1] InLroPackets: 0
-     Queue[1] InErrors: 0
-     Queue[2] InPackets: 3187011
-     Queue[2] OutPackets: 13080381
-     Queue[2] InJumboPackets: 0
-     Queue[2] InLroPackets: 0
-     Queue[2] InErrors: 0
-     Queue[3] InPackets: 23349136
-     Queue[3] OutPackets: 15046810
-     Queue[3] InJumboPackets: 0
-     Queue[3] InLroPackets: 0
-     Queue[3] InErrors: 0
-
- Interrupt coalescing support
- ---------------------------------
- ITR mode, TX/RX coalescing timings could be viewed with:
-
- ethtool -c <ethX>
-
- and changed with:
-
- ethtool -C <ethX> tx-usecs <usecs> rx-usecs <usecs>
-
- To disable coalescing:
-
- ethtool -C <ethX> tx-usecs 0 rx-usecs 0 tx-max-frames 1 tx-max-frames 1
-
- Wake on LAN support
- ---------------------------------
-
- WOL support by magic packet:
-
- ethtool -s <ethX> wol g
-
- To disable WOL:
-
- ethtool -s <ethX> wol d
-
- Set and check the driver message level
- ---------------------------------
-
- Set message level
-
- ethtool -s <ethX> msglvl <level>
-
- Level values:
-
- 0x0001 - general driver status.
- 0x0002 - hardware probing.
- 0x0004 - link state.
- 0x0008 - periodic status check.
- 0x0010 - interface being brought down.
- 0x0020 - interface being brought up.
- 0x0040 - receive error.
- 0x0080 - transmit error.
- 0x0200 - interrupt handling.
- 0x0400 - transmit completion.
- 0x0800 - receive completion.
- 0x1000 - packet contents.
- 0x2000 - hardware status.
- 0x4000 - Wake-on-LAN status.
-
- By default, the level of debugging messages is set 0x0001(general driver status).
-
- Check message level
-
- ethtool <ethX> | grep "Current message level"
-
- If you want to disable the output of messages
-
- ethtool -s <ethX> msglvl 0
-
- RX flow rules (ntuple filters)
- ---------------------------------
- There are separate rules supported, that applies in that order:
- 1. 16 VLAN ID rules
- 2. 16 L2 EtherType rules
- 3. 8 L3/L4 5-Tuple rules
-
-
- The driver utilizes the ethtool interface for configuring ntuple filters,
- via "ethtool -N <device> <filter>".
-
- To enable or disable the RX flow rules:
-
- ethtool -K ethX ntuple <on|off>
-
- When disabling ntuple filters, all the user programed filters are
- flushed from the driver cache and hardware. All needed filters must
- be re-added when ntuple is re-enabled.
-
- Because of the fixed order of the rules, the location of filters is also fixed:
- - Locations 0 - 15 for VLAN ID filters
- - Locations 16 - 31 for L2 EtherType filters
- - Locations 32 - 39 for L3/L4 5-tuple filters (locations 32, 36 for IPv6)
-
- The L3/L4 5-tuple (protocol, source and destination IP address, source and
- destination TCP/UDP/SCTP port) is compared against 8 filters. For IPv4, up to
- 8 source and destination addresses can be matched. For IPv6, up to 2 pairs of
- addresses can be supported. Source and destination ports are only compared for
- TCP/UDP/SCTP packets.
-
- To add a filter that directs packet to queue 5, use <-N|-U|--config-nfc|--config-ntuple> switch:
-
- ethtool -N <ethX> flow-type udp4 src-ip 10.0.0.1 dst-ip 10.0.0.2 src-port 2000 dst-port 2001 action 5 <loc 32>
-
- - action is the queue number.
- - loc is the rule number.
-
- For "flow-type ip4|udp4|tcp4|sctp4|ip6|udp6|tcp6|sctp6" you must set the loc
- number within 32 - 39.
- For "flow-type ip4|udp4|tcp4|sctp4|ip6|udp6|tcp6|sctp6" you can set 8 rules
- for traffic IPv4 or you can set 2 rules for traffic IPv6. Loc number traffic
- IPv6 is 32 and 36.
- At the moment you can not use IPv4 and IPv6 filters at the same time.
-
- Example filter for IPv6 filter traffic:
-
- sudo ethtool -N <ethX> flow-type tcp6 src-ip 2001:db8:0:f101::1 dst-ip 2001:db8:0:f101::2 action 1 loc 32
- sudo ethtool -N <ethX> flow-type ip6 src-ip 2001:db8:0:f101::2 dst-ip 2001:db8:0:f101::5 action -1 loc 36
-
- Example filter for IPv4 filter traffic:
-
- sudo ethtool -N <ethX> flow-type udp4 src-ip 10.0.0.4 dst-ip 10.0.0.7 src-port 2000 dst-port 2001 loc 32
- sudo ethtool -N <ethX> flow-type tcp4 src-ip 10.0.0.3 dst-ip 10.0.0.9 src-port 2000 dst-port 2001 loc 33
- sudo ethtool -N <ethX> flow-type ip4 src-ip 10.0.0.6 dst-ip 10.0.0.4 loc 34
-
- If you set action -1, then all traffic corresponding to the filter will be discarded.
- The maximum value action is 31.
-
-
- The VLAN filter (VLAN id) is compared against 16 filters.
- VLAN id must be accompanied by mask 0xF000. That is to distinguish VLAN filter
- from L2 Ethertype filter with UserPriority since both User Priority and VLAN ID
- are passed in the same 'vlan' parameter.
-
- To add a filter that directs packets from VLAN 2001 to queue 5:
- ethtool -N <ethX> flow-type ip4 vlan 2001 m 0xF000 action 1 loc 0
-
-
- L2 EtherType filters allows filter packet by EtherType field or both EtherType
- and User Priority (PCP) field of 802.1Q.
- UserPriority (vlan) parameter must be accompanied by mask 0x1FFF. That is to
- distinguish VLAN filter from L2 Ethertype filter with UserPriority since both
- User Priority and VLAN ID are passed in the same 'vlan' parameter.
-
- To add a filter that directs IP4 packess of priority 3 to queue 3:
- ethtool -N <ethX> flow-type ether proto 0x800 vlan 0x600 m 0x1FFF action 3 loc 16
-
-
- To see the list of filters currently present:
-
- ethtool <-u|-n|--show-nfc|--show-ntuple> <ethX>
-
- Rules may be deleted from the table itself. This is done using:
-
- sudo ethtool <-N|-U|--config-nfc|--config-ntuple> <ethX> delete <loc>
-
- - loc is the rule number to be deleted.
-
- Rx filters is an interface to load the filter table that funnels all flow
- into queue 0 unless an alternative queue is specified using "action". In that
- case, any flow that matches the filter criteria will be directed to the
- appropriate queue. RX filters is supported on all kernels 2.6.30 and later.
-
- RSS for UDP
- ---------------------------------
- Currently, NIC does not support RSS for fragmented IP packets, which leads to
- incorrect working of RSS for fragmented UDP traffic. To disable RSS for UDP the
- RX Flow L3/L4 rule may be used.
-
- Example:
- ethtool -N eth0 flow-type udp4 action 0 loc 32
-
- UDP GSO hardware offload
- ---------------------------------
- UDP GSO allows to boost UDP tx rates by offloading UDP headers allocation
- into hardware. A special userspace socket option is required for this,
- could be validated with /kernel/tools/testing/selftests/net/
-
-    udpgso_bench_tx -u -4 -D 10.0.1.1 -s 6300 -S 100
-
- Will cause sending out of 100 byte sized UDP packets formed from single
- 6300 bytes user buffer.
-
- UDP GSO is configured by:
-
-    ethtool -K eth0 tx-udp-segmentation on
-
- Private flags (testing)
- ---------------------------------
-
- Atlantic driver supports private flags for hardware custom features:
-
-	$ ethtool --show-priv-flags ethX
-
-	Private flags for ethX:
-	DMASystemLoopback  : off
-	PKTSystemLoopback  : off
-	DMANetworkLoopback : off
-	PHYInternalLoopback: off
-	PHYExternalLoopback: off
-
- Example:
-
-	$ ethtool --set-priv-flags ethX DMASystemLoopback on
-
- DMASystemLoopback:   DMA Host loopback.
- PKTSystemLoopback:   Packet buffer host loopback.
- DMANetworkLoopback:  Network side loopback on DMA block.
- PHYInternalLoopback: Internal loopback on Phy.
- PHYExternalLoopback: External loopback on Phy (with loopback ethernet cable).
-
-
-Command Line Parameters
-=======================
-The following command line parameters are available on atlantic driver:
-
-aq_itr -Interrupt throttling mode
-----------------------------------------
-Accepted values: 0, 1, 0xFFFF
-Default value: 0xFFFF
-0      - Disable interrupt throttling.
-1      - Enable interrupt throttling and use specified tx and rx rates.
-0xFFFF - Auto throttling mode. Driver will choose the best RX and TX
-         interrupt throtting settings based on link speed.
-
-aq_itr_tx - TX interrupt throttle rate
-----------------------------------------
-Accepted values: 0 - 0x1FF
-Default value: 0
-TX side throttling in microseconds. Adapter will setup maximum interrupt delay
-to this value. Minimum interrupt delay will be a half of this value
-
-aq_itr_rx - RX interrupt throttle rate
-----------------------------------------
-Accepted values: 0 - 0x1FF
-Default value: 0
-RX side throttling in microseconds. Adapter will setup maximum interrupt delay
-to this value. Minimum interrupt delay will be a half of this value
-
-Note: ITR settings could be changed in runtime by ethtool -c means (see below)
-
-Config file parameters
-=======================
-For some fine tuning and performance optimizations,
-some parameters can be changed in the {source_dir}/aq_cfg.h file.
-
-AQ_CFG_RX_PAGEORDER
-----------------------------------------
-Default value: 0
-RX page order override. Thats a power of 2 number of RX pages allocated for
-each descriptor. Received descriptor size is still limited by AQ_CFG_RX_FRAME_MAX.
-Increasing pageorder makes page reuse better (actual on iommu enabled systems).
-
-AQ_CFG_RX_REFILL_THRES
-----------------------------------------
-Default value: 32
-RX refill threshold. RX path will not refill freed descriptors until the
-specified number of free descriptors is observed. Larger values may help
-better page reuse but may lead to packet drops as well.
-
-AQ_CFG_VECS_DEF
-------------------------------------------------------------
-Number of queues
-Valid Range: 0 - 8 (up to AQ_CFG_VECS_MAX)
-Default value: 8
-Notice this value will be capped by the number of cores available on the system.
-
-AQ_CFG_IS_RSS_DEF
-------------------------------------------------------------
-Enable/disable Receive Side Scaling
-
-This feature allows the adapter to distribute receive processing
-across multiple CPU-cores and to prevent from overloading a single CPU core.
-
-Valid values
-0 - disabled
-1 - enabled
-
-Default value: 1
-
-AQ_CFG_NUM_RSS_QUEUES_DEF
-------------------------------------------------------------
-Number of queues for Receive Side Scaling
-Valid Range: 0 - 8 (up to AQ_CFG_VECS_DEF)
-
-Default value: AQ_CFG_VECS_DEF
-
-AQ_CFG_IS_LRO_DEF
-------------------------------------------------------------
-Enable/disable Large Receive Offload
-
-This offload enables the adapter to coalesce multiple TCP segments and indicate
-them as a single coalesced unit to the OS networking subsystem.
-The system consumes less energy but it also introduces more latency in packets processing.
-
-Valid values
-0 - disabled
-1 - enabled
-
-Default value: 1
-
-AQ_CFG_TX_CLEAN_BUDGET
-----------------------------------------
-Maximum descriptors to cleanup on TX at once.
-Default value: 256
-
-After the aq_cfg.h file changed the driver must be rebuilt to take effect.
-
-Support
-=======
-
-If an issue is identified with the released source code on the supported
-kernel with a supported adapter, email the specific information related
-to the issue to aqn_support@marvell.com
-
-License
-=======
-
-aQuantia Corporation Network Driver
-Copyright(c) 2014 - 2019 aQuantia Corporation.
-
-This program is free software; you can redistribute it and/or modify it
-under the terms and conditions of the GNU General Public License,
-version 2, as published by the Free Software Foundation.
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 019a0d2efe67..7dde314fc957 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -30,6 +30,7 @@ Contents:
    3com/3c509
    3com/vortex
    amazon/ena
+   aquantia/atlantic
 
 .. only::  subproject and html
 
diff --git a/MAINTAINERS b/MAINTAINERS
index 7b6c13cc832f..b5cfee17635e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1275,7 +1275,7 @@ L:	netdev@vger.kernel.org
 S:	Supported
 W:	https://www.marvell.com/
 Q:	http://patchwork.ozlabs.org/project/netdev/list/
-F:	Documentation/networking/device_drivers/aquantia/atlantic.txt
+F:	Documentation/networking/device_drivers/aquantia/atlantic.rst
 F:	drivers/net/ethernet/aquantia/atlantic/
 
 AQUANTIA ETHERNET DRIVER PTP SUBSYSTEM
-- 
cgit v1.2.3


From c981977d3a5ce55c96b1b77f42d0a9df0a79244e Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:42 +0200
Subject: docs: networking: device drivers: convert dec/dmfe.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- comment out text-only TOC from html/pdf output;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 .../networking/device_drivers/dec/dmfe.rst         | 71 ++++++++++++++++++++++
 .../networking/device_drivers/dec/dmfe.txt         | 66 --------------------
 Documentation/networking/device_drivers/index.rst  |  1 +
 MAINTAINERS                                        |  2 +-
 drivers/net/ethernet/dec/tulip/Kconfig             |  2 +-
 5 files changed, 74 insertions(+), 68 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/dec/dmfe.rst
 delete mode 100644 Documentation/networking/device_drivers/dec/dmfe.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/dec/dmfe.rst b/Documentation/networking/device_drivers/dec/dmfe.rst
new file mode 100644
index 000000000000..c4cf809cad84
--- /dev/null
+++ b/Documentation/networking/device_drivers/dec/dmfe.rst
@@ -0,0 +1,71 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================================================
+Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux
+==============================================================
+
+Note: This driver doesn't have a maintainer.
+
+
+This program is free software; you can redistribute it and/or
+modify it under the terms of the GNU General   Public License
+as published by the Free Software Foundation; either version 2
+of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+
+This driver provides kernel support for Davicom DM9102(A)/DM9132/DM9801 ethernet cards ( CNET
+10/100 ethernet cards uses Davicom chipset too, so this driver supports CNET cards too ).If you
+didn't compile this driver as a module, it will automatically load itself on boot and print a
+line similar to::
+
+	dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17)
+
+If you compiled this driver as a module, you have to load it on boot.You can load it with command::
+
+	insmod dmfe
+
+This way it will autodetect the device mode.This is the suggested way to load the module.Or you can pass
+a mode= setting to module while loading, like::
+
+	insmod dmfe mode=0 # Force 10M Half Duplex
+	insmod dmfe mode=1 # Force 100M Half Duplex
+	insmod dmfe mode=4 # Force 10M Full Duplex
+	insmod dmfe mode=5 # Force 100M Full Duplex
+
+Next you should configure your network interface with a command similar to::
+
+	ifconfig eth0 172.22.3.18
+		      ^^^^^^^^^^^
+		     Your IP Address
+
+Then you may have to modify the default routing table with command::
+
+	route add default eth0
+
+
+Now your ethernet card should be up and running.
+
+
+TODO:
+
+- Implement pci_driver::suspend() and pci_driver::resume() power management methods.
+- Check on 64 bit boxes.
+- Check and fix on big endian boxes.
+- Test and make sure PCI latency is now correct for all cases.
+
+
+Authors:
+
+Sten Wang <sten_wang@davicom.com.tw >   : Original Author
+
+Contributors:
+
+- Marcelo Tosatti <marcelo@conectiva.com.br>
+- Alan Cox <alan@lxorguk.ukuu.org.uk>
+- Jeff Garzik <jgarzik@pobox.com>
+- Vojtech Pavlik <vojtech@suse.cz>
diff --git a/Documentation/networking/device_drivers/dec/dmfe.txt b/Documentation/networking/device_drivers/dec/dmfe.txt
deleted file mode 100644
index 25320bf19c86..000000000000
--- a/Documentation/networking/device_drivers/dec/dmfe.txt
+++ /dev/null
@@ -1,66 +0,0 @@
-Note: This driver doesn't have a maintainer.
-
-Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux.
-
-This program is free software; you can redistribute it and/or
-modify it under the terms of the GNU General   Public License
-as published by the Free Software Foundation; either version 2
-of the License, or (at your option) any later version.
-
-This program is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-
-This driver provides kernel support for Davicom DM9102(A)/DM9132/DM9801 ethernet cards ( CNET
-10/100 ethernet cards uses Davicom chipset too, so this driver supports CNET cards too ).If you
-didn't compile this driver as a module, it will automatically load itself on boot and print a
-line similar to :
-
-	dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17)
-
-If you compiled this driver as a module, you have to load it on boot.You can load it with command :
-
-	insmod dmfe
-
-This way it will autodetect the device mode.This is the suggested way to load the module.Or you can pass
-a mode= setting to module while loading, like :
-
-	insmod dmfe mode=0 # Force 10M Half Duplex
-	insmod dmfe mode=1 # Force 100M Half Duplex
-	insmod dmfe mode=4 # Force 10M Full Duplex
-	insmod dmfe mode=5 # Force 100M Full Duplex
-
-Next you should configure your network interface with a command similar to :
-
-	ifconfig eth0 172.22.3.18
-                      ^^^^^^^^^^^
-		     Your IP Address
-
-Then you may have to modify the default routing table with command :
-
-	route add default eth0
-
-
-Now your ethernet card should be up and running.
-
-
-TODO:
-
-Implement pci_driver::suspend() and pci_driver::resume() power management methods.
-Check on 64 bit boxes.
-Check and fix on big endian boxes.
-Test and make sure PCI latency is now correct for all cases.
-
-
-Authors:
-
-Sten Wang <sten_wang@davicom.com.tw >   : Original Author
-
-Contributors:
-
-Marcelo Tosatti <marcelo@conectiva.com.br>
-Alan Cox <alan@lxorguk.ukuu.org.uk>
-Jeff Garzik <jgarzik@pobox.com>
-Vojtech Pavlik <vojtech@suse.cz>
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 4ad13ffb5800..09728e964ce1 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -35,6 +35,7 @@ Contents:
    cirrus/cs89x0
    davicom/dm9000
    dec/de4x5
+   dec/dmfe
 
 .. only::  subproject and html
 
diff --git a/MAINTAINERS b/MAINTAINERS
index b5cfee17635e..f0b18c156176 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4694,7 +4694,7 @@ F:	net/ax25/sysctl_net_ax25.c
 DAVICOM FAST ETHERNET (DMFE) NETWORK DRIVER
 L:	netdev@vger.kernel.org
 S:	Orphan
-F:	Documentation/networking/device_drivers/dec/dmfe.txt
+F:	Documentation/networking/device_drivers/dec/dmfe.rst
 F:	drivers/net/ethernet/dec/tulip/dmfe.c
 
 DC390/AM53C974 SCSI driver
diff --git a/drivers/net/ethernet/dec/tulip/Kconfig b/drivers/net/ethernet/dec/tulip/Kconfig
index 8c4245d94bb2..177f36f4b89d 100644
--- a/drivers/net/ethernet/dec/tulip/Kconfig
+++ b/drivers/net/ethernet/dec/tulip/Kconfig
@@ -138,7 +138,7 @@ config DM9102
 	  This driver is for DM9102(A)/DM9132/DM9801 compatible PCI cards from
 	  Davicom (<http://www.davicom.com.tw/>).  If you have such a network
 	  (Ethernet) card, say Y.  Some information is contained in the file
-	  <file:Documentation/networking/device_drivers/dec/dmfe.txt>.
+	  <file:Documentation/networking/device_drivers/dec/dmfe.rst>.
 
 	  To compile this driver as a module, choose M here. The module will
 	  be called dmfe.
-- 
cgit v1.2.3


From cf7eba49b2b160f98106b33ca12039b05d812140 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:46 +0200
Subject: docs: networking: device drivers: convert intel/ipw2100.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- comment out text-only TOC from html/pdf output;
- use copyright symbol;
- use :field: markup;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |   1 +
 .../networking/device_drivers/intel/ipw2100.rst    | 323 +++++++++++++++++++++
 .../networking/device_drivers/intel/ipw2100.txt    | 293 -------------------
 MAINTAINERS                                        |   2 +-
 drivers/net/wireless/intel/ipw2x00/Kconfig         |   2 +-
 drivers/net/wireless/intel/ipw2x00/ipw2100.c       |   2 +-
 6 files changed, 327 insertions(+), 296 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/intel/ipw2100.rst
 delete mode 100644 Documentation/networking/device_drivers/intel/ipw2100.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index cec3415ee459..54ed10f3d1a7 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -39,6 +39,7 @@ Contents:
    dlink/dl2k
    freescale/dpaa
    freescale/gianfar
+   intel/ipw2100
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/intel/ipw2100.rst b/Documentation/networking/device_drivers/intel/ipw2100.rst
new file mode 100644
index 000000000000..d54ad522f937
--- /dev/null
+++ b/Documentation/networking/device_drivers/intel/ipw2100.rst
@@ -0,0 +1,323 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===========================================
+Intel(R) PRO/Wireless 2100 Driver for Linux
+===========================================
+
+Support for:
+
+- Intel(R) PRO/Wireless 2100 Network Connection
+
+Copyright |copy| 2003-2006, Intel Corporation
+
+README.ipw2100
+
+:Version: git-1.1.5
+:Date:    January 25, 2006
+
+.. Index
+
+    0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+    1. Introduction
+    2. Release git-1.1.5 Current Features
+    3. Command Line Parameters
+    4. Sysfs Helper Files
+    5. Radio Kill Switch
+    6. Dynamic Firmware
+    7. Power Management
+    8. Support
+    9. License
+
+
+0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+=================================================
+
+Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
+
+Intel wireless LAN adapters are engineered, manufactured, tested, and
+quality checked to ensure that they meet all necessary local and
+governmental regulatory agency requirements for the regions that they
+are designated and/or marked to ship into. Since wireless LANs are
+generally unlicensed devices that share spectrum with radars,
+satellites, and other licensed and unlicensed devices, it is sometimes
+necessary to dynamically detect, avoid, and limit usage to avoid
+interference with these devices. In many instances Intel is required to
+provide test data to prove regional and local compliance to regional and
+governmental regulations before certification or approval to use the
+product is granted. Intel's wireless LAN's EEPROM, firmware, and
+software driver are designed to carefully control parameters that affect
+radio operation and to ensure electromagnetic compliance (EMC). These
+parameters include, without limitation, RF power, spectrum usage,
+channel scanning, and human exposure.
+
+For these reasons Intel cannot permit any manipulation by third parties
+of the software provided in binary format with the wireless WLAN
+adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
+patches, utilities, or code with the Intel wireless LAN adapters that
+have been manipulated by an unauthorized party (i.e., patches,
+utilities, or code (including open source code modifications) which have
+not been validated by Intel), (i) you will be solely responsible for
+ensuring the regulatory compliance of the products, (ii) Intel will bear
+no liability, under any theory of liability for any issues associated
+with the modified products, including without limitation, claims under
+the warranty and/or issues arising from regulatory non-compliance, and
+(iii) Intel will not provide or be required to assist in providing
+support to any third parties for such modified products.
+
+Note: Many regulatory agencies consider Wireless LAN adapters to be
+modules, and accordingly, condition system-level regulatory approval
+upon receipt and review of test data documenting that the antennas and
+system configuration do not cause the EMC and radio operation to be
+non-compliant.
+
+The drivers available for download from SourceForge are provided as a
+part of a development project.  Conformance to local regulatory
+requirements is the responsibility of the individual developer.  As
+such, if you are interested in deploying or shipping a driver as part of
+solution intended to be used for purposes other than development, please
+obtain a tested driver from Intel Customer Support at:
+
+http://www.intel.com/support/wireless/sb/CS-006408.htm
+
+1. Introduction
+===============
+
+This document provides a brief overview of the features supported by the
+IPW2100 driver project.  The main project website, where the latest
+development version of the driver can be found, is:
+
+	http://ipw2100.sourceforge.net
+
+There you can find the not only the latest releases, but also information about
+potential fixes and patches, as well as links to the development mailing list
+for the driver project.
+
+
+2. Release git-1.1.5 Current Supported Features
+===============================================
+
+- Managed (BSS) and Ad-Hoc (IBSS)
+- WEP (shared key and open)
+- Wireless Tools support
+- 802.1x (tested with XSupplicant 1.0.1)
+
+Enabled (but not supported) features:
+- Monitor/RFMon mode
+- WPA/WPA2
+
+The distinction between officially supported and enabled is a reflection
+on the amount of validation and interoperability testing that has been
+performed on a given feature.
+
+
+3. Command Line Parameters
+==========================
+
+If the driver is built as a module, the following optional parameters are used
+by entering them on the command line with the modprobe command using this
+syntax::
+
+	modprobe ipw2100 [<option>=<VAL1><,VAL2>...]
+
+For example, to disable the radio on driver loading, enter:
+
+	modprobe ipw2100 disable=1
+
+The ipw2100 driver supports the following module parameters:
+
+=========	==============	============  ==============================
+Name		Value		Example       Meaning
+=========	==============	============  ==============================
+debug		0x0-0xffffffff	debug=1024    Debug level set to 1024
+mode		0,1,2		mode=1        AdHoc
+channel		int		channel=3     Only valid in AdHoc or Monitor
+associate	boolean		associate=0   Do NOT auto associate
+disable		boolean		disable=1     Do not power the HW
+=========	==============	============  ==============================
+
+
+4. Sysfs Helper Files
+=====================
+
+There are several ways to control the behavior of the driver.  Many of the
+general capabilities are exposed through the Wireless Tools (iwconfig).  There
+are a few capabilities that are exposed through entries in the Linux Sysfs.
+
+
+**Driver Level**
+
+For the driver level files, look in /sys/bus/pci/drivers/ipw2100/
+
+  debug_level
+	This controls the same global as the 'debug' module parameter.  For
+	information on the various debugging levels available, run the 'dvals'
+	script found in the driver source directory.
+
+	.. note::
+
+	      'debug_level' is only enabled if CONFIG_IPW2100_DEBUG is turn on.
+
+**Device Level**
+
+For the device level files look in::
+
+	/sys/bus/pci/drivers/ipw2100/{PCI-ID}/
+
+For example::
+
+	/sys/bus/pci/drivers/ipw2100/0000:02:01.0
+
+For the device level files, see /sys/bus/pci/drivers/ipw2100:
+
+  rf_kill
+	read
+
+	==  =========================================
+	0   RF kill not enabled (radio on)
+	1   SW based RF kill active (radio off)
+	2   HW based RF kill active (radio off)
+	3   Both HW and SW RF kill active (radio off)
+	==  =========================================
+
+	write
+
+	==  ==================================================
+	0   If SW based RF kill active, turn the radio back on
+	1   If radio is on, activate SW based RF kill
+	==  ==================================================
+
+	.. note::
+
+	   If you enable the SW based RF kill and then toggle the HW
+	   based RF kill from ON -> OFF -> ON, the radio will NOT come back on
+
+
+5. Radio Kill Switch
+====================
+
+Most laptops provide the ability for the user to physically disable the radio.
+Some vendors have implemented this as a physical switch that requires no
+software to turn the radio off and on.  On other laptops, however, the switch
+is controlled through a button being pressed and a software driver then making
+calls to turn the radio off and on.  This is referred to as a "software based
+RF kill switch"
+
+See the Sysfs helper file 'rf_kill' for determining the state of the RF switch
+on your system.
+
+
+6. Dynamic Firmware
+===================
+
+As the firmware is licensed under a restricted use license, it can not be
+included within the kernel sources.  To enable the IPW2100 you will need a
+firmware image to load into the wireless NIC's processors.
+
+You can obtain these images from <http://ipw2100.sf.net/firmware.php>.
+
+See INSTALL for instructions on installing the firmware.
+
+
+7. Power Management
+===================
+
+The IPW2100 supports the configuration of the Power Save Protocol
+through a private wireless extension interface.  The IPW2100 supports
+the following different modes:
+
+	===	===========================================================
+	off	No power management.  Radio is always on.
+	on	Automatic power management
+	1-5	Different levels of power management.  The higher the
+		number the greater the power savings, but with an impact to
+		packet latencies.
+	===	===========================================================
+
+Power management works by powering down the radio after a certain
+interval of time has passed where no packets are passed through the
+radio.  Once powered down, the radio remains in that state for a given
+period of time.  For higher power savings, the interval between last
+packet processed to sleep is shorter and the sleep period is longer.
+
+When the radio is asleep, the access point sending data to the station
+must buffer packets at the AP until the station wakes up and requests
+any buffered packets.  If you have an AP that does not correctly support
+the PSP protocol you may experience packet loss or very poor performance
+while power management is enabled.  If this is the case, you will need
+to try and find a firmware update for your AP, or disable power
+management (via ``iwconfig eth1 power off``)
+
+To configure the power level on the IPW2100 you use a combination of
+iwconfig and iwpriv.  iwconfig is used to turn power management on, off,
+and set it to auto.
+
+	=========================  ====================================
+	iwconfig eth1 power off    Disables radio power down
+	iwconfig eth1 power on     Enables radio power management to
+				   last set level (defaults to AUTO)
+	iwpriv eth1 set_power 0    Sets power level to AUTO and enables
+				   power management if not previously
+				   enabled.
+	iwpriv eth1 set_power 1-5  Set the power level as specified,
+				   enabling power management if not
+				   previously enabled.
+	=========================  ====================================
+
+You can view the current power level setting via::
+
+	iwpriv eth1 get_power
+
+It will return the current period or timeout that is configured as a string
+in the form of xxxx/yyyy (z) where xxxx is the timeout interval (amount of
+time after packet processing), yyyy is the period to sleep (amount of time to
+wait before powering the radio and querying the access point for buffered
+packets), and z is the 'power level'.  If power management is turned off the
+xxxx/yyyy will be replaced with 'off' -- the level reported will be the active
+level if `iwconfig eth1 power on` is invoked.
+
+
+8. Support
+==========
+
+For general development information and support,
+go to:
+
+    http://ipw2100.sf.net/
+
+The ipw2100 1.1.0 driver and firmware can be downloaded from:
+
+    http://support.intel.com
+
+For installation support on the ipw2100 1.1.0 driver on Linux kernels
+2.6.8 or greater, email support is available from:
+
+    http://supportmail.intel.com
+
+9. License
+==========
+
+  Copyright |copy| 2003 - 2006 Intel Corporation. All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms of the GNU General Public License (version 2) as
+  published by the Free Software Foundation.
+
+  This program is distributed in the hope that it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc., 59
+  Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+
+  The full GNU General Public License is included in this distribution in the
+  file called LICENSE.
+
+  License Contact Information:
+
+  James P. Ketrenos <ipw2100-admin@linux.intel.com>
+
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
diff --git a/Documentation/networking/device_drivers/intel/ipw2100.txt b/Documentation/networking/device_drivers/intel/ipw2100.txt
deleted file mode 100644
index 6f85e1d06031..000000000000
--- a/Documentation/networking/device_drivers/intel/ipw2100.txt
+++ /dev/null
@@ -1,293 +0,0 @@
-
-Intel(R) PRO/Wireless 2100 Driver for Linux in support of:
-
-Intel(R) PRO/Wireless 2100 Network Connection
-
-Copyright (C) 2003-2006, Intel Corporation
-
-README.ipw2100
-
-Version: git-1.1.5
-Date   : January 25, 2006
-
-Index
------------------------------------------------
-0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
-1. Introduction
-2. Release git-1.1.5 Current Features
-3. Command Line Parameters
-4. Sysfs Helper Files
-5. Radio Kill Switch
-6. Dynamic Firmware
-7. Power Management
-8. Support
-9. License
-
-
-0.   IMPORTANT INFORMATION BEFORE USING THIS DRIVER
------------------------------------------------
-
-Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
-
-Intel wireless LAN adapters are engineered, manufactured, tested, and
-quality checked to ensure that they meet all necessary local and
-governmental regulatory agency requirements for the regions that they
-are designated and/or marked to ship into. Since wireless LANs are
-generally unlicensed devices that share spectrum with radars,
-satellites, and other licensed and unlicensed devices, it is sometimes
-necessary to dynamically detect, avoid, and limit usage to avoid
-interference with these devices. In many instances Intel is required to
-provide test data to prove regional and local compliance to regional and
-governmental regulations before certification or approval to use the
-product is granted. Intel's wireless LAN's EEPROM, firmware, and
-software driver are designed to carefully control parameters that affect
-radio operation and to ensure electromagnetic compliance (EMC). These
-parameters include, without limitation, RF power, spectrum usage,
-channel scanning, and human exposure.
-
-For these reasons Intel cannot permit any manipulation by third parties
-of the software provided in binary format with the wireless WLAN
-adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
-patches, utilities, or code with the Intel wireless LAN adapters that
-have been manipulated by an unauthorized party (i.e., patches,
-utilities, or code (including open source code modifications) which have
-not been validated by Intel), (i) you will be solely responsible for
-ensuring the regulatory compliance of the products, (ii) Intel will bear
-no liability, under any theory of liability for any issues associated
-with the modified products, including without limitation, claims under
-the warranty and/or issues arising from regulatory non-compliance, and
-(iii) Intel will not provide or be required to assist in providing
-support to any third parties for such modified products.
-
-Note: Many regulatory agencies consider Wireless LAN adapters to be
-modules, and accordingly, condition system-level regulatory approval
-upon receipt and review of test data documenting that the antennas and
-system configuration do not cause the EMC and radio operation to be
-non-compliant.
-
-The drivers available for download from SourceForge are provided as a
-part of a development project.  Conformance to local regulatory
-requirements is the responsibility of the individual developer.  As
-such, if you are interested in deploying or shipping a driver as part of
-solution intended to be used for purposes other than development, please
-obtain a tested driver from Intel Customer Support at:
-
-http://www.intel.com/support/wireless/sb/CS-006408.htm
-
-1. Introduction
------------------------------------------------
-
-This document provides a brief overview of the features supported by the 
-IPW2100 driver project.  The main project website, where the latest 
-development version of the driver can be found, is:
-
-	http://ipw2100.sourceforge.net
-
-There you can find the not only the latest releases, but also information about
-potential fixes and patches, as well as links to the development mailing list
-for the driver project.
-
-
-2. Release git-1.1.5 Current Supported Features
------------------------------------------------
-- Managed (BSS) and Ad-Hoc (IBSS)
-- WEP (shared key and open)
-- Wireless Tools support 
-- 802.1x (tested with XSupplicant 1.0.1)
-
-Enabled (but not supported) features:
-- Monitor/RFMon mode
-- WPA/WPA2
-
-The distinction between officially supported and enabled is a reflection
-on the amount of validation and interoperability testing that has been
-performed on a given feature.
-
-
-3. Command Line Parameters
------------------------------------------------
-
-If the driver is built as a module, the following optional parameters are used
-by entering them on the command line with the modprobe command using this
-syntax:
-
-	modprobe ipw2100 [<option>=<VAL1><,VAL2>...]
-
-For example, to disable the radio on driver loading, enter:
-
-	modprobe ipw2100 disable=1
-
-The ipw2100 driver supports the following module parameters:
-
-Name		Value		Example:
-debug		0x0-0xffffffff	debug=1024
-mode		0,1,2		mode=1   /* AdHoc */
-channel		int		channel=3 /* Only valid in AdHoc or Monitor */
-associate	boolean		associate=0 /* Do NOT auto associate */
-disable		boolean		disable=1 /* Do not power the HW */
-
-
-4. Sysfs Helper Files
----------------------------     
------------------------------------------------
-
-There are several ways to control the behavior of the driver.  Many of the 
-general capabilities are exposed through the Wireless Tools (iwconfig).  There
-are a few capabilities that are exposed through entries in the Linux Sysfs.
-
-
------ Driver Level ------
-For the driver level files, look in /sys/bus/pci/drivers/ipw2100/
-
-  debug_level  
-	
-	This controls the same global as the 'debug' module parameter.  For 
-        information on the various debugging levels available, run the 'dvals'
-	script found in the driver source directory.
-
-	NOTE:  'debug_level' is only enabled if CONFIG_IPW2100_DEBUG is turn
-	       on.
-
------ Device Level ------
-For the device level files look in
-	
-	/sys/bus/pci/drivers/ipw2100/{PCI-ID}/
-
-For example:
-	/sys/bus/pci/drivers/ipw2100/0000:02:01.0
-
-For the device level files, see /sys/bus/pci/drivers/ipw2100:
-
-  rf_kill
-	read - 
-	0 = RF kill not enabled (radio on)
-	1 = SW based RF kill active (radio off)
-	2 = HW based RF kill active (radio off)
-	3 = Both HW and SW RF kill active (radio off)
-	write -
-	0 = If SW based RF kill active, turn the radio back on
-	1 = If radio is on, activate SW based RF kill
-
-	NOTE: If you enable the SW based RF kill and then toggle the HW
-  	based RF kill from ON -> OFF -> ON, the radio will NOT come back on
-
-
-5. Radio Kill Switch
------------------------------------------------
-Most laptops provide the ability for the user to physically disable the radio.
-Some vendors have implemented this as a physical switch that requires no
-software to turn the radio off and on.  On other laptops, however, the switch
-is controlled through a button being pressed and a software driver then making
-calls to turn the radio off and on.  This is referred to as a "software based
-RF kill switch"
-
-See the Sysfs helper file 'rf_kill' for determining the state of the RF switch
-on your system.
-
-
-6. Dynamic Firmware
------------------------------------------------
-As the firmware is licensed under a restricted use license, it can not be 
-included within the kernel sources.  To enable the IPW2100 you will need a 
-firmware image to load into the wireless NIC's processors.
-
-You can obtain these images from <http://ipw2100.sf.net/firmware.php>.
-
-See INSTALL for instructions on installing the firmware.
-
-
-7. Power Management
------------------------------------------------
-The IPW2100 supports the configuration of the Power Save Protocol 
-through a private wireless extension interface.  The IPW2100 supports 
-the following different modes:
-
-	off	No power management.  Radio is always on.
-	on	Automatic power management
-	1-5	Different levels of power management.  The higher the 
-		number the greater the power savings, but with an impact to 
-		packet latencies. 
-
-Power management works by powering down the radio after a certain 
-interval of time has passed where no packets are passed through the 
-radio.  Once powered down, the radio remains in that state for a given 
-period of time.  For higher power savings, the interval between last 
-packet processed to sleep is shorter and the sleep period is longer.
-
-When the radio is asleep, the access point sending data to the station 
-must buffer packets at the AP until the station wakes up and requests 
-any buffered packets.  If you have an AP that does not correctly support 
-the PSP protocol you may experience packet loss or very poor performance 
-while power management is enabled.  If this is the case, you will need 
-to try and find a firmware update for your AP, or disable power 
-management (via `iwconfig eth1 power off`)
-
-To configure the power level on the IPW2100 you use a combination of 
-iwconfig and iwpriv.  iwconfig is used to turn power management on, off, 
-and set it to auto.
-
-	iwconfig eth1 power off    Disables radio power down
-	iwconfig eth1 power on     Enables radio power management to 
-				   last set level (defaults to AUTO)
-	iwpriv eth1 set_power 0    Sets power level to AUTO and enables 
-				   power management if not previously 
-				   enabled.
-	iwpriv eth1 set_power 1-5  Set the power level as specified, 
-				   enabling power management if not 
-				   previously enabled.
-
-You can view the current power level setting via:
-	
-	iwpriv eth1 get_power
-
-It will return the current period or timeout that is configured as a string
-in the form of xxxx/yyyy (z) where xxxx is the timeout interval (amount of
-time after packet processing), yyyy is the period to sleep (amount of time to 
-wait before powering the radio and querying the access point for buffered
-packets), and z is the 'power level'.  If power management is turned off the
-xxxx/yyyy will be replaced with 'off' -- the level reported will be the active
-level if `iwconfig eth1 power on` is invoked.
-
-
-8. Support
------------------------------------------------
-
-For general development information and support,
-go to:
-	
-    http://ipw2100.sf.net/
-
-The ipw2100 1.1.0 driver and firmware can be downloaded from:  
-
-    http://support.intel.com
-
-For installation support on the ipw2100 1.1.0 driver on Linux kernels 
-2.6.8 or greater, email support is available from:  
-
-    http://supportmail.intel.com
-
-9. License
------------------------------------------------
-
-  Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
-
-  This program is free software; you can redistribute it and/or modify it 
-  under the terms of the GNU General Public License (version 2) as 
-  published by the Free Software Foundation.
-  
-  This program is distributed in the hope that it will be useful, but WITHOUT 
-  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
-  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
-  more details.
-  
-  You should have received a copy of the GNU General Public License along with
-  this program; if not, write to the Free Software Foundation, Inc., 59 
-  Temple Place - Suite 330, Boston, MA  02111-1307, USA.
-  
-  The full GNU General Public License is included in this distribution in the
-  file called LICENSE.
-  
-  License Contact Information:
-  James P. Ketrenos <ipw2100-admin@linux.intel.com>
-  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
-
diff --git a/MAINTAINERS b/MAINTAINERS
index f0b18c156176..887c4e7e6102 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8742,7 +8742,7 @@ INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
 M:	Stanislav Yakovlev <stas.yakovlev@gmail.com>
 L:	linux-wireless@vger.kernel.org
 S:	Maintained
-F:	Documentation/networking/device_drivers/intel/ipw2100.txt
+F:	Documentation/networking/device_drivers/intel/ipw2100.rst
 F:	Documentation/networking/device_drivers/intel/ipw2200.txt
 F:	drivers/net/wireless/intel/ipw2x00/
 
diff --git a/drivers/net/wireless/intel/ipw2x00/Kconfig b/drivers/net/wireless/intel/ipw2x00/Kconfig
index ab17903ba9f8..b0b3cd6296f3 100644
--- a/drivers/net/wireless/intel/ipw2x00/Kconfig
+++ b/drivers/net/wireless/intel/ipw2x00/Kconfig
@@ -16,7 +16,7 @@ config IPW2100
 	  A driver for the Intel PRO/Wireless 2100 Network
 	  Connection 802.11b wireless network adapter.
 
-	  See <file:Documentation/networking/device_drivers/intel/ipw2100.txt>
+	  See <file:Documentation/networking/device_drivers/intel/ipw2100.rst>
 	  for information on the capabilities currently enabled in this driver
 	  and for tips for debugging issues and problems.
 
diff --git a/drivers/net/wireless/intel/ipw2x00/ipw2100.c b/drivers/net/wireless/intel/ipw2x00/ipw2100.c
index 97ea6e2035e6..624fe721e2b5 100644
--- a/drivers/net/wireless/intel/ipw2x00/ipw2100.c
+++ b/drivers/net/wireless/intel/ipw2x00/ipw2100.c
@@ -8352,7 +8352,7 @@ static int ipw2100_mod_firmware_load(struct ipw2100_fw *fw)
 	if (IPW2100_FW_MAJOR(h->version) != IPW2100_FW_MAJOR_VERSION) {
 		printk(KERN_WARNING DRV_NAME ": Firmware image not compatible "
 		       "(detected version id of %u). "
-		       "See Documentation/networking/device_drivers/intel/ipw2100.txt\n",
+		       "See Documentation/networking/device_drivers/intel/ipw2100.rst\n",
 		       h->version);
 		return 1;
 	}
-- 
cgit v1.2.3


From c81f195703270a330f04ae41b9890b13c101a63f Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:47 +0200
Subject: docs: networking: device drivers: convert intel/ipw2200.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- comment out text-only TOC from html/pdf output;
- use copyright symbol;
- use :field: markup;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |   1 +
 .../networking/device_drivers/intel/ipw2200.rst    | 526 +++++++++++++++++++++
 .../networking/device_drivers/intel/ipw2200.txt    | 472 ------------------
 MAINTAINERS                                        |   2 +-
 drivers/net/wireless/intel/ipw2x00/Kconfig         |   2 +-
 5 files changed, 529 insertions(+), 474 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/intel/ipw2200.rst
 delete mode 100644 Documentation/networking/device_drivers/intel/ipw2200.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 54ed10f3d1a7..f9ce0089ec7d 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -40,6 +40,7 @@ Contents:
    freescale/dpaa
    freescale/gianfar
    intel/ipw2100
+   intel/ipw2200
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/intel/ipw2200.rst b/Documentation/networking/device_drivers/intel/ipw2200.rst
new file mode 100644
index 000000000000..0cb42d2fd7e5
--- /dev/null
+++ b/Documentation/networking/device_drivers/intel/ipw2200.rst
@@ -0,0 +1,526 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+==============================================
+Intel(R) PRO/Wireless 2915ABG Driver for Linux
+==============================================
+
+
+Support for:
+
+- Intel(R) PRO/Wireless 2200BG Network Connection
+- Intel(R) PRO/Wireless 2915ABG Network Connection
+
+Note: The Intel(R) PRO/Wireless 2915ABG Driver for Linux and Intel(R)
+PRO/Wireless 2200BG Driver for Linux is a unified driver that works on
+both hardware adapters listed above. In this document the Intel(R)
+PRO/Wireless 2915ABG Driver for Linux will be used to reference the
+unified driver.
+
+Copyright |copy| 2004-2006, Intel Corporation
+
+README.ipw2200
+
+:Version: 1.1.2
+:Date: March 30, 2006
+
+
+.. Index
+
+    0.   IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+    1.   Introduction
+    1.1. Overview of features
+    1.2. Module parameters
+    1.3. Wireless Extension Private Methods
+    1.4. Sysfs Helper Files
+    1.5. Supported channels
+    2.   Ad-Hoc Networking
+    3.   Interacting with Wireless Tools
+    3.1. iwconfig mode
+    3.2. iwconfig sens
+    4.   About the Version Numbers
+    5.   Firmware installation
+    6.   Support
+    7.   License
+
+
+0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+=================================================
+
+Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
+
+Intel wireless LAN adapters are engineered, manufactured, tested, and
+quality checked to ensure that they meet all necessary local and
+governmental regulatory agency requirements for the regions that they
+are designated and/or marked to ship into. Since wireless LANs are
+generally unlicensed devices that share spectrum with radars,
+satellites, and other licensed and unlicensed devices, it is sometimes
+necessary to dynamically detect, avoid, and limit usage to avoid
+interference with these devices. In many instances Intel is required to
+provide test data to prove regional and local compliance to regional and
+governmental regulations before certification or approval to use the
+product is granted. Intel's wireless LAN's EEPROM, firmware, and
+software driver are designed to carefully control parameters that affect
+radio operation and to ensure electromagnetic compliance (EMC). These
+parameters include, without limitation, RF power, spectrum usage,
+channel scanning, and human exposure.
+
+For these reasons Intel cannot permit any manipulation by third parties
+of the software provided in binary format with the wireless WLAN
+adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
+patches, utilities, or code with the Intel wireless LAN adapters that
+have been manipulated by an unauthorized party (i.e., patches,
+utilities, or code (including open source code modifications) which have
+not been validated by Intel), (i) you will be solely responsible for
+ensuring the regulatory compliance of the products, (ii) Intel will bear
+no liability, under any theory of liability for any issues associated
+with the modified products, including without limitation, claims under
+the warranty and/or issues arising from regulatory non-compliance, and
+(iii) Intel will not provide or be required to assist in providing
+support to any third parties for such modified products.
+
+Note: Many regulatory agencies consider Wireless LAN adapters to be
+modules, and accordingly, condition system-level regulatory approval
+upon receipt and review of test data documenting that the antennas and
+system configuration do not cause the EMC and radio operation to be
+non-compliant.
+
+The drivers available for download from SourceForge are provided as a
+part of a development project.  Conformance to local regulatory
+requirements is the responsibility of the individual developer.  As
+such, if you are interested in deploying or shipping a driver as part of
+solution intended to be used for purposes other than development, please
+obtain a tested driver from Intel Customer Support at:
+
+http://support.intel.com
+
+
+1. Introduction
+===============
+
+The following sections attempt to provide a brief introduction to using
+the Intel(R) PRO/Wireless 2915ABG Driver for Linux.
+
+This document is not meant to be a comprehensive manual on
+understanding or using wireless technologies, but should be sufficient
+to get you moving without wires on Linux.
+
+For information on building and installing the driver, see the INSTALL
+file.
+
+
+1.1. Overview of Features
+-------------------------
+The current release (1.1.2) supports the following features:
+
++ BSS mode (Infrastructure, Managed)
++ IBSS mode (Ad-Hoc)
++ WEP (OPEN and SHARED KEY mode)
++ 802.1x EAP via wpa_supplicant and xsupplicant
++ Wireless Extension support
++ Full B and G rate support (2200 and 2915)
++ Full A rate support (2915 only)
++ Transmit power control
++ S state support (ACPI suspend/resume)
+
+The following features are currently enabled, but not officially
+supported:
+
++ WPA
++ long/short preamble support
++ Monitor mode (aka RFMon)
+
+The distinction between officially supported and enabled is a reflection
+on the amount of validation and interoperability testing that has been
+performed on a given feature.
+
+
+
+1.2. Command Line Parameters
+----------------------------
+
+Like many modules used in the Linux kernel, the Intel(R) PRO/Wireless
+2915ABG Driver for Linux allows configuration options to be provided
+as module parameters.  The most common way to specify a module parameter
+is via the command line.
+
+The general form is::
+
+    % modprobe ipw2200 parameter=value
+
+Where the supported parameter are:
+
+  associate
+	Set to 0 to disable the auto scan-and-associate functionality of the
+	driver.  If disabled, the driver will not attempt to scan
+	for and associate to a network until it has been configured with
+	one or more properties for the target network, for example configuring
+	the network SSID.  Default is 0 (do not auto-associate)
+
+	Example: % modprobe ipw2200 associate=0
+
+  auto_create
+	Set to 0 to disable the auto creation of an Ad-Hoc network
+	matching the channel and network name parameters provided.
+	Default is 1.
+
+  channel
+	channel number for association.  The normal method for setting
+	the channel would be to use the standard wireless tools
+	(i.e. `iwconfig eth1 channel 10`), but it is useful sometimes
+	to set this while debugging.  Channel 0 means 'ANY'
+
+  debug
+	If using a debug build, this is used to control the amount of debug
+	info is logged.  See the 'dvals' and 'load' script for more info on
+	how to use this (the dvals and load scripts are provided as part
+	of the ipw2200 development snapshot releases available from the
+	SourceForge project at http://ipw2200.sf.net)
+
+  led
+	Can be used to turn on experimental LED code.
+	0 = Off, 1 = On.  Default is 1.
+
+  mode
+	Can be used to set the default mode of the adapter.
+	0 = Managed, 1 = Ad-Hoc, 2 = Monitor
+
+
+1.3. Wireless Extension Private Methods
+---------------------------------------
+
+As an interface designed to handle generic hardware, there are certain
+capabilities not exposed through the normal Wireless Tool interface.  As
+such, a provision is provided for a driver to declare custom, or
+private, methods.  The Intel(R) PRO/Wireless 2915ABG Driver for Linux
+defines several of these to configure various settings.
+
+The general form of using the private wireless methods is::
+
+	% iwpriv $IFNAME method parameters
+
+Where $IFNAME is the interface name the device is registered with
+(typically eth1, customized via one of the various network interface
+name managers, such as ifrename)
+
+The supported private methods are:
+
+  get_mode
+	Can be used to report out which IEEE mode the driver is
+	configured to support.  Example:
+
+	% iwpriv eth1 get_mode
+	eth1	get_mode:802.11bg (6)
+
+  set_mode
+	Can be used to configure which IEEE mode the driver will
+	support.
+
+	Usage::
+
+	    % iwpriv eth1 set_mode {mode}
+
+	Where {mode} is a number in the range 1-7:
+
+	==	=====================
+	1	802.11a (2915 only)
+	2	802.11b
+	3	802.11ab (2915 only)
+	4	802.11g
+	5	802.11ag (2915 only)
+	6	802.11bg
+	7	802.11abg (2915 only)
+	==	=====================
+
+  get_preamble
+	Can be used to report configuration of preamble length.
+
+  set_preamble
+	Can be used to set the configuration of preamble length:
+
+	Usage::
+
+	    % iwpriv eth1 set_preamble {mode}
+
+	Where {mode} is one of:
+
+	==	========================================
+	1	Long preamble only
+	0	Auto (long or short based on connection)
+	==	========================================
+
+
+1.4. Sysfs Helper Files
+-----------------------
+
+The Linux kernel provides a pseudo file system that can be used to
+access various components of the operating system.  The Intel(R)
+PRO/Wireless 2915ABG Driver for Linux exposes several configuration
+parameters through this mechanism.
+
+An entry in the sysfs can support reading and/or writing.  You can
+typically query the contents of a sysfs entry through the use of cat,
+and can set the contents via echo.  For example::
+
+    % cat /sys/bus/pci/drivers/ipw2200/debug_level
+
+Will report the current debug level of the driver's logging subsystem
+(only available if CONFIG_IPW2200_DEBUG was configured when the driver
+was built).
+
+You can set the debug level via::
+
+    % echo $VALUE > /sys/bus/pci/drivers/ipw2200/debug_level
+
+Where $VALUE would be a number in the case of this sysfs entry.  The
+input to sysfs files does not have to be a number.  For example, the
+firmware loader used by hotplug utilizes sysfs entries for transferring
+the firmware image from user space into the driver.
+
+The Intel(R) PRO/Wireless 2915ABG Driver for Linux exposes sysfs entries
+at two levels -- driver level, which apply to all instances of the driver
+(in the event that there are more than one device installed) and device
+level, which applies only to the single specific instance.
+
+
+1.4.1 Driver Level Sysfs Helper Files
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+For the driver level files, look in /sys/bus/pci/drivers/ipw2200/
+
+  debug_level
+	This controls the same global as the 'debug' module parameter
+
+
+
+1.4.2 Device Level Sysfs Helper Files
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+For the device level files, look in::
+
+	/sys/bus/pci/drivers/ipw2200/{PCI-ID}/
+
+For example:::
+
+	/sys/bus/pci/drivers/ipw2200/0000:02:01.0
+
+For the device level files, see /sys/bus/pci/drivers/ipw2200:
+
+  rf_kill
+	read -
+
+	==  =========================================
+	0   RF kill not enabled (radio on)
+	1   SW based RF kill active (radio off)
+	2   HW based RF kill active (radio off)
+	3   Both HW and SW RF kill active (radio off)
+	==  =========================================
+
+	write -
+
+	==  ==================================================
+	0   If SW based RF kill active, turn the radio back on
+	1   If radio is on, activate SW based RF kill
+	==  ==================================================
+
+	.. note::
+
+	   If you enable the SW based RF kill and then toggle the HW
+	   based RF kill from ON -> OFF -> ON, the radio will NOT come back on
+
+  ucode
+	read-only access to the ucode version number
+
+  led
+	read -
+
+	==  =================
+	0   LED code disabled
+	1   LED code enabled
+	==  =================
+
+	write -
+
+	==  ================
+	0   Disable LED code
+	1   Enable LED code
+	==  ================
+
+
+	.. note::
+
+	   The LED code has been reported to hang some systems when
+	   running ifconfig and is therefore disabled by default.
+
+
+1.5. Supported channels
+-----------------------
+
+Upon loading the Intel(R) PRO/Wireless 2915ABG Driver for Linux, a
+message stating the detected geography code and the number of 802.11
+channels supported by the card will be displayed in the log.
+
+The geography code corresponds to a regulatory domain as shown in the
+table below.
+
+	+------+----------------------------+--------------------+
+	|      |			    | Supported channels |
+	| Code |        Geography	    +----------+---------+
+	|      |			    | 802.11bg | 802.11a |
+	+======+============================+==========+=========+
+	| ---  | Restricted 		    |  11      |   0     |
+	+------+----------------------------+----------+---------+
+	| ZZF  | Custom US/Canada 	    |  11      |   8     |
+	+------+----------------------------+----------+---------+
+	| ZZD  | Rest of World 		    |  13      |   0     |
+	+------+----------------------------+----------+---------+
+	| ZZA  | Custom USA & Europe & High |  11      |  13     |
+	+------+----------------------------+----------+---------+
+	| ZZB  | Custom NA & Europe	    |  11      |  13     |
+	+------+----------------------------+----------+---------+
+	| ZZC  | Custom Japan 		    |  11      |   4     |
+	+------+----------------------------+----------+---------+
+	| ZZM  | Custom  		    |  11      |   0     |
+	+------+----------------------------+----------+---------+
+	| ZZE  | Europe 		    |  13      |  19     |
+	+------+----------------------------+----------+---------+
+	| ZZJ  | Custom Japan 		    |  14      |   4     |
+	+------+----------------------------+----------+---------+
+	| ZZR  | Rest of World		    |  14      |   0     |
+	+------+----------------------------+----------+---------+
+	| ZZH  | High Band		    |  13      |   4     |
+	+------+----------------------------+----------+---------+
+	| ZZG  | Custom Europe		    |  13      |   4     |
+	+------+----------------------------+----------+---------+
+	| ZZK  | Europe 		    |  13      |  24     |
+	+------+----------------------------+----------+---------+
+	| ZZL  | Europe 		    |  11      |  13     |
+	+------+----------------------------+----------+---------+
+
+2.  Ad-Hoc Networking
+=====================
+
+When using a device in an Ad-Hoc network, it is useful to understand the
+sequence and requirements for the driver to be able to create, join, or
+merge networks.
+
+The following attempts to provide enough information so that you can
+have a consistent experience while using the driver as a member of an
+Ad-Hoc network.
+
+2.1. Joining an Ad-Hoc Network
+------------------------------
+
+The easiest way to get onto an Ad-Hoc network is to join one that
+already exists.
+
+2.2. Creating an Ad-Hoc Network
+-------------------------------
+
+An Ad-Hoc networks is created using the syntax of the Wireless tool.
+
+For Example:
+iwconfig eth1 mode ad-hoc essid testing channel 2
+
+2.3. Merging Ad-Hoc Networks
+----------------------------
+
+
+3. Interaction with Wireless Tools
+==================================
+
+3.1 iwconfig mode
+-----------------
+
+When configuring the mode of the adapter, all run-time configured parameters
+are reset to the value used when the module was loaded.  This includes
+channels, rates, ESSID, etc.
+
+3.2 iwconfig sens
+-----------------
+
+The 'iwconfig ethX sens XX' command will not set the signal sensitivity
+threshold, as described in iwconfig documentation, but rather the number
+of consecutive missed beacons that will trigger handover, i.e. roaming
+to another access point. At the same time, it will set the disassociation
+threshold to 3 times the given value.
+
+
+4.  About the Version Numbers
+=============================
+
+Due to the nature of open source development projects, there are
+frequently changes being incorporated that have not gone through
+a complete validation process.  These changes are incorporated into
+development snapshot releases.
+
+Releases are numbered with a three level scheme:
+
+	major.minor.development
+
+Any version where the 'development' portion is 0 (for example
+1.0.0, 1.1.0, etc.) indicates a stable version that will be made
+available for kernel inclusion.
+
+Any version where the 'development' portion is not a 0 (for
+example 1.0.1, 1.1.5, etc.) indicates a development version that is
+being made available for testing and cutting edge users.  The stability
+and functionality of the development releases are not know.  We make
+efforts to try and keep all snapshots reasonably stable, but due to the
+frequency of their release, and the desire to get those releases
+available as quickly as possible, unknown anomalies should be expected.
+
+The major version number will be incremented when significant changes
+are made to the driver.  Currently, there are no major changes planned.
+
+5. Firmware installation
+========================
+
+The driver requires a firmware image, download it and extract the
+files under /lib/firmware (or wherever your hotplug's firmware.agent
+will look for firmware files)
+
+The firmware can be downloaded from the following URL:
+
+    http://ipw2200.sf.net/
+
+
+6. Support
+==========
+
+For direct support of the 1.0.0 version, you can contact
+http://supportmail.intel.com, or you can use the open source project
+support.
+
+For general information and support, go to:
+
+    http://ipw2200.sf.net/
+
+
+7. License
+==========
+
+  Copyright |copy| 2003 - 2006 Intel Corporation. All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms of the GNU General Public License version 2 as
+  published by the Free Software Foundation.
+
+  This program is distributed in the hope that it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc., 59
+  Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+
+  The full GNU General Public License is included in this distribution in the
+  file called LICENSE.
+
+  Contact Information:
+
+  James P. Ketrenos <ipw2100-admin@linux.intel.com>
+
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
diff --git a/Documentation/networking/device_drivers/intel/ipw2200.txt b/Documentation/networking/device_drivers/intel/ipw2200.txt
deleted file mode 100644
index b7658bed4906..000000000000
--- a/Documentation/networking/device_drivers/intel/ipw2200.txt
+++ /dev/null
@@ -1,472 +0,0 @@
-
-Intel(R) PRO/Wireless 2915ABG Driver for Linux in support of:
-
-Intel(R) PRO/Wireless 2200BG Network Connection
-Intel(R) PRO/Wireless 2915ABG Network Connection
-
-Note: The Intel(R) PRO/Wireless 2915ABG Driver for Linux and Intel(R)
-PRO/Wireless 2200BG Driver for Linux is a unified driver that works on
-both hardware adapters listed above. In this document the Intel(R)
-PRO/Wireless 2915ABG Driver for Linux will be used to reference the
-unified driver.
-
-Copyright (C) 2004-2006, Intel Corporation
-
-README.ipw2200
-
-Version: 1.1.2
-Date   : March 30, 2006
-
-
-Index
------------------------------------------------
-0.   IMPORTANT INFORMATION BEFORE USING THIS DRIVER
-1.   Introduction
-1.1. Overview of features
-1.2. Module parameters
-1.3. Wireless Extension Private Methods
-1.4. Sysfs Helper Files
-1.5. Supported channels
-2.   Ad-Hoc Networking
-3.   Interacting with Wireless Tools
-3.1. iwconfig mode
-3.2. iwconfig sens
-4.   About the Version Numbers
-5.   Firmware installation
-6.   Support
-7.   License
-
-
-0.   IMPORTANT INFORMATION BEFORE USING THIS DRIVER
------------------------------------------------
-
-Important Notice FOR ALL USERS OR DISTRIBUTORS!!!! 
-
-Intel wireless LAN adapters are engineered, manufactured, tested, and
-quality checked to ensure that they meet all necessary local and
-governmental regulatory agency requirements for the regions that they
-are designated and/or marked to ship into. Since wireless LANs are
-generally unlicensed devices that share spectrum with radars,
-satellites, and other licensed and unlicensed devices, it is sometimes
-necessary to dynamically detect, avoid, and limit usage to avoid
-interference with these devices. In many instances Intel is required to
-provide test data to prove regional and local compliance to regional and
-governmental regulations before certification or approval to use the
-product is granted. Intel's wireless LAN's EEPROM, firmware, and
-software driver are designed to carefully control parameters that affect
-radio operation and to ensure electromagnetic compliance (EMC). These
-parameters include, without limitation, RF power, spectrum usage,
-channel scanning, and human exposure. 
-
-For these reasons Intel cannot permit any manipulation by third parties
-of the software provided in binary format with the wireless WLAN
-adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
-patches, utilities, or code with the Intel wireless LAN adapters that
-have been manipulated by an unauthorized party (i.e., patches,
-utilities, or code (including open source code modifications) which have
-not been validated by Intel), (i) you will be solely responsible for
-ensuring the regulatory compliance of the products, (ii) Intel will bear
-no liability, under any theory of liability for any issues associated
-with the modified products, including without limitation, claims under
-the warranty and/or issues arising from regulatory non-compliance, and
-(iii) Intel will not provide or be required to assist in providing
-support to any third parties for such modified products.  
-
-Note: Many regulatory agencies consider Wireless LAN adapters to be
-modules, and accordingly, condition system-level regulatory approval
-upon receipt and review of test data documenting that the antennas and
-system configuration do not cause the EMC and radio operation to be
-non-compliant.
-
-The drivers available for download from SourceForge are provided as a 
-part of a development project.  Conformance to local regulatory 
-requirements is the responsibility of the individual developer.  As 
-such, if you are interested in deploying or shipping a driver as part of 
-solution intended to be used for purposes other than development, please 
-obtain a tested driver from Intel Customer Support at:
-
-http://support.intel.com
-
-
-1.   Introduction
------------------------------------------------
-The following sections attempt to provide a brief introduction to using 
-the Intel(R) PRO/Wireless 2915ABG Driver for Linux.
-
-This document is not meant to be a comprehensive manual on 
-understanding or using wireless technologies, but should be sufficient 
-to get you moving without wires on Linux.
-
-For information on building and installing the driver, see the INSTALL
-file.
-
-
-1.1. Overview of Features
------------------------------------------------
-The current release (1.1.2) supports the following features:
-
-+ BSS mode (Infrastructure, Managed)
-+ IBSS mode (Ad-Hoc)
-+ WEP (OPEN and SHARED KEY mode)
-+ 802.1x EAP via wpa_supplicant and xsupplicant
-+ Wireless Extension support 
-+ Full B and G rate support (2200 and 2915)
-+ Full A rate support (2915 only)
-+ Transmit power control
-+ S state support (ACPI suspend/resume)
-
-The following features are currently enabled, but not officially
-supported:
-
-+ WPA
-+ long/short preamble support
-+ Monitor mode (aka RFMon)
-
-The distinction between officially supported and enabled is a reflection 
-on the amount of validation and interoperability testing that has been
-performed on a given feature. 
-
-
-
-1.2. Command Line Parameters
------------------------------------------------
-
-Like many modules used in the Linux kernel, the Intel(R) PRO/Wireless
-2915ABG Driver for Linux allows configuration options to be provided 
-as module parameters.  The most common way to specify a module parameter 
-is via the command line.  
-
-The general form is:
-
-% modprobe ipw2200 parameter=value
-
-Where the supported parameter are:
-
-  associate
-	Set to 0 to disable the auto scan-and-associate functionality of the
-	driver.  If disabled, the driver will not attempt to scan 
-	for and associate to a network until it has been configured with 
-	one or more properties for the target network, for example configuring 
-	the network SSID.  Default is 0 (do not auto-associate)
-	
-	Example: % modprobe ipw2200 associate=0
-
-  auto_create
-	Set to 0 to disable the auto creation of an Ad-Hoc network 
-	matching the channel and network name parameters provided.  
-	Default is 1.
-
-  channel
-	channel number for association.  The normal method for setting
-        the channel would be to use the standard wireless tools
-        (i.e. `iwconfig eth1 channel 10`), but it is useful sometimes
-	to set this while debugging.  Channel 0 means 'ANY'
-
-  debug
-	If using a debug build, this is used to control the amount of debug
-	info is logged.  See the 'dvals' and 'load' script for more info on
-	how to use this (the dvals and load scripts are provided as part 
-	of the ipw2200 development snapshot releases available from the 
-	SourceForge project at http://ipw2200.sf.net)
-  
-  led
-	Can be used to turn on experimental LED code.
-	0 = Off, 1 = On.  Default is 1.
-
-  mode
-	Can be used to set the default mode of the adapter.  
-	0 = Managed, 1 = Ad-Hoc, 2 = Monitor
-
-
-1.3. Wireless Extension Private Methods
------------------------------------------------
-
-As an interface designed to handle generic hardware, there are certain 
-capabilities not exposed through the normal Wireless Tool interface.  As 
-such, a provision is provided for a driver to declare custom, or 
-private, methods.  The Intel(R) PRO/Wireless 2915ABG Driver for Linux 
-defines several of these to configure various settings.
-
-The general form of using the private wireless methods is:
-
-	% iwpriv $IFNAME method parameters
-
-Where $IFNAME is the interface name the device is registered with 
-(typically eth1, customized via one of the various network interface
-name managers, such as ifrename)
-
-The supported private methods are:
-
-  get_mode
-	Can be used to report out which IEEE mode the driver is 
-	configured to support.  Example:
-	
-	% iwpriv eth1 get_mode
-	eth1	get_mode:802.11bg (6)
-
-  set_mode
-	Can be used to configure which IEEE mode the driver will 
-	support.  
-
-	Usage:
-	% iwpriv eth1 set_mode {mode}
-	Where {mode} is a number in the range 1-7:
-	1	802.11a (2915 only)
-	2	802.11b
-	3	802.11ab (2915 only)
-	4	802.11g 
-	5	802.11ag (2915 only)
-	6	802.11bg
-	7	802.11abg (2915 only)
-
-  get_preamble
-	Can be used to report configuration of preamble length.
-
-  set_preamble
-	Can be used to set the configuration of preamble length:
-
-	Usage:
-	% iwpriv eth1 set_preamble {mode}
-	Where {mode} is one of:
-	1	Long preamble only
-	0	Auto (long or short based on connection)
-	
-
-1.4. Sysfs Helper Files:
------------------------------------------------
-
-The Linux kernel provides a pseudo file system that can be used to 
-access various components of the operating system.  The Intel(R)
-PRO/Wireless 2915ABG Driver for Linux exposes several configuration
-parameters through this mechanism.
-
-An entry in the sysfs can support reading and/or writing.  You can 
-typically query the contents of a sysfs entry through the use of cat, 
-and can set the contents via echo.  For example:
-
-% cat /sys/bus/pci/drivers/ipw2200/debug_level
-
-Will report the current debug level of the driver's logging subsystem 
-(only available if CONFIG_IPW2200_DEBUG was configured when the driver
-was built).
-
-You can set the debug level via:
-
-% echo $VALUE > /sys/bus/pci/drivers/ipw2200/debug_level
-
-Where $VALUE would be a number in the case of this sysfs entry.  The 
-input to sysfs files does not have to be a number.  For example, the 
-firmware loader used by hotplug utilizes sysfs entries for transferring 
-the firmware image from user space into the driver.
-
-The Intel(R) PRO/Wireless 2915ABG Driver for Linux exposes sysfs entries 
-at two levels -- driver level, which apply to all instances of the driver 
-(in the event that there are more than one device installed) and device 
-level, which applies only to the single specific instance.
-
-
-1.4.1 Driver Level Sysfs Helper Files
------------------------------------------------
-
-For the driver level files, look in /sys/bus/pci/drivers/ipw2200/
-
-  debug_level  
-	
-	This controls the same global as the 'debug' module parameter
-
-
-
-1.4.2 Device Level Sysfs Helper Files
------------------------------------------------
-
-For the device level files, look in
-	
-	/sys/bus/pci/drivers/ipw2200/{PCI-ID}/
-
-For example:
-	/sys/bus/pci/drivers/ipw2200/0000:02:01.0
-
-For the device level files, see /sys/bus/pci/drivers/ipw2200:
-
-  rf_kill
-	read - 
-	0 = RF kill not enabled (radio on)
-	1 = SW based RF kill active (radio off)
-	2 = HW based RF kill active (radio off)
-	3 = Both HW and SW RF kill active (radio off)
-	write -
-	0 = If SW based RF kill active, turn the radio back on
-	1 = If radio is on, activate SW based RF kill
-
-	NOTE: If you enable the SW based RF kill and then toggle the HW
-  	based RF kill from ON -> OFF -> ON, the radio will NOT come back on
-	
-  ucode 
-	read-only access to the ucode version number
-
-  led
-	read -
-	0 = LED code disabled
-	1 = LED code enabled
-	write -
-	0 = Disable LED code
-	1 = Enable LED code
-
-	NOTE: The LED code has been reported to hang some systems when 
-	running ifconfig and is therefore disabled by default.
-
-
-1.5. Supported channels
------------------------------------------------
-
-Upon loading the Intel(R) PRO/Wireless 2915ABG Driver for Linux, a
-message stating the detected geography code and the number of 802.11
-channels supported by the card will be displayed in the log.
-
-The geography code corresponds to a regulatory domain as shown in the
-table below.
-
-					  Supported channels
-Code	Geography			802.11bg	802.11a
-
----	Restricted			11 	 	 0
-ZZF	Custom US/Canada		11	 	 8
-ZZD	Rest of World			13	 	 0
-ZZA	Custom USA & Europe & High	11		13
-ZZB	Custom NA & Europe    		11		13
-ZZC	Custom Japan			11	 	 4
-ZZM	Custom 				11	 	 0
-ZZE	Europe				13		19
-ZZJ	Custom Japan			14	 	 4
-ZZR	Rest of World			14	 	 0
-ZZH	High Band			13	 	 4
-ZZG	Custom Europe			13	 	 4
-ZZK	Europe 				13		24
-ZZL	Europe				11		13
-
-
-2.   Ad-Hoc Networking
------------------------------------------------
-
-When using a device in an Ad-Hoc network, it is useful to understand the 
-sequence and requirements for the driver to be able to create, join, or 
-merge networks.
-
-The following attempts to provide enough information so that you can 
-have a consistent experience while using the driver as a member of an 
-Ad-Hoc network.
-
-2.1. Joining an Ad-Hoc Network
------------------------------------------------
-
-The easiest way to get onto an Ad-Hoc network is to join one that 
-already exists.
-
-2.2. Creating an Ad-Hoc Network
------------------------------------------------
-
-An Ad-Hoc networks is created using the syntax of the Wireless tool.
-
-For Example:
-iwconfig eth1 mode ad-hoc essid testing channel 2
-
-2.3. Merging Ad-Hoc Networks
------------------------------------------------
-
-
-3.  Interaction with Wireless Tools
------------------------------------------------
-
-3.1 iwconfig mode
------------------------------------------------
-
-When configuring the mode of the adapter, all run-time configured parameters
-are reset to the value used when the module was loaded.  This includes
-channels, rates, ESSID, etc.
-
-3.2 iwconfig sens
------------------------------------------------
-
-The 'iwconfig ethX sens XX' command will not set the signal sensitivity
-threshold, as described in iwconfig documentation, but rather the number
-of consecutive missed beacons that will trigger handover, i.e. roaming
-to another access point. At the same time, it will set the disassociation
-threshold to 3 times the given value.
-
-
-4.   About the Version Numbers
------------------------------------------------
-
-Due to the nature of open source development projects, there are 
-frequently changes being incorporated that have not gone through 
-a complete validation process.  These changes are incorporated into 
-development snapshot releases.
-
-Releases are numbered with a three level scheme: 
-
-	major.minor.development
-
-Any version where the 'development' portion is 0 (for example
-1.0.0, 1.1.0, etc.) indicates a stable version that will be made 
-available for kernel inclusion.
-
-Any version where the 'development' portion is not a 0 (for
-example 1.0.1, 1.1.5, etc.) indicates a development version that is
-being made available for testing and cutting edge users.  The stability 
-and functionality of the development releases are not know.  We make
-efforts to try and keep all snapshots reasonably stable, but due to the
-frequency of their release, and the desire to get those releases 
-available as quickly as possible, unknown anomalies should be expected.
-
-The major version number will be incremented when significant changes
-are made to the driver.  Currently, there are no major changes planned.
-
-5.  Firmware installation
-----------------------------------------------
-
-The driver requires a firmware image, download it and extract the
-files under /lib/firmware (or wherever your hotplug's firmware.agent
-will look for firmware files)
-
-The firmware can be downloaded from the following URL:
-
-    http://ipw2200.sf.net/
-
-
-6.  Support
------------------------------------------------
-
-For direct support of the 1.0.0 version, you can contact 
-http://supportmail.intel.com, or you can use the open source project
-support.
-
-For general information and support, go to:
-	
-    http://ipw2200.sf.net/
-
-
-7.  License
------------------------------------------------
-
-  Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
-
-  This program is free software; you can redistribute it and/or modify it 
-  under the terms of the GNU General Public License version 2 as 
-  published by the Free Software Foundation.
-  
-  This program is distributed in the hope that it will be useful, but WITHOUT 
-  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
-  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
-  more details.
-  
-  You should have received a copy of the GNU General Public License along with
-  this program; if not, write to the Free Software Foundation, Inc., 59 
-  Temple Place - Suite 330, Boston, MA  02111-1307, USA.
-  
-  The full GNU General Public License is included in this distribution in the
-  file called LICENSE.
-  
-  Contact Information:
-  James P. Ketrenos <ipw2100-admin@linux.intel.com>
-  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
-
diff --git a/MAINTAINERS b/MAINTAINERS
index 887c4e7e6102..107decaf0ac0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8743,7 +8743,7 @@ M:	Stanislav Yakovlev <stas.yakovlev@gmail.com>
 L:	linux-wireless@vger.kernel.org
 S:	Maintained
 F:	Documentation/networking/device_drivers/intel/ipw2100.rst
-F:	Documentation/networking/device_drivers/intel/ipw2200.txt
+F:	Documentation/networking/device_drivers/intel/ipw2200.rst
 F:	drivers/net/wireless/intel/ipw2x00/
 
 INTEL PSTATE DRIVER
diff --git a/drivers/net/wireless/intel/ipw2x00/Kconfig b/drivers/net/wireless/intel/ipw2x00/Kconfig
index b0b3cd6296f3..f42b3cdce611 100644
--- a/drivers/net/wireless/intel/ipw2x00/Kconfig
+++ b/drivers/net/wireless/intel/ipw2x00/Kconfig
@@ -78,7 +78,7 @@ config IPW2200
 	  A driver for the Intel PRO/Wireless 2200BG and 2915ABG Network
 	  Connection adapters.
 
-	  See <file:Documentation/networking/device_drivers/intel/ipw2200.txt>
+	  See <file:Documentation/networking/device_drivers/intel/ipw2200.rst>
 	  for information on the capabilities currently enabled in this
 	  driver and for tips for debugging issues and problems.
 
-- 
cgit v1.2.3


From 011531f7e525983f0bf2060fb4f048f580606d74 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:48 +0200
Subject: docs: networking: device drivers: convert microsoft/netvsc.txt to
 ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |   1 +
 .../networking/device_drivers/microsoft/netvsc.rst | 116 +++++++++++++++++++++
 .../networking/device_drivers/microsoft/netvsc.txt | 105 -------------------
 MAINTAINERS                                        |   2 +-
 4 files changed, 118 insertions(+), 106 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/microsoft/netvsc.rst
 delete mode 100644 Documentation/networking/device_drivers/microsoft/netvsc.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index f9ce0089ec7d..575f0043b03e 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -41,6 +41,7 @@ Contents:
    freescale/gianfar
    intel/ipw2100
    intel/ipw2200
+   microsoft/netvsc
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/microsoft/netvsc.rst b/Documentation/networking/device_drivers/microsoft/netvsc.rst
new file mode 100644
index 000000000000..c3f51c672a68
--- /dev/null
+++ b/Documentation/networking/device_drivers/microsoft/netvsc.rst
@@ -0,0 +1,116 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+Hyper-V network driver
+======================
+
+Compatibility
+=============
+
+This driver is compatible with Windows Server 2012 R2, 2016 and
+Windows 10.
+
+Features
+========
+
+Checksum offload
+----------------
+  The netvsc driver supports checksum offload as long as the
+  Hyper-V host version does. Windows Server 2016 and Azure
+  support checksum offload for TCP and UDP for both IPv4 and
+  IPv6. Windows Server 2012 only supports checksum offload for TCP.
+
+Receive Side Scaling
+--------------------
+  Hyper-V supports receive side scaling. For TCP & UDP, packets can
+  be distributed among available queues based on IP address and port
+  number.
+
+  For TCP & UDP, we can switch hash level between L3 and L4 by ethtool
+  command. TCP/UDP over IPv4 and v6 can be set differently. The default
+  hash level is L4. We currently only allow switching TX hash level
+  from within the guests.
+
+  On Azure, fragmented UDP packets have high loss rate with L4
+  hashing. Using L3 hashing is recommended in this case.
+
+  For example, for UDP over IPv4 on eth0:
+
+  To include UDP port numbers in hashing::
+
+	ethtool -N eth0 rx-flow-hash udp4 sdfn
+
+  To exclude UDP port numbers in hashing::
+
+	ethtool -N eth0 rx-flow-hash udp4 sd
+
+  To show UDP hash level::
+
+	ethtool -n eth0 rx-flow-hash udp4
+
+Generic Receive Offload, aka GRO
+--------------------------------
+  The driver supports GRO and it is enabled by default. GRO coalesces
+  like packets and significantly reduces CPU usage under heavy Rx
+  load.
+
+Large Receive Offload (LRO), or Receive Side Coalescing (RSC)
+-------------------------------------------------------------
+  The driver supports LRO/RSC in the vSwitch feature. It reduces the per packet
+  processing overhead by coalescing multiple TCP segments when possible. The
+  feature is enabled by default on VMs running on Windows Server 2019 and
+  later. It may be changed by ethtool command::
+
+	ethtool -K eth0 lro on
+	ethtool -K eth0 lro off
+
+SR-IOV support
+--------------
+  Hyper-V supports SR-IOV as a hardware acceleration option. If SR-IOV
+  is enabled in both the vSwitch and the guest configuration, then the
+  Virtual Function (VF) device is passed to the guest as a PCI
+  device. In this case, both a synthetic (netvsc) and VF device are
+  visible in the guest OS and both NIC's have the same MAC address.
+
+  The VF is enslaved by netvsc device.  The netvsc driver will transparently
+  switch the data path to the VF when it is available and up.
+  Network state (addresses, firewall, etc) should be applied only to the
+  netvsc device; the slave device should not be accessed directly in
+  most cases.  The exceptions are if some special queue discipline or
+  flow direction is desired, these should be applied directly to the
+  VF slave device.
+
+Receive Buffer
+--------------
+  Packets are received into a receive area which is created when device
+  is probed. The receive area is broken into MTU sized chunks and each may
+  contain one or more packets. The number of receive sections may be changed
+  via ethtool Rx ring parameters.
+
+  There is a similar send buffer which is used to aggregate packets for sending.
+  The send area is broken into chunks of 6144 bytes, each of section may
+  contain one or more packets. The send buffer is an optimization, the driver
+  will use slower method to handle very large packets or if the send buffer
+  area is exhausted.
+
+XDP support
+-----------
+  XDP (eXpress Data Path) is a feature that runs eBPF bytecode at the early
+  stage when packets arrive at a NIC card. The goal is to increase performance
+  for packet processing, reducing the overhead of SKB allocation and other
+  upper network layers.
+
+  hv_netvsc supports XDP in native mode, and transparently sets the XDP
+  program on the associated VF NIC as well.
+
+  Setting / unsetting XDP program on synthetic NIC (netvsc) propagates to
+  VF NIC automatically. Setting / unsetting XDP program on VF NIC directly
+  is not recommended, also not propagated to synthetic NIC, and may be
+  overwritten by setting of synthetic NIC.
+
+  XDP program cannot run with LRO (RSC) enabled, so you need to disable LRO
+  before running XDP::
+
+	ethtool -K eth0 lro off
+
+  XDP_REDIRECT action is not yet supported.
diff --git a/Documentation/networking/device_drivers/microsoft/netvsc.txt b/Documentation/networking/device_drivers/microsoft/netvsc.txt
deleted file mode 100644
index cd63556b27a0..000000000000
--- a/Documentation/networking/device_drivers/microsoft/netvsc.txt
+++ /dev/null
@@ -1,105 +0,0 @@
-Hyper-V network driver
-======================
-
-Compatibility
-=============
-
-This driver is compatible with Windows Server 2012 R2, 2016 and
-Windows 10.
-
-Features
-========
-
-  Checksum offload
-  ----------------
-  The netvsc driver supports checksum offload as long as the
-  Hyper-V host version does. Windows Server 2016 and Azure
-  support checksum offload for TCP and UDP for both IPv4 and
-  IPv6. Windows Server 2012 only supports checksum offload for TCP.
-
-  Receive Side Scaling
-  --------------------
-  Hyper-V supports receive side scaling. For TCP & UDP, packets can
-  be distributed among available queues based on IP address and port
-  number.
-
-  For TCP & UDP, we can switch hash level between L3 and L4 by ethtool
-  command. TCP/UDP over IPv4 and v6 can be set differently. The default
-  hash level is L4. We currently only allow switching TX hash level
-  from within the guests.
-
-  On Azure, fragmented UDP packets have high loss rate with L4
-  hashing. Using L3 hashing is recommended in this case.
-
-  For example, for UDP over IPv4 on eth0:
-  To include UDP port numbers in hashing:
-        ethtool -N eth0 rx-flow-hash udp4 sdfn
-  To exclude UDP port numbers in hashing:
-        ethtool -N eth0 rx-flow-hash udp4 sd
-  To show UDP hash level:
-        ethtool -n eth0 rx-flow-hash udp4
-
-  Generic Receive Offload, aka GRO
-  --------------------------------
-  The driver supports GRO and it is enabled by default. GRO coalesces
-  like packets and significantly reduces CPU usage under heavy Rx
-  load.
-
-  Large Receive Offload (LRO), or Receive Side Coalescing (RSC)
-  -------------------------------------------------------------
-  The driver supports LRO/RSC in the vSwitch feature. It reduces the per packet
-  processing overhead by coalescing multiple TCP segments when possible. The
-  feature is enabled by default on VMs running on Windows Server 2019 and
-  later. It may be changed by ethtool command:
-	ethtool -K eth0 lro on
-	ethtool -K eth0 lro off
-
-  SR-IOV support
-  --------------
-  Hyper-V supports SR-IOV as a hardware acceleration option. If SR-IOV
-  is enabled in both the vSwitch and the guest configuration, then the
-  Virtual Function (VF) device is passed to the guest as a PCI
-  device. In this case, both a synthetic (netvsc) and VF device are
-  visible in the guest OS and both NIC's have the same MAC address.
-
-  The VF is enslaved by netvsc device.  The netvsc driver will transparently
-  switch the data path to the VF when it is available and up.
-  Network state (addresses, firewall, etc) should be applied only to the
-  netvsc device; the slave device should not be accessed directly in
-  most cases.  The exceptions are if some special queue discipline or
-  flow direction is desired, these should be applied directly to the
-  VF slave device.
-
-  Receive Buffer
-  --------------
-  Packets are received into a receive area which is created when device
-  is probed. The receive area is broken into MTU sized chunks and each may
-  contain one or more packets. The number of receive sections may be changed
-  via ethtool Rx ring parameters.
-
-  There is a similar send buffer which is used to aggregate packets for sending.
-  The send area is broken into chunks of 6144 bytes, each of section may
-  contain one or more packets. The send buffer is an optimization, the driver
-  will use slower method to handle very large packets or if the send buffer
-  area is exhausted.
-
-  XDP support
-  -----------
-  XDP (eXpress Data Path) is a feature that runs eBPF bytecode at the early
-  stage when packets arrive at a NIC card. The goal is to increase performance
-  for packet processing, reducing the overhead of SKB allocation and other
-  upper network layers.
-
-  hv_netvsc supports XDP in native mode, and transparently sets the XDP
-  program on the associated VF NIC as well.
-
-  Setting / unsetting XDP program on synthetic NIC (netvsc) propagates to
-  VF NIC automatically. Setting / unsetting XDP program on VF NIC directly
-  is not recommended, also not propagated to synthetic NIC, and may be
-  overwritten by setting of synthetic NIC.
-
-  XDP program cannot run with LRO (RSC) enabled, so you need to disable LRO
-  before running XDP:
-	ethtool -K eth0 lro off
-
-  XDP_REDIRECT action is not yet supported.
diff --git a/MAINTAINERS b/MAINTAINERS
index 107decaf0ac0..ba8bb932e3da 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7867,7 +7867,7 @@ S:	Supported
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git
 F:	Documentation/ABI/stable/sysfs-bus-vmbus
 F:	Documentation/ABI/testing/debugfs-hyperv
-F:	Documentation/networking/device_drivers/microsoft/netvsc.txt
+F:	Documentation/networking/device_drivers/microsoft/netvsc.rst
 F:	arch/x86/hyperv
 F:	arch/x86/include/asm/hyperv-tlfs.h
 F:	arch/x86/include/asm/mshyperv.h
-- 
cgit v1.2.3


From 7762f5c514dce027ad2a2031390c0c19c24547af Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:49 +0200
Subject: docs: networking: device drivers: convert neterion/s2io.txt to ReST

- add SPDX header;
- add a document title;
- comment out text-only TOC from html/pdf output;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |   1 +
 .../networking/device_drivers/neterion/s2io.rst    | 196 +++++++++++++++++++++
 .../networking/device_drivers/neterion/s2io.txt    | 141 ---------------
 MAINTAINERS                                        |   2 +-
 drivers/net/ethernet/neterion/Kconfig              |   2 +-
 5 files changed, 199 insertions(+), 143 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/neterion/s2io.rst
 delete mode 100644 Documentation/networking/device_drivers/neterion/s2io.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 575f0043b03e..da1f8438d4ea 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -42,6 +42,7 @@ Contents:
    intel/ipw2100
    intel/ipw2200
    microsoft/netvsc
+   neterion/s2io
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/neterion/s2io.rst b/Documentation/networking/device_drivers/neterion/s2io.rst
new file mode 100644
index 000000000000..c5673ec4559b
--- /dev/null
+++ b/Documentation/networking/device_drivers/neterion/s2io.rst
@@ -0,0 +1,196 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================================
+Neterion's (Formerly S2io) Xframe I/II PCI-X 10GbE driver
+=========================================================
+
+Release notes for Neterion's (Formerly S2io) Xframe I/II PCI-X 10GbE driver.
+
+.. Contents
+  - 1.  Introduction
+  - 2.  Identifying the adapter/interface
+  - 3.  Features supported
+  - 4.  Command line parameters
+  - 5.  Performance suggestions
+  - 6.  Available Downloads
+
+
+1. Introduction
+===============
+This Linux driver supports Neterion's Xframe I PCI-X 1.0 and
+Xframe II PCI-X 2.0 adapters. It supports several features
+such as jumbo frames, MSI/MSI-X, checksum offloads, TSO, UFO and so on.
+See below for complete list of features.
+
+All features are supported for both IPv4 and IPv6.
+
+2. Identifying the adapter/interface
+====================================
+
+a. Insert the adapter(s) in your system.
+b. Build and load driver::
+
+	# insmod s2io.ko
+
+c. View log messages::
+
+	# dmesg | tail -40
+
+You will see messages similar to::
+
+	eth3: Neterion Xframe I 10GbE adapter (rev 3), Version 2.0.9.1, Intr type INTA
+	eth4: Neterion Xframe II 10GbE adapter (rev 2), Version 2.0.9.1, Intr type INTA
+	eth4: Device is on 64 bit 133MHz PCIX(M1) bus
+
+The above messages identify the adapter type(Xframe I/II), adapter revision,
+driver version, interface name(eth3, eth4), Interrupt type(INTA, MSI, MSI-X).
+In case of Xframe II, the PCI/PCI-X bus width and frequency are displayed
+as well.
+
+To associate an interface with a physical adapter use "ethtool -p <ethX>".
+The corresponding adapter's LED will blink multiple times.
+
+3. Features supported
+=====================
+a. Jumbo frames. Xframe I/II supports MTU up to 9600 bytes,
+   modifiable using ip command.
+
+b. Offloads. Supports checksum offload(TCP/UDP/IP) on transmit
+   and receive, TSO.
+
+c. Multi-buffer receive mode. Scattering of packet across multiple
+   buffers. Currently driver supports 2-buffer mode which yields
+   significant performance improvement on certain platforms(SGI Altix,
+   IBM xSeries).
+
+d. MSI/MSI-X. Can be enabled on platforms which support this feature
+   (IA64, Xeon) resulting in noticeable performance improvement(up to 7%
+   on certain platforms).
+
+e. Statistics. Comprehensive MAC-level and software statistics displayed
+   using "ethtool -S" option.
+
+f. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
+   with multiple steering options.
+
+4. Command line parameters
+==========================
+
+a. tx_fifo_num
+	Number of transmit queues
+
+Valid range: 1-8
+
+Default: 1
+
+b. rx_ring_num
+	Number of receive rings
+
+Valid range: 1-8
+
+Default: 1
+
+c. tx_fifo_len
+	Size of each transmit queue
+
+Valid range: Total length of all queues should not exceed 8192
+
+Default: 4096
+
+d. rx_ring_sz
+	Size of each receive ring(in 4K blocks)
+
+Valid range: Limited by memory on system
+
+Default: 30
+
+e. intr_type
+	Specifies interrupt type. Possible values 0(INTA), 2(MSI-X)
+
+Valid values: 0, 2
+
+Default: 2
+
+5. Performance suggestions
+==========================
+
+General:
+
+a. Set MTU to maximum(9000 for switch setup, 9600 in back-to-back configuration)
+b. Set TCP windows size to optimal value.
+
+For instance, for MTU=1500 a value of 210K has been observed to result in
+good performance::
+
+	# sysctl -w net.ipv4.tcp_rmem="210000 210000 210000"
+	# sysctl -w net.ipv4.tcp_wmem="210000 210000 210000"
+
+For MTU=9000, TCP window size of 10 MB is recommended::
+
+	# sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
+	# sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
+
+Transmit performance:
+
+a. By default, the driver respects BIOS settings for PCI bus parameters.
+   However, you may want to experiment with PCI bus parameters
+   max-split-transactions(MOST) and MMRBC (use setpci command).
+
+   A MOST value of 2 has been found optimal for Opterons and 3 for Itanium.
+
+   It could be different for your hardware.
+
+   Set MMRBC to 4K**.
+
+   For example you can set
+
+   For opteron::
+
+	#setpci -d 17d5:* 62=1d
+
+   For Itanium::
+
+	#setpci -d 17d5:* 62=3d
+
+   For detailed description of the PCI registers, please see Xframe User Guide.
+
+b. Ensure Transmit Checksum offload is enabled. Use ethtool to set/verify this
+   parameter.
+
+c. Turn on TSO(using "ethtool -K")::
+
+	# ethtool -K <ethX> tso on
+
+Receive performance:
+
+a. By default, the driver respects BIOS settings for PCI bus parameters.
+   However, you may want to set PCI latency timer to 248::
+
+	#setpci -d 17d5:* LATENCY_TIMER=f8
+
+   For detailed description of the PCI registers, please see Xframe User Guide.
+
+b. Use 2-buffer mode. This results in large performance boost on
+   certain platforms(eg. SGI Altix, IBM xSeries).
+
+c. Ensure Receive Checksum offload is enabled. Use "ethtool -K ethX" command to
+   set/verify this option.
+
+d. Enable NAPI feature(in kernel configuration Device Drivers ---> Network
+   device support --->  Ethernet (10000 Mbit) ---> S2IO 10Gbe Xframe NIC) to
+   bring down CPU utilization.
+
+.. note::
+
+   For AMD opteron platforms with 8131 chipset, MMRBC=1 and MOST=1 are
+   recommended as safe parameters.
+
+For more information, please review the AMD8131 errata at
+http://vip.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/
+26310_AMD-8131_HyperTransport_PCI-X_Tunnel_Revision_Guide_rev_3_18.pdf
+
+6. Support
+==========
+
+For further support please contact either your 10GbE Xframe NIC vendor (IBM,
+HP, SGI etc.)
diff --git a/Documentation/networking/device_drivers/neterion/s2io.txt b/Documentation/networking/device_drivers/neterion/s2io.txt
deleted file mode 100644
index 0362a42f7cf4..000000000000
--- a/Documentation/networking/device_drivers/neterion/s2io.txt
+++ /dev/null
@@ -1,141 +0,0 @@
-Release notes for Neterion's (Formerly S2io) Xframe I/II PCI-X 10GbE driver.
-
-Contents
-=======
-- 1.  Introduction
-- 2.  Identifying the adapter/interface
-- 3.  Features supported
-- 4.  Command line parameters
-- 5.  Performance suggestions
-- 6.  Available Downloads 
-
-
-1.	Introduction:
-This Linux driver supports Neterion's Xframe I PCI-X 1.0 and
-Xframe II PCI-X 2.0 adapters. It supports several features 
-such as jumbo frames, MSI/MSI-X, checksum offloads, TSO, UFO and so on.
-See below for complete list of features.
-All features are supported for both IPv4 and IPv6.
-
-2.	Identifying the adapter/interface:
-a. Insert the adapter(s) in your system.
-b. Build and load driver 
-# insmod s2io.ko
-c. View log messages
-# dmesg | tail -40
-You will see messages similar to:
-eth3: Neterion Xframe I 10GbE adapter (rev 3), Version 2.0.9.1, Intr type INTA
-eth4: Neterion Xframe II 10GbE adapter (rev 2), Version 2.0.9.1, Intr type INTA
-eth4: Device is on 64 bit 133MHz PCIX(M1) bus
-
-The above messages identify the adapter type(Xframe I/II), adapter revision,
-driver version, interface name(eth3, eth4), Interrupt type(INTA, MSI, MSI-X).
-In case of Xframe II, the PCI/PCI-X bus width and frequency are displayed
-as well.
-
-To associate an interface with a physical adapter use "ethtool -p <ethX>".
-The corresponding adapter's LED will blink multiple times.
-
-3.	Features supported:
-a. Jumbo frames. Xframe I/II supports MTU up to 9600 bytes,
-modifiable using ip command.
-
-b. Offloads. Supports checksum offload(TCP/UDP/IP) on transmit
-and receive, TSO.
-
-c. Multi-buffer receive mode. Scattering of packet across multiple
-buffers. Currently driver supports 2-buffer mode which yields
-significant performance improvement on certain platforms(SGI Altix,
-IBM xSeries).
-
-d. MSI/MSI-X. Can be enabled on platforms which support this feature
-(IA64, Xeon) resulting in noticeable performance improvement(up to 7%
-on certain platforms).
-
-e. Statistics. Comprehensive MAC-level and software statistics displayed
-using "ethtool -S" option.
-
-f. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
-with multiple steering options.
-
-4.  Command line parameters
-a. tx_fifo_num
-Number of transmit queues
-Valid range: 1-8
-Default: 1
-
-b. rx_ring_num
-Number of receive rings
-Valid range: 1-8
-Default: 1
-
-c. tx_fifo_len
-Size of each transmit queue
-Valid range: Total length of all queues should not exceed 8192
-Default: 4096
-
-d. rx_ring_sz 
-Size of each receive ring(in 4K blocks)
-Valid range: Limited by memory on system
-Default: 30 
-
-e. intr_type
-Specifies interrupt type. Possible values 0(INTA), 2(MSI-X)
-Valid values: 0, 2
-Default: 2
-
-5.  Performance suggestions
-General:
-a. Set MTU to maximum(9000 for switch setup, 9600 in back-to-back configuration)
-b. Set TCP windows size to optimal value. 
-For instance, for MTU=1500 a value of 210K has been observed to result in 
-good performance.
-# sysctl -w net.ipv4.tcp_rmem="210000 210000 210000"
-# sysctl -w net.ipv4.tcp_wmem="210000 210000 210000"
-For MTU=9000, TCP window size of 10 MB is recommended.
-# sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
-# sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
-
-Transmit performance:
-a. By default, the driver respects BIOS settings for PCI bus parameters. 
-However, you may want to experiment with PCI bus parameters 
-max-split-transactions(MOST) and MMRBC (use setpci command). 
-A MOST value of 2 has been found optimal for Opterons and 3 for Itanium.  
-It could be different for your hardware.  
-Set MMRBC to 4K**.
-
-For example you can set 
-For opteron
-#setpci -d 17d5:* 62=1d 
-For Itanium
-#setpci -d 17d5:* 62=3d 
-
-For detailed description of the PCI registers, please see Xframe User Guide.
-
-b. Ensure Transmit Checksum offload is enabled. Use ethtool to set/verify this 
-parameter.
-c. Turn on TSO(using "ethtool -K")
-# ethtool -K <ethX> tso on
-
-Receive performance:
-a. By default, the driver respects BIOS settings for PCI bus parameters. 
-However, you may want to set PCI latency timer to 248.
-#setpci -d 17d5:* LATENCY_TIMER=f8
-For detailed description of the PCI registers, please see Xframe User Guide.
-b. Use 2-buffer mode. This results in large performance boost on
-certain platforms(eg. SGI Altix, IBM xSeries).
-c. Ensure Receive Checksum offload is enabled. Use "ethtool -K ethX" command to 
-set/verify this option.
-d. Enable NAPI feature(in kernel configuration Device Drivers ---> Network 
-device support --->  Ethernet (10000 Mbit) ---> S2IO 10Gbe Xframe NIC) to 
-bring down CPU utilization.
-
-** For AMD opteron platforms with 8131 chipset, MMRBC=1 and MOST=1 are 
-recommended as safe parameters.
-For more information, please review the AMD8131 errata at
-http://vip.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/
-26310_AMD-8131_HyperTransport_PCI-X_Tunnel_Revision_Guide_rev_3_18.pdf
-
-6. Support
-For further support please contact either your 10GbE Xframe NIC vendor (IBM, 
-HP, SGI etc.)
diff --git a/MAINTAINERS b/MAINTAINERS
index ba8bb932e3da..4e3f96ee0d98 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11656,7 +11656,7 @@ NETERION 10GbE DRIVERS (s2io/vxge)
 M:	Jon Mason <jdmason@kudzu.us>
 L:	netdev@vger.kernel.org
 S:	Supported
-F:	Documentation/networking/device_drivers/neterion/s2io.txt
+F:	Documentation/networking/device_drivers/neterion/s2io.rst
 F:	Documentation/networking/device_drivers/neterion/vxge.txt
 F:	drivers/net/ethernet/neterion/
 
diff --git a/drivers/net/ethernet/neterion/Kconfig b/drivers/net/ethernet/neterion/Kconfig
index 5e630f3a0189..c375ee08f6ea 100644
--- a/drivers/net/ethernet/neterion/Kconfig
+++ b/drivers/net/ethernet/neterion/Kconfig
@@ -27,7 +27,7 @@ config S2IO
 	  on its age.
 
 	  More specific information on configuring the driver is in
-	  <file:Documentation/networking/device_drivers/neterion/s2io.txt>.
+	  <file:Documentation/networking/device_drivers/neterion/s2io.rst>.
 
 	  To compile this driver as a module, choose M here. The module
 	  will be called s2io.
-- 
cgit v1.2.3


From f10727d3b68c8e03111436de94c922ffe304e21e Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:50 +0200
Subject: docs: networking: device drivers: convert neterion/vxge.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- comment out text-only TOC from html/pdf output;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |   1 +
 .../networking/device_drivers/neterion/vxge.rst    | 115 +++++++++++++++++++++
 .../networking/device_drivers/neterion/vxge.txt    |  93 -----------------
 MAINTAINERS                                        |   2 +-
 drivers/net/ethernet/neterion/Kconfig              |   2 +-
 5 files changed, 118 insertions(+), 95 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/neterion/vxge.rst
 delete mode 100644 Documentation/networking/device_drivers/neterion/vxge.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index da1f8438d4ea..55837244eaad 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -43,6 +43,7 @@ Contents:
    intel/ipw2200
    microsoft/netvsc
    neterion/s2io
+   neterion/vxge
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/neterion/vxge.rst b/Documentation/networking/device_drivers/neterion/vxge.rst
new file mode 100644
index 000000000000..589c6b15c63d
--- /dev/null
+++ b/Documentation/networking/device_drivers/neterion/vxge.rst
@@ -0,0 +1,115 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================================================================
+Neterion's (Formerly S2io) X3100 Series 10GbE PCIe Server Adapter Linux driver
+==============================================================================
+
+.. Contents
+
+  1) Introduction
+  2) Features supported
+  3) Configurable driver parameters
+  4) Troubleshooting
+
+1. Introduction
+===============
+
+This Linux driver supports all Neterion's X3100 series 10 GbE PCIe I/O
+Virtualized Server adapters.
+
+The X3100 series supports four modes of operation, configurable via
+firmware:
+
+	- Single function mode
+	- Multi function mode
+	- SRIOV mode
+	- MRIOV mode
+
+The functions share a 10GbE link and the pci-e bus, but hardly anything else
+inside the ASIC. Features like independent hw reset, statistics, bandwidth/
+priority allocation and guarantees, GRO, TSO, interrupt moderation etc are
+supported independently on each function.
+
+(See below for a complete list of features supported for both IPv4 and IPv6)
+
+2. Features supported
+=====================
+
+i)   Single function mode (up to 17 queues)
+
+ii)  Multi function mode (up to 17 functions)
+
+iii) PCI-SIG's I/O Virtualization
+
+       - Single Root mode: v1.0 (up to 17 functions)
+       - Multi-Root mode: v1.0 (up to 17 functions)
+
+iv)  Jumbo frames
+
+       X3100 Series supports MTU up to 9600 bytes, modifiable using
+       ip command.
+
+v)   Offloads supported: (Enabled by default)
+
+       - Checksum offload (TCP/UDP/IP) on transmit and receive paths
+       - TCP Segmentation Offload (TSO) on transmit path
+       - Generic Receive Offload (GRO) on receive path
+
+vi)  MSI-X: (Enabled by default)
+
+       Resulting in noticeable performance improvement (up to 7% on certain
+       platforms).
+
+vii) NAPI: (Enabled by default)
+
+       For better Rx interrupt moderation.
+
+viii)RTH (Receive Traffic Hash): (Enabled by default)
+
+       Receive side steering for better scaling.
+
+ix)  Statistics
+
+       Comprehensive MAC-level and software statistics displayed using
+       "ethtool -S" option.
+
+x)   Multiple hardware queues: (Enabled by default)
+
+       Up to 17 hardware based transmit and receive data channels, with
+       multiple steering options (transmit multiqueue enabled by default).
+
+3) Configurable driver parameters:
+----------------------------------
+
+i)  max_config_dev
+       Specifies maximum device functions to be enabled.
+
+       Valid range: 1-8
+
+ii) max_config_port
+       Specifies number of ports to be enabled.
+
+       Valid range: 1,2
+
+       Default: 1
+
+iii) max_config_vpath
+       Specifies maximum VPATH(s) configured for each device function.
+
+       Valid range: 1-17
+
+iv) vlan_tag_strip
+       Enables/disables vlan tag stripping from all received tagged frames that
+       are not replicated at the internal L2 switch.
+
+       Valid range: 0,1 (disabled, enabled respectively)
+
+       Default: 1
+
+v)  addr_learn_en
+       Enable learning the mac address of the guest OS interface in
+       virtualization environment.
+
+       Valid range: 0,1 (disabled, enabled respectively)
+
+       Default: 0
diff --git a/Documentation/networking/device_drivers/neterion/vxge.txt b/Documentation/networking/device_drivers/neterion/vxge.txt
deleted file mode 100644
index abfec245f97c..000000000000
--- a/Documentation/networking/device_drivers/neterion/vxge.txt
+++ /dev/null
@@ -1,93 +0,0 @@
-Neterion's (Formerly S2io) X3100 Series 10GbE PCIe Server Adapter Linux driver
-==============================================================================
-
-Contents
---------
-
-1) Introduction
-2) Features supported
-3) Configurable driver parameters
-4) Troubleshooting
-
-1) Introduction:
-----------------
-This Linux driver supports all Neterion's X3100 series 10 GbE PCIe I/O
-Virtualized Server adapters.
-The X3100 series supports four modes of operation, configurable via
-firmware -
-	Single function mode
-	Multi function mode
-	SRIOV mode
-	MRIOV mode
-The functions share a 10GbE link and the pci-e bus, but hardly anything else
-inside the ASIC. Features like independent hw reset, statistics, bandwidth/
-priority allocation and guarantees, GRO, TSO, interrupt moderation etc are
-supported independently on each function.
-
-(See below for a complete list of features supported for both IPv4 and IPv6)
-
-2) Features supported:
-----------------------
-
-i)   Single function mode (up to 17 queues)
-
-ii)  Multi function mode (up to 17 functions)
-
-iii) PCI-SIG's I/O Virtualization
-       - Single Root mode: v1.0 (up to 17 functions)
-       - Multi-Root mode: v1.0 (up to 17 functions)
-
-iv)  Jumbo frames
-       X3100 Series supports MTU up to 9600 bytes, modifiable using
-       ip command.
-
-v)   Offloads supported: (Enabled by default)
-       Checksum offload (TCP/UDP/IP) on transmit and receive paths
-       TCP Segmentation Offload (TSO) on transmit path
-       Generic Receive Offload (GRO) on receive path
-
-vi)  MSI-X: (Enabled by default)
-       Resulting in noticeable performance improvement (up to 7% on certain
-       platforms).
-
-vii) NAPI: (Enabled by default)
-       For better Rx interrupt moderation.
-
-viii)RTH (Receive Traffic Hash): (Enabled by default)
-       Receive side steering for better scaling.
-
-ix)  Statistics
-       Comprehensive MAC-level and software statistics displayed using
-       "ethtool -S" option.
-
-x)   Multiple hardware queues: (Enabled by default)
-       Up to 17 hardware based transmit and receive data channels, with
-       multiple steering options (transmit multiqueue enabled by default).
-
-3) Configurable driver parameters:
-----------------------------------
-
-i)  max_config_dev
-       Specifies maximum device functions to be enabled.
-       Valid range: 1-8
-
-ii) max_config_port
-       Specifies number of ports to be enabled.
-       Valid range: 1,2
-       Default: 1
-
-iii)max_config_vpath
-       Specifies maximum VPATH(s) configured for each device function.
-       Valid range: 1-17
-
-iv) vlan_tag_strip
-       Enables/disables vlan tag stripping from all received tagged frames that
-       are not replicated at the internal L2 switch.
-       Valid range: 0,1 (disabled, enabled respectively)
-       Default: 1
-
-v)  addr_learn_en
-       Enable learning the mac address of the guest OS interface in
-       virtualization environment.
-       Valid range: 0,1 (disabled, enabled respectively)
-       Default: 0
diff --git a/MAINTAINERS b/MAINTAINERS
index 4e3f96ee0d98..88e9e8430581 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11657,7 +11657,7 @@ M:	Jon Mason <jdmason@kudzu.us>
 L:	netdev@vger.kernel.org
 S:	Supported
 F:	Documentation/networking/device_drivers/neterion/s2io.rst
-F:	Documentation/networking/device_drivers/neterion/vxge.txt
+F:	Documentation/networking/device_drivers/neterion/vxge.rst
 F:	drivers/net/ethernet/neterion/
 
 NETFILTER
diff --git a/drivers/net/ethernet/neterion/Kconfig b/drivers/net/ethernet/neterion/Kconfig
index c375ee08f6ea..a82a37094579 100644
--- a/drivers/net/ethernet/neterion/Kconfig
+++ b/drivers/net/ethernet/neterion/Kconfig
@@ -42,7 +42,7 @@ config VXGE
 	  labeled as either one, depending on its age.
 
 	  More specific information on configuring the driver is in
-	  <file:Documentation/networking/device_drivers/neterion/vxge.txt>.
+	  <file:Documentation/networking/device_drivers/neterion/vxge.rst>.
 
 	  To compile this driver as a module, choose M here. The module
 	  will be called vxge.
-- 
cgit v1.2.3


From acfcf23597d62700f1c8e1975bca34070e0251ef Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:51 +0200
Subject: docs: networking: device drivers: convert qualcomm/rmnet.txt to ReST

- add SPDX header;
- add a document title;
- mark code blocks and literals as such;
- mark tables as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |  1 +
 .../networking/device_drivers/qualcomm/rmnet.rst   | 95 ++++++++++++++++++++++
 .../networking/device_drivers/qualcomm/rmnet.txt   | 82 -------------------
 MAINTAINERS                                        |  2 +-
 4 files changed, 97 insertions(+), 83 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/qualcomm/rmnet.rst
 delete mode 100644 Documentation/networking/device_drivers/qualcomm/rmnet.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 55837244eaad..66ed884548cc 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -44,6 +44,7 @@ Contents:
    microsoft/netvsc
    neterion/s2io
    neterion/vxge
+   qualcomm/rmnet
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/qualcomm/rmnet.rst b/Documentation/networking/device_drivers/qualcomm/rmnet.rst
new file mode 100644
index 000000000000..70643b58de05
--- /dev/null
+++ b/Documentation/networking/device_drivers/qualcomm/rmnet.rst
@@ -0,0 +1,95 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+Rmnet Driver
+============
+
+1. Introduction
+===============
+
+rmnet driver is used for supporting the Multiplexing and aggregation
+Protocol (MAP). This protocol is used by all recent chipsets using Qualcomm
+Technologies, Inc. modems.
+
+This driver can be used to register onto any physical network device in
+IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.
+
+Multiplexing allows for creation of logical netdevices (rmnet devices) to
+handle multiple private data networks (PDN) like a default internet, tethering,
+multimedia messaging service (MMS) or IP media subsystem (IMS). Hardware sends
+packets with MAP headers to rmnet. Based on the multiplexer id, rmnet
+routes to the appropriate PDN after removing the MAP header.
+
+Aggregation is required to achieve high data rates. This involves hardware
+sending aggregated bunch of MAP frames. rmnet driver will de-aggregate
+these MAP frames and send them to appropriate PDN's.
+
+2. Packet format
+================
+
+a. MAP packet (data / control)
+
+MAP header has the same endianness of the IP packet.
+
+Packet format::
+
+  Bit             0             1           2-7      8 - 15           16 - 31
+  Function   Command / Data   Reserved     Pad   Multiplexer ID    Payload length
+  Bit            32 - x
+  Function     Raw  Bytes
+
+Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command
+or data packet. Control packet is used for transport level flow control. Data
+packets are standard IP packets.
+
+Reserved bits are usually zeroed out and to be ignored by receiver.
+
+Padding is number of bytes to be added for 4 byte alignment if required by
+hardware.
+
+Multiplexer ID is to indicate the PDN on which data has to be sent.
+
+Payload length includes the padding length but does not include MAP header
+length.
+
+b. MAP packet (command specific)::
+
+    Bit             0             1           2-7      8 - 15           16 - 31
+    Function   Command         Reserved     Pad   Multiplexer ID    Payload length
+    Bit          32 - 39        40 - 45    46 - 47       48 - 63
+    Function   Command name    Reserved   Command Type   Reserved
+    Bit          64 - 95
+    Function   Transaction ID
+    Bit          96 - 127
+    Function   Command data
+
+Command 1 indicates disabling flow while 2 is enabling flow
+
+Command types
+
+= ==========================================
+0 for MAP command request
+1 is to acknowledge the receipt of a command
+2 is for unsupported commands
+3 is for error during processing of commands
+= ==========================================
+
+c. Aggregation
+
+Aggregation is multiple MAP packets (can be data or command) delivered to
+rmnet in a single linear skb. rmnet will process the individual
+packets and either ACK the MAP command or deliver the IP packet to the
+network stack as needed
+
+MAP header|IP Packet|Optional padding|MAP header|IP Packet|Optional padding....
+
+MAP header|IP Packet|Optional padding|MAP header|Command Packet|Optional pad...
+
+3. Userspace configuration
+==========================
+
+rmnet userspace configuration is done through netlink library librmnetctl
+and command line utility rmnetcli. Utility is hosted in codeaurora forum git.
+The driver uses rtnl_link_ops for communication.
+
+https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/dataservices/tree/rmnetctl
diff --git a/Documentation/networking/device_drivers/qualcomm/rmnet.txt b/Documentation/networking/device_drivers/qualcomm/rmnet.txt
deleted file mode 100644
index 6b341eaf2062..000000000000
--- a/Documentation/networking/device_drivers/qualcomm/rmnet.txt
+++ /dev/null
@@ -1,82 +0,0 @@
-1. Introduction
-
-rmnet driver is used for supporting the Multiplexing and aggregation
-Protocol (MAP). This protocol is used by all recent chipsets using Qualcomm
-Technologies, Inc. modems.
-
-This driver can be used to register onto any physical network device in
-IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.
-
-Multiplexing allows for creation of logical netdevices (rmnet devices) to
-handle multiple private data networks (PDN) like a default internet, tethering,
-multimedia messaging service (MMS) or IP media subsystem (IMS). Hardware sends
-packets with MAP headers to rmnet. Based on the multiplexer id, rmnet
-routes to the appropriate PDN after removing the MAP header.
-
-Aggregation is required to achieve high data rates. This involves hardware
-sending aggregated bunch of MAP frames. rmnet driver will de-aggregate
-these MAP frames and send them to appropriate PDN's.
-
-2. Packet format
-
-a. MAP packet (data / control)
-
-MAP header has the same endianness of the IP packet.
-
-Packet format -
-
-Bit             0             1           2-7      8 - 15           16 - 31
-Function   Command / Data   Reserved     Pad   Multiplexer ID    Payload length
-Bit            32 - x
-Function     Raw  Bytes
-
-Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command
-or data packet. Control packet is used for transport level flow control. Data
-packets are standard IP packets.
-
-Reserved bits are usually zeroed out and to be ignored by receiver.
-
-Padding is number of bytes to be added for 4 byte alignment if required by
-hardware.
-
-Multiplexer ID is to indicate the PDN on which data has to be sent.
-
-Payload length includes the padding length but does not include MAP header
-length.
-
-b. MAP packet (command specific)
-
-Bit             0             1           2-7      8 - 15           16 - 31
-Function   Command         Reserved     Pad   Multiplexer ID    Payload length
-Bit          32 - 39        40 - 45    46 - 47       48 - 63
-Function   Command name    Reserved   Command Type   Reserved
-Bit          64 - 95
-Function   Transaction ID
-Bit          96 - 127
-Function   Command data
-
-Command 1 indicates disabling flow while 2 is enabling flow
-
-Command types -
-0 for MAP command request
-1 is to acknowledge the receipt of a command
-2 is for unsupported commands
-3 is for error during processing of commands
-
-c. Aggregation
-
-Aggregation is multiple MAP packets (can be data or command) delivered to
-rmnet in a single linear skb. rmnet will process the individual
-packets and either ACK the MAP command or deliver the IP packet to the
-network stack as needed
-
-MAP header|IP Packet|Optional padding|MAP header|IP Packet|Optional padding....
-MAP header|IP Packet|Optional padding|MAP header|Command Packet|Optional pad...
-
-3. Userspace configuration
-
-rmnet userspace configuration is done through netlink library librmnetctl
-and command line utility rmnetcli. Utility is hosted in codeaurora forum git.
-The driver uses rtnl_link_ops for communication.
-
-https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/dataservices/tree/rmnetctl
diff --git a/MAINTAINERS b/MAINTAINERS
index 88e9e8430581..94afbf577a06 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14031,7 +14031,7 @@ M:	Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
 M:	Sean Tranchetti <stranche@codeaurora.org>
 L:	netdev@vger.kernel.org
 S:	Maintained
-F:	Documentation/networking/device_drivers/qualcomm/rmnet.txt
+F:	Documentation/networking/device_drivers/qualcomm/rmnet.rst
 F:	drivers/net/ethernet/qualcomm/rmnet/
 F:	include/linux/if_rmnet.h
 
-- 
cgit v1.2.3


From e9a5475e735c9603b870c6ee5189de7cd32bb080 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:56 +0200
Subject: docs: networking: device drivers: convert ti/tlan.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark tables as such;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |   1 +
 .../networking/device_drivers/ti/tlan.rst          | 140 +++++++++++++++++++++
 .../networking/device_drivers/ti/tlan.txt          | 117 -----------------
 MAINTAINERS                                        |   2 +-
 drivers/net/ethernet/ti/Kconfig                    |   2 +-
 drivers/net/ethernet/ti/tlan.c                     |   2 +-
 6 files changed, 144 insertions(+), 120 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/ti/tlan.rst
 delete mode 100644 Documentation/networking/device_drivers/ti/tlan.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 1d3b664e6921..adc0bf65fb02 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -49,6 +49,7 @@ Contents:
    smsc/smc9
    ti/cpsw_switchdev
    ti/cpsw
+   ti/tlan
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/ti/tlan.rst b/Documentation/networking/device_drivers/ti/tlan.rst
new file mode 100644
index 000000000000..4fdc0907f4fc
--- /dev/null
+++ b/Documentation/networking/device_drivers/ti/tlan.rst
@@ -0,0 +1,140 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+TLAN driver for Linux
+=====================
+
+:Version: 1.14a
+
+(C) 1997-1998 Caldera, Inc.
+
+(C) 1998 James Banks
+
+(C) 1999-2001 Torben Mathiasen <tmm@image.dk, torben.mathiasen@compaq.com>
+
+For driver information/updates visit http://www.compaq.com
+
+
+
+
+
+I. Supported Devices
+====================
+
+    Only PCI devices will work with this driver.
+
+    Supported:
+
+    =========	=========	===========================================
+    Vendor ID	Device ID	Name
+    =========	=========	===========================================
+    0e11	ae32		Compaq Netelligent 10/100 TX PCI UTP
+    0e11	ae34		Compaq Netelligent 10 T PCI UTP
+    0e11	ae35		Compaq Integrated NetFlex 3/P
+    0e11	ae40		Compaq Netelligent Dual 10/100 TX PCI UTP
+    0e11	ae43		Compaq Netelligent Integrated 10/100 TX UTP
+    0e11	b011		Compaq Netelligent 10/100 TX Embedded UTP
+    0e11	b012		Compaq Netelligent 10 T/2 PCI UTP/Coax
+    0e11	b030		Compaq Netelligent 10/100 TX UTP
+    0e11	f130		Compaq NetFlex 3/P
+    0e11	f150		Compaq NetFlex 3/P
+    108d	0012		Olicom OC-2325
+    108d	0013		Olicom OC-2183
+    108d	0014		Olicom OC-2326
+    =========	=========	===========================================
+
+
+    Caveats:
+
+    I am not sure if 100BaseTX daughterboards (for those cards which
+    support such things) will work.  I haven't had any solid evidence
+    either way.
+
+    However, if a card supports 100BaseTx without requiring an add
+    on daughterboard, it should work with 100BaseTx.
+
+    The "Netelligent 10 T/2 PCI UTP/Coax" (b012) device is untested,
+    but I do not expect any problems.
+
+
+II. Driver Options
+==================
+
+	1. You can append debug=x to the end of the insmod line to get
+	   debug messages, where x is a bit field where the bits mean
+	   the following:
+
+	   ====		=====================================
+	   0x01		Turn on general debugging messages.
+	   0x02		Turn on receive debugging messages.
+	   0x04		Turn on transmit debugging messages.
+	   0x08		Turn on list debugging messages.
+	   ====		=====================================
+
+	2. You can append aui=1 to the end of the insmod line to cause
+	   the adapter to use the AUI interface instead of the 10 Base T
+	   interface.  This is also what to do if you want to use the BNC
+	   connector on a TLAN based device.  (Setting this option on a
+	   device that does not have an AUI/BNC connector will probably
+	   cause it to not function correctly.)
+
+	3. You can set duplex=1 to force half duplex, and duplex=2 to
+	   force full duplex.
+
+	4. You can set speed=10 to force 10Mbs operation, and speed=100
+	   to force 100Mbs operation. (I'm not sure what will happen
+	   if a card which only supports 10Mbs is forced into 100Mbs
+	   mode.)
+
+	5. You have to use speed=X duplex=Y together now. If you just
+	   do "insmod tlan.o speed=100" the driver will do Auto-Neg.
+	   To force a 10Mbps Half-Duplex link do "insmod tlan.o speed=10
+	   duplex=1".
+
+	6. If the driver is built into the kernel, you can use the 3rd
+	   and 4th parameters to set aui and debug respectively.  For
+	   example::
+
+		ether=0,0,0x1,0x7,eth0
+
+	   This sets aui to 0x1 and debug to 0x7, assuming eth0 is a
+	   supported TLAN device.
+
+	   The bits in the third byte are assigned as follows:
+
+		====   ===============
+		0x01   aui
+		0x02   use half duplex
+		0x04   use full duplex
+		0x08   use 10BaseT
+		0x10   use 100BaseTx
+		====   ===============
+
+	   You also need to set both speed and duplex settings when forcing
+	   speeds with kernel-parameters.
+	   ether=0,0,0x12,0,eth0 will force link to 100Mbps Half-Duplex.
+
+	7. If you have more than one tlan adapter in your system, you can
+	   use the above options on a per adapter basis. To force a 100Mbit/HD
+	   link with your eth1 adapter use::
+
+		insmod tlan speed=0,100 duplex=0,1
+
+	   Now eth0 will use auto-neg and eth1 will be forced to 100Mbit/HD.
+	   Note that the tlan driver supports a maximum of 8 adapters.
+
+
+III. Things to try if you have problems
+=======================================
+
+	1. Make sure your card's PCI id is among those listed in
+	   section I, above.
+	2. Make sure routing is correct.
+	3. Try forcing different speed/duplex settings
+
+
+There is also a tlan mailing list which you can join by sending "subscribe tlan"
+in the body of an email to majordomo@vuser.vu.union.edu.
+
+There is also a tlan website at http://www.compaq.com
+
diff --git a/Documentation/networking/device_drivers/ti/tlan.txt b/Documentation/networking/device_drivers/ti/tlan.txt
deleted file mode 100644
index 34550dfcef74..000000000000
--- a/Documentation/networking/device_drivers/ti/tlan.txt
+++ /dev/null
@@ -1,117 +0,0 @@
-(C) 1997-1998 Caldera, Inc.
-(C) 1998 James Banks
-(C) 1999-2001 Torben Mathiasen <tmm@image.dk, torben.mathiasen@compaq.com>
-
-For driver information/updates visit http://www.compaq.com
-
-
-TLAN driver for Linux, version 1.14a
-README
-
-
-I.  Supported Devices.
-
-    Only PCI devices will work with this driver.
-
-    Supported:
-    Vendor ID	Device ID	Name
-    0e11	ae32		Compaq Netelligent 10/100 TX PCI UTP
-    0e11	ae34		Compaq Netelligent 10 T PCI UTP
-    0e11	ae35		Compaq Integrated NetFlex 3/P
-    0e11	ae40		Compaq Netelligent Dual 10/100 TX PCI UTP
-    0e11	ae43		Compaq Netelligent Integrated 10/100 TX UTP
-    0e11	b011		Compaq Netelligent 10/100 TX Embedded UTP
-    0e11	b012		Compaq Netelligent 10 T/2 PCI UTP/Coax
-    0e11	b030		Compaq Netelligent 10/100 TX UTP
-    0e11	f130		Compaq NetFlex 3/P
-    0e11	f150		Compaq NetFlex 3/P
-    108d	0012		Olicom OC-2325	
-    108d	0013		Olicom OC-2183
-    108d	0014		Olicom OC-2326	
-
-
-    Caveats:
-    
-    I am not sure if 100BaseTX daughterboards (for those cards which
-    support such things) will work.  I haven't had any solid evidence
-    either way.
-
-    However, if a card supports 100BaseTx without requiring an add
-    on daughterboard, it should work with 100BaseTx.
-
-    The "Netelligent 10 T/2 PCI UTP/Coax" (b012) device is untested,
-    but I do not expect any problems.
-    
-
-II.   Driver Options
-	1. You can append debug=x to the end of the insmod line to get
-           debug messages, where x is a bit field where the bits mean
-	   the following:
-	   
-	   0x01		Turn on general debugging messages.
-	   0x02		Turn on receive debugging messages.
-	   0x04		Turn on transmit debugging messages.
-	   0x08		Turn on list debugging messages.
-
-	2. You can append aui=1 to the end of the insmod line to cause
-           the adapter to use the AUI interface instead of the 10 Base T
-           interface.  This is also what to do if you want to use the BNC
-	   connector on a TLAN based device.  (Setting this option on a
-	   device that does not have an AUI/BNC connector will probably
-	   cause it to not function correctly.)
-
-	3. You can set duplex=1 to force half duplex, and duplex=2 to
-	   force full duplex.
-
-	4. You can set speed=10 to force 10Mbs operation, and speed=100
-	   to force 100Mbs operation. (I'm not sure what will happen
-	   if a card which only supports 10Mbs is forced into 100Mbs
-	   mode.)
-
-	5. You have to use speed=X duplex=Y together now. If you just
-	   do "insmod tlan.o speed=100" the driver will do Auto-Neg.
-	   To force a 10Mbps Half-Duplex link do "insmod tlan.o speed=10 
-	   duplex=1".
-
-	6. If the driver is built into the kernel, you can use the 3rd
-	   and 4th parameters to set aui and debug respectively.  For
-	   example:
-
-	   ether=0,0,0x1,0x7,eth0
-
-	   This sets aui to 0x1 and debug to 0x7, assuming eth0 is a
-	   supported TLAN device.
-
-	   The bits in the third byte are assigned as follows:
-
-		0x01 = aui
-		0x02 = use half duplex
-		0x04 = use full duplex
-		0x08 = use 10BaseT
-		0x10 = use 100BaseTx
-
-	   You also need to set both speed and duplex settings when forcing
-	   speeds with kernel-parameters. 
-	   ether=0,0,0x12,0,eth0 will force link to 100Mbps Half-Duplex.
-
-	7. If you have more than one tlan adapter in your system, you can
-	   use the above options on a per adapter basis. To force a 100Mbit/HD
-	   link with your eth1 adapter use:
-	   
-	   insmod tlan speed=0,100 duplex=0,1
-
-	   Now eth0 will use auto-neg and eth1 will be forced to 100Mbit/HD.
-	   Note that the tlan driver supports a maximum of 8 adapters.
-
-
-III.  Things to try if you have problems.
-	1. Make sure your card's PCI id is among those listed in
-	   section I, above.
-	2. Make sure routing is correct.
-	3. Try forcing different speed/duplex settings
-
-
-There is also a tlan mailing list which you can join by sending "subscribe tlan"
-in the body of an email to majordomo@vuser.vu.union.edu.
-There is also a tlan website at http://www.compaq.com
-
diff --git a/MAINTAINERS b/MAINTAINERS
index 94afbf577a06..38dbfbfccb5e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16971,7 +16971,7 @@ M:	Samuel Chessman <chessman@tux.org>
 L:	tlan-devel@lists.sourceforge.net (subscribers-only)
 S:	Maintained
 W:	http://sourceforge.net/projects/tlan/
-F:	Documentation/networking/device_drivers/ti/tlan.txt
+F:	Documentation/networking/device_drivers/ti/tlan.rst
 F:	drivers/net/ethernet/ti/tlan.*
 
 TM6000 VIDEO4LINUX DRIVER
diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 89cec778cf2d..7b0ad777828d 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -138,7 +138,7 @@ config TLAN
 
 	  Devices currently supported by this driver are Compaq Netelligent,
 	  Compaq NetFlex and Olicom cards.  Please read the file
-	  <file:Documentation/networking/device_drivers/ti/tlan.txt>
+	  <file:Documentation/networking/device_drivers/ti/tlan.rst>
 	  for more details.
 
 	  To compile this driver as a module, choose M here. The module
diff --git a/drivers/net/ethernet/ti/tlan.c b/drivers/net/ethernet/ti/tlan.c
index ad465202980a..857709828058 100644
--- a/drivers/net/ethernet/ti/tlan.c
+++ b/drivers/net/ethernet/ti/tlan.c
@@ -70,7 +70,7 @@ MODULE_DESCRIPTION("Driver for TI ThunderLAN based ethernet PCI adapters");
 MODULE_LICENSE("GPL");
 
 /* Turn on debugging.
- * See Documentation/networking/device_drivers/ti/tlan.txt for details
+ * See Documentation/networking/device_drivers/ti/tlan.rst for details
  */
 static  int		debug;
 module_param(debug, int, 0);
-- 
cgit v1.2.3


From 7ac0cbb49142edc22f0b3b4033907da6b3f698d9 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab
Date: Fri, 1 May 2020 16:44:57 +0200
Subject: docs: networking: device drivers: convert toshiba/spider_net.txt to
 ReST

- add SPDX header;
- adjust title markup;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/device_drivers/index.rst  |   1 +
 .../device_drivers/toshiba/spider_net.rst          | 202 ++++++++++++++++++++
 .../device_drivers/toshiba/spider_net.txt          | 204 ---------------------
 MAINTAINERS                                        |   2 +-
 4 files changed, 204 insertions(+), 205 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/toshiba/spider_net.rst
 delete mode 100644 Documentation/networking/device_drivers/toshiba/spider_net.txt

(limited to 'MAINTAINERS')

diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index adc0bf65fb02..e18dad11bc72 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -50,6 +50,7 @@ Contents:
    ti/cpsw_switchdev
    ti/cpsw
    ti/tlan
+   toshiba/spider_net
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/toshiba/spider_net.rst b/Documentation/networking/device_drivers/toshiba/spider_net.rst
new file mode 100644
index 000000000000..fe5b32be15cd
--- /dev/null
+++ b/Documentation/networking/device_drivers/toshiba/spider_net.rst
@@ -0,0 +1,202 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+The Spidernet Device Driver
+===========================
+
+Written by Linas Vepstas <linas@austin.ibm.com>
+
+Version of 7 June 2007
+
+Abstract
+========
+This document sketches the structure of portions of the spidernet
+device driver in the Linux kernel tree. The spidernet is a gigabit
+ethernet device built into the Toshiba southbridge commonly used
+in the SONY Playstation 3 and the IBM QS20 Cell blade.
+
+The Structure of the RX Ring.
+=============================
+The receive (RX) ring is a circular linked list of RX descriptors,
+together with three pointers into the ring that are used to manage its
+contents.
+
+The elements of the ring are called "descriptors" or "descrs"; they
+describe the received data. This includes a pointer to a buffer
+containing the received data, the buffer size, and various status bits.
+
+There are three primary states that a descriptor can be in: "empty",
+"full" and "not-in-use".  An "empty" or "ready" descriptor is ready
+to receive data from the hardware. A "full" descriptor has data in it,
+and is waiting to be emptied and processed by the OS. A "not-in-use"
+descriptor is neither empty or full; it is simply not ready. It may
+not even have a data buffer in it, or is otherwise unusable.
+
+During normal operation, on device startup, the OS (specifically, the
+spidernet device driver) allocates a set of RX descriptors and RX
+buffers. These are all marked "empty", ready to receive data. This
+ring is handed off to the hardware, which sequentially fills in the
+buffers, and marks them "full". The OS follows up, taking the full
+buffers, processing them, and re-marking them empty.
+
+This filling and emptying is managed by three pointers, the "head"
+and "tail" pointers, managed by the OS, and a hardware current
+descriptor pointer (GDACTDPA). The GDACTDPA points at the descr
+currently being filled. When this descr is filled, the hardware
+marks it full, and advances the GDACTDPA by one.  Thus, when there is
+flowing RX traffic, every descr behind it should be marked "full",
+and everything in front of it should be "empty".  If the hardware
+discovers that the current descr is not empty, it will signal an
+interrupt, and halt processing.
+
+The tail pointer tails or trails the hardware pointer. When the
+hardware is ahead, the tail pointer will be pointing at a "full"
+descr. The OS will process this descr, and then mark it "not-in-use",
+and advance the tail pointer.  Thus, when there is flowing RX traffic,
+all of the descrs in front of the tail pointer should be "full", and
+all of those behind it should be "not-in-use". When RX traffic is not
+flowing, then the tail pointer can catch up to the hardware pointer.
+The OS will then note that the current tail is "empty", and halt
+processing.
+
+The head pointer (somewhat mis-named) follows after the tail pointer.
+When traffic is flowing, then the head pointer will be pointing at
+a "not-in-use" descr. The OS will perform various housekeeping duties
+on this descr. This includes allocating a new data buffer and
+dma-mapping it so as to make it visible to the hardware. The OS will
+then mark the descr as "empty", ready to receive data. Thus, when there
+is flowing RX traffic, everything in front of the head pointer should
+be "not-in-use", and everything behind it should be "empty". If no
+RX traffic is flowing, then the head pointer can catch up to the tail
+pointer, at which point the OS will notice that the head descr is
+"empty", and it will halt processing.
+
+Thus, in an idle system, the GDACTDPA, tail and head pointers will
+all be pointing at the same descr, which should be "empty". All of the
+other descrs in the ring should be "empty" as well.
+
+The show_rx_chain() routine will print out the locations of the
+GDACTDPA, tail and head pointers. It will also summarize the contents
+of the ring, starting at the tail pointer, and listing the status
+of the descrs that follow.
+
+A typical example of the output, for a nearly idle system, might be::
+
+    net eth1: Total number of descrs=256
+    net eth1: Chain tail located at descr=20
+    net eth1: Chain head is at 20
+    net eth1: HW curr desc (GDACTDPA) is at 21
+    net eth1: Have 1 descrs with stat=x40800101
+    net eth1: HW next desc (GDACNEXTDA) is at 22
+    net eth1: Last 255 descrs with stat=xa0800000
+
+In the above, the hardware has filled in one descr, number 20. Both
+head and tail are pointing at 20, because it has not yet been emptied.
+Meanwhile, hw is pointing at 21, which is free.
+
+The "Have nnn decrs" refers to the descr starting at the tail: in this
+case, nnn=1 descr, starting at descr 20. The "Last nnn descrs" refers
+to all of the rest of the descrs, from the last status change. The "nnn"
+is a count of how many descrs have exactly the same status.
+
+The status x4... corresponds to "full" and status xa... corresponds
+to "empty". The actual value printed is RXCOMST_A.
+
+In the device driver source code, a different set of names are
+used for these same concepts, so that::
+
+    "empty" == SPIDER_NET_DESCR_CARDOWNED == 0xa
+    "full"  == SPIDER_NET_DESCR_FRAME_END == 0x4
+    "not in use" == SPIDER_NET_DESCR_NOT_IN_USE == 0xf
+
+
+The RX RAM full bug/feature
+===========================
+
+As long as the OS can empty out the RX buffers at a rate faster than
+the hardware can fill them, there is no problem. If, for some reason,
+the OS fails to empty the RX ring fast enough, the hardware GDACTDPA
+pointer will catch up to the head, notice the not-empty condition,
+ad stop. However, RX packets may still continue arriving on the wire.
+The spidernet chip can save some limited number of these in local RAM.
+When this local ram fills up, the spider chip will issue an interrupt
+indicating this (GHIINT0STS will show ERRINT, and the GRMFLLINT bit
+will be set in GHIINT1STS).  When the RX ram full condition occurs,
+a certain bug/feature is triggered that has to be specially handled.
+This section describes the special handling for this condition.
+
+When the OS finally has a chance to run, it will empty out the RX ring.
+In particular, it will clear the descriptor on which the hardware had
+stopped. However, once the hardware has decided that a certain
+descriptor is invalid, it will not restart at that descriptor; instead
+it will restart at the next descr. This potentially will lead to a
+deadlock condition, as the tail pointer will be pointing at this descr,
+which, from the OS point of view, is empty; the OS will be waiting for
+this descr to be filled. However, the hardware has skipped this descr,
+and is filling the next descrs. Since the OS doesn't see this, there
+is a potential deadlock, with the OS waiting for one descr to fill,
+while the hardware is waiting for a different set of descrs to become
+empty.
+
+A call to show_rx_chain() at this point indicates the nature of the
+problem. A typical print when the network is hung shows the following::
+
+    net eth1: Spider RX RAM full, incoming packets might be discarded!
+    net eth1: Total number of descrs=256
+    net eth1: Chain tail located at descr=255
+    net eth1: Chain head is at 255
+    net eth1: HW curr desc (GDACTDPA) is at 0
+    net eth1: Have 1 descrs with stat=xa0800000
+    net eth1: HW next desc (GDACNEXTDA) is at 1
+    net eth1: Have 127 descrs with stat=x40800101
+    net eth1: Have 1 descrs with stat=x40800001
+    net eth1: Have 126 descrs with stat=x40800101
+    net eth1: Last 1 descrs with stat=xa0800000
+
+Both the tail and head pointers are pointing at descr 255, which is
+marked xa... which is "empty". Thus, from the OS point of view, there
+is nothing to be done. In particular, there is the implicit assumption
+that everything in front of the "empty" descr must surely also be empty,
+as explained in the last section. The OS is waiting for descr 255 to
+become non-empty, which, in this case, will never happen.
+
+The HW pointer is at descr 0. This descr is marked 0x4.. or "full".
+Since its already full, the hardware can do nothing more, and thus has
+halted processing. Notice that descrs 0 through 254 are all marked
+"full", while descr 254 and 255 are empty. (The "Last 1 descrs" is
+descr 254, since tail was at 255.) Thus, the system is deadlocked,
+and there can be no forward progress; the OS thinks there's nothing
+to do, and the hardware has nowhere to put incoming data.
+
+This bug/feature is worked around with the spider_net_resync_head_ptr()
+routine. When the driver receives RX interrupts, but an examination
+of the RX chain seems to show it is empty, then it is probable that
+the hardware has skipped a descr or two (sometimes dozens under heavy
+network conditions). The spider_net_resync_head_ptr() subroutine will
+search the ring for the next full descr, and the driver will resume
+operations there.  Since this will leave "holes" in the ring, there
+is also a spider_net_resync_tail_ptr() that will skip over such holes.
+
+As of this writing, the spider_net_resync() strategy seems to work very
+well, even under heavy network loads.
+
+
+The TX ring
+===========
+The TX ring uses a low-watermark interrupt scheme to make sure that
+the TX queue is appropriately serviced for large packet sizes.
+
+For packet sizes greater than about 1KBytes, the kernel can fill
+the TX ring quicker than the device can drain it. Once the ring
+is full, the netdev is stopped. When there is room in the ring,
+the netdev needs to be reawakened, so that more TX packets are placed
+in the ring. The hardware can empty the ring about four times per jiffy,
+so its not appropriate to wait for the poll routine to refill, since
+the poll routine runs only once per jiffy.  The low-watermark mechanism
+marks a descr about 1/4th of the way from the bottom of the queue, so
+that an interrupt is generated when the descr is processed. This
+interrupt wakes up the netdev, which can then refill the queue.
+For large packets, this mechanism generates a relatively small number
+of interrupts, about 1K/sec. For smaller packets, this will drop to zero
+interrupts, as the hardware can empty the queue faster than the kernel
+can fill it.
diff --git a/Documentation/networking/device_drivers/toshiba/spider_net.txt b/Documentation/networking/device_drivers/toshiba/spider_net.txt
deleted file mode 100644
index b0b75f8463b3..000000000000
--- a/Documentation/networking/device_drivers/toshiba/spider_net.txt
+++ /dev/null
@@ -1,204 +0,0 @@
-
-            The Spidernet Device Driver
-            ===========================
-
-Written by Linas Vepstas <linas@austin.ibm.com>
-
-Version of 7 June 2007
-
-Abstract
-========
-This document sketches the structure of portions of the spidernet
-device driver in the Linux kernel tree. The spidernet is a gigabit
-ethernet device built into the Toshiba southbridge commonly used
-in the SONY Playstation 3 and the IBM QS20 Cell blade.
-
-The Structure of the RX Ring.
-=============================
-The receive (RX) ring is a circular linked list of RX descriptors,
-together with three pointers into the ring that are used to manage its
-contents.
-
-The elements of the ring are called "descriptors" or "descrs"; they
-describe the received data. This includes a pointer to a buffer
-containing the received data, the buffer size, and various status bits.
-
-There are three primary states that a descriptor can be in: "empty",
-"full" and "not-in-use".  An "empty" or "ready" descriptor is ready
-to receive data from the hardware. A "full" descriptor has data in it,
-and is waiting to be emptied and processed by the OS. A "not-in-use"
-descriptor is neither empty or full; it is simply not ready. It may
-not even have a data buffer in it, or is otherwise unusable.
-
-During normal operation, on device startup, the OS (specifically, the
-spidernet device driver) allocates a set of RX descriptors and RX
-buffers. These are all marked "empty", ready to receive data. This
-ring is handed off to the hardware, which sequentially fills in the
-buffers, and marks them "full". The OS follows up, taking the full
-buffers, processing them, and re-marking them empty.
-
-This filling and emptying is managed by three pointers, the "head"
-and "tail" pointers, managed by the OS, and a hardware current
-descriptor pointer (GDACTDPA). The GDACTDPA points at the descr
-currently being filled. When this descr is filled, the hardware
-marks it full, and advances the GDACTDPA by one.  Thus, when there is
-flowing RX traffic, every descr behind it should be marked "full",
-and everything in front of it should be "empty".  If the hardware
-discovers that the current descr is not empty, it will signal an
-interrupt, and halt processing.
-
-The tail pointer tails or trails the hardware pointer. When the
-hardware is ahead, the tail pointer will be pointing at a "full"
-descr. The OS will process this descr, and then mark it "not-in-use",
-and advance the tail pointer.  Thus, when there is flowing RX traffic,
-all of the descrs in front of the tail pointer should be "full", and
-all of those behind it should be "not-in-use". When RX traffic is not
-flowing, then the tail pointer can catch up to the hardware pointer.
-The OS will then note that the current tail is "empty", and halt
-processing.
-
-The head pointer (somewhat mis-named) follows after the tail pointer.
-When traffic is flowing, then the head pointer will be pointing at
-a "not-in-use" descr. The OS will perform various housekeeping duties
-on this descr. This includes allocating a new data buffer and
-dma-mapping it so as to make it visible to the hardware. The OS will
-then mark the descr as "empty", ready to receive data. Thus, when there
-is flowing RX traffic, everything in front of the head pointer should
-be "not-in-use", and everything behind it should be "empty". If no
-RX traffic is flowing, then the head pointer can catch up to the tail
-pointer, at which point the OS will notice that the head descr is
-"empty", and it will halt processing.
-
-Thus, in an idle system, the GDACTDPA, tail and head pointers will
-all be pointing at the same descr, which should be "empty". All of the
-other descrs in the ring should be "empty" as well.
-
-The show_rx_chain() routine will print out the locations of the
-GDACTDPA, tail and head pointers. It will also summarize the contents
-of the ring, starting at the tail pointer, and listing the status
-of the descrs that follow.
-
-A typical example of the output, for a nearly idle system, might be
-
-net eth1: Total number of descrs=256
-net eth1: Chain tail located at descr=20
-net eth1: Chain head is at 20
-net eth1: HW curr desc (GDACTDPA) is at 21
-net eth1: Have 1 descrs with stat=x40800101
-net eth1: HW next desc (GDACNEXTDA) is at 22
-net eth1: Last 255 descrs with stat=xa0800000
-
-In the above, the hardware has filled in one descr, number 20. Both
-head and tail are pointing at 20, because it has not yet been emptied.
-Meanwhile, hw is pointing at 21, which is free.
-
-The "Have nnn decrs" refers to the descr starting at the tail: in this
-case, nnn=1 descr, starting at descr 20. The "Last nnn descrs" refers
-to all of the rest of the descrs, from the last status change. The "nnn"
-is a count of how many descrs have exactly the same status.
-
-The status x4... corresponds to "full" and status xa... corresponds
-to "empty". The actual value printed is RXCOMST_A.
-
-In the device driver source code, a different set of names are
-used for these same concepts, so that
-
-"empty" == SPIDER_NET_DESCR_CARDOWNED == 0xa
-"full"  == SPIDER_NET_DESCR_FRAME_END == 0x4
-"not in use" == SPIDER_NET_DESCR_NOT_IN_USE == 0xf
-
-
-The RX RAM full bug/feature
-===========================
-
-As long as the OS can empty out the RX buffers at a rate faster than
-the hardware can fill them, there is no problem. If, for some reason,
-the OS fails to empty the RX ring fast enough, the hardware GDACTDPA
-pointer will catch up to the head, notice the not-empty condition,
-ad stop. However, RX packets may still continue arriving on the wire.
-The spidernet chip can save some limited number of these in local RAM.
-When this local ram fills up, the spider chip will issue an interrupt
-indicating this (GHIINT0STS will show ERRINT, and the GRMFLLINT bit
-will be set in GHIINT1STS).  When the RX ram full condition occurs,
-a certain bug/feature is triggered that has to be specially handled.
-This section describes the special handling for this condition.
-
-When the OS finally has a chance to run, it will empty out the RX ring.
-In particular, it will clear the descriptor on which the hardware had
-stopped. However, once the hardware has decided that a certain
-descriptor is invalid, it will not restart at that descriptor; instead
-it will restart at the next descr. This potentially will lead to a
-deadlock condition, as the tail pointer will be pointing at this descr,
-which, from the OS point of view, is empty; the OS will be waiting for
-this descr to be filled. However, the hardware has skipped this descr,
-and is filling the next descrs. Since the OS doesn't see this, there
-is a potential deadlock, with the OS waiting for one descr to fill,
-while the hardware is waiting for a different set of descrs to become
-empty.
-
-A call to show_rx_chain() at this point indicates the nature of the
-problem. A typical print when the network is hung shows the following:
-
-net eth1: Spider RX RAM full, incoming packets might be discarded!
-net eth1: Total number of descrs=256
-net eth1: Chain tail located at descr=255
-net eth1: Chain head is at 255
-net eth1: HW curr desc (GDACTDPA) is at 0
-net eth1: Have 1 descrs with stat=xa0800000
-net eth1: HW next desc (GDACNEXTDA) is at 1
-net eth1: Have 127 descrs with stat=x40800101
-net eth1: Have 1 descrs with stat=x40800001
-net eth1: Have 126 descrs with stat=x40800101
-net eth1: Last 1 descrs with stat=xa0800000
-
-Both the tail and head pointers are pointing at descr 255, which is
-marked xa... which is "empty". Thus, from the OS point of view, there
-is nothing to be done. In particular, there is the implicit assumption
-that everything in front of the "empty" descr must surely also be empty,
-as explained in the last section. The OS is waiting for descr 255 to
-become non-empty, which, in this case, will never happen.
-
-The HW pointer is at descr 0. This descr is marked 0x4.. or "full".
-Since its already full, the hardware can do nothing more, and thus has
-halted processing. Notice that descrs 0 through 254 are all marked
-"full", while descr 254 and 255 are empty. (The "Last 1 descrs" is
-descr 254, since tail was at 255.) Thus, the system is deadlocked,
-and there can be no forward progress; the OS thinks there's nothing
-to do, and the hardware has nowhere to put incoming data.
-
-This bug/feature is worked around with the spider_net_resync_head_ptr()
-routine. When the driver receives RX interrupts, but an examination
-of the RX chain seems to show it is empty, then it is probable that
-the hardware has skipped a descr or two (sometimes dozens under heavy
-network conditions). The spider_net_resync_head_ptr() subroutine will
-search the ring for the next full descr, and the driver will resume
-operations there.  Since this will leave "holes" in the ring, there
-is also a spider_net_resync_tail_ptr() that will skip over such holes.
-
-As of this writing, the spider_net_resync() strategy seems to work very
-well, even under heavy network loads.
-
-
-The TX ring
-===========
-The TX ring uses a low-watermark interrupt scheme to make sure that
-the TX queue is appropriately serviced for large packet sizes.
-
-For packet sizes greater than about 1KBytes, the kernel can fill
-the TX ring quicker than the device can drain it. Once the ring
-is full, the netdev is stopped. When there is room in the ring,
-the netdev needs to be reawakened, so that more TX packets are placed
-in the ring. The hardware can empty the ring about four times per jiffy,
-so its not appropriate to wait for the poll routine to refill, since
-the poll routine runs only once per jiffy.  The low-watermark mechanism
-marks a descr about 1/4th of the way from the bottom of the queue, so
-that an interrupt is generated when the descr is processed. This
-interrupt wakes up the netdev, which can then refill the queue.
-For large packets, this mechanism generates a relatively small number
-of interrupts, about 1K/sec. For smaller packets, this will drop to zero
-interrupts, as the hardware can empty the queue faster than the kernel
-can fill it.
-
-
- ======= END OF DOCUMENT ========
-
diff --git a/MAINTAINERS b/MAINTAINERS
index 38dbfbfccb5e..db7a6d462dff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15874,7 +15874,7 @@ SPIDERNET NETWORK DRIVER for CELL
 M:	Ishizaki Kou <kou.ishizaki@toshiba.co.jp>
 L:	netdev@vger.kernel.org
 S:	Supported
-F:	Documentation/networking/device_drivers/toshiba/spider_net.txt
+F:	Documentation/networking/device_drivers/toshiba/spider_net.rst
 F:	drivers/net/ethernet/toshiba/spider_net*
 
 SPMI SUBSYSTEM
-- 
cgit v1.2.3


From 966a5c08af1b1399fe1014f24877578e8493ffe1 Mon Sep 17 00:00:00 2001
From: Kunihiko Hayashi
Date: Tue, 12 May 2020 16:26:50 +0900
Subject: dt-bindings: net: Convert UniPhier AVE4 controller to json-schema

Convert the UniPhier AVE4 controller binding to DT schema format.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 .../bindings/net/socionext,uniphier-ave4.txt       |  64 ------------
 .../bindings/net/socionext,uniphier-ave4.yaml      | 111 +++++++++++++++++++++
 MAINTAINERS                                        |   2 +-
 3 files changed, 112 insertions(+), 65 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/net/socionext,uniphier-ave4.txt
 create mode 100644 Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml

(limited to 'MAINTAINERS')

diff --git a/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.txt b/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.txt
deleted file mode 100644
index 4e85fc495e87..000000000000
--- a/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.txt
+++ /dev/null
@@ -1,64 +0,0 @@
-* Socionext AVE ethernet controller
-
-This describes the devicetree bindings for AVE ethernet controller
-implemented on Socionext UniPhier SoCs.
-
-Required properties:
- - compatible: Should be
-	- "socionext,uniphier-pro4-ave4" : for Pro4 SoC
-	- "socionext,uniphier-pxs2-ave4" : for PXs2 SoC
-	- "socionext,uniphier-ld11-ave4" : for LD11 SoC
-	- "socionext,uniphier-ld20-ave4" : for LD20 SoC
-	- "socionext,uniphier-pxs3-ave4" : for PXs3 SoC
- - reg: Address where registers are mapped and size of region.
- - interrupts: Should contain the MAC interrupt.
- - phy-mode: See ethernet.txt in the same directory. Allow to choose
-	"rgmii", "rmii", "mii", or "internal" according to the PHY.
-	The acceptable mode is SoC-dependent.
- - phy-handle: Should point to the external phy device.
-	See ethernet.txt file in the same directory.
- - clocks: A phandle to the clock for the MAC.
-	For Pro4 SoC, that is "socionext,uniphier-pro4-ave4",
-	another MAC clock, GIO bus clock and PHY clock are also required.
- - clock-names: Should contain
-	- "ether", "ether-gb", "gio", "ether-phy" for Pro4 SoC
-	- "ether" for others
- - resets: A phandle to the reset control for the MAC. For Pro4 SoC,
-	GIO bus reset is also required.
- - reset-names: Should contain
-	- "ether", "gio" for Pro4 SoC
-	- "ether" for others
- - socionext,syscon-phy-mode: A phandle to syscon with one argument
-	that configures phy mode. The argument is the ID of MAC instance.
-
-The MAC address will be determined using the optional properties
-defined in ethernet.txt.
-
-Required subnode:
- - mdio: A container for child nodes representing phy nodes.
-         See phy.txt in the same directory.
-
-Example:
-
-	ether: ethernet@65000000 {
-		compatible = "socionext,uniphier-ld20-ave4";
-		reg = <0x65000000 0x8500>;
-		interrupts = <0 66 4>;
-		phy-mode = "rgmii";
-		phy-handle = <&ethphy>;
-		clock-names = "ether";
-		clocks = <&sys_clk 6>;
-		reset-names = "ether";
-		resets = <&sys_rst 6>;
-		socionext,syscon-phy-mode = <&soc_glue 0>;
-		local-mac-address = [00 00 00 00 00 00];
-
-		mdio {
-			#address-cells = <1>;
-			#size-cells = <0>;
-
-			ethphy: ethphy@1 {
-				reg = <1>;
-			};
-		};
-	};
diff --git a/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml b/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml
new file mode 100644
index 000000000000..7d84a863b9b9
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml
@@ -0,0 +1,111 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/socionext,uniphier-ave4.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Socionext AVE ethernet controller
+
+maintainers:
+  - Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
+
+description: |
+  This describes the devicetree bindings for AVE ethernet controller
+  implemented on Socionext UniPhier SoCs.
+
+allOf:
+  - $ref: ethernet-controller.yaml#
+
+properties:
+  compatible:
+    enum:
+      - socionext,uniphier-pro4-ave4
+      - socionext,uniphier-pxs2-ave4
+      - socionext,uniphier-ld11-ave4
+      - socionext,uniphier-ld20-ave4
+      - socionext,uniphier-pxs3-ave4
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  phy-mode: true
+
+  phy-handle: true
+
+  mac-address: true
+
+  local-mac-address: true
+
+  clocks:
+    minItems: 1
+    maxItems: 4
+
+  clock-names:
+    oneOf:
+      - items:          # for Pro4
+        - const: gio
+        - const: ether
+        - const: ether-gb
+        - const: ether-phy
+      - const: ether    # for others
+
+  resets:
+    minItems: 1
+    maxItems: 2
+
+  reset-names:
+    oneOf:
+      - items:          # for Pro4
+        - const: gio
+        - const: ether
+      - const: ether    # for others
+
+  socionext,syscon-phy-mode:
+    $ref: /schemas/types.yaml#definitions/phandle-array
+    description:
+      A phandle to syscon with one argument that configures phy mode.
+      The argument is the ID of MAC instance.
+
+  mdio:
+    $ref: mdio.yaml#
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - phy-mode
+  - phy-handle
+  - clocks
+  - clock-names
+  - resets
+  - reset-names
+  - mdio
+
+additionalProperties: false
+
+examples:
+  - |
+    ether: ethernet@65000000 {
+        compatible = "socionext,uniphier-ld20-ave4";
+                reg = <0x65000000 0x8500>;
+                interrupts = <0 66 4>;
+                phy-mode = "rgmii";
+                phy-handle = <&ethphy>;
+                clock-names = "ether";
+                clocks = <&sys_clk 6>;
+                reset-names = "ether";
+                resets = <&sys_rst 6>;
+                socionext,syscon-phy-mode = <&soc_glue 0>;
+
+                mdio {
+                        #address-cells = <1>;
+                        #size-cells = <0>;
+
+                        ethphy: ethernet-phy@1 {
+                                reg = <1>;
+                        };
+                };
+        };
diff --git a/MAINTAINERS b/MAINTAINERS
index e581ae499057..734cccf1d1e5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15542,7 +15542,7 @@ SOCIONEXT (SNI) AVE NETWORK DRIVER
 M:	Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
 L:	netdev@vger.kernel.org
 S:	Maintained
-F:	Documentation/devicetree/bindings/net/socionext,uniphier-ave4.txt
+F:	Documentation/devicetree/bindings/net/socionext,uniphier-ave4.yaml
 F:	drivers/net/ethernet/socionext/sni_ave.c
 
 SOCIONEXT (SNI) NETSEC NETWORK DRIVER
-- 
cgit v1.2.3


From 3f044d26f80b0c2ee53f8409cdbb2aca28fa90b1 Mon Sep 17 00:00:00 2001
From: Luo bin
Date: Wed, 13 May 2020 22:50:49 +0000
Subject: hinic: update huawei ethernet driver maintainer

update huawei ethernet driver maintainer from aviad to Bin luo

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'MAINTAINERS')

diff --git a/MAINTAINERS b/MAINTAINERS
index 734cccf1d1e5..d853f70181a0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7811,7 +7811,7 @@ F:	Documentation/devicetree/bindings/iio/humidity/hts221.txt
 F:	drivers/iio/humidity/hts221*
 
 HUAWEI ETHERNET DRIVER
-M:	Aviad Krawczyk <aviad.krawczyk@huawei.com>
+M:	Bin Luo <luobin9@huawei.com>
 L:	netdev@vger.kernel.org
 S:	Supported
 F:	Documentation/networking/hinic.rst
-- 
cgit v1.2.3


From 28bee21dc04b39e587af3b68938e68caed02d552 Mon Sep 17 00:00:00 2001
From: Björn Töpel
Date: Wed, 20 May 2020 21:21:03 +0200
Subject: MAINTAINERS, xsk: Update AF_XDP section after moves/adds

Update MAINTAINERS to correctly mirror the current AF_XDP socket file
layout. Also, add the AF_XDP files of libbpf.

rfc->v1: Sorted file entries. (Joe)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/bpf/20200520192103.355233-16-bjorn.topel@gmail.com
---
 MAINTAINERS | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

(limited to 'MAINTAINERS')

diff --git a/MAINTAINERS b/MAINTAINERS
index b7844f6cfa4a..087e68b21f9f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18443,8 +18443,12 @@ R:	Jonathan Lemon <jonathan.lemon@gmail.com>
 L:	netdev@vger.kernel.org
 L:	bpf@vger.kernel.org
 S:	Maintained
-F:	kernel/bpf/xskmap.c
+F:	include/net/xdp_sock*
+F:	include/net/xsk_buffer_pool.h
+F:	include/uapi/linux/if_xdp.h
 F:	net/xdp/
+F:	samples/bpf/xdpsock*
+F:	tools/lib/bpf/xsk*
 
 XEN BLOCK SUBSYSTEM
 M:	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-- 
cgit v1.2.3


From 3a8855d8cfcb944032ce0eecba05b2d0e93f4fb1 Mon Sep 17 00:00:00 2001
From: Sergey Matyukevich
Date: Wed, 20 May 2020 16:08:00 +0300
Subject: MAINTAINERS: update qtnfmac maintainers

I am leaving Quantenna, so I will no longer have access to firmware and
hardware. Meanwhile I plan to participate in reviewing qtnfmac patches
for a while until my firmware knowledge becomes completely obsolete.
Adding myself as a reviewer using my personal email address.

Signed-off-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200520130800.1902-1-sergey.matyukevich.os@quantenna.com
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'MAINTAINERS')

diff --git a/MAINTAINERS b/MAINTAINERS
index 9f338ed0d9ab..5d81c002232a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14066,7 +14066,7 @@ F:	drivers/net/wireless/ath/wcn36xx/
 
 QUANTENNA QTNFMAC WIRELESS DRIVER
 M:	Igor Mitsyanko <imitsyanko@quantenna.com>
-M:	Sergey Matyukevich <smatyukevich@quantenna.com>
+R:	Sergey Matyukevich <geomatsi@gmail.com>
 L:	linux-wireless@vger.kernel.org
 S:	Maintained
 F:	drivers/net/wireless/quantenna
-- 
cgit v1.2.3


From 2b983b407a3a1f47f7d8595245066854ff352c65 Mon Sep 17 00:00:00 2001
From: Lukas Bulwahn
Date: Mon, 25 May 2020 16:15:53 +0200
Subject: MAINTAINERS: Adjust entry in XDP SOCKETS to actual file name

Commit 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API") added a
new header file include/net/xsk_buff_pool.h, but commit 28bee21dc04b
("MAINTAINERS, xsk: Update AF_XDP section after moves/adds") added a file
entry referring to include/net/xsk_buffer_pool.h.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:

  warning: no file matches  F:  include/net/xsk_buffer_pool.h

Adjust the entry in XDP SOCKETS to the actual file name.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200525141553.7035-1-lukas.bulwahn@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'MAINTAINERS')

diff --git a/MAINTAINERS b/MAINTAINERS
index 5d81c002232a..66d1a3f10102 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18456,7 +18456,7 @@ L:	netdev@vger.kernel.org
 L:	bpf@vger.kernel.org
 S:	Maintained
 F:	include/net/xdp_sock*
-F:	include/net/xsk_buffer_pool.h
+F:	include/net/xsk_buff_pool.h
 F:	include/uapi/linux/if_xdp.h
 F:	net/xdp/
 F:	samples/bpf/xdpsock*
-- 
cgit v1.2.3