From c1f9e14e3b676eb88fe1c9488c0b5f4fc9108a1c Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Wed, 8 Mar 2023 20:53:03 +0000 Subject: bpf, docs: Explain helper functions Add brief text about existence of helper functions, with details to go in separate psABI text. Note that text about runtime functions (kfuncs) is part of a separate patch, not this one. Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20230308205303.1308-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/clang-notes.rst | 6 ++++++ Documentation/bpf/instruction-set.rst | 9 ++++++++- Documentation/bpf/linux-notes.rst | 8 ++++++++ 3 files changed, 22 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst index 528feddf2db9..2c872a1ee08e 100644 --- a/Documentation/bpf/clang-notes.rst +++ b/Documentation/bpf/clang-notes.rst @@ -20,6 +20,12 @@ Arithmetic instructions For CPU versions prior to 3, Clang v7.0 and later can enable ``BPF_ALU`` support with ``-Xclang -target-feature -Xclang +alu32``. In CPU version 3, support is automatically included. +Jump instructions +================= + +If ``-O0`` is used, Clang will generate the ``BPF_CALL | BPF_X | BPF_JMP`` (0x8d) +instruction, which is not supported by the Linux kernel verifier. + Atomic operations ================= diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index db8789e6969e..5e43e14abe80 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -253,7 +253,7 @@ BPF_JSET 0x40 PC += off if dst & src BPF_JNE 0x50 PC += off if dst != src BPF_JSGT 0x60 PC += off if dst > src signed BPF_JSGE 0x70 PC += off if dst >= src signed -BPF_CALL 0x80 function call +BPF_CALL 0x80 function call see `Helper functions`_ BPF_EXIT 0x90 function / program return BPF_JMP only BPF_JLT 0xa0 PC += off if dst < src unsigned BPF_JLE 0xb0 PC += off if dst <= src unsigned @@ -264,6 +264,13 @@ BPF_JSLE 0xd0 PC += off if dst <= src signed The eBPF program needs to store the return value into register R0 before doing a BPF_EXIT. +Helper functions +~~~~~~~~~~~~~~~~ + +Helper functions are a concept whereby BPF programs can call into a +set of function calls exposed by the runtime. Each helper +function is identified by an integer used in a ``BPF_CALL`` instruction. +The available helper functions may differ for each program type. Load and store instructions =========================== diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst index 956b0c86699d..f43b9c797bcb 100644 --- a/Documentation/bpf/linux-notes.rst +++ b/Documentation/bpf/linux-notes.rst @@ -12,6 +12,14 @@ Byte swap instructions ``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively. +Jump instructions +================= + +``BPF_CALL | BPF_X | BPF_JMP`` (0x8d), where the helper function +integer would be read from a specified register, is not currently supported +by the verifier. Any programs with this instruction will fail to load +until such support is added. + Legacy BPF Packet access instructions ===================================== -- cgit v1.2.3 From b9fe8e8d03d0df28b2431e3aaf8e115cf7bf2f65 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Fri, 10 Mar 2023 23:38:14 +0000 Subject: bpf, docs: Add signed comparison example Improve clarity by adding an example of a signed comparison instruction Signed-off-by: Dave Thaler Acked-by: David Vernet Acked-by: John Fastabend Link: https://lore.kernel.org/r/20230310233814.4641-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/instruction-set.rst | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index 5e43e14abe80..b44640589055 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -11,7 +11,8 @@ Documentation conventions ========================= For brevity, this document uses the type notion "u64", "u32", etc. -to mean an unsigned integer whose width is the specified number of bits. +to mean an unsigned integer whose width is the specified number of bits, +and "s32", etc. to mean a signed integer of the specified number of bits. Registers and calling convention ================================ @@ -264,6 +265,14 @@ BPF_JSLE 0xd0 PC += off if dst <= src signed The eBPF program needs to store the return value into register R0 before doing a BPF_EXIT. +Example: + +``BPF_JSGE | BPF_X | BPF_JMP32`` (0x7e) means:: + + if (s32)dst s>= (s32)src goto +offset + +where 's>=' indicates a signed '>=' comparison. + Helper functions ~~~~~~~~~~~~~~~~ -- cgit v1.2.3 From fec2c6d14fd5001e7d24a2ae44f0e9aea82a6149 Mon Sep 17 00:00:00 2001 From: David Vernet Date: Thu, 16 Mar 2023 00:40:28 -0500 Subject: bpf,docs: Remove bpf_cpumask_kptr_get() from documentation Now that the kfunc no longer exists, we can remove it and instead describe how RCU can be used to get a struct bpf_cpumask from a map value. This patch updates the BPF documentation accordingly. Signed-off-by: David Vernet Link: https://lore.kernel.org/r/20230316054028.88924-6-void@manifault.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/cpumasks.rst | 30 ++++++++++-------------------- 1 file changed, 10 insertions(+), 20 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/cpumasks.rst b/Documentation/bpf/cpumasks.rst index 75344cd230e5..41efd8874eeb 100644 --- a/Documentation/bpf/cpumasks.rst +++ b/Documentation/bpf/cpumasks.rst @@ -117,12 +117,7 @@ For example: As mentioned and illustrated above, these ``struct bpf_cpumask *`` objects can also be stored in a map and used as kptrs. If a ``struct bpf_cpumask *`` is in a map, the reference can be removed from the map with bpf_kptr_xchg(), or -opportunistically acquired with bpf_cpumask_kptr_get(): - -.. kernel-doc:: kernel/bpf/cpumask.c - :identifiers: bpf_cpumask_kptr_get - -Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map: +opportunistically acquired using RCU: .. code-block:: c @@ -144,7 +139,7 @@ Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map: /** * A simple example tracepoint program showing how a * struct bpf_cpumask * kptr that is stored in a map can - * be acquired using the bpf_cpumask_kptr_get() kfunc. + * be passed to kfuncs using RCU protection. */ SEC("tp_btf/cgroup_mkdir") int BPF_PROG(cgrp_ancestor_example, struct cgroup *cgrp, const char *path) @@ -158,26 +153,21 @@ Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map: if (!v) return -ENOENT; + bpf_rcu_read_lock(); /* Acquire a reference to the bpf_cpumask * kptr that's already stored in the map. */ - kptr = bpf_cpumask_kptr_get(&v->cpumask); - if (!kptr) + kptr = v->cpumask; + if (!kptr) { /* If no bpf_cpumask was present in the map, it's because * we're racing with another CPU that removed it with * bpf_kptr_xchg() between the bpf_map_lookup_elem() - * above, and our call to bpf_cpumask_kptr_get(). - * bpf_cpumask_kptr_get() internally safely handles this - * race, and will return NULL if the cpumask is no longer - * present in the map by the time we invoke the kfunc. + * above, and our load of the pointer from the map. */ + bpf_rcu_read_unlock(); return -EBUSY; + } - /* Free the reference we just took above. Note that the - * original struct bpf_cpumask * kptr is still in the map. It will - * be freed either at a later time if another context deletes - * it from the map, or automatically by the BPF subsystem if - * it's still present when the map is destroyed. - */ - bpf_cpumask_release(kptr); + bpf_cpumask_setall(kptr); + bpf_rcu_read_unlock(); return 0; } -- cgit v1.2.3 From 0f10f647f45545004ea50b73a7a7c5c3309ff286 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Tue, 14 Mar 2023 14:44:49 +0700 Subject: bpf, docs: Use internal linking for link to netdev subsystem doc Commit d56b0c461d19da ("bpf, docs: Fix link to netdev-FAQ target") attempts to fix linking problem to undefined "netdev-FAQ" label introduced in 287f4fa99a5281 ("docs: Update references to netdev-FAQ") by changing internal cross reference to netdev subsystem documentation (Documentation/process/maintainer-netdev.rst) to external one at docs.kernel.org. However, the linking problem is still not resolved, as the generated link points to non-existent netdev-FAQ section of the external doc, which when clicked, will instead going to the top of the doc. Revert back to internal linking by simply mention the doc path while massaging the leading text to the link, since the netdev subsystem doc contains no FAQs but rather general information about the subsystem. Fixes: d56b0c461d19 ("bpf, docs: Fix link to netdev-FAQ target") Fixes: 287f4fa99a52 ("docs: Update references to netdev-FAQ") Signed-off-by: Bagas Sanjaya Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20230314074449.23620-1-bagasdotme@gmail.com --- Documentation/bpf/bpf_devel_QA.rst | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/bpf_devel_QA.rst b/Documentation/bpf/bpf_devel_QA.rst index 5f5f9ccc3862..e151e61dff38 100644 --- a/Documentation/bpf/bpf_devel_QA.rst +++ b/Documentation/bpf/bpf_devel_QA.rst @@ -128,7 +128,8 @@ into the bpf-next tree will make their way into net-next tree. net and net-next are both run by David S. Miller. From there, they will go into the kernel mainline tree run by Linus Torvalds. To read up on the process of net and net-next being merged into the mainline tree, see -the `netdev-FAQ`_. +the documentation on netdev subsystem at +Documentation/process/maintainer-netdev.rst. @@ -147,7 +148,8 @@ request):: Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be applied to? --------------------------------------------------------------------------------- -A: The process is the very same as described in the `netdev-FAQ`_, +A: The process is the very same as described in the netdev subsystem +documentation at Documentation/process/maintainer-netdev.rst, so please read up on it. The subject line must indicate whether the patch is a fix or rather "next-like" content in order to let the maintainers know whether it is targeted at bpf or bpf-next. @@ -206,8 +208,9 @@ ii) run extensive BPF test suite and Once the BPF pull request was accepted by David S. Miller, then the patches end up in net or net-next tree, respectively, and make their way from there further into mainline. Again, see the -`netdev-FAQ`_ for additional information e.g. on how often they are -merged to mainline. +documentation for netdev subsystem at +Documentation/process/maintainer-netdev.rst for additional information +e.g. on how often they are merged to mainline. Q: How long do I need to wait for feedback on my BPF patches? ------------------------------------------------------------- @@ -230,7 +233,8 @@ Q: Are patches applied to bpf-next when the merge window is open? ----------------------------------------------------------------- A: For the time when the merge window is open, bpf-next will not be processed. This is roughly analogous to net-next patch processing, -so feel free to read up on the `netdev-FAQ`_ about further details. +so feel free to read up on the netdev docs at +Documentation/process/maintainer-netdev.rst about further details. During those two weeks of merge window, we might ask you to resend your patch series once bpf-next is open again. Once Linus released @@ -394,7 +398,8 @@ netdev kernel mailing list in Cc and ask for the fix to be queued up: netdev@vger.kernel.org The process in general is the same as on netdev itself, see also the -`netdev-FAQ`_. +the documentation on networking subsystem at +Documentation/process/maintainer-netdev.rst. Q: Do you also backport to kernels not currently maintained as stable? ---------------------------------------------------------------------- @@ -410,7 +415,7 @@ Q: The BPF patch I am about to submit needs to go to stable as well What should I do? A: The same rules apply as with netdev patch submissions in general, see -the `netdev-FAQ`_. +the netdev docs at Documentation/process/maintainer-netdev.rst. Never add "``Cc: stable@vger.kernel.org``" to the patch description, but ask the BPF maintainers to queue the patches instead. This can be done @@ -685,7 +690,6 @@ when: .. Links .. _Documentation/process/: https://www.kernel.org/doc/html/latest/process/ -.. _netdev-FAQ: https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html .. _selftests: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/ .. _Documentation/dev-tools/kselftest.rst: -- cgit v1.2.3 From 08ff1c9f3e927ba3701c113dda70953a6f4afffa Mon Sep 17 00:00:00 2001 From: Sreevani Sreejith Date: Wed, 15 Mar 2023 12:54:05 -0700 Subject: bpf, docs: Libbpf overview documentation This patch documents overview of libbpf, including its features for developing BPF programs. Signed-off-by: Sreevani Sreejith Signed-off-by: Andrii Nakryiko Acked-by: David Vernet Link: https://lore.kernel.org/bpf/20230315195405.2051559-1-ssreevani@meta.com --- Documentation/bpf/libbpf/index.rst | 25 ++- Documentation/bpf/libbpf/libbpf_overview.rst | 228 +++++++++++++++++++++++++++ 2 files changed, 245 insertions(+), 8 deletions(-) create mode 100644 Documentation/bpf/libbpf/libbpf_overview.rst (limited to 'Documentation') diff --git a/Documentation/bpf/libbpf/index.rst b/Documentation/bpf/libbpf/index.rst index f9b3b252e28f..7545a2049692 100644 --- a/Documentation/bpf/libbpf/index.rst +++ b/Documentation/bpf/libbpf/index.rst @@ -2,23 +2,32 @@ .. _libbpf: +====== libbpf ====== +If you are looking to develop BPF applications using the libbpf library, this +directory contains important documentation that you should read. + +To get started, it is recommended to begin with the :doc:`libbpf Overview +` document, which provides a high-level understanding of the +libbpf APIs and their usage. This will give you a solid foundation to start +exploring and utilizing the various features of libbpf to develop your BPF +applications. + .. toctree:: :maxdepth: 1 + libbpf_overview API Documentation program_types libbpf_naming_convention libbpf_build -This is documentation for libbpf, a userspace library for loading and -interacting with bpf programs. -All general BPF questions, including kernel functionality, libbpf APIs and -their application, should be sent to bpf@vger.kernel.org mailing list. -You can `subscribe `_ to the -mailing list search its `archive `_. -Please search the archive before asking new questions. It very well might -be that this was already addressed or answered before. +All general BPF questions, including kernel functionality, libbpf APIs and their +application, should be sent to bpf@vger.kernel.org mailing list. You can +`subscribe `_ to the mailing list +search its `archive `_. Please search the archive +before asking new questions. It may be that this was already addressed or +answered before. diff --git a/Documentation/bpf/libbpf/libbpf_overview.rst b/Documentation/bpf/libbpf/libbpf_overview.rst new file mode 100644 index 000000000000..f36a2d4ffea2 --- /dev/null +++ b/Documentation/bpf/libbpf/libbpf_overview.rst @@ -0,0 +1,228 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============== +libbpf Overview +=============== + +libbpf is a C-based library containing a BPF loader that takes compiled BPF +object files and prepares and loads them into the Linux kernel. libbpf takes the +heavy lifting of loading, verifying, and attaching BPF programs to various +kernel hooks, allowing BPF application developers to focus only on BPF program +correctness and performance. + +The following are the high-level features supported by libbpf: + +* Provides high-level and low-level APIs for user space programs to interact + with BPF programs. The low-level APIs wrap all the bpf system call + functionality, which is useful when users need more fine-grained control + over the interactions between user space and BPF programs. +* Provides overall support for the BPF object skeleton generated by bpftool. + The skeleton file simplifies the process for the user space programs to access + global variables and work with BPF programs. +* Provides BPF-side APIS, including BPF helper definitions, BPF maps support, + and tracing helpers, allowing developers to simplify BPF code writing. +* Supports BPF CO-RE mechanism, enabling BPF developers to write portable + BPF programs that can be compiled once and run across different kernel + versions. + +This document will delve into the above concepts in detail, providing a deeper +understanding of the capabilities and advantages of libbpf and how it can help +you develop BPF applications efficiently. + +BPF App Lifecycle and libbpf APIs +================================== + +A BPF application consists of one or more BPF programs (either cooperating or +completely independent), BPF maps, and global variables. The global +variables are shared between all BPF programs, which allows them to cooperate on +a common set of data. libbpf provides APIs that user space programs can use to +manipulate the BPF programs by triggering different phases of a BPF application +lifecycle. + +The following section provides a brief overview of each phase in the BPF life +cycle: + +* **Open phase**: In this phase, libbpf parses the BPF + object file and discovers BPF maps, BPF programs, and global variables. After + a BPF app is opened, user space apps can make additional adjustments + (setting BPF program types, if necessary; pre-setting initial values for + global variables, etc.) before all the entities are created and loaded. + +* **Load phase**: In the load phase, libbpf creates BPF + maps, resolves various relocations, and verifies and loads BPF programs into + the kernel. At this point, libbpf validates all the parts of a BPF application + and loads the BPF program into the kernel, but no BPF program has yet been + executed. After the load phase, it’s possible to set up the initial BPF map + state without racing with the BPF program code execution. + +* **Attachment phase**: In this phase, libbpf + attaches BPF programs to various BPF hook points (e.g., tracepoints, kprobes, + cgroup hooks, network packet processing pipeline, etc.). During this + phase, BPF programs perform useful work such as processing + packets, or updating BPF maps and global variables that can be read from user + space. + +* **Tear down phase**: In the tear down phase, + libbpf detaches BPF programs and unloads them from the kernel. BPF maps are + destroyed, and all the resources used by the BPF app are freed. + +BPF Object Skeleton File +======================== + +BPF skeleton is an alternative interface to libbpf APIs for working with BPF +objects. Skeleton code abstract away generic libbpf APIs to significantly +simplify code for manipulating BPF programs from user space. Skeleton code +includes a bytecode representation of the BPF object file, simplifying the +process of distributing your BPF code. With BPF bytecode embedded, there are no +extra files to deploy along with your application binary. + +You can generate the skeleton header file ``(.skel.h)`` for a specific object +file by passing the BPF object to the bpftool. The generated BPF skeleton +provides the following custom functions that correspond to the BPF lifecycle, +each of them prefixed with the specific object name: + +* ``__open()`` – creates and opens BPF application (```` stands for + the specific bpf object name) +* ``__load()`` – instantiates, loads,and verifies BPF application parts +* ``__attach()`` – attaches all auto-attachable BPF programs (it’s + optional, you can have more control by using libbpf APIs directly) +* ``__destroy()`` – detaches all BPF programs and + frees up all used resources + +Using the skeleton code is the recommended way to work with bpf programs. Keep +in mind, BPF skeleton provides access to the underlying BPF object, so whatever +was possible to do with generic libbpf APIs is still possible even when the BPF +skeleton is used. It's an additive convenience feature, with no syscalls, and no +cumbersome code. + +Other Advantages of Using Skeleton File +--------------------------------------- + +* BPF skeleton provides an interface for user space programs to work with BPF + global variables. The skeleton code memory maps global variables as a struct + into user space. The struct interface allows user space programs to initialize + BPF programs before the BPF load phase and fetch and update data from user + space afterward. + +* The ``skel.h`` file reflects the object file structure by listing out the + available maps, programs, etc. BPF skeleton provides direct access to all the + BPF maps and BPF programs as struct fields. This eliminates the need for + string-based lookups with ``bpf_object_find_map_by_name()`` and + ``bpf_object_find_program_by_name()`` APIs, reducing errors due to BPF source + code and user-space code getting out of sync. + +* The embedded bytecode representation of the object file ensures that the + skeleton and the BPF object file are always in sync. + +BPF Helpers +=========== + +libbpf provides BPF-side APIs that BPF programs can use to interact with the +system. The BPF helpers definition allows developers to use them in BPF code as +any other plain C function. For example, there are helper functions to print +debugging messages, get the time since the system was booted, interact with BPF +maps, manipulate network packets, etc. + +For a complete description of what the helpers do, the arguments they take, and +the return value, see the `bpf-helpers +`_ man page. + +BPF CO-RE (Compile Once – Run Everywhere) +========================================= + +BPF programs work in the kernel space and have access to kernel memory and data +structures. One limitation that BPF applications come across is the lack of +portability across different kernel versions and configurations. `BCC +`_ is one of the solutions for BPF +portability. However, it comes with runtime overhead and a large binary size +from embedding the compiler with the application. + +libbpf steps up the BPF program portability by supporting the BPF CO-RE concept. +BPF CO-RE brings together BTF type information, libbpf, and the compiler to +produce a single executable binary that you can run on multiple kernel versions +and configurations. + +To make BPF programs portable libbpf relies on the BTF type information of the +running kernel. Kernel also exposes this self-describing authoritative BTF +information through ``sysfs`` at ``/sys/kernel/btf/vmlinux``. + +You can generate the BTF information for the running kernel with the following +command: + +:: + + $ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h + +The command generates a ``vmlinux.h`` header file with all kernel types +(:doc:`BTF types <../btf>`) that the running kernel uses. Including +``vmlinux.h`` in your BPF program eliminates dependency on system-wide kernel +headers. + +libbpf enables portability of BPF programs by looking at the BPF program’s +recorded BTF type and relocation information and matching them to BTF +information (vmlinux) provided by the running kernel. libbpf then resolves and +matches all the types and fields, and updates necessary offsets and other +relocatable data to ensure that BPF program’s logic functions correctly for a +specific kernel on the host. BPF CO-RE concept thus eliminates overhead +associated with BPF development and allows developers to write portable BPF +applications without modifications and runtime source code compilation on the +target machine. + +The following code snippet shows how to read the parent field of a kernel +``task_struct`` using BPF CO-RE and libbf. The basic helper to read a field in a +CO-RE relocatable manner is ``bpf_core_read(dst, sz, src)``, which will read +``sz`` bytes from the field referenced by ``src`` into the memory pointed to by +``dst``. + +.. code-block:: C + :emphasize-lines: 6 + + //... + struct task_struct *task = (void *)bpf_get_current_task(); + struct task_struct *parent_task; + int err; + + err = bpf_core_read(&parent_task, sizeof(void *), &task->parent); + if (err) { + /* handle error */ + } + + /* parent_task contains the value of task->parent pointer */ + +In the code snippet, we first get a pointer to the current ``task_struct`` using +``bpf_get_current_task()``. We then use ``bpf_core_read()`` to read the parent +field of task struct into the ``parent_task`` variable. ``bpf_core_read()`` is +just like ``bpf_probe_read_kernel()`` BPF helper, except it records information +about the field that should be relocated on the target kernel. i.e, if the +``parent`` field gets shifted to a different offset within +``struct task_struct`` due to some new field added in front of it, libbpf will +automatically adjust the actual offset to the proper value. + +Getting Started with libbpf +=========================== + +Check out the `libbpf-bootstrap `_ +repository with simple examples of using libbpf to build various BPF +applications. + +See also `libbpf API documentation +`_. + +libbpf and Rust +=============== + +If you are building BPF applications in Rust, it is recommended to use the +`Libbpf-rs `_ library instead of bindgen +bindings directly to libbpf. Libbpf-rs wraps libbpf functionality in +Rust-idiomatic interfaces and provides libbpf-cargo plugin to handle BPF code +compilation and skeleton generation. Using Libbpf-rs will make building user +space part of the BPF application easier. Note that the BPF program themselves +must still be written in plain C. + +Additional Documentation +======================== + +* `Program types and ELF Sections `_ +* `API naming convention `_ +* `Building libbpf `_ +* `API documentation Convention `_ -- cgit v1.2.3 From 6c831c4684124a544f73f7c9b83bc7b2eb0b23d3 Mon Sep 17 00:00:00 2001 From: David Vernet Date: Sat, 25 Mar 2023 16:31:46 -0500 Subject: bpf: Treat KF_RELEASE kfuncs as KF_TRUSTED_ARGS KF_RELEASE kfuncs are not currently treated as having KF_TRUSTED_ARGS, even though they have a superset of the requirements of KF_TRUSTED_ARGS. Like KF_TRUSTED_ARGS, KF_RELEASE kfuncs require a 0-offset argument, and don't allow NULL-able arguments. Unlike KF_TRUSTED_ARGS which require _either_ an argument with ref_obj_id > 0, _or_ (ref->type & BPF_REG_TRUSTED_MODIFIERS) (and no unsafe modifiers allowed), KF_RELEASE only allows for ref_obj_id > 0. Because KF_RELEASE today doesn't automatically imply KF_TRUSTED_ARGS, some of these requirements are enforced in different ways that can make the behavior of the verifier feel unpredictable. For example, a KF_RELEASE kfunc with a NULL-able argument will currently fail in the verifier with a message like, "arg#0 is ptr_or_null_ expected ptr_ or socket" rather than "Possibly NULL pointer passed to trusted arg0". Our intention is the same, but the semantics are different due to implemenetation details that kfunc authors and BPF program writers should not need to care about. Let's make the behavior of the verifier more consistent and intuitive by having KF_RELEASE kfuncs imply the presence of KF_TRUSTED_ARGS. Our eventual goal is to have all kfuncs assume KF_TRUSTED_ARGS by default anyways, so this takes us a step in that direction. Note that it does not make sense to assume KF_TRUSTED_ARGS for all KF_ACQUIRE kfuncs. KF_ACQUIRE kfuncs can have looser semantics than KF_RELEASE, with e.g. KF_RCU | KF_RET_NULL. We may want to have KF_ACQUIRE imply KF_TRUSTED_ARGS _unless_ KF_RCU is specified, but that can be left to another patch set, and there are no such subtleties to address for KF_RELEASE. Signed-off-by: David Vernet Link: https://lore.kernel.org/r/20230325213144.486885-4-void@manifault.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/kfuncs.rst | 7 ++++--- kernel/bpf/cpumask.c | 2 +- kernel/bpf/verifier.c | 2 +- net/bpf/test_run.c | 6 ++++++ tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c | 4 ++-- tools/testing/selftests/bpf/progs/task_kfunc_failure.c | 6 +++--- tools/testing/selftests/bpf/verifier/calls.c | 10 +++++++--- tools/testing/selftests/bpf/verifier/ref_tracking.c | 6 +++--- 8 files changed, 27 insertions(+), 16 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index 69eccf6f98ef..bf1b85941452 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -179,9 +179,10 @@ both are orthogonal to each other. --------------------- The KF_RELEASE flag is used to indicate that the kfunc releases the pointer -passed in to it. There can be only one referenced pointer that can be passed in. -All copies of the pointer being released are invalidated as a result of invoking -kfunc with this flag. +passed in to it. There can be only one referenced pointer that can be passed +in. All copies of the pointer being released are invalidated as a result of +invoking kfunc with this flag. KF_RELEASE kfuncs automatically receive the +protection afforded by the KF_TRUSTED_ARGS flag described below. 2.4.4 KF_KPTR_GET flag ---------------------- diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c index e991af7dc13c..7efdf5d770ca 100644 --- a/kernel/bpf/cpumask.c +++ b/kernel/bpf/cpumask.c @@ -402,7 +402,7 @@ __diag_pop(); BTF_SET8_START(cpumask_kfunc_btf_ids) BTF_ID_FLAGS(func, bpf_cpumask_create, KF_ACQUIRE | KF_RET_NULL) -BTF_ID_FLAGS(func, bpf_cpumask_release, KF_RELEASE | KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_cpumask_release, KF_RELEASE) BTF_ID_FLAGS(func, bpf_cpumask_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_cpumask_first, KF_RCU) BTF_ID_FLAGS(func, bpf_cpumask_first_zero, KF_RCU) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 64f06f6e16bf..20eb2015842f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9307,7 +9307,7 @@ static bool is_kfunc_release(struct bpf_kfunc_call_arg_meta *meta) static bool is_kfunc_trusted_args(struct bpf_kfunc_call_arg_meta *meta) { - return meta->kfunc_flags & KF_TRUSTED_ARGS; + return (meta->kfunc_flags & KF_TRUSTED_ARGS) || is_kfunc_release(meta); } static bool is_kfunc_sleepable(struct bpf_kfunc_call_arg_meta *meta) diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 27587f1c5f36..f1652f5fbd2e 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -606,6 +606,11 @@ bpf_kfunc_call_test_acquire(unsigned long *scalar_ptr) return &prog_test_struct; } +__bpf_kfunc void bpf_kfunc_call_test_offset(struct prog_test_ref_kfunc *p) +{ + WARN_ON_ONCE(1); +} + __bpf_kfunc struct prog_test_member * bpf_kfunc_call_memb_acquire(void) { @@ -800,6 +805,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail2) BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS | KF_RCU) BTF_ID_FLAGS(func, bpf_kfunc_call_test_destructive, KF_DESTRUCTIVE) BTF_ID_FLAGS(func, bpf_kfunc_call_test_static_unused_arg) +BTF_ID_FLAGS(func, bpf_kfunc_call_test_offset) BTF_SET8_END(test_sk_check_kfunc_ids) static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size, diff --git a/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c b/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c index 807fb0ac41e9..48b2034cadb3 100644 --- a/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c +++ b/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c @@ -206,7 +206,7 @@ int BPF_PROG(cgrp_kfunc_get_unreleased, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") -__failure __msg("expects refcounted") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(cgrp_kfunc_release_untrusted, struct cgroup *cgrp, const char *path) { struct __cgrps_kfunc_map_value *v; @@ -234,7 +234,7 @@ int BPF_PROG(cgrp_kfunc_release_fp, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") -__failure __msg("arg#0 is ptr_or_null_ expected ptr_ or socket") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(cgrp_kfunc_release_null, struct cgroup *cgrp, const char *path) { struct __cgrps_kfunc_map_value local, *v; diff --git a/tools/testing/selftests/bpf/progs/task_kfunc_failure.c b/tools/testing/selftests/bpf/progs/task_kfunc_failure.c index 27994d6b2914..2c374a7ffece 100644 --- a/tools/testing/selftests/bpf/progs/task_kfunc_failure.c +++ b/tools/testing/selftests/bpf/progs/task_kfunc_failure.c @@ -206,7 +206,7 @@ int BPF_PROG(task_kfunc_get_unreleased, struct task_struct *task, u64 clone_flag } SEC("tp_btf/task_newtask") -__failure __msg("arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(task_kfunc_release_untrusted, struct task_struct *task, u64 clone_flags) { struct __tasks_kfunc_map_value *v; @@ -234,7 +234,7 @@ int BPF_PROG(task_kfunc_release_fp, struct task_struct *task, u64 clone_flags) } SEC("tp_btf/task_newtask") -__failure __msg("arg#0 is ptr_or_null_ expected ptr_ or socket") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(task_kfunc_release_null, struct task_struct *task, u64 clone_flags) { struct __tasks_kfunc_map_value local, *v; @@ -277,7 +277,7 @@ int BPF_PROG(task_kfunc_release_unacquired, struct task_struct *task, u64 clone_ } SEC("tp_btf/task_newtask") -__failure __msg("arg#0 is ptr_or_null_ expected ptr_ or socket") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(task_kfunc_from_pid_no_null_check, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired; diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c index 5702fc9761ef..1bdf2b43e49e 100644 --- a/tools/testing/selftests/bpf/verifier/calls.c +++ b/tools/testing/selftests/bpf/verifier/calls.c @@ -109,7 +109,7 @@ }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .result = REJECT, - .errstr = "arg#0 is ptr_or_null_ expected ptr_ or socket", + .errstr = "Possibly NULL pointer passed to trusted arg0", .fixup_kfunc_btf_id = { { "bpf_kfunc_call_test_acquire", 3 }, { "bpf_kfunc_call_test_release", 5 }, @@ -165,19 +165,23 @@ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8), BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_0), BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), BPF_EXIT_INSN(), BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), - BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 16), BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -4), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0), BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_2), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0), + BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .fixup_kfunc_btf_id = { { "bpf_kfunc_call_test_acquire", 3 }, - { "bpf_kfunc_call_test_release", 9 }, + { "bpf_kfunc_call_test_offset", 9 }, + { "bpf_kfunc_call_test_release", 12 }, }, .result_unpriv = REJECT, .result = REJECT, diff --git a/tools/testing/selftests/bpf/verifier/ref_tracking.c b/tools/testing/selftests/bpf/verifier/ref_tracking.c index 9540164712b7..5a2e154dd1e0 100644 --- a/tools/testing/selftests/bpf/verifier/ref_tracking.c +++ b/tools/testing/selftests/bpf/verifier/ref_tracking.c @@ -142,7 +142,7 @@ .kfunc = "bpf", .expected_attach_type = BPF_LSM_MAC, .flags = BPF_F_SLEEPABLE, - .errstr = "arg#0 is ptr_or_null_ expected ptr_ or socket", + .errstr = "Possibly NULL pointer passed to trusted arg0", .fixup_kfunc_btf_id = { { "bpf_lookup_user_key", 2 }, { "bpf_key_put", 4 }, @@ -163,7 +163,7 @@ .kfunc = "bpf", .expected_attach_type = BPF_LSM_MAC, .flags = BPF_F_SLEEPABLE, - .errstr = "arg#0 is ptr_or_null_ expected ptr_ or socket", + .errstr = "Possibly NULL pointer passed to trusted arg0", .fixup_kfunc_btf_id = { { "bpf_lookup_system_key", 1 }, { "bpf_key_put", 3 }, @@ -182,7 +182,7 @@ .kfunc = "bpf", .expected_attach_type = BPF_LSM_MAC, .flags = BPF_F_SLEEPABLE, - .errstr = "arg#0 pointer type STRUCT bpf_key must point to scalar, or struct with scalar", + .errstr = "Possibly NULL pointer passed to trusted arg0", .fixup_kfunc_btf_id = { { "bpf_key_put", 1 }, }, -- cgit v1.2.3 From 8cfee110711ed60bfdd39af0107ddef01d6b72c3 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Sun, 26 Mar 2023 03:31:17 +0000 Subject: bpf, docs: Add extended call instructions Add extended call instructions. Uses the term "program-local" for call by offset. And there are instructions for calling helper functions by "address" (the old way of using integer values), and for calling helper functions by BTF ID (for kfuncs). V1 -> V2: addressed comments from David Vernet V2 -> V3: make descriptions in table consistent with updated names V3 -> V4: addressed comments from Alexei V4 -> V5: fixed alignment Signed-off-by: Dave Thaler Acked-by: David Vernet Link: https://lore.kernel.org/r/20230326033117.1075-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/instruction-set.rst | 59 ++++++++++++++++++++++------------- 1 file changed, 37 insertions(+), 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index b44640589055..b77280eb926f 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -243,27 +243,29 @@ Jump instructions otherwise identical operations. The 'code' field encodes the operation as below: -======== ===== ========================= ============ -code value description notes -======== ===== ========================= ============ -BPF_JA 0x00 PC += off BPF_JMP only -BPF_JEQ 0x10 PC += off if dst == src -BPF_JGT 0x20 PC += off if dst > src unsigned -BPF_JGE 0x30 PC += off if dst >= src unsigned -BPF_JSET 0x40 PC += off if dst & src -BPF_JNE 0x50 PC += off if dst != src -BPF_JSGT 0x60 PC += off if dst > src signed -BPF_JSGE 0x70 PC += off if dst >= src signed -BPF_CALL 0x80 function call see `Helper functions`_ -BPF_EXIT 0x90 function / program return BPF_JMP only -BPF_JLT 0xa0 PC += off if dst < src unsigned -BPF_JLE 0xb0 PC += off if dst <= src unsigned -BPF_JSLT 0xc0 PC += off if dst < src signed -BPF_JSLE 0xd0 PC += off if dst <= src signed -======== ===== ========================= ============ +======== ===== === =========================================== ========================================= +code value src description notes +======== ===== === =========================================== ========================================= +BPF_JA 0x0 0x0 PC += offset BPF_JMP only +BPF_JEQ 0x1 any PC += offset if dst == src +BPF_JGT 0x2 any PC += offset if dst > src unsigned +BPF_JGE 0x3 any PC += offset if dst >= src unsigned +BPF_JSET 0x4 any PC += offset if dst & src +BPF_JNE 0x5 any PC += offset if dst != src +BPF_JSGT 0x6 any PC += offset if dst > src signed +BPF_JSGE 0x7 any PC += offset if dst >= src signed +BPF_CALL 0x8 0x0 call helper function by address see `Helper functions`_ +BPF_CALL 0x8 0x1 call PC += offset see `Program-local functions`_ +BPF_CALL 0x8 0x2 call helper function by BTF ID see `Helper functions`_ +BPF_EXIT 0x9 0x0 return BPF_JMP only +BPF_JLT 0xa any PC += offset if dst < src unsigned +BPF_JLE 0xb any PC += offset if dst <= src unsigned +BPF_JSLT 0xc any PC += offset if dst < src signed +BPF_JSLE 0xd any PC += offset if dst <= src signed +======== ===== === =========================================== ========================================= The eBPF program needs to store the return value into register R0 before doing a -BPF_EXIT. +``BPF_EXIT``. Example: @@ -277,9 +279,22 @@ Helper functions ~~~~~~~~~~~~~~~~ Helper functions are a concept whereby BPF programs can call into a -set of function calls exposed by the runtime. Each helper -function is identified by an integer used in a ``BPF_CALL`` instruction. -The available helper functions may differ for each program type. +set of function calls exposed by the underlying platform. + +Historically, each helper function was identified by an address +encoded in the imm field. The available helper functions may differ +for each program type, but address values are unique across all program types. + +Platforms that support the BPF Type Format (BTF) support identifying +a helper function by a BTF ID encoded in the imm field, where the BTF ID +identifies the helper name and type. + +Program-local functions +~~~~~~~~~~~~~~~~~~~~~~~ +Program-local functions are functions exposed by the same BPF program as the +caller, and are referenced by offset from the call instruction, similar to +``BPF_JA``. A ``BPF_EXIT`` within the program-local function will return to +the caller. Load and store instructions =========================== -- cgit v1.2.3 From db9d479ab59b21d719486e6bf673f83f129dae32 Mon Sep 17 00:00:00 2001 From: David Vernet Date: Fri, 31 Mar 2023 14:57:33 -0500 Subject: bpf,docs: Update documentation to reflect new task kfuncs Now that struct task_struct objects are RCU safe, and bpf_task_acquire() can return NULL, we should update the BPF task kfunc documentation to reflect the current state of the API. Signed-off-by: David Vernet Link: https://lore.kernel.org/r/20230331195733.699708-4-void@manifault.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/kfuncs.rst | 49 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 43 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index bf1b85941452..d8a16c4bef7f 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -471,13 +471,50 @@ struct_ops callback arg. For example: struct task_struct *acquired; acquired = bpf_task_acquire(task); + if (acquired) + /* + * In a typical program you'd do something like store + * the task in a map, and the map will automatically + * release it later. Here, we release it manually. + */ + bpf_task_release(acquired); + return 0; + } + + +References acquired on ``struct task_struct *`` objects are RCU protected. +Therefore, when in an RCU read region, you can obtain a pointer to a task +embedded in a map value without having to acquire a reference: + +.. code-block:: c + + #define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) + private(TASK) static struct task_struct *global; + + /** + * A trivial example showing how to access a task stored + * in a map using RCU. + */ + SEC("tp_btf/task_newtask") + int BPF_PROG(task_rcu_read_example, struct task_struct *task, u64 clone_flags) + { + struct task_struct *local_copy; + + bpf_rcu_read_lock(); + local_copy = global; + if (local_copy) + /* + * We could also pass local_copy to kfuncs or helper functions here, + * as we're guaranteed that local_copy will be valid until we exit + * the RCU read region below. + */ + bpf_printk("Global task %s is valid", local_copy->comm); + else + bpf_printk("No global task found"); + bpf_rcu_read_unlock(); + + /* At this point we can no longer reference local_copy. */ - /* - * In a typical program you'd do something like store - * the task in a map, and the map will automatically - * release it later. Here, we release it manually. - */ - bpf_task_release(acquired); return 0; } -- cgit v1.2.3 From 16b7c970cc8192e929dbd5192ccc1867e19d7bda Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Sun, 26 Mar 2023 05:49:46 +0000 Subject: bpf, docs: Add docs on extended 64-bit immediate instructions Add docs on extended 64-bit immediate instructions, including six instructions previously undocumented. Include a brief description of maps and variables, as used by those instructions. V1 -> V2: rebased on top of latest master V2 -> V3: addressed comments from Alexei V3 -> V4: addressed comments from David Vernet Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20230326054946.2331-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/instruction-set.rst | 58 ++++++++++++++++++++++++++++++----- Documentation/bpf/linux-notes.rst | 22 +++++++++++++ 2 files changed, 72 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index b77280eb926f..492980ece1ab 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -416,14 +416,56 @@ and loaded back to ``R0``. ----------------------------- Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction -encoding for an extra imm64 value. - -There is currently only one such instruction. - -``BPF_LD | BPF_DW | BPF_IMM`` means:: - - dst = imm64 - +encoding defined in `Instruction encoding`_, and use the 'src' field of the +basic instruction to hold an opcode subtype. + +The following table defines a set of ``BPF_IMM | BPF_DW | BPF_LD`` instructions +with opcode subtypes in the 'src' field, using new terms such as "map" +defined further below: + +========================= ====== === ========================================= =========== ============== +opcode construction opcode src pseudocode imm type dst type +========================= ====== === ========================================= =========== ============== +BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = imm64 integer integer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map +BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map +BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer +========================= ====== === ========================================= =========== ============== + +where + +* map_by_fd(imm) means to convert a 32-bit file descriptor into an address of a map (see `Maps`_) +* map_by_idx(imm) means to convert a 32-bit index into an address of a map +* map_val(map) gets the address of the first value in a given map +* var_addr(imm) gets the address of a platform variable (see `Platform Variables`_) with a given id +* code_addr(imm) gets the address of the instruction at a specified relative offset in number of (64-bit) instructions +* the 'imm type' can be used by disassemblers for display +* the 'dst type' can be used for verification and JIT compilation purposes + +Maps +~~~~ + +Maps are shared memory regions accessible by eBPF programs on some platforms. +A map can have various semantics as defined in a separate document, and may or +may not have a single contiguous memory region, but the 'map_val(map)' is +currently only defined for maps that do have a single contiguous memory region. + +Each map can have a file descriptor (fd) if supported by the platform, where +'map_by_fd(imm)' means to get the map with the specified file descriptor. Each +BPF program can also be defined to use a set of maps associated with the +program at load time, and 'map_by_idx(imm)' means to get the map with the given +index in the set associated with the BPF program containing the instruction. + +Platform Variables +~~~~~~~~~~~~~~~~~~ + +Platform variables are memory regions, identified by integer ids, exposed by +the runtime and accessible by BPF programs on some platforms. The +'var_addr(imm)' operation means to get the address of the memory region +identified by the given id. Legacy BPF Packet access instructions ------------------------------------- diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst index f43b9c797bcb..508d009d3bed 100644 --- a/Documentation/bpf/linux-notes.rst +++ b/Documentation/bpf/linux-notes.rst @@ -20,6 +20,28 @@ integer would be read from a specified register, is not currently supported by the verifier. Any programs with this instruction will fail to load until such support is added. +Maps +==== + +Linux only supports the 'map_val(map)' operation on array maps with a single element. + +Linux uses an fd_array to store maps associated with a BPF program. Thus, +map_by_idx(imm) uses the fd at that index in the array. + +Variables +========= + +The following 64-bit immediate instruction specifies that a variable address, +which corresponds to some integer stored in the 'imm' field, should be loaded: + +========================= ====== === ========================================= =========== ============== +opcode construction opcode src pseudocode imm type dst type +========================= ====== === ========================================= =========== ============== +BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer +========================= ====== === ========================================= =========== ============== + +On Linux, this integer is a BTF ID. + Legacy BPF Packet access instructions ===================================== -- cgit v1.2.3 From ec48599abee3a16fdc93c1c3c3e153a4f4d29420 Mon Sep 17 00:00:00 2001 From: David Vernet Date: Mon, 10 Apr 2023 23:16:33 -0500 Subject: bpf,docs: Remove references to bpf_cgroup_kptr_get() The bpf_cgroup_kptr_get() kfunc has been removed, and bpf_cgroup_acquire() / bpf_cgroup_release() now have the same semantics as bpf_task_acquire() / bpf_task_release(). This patch updates the BPF documentation to reflect this. Signed-off-by: David Vernet Link: https://lore.kernel.org/r/20230411041633.179404-3-void@manifault.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/kfuncs.rst | 68 -------------------------------------------- 1 file changed, 68 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index d8a16c4bef7f..3b42cfe12437 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -572,74 +572,6 @@ bpf_task_release() respectively, so we won't provide examples for them. ---- -You may also acquire a reference to a ``struct cgroup`` kptr that's already -stored in a map using bpf_cgroup_kptr_get(): - -.. kernel-doc:: kernel/bpf/helpers.c - :identifiers: bpf_cgroup_kptr_get - -Here's an example of how it can be used: - -.. code-block:: c - - /* struct containing the struct task_struct kptr which is actually stored in the map. */ - struct __cgroups_kfunc_map_value { - struct cgroup __kptr * cgroup; - }; - - /* The map containing struct __cgroups_kfunc_map_value entries. */ - struct { - __uint(type, BPF_MAP_TYPE_HASH); - __type(key, int); - __type(value, struct __cgroups_kfunc_map_value); - __uint(max_entries, 1); - } __cgroups_kfunc_map SEC(".maps"); - - /* ... */ - - /** - * A simple example tracepoint program showing how a - * struct cgroup kptr that is stored in a map can - * be acquired using the bpf_cgroup_kptr_get() kfunc. - */ - SEC("tp_btf/cgroup_mkdir") - int BPF_PROG(cgroup_kptr_get_example, struct cgroup *cgrp, const char *path) - { - struct cgroup *kptr; - struct __cgroups_kfunc_map_value *v; - s32 id = cgrp->self.id; - - /* Assume a cgroup kptr was previously stored in the map. */ - v = bpf_map_lookup_elem(&__cgroups_kfunc_map, &id); - if (!v) - return -ENOENT; - - /* Acquire a reference to the cgroup kptr that's already stored in the map. */ - kptr = bpf_cgroup_kptr_get(&v->cgroup); - if (!kptr) - /* If no cgroup was present in the map, it's because - * we're racing with another CPU that removed it with - * bpf_kptr_xchg() between the bpf_map_lookup_elem() - * above, and our call to bpf_cgroup_kptr_get(). - * bpf_cgroup_kptr_get() internally safely handles this - * race, and will return NULL if the task is no longer - * present in the map by the time we invoke the kfunc. - */ - return -EBUSY; - - /* Free the reference we just took above. Note that the - * original struct cgroup kptr is still in the map. It will - * be freed either at a later time if another context deletes - * it from the map, or automatically by the BPF subsystem if - * it's still present when the map is destroyed. - */ - bpf_cgroup_release(kptr); - - return 0; - } - ----- - Other kfuncs available for interacting with ``struct cgroup *`` objects are bpf_cgroup_ancestor() and bpf_cgroup_from_id(), allowing callers to access the ancestor of a cgroup and find a cgroup by its ID, respectively. Both -- cgit v1.2.3