diff options
author | David S. Miller | 2019-04-11 17:00:05 -0700 |
---|---|---|
committer | David S. Miller | 2019-04-11 17:00:05 -0700 |
commit | bb23581b9b38703257acabd520aa5ebf1db008af (patch) | |
tree | 9d9f7b7ccad9697dfd2eab3b1fda37056b0e47f6 | |
parent | 78f07adac86186b5ef0318b7faec377b6d31ea9f (diff) | |
parent | 947e8b595b82d3551750641445d0a97b8f29b536 (diff) |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-04-12
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Improve BPF verifier scalability for large programs through two
optimizations: i) remove verifier states that are not useful in pruning,
ii) stop walking parentage chain once first LIVE_READ is seen. Combined
gives approx 20x speedup. Increase limits for accepting large programs
under root, and add various stress tests, from Alexei.
2) Implement global data support in BPF. This enables static global variables
for .data, .rodata and .bss sections to be properly handled which allows
for more natural program development. This also opens up the possibility
to optimize program workflow by compiling ELFs only once and later only
rewriting section data before reload, from Daniel and with test cases and
libbpf refactoring from Joe.
3) Add config option to generate BTF type info for vmlinux as part of the
kernel build process. DWARF debug info is converted via pahole to BTF.
Latter relies on libbpf and makes use of BTF deduplication algorithm which
results in 100x savings compared to DWARF data. Resulting .BTF section is
typically about 2MB in size, from Andrii.
4) Add BPF verifier support for stack access with variable offset from
helpers and add various test cases along with it, from Andrey.
5) Extend bpf_skb_adjust_room() growth BPF helper to mark inner MAC header
so that L2 encapsulation can be used for tc tunnels, from Alan.
6) Add support for input __sk_buff context in BPF_PROG_TEST_RUN so that
users can define a subset of allowed __sk_buff fields that get fed into
the test program, from Stanislav.
7) Add bpf fs multi-dimensional array tests for BTF test suite and fix up
various UBSAN warnings in bpftool, from Yonghong.
8) Generate a pkg-config file for libbpf, from Luca.
9) Dump program's BTF id in bpftool, from Prashant.
10) libbpf fix to use smaller BPF log buffer size for AF_XDP's XDP
program, from Magnus.
11) kallsyms related fixes for the case when symbols are not present in
BPF selftests and samples, from Daniel
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
75 files changed, 4603 insertions, 425 deletions
@@ -16,6 +16,9 @@ Alan Cox <alan@lxorguk.ukuu.org.uk> Alan Cox <root@hraefn.swansea.linux.org.uk> Aleksey Gorelov <aleksey_gorelov@phoenix.com> Aleksandar Markovic <aleksandar.markovic@mips.com> <aleksandar.markovic@imgtec.com> +Alexei Starovoitov <ast@kernel.org> <ast@plumgrid.com> +Alexei Starovoitov <ast@kernel.org> <alexei.starovoitov@gmail.com> +Alexei Starovoitov <ast@kernel.org> <ast@fb.com> Al Viro <viro@ftp.linux.org.uk> Al Viro <viro@zenIV.linux.org.uk> Andi Shyti <andi@etezian.org> <andi.shyti@samsung.com> @@ -46,6 +49,12 @@ Christoph Hellwig <hch@lst.de> Christophe Ricard <christophe.ricard@gmail.com> Corey Minyard <minyard@acm.org> Damian Hobson-Garcia <dhobsong@igel.co.jp> +Daniel Borkmann <daniel@iogearbox.net> <dborkman@redhat.com> +Daniel Borkmann <daniel@iogearbox.net> <dborkmann@redhat.com> +Daniel Borkmann <daniel@iogearbox.net> <danborkmann@iogearbox.net> +Daniel Borkmann <daniel@iogearbox.net> <daniel.borkmann@tik.ee.ethz.ch> +Daniel Borkmann <daniel@iogearbox.net> <danborkmann@googlemail.com> +Daniel Borkmann <daniel@iogearbox.net> <dxchgb@gmail.com> David Brownell <david-b@pacbell.net> David Woodhouse <dwmw2@shinybook.infradead.org> Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@mips.com> diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst index 7313d354f20e..29396e6943b0 100644 --- a/Documentation/bpf/btf.rst +++ b/Documentation/bpf/btf.rst @@ -82,6 +82,8 @@ sequentially and type id is assigned to each recognized type starting from id #define BTF_KIND_RESTRICT 11 /* Restrict */ #define BTF_KIND_FUNC 12 /* Function */ #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ + #define BTF_KIND_VAR 14 /* Variable */ + #define BTF_KIND_DATASEC 15 /* Section */ Note that the type section encodes debug info, not just pure types. ``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. @@ -393,6 +395,61 @@ refers to parameter type. If the function has variable arguments, the last parameter is encoded with ``name_off = 0`` and ``type = 0``. +2.2.14 BTF_KIND_VAR +~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: offset to a valid C identifier + * ``info.kind_flag``: 0 + * ``info.kind``: BTF_KIND_VAR + * ``info.vlen``: 0 + * ``type``: the type of the variable + +``btf_type`` is followed by a single ``struct btf_variable`` with the +following data:: + + struct btf_var { + __u32 linkage; + }; + +``struct btf_var`` encoding: + * ``linkage``: currently only static variable 0, or globally allocated + variable in ELF sections 1 + +Not all type of global variables are supported by LLVM at this point. +The following is currently available: + + * static variables with or without section attributes + * global variables with section attributes + +The latter is for future extraction of map key/value type id's from a +map definition. + +2.2.15 BTF_KIND_DATASEC +~~~~~~~~~~~~~~~~~~~~~~~ + +``struct btf_type`` encoding requirement: + * ``name_off``: offset to a valid name associated with a variable or + one of .data/.bss/.rodata + * ``info.kind_flag``: 0 + * ``info.kind``: BTF_KIND_DATASEC + * ``info.vlen``: # of variables + * ``size``: total section size in bytes (0 at compilation time, patched + to actual size by BPF loaders such as libbpf) + +``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.:: + + struct btf_var_secinfo { + __u32 type; + __u32 offset; + __u32 size; + }; + +``struct btf_var_secinfo`` encoding: + * ``type``: the type of the BTF_KIND_VAR variable + * ``offset``: the in-section offset of the variable + * ``size``: the size of the variable in bytes + 3. BTF Kernel API ***************** @@ -401,6 +401,7 @@ NM = $(CROSS_COMPILE)nm STRIP = $(CROSS_COMPILE)strip OBJCOPY = $(CROSS_COMPILE)objcopy OBJDUMP = $(CROSS_COMPILE)objdump +PAHOLE = pahole LEX = flex YACC = bison AWK = awk @@ -455,7 +456,7 @@ KBUILD_LDFLAGS := GCC_PLUGINS_CFLAGS := export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD CC -export CPP AR NM STRIP OBJCOPY OBJDUMP KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS +export CPP AR NM STRIP OBJCOPY OBJDUMP PAHOLE KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS export MAKE LEX YACC AWK INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 UTS_MACHINE export HOSTCXX KBUILD_HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f62897198844..e4d4c1771ab0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -57,6 +57,12 @@ struct bpf_map_ops { const struct btf *btf, const struct btf_type *key_type, const struct btf_type *value_type); + + /* Direct value access helpers. */ + int (*map_direct_value_addr)(const struct bpf_map *map, + u64 *imm, u32 off); + int (*map_direct_value_meta)(const struct bpf_map *map, + u64 imm, u32 *off); }; struct bpf_map { @@ -81,7 +87,8 @@ struct bpf_map { struct btf *btf; u32 pages; bool unpriv_array; - /* 51 bytes hole */ + bool frozen; /* write-once */ + /* 48 bytes hole */ /* The 3rd and 4th cacheline with misc members to avoid false sharing * particularly with refcounting. @@ -421,8 +428,38 @@ struct bpf_array { }; }; +#define BPF_COMPLEXITY_LIMIT_INSNS 1000000 /* yes. 1M insns */ #define MAX_TAIL_CALL_CNT 32 +#define BPF_F_ACCESS_MASK (BPF_F_RDONLY | \ + BPF_F_RDONLY_PROG | \ + BPF_F_WRONLY | \ + BPF_F_WRONLY_PROG) + +#define BPF_MAP_CAN_READ BIT(0) +#define BPF_MAP_CAN_WRITE BIT(1) + +static inline u32 bpf_map_flags_to_cap(struct bpf_map *map) +{ + u32 access_flags = map->map_flags & (BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG); + + /* Combination of BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG is + * not possible. + */ + if (access_flags & BPF_F_RDONLY_PROG) + return BPF_MAP_CAN_READ; + else if (access_flags & BPF_F_WRONLY_PROG) + return BPF_MAP_CAN_WRITE; + else + return BPF_MAP_CAN_READ | BPF_MAP_CAN_WRITE; +} + +static inline bool bpf_map_flags_access_ok(u32 access_flags) +{ + return (access_flags & (BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG)) != + (BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG); +} + struct bpf_event_entry { struct perf_event *event; struct file *perf_file; @@ -446,14 +483,6 @@ typedef u32 (*bpf_convert_ctx_access_t)(enum bpf_access_type type, u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size, void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy); -int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, - union bpf_attr __user *uattr); -int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, - union bpf_attr __user *uattr); -int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, - const union bpf_attr *kattr, - union bpf_attr __user *uattr); - /* an array of programs to be executed under rcu_lock. * * Typical usage: @@ -644,6 +673,13 @@ static inline int bpf_map_attr_numa_node(const union bpf_attr *attr) struct bpf_prog *bpf_prog_get_type_path(const char *name, enum bpf_prog_type type); int array_map_alloc_check(union bpf_attr *attr); +int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, + union bpf_attr __user *uattr); +int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, + union bpf_attr __user *uattr); +int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr); #else /* !CONFIG_BPF_SYSCALL */ static inline struct bpf_prog *bpf_prog_get(u32 ufd) { @@ -755,6 +791,27 @@ static inline struct bpf_prog *bpf_prog_get_type_path(const char *name, { return ERR_PTR(-EOPNOTSUPP); } + +static inline int bpf_prog_test_run_xdp(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr) +{ + return -ENOTSUPP; +} + +static inline int bpf_prog_test_run_skb(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr) +{ + return -ENOTSUPP; +} + +static inline int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, + const union bpf_attr *kattr, + union bpf_attr __user *uattr) +{ + return -ENOTSUPP; +} #endif /* CONFIG_BPF_SYSCALL */ static inline struct bpf_prog *bpf_prog_get_type(u32 ufd, diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 7d8228d1c898..b3ab61fe1932 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -207,6 +207,7 @@ struct bpf_verifier_state { struct bpf_verifier_state_list { struct bpf_verifier_state state; struct bpf_verifier_state_list *next; + int miss_cnt, hit_cnt; }; /* Possible states for alu_state member. */ @@ -223,6 +224,10 @@ struct bpf_insn_aux_data { unsigned long map_state; /* pointer/poison value for maps */ s32 call_imm; /* saved imm field of call insn */ u32 alu_limit; /* limit for add/sub register with pointer */ + struct { + u32 map_index; /* index into used_maps[] */ + u32 map_off; /* offset from value base address */ + }; }; int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ int sanitize_stack_off; /* stack slot to be cleared */ @@ -248,6 +253,12 @@ static inline bool bpf_verifier_log_full(const struct bpf_verifier_log *log) return log->len_used >= log->len_total - 1; } +#define BPF_LOG_LEVEL1 1 +#define BPF_LOG_LEVEL2 2 +#define BPF_LOG_STATS 4 +#define BPF_LOG_LEVEL (BPF_LOG_LEVEL1 | BPF_LOG_LEVEL2) +#define BPF_LOG_MASK (BPF_LOG_LEVEL | BPF_LOG_STATS) + static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log) { return log->level && log->ubuf && !bpf_verifier_log_full(log); @@ -274,6 +285,7 @@ struct bpf_verifier_env { bool strict_alignment; /* perform strict pointer alignment checks */ struct bpf_verifier_state *cur_state; /* current verifier state */ struct bpf_verifier_state_list **explored_states; /* search pruning optimization */ + struct bpf_verifier_state_list *free_list; struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */ u32 used_map_cnt; /* number of used maps */ u32 id_gen; /* used to generate unique reg IDs */ @@ -284,6 +296,21 @@ struct bpf_verifier_env { struct bpf_verifier_log log; struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS + 1]; u32 subprog_cnt; + /* number of instructions analyzed by the verifier */ + u32 insn_processed; + /* total verification time */ + u64 verification_time; + /* maximum number of verifier states kept in 'branching' instructions */ + u32 max_states_per_insn; + /* total number of allocated verifier states */ + u32 total_states; + /* some states are freed during program analysis. + * this is peak number of states. this number dominates kernel + * memory consumption during verification + */ + u32 peak_states; + /* longest register parentage chain walked for liveness marking */ + u32 longest_mark_read_walk; }; __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log, diff --git a/include/linux/btf.h b/include/linux/btf.h index 455d31b55828..64cdf2a23d42 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -51,6 +51,7 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, const struct btf_member *m, u32 expected_offset, u32 expected_size); int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t); +bool btf_type_is_void(const struct btf_type *t); #ifdef CONFIG_BPF_SYSCALL const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 837024512baf..2e96d0b4bf65 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -105,6 +105,7 @@ enum bpf_cmd { BPF_BTF_GET_FD_BY_ID, BPF_TASK_FD_QUERY, BPF_MAP_LOOKUP_AND_DELETE_ELEM, + BPF_MAP_FREEZE, }; enum bpf_map_type { @@ -255,8 +256,19 @@ enum bpf_attach_type { */ #define BPF_F_ANY_ALIGNMENT (1U << 1) -/* when bpf_ldimm64->src_reg == BPF_PSEUDO_MAP_FD, bpf_ldimm64->imm == fd */ +/* When BPF ldimm64's insn[0].src_reg != 0 then this can have + * two extensions: + * + * insn[0].src_reg: BPF_PSEUDO_MAP_FD BPF_PSEUDO_MAP_VALUE + * insn[0].imm: map fd map fd + * insn[1].imm: 0 offset into value + * insn[0].off: 0 0 + * insn[1].off: 0 0 + * ldimm64 rewrite: address of map address of map[0]+offset + * verifier type: CONST_PTR_TO_MAP PTR_TO_MAP_VALUE + */ #define BPF_PSEUDO_MAP_FD 1 +#define BPF_PSEUDO_MAP_VALUE 2 /* when bpf_call->src_reg == BPF_PSEUDO_CALL, bpf_call->imm == pc-relative * offset to another bpf function @@ -283,7 +295,7 @@ enum bpf_attach_type { #define BPF_OBJ_NAME_LEN 16U -/* Flags for accessing BPF object */ +/* Flags for accessing BPF object from syscall side. */ #define BPF_F_RDONLY (1U << 3) #define BPF_F_WRONLY (1U << 4) @@ -293,6 +305,10 @@ enum bpf_attach_type { /* Zero-initialize hash function seed. This should only be used for testing. */ #define BPF_F_ZERO_SEED (1U << 6) +/* Flags for accessing BPF object from program side. */ +#define BPF_F_RDONLY_PROG (1U << 7) +#define BPF_F_WRONLY_PROG (1U << 8) + /* flags for BPF_PROG_QUERY */ #define BPF_F_QUERY_EFFECTIVE (1U << 0) @@ -396,6 +412,13 @@ union bpf_attr { __aligned_u64 data_out; __u32 repeat; __u32 duration; + __u32 ctx_size_in; /* input: len of ctx_in */ + __u32 ctx_size_out; /* input/output: len of ctx_out + * returns ENOSPC if ctx_out + * is too small. + */ + __aligned_u64 ctx_in; + __aligned_u64 ctx_out; } test; struct { /* anonymous struct used by BPF_*_GET_*_ID */ @@ -1500,6 +1523,10 @@ union bpf_attr { * * **BPF_F_ADJ_ROOM_ENCAP_L4_UDP **: * Use with ENCAP_L3 flags to further specify the tunnel type. * + * * **BPF_F_ADJ_ROOM_ENCAP_L2(len) **: + * Use with ENCAP_L3/L4 flags to further specify the tunnel + * type; **len** is the length of the inner MAC header. + * * A call to this helper is susceptible to change the underlaying * packet buffer. Therefore, at load time, all checks on pointers * previously done by the verifier are invalidated and must be @@ -2641,10 +2668,16 @@ enum bpf_func_id { /* BPF_FUNC_skb_adjust_room flags. */ #define BPF_F_ADJ_ROOM_FIXED_GSO (1ULL << 0) +#define BPF_ADJ_ROOM_ENCAP_L2_MASK 0xff +#define BPF_ADJ_ROOM_ENCAP_L2_SHIFT 56 + #define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1) #define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 (1ULL << 2) #define BPF_F_ADJ_ROOM_ENCAP_L4_GRE (1ULL << 3) #define BPF_F_ADJ_ROOM_ENCAP_L4_UDP (1ULL << 4) +#define BPF_F_ADJ_ROOM_ENCAP_L2(len) (((__u64)len & \ + BPF_ADJ_ROOM_ENCAP_L2_MASK) \ + << BPF_ADJ_ROOM_ENCAP_L2_SHIFT) /* Mode for BPF_FUNC_skb_adjust_room helper. */ enum bpf_adj_room_mode { diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index 7b7475ef2f17..9310652ca4f9 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -39,11 +39,11 @@ struct btf_type { * struct, union and fwd */ __u32 info; - /* "size" is used by INT, ENUM, STRUCT and UNION. + /* "size" is used by INT, ENUM, STRUCT, UNION and DATASEC. * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC and FUNC_PROTO. + * FUNC, FUNC_PROTO and VAR. * "type" is a type_id referring to another type. */ union { @@ -70,8 +70,10 @@ struct btf_type { #define BTF_KIND_RESTRICT 11 /* Restrict */ #define BTF_KIND_FUNC 12 /* Function */ #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ -#define BTF_KIND_MAX 13 -#define NR_BTF_KINDS 14 +#define BTF_KIND_VAR 14 /* Variable */ +#define BTF_KIND_DATASEC 15 /* Section */ +#define BTF_KIND_MAX BTF_KIND_DATASEC +#define NR_BTF_KINDS (BTF_KIND_MAX + 1) /* For some specific BTF_KIND, "struct btf_type" is immediately * followed by extra data. @@ -138,4 +140,26 @@ struct btf_param { __u32 type; }; +enum { + BTF_VAR_STATIC = 0, + BTF_VAR_GLOBAL_ALLOCATED, +}; + +/* BTF_KIND_VAR is followed by a single "struct btf_var" to describe + * additional information related to the variable such as its linkage. + */ +struct btf_var { + __u32 linkage; +}; + +/* BTF_KIND_DATASEC is followed by multiple "struct btf_var_secinfo" + * to describe all BTF_KIND_VAR types it contains along with it's + * in-section offset as well as size. + */ +struct btf_var_secinfo { + __u32 type; + __u32 offset; + __u32 size; +}; + #endif /* _UAPI__LINUX_BTF_H__ */ diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index c72e0d8e1e65..584636c9e2eb 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -22,7 +22,7 @@ #include "map_in_map.h" #define ARRAY_CREATE_FLAG_MASK \ - (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY) + (BPF_F_NUMA_NODE | BPF_F_ACCESS_MASK) static void bpf_array_free_percpu(struct bpf_array *array) { @@ -63,6 +63,7 @@ int array_map_alloc_check(union bpf_attr *attr) if (attr->max_entries == 0 || attr->key_size != 4 || attr->value_size == 0 || attr->map_flags & ~ARRAY_CREATE_FLAG_MASK || + !bpf_map_flags_access_ok(attr->map_flags) || (percpu && numa_node != NUMA_NO_NODE)) return -EINVAL; @@ -160,6 +161,36 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key) return array->value + array->elem_size * (index & array->index_mask); } +static int array_map_direct_value_addr(const struct bpf_map *map, u64 *imm, + u32 off) +{ + struct bpf_array *array = container_of(map, struct bpf_array, map); + + if (map->max_entries != 1) + return -ENOTSUPP; + if (off >= map->value_size) + return -EINVAL; + + *imm = (unsigned long)array->value; + return 0; +} + +static int array_map_direct_value_meta(const struct bpf_map *map, u64 imm, + u32 *off) +{ + struct bpf_array *array = container_of(map, struct bpf_array, map); + u64 base = (unsigned long)array->value; + u64 range = array->elem_size; + + if (map->max_entries != 1) + return -ENOTSUPP; + if (imm < base || imm >= base + range) + return -ENOENT; + + *off = imm - base; + return 0; +} + /* emit BPF instructions equivalent to C code of array_map_lookup_elem() */ static u32 array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) { @@ -360,7 +391,8 @@ static void array_map_seq_show_elem(struct bpf_map *map, void *key, return; } - seq_printf(m, "%u: ", *(u32 *)key); + if (map->btf_key_type_id) + seq_printf(m, "%u: ", *(u32 *)key); btf_type_seq_show(map->btf, map->btf_value_type_id, value, m); seq_puts(m, "\n"); @@ -397,6 +429,18 @@ static int array_map_check_btf(const struct bpf_map *map, { u32 int_data; + /* One exception for keyless BTF: .bss/.data/.rodata map */ + if (btf_type_is_void(key_type)) { + if (map->map_type != BPF_MAP_TYPE_ARRAY || + map->max_entries != 1) + return -EINVAL; + + if (BTF_INFO_KIND(value_type->info) != BTF_KIND_DATASEC) + return -EINVAL; + + return 0; + } + if (BTF_INFO_KIND(key_type->info) != BTF_KIND_INT) return -EINVAL; @@ -419,6 +463,8 @@ const struct bpf_map_ops array_map_ops = { .map_update_elem = array_map_update_elem, .map_delete_elem = array_map_delete_elem, .map_gen_lookup = array_map_gen_lookup, + .map_direct_value_addr = array_map_direct_value_addr, + .map_direct_value_meta = array_map_direct_value_meta, .map_seq_show_elem = array_map_seq_show_elem, .map_check_btf = array_map_check_btf, }; @@ -440,6 +486,9 @@ static int fd_array_map_alloc_check(union bpf_attr *attr) /* only file descriptors can be stored in this type of map */ if (attr->value_size != sizeof(u32)) return -EINVAL; + /* Program read-only/write-only not supported for special maps yet. */ + if (attr->map_flags & (BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG)) + return -EINVAL; return array_map_alloc_check(attr); } diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index bd3921b1514b..cad09858a5f2 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -185,6 +185,16 @@ i < btf_type_vlen(struct_type); \ i++, member++) +#define for_each_vsi(i, struct_type, member) \ + for (i = 0, member = btf_type_var_secinfo(struct_type); \ + i < btf_type_vlen(struct_type); \ + i++, member++) + +#define for_each_vsi_from(i, from, struct_type, member) \ + for (i = from, member = btf_type_var_secinfo(struct_type) + from; \ + i < btf_type_vlen(struct_type); \ + i++, member++) + static DEFINE_IDR(btf_idr); static DEFINE_SPINLOCK(btf_idr_lock); @@ -262,6 +272,8 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = { [BTF_KIND_RESTRICT] = "RESTRICT", [BTF_KIND_FUNC] = "FUNC", [BTF_KIND_FUNC_PROTO] = "FUNC_PROTO", + [BTF_KIND_VAR] = "VAR", + [BTF_KIND_DATASEC] = "DATASEC", }; struct btf_kind_operations { @@ -314,7 +326,7 @@ static bool btf_type_is_modifier(const struct btf_type *t) return false; } -static bool btf_type_is_void(const struct btf_type *t) +bool btf_type_is_void(const struct btf_type *t) { return t == &btf_void; } @@ -375,13 +387,36 @@ static bool btf_type_is_int(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_INT; } +static bool btf_type_is_var(const struct btf_type *t) +{ + return BTF_INFO_KIND(t->info) == BTF_KIND_VAR; +} + +static bool btf_type_is_datasec(const struct btf_type *t) +{ + return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; +} + +/* Types that act only as a source, not sink or intermediate + * type when resolving. + */ +static bool btf_type_is_resolve_source_only(const struct btf_type *t) +{ + return btf_type_is_var(t) || + btf_type_is_datasec(t); +} + /* What types need to be resolved? * * btf_type_is_modifier() is an obvious one. * * btf_type_is_struct() because its member refers to * another type (through member->type). - + * + * btf_type_is_var() because the variable refers to + * another type. btf_type_is_datasec() holds multiple + * btf_type_is_var() types that need resolving. + * * btf_type_is_array() because its element (array->type) * refers to another type. Array can be thought of a * special case of struct while array just has the same @@ -390,9 +425,11 @@ static bool btf_type_is_int(const struct btf_type *t) static bool btf_type_needs_resolve(const struct btf_type *t) { return btf_type_is_modifier(t) || - btf_type_is_ptr(t) || - btf_type_is_struct(t) || - btf_type_is_array(t); + btf_type_is_ptr(t) || + btf_type_is_struct(t) || + btf_type_is_array(t) || + btf_type_is_var(t) || + btf_type_is_datasec(t); } /* t->size can be used */ @@ -403,6 +440,7 @@ static bool btf_type_has_size(const struct btf_type *t) case BTF_KIND_STRUCT: case BTF_KIND_UNION: case BTF_KIND_ENUM: + case BTF_KIND_DATASEC: return true; } @@ -467,6 +505,16 @@ static const struct btf_enum *btf_type_enum(const struct btf_type *t) return (const struct btf_enum *)(t + 1); } +static const struct btf_var *btf_type_var(const struct btf_type *t) +{ + return (const struct btf_var *)(t + 1); +} + +static const struct btf_var_secinfo *btf_type_var_secinfo(const struct btf_type *t) +{ + return (const struct btf_var_secinfo *)(t + 1); +} + static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t) { return kind_ops[BTF_INFO_KIND(t->info)]; @@ -478,23 +526,31 @@ static bool btf_name_offset_valid(const struct btf *btf, u32 offset) offset < btf->hdr.str_len; } -/* Only C-style identifier is permitted. This can be relaxed if - * necessary. - */ -static bool btf_name_valid_identifier(const struct btf *btf, u32 offset) +static bool __btf_name_char_ok(char c, bool first, bool dot_ok) +{ + if ((first ? !isalpha(c) : + !isalnum(c)) && + c != '_' && + ((c == '.' && !dot_ok) || + c != '.')) + return false; + return true; +} + +static bool __btf_name_valid(const struct btf *btf, u32 offset, bool dot_ok) { /* offset must be valid */ const char *src = &btf->strings[offset]; const char *src_limit; - if (!isalpha(*src) && *src != '_') + if (!__btf_name_char_ok(*src, true, dot_ok)) return false; /* set a limit on identifier length */ src_limit = src + KSYM_NAME_LEN; src++; while (*src && src < src_limit) { - if (!isalnum(*src) && *src != '_') + if (!__btf_name_char_ok(*src, false, dot_ok)) return false; src++; } @@ -502,6 +558,19 @@ static bool btf_name_valid_identifier(const struct btf *btf, u32 offset) return !*src; } +/* Only C-style identifier is permitted. This can be relaxed if + * necessary. + */ +static bool btf_name_valid_identifier(const struct btf *btf, u32 offset) +{ + return __btf_name_valid(btf, offset, false); +} + +static bool btf_name_valid_section(const struct btf *btf, u32 offset) +{ + return __btf_name_valid(btf, offset, true); +} + static const char *__btf_name_by_offset(const struct btf *btf, u32 offset) { if (!offset) @@ -697,6 +766,32 @@ static void btf_verifier_log_member(struct btf_verifier_env *env, __btf_verifier_log(log, "\n"); } +__printf(4, 5) +static void btf_verifier_log_vsi(struct btf_verifier_env *env, + const struct btf_type *datasec_type, + const struct btf_var_secinfo *vsi, + const char *fmt, ...) +{ + struct bpf_verifier_log *log = &env->log; + va_list args; + + if (!bpf_verifier_log_needed(log)) + return; + if (env->phase != CHECK_META) + btf_verifier_log_type(env, datasec_type, NULL); + + __btf_verifier_log(log, "\t type_id=%u offset=%u size=%u", + vsi->type, vsi->offset, vsi->size); + if (fmt && *fmt) { + __btf_verifier_log(log, " "); + va_start(args, fmt); + bpf_verifier_vlog(log, fmt, args); + va_end(args); + } + + __btf_verifier_log(log, "\n"); +} + static void btf_verifier_log_hdr(struct btf_verifier_env *env, u32 btf_data_size) { @@ -974,7 +1069,8 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, } else if (btf_type_is_ptr(size_type)) { size = sizeof(void *); } else { - if (WARN_ON_ONCE(!btf_type_is_modifier(size_type))) + if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) && + !btf_type_is_var(size_type))) return NULL; size = btf->resolved_sizes[size_type_id]; @@ -1509,7 +1605,7 @@ static int btf_modifier_resolve(struct btf_verifier_env *env, u32 next_type_size = 0; next_type = btf_type_by_id(btf, next_type_id); - if (!next_type) { + if (!next_type || btf_type_is_resolve_source_only(next_type)) { btf_verifier_log_type(env, v->t, "Invalid type_id"); return -EINVAL; } @@ -1542,6 +1638,53 @@ static int btf_modifier_resolve(struct btf_verifier_env *env, return 0; } +static int btf_var_resolve(struct btf_verifier_env *env, + const struct resolve_vertex *v) +{ + const struct btf_type *next_type; + const struct btf_type *t = v->t; + u32 next_type_id = t->type; + struct btf *btf = env->btf; + u32 next_type_size; + + next_type = btf_type_by_id(btf, next_type_id); + if (!next_type || btf_type_is_resolve_source_only(next_type)) { + btf_verifier_log_type(env, v->t, "Invalid type_id"); + return -EINVAL; + } + + if (!env_type_is_resolve_sink(env, next_type) && + !env_type_is_resolved(env, next_type_id)) + return env_stack_push(env, next_type, next_type_id); + + if (btf_type_is_modifier(next_type)) { + const struct btf_type *resolved_type; + u32 resolved_type_id; + + resolved_type_id = next_type_id; + resolved_type = btf_type_id_resolve(btf, &resolved_type_id); + + if (btf_type_is_ptr(resolved_type) && + !env_type_is_resolve_sink(env, resolved_type) && + !env_type_is_resolved(env, resolved_type_id)) + return env_stack_push(env, resolved_type, + resolved_type_id); + } + + /* We must resolve to something concrete at this point, no + * forward types or similar that would resolve to size of + * zero is allowed. + */ + if (!btf_type_id_size(btf, &next_type_id, &next_type_size)) { + btf_verifier_log_type(env, v->t, "Invalid type_id"); + return -EINVAL; + } + + env_stack_pop_resolved(env, next_type_id, next_type_size); + + return 0; +} + static int btf_ptr_resolve(struct btf_verifier_env *env, const struct resolve_vertex *v) { @@ -1551,7 +1694,7 @@ static int btf_ptr_resolve(struct btf_verifier_env *env, struct btf *btf = env->btf; next_type = btf_type_by_id(btf, next_type_id); - if (!next_type) { + if (!next_type || btf_type_is_resolve_source_only(next_type)) { btf_verifier_log_type(env, v->t, "Invalid type_id"); return -EINVAL; } @@ -1609,6 +1752,15 @@ static void btf_modifier_seq_show(const struct btf *btf, btf_type_ops(t)->seq_show(btf, t, type_id, data, bits_offset, m); } +static void btf_var_seq_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct seq_file *m) +{ + t = btf_type_id_resolve(btf, &type_id); + + btf_type_ops(t)->seq_show(btf, t, type_id, data, bits_offset, m); +} + static void btf_ptr_seq_show(const struct btf *btf, const struct btf_type *t, u32 type_id, void *data, u8 bits_offset, struct seq_file *m) @@ -1776,7 +1928,8 @@ static int btf_array_resolve(struct btf_verifier_env *env, /* Check array->index_type */ index_type_id = array->index_type; index_type = btf_type_by_id(btf, index_type_id); - if (btf_type_nosize_or_null(index_type)) { + if (btf_type_is_resolve_source_only(index_type) || + btf_type_nosize_or_null(index_type)) { btf_verifier_log_type(env, v->t, "Invalid index"); return -EINVAL; } @@ -1795,7 +1948,8 @@ static int btf_array_resolve(struct btf_verifier_env *env, /* Check array->type */ elem_type_id = array->type; elem_type = btf_type_by_id(btf, elem_type_id); - if (btf_type_nosize_or_null(elem_type)) { + if (btf_type_is_resolve_source_only(elem_type) || + btf_type_nosize_or_null(elem_type)) { btf_verifier_log_type(env, v->t, "Invalid elem"); return -EINVAL; @@ -2016,7 +2170,8 @@ static int btf_struct_resolve(struct btf_verifier_env *env, const struct btf_type *member_type = btf_type_by_id(env->btf, member_type_id); - if (btf_type_nosize_or_null(member_type)) { + if (btf_type_is_resolve_source_only(member_type) || + btf_type_nosize_or_null(member_type)) { btf_verifier_log_member(env, v->t, member, "Invalid member"); return -EINVAL; @@ -2411,6 +2566,222 @@ static struct btf_kind_operations func_ops = { .seq_show = btf_df_seq_show, }; +static s32 btf_var_check_meta(struct btf_verifier_env *env, + const struct btf_type *t, + u32 meta_left) +{ + const struct btf_var *var; + u32 meta_needed = sizeof(*var); + + if (meta_left < meta_needed) { + btf_verifier_log_basic(env, t, + "meta_left:%u meta_needed:%u", + meta_left, meta_needed); + return -EINVAL; + } + + if (btf_type_vlen(t)) { + btf_verifier_log_type(env, t, "vlen != 0"); + return -EINVAL; + } + + if (btf_type_kflag(t)) { + btf_verifier_log_type(env, t, "Invalid btf_info kind_flag"); + return -EINVAL; + } + + if (!t->name_off || + !__btf_name_valid(env->btf, t->name_off, true)) { + btf_verifier_log_type(env, t, "Invalid name"); + return -EINVAL; + } + + /* A var cannot be in type void */ + if (!t->type || !BTF_TYPE_ID_VALID(t->type)) { + btf_verifier_log_type(env, t, "Invalid type_id"); + return -EINVAL; + } + + var = btf_type_var(t); + if (var->linkage != BTF_VAR_STATIC && + var->linkage != BTF_VAR_GLOBAL_ALLOCATED) { + btf_verifier_log_type(env, t, "Linkage not supported"); + return -EINVAL; + } + + btf_verifier_log_type(env, t, NULL); + + return meta_needed; +} + +static void btf_var_log(struct btf_verifier_env *env, const struct btf_type *t) +{ + const struct btf_var *var = btf_type_var(t); + + btf_verifier_log(env, "type_id=%u linkage=%u", t->type, var->linkage); +} + +static const struct btf_kind_operations var_ops = { + .check_meta = btf_var_check_meta, + .resolve = btf_var_resolve, + .check_member = btf_df_check_member, + .check_kflag_member = btf_df_check_kflag_member, + .log_details = btf_var_log, + .seq_show = btf_var_seq_show, +}; + +static s32 btf_datasec_check_meta(struct btf_verifier_env *env, + const struct btf_type *t, + u32 meta_left) +{ + const struct btf_var_secinfo *vsi; + u64 last_vsi_end_off = 0, sum = 0; + u32 i, meta_needed; + + meta_needed = btf_type_vlen(t) * sizeof(*vsi); + if (meta_left < meta_needed) { + btf_verifier_log_basic(env, t, + "meta_left:%u meta_needed:%u", + meta_left, meta_needed); + return -EINVAL; + } + + if (!btf_type_vlen(t)) { + btf_verifier_log_type(env, t, "vlen == 0"); + return -EINVAL; + } + + if (!t->size) { + btf_verifier_log_type(env, t, "size == 0"); + return -EINVAL; + } + + if (btf_type_kflag(t)) { + btf_verifier_log_type(env, t, "Invalid btf_info kind_flag"); + return -EINVAL; + } + + if (!t->name_off || + !btf_name_valid_section(env->btf, t->name_off)) { + btf_verifier_log_type(env, t, "Invalid name"); + return -EINVAL; + } + + btf_verifier_log_type(env, t, NULL); + + for_each_vsi(i, t, vsi) { + /* A var cannot be in type void */ + if (!vsi->type || !BTF_TYPE_ID_VALID(vsi->type)) { + btf_verifier_log_vsi(env, t, vsi, + "Invalid type_id"); + return -EINVAL; + } + + if (vsi->offset < last_vsi_end_off || vsi->offset >= t->size) { + btf_verifier_log_vsi(env, t, vsi, + "Invalid offset"); + return -EINVAL; + } + + if (!vsi->size || vsi->size > t->size) { + btf_verifier_log_vsi(env, t, vsi, + "Invalid size"); + return -EINVAL; + } + + last_vsi_end_off = vsi->offset + vsi->size; + if (last_vsi_end_off > t->size) { + btf_verifier_log_vsi(env, t, vsi, + "Invalid offset+size"); + return -EINVAL; + } + + btf_verifier_log_vsi(env, t, vsi, NULL); + sum += vsi->size; + } + + if (t->size < sum) { + btf_verifier_log_type(env, t, "Invalid btf_info size"); + return -EINVAL; + } + + return meta_needed; +} + +static int btf_datasec_resolve(struct btf_verifier_env *env, + const struct resolve_vertex *v) +{ + const struct btf_var_secinfo *vsi; + struct btf *btf = env->btf; + u16 i; + + for_each_vsi_from(i, v->next_member, v->t, vsi) { + u32 var_type_id = vsi->type, type_id, type_size = 0; + const struct btf_type *var_type = btf_type_by_id(env->btf, + var_type_id); + if (!var_type || !btf_type_is_var(var_type)) { + btf_verifier_log_vsi(env, v->t, vsi, + "Not a VAR kind member"); + return -EINVAL; + } + + if (!env_type_is_resolve_sink(env, var_type) && + !env_type_is_resolved(env, var_type_id)) { + env_stack_set_next_member(env, i + 1); + return env_stack_push(env, var_type, var_type_id); + } + + type_id = var_type->type; + if (!btf_type_id_size(btf, &type_id, &type_size)) { + btf_verifier_log_vsi(env, v->t, vsi, "Invalid type"); + return -EINVAL; + } + + if (vsi->size < type_size) { + btf_verifier_log_vsi(env, v->t, vsi, "Invalid size"); + return -EINVAL; + } + } + + env_stack_pop_resolved(env, 0, 0); + return 0; +} + +static void btf_datasec_log(struct btf_verifier_env *env, + const struct btf_type *t) +{ + btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t)); +} + +static void btf_datasec_seq_show(const struct btf *btf, + const struct btf_type *t, u32 type_id, + void *data, u8 bits_offset, + struct seq_file *m) +{ + const struct btf_var_secinfo *vsi; + const struct btf_type *var; + u32 i; + + seq_printf(m, "section (\"%s\") = {", __btf_name_by_offset(btf, t->name_off)); + for_each_vsi(i, t, vsi) { + var = btf_type_by_id(btf, vsi->type); + if (i) + seq_puts(m, ","); + btf_type_ops(var)->seq_show(btf, var, vsi->type, + data + vsi->offset, bits_offset, m); + } + seq_puts(m, "}"); +} + +static const struct btf_kind_operations datasec_ops = { + .check_meta = btf_datasec_check_meta, + .resolve = btf_datasec_resolve, + .check_member = btf_df_check_member, + .check_kflag_member = btf_df_check_kflag_member, + .log_details = btf_datasec_log, + .seq_show = btf_datasec_seq_show, +}; + static int btf_func_proto_check(struct btf_verifier_env *env, const struct btf_type *t) { @@ -2542,6 +2913,8 @@ static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS] = { [BTF_KIND_RESTRICT] = &modifier_ops, [BTF_KIND_FUNC] = &func_ops, [BTF_KIND_FUNC_PROTO] = &func_proto_ops, + [BTF_KIND_VAR] = &var_ops, + [BTF_KIND_DATASEC] = &datasec_ops, }; static s32 btf_check_meta(struct btf_verifier_env *env, @@ -2622,13 +2995,17 @@ static bool btf_resolve_valid(struct btf_verifier_env *env, if (!env_type_is_resolved(env, type_id)) return false; - if (btf_type_is_struct(t)) + if (btf_type_is_struct(t) || btf_type_is_datasec(t)) return !btf->resolved_ids[type_id] && - !btf->resolved_sizes[type_id]; + !btf->resolved_sizes[type_id]; - if (btf_type_is_modifier(t) || btf_type_is_ptr(t)) { + if (btf_type_is_modifier(t) || btf_type_is_ptr(t) || + btf_type_is_var(t)) { t = btf_type_id_resolve(btf, &type_id); - return t && !btf_type_is_modifier(t); + return t && + !btf_type_is_modifier(t) && + !btf_type_is_var(t) && + !btf_type_is_datasec(t); } if (btf_type_is_array(t)) { diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index ff09d32a8a1b..ace8c22c8b0e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -292,7 +292,8 @@ int bpf_prog_calc_tag(struct bpf_prog *fp) dst[i] = fp->insnsi[i]; if (!was_ld_map && dst[i].code == (BPF_LD | BPF_IMM | BPF_DW) && - dst[i].src_reg == BPF_PSEUDO_MAP_FD) { + (dst[i].src_reg == BPF_PSEUDO_MAP_FD || + dst[i].src_reg == BPF_PSEUDO_MAP_VALUE)) { was_ld_map = true; dst[i].imm = 0; } else if (was_ld_map && @@ -438,6 +439,7 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off, u32 insn_adj_cnt, insn_rest, insn_delta = len - 1; const u32 cnt_max = S16_MAX; struct bpf_prog *prog_adj; + int err; /* Since our patchlet doesn't expand the image, we're done. */ if (insn_delta == 0) { @@ -453,8 +455,8 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off, * we afterwards may not fail anymore. */ if (insn_adj_cnt > cnt_max && - bpf_adj_branches(prog, off, off + 1, off + len, true)) - return NULL; + (err = bpf_adj_branches(prog, off, off + 1, off + len, true))) + return ERR_PTR(err); /* Several new instructions need to be inserted. Make room * for them. Likely, there's no need for a new allocation as @@ -463,7 +465,7 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off, prog_adj = bpf_prog_realloc(prog, bpf_prog_size(insn_adj_cnt), GFP_USER); if (!prog_adj) - return NULL; + return ERR_PTR(-ENOMEM); prog_adj->len = insn_adj_cnt; @@ -1096,13 +1098,13 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog) continue; tmp = bpf_patch_insn_single(clone, i, insn_buff, rewritten); - if (!tmp) { + if (IS_ERR(tmp)) { /* Patching may have repointed aux->prog during * realloc from the original one, so we need to * fix it up here on error. */ bpf_jit_prog_release_other(prog, clone); - return ERR_PTR(-ENOMEM); + return tmp; } clone = tmp; diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index de73f55e42fd..d9ce383c0f9c 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -205,10 +205,11 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, * part of the ldimm64 insn is accessible. */ u64 imm = ((u64)(insn + 1)->imm << 32) | (u32)insn->imm; - bool map_ptr = insn->src_reg == BPF_PSEUDO_MAP_FD; + bool is_ptr = insn->src_reg == BPF_PSEUDO_MAP_FD || + insn->src_reg == BPF_PSEUDO_MAP_VALUE; char tmp[64]; - if (map_ptr && !allow_ptr_leaks) + if (is_ptr && !allow_ptr_leaks) imm = 0; verbose(cbs->private_data, "(%02x) r%d = %s\n", diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index fed15cf94dca..192d32e77db3 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -23,7 +23,7 @@ #define HTAB_CREATE_FLAG_MASK \ (BPF_F_NO_PREALLOC | BPF_F_NO_COMMON_LRU | BPF_F_NUMA_NODE | \ - BPF_F_RDONLY | BPF_F_WRONLY | BPF_F_ZERO_SEED) + BPF_F_ACCESS_MASK | BPF_F_ZERO_SEED) struct bucket { struct hlist_nulls_head head; @@ -262,8 +262,8 @@ static int htab_map_alloc_check(union bpf_attr *attr) /* Guard against local DoS, and discourage production use. */ return -EPERM; - if (attr->map_flags & ~HTAB_CREATE_FLAG_MASK) - /* reserved bits should not be used */ + if (attr->map_flags & ~HTAB_CREATE_FLAG_MASK || + !bpf_map_flags_access_ok(attr->map_flags)) return -EINVAL; if (!lru && percpu_lru) diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index 6b572e2de7fb..980e8f1f6cb5 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -14,7 +14,7 @@ DEFINE_PER_CPU(struct bpf_cgroup_storage*, bpf_cgroup_storage[MAX_BPF_CGROUP_STO #ifdef CONFIG_CGROUP_BPF #define LOCAL_STORAGE_CREATE_FLAG_MASK \ - (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY) + (BPF_F_NUMA_NODE | BPF_F_ACCESS_MASK) struct bpf_cgroup_storage_map { struct bpf_map map; @@ -282,8 +282,8 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr) if (attr->value_size > PAGE_SIZE) return ERR_PTR(-E2BIG); - if (attr->map_flags & ~LOCAL_STORAGE_CREATE_FLAG_MASK) - /* reserved bits should not be used */ + if (attr->map_flags & ~LOCAL_STORAGE_CREATE_FLAG_MASK || + !bpf_map_flags_access_ok(attr->map_flags)) return ERR_PTR(-EINVAL); if (attr->max_entries) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index 93a5cbbde421..e61630c2e50b 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -538,7 +538,7 @@ out: #define LPM_KEY_SIZE_MIN LPM_KEY_SIZE(LPM_DATA_SIZE_MIN) #define LPM_CREATE_FLAG_MASK (BPF_F_NO_PREALLOC | BPF_F_NUMA_NODE | \ - BPF_F_RDONLY | BPF_F_WRONLY) + BPF_F_ACCESS_MASK) static struct bpf_map *trie_alloc(union bpf_attr *attr) { @@ -553,6 +553,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr) if (attr->max_entries == 0 || !(attr->map_flags & BPF_F_NO_PREALLOC) || attr->map_flags & ~LPM_CREATE_FLAG_MASK || + !bpf_map_flags_access_ok(attr->map_flags) || attr->key_size < LPM_KEY_SIZE_MIN || attr->key_size > LPM_KEY_SIZE_MAX || attr->value_size < LPM_VAL_SIZE_MIN || diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c index b384ea9f3254..0b140d236889 100644 --- a/kernel/bpf/queue_stack_maps.c +++ b/kernel/bpf/queue_stack_maps.c @@ -11,8 +11,7 @@ #include "percpu_freelist.h" #define QUEUE_STACK_CREATE_FLAG_MASK \ - (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY) - + (BPF_F_NUMA_NODE | BPF_F_ACCESS_MASK) struct bpf_queue_stack { struct bpf_map map; @@ -52,7 +51,8 @@ static int queue_stack_map_alloc_check(union bpf_attr *attr) /* check sanity of attributes */ if (attr->max_entries == 0 || attr->key_size != 0 || attr->value_size == 0 || - attr->map_flags & ~QUEUE_STACK_CREATE_FLAG_MASK) + attr->map_flags & ~QUEUE_STACK_CREATE_FLAG_MASK || + !bpf_map_flags_access_ok(attr->map_flags)) return -EINVAL; if (attr->value_size > KMALLOC_MAX_SIZE) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index afca36f53c49..d995eedfdd16 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -166,13 +166,25 @@ void bpf_map_area_free(void *area) kvfree(area); } +static u32 bpf_map_flags_retain_permanent(u32 flags) +{ + /* Some map creation flags are not tied to the map object but + * rather to the map fd instead, so they have no meaning upon + * map object inspection since multiple file descriptors with + * different (access) properties can exist here. Thus, given + * this has zero meaning for the map itself, lets clear these + * from here. + */ + return flags & ~(BPF_F_RDONLY | BPF_F_WRONLY); +} + void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr) { map->map_type = attr->map_type; map->key_size = attr->key_size; map->value_size = attr->value_size; map->max_entries = attr->max_entries; - map->map_flags = attr->map_flags; + map->map_flags = bpf_map_flags_retain_permanent(attr->map_flags); map->numa_node = bpf_map_attr_numa_node(attr); } @@ -343,6 +355,18 @@ static int bpf_map_release(struct inode *inode, struct file *filp) return 0; } +static fmode_t map_get_sys_perms(struct bpf_map *map, struct fd f) +{ + fmode_t mode = f.file->f_mode; + + /* Our file permissions may have been overridden by global + * map permissions facing syscall side. + */ + if (READ_ONCE(map->frozen)) + mode &= ~FMODE_CAN_WRITE; + return mode; +} + #ifdef CONFIG_PROC_FS static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) { @@ -364,14 +388,16 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) "max_entries:\t%u\n" "map_flags:\t%#x\n" "memlock:\t%llu\n" - "map_id:\t%u\n", + "map_id:\t%u\n" + "frozen:\t%u\n", map->map_type, map->key_size, map->value_size, map->max_entries, map->map_flags, map->pages * 1ULL << PAGE_SHIFT, - map->id); + map->id, + READ_ONCE(map->frozen)); if (owner_prog_type) { seq_printf(m, "owner_prog_type:\t%u\n", @@ -448,10 +474,10 @@ static int bpf_obj_name_cpy(char *dst, const char *src) const char *end = src + BPF_OBJ_NAME_LEN; memset(dst, 0, BPF_OBJ_NAME_LEN); - - /* Copy all isalnum() and '_' char */ + /* Copy all isalnum(), '_' and '.' chars. */ while (src < end && *src) { - if (!isalnum(*src) && *src != '_') + if (!isalnum(*src) && + *src != '_' && *src != '.') return -EINVAL; *dst++ = *src++; } @@ -478,9 +504,16 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, u32 key_size, value_size; int ret = 0; - key_type = btf_type_id_size(btf, &btf_key_id, &key_size); - if (!key_type || key_size != map->key_size) - return -EINVAL; + /* Some maps allow key to be unspecified. */ + if (btf_key_id) { + key_type = btf_type_id_size(btf, &btf_key_id, &key_size); + if (!key_type || key_size != map->key_size) + return -EINVAL; + } else { + key_type = btf_type_by_id(btf, 0); + if (!map->ops->map_check_btf) + return -EINVAL; + } value_type = btf_type_id_size(btf, &btf_value_id, &value_size); if (!value_type || value_size != map->value_size) @@ -489,6 +522,8 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, map->spin_lock_off = btf_find_spin_lock(btf, value_type); if (map_value_has_spin_lock(map)) { + if (map->map_flags & BPF_F_RDONLY_PROG) + return -EACCES; if (map->map_type != BPF_MAP_TYPE_HASH && map->map_type != BPF_MAP_TYPE_ARRAY && map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE) @@ -545,7 +580,7 @@ static int map_create(union bpf_attr *attr) if (attr->btf_key_type_id || attr->btf_value_type_id) { struct btf *btf; - if (!attr->btf_key_type_id || !attr->btf_value_type_id) { + if (!attr->btf_value_type_id) { err = -EINVAL; goto free_map_nouncharge; } @@ -713,8 +748,7 @@ static int map_lookup_elem(union bpf_attr *attr) map = __bpf_map_get(f); if (IS_ERR(map)) return PTR_ERR(map); - - if (!(f.file->f_mode & FMODE_CAN_READ)) { + if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) { err = -EPERM; goto err_put; } @@ -843,8 +877,7 @@ static int map_update_elem(union bpf_attr *attr) map = __bpf_map_get(f); if (IS_ERR(map)) return PTR_ERR(map); - - if (!(f.file->f_mode & FMODE_CAN_WRITE)) { + if (!(map_get_sys_perms(map, f) & FMODE_CAN_WRITE)) { err = -EPERM; goto err_put; } @@ -955,8 +988,7 @@ static int map_delete_elem(union bpf_attr *attr) map = __bpf_map_get(f); if (IS_ERR(map)) return PTR_ERR(map); - - if (!(f.file->f_mode & FMODE_CAN_WRITE)) { + if (!(map_get_sys_perms(map, f) & FMODE_CAN_WRITE)) { err = -EPERM; goto err_put; } @@ -1007,8 +1039,7 @@ static int map_get_next_key(union bpf_attr *attr) map = __bpf_map_get(f); if (IS_ERR(map)) return PTR_ERR(map); - - if (!(f.file->f_mode & FMODE_CAN_READ)) { + if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) { err = -EPERM; goto err_put; } @@ -1075,8 +1106,7 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr) map = __bpf_map_get(f); if (IS_ERR(map)) return PTR_ERR(map); - - if (!(f.file->f_mode & FMODE_CAN_WRITE)) { + if (!(map_get_sys_perms(map, f) & FMODE_CAN_WRITE)) { err = -EPERM; goto err_put; } @@ -1118,6 +1148,36 @@ err_put: return err; } +#define BPF_MAP_FREEZE_LAST_FIELD map_fd + +static int map_freeze(const union bpf_attr *attr) +{ + int err = 0, ufd = attr->map_fd; + struct bpf_map *map; + struct fd f; + + if (CHECK_ATTR(BPF_MAP_FREEZE)) + return -EINVAL; + + f = fdget(ufd); + map = __bpf_map_get(f); + if (IS_ERR(map)) + return PTR_ERR(map); + if (READ_ONCE(map->frozen)) { + err = -EBUSY; + goto err_put; + } + if (!capable(CAP_SYS_ADMIN)) { + err = -EPERM; + goto err_put; + } + + WRITE_ONCE(map->frozen, true); +err_put: + fdput(f); + return err; +} + static const struct bpf_prog_ops * const bpf_prog_types[] = { #define BPF_PROG_TYPE(_id, _name) \ [_id] = & _name ## _prog_ops, @@ -1557,7 +1617,8 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) /* eBPF programs must be GPL compatible to use GPL-ed functions */ is_gpl = license_is_gpl_compatible(license); - if (attr->insn_cnt == 0 || attr->insn_cnt > BPF_MAXINSNS) + if (attr->insn_cnt == 0 || + attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS)) return -E2BIG; if (type != BPF_PROG_TYPE_SOCKET_FILTER && type != BPF_PROG_TYPE_CGROUP_SKB && @@ -1948,7 +2009,7 @@ static int bpf_prog_query(const union bpf_attr *attr, return cgroup_bpf_prog_query(attr, uattr); } -#define BPF_PROG_TEST_RUN_LAST_FIELD test.duration +#define BPF_PROG_TEST_RUN_LAST_FIELD test.ctx_out static int bpf_prog_test_run(const union bpf_attr *attr, union bpf_attr __user *uattr) @@ -1961,6 +2022,14 @@ static int bpf_prog_test_run(const union bpf_attr *attr, if (CHECK_ATTR(BPF_PROG_TEST_RUN)) return -EINVAL; + if ((attr->test.ctx_size_in && !attr->test.ctx_in) || + (!attr->test.ctx_size_in && attr->test.ctx_in)) + return -EINVAL; + + if ((attr->test.ctx_size_out && !attr->test.ctx_out) || + (!attr->test.ctx_size_out && attr->test.ctx_out)) + return -EINVAL; + prog = bpf_prog_get(attr->test.prog_fd); if (IS_ERR(prog)) return PTR_ERR(prog); @@ -2071,13 +2140,26 @@ static int bpf_map_get_fd_by_id(const union bpf_attr *attr) } static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog, - unsigned long addr) + unsigned long addr, u32 *off, + u32 *type) { + const struct bpf_map *map; int i; - for (i = 0; i < prog->aux->used_map_cnt; i++) - if (prog->aux->used_maps[i] == (void *)addr) - return prog->aux->used_maps[i]; + for (i = 0, *off = 0; i < prog->aux->used_map_cnt; i++) { + map = prog->aux->used_maps[i]; + if (map == (void *)addr) { + *type = BPF_PSEUDO_MAP_FD; + return map; + } + if (!map->ops->map_direct_value_meta) + continue; + if (!map->ops->map_direct_value_meta(map, addr, off)) { + *type = BPF_PSEUDO_MAP_VALUE; + return map; + } + } + return NULL; } @@ -2085,6 +2167,7 @@ static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog) { const struct bpf_map *map; struct bpf_insn *insns; + u32 off, type; u64 imm; int i; @@ -2112,11 +2195,11 @@ static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog) continue; imm = ((u64)insns[i + 1].imm << 32) | (u32)insns[i].imm; - map = bpf_map_from_imm(prog, imm); + map = bpf_map_from_imm(prog, imm, &off, &type); if (map) { - insns[i].src_reg = BPF_PSEUDO_MAP_FD; + insns[i].src_reg = type; insns[i].imm = map->id; - insns[i + 1].imm = 0; + insns[i + 1].imm = off; continue; } } @@ -2706,6 +2789,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_MAP_GET_NEXT_KEY: err = map_get_next_key(&attr); break; + case BPF_MAP_FREEZE: + err = map_freeze(&attr); + break; case BPF_PROG_LOAD: err = bpf_prog_load(&attr, uattr); break; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b7ad8003c4e6..f25b7c9c20ba 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -176,7 +176,6 @@ struct bpf_verifier_stack_elem { struct bpf_verifier_stack_elem *next; }; -#define BPF_COMPLEXITY_LIMIT_INSNS 131072 #define BPF_COMPLEXITY_LIMIT_STACK 1024 #define BPF_COMPLEXITY_LIMIT_STATES 64 @@ -1092,7 +1091,7 @@ static int check_subprogs(struct bpf_verifier_env *env) */ subprog[env->subprog_cnt].start = insn_cnt; - if (env->log.level > 1) + if (env->log.level & BPF_LOG_LEVEL2) for (i = 0; i < env->subprog_cnt; i++) verbose(env, "func#%d @%d\n", i, subprog[i].start); @@ -1139,6 +1138,7 @@ static int mark_reg_read(struct bpf_verifier_env *env, struct bpf_reg_state *parent) { bool writes = parent == state->parent; /* Observe write marks */ + int cnt = 0; while (parent) { /* if read wasn't screened by an earlier write ... */ @@ -1150,12 +1150,25 @@ static int mark_reg_read(struct bpf_verifier_env *env, parent->var_off.value, parent->off); return -EFAULT; } + if (parent->live & REG_LIVE_READ) + /* The parentage chain never changes and + * this parent was already marked as LIVE_READ. + * There is no need to keep walking the chain again and + * keep re-marking all parents as LIVE_READ. + * This case happens when the same register is read + * multiple times without writes into it in-between. + */ + break; /* ... then we depend on parent's value */ parent->live |= REG_LIVE_READ; state = parent; parent = state->parent; writes = true; + cnt++; } + + if (env->longest_mark_read_walk < cnt) + env->longest_mark_read_walk = cnt; return 0; } @@ -1413,7 +1426,7 @@ static int check_stack_access(struct bpf_verifier_env *env, char tn_buf[48]; tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); - verbose(env, "variable stack access var_off=%s off=%d size=%d", + verbose(env, "variable stack access var_off=%s off=%d size=%d\n", tn_buf, off, size); return -EACCES; } @@ -1426,6 +1439,28 @@ static int check_stack_access(struct bpf_verifier_env *env, return 0; } +static int check_map_access_type(struct bpf_verifier_env *env, u32 regno, + int off, int size, enum bpf_access_type type) +{ + struct bpf_reg_state *regs = cur_regs(env); + struct bpf_map *map = regs[regno].map_ptr; + u32 cap = bpf_map_flags_to_cap(map); + + if (type == BPF_WRITE && !(cap & BPF_MAP_CAN_WRITE)) { + verbose(env, "write into map forbidden, value_size=%d off=%d size=%d\n", + map->value_size, off, size); + return -EACCES; + } + + if (type == BPF_READ && !(cap & BPF_MAP_CAN_READ)) { + verbose(env, "read from map forbidden, value_size=%d off=%d size=%d\n", + map->value_size, off, size); + return -EACCES; + } + + return 0; +} + /* check read/write into map element returned by bpf_map_lookup_elem() */ static int __check_map_access(struct bpf_verifier_env *env, u32 regno, int off, int size, bool zero_size_allowed) @@ -1455,7 +1490,7 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno, * need to try adding each of min_value and max_value to off * to make sure our theoretical access will be safe. */ - if (env->log.level) + if (env->log.level & BPF_LOG_LEVEL) print_verifier_state(env, state); /* The minimum value is only important with signed @@ -2012,7 +2047,9 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn verbose(env, "R%d leaks addr into map\n", value_regno); return -EACCES; } - + err = check_map_access_type(env, regno, off, size, t); + if (err) + return err; err = check_map_access(env, regno, off, size, false); if (!err && t == BPF_READ && value_regno >= 0) mark_reg_unknown(env, regs, value_regno); @@ -2158,6 +2195,29 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins BPF_SIZE(insn->code), BPF_WRITE, -1, true); } +static int __check_stack_boundary(struct bpf_verifier_env *env, u32 regno, + int off, int access_size, + bool zero_size_allowed) +{ + struct bpf_reg_state *reg = reg_state(env, regno); + + if (off >= 0 || off < -MAX_BPF_STACK || off + access_size > 0 || + access_size < 0 || (access_size == 0 && !zero_size_allowed)) { + if (tnum_is_const(reg->var_off)) { + verbose(env, "invalid stack type R%d off=%d access_size=%d\n", + regno, off, access_size); + } else { + char tn_buf[48]; + + tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); + verbose(env, "invalid stack type R%d var_off=%s access_size=%d\n", + regno, tn_buf, access_size); + } + return -EACCES; + } + return 0; +} + /* when register 'regno' is passed into function that will read 'access_size' * bytes from that pointer, make sure that it's within stack boundary * and all elements of stack are initialized. @@ -2170,7 +2230,7 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno, { struct bpf_reg_state *reg = reg_state(env, regno); struct bpf_func_state *state = func(env, reg); - int off, i, slot, spi; + int err, min_off, max_off, i, slot, spi; if (reg->type != PTR_TO_STACK) { /* Allow zero-byte read from NULL, regardless of pointer type */ @@ -2184,21 +2244,57 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno, return -EACCES; } - /* Only allow fixed-offset stack reads */ - if (!tnum_is_const(reg->var_off)) { - char tn_buf[48]; + if (tnum_is_const(reg->var_off)) { + min_off = max_off = reg->var_off.value + reg->off; + err = __check_stack_boundary(env, regno, min_off, access_size, + zero_size_allowed); + if (err) + return err; + } else { + /* Variable offset is prohibited for unprivileged mode for + * simplicity since it requires corresponding support in + * Spectre masking for stack ALU. + * See also retrieve_ptr_limit(). + */ + if (!env->allow_ptr_leaks) { + char tn_buf[48]; - tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); - verbose(env, "invalid variable stack read R%d var_off=%s\n", - regno, tn_buf); - return -EACCES; - } - off = reg->off + reg->var_off.value; - if (off >= 0 || off < -MAX_BPF_STACK || off + access_size > 0 || - access_size < 0 || (access_size == 0 && !zero_size_allowed)) { - verbose(env, "invalid stack type R%d off=%d access_size=%d\n", - regno, off, access_size); - return -EACCES; + tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); + verbose(env, "R%d indirect variable offset stack access prohibited for !root, var_off=%s\n", + regno, tn_buf); + return -EACCES; + } + /* Only initialized buffer on stack is allowed to be accessed + * with variable offset. With uninitialized buffer it's hard to + * guarantee that whole memory is marked as initialized on + * helper return since specific bounds are unknown what may + * cause uninitialized stack leaking. + */ + if (meta && meta->raw_mode) + meta = NULL; + + if (reg->smax_value >= BPF_MAX_VAR_OFF || + reg->smax_value <= -BPF_MAX_VAR_OFF) { + verbose(env, "R%d unbounded indirect variable offset stack access\n", + regno); + return -EACCES; + } + min_off = reg->smin_value + reg->off; + max_off = reg->smax_value + reg->off; + err = __check_stack_boundary(env, regno, min_off, access_size, + zero_size_allowed); + if (err) { + verbose(env, "R%d min value is outside of stack bound\n", + regno); + return err; + } + err = __check_stack_boundary(env, regno, max_off, access_size, + zero_size_allowed); + if (err) { + verbose(env, "R%d max value is outside of stack bound\n", + regno); + return err; + } } if (meta && meta->raw_mode) { @@ -2207,10 +2303,10 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno, return 0; } - for (i = 0; i < access_size; i++) { + for (i = min_off; i < max_off + access_size; i++) { u8 *stype; - slot = -(off + i) - 1; + slot = -i - 1; spi = slot / BPF_REG_SIZE; if (state->allocated_stack <= slot) goto err; @@ -2223,8 +2319,16 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno, goto mark; } err: - verbose(env, "invalid indirect read from stack off %d+%d size %d\n", - off, i, access_size); + if (tnum_is_const(reg->var_off)) { + verbose(env, "invalid indirect read from stack off %d+%d size %d\n", + min_off, i - min_off, access_size); + } else { + char tn_buf[48]; + + tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); + verbose(env, "invalid indirect read from stack var_off %s+%d size %d\n", + tn_buf, i - min_off, access_size); + } return -EACCES; mark: /* reading any byte out of 8-byte 'spill_slot' will cause @@ -2233,7 +2337,7 @@ mark: mark_reg_read(env, &state->stack[spi].spilled_ptr, state->stack[spi].spilled_ptr.parent); } - return update_stack_depth(env, state, off); + return update_stack_depth(env, state, min_off); } static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, @@ -2248,6 +2352,10 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, return check_packet_access(env, regno, reg->off, access_size, zero_size_allowed); case PTR_TO_MAP_VALUE: + if (check_map_access_type(env, regno, reg->off, access_size, + meta && meta->raw_mode ? BPF_WRITE : + BPF_READ)) + return -EACCES; return check_map_access(env, regno, reg->off, access_size, zero_size_allowed); default: /* scalar_value|ptr_to_stack or invalid ptr */ @@ -2906,7 +3014,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, /* and go analyze first insn of the callee */ *insn_idx = target_insn; - if (env->log.level) { + if (env->log.level & BPF_LOG_LEVEL) { verbose(env, "caller:\n"); print_verifier_state(env, caller); verbose(env, "callee:\n"); @@ -2946,7 +3054,7 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) return err; *insn_idx = callee->callsite + 1; - if (env->log.level) { + if (env->log.level & BPF_LOG_LEVEL) { verbose(env, "returning from callee:\n"); print_verifier_state(env, callee); verbose(env, "to caller at %d:\n", *insn_idx); @@ -2980,6 +3088,7 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, int func_id, int insn_idx) { struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx]; + struct bpf_map *map = meta->map_ptr; if (func_id != BPF_FUNC_tail_call && func_id != BPF_FUNC_map_lookup_elem && @@ -2990,11 +3099,24 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, func_id != BPF_FUNC_map_peek_elem) return 0; - if (meta->map_ptr == NULL) { + if (map == NULL) { verbose(env, "kernel subsystem misconfigured verifier\n"); return -EINVAL; } + /* In case of read-only, some additional restrictions + * need to be applied in order to prevent altering the + * state of the map from program side. + */ + if ((map->map_flags & BPF_F_RDONLY_PROG) && + (func_id == BPF_FUNC_map_delete_elem || + func_id == BPF_FUNC_map_update_elem || + func_id == BPF_FUNC_map_push_elem || + func_id == BPF_FUNC_map_pop_elem)) { + verbose(env, "write into map forbidden\n"); + return -EACCES; + } + if (!BPF_MAP_PTR(aux->map_state)) bpf_map_ptr_store(aux, meta->map_ptr, meta->map_ptr->unpriv_array); @@ -3285,6 +3407,9 @@ static int retrieve_ptr_limit(const struct bpf_reg_state *ptr_reg, switch (ptr_reg->type) { case PTR_TO_STACK: + /* Indirect variable offset stack access is prohibited in + * unprivileged mode so it's not handled here. + */ off = ptr_reg->off + ptr_reg->var_off.value; if (mask_to_left) *ptr_limit = MAX_BPF_STACK + off; @@ -4969,23 +5094,17 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, insn->dst_reg); return -EACCES; } - if (env->log.level) + if (env->log.level & BPF_LOG_LEVEL) print_verifier_state(env, this_branch->frame[this_branch->curframe]); return 0; } -/* return the map pointer stored inside BPF_LD_IMM64 instruction */ -static struct bpf_map *ld_imm64_to_map_ptr(struct bpf_insn *insn) -{ - u64 imm64 = ((u64) (u32) insn[0].imm) | ((u64) (u32) insn[1].imm) << 32; - - return (struct bpf_map *) (unsigned long) imm64; -} - /* verify BPF_LD_IMM64 instruction */ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) { + struct bpf_insn_aux_data *aux = cur_aux(env); struct bpf_reg_state *regs = cur_regs(env); + struct bpf_map *map; int err; if (BPF_SIZE(insn->code) != BPF_DW) { @@ -5009,11 +5128,22 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) return 0; } - /* replace_map_fd_with_map_ptr() should have caught bad ld_imm64 */ - BUG_ON(insn->src_reg != BPF_PSEUDO_MAP_FD); + map = env->used_maps[aux->map_index]; + mark_reg_known_zero(env, regs, insn->dst_reg); + regs[insn->dst_reg].map_ptr = map; + + if (insn->src_reg == BPF_PSEUDO_MAP_VALUE) { + regs[insn->dst_reg].type = PTR_TO_MAP_VALUE; + regs[insn->dst_reg].off = aux->map_off; + if (map_value_has_spin_lock(map)) + regs[insn->dst_reg].id = ++env->id_gen; + } else if (insn->src_reg == BPF_PSEUDO_MAP_FD) { + regs[insn->dst_reg].type = CONST_PTR_TO_MAP; + } else { + verbose(env, "bpf verifier is misconfigured\n"); + return -EINVAL; + } - regs[insn->dst_reg].type = CONST_PTR_TO_MAP; - regs[insn->dst_reg].map_ptr = ld_imm64_to_map_ptr(insn); return 0; } @@ -5267,13 +5397,13 @@ static int check_cfg(struct bpf_verifier_env *env) int ret = 0; int i, t; - insn_state = kcalloc(insn_cnt, sizeof(int), GFP_KERNEL); + insn_state = kvcalloc(insn_cnt, sizeof(int), GFP_KERNEL); if (!insn_state) return -ENOMEM; - insn_stack = kcalloc(insn_cnt, sizeof(int), GFP_KERNEL); + insn_stack = kvcalloc(insn_cnt, sizeof(int), GFP_KERNEL); if (!insn_stack) { - kfree(insn_state); + kvfree(insn_state); return -ENOMEM; } @@ -5371,8 +5501,8 @@ check_state: ret = 0; /* cfg looks good */ err_free: - kfree(insn_state); - kfree(insn_stack); + kvfree(insn_state); + kvfree(insn_stack); return ret; } @@ -6115,11 +6245,13 @@ static int propagate_liveness(struct bpf_verifier_env *env, static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) { struct bpf_verifier_state_list *new_sl; - struct bpf_verifier_state_list *sl; + struct bpf_verifier_state_list *sl, **pprev; struct bpf_verifier_state *cur = env->cur_state, *new; int i, j, err, states_cnt = 0; - sl = env->explored_states[insn_idx]; + pprev = &env->explored_states[insn_idx]; + sl = *pprev; + if (!sl) /* this 'insn_idx' instruction wasn't marked, so we will not * be doing state search here @@ -6130,6 +6262,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) while (sl != STATE_LIST_MARK) { if (states_equal(env, &sl->state, cur)) { + sl->hit_cnt++; /* reached equivalent register/stack state, * prune the search. * Registers read by the continuation are read by us. @@ -6145,10 +6278,40 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) return err; return 1; } - sl = sl->next; states_cnt++; + sl->miss_cnt++; + /* heuristic to determine whether this state is beneficial + * to keep checking from state equivalence point of view. + * Higher numbers increase max_states_per_insn and verification time, + * but do not meaningfully decrease insn_processed. + */ + if (sl->miss_cnt > sl->hit_cnt * 3 + 3) { + /* the state is unlikely to be useful. Remove it to + * speed up verification + */ + *pprev = sl->next; + if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE) { + free_verifier_state(&sl->state, false); + kfree(sl); + env->peak_states--; + } else { + /* cannot free this state, since parentage chain may + * walk it later. Add it for free_list instead to + * be freed at the end of verification + */ + sl->next = env->free_list; + env->free_list = sl; + } + sl = *pprev; + continue; + } + pprev = &sl->next; + sl = *pprev; } + if (env->max_states_per_insn < states_cnt) + env->max_states_per_insn = states_cnt; + if (!env->allow_ptr_leaks && states_cnt > BPF_COMPLEXITY_LIMIT_STATES) return 0; @@ -6162,6 +6325,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) new_sl = kzalloc(sizeof(struct bpf_verifier_state_list), GFP_KERNEL); if (!new_sl) return -ENOMEM; + env->total_states++; + env->peak_states++; /* add new state to the head of linked list */ new = &new_sl->state; @@ -6246,8 +6411,7 @@ static int do_check(struct bpf_verifier_env *env) struct bpf_verifier_state *state; struct bpf_insn *insns = env->prog->insnsi; struct bpf_reg_state *regs; - int insn_cnt = env->prog->len, i; - int insn_processed = 0; + int insn_cnt = env->prog->len; bool do_print_state = false; env->prev_linfo = NULL; @@ -6282,10 +6446,10 @@ static int do_check(struct bpf_verifier_env *env) insn = &insns[env->insn_idx]; class = BPF_CLASS(insn->code); - if (++insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { + if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { verbose(env, "BPF program is too large. Processed %d insn\n", - insn_processed); + env->insn_processed); return -E2BIG; } @@ -6294,7 +6458,7 @@ static int do_check(struct bpf_verifier_env *env) return err; if (err == 1) { /* found equivalent state, can prune the search */ - if (env->log.level) { + if (env->log.level & BPF_LOG_LEVEL) { if (do_print_state) verbose(env, "\nfrom %d to %d%s: safe\n", env->prev_insn_idx, env->insn_idx, @@ -6312,8 +6476,9 @@ static int do_check(struct bpf_verifier_env *env) if (need_resched()) cond_resched(); - if (env->log.level > 1 || (env->log.level && do_print_state)) { - if (env->log.level > 1) + if (env->log.level & BPF_LOG_LEVEL2 || + (env->log.level & BPF_LOG_LEVEL && do_print_state)) { + if (env->log.level & BPF_LOG_LEVEL2) verbose(env, "%d:", env->insn_idx); else verbose(env, "\nfrom %d to %d%s:", @@ -6324,7 +6489,7 @@ static int do_check(struct bpf_verifier_env *env) do_print_state = false; } - if (env->log.level) { + if (env->log.level & BPF_LOG_LEVEL) { const struct bpf_insn_cbs cbs = { .cb_print = verbose, .private_data = env, @@ -6589,16 +6754,6 @@ process_bpf_exit: env->insn_idx++; } - verbose(env, "processed %d insns (limit %d), stack depth ", - insn_processed, BPF_COMPLEXITY_LIMIT_INSNS); - for (i = 0; i < env->subprog_cnt; i++) { - u32 depth = env->subprog_info[i].stack_depth; - - verbose(env, "%d", depth); - if (i + 1 < env->subprog_cnt) - verbose(env, "+"); - } - verbose(env, "\n"); env->prog->aux->stack_depth = env->subprog_info[0].stack_depth; return 0; } @@ -6696,8 +6851,10 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) } if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) { + struct bpf_insn_aux_data *aux; struct bpf_map *map; struct fd f; + u64 addr; if (i == insn_cnt - 1 || insn[1].code != 0 || insn[1].dst_reg != 0 || insn[1].src_reg != 0 || @@ -6706,13 +6863,19 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) return -EINVAL; } - if (insn->src_reg == 0) + if (insn[0].src_reg == 0) /* valid generic load 64-bit imm */ goto next_insn; - if (insn[0].src_reg != BPF_PSEUDO_MAP_FD || - insn[1].imm != 0) { - verbose(env, "unrecognized bpf_ld_imm64 insn\n"); + /* In final convert_pseudo_ld_imm64() step, this is + * converted into regular 64-bit imm load insn. + */ + if ((insn[0].src_reg != BPF_PSEUDO_MAP_FD && + insn[0].src_reg != BPF_PSEUDO_MAP_VALUE) || + (insn[0].src_reg == BPF_PSEUDO_MAP_FD && + insn[1].imm != 0)) { + verbose(env, + "unrecognized bpf_ld_imm64 insn\n"); return -EINVAL; } @@ -6730,16 +6893,47 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) return err; } - /* store map pointer inside BPF_LD_IMM64 instruction */ - insn[0].imm = (u32) (unsigned long) map; - insn[1].imm = ((u64) (unsigned long) map) >> 32; + aux = &env->insn_aux_data[i]; + if (insn->src_reg == BPF_PSEUDO_MAP_FD) { + addr = (unsigned long)map; + } else { + u32 off = insn[1].imm; + + if (off >= BPF_MAX_VAR_OFF) { + verbose(env, "direct value offset of %u is not allowed\n", off); + fdput(f); + return -EINVAL; + } + + if (!map->ops->map_direct_value_addr) { + verbose(env, "no direct value access support for this map type\n"); + fdput(f); + return -EINVAL; + } + + err = map->ops->map_direct_value_addr(map, &addr, off); + if (err) { + verbose(env, "invalid access to map value pointer, value_size=%u off=%u\n", + map->value_size, off); + fdput(f); + return err; + } + + aux->map_off = off; + addr += off; + } + + insn[0].imm = (u32)addr; + insn[1].imm = addr >> 32; /* check whether we recorded this map already */ - for (j = 0; j < env->used_map_cnt; j++) + for (j = 0; j < env->used_map_cnt; j++) { if (env->used_maps[j] == map) { + aux->map_index = j; fdput(f); goto next_insn; } + } if (env->used_map_cnt >= MAX_USED_MAPS) { fdput(f); @@ -6756,6 +6950,8 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) fdput(f); return PTR_ERR(map); } + + aux->map_index = env->used_map_cnt; env->used_maps[env->used_map_cnt++] = map; if (bpf_map_is_cgroup_storage(map) && @@ -6861,8 +7057,13 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of struct bpf_prog *new_prog; new_prog = bpf_patch_insn_single(env->prog, off, patch, len); - if (!new_prog) + if (IS_ERR(new_prog)) { + if (PTR_ERR(new_prog) == -ERANGE) + verbose(env, + "insn %d cannot be patched due to 16-bit range\n", + env->insn_aux_data[off].orig_idx); return NULL; + } if (adjust_insn_aux_data(env, new_prog->len, off, len)) return NULL; adjust_subprog_starts(env, off, len); @@ -7804,6 +8005,14 @@ static void free_states(struct bpf_verifier_env *env) struct bpf_verifier_state_list *sl, *sln; int i; + sl = env->free_list; + while (sl) { + sln = sl->next; + free_verifier_state(&sl->state, false); + kfree(sl); + sl = sln; + } + if (!env->explored_states) return; @@ -7819,12 +8028,37 @@ static void free_states(struct bpf_verifier_env *env) } } - kfree(env->explored_states); + kvfree(env->explored_states); +} + +static void print_verification_stats(struct bpf_verifier_env *env) +{ + int i; + + if (env->log.level & BPF_LOG_STATS) { + verbose(env, "verification time %lld usec\n", + div_u64(env->verification_time, 1000)); + verbose(env, "stack depth "); + for (i = 0; i < env->subprog_cnt; i++) { + u32 depth = env->subprog_info[i].stack_depth; + + verbose(env, "%d", depth); + if (i + 1 < env->subprog_cnt) + verbose(env, "+"); + } + verbose(env, "\n"); + } + verbose(env, "processed %d insns (limit %d) max_states_per_insn %d " + "total_states %d peak_states %d mark_read %d\n", + env->insn_processed, BPF_COMPLEXITY_LIMIT_INSNS, + env->max_states_per_insn, env->total_states, + env->peak_states, env->longest_mark_read_walk); } int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, union bpf_attr __user *uattr) { + u64 start_time = ktime_get_ns(); struct bpf_verifier_env *env; struct bpf_verifier_log *log; int i, len, ret = -EINVAL; @@ -7866,8 +8100,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, ret = -EINVAL; /* log attributes have to be sane */ - if (log->len_total < 128 || log->len_total > UINT_MAX >> 8 || - !log->level || !log->ubuf) + if (log->len_total < 128 || log->len_total > UINT_MAX >> 2 || + !log->level || !log->ubuf || log->level & ~BPF_LOG_MASK) goto err_unlock; } @@ -7890,7 +8124,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, goto skip_full_check; } - env->explored_states = kcalloc(env->prog->len, + env->explored_states = kvcalloc(env->prog->len, sizeof(struct bpf_verifier_state_list *), GFP_USER); ret = -ENOMEM; @@ -7948,6 +8182,9 @@ skip_full_check: if (ret == 0) ret = fixup_call_args(env); + env->verification_time = ktime_get_ns() - start_time; + print_verification_stats(env); + if (log->level && bpf_verifier_log_full(log)) ret = -ENOSPC; if (log->level && !log->ubuf) { diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 0d9e81779e37..188fc17c2202 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -219,6 +219,14 @@ config DEBUG_INFO_DWARF4 But it significantly improves the success of resolving variables in gdb on optimized code. +config DEBUG_INFO_BTF + bool "Generate BTF typeinfo" + depends on DEBUG_INFO + help + Generate deduplicated BTF type information from DWARF debug info. + Turning this on expects presence of pahole tool, which will convert + DWARF type info into equivalent deduplicated BTF type info. + config GDB_SCRIPTS bool "Provide GDB scripts for kernel debugging" depends on DEBUG_INFO diff --git a/net/bpf/Makefile b/net/bpf/Makefile index 27b2992a0692..b0ca361742e4 100644 --- a/net/bpf/Makefile +++ b/net/bpf/Makefile @@ -1 +1 @@ -obj-y := test_run.o +obj-$(CONFIG_BPF_SYSCALL) := test_run.o diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index fab142b796ef..2221573dacdb 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -123,12 +123,126 @@ static void *bpf_test_init(const union bpf_attr *kattr, u32 size, return data; } +static void *bpf_ctx_init(const union bpf_attr *kattr, u32 max_size) +{ + void __user *data_in = u64_to_user_ptr(kattr->test.ctx_in); + void __user *data_out = u64_to_user_ptr(kattr->test.ctx_out); + u32 size = kattr->test.ctx_size_in; + void *data; + int err; + + if (!data_in && !data_out) + return NULL; + + data = kzalloc(max_size, GFP_USER); + if (!data) + return ERR_PTR(-ENOMEM); + + if (data_in) { + err = bpf_check_uarg_tail_zero(data_in, max_size, size); + if (err) { + kfree(data); + return ERR_PTR(err); + } + + size = min_t(u32, max_size, size); + if (copy_from_user(data, data_in, size)) { + kfree(data); + return ERR_PTR(-EFAULT); + } + } + return data; +} + +static int bpf_ctx_finish(const union bpf_attr *kattr, + union bpf_attr __user *uattr, const void *data, + u32 size) +{ + void __user *data_out = u64_to_user_ptr(kattr->test.ctx_out); + int err = -EFAULT; + u32 copy_size = size; + + if (!data || !data_out) + return 0; + + if (copy_size > kattr->test.ctx_size_out) { + copy_size = kattr->test.ctx_size_out; + err = -ENOSPC; + } + + if (copy_to_user(data_out, data, copy_size)) + goto out; + if (copy_to_user(&uattr->test.ctx_size_out, &size, sizeof(size))) + goto out; + if (err != -ENOSPC) + err = 0; +out: + return err; +} + +/** + * range_is_zero - test whether buffer is initialized + * @buf: buffer to check + * @from: check from this position + * @to: check up until (excluding) this position + * + * This function returns true if the there is a non-zero byte + * in the buf in the range [from,to). + */ +static inline bool range_is_zero(void *buf, size_t from, size_t to) +{ + return !memchr_inv((u8 *)buf + from, 0, to - from); +} + +static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) +{ + struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; + + if (!__skb) + return 0; + + /* make sure the fields we don't use are zeroed */ + if (!range_is_zero(__skb, 0, offsetof(struct __sk_buff, priority))) + return -EINVAL; + + /* priority is allowed */ + + if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) + + FIELD_SIZEOF(struct __sk_buff, priority), + offsetof(struct __sk_buff, cb))) + return -EINVAL; + + /* cb is allowed */ + + if (!range_is_zero(__skb, offsetof(struct __sk_buff, cb) + + FIELD_SIZEOF(struct __sk_buff, cb), + sizeof(struct __sk_buff))) + return -EINVAL; + + skb->priority = __skb->priority; + memcpy(&cb->data, __skb->cb, QDISC_CB_PRIV_LEN); + + return 0; +} + +static void convert_skb_to___skb(struct sk_buff *skb, struct __sk_buff *__skb) +{ + struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; + + if (!__skb) + return; + + __skb->priority = skb->priority; + memcpy(__skb->cb, &cb->data, QDISC_CB_PRIV_LEN); +} + int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, union bpf_attr __user *uattr) { bool is_l2 = false, is_direct_pkt_access = false; u32 size = kattr->test.data_size_in; u32 repeat = kattr->test.repeat; + struct __sk_buff *ctx = NULL; u32 retval, duration; int hh_len = ETH_HLEN; struct sk_buff *skb; @@ -141,6 +255,12 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, if (IS_ERR(data)) return PTR_ERR(data); + ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); + if (IS_ERR(ctx)) { + kfree(data); + return PTR_ERR(ctx); + } + switch (prog->type) { case BPF_PROG_TYPE_SCHED_CLS: case BPF_PROG_TYPE_SCHED_ACT: @@ -158,6 +278,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, sk = kzalloc(sizeof(struct sock), GFP_USER); if (!sk) { kfree(data); + kfree(ctx); return -ENOMEM; } sock_net_set(sk, current->nsproxy->net_ns); @@ -166,6 +287,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, skb = build_skb(data, 0); if (!skb) { kfree(data); + kfree(ctx); kfree(sk); return -ENOMEM; } @@ -180,32 +302,37 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, __skb_push(skb, hh_len); if (is_direct_pkt_access) bpf_compute_data_pointers(skb); + ret = convert___skb_to_skb(skb, ctx); + if (ret) + goto out; ret = bpf_test_run(prog, skb, repeat, &retval, &duration); - if (ret) { - kfree_skb(skb); - kfree(sk); - return ret; - } + if (ret) + goto out; if (!is_l2) { if (skb_headroom(skb) < hh_len) { int nhead = HH_DATA_ALIGN(hh_len - skb_headroom(skb)); if (pskb_expand_head(skb, nhead, 0, GFP_USER)) { - kfree_skb(skb); - kfree(sk); - return -ENOMEM; + ret = -ENOMEM; + goto out; } } memset(__skb_push(skb, hh_len), 0, hh_len); } + convert_skb_to___skb(skb, ctx); size = skb->len; /* bpf program can never convert linear skb to non-linear */ if (WARN_ON_ONCE(skb_is_nonlinear(skb))) size = skb_headlen(skb); ret = bpf_test_finish(kattr, uattr, skb->data, size, retval, duration); + if (!ret) + ret = bpf_ctx_finish(kattr, uattr, ctx, + sizeof(struct __sk_buff)); +out: kfree_skb(skb); kfree(sk); + kfree(ctx); return ret; } @@ -220,6 +347,9 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, void *data; int ret; + if (kattr->test.ctx_in || kattr->test.ctx_out) + return -EINVAL; + data = bpf_test_init(kattr, size, XDP_PACKET_HEADROOM + NET_IP_ALIGN, 0); if (IS_ERR(data)) return PTR_ERR(data); @@ -263,6 +393,9 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, if (prog->type != BPF_PROG_TYPE_FLOW_DISSECTOR) return -EINVAL; + if (kattr->test.ctx_in || kattr->test.ctx_out) + return -EINVAL; + data = bpf_test_init(kattr, size, NET_SKB_PAD + NET_IP_ALIGN, SKB_DATA_ALIGN(sizeof(struct skb_shared_info))); if (IS_ERR(data)) diff --git a/net/core/filter.c b/net/core/filter.c index 41f633cf4fc1..95a27fdf9a40 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2970,11 +2970,14 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb) #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \ BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \ BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \ - BPF_F_ADJ_ROOM_ENCAP_L4_UDP) + BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \ + BPF_F_ADJ_ROOM_ENCAP_L2( \ + BPF_ADJ_ROOM_ENCAP_L2_MASK)) static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, u64 flags) { + u8 inner_mac_len = flags >> BPF_ADJ_ROOM_ENCAP_L2_SHIFT; bool encap = flags & BPF_F_ADJ_ROOM_ENCAP_L3_MASK; u16 mac_len = 0, inner_net = 0, inner_trans = 0; unsigned int gso_type = SKB_GSO_DODGY; @@ -3009,6 +3012,8 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, mac_len = skb->network_header - skb->mac_header; inner_net = skb->network_header; + if (inner_mac_len > len_diff) + return -EINVAL; inner_trans = skb->transport_header; } @@ -3017,8 +3022,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, return ret; if (encap) { - /* inner mac == inner_net on l3 encap */ - skb->inner_mac_header = inner_net; + skb->inner_mac_header = inner_net - inner_mac_len; skb->inner_network_header = inner_net; skb->inner_transport_header = inner_trans; skb_set_inner_protocol(skb, skb->protocol); @@ -3032,7 +3036,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, gso_type |= SKB_GSO_GRE; else if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6) gso_type |= SKB_GSO_IPXIP6; - else + else if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4) gso_type |= SKB_GSO_IPXIP4; if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE || diff --git a/samples/bpf/asm_goto_workaround.h b/samples/bpf/asm_goto_workaround.h index 5cd7c1d1a5d5..7409722727ca 100644 --- a/samples/bpf/asm_goto_workaround.h +++ b/samples/bpf/asm_goto_workaround.h @@ -13,4 +13,5 @@ #define asm_volatile_goto(x...) asm volatile("invalid use of asm_volatile_goto") #endif +#define volatile(x...) volatile("") #endif diff --git a/samples/bpf/offwaketime_user.c b/samples/bpf/offwaketime_user.c index f06063af9fcb..bb315ce1b866 100644 --- a/samples/bpf/offwaketime_user.c +++ b/samples/bpf/offwaketime_user.c @@ -28,6 +28,11 @@ static void print_ksym(__u64 addr) if (!addr) return; sym = ksym_search(addr); + if (!sym) { + printf("ksym not found. Is kallsyms loaded?\n"); + return; + } + if (PRINT_RAW_ADDR) printf("%s/%llx;", sym->name, addr); else diff --git a/samples/bpf/sampleip_user.c b/samples/bpf/sampleip_user.c index 216c7ecbbbe9..23b90a45c802 100644 --- a/samples/bpf/sampleip_user.c +++ b/samples/bpf/sampleip_user.c @@ -109,6 +109,11 @@ static void print_ip_map(int fd) for (i = 0; i < max; i++) { if (counts[i].ip > PAGE_OFFSET) { sym = ksym_search(counts[i].ip); + if (!sym) { + printf("ksym not found. Is kallsyms loaded?\n"); + continue; + } + printf("0x%-17llx %-32s %u\n", counts[i].ip, sym->name, counts[i].count); } else { diff --git a/samples/bpf/spintest_user.c b/samples/bpf/spintest_user.c index 8d3e9cfa1909..2556af2d9b3e 100644 --- a/samples/bpf/spintest_user.c +++ b/samples/bpf/spintest_user.c @@ -37,8 +37,13 @@ int main(int ac, char **argv) bpf_map_lookup_elem(map_fd[0], &next_key, &value); assert(next_key == value); sym = ksym_search(value); - printf(" %s", sym->name); key = next_key; + if (!sym) { + printf("ksym not found. Is kallsyms loaded?\n"); + continue; + } + + printf(" %s", sym->name); } if (key) printf("\n"); diff --git a/samples/bpf/trace_event_user.c b/samples/bpf/trace_event_user.c index d08046ab81f0..d4178f60e075 100644 --- a/samples/bpf/trace_event_user.c +++ b/samples/bpf/trace_event_user.c @@ -34,6 +34,11 @@ static void print_ksym(__u64 addr) if (!addr) return; sym = ksym_search(addr); + if (!sym) { + printf("ksym not found. Is kallsyms loaded?\n"); + return; + } + printf("%s;", sym->name); if (!strcmp(sym->name, "sys_read")) sys_read_seen = true; diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index dc0e8c5a1402..dd2b31ccca6a 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -35,7 +35,7 @@ set -e info() { if [ "${quiet}" != "silent_" ]; then - printf " %-7s %s\n" ${1} ${2} + printf " %-7s %s\n" "${1}" "${2}" fi } @@ -91,6 +91,20 @@ vmlinux_link() fi } +# generate .BTF typeinfo from DWARF debuginfo +gen_btf() +{ + local pahole_ver; + + pahole_ver=$(${PAHOLE} --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/') + if [ "${pahole_ver}" -lt "113" ]; then + info "BTF" "${1}: pahole version $(${PAHOLE} --version) is too old, need at least v1.13" + exit 0 + fi + + info "BTF" ${1} + LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${1} +} # Create ${2} .o file with all symbols from the ${1} object file kallsyms() @@ -248,6 +262,10 @@ fi info LD vmlinux vmlinux_link "${kallsymso}" vmlinux +if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then + gen_btf vmlinux +fi + if [ -n "${CONFIG_BUILDTIME_EXTABLE_SORT}" ]; then info SORTEX vmlinux sortextable vmlinux diff --git a/tools/arch/arm64/include/asm/barrier.h b/tools/arch/arm64/include/asm/barrier.h index 378c051fa177..3b9b41331c4f 100644 --- a/tools/arch/arm64/include/asm/barrier.h +++ b/tools/arch/arm64/include/asm/barrier.h @@ -14,6 +14,16 @@ #define wmb() asm volatile("dmb ishst" ::: "memory") #define rmb() asm volatile("dmb ishld" ::: "memory") +/* + * Kernel uses dmb variants on arm64 for smp_*() barriers. Pretty much the same + * implementation as above mb()/wmb()/rmb(), though for the latter kernel uses + * dsb. In any case, should above mb()/wmb()/rmb() change, make sure the below + * smp_*() don't. + */ +#define smp_mb() asm volatile("dmb ish" ::: "memory") +#define smp_wmb() asm volatile("dmb ishst" ::: "memory") +#define smp_rmb() asm volatile("dmb ishld" ::: "memory") + #define smp_store_release(p, v) \ do { \ union { typeof(*p) __val; char __c[1]; } __u = \ diff --git a/tools/arch/x86/include/asm/barrier.h b/tools/arch/x86/include/asm/barrier.h index 58919868473c..0adf295dd5b6 100644 --- a/tools/arch/x86/include/asm/barrier.h +++ b/tools/arch/x86/include/asm/barrier.h @@ -21,9 +21,12 @@ #define rmb() asm volatile("lock; addl $0,0(%%esp)" ::: "memory") #define wmb() asm volatile("lock; addl $0,0(%%esp)" ::: "memory") #elif defined(__x86_64__) -#define mb() asm volatile("mfence":::"memory") -#define rmb() asm volatile("lfence":::"memory") +#define mb() asm volatile("mfence" ::: "memory") +#define rmb() asm volatile("lfence" ::: "memory") #define wmb() asm volatile("sfence" ::: "memory") +#define smp_rmb() barrier() +#define smp_wmb() barrier() +#define smp_mb() asm volatile("lock; addl $0,-132(%%rsp)" ::: "memory", "cc") #endif #if defined(__x86_64__) diff --git a/tools/bpf/bpftool/btf_dumper.c b/tools/bpf/bpftool/btf_dumper.c index e63bce0755eb..8cafb9b31467 100644 --- a/tools/bpf/bpftool/btf_dumper.c +++ b/tools/bpf/bpftool/btf_dumper.c @@ -309,6 +309,48 @@ static int btf_dumper_struct(const struct btf_dumper *d, __u32 type_id, return ret; } +static int btf_dumper_var(const struct btf_dumper *d, __u32 type_id, + __u8 bit_offset, const void *data) +{ + const struct btf_type *t = btf__type_by_id(d->btf, type_id); + int ret; + + jsonw_start_object(d->jw); + jsonw_name(d->jw, btf__name_by_offset(d->btf, t->name_off)); + ret = btf_dumper_do_type(d, t->type, bit_offset, data); + jsonw_end_object(d->jw); + + return ret; +} + +static int btf_dumper_datasec(const struct btf_dumper *d, __u32 type_id, + const void *data) +{ + struct btf_var_secinfo *vsi; + const struct btf_type *t; + int ret = 0, i, vlen; + + t = btf__type_by_id(d->btf, type_id); + if (!t) + return -EINVAL; + + vlen = BTF_INFO_VLEN(t->info); + vsi = (struct btf_var_secinfo *)(t + 1); + + jsonw_start_object(d->jw); + jsonw_name(d->jw, btf__name_by_offset(d->btf, t->name_off)); + jsonw_start_array(d->jw); + for (i = 0; i < vlen; i++) { + ret = btf_dumper_do_type(d, vsi[i].type, 0, data + vsi[i].offset); + if (ret) + break; + } + jsonw_end_array(d->jw); + jsonw_end_object(d->jw); + + return ret; +} + static int btf_dumper_do_type(const struct btf_dumper *d, __u32 type_id, __u8 bit_offset, const void *data) { @@ -341,6 +383,10 @@ static int btf_dumper_do_type(const struct btf_dumper *d, __u32 type_id, case BTF_KIND_CONST: case BTF_KIND_RESTRICT: return btf_dumper_modifier(d, type_id, bit_offset, data); + case BTF_KIND_VAR: + return btf_dumper_var(d, type_id, bit_offset, data); + case BTF_KIND_DATASEC: + return btf_dumper_datasec(d, type_id, data); default: jsonw_printf(d->jw, "(unsupported-kind"); return -EINVAL; @@ -377,6 +423,7 @@ static int __btf_dumper_type_only(const struct btf *btf, __u32 type_id, { const struct btf_type *proto_type; const struct btf_array *array; + const struct btf_var *var; const struct btf_type *t; if (!type_id) { @@ -440,6 +487,18 @@ static int __btf_dumper_type_only(const struct btf *btf, __u32 type_id, if (pos == -1) return -1; break; + case BTF_KIND_VAR: + var = (struct btf_var *)(t + 1); + if (var->linkage == BTF_VAR_STATIC) + BTF_PRINT_ARG("static "); + BTF_PRINT_TYPE(t->type); + BTF_PRINT_ARG(" %s", + btf__name_by_offset(btf, t->name_off)); + break; + case BTF_KIND_DATASEC: + BTF_PRINT_ARG("section (\"%s\") ", + btf__name_by_offset(btf, t->name_off)); + break; case BTF_KIND_UNKN: default: return -1; diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c index e0c650d91784..e96903078991 100644 --- a/tools/bpf/bpftool/map.c +++ b/tools/bpf/bpftool/map.c @@ -153,11 +153,13 @@ static int do_dump_btf(const struct btf_dumper *d, /* start of key-value pair */ jsonw_start_object(d->jw); - jsonw_name(d->jw, "key"); + if (map_info->btf_key_type_id) { + jsonw_name(d->jw, "key"); - ret = btf_dumper_type(d, map_info->btf_key_type_id, key); - if (ret) - goto err_end_obj; + ret = btf_dumper_type(d, map_info->btf_key_type_id, key); + if (ret) + goto err_end_obj; + } if (!map_is_per_cpu(map_info->type)) { jsonw_name(d->jw, "value"); diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index d2be5a06c339..81067803189e 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -249,6 +249,9 @@ static void print_prog_json(struct bpf_prog_info *info, int fd) if (info->nr_map_ids) show_prog_maps(fd, info->nr_map_ids); + if (info->btf_id) + jsonw_int_field(json_wtr, "btf_id", info->btf_id); + if (!hash_empty(prog_table.table)) { struct pinned_obj *obj; @@ -319,6 +322,9 @@ static void print_prog_plain(struct bpf_prog_info *info, int fd) } } + if (info->btf_id) + printf("\n\tbtf_id %d\n", info->btf_id); + printf("\n"); } diff --git a/tools/bpf/bpftool/xlated_dumper.c b/tools/bpf/bpftool/xlated_dumper.c index 7073dbe1ff27..0bb17bf88b18 100644 --- a/tools/bpf/bpftool/xlated_dumper.c +++ b/tools/bpf/bpftool/xlated_dumper.c @@ -195,6 +195,9 @@ static const char *print_imm(void *private_data, if (insn->src_reg == BPF_PSEUDO_MAP_FD) snprintf(dd->scratch_buff, sizeof(dd->scratch_buff), "map[id:%u]", insn->imm); + else if (insn->src_reg == BPF_PSEUDO_MAP_VALUE) + snprintf(dd->scratch_buff, sizeof(dd->scratch_buff), + "map[id:%u][0]+%u", insn->imm, (insn + 1)->imm); else snprintf(dd->scratch_buff, sizeof(dd->scratch_buff), "0x%llx", (unsigned long long)full_imm); diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h index cce0b02c0e28..ca28b6ab8db7 100644 --- a/tools/include/linux/filter.h +++ b/tools/include/linux/filter.h @@ -278,10 +278,29 @@ .off = 0, \ .imm = ((__u64) (IMM)) >> 32 }) +#define BPF_LD_IMM64_RAW_FULL(DST, SRC, OFF1, OFF2, IMM1, IMM2) \ + ((struct bpf_insn) { \ + .code = BPF_LD | BPF_DW | BPF_IMM, \ + .dst_reg = DST, \ + .src_reg = SRC, \ + .off = OFF1, \ + .imm = IMM1 }), \ + ((struct bpf_insn) { \ + .code = 0, /* zero is reserved opcode */ \ + .dst_reg = 0, \ + .src_reg = 0, \ + .off = OFF2, \ + .imm = IMM2 }) + /* pseudo BPF_LD_IMM64 insn used to refer to process-local map_fd */ #define BPF_LD_MAP_FD(DST, MAP_FD) \ - BPF_LD_IMM64_RAW(DST, BPF_PSEUDO_MAP_FD, MAP_FD) + BPF_LD_IMM64_RAW_FULL(DST, BPF_PSEUDO_MAP_FD, 0, 0, \ + MAP_FD, 0) + +#define BPF_LD_MAP_VALUE(DST, MAP_FD, VALUE_OFF) \ + BPF_LD_IMM64_RAW_FULL(DST, BPF_PSEUDO_MAP_VALUE, 0, 0, \ + MAP_FD, VALUE_OFF) /* Relative call */ diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 837024512baf..2e96d0b4bf65 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -105,6 +105,7 @@ enum bpf_cmd { BPF_BTF_GET_FD_BY_ID, BPF_TASK_FD_QUERY, BPF_MAP_LOOKUP_AND_DELETE_ELEM, + BPF_MAP_FREEZE, }; enum bpf_map_type { @@ -255,8 +256,19 @@ enum bpf_attach_type { */ #define BPF_F_ANY_ALIGNMENT (1U << 1) -/* when bpf_ldimm64->src_reg == BPF_PSEUDO_MAP_FD, bpf_ldimm64->imm == fd */ +/* When BPF ldimm64's insn[0].src_reg != 0 then this can have + * two extensions: + * + * insn[0].src_reg: BPF_PSEUDO_MAP_FD BPF_PSEUDO_MAP_VALUE + * insn[0].imm: map fd map fd + * insn[1].imm: 0 offset into value + * insn[0].off: 0 0 + * insn[1].off: 0 0 + * ldimm64 rewrite: address of map address of map[0]+offset + * verifier type: CONST_PTR_TO_MAP PTR_TO_MAP_VALUE + */ #define BPF_PSEUDO_MAP_FD 1 +#define BPF_PSEUDO_MAP_VALUE 2 /* when bpf_call->src_reg == BPF_PSEUDO_CALL, bpf_call->imm == pc-relative * offset to another bpf function @@ -283,7 +295,7 @@ enum bpf_attach_type { #define BPF_OBJ_NAME_LEN 16U -/* Flags for accessing BPF object */ +/* Flags for accessing BPF object from syscall side. */ #define BPF_F_RDONLY (1U << 3) #define BPF_F_WRONLY (1U << 4) @@ -293,6 +305,10 @@ enum bpf_attach_type { /* Zero-initialize hash function seed. This should only be used for testing. */ #define BPF_F_ZERO_SEED (1U << 6) +/* Flags for accessing BPF object from program side. */ +#define BPF_F_RDONLY_PROG (1U << 7) +#define BPF_F_WRONLY_PROG (1U << 8) + /* flags for BPF_PROG_QUERY */ #define BPF_F_QUERY_EFFECTIVE (1U << 0) @@ -396,6 +412,13 @@ union bpf_attr { __aligned_u64 data_out; __u32 repeat; __u32 duration; + __u32 ctx_size_in; /* input: len of ctx_in */ + __u32 ctx_size_out; /* input/output: len of ctx_out + * returns ENOSPC if ctx_out + * is too small. + */ + __aligned_u64 ctx_in; + __aligned_u64 ctx_out; } test; struct { /* anonymous struct used by BPF_*_GET_*_ID */ @@ -1500,6 +1523,10 @@ union bpf_attr { * * **BPF_F_ADJ_ROOM_ENCAP_L4_UDP **: * Use with ENCAP_L3 flags to further specify the tunnel type. * + * * **BPF_F_ADJ_ROOM_ENCAP_L2(len) **: + * Use with ENCAP_L3/L4 flags to further specify the tunnel + * type; **len** is the length of the inner MAC header. + * * A call to this helper is susceptible to change the underlaying * packet buffer. Therefore, at load time, all checks on pointers * previously done by the verifier are invalidated and must be @@ -2641,10 +2668,16 @@ enum bpf_func_id { /* BPF_FUNC_skb_adjust_room flags. */ #define BPF_F_ADJ_ROOM_FIXED_GSO (1ULL << 0) +#define BPF_ADJ_ROOM_ENCAP_L2_MASK 0xff +#define BPF_ADJ_ROOM_ENCAP_L2_SHIFT 56 + #define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1) #define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 (1ULL << 2) #define BPF_F_ADJ_ROOM_ENCAP_L4_GRE (1ULL << 3) #define BPF_F_ADJ_ROOM_ENCAP_L4_UDP (1ULL << 4) +#define BPF_F_ADJ_ROOM_ENCAP_L2(len) (((__u64)len & \ + BPF_ADJ_ROOM_ENCAP_L2_MASK) \ + << BPF_ADJ_ROOM_ENCAP_L2_SHIFT) /* Mode for BPF_FUNC_skb_adjust_room helper. */ enum bpf_adj_room_mode { diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h index 7b7475ef2f17..9310652ca4f9 100644 --- a/tools/include/uapi/linux/btf.h +++ b/tools/include/uapi/linux/btf.h @@ -39,11 +39,11 @@ struct btf_type { * struct, union and fwd */ __u32 info; - /* "size" is used by INT, ENUM, STRUCT and UNION. + /* "size" is used by INT, ENUM, STRUCT, UNION and DATASEC. * "size" tells the size of the type it is describing. * * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, - * FUNC and FUNC_PROTO. + * FUNC, FUNC_PROTO and VAR. * "type" is a type_id referring to another type. */ union { @@ -70,8 +70,10 @@ struct btf_type { #define BTF_KIND_RESTRICT 11 /* Restrict */ #define BTF_KIND_FUNC 12 /* Function */ #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ -#define BTF_KIND_MAX 13 -#define NR_BTF_KINDS 14 +#define BTF_KIND_VAR 14 /* Variable */ +#define BTF_KIND_DATASEC 15 /* Section */ +#define BTF_KIND_MAX BTF_KIND_DATASEC +#define NR_BTF_KINDS (BTF_KIND_MAX + 1) /* For some specific BTF_KIND, "struct btf_type" is immediately * followed by extra data. @@ -138,4 +140,26 @@ struct btf_param { __u32 type; }; +enum { + BTF_VAR_STATIC = 0, + BTF_VAR_GLOBAL_ALLOCATED, +}; + +/* BTF_KIND_VAR is followed by a single "struct btf_var" to describe + * additional information related to the variable such as its linkage. + */ +struct btf_var { + __u32 linkage; +}; + +/* BTF_KIND_DATASEC is followed by multiple "struct btf_var_secinfo" + * to describe all BTF_KIND_VAR types it contains along with it's + * in-section offset as well as size. + */ +struct btf_var_secinfo { + __u32 type; + __u32 offset; + __u32 size; +}; + #endif /* _UAPI__LINUX_BTF_H__ */ diff --git a/tools/lib/bpf/.gitignore b/tools/lib/bpf/.gitignore index 4db74758c674..7d9e182a1f51 100644 --- a/tools/lib/bpf/.gitignore +++ b/tools/lib/bpf/.gitignore @@ -1,3 +1,4 @@ libbpf_version.h +libbpf.pc FEATURE-DUMP.libbpf test_libbpf diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 8e7c56e9590f..c6c06bc6683c 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -3,7 +3,7 @@ BPF_VERSION = 0 BPF_PATCHLEVEL = 0 -BPF_EXTRAVERSION = 2 +BPF_EXTRAVERSION = 3 MAKEFLAGS += --no-print-directory @@ -90,6 +90,7 @@ LIBBPF_VERSION = $(BPF_VERSION).$(BPF_PATCHLEVEL).$(BPF_EXTRAVERSION) LIB_TARGET = libbpf.a libbpf.so.$(LIBBPF_VERSION) LIB_FILE = libbpf.a libbpf.so* +PC_FILE = libbpf.pc # Set compile option CFLAGS ifdef EXTRA_CFLAGS @@ -134,13 +135,14 @@ VERSION_SCRIPT := libbpf.map LIB_TARGET := $(addprefix $(OUTPUT),$(LIB_TARGET)) LIB_FILE := $(addprefix $(OUTPUT),$(LIB_FILE)) +PC_FILE := $(addprefix $(OUTPUT),$(PC_FILE)) GLOBAL_SYM_COUNT = $(shell readelf -s --wide $(BPF_IN) | \ awk '/GLOBAL/ && /DEFAULT/ && !/UND/ {s++} END{print s}') VERSIONED_SYM_COUNT = $(shell readelf -s --wide $(OUTPUT)libbpf.so | \ grep -Eo '[^ ]+@LIBBPF_' | cut -d@ -f1 | sort -u | wc -l) -CMD_TARGETS = $(LIB_TARGET) +CMD_TARGETS = $(LIB_TARGET) $(PC_FILE) CXX_TEST_TARGET = $(OUTPUT)test_libbpf @@ -187,6 +189,12 @@ $(OUTPUT)libbpf.a: $(BPF_IN) $(OUTPUT)test_libbpf: test_libbpf.cpp $(OUTPUT)libbpf.a $(QUIET_LINK)$(CXX) $(INCLUDES) $^ -lelf -o $@ +$(OUTPUT)libbpf.pc: + $(QUIET_GEN)sed -e "s|@PREFIX@|$(prefix)|" \ + -e "s|@LIBDIR@|$(libdir_SQ)|" \ + -e "s|@VERSION@|$(LIBBPF_VERSION)|" \ + < libbpf.pc.template > $@ + check: check_abi check_abi: $(OUTPUT)libbpf.so @@ -224,7 +232,11 @@ install_headers: $(call do_install,btf.h,$(prefix)/include/bpf,644); \ $(call do_install,xsk.h,$(prefix)/include/bpf,644); -install: install_lib +install_pkgconfig: $(PC_FILE) + $(call QUIET_INSTALL, $(PC_FILE)) \ + $(call do_install,$(PC_FILE),$(libdir_SQ)/pkgconfig,644) + +install: install_lib install_pkgconfig ### Cleaning rules @@ -234,7 +246,7 @@ config-clean: clean: $(call QUIET_CLEAN, libbpf) $(RM) $(TARGETS) $(CXX_TEST_TARGET) \ - *.o *~ *.a *.so *.so.$(VERSION) .*.d .*.cmd LIBBPF-CFLAGS + *.o *~ *.a *.so *.so.$(VERSION) .*.d .*.cmd *.pc LIBBPF-CFLAGS $(call QUIET_CLEAN, core-gen) $(RM) $(OUTPUT)FEATURE-DUMP.libbpf diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 9cd015574e83..955191c64b64 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -79,7 +79,6 @@ static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size) int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr) { - __u32 name_len = create_attr->name ? strlen(create_attr->name) : 0; union bpf_attr attr; memset(&attr, '\0', sizeof(attr)); @@ -89,8 +88,9 @@ int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr) attr.value_size = create_attr->value_size; attr.max_entries = create_attr->max_entries; attr.map_flags = create_attr->map_flags; - memcpy(attr.map_name, create_attr->name, - min(name_len, BPF_OBJ_NAME_LEN - 1)); + if (create_attr->name) + memcpy(attr.map_name, create_attr->name, + min(strlen(create_attr->name), BPF_OBJ_NAME_LEN - 1)); attr.numa_node = create_attr->numa_node; attr.btf_fd = create_attr->btf_fd; attr.btf_key_type_id = create_attr->btf_key_type_id; @@ -155,7 +155,6 @@ int bpf_create_map_in_map_node(enum bpf_map_type map_type, const char *name, int key_size, int inner_map_fd, int max_entries, __u32 map_flags, int node) { - __u32 name_len = name ? strlen(name) : 0; union bpf_attr attr; memset(&attr, '\0', sizeof(attr)); @@ -166,7 +165,9 @@ int bpf_create_map_in_map_node(enum bpf_map_type map_type, const char *name, attr.inner_map_fd = inner_map_fd; attr.max_entries = max_entries; attr.map_flags = map_flags; - memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1)); + if (name) + memcpy(attr.map_name, name, + min(strlen(name), BPF_OBJ_NAME_LEN - 1)); if (node >= 0) { attr.map_flags |= BPF_F_NUMA_NODE; @@ -216,18 +217,15 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, void *finfo = NULL, *linfo = NULL; union bpf_attr attr; __u32 log_level; - __u32 name_len; int fd; if (!load_attr || !log_buf != !log_buf_sz) return -EINVAL; log_level = load_attr->log_level; - if (log_level > 2 || (log_level && !log_buf)) + if (log_level > (4 | 2 | 1) || (log_level && !log_buf)) return -EINVAL; - name_len = load_attr->name ? strlen(load_attr->name) : 0; - memset(&attr, 0, sizeof(attr)); attr.prog_type = load_attr->prog_type; attr.expected_attach_type = load_attr->expected_attach_type; @@ -253,8 +251,9 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, attr.line_info_rec_size = load_attr->line_info_rec_size; attr.line_info_cnt = load_attr->line_info_cnt; attr.line_info = ptr_to_u64(load_attr->line_info); - memcpy(attr.prog_name, load_attr->name, - min(name_len, BPF_OBJ_NAME_LEN - 1)); + if (load_attr->name) + memcpy(attr.prog_name, load_attr->name, + min(strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1)); fd = sys_bpf_prog_load(&attr, sizeof(attr)); if (fd >= 0) @@ -429,6 +428,16 @@ int bpf_map_get_next_key(int fd, const void *key, void *next_key) return sys_bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr)); } +int bpf_map_freeze(int fd) +{ + union bpf_attr attr; + + memset(&attr, 0, sizeof(attr)); + attr.map_fd = fd; + + return sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr)); +} + int bpf_obj_pin(int fd, const char *pathname) { union bpf_attr attr; @@ -545,10 +554,15 @@ int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr) attr.test.data_out = ptr_to_u64(test_attr->data_out); attr.test.data_size_in = test_attr->data_size_in; attr.test.data_size_out = test_attr->data_size_out; + attr.test.ctx_in = ptr_to_u64(test_attr->ctx_in); + attr.test.ctx_out = ptr_to_u64(test_attr->ctx_out); + attr.test.ctx_size_in = test_attr->ctx_size_in; + attr.test.ctx_size_out = test_attr->ctx_size_out; attr.test.repeat = test_attr->repeat; ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); test_attr->data_size_out = attr.test.data_size_out; + test_attr->ctx_size_out = attr.test.ctx_size_out; test_attr->retval = attr.test.retval; test_attr->duration = attr.test.duration; return ret; diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 6ffdd79bea89..bc30783d1403 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -92,7 +92,7 @@ struct bpf_load_program_attr { #define MAPS_RELAX_COMPAT 0x01 /* Recommend log buffer size */ -#define BPF_LOG_BUF_SIZE (256 * 1024) +#define BPF_LOG_BUF_SIZE (16 * 1024 * 1024) /* verifier maximum in kernels <= 5.1 */ LIBBPF_API int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, char *log_buf, size_t log_buf_sz); @@ -117,6 +117,7 @@ LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key, void *value); LIBBPF_API int bpf_map_delete_elem(int fd, const void *key); LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key); +LIBBPF_API int bpf_map_freeze(int fd); LIBBPF_API int bpf_obj_pin(int fd, const char *pathname); LIBBPF_API int bpf_obj_get(const char *pathname); LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd, @@ -135,6 +136,11 @@ struct bpf_prog_test_run_attr { * out: length of data_out */ __u32 retval; /* out: return code of the BPF program */ __u32 duration; /* out: average per repetition in ns */ + const void *ctx_in; /* optional */ + __u32 ctx_size_in; + void *ctx_out; /* optional */ + __u32 ctx_size_out; /* in: max length of ctx_out + * out: length of cxt_out */ }; LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr); diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index cf119c9b6f27..701e7e28ada3 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -24,6 +24,8 @@ ((k) == BTF_KIND_CONST) || \ ((k) == BTF_KIND_RESTRICT)) +#define IS_VAR(k) ((k) == BTF_KIND_VAR) + static struct btf_type btf_void; struct btf { @@ -212,6 +214,10 @@ static int btf_type_size(struct btf_type *t) return base_size + vlen * sizeof(struct btf_member); case BTF_KIND_FUNC_PROTO: return base_size + vlen * sizeof(struct btf_param); + case BTF_KIND_VAR: + return base_size + sizeof(struct btf_var); + case BTF_KIND_DATASEC: + return base_size + vlen * sizeof(struct btf_var_secinfo); default: pr_debug("Unsupported BTF_KIND:%u\n", BTF_INFO_KIND(t->info)); return -EINVAL; @@ -283,6 +289,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) case BTF_KIND_STRUCT: case BTF_KIND_UNION: case BTF_KIND_ENUM: + case BTF_KIND_DATASEC: size = t->size; goto done; case BTF_KIND_PTR: @@ -292,6 +299,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id) case BTF_KIND_VOLATILE: case BTF_KIND_CONST: case BTF_KIND_RESTRICT: + case BTF_KIND_VAR: type_id = t->type; break; case BTF_KIND_ARRAY: @@ -326,7 +334,8 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id) t = btf__type_by_id(btf, type_id); while (depth < MAX_RESOLVE_DEPTH && !btf_type_is_void_or_null(t) && - IS_MODIFIER(BTF_INFO_KIND(t->info))) { + (IS_MODIFIER(BTF_INFO_KIND(t->info)) || + IS_VAR(BTF_INFO_KIND(t->info)))) { type_id = t->type; t = btf__type_by_id(btf, type_id); depth++; @@ -408,6 +417,92 @@ done: return btf; } +static int compare_vsi_off(const void *_a, const void *_b) +{ + const struct btf_var_secinfo *a = _a; + const struct btf_var_secinfo *b = _b; + + return a->offset - b->offset; +} + +static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf, + struct btf_type *t) +{ + __u32 size = 0, off = 0, i, vars = BTF_INFO_VLEN(t->info); + const char *name = btf__name_by_offset(btf, t->name_off); + const struct btf_type *t_var; + struct btf_var_secinfo *vsi; + struct btf_var *var; + int ret; + + if (!name) { + pr_debug("No name found in string section for DATASEC kind.\n"); + return -ENOENT; + } + + ret = bpf_object__section_size(obj, name, &size); + if (ret || !size || (t->size && t->size != size)) { + pr_debug("Invalid size for section %s: %u bytes\n", name, size); + return -ENOENT; + } + + t->size = size; + + for (i = 0, vsi = (struct btf_var_secinfo *)(t + 1); + i < vars; i++, vsi++) { + t_var = btf__type_by_id(btf, vsi->type); + var = (struct btf_var *)(t_var + 1); + + if (BTF_INFO_KIND(t_var->info) != BTF_KIND_VAR) { + pr_debug("Non-VAR type seen in section %s\n", name); + return -EINVAL; + } + + if (var->linkage == BTF_VAR_STATIC) + continue; + + name = btf__name_by_offset(btf, t_var->name_off); + if (!name) { + pr_debug("No name found in string section for VAR kind\n"); + return -ENOENT; + } + + ret = bpf_object__variable_offset(obj, name, &off); + if (ret) { + pr_debug("No offset found in symbol table for VAR %s\n", name); + return -ENOENT; + } + + vsi->offset = off; + } + + qsort(t + 1, vars, sizeof(*vsi), compare_vsi_off); + return 0; +} + +int btf__finalize_data(struct bpf_object *obj, struct btf *btf) +{ + int err = 0; + __u32 i; + + for (i = 1; i <= btf->nr_types; i++) { + struct btf_type *t = btf->types[i]; + + /* Loader needs to fix up some of the things compiler + * couldn't get its hands on while emitting BTF. This + * is section size and global variable offset. We use + * the info from the ELF itself for this purpose. + */ + if (BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC) { + err = btf_fixup_datasec(obj, btf, t); + if (err) + break; + } + } + + return err; +} + int btf__load(struct btf *btf) { __u32 log_buf_size = BPF_LOG_BUF_SIZE; diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 28a1e1e59861..c7b399e81fce 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -21,6 +21,8 @@ struct btf; struct btf_ext; struct btf_type; +struct bpf_object; + /* * The .BTF.ext ELF section layout defined as * struct btf_ext_header @@ -57,6 +59,7 @@ struct btf_ext_header { LIBBPF_API void btf__free(struct btf *btf); LIBBPF_API struct btf *btf__new(__u8 *data, __u32 size); +LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); LIBBPF_API int btf__load(struct btf *btf); LIBBPF_API __s32 btf__find_by_name(const struct btf *btf, const char *type_name); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 11c25d9ea431..67484cf32b2e 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7,6 +7,7 @@ * Copyright (C) 2015 Wang Nan <wangnan0@huawei.com> * Copyright (C) 2015 Huawei Inc. * Copyright (C) 2017 Nicira, Inc. + * Copyright (C) 2019 Isovalent, Inc. */ #ifndef _GNU_SOURCE @@ -52,6 +53,11 @@ #define BPF_FS_MAGIC 0xcafe4a11 #endif +/* vsprintf() in __base_pr() uses nonliteral format string. It may break + * compilation if user enables corresponding warning. Disable it explicitly. + */ +#pragma GCC diagnostic ignored "-Wformat-nonliteral" + #define __printf(a, b) __attribute__((format(printf, a, b))) static int __base_pr(enum libbpf_print_level level, const char *format, @@ -144,6 +150,7 @@ struct bpf_program { enum { RELO_LD64, RELO_CALL, + RELO_DATA, } type; int insn_idx; union { @@ -152,6 +159,7 @@ struct bpf_program { }; } *reloc_desc; int nr_reloc; + int log_level; struct { int nr; @@ -176,6 +184,19 @@ struct bpf_program { __u32 line_info_cnt; }; +enum libbpf_map_type { + LIBBPF_MAP_UNSPEC, + LIBBPF_MAP_DATA, + LIBBPF_MAP_BSS, + LIBBPF_MAP_RODATA, +}; + +static const char * const libbpf_type_to_btf_name[] = { + [LIBBPF_MAP_DATA] = ".data", + [LIBBPF_MAP_BSS] = ".bss", + [LIBBPF_MAP_RODATA] = ".rodata", +}; + struct bpf_map { int fd; char *name; @@ -187,11 +208,18 @@ struct bpf_map { __u32 btf_value_type_id; void *priv; bpf_map_clear_priv_t clear_priv; + enum libbpf_map_type libbpf_type; +}; + +struct bpf_secdata { + void *rodata; + void *data; }; static LIST_HEAD(bpf_objects_list); struct bpf_object { + char name[BPF_OBJ_NAME_LEN]; char license[64]; __u32 kern_version; @@ -199,6 +227,7 @@ struct bpf_object { size_t nr_programs; struct bpf_map *maps; size_t nr_maps; + struct bpf_secdata sections; bool loaded; bool has_pseudo_calls; @@ -214,6 +243,9 @@ struct bpf_object { Elf *elf; GElf_Ehdr ehdr; Elf_Data *symbols; + Elf_Data *data; + Elf_Data *rodata; + Elf_Data *bss; size_t strtabidx; struct { GElf_Shdr shdr; @@ -222,6 +254,9 @@ struct bpf_object { int nr_reloc; int maps_shndx; int text_shndx; + int data_shndx; + int rodata_shndx; + int bss_shndx; } efile; /* * All loaded bpf_object is linked in a list, which is @@ -443,6 +478,7 @@ static struct bpf_object *bpf_object__new(const char *path, size_t obj_buf_sz) { struct bpf_object *obj; + char *end; obj = calloc(1, sizeof(struct bpf_object) + strlen(path) + 1); if (!obj) { @@ -451,8 +487,14 @@ static struct bpf_object *bpf_object__new(const char *path, } strcpy(obj->path, path); - obj->efile.fd = -1; + /* Using basename() GNU version which doesn't modify arg. */ + strncpy(obj->name, basename((void *)path), + sizeof(obj->name) - 1); + end = strchr(obj->name, '.'); + if (end) + *end = 0; + obj->efile.fd = -1; /* * Caller of this function should also calls * bpf_object__elf_finish() after data collection to return @@ -462,6 +504,9 @@ static struct bpf_object *bpf_object__new(const char *path, obj->efile.obj_buf = obj_buf; obj->efile.obj_buf_sz = obj_buf_sz; obj->efile.maps_shndx = -1; + obj->efile.data_shndx = -1; + obj->efile.rodata_shndx = -1; + obj->efile.bss_shndx = -1; obj->loaded = false; @@ -480,6 +525,9 @@ static void bpf_object__elf_finish(struct bpf_object *obj) obj->efile.elf = NULL; } obj->efile.symbols = NULL; + obj->efile.data = NULL; + obj->efile.rodata = NULL; + obj->efile.bss = NULL; zfree(&obj->efile.reloc); obj->efile.nr_reloc = 0; @@ -621,27 +669,182 @@ static bool bpf_map_type__is_map_in_map(enum bpf_map_type type) return false; } +static int bpf_object_search_section_size(const struct bpf_object *obj, + const char *name, size_t *d_size) +{ + const GElf_Ehdr *ep = &obj->efile.ehdr; + Elf *elf = obj->efile.elf; + Elf_Scn *scn = NULL; + int idx = 0; + + while ((scn = elf_nextscn(elf, scn)) != NULL) { + const char *sec_name; + Elf_Data *data; + GElf_Shdr sh; + + idx++; + if (gelf_getshdr(scn, &sh) != &sh) { + pr_warning("failed to get section(%d) header from %s\n", + idx, obj->path); + return -EIO; + } + + sec_name = elf_strptr(elf, ep->e_shstrndx, sh.sh_name); + if (!sec_name) { + pr_warning("failed to get section(%d) name from %s\n", + idx, obj->path); + return -EIO; + } + + if (strcmp(name, sec_name)) + continue; + + data = elf_getdata(scn, 0); + if (!data) { + pr_warning("failed to get section(%d) data from %s(%s)\n", + idx, name, obj->path); + return -EIO; + } + + *d_size = data->d_size; + return 0; + } + + return -ENOENT; +} + +int bpf_object__section_size(const struct bpf_object *obj, const char *name, + __u32 *size) +{ + int ret = -ENOENT; + size_t d_size; + + *size = 0; + if (!name) { + return -EINVAL; + } else if (!strcmp(name, ".data")) { + if (obj->efile.data) + *size = obj->efile.data->d_size; + } else if (!strcmp(name, ".bss")) { + if (obj->efile.bss) + *size = obj->efile.bss->d_size; + } else if (!strcmp(name, ".rodata")) { + if (obj->efile.rodata) + *size = obj->efile.rodata->d_size; + } else { + ret = bpf_object_search_section_size(obj, name, &d_size); + if (!ret) + *size = d_size; + } + + return *size ? 0 : ret; +} + +int bpf_object__variable_offset(const struct bpf_object *obj, const char *name, + __u32 *off) +{ + Elf_Data *symbols = obj->efile.symbols; + const char *sname; + size_t si; + + if (!name || !off) + return -EINVAL; + + for (si = 0; si < symbols->d_size / sizeof(GElf_Sym); si++) { + GElf_Sym sym; + + if (!gelf_getsym(symbols, si, &sym)) + continue; + if (GELF_ST_BIND(sym.st_info) != STB_GLOBAL || + GELF_ST_TYPE(sym.st_info) != STT_OBJECT) + continue; + + sname = elf_strptr(obj->efile.elf, obj->efile.strtabidx, + sym.st_name); + if (!sname) { + pr_warning("failed to get sym name string for var %s\n", + name); + return -EIO; + } + if (strcmp(name, sname) == 0) { + *off = sym.st_value; + return 0; + } + } + + return -ENOENT; +} + +static bool bpf_object__has_maps(const struct bpf_object *obj) +{ + return obj->efile.maps_shndx >= 0 || + obj->efile.data_shndx >= 0 || + obj->efile.rodata_shndx >= 0 || + obj->efile.bss_shndx >= 0; +} + +static int +bpf_object__init_internal_map(struct bpf_object *obj, struct bpf_map *map, + enum libbpf_map_type type, Elf_Data *data, + void **data_buff) +{ + struct bpf_map_def *def = &map->def; + char map_name[BPF_OBJ_NAME_LEN]; + + map->libbpf_type = type; + map->offset = ~(typeof(map->offset))0; + snprintf(map_name, sizeof(map_name), "%.8s%.7s", obj->name, + libbpf_type_to_btf_name[type]); + map->name = strdup(map_name); + if (!map->name) { + pr_warning("failed to alloc map name\n"); + return -ENOMEM; + } + + def->type = BPF_MAP_TYPE_ARRAY; + def->key_size = sizeof(int); + def->value_size = data->d_size; + def->max_entries = 1; + def->map_flags = type == LIBBPF_MAP_RODATA ? + BPF_F_RDONLY_PROG : 0; + if (data_buff) { + *data_buff = malloc(data->d_size); + if (!*data_buff) { + zfree(&map->name); + pr_warning("failed to alloc map content buffer\n"); + return -ENOMEM; + } + memcpy(*data_buff, data->d_buf, data->d_size); + } + + pr_debug("map %ld is \"%s\"\n", map - obj->maps, map->name); + return 0; +} + static int bpf_object__init_maps(struct bpf_object *obj, int flags) { + int i, map_idx, map_def_sz = 0, nr_syms, nr_maps = 0, nr_maps_glob = 0; bool strict = !(flags & MAPS_RELAX_COMPAT); - int i, map_idx, map_def_sz, nr_maps = 0; - Elf_Scn *scn; - Elf_Data *data = NULL; Elf_Data *symbols = obj->efile.symbols; + Elf_Data *data = NULL; + int ret = 0; - if (obj->efile.maps_shndx < 0) - return -EINVAL; if (!symbols) return -EINVAL; + nr_syms = symbols->d_size / sizeof(GElf_Sym); - scn = elf_getscn(obj->efile.elf, obj->efile.maps_shndx); - if (scn) - data = elf_getdata(scn, NULL); - if (!scn || !data) { - pr_warning("failed to get Elf_Data from map section %d\n", - obj->efile.maps_shndx); - return -EINVAL; + if (obj->efile.maps_shndx >= 0) { + Elf_Scn *scn = elf_getscn(obj->efile.elf, + obj->efile.maps_shndx); + + if (scn) + data = elf_getdata(scn, NULL); + if (!scn || !data) { + pr_warning("failed to get Elf_Data from map section %d\n", + obj->efile.maps_shndx); + return -EINVAL; + } } /* @@ -651,7 +854,13 @@ bpf_object__init_maps(struct bpf_object *obj, int flags) * * TODO: Detect array of map and report error. */ - for (i = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) { + if (obj->efile.data_shndx >= 0) + nr_maps_glob++; + if (obj->efile.rodata_shndx >= 0) + nr_maps_glob++; + if (obj->efile.bss_shndx >= 0) + nr_maps_glob++; + for (i = 0; data && i < nr_syms; i++) { GElf_Sym sym; if (!gelf_getsym(symbols, i, &sym)) @@ -664,19 +873,21 @@ bpf_object__init_maps(struct bpf_object *obj, int flags) /* Alloc obj->maps and fill nr_maps. */ pr_debug("maps in %s: %d maps in %zd bytes\n", obj->path, nr_maps, data->d_size); - - if (!nr_maps) + if (!nr_maps && !nr_maps_glob) return 0; /* Assume equally sized map definitions */ - map_def_sz = data->d_size / nr_maps; - if (!data->d_size || (data->d_size % nr_maps) != 0) { - pr_warning("unable to determine map definition size " - "section %s, %d maps in %zd bytes\n", - obj->path, nr_maps, data->d_size); - return -EINVAL; + if (data) { + map_def_sz = data->d_size / nr_maps; + if (!data->d_size || (data->d_size % nr_maps) != 0) { + pr_warning("unable to determine map definition size " + "section %s, %d maps in %zd bytes\n", + obj->path, nr_maps, data->d_size); + return -EINVAL; + } } + nr_maps += nr_maps_glob; obj->maps = calloc(nr_maps, sizeof(obj->maps[0])); if (!obj->maps) { pr_warning("alloc maps for object failed\n"); @@ -697,7 +908,7 @@ bpf_object__init_maps(struct bpf_object *obj, int flags) /* * Fill obj->maps using data in "maps" section. */ - for (i = 0, map_idx = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) { + for (i = 0, map_idx = 0; data && i < nr_syms; i++) { GElf_Sym sym; const char *map_name; struct bpf_map_def *def; @@ -710,6 +921,8 @@ bpf_object__init_maps(struct bpf_object *obj, int flags) map_name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, sym.st_name); + + obj->maps[map_idx].libbpf_type = LIBBPF_MAP_UNSPEC; obj->maps[map_idx].offset = sym.st_value; if (sym.st_value + map_def_sz > data->d_size) { pr_warning("corrupted maps section in %s: last map \"%s\" too small\n", @@ -758,8 +971,27 @@ bpf_object__init_maps(struct bpf_object *obj, int flags) map_idx++; } - qsort(obj->maps, obj->nr_maps, sizeof(obj->maps[0]), compare_bpf_map); - return 0; + /* + * Populate rest of obj->maps with libbpf internal maps. + */ + if (obj->efile.data_shndx >= 0) + ret = bpf_object__init_internal_map(obj, &obj->maps[map_idx++], + LIBBPF_MAP_DATA, + obj->efile.data, + &obj->sections.data); + if (!ret && obj->efile.rodata_shndx >= 0) + ret = bpf_object__init_internal_map(obj, &obj->maps[map_idx++], + LIBBPF_MAP_RODATA, + obj->efile.rodata, + &obj->sections.rodata); + if (!ret && obj->efile.bss_shndx >= 0) + ret = bpf_object__init_internal_map(obj, &obj->maps[map_idx++], + LIBBPF_MAP_BSS, + obj->efile.bss, NULL); + if (!ret) + qsort(obj->maps, obj->nr_maps, sizeof(obj->maps[0]), + compare_bpf_map); + return ret; } static bool section_have_execinstr(struct bpf_object *obj, int idx) @@ -785,6 +1017,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj, int flags) Elf *elf = obj->efile.elf; GElf_Ehdr *ep = &obj->efile.ehdr; Elf_Data *btf_ext_data = NULL; + Elf_Data *btf_data = NULL; Elf_Scn *scn = NULL; int idx = 0, err = 0; @@ -828,32 +1061,18 @@ static int bpf_object__elf_collect(struct bpf_object *obj, int flags) (int)sh.sh_link, (unsigned long)sh.sh_flags, (int)sh.sh_type); - if (strcmp(name, "license") == 0) + if (strcmp(name, "license") == 0) { err = bpf_object__init_license(obj, data->d_buf, data->d_size); - else if (strcmp(name, "version") == 0) + } else if (strcmp(name, "version") == 0) { err = bpf_object__init_kversion(obj, data->d_buf, data->d_size); - else if (strcmp(name, "maps") == 0) + } else if (strcmp(name, "maps") == 0) { obj->efile.maps_shndx = idx; - else if (strcmp(name, BTF_ELF_SEC) == 0) { - obj->btf = btf__new(data->d_buf, data->d_size); - if (IS_ERR(obj->btf)) { - pr_warning("Error loading ELF section %s: %ld. Ignored and continue.\n", - BTF_ELF_SEC, PTR_ERR(obj->btf)); - obj->btf = NULL; - continue; - } - err = btf__load(obj->btf); - if (err) { - pr_warning("Error loading %s into kernel: %d. Ignored and continue.\n", - BTF_ELF_SEC, err); - btf__free(obj->btf); - obj->btf = NULL; - err = 0; - } + } else if (strcmp(name, BTF_ELF_SEC) == 0) { + btf_data = data; } else if (strcmp(name, BTF_EXT_ELF_SEC) == 0) { btf_ext_data = data; } else if (sh.sh_type == SHT_SYMTAB) { @@ -865,20 +1084,28 @@ static int bpf_object__elf_collect(struct bpf_object *obj, int flags) obj->efile.symbols = data; obj->efile.strtabidx = sh.sh_link; } - } else if ((sh.sh_type == SHT_PROGBITS) && - (sh.sh_flags & SHF_EXECINSTR) && - (data->d_size > 0)) { - if (strcmp(name, ".text") == 0) - obj->efile.text_shndx = idx; - err = bpf_object__add_program(obj, data->d_buf, - data->d_size, name, idx); - if (err) { - char errmsg[STRERR_BUFSIZE]; - char *cp = libbpf_strerror_r(-err, errmsg, - sizeof(errmsg)); - - pr_warning("failed to alloc program %s (%s): %s", - name, obj->path, cp); + } else if (sh.sh_type == SHT_PROGBITS && data->d_size > 0) { + if (sh.sh_flags & SHF_EXECINSTR) { + if (strcmp(name, ".text") == 0) + obj->efile.text_shndx = idx; + err = bpf_object__add_program(obj, data->d_buf, + data->d_size, name, idx); + if (err) { + char errmsg[STRERR_BUFSIZE]; + char *cp = libbpf_strerror_r(-err, errmsg, + sizeof(errmsg)); + + pr_warning("failed to alloc program %s (%s): %s", + name, obj->path, cp); + } + } else if (strcmp(name, ".data") == 0) { + obj->efile.data = data; + obj->efile.data_shndx = idx; + } else if (strcmp(name, ".rodata") == 0) { + obj->efile.rodata = data; + obj->efile.rodata_shndx = idx; + } else { + pr_debug("skip section(%d) %s\n", idx, name); } } else if (sh.sh_type == SHT_REL) { void *reloc = obj->efile.reloc; @@ -906,6 +1133,9 @@ static int bpf_object__elf_collect(struct bpf_object *obj, int flags) obj->efile.reloc[n].shdr = sh; obj->efile.reloc[n].data = data; } + } else if (sh.sh_type == SHT_NOBITS && strcmp(name, ".bss") == 0) { + obj->efile.bss = data; + obj->efile.bss_shndx = idx; } else { pr_debug("skip section(%d) %s\n", idx, name); } @@ -917,6 +1147,25 @@ static int bpf_object__elf_collect(struct bpf_object *obj, int flags) pr_warning("Corrupted ELF file: index of strtab invalid\n"); return LIBBPF_ERRNO__FORMAT; } + if (btf_data) { + obj->btf = btf__new(btf_data->d_buf, btf_data->d_size); + if (IS_ERR(obj->btf)) { + pr_warning("Error loading ELF section %s: %ld. Ignored and continue.\n", + BTF_ELF_SEC, PTR_ERR(obj->btf)); + obj->btf = NULL; + } else { + err = btf__finalize_data(obj, obj->btf); + if (!err) + err = btf__load(obj->btf); + if (err) { + pr_warning("Error finalizing and loading %s into kernel: %d. Ignored and continue.\n", + BTF_ELF_SEC, err); + btf__free(obj->btf); + obj->btf = NULL; + err = 0; + } + } + } if (btf_ext_data) { if (!obj->btf) { pr_debug("Ignore ELF section %s because its depending ELF section %s is not found.\n", @@ -932,7 +1181,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj, int flags) } } } - if (obj->efile.maps_shndx >= 0) { + if (bpf_object__has_maps(obj)) { err = bpf_object__init_maps(obj, flags); if (err) goto out; @@ -968,13 +1217,46 @@ bpf_object__find_program_by_title(struct bpf_object *obj, const char *title) return NULL; } +static bool bpf_object__shndx_is_data(const struct bpf_object *obj, + int shndx) +{ + return shndx == obj->efile.data_shndx || + shndx == obj->efile.bss_shndx || + shndx == obj->efile.rodata_shndx; +} + +static bool bpf_object__shndx_is_maps(const struct bpf_object *obj, + int shndx) +{ + return shndx == obj->efile.maps_shndx; +} + +static bool bpf_object__relo_in_known_section(const struct bpf_object *obj, + int shndx) +{ + return shndx == obj->efile.text_shndx || + bpf_object__shndx_is_maps(obj, shndx) || + bpf_object__shndx_is_data(obj, shndx); +} + +static enum libbpf_map_type +bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx) +{ + if (shndx == obj->efile.data_shndx) + return LIBBPF_MAP_DATA; + else if (shndx == obj->efile.bss_shndx) + return LIBBPF_MAP_BSS; + else if (shndx == obj->efile.rodata_shndx) + return LIBBPF_MAP_RODATA; + else + return LIBBPF_MAP_UNSPEC; +} + static int bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr, Elf_Data *data, struct bpf_object *obj) { Elf_Data *symbols = obj->efile.symbols; - int text_shndx = obj->efile.text_shndx; - int maps_shndx = obj->efile.maps_shndx; struct bpf_map *maps = obj->maps; size_t nr_maps = obj->nr_maps; int i, nrels; @@ -994,7 +1276,10 @@ bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr, GElf_Sym sym; GElf_Rel rel; unsigned int insn_idx; + unsigned int shdr_idx; struct bpf_insn *insns = prog->insns; + enum libbpf_map_type type; + const char *name; size_t map_idx; if (!gelf_getrel(data, i, &rel)) { @@ -1009,13 +1294,18 @@ bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr, GELF_R_SYM(rel.r_info)); return -LIBBPF_ERRNO__FORMAT; } - pr_debug("relo for %lld value %lld name %d\n", + + name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, + sym.st_name) ? : "<?>"; + + pr_debug("relo for %lld value %lld name %d (\'%s\')\n", (long long) (rel.r_info >> 32), - (long long) sym.st_value, sym.st_name); + (long long) sym.st_value, sym.st_name, name); - if (sym.st_shndx != maps_shndx && sym.st_shndx != text_shndx) { - pr_warning("Program '%s' contains non-map related relo data pointing to section %u\n", - prog->section_name, sym.st_shndx); + shdr_idx = sym.st_shndx; + if (!bpf_object__relo_in_known_section(obj, shdr_idx)) { + pr_warning("Program '%s' contains unrecognized relo data pointing to section %u\n", + prog->section_name, shdr_idx); return -LIBBPF_ERRNO__RELOC; } @@ -1040,24 +1330,39 @@ bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr, return -LIBBPF_ERRNO__RELOC; } - /* TODO: 'maps' is sorted. We can use bsearch to make it faster. */ - for (map_idx = 0; map_idx < nr_maps; map_idx++) { - if (maps[map_idx].offset == sym.st_value) { - pr_debug("relocation: find map %zd (%s) for insn %u\n", - map_idx, maps[map_idx].name, insn_idx); - break; + if (bpf_object__shndx_is_maps(obj, shdr_idx) || + bpf_object__shndx_is_data(obj, shdr_idx)) { + type = bpf_object__section_to_libbpf_map_type(obj, shdr_idx); + if (type != LIBBPF_MAP_UNSPEC && + GELF_ST_BIND(sym.st_info) == STB_GLOBAL) { + pr_warning("bpf: relocation: not yet supported relo for non-static global \'%s\' variable found in insns[%d].code 0x%x\n", + name, insn_idx, insns[insn_idx].code); + return -LIBBPF_ERRNO__RELOC; } - } - if (map_idx >= nr_maps) { - pr_warning("bpf relocation: map_idx %d large than %d\n", - (int)map_idx, (int)nr_maps - 1); - return -LIBBPF_ERRNO__RELOC; - } + for (map_idx = 0; map_idx < nr_maps; map_idx++) { + if (maps[map_idx].libbpf_type != type) + continue; + if (type != LIBBPF_MAP_UNSPEC || + (type == LIBBPF_MAP_UNSPEC && + maps[map_idx].offset == sym.st_value)) { + pr_debug("relocation: find map %zd (%s) for insn %u\n", + map_idx, maps[map_idx].name, insn_idx); + break; + } + } + + if (map_idx >= nr_maps) { + pr_warning("bpf relocation: map_idx %d large than %d\n", + (int)map_idx, (int)nr_maps - 1); + return -LIBBPF_ERRNO__RELOC; + } - prog->reloc_desc[i].type = RELO_LD64; - prog->reloc_desc[i].insn_idx = insn_idx; - prog->reloc_desc[i].map_idx = map_idx; + prog->reloc_desc[i].type = type != LIBBPF_MAP_UNSPEC ? + RELO_DATA : RELO_LD64; + prog->reloc_desc[i].insn_idx = insn_idx; + prog->reloc_desc[i].map_idx = map_idx; + } } return 0; } @@ -1065,18 +1370,27 @@ bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr, static int bpf_map_find_btf_info(struct bpf_map *map, const struct btf *btf) { struct bpf_map_def *def = &map->def; - __u32 key_type_id, value_type_id; + __u32 key_type_id = 0, value_type_id = 0; int ret; - ret = btf__get_map_kv_tids(btf, map->name, def->key_size, - def->value_size, &key_type_id, - &value_type_id); - if (ret) + if (!bpf_map__is_internal(map)) { + ret = btf__get_map_kv_tids(btf, map->name, def->key_size, + def->value_size, &key_type_id, + &value_type_id); + } else { + /* + * LLVM annotates global data differently in BTF, that is, + * only as '.data', '.bss' or '.rodata'. + */ + ret = btf__find_by_name(btf, + libbpf_type_to_btf_name[map->libbpf_type]); + } + if (ret < 0) return ret; map->btf_key_type_id = key_type_id; - map->btf_value_type_id = value_type_id; - + map->btf_value_type_id = bpf_map__is_internal(map) ? + ret : value_type_id; return 0; } @@ -1188,6 +1502,34 @@ bpf_object__probe_caps(struct bpf_object *obj) } static int +bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map) +{ + char *cp, errmsg[STRERR_BUFSIZE]; + int err, zero = 0; + __u8 *data; + + /* Nothing to do here since kernel already zero-initializes .bss map. */ + if (map->libbpf_type == LIBBPF_MAP_BSS) + return 0; + + data = map->libbpf_type == LIBBPF_MAP_DATA ? + obj->sections.data : obj->sections.rodata; + + err = bpf_map_update_elem(map->fd, &zero, data, 0); + /* Freeze .rodata map as read-only from syscall side. */ + if (!err && map->libbpf_type == LIBBPF_MAP_RODATA) { + err = bpf_map_freeze(map->fd); + if (err) { + cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); + pr_warning("Error freezing map(%s) as read-only: %s\n", + map->name, cp); + err = 0; + } + } + return err; +} + +static int bpf_object__create_maps(struct bpf_object *obj) { struct bpf_create_map_attr create_attr = {}; @@ -1244,6 +1586,7 @@ bpf_object__create_maps(struct bpf_object *obj) size_t j; err = *pfd; +err_out: cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); pr_warning("failed to create map (name: '%s'): %s\n", map->name, cp); @@ -1251,6 +1594,15 @@ bpf_object__create_maps(struct bpf_object *obj) zclose(obj->maps[j].fd); return err; } + + if (bpf_map__is_internal(map)) { + err = bpf_object__populate_internal_map(obj, map); + if (err < 0) { + zclose(*pfd); + goto err_out; + } + } + pr_debug("create map %s: fd=%d\n", map->name, *pfd); } @@ -1405,21 +1757,29 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj) return 0; for (i = 0; i < prog->nr_reloc; i++) { - if (prog->reloc_desc[i].type == RELO_LD64) { + if (prog->reloc_desc[i].type == RELO_LD64 || + prog->reloc_desc[i].type == RELO_DATA) { + bool relo_data = prog->reloc_desc[i].type == RELO_DATA; struct bpf_insn *insns = prog->insns; int insn_idx, map_idx; insn_idx = prog->reloc_desc[i].insn_idx; map_idx = prog->reloc_desc[i].map_idx; - if (insn_idx >= (int)prog->insns_cnt) { + if (insn_idx + 1 >= (int)prog->insns_cnt) { pr_warning("relocation out of range: '%s'\n", prog->section_name); return -LIBBPF_ERRNO__RELOC; } - insns[insn_idx].src_reg = BPF_PSEUDO_MAP_FD; + + if (!relo_data) { + insns[insn_idx].src_reg = BPF_PSEUDO_MAP_FD; + } else { + insns[insn_idx].src_reg = BPF_PSEUDO_MAP_VALUE; + insns[insn_idx + 1].imm = insns[insn_idx].imm; + } insns[insn_idx].imm = obj->maps[map_idx].fd; - } else { + } else if (prog->reloc_desc[i].type == RELO_CALL) { err = bpf_program__reloc_text(prog, obj, &prog->reloc_desc[i]); if (err) @@ -1494,6 +1854,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, { struct bpf_load_program_attr load_attr; char *cp, errmsg[STRERR_BUFSIZE]; + int log_buf_size = BPF_LOG_BUF_SIZE; char *log_buf; int ret; @@ -1514,21 +1875,30 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, load_attr.line_info = prog->line_info; load_attr.line_info_rec_size = prog->line_info_rec_size; load_attr.line_info_cnt = prog->line_info_cnt; + load_attr.log_level = prog->log_level; if (!load_attr.insns || !load_attr.insns_cnt) return -EINVAL; - log_buf = malloc(BPF_LOG_BUF_SIZE); +retry_load: + log_buf = malloc(log_buf_size); if (!log_buf) pr_warning("Alloc log buffer for bpf loader error, continue without log\n"); - ret = bpf_load_program_xattr(&load_attr, log_buf, BPF_LOG_BUF_SIZE); + ret = bpf_load_program_xattr(&load_attr, log_buf, log_buf_size); if (ret >= 0) { + if (load_attr.log_level) + pr_debug("verifier log:\n%s", log_buf); *pfd = ret; ret = 0; goto out; } + if (errno == ENOSPC) { + log_buf_size <<= 1; + free(log_buf); + goto retry_load; + } ret = -LIBBPF_ERRNO__LOAD; cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); pr_warning("load bpf program failed: %s\n", cp); @@ -2303,6 +2673,9 @@ void bpf_object__close(struct bpf_object *obj) obj->maps[i].priv = NULL; obj->maps[i].clear_priv = NULL; } + + zfree(&obj->sections.rodata); + zfree(&obj->sections.data); zfree(&obj->maps); obj->nr_maps = 0; @@ -2780,6 +3153,11 @@ bool bpf_map__is_offload_neutral(struct bpf_map *map) return map->def.type == BPF_MAP_TYPE_PERF_EVENT_ARRAY; } +bool bpf_map__is_internal(struct bpf_map *map) +{ + return map->libbpf_type != LIBBPF_MAP_UNSPEC; +} + void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex) { map->map_ifindex = ifindex; @@ -2938,6 +3316,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, bpf_program__set_expected_attach_type(prog, expected_attach_type); + prog->log_level = attr->log_level; if (!first_prog) first_prog = prog; } diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index c70785cc8ef5..c5ff00515ce7 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -75,6 +75,10 @@ struct bpf_object *__bpf_object__open_xattr(struct bpf_object_open_attr *attr, LIBBPF_API struct bpf_object *bpf_object__open_buffer(void *obj_buf, size_t obj_buf_sz, const char *name); +int bpf_object__section_size(const struct bpf_object *obj, const char *name, + __u32 *size); +int bpf_object__variable_offset(const struct bpf_object *obj, const char *name, + __u32 *off); LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, const char *path); LIBBPF_API int bpf_object__unpin_maps(struct bpf_object *obj, const char *path); @@ -301,6 +305,7 @@ LIBBPF_API void *bpf_map__priv(struct bpf_map *map); LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd); LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries); LIBBPF_API bool bpf_map__is_offload_neutral(struct bpf_map *map); +LIBBPF_API bool bpf_map__is_internal(struct bpf_map *map); LIBBPF_API void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex); LIBBPF_API int bpf_map__pin(struct bpf_map *map, const char *path); LIBBPF_API int bpf_map__unpin(struct bpf_map *map, const char *path); @@ -314,6 +319,7 @@ struct bpf_prog_load_attr { enum bpf_prog_type prog_type; enum bpf_attach_type expected_attach_type; int ifindex; + int log_level; }; LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index f3ce50500cf2..673001787cba 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -157,3 +157,10 @@ LIBBPF_0.0.2 { bpf_program__bpil_addr_to_offs; bpf_program__bpil_offs_to_addr; } LIBBPF_0.0.1; + +LIBBPF_0.0.3 { + global: + bpf_map__is_internal; + bpf_map_freeze; + btf__finalize_data; +} LIBBPF_0.0.2; diff --git a/tools/lib/bpf/libbpf.pc.template b/tools/lib/bpf/libbpf.pc.template new file mode 100644 index 000000000000..ac17fcef2108 --- /dev/null +++ b/tools/lib/bpf/libbpf.pc.template @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) + +prefix=@PREFIX@ +libdir=@LIBDIR@ +includedir=${prefix}/include + +Name: libbpf +Description: BPF library +Version: @VERSION@ +Libs: -L${libdir} -lbpf +Requires.private: libelf +Cflags: -I${includedir} diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 8d0078b65486..557ef8d1250d 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -259,7 +259,8 @@ out_umem_alloc: static int xsk_load_xdp_prog(struct xsk_socket *xsk) { - char bpf_log_buf[BPF_LOG_BUF_SIZE]; + static const int log_buf_size = 16 * 1024; + char log_buf[log_buf_size]; int err, prog_fd; /* This is the C-program: @@ -308,10 +309,10 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk) size_t insns_cnt = sizeof(prog) / sizeof(struct bpf_insn); prog_fd = bpf_load_program(BPF_PROG_TYPE_XDP, prog, insns_cnt, - "LGPL-2.1 or BSD-2-Clause", 0, bpf_log_buf, - BPF_LOG_BUF_SIZE); + "LGPL-2.1 or BSD-2-Clause", 0, log_buf, + log_buf_size); if (prog_fd < 0) { - pr_warning("BPF log buffer:\n%s", bpf_log_buf); + pr_warning("BPF log buffer:\n%s", log_buf); return prog_fd; } diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 77b73b892136..078283d073b0 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -209,7 +209,7 @@ ifeq ($(DWARF2BTF),y) endif PROG_TESTS_H := $(OUTPUT)/prog_tests/tests.h -$(OUTPUT)/test_progs: $(PROG_TESTS_H) +test_progs.c: $(PROG_TESTS_H) $(OUTPUT)/test_progs: CFLAGS += $(TEST_PROGS_CFLAGS) $(OUTPUT)/test_progs: prog_tests/*.c @@ -232,7 +232,7 @@ $(PROG_TESTS_H): $(PROG_TESTS_DIR) $(PROG_TESTS_FILES) ) > $(PROG_TESTS_H)) VERIFIER_TESTS_H := $(OUTPUT)/verifier/tests.h -$(OUTPUT)/test_verifier: $(VERIFIER_TESTS_H) +test_verifier.c: $(VERIFIER_TESTS_H) $(OUTPUT)/test_verifier: CFLAGS += $(TEST_VERIFIER_CFLAGS) VERIFIER_TESTS_DIR = $(OUTPUT)/verifier diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h index 97d140961438..e85d62cb53d0 100644 --- a/tools/testing/selftests/bpf/bpf_helpers.h +++ b/tools/testing/selftests/bpf/bpf_helpers.h @@ -9,14 +9,14 @@ #define SEC(NAME) __attribute__((section(NAME), used)) /* helper functions called from eBPF programs written in C */ -static void *(*bpf_map_lookup_elem)(void *map, void *key) = +static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) BPF_FUNC_map_lookup_elem; -static int (*bpf_map_update_elem)(void *map, void *key, void *value, +static int (*bpf_map_update_elem)(void *map, const void *key, const void *value, unsigned long long flags) = (void *) BPF_FUNC_map_update_elem; -static int (*bpf_map_delete_elem)(void *map, void *key) = +static int (*bpf_map_delete_elem)(void *map, const void *key) = (void *) BPF_FUNC_map_delete_elem; -static int (*bpf_map_push_elem)(void *map, void *value, +static int (*bpf_map_push_elem)(void *map, const void *value, unsigned long long flags) = (void *) BPF_FUNC_map_push_elem; static int (*bpf_map_pop_elem)(void *map, void *value) = diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index a42f4fc4dc11..8c976476f6fd 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -25,3 +25,11 @@ CONFIG_XDP_SOCKETS=y CONFIG_FTRACE_SYSCALLS=y CONFIG_IPV6_TUNNEL=y CONFIG_IPV6_GRE=y +CONFIG_NET_FOU=m +CONFIG_NET_FOU_IP_TUNNELS=y +CONFIG_IPV6_FOU=m +CONFIG_IPV6_FOU_TUNNEL=m +CONFIG_MPLS=y +CONFIG_NET_MPLS_GSO=m +CONFIG_MPLS_ROUTING=m +CONFIG_MPLS_IPTUNNEL=m diff --git a/tools/testing/selftests/bpf/flow_dissector_load.c b/tools/testing/selftests/bpf/flow_dissector_load.c index 77cafa66d048..7136ab9ffa73 100644 --- a/tools/testing/selftests/bpf/flow_dissector_load.c +++ b/tools/testing/selftests/bpf/flow_dissector_load.c @@ -52,7 +52,7 @@ static void detach_program(void) sprintf(command, "rm -r %s", cfg_pin_path); ret = system(command); if (ret) - error(1, errno, command); + error(1, errno, "%s", command); } static void parse_opts(int argc, char **argv) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c index a64f7a02139c..cb827383db4d 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c @@ -73,7 +73,7 @@ void test_bpf_obj_id(void) info_len != sizeof(struct bpf_map_info) || strcmp((char *)map_infos[i].name, expected_map_name), "get-map-info(fd)", - "err %d errno %d type %d(%d) info_len %u(%Zu) key_size %u value_size %u max_entries %u map_flags %X name %s(%s)\n", + "err %d errno %d type %d(%d) info_len %u(%zu) key_size %u value_size %u max_entries %u map_flags %X name %s(%s)\n", err, errno, map_infos[i].type, BPF_MAP_TYPE_ARRAY, info_len, sizeof(struct bpf_map_info), @@ -117,7 +117,7 @@ void test_bpf_obj_id(void) *(int *)(long)prog_infos[i].map_ids != map_infos[i].id || strcmp((char *)prog_infos[i].name, expected_prog_name), "get-prog-info(fd)", - "err %d errno %d i %d type %d(%d) info_len %u(%Zu) jit_enabled %d jited_prog_len %u xlated_prog_len %u jited_prog %d xlated_prog %d load_time %lu(%lu) uid %u(%u) nr_map_ids %u(%u) map_id %u(%u) name %s(%s)\n", + "err %d errno %d i %d type %d(%d) info_len %u(%zu) jit_enabled %d jited_prog_len %u xlated_prog_len %u jited_prog %d xlated_prog %d load_time %lu(%lu) uid %u(%u) nr_map_ids %u(%u) map_id %u(%u) name %s(%s)\n", err, errno, i, prog_infos[i].type, BPF_PROG_TYPE_SOCKET_FILTER, info_len, sizeof(struct bpf_prog_info), @@ -185,7 +185,7 @@ void test_bpf_obj_id(void) memcmp(&prog_info, &prog_infos[i], info_len) || *(int *)(long)prog_info.map_ids != saved_map_id, "get-prog-info(next_id->fd)", - "err %d errno %d info_len %u(%Zu) memcmp %d map_id %u(%u)\n", + "err %d errno %d info_len %u(%zu) memcmp %d map_id %u(%u)\n", err, errno, info_len, sizeof(struct bpf_prog_info), memcmp(&prog_info, &prog_infos[i], info_len), *(int *)(long)prog_info.map_ids, saved_map_id); @@ -231,7 +231,7 @@ void test_bpf_obj_id(void) memcmp(&map_info, &map_infos[i], info_len) || array_value != array_magic_value, "check get-map-info(next_id->fd)", - "err %d errno %d info_len %u(%Zu) memcmp %d array_value %llu(%llu)\n", + "err %d errno %d info_len %u(%zu) memcmp %d array_value %llu(%llu)\n", err, errno, info_len, sizeof(struct bpf_map_info), memcmp(&map_info, &map_infos[i], info_len), array_value, array_magic_value); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c new file mode 100644 index 000000000000..23b159d95c3f --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2019 Facebook +#include <test_progs.h> +static int libbpf_debug_print(enum libbpf_print_level level, + const char *format, va_list args) +{ + if (level != LIBBPF_DEBUG) + return 0; + + if (!strstr(format, "verifier log")) + return 0; + return vfprintf(stderr, "%s", args); +} + +static int check_load(const char *file) +{ + struct bpf_prog_load_attr attr; + struct bpf_object *obj; + int err, prog_fd; + + memset(&attr, 0, sizeof(struct bpf_prog_load_attr)); + attr.file = file; + attr.prog_type = BPF_PROG_TYPE_SCHED_CLS; + attr.log_level = 4; + err = bpf_prog_load_xattr(&attr, &obj, &prog_fd); + bpf_object__close(obj); + if (err) + error_cnt++; + return err; +} + +void test_bpf_verif_scale(void) +{ + const char *file1 = "./test_verif_scale1.o"; + const char *file2 = "./test_verif_scale2.o"; + const char *file3 = "./test_verif_scale3.o"; + int err; + + if (verifier_stats) + libbpf_set_print(libbpf_debug_print); + + err = check_load(file1); + err |= check_load(file2); + err |= check_load(file3); + if (!err) + printf("test_verif_scale:OK\n"); + else + printf("test_verif_scale:FAIL\n"); +} diff --git a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c index d7bb5beb1c57..c2a0a9d5591b 100644 --- a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c +++ b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c @@ -39,7 +39,7 @@ static int get_stack_print_output(void *data, int size) } else { for (i = 0; i < num_stack; i++) { ks = ksym_search(raw_data[i]); - if (strcmp(ks->name, nonjit_func) == 0) { + if (ks && (strcmp(ks->name, nonjit_func) == 0)) { found = true; break; } @@ -56,7 +56,7 @@ static int get_stack_print_output(void *data, int size) } else { for (i = 0; i < num_stack; i++) { ks = ksym_search(e->kern_stack[i]); - if (strcmp(ks->name, nonjit_func) == 0) { + if (ks && (strcmp(ks->name, nonjit_func) == 0)) { good_kern_stack = true; break; } diff --git a/tools/testing/selftests/bpf/prog_tests/global_data.c b/tools/testing/selftests/bpf/prog_tests/global_data.c new file mode 100644 index 000000000000..d011079fb0bf --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/global_data.c @@ -0,0 +1,157 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> + +static void test_global_data_number(struct bpf_object *obj, __u32 duration) +{ + int i, err, map_fd; + uint64_t num; + + map_fd = bpf_find_map(__func__, obj, "result_number"); + if (map_fd < 0) { + error_cnt++; + return; + } + + struct { + char *name; + uint32_t key; + uint64_t num; + } tests[] = { + { "relocate .bss reference", 0, 0 }, + { "relocate .data reference", 1, 42 }, + { "relocate .rodata reference", 2, 24 }, + { "relocate .bss reference", 3, 0 }, + { "relocate .data reference", 4, 0xffeeff }, + { "relocate .rodata reference", 5, 0xabab }, + { "relocate .bss reference", 6, 1234 }, + { "relocate .bss reference", 7, 0 }, + { "relocate .rodata reference", 8, 0xab }, + { "relocate .rodata reference", 9, 0x1111111111111111 }, + { "relocate .rodata reference", 10, ~0 }, + }; + + for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) { + err = bpf_map_lookup_elem(map_fd, &tests[i].key, &num); + CHECK(err || num != tests[i].num, tests[i].name, + "err %d result %lx expected %lx\n", + err, num, tests[i].num); + } +} + +static void test_global_data_string(struct bpf_object *obj, __u32 duration) +{ + int i, err, map_fd; + char str[32]; + + map_fd = bpf_find_map(__func__, obj, "result_string"); + if (map_fd < 0) { + error_cnt++; + return; + } + + struct { + char *name; + uint32_t key; + char str[32]; + } tests[] = { + { "relocate .rodata reference", 0, "abcdefghijklmnopqrstuvwxyz" }, + { "relocate .data reference", 1, "abcdefghijklmnopqrstuvwxyz" }, + { "relocate .bss reference", 2, "" }, + { "relocate .data reference", 3, "abcdexghijklmnopqrstuvwxyz" }, + { "relocate .bss reference", 4, "\0\0hello" }, + }; + + for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) { + err = bpf_map_lookup_elem(map_fd, &tests[i].key, str); + CHECK(err || memcmp(str, tests[i].str, sizeof(str)), + tests[i].name, "err %d result \'%s\' expected \'%s\'\n", + err, str, tests[i].str); + } +} + +struct foo { + __u8 a; + __u32 b; + __u64 c; +}; + +static void test_global_data_struct(struct bpf_object *obj, __u32 duration) +{ + int i, err, map_fd; + struct foo val; + + map_fd = bpf_find_map(__func__, obj, "result_struct"); + if (map_fd < 0) { + error_cnt++; + return; + } + + struct { + char *name; + uint32_t key; + struct foo val; + } tests[] = { + { "relocate .rodata reference", 0, { 42, 0xfefeefef, 0x1111111111111111ULL, } }, + { "relocate .bss reference", 1, { } }, + { "relocate .rodata reference", 2, { } }, + { "relocate .data reference", 3, { 41, 0xeeeeefef, 0x2111111111111111ULL, } }, + }; + + for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) { + err = bpf_map_lookup_elem(map_fd, &tests[i].key, &val); + CHECK(err || memcmp(&val, &tests[i].val, sizeof(val)), + tests[i].name, "err %d result { %u, %u, %llu } expected { %u, %u, %llu }\n", + err, val.a, val.b, val.c, tests[i].val.a, tests[i].val.b, tests[i].val.c); + } +} + +static void test_global_data_rdonly(struct bpf_object *obj, __u32 duration) +{ + int err = -ENOMEM, map_fd, zero = 0; + struct bpf_map *map; + __u8 *buff; + + map = bpf_object__find_map_by_name(obj, "test_glo.rodata"); + if (!map || !bpf_map__is_internal(map)) { + error_cnt++; + return; + } + + map_fd = bpf_map__fd(map); + if (map_fd < 0) { + error_cnt++; + return; + } + + buff = malloc(bpf_map__def(map)->value_size); + if (buff) + err = bpf_map_update_elem(map_fd, &zero, buff, 0); + free(buff); + CHECK(!err || errno != EPERM, "test .rodata read-only map", + "err %d errno %d\n", err, errno); +} + +void test_global_data(void) +{ + const char *file = "./test_global_data.o"; + __u32 duration = 0, retval; + struct bpf_object *obj; + int err, prog_fd; + + err = bpf_prog_load(file, BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd); + if (CHECK(err, "load program", "error %d loading %s\n", err, file)) + return; + + err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4), + NULL, NULL, &retval, &duration); + CHECK(err || retval, "pass global data run", + "err %d errno %d retval %d duration %d\n", + err, errno, retval, duration); + + test_global_data_number(obj, duration); + test_global_data_string(obj, duration); + test_global_data_struct(obj, duration); + test_global_data_rdonly(obj, duration); + + bpf_object__close(obj); +} diff --git a/tools/testing/selftests/bpf/prog_tests/skb_ctx.c b/tools/testing/selftests/bpf/prog_tests/skb_ctx.c new file mode 100644 index 000000000000..e95baa32e277 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/skb_ctx.c @@ -0,0 +1,89 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> + +void test_skb_ctx(void) +{ + struct __sk_buff skb = { + .cb[0] = 1, + .cb[1] = 2, + .cb[2] = 3, + .cb[3] = 4, + .cb[4] = 5, + .priority = 6, + }; + struct bpf_prog_test_run_attr tattr = { + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .ctx_in = &skb, + .ctx_size_in = sizeof(skb), + .ctx_out = &skb, + .ctx_size_out = sizeof(skb), + }; + struct bpf_object *obj; + int err; + int i; + + err = bpf_prog_load("./test_skb_ctx.o", BPF_PROG_TYPE_SCHED_CLS, &obj, + &tattr.prog_fd); + if (CHECK_ATTR(err, "load", "err %d errno %d\n", err, errno)) + return; + + /* ctx_in != NULL, ctx_size_in == 0 */ + + tattr.ctx_size_in = 0; + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err == 0, "ctx_size_in", "err %d errno %d\n", err, errno); + tattr.ctx_size_in = sizeof(skb); + + /* ctx_out != NULL, ctx_size_out == 0 */ + + tattr.ctx_size_out = 0; + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err == 0, "ctx_size_out", "err %d errno %d\n", err, errno); + tattr.ctx_size_out = sizeof(skb); + + /* non-zero [len, tc_index] fields should be rejected*/ + + skb.len = 1; + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err == 0, "len", "err %d errno %d\n", err, errno); + skb.len = 0; + + skb.tc_index = 1; + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err == 0, "tc_index", "err %d errno %d\n", err, errno); + skb.tc_index = 0; + + /* non-zero [hash, sk] fields should be rejected */ + + skb.hash = 1; + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err == 0, "hash", "err %d errno %d\n", err, errno); + skb.hash = 0; + + skb.sk = (struct bpf_sock *)1; + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err == 0, "sk", "err %d errno %d\n", err, errno); + skb.sk = 0; + + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err != 0 || tattr.retval, + "run", + "err %d errno %d retval %d\n", + err, errno, tattr.retval); + + CHECK_ATTR(tattr.ctx_size_out != sizeof(skb), + "ctx_size_out", + "incorrect output size, want %lu have %u\n", + sizeof(skb), tattr.ctx_size_out); + + for (i = 0; i < 5; i++) + CHECK_ATTR(skb.cb[i] != i + 2, + "ctx_out_cb", + "skb->cb[i] == %d, expected %d\n", + skb.cb[i], i + 2); + CHECK_ATTR(skb.priority != 7, + "ctx_out_priority", + "skb->priority == %d, expected %d\n", + skb.priority, 7); +} diff --git a/tools/testing/selftests/bpf/progs/test_global_data.c b/tools/testing/selftests/bpf/progs/test_global_data.c new file mode 100644 index 000000000000..5ab14e941980 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_data.c @@ -0,0 +1,106 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2019 Isovalent, Inc. + +#include <linux/bpf.h> +#include <linux/pkt_cls.h> +#include <string.h> + +#include "bpf_helpers.h" + +struct bpf_map_def SEC("maps") result_number = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(__u32), + .value_size = sizeof(__u64), + .max_entries = 11, +}; + +struct bpf_map_def SEC("maps") result_string = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(__u32), + .value_size = 32, + .max_entries = 5, +}; + +struct foo { + __u8 a; + __u32 b; + __u64 c; +}; + +struct bpf_map_def SEC("maps") result_struct = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(__u32), + .value_size = sizeof(struct foo), + .max_entries = 5, +}; + +/* Relocation tests for __u64s. */ +static __u64 num0; +static __u64 num1 = 42; +static const __u64 num2 = 24; +static __u64 num3 = 0; +static __u64 num4 = 0xffeeff; +static const __u64 num5 = 0xabab; +static const __u64 num6 = 0xab; + +/* Relocation tests for strings. */ +static const char str0[32] = "abcdefghijklmnopqrstuvwxyz"; +static char str1[32] = "abcdefghijklmnopqrstuvwxyz"; +static char str2[32]; + +/* Relocation tests for structs. */ +static const struct foo struct0 = { + .a = 42, + .b = 0xfefeefef, + .c = 0x1111111111111111ULL, +}; +static struct foo struct1; +static const struct foo struct2; +static struct foo struct3 = { + .a = 41, + .b = 0xeeeeefef, + .c = 0x2111111111111111ULL, +}; + +#define test_reloc(map, num, var) \ + do { \ + __u32 key = num; \ + bpf_map_update_elem(&result_##map, &key, var, 0); \ + } while (0) + +SEC("static_data_load") +int load_static_data(struct __sk_buff *skb) +{ + static const __u64 bar = ~0; + + test_reloc(number, 0, &num0); + test_reloc(number, 1, &num1); + test_reloc(number, 2, &num2); + test_reloc(number, 3, &num3); + test_reloc(number, 4, &num4); + test_reloc(number, 5, &num5); + num4 = 1234; + test_reloc(number, 6, &num4); + test_reloc(number, 7, &num0); + test_reloc(number, 8, &num6); + + test_reloc(string, 0, str0); + test_reloc(string, 1, str1); + test_reloc(string, 2, str2); + str1[5] = 'x'; + test_reloc(string, 3, str1); + __builtin_memcpy(&str2[2], "hello", sizeof("hello")); + test_reloc(string, 4, str2); + + test_reloc(struct, 0, &struct0); + test_reloc(struct, 1, &struct1); + test_reloc(struct, 2, &struct2); + test_reloc(struct, 3, &struct3); + + test_reloc(number, 9, &struct0.c); + test_reloc(number, 10, &bar); + + return TC_ACT_OK; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_jhash.h b/tools/testing/selftests/bpf/progs/test_jhash.h new file mode 100644 index 000000000000..3d12c11a8d47 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_jhash.h @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2019 Facebook + +typedef unsigned int u32; + +static __attribute__((always_inline)) u32 rol32(u32 word, unsigned int shift) +{ + return (word << shift) | (word >> ((-shift) & 31)); +} + +#define __jhash_mix(a, b, c) \ +{ \ + a -= c; a ^= rol32(c, 4); c += b; \ + b -= a; b ^= rol32(a, 6); a += c; \ + c -= b; c ^= rol32(b, 8); b += a; \ + a -= c; a ^= rol32(c, 16); c += b; \ + b -= a; b ^= rol32(a, 19); a += c; \ + c -= b; c ^= rol32(b, 4); b += a; \ +} + +#define __jhash_final(a, b, c) \ +{ \ + c ^= b; c -= rol32(b, 14); \ + a ^= c; a -= rol32(c, 11); \ + b ^= a; b -= rol32(a, 25); \ + c ^= b; c -= rol32(b, 16); \ + a ^= c; a -= rol32(c, 4); \ + b ^= a; b -= rol32(a, 14); \ + c ^= b; c -= rol32(b, 24); \ +} + +#define JHASH_INITVAL 0xdeadbeef + +static ATTR +u32 jhash(const void *key, u32 length, u32 initval) +{ + u32 a, b, c; + const unsigned char *k = key; + + a = b = c = JHASH_INITVAL + length + initval; + + while (length > 12) { + a += *(volatile u32 *)(k); + b += *(volatile u32 *)(k + 4); + c += *(volatile u32 *)(k + 8); + __jhash_mix(a, b, c); + length -= 12; + k += 12; + } + switch (length) { + case 12: c += (u32)k[11]<<24; + case 11: c += (u32)k[10]<<16; + case 10: c += (u32)k[9]<<8; + case 9: c += k[8]; + case 8: b += (u32)k[7]<<24; + case 7: b += (u32)k[6]<<16; + case 6: b += (u32)k[5]<<8; + case 5: b += k[4]; + case 4: a += (u32)k[3]<<24; + case 3: a += (u32)k[2]<<16; + case 2: a += (u32)k[1]<<8; + case 1: a += k[0]; + c ^= a; + __jhash_final(a, b, c); + case 0: /* Nothing left to add */ + break; + } + + return c; +} diff --git a/tools/testing/selftests/bpf/progs/test_skb_ctx.c b/tools/testing/selftests/bpf/progs/test_skb_ctx.c new file mode 100644 index 000000000000..7a80960d7df1 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_skb_ctx.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/bpf.h> +#include "bpf_helpers.h" + +int _version SEC("version") = 1; +char _license[] SEC("license") = "GPL"; + +SEC("skb_ctx") +int process(struct __sk_buff *skb) +{ + #pragma clang loop unroll(full) + for (int i = 0; i < 5; i++) { + if (skb->cb[i] != i + 1) + return 1; + skb->cb[i]++; + } + skb->priority++; + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c index f541c2de947d..bcb00d737e95 100644 --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c @@ -11,7 +11,9 @@ #include <linux/in.h> #include <linux/ip.h> #include <linux/ipv6.h> +#include <linux/mpls.h> #include <linux/tcp.h> +#include <linux/udp.h> #include <linux/pkt_cls.h> #include <linux/types.h> @@ -20,16 +22,36 @@ static const int cfg_port = 8000; -struct grev4hdr { - struct iphdr ip; +static const int cfg_udp_src = 20000; + +#define UDP_PORT 5555 +#define MPLS_OVER_UDP_PORT 6635 +#define ETH_OVER_UDP_PORT 7777 + +/* MPLS label 1000 with S bit (last label) set and ttl of 255. */ +static const __u32 mpls_label = __bpf_constant_htonl(1000 << 12 | + MPLS_LS_S_MASK | 0xff); + +struct gre_hdr { __be16 flags; __be16 protocol; } __attribute__((packed)); -struct grev6hdr { +union l4hdr { + struct udphdr udp; + struct gre_hdr gre; +}; + +struct v4hdr { + struct iphdr ip; + union l4hdr l4hdr; + __u8 pad[16]; /* enough space for L2 header */ +} __attribute__((packed)); + +struct v6hdr { struct ipv6hdr ip; - __be16 flags; - __be16 protocol; + union l4hdr l4hdr; + __u8 pad[16]; /* enough space for L2 header */ } __attribute__((packed)); static __always_inline void set_ipv4_csum(struct iphdr *iph) @@ -47,13 +69,15 @@ static __always_inline void set_ipv4_csum(struct iphdr *iph) iph->check = ~((csum & 0xffff) + (csum >> 16)); } -static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre) +static __always_inline int encap_ipv4(struct __sk_buff *skb, __u8 encap_proto, + __u16 l2_proto) { - struct grev4hdr h_outer; + __u16 udp_dst = UDP_PORT; struct iphdr iph_inner; + struct v4hdr h_outer; struct tcphdr tcph; + int olen, l2_len; __u64 flags; - int olen; if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner, sizeof(iph_inner)) < 0) @@ -70,13 +94,58 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre) if (tcph.dest != __bpf_constant_htons(cfg_port)) return TC_ACT_OK; + olen = sizeof(h_outer.ip); + l2_len = 0; + flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV4; - if (with_gre) { + + switch (l2_proto) { + case ETH_P_MPLS_UC: + l2_len = sizeof(mpls_label); + udp_dst = MPLS_OVER_UDP_PORT; + break; + case ETH_P_TEB: + l2_len = ETH_HLEN; + udp_dst = ETH_OVER_UDP_PORT; + break; + } + flags |= BPF_F_ADJ_ROOM_ENCAP_L2(l2_len); + + switch (encap_proto) { + case IPPROTO_GRE: flags |= BPF_F_ADJ_ROOM_ENCAP_L4_GRE; - olen = sizeof(h_outer); - } else { - olen = sizeof(h_outer.ip); + olen += sizeof(h_outer.l4hdr.gre); + h_outer.l4hdr.gre.protocol = bpf_htons(l2_proto); + h_outer.l4hdr.gre.flags = 0; + break; + case IPPROTO_UDP: + flags |= BPF_F_ADJ_ROOM_ENCAP_L4_UDP; + olen += sizeof(h_outer.l4hdr.udp); + h_outer.l4hdr.udp.source = __bpf_constant_htons(cfg_udp_src); + h_outer.l4hdr.udp.dest = bpf_htons(udp_dst); + h_outer.l4hdr.udp.check = 0; + h_outer.l4hdr.udp.len = bpf_htons(bpf_ntohs(iph_inner.tot_len) + + sizeof(h_outer.l4hdr.udp) + + l2_len); + break; + case IPPROTO_IPIP: + break; + default: + return TC_ACT_OK; + } + + /* add L2 encap (if specified) */ + switch (l2_proto) { + case ETH_P_MPLS_UC: + *((__u32 *)((__u8 *)&h_outer + olen)) = mpls_label; + break; + case ETH_P_TEB: + if (bpf_skb_load_bytes(skb, 0, (__u8 *)&h_outer + olen, + ETH_HLEN)) + return TC_ACT_SHOT; + break; } + olen += l2_len; /* add room between mac and network header */ if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags)) @@ -85,16 +154,10 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre) /* prepare new outer network header */ h_outer.ip = iph_inner; h_outer.ip.tot_len = bpf_htons(olen + - bpf_htons(h_outer.ip.tot_len)); - if (with_gre) { - h_outer.ip.protocol = IPPROTO_GRE; - h_outer.protocol = bpf_htons(ETH_P_IP); - h_outer.flags = 0; - } else { - h_outer.ip.protocol = IPPROTO_IPIP; - } + bpf_ntohs(h_outer.ip.tot_len)); + h_outer.ip.protocol = encap_proto; - set_ipv4_csum((void *)&h_outer.ip); + set_ipv4_csum(&h_outer.ip); /* store new outer network header */ if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen, @@ -104,13 +167,16 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre) return TC_ACT_OK; } -static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre) +static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, + __u16 l2_proto) { + __u16 udp_dst = UDP_PORT; struct ipv6hdr iph_inner; - struct grev6hdr h_outer; + struct v6hdr h_outer; struct tcphdr tcph; + int olen, l2_len; + __u16 tot_len; __u64 flags; - int olen; if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner, sizeof(iph_inner)) < 0) @@ -124,14 +190,58 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre) if (tcph.dest != __bpf_constant_htons(cfg_port)) return TC_ACT_OK; + olen = sizeof(h_outer.ip); + l2_len = 0; + flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV6; - if (with_gre) { + + switch (l2_proto) { + case ETH_P_MPLS_UC: + l2_len = sizeof(mpls_label); + udp_dst = MPLS_OVER_UDP_PORT; + break; + case ETH_P_TEB: + l2_len = ETH_HLEN; + udp_dst = ETH_OVER_UDP_PORT; + break; + } + flags |= BPF_F_ADJ_ROOM_ENCAP_L2(l2_len); + + switch (encap_proto) { + case IPPROTO_GRE: flags |= BPF_F_ADJ_ROOM_ENCAP_L4_GRE; - olen = sizeof(h_outer); - } else { - olen = sizeof(h_outer.ip); + olen += sizeof(h_outer.l4hdr.gre); + h_outer.l4hdr.gre.protocol = bpf_htons(l2_proto); + h_outer.l4hdr.gre.flags = 0; + break; + case IPPROTO_UDP: + flags |= BPF_F_ADJ_ROOM_ENCAP_L4_UDP; + olen += sizeof(h_outer.l4hdr.udp); + h_outer.l4hdr.udp.source = __bpf_constant_htons(cfg_udp_src); + h_outer.l4hdr.udp.dest = bpf_htons(udp_dst); + tot_len = bpf_ntohs(iph_inner.payload_len) + sizeof(iph_inner) + + sizeof(h_outer.l4hdr.udp); + h_outer.l4hdr.udp.check = 0; + h_outer.l4hdr.udp.len = bpf_htons(tot_len); + break; + case IPPROTO_IPV6: + break; + default: + return TC_ACT_OK; } + /* add L2 encap (if specified) */ + switch (l2_proto) { + case ETH_P_MPLS_UC: + *((__u32 *)((__u8 *)&h_outer + olen)) = mpls_label; + break; + case ETH_P_TEB: + if (bpf_skb_load_bytes(skb, 0, (__u8 *)&h_outer + olen, + ETH_HLEN)) + return TC_ACT_SHOT; + break; + } + olen += l2_len; /* add room between mac and network header */ if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags)) @@ -141,13 +251,8 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre) h_outer.ip = iph_inner; h_outer.ip.payload_len = bpf_htons(olen + bpf_ntohs(h_outer.ip.payload_len)); - if (with_gre) { - h_outer.ip.nexthdr = IPPROTO_GRE; - h_outer.protocol = bpf_htons(ETH_P_IPV6); - h_outer.flags = 0; - } else { - h_outer.ip.nexthdr = IPPROTO_IPV6; - } + + h_outer.ip.nexthdr = encap_proto; /* store new outer network header */ if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen, @@ -157,54 +262,168 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre) return TC_ACT_OK; } -SEC("encap_ipip") -int __encap_ipip(struct __sk_buff *skb) +SEC("encap_ipip_none") +int __encap_ipip_none(struct __sk_buff *skb) { if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) - return encap_ipv4(skb, false); + return encap_ipv4(skb, IPPROTO_IPIP, ETH_P_IP); else return TC_ACT_OK; } -SEC("encap_gre") -int __encap_gre(struct __sk_buff *skb) +SEC("encap_gre_none") +int __encap_gre_none(struct __sk_buff *skb) { if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) - return encap_ipv4(skb, true); + return encap_ipv4(skb, IPPROTO_GRE, ETH_P_IP); else return TC_ACT_OK; } -SEC("encap_ip6tnl") -int __encap_ip6tnl(struct __sk_buff *skb) +SEC("encap_gre_mpls") +int __encap_gre_mpls(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) + return encap_ipv4(skb, IPPROTO_GRE, ETH_P_MPLS_UC); + else + return TC_ACT_OK; +} + +SEC("encap_gre_eth") +int __encap_gre_eth(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) + return encap_ipv4(skb, IPPROTO_GRE, ETH_P_TEB); + else + return TC_ACT_OK; +} + +SEC("encap_udp_none") +int __encap_udp_none(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) + return encap_ipv4(skb, IPPROTO_UDP, ETH_P_IP); + else + return TC_ACT_OK; +} + +SEC("encap_udp_mpls") +int __encap_udp_mpls(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) + return encap_ipv4(skb, IPPROTO_UDP, ETH_P_MPLS_UC); + else + return TC_ACT_OK; +} + +SEC("encap_udp_eth") +int __encap_udp_eth(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) + return encap_ipv4(skb, IPPROTO_UDP, ETH_P_TEB); + else + return TC_ACT_OK; +} + +SEC("encap_ip6tnl_none") +int __encap_ip6tnl_none(struct __sk_buff *skb) { if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) - return encap_ipv6(skb, false); + return encap_ipv6(skb, IPPROTO_IPV6, ETH_P_IPV6); else return TC_ACT_OK; } -SEC("encap_ip6gre") -int __encap_ip6gre(struct __sk_buff *skb) +SEC("encap_ip6gre_none") +int __encap_ip6gre_none(struct __sk_buff *skb) { if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) - return encap_ipv6(skb, true); + return encap_ipv6(skb, IPPROTO_GRE, ETH_P_IPV6); + else + return TC_ACT_OK; +} + +SEC("encap_ip6gre_mpls") +int __encap_ip6gre_mpls(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) + return encap_ipv6(skb, IPPROTO_GRE, ETH_P_MPLS_UC); + else + return TC_ACT_OK; +} + +SEC("encap_ip6gre_eth") +int __encap_ip6gre_eth(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) + return encap_ipv6(skb, IPPROTO_GRE, ETH_P_TEB); + else + return TC_ACT_OK; +} + +SEC("encap_ip6udp_none") +int __encap_ip6udp_none(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) + return encap_ipv6(skb, IPPROTO_UDP, ETH_P_IPV6); + else + return TC_ACT_OK; +} + +SEC("encap_ip6udp_mpls") +int __encap_ip6udp_mpls(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) + return encap_ipv6(skb, IPPROTO_UDP, ETH_P_MPLS_UC); + else + return TC_ACT_OK; +} + +SEC("encap_ip6udp_eth") +int __encap_ip6udp_eth(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) + return encap_ipv6(skb, IPPROTO_UDP, ETH_P_TEB); else return TC_ACT_OK; } static int decap_internal(struct __sk_buff *skb, int off, int len, char proto) { - char buf[sizeof(struct grev6hdr)]; - int olen; + char buf[sizeof(struct v6hdr)]; + struct gre_hdr greh; + struct udphdr udph; + int olen = len; switch (proto) { case IPPROTO_IPIP: case IPPROTO_IPV6: - olen = len; break; case IPPROTO_GRE: - olen = len + 4 /* gre hdr */; + olen += sizeof(struct gre_hdr); + if (bpf_skb_load_bytes(skb, off + len, &greh, sizeof(greh)) < 0) + return TC_ACT_OK; + switch (bpf_ntohs(greh.protocol)) { + case ETH_P_MPLS_UC: + olen += sizeof(mpls_label); + break; + case ETH_P_TEB: + olen += ETH_HLEN; + break; + } + break; + case IPPROTO_UDP: + olen += sizeof(struct udphdr); + if (bpf_skb_load_bytes(skb, off + len, &udph, sizeof(udph)) < 0) + return TC_ACT_OK; + switch (bpf_ntohs(udph.dest)) { + case MPLS_OVER_UDP_PORT: + olen += sizeof(mpls_label); + break; + case ETH_OVER_UDP_PORT: + olen += ETH_HLEN; + break; + } break; default: return TC_ACT_OK; diff --git a/tools/testing/selftests/bpf/progs/test_verif_scale1.c b/tools/testing/selftests/bpf/progs/test_verif_scale1.c new file mode 100644 index 000000000000..f3236ce35f31 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_verif_scale1.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2019 Facebook +#include <linux/bpf.h> +#include "bpf_helpers.h" +#define ATTR __attribute__((noinline)) +#include "test_jhash.h" + +SEC("scale90_noinline") +int balancer_ingress(struct __sk_buff *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + void *ptr; + int ret = 0, nh_off, i = 0; + + nh_off = 14; + + /* pragma unroll doesn't work on large loops */ + +#define C do { \ + ptr = data + i; \ + if (ptr + nh_off > data_end) \ + break; \ + ctx->tc_index = jhash(ptr, nh_off, ctx->cb[0] + i++); \ + } while (0); +#define C30 C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C; + C30;C30;C30; /* 90 calls */ + return 0; +} +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_verif_scale2.c b/tools/testing/selftests/bpf/progs/test_verif_scale2.c new file mode 100644 index 000000000000..77830693eccb --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_verif_scale2.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2019 Facebook +#include <linux/bpf.h> +#include "bpf_helpers.h" +#define ATTR __attribute__((always_inline)) +#include "test_jhash.h" + +SEC("scale90_inline") +int balancer_ingress(struct __sk_buff *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + void *ptr; + int ret = 0, nh_off, i = 0; + + nh_off = 14; + + /* pragma unroll doesn't work on large loops */ + +#define C do { \ + ptr = data + i; \ + if (ptr + nh_off > data_end) \ + break; \ + ctx->tc_index = jhash(ptr, nh_off, ctx->cb[0] + i++); \ + } while (0); +#define C30 C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C; + C30;C30;C30; /* 90 calls */ + return 0; +} +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_verif_scale3.c b/tools/testing/selftests/bpf/progs/test_verif_scale3.c new file mode 100644 index 000000000000..1848da04ea41 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_verif_scale3.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2019 Facebook +#include <linux/bpf.h> +#include "bpf_helpers.h" +#define ATTR __attribute__((noinline)) +#include "test_jhash.h" + +SEC("scale90_noinline32") +int balancer_ingress(struct __sk_buff *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + void *ptr; + int ret = 0, nh_off, i = 0; + + nh_off = 32; + + /* pragma unroll doesn't work on large loops */ + +#define C do { \ + ptr = data + i; \ + if (ptr + nh_off > data_end) \ + break; \ + ctx->tc_index = jhash(ptr, nh_off, ctx->cb[0] + i++); \ + } while (0); +#define C30 C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C; + C30;C30;C30; /* 90 calls */ + return 0; +} +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c index ec5794e4205b..44cd3378d216 100644 --- a/tools/testing/selftests/bpf/test_btf.c +++ b/tools/testing/selftests/bpf/test_btf.c @@ -85,6 +85,11 @@ static int __base_pr(enum libbpf_print_level level __attribute__((unused)), #define BTF_UNION_ENC(name, nr_elems, sz) \ BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_UNION, 0, nr_elems), sz) +#define BTF_VAR_ENC(name, type, linkage) \ + BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), type), (linkage) +#define BTF_VAR_SECINFO_ENC(type, offset, size) \ + (type), (offset), (size) + #define BTF_MEMBER_ENC(name, type, bits_offset) \ (name), (type), (bits_offset) #define BTF_ENUM_ENC(name, val) (name), (val) @@ -291,7 +296,6 @@ static struct btf_raw_test raw_tests[] = { .value_type_id = 3, .max_entries = 4, }, - { .descr = "struct test #3 Invalid member offset", .raw_types = { @@ -319,7 +323,664 @@ static struct btf_raw_test raw_tests[] = { .btf_load_err = true, .err_str = "Invalid member bits_offset", }, - +/* + * struct A { + * unsigned long long m; + * int n; + * char o; + * [3 bytes hole] + * int p[8]; + * }; + */ +{ + .descr = "global data test #1", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = "struct_test1_map", + .key_size = sizeof(int), + .value_size = 48, + .key_type_id = 1, + .value_type_id = 5, + .max_entries = 4, +}, +/* + * struct A { + * unsigned long long m; + * int n; + * char o; + * [3 bytes hole] + * int p[8]; + * }; + * static struct A t; <- in .bss + */ +{ + .descr = "global data test #2", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* .bss section */ /* [7] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 48), + BTF_VAR_SECINFO_ENC(6, 0, 48), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 48, + .key_type_id = 0, + .value_type_id = 7, + .max_entries = 1, +}, +{ + .descr = "global data test #3", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* static int t */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [2] */ + /* .bss section */ /* [3] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(2, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0t\0.bss", + .str_sec_size = sizeof("\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 3, + .max_entries = 1, +}, +{ + .descr = "global data test #4, unsupported linkage", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* static int t */ + BTF_VAR_ENC(NAME_TBD, 1, 2), /* [2] */ + /* .bss section */ /* [3] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(2, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0t\0.bss", + .str_sec_size = sizeof("\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 3, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Linkage not supported", +}, +{ + .descr = "global data test #5, invalid var type", + .raw_types = { + /* static void t */ + BTF_VAR_ENC(NAME_TBD, 0, 0), /* [1] */ + /* .bss section */ /* [2] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(1, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0t\0.bss", + .str_sec_size = sizeof("\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 2, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type_id", +}, +{ + .descr = "global data test #6, invalid var type (fwd type)", + .raw_types = { + /* union A */ + BTF_TYPE_ENC(NAME_TBD, + BTF_INFO_ENC(BTF_KIND_FWD, 1, 0), 0), /* [1] */ + /* static union A t */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [2] */ + /* .bss section */ /* [3] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(2, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0.bss", + .str_sec_size = sizeof("\0A\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 2, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type", +}, +{ + .descr = "global data test #7, invalid var type (fwd type)", + .raw_types = { + /* union A */ + BTF_TYPE_ENC(NAME_TBD, + BTF_INFO_ENC(BTF_KIND_FWD, 1, 0), 0), /* [1] */ + /* static union A t */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [2] */ + /* .bss section */ /* [3] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(1, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0.bss", + .str_sec_size = sizeof("\0A\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 2, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type", +}, +{ + .descr = "global data test #8, invalid var size", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* .bss section */ /* [7] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 48), + BTF_VAR_SECINFO_ENC(6, 0, 47), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 48, + .key_type_id = 0, + .value_type_id = 7, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid size", +}, +{ + .descr = "global data test #9, invalid var size", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* .bss section */ /* [7] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 46), + BTF_VAR_SECINFO_ENC(6, 0, 48), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 48, + .key_type_id = 0, + .value_type_id = 7, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid size", +}, +{ + .descr = "global data test #10, invalid var size", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* .bss section */ /* [7] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 46), + BTF_VAR_SECINFO_ENC(6, 0, 46), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 48, + .key_type_id = 0, + .value_type_id = 7, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid size", +}, +{ + .descr = "global data test #11, multiple section members", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* static int u */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [7] */ + /* .bss section */ /* [8] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 2), 62), + BTF_VAR_SECINFO_ENC(6, 10, 48), + BTF_VAR_SECINFO_ENC(7, 58, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0u\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0u\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 62, + .key_type_id = 0, + .value_type_id = 8, + .max_entries = 1, +}, +{ + .descr = "global data test #12, invalid offset", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* static int u */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [7] */ + /* .bss section */ /* [8] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 2), 62), + BTF_VAR_SECINFO_ENC(6, 10, 48), + BTF_VAR_SECINFO_ENC(7, 60, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0u\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0u\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 62, + .key_type_id = 0, + .value_type_id = 8, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid offset+size", +}, +{ + .descr = "global data test #13, invalid offset", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* static int u */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [7] */ + /* .bss section */ /* [8] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 2), 62), + BTF_VAR_SECINFO_ENC(6, 10, 48), + BTF_VAR_SECINFO_ENC(7, 12, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0u\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0u\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 62, + .key_type_id = 0, + .value_type_id = 8, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid offset", +}, +{ + .descr = "global data test #14, invalid offset", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* unsigned long long */ + BTF_TYPE_INT_ENC(0, 0, 0, 64, 8), /* [2] */ + /* char */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 8, 1), /* [3] */ + /* int[8] */ + BTF_TYPE_ARRAY_ENC(1, 1, 8), /* [4] */ + /* struct A { */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 4), 48), + BTF_MEMBER_ENC(NAME_TBD, 2, 0), /* unsigned long long m;*/ + BTF_MEMBER_ENC(NAME_TBD, 1, 64),/* int n; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 96),/* char o; */ + BTF_MEMBER_ENC(NAME_TBD, 4, 128),/* int p[8] */ + /* } */ + /* static struct A t */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [6] */ + /* static int u */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [7] */ + /* .bss section */ /* [8] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 2), 62), + BTF_VAR_SECINFO_ENC(7, 58, 4), + BTF_VAR_SECINFO_ENC(6, 10, 48), + BTF_END_RAW, + }, + .str_sec = "\0A\0m\0n\0o\0p\0t\0u\0.bss", + .str_sec_size = sizeof("\0A\0m\0n\0o\0p\0t\0u\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 62, + .key_type_id = 0, + .value_type_id = 8, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid offset", +}, +{ + .descr = "global data test #15, not var kind", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [2] */ + /* .bss section */ /* [3] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(1, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0.bss", + .str_sec_size = sizeof("\0A\0t\0.bss"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 3, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Not a VAR kind member", +}, +{ + .descr = "global data test #16, invalid var referencing sec", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_VAR_ENC(NAME_TBD, 5, 0), /* [2] */ + BTF_VAR_ENC(NAME_TBD, 2, 0), /* [3] */ + /* a section */ /* [4] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(3, 0, 4), + /* a section */ /* [5] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(6, 0, 4), + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [6] */ + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0s\0a\0a", + .str_sec_size = sizeof("\0A\0t\0s\0a\0a"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 4, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type_id", +}, +{ + .descr = "global data test #17, invalid var referencing var", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [2] */ + BTF_VAR_ENC(NAME_TBD, 2, 0), /* [3] */ + /* a section */ /* [4] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(3, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0s\0a\0a", + .str_sec_size = sizeof("\0A\0t\0s\0a\0a"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 4, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type_id", +}, +{ + .descr = "global data test #18, invalid var loop", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_VAR_ENC(NAME_TBD, 2, 0), /* [2] */ + /* .bss section */ /* [3] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), + BTF_VAR_SECINFO_ENC(2, 0, 4), + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0aaa", + .str_sec_size = sizeof("\0A\0t\0aaa"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 4, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type_id", +}, +{ + .descr = "global data test #19, invalid var referencing var", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_VAR_ENC(NAME_TBD, 3, 0), /* [2] */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [3] */ + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0s\0a\0a", + .str_sec_size = sizeof("\0A\0t\0s\0a\0a"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 4, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type_id", +}, +{ + .descr = "global data test #20, invalid ptr referencing var", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* PTR type_id=3 */ /* [2] */ + BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 3), + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [3] */ + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0s\0a\0a", + .str_sec_size = sizeof("\0A\0t\0s\0a\0a"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 4, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid type_id", +}, +{ + .descr = "global data test #21, var included in struct", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + /* struct A { */ /* [2] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 2), sizeof(int) * 2), + BTF_MEMBER_ENC(NAME_TBD, 1, 0), /* int m; */ + BTF_MEMBER_ENC(NAME_TBD, 3, 32),/* VAR type_id=3; */ + /* } */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [3] */ + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0s\0a\0a", + .str_sec_size = sizeof("\0A\0t\0s\0a\0a"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 4, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid member", +}, +{ + .descr = "global data test #22, array of var", + .raw_types = { + /* int */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_TYPE_ARRAY_ENC(3, 1, 4), /* [2] */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [3] */ + BTF_END_RAW, + }, + .str_sec = "\0A\0t\0s\0a\0a", + .str_sec_size = sizeof("\0A\0t\0s\0a\0a"), + .map_type = BPF_MAP_TYPE_ARRAY, + .map_name = ".bss", + .key_size = sizeof(int), + .value_size = 4, + .key_type_id = 0, + .value_type_id = 4, + .max_entries = 1, + .btf_load_err = true, + .err_str = "Invalid elem", +}, /* Test member exceeds the size of struct. * * struct A { @@ -3677,6 +4338,7 @@ struct pprint_mapv { } aenum; uint32_t ui32b; uint32_t bits2c:2; + uint8_t si8_4[2][2]; }; #ifdef __SIZEOF_INT128__ @@ -3729,7 +4391,7 @@ static struct btf_raw_test pprint_test_template[] = { BTF_ENUM_ENC(NAME_TBD, 2), BTF_ENUM_ENC(NAME_TBD, 3), /* struct pprint_mapv */ /* [16] */ - BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 10), 40), + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 0, 11), 40), BTF_MEMBER_ENC(NAME_TBD, 11, 0), /* uint32_t ui32 */ BTF_MEMBER_ENC(NAME_TBD, 10, 32), /* uint16_t ui16 */ BTF_MEMBER_ENC(NAME_TBD, 12, 64), /* int32_t si32 */ @@ -3740,9 +4402,12 @@ static struct btf_raw_test pprint_test_template[] = { BTF_MEMBER_ENC(NAME_TBD, 15, 192), /* aenum */ BTF_MEMBER_ENC(NAME_TBD, 11, 224), /* uint32_t ui32b */ BTF_MEMBER_ENC(NAME_TBD, 6, 256), /* bits2c */ + BTF_MEMBER_ENC(NAME_TBD, 17, 264), /* si8_4 */ + BTF_TYPE_ARRAY_ENC(18, 1, 2), /* [17] */ + BTF_TYPE_ARRAY_ENC(1, 1, 2), /* [18] */ BTF_END_RAW, }, - BTF_STR_SEC("\0unsigned char\0unsigned short\0unsigned int\0int\0unsigned long long\0uint8_t\0uint16_t\0uint32_t\0int32_t\0uint64_t\0ui64\0ui8a\0ENUM_ZERO\0ENUM_ONE\0ENUM_TWO\0ENUM_THREE\0pprint_mapv\0ui32\0ui16\0si32\0unused_bits2a\0bits28\0unused_bits2b\0aenum\0ui32b\0bits2c"), + BTF_STR_SEC("\0unsigned char\0unsigned short\0unsigned int\0int\0unsigned long long\0uint8_t\0uint16_t\0uint32_t\0int32_t\0uint64_t\0ui64\0ui8a\0ENUM_ZERO\0ENUM_ONE\0ENUM_TWO\0ENUM_THREE\0pprint_mapv\0ui32\0ui16\0si32\0unused_bits2a\0bits28\0unused_bits2b\0aenum\0ui32b\0bits2c\0si8_4"), .key_size = sizeof(unsigned int), .value_size = sizeof(struct pprint_mapv), .key_type_id = 3, /* unsigned int */ @@ -3791,7 +4456,7 @@ static struct btf_raw_test pprint_test_template[] = { BTF_ENUM_ENC(NAME_TBD, 2), BTF_ENUM_ENC(NAME_TBD, 3), /* struct pprint_mapv */ /* [16] */ - BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 1, 10), 40), + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 1, 11), 40), BTF_MEMBER_ENC(NAME_TBD, 11, BTF_MEMBER_OFFSET(0, 0)), /* uint32_t ui32 */ BTF_MEMBER_ENC(NAME_TBD, 10, BTF_MEMBER_OFFSET(0, 32)), /* uint16_t ui16 */ BTF_MEMBER_ENC(NAME_TBD, 12, BTF_MEMBER_OFFSET(0, 64)), /* int32_t si32 */ @@ -3802,9 +4467,12 @@ static struct btf_raw_test pprint_test_template[] = { BTF_MEMBER_ENC(NAME_TBD, 15, BTF_MEMBER_OFFSET(0, 192)), /* aenum */ BTF_MEMBER_ENC(NAME_TBD, 11, BTF_MEMBER_OFFSET(0, 224)), /* uint32_t ui32b */ BTF_MEMBER_ENC(NAME_TBD, 6, BTF_MEMBER_OFFSET(2, 256)), /* bits2c */ + BTF_MEMBER_ENC(NAME_TBD, 17, 264), /* si8_4 */ + BTF_TYPE_ARRAY_ENC(18, 1, 2), /* [17] */ + BTF_TYPE_ARRAY_ENC(1, 1, 2), /* [18] */ BTF_END_RAW, }, - BTF_STR_SEC("\0unsigned char\0unsigned short\0unsigned int\0int\0unsigned long long\0uint8_t\0uint16_t\0uint32_t\0int32_t\0uint64_t\0ui64\0ui8a\0ENUM_ZERO\0ENUM_ONE\0ENUM_TWO\0ENUM_THREE\0pprint_mapv\0ui32\0ui16\0si32\0unused_bits2a\0bits28\0unused_bits2b\0aenum\0ui32b\0bits2c"), + BTF_STR_SEC("\0unsigned char\0unsigned short\0unsigned int\0int\0unsigned long long\0uint8_t\0uint16_t\0uint32_t\0int32_t\0uint64_t\0ui64\0ui8a\0ENUM_ZERO\0ENUM_ONE\0ENUM_TWO\0ENUM_THREE\0pprint_mapv\0ui32\0ui16\0si32\0unused_bits2a\0bits28\0unused_bits2b\0aenum\0ui32b\0bits2c\0si8_4"), .key_size = sizeof(unsigned int), .value_size = sizeof(struct pprint_mapv), .key_type_id = 3, /* unsigned int */ @@ -3855,7 +4523,7 @@ static struct btf_raw_test pprint_test_template[] = { BTF_ENUM_ENC(NAME_TBD, 2), BTF_ENUM_ENC(NAME_TBD, 3), /* struct pprint_mapv */ /* [16] */ - BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 1, 10), 40), + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_STRUCT, 1, 11), 40), BTF_MEMBER_ENC(NAME_TBD, 11, BTF_MEMBER_OFFSET(0, 0)), /* uint32_t ui32 */ BTF_MEMBER_ENC(NAME_TBD, 10, BTF_MEMBER_OFFSET(0, 32)), /* uint16_t ui16 */ BTF_MEMBER_ENC(NAME_TBD, 12, BTF_MEMBER_OFFSET(0, 64)), /* int32_t si32 */ @@ -3866,13 +4534,16 @@ static struct btf_raw_test pprint_test_template[] = { BTF_MEMBER_ENC(NAME_TBD, 15, BTF_MEMBER_OFFSET(0, 192)), /* aenum */ BTF_MEMBER_ENC(NAME_TBD, 11, BTF_MEMBER_OFFSET(0, 224)), /* uint32_t ui32b */ BTF_MEMBER_ENC(NAME_TBD, 17, BTF_MEMBER_OFFSET(2, 256)), /* bits2c */ + BTF_MEMBER_ENC(NAME_TBD, 20, BTF_MEMBER_OFFSET(0, 264)), /* si8_4 */ /* typedef unsigned int ___int */ /* [17] */ BTF_TYPEDEF_ENC(NAME_TBD, 18), BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_VOLATILE, 0, 0), 6), /* [18] */ BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_CONST, 0, 0), 15), /* [19] */ + BTF_TYPE_ARRAY_ENC(21, 1, 2), /* [20] */ + BTF_TYPE_ARRAY_ENC(1, 1, 2), /* [21] */ BTF_END_RAW, }, - BTF_STR_SEC("\0unsigned char\0unsigned short\0unsigned int\0int\0unsigned long long\0uint8_t\0uint16_t\0uint32_t\0int32_t\0uint64_t\0ui64\0ui8a\0ENUM_ZERO\0ENUM_ONE\0ENUM_TWO\0ENUM_THREE\0pprint_mapv\0ui32\0ui16\0si32\0unused_bits2a\0bits28\0unused_bits2b\0aenum\0ui32b\0bits2c\0___int"), + BTF_STR_SEC("\0unsigned char\0unsigned short\0unsigned int\0int\0unsigned long long\0uint8_t\0uint16_t\0uint32_t\0int32_t\0uint64_t\0ui64\0ui8a\0ENUM_ZERO\0ENUM_ONE\0ENUM_TWO\0ENUM_THREE\0pprint_mapv\0ui32\0ui16\0si32\0unused_bits2a\0bits28\0unused_bits2b\0aenum\0ui32b\0bits2c\0___int\0si8_4"), .key_size = sizeof(unsigned int), .value_size = sizeof(struct pprint_mapv), .key_type_id = 3, /* unsigned int */ @@ -4007,6 +4678,10 @@ static void set_pprint_mapv(enum pprint_mapv_kind_t mapv_kind, v->aenum = i & 0x03; v->ui32b = 4; v->bits2c = 1; + v->si8_4[0][0] = (cpu + i) & 0xff; + v->si8_4[0][1] = (cpu + i + 1) & 0xff; + v->si8_4[1][0] = (cpu + i + 2) & 0xff; + v->si8_4[1][1] = (cpu + i + 3) & 0xff; v = (void *)v + rounded_value_size; } } @@ -4040,7 +4715,7 @@ ssize_t get_pprint_expected_line(enum pprint_mapv_kind_t mapv_kind, nexpected_line = snprintf(expected_line, line_size, "%s%u: {%u,0,%d,0x%x,0x%x,0x%x," "{%lu|[%u,%u,%u,%u,%u,%u,%u,%u]},%s," - "%u,0x%x}\n", + "%u,0x%x,[[%d,%d],[%d,%d]]}\n", percpu_map ? "\tcpu" : "", percpu_map ? cpu : next_key, v->ui32, v->si32, @@ -4054,7 +4729,9 @@ ssize_t get_pprint_expected_line(enum pprint_mapv_kind_t mapv_kind, v->ui8a[6], v->ui8a[7], pprint_enum_str[v->aenum], v->ui32b, - v->bits2c); + v->bits2c, + v->si8_4[0][0], v->si8_4[0][1], + v->si8_4[1][0], v->si8_4[1][1]); } #ifdef __SIZEOF_INT128__ diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index 5d10aee9e277..bf5c90998916 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -9,6 +9,7 @@ int error_cnt, pass_cnt; bool jit_enabled; +bool verifier_stats = false; struct ipv4_packet pkt_v4 = { .eth.h_proto = __bpf_constant_htons(ETH_P_IP), @@ -162,12 +163,15 @@ void *spin_lock_thread(void *arg) #include <prog_tests/tests.h> #undef DECLARE -int main(void) +int main(int ac, char **av) { srand(time(NULL)); jit_enabled = is_jit_enabled(); + if (ac == 2 && strcmp(av[1], "-s") == 0) + verifier_stats = true; + #define CALL #include <prog_tests/tests.h> #undef CALL diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index 51a07367cd43..f095e1d4c657 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -40,6 +40,7 @@ typedef __u16 __sum16; extern int error_cnt, pass_cnt; extern bool jit_enabled; +extern bool verifier_stats; #define MAGIC_BYTES 123 diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh index c805adb88f3a..d4d8d5d3b06e 100755 --- a/tools/testing/selftests/bpf/test_tc_tunnel.sh +++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh @@ -15,6 +15,12 @@ readonly ns2_v4=192.168.1.2 readonly ns1_v6=fd::1 readonly ns2_v6=fd::2 +# Must match port used by bpf program +readonly udpport=5555 +# MPLSoverUDP +readonly mplsudpport=6635 +readonly mplsproto=137 + readonly infile="$(mktemp)" readonly outfile="$(mktemp)" @@ -38,8 +44,8 @@ setup() { # clamp route to reserve room for tunnel headers ip -netns "${ns1}" -4 route flush table main ip -netns "${ns1}" -6 route flush table main - ip -netns "${ns1}" -4 route add "${ns2_v4}" mtu 1476 dev veth1 - ip -netns "${ns1}" -6 route add "${ns2_v6}" mtu 1456 dev veth1 + ip -netns "${ns1}" -4 route add "${ns2_v4}" mtu 1458 dev veth1 + ip -netns "${ns1}" -6 route add "${ns2_v6}" mtu 1438 dev veth1 sleep 1 @@ -86,30 +92,44 @@ set -e # no arguments: automated test, run all if [[ "$#" -eq "0" ]]; then echo "ipip" - $0 ipv4 ipip 100 + $0 ipv4 ipip none 100 echo "ip6ip6" - $0 ipv6 ip6tnl 100 + $0 ipv6 ip6tnl none 100 + + for mac in none mpls eth ; do + echo "ip gre $mac" + $0 ipv4 gre $mac 100 + + echo "ip6 gre $mac" + $0 ipv6 ip6gre $mac 100 + + echo "ip gre $mac gso" + $0 ipv4 gre $mac 2000 - echo "ip gre" - $0 ipv4 gre 100 + echo "ip6 gre $mac gso" + $0 ipv6 ip6gre $mac 2000 - echo "ip6 gre" - $0 ipv6 ip6gre 100 + echo "ip udp $mac" + $0 ipv4 udp $mac 100 - echo "ip gre gso" - $0 ipv4 gre 2000 + echo "ip6 udp $mac" + $0 ipv6 ip6udp $mac 100 - echo "ip6 gre gso" - $0 ipv6 ip6gre 2000 + echo "ip udp $mac gso" + $0 ipv4 udp $mac 2000 + + echo "ip6 udp $mac gso" + $0 ipv6 ip6udp $mac 2000 + done echo "OK. All tests passed" exit 0 fi -if [[ "$#" -ne "3" ]]; then +if [[ "$#" -ne "4" ]]; then echo "Usage: $0" - echo " or: $0 <ipv4|ipv6> <tuntype> <data_len>" + echo " or: $0 <ipv4|ipv6> <tuntype> <none|mpls|eth> <data_len>" exit 1 fi @@ -117,12 +137,24 @@ case "$1" in "ipv4") readonly addr1="${ns1_v4}" readonly addr2="${ns2_v4}" - readonly netcat_opt=-4 + readonly ipproto=4 + readonly netcat_opt=-${ipproto} + readonly foumod=fou + readonly foutype=ipip + readonly fouproto=4 + readonly fouproto_mpls=${mplsproto} + readonly gretaptype=gretap ;; "ipv6") readonly addr1="${ns1_v6}" readonly addr2="${ns2_v6}" - readonly netcat_opt=-6 + readonly ipproto=6 + readonly netcat_opt=-${ipproto} + readonly foumod=fou6 + readonly foutype=ip6tnl + readonly fouproto="41 -6" + readonly fouproto_mpls="${mplsproto} -6" + readonly gretaptype=ip6gretap ;; *) echo "unknown arg: $1" @@ -131,9 +163,10 @@ case "$1" in esac readonly tuntype=$2 -readonly datalen=$3 +readonly mac=$3 +readonly datalen=$4 -echo "encap ${addr1} to ${addr2}, type ${tuntype}, len ${datalen}" +echo "encap ${addr1} to ${addr2}, type ${tuntype}, mac ${mac} len ${datalen}" trap cleanup EXIT @@ -150,16 +183,63 @@ verify_data ip netns exec "${ns1}" tc qdisc add dev veth1 clsact ip netns exec "${ns1}" tc filter add dev veth1 egress \ bpf direct-action object-file ./test_tc_tunnel.o \ - section "encap_${tuntype}" + section "encap_${tuntype}_${mac}" echo "test bpf encap without decap (expect failure)" server_listen ! client_connect +if [[ "$tuntype" =~ "udp" ]]; then + # Set up fou tunnel. + ttype="${foutype}" + targs="encap fou encap-sport auto encap-dport $udpport" + # fou may be a module; allow this to fail. + modprobe "${foumod}" ||true + if [[ "$mac" == "mpls" ]]; then + dport=${mplsudpport} + dproto=${fouproto_mpls} + tmode="mode any ttl 255" + else + dport=${udpport} + dproto=${fouproto} + fi + ip netns exec "${ns2}" ip fou add port $dport ipproto ${dproto} + targs="encap fou encap-sport auto encap-dport $dport" +elif [[ "$tuntype" =~ "gre" && "$mac" == "eth" ]]; then + ttype=$gretaptype +else + ttype=$tuntype + targs="" +fi + # serverside, insert decap module # server is still running # client can connect again -ip netns exec "${ns2}" ip link add dev testtun0 type "${tuntype}" \ - remote "${addr1}" local "${addr2}" +ip netns exec "${ns2}" ip link add name testtun0 type "${ttype}" \ + ${tmode} remote "${addr1}" local "${addr2}" $targs + +expect_tun_fail=0 + +if [[ "$tuntype" == "ip6udp" && "$mac" == "mpls" ]]; then + # No support for MPLS IPv6 fou tunnel; expect failure. + expect_tun_fail=1 +elif [[ "$tuntype" =~ "udp" && "$mac" == "eth" ]]; then + # No support for TEB fou tunnel; expect failure. + expect_tun_fail=1 +elif [[ "$tuntype" =~ "gre" && "$mac" == "eth" ]]; then + # Share ethernet address between tunnel/veth2 so L2 decap works. + ethaddr=$(ip netns exec "${ns2}" ip link show veth2 | \ + awk '/ether/ { print $2 }') + ip netns exec "${ns2}" ip link set testtun0 address $ethaddr +elif [[ "$mac" == "mpls" ]]; then + modprobe mpls_iptunnel ||true + modprobe mpls_gso ||true + ip netns exec "${ns2}" sysctl -qw net.mpls.platform_labels=65536 + ip netns exec "${ns2}" ip -f mpls route add 1000 dev lo + ip netns exec "${ns2}" ip link set lo up + ip netns exec "${ns2}" sysctl -qw net.mpls.conf.testtun0.input=1 + ip netns exec "${ns2}" sysctl -qw net.ipv4.conf.lo.rp_filter=0 +fi + # Because packets are decapped by the tunnel they arrive on testtun0 from # the IP stack perspective. Ensure reverse path filtering is disabled # otherwise we drop the TCP SYN as arriving on testtun0 instead of the @@ -169,16 +249,22 @@ ip netns exec "${ns2}" sysctl -qw net.ipv4.conf.all.rp_filter=0 # selected as the max of the "all" and device-specific values. ip netns exec "${ns2}" sysctl -qw net.ipv4.conf.testtun0.rp_filter=0 ip netns exec "${ns2}" ip link set dev testtun0 up -echo "test bpf encap with tunnel device decap" -client_connect -verify_data +if [[ "$expect_tun_fail" == 1 ]]; then + # This tunnel mode is not supported, so we expect failure. + echo "test bpf encap with tunnel device decap (expect failure)" + ! client_connect +else + echo "test bpf encap with tunnel device decap" + client_connect + verify_data + server_listen +fi # serverside, use BPF for decap ip netns exec "${ns2}" ip link del dev testtun0 ip netns exec "${ns2}" tc qdisc add dev veth2 clsact ip netns exec "${ns2}" tc filter add dev veth2 ingress \ bpf direct-action object-file ./test_tc_tunnel.o section decap -server_listen echo "test bpf encap with bpf decap" client_connect verify_data diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 19b5d03acc2a..e2ebcaddbe78 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -50,8 +50,9 @@ #include "../../../include/linux/filter.h" #define MAX_INSNS BPF_MAXINSNS +#define MAX_TEST_INSNS 1000000 #define MAX_FIXUPS 8 -#define MAX_NR_MAPS 14 +#define MAX_NR_MAPS 16 #define MAX_TEST_RUNS 8 #define POINTER_VALUE 0xcafe4all #define TEST_DATA_LEN 64 @@ -66,6 +67,7 @@ static int skips; struct bpf_test { const char *descr; struct bpf_insn insns[MAX_INSNS]; + struct bpf_insn *fill_insns; int fixup_map_hash_8b[MAX_FIXUPS]; int fixup_map_hash_48b[MAX_FIXUPS]; int fixup_map_hash_16b[MAX_FIXUPS]; @@ -80,9 +82,13 @@ struct bpf_test { int fixup_cgroup_storage[MAX_FIXUPS]; int fixup_percpu_cgroup_storage[MAX_FIXUPS]; int fixup_map_spin_lock[MAX_FIXUPS]; + int fixup_map_array_ro[MAX_FIXUPS]; + int fixup_map_array_wo[MAX_FIXUPS]; + int fixup_map_array_small[MAX_FIXUPS]; const char *errstr; const char *errstr_unpriv; uint32_t retval, retval_unpriv, insn_processed; + int prog_len; enum { UNDEF, ACCEPT, @@ -119,10 +125,11 @@ struct other_val { static void bpf_fill_ld_abs_vlan_push_pop(struct bpf_test *self) { - /* test: {skb->data[0], vlan_push} x 68 + {skb->data[0], vlan_pop} x 68 */ + /* test: {skb->data[0], vlan_push} x 51 + {skb->data[0], vlan_pop} x 51 */ #define PUSH_CNT 51 - unsigned int len = BPF_MAXINSNS; - struct bpf_insn *insn = self->insns; + /* jump range is limited to 16 bit. PUSH_CNT of ld_abs needs room */ + unsigned int len = (1 << 15) - PUSH_CNT * 2 * 5 * 6; + struct bpf_insn *insn = self->fill_insns; int i = 0, j, k = 0; insn[i++] = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1); @@ -156,12 +163,14 @@ loop: for (; i < len - 1; i++) insn[i] = BPF_ALU32_IMM(BPF_MOV, BPF_REG_0, 0xbef); insn[len - 1] = BPF_EXIT_INSN(); + self->prog_len = len; } static void bpf_fill_jump_around_ld_abs(struct bpf_test *self) { - struct bpf_insn *insn = self->insns; - unsigned int len = BPF_MAXINSNS; + struct bpf_insn *insn = self->fill_insns; + /* jump range is limited to 16 bit. every ld_abs is replaced by 6 insns */ + unsigned int len = (1 << 15) / 6; int i = 0; insn[i++] = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1); @@ -171,11 +180,12 @@ static void bpf_fill_jump_around_ld_abs(struct bpf_test *self) while (i < len - 1) insn[i++] = BPF_LD_ABS(BPF_B, 1); insn[i] = BPF_EXIT_INSN(); + self->prog_len = i + 1; } static void bpf_fill_rand_ld_dw(struct bpf_test *self) { - struct bpf_insn *insn = self->insns; + struct bpf_insn *insn = self->fill_insns; uint64_t res = 0; int i = 0; @@ -193,6 +203,7 @@ static void bpf_fill_rand_ld_dw(struct bpf_test *self) insn[i++] = BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 32); insn[i++] = BPF_ALU64_REG(BPF_XOR, BPF_REG_0, BPF_REG_1); insn[i] = BPF_EXIT_INSN(); + self->prog_len = i + 1; res ^= (res >> 32); self->retval = (uint32_t)res; } @@ -277,13 +288,15 @@ static bool skip_unsupported_map(enum bpf_map_type map_type) return false; } -static int create_map(uint32_t type, uint32_t size_key, - uint32_t size_value, uint32_t max_elem) +static int __create_map(uint32_t type, uint32_t size_key, + uint32_t size_value, uint32_t max_elem, + uint32_t extra_flags) { int fd; fd = bpf_create_map(type, size_key, size_value, max_elem, - type == BPF_MAP_TYPE_HASH ? BPF_F_NO_PREALLOC : 0); + (type == BPF_MAP_TYPE_HASH ? + BPF_F_NO_PREALLOC : 0) | extra_flags); if (fd < 0) { if (skip_unsupported_map(type)) return -1; @@ -293,6 +306,12 @@ static int create_map(uint32_t type, uint32_t size_key, return fd; } +static int create_map(uint32_t type, uint32_t size_key, + uint32_t size_value, uint32_t max_elem) +{ + return __create_map(type, size_key, size_value, max_elem, 0); +} + static void update_map(int fd, int index) { struct test_val value = { @@ -519,9 +538,14 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type, int *fixup_cgroup_storage = test->fixup_cgroup_storage; int *fixup_percpu_cgroup_storage = test->fixup_percpu_cgroup_storage; int *fixup_map_spin_lock = test->fixup_map_spin_lock; + int *fixup_map_array_ro = test->fixup_map_array_ro; + int *fixup_map_array_wo = test->fixup_map_array_wo; + int *fixup_map_array_small = test->fixup_map_array_small; - if (test->fill_helper) + if (test->fill_helper) { + test->fill_insns = calloc(MAX_TEST_INSNS, sizeof(struct bpf_insn)); test->fill_helper(test); + } /* Allocating HTs with 1 elem is fine here, since we only test * for verifier and not do a runtime lookup, so the only thing @@ -642,6 +666,35 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type, fixup_map_spin_lock++; } while (*fixup_map_spin_lock); } + if (*fixup_map_array_ro) { + map_fds[14] = __create_map(BPF_MAP_TYPE_ARRAY, sizeof(int), + sizeof(struct test_val), 1, + BPF_F_RDONLY_PROG); + update_map(map_fds[14], 0); + do { + prog[*fixup_map_array_ro].imm = map_fds[14]; + fixup_map_array_ro++; + } while (*fixup_map_array_ro); + } + if (*fixup_map_array_wo) { + map_fds[15] = __create_map(BPF_MAP_TYPE_ARRAY, sizeof(int), + sizeof(struct test_val), 1, + BPF_F_WRONLY_PROG); + update_map(map_fds[15], 0); + do { + prog[*fixup_map_array_wo].imm = map_fds[15]; + fixup_map_array_wo++; + } while (*fixup_map_array_wo); + } + if (*fixup_map_array_small) { + map_fds[16] = __create_map(BPF_MAP_TYPE_ARRAY, sizeof(int), + 1, 1, 0); + update_map(map_fds[16], 0); + do { + prog[*fixup_map_array_small].imm = map_fds[16]; + fixup_map_array_small++; + } while (*fixup_map_array_small); + } } static int set_admin(bool admin) @@ -718,12 +771,17 @@ static void do_test_single(struct bpf_test *test, bool unpriv, prog_type = BPF_PROG_TYPE_SOCKET_FILTER; fixup_skips = skips; do_test_fixup(test, prog_type, prog, map_fds); + if (test->fill_insns) { + prog = test->fill_insns; + prog_len = test->prog_len; + } else { + prog_len = probe_filter_length(prog); + } /* If there were some map skips during fixup due to missing bpf * features, skip this test. */ if (fixup_skips != skips) return; - prog_len = probe_filter_length(prog); pflags = 0; if (test->flags & F_LOAD_WITH_STRICT_ALIGNMENT) @@ -731,7 +789,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv, if (test->flags & F_NEEDS_EFFICIENT_UNALIGNED_ACCESS) pflags |= BPF_F_ANY_ALIGNMENT; fd_prog = bpf_verify_program(prog_type, prog, prog_len, pflags, - "GPL", 0, bpf_vlog, sizeof(bpf_vlog), 1); + "GPL", 0, bpf_vlog, sizeof(bpf_vlog), 4); if (fd_prog < 0 && !bpf_probe_prog_type(prog_type, 0)) { printf("SKIP (unsupported program type %d)\n", prog_type); skips++; @@ -830,6 +888,8 @@ static void do_test_single(struct bpf_test *test, bool unpriv, goto fail_log; } close_fds: + if (test->fill_insns) + free(test->fill_insns); close(fd_prog); for (i = 0; i < MAX_NR_MAPS; i++) close(map_fds[i]); diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c index 4cdb63bf0521..9a9fc6c9b70b 100644 --- a/tools/testing/selftests/bpf/trace_helpers.c +++ b/tools/testing/selftests/bpf/trace_helpers.c @@ -52,6 +52,10 @@ struct ksym *ksym_search(long key) int start = 0, end = sym_cnt; int result; + /* kallsyms not loaded. return NULL */ + if (sym_cnt <= 0) + return NULL; + while (start < end) { size_t mid = start + (end - start) / 2; diff --git a/tools/testing/selftests/bpf/verifier/array_access.c b/tools/testing/selftests/bpf/verifier/array_access.c index 0dcecaf3ec6f..bcb83196e459 100644 --- a/tools/testing/selftests/bpf/verifier/array_access.c +++ b/tools/testing/selftests/bpf/verifier/array_access.c @@ -217,3 +217,162 @@ .result = REJECT, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, }, +{ + "valid read map access into a read-only array 1", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_ro = { 3 }, + .result = ACCEPT, + .retval = 28, +}, +{ + "valid read map access into a read-only array 2", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6), + + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_MOV64_IMM(BPF_REG_2, 4), + BPF_MOV64_IMM(BPF_REG_3, 0), + BPF_MOV64_IMM(BPF_REG_4, 0), + BPF_MOV64_IMM(BPF_REG_5, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, + BPF_FUNC_csum_diff), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .fixup_map_array_ro = { 3 }, + .result = ACCEPT, + .retval = -29, +}, +{ + "invalid write map access into a read-only array 1", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), + BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 42), + BPF_EXIT_INSN(), + }, + .fixup_map_array_ro = { 3 }, + .result = REJECT, + .errstr = "write into map forbidden", +}, +{ + "invalid write map access into a read-only array 2", + .insns = { + BPF_MOV64_REG(BPF_REG_6, BPF_REG_1), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_MOV64_IMM(BPF_REG_2, 0), + BPF_MOV64_REG(BPF_REG_3, BPF_REG_0), + BPF_MOV64_IMM(BPF_REG_4, 8), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, + BPF_FUNC_skb_load_bytes), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .fixup_map_array_ro = { 4 }, + .result = REJECT, + .errstr = "write into map forbidden", +}, +{ + "valid write map access into a write-only array 1", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), + BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 42), + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_EXIT_INSN(), + }, + .fixup_map_array_wo = { 3 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "valid write map access into a write-only array 2", + .insns = { + BPF_MOV64_REG(BPF_REG_6, BPF_REG_1), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_MOV64_IMM(BPF_REG_2, 0), + BPF_MOV64_REG(BPF_REG_3, BPF_REG_0), + BPF_MOV64_IMM(BPF_REG_4, 8), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, + BPF_FUNC_skb_load_bytes), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .fixup_map_array_wo = { 4 }, + .result = ACCEPT, + .retval = 0, +}, +{ + "invalid read map access into a write-only array 1", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_wo = { 3 }, + .result = REJECT, + .errstr = "read from map forbidden", +}, +{ + "invalid read map access into a write-only array 2", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6), + + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_MOV64_IMM(BPF_REG_2, 4), + BPF_MOV64_IMM(BPF_REG_3, 0), + BPF_MOV64_IMM(BPF_REG_4, 0), + BPF_MOV64_IMM(BPF_REG_5, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, + BPF_FUNC_csum_diff), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .fixup_map_array_wo = { 3 }, + .result = REJECT, + .errstr = "read from map forbidden", +}, diff --git a/tools/testing/selftests/bpf/verifier/ctx_skb.c b/tools/testing/selftests/bpf/verifier/ctx_skb.c index c660deb582f1..b0fda2877119 100644 --- a/tools/testing/selftests/bpf/verifier/ctx_skb.c +++ b/tools/testing/selftests/bpf/verifier/ctx_skb.c @@ -705,7 +705,6 @@ .errstr = "invalid bpf_context access", .result = REJECT, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, - .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, }, { "check cb access: half, wrong type", diff --git a/tools/testing/selftests/bpf/verifier/direct_value_access.c b/tools/testing/selftests/bpf/verifier/direct_value_access.c new file mode 100644 index 000000000000..b9fb28e8e224 --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/direct_value_access.c @@ -0,0 +1,347 @@ +{ + "direct map access, write test 1", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 2", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 8), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 3", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 8), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 8, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 4", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 40), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 5", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 32), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 8, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 6", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 40), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 4, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "R1 min value is outside of the array range", +}, +{ + "direct map access, write test 7", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, -1), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 4, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "direct value offset of 4294967295 is not allowed", +}, +{ + "direct map access, write test 8", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 1), + BPF_ST_MEM(BPF_DW, BPF_REG_1, -1, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 9", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 48), + BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 4242), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid access to map value pointer", +}, +{ + "direct map access, write test 10", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 47), + BPF_ST_MEM(BPF_B, BPF_REG_1, 0, 4), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 11", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 48), + BPF_ST_MEM(BPF_B, BPF_REG_1, 0, 4), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid access to map value pointer", +}, +{ + "direct map access, write test 12", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, (1<<29)), + BPF_ST_MEM(BPF_B, BPF_REG_1, 0, 4), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "direct value offset of 536870912 is not allowed", +}, +{ + "direct map access, write test 13", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, (1<<29)-1), + BPF_ST_MEM(BPF_B, BPF_REG_1, 0, 4), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid access to map value pointer, value_size=48 off=536870911", +}, +{ + "direct map access, write test 14", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 47), + BPF_LD_MAP_VALUE(BPF_REG_2, 0, 46), + BPF_ST_MEM(BPF_H, BPF_REG_2, 0, 0xffff), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1, 3 }, + .result = ACCEPT, + .retval = 0xff, +}, +{ + "direct map access, write test 15", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 46), + BPF_LD_MAP_VALUE(BPF_REG_2, 0, 46), + BPF_ST_MEM(BPF_H, BPF_REG_2, 0, 0xffff), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1, 3 }, + .result = ACCEPT, + .retval = 0xffff, +}, +{ + "direct map access, write test 16", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 46), + BPF_LD_MAP_VALUE(BPF_REG_2, 0, 47), + BPF_ST_MEM(BPF_H, BPF_REG_2, 0, 0xffff), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1, 3 }, + .result = REJECT, + .errstr = "invalid access to map value, value_size=48 off=47 size=2", +}, +{ + "direct map access, write test 17", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 46), + BPF_LD_MAP_VALUE(BPF_REG_2, 0, 46), + BPF_ST_MEM(BPF_H, BPF_REG_2, 1, 0xffff), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1, 3 }, + .result = REJECT, + .errstr = "invalid access to map value, value_size=48 off=47 size=2", +}, +{ + "direct map access, write test 18", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 0), + BPF_ST_MEM(BPF_H, BPF_REG_1, 0, 42), + BPF_EXIT_INSN(), + }, + .fixup_map_array_small = { 1 }, + .result = REJECT, + .errstr = "R1 min value is outside of the array range", +}, +{ + "direct map access, write test 19", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 0), + BPF_ST_MEM(BPF_B, BPF_REG_1, 0, 42), + BPF_EXIT_INSN(), + }, + .fixup_map_array_small = { 1 }, + .result = ACCEPT, + .retval = 1, +}, +{ + "direct map access, write test 20", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_MAP_VALUE(BPF_REG_1, 0, 1), + BPF_ST_MEM(BPF_B, BPF_REG_1, 0, 42), + BPF_EXIT_INSN(), + }, + .fixup_map_array_small = { 1 }, + .result = REJECT, + .errstr = "invalid access to map value pointer", +}, +{ + "direct map access, invalid insn test 1", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_VALUE, 0, 1, 0, 47), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid bpf_ld_imm64 insn", +}, +{ + "direct map access, invalid insn test 2", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_VALUE, 1, 0, 0, 47), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "BPF_LD_IMM64 uses reserved fields", +}, +{ + "direct map access, invalid insn test 3", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_VALUE, ~0, 0, 0, 47), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "BPF_LD_IMM64 uses reserved fields", +}, +{ + "direct map access, invalid insn test 4", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_VALUE, 0, ~0, 0, 47), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid bpf_ld_imm64 insn", +}, +{ + "direct map access, invalid insn test 5", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_VALUE, ~0, ~0, 0, 47), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid bpf_ld_imm64 insn", +}, +{ + "direct map access, invalid insn test 6", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_FD, ~0, 0, 0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "BPF_LD_IMM64 uses reserved fields", +}, +{ + "direct map access, invalid insn test 7", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_FD, 0, ~0, 0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid bpf_ld_imm64 insn", +}, +{ + "direct map access, invalid insn test 8", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_FD, ~0, ~0, 0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "invalid bpf_ld_imm64 insn", +}, +{ + "direct map access, invalid insn test 9", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_FD, 0, 0, 0, 47), + BPF_EXIT_INSN(), + }, + .fixup_map_array_48b = { 1 }, + .result = REJECT, + .errstr = "unrecognized bpf_ld_imm64 insn", +}, diff --git a/tools/testing/selftests/bpf/verifier/ld_dw.c b/tools/testing/selftests/bpf/verifier/ld_dw.c index d2c75b889598..0f18e62f0099 100644 --- a/tools/testing/selftests/bpf/verifier/ld_dw.c +++ b/tools/testing/selftests/bpf/verifier/ld_dw.c @@ -34,3 +34,12 @@ .result = ACCEPT, .retval = 5, }, +{ + "ld_dw: xor semi-random 64 bit imms, test 5", + .insns = { }, + .data = { }, + .fill_helper = bpf_fill_rand_ld_dw, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = ACCEPT, + .retval = 1000000 - 6, +}, diff --git a/tools/testing/selftests/bpf/verifier/var_off.c b/tools/testing/selftests/bpf/verifier/var_off.c index 1e536ff121a5..8504ac937809 100644 --- a/tools/testing/selftests/bpf/verifier/var_off.c +++ b/tools/testing/selftests/bpf/verifier/var_off.c @@ -40,7 +40,35 @@ .prog_type = BPF_PROG_TYPE_LWT_IN, }, { - "indirect variable-offset stack access", + "indirect variable-offset stack access, unbounded", + .insns = { + BPF_MOV64_IMM(BPF_REG_2, 6), + BPF_MOV64_IMM(BPF_REG_3, 28), + /* Fill the top 16 bytes of the stack. */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + /* Get an unknown value. */ + BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_1, offsetof(struct bpf_sock_ops, + bytes_received)), + /* Check the lower bound but don't check the upper one. */ + BPF_JMP_IMM(BPF_JSLT, BPF_REG_4, 0, 4), + /* Point the lower bound to initialized stack. Offset is now in range + * from fp-16 to fp+0x7fffffffffffffef, i.e. max value is unbounded. + */ + BPF_ALU64_IMM(BPF_SUB, BPF_REG_4, 16), + BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_10), + BPF_MOV64_IMM(BPF_REG_5, 8), + /* Dereference it indirectly. */ + BPF_EMIT_CALL(BPF_FUNC_getsockopt), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .errstr = "R4 unbounded indirect variable offset stack access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SOCK_OPS, +}, +{ + "indirect variable-offset stack access, max out of bound", .insns = { /* Fill the top 8 bytes of the stack */ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), @@ -60,7 +88,161 @@ BPF_EXIT_INSN(), }, .fixup_map_hash_8b = { 5 }, - .errstr = "variable stack read R2", + .errstr = "R2 max value is outside of stack bound", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_LWT_IN, +}, +{ + "indirect variable-offset stack access, min out of bound", + .insns = { + /* Fill the top 8 bytes of the stack */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + /* Get an unknown value */ + BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0), + /* Make it small and 4-byte aligned */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 516), + /* add it to fp. We now have either fp-516 or fp-512, but + * we don't know which + */ + BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10), + /* dereference it indirectly */ + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_hash_8b = { 5 }, + .errstr = "R2 min value is outside of stack bound", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_LWT_IN, +}, +{ + "indirect variable-offset stack access, max_off+size > max_initialized", + .insns = { + /* Fill only the second from top 8 bytes of the stack. */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0), + /* Get an unknown value. */ + BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0), + /* Make it small and 4-byte aligned. */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 16), + /* Add it to fp. We now have either fp-12 or fp-16, but we don't know + * which. fp-12 size 8 is partially uninitialized stack. + */ + BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10), + /* Dereference it indirectly. */ + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_hash_8b = { 5 }, + .errstr = "invalid indirect read from stack var_off", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_LWT_IN, +}, +{ + "indirect variable-offset stack access, min_off < min_initialized", + .insns = { + /* Fill only the top 8 bytes of the stack. */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + /* Get an unknown value */ + BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0), + /* Make it small and 4-byte aligned. */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 16), + /* Add it to fp. We now have either fp-12 or fp-16, but we don't know + * which. fp-16 size 8 is partially uninitialized stack. + */ + BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10), + /* Dereference it indirectly. */ + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_hash_8b = { 5 }, + .errstr = "invalid indirect read from stack var_off", .result = REJECT, .prog_type = BPF_PROG_TYPE_LWT_IN, }, +{ + "indirect variable-offset stack access, priv vs unpriv", + .insns = { + /* Fill the top 16 bytes of the stack. */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + /* Get an unknown value. */ + BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0), + /* Make it small and 4-byte aligned. */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 16), + /* Add it to fp. We now have either fp-12 or fp-16, we don't know + * which, but either way it points to initialized stack. + */ + BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10), + /* Dereference it indirectly. */ + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_hash_8b = { 6 }, + .errstr_unpriv = "R2 stack pointer arithmetic goes out of range, prohibited for !root", + .result_unpriv = REJECT, + .result = ACCEPT, + .prog_type = BPF_PROG_TYPE_CGROUP_SKB, +}, +{ + "indirect variable-offset stack access, uninitialized", + .insns = { + BPF_MOV64_IMM(BPF_REG_2, 6), + BPF_MOV64_IMM(BPF_REG_3, 28), + /* Fill the top 16 bytes of the stack. */ + BPF_ST_MEM(BPF_W, BPF_REG_10, -16, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + /* Get an unknown value. */ + BPF_LDX_MEM(BPF_W, BPF_REG_4, BPF_REG_1, 0), + /* Make it small and 4-byte aligned. */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_4, 4), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_4, 16), + /* Add it to fp. We now have either fp-12 or fp-16, we don't know + * which, but either way it points to initialized stack. + */ + BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_10), + BPF_MOV64_IMM(BPF_REG_5, 8), + /* Dereference it indirectly. */ + BPF_EMIT_CALL(BPF_FUNC_getsockopt), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .errstr = "invalid indirect read from stack var_off", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SOCK_OPS, +}, +{ + "indirect variable-offset stack access, ok", + .insns = { + /* Fill the top 16 bytes of the stack. */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + /* Get an unknown value. */ + BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0), + /* Make it small and 4-byte aligned. */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 16), + /* Add it to fp. We now have either fp-12 or fp-16, we don't know + * which, but either way it points to initialized stack. + */ + BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10), + /* Dereference it indirectly. */ + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .fixup_map_hash_8b = { 6 }, + .result = ACCEPT, + .prog_type = BPF_PROG_TYPE_LWT_IN, +}, |