Program type BPF_PROG_TYPE_FLOW_DISSECTOR
Flow dissector is a program type that parses metadata out of the packets.
Usage
BPF flow dissectors can be attached per network namespace. These programs are given a packet, the program should fill out the rest of the struct bpf_flow_keys
fields located at __sk_buff->flow_keys
.
Various places in the networking subsystem use these flow keys to aggregate packets for the same "flow" a combination of these fields. By implementing this logic in BPF, it becomes possible to add flow parsing for new or custom protocols.
The return code of the BPF program is either BPF_OK
to indicate successful dissection, or BPF_DROP
to indicate parsing error.
Context
BPF flow dissector programs operate on an __sk_buff.
However, only the limited set of fields is allowed: data
, data_end
and flow_keys
.
Context fields
flow_keys
is struct bpf_flow_keys
and contains flow dissector input and output arguments. Input arguments nhoff
/thoff
/n_proto
should be also adjusted accordingly.
c structure of struct bpf_flow_keys
struct bpf_flow_keys {
__u16 nhoff;
__u16 thoff;
__u16 addr_proto; /* ETH_P_* of valid addrs */
__u8 is_frag;
__u8 is_first_frag;
__u8 is_encap;
__u8 ip_proto;
__be16 n_proto;
__be16 sport;
__be16 dport;
union {
struct {
__be32 ipv4_src;
__be32 ipv4_dst;
};
struct {
__u32 ipv6_src[4]; /* in6_addr; network order */
__u32 ipv6_dst[4]; /* in6_addr; network order */
};
};
__u32 flags;
__be32 flow_label;
};
The initial state of the input values can differ based on the type of packet handled and the state of the dissector. For example:
In the VLAN-less case, this is what the initial state of the BPF flow dissector looks like:
+------+------+------------+-----------+
| DMAC | SMAC | ETHER_TYPE | L3_HEADER |
+------+------+------------+-----------+
^
|
+-- flow dissector starts here
skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
flow_keys->thoff = nhoff
flow_keys->n_proto = ETHER_TYPE
In case of VLAN, flow dissector can be called with the two different states.
Pre-VLAN parsing:
+------+------+------+-----+-----------+-----------+
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
+------+------+------+-----+-----------+-----------+
^
|
+-- flow dissector starts here
skb->data + flow_keys->nhoff point the to first byte of TCI
flow_keys->thoff = nhoff
flow_keys->n_proto = TPID
Please note that TPID can be 802.1AD and, hence, BPF program would have to parse VLAN information twice for double tagged packets.
Post-VLAN parsing:
+------+------+------+-----+-----------+-----------+
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
+------+------+------+-----+-----------+-----------+
^
|
+-- flow dissector starts here
skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
flow_keys->thoff = nhoff
flow_keys->n_proto = ETHER_TYPE
In this case VLAN information has been processed before the flow dissector and BPF flow dissector is not required to handle it.
The takeaway here is as follows: BPF flow dissector program can be called with the optional VLAN header and should gracefully handle both cases: when single or double VLAN is present and when it is not present. The same program can be called for both cases and would have to be written carefully to handle both cases.
Attachment
Flow dissector programs are attached to network namespaces via the BPF_PROG_ATTACH
syscall or via BPF link.
This program type must always be loaded with the expected_attach_type
of BPF_FLOW_DISSECTOR
.
Warning
BPF_PROG_ATTACH
and links cannot be combined/used at the same time.
Note
When a flow dissector is added to the root network namespace, it overwrites all other flow dissectors.
BPF_PROG_ATTACH
When attaching flow dissector programs via BPF_PROG_ATTACH
, the program will be attached to the network namespace to which the current process is assigned. The specified target FD should be 0
.
BPF link
To attach flow dissector programs to a network namespace using a link. You creating the link the prog_fd
to the file descriptor of the program, target_fd
should be set to the file descriptor of a network namespace, and the attach_type
to BPF_FLOW_DISSECTOR
.
Example
The kernel maintains a reference example in-tree at bpf_flow.c.
Helper functions
Not all helper functions are available in all program types. These are the helper calls available for socket filter programs:
Supported helper functions
- bpf_skb_load_bytes
- bpf_skc_to_tcp6_sock
- bpf_skc_to_tcp_sock
- bpf_skc_to_tcp_timewait_sock
- bpf_skc_to_tcp_request_sock
- bpf_skc_to_udp6_sock
- bpf_skc_to_unix_sock
- bpf_ktime_get_coarse_ns
- bpf_map_lookup_elem
- bpf_map_update_elem
- bpf_map_delete_elem
- bpf_map_push_elem
- bpf_map_pop_elem
- bpf_map_peek_elem
- bpf_map_lookup_percpu_elem
- bpf_get_prandom_u32
- bpf_get_smp_processor_id
- bpf_get_numa_node_id
- bpf_tail_call
- bpf_ktime_get_ns
- bpf_ktime_get_boot_ns
- bpf_ringbuf_output
- bpf_ringbuf_reserve
- bpf_ringbuf_submit
- bpf_ringbuf_discard
- bpf_ringbuf_query
- bpf_for_each_map_elem
- bpf_loop
- bpf_strncmp
- bpf_spin_lock
- bpf_spin_unlock
- bpf_jiffies64
- bpf_per_cpu_ptr
- bpf_this_cpu_ptr
- bpf_timer_init
- bpf_timer_set_callback
- bpf_timer_start
- bpf_timer_cancel
- bpf_trace_printk
- bpf_get_current_task
- bpf_get_current_task_btf
- bpf_probe_read_user
- bpf_probe_read_kernel
- bpf_probe_read_user_str
- bpf_probe_read_kernel_str
- bpf_snprintf_btf
- bpf_snprintf
- bpf_task_pt_regs
- bpf_trace_vprintk
KFuncs
There are currently no kfuncs supported for this program type