Skip to content

Program type BPF_PROG_TYPE_SK_LOOKUP

v5.9

The socket lookup program allows an eBPF program to pick which socket to send traffic to irrespective of how that target socket has been bound.

The primary use case for this program type is to allow a single program to handle traffic for network patterns which cannot be expressed with the normal bind syscall. For example, a single socket can be bound to a whole /24 network CIDR (bind only allows for single IPs, or you have to set it to 0.0.0.0 which is not desirable if another application should answer a different range of IPs). Or a single socket can listen to any port for a given IP.

Usage

Socket lookup programs are typically put into an ELF section prefixed with sk_lookup. Socket lookup programs are invoked by the transport layer when looking up a listening socket for a new connection request for connection oriented protocols, or when looking up an unconnected socket for a packet for connection-less protocols.

The socket lookup program acts as a filter, if it returns SK_DROP (0) the connection or packet is dropped. If it returns SK_PASS (1) without setting a socket, the normal resolve behavior is used. However, the program can also chose to assign a specific socket with the bpf_sk_assign helper function.

Context

Socket lookup programs are called with the struct bpf_sk_lookup context.

c structure

union {
    __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */
    __u64 cookie; /* Non-zero if socket was selected in PROG_TEST_RUN */
};

__u32 family;           /* Protocol family (AF_INET, AF_INET6) */
__u32 protocol;         /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */
__u32 remote_ip4;       /* Network byte order */
__u32 remote_ip6[4];    /* Network byte order */
__be16 remote_port;     /* Network byte order */
__u16 :16;              /* Zero padding */
__u32 local_ip4;        /* Network byte order */
__u32 local_ip6[4];     /* Network byte order */
__u32 local_port;       /* Host byte order */
__u32 ingress_ifindex;  /* The arriving interface. Determined by inet_iif. */

sk

This field is a pointer to a selected socket, the field is read-only, but can be updated via the bpf_sk_assign helper function.

This field is is set to the cookie of the assigned socket if the program assigns one during a PROG_TEST_RUN.

family

The address family of the connection/packet for which the program is invoked. Can be AF_INET or AF_INET6

protocol

The transport layer protocol of the connection/packet for which the program is invoked. Can be IPPROTO_TCP or IPPROTO_UDP

remote_ip4

The remote IPv4 address of the connection/packet for which the program is invoked.

remote_ip6

The remote IPv6 address of the connection/packet for which the program is invoked.

remote_port

The remote port of the connection/packet for which the program is invoked.

local_ip4

The local IPv4 address of the connection/packet for which the program is invoked.

local_ip6

The local IPv6 address of the connection/packet for which the program is invoked.

local_port

The local port of the connection/packet for which the program is invoked.

ingress_ifindex

The network interface index of the network interface on which the packet arrived.

Attachment

This program type must always be loaded with the expected_attach_type of BPF_SK_LOOKUP.

Socket lookup programs are attached to a network namespace using a link. When creating the link the prog_fd to the file descriptor of the program, target_fd should be set to the file descriptor of a network namespace, and the attach_type to BPF_SK_LOOKUP.

Example

// Copyright (c) 2020 Cloudflare
struct {
    __uint(type, BPF_MAP_TYPE_SOCKMAP);
    __uint(max_entries, 32);
    __type(key, __u32);
    __type(value, __u64);
} redir_map SEC(".maps");

static const __u16 DST_PORT = 7007; /* Host byte order */
static const __u32 DST_IP4 = IP4(127, 0, 0, 1);
static const __u32 KEY_SERVER_A = 0;

/* Redirect packets destined for DST_IP4 address to socket at redir_map[0]. */
SEC("sk_lookup")
int redir_ip4(struct bpf_sk_lookup *ctx)
{
    struct bpf_sock *sk;
    int err;

    if (ctx->family != AF_INET)
        return SK_PASS;
    if (ctx->local_port != DST_PORT)
        return SK_PASS;
    if (ctx->local_ip4 != DST_IP4)
        return SK_PASS;

    sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A);
    if (!sk)
        return SK_PASS;

    err = bpf_sk_assign(ctx, sk, 0);
    bpf_sk_release(sk);
    return err ? SK_DROP : SK_PASS;
}

Helper functions

Not all helper functions are available in all program types. These are the helper calls available for socket filter programs:

Supported helper functions

KFuncs

There are currently no kfuncs supported for this program type