oss-sec mailing list archives

Linux kernel: CVE-2022-1015,CVE-2022-1016 in nf_tables cause privilege escalation, information leak

From: David Bouman <davidbouman35 () gmail com>
Date: Mon, 28 Mar 2022 20:28:21 +0200

Hello list,

I'm reporting two linux kernel vulnerabilities in the nf_tablescomponent of the netfilter subsystem that I found.

CVE-2022-1015 pertains to an out of bounds access in nf_tablesexpression evaluation due to validation of user register indices. Itleads to local privilege escalation, for example by overwriting a stackreturn address OOB with a crafted nft_expr_payload.

CVE-2022-1015 is exploitable starting from commit 345023b0db3("netfilter: nftables: add nft_parse_register_store() and use it"),v5.12 and has been fixed in commit 6e1acfa387b9 ("netfilter: nf_tables:validate registers coming from userspace.").

The bug has been present since commit 49499c3e6e18 ("netfilter:nf_tables: switch registers to 32 bit addressing"), but to my knowledgehas not been exploitable until v5.12.

CVE-2022-1016 pertains to uninitialized stack data in the nft_do_chainroutine. CVE-2022-1016 is exploitable starting from commit 96518518cc41(original merge of nf_tables), v3.13-rc1, and has been fixed in commit4c905f6740a3 ("netfilter: nf_tables: initialize registers innft_do_chain()").

I will be releasing a detailed blog post and exploit code for bothvulnerabilities in a few days.


Root cause CVE-2022-1016: (it is the shortest, so I will begin with it)

The nft_do_chain routine in net/netfilter/nf_tables_core.c does notinitialize the register data that nf_tables expressions can read from-and write to. These expressions inherently exhibit side effects that canbe used to determine the register data, which can contain kernel imagepointers, module pointers, and allocation pointers depending on the codepath taken to end up at nft_do_chain.


```
unsigned int
nft_do_chain(struct nft_pktinfo *pkt, void *priv)
{
        const struct nft_chain *chain = priv, *basechain = chain;
        const struct net *net = nft_net(pkt);
        struct nft_rule *const *rules;
        const struct nft_rule *rule;
        const struct nft_expr *expr, *last;
        struct nft_regs regs; // <-------- VULNERABLE! NOT INITIALIZED.
        unsigned int stackptr = 0;
        struct nft_jumpstack jumpstack[NFT_JUMP_STACK_SIZE];
        bool genbit = READ_ONCE(net->nft.gencursor);
        struct nft_traceinfo info;

        info.trace = false;
        if (static_branch_unlikely(&nft_trace_enabled))
                nft_trace_init(&info, pkt, &regs.verdict, basechain);
do_chain:
        if (genbit)
                rules = rcu_dereference(chain->rules_gen_1);
        else
                rules = rcu_dereference(chain->rules_gen_0);

next_rule:
        rule = *rules;
        regs.verdict.code = NFT_CONTINUE;
        for (; *rules ; rules++) {
                rule = *rules;
                nft_rule_for_each_expr(expr, last, rule) {
                        if (expr->ops == &nft_cmp_fast_ops)
                                nft_cmp_fast_eval(expr, &regs);
                        else if (expr->ops == &nft_bitwise_fast_ops)
                                nft_bitwise_fast_eval(expr, &regs);
                        else if (expr->ops != &nft_payload_fast_ops ||
                                 !nft_payload_fast_eval(expr, &regs, pkt))
                                expr_call_ops_eval(expr, &regs, pkt);
                                ...
```

Root cause CVE-2022-1015:

(below is pasted from my original security () kernel org report)

Hello, I'm mailing to report a vulnerability I found in nf_tablescomponent of the netfilter subsystem. The vulnerability gives anattacker a powerful primitive that can be used to both read from andwrite to relative stack data. This can lead to arbitrary code executionby an attacker.

In order for an unprivileged attacker to exploit this issue,unprivileged user- and network namespaces access is required(CLONE_NEWUSER | CLONE_NEWNET). The bug relies on a compileroptimization that introduces behavior that the maintainer did notaccount for, and most likely only occurs on kernels with`CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y`. I successfully exploited the bugon x86_64 kernel version 5.16-rc3, but I believe this vulnerabilityexists across different kernel versions and architectures (more on thislater).


Without further ado:

The bug resides in `linux/net/netfilter/nf_tables_api.c`, in the`nft_validate_register_store` and `nft_validate_register_load` routines.These routines are used to check if nft expression parameters suppliedby the user are sound and won't cause OOB stack accesses when evaluatingthe expression.

From my 5.16-rc3 kernel source(d58071a8a76d779eedab38033ae4c821c30295a5: Linux 5.16-rc3):


nft_validate_register_store:

```
static int nft_validate_register_store(const struct nft_ctx *ctx,
      enum nft_registers reg,
      const struct nft_data *data,
      enum nft_data_types type,
      unsigned int len)
{
int err;

switch (reg) {
        ...
default:
if (reg < NFT_REG_1 * NFT_REG_SIZE / NFT_REG32_SIZE)
return -EINVAL;
if (len == 0)
return -EINVAL;
if (reg * NFT_REG32_SIZE + len >
   sizeof_field(struct nft_regs, data))
return -ERANGE;

if (data != NULL && type != NFT_DATA_VALUE)
return -EINVAL;
return 0;
}
}
```

nft_validate_register_load:

```

static int nft_validate_register_load(enum nft_registers reg, unsignedint len)

{
if (reg < NFT_REG_1 * NFT_REG_SIZE / NFT_REG32_SIZE)
return -EINVAL;
if (len == 0)
return -EINVAL;
if (reg * NFT_REG32_SIZE + len > sizeof_field(struct nft_regs, data))
return -ERANGE;

return 0;
}
```

The problem lies in the fact that `enum nft_registers reg` is notguaranteed only be a single byte. As per the C89 specification, 3.1.3.3Enumeration constants: `An identifier declared as an enumerationconstant has type int.`.

Effectively this implies that the compiler is free to emit code thatoperates on `reg` as if it were a 32-bit value. If this is the case (andit is on the kernel I tested), a user can forge an expression registervalue that will overflow upon multiplication with `NFT_REG32_SIZE` (4)and upon addition with `len`, will be a value smaller than`sizeof_field(struct nft_regs, data)` (0x50). Once this check passes,the least significant byte of `reg` can still contain a value that willindex outside of the bounds of the `struct nft_regs regs` that it willlater be used with.

Take for example a `reg` value of `0xfffffff8` and a `len` value of`0x40`. The expression `reg * 4 + len` will then result in `0xffffffe0 +0x40 = 0x20`, which is lower than `0x50`. This makes that a value of`0xf8` is recognized as a valid index, and is subsequently assigned to aregister value in the expression info structs.

Here is a snippet of the x86_64 assembly code that these functions mightgenerate:


```
Disassembly of section .text:

0000000000002ed0 <nft_validate_register_store>:

2ed0: e8 00 00 00 00 callq 2ed5<nft_validate_register_store+0x5>

    2ed5: 55                   push   %rbp
    2ed6: 48 89 e5             mov    %rsp,%rbp
    2ed9: 41 54                 push   %r12
    2edb: 85 f6                 test   %esi,%esi

2edd: 75 2b jne 2f0a<nft_validate_register_store+0x3a>

    2edf: 81 f9 00 ff ff ff     cmp    $0xffffff00,%ecx

2ee5: 75 49 jne 2f30<nft_validate_register_store+0x60>

    2ee7: 45 31 e4             xor    %r12d,%r12d
    2eea: 48 85 d2             test   %rdx,%rdx

2eed: 74 3a je 2f29<nft_validate_register_store+0x59>

    2eef: 8b 02                 mov    (%rdx),%eax
    2ef1: 83 c0 04             add    $0x4,%eax
    2ef4: 83 f8 01             cmp    $0x1,%eax

2ef7: 77 30 ja 2f29<nft_validate_register_store+0x59>

    2ef9: 48 8b 72 08           mov    0x8(%rdx),%rsi
    2efd: e8 7e da ff ff       callq  980 <nf_tables_check_loops>
    2f02: 85 c0                 test   %eax,%eax
    2f04: 44 0f 4e e0           cmovle %eax,%r12d

2f08: eb 1f jmp 2f29<nft_validate_register_store+0x59>

    2f0a: 83 fe 03             cmp    $0x3,%esi

2f0d: 76 21 jbe 2f30<nft_validate_register_store+0x60>

    2f0f: 45 85 c0             test   %r8d,%r8d

2f12: 74 1c je 2f30<nft_validate_register_store+0x60>

    2f14: 41 8d 04 b0           lea    (%r8,%rsi,4),%eax
    2f18: 83 f8 50             cmp    $0x50,%eax

2f1b: 77 1b ja 2f38<nft_validate_register_store+0x68>

    2f1d: 48 85 d2             test   %rdx,%rdx

2f20: 74 04 je 2f26<nft_validate_register_store+0x56>

    2f22: 85 c9                 test   %ecx,%ecx

2f24: 75 0a jne 2f30<nft_validate_register_store+0x60>

    2f26: 45 31 e4             xor    %r12d,%r12d
    2f29: 44 89 e0             mov    %r12d,%eax
    2f2c: 41 5c                 pop    %r12
    2f2e: 5d                   pop    %rbp
    2f2f: c3                   retq
    2f30: 41 bc ea ff ff ff     mov    $0xffffffea,%r12d

2f36: eb f1 jmp 2f29<nft_validate_register_store+0x59>

    2f38: 41 bc de ff ff ff     mov    $0xffffffde,%r12d

2f3e: eb e9 jmp 2f29<nft_validate_register_store+0x59>

```

the `lea` instruction at `2f14` will multiply `%rsi` (reg) by 4 and add`%r8` len to it.

I created a working local privilege escalation exploit by using such anout of bounds index to copy stack data to the actual register area(declared in nf_tables_core.c:nft_do_chain). Then, I wrote a a few nftrules that drop or accept packets depending on whether the targeted byteis greater than the constant comparand in the rule or not. This way Icould create a binary search procedure that could determine the value ofthe leaked byte by registering whether the packet was dropped or not.This results in a kernel address leak.

Finally, I used a nft payload expression to write my arbitrary datasupplied in a packet to the stack in order to overwrite a return addressand execute a ROP chain.

An alternative exploitation strategy would be to overwrite to verdictregister (including its chain pointer) to arbitrary values, as you cannow get an register index of 0 in the same manner.


------------------------------

David Bouman

Current thread:

Linux kernel: CVE-2022-1015,CVE-2022-1016 in nf_tables cause privilege escalation, information leak David Bouman (Mar 28)