oss-sec mailing list archives

[CVE-2023-42756] Linux kernel race condition in netfilter


From: Kyle Zeng <zengyhkyle () gmail com>
Date: Wed, 27 Sep 2023 13:44:48 -0700

Hi there,

I recently found a race condition bug in the Linux kernel between
IPSET_CMD_ADD and IPSET_CMD_SWAP in netfilter/ip_set, which can
lead to the invocation of `__ip_set_put` on a wrong `set`, triggering
the `BUG_ON(set->ref == 0);` check in it, which leads to local DoS.
I confirm it at least affect upstream, v6.5.rc7, v6.1, and v5.10.

[Root Cause]
The bug is in the netfilter subsystem.
In `ip_set_swap` function, it will hold the `ip_set_ref_lock`
and then do the following to swap the sets:
~~~
        strncpy(from_name, from->name, IPSET_MAXNAMELEN);
        strncpy(from->name, to->name, IPSET_MAXNAMELEN);
        strncpy(to->name, from_name, IPSET_MAXNAMELEN);

        swap(from->ref, to->ref);
~~~
But in the retry loop in `call_ad`:
~~~
                if (retried) {
                        __ip_set_get(set);
                        nfnl_unlock(NFNL_SUBSYS_IPSET);
                        cond_resched();
                        nfnl_lock(NFNL_SUBSYS_IPSET);
                        __ip_set_put(set);
                }
~~~
No lock is hold when it does the `cond_resched()`.
As a result, `ip_set_ref_lock` (in thread 2) can swap the set with
another when thread 1 is doing the `cond_resched()`. When thread 1
wakes up, the `set` variable alreays means another `set`, calling
`__ip_set_put` on it will decrease the refcount on the wrong `set`,
triggering the `BUG_ON` call.

According to Jozsef Kadlecsik, who fixed the bug, the root cause is that
the `call_ad` function is using a wrong ref counter. Instead of using
`__ip_set_get`, which operates on `set->ref`, the correct way is to
operate on `set->ref_netlink`.

[Severity]
It will invoke a `BUG_ON` call, leading to kernel panic.
In other words, it will lead to local DoS.

[Patch]
Jozsef Kadlecsik prepared a patch and it got merged into mainline and
stables already.
The patch can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7433b6d2afd512d04398c73aa984d1e285be125b

[Proof-of-Concept]
A proof-of-concept code to trigger the bug is attached to this email.

Best,
Kyle

========================================================================
[    5.110096] ------------[ cut here ]------------
[    5.110337] kernel BUG at net/netfilter/ipset/ip_set_core.c:677!
[    5.110618] invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[    5.110892] CPU: 2 PID: 507 Comm: poc Not tainted 6.1.47+ #67
[    5.111143] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[    5.111490] RIP: 0010:call_ad+0x83e/0x850
[    5.111677] Code: 89 df e8 35 c6 d2 fd e9 d4 fd ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c d7 fd ff ff 4c 89 ff e8 
a7 c5 d2 fd e9 ca fd ff ff <0f> 0b e8 0b 09 85 00 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00
[    5.112481] RSP: 0018:ffff88800c4d7350 EFLAGS: 00010246
[    5.112718] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000000ff
[    5.113047] RDX: ffff88800b658324 RSI: 0000000000000004 RDI: ffff88800c4d7314
[    5.113373] RBP: ffff88800c4d7448 R08: dffffc0000000000 R09: ffffed100189ae63
[    5.113696] R10: dfffe9100189ae64 R11: 1ffff1100189ae62 R12: dffffc0000000000
[    5.114024] R13: 1ffff110016cb067 R14: ffff88800b658338 R15: ffffffff8557d401
[    5.114346] FS:  00000000027203c0(0000) GS:ffff888034f00000(0000) knlGS:0000000000000000
[    5.114745] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.115049] CR2: 000000000046c280 CR3: 000000000d71c005 CR4: 0000000000770ee0
[    5.115478] PKRU: 55555554
[    5.115653] Call Trace:
[    5.115799]  <TASK>
[    5.115923]  ? __die_body+0x67/0xb0
[    5.116125]  ? die+0xa0/0xc0
[    5.116295]  ? do_trap+0x124/0x350
[    5.116485]  ? call_ad+0x83e/0x850
[    5.116670]  ? call_ad+0x83e/0x850
[    5.116855]  ? handle_invalid_op+0x96/0xd0
[    5.117084]  ? call_ad+0x83e/0x850
[    5.117270]  ? exc_invalid_op+0x2f/0x40
[    5.117453]  ? asm_exc_invalid_op+0x16/0x20
[    5.117633]  ? call_ad+0x83e/0x850
[    5.117782]  ip_set_ad+0x68e/0x7d0
[    5.117932]  ? mutex_lock+0x76/0xc0
[    5.118083]  nfnetlink_rcv_msg+0x6a7/0x830
[    5.118262]  netlink_rcv_skb+0x15a/0x330
[    5.118430]  ? nfnetlink_unbind+0x180/0x180
[    5.118632]  nfnetlink_rcv+0x22d/0x1e70
[    5.118797]  ? __stack_depot_save+0x35/0x480
[    5.118982]  ? kasan_set_track+0x61/0x70
[    5.119150]  ? kasan_set_track+0x4c/0x70
[    5.119318]  ? __kasan_kmalloc+0x85/0x90
[    5.119486]  ? netlink_sendmsg+0x509/0xa00
[    5.119660]  ? __sys_sendto+0x494/0x4b0
[    5.119826]  ? __x64_sys_sendto+0xda/0xf0
[    5.119998]  ? do_syscall_64+0x67/0x90
[    5.120159]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[    5.120383]  ? __netlink_lookup+0x2fa/0x310
[    5.120562]  netlink_unicast+0x675/0x8a0
[    5.120731]  netlink_sendmsg+0x685/0xa00
[    5.120902]  ? netlink_getsockopt+0x3f0/0x3f0
[    5.121093]  __sys_sendto+0x494/0x4b0
[    5.121264]  __x64_sys_sendto+0xda/0xf0
[    5.121438]  do_syscall_64+0x67/0x90
[    5.121628]  ? exit_to_user_mode_prepare+0x12/0xa0
[    5.121874]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[    5.122142] RIP: 0033:0x475b30
[    5.122305] Code: c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 1d 45 31 c9 45 
31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 68 c3 0f 1f 80 00 00 00 00 41 54 48 83 ec 20
[    5.123328] RSP: 002b:00007ffc64795c98 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[    5.123741] RAX: ffffffffffffffda RBX: 00007ffc64795f48 RCX: 0000000000475b30
[    5.124512] RDX: 000000000000007c RSI: 00000000027244a0 RDI: 0000000000000005
[    5.124905] RBP: 00007ffc64795d40 R08: 0000000000000000 R09: 0000000000000000
[    5.125416] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[    5.125860] R13: 00007ffc64795f38 R14: 0000000000500740 R15: 0000000000000002
[    5.126282]  </TASK>
[    5.126408] Modules linked in:
[    5.126613] ---[ end trace 0000000000000000 ]---
[    5.127317] RIP: 0010:call_ad+0x83e/0x850
[    5.127565] Code: 89 df e8 35 c6 d2 fd e9 d4 fd ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c d7 fd ff ff 4c 89 ff e8 
a7 c5 d2 fd e9 ca fd ff ff <0f> 0b e8 0b 09 85 00 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00
[    5.128567] RSP: 0018:ffff88800c4d7350 EFLAGS: 00010246
[    5.128928] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000000ff
[    5.129356] RDX: ffff88800b658324 RSI: 0000000000000004 RDI: ffff88800c4d7314
[    5.129766] RBP: ffff88800c4d7448 R08: dffffc0000000000 R09: ffffed100189ae63
[    5.130203] R10: dfffe9100189ae64 R11: 1ffff1100189ae62 R12: dffffc0000000000
[    5.130602] R13: 1ffff110016cb067 R14: ffff88800b658338 R15: ffffffff8557d401
[    5.130973] FS:  00000000027203c0(0000) GS:ffff888034f00000(0000) knlGS:0000000000000000
[    5.131454] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.131809] CR2: 000000000046c280 CR3: 000000000d71c005 CR4: 0000000000770ee0
[    5.132290] PKRU: 55555554
[    5.132452] Kernel panic - not syncing: Fatal exception in interrupt
[    5.133092] Kernel Offset: disabled
[    5.133320] Rebooting in 1000 seconds..

Attachment: poc.c
Description:


Current thread: