oss-sec mailing list archives
[CVE-2023-42756] Linux kernel race condition in netfilter
From: Kyle Zeng <zengyhkyle () gmail com>
Date: Wed, 27 Sep 2023 13:44:48 -0700
Hi there, I recently found a race condition bug in the Linux kernel between IPSET_CMD_ADD and IPSET_CMD_SWAP in netfilter/ip_set, which can lead to the invocation of `__ip_set_put` on a wrong `set`, triggering the `BUG_ON(set->ref == 0);` check in it, which leads to local DoS. I confirm it at least affect upstream, v6.5.rc7, v6.1, and v5.10. [Root Cause] The bug is in the netfilter subsystem. In `ip_set_swap` function, it will hold the `ip_set_ref_lock` and then do the following to swap the sets: ~~~ strncpy(from_name, from->name, IPSET_MAXNAMELEN); strncpy(from->name, to->name, IPSET_MAXNAMELEN); strncpy(to->name, from_name, IPSET_MAXNAMELEN); swap(from->ref, to->ref); ~~~ But in the retry loop in `call_ad`: ~~~ if (retried) { __ip_set_get(set); nfnl_unlock(NFNL_SUBSYS_IPSET); cond_resched(); nfnl_lock(NFNL_SUBSYS_IPSET); __ip_set_put(set); } ~~~ No lock is hold when it does the `cond_resched()`. As a result, `ip_set_ref_lock` (in thread 2) can swap the set with another when thread 1 is doing the `cond_resched()`. When thread 1 wakes up, the `set` variable alreays means another `set`, calling `__ip_set_put` on it will decrease the refcount on the wrong `set`, triggering the `BUG_ON` call. According to Jozsef Kadlecsik, who fixed the bug, the root cause is that the `call_ad` function is using a wrong ref counter. Instead of using `__ip_set_get`, which operates on `set->ref`, the correct way is to operate on `set->ref_netlink`. [Severity] It will invoke a `BUG_ON` call, leading to kernel panic. In other words, it will lead to local DoS. [Patch] Jozsef Kadlecsik prepared a patch and it got merged into mainline and stables already. The patch can be found here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7433b6d2afd512d04398c73aa984d1e285be125b [Proof-of-Concept] A proof-of-concept code to trigger the bug is attached to this email. Best, Kyle ======================================================================== [ 5.110096] ------------[ cut here ]------------ [ 5.110337] kernel BUG at net/netfilter/ipset/ip_set_core.c:677! [ 5.110618] invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 5.110892] CPU: 2 PID: 507 Comm: poc Not tainted 6.1.47+ #67 [ 5.111143] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 5.111490] RIP: 0010:call_ad+0x83e/0x850 [ 5.111677] Code: 89 df e8 35 c6 d2 fd e9 d4 fd ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c d7 fd ff ff 4c 89 ff e8 a7 c5 d2 fd e9 ca fd ff ff <0f> 0b e8 0b 09 85 00 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 [ 5.112481] RSP: 0018:ffff88800c4d7350 EFLAGS: 00010246 [ 5.112718] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000000ff [ 5.113047] RDX: ffff88800b658324 RSI: 0000000000000004 RDI: ffff88800c4d7314 [ 5.113373] RBP: ffff88800c4d7448 R08: dffffc0000000000 R09: ffffed100189ae63 [ 5.113696] R10: dfffe9100189ae64 R11: 1ffff1100189ae62 R12: dffffc0000000000 [ 5.114024] R13: 1ffff110016cb067 R14: ffff88800b658338 R15: ffffffff8557d401 [ 5.114346] FS: 00000000027203c0(0000) GS:ffff888034f00000(0000) knlGS:0000000000000000 [ 5.114745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.115049] CR2: 000000000046c280 CR3: 000000000d71c005 CR4: 0000000000770ee0 [ 5.115478] PKRU: 55555554 [ 5.115653] Call Trace: [ 5.115799] <TASK> [ 5.115923] ? __die_body+0x67/0xb0 [ 5.116125] ? die+0xa0/0xc0 [ 5.116295] ? do_trap+0x124/0x350 [ 5.116485] ? call_ad+0x83e/0x850 [ 5.116670] ? call_ad+0x83e/0x850 [ 5.116855] ? handle_invalid_op+0x96/0xd0 [ 5.117084] ? call_ad+0x83e/0x850 [ 5.117270] ? exc_invalid_op+0x2f/0x40 [ 5.117453] ? asm_exc_invalid_op+0x16/0x20 [ 5.117633] ? call_ad+0x83e/0x850 [ 5.117782] ip_set_ad+0x68e/0x7d0 [ 5.117932] ? mutex_lock+0x76/0xc0 [ 5.118083] nfnetlink_rcv_msg+0x6a7/0x830 [ 5.118262] netlink_rcv_skb+0x15a/0x330 [ 5.118430] ? nfnetlink_unbind+0x180/0x180 [ 5.118632] nfnetlink_rcv+0x22d/0x1e70 [ 5.118797] ? __stack_depot_save+0x35/0x480 [ 5.118982] ? kasan_set_track+0x61/0x70 [ 5.119150] ? kasan_set_track+0x4c/0x70 [ 5.119318] ? __kasan_kmalloc+0x85/0x90 [ 5.119486] ? netlink_sendmsg+0x509/0xa00 [ 5.119660] ? __sys_sendto+0x494/0x4b0 [ 5.119826] ? __x64_sys_sendto+0xda/0xf0 [ 5.119998] ? do_syscall_64+0x67/0x90 [ 5.120159] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 5.120383] ? __netlink_lookup+0x2fa/0x310 [ 5.120562] netlink_unicast+0x675/0x8a0 [ 5.120731] netlink_sendmsg+0x685/0xa00 [ 5.120902] ? netlink_getsockopt+0x3f0/0x3f0 [ 5.121093] __sys_sendto+0x494/0x4b0 [ 5.121264] __x64_sys_sendto+0xda/0xf0 [ 5.121438] do_syscall_64+0x67/0x90 [ 5.121628] ? exit_to_user_mode_prepare+0x12/0xa0 [ 5.121874] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 5.122142] RIP: 0033:0x475b30 [ 5.122305] Code: c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 1d 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 68 c3 0f 1f 80 00 00 00 00 41 54 48 83 ec 20 [ 5.123328] RSP: 002b:00007ffc64795c98 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 5.123741] RAX: ffffffffffffffda RBX: 00007ffc64795f48 RCX: 0000000000475b30 [ 5.124512] RDX: 000000000000007c RSI: 00000000027244a0 RDI: 0000000000000005 [ 5.124905] RBP: 00007ffc64795d40 R08: 0000000000000000 R09: 0000000000000000 [ 5.125416] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 5.125860] R13: 00007ffc64795f38 R14: 0000000000500740 R15: 0000000000000002 [ 5.126282] </TASK> [ 5.126408] Modules linked in: [ 5.126613] ---[ end trace 0000000000000000 ]--- [ 5.127317] RIP: 0010:call_ad+0x83e/0x850 [ 5.127565] Code: 89 df e8 35 c6 d2 fd e9 d4 fd ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c d7 fd ff ff 4c 89 ff e8 a7 c5 d2 fd e9 ca fd ff ff <0f> 0b e8 0b 09 85 00 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 [ 5.128567] RSP: 0018:ffff88800c4d7350 EFLAGS: 00010246 [ 5.128928] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000000ff [ 5.129356] RDX: ffff88800b658324 RSI: 0000000000000004 RDI: ffff88800c4d7314 [ 5.129766] RBP: ffff88800c4d7448 R08: dffffc0000000000 R09: ffffed100189ae63 [ 5.130203] R10: dfffe9100189ae64 R11: 1ffff1100189ae62 R12: dffffc0000000000 [ 5.130602] R13: 1ffff110016cb067 R14: ffff88800b658338 R15: ffffffff8557d401 [ 5.130973] FS: 00000000027203c0(0000) GS:ffff888034f00000(0000) knlGS:0000000000000000 [ 5.131454] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.131809] CR2: 000000000046c280 CR3: 000000000d71c005 CR4: 0000000000770ee0 [ 5.132290] PKRU: 55555554 [ 5.132452] Kernel panic - not syncing: Fatal exception in interrupt [ 5.133092] Kernel Offset: disabled [ 5.133320] Rebooting in 1000 seconds..
Attachment:
poc.c
Description:
Current thread:
- [CVE-2023-42756] Linux kernel race condition in netfilter Kyle Zeng (Sep 27)