oss-sec mailing list archives
Re: Possible AMD Zen2 CVE
From: Mathias Krause <minipli () googlemail com>
Date: Tue, 19 Sep 2023 09:47:55 +0200
Hi Steve, On 18.09.23 22:37, Steve Thompson wrote:
[snip] In a loop, the following code is found: nr_spin = t1lock_acquire(&obj.lock); #if defined BROKEN temp = ++obj.value; #else ++obj.value; #endif t1lock_release(&obj.lock); If "BROKEN" is defined, you can see that an additional cache-line write is made with the assignment to 'temp'. When this code path is enabled, the underlying cmpxchg operation in t1lock_acquire() occasionally succeeds when it shouldn't...
I think you're misinterpreting the generated binary. Looking only at the difference in the core loop give us the following diff between good and bad: $ objdump -wdr --no-show-raw-insn good [...] 0000000000001580 <wr_thread>: :: 1653: incq 0x2a4e(%rip) # 40a8 <obj+0x8> 165a: incw (%rbx) In 'good' the increment of obj.value at 1653 is followed by the increment of obj.lock.ticket. Everything looking good so far. $ objdump -wdr --no-show-raw-insn bad [...] 0000000000001580 <wr_thread>: :: 1653: mov 0x2a4e(%rip),%rax # 40a8 <obj+0x8> 165a: incw (%rbx) 165d: inc %rax 1660: mov %rax,0x2a41(%rip) # 40a8 <obj+0x8> 1667: mov %rax,0x2a5a(%rip) # 40c8 <temp> In 'bad', however, obj.value is incremented only *after* obj.lock.ticket was incremented and the lock thereby released, allowing further threads to take it. This allows the data race between the read of obj.value in 1653, its increment in 165d (after the lock was released again) and writing back the possibly out-of-date value to obj.value at 1660. What's clear from the above code dump that the code is missing a memory barrier. Adding it, like the patch at the end of the email does, gives me the following: $ objdump -wdr --no-show-raw-insn not_bad [...] 0000000000001580 <wr_thread>: :: 1653: mov 0x2a4e(%rip),%rax # 40a8 <obj+0x8> 165a: inc %rax 165d: mov %rax,0x2a44(%rip) # 40a8 <obj+0x8> 1664: mov %rax,0x2a5d(%rip) # 40c8 <temp> 166b: incw (%rbx) It's basically the same instructions as for 'bad' but the lock release, i.e. obj.lock.ticket++, was moved after the write operations to 166b, preventing the data race of 'bad'. Here's the fix: --- a/bug_src.c +++ b/bug_src.c @@ -91,6 +91,7 @@ typedef struct { __inline__ void t1lock_release(t1lock * oo) { + __asm__ ("" ::: "memory"); oo->ticket++; return; The code probably needs more memory barriers to prevent making the compiler moving reads and writes outside of the critical section. But, I guess, there are good online resources to read up the requirements, e.g. what's written for the Linux kernel should be a good start: https://www.kernel.org/doc/Documentation/memory-barriers.txt Cheers, Mathias
Current thread:
- Possible AMD Zen2 CVE Steve Thompson (Sep 18)
- Re: Possible AMD Zen2 CVE Mathias Krause (Sep 19)
- RE: [External] : [oss-security] Possible AMD Zen2 CVE Casper Dik (Sep 19)