pixel art of a fish

lethargic.talkative.fish

blackle mori @suricrasia
Sep 14, 2022

suppose in one thread I have:

while true {
lock(mutex)
if (shared_var == true) do_thing()
unlock(mutex)
}

and in another thread I have:

shared_var = true
sleep(30)
shared_var = false

(note the lack of a mutex)

what happens and why?

Sep 14, 2022, 20:03
3
1
1
View toot
blackle mori @suricrasia
Sep 14, 2022

something something store queues, right? idk why I keep learning then forgetting how this actually works on the CPU

Sep 14, 2022, 20:05
1
1
0
View toot
Drahflow @drahflow@infosec.exchange
Sep 14, 2022

@suricrasia On _which_ CPU and with _which_ compiler specifically? There is an hilarious amount of papers discussing not only the problems of eg. the Java memory model, but first, what _is_ the Java memory model.

Sep 14, 2022, 20:07
1
0
1
View toot
blackle mori @suricrasia
Sep 14, 2022

@drahflow to be simple (for some definition of simple) let's suppose an x86 cpu with two cores, and the threads are on two different cores

Sep 14, 2022, 20:08
1
0
0
View toot
Drahflow @drahflow@infosec.exchange
Sep 14, 2022

@suricrasia A modern compiler might see that `shared_var` is independent of sleep() and move the first store below it. Then multiscalar CPU will find the second store shadows the first, eliminate the first and voila, nothing happens. (Assuming shared_var is declared volatile, otherwise the compiler might have done this already anyway.) Except when the scheduler interrupts the core in-between the scheduling of the two stores.

Sep 14, 2022, 20:12
1
0
1
View toot
Drahflow @drahflow@infosec.exchange
Sep 14, 2022

@suricrasia IIRC, the one guarantee you get for similar code on x86 (but NOT everywhere) is that stores reach cross-core memory levels in the order the instructions were issued.

Sep 14, 2022, 20:14
2
0
1
View toot
Drahflow @drahflow@infosec.exchange
Sep 14, 2022

@suricrasia Source: AMD x86-64 Programmer's Manual, Vol. 1, Rev. 3.08, page 114 (search for chapter "3.9 Memory Optimization")

Sep 14, 2022, 20:29
0
0
1
View toot
Drahflow @drahflow@infosec.exchange
Sep 14, 2022

@suricrasia The other one being (but irrelevant for this code) that loads from addresses are guaranteed to see data which is not older than it was when the address was loaded. (Even when the address was speculated in reality.) I think this one is more widespread.

Sep 14, 2022, 20:17
1
0
1
View toot
blackle mori @suricrasia
Sep 14, 2022

@drahflow hmm I thought there was something about functions-with-side-effects that means memory operations don't get re-ordered around function calls. though this might be an annotation you have to add to a function call...

Sep 14, 2022, 20:20
2
0
0
View toot
kepstin @kepstin@glitch.social
Sep 14, 2022

@suricrasia @drahflow function calls are… tricky. There's some *compile-time* re-ordering that can't be done when calling a function that's in an external library (when doing lto, anything goes…), but in this case i think the compiler will know that the sleep() call has no access to shared_var due to aliasing rules, so it could probably combine the stores to it.

Sep 14, 2022, 21:00
1
0
1
View toot
kepstin @kepstin@glitch.social
Sep 14, 2022

@suricrasia @drahflow when running on other cores you might also have memory ordering problems here? I forget about x86 specifically, but you have mutex lock calls (which have "acquire" memory ordering), but you have nothing that does "release" memory ordering after the write, so I'm not sure if or when the core doing the mutex loop will necessarily see the update?

Sep 14, 2022, 21:01
1
0
1
View toot
kepstin @kepstin@glitch.social
Sep 14, 2022

@suricrasia @drahflow FWIW, this would "just work" as expected if instead of a normal bool, you use an atomic type like built into many modern programming languages like e.g. doc.rust-lang.org/std/sync/ato or en.cppreference.com/w/cpp/atom where the operations used to store and retrieve the values from the variable do the expected compiler *and* runtime memory ordering.

Sep 14, 2022, 21:05
0
0
1
View toot
Drahflow @drahflow@infosec.exchange
Sep 14, 2022

@suricrasia Rather the other way around: Unknown function (assuming C-like ABI) could do anything (incl. memory fences), therefore no reordering. But sleep(), being presumably a stdlib call, might be annotated not to have that or even inlined by the compiler - think an embedded platform where it's really just a loop.

Sep 14, 2022, 20:23
0
0
1
View toot