-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
Non-temporal stores (and _mm_stream operations in stdarch) break our memory model #114582
Copy link
Copy link
Closed
Closed
Copy link
Labels
I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessO-x86_32Target: x86 processors, 32 bit (like i686-*) (also known as IA-32, i386, i586, i686)Target: x86 processors, 32 bit (like i686-*) (also known as IA-32, i386, i586, i686)O-x86_64Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)P-mediumMedium priorityMedium priorityT-langRelevant to the language teamRelevant to the language teamT-opsemRelevant to the opsem teamRelevant to the opsem team
Metadata
Metadata
Assignees
Labels
I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessO-x86_32Target: x86 processors, 32 bit (like i686-*) (also known as IA-32, i386, i586, i686)Target: x86 processors, 32 bit (like i686-*) (also known as IA-32, i386, i586, i686)O-x86_64Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)P-mediumMedium priorityMedium priorityT-langRelevant to the language teamRelevant to the language teamT-opsemRelevant to the opsem teamRelevant to the opsem team
Type
Fields
Give feedbackNo fields configured for issues without a type.
I recently discovered this funny little intrinsic with the great comment
Unfortunately, the comment is wrong: this has become stable, through vendor intrinsics like
_mm_stream_ps.Why is that a problem? Well, turns out non-temporal stores completely break our memory model. The following assertion can fail under the current compilation scheme used by LLVM:
The assertion can fail because the CPU may order MOVNT after later MOV (for different locations), so the nontemporal_store might occur after the release store. Sources for this claim:
This is a big problem -- we have a memory model that says you can use release/acquire operations to synchronize any (including non-atomic) memory accesses, and we have memory accesses which are not properly synchronized by release/acquire operations.
So what could be done?
_mm_streamintrinsics without it and mark them as deprecated to signal that they don't match the expected semantics of the underlying hardware operation. People should use inline assembly instead and then it is their responsibility to have an sfence at the end of their asm block to restore expected synchronization behavior.Thanks a lot to @workingjubilee and @the8472 for their help in figuring out the details of nontemporal stores.
Cc @rust-lang/lang @Amanieu
Also see the nomination comment here.