Expand description
SIMD and vendor intrinsics module.
This module is intended to be the gateway to architecture-specific intrinsic functions, typically related to SIMD (but not always!). Each architecture that Rust compiles to may contain a submodule here, which means that this is not a portable module! If you’re writing a portable library take care when using these APIs!
Under this module you’ll find an architecture-named module, such as
x86_64
. Each #[cfg(target_arch)]
that Rust can compile to may have a
module entry here, only present on that particular target. For example the
i686-pc-windows-msvc
target will have an x86
module here, whereas
x86_64-pc-windows-msvc
has x86_64
.
§Overview
This module exposes vendor-specific intrinsics that typically correspond to a single machine instruction. These intrinsics are not portable: their availability is architecture-dependent, and not all machines of that architecture might provide the intrinsic.
The arch
module is intended to be a low-level implementation detail for
higher-level APIs. Using it correctly can be quite tricky as you need to
ensure at least a few guarantees are upheld:
- The correct architecture’s module is used. For example the
arm
module isn’t available on thex86_64-unknown-linux-gnu
target. This is typically done by ensuring that#[cfg]
is used appropriately when using this module. - The CPU the program is currently running on supports the function being called. For example it is unsafe to call an AVX2 function on a CPU that doesn’t actually support AVX2.
As a result of the latter of these guarantees all intrinsics in this module
are unsafe
and extra care needs to be taken when calling them!
§CPU Feature Detection
In order to call these APIs in a safe fashion there’s a number of
mechanisms available to ensure that the correct CPU feature is available
to call an intrinsic. Let’s consider, for example, the _mm256_add_epi64
intrinsics on the x86
and x86_64
architectures. This function requires
the AVX2 feature as documented by Intel so to correctly call
this function we need to (a) guarantee we only call it on x86
/x86_64
and (b) ensure that the CPU feature is available
§Static CPU Feature Detection
The first option available to us is to conditionally compile code via the
#[cfg]
attribute. CPU features correspond to the target_feature
cfg
available, and can be used like so:
#[cfg(
all(
any(target_arch = "x86", target_arch = "x86_64"),
target_feature = "avx2"
)
)]
fn foo() {
#[cfg(target_arch = "x86")]
use std::arch::x86::_mm256_add_epi64;
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::_mm256_add_epi64;
unsafe {
_mm256_add_epi64(...);
}
}
Here we’re using #[cfg(target_feature = "avx2")]
to conditionally compile
this function into our module. This means that if the