Inlining and target features

Inlining and target features have a few interactions that can significantly affect the speed of your program.

#[target_feature] hinders inlining

Compilers are free to optimize functions by reordering their operations, as long as it doesn’t change the observable results. To prevent these optimizations from reordering target feature detection, the #[target_feature] attribute prevents inlining.

Since non-inlined function calls can be slow, try to use one large function tagged with #[target_feature], rather than many separate functions. Additionally, multiple uses of #[target_feature] may not be necessary, as explained in the next paragraph.

Info

Functions with #[target_feature] can still inline into other functions that support the same features via #[target_feature] or the -Ctarget-feature flag.

Inlined functions can inherit target features

When a function is inlined into another function tagged with #[target_feature], the inlined function is also compiled with those target features. To help ensure inlining, the #[inline] attribute can be used. This behavior can be used to write functions without worrying about what the final target features will be—in fact, this is what portable SIMD does! Functions in std::simd are inlined, allowing the user to specify the target features.

In the following example, all of the code is generated using AVX, because it’s all inlined into the function with #[target_feature]:

#![feature(portable_simd)] fn main() {} use std::simd::f32x4; #[inline] fn double(x: f32x4) -> f32x4 { x * f32x4::splat(2.0) // The `splat` and `*` functions are inlined here } #[target_feature(enable = "avx")] unsafe fn double_avx(x: f32x4) -> f32x4 { double(x) // The `double` function is inlined here }

Target features affect calling convention

The target features of a function affect its calling convention, the way memory is arranged when calling the function. Specifically, the target features affect which vector registers are used to pass vectors.

To ensure that functions with different target features can still be used in the same program, Rust passes vectors by reference instead of by value. This can significantly slow down calling functions with vector arguments because it forces the vector to be written to memory.

To avoid this behavior, avoid using vectors as arguments in non-inlined functions. Inlined functions are not affected because the function call is optimized out.