Native vector width

Different architectures have different vector register sizes. To make it even more complicated, some architectures have different vector register sizes available depending on target features and element type.

Using a fixed vector size

Some algorithms might be better suited to particular vector sizes, even if it doesn’t match the native vector size. In the case of mismatched sizes, there are two possibilities:

  • If the vectors are smaller than the native vector registers, native vectors will still be used but will be partially empty. In this scenario, the full parallel capability of the target is underused.
  • If the vectors are larger than the native vector registers, multiple native vectors will be used to emulate one large vector. If too many native vectors are used, the program can spill, meaning it has run out of registers and must use (much slower) memory instead.

Using a fixed vector size is a tradeoff between these possible issues and the design of the particular algorithm.

Using the native vector size

The target-features crate provides a function for determining the native vector size:

use target_features::CURRENT_TARGET;

// Different element types can have different vector sizes. Here we use `f32`.
const N: usize = if let Some(size) = CURRENT_TARGET.suggested_vector_width::<f32>() {
    size
} else {
    // If SIMD isn't supported natively, we use a vector of 1 element.
    // This is effectively a scalar value.
    1
};

/// Now we can use `Vector` instead of a particular vector type.
type Vector = Simd<f32, N>;

When detecting features, the multiversion crate provides the same capability:

use multiversion::{multiversion, target::selected_target};

#[multiversion(targets = "simd")]
fn example() {
    // The `selected_target` macro takes into account the detected optional features,
    // and not just the base features used in the previous example.
    const N: usize = if let Some(size) = selected_target!().suggested_vector_width::<f32>() {
        size
    } else {
        1
    };

    /// Once again, we can use `Vector` instead of a particular vector type.
    type Vector = Simd<f32, N>;
}