Permutation

__m128i __lsx_vpermi_w (__m128i a, __m128i b, imm0_255 imm)

Synopsis

__m128i __lsx_vpermi_w (__m128i a, __m128i b, imm0_255 imm)
#include <lsxintrin.h>
Instruction: vpermi.w vr, vr, imm
CPU Flags: LSX

Description

Permute words from a and b with indices recorded in imm and store into dst.

Operation

dst.word[0] = b.word[imm & 0x3];
dst.word[1] = b.word[(imm >> 2) & 0x3];
dst.word[2] = a.word[(imm >> 4) & 0x3];
dst.word[3] = a.word[(imm >> 6) & 0x3];

Tested on real machine.

Latency and Throughput

CPU Latency Throughput (CPI)
3A6000 1 4
3C5000 1 2