masked loads?
Matthias Kretz
[please enable javascript to see the address]
Tue Aug 9 08:04:28 CEST 2016
On Montag, 8. August 2016 17:48:32 CEST Giordano Khouri wrote:
> _mm_maskload_ps is an AVX intrinsic. _mm_maskmoveu_si128 is SSE2, but will
> cause address exceptions even if those bytes are masked.
That is a good point, thanks for pointing it out. However, I guess I should
point out how the SSE vs. AVX namespaces/policy tags work (since Vc 1.0). When
it says Vector<T, VectorAbi::Sse> (SSE::Vector<T> is just an alias for the
former) then you're only asking for using xmm registers for function arguments
of those types. When it says Vector<T, VectorAbi::Avx> you get ymm registers.
The set of instructions used to implement the functions and operators is
partially orthogonal. E.g. compile with -mavx2 and explicitly use SSE Vectors
in your code: You'll get vector objects of 16 Bytes, but using AVX
instructions and most importantly VEX encoding (and thus ternary
instructions).
So, it is possible to implement an SSE::Vector function using AVX intrinsics.
But it is not enough for the cases where you compile with -mno-avx. So it'll
need an #ifdef for __AVX__ and fall back to a manual gather for the pure SSE
case.
Cheers,
Matthias
--
──────────────────────────────────────────────────────────────────────────
Dr. Matthias Kretz https://kretzfamily.de
GSI Helmholtzzentrum für Schwerionenforschung https://gsi.de
SIMD easy and portable https://github.com/VcDevel/Vc
──────────────────────────────────────────────────────────────────────────
More information about the Vc
mailing list