Vc mask-bag?
Matthias Kretz
[please enable javascript to see the address]
Wed Sep 8 09:20:57 CEST 2010
Hi,
On Wednesday 08 September 2010 08:54:05 Kulakov, Igor wrote:
> sfloat_v Y4;
> short_m dnMask;
>
> Y4.gather( data.HitDataY( rowUpUp ), upupIndexes,
static_cast<sfloat_m>(dnMask) );
> works
>
> Y4.gather( data.HitDataY( rowUpUp ), upupIndexes, dnMask );
> worked with -O0, but problems with -O2
Actually the latter is explicitly supported by the Implementation.
sfloat_v::gather uses a special helper type to allow both short_m (which
internally is SSE::Mask<8>) and sfloat_m (which internally is
SSE::Float8Mask). It's the Float8GatherMask type you can find in Vc/sse/mask.h
So really both should work and the latter should even be a little faster
because it requires less conversions. I'll try to add your example to the unit
tests and see whether I can reproduce the problem (and then fix it).
Regards,
Matthias
--
Dipl.-Phys. Matthias Kretz
More information about the Vc
mailing list