FIAS . Impressum . Privacy

Vc mask-bag?

Matthias Kretz [please enable javascript to see the address]
Wed Sep 8 09:20:57 CEST 2010


Hi,

On Wednesday 08 September 2010 08:54:05 Kulakov, Igor wrote:
> sfloat_v Y4;
> short_m dnMask;
>
> Y4.gather( data.HitDataY( rowUpUp ), upupIndexes, 
static_cast<sfloat_m>(dnMask) );
> works
>
> Y4.gather( data.HitDataY( rowUpUp ), upupIndexes, dnMask );
> worked with -O0, but problems with -O2

Actually the latter is explicitly supported by the Implementation. 
sfloat_v::gather uses a special helper type to allow both short_m (which 
internally is SSE::Mask<8>) and sfloat_m (which internally is 
SSE::Float8Mask). It's the Float8GatherMask type you can find in Vc/sse/mask.h

So really both should work and the latter should even be a little faster 
because it requires less conversions. I'll try to add your example to the unit 
tests and see whether I can reproduce the problem (and then fix it).

Regards,
	Matthias

-- 
Dipl.-Phys. Matthias Kretz



More information about the Vc mailing list
FIAS . Impressum . Privacy