FIAS . Impressum . Privacy

std::vector vs Vc::Memory

Sandro Wenzel [please enable javascript to see the address]
Fri Mar 7 14:11:18 CET 2014


Dear Tijskens,

I am writing back to confirm that I get the same observations as you now. I
am attaching a slightly modified code that puts the individual tests in
some functions ( in order to look at the assembly and to enable binary
instrumentation analysis ... ).

I also added the same tests using plain C-like arrays. Those seem to give
good "Vc" performance immediately:

@@@ test0 - begin @@@
std::vector<float> vector : 2087 cycles/repetition, 6.14e-07
seconds/repetition, 1x speedup, 3.4 GHz, 1000000 repetitions.
float * vector : 287 cycles/repetition, 8.45e-08 seconds/repetition, 7.27x
speedup, 3.4 GHz, 1000000 repetitions.
Vc::vector<float> -one from std::vector : 635 cycles/repetition, 1.869e-07
seconds/repetition, 3.29x speedup, 3.4 GHz, 1000000 repetitions.
Vc::vector<float> -one from plain array : 319 cycles/repetition, 9.397e-08
seconds/repetition, 6.53x speedup, 3.4 GHz, 1000000 repetitions.
Vc::vector with -1 : 410 cycles/repetition, 1.207e-07 seconds/repetition,
5.09x speedup, 3.4 GHz, 1000000 repetitions.
Vc::vector with -1 and plain array : 317 cycles/repetition, 9.345e-08
seconds/repetition, 6.57x speedup, 3.4 GHz, 1000000 repetitions.
Vc::memory : 310 cycles/repetition, 9.144e-08 seconds/repetition, 6.72x
speedup, 3.4 GHz, 1000000 repetitions.
@@@ test0 - done @@@



My conclusion is that std::vector is to be avoided ... ( and anyway I still
had issues with alignment ). Note also that the compiler autovectorization
is better than any other solution here ( probably because it also unrolls
... ).


I compiled like this:

icc -mavx -I ./ -O2 -I ${VCROOT}/include testmodif.cpp -o foo.x -std=c++11
-L ${VCROOT}/lib -lVc -fabi-version=6


Best

Sandro




2014-03-06 9:42 GMT+01:00 Tijskens Engelbert <
[please enable javascript to see the address]>:

>  Dear sandro
> the attachments contains the main file test.cpp and the included timer.h
> i included some unrolling tests as mathias suggested for the scalar case.
> that helps indeed. didn't check the simd case so far.
> kindest regards,
> bert
>
>[please enable javascript to see the address]> wrote:
>
>  Dear Tijskens,
>
>  I was intrigued by your observations  and tried to reproduce them but I
> failed. Actually, I feel like Matthias that measuring such short
> minimalistic code section is really tough.
>
>  Would you be able  to share your benchmark code and the way you compile
> it such that I can have a more thorough look?
>
>  Best
>
>  Sandro
>
>
>
> 2014-03-04 18:57 GMT+01:00 Tijskens Engelbert <
>[please enable javascript to see the address]>:
>
> Dear all,
>
>  I am trying to figure out how to use std::vector<float> efficiently in
> combination with Vc. (to have dynamic arrays and performance)
>
>      std::vector<float> x(1024);
>     for( int i=0; i<ne; ++i ) {//initialize
>         x[i]=1.0;
>     }
> // scalar loop using std::vector
>     for( int i=0; i<ne; ++i ) {
>         x[i] -= 1.0;
>     }
> // vector loop using std::vector
>     for( int i=0; i<ne; i+=Vc::float_v::Size )
>     {
>         Vc::float_v vx( &x[i] );
>         vx -= 1.0;
>         vx.store( &x[i] );
>     }
> // vector loop using Vc::Memory instead of std::vector
>     Vc::Memory<Vc::float_v,ne> Vx;
>     for( int i=0; i<ne; ++i ) {//initialize
>         Vx[i] = 1.0;
>     }
>     Vc::float_v one(1.);
>     ET_TIME_THIS
>      ( "Vc::Memory<Vc::float_v,ne>  vector",
>         for( int i=0; i<nv; ++i ) {
>             Vx.vector(i) -= one;
>         }
> When i time these loops i get the following results
>  scalar loop using std::vector : 2162 cycles/repetition, 9.4e-07
> seconds/repetition, 1   x speedup, 2.3  GHz, 100 repetitions.
> vector loop using std::vector :  357 cycles/repetition, 1.6e-07
> seconds/repetition, 6.04x speedup, 2.24 GHz, 100 repetitions.
> vector loop using Vc::Memory  :  288 cycles/repetition, 1.2e-07
> seconds/repetition, 7.49x speedup, 2.4  GHz, 100 repetitions.
>
>
>  is there a way to improve the vector loop using std::vector? By the way
> if i write the second loop as
>  // vector loop using std::vector
>     for( int i=0; i<ne; i+=Vc::float_v::Size )
>     {
>         Vc::float_v vx( &x[i] );
>         Vx.vector(i) -= one;
>         vx.store( &x[i] );
>     }
>  things get even worse, the speedup being only 4.2x roughly.
>
>
> _______________________________________________
> Vc mailing list
>[please enable javascript to see the address]
> https://compeng.uni-frankfurt.de/mailman/listinfo/vc
>
>
>
>
>  --
> Dr. Sandro Wenzel
> PH / SFT
> CERN
>
>
>


-- 
Dr. Sandro Wenzel
PH / SFT
CERN
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://compeng.uni-frankfurt.de/pipermail/vc/attachments/20140307/b81fe582/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testmodif2.cpp
Type: text/x-c++src
Size: 3019 bytes
Desc: not available
URL: <http://compeng.uni-frankfurt.de/pipermail/vc/attachments/20140307/b81fe582/attachment.cpp>


More information about the Vc mailing list
FIAS . Impressum . Privacy