std::vector vs Vc::Memory
Tijskens Engelbert
[please enable javascript to see the address]
Thu Mar 6 09:42:39 CET 2014
Dear sandro
the attachments contains the main file test.cpp and the included timer.h
i included some unrolling tests as mathias suggested for the scalar case. that helps indeed. didn’t check the simd case so far.
kindest regards,
bert
[please enable javascript to see the address][please enable javascript to see the address]>> wrote:
Dear Tijskens,
I was intrigued by your observations and tried to reproduce them but I failed. Actually, I feel like Matthias that measuring such short minimalistic code section is really tough.
Would you be able to share your benchmark code and the way you compile it such that I can have a more thorough look?
Best
Sandro
[please enable javascript to see the address][please enable javascript to see the address]>>:
Dear all,
I am trying to figure out how to use std::vector<float> efficiently in combination with Vc. (to have dynamic arrays and performance)
std::vector<float> x(1024);
for( int i=0; i<ne; ++i ) {//initialize
x[i]=1.0;
}
// scalar loop using std::vector
for( int i=0; i<ne; ++i ) {
x[i] -= 1.0;
}
// vector loop using std::vector
for( int i=0; i<ne; i+=Vc::float_v::Size )
{
Vc::float_v vx( &x[i] );
vx -= 1.0;
vx.store( &x[i] );
}
// vector loop using Vc::Memory instead of std::vector
Vc::Memory<Vc::float_v,ne> Vx;
for( int i=0; i<ne; ++i ) {//initialize
Vx[i] = 1.0;
}
Vc::float_v one(1.);
ET_TIME_THIS
( "Vc::Memory<Vc::float_v,ne> vector",
for( int i=0; i<nv; ++i ) {
Vx.vector(i) -= one;
}
When i time these loops i get the following results
scalar loop using std::vector : 2162 cycles/repetition, 9.4e-07 seconds/repetition, 1 x speedup, 2.3 GHz, 100 repetitions.
vector loop using std::vector : 357 cycles/repetition, 1.6e-07 seconds/repetition, 6.04x speedup, 2.24 GHz, 100 repetitions.
vector loop using Vc::Memory : 288 cycles/repetition, 1.2e-07 seconds/repetition, 7.49x speedup, 2.4 GHz, 100 repetitions.
is there a way to improve the vector loop using std::vector? By the way if i write the second loop as
// vector loop using std::vector
for( int i=0; i<ne; i+=Vc::float_v::Size )
{
Vc::float_v vx( &x[i] );
Vx.vector(i) -= one;
vx.store( &x[i] );
}
things get even worse, the speedup being only 4.2x roughly.
_______________________________________________
Vc mailing list
[please enable javascript to see the address][please enable javascript to see the address]>
https://compeng.uni-frankfurt.de/mailman/listinfo/vc
--
Dr. Sandro Wenzel
PH / SFT
CERN
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://compeng.uni-frankfurt.de/pipermail/vc/attachments/20140306/dd890c68/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Timer.h
Type: application/octet-stream
Size: 9815 bytes
Desc: Timer.h
URL: <http://compeng.uni-frankfurt.de/pipermail/vc/attachments/20140306/dd890c68/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.cpp
Type: application/octet-stream
Size: 2852 bytes
Desc: test.cpp
URL: <http://compeng.uni-frankfurt.de/pipermail/vc/attachments/20140306/dd890c68/attachment-0003.obj>
More information about the Vc
mailing list