<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<div style="word-wrap:break-word">Dear sandro
<div>the attachments contains the main file test.cpp and the included timer.h</div>
<div>i included some unrolling tests as mathias suggested for the scalar case. that helps indeed. didn’t check the simd case so far.</div>
<div>kindest regards,</div>
<div>bert</div>
<div></div>
</div>
<div style="word-wrap:break-word">
<div></div>
<div></div>
</div>
<div style="word-wrap:break-word">
<div></div>
<div><br>
<div>
<div>On 05 Mar 2014, at 09:23, Sandro Wenzel <<a href="mailto:sandro.wenzel@cern.ch">sandro.wenzel@cern.ch</a>> wrote:</div>
<br class="x_Apple-interchange-newline">
<blockquote type="cite">
<div dir="ltr">Dear Tijskens,
<div><br>
</div>
<div style="">I was intrigued by your observations and tried to reproduce them but I failed. Actually, I feel like Matthias that measuring such short minimalistic code section is really tough.</div>
<div style=""><br>
</div>
<div style="">Would you be able to share your benchmark code and the way you compile it such that I can have a more thorough look?</div>
<div style=""><br>
</div>
<div style="">Best</div>
<div style=""><br>
</div>
<div style="">Sandro</div>
<div style=""><br>
</div>
<div class="x_gmail_extra"><br>
<br>
<div class="x_gmail_quote">2014-03-04 18:57 GMT+01:00 Tijskens Engelbert <span dir="ltr">
<<a href="mailto:Engelbert.Tijskens@uantwerpen.be" target="_blank">Engelbert.Tijskens@uantwerpen.be</a>></span>:<br>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div style="word-wrap:break-word">Dear all,
<div><br>
</div>
<div>I am trying to figure out how to use std::vector<float> efficiently in combination with Vc. (to have dynamic arrays and performance)</div>
<div><br>
</div>
<div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> std::<span style="color:rgb(0,97,65)">vector</span><<span style="color:rgb(147,26,104)">float</span>> x(1024);</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> <span style="color:#931a68">
for</span>( <span style="color:#931a68">int</span> i=0; i<ne; ++i ) {//initialize</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> x[i]=1.0;</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> }</div>
<div style="margin:0px; font-size:11px; font-family:Monaco">// scalar loop using std::vector</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> <span style="color:rgb(147,26,104)">for</span>(
<span style="color:rgb(147,26,104)">int</span> i=0; i<ne; ++i ) {</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> x[i] -= 1.0;</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> }</div>
<div style="margin:0px; font-size:11px; font-family:Monaco">// vector loop using std::vector</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> <span style="color:rgb(147,26,104)">
for</span>( <span style="color:rgb(147,26,104)">int</span> i=0; i<ne; i+=Vc::<span style="color:rgb(0,97,65)">float_v</span>::<span style="color:rgb(3,38,204)">Size</span> )</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> {</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> Vc::<span style="color:#006141">float_v</span> vx( &x[i] );</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> vx -= 1.0;</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> vx.store( &x[i] );</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> }</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"><span style="color:rgb(78,144,114)">// vector loop using Vc::Memory instead of std::vector</span></div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> Vc::<span style="text-decoration:underline; color:#006141">Memory</span><Vc::<span style="color:#006141">float_v</span>,ne> Vx;</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> <span style="color:#931a68">
for</span>( <span style="color:#931a68">int</span> i=0; i<ne; ++i ) {//initialize</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> Vx[i] = 1.0;</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> }</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> Vc::<span style="color:#006141">float_v</span> one(1.);</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> ET_TIME_THIS</div>
<div style="margin:0px; font-size:11px; font-family:Monaco; color:rgb(57,51,255)">
<span style=""> ( </span>"<span style="text-decoration:underline">Vc</span>::Memory<Vc::float_v,<span style="text-decoration:underline">ne</span>> vector"<span style="">,</span></div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> <span style="color:#931a68">
for</span>( <span style="color:#931a68">int</span> i=0; i<nv; ++i ) {</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> Vx.<span style="text-decoration:underline">vector</span>(i) -= one;</div>
<div style="margin:0px; font-size:11px; font-family:Monaco"> }</div>
<div style="margin:0px; font-size:11px; font-family:Monaco">When i time these loops i get the following results</div>
</div>
<div style="margin:0px; font-size:11px; font-family:Monaco">
<div style="margin:0px; font-family:Menlo">scalar loop using std::vector : 2162 cycles/repetition, 9.4e-07 seconds/repetition, 1 x speedup, 2.3 GHz, 100 repetitions.</div>
<div style="margin:0px; font-family:Menlo">vector loop using std::vector : 357 cycles/repetition, 1.6e-07 seconds/repetition, 6.04x speedup, 2.24 GHz, 100 repetitions.</div>
<div style="margin:0px; font-family:Menlo">vector loop using Vc::Memory : 288 cycles/repetition, 1.2e-07 seconds/repetition, 7.49x speedup, 2.4 GHz, 100 repetitions.</div>
<div><br>
</div>
<div><br>
</div>
<div>is there a way to improve the vector loop using std::vector? By the way if i write the second loop as</div>
<div>
<div style="margin:0px">// vector loop using std::vector</div>
<div style="margin:0px"> <span style="color:rgb(147,26,104)">for</span>( <span style="color:rgb(147,26,104)">int</span> i=0; i<ne; i+=Vc::<span style="color:rgb(0,97,65)">float_v</span>::<span style="color:rgb(3,38,204)">Size</span> )</div>
<div style="margin:0px"> {</div>
<div style="margin:0px"> Vc::<span style="color:rgb(0,97,65)">float_v</span> vx( &x[i] );</div>
<div style="margin:0px"> Vx.<span style="text-decoration:underline">vector</span>(i) -= one;</div>
<div style="margin:0px"> vx.store( &x[i] );</div>
<div style="margin:0px"> }</div>
</div>
<div style="margin:0px">things get even worse, the speedup being only 4.2x roughly.</div>
<div><br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
Vc mailing list<br>
<a href="mailto:Vc@compeng.uni-frankfurt.de">Vc@compeng.uni-frankfurt.de</a><br>
<a href="https://compeng.uni-frankfurt.de/mailman/listinfo/vc" target="_blank">https://compeng.uni-frankfurt.de/mailman/listinfo/vc</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">Dr. Sandro Wenzel<br>
<div>PH / SFT</div>
<div>CERN <br>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</body>
</html>