Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16598 Discussions

Matrix multiplication example Block memory overhead

Altera_Forum
Honored Contributor II
1,086 Views

Hello Everyone 

 

I am a quite newbie for Altera OpenCL. Recently I tried compiling Matrix Multiplication example which is given in the opencl design examples page (https://www.altera.com/support/support-resources/design-examples/design-software/opencl.html

 

For my surprise, the block memory bits usage is very high. As I explored in Quartus, the most of the BlockRAM bits were used by FIFOs and LSUs(load store units). 

 

May I have any help to understand, why the compilation generates such FIFOs and LSUs? I could not find any reference which explains the reasons behind FIFO and LSU generation. 

 

Any guidance is really appreciated.
0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
378 Views

The kernel reads in data from the global memory and stores it in local memory and performs the matrix multiply on the blocks of data it pulls into the local memory. If you use the default matmult application they provided, the default block size is 64. With the required work group size set to 64x64 = 4096 total work items for one work group. That's my guess. There's a lot of data movement depending on how big you set the block size.

0 Kudos
Altera_Forum
Honored Contributor II
378 Views

i agree with what okebz said 。 FIFO use to store the variables ,and LSU used to read the data from global memory or write the data into the global memory 。

0 Kudos
Reply