- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am attempting to compile the matrix multiply example located here: https://www.altera.com/support/support-resources/design-examples/design-software/opencl/matrix-multiplication.html
However when compiling I see the following:aoc: Running OpenCL parser....
/home/mike/ont_core_cpp/ont_core/basecall_nn/ocl/altera_experiments/matrix_mult/device/matrix_mult.cl:105:34: warning: declaring kernel argument with no 'restrict' may lead to low kernel performance
__global float *A,
^
/home/mike/ont_core_cpp/ont_core/basecall_nn/ocl/altera_experiments/matrix_mult/device/matrix_mult.cl:106:34: warning: declaring kernel argument with no 'restrict' may lead to low kernel performance
__global float *B,
^
2 warnings generated.
aoc: OpenCL parser completed successfully.
aoc: Compiling....
aoc: Linking with IP library ...
Checking if memory usage is larger than 100%
Compiler Warning: Vectorized kernel contains loads/stores that cannot be vectorized. This might reduce performance.
+--------------------------------------------------------------------+
; Estimated Resource Usage Summary ;
+----------------------------------------+---------------------------+
; Resource + Usage ;
+----------------------------------------+---------------------------+
; Logic utilization ; 33% ;
; ALUTs ; 18% ;
; Dedicated logic registers ; 16% ;
; Memory blocks ; 32% ;
; DSP blocks ; 23% ;
+----------------------------------------+---------------------------;
I am compiling with the following options: -v --report --fpc --fp-relaxed -cl-fast-relaxed-math -cl-finite-math-only This happens with aoc Version 17.0.0 Build 290 and also with aoc Version 16.1.2 Build 203. The part of the kernel which causes the error appears to be: # pragma unroll
for (int k = 0; k < BLOCK_SIZE; ++k)
{
running_sum += A_local * B_local;
}
But I cannot understand why this would be a problem.
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That warning is related to global memory accesses and is normal when SIMD is used. If the compiler fails to fully coalesce such accesses under the presence of SIMD, it will generate that warning. What it is trying to say is that don't expect to get linear performance improvement by using SIMD, if your global memory accesses are not contiguous. However, if your kernel is memory-bound and you use SIMD despite accesses not being contiguous, performance will actually go down.
Needless to say, everyone will get the same message and it is completely safe to ignore in this case.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page