Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20688 Discussions

Benchmark - how to raise clock frequency?

Altera_Forum
Honored Contributor II
1,294 Views

Hello, 

currently I'm doing some benchmark evaluation (dhrystone) on my Cyclone III Evalboard "BeMicro". 

 

I'm using the Nios II/e for the simplest option. The benchmark tells me 0,124 MIPS/MHz, which is not far away of the value shown in the Altera Documentation "Nios II Performance Benchmarks.pdf" (0,138). 

 

The question now is: In the Documentation, the drive the Cyclone III up to 215 MHz - but with my current configuration, I only managed to drive it up to 130 MHz, which is very little, compared to 215. I cannot run the program at 140MHz or faster ("Leaving target processor paused"). 

 

I'm actually quite new to FPGA design, so could someone please give me a hint, how the FPGA can be driven faster - where the bottleneck might be in my current configuration (in SOPC designer, I'm using the NIOS, JTAG UART, onchip memory, PLL and a timer for the benchmark) ?? 

 

thanks in advance, 

sincerley, Harald. 

 

edit: please find my SOPC design overview in the attachment
0 Kudos
5 Replies
Altera_Forum
Honored Contributor II
377 Views

This 215MHz value looks quite optimistic. According to Altera's document it was measured on a EP3C40F324C6. I guess you have a C7 or C8 on your board, that will be a bit slower. 

Do you have timing violations when you recompile your design with a higher frequency? 

You can try and add some pipelines, it will help increase the fmax. As an example a Avalon MM pipeline bridge between the CPU instruction and data masters and the JTAG debug module usually helps.
0 Kudos
Altera_Forum
Honored Contributor II
377 Views

Thank you for the answer. 

 

 

--- Quote Start ---  

This 215MHz value looks quite optimistic. According to Altera's document it was measured on a EP3C40F324C6. I guess you have a C7 or C8 on your board, that will be a bit slower. 

--- Quote End ---  

 

 

You're right, i've got the C8 speed grade. The "Cyclone III Device Handbook, Vol2, page 25" shows differences in speed only around 20% between C6 and C8. So the speed grade is one reason, but cannot be the only one, from my point of view. 

 

 

--- Quote Start ---  

Do you have timing violations when you recompile your design with a higher frequency? 

--- Quote End ---  

 

 

Sorry, i didn't get that... Which timing violations do you mean? I don't have errors when compiling, but when i try to run the benchmark with the NIOS II IDE (Eclipse), either the download verify fails or the processor does not respond, if the PLL frequency is configured too high. 

 

 

--- Quote Start ---  

You can try and add some pipelines, it will help increase the fmax. As an example a Avalon MM pipeline bridge between the CPU instruction and data masters and the JTAG debug module usually helps. 

--- Quote End ---  

 

 

Hm, I added this pipeline bridge, but I can't see much difference now. The clock frequency is stucked at around 130MHz for eco, ~80MHz for the standard and ~100MHz for the fast NIOS.
0 Kudos
Altera_Forum
Honored Contributor II
377 Views

 

--- Quote Start ---  

Sorry, i didn't get that... Which timing violations do you mean? I don't have errors when compiling, but when i try to run the benchmark with the NIOS II IDE (Eclipse), either the download verify fails or the processor does not respond, if the PLL frequency is configured too high. 

--- Quote End ---  

Towards the end of the compilation, Quartus runs a timing check using TimeQuest. It will check that the design you made can be run at the frequency that you selected. When there is a timing violation it won't report an error but just a critical warning. You should check for those. 

Check also that the input clock is correctly defined, at the correct frequency (open Timequest, and select "report clocks"). This is mandatory, to ensure that Quartus will optimise the design to cope with the frequency that you want to achieve.
0 Kudos
Altera_Forum
Honored Contributor II
377 Views

Harry_DC, 

 

What do you really need? A higher FPGA frequency or more computation power? If you go for the later I suggest that you don't use the /e version of the NIOS CPU. This e means economic in terms of ressource usage. If you need CPU power than you should use /f. Furthermore, it's neccessary to check the externam SRAM interface. If you use the external SRAM and increase the system clock then it might be neccessary to increase the number of wait states for the RAM. 

 

Hope that helps, 

Harald
0 Kudos
Altera_Forum
Honored Contributor II
377 Views

 

--- Quote Start ---  

Towards the end of the compilation, Quartus runs a timing check using TimeQuest. It will check that the design you made can be run at the frequency that you selected. When there is a timing violation it won't report an error but just a critical warning. You should check for those. 

Check also that the input clock is correctly defined, at the correct frequency (open Timequest, and select "report clocks"). This is mandatory, to ensure that Quartus will optimise the design to cope with the frequency that you want to achieve. 

--- Quote End ---  

 

 

I checked those and you were right, there are timing violations. "Critical Warning: Timing requirements not met". Furthermore there are critical warnings about clock uncertainty assignments, for a clock called "altera_reserved_tck". I assume this clock belongs to the JTAG component. 

I also tried to use the Classic Timing Analyzer, not the TimeQuest and set the "Default required fmax" in the Settings to a little higher than the desired clock frequency. So i managed to raise the frequency to 150-180 MHz. This sounds fine now. 

 

 

--- Quote Start ---  

Harry_DC, 

 

What do you really need? A higher FPGA frequency or more computation power? If you go for the later I suggest that you don't use the /e version of the NIOS CPU. This e means economic in terms of ressource usage. If you need CPU power than you should use /f. Furthermore, it's neccessary to check the externam SRAM interface. If you use the external SRAM and increase the system clock then it might be neccessary to increase the number of wait states for the RAM. 

 

Hope that helps, 

Harald 

--- Quote End ---  

 

 

First of all, thank you for the hint. In the end, as you say, i'm interested in computation power. As a start with the NIOS, i just wanted to do some benchmarking about the three different versions to see the differences in the ressource/power ratio. 

Of course, the external SRAM is a bottleneck, this is why i managed to run the benchmark in the internal RAM (with "small-C library" and other switches to reduce memory footprint of the benchmark). I will now make tests with the external RAM and try to increase the number of wait states. 

 

Thanks, Harry.
0 Kudos
Reply