Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++

Performance Issues

Altera_Forum
Honored Contributor II
1,014 Views

I am in the process of debugging a custom Nios II system. I have implemented a NIOS II/f CPU with and EPCS controller. I have created a custom component to access a SRAM in which my firmware is running from. I have implemented on-chip memory for read-only data. I have created a custom component to access a external Dual Port Ram to store user data. I have also created a custom component to access my logic block. I am using a cpu clock of 85 MHz.  

 

After some timing analysis, I noticed that there is approximately 150nS betweem reads coming from my customized logic component. Also there appears for each read a "double" access fro some reason ! 

 

Is there anyway to increase this read performance? We tried using different instruction commands. We tried accesses with and without chache. I tried to play around with the configuration of the system a bit. Nothing seemed to help  

 

I found the following thread on the Forum :- 

 

http://forum.niosforum.com/forum/index.php...&hl=read+access (http://forum.niosforum.com/forum/index.php?showtopic=629&hl=read+access

 

They wrote a lot about 12 cycle read accesses from a SDRAM controller. Also that was over 1.5 years ago !  

 

Does anyone have any suggestions ?
0 Kudos
7 Replies
Altera_Forum
Honored Contributor II
303 Views

The 150ns of time between reads to your peripheral could be for a lot of reasons depending on the implementation of your custom user logic and the amount of processing you are doing between reads. Maybe you could provide a little more info on what you are trying to do. 

If you really need to do high speed transfers, you should consider using the DMA controller.
0 Kudos
Altera_Forum
Honored Contributor II
303 Views

Two things come to mind that might be the reason for the double reads. The first is whether or not you are running the debugger. The debugger will access the memory. Second, if you are not running the debugger are you looking at the address appropriately? Does the processor have access to 16 bits of data on your SRAM? It may be doing byte reads, first byte in first read and second in second read. 

Verify your SRAM setup in SOPC Builder and be sure it matches your part in terms of size and latency. 

 

Good luck.
0 Kudos
Altera_Forum
Honored Contributor II
303 Views

You should notice that a read is always 32 bit wide.  

Regardless of your target. 

If for example you have an 8bit slave you will have 4 cycles (4x8=32) 

If the slave is 16 bit then ypu will have 2 cycles (2x16=32) 

in fact if your software does a char (8bit) read this leads to a 32bit read and the 24bit that are not wanted will be ingnored. 

 

the only difference is a write. a slave with 8 bit will have 1 cycle if the write is 8 bit wide.  

 

one of the avalon documents says that a master must set all byteenables for a read cycle ! 

 

This is the reason why it ist currently nearly impossible to connect an existing profibus chip (8bit) to the avalon switch fabric. some registers inside the chip interprete a read even if this access is not intended by the software but done in hardware 

 

have you monitored your custom component with signal tap ? 

 

in the beginning i had readdata as a n-bit wide register. this lead to the problem that the first cycle with chipselect loads this readdata register and the next one is needed to have the output available and so i had to insert 1 waitstate. furthermore check the sopc setting of your custom component. 

nowadays the readdata is a wire with a set of combinatorical logic and now each access is only 1 clock cycle long, regardless of read or write. 

 

Michael Schmitt
0 Kudos
Altera_Forum
Honored Contributor II
303 Views

Thank you everyone for all your information. 

 

MSchmitt, when you stated :- 

 

"nowadays the readdata is a wire with a set of combinatorical logic and now each access is only 1 clock cycle long, regardless of read or write." 

 

what did you mean ?
0 Kudos
Altera_Forum
Honored Contributor II
303 Views

 

--- Quote Start ---  

originally posted by shmueld@Jul 11 2006, 10:53 AM 

thank you everyone for all your information. 

 

mschmitt, when you stated :- 

 

"nowadays the readdata is a wire with a set of combinatorical logic and now each access is only 1 clock cycle long, regardless of read or write." 

 

what did you mean ? 

<div align='right'><{post_snapback}> (index.php?act=findpost&pid=16796) 

--- quote end ---  

 

--- Quote End ---  

 

 

With this statement i mean that the oldway i did a avalon slave was : 

 

internal_signal -> DFF -> ReadDatabit 

 

where the internal_signal was combinatorical with the internal signal selected by the address.  

 

Now i have changed that in a way that i have removed the DFF. ReadData is now a wire instead of a reg (in verilog language)  

 

wire [31:0] avs_MyModul_readdata; 

assign avs_MyModul_readdata = 

( avs_MyModul_address === 6&#39;H00 ) ? { 32&#39;d12345678} : 

( avs_MyModul_address === 6&#39;H01 ) ? { 32&#39;d87654321} : 

{ 32&#39;d87654321}; 

 

Reading from slave address 0 will be 1234567, address 1 will be 87654321 and all others will be 0 

 

this read is only 1 clock cycle long (or short :-) ) no need to assert avs_MyModul_waitrequest 

 

is it now clear to you ? 

 

Michael Schmitt
0 Kudos
Altera_Forum
Honored Contributor II
303 Views

 

--- Quote Start ---  

originally posted by mschmitt@Jul 11 2006, 04:34 PM 

this is the reason why it ist currently nearly impossible to connect an existing profibus chip (8bit) to the avalon switch fabric. some registers inside the chip interprete a read even if this access is not intended by the software but done in hardware 

--- Quote End ---  

 

 

Just for my understanding: 

 

How about setting the address allignment to native mode, implement a 32 bit slave and use only bit 7-0 (bits 31-8 going to nowhere). This should give you the behaviour you need.  

 

Or am I wrong?
0 Kudos
Altera_Forum
Honored Contributor II
303 Views

Unfortunately you are. 

 

all of these profibuschips have both memory and registers with a single chipselect. the data to and from the profibus is stored inside the chip so this chip-memory area must be a memory inside the nios datamaster that the software can use 8/16/32 Bit access and even memcopy(move) operations without takeing care about that this device is a 8 bit device. (all available softwarestacks need this and we do not want to re-write such an stack an certify it) on the other hand with the same chipselect you gain access to the registers and some of them monitor a read access, so if you want to read a 8 bit register, but executed is 4 read cycles, then you do in real life a read from 3 registers that are not accessed by the software. 

 

the crazy thing is that the byteenables indicate which byte is requested. so the only workaround is to insert a statemachine between avalon and the profibuschip that fetches only these byte that are requested by the software. this works fine here. 

 

so for the software it is neccesary to have a memory modul instead of a register modul to get rid of the gaps in the memory. but no additional (wasted) read cycles.
0 Kudos
Reply