Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20705 Discussions

Running NIOS code from SSRAM

Altera_Forum
Honored Contributor II
2,030 Views

Hi, 

I am trying to run a NIOS code from CY7C1354CV25 SSRAM(32 bits data line) and it is very unstable. My hardware is sharing the address and data bus with a flash device PC48F4400P0TB0E (16 bits data) and I am using the tri-state bridge from Altera to share the address and data bus only. I am using the SRAM in synchronous mode and the Flash in asynchronous mode. To save some pins in the FPGA, I also have the byte enables going to SRAM hardwired and enabled outside the FPGA. The FPGA device is Arria II GX and I am using Quartus 9.1 version. Here are the timing constraints that I have for the project: 

****************** 

create_clock -period 10.000 -name Clk_100 [get_ports Clk] 

 

create_clock -period 20.00 -name virt_clk 

 

# ######## SRAM and Flash constrainsts# ####### 

# ############################################# 

 

# constraint the SRAM clock output pin 

create_generated_clock -name SRAM_CLK -source [get_pins {NIOS_PLL_I|altpll_component|auto_generated|pll1|clk[0]}]  

-divide_by 1 [get_ports {SRAM_CLK}] 

 

set_false_path -to [get_ports {SRAM_CLK}] 

 

# Tco constraint for SRAM address, byte write, byte write enable, Chip enable, and output enable 

set_output_delay -clock {virt_clk}  

-max 1.5 [get_ports {SRAM_BWEN SRAM_CEn SRAM_OEn FSM_A 

[*] FSM_D 

[*]}] 

set_output_delay -clock {virt_clk}  

-min -0.5 [get_ports {SRAM_BWEN SRAM_CEn SRAM_OEn FSM_A 

[*] FSM_D 

[*]}] 

 

# Tsu and Th constraint for SRAM and Flash data 

set_input_delay -clock {virt_clk} -max 3.5 [get_ports {FSM_D 

[*]}] 

set_input_delay -clock {virt_clk} -min 0.5 [get_ports {FSM_D 

[*]}] 

 

# Constraint for unused flash and SRAM pins 

set_false_path -from [get_ports {FLASH_ADVN FLASH_RSTn FLASH_CLK SRAM_ZZ SRAM_ADVn SRAM_CKEn SRAM_Mode}] 

 

 

*** 

Hardwired and open signals inside FPGA: 

As the SRAM chip does not have an ADSC line, I have left it open coming out of the NIOS entity. Other hardwired signal inside the FPGA are FLASH_ADVN <= '0', FLASH_RSTn <= '1', FLASH_CLK <= '0', SRAM_ZZ <= '0', SRAM_CKEn <= '0', SRAM_ADVn <= '0', and SRAM_Mode <= '0'. 

 

Here are my questions: 

 

  • I have used SRAM’s timing (Tco, Tsu, Th) for constraining the data and address line. Is this correct? 

  • The flash device being an asynchronous device, how do I constraint it? 

  • I tried the memtest program from Altera to verify both Flash and SRAM devices. The SRAM data bus test and the address bus test passed and the Flash device also passed the test. This test was done by running the NIOS from on-chip memory 

  • Next, I changed the Exception vector in the NIOS CPU to external SSRAM and the reset vector to external Flash and tried a hello world program from Altera. The program downloaded and verified properly, but it not print “Hello World” onto the console. I have JTAG UART for STDOUT, STDIN, and STDERR in the hello_world project. What could be the issue?
 

Thanks in advance 

 

Kumaran
0 Kudos
10 Replies
Altera_Forum
Honored Contributor II
635 Views

I appologize I don't have time to thoroughly review all of your post. However, what you ought to consider is creating a seperate clock to drive to the SRAM with a phase offset from the clock which is used to clock your logic. 

 

Below are my constraints from a system which is similar (though not exactly the same) as yours: 

 

create_generated_clock -name {clk_sopc} -source }] }] -add create_generated_clock -name {clk_reconfig} -source }] -divide_by 2 }] -add # SRAM clock create_generated_clock -name {sram_clk_int} -source }] -phase -172.8 }] create_generated_clock -name {sram_clk} -source }] # Input Delays set_input_delay -clock {clk_sopc} -max 9.7 set_input_delay -clock {clk_sopc} -min 5.1 # Output Delays set_output_delay -clock -reference_pin -max 1.670 set_output_delay -clock -reference_pin -max 1.670 set_output_delay -clock -reference_pin -max 1.670 set_output_delay -clock -reference_pin -max 1.670 set_output_delay -clock -reference_pin -max 1.670 set_output_delay -clock -reference_pin -max 1.670 set_output_delay -clock -reference_pin -max 1.670 set_output_delay -clock -reference_pin -min 0.670 set_output_delay -clock -reference_pin -min 0.670 set_output_delay -clock -reference_pin -min 0.670 set_output_delay -clock -reference_pin -min 0.670 set_output_delay -clock -reference_pin -min 0.670 set_output_delay -clock -reference_pin -min 0.670 set_output_delay -clock -reference_pin -min 0.670  

 

Jake
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

Hi Jake, 

Thanks for the code and quick response.  

1. I have the SRAM clock connected to a separate port in the PLL as you have mentioned, but I do not know how to determine the phase shift. How did you calculate the phase for sram_clk_int clock?  

2. Also for the set_output_delay calculation, I used the formula shown below(it ignore the board delay): 

set_output_delay -clock<clock source> -min -Th [get ports *] 

The difference I see between your SDC and mine is the minimum set_output_delay has negative value in mine. Am I missing something? 

3. Do we need to use virtual clock for constraining the SRAM pins or can I do the constraint like what you have done?  

4. Do you know the part number of the SRAM and flash on your design, so that I can try to decode what exactly you are doing.  

 

Thanks  

 

Kumaran
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

What the pll shift value are you using now? 

I usually use -1.66ns for the 50MHz ssram clock vs to 50MHz sopc system clock. This setting worked for 4 or 5 different board already. 

Give it a try!
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

1 - Basically the idea is to get the clock to the SRAM center aligned with all of the data and control signals. So in my particular case, I was running the SRAM at 100MHz. After calculating delays and everything I decided that I wanted the clock going to the SRAM to be -4.8ns shifted from my regular clock. This translated to the -172.8 degree shift you see in the constraint. 

2 - Let me take some time to go back and see how I got to my setup and hold values and I'll get back to you. 

3 - You can most certainly constrain it as I have done. 

4 - cy7c1380D 

 

Have you seen this free training on constraining source-synchronous interfaces? 

http://www.altera.com/education/training/courses/odsw1160 

 

Jake
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

Hi Jake and others, 

Thanks for the information.  

 

1. I calculated the data delay from the NIOS entity port to the SRAM data bus pins and the delay from the PLL port to SRAM Clock pin in the FPGA using the below two commands. 

# # Reports path delay on all the data lines. report_path -from |q}] -to }] -npaths 40 -panel_name "SRAM_OUT_DATA_DELAY" # ## report clock delay report_path -from }] -to -npaths 40 -panel_name "Report Path" The average data delay is 6.188188ns and the clock delay is 4.24 ns. In this case the clock delay is less than the data delay. So, should the phase shift of the SRAM clock should be 6.188188-4.24 = 1.948188 or should it be -1.948188. 

 

2. I calculated the delay of the data from the data pin to the first register in the NIOS using the command shown below: 

report_path -from }] -to |d}] -npaths 40 -panel_name "SRAM_IN_Data_Delay" Using the result of the above report path command, I got a average input data delay of 0.657063. Should I consider this delay in the phase of the clock to the SRAM?  

 

 

Thanks, 

 

Kumaran
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

I wouldn't initially take the approach of trying to adapt your clock phase based on the timing results. Those timing results will likely change based on your timing constraints and design changes. 

 

Try this approach: 

Let's assume an ideal world with no delays in the FPGA and no delays on the board. 

The datasheet says that you need 1.5ns of setup time and 0.5ns of hold time. So let's take that away from our clock period and see what margin we have left: 

10ns - 1.5ns - 0.5ns = 8ns. 

Let's split the difference between the setup and hold time. 

8ns / 2 = 4ns;  

We want 4ns margin on both our setup and hold time so our sram clock phase would be -0.5ns - 4ns = -4.5ns. 

 

So if we were only taking into account the output to the SRAM, we might choose a phase shift of -4.5ns.  

 

However, what about the input? The datasheet for the SRAM says there is a worst-case tCO of 3.5ns. 

With our -4.5ns clock phase, that data will show up 4.5ns - 3.5ns = 1ns before the rising edge of our regular non-shifted system clock. Hmmm, we have our SRAM a setup time of 5.5ns but we're only giving the FPGA a setup time of 1ns. That's not very nice. Maybe we should give the FPGA some more setup time. 

 

So we have to make a decision here: 

1 - We can push our SRAM clock further out (which eats at our output hold margin) and thereby push the delayed SRAM input data into the following clock cycle. You would have to compensate for this by adding a clock cycle of latency in your design. This is a perfectly valid choice in which case, I would use a phase shift similar to what nekojiru suggested (like -1.66ns) 

-or- 

2 - We can pull our SRAM clock back in a bit (which eats at our output setup margin) and thereby increase our setup margin on the input data coming back from the SRAM.  

Well let's redo our calculation and this time we split the margin between the output setup and input setup times. 

10ns - 3.5ns - 1.5ns = 5ns; <- This is our margin. 

5ns / 2 = 2.5ns; <- This is how much margin will give for both setup time in both direction. 

-3.5ns -2.5ns = -6ns. 

Now in reality, I think like you've calculated that the data delay inside the FPGA is going to be longer than the clock delay. So I would rather see that -6ns be more like -5ns. 

 

Does this help at all?
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

Hi Jake, 

Thanks for the information. I am working on the issue with the SRAM being a No Bus Latency(NoBL) device and the Altera IP core being a piplined SSRAM. I will look at the calculation that you provided and let you know the results soon. Thanks a lot for the help. 

 

Kumaran
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

Hi Jake and Nekojiru, 

I was able to fix the issue with the SRAM. It was not an timing issue, but the Alter IP SSRAM core is a pipeline SRAM and the one that I have used is NoBL. Thanks for your help. 

 

Kumaran
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

Can you post the fix ? 

What changes have you made to the Altera IP SSRAM core to make it support NoBL ?
0 Kudos
Altera_Forum
Honored Contributor II
635 Views

I made the following changes inside tri_state_bridge_avalon_slave_arbitrator in the NIOS main module. All I did was pipelined the data and write control twice. I had to do this inside the NIOS main module as I was sharing the avalon tri-stare bridge with a FLASH device.  

 

If your SSRAM is not sharing address, data and control pins with any other device (like Flash) you can do the pipeline outside the NIOS main module. I have not tried pipelining the data outside NIOS module, but I guess it should work. 

 

 

Hope this helps! 

d1_outgoing_tri_state_bridge_data_r <= d1_outgoing_tri_state_bridge_data_r2 when (sram_s1_in_a_write_cycle_r2 = '1') else outgoing_tri_state_bridge_data; process (clk, reset_n) begin if reset_n = '0' then d1_outgoing_tri_state_bridge_data <= std_logic_vector'("00000000000000000000000000000000"); elsif clk'event and clk = '1' then d1_outgoing_tri_state_bridge_data_r1 <= outgoing_tri_state_bridge_data; d1_outgoing_tri_state_bridge_data_r2 <= d1_outgoing_tri_state_bridge_data_r1; d1_outgoing_tri_state_bridge_data <= d1_outgoing_tri_state_bridge_data_r; end if; end process; --write cycle delayed by 1, which is an e_register d1_in_a_write_cycle_r <= d1_in_a_write_cycle_r2 when (sram_s1_in_a_write_cycle_r2 = '1') else time_to_write; process (clk, reset_n) begin if reset_n = '0' then d1_in_a_write_cycle <= std_logic'('0'); elsif clk'event and clk = '1' then sram_s1_in_a_write_cycle_r1 <= sram_s1_in_a_write_cycle; sram_s1_in_a_write_cycle_r2 <= sram_s1_in_a_write_cycle_r1; d1_in_a_write_cycle_r1 <= time_to_write; d1_in_a_write_cycle_r2 <= d1_in_a_write_cycle_r1; d1_in_a_write_cycle <= d1_in_a_write_cycle_r; end if; end process;  

 

 

Kumaran
0 Kudos
Reply