Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16614 Discussions

Altera FPGA ring oscillator low level implementation

Altera_Forum
Honored Contributor II
5,030 Views

Hi all, 

 

I'm a student working on a summer research project. I've been stuck on the same problem for some time now and would greatly appreciate any help. 

 

Software/Hardware: 

Quartus II software to connect to an Altera DE2 board. 

 

High level idea of project: 

Test for process variation of different chips. 

 

Chip Overview: 

Chips consists of lots of LABs. Each LAB has 32 LEs. LEs are either LUTs or Flip Flops. 

 

Implementation: 

Using VHDL, program ring oscillators. A ring oscillator consists of 4 buffers followed by an inverter. The low level of the chip consists of various LUTs. LUTs are capable of implementing any logical function. Thus program strings of ring oscillators by instantiating the LUTs as buffers or inverters as appropriate to create multiple ring oscillators on the chip. 

 

After the program runs various time delays will be obtained which can be used to draw conclusions regarding process variation of the chip. 

 

Problem: 

How to actually instantiate these elements (buffers and inverters) on the LUTs of the chip. I have tried various methods via the chip planner and assignment editor but I continually get the following warning: 

 

Warning: Ignored locations or region assignments to the following nodes 

Warning: Node "ring_osc:\GEN_RO:0:XILINX_RO|inv_lut_out" is assigned to location or region, but does not exist in design 

 

I need to know how to actually take the code and make instances of it on the LUTs of the chip. 

 

Any help is greatly appreciated. Thank you. 

 

pbryzek@ucla.edu 

 

Paul
0 Kudos
21 Replies
Altera_Forum
Honored Contributor II
3,297 Views

The systhesis tool is optimizing the redundent buffers away, etc. 

 

Try either a keep attribute, or try also feeding each buffer output off the chip so that the tool is forced to keep each node in the design. 

 

Do not let the tool 'help' you so much. 

 

Avatar 

ps. Forum rules ask that you not multiply post the same question to different sub forums. 

 

Thanks.
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

Instantiate multiple LCELLs together. I believe in VHDL the ports are a_in and a_out(both std_logic). Note that an LCELL is not an empty logic cell. Instead it designates that that particular point will be the output of a logic cell. So, for example, if you have an AND gate that feeds an output pin and insert an LCELL, the LCELL won't do anything since the AND gate would already be the output of a logic cell. But if you chain two LCELLs, then you'll have an AND gate followed by an empty cell. (I've seen confusion over this, but the way LCELL works is nice for control). Anyway, just chain a bunch of them together.

0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

 

--- Quote Start ---  

Each LAB has 32 LEs. LEs are either LUTs or Flip Flops. 

--- Quote End ---  

 

 

 

In Cyclone II, a LAB has 16 LEs. An LE is a LUT and register pair. The LUT and register can be used together or independently. Even if used separately, they are not counted separately for the LE quantity. 

 

 

 

 

--- Quote Start ---  

High level idea of project: 

Test for process variation of different chips. 

 

... 

 

Implementation: 

... create multiple ring oscillators on the chip. 

 

After the program runs various time delays will be obtained which can be used to draw conclusions regarding process variation of the chip. 

--- Quote End ---  

 

 

 

Your method should work well to compare delay of a ring at the same location on different devices for chip-to-chip process variation. If you are trying to measure process variation across the same device by comparing the ring delay in different locations, however, then you must avoid contributors to differences in the ring delay that will exceed the process difference. 

 

Even if you have exactly identical placement of the LUTs with respect to each other in a given ring, you might not be able to have exactly the same routing resource usage in the connections. A small difference in routing resource usage could affect the ring delay by more than the process variation does. 

 

Your rings could also have different delays because of different LUT inputs being used. The delay through the LUT depends on which input is used. When comparing more than one instance of a simple ring without other logic placed nearby, I would expect the same LUT position in the ring would use the same input for all ring instances. For example, buffer LUT# 1 might use input A for all rings, and the inverter LUT might use input C for all rings. However, you can't control that manually to guarantee it (unless you create a routing constraints file, which is probably more trouble than you would want). The Fitter will choose which LUT input is used during the routing stage. 

 

Because you will have only 5 LUTs (4 buffers and an inverter) in each loop, you can keep each ring in a single LAB. I think keeping each ring within a single LAB will help make the routing the same if you can get the same exact relative placement within the LABs (the Assignment Editor has a "Comb. cell" location assignment type for Cyclone II). If the routing is the same from one ring to the next, then the same LUT input will be used for the same LUT position in the ring from one ring to the next. Look at the reported timing to see whether there are any ring-to-ring differences in the interconnect delay (to check for the same routing) or cell delay (to check for delay differences from which LUT input is used). You can also use the Chip Planner to view the routing that is used and the Resource Property Editor to see the exact LUT input that is used. 

 

Be careful not to introduce measurement uncertainty by the method of measuring the delay through the ring. If, for example, you are going to use the ring signal to clock a counter, be careful that this additional logic does not cause the Fitter to make the ring instances different. I'd have the ring clock a single register in the same LAB and drive that register output to a counter that is far enough away it does not compete with the ring for routing resources. The less simple each instance is or the more nearby resources are used by other things, the more likely you'll run into differences in the specific LUT inputs and routing channels the Fitter chooses.
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

Thanks for the responses, 

 

The way that I have been trying to implement up to this point is the following. 

 

For each component of the ring oscillator (i.e. 4 buffers 1 inverter) I implement by making use of the LUT_input/LUT_output primitives. This is an example of a buffer, note pre_toggle_1 is simply the reset passed and pre_toggle is the ring_invert signal, the signal that wraps around from the end of the ring oscillator. 

 

-------------------------------------------------------- 

delay1_lut_in_1: LUT_INPUT 

PORT map( 

a_in => reset, 

a_out=> pre_toggle_1 

); 

 

delay1_lut_in_2: LUT_INPUT 

PORT map( 

a_in => ring_invert, 

a_out=> pre_toggle 

); 

 

pre_toggle_2 <= (pre_toggle_1) AND (pre_toggle); 

 

delay1_lut_out: LUT_OUTPUT 

Port map( 

a_in => pre_toggle_2, 

a_out => ring_delay1 

); 

 

----------------------------------------------------------------------- 

 

I then tried mapping all 3 elements to a particular location on the chip by use of the assignment editor. 

 

-to (these names were found by using the node finder) 

|lcd_test|ring_osc:\GEN_RO:0:XILINX_RO|delay1_lut_in_1 

|lcd_test|ring_osc:\GEN_RO:0:XILINX_RO|delay1_lut_in_2 

|lcd_test|ring_osc:\GEN_RO:0:XILINX_RO|delay1_lut_out 

 

-assignment name 

LOCATION  

as well as 

Implement as output of logic cell 

 

-value 

LCCOMB_X14_Y35_N0 

 

-------------------------------------------------------------------------------------- 

 

The following components of the ring oscillator are then mapped to sequential cells, so 1 particular ring oscillator could have the locations all within the same LAB: 

 

LCCOMB_X14_Y35_N0 - buffer 

LCCOMB_X14_Y35_N2 - buffer 

LCCOMB_X14_Y35_N4 - buffer 

LCCOMB_X14_Y35_N6 - buffer 

LCCOMB_X14_Y35_N8 - inverter 

 

----------------------------------------------------------------------- 

 

Additionally I tried using the attribute keep statements 

 

attribute KEEP : string;  

attribute KEEP of pre_toggle : signal is "true";  

attribute KEEP of pre_toggle_1 : signal is "true";  

attribute KEEP of ring_delay1 : signal is "true";  

 

--------------------------------------------------------------------------- 

 

Do I need to use an LCELL to implement the functions and not LUT_input/LUT_output? 

 

Also do you guys know of a way that I can verify that the placements are actually still there after synthesis. At this point it compiles without error, but with that warning. Is there a way that I can see that the placements have been respected? 

 

Thank you for your help. 

 

Paul
0 Kudos
Altera_Forum
Honored Contributor II
3,296 Views

Also, 

 

I just tried to turn off the optimizations: 

 

assignments -> settings -> Analysis and Synthesis -> 

Restructure MUXes -> off 

Powerplay power optimization -> off 

 

Same error, are there other options to turn off? 

 

Paul
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

 

--- Quote Start ---  

Do I need to use an LCELL to implement the functions and not LUT_input/LUT_output? 

 

Also do you guys know of a way that I can verify that the placements are actually still there after synthesis. At this point it compiles without error, but with that warning. Is there a way that I can see that the placements have been respected? 

--- Quote End ---  

 

 

 

I haven't heard of LUT_input/LUT_output primitives. I use LCELL to do what Rysc said. Be sure "Ignore LCELL Buffers" is off (the default) in the "More Analysis & Synthesis Settings" dialog box. You might also need "Remove Redundant Logic Cells" to be off. 

 

 

I don't know that all kinds of unused location assignments produce your Fitter warning below, but at least you know yours do. Just check for the warning. If you still want to know for certain whether something was placed where you want it, check "Tools --> Chip Planner" or "Assignments --> Timing Closure Floorplan". 

 

Warning: Ignored locations or region assignments to the following nodes
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

How would I use LCELLs to instantiate a buffer? I'm unfamiliar with the LCELL, can anyone provide an example?

0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

It's not the cleanest code(a for-generate might work better), but it does do what you want. The instance names should be assignable too. Open the Assignment Editor, type LC_a in the To column, type Location in the Assignment Column, and then X5_Y5 in the Value(or whatever lab location you want), and you can assign to an individual node if you want. 

 

You should also be able to put them all into a LogicLock region, so the fitter chooses their final placement within a lab and where the lab goes, but will keep them all in the same lab. 

 

 

LIBRARY IEEE; 

USE IEEE.STD_LOGIC_1164.ALL; 

 

ENTITY delay IS  

PORT ( 

din : IN STD_LOGIC; 

dout : OUT STD_LOGIC); 

END delay; 

 

ARCHITECTURE arch OF delay IS 

 

SIGNAL a, b, c, d, e, f : STD_LOGIC; 

 

COMPONENT LCELL 

PORT ( 

a_in : IN STD_LOGIC; 

a_out : OUT STD_LOGIC); 

END COMPONENT; 

 

BEGIN 

 

LC_a : LCELL 

PORT MAP ( 

a_in => din, 

a_out => a); 

 

LC_b : LCELL 

PORT MAP ( 

a_in => a, 

a_out => b); 

 

LC_c : LCELL 

PORT MAP ( 

a_in => b, 

a_out => c); 

 

LC_d : LCELL 

PORT MAP ( 

a_in => c, 

a_out => d); 

 

LC_e : LCELL 

PORT MAP ( 

a_in => d, 

a_out => e); 

 

LC_f : LCELL 

PORT MAP ( 

a_in => e, 

a_out => f);  

 

dout <= f;  

 

END arch;
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

 

--- Quote Start ---  

You should also be able to put them all into a LogicLock region, so the fitter chooses their final placement within a lab and where the lab goes, but will keep them all in the same lab. 

--- Quote End ---  

 

 

 

But remember that different placement within the LAB from one ring instance to the next could result in timing differences that will mask any differences from on-chip process variation.
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

Thank you for the responses, I think I'm almost there. 

 

Is there a way to implement a 2 input LCELL? The reset input is essential to the design. I'm thinking about just taking the output of an LCELL buffer and ANDing that with the reset input, but was wondering if theres another way. 

 

Thanks. 

 

Paul
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

The LCELL states that it is the output of a logic cell. That doesn't mean the lcell can't be packed with functionality. (This was what I was stating earlier but throws everyone off.) So if you're a_in port gets the AND of two signals, that AND gate will get put into the LCELL. So the LCELL primitive doesn't represent an actual LUT, but just that it's the output of a LUT. So you should be able to do whatever logic the reset does to the oscillating signal and then drive that into the a_in port of the LCELL.

0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

Hopefully the original poster can benefit from this document 

http://www.altera.com/literature/ug/ug_low_level.pdf
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

I had a colleague who was trying to implement delays in a MAX CPLD with LCELLs, and the synthesis setting to "Ignore LCELL" was set to AUTO, which bascially meant ON (ignore) 

So he set it to OFF (don't ignore my LCELLs!) and it worked.
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

I used LCELLs as inverter-buffers to create an RO. To analyze the delays I have left the chain open, and I am just studying the propagation delay of the chain. I plan to sample at 100MHz, so I want my osc to run at 10MHz. For this I need to calculate the number of invs I would need.  

 

Here's the VHDL. Hacked Rysc's code from earlier in this thread.  

 

*** 

 

USE IEEE.STD_LOGIC_1164.ALL; 

LIBRARY IEEE; 

 

ENTITY osc IS 

PORT  

(  

din: IN STD_LOGIC; 

dout, pin0, pin1, pin2 : OUT STD_LOGIC  

); 

END osc; 

 

ARCHITECTURE arch OF osc IS  

 

SIGNAL in0, in1, in2 : STD_LOGIC;  

SIGNAL out0, out1, out2 : STD_LOGIC; 

 

COMPONENT LCELL 

PORT ( 

a_in : IN STD_LOGIC; 

a_out : OUT STD_LOGIC); 

END COMPONENT; 

 

BEGIN 

 

in0<= NOT din; 

LC_0 : LCELL 

PORT MAP ( 

a_in => in0, 

a_out => out0); 

 

pin0<=out0; 

 

in1<=NOT out0; 

LC_1 : LCELL 

PORT MAP ( 

a_in => in1, 

a_out => out1); 

 

pin1<=out1; 

 

in2<= NOT out1; 

LC_2 : LCELL 

PORT MAP ( 

a_in => in2, 

a_out => out2); 

 

dout <= out2; 

END arch; 

 

*** 

 

3 inverter-buffer LCELLs. Compiles well. Ignore LCELL buffers is off. 

 

Here's the RTL view. Looks like I wanted it to be ~ link (http://img132.imageshack.us/img132/8985/delaychainzd1.gif

[One odd thing is that the net name and the buffer names are the same? LC_0 appears in nets and buffers both] 

 

Here's the waveform vector file ~ link (http://img403.imageshack.us/img403/2221/vvfiw6.gifhttp://img403.imageshack.us/img403/2221/vvfiw6.gif

I run the simulation - this is what I see ~ link (http://img169.imageshack.us/img169/93/simulationtu2.gif

 

Notice how changes on pin0 occur after nearly 10ns. But pin1 and pin2 change so very fast. 

 

In an earlier code, dout transitioned even before pin0 and pin1 would. I am zapped. Am I doing something dreadfully wrong? How can I calculate the propagation delay :S ?
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

Here's another example of the strange behavior I am talking about. Outputs later in the change transition even before their inputs change.... 

 

I was expecting to see a ripple like behavior, but here is a snapshot. :eek:  

 

~ link (http://img54.imageshack.us/img54/9833/strangekr1.gif

 

[I hope I am not annoying people with putting up too many pictures?]
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

--Here is a better code with generate 

 

LIBRARY IEEE; 

USE IEEE.STD_LOGIC_1164.ALL; 

 

ENTITY forced_delay IS  

GENERIC (N : INTEGER := 20); --number of forced delay buffers 

PORT ( 

din : IN STD_LOGIC; 

dout : OUT STD_LOGIC); 

END forced_delay; 

 

ARCHITECTURE arch OF forced_delay IS 

 

SIGNAL a:STD_LOGIC_VECTOR (N downto 1); 

 

COMPONENT LCELL 

PORT ( 

a_in : IN STD_LOGIC; 

a_out : OUT STD_LOGIC); 

END COMPONENT; 

 

BEGIN 

 

LC_1: LCELL PORT MAP(a_in=>din,a_out=>a(1)); 

 

Gen_delay : FOR i in 1 to N-1 GENERATE 

LC : LCELL PORT MAP(a_in=>a(i),a_out=>a(i+1)); 

END GENERATE; 

 

 

dout<=a(N); 

 

END arch;
0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

This VHDL code worked great for my ring counter in my lab on the DE0 dev. board. Thanks, Clint

0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

Could anyone provide the LCELL's delay time in different devices? Or tell me how or where to find this parameter in datasheet? Thanks!!!

0 Kudos
Altera_Forum
Honored Contributor II
3,297 Views

One thing I like as an alternative to the LCELL is using a keep attribute. Open the VHDL/Verilog in Quartus and go to Edit -> Insert Template -> VHDL/Verilog -> Synthesis Attributes to find examples. You basically just create signals/wires in a chain and apply the keep, which forces each one to be the ouptut of an LCELL. 

As for the question, I don't think it's in a datasheet. It's probably not a fixed number to begin with(i.e. it might vary within a device), but different devices, different speed grades could also cause differences. On top of that, each input through the LCELL has a different delay, which makes sense. Because of all this, it is not a fixed delay. (Note that you can lock down the placement and routing of a section in the design, and with everything physically locked, the timing can change by a few ps depending on what is placed and routed around it. This is accurate timing, but just another indicator of why timing is so hard to lock down to simple numbers...)
0 Kudos
Altera_Forum
Honored Contributor II
2,974 Views

 

--- Quote Start ---  

One thing I like as an alternative to the LCELL is using a keep attribute. Open the VHDL/Verilog in Quartus and go to Edit -> Insert Template -> VHDL/Verilog -> Synthesis Attributes to find examples. You basically just create signals/wires in a chain and apply the keep, which forces each one to be the ouptut of an LCELL. 

As for the question, I don't think it's in a datasheet. It's probably not a fixed number to begin with(i.e. it might vary within a device), but different devices, different speed grades could also cause differences. On top of that, each input through the LCELL has a different delay, which makes sense. Because of all this, it is not a fixed delay. (Note that you can lock down the placement and routing of a section in the design, and with everything physically locked, the timing can change by a few ps depending on what is placed and routed around it. This is accurate timing, but just another indicator of why timing is so hard to lock down to simple numbers...) 

--- Quote End ---  

 

 

Thanks you very much!!!
0 Kudos
Reply