Re: timing closure problem.

Altera_Forum · ‎08-08-2007

Hi,

Could anyone shed some light on this timing closure problem I am working on, any advices are appreciated.

Most of the critical paths of this design are in the implementation of a vhdl line similar to the following:

x[m][n]=x[m][n]+h;

where m, n, h are integers, m ranges from 0 to 8, n ranges from 0 to 3, x is 9 bits.

Between the registers the combination logic includes a few MUXes to look up the right x from m, n, a few LEs for the 9-bit adder, and then a few MUXes to put the x back at the right place. The time delay in the LEs are small but delays on the interconnections are huge. I tried to use max delay constraints to make the interconnections shorter but instead of improving the timing, the negative slack becomes even bigger. I am wondering if this is the right way to do this or there is a better way to solve this without adding registers in the flow path.

Thanks in advance!

Hua

Altera_Forum · ‎08-08-2007

There are cases where max delay constraints are appropriate to make a requirement on specific paths artificially tighter so that the Fitter will work harder on those paths, but I never use those constraints for performance optimization. It is very likely that there is a better solution.

Start with "Tools --> Advisors --> Timing Optimization Advisor --> Maximum Frequency (fmax)" in Quartus.

If none of the recommendations in the Advisor solve your problem, then refer to the Quartus handbook in Volume 2, Section III, Chapter 8 "Area and Timing Optimization".

If none of the recommendations in the handbook solve your problem, then you might have a case where it is impossible for the design in its current state to have positive slack even if Quartus did the best possible (and Quartus might already be doing the best possible). You might need to add pipeline stages, restructure logic to make it simpler, etc.

Altera_Forum · ‎08-09-2007

Could you add a little more info to the snippet of code? Are m, n and h signals/variables, i.e. they could be any value, or are they part of a for loop? It sounds like they are actual nets in your design. And is this in a clocked process, or a combinatorial line that feeds into a register?

I don't get why there are muxes to look up the x and muxes to feed it back? Even if m, n and h can be "any value", they are the same on the both sides of the equation, i.e. x[3][1] can only get x[3][1] + h, not some other x value. I guess it depends on the synthesis. You could have a 9-bit 36:1 mux that feeds a single adder who's value is decoded to back to one of 36 9-bit registers, or you could have 36 adders and do no muxing. Or something in betwee, say 4 adders, one for each n index. Etc.

What frequnecy are you targeting? How much slack do you have? What device and speed grade? Care to add a .txt of the critical path? Interconnect timing is generally the long part of the delay, but I would really look at how far the hops are, i.e. the placement. You're not going to have fantastic placement because of the nature of the algorithm, i.e. a large number of nodes that cover a decent sized area, muxing down to a single area, then muxing back out.

Altera_Forum · ‎08-09-2007

Hi Brad,

Thank you for your advice. It was very helpful. I am trying the advisors on one of my modified version and it passed the timing requirements.

Hua

Altera_Forum · ‎08-09-2007

Hi Rysc,

Thank you for your reply. Yeah, they are actually nets in the design and it's in a clocked process. m, n, and h are synchronous integar signals produced outside the process with specific ranges (m: 0-8, n: 03, h: 0-183).

Yeah, m, n are the same for both sides of the equation.... hmmm... using 36 9-bit adders is an interesting idea if it doesn't use too much resources... I still have to control which adder to add and which register to output though (which means we need mux again... probably less).

Thanks again, I will give it a thought.

Hua