How to identify if your design has a race condition.

Altera_Forum · ‎01-06-2013

I am currently working on a CPU and I managed to make it run an sequence of instructions (All it does is add 1 to a register and then store it into a memory mapped device which controls LEDs on my FPGA board and repeat) properly in modelsim however when I put it into a qsys system I have issues. I connect the CPU to an Onchip RAM component (through QSYS) which is preloaded with the same instructions.

When I loaded the design onto the FPGA I noticed that the LEDs increment for a VERY short time (Several microseconds or so) and then freeze. If I reset the System it will do the same thing but it will freeze at a different time. I tried using a ROM megafunction instead and I placed it inside of the CPU and it works perfectly fine. This led me to believe that there is an issue with the communication between the CPU and the Memory unit more specifically a race condition Is there any way to check if there is a race condition going on?

EDIT: In some compilations the LEDs will always be in the reset state which makes me believe that it is a race condition and the duration can be affected from the fitting stage.

Altera_Forum · ‎01-08-2013

Is there anybody that can help? I am guessing that I need to do some sort of timing analysis.

Altera_Forum · ‎01-09-2013

What sort of design approach are you taking?

With good synchronous design there should be no 'race' conditions as everything is driven off the same clock (at the simplest level).

Nial

Altera_Forum · ‎01-09-2013

--- Quote Start ---

What sort of design approach are you taking?clo

With good synchronous design there should be no 'race' conditions as everything is driven off the same clock (at the simplest level).

Nial

--- Quote End ---

What do you mean by design approach?

What I can tell you is that the main clock is fed into a PLL which produces a second clock. That clock is syncronized with the main clock so all clocks are synced properly.

To me it looks like it is an issue with the instruction fetching hardware or something like that. The CPU seems to completely freeze. Maybe it just skipped fetching an instruction or something like such.

Altera_Forum · ‎01-14-2013

I am thinking that the issue is that the On-chip memory is not fast enough to send the instructions to the CPU. Is there any way to do a timing constraint that will speed it up?

Altera_Forum · ‎01-14-2013

The on-chip memory will run at the frequency of it's clock signal, it can't go any faster than that. You can speed up the system by connecting it to a CPU tightly coupled memory port, but then the block can only be used for either data or instructions, but not both at once.

Altera_Forum · ‎01-14-2013

Hmmm, I guess then I will have to change the instruction fetching unit to acommendate for it. I am going to test it first to see if the memory works when it is clocked 4 times higher then the CPU just to see if that is the issue. Also is there any way to tightly couple the memory with the CPU in QSYS or do I need to do it in the CPU design with a megafunction.

Altera_Forum · ‎01-15-2013

You can create tightly coupled memory cores from the Nios2 configuration window in QSys. (IIRC its in the tab with the other memory and cache parameters).

With on-chip memory, using a clock that is 4 times the one of the CPU will probably not make things go faster. It will probably even be slower, because of the clock domain crossing logic that QSys will add. The most efficient way should be to use the same clock on both the CPU and the on-chip memory.

Altera_Forum · ‎01-15-2013

--- Quote Start ---

You can create tightly coupled memory cores from the Nios2 configuration window in QSys. (IIRC its in the tab with the other memory and cache parameters).

With on-chip memory, using a clock that is 4 times the one of the CPU will probably not make things go faster. It will probably even be slower, because of the clock domain crossing logic that QSys will add. The most efficient way should be to use the same clock on both the CPU and the on-chip memory.

--- Quote End ---

In my case I am using a custom made cpu so I guess that I will need to modify the instruction fetch unit or out a mega function rom chip in the CPU

Altera_Forum · ‎01-17-2013

Ok I am asking another question. I am testing out the memory unit in my CPU (for reads/writes) and I am having the exact same issue for reading. Is there any provisions I can do to insure that this doesn't happen and that the pipeline stalls when the read takes too long? I have currently implemented a wait request signal so I am not sure as to what I should also put. Maybe a data valid signal?

Altera_Forum · ‎01-21-2013

What kind of memory are you using in this test, and how long is this stall? If your CPU is the only one accessing the memory you should never have long stalls.

Altera_Forum · ‎01-21-2013

I am using a Altera on-chip rom for my instructions. I am having proof of the CPU executes the instructions as it does write to my memory mapped device (when loading to FPGA), however it stops before it is supposed to finish all of the instructions. In modelsim my CPU has the waitrequest signal on for my entire gatelevel simulation (100 microseconds). I am not sure if it is the fault of modelsim or of the CPU.

Altera_Forum · ‎01-22-2013

Could you show a capture of your memory signals in Modelsim? 100μs is very long. Is it the wait request from the ROM of the other memory? Are you sure the CPU is trying to access a valid address?

Altera_Forum · ‎01-24-2013

--- Quote Start ---

Could you show a capture of your memory signals in Modelsim? 100μs is very long. Is it the wait request from the ROM of the other memory? Are you sure the CPU is trying to access a valid address?

--- Quote End ---

It is the wait request from my On-chip rom. It is accessing the proper address. Here is the output: https://www.alteraforum.com/forum/attachment.php?attachmentid=6709

Test_write is the wait request signal.

Altera_Forum · ‎01-24-2013

I would connect your cpu directly to a dual ported internal memory block (use the other port for everything else) then you don't need to worry about wait states (generated by the memory block).

The memory will do a read every clock - the 'data out' is valid in the cycle following the address.

For writes the address, data and write enable are valid in the same clock (a read happens as well).

If you are pipelining reads (as the nios does) then you end up having to add a wait state when a write follows a read.

There is also a clock enable/address hold signal, the nios uses this to keep the correct address valid when the cpu stalls because the previous cycle stalls on the avalon bus - since the address gets updated by the following instruction.

If you use signaltap to watch a nios cpu accessing tightly coupled data memory, you see that it does the read cycle regardless of the opcode value.

Altera_Forum · ‎01-24-2013

--- Quote Start ---

I would connect your cpu directly to a dual ported internal memory block (use the other port for everything else) then you don't need to worry about wait states (generated by the memory block).

The memory will do a read every clock - the 'data out' is valid in the cycle following the address.

For writes the address, data and write enable are valid in the same clock (a read happens as well).

If you are pipelining reads (as the nios does) then you end up having to add a wait state when a write follows a read.

There is also a clock enable/address hold signal, the nios uses this to keep the correct address valid when the cpu stalls because the previous cycle stalls on the avalon bus - since the address gets updated by the following instruction.

If you use signaltap to watch a nios cpu accessing tightly coupled data memory, you see that it does the read cycle regardless of the opcode value.

--- Quote End ---

I will look into it. I should change my CPU, since it expects the memory to be there on the same clock cycle that it is needed. I don't have caches so I guess I need to stall my CPU every time the access is needed.

Altera_Forum · ‎01-25-2013

This is really strange. Are you sure the ROM isn't shared with any other master? Do you keep the read signal up all the time wait request is asserted? I don't see any reason why an on-chip ROM would maintain the wait request signal asserted for so long.

Altera_Forum · ‎01-25-2013

--- Quote Start ---

This is really strange. Are you sure the ROM isn't shared with any other master? Do you keep the read signal up all the time wait request is asserted? I don't see any reason why an on-chip ROM would maintain the wait request signal asserted for so long.

--- Quote End ---

I keep the read signal asserted all of the time. It may be caused by some issue related to compiling the QSYS system.