Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16556 Discussions

Floating point multiplication megafunction

Altera_Forum
Honored Contributor II
1,692 Views

Hello everyone, 

I am currently trying to use floating point multiplication megafunction. But I always get result 0 no matter what are the inputs. 

This is what I do: 

1. Get into megawizard and choose to generate verilog file 

2. define floating point format as "Single extended precision(43 to 64 bits) 

3. define dataa, datab and result widths as 45 bits 

4. define exponent width as 13 bits 

5. define output latency to be 11 

6. choose to generate verilog file and symbol file 

7. click finish 

 

For the input data I use concatenation method to fit the width of dataa and datab(that were defined in step 3).  

My dataa is a constant which is 0.172212 in decimal and then I convert it using fraction binary method as mantissa bit. Then I use concatenation method to insert the sign bits(0 because positive) and exponent width(13'b0) into the mantissa bit. It become 1 bit sign, 13 bits exponent bits and 31 bit mantissa. 

My datab is not a constant and I use also concatenation method to fit the width of the megafuntion. 

After I do that, no matter what I always get 0 at result. 

 

After that I thought may be I should use the clk_en port at the megafunction. Then I add clk_en port and feed it with high signal. But the result is still 0. 

 

What should I do everyone to get a correct result? 

Thank you in advanced for your guidance:D
0 Kudos
20 Replies
Altera_Forum
Honored Contributor II
567 Views

Come on guys... I do need your help. Thanks :)

0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Your constant is a denormalized number (exponent = 0). Unless denormalized support is enabled in the Megawizard,the number is treated as zero. You better learn how to encode regular normalized float numbers.

0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Thank you so much FvM for your reply. 

I have tried to change the constant from 0.172212 into 1.72212 in decimal. 

But the result is still 0. 

Why is that happen then?
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Thank you FvM. I know now what my problem is. I will come again later if I still have problem

0 Kudos
Altera_Forum
Honored Contributor II
567 Views

He was saying by setting the exponential bits to 0 you are working with denormal values (no matter what you set the mantissa to). Here is a handy page I use for looking at the bit fields for single precision values: http://www.h-schmidt.net/floatapplet/ieee754.html

0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Thank you BadOmen for your reply and the good website link. 

It is very helpful. 

In my case, then I need to make a verilog program to convert from unsigned binary to floating point because my dataa range between 60 to 700 and my datab is a constant which is 0.172212 in decimal.  

By the way has anyone ever made one? It would be helpful if anyone help me out here. 

 

PS. I also found a simple but good website that explain about floating point. http://www.tfinley.net/notes/cps104/floating.html#representation 

Here is a decimal to binary calculator(including fraction). http://www.exploringbinary.com/binary-converter/
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Thank you BadOmen for your reply and the good website link. 

It is very helpful. 

In my case, then I need to make a verilog program to convert from unsigned binary to floating point because my dataa range between 60 to 700 and my datab is a constant which is 0.172212 in decimal.  

By the way has anyone ever made one? It would be helpful if anyone help me out here. 

 

PS. I also found a simple but good website that explain about floating point. http://www.tfinley.net/notes/cps104/floating.html#representation 

Here is a decimal to binary calculator(including fraction). http://www.exploringbinary.com/binary-converter/
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

To convert the variable input, you can use the altfp_convert MegaFunction.

0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Thank you FvM and BadOmen. 

I found out it's not as difficult as I first thought. 

Wakaka. :) 

I wonder what is the effect of the latency output cycles except for the latency itself??? 

If I enter the shortest output latency, will it effect the performance or something else? (I am curious) 

Thanks in advance.
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Latency does not affect functionality. You target minimum latency to achieve fmax else increase it. You need to match latency with your external computations.

0 Kudos
Altera_Forum
Honored Contributor II
567 Views

with your required range, have you considered using fixed point? then its just integer arithmatic and much less resource usage than floating point.

0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Thank you kaz for your explanation. Now I understand what consideration to make when deciding the output latency. but the problem is we cannot have the latency of the megafunction as "our own will". For example the output latency for megafunction ALTFP_DIV are only 6, 14, 33. How if we want 50 output latency? All this time I always generate a slower clock to match the output latency or the timing diagram between each component. 

What do you think kaz? Thanks in advance kaz. 

 

Thank you Tricky. 

Actually I thought of using fixed point but I cannot find any megafunction for fixed point (at arithmetic part). What I found is anything with ALTFP initial which is for floating point. Please correct me if I'm wrong. Is there any megafunction for fixed point? 

Thanks in advance Tricky.
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

There are no megafunctions for fixed point because all of the arithmatic is integer arithmatic. So you either use the standard megafunctions (lpm_mult, lpm_add_sub etc) or you can just use behavioural code. 

 

This means the latency is MUCH lower, and logic use is MUCH smaller.
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Thank you Tricky. I think in my case I need a fractional part which can only be fulfilled by floating point. ie. 0.172212. At my final calculation, my final data range is between -1 to 1.  

Cheers :)
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Hi Gunardilin, 

 

There is nothing to stop you using ordinary integer aritmetic for fractions. I have never used floating point over the course of all projects. You can think of truncation as the decimal point. e.g.  

y * .5 = y * 1024 then chop off 11 bits.
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Hello kaz, 

Thank you for your reply. I still don't get your idea to use integer arithmatics for fraction. Could you explain further? Thank you in advance. 

Cheers :)
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

Fixed point is simply just binary. Imagine the number 9.25 

In 6 bit fixed point:b 1001.01 

 

This is the same as 37: 100101 

 

The only difference is a decimal place. So with fixed point arithmatic the decimal point (should probably be called the integer/fraction divider!) is just imaginary, so you would treat the fixed point number as if it was 37. 

 

The following rules apply: 

 

An M.N number + X.Y results in a max(M, X)+1.min(N, Y) number. 

M.N * X.Y = M+X.N+Y 

 

(Where M,N,X,Y are the number of bits). 

 

Have a look at the VHDL fixed point package: 

www.vhdl.org/fphdl  

 

Its the 2008 fixed point package in 93 format so you can use it with any synthesisor.
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

If you have say a multiplier that has two inputs (signal 12 bits signed and multiplicand 12 bits signed) then you may truncate 12 LSB bits off result. 

That is equivalent to /2^12. 

 

Thus if your multiplicand is 0.172212 then you first pre-convert it 0.172212*2^12 by hand and round it to 705. 

 

thus when you truncate 12 LSBs, in effect you multiply by 0.172212 

 

Thus all you do is scale your mutiplicand from normalised range of (-1 ~ 1) 

by scaling it by 2^12. 

 

You can additionally do rounding at truncation. 

Edit: additionally you can think of it in your head as fractional value with decimal point somewhere but you don't need to as it is just two ways of looking at same thing.
0 Kudos
Altera_Forum
Honored Contributor II
567 Views

With fixed point always remember the range of your values as well and how operators affect them. For example if I had two eight bit fixed point values and I add them together my result is potentially 9 bits. Likewise if I multiplied those two values together I would need 16 bits to store the result. 

 

So often with high precision fixed point math hardware, if you want to maintain 100% accuracy without rounding through the calculation the logic typically starts off narrow and becomes wider as the data moves through various operators. Then at the end you sometimes truncate the width by downscaling the number. If you round off bits after each operator then keep in mind errors will accumulate. 

 

I recommend google searching "fixed point math", there are probably plenty of sites that show graphically how you apply it to hardware/software. It'll probably make more sense if you see pictures of it.
0 Kudos
Altera_Forum
Honored Contributor II
502 Views

Bit growth is inevitable and most practical systems maintain a sort of fixed data path width e.g. 14 or 16 bits commonly between specific modules while allowing full or near full growth inside each module. Naturally that means rounding/occasional saturation. It all can be modelled easily before moving to implementation.

0 Kudos
Reply