FP64 + FP64 = FP128? Is that possible?

Aghanim · April 20, 2014

Yes, I know that this question is probably as stupid as someone asking about if I install Windows 7 32 bit twice does it makes 64 bit question, but here is the question. If we have two FP64 processor, is it possible to combine them in software somehow to create quad precision number? And is it possible to combine two FP32 processor to create double precision number?

andrewas · April 20, 2014

Nope. Quad precision is a thing, but no domestic cpus support them. If double is not good enough you can emulate higher precision in software, though that'll come with a performance hit. Don't try to code it yourself, find a math library that offers arbitrary precision. Another alternative is fixed point math. Integer artithmatic isn't faster than floating point anymore, but it's not much slower and you get constant accuracy across the entire range which may be helpful. 64 but fixed point can model the entire solar system at nanometer precision if I remember rightly.

Luis · April 20, 2014

Nope. Quad precision is a thing, but no domestic cpus support them. If double is not good enough you can emulate higher precision in software, though that'll come with a performance hit. Don't try to code it yourself, find a math library that offers arbitrary precision. Another alternative is fixed point math. Integer artithmatic isn't faster than floating point anymore, but it's not much slower and you get constant accuracy across the entire range which may be helpful. 64 but fixed point can model the entire solar system at nanometer precision if I remember rightly.

I'm afraid you remember incorrectly. 2^64 nanometres is just over 18 million kilometres, which is well inside the orbit of Mercury (69 million km) in our solar system, and about a quarter of the way to Jool in the KSP system.

Nuke · April 20, 2014

newer chips have avx which i think is capable of 128bit fp math (my i7 has this), avx2 which came out last july doubled this to 256bit. this will be upgraded again to avx-512 some time in the next couple generations of processors. even the aging x87 instruction set is capable of 80-bit floating point maths.

the problem is every time a new hardware capability comes out, it takes forever for software developers to start making use of it, because they still have to support the legacy bunch and their aging cpus/gpus. and its easier to just use the legacy instructions than it is to conditionally utilize one or the other based on the user's hardware.

you can do soft float math, but this is costly. there are infinite* precision libraries often used for scientific purposes.

*you can have as much or as little precision as you want. of course this is limited by amount of ram on your rig and the amount of time you want to wait for an add or a multiply.

Edited April 20, 2014 by Nuke

andrewas · April 20, 2014

avx(2) allows you to work with 128(256) bits of data at a time, but you still don't get quad precision, just the ability to work with 2(4) doubles at the time time.

I'm afraid you remember incorrectly. 2^64 nanometres is just over 18 million kilometres, which is well inside the orbit of Mercury (69 million km) in our solar system, and about a quarter of the way to Jool in the KSP system.

Yeah ... doing the math again your resolution is a couple of microns to a couple of centimeters depending on how you define 'solar system'. The point is that 64 bit fixed math gives you either huge range, huge resolution or some combination of both, and is sometimes a viable alternative to 64 bit floating point which only gives 53 bits of precision, the other 11 are used to store how big the number is.

xEvilReeperx · April 20, 2014

Yes, I know that this question is probably as stupid as someone asking about if I install Windows 7 32 bit twice does it makes 64 bit question, but here is the question. If we have two FP64 processor, is it possible to combine them in software somehow to create quad precision number?

The only limitation to precision is memory if you're working in software

K^2 · April 20, 2014

SSE2 can already do 2 double precision operations at a time. But no, performing two double operations at the same time doesn't give you 128 bit precision. However, for many practical applications, where exponent does not change much, 256 bit integer math gives you the same results. AVX2 does, indeed, allow you to perform 256 bit integer operations, which you can use to fake 128 bit floating point math in most cases.

The only limitation to precision is memory if you're working in software

True. But I think, the implication is a way to combine data without significant loss of performance, which is not going to be the case.

Nuke · April 20, 2014

with 256 bits you could do 128.128 fixed point math. that would give you pretty good precision, about 38 orders of magnitude either side of the decimal point. and unlike with floating point you can do operations between big and small numbers, and you never have to screw with epsilon values when doing compares, mt, lt, mte, lte. of course fixed point isnt that simple though. adds and subtracts of same precision can be done just fine, but multiplies and divides need higher precision intermediates to prevent overflows and loss of precision respectively. of course if you want to do operations between different fixed point formats you can do that but you need room to shift sometimes.

both fixed and float have their quirks though. fixed cant do infinity or negative infinity, but dont ever result in nan or quantization error, and gives you predictable precision (float has a non linear precision curve with magnitude). the other side is they can overflow on you. fixed point is faster than soft float, but it cant beat hard float because modern cpus have faster fpus than integer units.

Edited April 20, 2014 by Nuke

K^2 · April 20, 2014

That's a bad idea. Because prior to multiplying these together, you'd do a 128 bit shift to the right. If you happen to work with small numbers, you'd be losing most of your precision. You are much better off keeping track of the exponent somewhere in software, rather than having a truly fixed point.

Nuke · April 20, 2014

fixed point is really complicated. sometimes it works and sometimes you are slamming your head into the desk. i had to use it trying to do some filtering on imu data on an 8-bit mcu. doing trig, vector, and matrix math with fixed point numbers. i think i gave up. but im comfortable doing small stuff with fixed point maths.

K^2 · April 20, 2014

Of course. When you know in advance what sort of magnitudes you are going to be dealing with, you can just hard-code fixed point. Most MCU applications are going to be like this. But if you are writing something a little more general, it's not a lot of work to have computations done in fixed-point style, but with ability to adjust the position of the point. It's not true floating-point, because you don't adjust the point for every operation and for each operand. But that's why you don't lose much in terms of performance. You have to do shifts between multiplication operations anyways. Might as well hold the shift amount in one of the registers and shift by that, rather than by fixed amount. (Actually, is there even an immediate shift in SSE/AVX instruction sets?)

Nuke · April 20, 2014

idk, ive never done fixed point on anything more powerful than an avr.

FP64 + FP64 = FP128? Is that possible?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation