Old math for new problems.

steuben · October 31, 2019

I've been thinking about the math that KSP does when working with the craft. And I need some help shaping a thought.

One of the things taught to me by a _very_ theoretical prof back at school was this: "Any impossible problem can be changed into one easy problem and one difficult one. This axiom scales nicely."

And something taught to me by an old programmer: "Multiplication costs more than addition, floating point triply so."

My zeroth approximation is that the bulk of the math involved in multiplication/division. So rather than use the conventional number space that we are used to, what if we convert to logarithm base 10? With a trivial amount of work the floating point numbers can be represented as a pair of 9 digit integers. I had thought about using logarithm base e to remove the trig functions that will end up in the calculations. But, if the idea doesn't pan using log10 then it most likely won't using ln.

My zeroth approximation is that by doing it this way processing time could be cut in half.

Leaving aside my thoughts on processing time improvements

1. does this pass the basic sanity checks.

2. can gains be made with this approach or would the overhead of the conversion to logs eat any gains from the faster operations in log-space.

StrandedonEarth · October 31, 2019

I only sort-of follow that. Returning to slide rules? Or making (micro-electronic) computers use slide rules?

steuben · October 31, 2019

You've pretty much got it, making digital electronic computers use slip stick style math. Though like that old programmer taught me, "never trust a computer to give you an answer you can't find yourself."

Though it might be closer to adding-machine style than slip-stick.

Superfluous J · October 31, 2019

2 hours ago, steuben said:

"Any impossible problem can be changed into one easy problem and one difficult one. This axiom scales nicely."

No offense to your professor but that sounds like something a motivational speaker would say to get you to buy their DVDs. Or a consultant to a customer at my job, implying I'm going to be working all weekend on a problem that actually is impossible and not just an easy problem and a hard one.

Edited October 31, 2019 by 5thHorseman

steuben · October 31, 2019

1 hour ago, 5thHorseman said:

that sounds like something a motivational speaker would say to get you to buy their DVDs. Or a consultant to a customer

That's because they are fools, and thrive on condensed statements and sound bites.

Though I did think the same thing when he trotted it out. "You doubt me Stu?" he said. "Me a grand high wizard, sitting atop my ivory tower." He did have a touch of drama in his speech. I think it was all that math it does weird things to your brain. "Isolated from the practical necessities of reality. Not as much as you may think. You see that one statement sums up ninety percent of what the art and craft of theoretical math, and by extension any kind of problem solving, is about. How can I take this problem, that is impossible, hard, difficult, or what have you, and change it into a problem that I can easily find the answer to? That transformation is the hard part, and where the real proof of skill lies. Not in tables, proofs, axioms, and theorems, though they are important. They are merely the tools you need."

Nuke · November 1, 2019

i fathom to guess that the old programmer in questione never used a modern cpu or gpu. floating point is actually faster than int math these days, at least on x86-64 and possibly arm. ive used fixed point math on microcontrollers, gone out of my way in avoiding divide instructions, using simple bit shifts to divide by multiples of 2. on avrs hard multiply is available but its one of the 2-cycle instructions. even then if you can shift, shift. much of the slowness in the fpu isnt the fpu itself but the register ops needed to feed it and recover the result, which the move to 64 bit has effectively halved (the x87 instruction set can handle 80-bit float, but even on 64 you need 2 extra reg ops per operation). piplining is also a thing.

Edited November 1, 2019 by Nuke

razark · November 1, 2019

5 hours ago, 5thHorseman said:

...I'm going to be working all weekend on a problem that actually is impossible and not just an easy problem and a hard one.

Easy Problem:
Is this impossible? (Yes.)

Hard Problem:
How do I get out of this field, and into a consulting position that pays ten times more to tell people what they want to hear, in a way that doesn't make them realize I have no idea what I'm talking about? (Hell, if you know the answer, please let me know.)

Edited November 1, 2019 by razark

Xd the great · November 1, 2019

4 hours ago, razark said:

Hard Problem:
How do I get out of this field, and into a consulting position that pays ten times more to tell people what they want to hear, in a way that doesn't make them realize I have no idea what I'm talking about? (Hell, if you know the answer, please let me know.)

Look at my profile.

mikegarrison · November 1, 2019

November 1, 2019

14 hours ago, steuben said:

You see that one statement sums up ninety percent of what the art and craft of theoretical math, and by extension any kind of problem solving, is about.

And this is where he was very, very, very wrong. Theoretical math is a peculiar field, which does not work like most others, in that it's completely detached from physical reality. It's a lot like philosophy, in that regard. An impossible problem is still an impossible problem, it's just that in mathematics, there are no impossible problems (note, I'm not including things that, provably, can't be done. "It can't be done" or even "it can't be proven" are perfectly valid answers to a mathematical question, which is another counter-intuitive thing about mathematics).

If you try that with any other science, you'll run up against limits of physics and technology very quickly. Even programming, which seems almighty to many, has to consider architecture limitations. Ever wonder why every physical discovery had been predicted by theorists by a long shot? Because it's far easier to calculate that something like Higgs boson should exist than actually detecting it. The former takes only your brain, a good computer, and a good theory to build upon. The latter takes a gigantic particle accelerator and a legion of Earth's brightest minds, along with some of the most powerful computers, to sift through the data it produces. And then there's biology, in which you could argue that there's only one problem to solve.

16 hours ago, steuben said:

You've pretty much got it, making digital electronic computers use slip stick style math. Though like that old programmer taught me, "never trust a computer to give you an answer you can't find yourself."

A very outdated thinking in most practical applications. Many things, especially iterative numeric calculations, cannot be checked by hand. These days, quite often you either trust a computer, or are left with nothing. Understanding of how the algorithm works, and its limitations, is sufficient, and in many cases has to be.

Edited November 1, 2019 by Guest

kerbiloid · November 1, 2019

On 10/31/2019 at 9:33 PM, steuben said:

Any impossible problem can be changed into one easy problem and one difficult one.

That's true.
To google and to find out the correct one from 20 pages of googled links.
Most of problems are already googled by somebody at least once. For others there is stackoverflow.

***

On 10/31/2019 at 9:33 PM, steuben said:

So rather than use the conventional number space that we are used to, what if we convert to logarithm base 10?

Then in the end we'll get a virtual analog computer with limited accuracy like we had several decades earlier, just made of electric current rather than of metal.
Because the world is made of exponents. Sometimes of harmonics.

Edited November 1, 2019 by kerbiloid

magnemoe · November 2, 2019

15 hours ago, Dragon01 said:

A very outdated thinking in most practical applications. Many things, especially iterative numeric calculations, cannot be checked by hand. These days, quite often you either trust a computer, or are left with nothing. Understanding of how the algorithm works, and its limitations, is sufficient, and in many cases has to be.

Yes, analoge or mechanical computers are real time so faster than quantum computers. Downside is that they can not really be programmed you have to build them for the purpose. And you don't want to feed them data they have previously calculated because of inaccuracy. Later is also an problem with digital computers, its was how the butterfly effect was discovered and an major problem making KSP.

satnet · November 2, 2019

On 10/31/2019 at 1:33 PM, steuben said:

My zeroth approximation is that the bulk of the math involved in multiplication/division. So rather than use the conventional number space that we are used to, what if we convert to logarithm base 10? With a trivial amount of work the floating point numbers can be represented as a pair of 9 digit integers. I had thought about using logarithm base e to remove the trig functions that will end up in the calculations. But, if the idea doesn't pan using log10 then it most likely won't using ln.

My zeroth approximation is that by doing it this way processing time could be cut in half.

Leaving aside my thoughts on processing time improvements

1. does this pass the basic sanity checks.

2. can gains be made with this approach or would the overhead of the conversion to logs eat any gains from the faster operations in log-space.

Floating point is already logarithmic, and already represented by a pair of integers (plus a sign bit). The most common representation is IEEE 754 which defines several formats, but consists of a sign bit, significand, and an exponent with a pre-defined base. When you use a float data type it is (usually) a 32 bit representation with 1 sign bit, 23 significand bits, and 8 exponent bits with a base of 2 (making the exponent the whole number portion of log₂(N) ). Converting this into a base 2 number looks like this N = (-1)^{(sign bit)} x significand x 2^(exponent) which is very similar to scientific notation except in base 2 since binary computers naturally operate with a base of 2. IEEE 754 does define a base 10 representations as well, though they are recent and if anyone uses them I'm not aware of it.

Basically you won't save any time because this is already what we do. On the other hand you have managed to come up with a perfectly viable approach with a long history of working.

Base e is not likely to work well. Since it is an irrational number a computer needs to approximate it to some number of digits. Computing the log base 2 of a number can be computed with a fairly simple circuit, log base 10 is more complicated yet reasonable, and log base e is just an awful mess.

P.S. I'm glossing over some details. There is more to this than is relevant to the discussion.

IncongruousGoat · November 2, 2019

On 10/31/2019 at 5:42 PM, Nuke said:

floating point is actually faster than int math these days, at least on x86-64 and possibly arm.

This is technically true in some cases, but highly misleading. Floating-point vector operations are generally faster than the equivalent integer vector operations would be due to the presence of specialized instructions and registers (XMM on x64 and SIMD on ARM64) to handle such operations, and this generally translates to floating point having higher performance than integer math in real world scenarios because most floating point math is vector math that can take advantage of these specialized instructions. However, a plain old double-precision floating-point add is going to be slower than an 8-byte integer add on a modern x64 or ARM64 machine.

November 2, 2019

4 hours ago, magnemoe said:

Yes, analoge or mechanical computers are real time so faster than quantum computers. Downside is that they can not really be programmed you have to build them for the purpose. And you don't want to feed them data they have previously calculated because of inaccuracy. Later is also an problem with digital computers, its was how the butterfly effect was discovered and an major problem making KSP.

Inaccuracies aren't the problem with the machine (anything with finite precision will exhibit similar effects), but with the model being used. That's why it's important, when making predictions about real world, to avoid models that are chaotic. Computers are very much capable of arbitrary precision calculations, you can't really get infinite precision, but it's enough to ensure that inaccuracies come only from measurements themselves. Measurement errors appear to inherent in how the world works on quantum level. They can be reduced to a point, but they will always be a factor.

kerbiloid · November 2, 2019

6 hours ago, magnemoe said:

analoge or mechanical computers are real time so faster than quantum computers. Downside is that they can not really be programmed you have to build them for the purpose.

+1. We live in one.
And noone can get the purpose.

Nuke · November 2, 2019

6 hours ago, IncongruousGoat said:

This is technically true in some cases, but highly misleading. Floating-point vector operations are generally faster than the equivalent integer vector operations would be due to the presence of specialized instructions and registers (XMM on x64 and SIMD on ARM64) to handle such operations, and this generally translates to floating point having higher performance than integer math in real world scenarios because most floating point math is vector math that can take advantage of these specialized instructions. However, a plain old double-precision floating-point add is going to be slower than an 8-byte integer add on a modern x64 or ARM64 machine.

that 8 byte integer operation is functionally no different than a 64 bit int operation. its simply going to mask and copy when moving the data in and out of the register. there are probably single instructions for that lurking in the bowels of the instruction set. the slowdowns come when you need a bigger data type than will fit in the register and you have to break it down into 2 ops. with doubles its the same deal. actually last time i checked in gcc sizeof(double) == sizeof(float). with reguards to the vector instructions, the correct answer is that its bloody complicated. x86 is especially messy.

now if you compare float to fixed point maths, you definitely lose performance with ints. sure you can say do your math in milimeters instead of meters for example, that works fine. i do that a lot on lesser platforms. but int is not a float and you will have limitations if you try to use it as such. you can do large fixed point operations to do what float can do, but the extra shifts and extra space (a multiply needs 2x the space of the data type, and a divide needs 4x, a 64 bit divide (say 32.32) requires a 256 bit intermediary) resulting in additional reg ops. in this case float is faster. using float when an integer will do on the other hand is bad practice because in that case the int math is faster.

Edited November 2, 2019 by Nuke

wumpus · November 2, 2019

On 10/31/2019 at 8:42 PM, Nuke said:

i fathom to guess that the old programmer in questione never used a modern cpu or gpu. floating point is actually faster than int math these days, at least on x86-64 and possibly arm. ive used fixed point math on microcontrollers, gone out of my way in avoiding divide instructions, using simple bit shifts to divide by multiples of 2. on avrs hard multiply is available but its one of the 2-cycle instructions. even then if you can shift, shift. much of the slowness in the fpu isnt the fpu itself but the register ops needed to feed it and recover the result, which the move to 64 bit has effectively halved (the x87 instruction set can handle 80-bit float, but even on 64 you need 2 extra reg ops per operation). piplining is also a thing.

Note that only "floats" are unbelievably fast on consumer GPUs, doubles typically are greatly reduced (thanks to market segmentation, often much worse than simple transistor savings would imply. And equally often by disabling working double logic where present for the "big boys"). CPUs tend to focus more on doubles, although equally capable of cranking out as many floats as they have register ports (for those bits). I'm not even sure they need to pipeline anymore (although I expect if you dig deep enough, plenty of pipelining has to happen even if they can execute back-to-back instructions).

Fused floating point multiply add is also pretty common. Believe it or not, the biggest stumbling block is if you need to round on the intermediary multiply. IEEE-754 appears to make floating point as hard as possible to actually lay out in transistors (example: you have to compute the full 112 bit multiply and then round to the exact 56 bit result...), but nowadays they seem to not have any problem including the full suite on GPUs.

Divide is still a pain. Any pipelined or parallel algorithm is going to have nasty limitations such as inverting the thing then multiplying [favored in early vector extensions, I suspect it might be standard practice today] or converting back and forth from fourier transforms (which greatly optimises multiplication of >>100 bit factors and makes division as easy as multiplication).

1 hour ago, Nuke said:

that 8 byte integer operation is functionally no different than a 64 bit int operation. its simply going to mask and copy when moving the data in and out of the register. there are probably single instructions for that lurking in the bowels of the instruction set. the slowdowns come when you need a bigger data type than will fit in the register and you have to break it down into 2 ops. with doubles its the same deal. actually last time i checked in gcc sizeof(double) == sizeof(float). with reguards to the vector instructions, the correct answer is that its bloody complicated. x86 is especially messy.

x86 will have to update each flag for all flagged conditions the same as 8 byte and 64 bit int operations (assuming using an ALU and not the SSE/AVX interface). This means the internal datapaths will be a lot more complicated than naively assumed (although the programmer will never see the difference). I'd assume other architectures have similar things that were assumed trivial whenever they were designed.

10 hours ago, satnet said:

Floating point is already logarithmic, and already represented by a pair of integers (plus a sign bit). The most common representation is IEEE 754 which defines several formats, but consists of a sign bit, significand, and an exponent with a pre-defined base. When you use a float data type it is (usually) a 32 bit representation with 1 sign bit, 23 significand bits, and 8 exponent bits with a base of 2 (making the exponent the whole number portion of log₂(N) ). Converting this into a base 2 number looks like this N = (-1)^{(sign bit)} x significand x 2^(exponent) which is very similar to scientific notation except in base 2 since binary computers naturally operate with a base of 2. IEEE 754 does define a base 10 representations as well, though they are recent and if anyone uses them I'm not aware of it.

Basically you won't save any time because this is already what we do. On the other hand you have managed to come up with a perfectly viable approach with a long history of working.

Base e is not likely to work well. Since it is an irrational number a computer needs to approximate it to some number of digits. Computing the log base 2 of a number can be computed with a fairly simple circuit, log base 10 is more complicated yet reasonable, and log base e is just an awful mess.

P.S. I'm glossing over some details. There is more to this than is relevant to the discussion.

I'd assume that any log operation would be based on base 2, not base e. This should allow for much more producable algorithms. This sort of thing would be ideal for things like graphics and audio, mostly because our senses send more-or-less log-based signals to the brain (or at least we interpret such signals logarithmically). Don't expect any great gains unless you want the output in log form.

11 hours ago, magnemoe said:

Yes, analoge or mechanical computers are real time so faster than quantum computers. Downside is that they can not really be programmed you have to build them for the purpose. And you don't want to feed them data they have previously calculated because of inaccuracy. Later is also an problem with digital computers, its was how the butterfly effect was discovered and an major problem making KSP.

You can more or less effectively create a nth order differential equation simulator out of opamps and design in ways to change the coefficients on the fly. But they'll never be "generally programmable". And the "butterfly effect" becomes pretty clear long before that: consider why we even have double precision in computer. A "float" is good for seven decimal places of accuracy. A "double" is good for ~16. Only the most rigorous physical experiments will get 7 decimals of accuracy, and nobody is getting 16. But it isn't just the "butterfly effect": rounding errors (even with "perfect rounding" griped about above) will eventually corrupt even the most linear and stable algorithms. I remember doing a 32k Fourier transform on audio (16 bits, about 5 decimal digits) using just floats and the audio output quality was worse than 8 bit.

Building a mechanical or analog computer to 70dB (a float's accuracy) of accuracy (each operation, so expect things to get degraded each operation) sounds possible. Getting to 150dB (double) is likely impossible (and no, the only way to break it down into "hard" and "simple" is by going digital).

IncongruousGoat · November 2, 2019

3 hours ago, Nuke said:

that 8 byte integer operation is functionally no different than a 64 bit int operation.

I think we might have gotten hung up on a nomenclature issue. I didn't mean 8 8-bit ops - I meant a single 8-byte/64-bit integer add (so something like `add rsi, 10`).

Oh, and, in gcc, sizeof(float) == 4 and sizeof(double) == 8, regardless of whether or not -m32 is set. It's sizeof(int) that equals sizeof(long) if you're building a 32-bit binary (you need sizeof(long long) for a 64-bit int type).

But yes, floating point math's speed advantage comes because of specialized complicated instructions to do aggregate operations, mostly over vector types.

Nuke · November 2, 2019

4 hours ago, IncongruousGoat said:

I think we might have gotten hung up on a nomenclature issue. I didn't mean 8 8-bit ops - I meant a single 8-byte/64-bit integer add (so something like `add rsi, 10`).

Oh, and, in gcc, sizeof(float) == 4 and sizeof(double) == 8, regardless of whether or not -m32 is set. It's sizeof(int) that equals sizeof(long) if you're building a 32-bit binary (you need sizeof(long long) for a 64-bit int type).

But yes, floating point math's speed advantage comes because of specialized complicated instructions to do aggregate operations, mostly over vector types.

hey i only know what my compiler (maybe it was visual studio idk) told me. this is why its good practice to print(sizeof((type) your types so you know for sure what you are dealing with. sometimes its not what you expect. sometimes you want a smaller type as a storage format, like to make file size smaller, so it doesnt make much sense to make all the types the same size. but again check your compiler for shenanigans.

Edited November 2, 2019 by Nuke

kerbiloid · November 3, 2019

Quote

floating is faster

We need real bits.

wumpus · November 5, 2019

On 11/2/2019 at 2:17 PM, IncongruousGoat said:

But yes, floating point math's speed advantage comes because of specialized complicated instructions to do aggregate operations, mostly over vector types.

That and transistors are as close to free as possible. In CPUs, the limiting factor is likely clockspeed and the how complicated an instruction is (unless it needs to interact with memory) really isn't an issue. Once vectors get to 512 bits long or so, power becomes a problem, but otherwise ungodly complicated instructions "just happen" every cycle. It might make sense to optimize GPU instructions to use something more simple than IEEE-754, but current GPU companies won't be interested as long as AI developers (and other non-graphics GPU buyers) want something based on floating points (Intel and cell phone GPU manufacturers might be more interested in optimizing graphics operations).

IncongruousGoat · November 6, 2019

4 hours ago, wumpus said:

In CPUs, the limiting factor is likely clockspeed and the how complicated an instruction is (unless it needs to interact with memory) really isn't an issue.

This isn't really the case. Modern CPUs do a whole lot of complicated stuff under the hood to speed up instruction execution, and the execution time for a given instruction can range anywhere from a fraction of a cycle to dozens of cycles, depending on instruction complexity, instruction operands (independent of complexity), branch predictor accuracy, dependencies, and pipeline contents at time of execution. Some instructions on x86 and x64 will even vary in execution time based on the literal values of the operands at runtime (!).

If you want to do some more reading into the nitty-gritty of per-instruction processor performance, I found this pdf, which is fairly comprehensive.

Old math for new problems.

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation