Laie

KSP2 and multithreading (what CPU do I want for my next computer?)

Recommended Posts

Have there been any hints, or even proper announcements, as to how well KSP2 will be able to utilize several CPU cores? Basically I'm trying to figure out whether my next computer should be built around on Intel or AMD processor.

Share this post


Link to post
Share on other sites

Kso doesn't support multithreading due to rigid body. Ksp 2 might be the same way. They are still investigating 

I say get a cpu with good single core performance. And one that runs at least 4.1 with boost. The higher the ghz the better ksp will run, at least that's what I believe with Ksp 1. 

Edited by GoldForest

Share this post


Link to post
Share on other sites
42 minutes ago, Laie said:

Have there been any hints, or even proper announcements, as to how well KSP2 will be able to utilize several CPU cores? Basically I'm trying to figure out whether my next computer should be built around on Intel or AMD processor.

My semi-educated guess is that it'll still be single-core-performance constrained most of the time. But that's just a guess at this point, based on the assumption that they're still using Unity's physics API rather than something they've baked themselves.

They are trying to optimise for better scaling with part count however, but by itself that wouldn't make the program to be better able to take advantage of multiple cores.

Share this post


Link to post
Share on other sites

For the love of God, get with the times please. If a game isn’t utilizing this feature by now it should be. Please make it work.

Share this post


Link to post
Share on other sites
1 minute ago, B15hop said:

For the love of God, get with the times please. If a game isn’t utilizing this feature by now it should be. Please make it work.

Not all logic can be parallelized effectively.  KSP tends towards some that can't, because of what it's doing.  That's not an issue with the programming necissarially - it's limits of design, logic, and computing theory.

Share this post


Link to post
Share on other sites
31 minutes ago, B15hop said:

For the love of God, get with the times please. If a game isn’t utilizing this feature by now it should be. Please make it work.

DStaal covered the explanation perfectly fine, but I'm going to add another thing: the limitations of parallel computing have been known since the sixties and seventies.  "Get with the times" indeed.

By the way, a slavish devotion to parallel computing can end up with a program that is not only NOT faster than the equivalent single threaded/serial processing equivalent, but also has the added bonus of requiring vastly more resources.  Hooray!

I'd also point out that Windows is egregiously bad at thread scheduling.   I remember back in the SupCom days, when we'd manually assign threads to CPUs using CPU affinity to gain massive improvements in performance.  So you could very well end up with some fat, bloated program that barely runs, which then gets scheduled to CPU #3 exclusively for no reason (well, except perhaps cache coherency, but you've just negated any advantage to being multithreaded in the first place)...

Share this post


Link to post
Share on other sites
33 minutes ago, Renegrade said:

DStaal covered the explanation perfectly fine, but I'm going to add another thing: the limitations of parallel computing have been known since the sixties and seventies.  "Get with the times" indeed.

By the way, a slavish devotion to parallel computing can end up with a program that is not only NOT faster than the equivalent single threaded/serial processing equivalent, but also has the added bonus of requiring vastly more resources.  Hooray!

I'd also point out that Windows is egregiously bad at thread scheduling.   I remember back in the SupCom days, when we'd manually assign threads to CPUs using CPU affinity to gain massive improvements in performance.  So you could very well end up with some fat, bloated program that barely runs, which then gets scheduled to CPU #3 exclusively for no reason (well, except perhaps cache coherency, but you've just negated any advantage to being multithreaded in the first place)...

I love how people have this tendency to throw out some over blown statement, peppered with their own personal sentiment and no hard facts. You don’t even know what kind of programming KSP 2 will have. But here you are making this giant, blanketing statement because hey, it’s nice to sound like you know something isn’t it? When I look at KSP, I see a bunch of things that could be separated onto different cores. But if you want to stay in the 1970’s, feel free. 

Edited by B15hop

Share this post


Link to post
Share on other sites
1 hour ago, B15hop said:

When I look at KSP, I see a bunch of things that could be separated onto different cores. But if you want to stay in the 1970’s, feel free. 

Many of them already are.

Share this post


Link to post
Share on other sites
1 minute ago, B15hop said:

I love how people have this tendency to throw out some over blown statement, peppered with their own personal sentiment and no hard facts. You don’t even know what kind of programming KSP 2 will have. But here you are making this giant, blanketing statement because hey, it’s nice to sound like you know something isn’t it? When I look at KSP, I see a bunch of things that could be separated onto different cores. But if you want to stay in the 1990s, feel free. 

Do you understand Amdahl's Law? The need for critical sections and mutexes?  Order-dependent operations?  I'm a professional programmer.  Are you?

Ironic that you should mention the 90s - most of the crap you're using today is just tired old rehashes of 90s stuff anyhow.  You're running GameSDK on Windows NT (or OpenGL on Linux).   All of the 'advanced' stuff (SMP, SIMD, 64-bit, PIC, cache, etc) of your CPU was invented by Westinghouse and Cray and Burroughs and IBM and DEC and such before you were born, and saw their first iterations in the 80s and 90s on PCs and home computers.   None of the underlying laws of physics has changed since that time.

"When I look at KSP, I see a bunch of things that could be separated onto different cores" -- Like this isn't some overblown statement, peppered with your own personal sentiment.

By the way, everything I said can be backed up with simple web searches with the sole exception of SupCom stuff.   It appears that Gas Powered Games went under at some point in the past, and the forums with thousands of posts detailing the crash and burn which was the "muh multicore/SMP" programming paradigm bit it with them.  Oh wait, I suppose that actually counts as evidence too!  Maybe they'd still be here if SupCom wasn't such a pig... Still, I'd prefer the actual direct evidence remained.  Effing internet is such fail.  Maybe I'll dig it out of the wayback machine at some point...

You know KSP already runs separate vessel physics instances in separate threads, right?  (introduced in ... uhhh.. 1.2?  1.3? I forget) That's likely 80% of the potential speedup right there.   Even if you could find other places to be concurrent in, you'd also find that the opportunity cost in terms of data locality / cache misses would override any possible gain.

Anyhow, I have Kerbals that need to explore the newly retextured worlds.  Oh and they're using 57 threads to do so already, of which about ten are sharing the CPUs aggressively. 

TL;DR: It's a complex problem that you can't just throw cores at and expect it to work (this applies to almost any program). See Amdahl's Law.  Plus what Brikoleur just said.

Share this post


Link to post
Share on other sites
1 hour ago, B15hop said:

I love how people have this tendency to throw out some over blown statement, peppered with their own personal sentiment and no hard facts. You don’t even know what kind of programming KSP 2 will have. But here you are making this giant, blanketing statement because hey, it’s nice to sound like you know something isn’t it? When I look at KSP, I see a bunch of things that could be separated onto different cores. But if you want to stay in the 1970’s, feel free. 

I don't think you understand programming. It's math, which has rules. Certain things depend on the outcome of another. Sometimes you can't start one calculation until the other has finished. If you multi-thread something that is constantly dependent on the results of previous calculations then it will sometimes still run at the same speed, now you're just using additional threads. So, you end up with threads sitting idle while they wait for results they could be calculating themselves. If you think those threads can do something while they're waiting you are correct but, when a thread switches tasks it has to save the state of it's current task, consuming additional resources. It's like employees, having more can help when additional tasks are required but, if it's a single task that only one person can work on at a time, it would slow it down to have it passed back and forth between multiple people.

Share this post


Link to post
Share on other sites
5 hours ago, B15hop said:

I love how people have this tendency to throw out some over blown statement, peppered with their own personal sentiment and no hard facts. You don’t even know what kind of programming KSP 2 will have. But here you are making this giant, blanketing statement because hey, it’s nice to sound like you know something isn’t it? When I look at KSP, I see a bunch of things that could be separated onto different cores. But if you want to stay in the 1970’s, feel free. 

I know you mean well, but he's right.  The way we model physics hasn't changed much since the 70s.

 

Why?  Because physics havent changed much since the 70s, of course (it would be more correct to say they are exactly the same). 

They really need to be on a single core, unfortunately.  That's also where the bulk of KSP processing is.

The only reliable way to multithread physics is to make the objects non-collidable and unable to act on each other.  That's good for particles, but it's useless for a sim.

Edited by R-T-B

Share this post


Link to post
Share on other sites
30 minutes ago, R-T-B said:

The only reliable way to multithread physics is to make the objects non-collidable and unable to act on each other.  That's good for particles, but it's useless for a sim.

^ Exactly.  Good example too.

Share this post


Link to post
Share on other sites
11 hours ago, Laie said:

Have there been any hints, or even proper announcements, as to how well KSP2 will be able to utilize several CPU cores? Basically I'm trying to figure out whether my next computer should be built around on Intel or AMD processor.

Well if you are in the situation like me where I am using a school designed computer for gaming then I would worry. He did say that the game is built from a complete new way! Kerbal Space Program runs on software that was made back from 2011. A 2020 game should be able to run better because it's built on a better and different foundation.

Share this post


Link to post
Share on other sites

I have no idea whether KSP2 will support multiple threads for physics or not, but I *do* know a bit about multithreading. Physics simulations are one of the most well studied areas for parallel programming, and you can *absolutely* do rigid body simulations in parallel.

Here's a paper that discusses some of the issues and solutions for multi-threaded game physics as researched by a CS grad student back in 2014: http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1341&context=etd_projects

The basic idea for KSP-style rigid body physics is pretty simple: You divide the parts into groups by location, and have a separate thread calculate physics for each group. To handle cross group interactions, you either do the operation in two passes (no conflict motions and conflict detection, followed by conflict resolution) or you allow threads to duplicate a small amount of work recalculating physics for parts belonging to other threads. Either of these strategies scale better than most parallel algorithms for game engines. Maybe they're 20% more expensive than an optimized single threaded solution, but that still would make them more than 3x as fast running physics on 4 cores.

In general, there is *no* CPU bound task in a video game that can't be parallelized. The reason we don't see a lot of parallel optimization yet is commercial: it hasn't been a good deal to spend extra developer time to add functionality which excludes users still on dual core boxes.

Share this post


Link to post
Share on other sites
6 hours ago, Power9 said:

In general, there is *no* CPU bound task in a video game that can't be parallelized. The reason we don't see a lot of parallel optimization yet is commercial: it hasn't been a good deal to spend extra developer time to add functionality which excludes users still on dual core boxes.

This is very interesting. I know that the PhysX engine makes use of the GPU. Do you think it's already parallelised internally? There aren't really that many cores on the CPU so the potential gain from parallelising there is pretty limited, but if you could offload that to 2000 or more cores on the GPU it could effectively remove the bottleneck altogether.

Edited by Brikoleur

Share this post


Link to post
Share on other sites
19 hours ago, Brikoleur said:

My semi-educated guess is that it'll still be single-core-performance constrained most of the time. But that's just a guess at this point, based on the assumption that they're still using Unity's physics API rather than something they've baked themselves.

They are trying to optimise for better scaling with part count however, but by itself that wouldn't make the program to be better able to take advantage of multiple cores.

still praying for GPU accelerated physics here

Share this post


Link to post
Share on other sites
2 hours ago, Brikoleur said:

This is very interesting. I know that the PhysX engine makes use of the GPU. Do you think it's already parallelised internally? There aren't really that many cores on the CPU so the potential gain from parallelising there is pretty limited, but if you could offload that to 2000 or more cores on the GPU it could effectively remove the bottleneck altogether.

You can't effectively use a GPU and not be parallel - so yes, the GPU portion of PhysX runs in parallel. The complication is that you need to be parallel in a specific way. Although every CPU-bound task in modern games can be parallelized, they can't all be parallelized effectively in the way required by GPUs. Some tasks still require CPU-style code execution.

For tasks that run nicely on GPUs the performance benefit of using a GPU is significant, but quite as drastic as it seems. An RTX 2080 Ti gives you a max theoretical performance of ~14 teraflops @ 250W. An AMD Epyc  7742 gets you ~3 teraflops @ 225W. That's a factor of five.

GPUs are really good at vector operations, when a bunch of tasks are doing exactly the same thing on adjacent data. They're bad at indirection, for example in object oriented programming, or anything that involves non-adjacent data. They're also bad at small tasks - if you don't have a 100,000 items to process, it's not worth bothering the GPU with it. The algorithms I describe above would probably work fine on a GPU depending on a bunch of details, but your ships in KSP probably don't have enough parts to be worth bothering the GPU with - with a parallel algorithm a decent CPU will be faster on even thousands of discrete parts.

With a CPU-parallel physics engine, KSP2 could definitely get us 4x the performance on 8 core CPUs over KSP1, or 20x the performance for people gaming on top of the line server CPUs. With a GPU physics engine, they might be able to get us 50x the simulation performance of KSP 1 for users who have a high end dedicated physics GPU.

Share this post


Link to post
Share on other sites
52 minutes ago, Power9 said:

You can't effectively use a GPU and not be parallel - so yes, the GPU portion of PhysX runs in parallel. The complication is that you need to be parallel in a specific way. Although every CPU-bound task in modern games can be parallelized, they can't all be parallelized effectively in the way required by GPUs. Some tasks still require CPU-style code execution.

For tasks that run nicely on GPUs the performance benefit of using a GPU is significant, but quite as drastic as it seems. An RTX 2080 Ti gives you a max theoretical performance of ~14 teraflops @ 250W. An AMD Epyc  7742 gets you ~3 teraflops @ 225W. That's a factor of five.

GPUs are really good at vector operations, when a bunch of tasks are doing exactly the same thing on adjacent data. They're bad at indirection, for example in object oriented programming, or anything that involves non-adjacent data. They're also bad at small tasks - if you don't have a 100,000 items to process, it's not worth bothering the GPU with it. The algorithms I describe above would probably work fine on a GPU depending on a bunch of details, but your ships in KSP probably don't have enough parts to be worth bothering the GPU with - with a parallel algorithm a decent CPU will be faster on even thousands of discrete parts.

With a CPU-parallel physics engine, KSP2 could definitely get us 4x the performance on 8 core CPUs over KSP1, or 20x the performance for people gaming on top of the line server CPUs. With a GPU physics engine, they might be able to get us 50x the simulation performance of KSP 1 for users who have a high end dedicated physics GPU.

Any vendor agnostic solutions? GPU PhysX requires support at the driver level which even though the source has been opened up AMD hasn't made any effort to support it due to the lack of GPU PhysX in games.

Share this post


Link to post
Share on other sites
12 hours ago, Power9 said:

I have no idea whether KSP2 will support multiple threads for physics or not, but I *do* know a bit about multithreading. Physics simulations are one of the most well studied areas for parallel programming, and you can *absolutely* do rigid body simulations in parallel.

Here's a paper that discusses some of the issues and solutions for multi-threaded game physics as researched by a CS grad student back in 2014: http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1341&context=etd_projects

The basic idea for KSP-style rigid body physics is pretty simple: You divide the parts into groups by location, and have a separate thread calculate physics for each group. To handle cross group interactions, you either do the operation in two passes (no conflict motions and conflict detection, followed by conflict resolution) or you allow threads to duplicate a small amount of work recalculating physics for parts belonging to other threads. Either of these strategies scale better than most parallel algorithms for game engines. Maybe they're 20% more expensive than an optimized single threaded solution, but that still would make them more than 3x as fast running physics on 4 cores.

In general, there is *no* CPU bound task in a video game that can't be parallelized. The reason we don't see a lot of parallel optimization yet is commercial: it hasn't been a good deal to spend extra developer time to add functionality which excludes users still on dual core boxes.

 

You've provided some good reading and examples.  Have a like. :)

Share this post


Link to post
Share on other sites
23 hours ago, Power9 said:

I have no idea whether KSP2 will support multiple threads for physics or not, but I *do* know a bit about multithreading. Physics simulations are one of the most well studied areas for parallel programming, and you can *absolutely* do rigid body simulations in parallel.

Here's a paper that discusses some of the issues and solutions for multi-threaded game physics as researched by a CS grad student back in 2014: http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1341&context=etd_projects

The basic idea for KSP-style rigid body physics is pretty simple: You divide the parts into groups by location, and have a separate thread calculate physics for each group. To handle cross group interactions, you either do the operation in two passes (no conflict motions and conflict detection, followed by conflict resolution) or you allow threads to duplicate a small amount of work recalculating physics for parts belonging to other threads. Either of these strategies scale better than most parallel algorithms for game engines. Maybe they're 20% more expensive than an optimized single threaded solution, but that still would make them more than 3x as fast running physics on 4 cores.

In general, there is *no* CPU bound task in a video game that can't be parallelized. The reason we don't see a lot of parallel optimization yet is commercial: it hasn't been a good deal to spend extra developer time to add functionality which excludes users still on dual core boxes.

The other angle I've seen on the same problem is to go the other way. Map the bodies to particles and deal with everything as particles.

https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch29.html

Sorry it's nVidia dev article but it has the best pictures (or this one if you like reading drier https://www.hindawi.com/journals/ijcgt/2014/485019/)

Would seem to lend itself to what we know of KSP2 so far. In that particles lend themselves to unique scary snowflake explosions, better smoke and visuals, rings, better scatter while allowing GPU processing and a higher degree of multi-threading. Would also seem to lend itself to future advancements on the games physics.

Share this post


Link to post
Share on other sites

If they us Unity's C# Job system, example for orbit calculations, then you benefit from many CPU cores. 

Edited by runner78

Share this post


Link to post
Share on other sites
On 10/21/2019 at 11:18 PM, Power9 said:

I have no idea whether KSP2 will support multiple threads for physics or not, but I *do* know a bit about multithreading. Physics simulations are one of the most well studied areas for parallel programming, and you can *absolutely* do rigid body simulations in parallel.

Here's a paper that discusses some of the issues and solutions for multi-threaded game physics as researched by a CS grad student back in 2014: http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1341&context=etd_projects

The basic idea for KSP-style rigid body physics is pretty simple: You divide the parts into groups by location, and have a separate thread calculate physics for each group. To handle cross group interactions, you either do the operation in two passes (no conflict motions and conflict detection, followed by conflict resolution) or you allow threads to duplicate a small amount of work recalculating physics for parts belonging to other threads. Either of these strategies scale better than most parallel algorithms for game engines. Maybe they're 20% more expensive than an optimized single threaded solution, but that still would make them more than 3x as fast running physics on 4 cores.

In general, there is *no* CPU bound task in a video game that can't be parallelized. The reason we don't see a lot of parallel optimization yet is commercial: it hasn't been a good deal to spend extra developer time to add functionality which excludes users still on dual core boxes.

Why don't they just run duel implementations?!

On 10/22/2019 at 8:56 AM, Power9 said:

You can't effectively use a GPU and not be parallel - so yes, the GPU portion of PhysX runs in parallel. The complication is that you need to be parallel in a specific way. Although every CPU-bound task in modern games can be parallelized, they can't all be parallelized effectively in the way required by GPUs. Some tasks still require CPU-style code execution.

For tasks that run nicely on GPUs the performance benefit of using a GPU is significant, but quite as drastic as it seems. An RTX 2080 Ti gives you a max theoretical performance of ~14 teraflops @ 250W. An AMD Epyc  7742 gets you ~3 teraflops @ 225W. That's a factor of five.

GPUs are really good at vector operations, when a bunch of tasks are doing exactly the same thing on adjacent data. They're bad at indirection, for example in object oriented programming, or anything that involves non-adjacent data. They're also bad at small tasks - if you don't have a 100,000 items to process, it's not worth bothering the GPU with it. The algorithms I describe above would probably work fine on a GPU depending on a bunch of details, but your ships in KSP probably don't have enough parts to be worth bothering the GPU with - with a parallel algorithm a decent CPU will be faster on even thousands of discrete parts.

With a CPU-parallel physics engine, KSP2 could definitely get us 4x the performance on 8 core CPUs over KSP1, or 20x the performance for people gaming on top of the line server CPUs. With a GPU physics engine, they might be able to get us 50x the simulation performance of KSP 1 for users who have a high end dedicated physics GPU.

 

You realize if they implemented that that would change. We can get that many parts on a ship easily. I could make a ship with 10's of thousand with copy past in a manner of seconds to minutes with lag and stutter gone.

I was going to make a quick 144,000 tons cargo to orbit rocket but the parts got to high to fast. It wasn't even that many parts. But it can be scaled fast. Especially if they remove the limits of the SPH and VAB. It's sad this game retrogressed. That permanently damages it's momentum. Especially something that should have been mainstay decades ago.

Edited by Arugela

Share this post


Link to post
Share on other sites

A couple of detail questions...

How long a vector can a GPU handle? 3 elements, or an arbitrary number definable during runtime?

How many pieces of info does KSP work with on a part during calculation, there's position relative to the root(3), linear speed (3), linear acceleration(3), rotation(3), rotation speed(3), rotation acceleration(3), mass, force applied . Any others?

Share this post


Link to post
Share on other sites
48 minutes ago, steuben said:

A couple of detail questions...

How long a vector can a GPU handle? 3 elements, or an arbitrary number definable during runtime?

How many pieces of info does KSP work with on a part during calculation, there's position relative to the root(3), linear speed (3), linear acceleration(3), rotation(3), rotation speed(3), rotation acceleration(3), mass, force applied . Any others?

The vector length would mainly be constrained by the GPU architecture; though i'm sure you could find the most common implementation and use that. 

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.