Jump to content

I'm worried about the possible system requirements of KSP2


Recommended Posts

1 hour ago, Oneiros said:

games rely a lot more on GPU than CPU. that CPU will most likely be fine but you'll probably need an update to a dedicated graphics card. sometimes you can add them to laptops through a m-PCIe port adapter + external power supply, it's called an eGPU & can be done cheaply if your laptop supports it.

 

Not KSP, it's CPU-bound big time. And eGPU enclosures are around 300 bucks, the GPU around 200. And only give you 50% performance of the GPU in the best case, you're spending almost as much as you would on a decent desktop for half the performance or worse. KSP2 will be better optimized to support more parts, but will still be CPU bound in the end.

Oh and M-PCIe solutions are hardly portable lel; since they require cutting open the back of the laptop or opening it. And most are PCI x 2, not x4, so your performance loss is even worse than the 50% i quoted above. Seriously, just take your money and save it over time.

Link to comment
Share on other sites

33 minutes ago, Oneiros said:

and yet the graphical calculations remain far more intensive than the central processor's calculations

In KSP’s case, the problem is the rigid-body physics. It’s really hard to parallelise. GPUs do their magic by being massively parallel. That means that the physics stuff can’t be easily shunted off to the GPU, and performance is constrained by the CPU.

KSP2 devs have said that they have a ”physics LOD” system to improve scalability with large craft, but even with tricks like this I think it’s extremely likely that it will be CPU-bottlenecked on all but extremely low-end machines.

So in this case, the CPU calculations do remain ”far more intensive” than the GPU ones — since the GPU computations are done in parallel.

Link to comment
Share on other sites

3 minutes ago, Oneiros said:

lol

on a HD 620 + a CPU that benches at i5-4590 level? 0% chance.

entirely unnecessary.

not sure on currency but an entry-level dedicated option like the GT1030 benches well over double the HD 620 and should be under $200 anywhere.

how so?

it might also have 21st century graphics too, unlike KSP.

not sure what this has to do with portability but if you're too afraid of opening a laptop then yeah, don't try an upgrade. whether cutting the case is required depends on the laptop.

the whole number of PCI lanes thing isn't an issue when the current generation of cards aren't utilising them.

Alright so my desktop gets cheaper with your GT 1030? Also GT 1030 won't perform well after it's performance is cut in half, it'll begin to approach IGPU at that point. Which is why you want at least a mid-range GPU,

so you have enough performance left over to even make it worth it. Again, you'd know this if you looked at benchmarks.

Nope, you need a PSU and a PCI slot (Often as a riser). Either you carry that around with you all the time loose or you have an enclosure.

Look up benchmarks? Seriously, be informed before you post technical recommendations. Especially when there's serious money people are considering. 

Even with better graphics, the CPU limit is going to be approached before GPU. Just like KSP with graphical mods.

The dude said he needed to travel? Can't do that with the hotwire special you're describing.

It is when the Number of lanes is < 8, again you'd know this if you did any investigation beforehand. PCI x2 cuts your performance dramatically, PCI x4 is still bad but not as bad.

There's a reason i know this, and it's because i looked into ALL of these options when i was on a laptop. But had no desktop, and guess what?  This is why i ended up building my first desktop instead of eGPU or anything similar, it's hardly upgradable, not portable and you're still paying a premium for half the performance.  Even if we go for your solution, and we only use a single cable to go from M.2 to a GPU, the GPU is a GT 1030 and nothing else.

We're still talking about around 130-170 USD to get performance close to that IGPU, and now you have a laptop that can't move....

Where if you saved another few hundred dollars, you'd have a desktop that could potentially last you years and still have a portable laptop for whatever you might need it for.

Link to comment
Share on other sites

1 hour ago, Oneiros said:

it's a bit too time-consuming refuting all this. i can tell that you've bought into a lot of the trash being pushed by low grade media articles. performance halved - what? number of PCI lanes? and how does a little box make a laptop immobile? lol. first thing is to check whether the laptop even supports an eGPU bc in many cases they don't and that's the end of the story.

argue all you want that the HD 620 isn't the bottleneck in this setup but all the OP needs to do is open task manager and check the utilisation, GPU is guaranteed to be 100% constant while cpu will be something much lower. i know because i used to run a CPU that benched about half his with my external GT1030 and it was GPU bottlenecked on KSP just like every other game and it's still GPU bottlenecked on my 10th gen i3.

:cool:

edit: here's a pic for ya

AYPM0ba.jpg

on the left my old laptop which didn't support an eGPU, on the right a little old optiplex USFF which did.

Again, look at benchmarks. And i didn't look at "Low grade media articles" i looked at multiple sources ranging from specifically mobile oriented to more general ones. Performance was cut across the board by 50% in the best cases, and that's directly due to the bandwith limitations of the M.2 interface. That's directly due to the number of lanes.

I'm not even saying the HD 620 "Isn't the bottleneck" either; what i am saying is that pairing a better GPU with a mobile chip is going to have massive diminishing returns because KSP is more CPU-bound. The GPU must be able to communicate with the CPU, so if you have a powerful GPU paired with a low end CPU it's going to be a bad time.

Also....that machine has no reason to run a eGPU, i have one running as a PFsense box. It supports half height, low profile cards. You could easily test EVERYTHING i've said by taking your GT1030 and wacking the LP bracket on it and slapping it in there. What are you even doing lel, that's a USB 3.0 riser which is even worse for this. And it still requires external power.

So now i have to carry around my laptop, two power bricks and i'm cutting performance even more. Oh this is bad.

Link to comment
Share on other sites

4 hours ago, Oneiros said:

games rely a lot more on GPU than CPU. that CPU will most likely be fine but you'll probably need an update to a dedicated graphics card. sometimes you can add them to laptops through a m-PCIe port adapter + external power supply, it's called an eGPU & can be done cheaply if your laptop supports it.

2 hours ago, Oneiros said:

and yet the graphical calculations remain far more intensive than the central processor's calculations

This is not the case... at all... Have you never upgraded the VAB and only play with <30 part ships?

2 hours ago, Oneiros said:

on a HD 620 + a CPU that benches at i5-4590 level? 0% chance.

1 hour ago, Oneiros said:

i.e. HD 620

Where are you even getting this example from and why are you using it? The processors people have stated they have in this thread are:

Spoiler

  

8 hours ago, dave1904 said:
  • Processor: Core 2 Duo 2.0 Ghz

running KSP 1 on a 2.0GHz processor? You must have insane patience

5 hours ago, Helvica_Ring_Scientist said:

Processor: Core i3 1005G1

Quote

I've played KSP on a laptop with an about 3.4GHz boost and it played fairly well up to about 150 parts

6 hours ago, The Doodling Astronaut said:

Processor: Core i5 -5300U

Quote
Frequency (GHz) 2.5 GHz
Turbo Frequency (GHz) 3.1 GHz

3.1GHz max sounds rough

On 12/26/2020 at 4:33 AM, Helvica_Ring_Scientist said:

But only one laptop has Nvidia, and its processor is i3-8145U @2.10GHz

Quote

I assume with a 3.9 turbo this would be fine

 

Also, if you try to fly a 300 part ship (not an unreasonable midgame ship) then yes, even in this scenario, the game would be CPU bottlnecked.

KSP is not at all a GPU intensive game without adding visual mods, the textures in the stock game are all pretty easy on a GPU. This will change in KSP 2, thankfully, but I have no doubt that in the long run the CPU will be the bottleneck once again.

1 minute ago, Oneiros said:

OP has a good CPU, it benches like an i5-4590, maybe you should look that up?

i5-4590(4 cores, non-hyperthreaded, desktop processor, running at 84W):

Quote

i3-1005G1(2 cores, hyperthreaded, mobile processor, running at 13W):

Quote

Yeah.... no.

Link to comment
Share on other sites

28 minutes ago, Oneiros said:

this is mPCIe not M.2

OP has a good CPU, it benches like an i5-4590, maybe you should look that up?

optiplex 920 USFF does not have a PCIe slot

there's this thing called a graphics card, maybe you've heard of them?

Yeah, that was my point. If you're not connecting the HDMI cable to the graphics card (GPU == Graphics card, IGPU == Integrated Graphics Card), it would still be using the Integrated graphics and you wouldn't actually see the performance of the setup you just displayed.

Also nope, that's a U model mobile CPU. It's limited to 15W at maximum, there's absolutely no way it's approaching a desktop i5-4590 in performance even in heavily threaded workloads lel

That's not PCIe at all, it's connected to a USB 2.0 port lel. And pop the side panel, I'm almost positive there's a x16 slot in there. I recognize that side panel catch from mine (The little semi-circular part sticking out the middle of the panel). There was a version that lacked them, but the dimensions look about right for the PCI x16 equipped version.

14 minutes ago, Oneiros said:

Hyperthreading won't help you for KSP I'm afraid, only multi-threading is for unloaded vessels and some background tasks.

Edited by Incarnation of Chaos
Link to comment
Share on other sites

My PC's Specs 

Operating System: Windows 7 Ultimate 64-bit (6.1, Build 7601) Service Pack 1

Processor: Intel(R) Core(TM) i3-2120 CPU @ 3.30GHz (4 CPUs), ~3.3GHz

Memory: 4096MB RAM

Available OS Memory: 4008MB RAM

Page File: 2617MB used, 11409MB available

DirectX Version: DirectX 11

Link to comment
Share on other sites

3 minutes ago, Oneiros said:

The thing that really matters here is single thread scores as KSP is not multi-threaded for interactions in a single vessle (rigid body dynamics) though extra threads are helpful when multiple craft are within KSPs physics bubble as each vessel gets a thread (pretty sure, correct me if I'm wrong). That said the 10th gen i3 has a better single thread score than the 4th gen i5. Seems that 6 gens of IPC improvements make up for the lost 300MHz boost.

That all said, whether the game is CPU or GPU bottlenecked is going to be heavily dependent on  how many parts are on the simulated vessel in your game. If the vessel is a low part count it's probably GPU bottlenecked, but as part count increases the difficulty of simulating its rigid body dynamics increases exponentially, so no matter what, at some point the game becomes CPU bottlenecked. My 2700x gets a yellow clock starting somewhere around 150-200 parts in atmosphere stock I believe (it has been a long time since I have played stock), and the only thing that will change the color of that clock in this game is how well your CPU is handling simulating your craft.

Link to comment
Share on other sites

2 hours ago, Oneiros said:

GPU is guaranteed to be 100% constant while cpu will be something much lower.

This doesn’t necessarily mean it’s GPU bottlenecked. 

Link to comment
Share on other sites

38 minutes ago, Oneiros said:

finally someone talking some sense.

Thank you I guess but do you maybe see why your original statement of

6 hours ago, Oneiros said:

games rely a lot more on GPU than CPU. that CPU will most likely be fine but you'll probably need an update to a dedicated graphics card. sometimes you can add them to laptops through a m-PCIe port adapter + external power supply, it's called an eGPU & can be done cheaply if your laptop supports it.

has gotten negative attention? It's very general and KSP truly is far different from most games, not just in its gameplay, but in what is computationally required for it to function. That statement holds more true with the likes of farcry, assassins creed, or cyberpunk than it does with games like KSP, factorio, or city skylines. The 3 later of which are all heavily CPU bound once mid game gets in gear. I still find it crazy to see a relatively low-res 2D game like factorio make my PC start to lose FPS when a megafactory is being played or traffic simulation in city skylines alone starts to make my PC huff. The common factor between KSP and those 2 games is the lack of ability to parallelize the most computationally heavy parts of the game and KSP 2 will not be an exception to that.

The majority of gameplay time will be spent with higher part count craft, meanwhile graphics can be toned down to fit with what your GPU is capable of. As far as your GPU goes, what you experience on your first flight will be very similar in graphical intensity to what comes mid-late game, meanwhile the complexity of what your CPU will have to simulate will become exponentially more complex. So all those time all of us get a good KSP run going and the game slowly starts chugging more and more, THAT is the CPU bottlenecking and is what I believe the others who are expressing disagreement with you are referring to.

Personally, I would be very surprised if minimum requirements will require even a GTX 1030 for KSP 2(lots of pop in and not much ground detail), I just hope that the upper bound of the games beauty can at least fully utilize a 2080ti since we will be playing this for the next 10 years and I have no doubt that within that time span raytracing will be commonplace on 8k screens at 60Hz and people will be playing with 6080 ti's and I hope this game is going to be capable of trying to keep up with that.

 

TL;DR there is a reason that CS:GO, Civilization VI, or Total war are typical gotos for CPU benchmarks and games like Tomb Raider, Operation Metro, or Crysis are used for GPU benchmarks. The first 3 CPU bottleneck, the latter 3 GPU bottleneck. Meanwhile, games like GTA V benchmark to show off your systems overall beefiness.

Edited by mcwaffles2003
Link to comment
Share on other sites

9 hours ago, Helvica_Ring_Scientist said:

I met most of them

OS: Windows 10 Pro 64-bit

Processor: Core i3 1005G1

Memory: 4GB RAM

Graphics: Intel(R) UHD Graphics 620

Storage: 512 GB (SSD)

I said barley. I don't know what KSP2 requires but expect it to be more than your specs. 

7 hours ago, Oneiros said:

and yet the graphical calculations remain far more intensive than the central processor's calculations

No they don't. My 10700k bottlenecks my 3080 once I reach certain craft size. I will admit that I run 100 mods and have not bothered doing proper tests. 

Edited by dave1904
Link to comment
Share on other sites

21 minutes ago, Oneiros said:

edit: forgot to add </s>

There is no limit on parts ingame so craft size is not an argument for or against how CPU or GPU intensive the game is. Simple Fact is that even in the stock game I will always be able to max out a 10700K or a Ryzen 5000 before the 3080. So yes the game is CPU bound more than GPU. I do not get the sarcasm here because on the one hand you say its not CPU bound but on the other you say its a badly optimized physics engine that cannot run 500 parts. 

 

4 hours ago, mcwaffles2003 said:

The thing that really matters here is single thread scores as KSP is not multi-threaded for interactions in a single vessle (rigid body dynamics) though extra threads are helpful when multiple craft are within KSPs physics bubble as each vessel gets a thread (pretty sure, correct me if I'm wrong).

I've heard this but have not been able to test this in practice. Have you ever noticed any differences? By that logic if I undock a craft with 500 parts in 2 250part craft I should see a performance gain? 

Link to comment
Share on other sites

9 hours ago, dave1904 said:

 

I've heard this but have not been able to test this in practice. Have you ever noticed any differences? By that logic if I undock a craft with 500 parts in 2 250part craft I should see a performance gain? 

Depends on if your computer has the cores available, if so then in general yes. Simulating 2 systems of 250 parts independently in parallel is faster than 1 system of 500 parts. 

It also means docking a 250 part ship to a 250 part station will result in a fair performance loss

Edited by mcwaffles2003
Link to comment
Share on other sites

On 12/26/2020 at 2:17 AM, vv3k70r said:

They now know what is KSP engine and what could be left to be handled by UNITY (which should only be a display, type GUI because it is ment for diferent kind of game, and game such KSP does not have any engine on market, because there is no such need on the market - it is very advanced nerd kind).

That's wrong on just so many levels. There are a few things that KSP does that very few other games do. Like rudimentary aerodynamics and patched conics. But these aren't the things that game engine ever takes care for you in any sort of game. This has always been part of gameplay and always has to be built out custom for a game. Just like you don't expect the game engine to handle a tech tree, for example, because every single game is going to handle it differently.

Game engine is in charge of UI, rendering, animation, and physics. And in absolutely none of these is KSP any different from any other game on the market. In fact, KSP is less render and animation heavy than most games of that caliber. Yes, Unity struggles with physics for larger craft, but Unity has an absolutely the worst physics engine of any commercially available game engines. It would take an absolute monstrosity of a ship to make even Unreal or CryEngine struggle to keep up the framerate. Some of the proprietary stuff is even better. The last game I've worked on we were able to run thousands of pieces of debris colliding with each other. Each being a convex hull generating multiple contact points. This is orders of magnitude more complex than anything you are likely to build in KSP. It ran comfortable 30FPS on a PS4 Pro. Now, we did ultimately have to cut it down for consoles, because the game also needed to run complex animation graphs for dozens of complex characters, behavior trees and pathing for AI for all the enemies, and hundreds of custom scripts for combat, abilities, damage, and loot tables - none of which KSP nor KSP2 will ever have to deal with. But even on a modest PC you can still enable this and not have a big FPS penalty.

KSP performance is combination of poorly optimized code, which I can understand given size of Squad, and Unity being about the worst possible choice for the game engine. Which, again, given how KSP started, made sense, and, unfortunately, I also understand how financial pressures prevented Intercept from pivoting away from it. But to say that KSP as a game requires some marvels from game engine that few other games do is absurd. Most game engines are built for exactly the kind of game that KSP and KSP2 are. Large, expansive worlds with customized materials to make them look unique, filled with thousands of rendered objects and hundreds of interacting rigid bodies. It's just that most games throw the physics sim at FX and a few physics puzzles here and there, while KSP consolidates all of it into simulating the ships. The requirements are identical, and amount of physics simulation a lot of commercial games do even on consoles between everything else they have to handle puts KSP to shame.

Link to comment
Share on other sites

1 hour ago, mcwaffles2003 said:

Depends on if your computer has the cores available, if so then in general yes. Simulating 2 systems of 250 parts independently in parallel is faster than 1 system of 500 parts. 

It also means docking a 250 part ship to a 250 part station will result in a fair performance loss

You know anything about multithreading? I always wanted to know if you could easily program KSP to have 1 craft use multible threads if it would actually be faster than a single core even if its maxed out. I read somewere a few years back that it can be slower because you're sharing memory or whatever. Cannot remember the details. 

Link to comment
Share on other sites

1 hour ago, dave1904 said:

You know anything about multithreading? I always wanted to know if you could easily program KSP to have 1 craft use multible threads if it would actually be faster than a single core even if its maxed out. I read somewere a few years back that it can be slower because you're sharing memory or whatever. Cannot remember the details. 

A little but not much, but the topic has been discussed at length in multiple threads within this forum, just search multithreading or Rigid body and you'll see plenty of discussion related. As for the memory sharing I believe that talking about the independent core caches which hold info about the vessel and exchanging information between threads in different caches would considerably slow down performance. I may be wrong, grain of salt

Link to comment
Share on other sites

13 hours ago, Oneiros said:

lol

it was getting a bit tedious arguing with people who don't even know what a graphics card looks like.  but if you want to try to dictate your narrative to me, i'll just point out that it's just as absurd as people who think that a game with last century's graphics is a good case for proving that games can still be CPU bound.

CPUs have been overpowered for many years to the point that they're almost irrelevant in gaming - any old CPU will do.

have fun with your five hundred part monstrosities on a badly optimised physics engine. lol. i know exactly why people attack just about everything i say and it has nothing to do with computers, but if i say that the mods will just censor it because they don't tolerate "politics"

You know, i had realized that you had no interest other than contributing pettiness and spite after last night and decided to drop it.

That was until you decided to say that, so alright. You wanna throw down? Lets!

That's my main rig, as you can see it has not one but *GASP* TWO Graphics Cards inside it! I built it myself, no help. Mind you; computers are pretty much big boy lego at this point so it's not impressive. But still; i know what the rear of a Graphics Card looks like quite well.

That machine also has 2 PSU's, and a Relay to ensure they both turn on at the exact same time. One 750W to run the System Motherboard, one 1000W just for the GPU's. Vega 56 are hungry cards <3

Beauty shot, just for the heck of it.

This is a old workstation i was trying to get IOMMU pass through working on, that sadly had the motherboard die on me right as i got everything configured just right. As you can see, it has a crappy little GPU that came with it. But the thing with IOMMU passthrough is, that whatever cards you give to the client VM will be unavailable to your host PC (The one running the VM), so it had THREE GPU's running at one point inside it.

Also, no. I don't know a bloody thing about your "Politics", but i do know quite a fair bit about computers. I saw someone posting misinformation at first, then later what looked like blatant bait (Seriously, a USB 3 riser card connected to a front USB 2 port on a machine that has a slot available? Do drivers even install lel). And i sought to address it, and nothing more. You still owe me a clean shot of that machine with the side panel removed, and benchmarks of your supposed performance with your riser setup using a GT 1030. That is if you wish to be honest, otherwise i couldn't care less either way.

Link to comment
Share on other sites

6 minutes ago, Oneiros said:

it runs but not well, for instance with a custom engine or better optimisation it wouldn't be so CPU-heavy. that's the point - that a well designed game isn't going to tax the CPU anywhere near as much as the GPU, and given the amount of time being devoted to KSP2 one would assume that it'll be a lot better optimised. also KSP graphics are extremely outdated which is why it doesn't tax the GPU much.

when discussing a gaming upgrade to a laptop with integrated graphics the GPU would always be the bottleneck and therefore the first thing to upgrade. that people think otherwise just indicates to me that they aren't too experienced in computers or that they've bought into marketing hype - or perhaps in this case that they think KSP is normal for a game and that this trend will continue into the sequel. the switch from a tiny indie team to AAA level production is likely to result in higher-end graphics and decent optimisation.

if an eGPU is possible that's the route i'd go down (and have done) but the alternative would be to run a setup like in the pic i posted with a cheap ultra small form factor dell with an eGPU or a card slotted in (if the stock PSU can handle it). the CPU will only be about the same in performance as the OP's current one on the laptop but at least this allows an upgrade on the graphics bottleneck.

well i'm not sure why you were asking me about a missing HDMI port when it should be obvious that there's no graphics card in there & that the PCB pictured is just the PCIe adapter.

it's just a mini-PCI to PCIe adapter that uses a USB cable to connect them. and i can't post specs bc that machine is retired but as far as the GPU goes i haven't noticed a difference since dropping the GT1030 into a new build.

I wasn't asking if the HDMI was missing, i was asking if you plugged it into the rear of that desktop or the Graphics card. Laptops that use eGPU have something that's based roughly on optimus to allow external GPU's to display on the screen, setups like this don't. So from my perspective; i was attempting to rule out the potential that you might've just plunked the Graphics Card in the Riser but connected the HDMI cable to the rear of the PC. Where it would've been using the Integrated GPU the entire time, and therefore whatever performance improvements you noticed might've been essentially placebo.

I figured you had the system at some point, especially if you built it for someone else. And might've remembered details, or even better been able to pull up a benchmark score or two for comparison. But i guess it's long gone by now, so i can't call foul for that.

Anyway, my entire point wasn't even really about the CPU bottleneck itself. It was mostly that even if we assumed you had a spare power brick lying around that was the correct specs for powering the riser, and the only cost was the GPU and Riser. You're still about 1/3rd the cost of a entry-level desktop, and while a quick and dirty upgrade like that might work right then? It's not going to be viable long-term, as either the GPU's you can use will become too limited (How many laptop power bricks go beyond 100W? Not many, and those that do are more expensive), the machine itself might not support enough RAM, Storage, etc. So if you're going to buy components anyway, it just makes way more sense to set yourself up for future growth. You could get a riser with a connection for actual PCI power, but then you're looking at even worse numbers financially.

Link to comment
Share on other sites

1 hour ago, dave1904 said:

You know anything about multithreading? I always wanted to know if you could easily program KSP to have 1 craft use multible threads if it would actually be faster than a single core even if its maxed out. I read somewere a few years back that it can be slower because you're sharing memory or whatever. Cannot remember the details. 

The rigid body approximation they're using must have the previous calculation's value before doing the next step in the series. The short answer therefore is "No; unless they change how vessels work"

The long answer is that you technically could multi-thread it, but you would need to use a mutex/semaphore to prevent each thread from doing the calculations out of order. These essentially act as locks, and when released by the programmer allow the next thread to grab the key and relock the current task so no other thread may modify it (In theory, in practice? There be hella bugs). Anyway, this adds additional overhead and doesn't allow any additional calculations over the Single-Threaded solution. So the only thing you've achieved is increasing the battering the CPU must endure, and spread it out over more cores (Which are actually doing background calculations and other things). Which means you've lost performance, not gained any.

There are ways to handle multi-thread/multi-core physics, mostly involving particles. But KSP2 is already confirmed to be using this system, they are reducing the load however. There's multiple areas in KSP1 where CPU cycles are wasted needlessly, which clean code will remove. They've also been working on a "Physics LOD" to dynamically take large numbers of parts that aren't having much happen to them physically (Fuel tanks) and welding them into a single part for the purposes of calculations. Reducing the number of parts, exponentially reduces the load on the CPU. So KSP2 will likely support much larger, and much higher part count ships than KSP1.

But those are all workarounds, they don't change the fact the limit exists. Just move it further down, so the real question is "Will KSP2's workarounds be capable of meeting the part counts for the majority of players?". You're going to have someone, first day who goes in sandbox and finds that new limit. But they're not the average player, and likely doing it for the sake of "Science". If KSP2 can hold a butter smooth framerate with a fair number of parts? Then I'd say they've done their job.

Link to comment
Share on other sites

3 hours ago, dave1904 said:

You know anything about multithreading? I always wanted to know if you could easily program KSP to have 1 craft use multible threads if it would actually be faster than a single core even if its maxed out. I read somewere a few years back that it can be slower because you're sharing memory or whatever. Cannot remember the details. 

Splitting threads per craft wouldn't make any sense. There are sim tasks that can be farmed out to multiple cores and these that can't. The easiest thing to spread between cores is collision checks. These will share access to BVH, but it's read-only and you aren't modifying BVH during the checks, so it's very easy code to write. At that point, it doesn't matter which collision check goes with which craft. You just distribute jobs evenly between cores. Various forces on parts are generated during updates for individual parts and that can also be spread between cores, at least, in theory. So that will include things like evaluating all of the control curves to produce engine forces and doing aerodynamics sim while flying through atmosphere. Once you have all the forces, you generally want to run the solver on a single thread. While that can be spread out in theory, in practice, checking to make sure you can actually simulate two craft separately, which requires knowing that they aren't interacting in any ways, is more expensive than just running everything on a single core. And the solver is going to be a pretty big chunk of physics sim time.

That's all largely academic, though, as KSP doesn't run on a bespoke engine and relies on Unity to handle physics and scheduling part updates. By default, all of that is going to happen on the main thread, which is why KSP absolutely sucks about using multiple threads. You can improve on that. Unity does support jobs, and things like aerodynamic updates can be farmed out to cores relatively easily. It's also possible to use C# async features to handle all of that. Physics is harder. PhysX is not written to make good use of multithreading, so you're kind of stuck with everything happening on a single thread. With the latest Unity it is possible to use Havok instead, which is far better optimized, but that also requires using completely different architecture of for your game. It's very unlikely that we'll be seeing any of that with KSP2.

Finally, a general statement about sharing memory in a multithreaded environment. Memory access itself is never the problem. It's always about cache. Generally, multiple threads reading the same memory is not a bad thing. Actually, it's a good thing, as you are less likely to churn on shared cache. Very often, you end up losing performance in highly-threaded tasks because individual cores access different memory locations that collide in cache, invalidating each other's lines, and resulting in a lot more cache misses than need to happen. The way cache is shared between cores on modern system allows you to plan for it a bit to reduce the impact. Higher associativity count on modern systems helps a lot too, so it's not as big of a deal as it used to be. The other problem with older hardware is bad cache behavior under atomic operations. These are important when one of the cores might be writing to memory that others are reading. Infamously, PS4 and XB1 have an architecture that's really bad at cache coherency under atomic exchanges, which basically mean that any time you are touching memory with atomic operations, you're practically guaranteed a cache miss. If you have some heavily used data on the same cache lines as some control atomics, you are going to have a very, very bad time in multithreaded tasks. Unfortunately, this isn't as common knowledge among console devs as you'd like, so I have seen engines with easily fixable performance problems on consoles due to the atomics. From personal experience, this isn't as bad on PS4 Pro and XB1X.

In short, a more recent architecture helps. Even between two CPUs that have comparable performance in pure math computations, the more recent architecture likely to have better cache coherence and prediction allowing for better performance in games that are designed to make good use of multiple cores. Most Unity games, however, are going to be bound by either main thread or rendering thread performance, so your best bet is to look for CPU with best single-thread performance. I don't expect that to be much different with KSP2.

1 hour ago, Incarnation of Chaos said:

The rigid body approximation they're using must have the previous calculation's value before doing the next step in the series. The short answer therefore is "No; unless they change how vessels work"

That's not really the main problem. Nothing stops you from having front/back buffer paradigm for constraints cache. Yeah, it doubles the amount of memory you need for that cache, but that's really not a big deal, and you don't even need to take a system cache hit on this if you use streaming instructions. The bigger problem is that the solver iterates over the entire (sparse) constraints matrix, so you can't trivially split the operation between cores. It's not about values from last iteration, but rather running values for current iteration constantly changing.

1 hour ago, Incarnation of Chaos said:

The long answer is that you technically could multi-thread it, but you would need to use a mutex/semaphore to prevent each thread from doing the calculations out of order. These essentially act as locks, and when released by the programmer allow the next thread to grab the key and relock the current task so no other thread may modify it (In theory, in practice? There be hella bugs).

Locks aren't going to help you here.  Unless you simply exclude execution on each thread from all the others, in which case, you are doing worse than single thread performance. You have to solve this with atomic operations, but it's still going to be very ugly.

Edited by K^2
Link to comment
Share on other sites

2 hours ago, K^2 said:

Splitting threads per craft wouldn't make any sense. There are sim tasks that can be farmed out to multiple cores and these that can't. The easiest thing to spread between cores is collision checks. These will share access to BVH, but it's read-only and you aren't modifying BVH during the checks, so it's very easy code to write. At that point, it doesn't matter which collision check goes with which craft. You just distribute jobs evenly between cores. Various forces on parts are generated during updates for individual parts and that can also be spread between cores, at least, in theory. So that will include things like evaluating all of the control curves to produce engine forces and doing aerodynamics sim while flying through atmosphere. Once you have all the forces, you generally want to run the solver on a single thread. While that can be spread out in theory, in practice, checking to make sure you can actually simulate two craft separately, which requires knowing that they aren't interacting in any ways, is more expensive than just running everything on a single core. And the solver is going to be a pretty big chunk of physics sim time.

That's all largely academic, though, as KSP doesn't run on a bespoke engine and relies on Unity to handle physics and scheduling part updates. By default, all of that is going to happen on the main thread, which is why KSP absolutely sucks about using multiple threads. You can improve on that. Unity does support jobs, and things like aerodynamic updates can be farmed out to cores relatively easily. It's also possible to use C# async features to handle all of that. Physics is harder. PhysX is not written to make good use of multithreading, so you're kind of stuck with everything happening on a single thread. With the latest Unity it is possible to use Havok instead, which is far better optimized, but that also requires using completely different architecture of for your game. It's very unlikely that we'll be seeing any of that with KSP2.

Finally, a general statement about sharing memory in a multithreaded environment. Memory access itself is never the problem. It's always about cache. Generally, multiple threads reading the same memory is not a bad thing. Actually, it's a good thing, as you are less likely to churn on shared cache. Very often, you end up losing performance in highly-threaded tasks because individual cores access different memory locations that collide in cache, invalidating each other's lines, and resulting in a lot more cache misses than need to happen. The way cache is shared between cores on modern system allows you to plan for it a bit to reduce the impact. Higher associativity count on modern systems helps a lot too, so it's not as big of a deal as it used to be. The other problem with older hardware is bad cache behavior under atomic operations. These are important when one of the cores might be writing to memory that others are reading. Infamously, PS4 and XB1 have an architecture that's really bad at cache coherency under atomic exchanges, which basically mean that any time you are touching memory with atomic operations, you're practically guaranteed a cache miss. If you have some heavily used data on the same cache lines as some control atomics, you are going to have a very, very bad time in multithreaded tasks. Unfortunately, this isn't as common knowledge among console devs as you'd like, so I have seen engines with easily fixable performance problems on consoles due to the atomics. From personal experience, this isn't as bad on PS4 Pro and XB1X.

In short, a more recent architecture helps. Even between two CPUs that have comparable performance in pure math computations, the more recent architecture likely to have better cache coherence and prediction allowing for better performance in games that are designed to make good use of multiple cores. Most Unity games, however, are going to be bound by either main thread or rendering thread performance, so your best bet is to look for CPU with best single-thread performance. I don't expect that to be much different with KSP2.

That's not really the main problem. Nothing stops you from having front/back buffer paradigm for constraints cache. Yeah, it doubles the amount of memory you need for that cache, but that's really not a big deal, and you don't even need to take a system cache hit on this if you use streaming instructions. The bigger problem is that the solver iterates over the entire (sparse) constraints matrix, so you can't trivially split the operation between cores. It's not about values from last iteration, but rather running values for current iteration constantly changing.

Locks aren't going to help you here.  Unless you simply exclude execution on each thread from all the others, in which case, you are doing worse than single thread performance. You have to solve this with atomic operations, but it's still going to be very ugly.

Which is essentially what I was getting at, but that's apparently the naive approach to the issue. 

Would each thread being able to have some form of state (i.e the integer or decimal value) be able to help? That introduces an entire new set of issues, but it's something I've actually worked with. Or is that what you mean by "Atomic" 

Link to comment
Share on other sites

1 hour ago, Incarnation of Chaos said:

Would each thread being able to have some form of state (i.e the integer or decimal value) be able to help? That introduces an entire new set of issues, but it's something I've actually worked with. Or is that what you mean by "Atomic" 

There's a concept of thread-local storage. It used to be a pain to implement properly in platform-independent way, but now there are standard ways of handling it in modern C/C++ and C# as well as some other modern languages. It gives you a simple way for each thread to know what it's working on.

Atomic operations are just these that are performed all at once. If you are doing anything complicated, you'll need locks or some other mutex, but the interesting cases include increment/decrement, exchange, or compare-and-exchange operations which have interlocked equivalents natively supported by modern CPUs. Again, that's something that used to have horrible platform-specific support, but now it's handled with std::atomic in C++ and Interlocked class in C#.

What the physics solver is doing, whether it's implemented directly or via something like Sequential Impulse, is solving a constrained least squares problem. Example of constraints being that contact force cannot be negative - objects don't (usually) stick to surfaces, so the contact force will only push in one direction. You can also use constraints to do traction limiting when simulating a wheeled vehicle. But the general strategy for solving these is very similar to solving a system of linear equations by an iterative method. So to get a rough idea of what the multithreaded solution would be like, think of how you'd implement Gaussian Elimination algorithm across multiple cores. Imagine that you've already brought the first N rows to  triangular form and you are now working on N+1st row. You need to subtract from current row every previous row divided by its pivot element and multiplied by the element in corresponding column in N+1st row. Lets say, I have several cores available. Rather than do one row at a time, I'll kick off the process on the first core with the first row. Once the second element of N+1st row settles, I'll launch the second core working on second row. It has all the data it needs, and so long as decrement operations on the N+1st row are atomic it doesn't matter that two threads are working on it at once. Once the 3rd element settles - that is, both cores 1 and 2 have processed it - the 3rd core can be launched on 3rd row, and so on up to the core count. The synchronization can be handled with a single monotonic counter which can also be handled with atomic increment.

This is still nowhere near 100% CPU utilization, there is a lot of overhead, especially when you consider that x86-64 CPUs still don't handle atomic increment and decrement with floating point values, and we're just talking about a toy version of the problem. There are applications where this kind of optimization is crucial. If you ever want to learn a trick or two about squeezing every last bit of performance when working with giant matrices, talk to a Lattice QCD theorist. But I don't think a physics solver for games is a use case. I'll take a simpler, more stable algorithm and I'll find something else to occupy all the other cores with. Like animation and AI. Collision is the only part of physics that, in my opinion, is worth farming out. Maybe BVH updates, depending on your implementation. But not the core solver.

Link to comment
Share on other sites

10 hours ago, K^2 said:

This has always been part of gameplay and always has to be built out custom for a game. Just like you don't expect the game engine to handle a tech tree, for example, because every single game is going to handle it differently.

Yes, most engines have some dll implemented phy for common use which is flat map on 3d or 2d. But these engines cannot handle anything like in KSP. Inside them is a made up fhisics that handle a narrow range of values that are adapted to scale, display and data structure.

Phy in KSP is also made up because of max value size, and it cause some inconsistency if You see to result in constants. It is made to be playable. But since They started it is now possible to handle data size of a proper size, and meybe KSP will work on real constants with a propper size of planets and lightspeed which will make them lot of easier to build a whole phy engine behind (just because we have all this equations already and do not have to made them up).

10 hours ago, K^2 said:

Game engine is in charge of UI, rendering, animation, and physics. And in absolutely none of these is KSP any different from any other game on the market. In fact, KSP is less render and animation heavy than most games of that caliber.

What You see in game is unity. But it is just a display from dynamicly build flat 3d of local structures that unity can handle. But speeds avilable in games allow players to crash this whole structure just by getting fast on land. In common engine You just put max_speed value and problem is solved, here it would be an offence even if terrain cannot be switched that fast.

10 hours ago, K^2 said:

Yes, Unity struggles with physics for larger craft, but Unity has an absolutely the worst physics engine of any commercially available game engines.

Try game maker in older version where 3d was att all avilable. But people made on this perfect engines for handle things like in KSP just scale would not be intresting. I guess You know all such tool for prototyping games better then me.

10 hours ago, K^2 said:

It would take an absolute monstrosity of a ship to make even Unreal or CryEngine struggle to keep up the framerate.

In such a solution You load to graphic engine only local objects for display and handle phy on Your own process. It is why KSP quite often do not match with display, especialy when You go fast and low. It is why it need workarounds with joints strength and so on. And quickload it is just a disaster. I would not expect anything else.

10 hours ago, K^2 said:

Now, we did ultimately have to cut it down for consoles, because the game also needed to run complex animation graphs for dozens of complex characters, behavior trees and pathing for AI for all the enemies, and hundreds of custom scripts for combat, abilities, damage, and loot tables - none of which KSP nor KSP2 will ever have to deal with.

Yes - features. Every of them works perfect separetly, but when You handle all this data strucutres at once it is just a series of post cards. In simple solutions I used clock modulo to pass just one part of logic behind in subsequent frames, but it would not work with speeds in KSP.

10 hours ago, K^2 said:

KSP performance is combination of poorly optimized code, which I can understand given size of Squad, and Unity being about the worst possible choice for the game engine.

I guess They did not plan to have all this features and modicfications. It just grow up like every hobby game - we add this, and now we add that, and then fix this and so on. It is a beutifull process but structure is wild and documentation is mostly in head of devs.

It would be nice just to make KSP2 same with order. Same game again but structurized so the most of process would not be just handling quick fixes.

10 hours ago, K^2 said:

Which, again, given how KSP started, made sense, and, unfortunately, I also understand how financial pressures prevented Intercept from pivoting away from it.

Would You even think of coding everything again after years just to make it corect?

10 hours ago, K^2 said:

Most game engines are built for exactly the kind of game that KSP and KSP2 are. Large, expansive worlds with customized materials to make them look unique, filled with thousands of rendered objects and hundreds of interacting rigid bodies.

I see these strcutres diferently. Mostly I set a layer with numbers over displayed objects to see what is going on when loaded up another sequence (like part of map). I ended up with this layer of numbers only showing what orientation, position, texture, lightning and animation, other values it is using. Display is something that can be separated to prepared structure that goes only to display loop.

10 hours ago, K^2 said:

The requirements are identical, and amount of physics simulation a lot of commercial games do even on consoles between everything else they have to handle puts KSP to shame.

I expect more like cad/cam handling of structures. I could just have a wrong expectations. I just use to that kind of solution.

In KSP my window is filed with pinned parts display (like wheels), but If I can see load on struts and other conection in real time it would be enough for me, and the vessel could be a wireframe.

6 hours ago, Incarnation of Chaos said:

Just move it further down, so the real question is "Will KSP2's workarounds be capable of meeting the part counts for the majority of players?".

Meybe parts could be chunkned in biger single structure on given load. It could solve some issues.

6 hours ago, Incarnation of Chaos said:

They've also been working on a "Physics LOD" to dynamically take large numbers of parts that aren't having much happen to them physically (Fuel tanks) and welding them into a single part for the purposes of calculations.

Exactly like that.

5 hours ago, K^2 said:

Generally, multiple threads reading the same memory is not a bad thing.

 

If they operating same data structure it could work.

5 hours ago, K^2 said:

Nothing stops you from having front/back buffer paradigm for constraints cache.

I always constraining a whole data strucutre. It is easier to handle for me even if it mean to process empty values. But it is constrain - there is no room left hence less features.

5 hours ago, K^2 said:

It's not about values from last iteration, but rather running values for current iteration constantly changing.

Yes, especialy if there is a quickfix that in case of exception have to move back to previous values.

56 minutes ago, K^2 said:

I'll take a simpler, more stable algorithm and I'll find something else to occupy all the other cores with. Like animation and AI.

Lovely - I see this that way.

56 minutes ago, K^2 said:

Collision is the only part of physics that, in my opinion, is worth farming out.

Exactly!

It is why I think about separating what the player see from what is going on. Comercial engine for beautifull, flat 3d is good for display stripped of any phy that can be made separatly just for this type of game. You can change graphical engine later, and phy would not be affected.

I did such a structure for simple, educational pourposes to show how to switch way of presenting results, but I have no idea how to handle this on any complex aplication because it could be heavy in such a loop. Loop that prepear data structure of phy for display, loop that display and then is an issue with responsivnes to handle what player could have in mind when he pressed mouse button what common engines solve easily but such a solution could bring some issue and solution could be heavy.

Quite often I stuck (with my solutions) in situation that I have to pause phy just to select something on screen in reasonable manner.

Edited by vv3k70r
Link to comment
Share on other sites

×
×
  • Create New...