Jump to content

1.1 Performance: AMAZING, BEAUTIFUL


justspace103

Recommended Posts

Just now, Temeter said:

Which isn't relevant to KSP's performance, if it's the same case like my I5. :P

Even Unity 4 KSP does multithreading, manages to drive 4 cores to 60%+ at times. Won't ever get that dumb boost. Is more less than a marketing gag.

You can set the CPU to run at that clock-speed as default so yeah it will have an effect! :sticktongue:

Link to comment
Share on other sites

4 minutes ago, Majorjim said:

You can set the CPU to run at that clock-speed as default so yeah it will have an effect! :sticktongue:

Doesn't work for me. :(

Doubt his laptop is overclocked tho. They tend to melt under normal circumstances already after ~2 years.

Edited by Temeter
Link to comment
Share on other sites

Just now, Temeter said:

Doen't work for me. :(

Doubt his laptop is overclocked tho. They tend to melt under normal circumstances already after ~2 years.

Swizz.. I made my i7 4790k to just run at 4.40GHZ screw that down-clocking crap!

Link to comment
Share on other sites

5 minutes ago, Majorjim said:

Swizz.. I made my i7 4790k to just run at 4.40GHZ screw that down-clocking crap!

Ugh. Problem is, i got a 4570, not a k version. Means no free multiplicator for overclocking. >_>

Think my next CPU is going to be AMD. Intels nonsense really starts bothering me at times...

Edited by Temeter
Link to comment
Share on other sites

19 hours ago, justspace103 said:

After my initial reaction of amazing disbelieved profanity, i started to wonder about how KSP 1.1 will work on my mac. will mac users get the same experience of performance? Also, how would any graphics be changed and will 64 bit be available for the mac?

The good news: on a Mac, “it just works™” :)

Link to comment
Share on other sites

1 hour ago, ainurakne said:

Hyperthreading means that the register sets of processor cores are duplicated. This means that two processes/threads can simultaneously sit on a single core, sharing its common execution resources. When the execution of one process/thread stalls (which happens quite often), the other one can jump in without the need of expensive context switch. So, hyperthreading actually means 2x less rescheduling than without hyperthreading.

Well put.  It's worth noting here that you're using a more technical (and more precise) definition of the term "rescheduling" then I did. 

I can certainly see how the vagueries of vernacular usage could ruffle the feathers of those familiar with the jargon, and probably should have chosen a better term, but I didn't think of it at the time.  So thank you for that clarification.

So, how to best communicate the consequences of that to people who's entire understanding of hyperthreading may be that the salesguy told them "it makes 4 cores behave like 8 cores" -- a statement that may-or-may-not be true depending on the nature of the load?   I suppose I'm still at a loss of that front.

Game threads generally need to stay synchronized with each other and with the user.  If you're waiting for one to stall out before running another, then your entire game is stalled and you've not gained much by having the additional thread.

Edited by Vim Razz
Link to comment
Share on other sites

1 hour ago, Temeter said:

^Jep, I think it's important to say games usually don't stall, because they are realtime, and therefor don't benefit very much from HT. Even moreso cause the most demanding game applications, like KSP's ship physics, won't run on more than a single thread in the first place, since you can't just delay them.

But they do stall. Cache misses, branch mispredictions, data dependencies - these all cause stalls, unless you are talking about a branchless program that fits entirely in a level 1 cache (including all the data it processes) and doesn't have any consecutive instructions that depend on each other's results.

Also, thread synchronization. Unless you can get the job done with only spinlocks and atomics, your thread is bound to block at some point and get pulled out from the processor to give room for the other threads that are eagerly waiting for their turn. So, the more of those other threads are sitting on some cores, the better for your thread.

 

But... managing the sharing of execution resources on a core has its cost. So, if your game has lots of busy rarely-stalling and rarely-blocking threads that, because of the lack of free resources, are put at the same time on the same core by hyperthreading, then this can be bad. In this case, hyperthreading can reduce performance.

 

31 minutes ago, Vim Razz said:

So, how to best communicate the consequences of that to people who's entire understanding of hyperthreading may be that the salesguy told them "it makes 4 cores behave like 8 cores" -- a statement that may-or-may-not be true depending on the nature of the load?   I suppose I'm still at a loss of that front.

Game threads generally need to stay synchronized with each other and with the user.  If you're waiting for one to stall out before running another, then your entire game is stalled and you've not gained much by having the additional thread.

Maybe the people who don't understand hyperthreading and what's it actually good for, also don't mostly need it. :P

I'm not really sure myself either about how it actually works, but I hope they don't just wait for the other one to get stalled and instead they're execution is more or less interleaved without the cost of context switches (and rescheduling). Getting to do your job while the other one is stalled is just bonus. Also, as I have understood, stalls are actually quite frequent.

Link to comment
Share on other sites

1 hour ago, Temeter said:

Ugh. Problem is, i got a 4570, not a k version. Means no free multiplicator for overclocking. >_>

Think my next CPU is going to be AMD. Intels nonsense really starts bothering me at times...

My 3820 non can easily be OCed to 43x multiplier. Intel disabled all OCing with non K Haswells?

Link to comment
Share on other sites

6 minutes ago, ainurakne said:

I'm not really sure myself either about how it actually works, but I hope they don't just wait for the other one to get stalled and instead they're execution is more or less interleaved without the cost of context switches (and rescheduling). Getting to do your job while the other one is stalled is just bonus. Also, as I have understood, stalls are actually quite frequent.

Hyperthreaded execution is indeed generally interleaved.  It doesn't have to wait for a stall to happen (although you're right in that they do happen fairly frequently), but it also means that two threads running on the same core via hyperthreading are always in contention for the shared assets of that core, and thus will be slower than the classic SMP example of running on two separate cores.

(note that 'hyperthreading' is an Intel trademark and that the generic term is 'SMT' or symmetrical multithreading)

Most of the stuff involving 'multi' is really just the chip makers scrambling desperately to justify new product sales, as single-thread performance-per-cycle is largely the same since the Pentium Pro era (like, 1996), and they haven't been able to scale up clock rate since the late P4* era (err... 2003? 2004?).   There are a lot of gotchas and caveats with SMP/multi-core/SMT programming, and no magical silver bullet to solve 'em.  I've posted extensively on these drawbacks before, and someone else made an excellent post about the problems with optimizing scheduling.

It's gotten so bad recently that there's no real upgrade path from my i7-3820, and that I'm looking very seriously at simply changing to watercooling and increasing my overclock as the next upgrade instead of any new core components.  The only real improvement that a 6xxx series CPU would offer me is DDR4, and my quad-channel DDR3 can easily give dual-channel DDR4 a run for it's money.  The 6600K I assembled for the lil lady is actually slower than my system at stock speeds (ignoring my current mild 4200mhz overclock), with only very small advantage in the DDR4 memory system.

I do kinda feel a bit sorry for 'em CPU makers though.  They DO have to earn money to stay in business, and it's not like they could assess people a monthly "because we're cool" fee or some crap (unlike the new CaaS idiots -- Adobe, Microsoft, etc).

* - of course a 3000mhz p4 is more like a 1500mhz PPro-equivalent (like an Athlon XP or Pentium M or Core/Core2 etc), but the Core series chips quickly regained the 3000mhz+ clockrate of the final-stage P4s.

Link to comment
Share on other sites

1 hour ago, benjee10 said:

Well the 64bit hack on Mac is currently more stable than the 32bit version, so hopefully it'll be even better with Unity 5 behind it as well.

It's more stable on Windows as well - on my machine, at least. I've been using the 64bit workaround for the last 6 months or so and it hasn't crashed once. 32bit used to crash quite frequently when running a lot of mods, even if I wasn't hitting the 3.4GB memory limit.

Also, yay, FAR and KJR will finally be updated to 64bit which means I'll be able to run Realism Overhaul with all the optional goodies installed! So hyped :D

 

Ermahgerd 400 posts!

Edited by CaptainKorhonen
Link to comment
Share on other sites

42 minutes ago, ainurakne said:

But they do stall. Cache misses, branch mispredictions, data dependencies - these all cause stalls, unless you are talking about a branchless program that fits entirely in a level 1 cache (including all the data it processes) and doesn't have any consecutive instructions that depend on each other's results.

Also, thread synchronization. Unless you can get the job done with only spinlocks and atomics, your thread is bound to block at some point and get pulled out from the processor to give room for the other threads that are eagerly waiting for their turn. So, the more of those other threads are sitting on some cores, the better for your thread.

 

But... managing the sharing of execution resources on a core has its cost. So, if your game has lots of busy rarely-stalling and rarely-blocking threads that, because of the lack of free resources, are put at the same time on the same core by hyperthreading, then this can be bad. In this case, hyperthreading can reduce performance.

That's not exactly what I ment. The really complicated threads of games usually cannnot be split up, because they need to stay synchronized: Which is e.g. the Pathfinding in Starcraft II, or a single ship in KSP 1.1, which is like 90% of the needed processing power of each game.

And even if they could split them up, you wouldn't want to use hyperthreading, but just shift them to other cores.

Link to comment
Share on other sites

1 hour ago, ainurakne said:

Maybe the people who don't understand hyperthreading and what's it actually good for, also don't mostly need it. :P

I don't know that that's the case.  The i7 machine I use at work crunches drawings a lot faster than my gaming/toy i5 rig at home.  The time saved over the course of a year more than pays for the extra cost of the chip, and you don't need to understand anything about how or why it works to benefit from it in that regard.

At home, though, it would be a lot of money to spend on a very marginal improvement for the stuff I use this computer for.

Edited by Vim Razz
Link to comment
Share on other sites

A CPU core back in the old days, among other things, comprised an FPU (for floating point math) and an ALU (for integer math). At some point, Intel started putting two ALU in each core, and noticed that one spent a lot of time doing nothing, because everything was waiting on the FPU. So, they invented hyperthreading, which split the single physical core into two virtual cores, so that programs that mostly needed the ALU could double their performance (provided they use enough threads to keep all the virtual cores busy).

Games generally don't benefit, because games generally do mostly floating point math.

Link to comment
Share on other sites

On 3/26/2016 at 4:14 AM, JohnnyPanzer said:

Wow. For me, what really stole the show was the UI. So much better. Crisp, clean, readable. The new right click menus looked awesome, and I very much approve of the lines drawn to the part the menu is attached to. The new navball was stunning, and I liked the way the heading buttons curved along the navball.

 

 

I cant understand why I would want the background of the right-click menu transparent like that.

When I open the menu, I want to see the information, not be distracted by slightly blurred background stuff happening behind the menu.  It seems like an odd thing to have done.   I love how you can move and pin them, for sure.

Link to comment
Share on other sites

While being seriously amazed by the performance improvements (yay! :)), I must say the basic idea stays the same: there is still a limit. I would bet something that it's still a CPU limit, too. And more importantly... is anybody else getting the vibe that it's not a linear thing? When EJ posted the second ship on the squadcast, the lag was increased dramatically even tough part count on pad was about 150%. It really looked like a freeze, and the staging events were similar.

So while I am overjoyed that collections of 300+ parts on screen (like the limits I run into now when I dock big stacks of stuff for transfers) are going to run smooth as a kitten on my i5*, I still say that 600+ parts on screen are going to turn what's supposed to be an enjoyable experience into a bit of a chore. Boy am I glad I learned to build ships in <100 parts...

 

*yeah, i5... but 3.10GHz without overclocking, old beasts take their time to die!

 

Rune. Don't get me wrong, my previous 'enjoyable' limit was around 150 parts.

Link to comment
Share on other sites

18 minutes ago, Rune said:

 And more importantly... is anybody else getting the vibe that it's not a linear thing? When EJ posted the second ship on the squadcast, the lag was increased dramatically even tough part count on pad was about 150%. It really looked like a freeze, and the staging events were similar.

So while I am overjoyed that collections of 300+ parts on screen (like the limits I run into now when I dock big stacks of stuff for transfers) are going to run smooth as a kitten on my i5*, I still say that 600+ parts on screen are going to turn what's supposed to be an enjoyable experience into a bit of a chore. Boy am I glad I learned to build ships in <100 parts...

*shrug* There is always going to be an upper limit. Someone is always going to hit it. The problem at the moment is that you can hit that limit during normal gameplay. So any improvement is welcomed (as you said).

It makes sense that it wouldn't be linear.  I'm not 100% certain on this, but I think that every part interacts with every other part (which is basically the issue with the physics calculations).

So assuming that's the case, and assuming each part does a single calculation per other part a 50 part ship has to do 2490 calculations per physics step. (50*49). A 100 part ship has to do 9900 (100*99) calculations per step. 100% part increase, but it's just over a 400% calculations increase.

Again these are numbers I've plucked out of thin air, so don't read too much into it, but I'm just trying to illustrate how it doesn't scale well.

Edited by severedsolo
Link to comment
Share on other sites

26 minutes ago, Rune said:

While being seriously amazed by the performance improvements (yay! :)), I must say the basic idea stays the same: there is still a limit. I would bet something that it's still a CPU limit, too. And more importantly... is anybody else getting the vibe that it's not a linear thing? When EJ posted the second ship on the squadcast, the lag was increased dramatically even tough part count on pad was about 150%. It really looked like a freeze, and the staging events were similar.

There was some talk about that on ricks stream, it might have just been an issue with importing crafts from 1.0.5. He had some decouplers multiplying themselves, and therefor causing lots of lag on staging.

Edited by Temeter
Link to comment
Share on other sites

8 minutes ago, Rune said:

there is still a limit.

Of course there will always be a limit Rune.

 It is about moving the goal posts. And yup, KSP is and always will be CPU limited. These great performance improvements we are seeing are sorely needed and will make more complex, detailed craft even more of a joy to fly!

 :kiss:

8 minutes ago, Rune said:

the lag was increased dramatically even tough part count on pad was about 150%. It really looked like a freeze, and the staging events were similar.

Those freezes are more than likely due to the fact we saw a buggy pre-release version. I do not see those pauses now so I highly doubt they will be in the 1.1 release and if they are it will be fixed very quickly. Excuses for low part count craft are getting harder and harder. :sticktongue:  Just teasing Love you Rune!

Edited by Majorjim
Link to comment
Share on other sites

Imo the performance issue of 1.0.5 isn't just that there is a limit, but how the performance just kinda starts to completely break down above 250 parts, while it works fine below that limit. Really weird, feels like a real issue with physics or lacking optimization.

So if the limit is a bit softer now, and 200 or 300 parts obove, then we'll already a lot better of. It's not just theoretical limits, the current limit inhibits reasonable crafts, not to mention Real Solar System stuff.

Link to comment
Share on other sites

1 hour ago, severedsolo said:

*shrug* There is always going to be an upper limit. Someone is always going to hit it. The problem at the moment is that you can hit that limit during normal gameplay. So any improvement is welcomed (as you said).

It makes sense that it wouldn't be linear.  I'm not 100% certain on this, but I think that every part interacts with every other part (which is basically the issue with the physics calculations).

So assuming that's the case, and assuming each part does a single calculation per other part a 50 part ship has to do 2490 calculations per physics step. (50*49). A 100 part ship has to do 9900 (100*99) calculations per step. 100% part increase, but it's just over a 400% calculations increase.

Again these are numbers I've plucked out of thin air, so don't read too much into it, but I'm just trying to illustrate how it doesn't scale well.

Sure, sure. And the exponential nature of the increase is also very logical, now that you mention it. Thing is, we don't know the exponent. Nice if a dev/person-that-knows-more-about-KSP than I do shed some light there.

Edit: for example, what's up with generators? Apparently they are performance hoggers now because they make lots of calls each cycle, has something been done to fix that so we can spam solar panels like there is no tomorrow?

1 hour ago, Majorjim said:

Of course there will always be a limit Rune.

 It is about moving the goal posts. And yup, KSP is and always will be CPU limited. These great performance improvements we are seeing are sorely needed and will make more complex, detailed craft even more of a joy to fly!

 :kiss:

Those freezes are more than likely due to the fact we saw a buggy pre-release version. I do not see those pauses now so I highly doubt they will be in the 1.1 release and if they are it will be fixed very quickly. Excuses for low part count craft are getting harder and harder. :sticktongue:  Just teasing Love you Rune!

Yeah, I get what you are saying and don't actually disagree. Thing is, you know I'd rather have six 50 parts ships docked together than three 100 part ones, or two with 150. Which actually means that the hyperthreading is going to help me more than the people that build big things like, say, @Kuzzter's Interpid! (though that one has a lot of auxiliary spacecraft, so perhaps not the best example) :D

In any case, it is very good that the performance hits of thermal have been done away with, the RAM ceiling is back where you can run lots of mods under it, and we are actually a bit better than we started, but after adding all the 0.23-1.05 content (thermal physics in particular hit the performance hard, and the addition of the post-asteroid day parts put the 4Gb limit scarily close). I think 1.1 is going to be finally a rounded out game: awesome content, solid engine, room to grow. All that I hoped for 1.0, but let's not get into that, shall we? I'm just happy we got (almost) here!

 

Rune. All hail Unity V, and salute those who sacrificed endless hours coding the port.

Edited by Rune
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...