Jump to content

[1.1.2] GrumpyCollector 1.0.1 - Stutter ? What Stutter


sarbian

Recommended Posts

7 hours ago, Padishar said:

That isn't how this mod works. This mod only increases the calling of the garbage collector, it forces a collection at the specified interval but, if lots of garbage is being created then the collections will still be run by Mono automatically. 

So, my theory is wrong, but it still helped somehow.

7 hours ago, Padishar said:

While it would be very nice if this did directly affect the random crashes, the nature of random crashes (especially ones that appear to be multi-threading related) is such that any change in internal program behaviour can have a significant effect.  Two people with subjective reports are certainly not statistically relevant so please don't do anything that will flood this thread with people looking for a fix for the crashes.

Too late buddy. I already posted it on reddit as "The solution for all random VAB crashes". Prepare to be boarded!

 

Link to comment
Share on other sites

On 14/05/2016 at 6:06 PM, lodestar said:

So, my theory is wrong, but it still helped somehow.

Too late buddy. I already posted it on reddit as "The solution for all random VAB crashes". Prepare to be boarded!

 

Congratulation. You killed my will to help with that issue.

Link to comment
Share on other sites

1 minute ago, sarbian said:

Congratulation. You killed my will to help with that issue.

:(

sarbian please dont let one person put you off with the amazeing and considerate work you are doing.

I suffer a lot with stutter and realy looked forward to what you was doing.

If you are totaly not going to work on this due to post above than thats also totaly cool as I do not expect people to do stuff :).

any ways thanks so much for looking at this and if you decide to carry on at some point I shall be here in a flash :)

 

Thanks

Link to comment
Share on other sites

29 minutes ago, lodestar said:

That was sarcasm. I'm not that stupid.

I had assumed so, especially after the reddit horde failed to materialise, otherwise I would probably have got myself banned for tearing you a new one... :wink:

Sarcasm doesn't come across very well in text which is why it is usual practice to indicate it somehow (</sarcasm> or a :wink:) to prevent misunderstandings such as this.  After all, if you had done as you said, your message wouldn't have been very different, perhaps a "Sorry!" rather than the "Prepare to be boarded!", perhaps not...

Link to comment
Share on other sites

Hey @sarbian thx for making the tool.

Here's my report, hope it helps.

Kerbal Space Program - 1.1.2.1260 (WindowsPlayer)

OS: Windows 7 Service Pack 1 (6.1.7601) 64bit
CPU: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz (4)
RAM: 16332
GPU: AMD Radeon HD 7900 Series (3051MB)
SM: 30 (Direct3D 9.0c [aticfx64.dll 8.17.10.1452])
RT Formats: ARGB32, Depth, ARGBHalf, Shadowmap, RGB565, ARGB4444, ARGB1555, Default, ARGB2101010, DefaultHDR, ARGBFloat, RGFloat, RGHalf, RFloat, RHalf, R8

Modded installation, total memory usage according to GCMonitor is about 4.4GB.

Little extra info, I've had small stutters in the game in both 1.05 (32 bit) and post 1.1.  This has happened to me both with my current system and my prior cpu (i5-2500K).  It also happened with 1.05 on Linux (64 bit).  My stutters have been very regular in their spacing - can't say for sure that they were the exact same number of seconds apart in the prior version because I didn't test, but it felt similar.

 

I used fraps to record my frametimes; I wanted to see how far apart the stutters were and the length of the stutters. 

 

Scenes were from a high orbit above Kerbin, a craft consisting of approximately 15-20 parts.  4.4GB memory usage.  Average fps 107.  Typical frametime (not including stutter frames) of ~9ms.  Typical stutter frametime 170ms-195ms.  The length of the stuttered frame was positively correlated with the time from the last stuttered frame.  If the prior stuttered frame was 500-700ms ago the stuttered frametime was 170-175ms.  2000-3000ms prior and the frametime was 180-185.  4000ms+ and the typical stutter frametime was 190-195ms.  So, it was increasing (which is to be expected?) but not proportionally.

 

For all the data below, the right column is the time in ms that a stutter occurred and the right column is the time in ms between the stutter frames. 

I've provided about the first 20 seconds for each run, further data is consistent with the first 20 seconds.

 

Normal modded installation, no GrumpyCollector. 
 

Spoiler

Stutter time in ms:

3557.114    
7795.5          4238.386
12034.591    4239.091
16264.669    4230.078
20528.415    4263.746

 

 


Grumpy Collector Interval = 10

Spoiler

Stutter time in ms:
1752.559    
4926.174    3173.615
5418.597    492.423
8777.099    3358.502
11938.437    3161.338
15124.071    3185.634
15581.325    457.254
18939.735    3358.41

 


Grumpy Collector Interval = 5

Spoiler

Stutter time in ms:
1329.781    
4572.391    3242.61
6497.341    1924.95
9754.242    3256.901
11673.594    1919.352
14935.377    3261.783
16843.14    1907.763
20095.964    3252.824

 


Grumpy Collector Interval = 0.5

Spoiler

 

Stutter time in ms:
664.49    
1327.145    662.655
1992.934    665.789
2654.568    661.634
3326.39        671.822
3987.153    660.763
4653.425    666.272
5315.645    662.22
5980.996    665.351
6642.151    661.155
7308.052    665.901
7974.855    666.803
8640.056    665.201
9303.327    663.271
9974.789    671.462
10641.92    667.131
11306.866    664.946
11974.366    667.5
12636.728    662.362
13301.143    664.415
13963.052    661.909
14629.925    666.873
15292.724    662.799
15959.598    666.874
16628.166    668.568
17289.969    661.803
17955.24    665.271
18614.694    659.454
19281.206    666.512
19944.262    663.056
20610.526    666.264

 

 

 

 

As you can see in each case the pattern is at least very consistent...  I don't know if that helps or not.  :) 

If I can do something more to help let me know.  Thank you again for trying to solve this matter.  Even if all that could be done was reducing this occurrence to 1-2 per minute (instead of 10-15) that would be great.

 

-----------------------------******************************----------------------------------

 

EDITED FOLLOWUP:

I tested a nearly stock game (just gcmonitor installed) and in about an 80 second run I only saw one spike of about 110 ms.   3.15GB of ram being used. 

Then I tested again with Grumpycollector with different settings and there were the expected frametime stutters, I can post all the data if needed but it was similar to before but with smaller spikes 75ms typical, although some 100ms+.

 

The following is just my opinion, but for what its worth... :) My subjective opinion, the *feel* of the game if you will, is slightly worse with Grumpycollector just because the garbage collections are more frequent, and with some settings creating higher variation in their temporal spacing.  The collection is certainly more frequent with Grumpycollector, but unfortunately the frametime decrease, although correlated, is not proportionate to the increased frequency of the collection. (Increasing the frequency by a factor of 6 only results in a ~10% decrease in stutter framtimes)  I personally can't percieve the difference between a 170ms and a 190ms frametime, especially when the frametimes in between are 8-11ms.  So, given that subjective opinion, the best option for me is the least frequent collection.  From my understanding of the GC involved I don't think its possible to for @sarbian to decrease the frequency from what the game wants to do... if I'm wrong please tell me, I'd be beyond happy to find out I'm wrong on that one.  :D

 

I'm going to go back through my mods and see if I can find 1 or 2 that are exacerbating the problem the most.  Is it fair to say that purely part mods (ones that use no .dll other than MM, either included with or required by the mod) are not likely to cause increased GC?  Or do I have that totally wrong?  My understanding of GC is okay, but its certainly not advanced.

Edited by Tig
Added Stock testing, added subjective thoughts for what they're worth
Link to comment
Share on other sites

PS.  In thanks for your help, I humbly submit a Grumpycollector mascot/icon made possible by my leet skills with google search & MS Paint:

VCBePRe.jpg

Edited: I know that later versions of mono use better GC, but not the version used by Unity.  IMO, Mono's original GC should be taught in CS classes under the heading "How-not-to-do-GC" :) 

Edited by Tig
Link to comment
Share on other sites

7 minutes ago, Tig said:

Here's my report, hope it helps.

Very nice numbers, thanks, pretty much exactly as expected.  I assume that none of the runs with Grumpy collector going were subjectively "significantly better" in terms of stutter...

11 minutes ago, Tig said:

If I can do something more to help let me know.  Thank you again for trying to solve this matter.  Even if all that could be done was reducing this occurrence to 1-2 per minute (instead of 10-15) that would be great.

Reducing the frequency can only be done by KSP and mod code creating less garbage (or possibly by some seriously involved hacking at the Unity player to change its mono heap allocation strategy).  Squad (and some modders) have made some decent inroads into reducing garbage but there is still quite a lot of "dodgy" code that creates way too much.  Squad have their eye on a few areas of KSP code that are particularly bad (resource handling, the editor part list UI and background vessel processing) so they will, hopefully, be fixed fairly soon but this problem is definitely not going to be eliminated overnight.

3 minutes ago, Tig said:

PS.  In thanks for your help, I humbly submit a Grumpycollector mascot/icon made possible by my leet skills with google search & MS Paint:

Hehe, he is grumpy... :wink:

Actually, it isn't really Mono's fault.  If you really want to blame someone then, in my opinion, you should be mainly looking at the rather iffy code in KSP that creates so much garbage, with a sideways glance at Unity for using Mono as their "scripting engine".  Techniques for minimising garbage creation in Unity games are widely published, both by Unity themselves and by third parties.  Yes, they are a bit of a pain to use, they generally make the code harder to read and maintain and it is quite easy to make mistakes but, there really is no other option to minimise GC stutters.

Link to comment
Share on other sites

So, I was reading up some more on the boehm collector.  See http://www.cs.princeton.edu/~appel/modern/c/software/boehm/gc/gc.h

I found this:
extern GC_word GC_free_space_divisor; 
			/* We try to make sure that we allocate at */ 
			/* least N/GC_free_space_divisor bytes between */ 	
			/* collections, where N is the heap size plus */ 
			/* a rough estimate of the root set size. */ 
			/* Initially, GC_free_space_divisor = 4. */ 
			/* Increasing its value will use less space */ 
			/* but more collection time. Decreasing it */ 
			/* will appreciably decrease collection time */ 
			/* at the expense of space. */ 
			/* GC_free_space_divisor = 1 will effectively */ 
			/* disable collections. */

If I'm reading this correctly, then it means that if you set GC_free_space_divisor = 4, as listed above, then the game will allocate 25% more memory on the heap every time it runs out of memory - I am assuming after in runs the GC to see if actually needs to allocate more memory after it dumps all the garbage.

So, step by step (please tell me if I have any step wrong - as I fully admit I have a decent but not expert level understanding of GC in general):

  1. Game reaches the limit of its allocated memory and needs more
  2. Game runs the GC
  3. GC runs and deletes any orphoned objects
  4. If after said deletions there is suffieicnt memory for the need in step 1 then the new objects are created and the process ends.
  5. If after the deletions there is insufficient memory then the game allocates more memory and in addition to that N/ GC_free_space_divisor.  
    1. For example if GC_free_space_divisor = 4, then 25% more memory than "necessary" gets allocated, if GC_free_space_divisor = 3 then 33% more memory gets allocated.

 

So, if I'm reading all that correct we have a tradeoff between allocating additional (and potentially wasteful) memory vs less memory BUT with the addition of more frequent GC.

I wouldn't be surprised to find that in Kerbal GC_free_space_divisor got turned up to something much higher than 4 given that the biggest cause of modded crashes was exceeding the memory limit.  But now that we have x64 perhaps a more "waseful" setting could really help us with the frequency of garbage collection.  I personally wouldn't care if I was wasting a GB or two if it got rid of or greatly reduced the frequecy of the garbage collections - or better yet expose the variable to us so we can choose.

 

Questions to those with more knowledge than me:

  1. Is my analysis correct?
  2. Is GC_free_space_divisor exposed to modders?  (I assume not - but I can hope right? :) )
  3. Would Squad be open to exposing this variable to us in some way?  I know they might be reticent only because many users could "break" their game by making large changes to the default setting without understanding the tradeoffs.

Thanks for the help.

 

------

 

Edit:

Found this right after I made my post, explains the situation pretty well with a real life example:

http://richg42.blogspot.com/2015/06/a-simple-way-to-reduce-memory-pressure.html

He describes a situation about the memory savings by changing GC_free_space_divisor to 16 and that their game was still playable with the increased GC, but their game was on a mobile platform and using 60-70MB, not the  4 gigs+ most of us are using with Kerbal. :)

 

Edited by Tig
Link to comment
Share on other sites

2 hours ago, Tig said:

Questions to those with more knowledge than me:

  1. Yes, you're basically correct, when Mono needs to increase the size of the heap it (currently) allocates so approx 1/4 of the total heap size is free (I believe Unity changes this value to 3).
  2. No, it isn't exposed to modding (or even to the KSP code written in C#).  This is what the blog article you linked to does for iOS or other OSes that use libgc.
  3. Mono doesn't expose it to the CIL code running inside Mono.  It would be fairly easy for Squad (or anyone else) to change this in the same way as the blog article but the Windows player doesn't use a shared libgc (I presume it's statically linked into the player executable or mono.dll).  If the linux and OSX versions use libgc then it should be easy for those but the Windows version may need a considerably more complex change.

However, the best this can do is increase the heap headroom by a factor of 2 or 3 which will only result in the frequency being reduced by this same factor.  E.g. if the heap has 3 times the headroom then there will be 3 times as long between runs of the GC.  This isn't going to be enough of an improvement to really be worthwhile by itself though it would make the improvement due to a general cleanup of the garbage producing code more significant.

Link to comment
Share on other sites

9 hours ago, Padishar said:
  1. Yes, you're basically correct, when Mono needs to increase the size of the heap it (currently) allocates so approx 1/4 of the total heap size is free (I believe Unity changes this value to 3).

 

Okay, going to link to this from your addon post for MemGraph http://forum.kerbalspaceprogram.com/index.php?/topic/139128-112-memgraph-1005/ so I don't double post you...

What if we went at this another way - okay its a total hack, I admit - but maybe it'll work?  :)

So I was watching my allocated and reserved memory in the debug toolbar along with MemGraph with both a modded and unmodded install.  I noticed that the GC tends to kick in when the allocated memory grows beyond a certain point, but it appears to be based upon some proportion of the reserved memory - and to my understanding that is how its supposed to work, please correct me if that's wrong.

Forgive any ignorance, I only have practical experience with unmanaged code - I conceptionally understand managed code but I've never used it in practice.  In other words, I know just enough to be dangerous, lol.

That said...

 

I noticed the GC ran less frequently with unmodded but was reacting to much smaller changes in allocated memory.  Modded installations required larger changes in allocated memory before the GC kicked in (again I assume because of the relative proportion of the higher reserved memory)

Now my game ran the GC more often on the modded install because the allocated memory was growing considerably faster per second than the unmodded install - no shock there.

So, and here's my key thought, what if we artificially increased the allocated/reserved memory?  If we double the size of the heap with some static information that doesn't ever update, then in theory would the GC run half as often due to the proportion doubling while the allocated/sec remained constant?

 

I played with MemGraph a bit.  I changed the cfg to a rather extreme value from 984 added 20 times to 1 GB.  The GC kicked in immediately.  From my KSP log:
 

[LOG 01:46:18.275] MemGraph Test(1, 1000000000)
Initial memory = 1301839872  (counts = 72)
GC has run, aborting
Final memory = 1301626880  (counts = 73)

[LOG 01:46:18.450] MemGraph Test(1, 1000000000)
Initial memory = 1301782528  (counts = 73)
GC has run, aborting
Final memory = 1301573632  (counts = 74)

[LOG 01:46:18.635] MemGraph Test(1, 1000000000)
Initial memory = 1301733376  (counts = 74)
GC has run, aborting
Final memory = 1301573632  (counts = 75)

No surprise there, right?  But what did change was that prior to running the MemGraph test my GC was kicking in like clockwork every 9.0 sec +-20ms.  After running the test, it was 13.1sec +- 50ms.  It did increase the GC stutter frame by an additional 10-15ms.  Still, a definite improvement overall imo.

 

If you wouldn't mind, could you make an option in MemGraph to add a configurable amount to the heap that won't get GC'd and survives scene changes?  I don't know if that's as simple as defining something similar to your block array in RunTestCode() as a public up top or if I'm asking for something more difficult than I understand. 

 

PS.  The irony of how Kerbal this hack would be if it works is not lost on me...  I propose it be renamed "MemGraph - Moar Boosters" :)  But, of course the final decision up to you.  If it works I shall be completely satisfied with MemGraph 1.0.0.6  :) Thx.

 

EDITED:  Changed the bolded numbers. I realized I had a mod that was spamming my ksp.log (it was 200MB, lol).  Updated the mod, and updated the numbers.

Edited by Tig
Unrelated mod update
Link to comment
Share on other sites

18 minutes ago, Tig said:

What if we went at this another way - okay its a total hack, I admit - but maybe it'll work?  :)

 

Hehe, yeah, I've investigated that too in the past, though I was trying to avoid keeping the padding allocated.  I'm intending to revisit it now after learning more detail about how the allocator works.  It should be possible to force the total mono heap size to virtually any value while leaving most of the padding unallocated by carefully choosing the sizes of memory blocks and keeping one block allocated from each 4k page...

22 minutes ago, Tig said:

I played with MemGraph a bit.  I changed the cfg to a rather extreme value from 984 added 20 times to 100million bytes.  The GC kicked in immediately.  From my KSP log:

That looks like 1000 million, not 100.  I would've expected a better improvement for that much so there is probably something else going on like Unity increasing the GC_free_space_divisor when the heap gets very large so you don't end up with an extra 250MB of heap headroom. It would be interesting to know the minimum and maximum values of HWM you see before and after adding the huge block.

I'll definitely take a look at PadHeap later today and also try to post the info I've learned about how the allocator works...

Link to comment
Share on other sites

1 hour ago, Padishar said:

That looks like 1000 million, not 100.  I would've expected a better improvement for that much so there is probably something else going on like Unity increasing the GC_free_space_divisor when the heap gets very large so you don't end up with an extra 250MB of heap headroom. It would be interesting to know the minimum and maximum values of HWM you see before and after adding the huge block.

Yeah you caught me before I edited my post.   I realized that I had a mod that was spamming my ksp.log, it got to like 209MB or something.  :)  I thought it might be affecting the results so I redid the test with 1GB, my post above is now corrected.

Here is the graph from the 1GB test and 9 sec to 13.1 sec GC times.

ufwIxgW.jpg

As per your request I checked the HWM changes (using 2GB this time, but I don't think that changed it much).

Pre-test HWM fluctuates between 970-1150MB, post-test 10 press of alt-/ with test.cfg of 2GB x 1 and HWM fluctuates from 1020-1320MB.

Letting it run for a while post-test, you can see from the graph how clockwork the timing is:

vqI6aV9.jpg

1 hour ago, Padishar said:

I'll definitely take a look at PadHeap later today and also try to post the info I've learned about how the allocator works...

Thank you very much.

 

----------------------------------------------------

 

EDITED:

Made sure all mods fully updated.  Got a better looking graph.   test.cfg settings 300MB x 10.  HWM changes nearly same as above.   Pre-test HWM between 970-1150MB, post-test HWM fluctuates from 1020-1360MB.  

KSP.log:
 

[LOG 05:32:20.058] [Vessel CPU Test 600 v23_5 Debris]: Destroyed. No crews were aboard.
[LOG 05:32:29.947] MemGraph Test(10, 300000000)
Initial memory = 1091579904  (counts = 165)
GC has run, aborting
Final memory = 1303949312  (counts = 166)

[LOG 05:32:30.447] MemGraph Test(10, 300000000)
Initial memory = 1304113152  (counts = 166)
GC has run, aborting
Final memory = 1596772352  (counts = 167)

[LOG 05:32:30.765] MemGraph Test(10, 300000000)
Initial memory = 1597702144  (counts = 167)
GC has run, aborting
Final memory = 1878835200  (counts = 168)

[LOG 05:32:31.000] MemGraph Test(10, 300000000)
Initial memory = 1879711744  (counts = 168)
GC has run, aborting
Final memory = 1878761472  (counts = 169)

[LOG 05:32:31.239] MemGraph Test(10, 300000000)
Initial memory = 1879638016  (counts = 169)
GC has run, aborting
Final memory = 1878724608  (counts = 170)

 

Even more KSP.log from another run showing how the memory drops back down after 20-30 seconds (the dashes are mine for easier reading):

With this run I briefly saw HWM of 2.2GB.  Then after 3 series of multiple alt-/ presses it settled around 1100-1450.  Stuttered frametimes increased to ~19seconds apart.


[LOG 06:00:11.703] MemGraph Test(10, 300000000)
Initial memory = 1880608768  (counts = 170)
GC has run, aborting
Final memory = 1879851008  (counts = 171)

[LOG 06:00:11.952] MemGraph Test(10, 300000000)
Initial memory = 1880506368  (counts = 171)
GC has run, aborting
Final memory = 1879732224  (counts = 172)

[LOG 06:00:12.228] MemGraph Test(10, 300000000)
Initial memory = 1880711168  (counts = 172)
GC has run, aborting
Final memory = 1879744512  (counts = 173)

[LOG 06:00:12.525] MemGraph Test(10, 300000000)
Initial memory = 1880526848  (counts = 173)
GC has run, aborting
Final memory = 1879715840  (counts = 174)

[LOG 06:00:12.789] MemGraph Test(10, 300000000)
Initial memory = 1880420352  (counts = 174)
GC has run, aborting
Final memory = 1879703552  (counts = 175)

[LOG 06:00:13.090] MemGraph Test(10, 300000000)
Initial memory = 1880281088  (counts = 175)
GC has run, aborting
Final memory = 1879683072  (counts = 176)

--------

[LOG 06:00:49.375] MemGraph Test(10, 300000000)
Initial memory = 1239945216  (counts = 178)
GC has run, aborting
Final memory = 1336397824  (counts = 179)

[LOG 06:00:49.971] MemGraph Test(10, 300000000)
Initial memory = 1337507840  (counts = 179)
GC has run, aborting
Final memory = 1584709632  (counts = 180)

[LOG 06:00:50.210] MemGraph Test(10, 300000000)
Initial memory = 1584873472  (counts = 180)
GC has run, aborting
Final memory = 1881849856  (counts = 181)

[LOG 06:00:50.801] MemGraph Test(10, 300000000)
Initial memory = 1885396992  (counts = 181)
Block 0 increase = 300003328
GC has run, aborting
Final memory = 2181873664  (counts = 182)

[LOG 06:00:51.152] MemGraph Test(10, 300000000)
Initial memory = 2183086080  (counts = 182)
GC has run, aborting
Final memory = 1881845760  (counts = 183)

[LOG 06:00:51.504] MemGraph Test(10, 300000000)
Initial memory = 1882562560  (counts = 183)
Block 0 increase = 300003328
GC has run, aborting
Final memory = 2181832704  (counts = 184)

---------

[LOG 06:01:54.018] MemGraph Test(10, 300000000)
Initial memory = 1154875392  (counts = 187)
Block 0 increase = 300003328
GC has run, aborting
Final memory = 1608048640  (counts = 188)

[LOG 06:01:54.449] MemGraph Test(10, 300000000)
Initial memory = 1608212480  (counts = 188)
Block 0 increase = 300003328
GC has run, aborting
Final memory = 1893093376  (counts = 189)

[LOG 06:01:54.687] MemGraph Test(10, 300000000)
Initial memory = 1893969920  (counts = 189)
GC has run, aborting
Final memory = 1581838336  (counts = 190)

[LOG 06:01:54.948] MemGraph Test(10, 300000000)
Initial memory = 1582485504  (counts = 190)
Block 0 increase = 300003328
Block 1 increase = 300003328
GC has run, aborting
Final memory = 1881796608  (counts = 191)

[LOG 06:01:55.193] MemGraph Test(10, 300000000)
Initial memory = 1881960448  (counts = 191)
Block 0 increase = 300003328
GC has run, aborting
Final memory = 2181758976  (counts = 192)

[LOG 06:01:55.430] MemGraph Test(10, 300000000)
Initial memory = 2182176768  (counts = 192)
GC has run, aborting
Final memory = 1881739264  (counts = 193)

[LOG 06:01:55.690] MemGraph Test(10, 300000000)
Initial memory = 1881903104  (counts = 193)
Block 0 increase = 300003328
GC has run, aborting
Final memory = 2181730304  (counts = 194)

[LOG 06:01:55.961] MemGraph Test(10, 300000000)
Initial memory = 2182148096  (counts = 194)
GC has run, aborting
Final memory = 1881714688  (counts = 195)

[LOG 06:01:56.208] MemGraph Test(10, 300000000)
Initial memory = 1882361856  (counts = 195)
Block 0 increase = 300003328
GC has run, aborting
Final memory = 1881706496  (counts = 196)

Frametimes from the 1st run:

(All in ms)

Stutter		 ms since 	
occurs:  	 last stutter:
14307    	8055
22401    	8094
30786    	8385
39144	   	8358
47532   	8388
55918    	8386
64306    	8388
72661    	8355
80894    	8233
84785    	3891*
85286    	501*
85602    	316*
85837    	235*
86077    	239*
86311    	235*
87226    	915*
87508    	282*
87746    	239*
87994    	248*
98067    	10072
113899    	15833
129766    	15867
145691    	15925
161634    	15943

*Me pressing alt-/ to run test.

Graph (I think its self explanatory):

J374eFu.jpg

 

PS.  One last note, GCMonitor shows an increase of approximately 1GB of memory usage after running MemGraph test.  Just in case that's important.

Edited by Tig
Further testing
Link to comment
Share on other sites

2 hours ago, Tig said:

Even more KSP.log from another run showing how the memory drops back down after 20-30 seconds (the dashes are mine for easier reading):

Yes, the docs for the Boehm collector and various Unity specific docs claim that it never releases memory back once it is part of the Mono heap but this appears to not be the case.  I suspect that Unity do some extra weird stuff to get some control over it.

This is all very useful data (though it's nothing I've not seen before, it's nice to have it confirmed).  The new padding mechanism I'm designing should increase the heap headroom by a factor of 10+ rather than the near-doubling you are seeing.  This should be a seriously useful reduction in the GC rate though the actual stutters, when they do happen, may be a lot more noticeable...

Link to comment
Share on other sites

6 hours ago, Nansuchao said:

Did you flew just a ship in orbit or a plane/launch rocket too? 

The stutter and the use of resource is more intensive flying in atmosphere or however near the ground.

The use of the cpu is certainly more intense, yes.  But I don't see stutters like that when launching, now that could be because the framerate is considerably lower so if there are stutters they are relatively less severe.  (going from 120fps to 5 fps for a frame is more jaring than 30fps to 5fps for 1 frame - if that's even happening.

As to my testing I use "CPU Test 600 v23_5" a 600 part monster that drops my 4.6GHz i5-6600K down to 42 fps on the launchpad with stock 1.1.2 (still a vast improvement over the 18-19fps i got with 1.0.5).  With my mod list which includes KER, MJ, FAR, EVE and many others the framerate dips as low as 16fps during launch typically around max aero near the sound barrier.  If there's something more intense - send it my way. :)  I don't know if I have permission to distribute the .craft file I use, but just do a search its easy to find.

-----------------

 

6 hours ago, Padishar said:

This is all very useful data (though it's nothing I've not seen before, it's nice to have it confirmed).  The new padding mechanism I'm designing should increase the heap headroom by a factor of 10+ rather than the near-doubling you are seeing.

No problem, thanks to you and @sarbian for trying to code a fix/workaround.  Just trying to provide the two of you with more than just anecdotal subjective opinions - not that those are invaluable always, but I'm a raw data kinda guy - sneaking suspicion you two might be of a similar mindset. 

Can't wait to try the new padding.

6 hours ago, Padishar said:

This should be a seriously useful reduction in the GC rate though the actual stutters, when they do happen, may be a lot more noticeable...

Certainly true, but hopefully the pattern I've seen thus far will stay the same.  With sarbian's Grumpycollector I managed to increase the GC rate by a factor of 6 and only got a 10-15% reduction in the ms of the stuttered frame.  With your MemGraph I managed to nearly double the time between GCs and the stuttered frame only increased by 5-10%.  I imagine that pattern will continue to hold... if in a couple of days i can go from 15 stutters per minute to 15 per hour, hell, I don't care if that one stutter is 1/3 of a second.  :)

Btw, what is the most typical limiting factor on a GC collection?  Is it CPU or Ram?

 

Edited by Tig
Link to comment
Share on other sites

3 hours ago, Tig said:

Btw, what is the most typical limiting factor on a GC collection?  Is it CPU or Ram?

Well, the process uses the CPU so the faster the CPU, the shorter each pause will be.  The amount of work to be done depends on the total number and total size of all the allocated memory blocks but the relationship is slightly complex due to the multiple passes the GC performs.

I've now released version 1.0.0.6 of MemGraph that includes Mod-End to allocate a large chunk of heap headroom.  In my testing this has increased the time between runs of the garbage collector by between 5 and 12 times (though the collections themselves pause for somewhat longer) though it probably needs further tweaking and could probably do with dynamically adjusting to the current pattern of heap usage.

 

Link to comment
Share on other sites

1 hour ago, Padishar said:

Well, the process uses the CPU so the faster the CPU, the shorter each pause will be.  The amount of work to be done depends on the total number and total size of all the allocated memory blocks but the relationship is slightly complex due to the multiple passes the GC performs.

Okay yeah I figured the actual analysis by the CPU would make RAM speed somewhat irrelevant.  Thanks for the confirmation.

1 hour ago, Padishar said:

I've now released version 1.0.0.6 of MemGraph that includes Mod-End to allocate a large chunk of heap headroom.  In my testing this has increased the time between runs of the garbage collector by between 5 and 12 times (though the collections themselves pause for somewhat longer) though it probably needs further tweaking and could probably do with dynamically adjusting to the current pattern of heap usage.

Thanks much... here's all my testing.

Picture first, with labels A, B, C, D

  • A) Savegame load
  • B) Mod(alt)-End hit 1 time
  • C) Mod-End hit 2 times
  • D) Mod-End hit 3 times
  • END) End of the run, just put that there in case the dark blue line was too faint.

faLTeg7.jpg

Summary:

 

HWM

(MB)

Memory - GC

Monitor (MB)

Avg GC

interval (sec)

Typical stutter

Frametime (ms)

A)  990-1200 4871 10.25 252
B) 2440-2780 6514 19.01 407
C) 2450-3020 6786 32.24 357
D)   2450-3030 6808 32.74 366

After I took the final screenshot I hit Mod-End several more times.  No significant difference from D (which was barely more than C).

My thoughts:

  1. First, vast improvement in playabllity, thank you
  2. 1/3 as many stutters, but stutters are 40% longer.  I'll take it.  :)
  3. Strange how the frametime dropped between B) and C)?
  4. I still need to go through my mods, I have a feeling there is one updating way too much kb/sec.  If I find it (or them) it'll make this fix even sweeter.

------------------

Backup data for those interested.

Frametimes:

Spoiler

 

Left column = Stuttered framtime occcurence in ms. 

Right column = time interval in ms since last stuttered timeframe

Note 1.   The frametime recording was stopped and started between C & D so I could screenshot.

Note 2.   For the average GC interval in the table above I ignore the first interval after Mod-End (it was always inconsistently shorter than all the rest)


A)
15521    10,021
25584    10,063
35955    10,371
46331    10,376
56549    10,218
66953    10,404
77184    10,230
87502    10,319
    
B)
109927    13,370
127769    17,842
145954    18,185
164104    18,150
183610    19,506
203813    20,203
223979    20,166
    
C)
272228    27,763
303599    31,371
336199    32,600
368681    32,482
401196    32,515
    
D)
38822    21,042
71530    32,708
104325    32,795
137184    32,859
170059    32,875
202420    32,361
235269    32,848
268032    32,764

 

 

 

 

 

Pertinent sections of my KSP Log:

Spoiler

 

All bolded items are added by me

B)

[LOG 17:34:42.110] [Vessel CPU Test 600 v23_5 Debris]: Destroyed. No crews were aboard.
[LOG 17:35:02.995] Pad(32) started, memory = 1120364 KB
After disard and collect, memory = 994272 KB
Pad8
Pad16
Pad24
PadArray(8)
PadArray(16)
PadArray(24)
PadArray(40)
PadArray(56)
PadArray(72)
PadArray(88)
PadArray(120)
PadArray(152)
PadArray(184)
PadArray(216)
PadArray(272)
PadArray(328)
PadArray(408)
After padding, memory = 2527172 KB

[LOG 17:36:15.848] [CHATR] Capsule starts the exchange...

C)
[LOG 17:37:19.341] Pad(32) started, memory = 2604300 KB
After disard and collect, memory = 1219492 KB
Pad8
Pad16
Pad24
PadArray(8)
PadArray(16)
PadArray(24)
PadArray(40)
PadArray(56)
PadArray(72)
PadArray(88)
PadArray(120)
PadArray(152)
PadArray(184)
PadArray(216)
PadArray(272)
PadArray(328)
PadArray(408)
After padding, memory = 2762376 KB

[LOG 17:37:31.057] Pad(32) started, memory = 2513104 KB
After disard and collect, memory = 1011152 KB
Pad8
Pad16
Pad24
PadArray(8)
PadArray(16)
PadArray(24)
PadArray(40)
PadArray(56)
PadArray(72)
PadArray(88)
PadArray(120)
PadArray(152)
PadArray(184)
PadArray(216)
PadArray(272)
PadArray(328)
PadArray(408)
After padding, memory = 2532936 KB

D)

[LOG 17:40:33.641] Pad(32) started, memory = 2896880 KB
After disard and collect, memory = 1101468 KB
Pad8
Pad16
Pad24
PadArray(8)
PadArray(16)
PadArray(24)
PadArray(40)
PadArray(56)
PadArray(72)
PadArray(88)
PadArray(120)
PadArray(152)
PadArray(184)
PadArray(216)
PadArray(272)
PadArray(328)
PadArray(408)
After padding, memory = 2584644 KB

[LOG 17:40:36.248] Pad(32) started, memory = 2584644 KB
After disard and collect, memory = 1198384 KB
Pad8
Pad16
Pad24
PadArray(8)
PadArray(16)
PadArray(24)
PadArray(40)
PadArray(56)
PadArray(72)
PadArray(88)
PadArray(120)
PadArray(152)
PadArray(184)
PadArray(216)
PadArray(272)
PadArray(328)
PadArray(408)
After padding, memory = 2741400 KB

[LOG 17:40:38.489] Pad(32) started, memory = 2748492 KB
After disard and collect, memory = 1118732 KB
Pad8
Pad16
Pad24
PadArray(8)
PadArray(16)
PadArray(24)
PadArray(40)
PadArray(56)
PadArray(72)
PadArray(88)
PadArray(120)
PadArray(152)
PadArray(184)
PadArray(216)
PadArray(272)
PadArray(328)
PadArray(408)
After padding, memory = 2687168 KB

*** Delay between Mod-End presses is due to it taking about 3.5sec to run each time.  Wanted to make sure it was registering with the system.

 

 

Relevant system specs:

Spoiler

 

Kerbal Space Program - 1.1.2.1260 (WindowsPlayer)

OS: Windows 7 Service Pack 1 (6.1.7601) 64bit
CPU: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz (4)
RAM: 16332
GPU: AMD Radeon HD 7900 Series (3051MB)
SM: 30 (Direct3D 9.0c [aticfx64.dll 8.17.10.1452])

CPU: i5-6600K OC to 4.6GHz on z170 chipset, air-cooled

GPU: HD7950 OC 1100 core 1350 memory

RAM: 16GB DDR4-3200 CL16

Primary Drive (OS + swapfile): Samsung 850 Evo

Kerbal Drive: Adata SP900

 

Modlist:

Action Groups Extended (AGExt 2.0)
Alcubierre Warp Drive (Stand-alone) (AlcubierreStandalone 0.4.3.0)
B9 Aerospace Procedural Wings - Fork (B9-PWings-Fork 1:0.40.7)
B9 Part Switch (B9PartSwitch v1.2.0)
BetterBurnTime (BetterBurnTime 1.4.3)
Camera Focus Changer (CameraFocusChanger v0.9.0.0)
Celestial Body Science Editor (CelestialBodyScienceEditor 6.0)
Chatterer (Chatterer 0.9.8)
Community Resource Pack (CommunityResourcePack 0.5.2.0)
Community Tech Tree (CommunityTechTree 1:2.4)
Community Terrain Texture Pack (CommunityTerrainTexturePack 2.0)
Connected Living Space (ConnectedLivingSpace 1.2.1.2)
Contract Configurator (ContractConfigurator 1.11.5)
Contract Pack: Anomaly Surveyor (ContractConfigurator-AnomalySurveyor 1.4.3)
Contract Pack: Bases and Stations (ContractConfigurator-KerbinSpaceStation 1:3.2.1.2)
Contract Pack: Historic Missions (ContractConfigurator-HistoricMissions 2.1.0)
Contract Parser (ContractParser 3.0)
Contract Reward Modifier (ContractRewardModifier 2.3)
Contracts Window + (ContractsWindowPlus 6.3)
Control Lock (ControlLock 1.5)
Crowd Sourced Science (CrowdSourcedScience v3.0.2)
Cryogenic Engines (CryoEngines 1:0.3.2)
Cryogenic Tanks (CryoTanks 0.2.2)
Custom Barn Kit (CustomBarnKit 1.1.8.0)
Deployable Engines Plugin (DeployableEngines 0.2.2)
Distant Object Enhancement (DistantObject v1.7.1)
Distant Object Enhancement default config (DistantObject-default v1.7.1)
DMagic Orbital Science (DMagicOrbitalScience 1.3)
Editor Extensions Redux (EditorExtensionsRedux 3.2.13)
Environmental Visual Enhancements (EnvironmentalVisualEnhancements 2:EVE-1.1-2-1)
Environmental Visual Enhancements - Config files (EnvironmentalVisualEnhancements-HR 2:EVE-1.1-2-1)
Extraplanetary Launchpads (ExtraPlanetaryLaunchpads 5.3.2)
Ferram Aerospace Research (FerramAerospaceResearch 3:0.15.6.5)
Final Frontier (FinalFrontier 1.0.10-2467)
Firespitter Core (FirespitterCore v7.2.4)
Graphic Memory Monitor (GCMonitor 1.4.3.0)
Haystack Continued (HaystackContinued 0.5.0.0)
Heat Control (HeatControl 0.3.1)
Kalculator (Kalculator 0.2.1)
Kerbal Alarm Clock (KerbalAlarmClock v3.6.2.1)
Kerbal Atomics (KerbalAtomics 1:0.2.2)
Kerbal Attachment System (KAS 0.5.7)
Kerbal Engineer Redux (KerbalEngineerRedux 1.1.1.0)
Kerbal Inventory System (KIS 1.2.9)
Kerbal Joint Reinforcement (KerbalJointReinforcement v3.1.7)
Kerbal Stats (KerbalStats 2.1.0)
KOAS - Kerbal Optical Alignment System (KerbalOpticalAlignmentSystem 1.0.2)
Kopernicus Planetary System Modifier (Kopernicus 2:release-1-0-4)
KSP Achievements (Achievements 1.9.2)
KSP AVC (KSP-AVC 1.1.6.1)
KW Rocketry Redux (KWRocketryRedux 3.0.7)
KW Rocketry Redux - Graduated Power Response Configs (KWRocketryRedux-GraduatedPwr 3.0.7)
Magic Smoke Industries Infernal Robotics (InfernalRobotics v2.0.4)
ModularFlightIntegrator (ModularFlightIntegrator 1.1.4.0)
Module Manager (ModuleManager 2.6.25)
Near Future Construction (NearFutureConstruction 0.6.1)
Near Future Electrical (NearFutureElectrical 0.7.2)
Near Future Electrical Core (NearFutureElectrical-Core 0.7.2)
Near Future IVA Props (NearFutureProps 0.5.0)
Near Future Propulsion (NearFuturePropulsion 0.7.2)
Near Future Solar (NearFutureSolar 0.6.1)
Near Future Solar Core (NearFutureSolar-Core 0.6.1)
Near Future Spacecraft (NearFutureSpacecraft 0.5.0)
'Otter' Submersible (USI-SubPack 0.1.3.0)
Outer Planets Mod (OuterPlanetsMod 1:2.0)
Portrait Stats (PortraitStats 9.0)
Precise Node (PreciseNode 1.2.3)
Procedural Fairings (ProceduralFairings v3.17)
Progress Parser (ProgressParser 4.0)
'Project Orion' Nuclear Pulse Engine (USI-NuclearRockets 0.2.3.0)
RealChute Parachute Systems (RealChute v1.4.1.1)
SafeChute (SafeChute v1.9.1)
ShipManifest (ShipManifest 5.1.0.0)
StageRecovery (StageRecovery 1.6.4)
Station Science (StationScience 2.0)
TAC Fuel Balancer (TacFuelBalancer 2.6)
TAC Life Support (TACLS) (TACLS v0.12.1)
TAC Life Support (TACLS) - stock config (TACLS-Config-Stock v0.11.1)
Take Command (TakeCommand 1.4.1)
TextureReplacer (TextureReplacer v2.4.13)
Toolbar (Toolbar 1.7.12)
Trajectories (Trajectories v1.6.2)
Transfer Window Planner (TransferWindowPlanner v1.5.0.0)
TriggerAu Flags (TriggerAu-Flags v2.7.4.0)
UbioZur Welding Ltd. Continued (UbioWeldContinued 2.3.2)
Universal Storage (UniversalStorage 1.1.0.12)
US Officer Ranks (Standard) - Final Frontier Add-On (USOfficerRanksStandardFinalFrontierAddOn 0.9.1)
USI Core (USI-Core 0.2.3.0)
USI Kolonization Core (USI-UKS-Shared 0.2.3.1)
USI Kolonization Systems (MKS/OKS) (UKS 1:0.40.3.0)
USI Tools (USITools 0.7.2.1)
Waypoint Manager (WaypointManager 2.5.1)
Texturereplacer - Navballs
Texturereplacer - Suits
MemGraph
SETI-Contracts
SETI-Rebalance
SETI-Ctt
InlineBallutes
S.A.V.E.
RLA Stockalike
Tig - CustomBarn
Tig - CC simultaneous
Tig - dmagic sesimic adjustment
Tig - other changes
Tig - Umbra TAC patch
Tig - KW SRB thrust patch
Tig - Univ Storage Adjustments
TIg - SETI-Tech tree-KW
TIg - SETI-Tech tree-Squad

 

 

 

 

 

Edited by Tig
Link to comment
Share on other sites

2 hours ago, Padishar said:

Well, the process uses the CPU so the faster the CPU, the shorter each pause will be.  The amount of work to be done depends on the total number and total size of all the allocated memory blocks but the relationship is slightly complex due to the multiple passes the GC performs.

I've now released version 1.0.0.6 of MemGraph that includes Mod-End to allocate a large chunk of heap headroom.  In my testing this has increased the time between runs of the garbage collector by between 5 and 12 times (though the collections themselves pause for somewhat longer) though it probably needs further tweaking and could probably do with dynamically adjusting to the current pattern of heap usage.

 

Anyone know if the Unity GC is multi-threaded? If not, would it be possible to shuffle the GC off onto its own CPU core by way of prioritization? Or does it just run as a part of an existing thread?

Sorry if this is stupid, I'm a professional network geek, but this is getting into territory where I have no idea what the heck I'm doing :P

Link to comment
Share on other sites

1 hour ago, Tig said:

My thoughts:

  1. First, vast improvement in playabllity, thank you
  2. 1/3 as many stutters, but stutters are 40% longer.  I'll take it.  :)
  3. Strange how the frametime dropped between B) and C)?
  4. I still need to go through my mods, I have a feeling there is one updating way too much kb/sec.  If I find it (or them) it'll make this fix even sweeter.

Great data again, thanks.  I'm a bit disappointed by only just over three times, I guess it's probably tuned a bit too specifically to the situations I was testing.  Hopefully, I should be able to improve that a fair bit with some analysis of the heap.  The figures should also be better with fewer mods installed, though it is difficult to predict because there isn't any easy way to see how many blocks of what sizes are being regularly allocated.  Not sure about 3, I'll have to think about it tomorrow (1:30 am here now and I need to be up at 7:30).

Another thing that could reduce your stutter would be to set a lower fps limit.  Currently, the 16-20 MB/s is being generated by 120 rendered frames and 50 physics updates.  It is hard to say how much is being done per rendered frame and how much is per physics update but changing to a 60 fps limit could significantly reduce the rate of garbage creation...

Link to comment
Share on other sites

4 hours ago, Chris97b said:

Anyone know if the Unity GC is multi-threaded?

Not as far as I am aware. There is a compile-time option to enable partially ("mark" phase) multithreaded GC in mono, but it's turned off by unity with the comment "causes crashes"... Which indeed it does.
Not sure what Unity has done to the GC in their mono fork, this works just fine in vanilla mono.

Edited by steve_v
Link to comment
Share on other sites

@Padishar

Have you also considered looking for low hanging stock fruit to fix up? The main editor GC problem seems fixable to me and there are some small bits we can fix here and there (METDisplay wastes 1.5 kB every update frame for example) which could add up to some significant GC reduction

Link to comment
Share on other sites

4 hours ago, steve_v said:

Not as far as I am aware. There is a compile-time option to enable partially ("mark" phase) multithreaded GC in mono, but it's turned off by unity with the comment "causes crashes"... Which indeed it does.
Not sure what Unity has done to the GC in their mono fork, this works just fine in vanilla mono.

I suspect, though I haven't got any real evidence to back it up, that they have tried to use some of the more advanced features of the Boehm collector to make it less conservative, e.g. by using the Mono type information to determine that a block can't contain any references to other blocks and therefore doesn't need to be scanned, or even, to make the scanning itself use the type information to only check those memory locations within a block that actually are reference types.  It wouldn't surprise me if these changes were not thread-safe as, with the option turned off, they don't need to be.

1 hour ago, stk2008 said:

skimming through this thread again am I right in thinking there has been improvment in the stutter? :o

if so please tell me what I need to install this is awesome news :)

I have devised a method of padding the Mono heap with quite a lot of extra empty headroom that doesn't get eliminated after one or two runs of the GC.  This should increase the time between stutters considerably (anywhere from 2 to 12 times depending on the current pattern of heap usage) though it does make each pause somewhat longer.  The code will probably be improved further in the coming days/weeks but you can try it out by installing the latest version of the MemGraph mod linked in my sig.  Get into a situation where the stutter is bad and then hit Mod-End.  After a short while (at least 2 or 3 runs of the collector), you can try hitting it again, which will reset the padding and may improve it further.  Please reply in either this thread or the MemGraph thread with any observations, either qualitative or quantitative (but as with any kind of report, details of your system, mod setup and what you were doing would be helpful).

20 minutes ago, xEvilReeperx said:

@Padishar

Have you also considered looking for low hanging stock fruit to fix up? The main editor GC problem seems fixable to me and there are some small bits we can fix here and there (METDisplay wastes 1.5 kB every update frame for example) which could add up to some significant GC reduction

Yes, that was the entire purpose of my MemGraph mod.  I've already passed on information about a number of stock areas of code that create excessive garbage and am continuing to look for them as reducing the amount created is still, by far, the best way to reduce the annoyance of the stutters...

While targeting code that creates 1.5 KB/frame will have an effect (and it will be considerable if lots of such code is fixed), I'm, initially, more concerned about situations where stock code allocates much more.  I have a number of saves where the garbage creation adds up to 20+ MB/s resulting in the GC running every 3 to 4 seconds.  This padding of the heap extends that to approx. 20 seconds between collections which, even with the pauses being noticeable longer, is considerably nicer to play (in my opinion).

Link to comment
Share on other sites

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...