Jump to content

Need a help on a KSP bug I am dealing with


Recommended Posts

34 minutes ago, exospaceman said:

OMG I FIXED IT! I only removed Aviation Lights from Game Data and it seems the menu bug is gone.

However, I want to see if this lasts for long so still keeping an eye on it for now. But anyway thank you for the help

DAMN! That's was really unexpected!!!! :o

Try adding it back and putting DOE in its place - I really don't see a reason for Aviation Lights be doing something wrong at that point, DOE is the one that it's active on Main Menu Space Center!!

— — POST EDIT — — 

I'm double checking AviationLight's code, and the damned thing is a PartModule, it shouldn't be even active by the time your problems are happening. It doesn't caches anything, don't hook itself on any KSPEvent… 

Looking into your original Player.log and KSP.log, there's no sign of AviationLights being involved on any kind of trouble.

 

Edited by Lisias
POST EDIT
Link to comment
Share on other sites

On 8/1/2023 at 11:27 PM, exospaceman said:

Hmm that's weird I ran KSP 2 times and the menu seems to work well and what is DOE?

DOE is DistantObject.

I think I found a possible flaw on AviationLight, I may had misunderstood something on how KSP handles PartModule's life cycle. [edit] Nope, it's a KSP idiosyncrasy that it's bitting now due the Hubrid CPU problem [/edit] I'm doing some exploratory tests on my test beds in order to see if I can force this misbehaviour in the way I think I may be related to AviationLights.

I will be back to you as soon as I finish these tests - I think a couple hours.
 

— — POST EDIT — — 

Instead of polluting this thread with tons of Technical Mambo Jambo, I opened an issue on AL to document my findings: https://github.com/net-lisias-ksp/AviationLights/issues/4

I will come back here when I have something concrete to say.

I will come back to you ASAP, before sleep time for sure.

— — POST POST EDIT — — 

@exospaceman, I found one possibility. The only way such possibility could happen is through a way convoluted sequence of events that depends of something going off sync inside your game due the Intel's Hybrid Stunt, because on a symmetrical CPU like mine such sequence of events would be behaving correctly.

Another problem with this working theory is that you are telling us that the problem vanished but.. Had you launched something using an Aviation Light after reinstalling it? Loading savegames without a needed PartModule will just rip it off from the thing, and by reinstalling you would have the PartModule reinjected back, but with default values. And if I'm right, the default values will not trigger the problem.

It may be also that only happens if you have a ton of crafts full of lights - but, and again, I'm guessing because I'm assuming that this hypothetical situation I pulled out from my SAS would be really happening, what at this time is uncertain.

It can be also an unhappy conjunction of events where two or more 3rd parties would be needed in order to the present situation be triggered.

So, I need to ask you some more tests.

Get rid of the full backup, that by this time is pretty messed up, then make another FULL BACKUP and on it, and after confirming the problem is happening, remove Kopernicus, Scatterer and EVE and see what you get.

And yeah, we are guessing now. Since I could not reproduce the problem here using Aviation Lights on my rig, is still something on yours - and I need to know if Aviation Lights is the trigger of the problem, or if it's being induced to bork somehow by something else. Ideally, I need to reproduce the problem on my rig too, otherwise I will not be able to be sure about the problem and fix it.

Humm… Your EVE is that one with Volumetric Clouds, I'm right?

Edited by Lisias
POST EDIT
Link to comment
Share on other sites

@exospaceman, I had an idea: please delete the current BACKUP KSP, make a completely new FULL BACKUP and, on it, please confirm the problem is still happening.

Once you have such confirmation, please update Aviation Lights to this specific PRE-RELEASE: https://github.com/net-lisias-ksp/AviationLights/releases/tag/RELEASE%2F4.2.1.1

If I'm right about my guessings, this should prevent the problem that it's affecting you - assuming, of course, that Aviation Lights is the one causing the problem. If it's only a trigger or an enabler, we may need to keep digging. But, hey, let's keep the faith - the best scenario possible is finding a bug on something I do, because I then can fix the thing  and solve the problem!! :) 

Let me know whatever happens with this release.

Link to comment
Share on other sites

10 hours ago, Lisias said:

Had you launched something using an Aviation Light after reinstalling it?

Nope 

 

10 hours ago, Lisias said:

Your EVE is that one with Volumetric Clouds, I'm right?

Yes

10 hours ago, Lisias said:

remove Kopernicus, Scatterer and EVE and see what you get.

But removing those will also affect the planets of KSRSS

Link to comment
Share on other sites

18 minutes ago, exospaceman said:

But removing those will also affect the planets of KSRSS

I know. But we need to trim down to someone - these tests are destructive, it's the reason we need to use backups. On this case, remove KSRSS too.

But, FIRST,  do a clean FULL BACKUP, check if AviationLights is still misbehaving and, if positive, update it to this pre-release: https://github.com/net-lisias-ksp/AviationLights/releases/tag/RELEASE%2F4.2.1.1

I worked around the one theoretical hole I found on it that could be causing you trouble - if I'm right, by updating AL to this pre-release, your problem will just vanish. But, first, we need to confirm that the current release is still misbehaving or we will have a false positive.

Link to comment
Share on other sites

Not to disappoint you but rather than just doing a copy I made a new sandbox instead I feel more comfortable with it, anyway off to test this is going to be on heck of a mess!

13 hours ago, Lisias said:

 FULL BACKUP and on it

Ok made another one, finally found the backup btw

Link to comment
Share on other sites

4 minutes ago, exospaceman said:

It seems the back up works now for the last and hurtful test: Removing Kopernicus, Scatterer and EVE 

think that Kopernicus is innocent on this one, because I installed it here and - apparently - found no evidence of problems (but didn't tried hard enough, to tell you the true -  I need to install something that makes it sweat a bit).

I would try again by restoring the backup to a problematic posture and then removing EVE, and later Scatterer.

Additionally, give a try on that pre-release of AviationLights. One of the possible consequences of adding these heavy Visual add'ons is that some loops inside KSP start to get busier than others, and perhaps this could be the trigger that (perhaps) would induce AL to bork. If I'm right, the change I did on AL will not solve the problem, but at least will remove itself from the critical path of the problem, avoiding triggering it and with a bit of luck allowing you to run your game with the Add'Ons you want.

Essentially, I'm aiming to prevent AL from being that straw that breaks the camel's back.

Link to comment
Share on other sites

4 hours ago, exospaceman said:

I think I'm gonna have to save some money so that I can get an even powerful PC to handle mods like this

Nope, it appears to be exactly the opposite - you have a i7-12700F (pretty decent), and this thing have two kinds of "cores" inside: 8 cores for Performance, and 4 cores for "Economy". The E-Cores are slower than the P-Cores, but KSP was coded on a time where every single Core inside your CPU was the same, so KSP is just using whatever is idle at the moment without caring about the differences.

So, suddenly, KSP is getting some internal threads running "out of the pace" related to some others (depending in which Core the threads is running at the moment).

So, as you add more and more add'ons, the KSP internal loops start to get fatter and fatter, and if some one of these fat internal loop ends up running on a E-Core, some things start to happen out of the pace compared to the loops that runs on the P-Cores (it's the reason I hate how Unity handles threads, and KSP handles concurrency - had these guys did the job properly, we would not be suffering this problem now).

By getting a bigger processor, you will have twice the E-Cores - what essentially doubles the chances of things going south (because you will have twice the number of threads running on slower cores). Believe it or not, the better is the processor your have, the worst is the problem for you.

"Vai dormir com um barulho desses…" as we say around here… (something like "Go figure it out…").

But some more memory will surely be useful to you. You GPU is also 100% adequate for KSP1.

 

5 hours ago, exospaceman said:

Downloaded the new Aviation lights game bug is no where to be seen even in a new sandbox

Going back to Aviation Lights, what I think i happened is that as that fat loops start to run on the E-Core, a critical code of AL became too much out of sync with the rest of the loops, and then something bad started to happen. What I did was to prevent that critical code from running outside the intended Scene by brute force (something that ideally I should not care, because KSP would kill the PartModule before the new Scene starts to kill old things), what apparently prevents AL from being directly hit by this problem I described.

Unfortunately, AL is only one of the possible targets of the problem - others will surely suffer the same.

I think we are on a update fest now on the Author's Land… I'm surely scrambling to double check my ones!

 

5 hours ago, exospaceman said:

I think the bug appears or doesn't appear sometimes

Perhaps only a few specific worker threads of KSP plays havoc when running on a E-Core, and so every time you startup KSP, we get a Russian Roulette about the subject.

What puzzles me is why setting the CPU's Affinity didn't did the trick for you, but until I get a machine that would suffer this problem too, my hands are tied.

Edited by Lisias
Brute force post merging
Link to comment
Share on other sites

1 hour ago, Lisias said:

Unfortunately, AL is only one of the possible targets of the problem - others will surely suffer the same.

Idk about that for now the game works fine 

1 hour ago, Lisias said:

What puzzles me is why setting the CPU's Affinity didn't did the trick for you, but until I get a machine that would suffer this problem too, my hands are tied.

I hate to say this but I lied, the truth was I didn't actually did the trick I didn't know where to find the Affinity and asked someone on a discord server into KSP for some help but told me it only works in Windows 10 so I didn't do it pls don't get mad at me :(

Edited by exospaceman
Link to comment
Share on other sites

25 minutes ago, exospaceman said:

Idk about that for now the game works fine 

That's what really matter in the end! :)

 

25 minutes ago, exospaceman said:

I hate to say this but I lied, the truth was I didn't actually did the trick I didn't know where to find the Affinity and asked someone on a discord server into KSP for some help but told me it only works in Windows 10 so I didn't do it pls don't get mad at me :(

Well, in the end it's a good news. :D Because now things make sense and perhaps I can try to do something about. :)

I only have a crappy Windows 10 machine around, so I can't help too much on Windows 11 anyway.

But, in a way or another, handling your case gave me some ideas, so thanks for the report! You managed to make things a bit better for everybody (I would not be able to do it without your help!!!)

Cheers!

Link to comment
Share on other sites

On 8/1/2023 at 10:07 PM, Lisias said:

However, I want to see if this lasts for long so still keeping an eye on it for now. But anyway thank you for the help

Like I said earlier I knew this sort of thing wasn't going to last long, however the problem is different and a new kinda bug, rather than the menu bug glitching whenever I launch after reverting to VAB the game crashes however this only happens to larger vehicles smaller don't affect 

Link to comment
Share on other sites

On 8/5/2023 at 11:52 AM, exospaceman said:

Like I said earlier I knew this sort of thing wasn't going to last long, however the problem is different and a new kinda bug, rather than the menu bug glitching whenever I launch after reverting to VAB the game crashes however this only happens to larger vehicles smaller don't affect 

It doesn't look like, but it's an improvement. :P 

New bugs means that the old bugs that were preventing them to happen were solved or mitigated.

What I think it's happening is that a similar problem to what I had mitigated on AL is now happening with someone else, in another circumstances. It's tricky to diagnose, but we already had realised the root cause so now it's a matter of detecting where it is also happening and patch it.

Next time this happens, send me the KSP.log explaining what you were doing when the thing blew up, and then I may have a hint about what to do.

 

On 8/5/2023 at 12:03 PM, exospaceman said:

Sometimes it happens guess I'll have to get a new computer by the end of the year and try to get used to this bug 

I'm not sure if this will help too much. Let me explain to you why:

Inside your CPU there're thingies called Cores and Threads. Let's simplify all if this calling them "Workers" - there's a "boss" on the chip reading the program, and distributing the instructions between the Workers that so scramble to get the job done and deliver the results.

Sometimes these workers do completely different and unrelated taks, but sometimes a task is so big that the "boss" split it between many workers, that so need to work synchronised in order to deliver the result at the right time.

There're many ways to synchronised these workers in order to prevent them from screwing up things (by example, by delivering the sound backwards, or preventing the Drawing Worker from drawing the screen before the 3D Math Workers had finished calculating the new positions of the scene objects). Unfortunately Unity (and KSP) choose to use a way in which the speed of each Worker is used to "match" internal milestones that are used to synchronise the workload:

  1. Worker 1: Dude, calculate the 100 vertices from position 0, then call the dispatcher with the results.
  2. Worker 2: Hey man, calculate the other 100 vertices at position 100, then call the dispatcher with the results.
  3. Worker 3: You do the same, but from position 200, then call the dispatcher with the results.

Since all the Workers calculate things at the same speed, and since you gave the tasks in order, all the workers will deliver the results in order too, and the Dispatcher will receive the results as follows:

  1. Vertices 0..99 from Worker 1
  2. Vertices 100..199 from Worker 2
  3. Vertices 200.299 from Worker 3
  4. And since things always happens this way, as soon the Worker 3 deliver its results the dispatcher sends everything to the 3D Drawer (the 300 vertices calculated).

But on your CPU, the Workers have not all the same speed. You have 8 really fast Workers, and 4 "lazy ones". And Unity doesn't cares at all. So, let's see what happens if that Worker 1 ends up being one of the "lazy" ones:

  1. Vertices 100..199 from Worker 2 are sent to Dispatcher. 
  2. Vertices 200.299 from Worker 3 are sent to dispatcher.
  3. As soon the Worker 3 deliver its results the dispatcher sends everything to the 3D Drawer, but there're missing the first 100 vertices and the Drawing goes South.
  4. Worker 1 delivers vertices 0..99
    1. Dispatcher: What a hell??

With Aviation Lights, what was happening is that when changing Scenes to Tracking Station, Aviation Lights got one of the Lazy Workers on a critical section of the code that need to use some Game Objects that were already destroyed by a task executed by a Fast Worker - since all Workers used to have the same speed in the past, by the time the Destructor Worker started to clean up the unused objects, the other workers had already finished their business - but now, on your rig with two different kind of Workers, if that critical task gets on a Lazy Worker, the Fast Worker will destroy everything before it had finished using them.

What I did on AL was to detect when the Scene will be changed and told Aviation Lights to finish its business in advance, so if that critical task ends up on a Lazy Worker, it have more time to do its job - before this change, AL was reacting after the scene had changed.

We call all this crap "Race Conditions" on Software Engineering.

So, if you buy another rig with more Workers, you will have more Fast Workers but will also have more Lazy Workers, and by having more Lazy Workers you wil end having more problems like this happening.

So, really, the best available way to solve the problem is doing what both of us did on this thread: you post KSP.logs with the problem happening, I analyse it and try to figure out who is being screwing and when, and then trying to cook a way to prevent the situation.

Of course, the best way to solve this problems is by looking directly on KSP's source code and rework it to synchronise the Workers the right way. :)

 

 

Edited by Lisias
My God, it's full of tyops!
Link to comment
Share on other sites

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...