Jump to content

Reinforced Learning in KSP(using kRPC, maybe?)


DunnoAnyThing

Recommended Posts

Sorry to ask "Is this mod X supported in game version Y?" sort of question, but I've run out of time, so...unfortunately.

My friend wants to practice his 'reinforced learning' techniques on this game(yes, I mean KSP.).
First mod that came to mind was kRPC, but I'm not sure if it works well with the current KSP version(no time to check myself).
So 1). (the 'kRPC works on currnet KSP?') question

2) I think I've come across a post on Red*** about a similar project, but I cannot recall it. Any recommendations for this project(?) is welcome.(especially from those who do RL)

Link to comment
Share on other sites

18 minutes ago, DunnoAnyThing said:

So 1). (the 'kRPC works on currnet KSP?') question

Hmmm... <copy> <paste> <select "titles only"> <hit search>

Seems to be "officially" supported until KSP 1.5. So I guess it will work with KSP 1.7. But you (or your friend) should test it. My guess is that the new features from Breaking Ground may give you problems.

Or your friend stays with KSP 1.5.

Link to comment
Share on other sites

  • 2 weeks later...

Another option is KOS, but of course you need to save weights and biases globally since exploration of the configuration space of possible weights and biases for a neural network controlling a rocket or plane or whatnot is most definitely not safe and WILL destroy your vehicle many many times.

 

I'm also a bit curious how you plan to train the network seeing as dying can often be a consequence of stuff you did much earlier, so you can't reliably know where in a run your mistake was.

Edited by Pds314
Link to comment
Share on other sites

On 10/20/2019 at 11:38 AM, Pds314 said:

Another option is KOS, but of course you need to save weights and biases globally since exploration of the configuration space of possible weights and biases for a neural network controlling a rocket or plane or whatnot is most definitely not safe and WILL destroy your vehicle many many times.

 

I'm also a bit curious how you plan to train the network seeing as dying can often be a consequence of stuff you did much earlier, so you can't reliably know where in a run your mistake was.

Dying is actually needed. Will just give high reward on flight time, but details may and will vary.

Plus the 'delayed reward' problem is a classic one in RL, so that won't be quite a problem.(I hope?)

Edited by DunnoAnyThing
Link to comment
Share on other sites

9 hours ago, DunnoAnyThing said:

Dying is actually needed. Will just give high reward on flight time, but details may and will vary.

Plus the 'delayed reward' problem is a classic one in RL, so that won't be quite a problem.(I hope?)

It basically means in-flight rewards are hard to do. Your get like 1 useful data point per launch, and simple backpropogation won't work well because you don't know the activation state of the network at the point mistakes were made, because you don't know when mistakes were made. You can either punish/reward it for everything it did in the whole flight by logging all of the inputs every frame, or you can use a non-backpropogation algorithm such as, say, genetic evolution. These do work, but are very experiment-hungry ways to train a network. And in the case of genetic evolution, storing 10 or 100 or 1000 slightly-altered copies of the network and testing all of them every generation makes for even slower progress.

 

Punishing it for whatever it did a frame before the crash will probably not be useful though, as most crashes cannot be avoided from one frame away. And same goes for punishing it more for stuff that happened more recently to the crash, as, again, it could have been a mistake very early in the flight.

Edited by Pds314
Link to comment
Share on other sites

On 10/25/2019 at 12:56 AM, DunnoAnyThing said:

Quite right, I guess.
It seems like first creating a 2-D simplified model of ksp then trying ksp-RL out will be a more feasible approach, since there are many problems like ones you mentioned.

Yes. Especially since even a 3D stupidly simple rocket sim where everything is one part would run at >10000x physics warp quite easily. It wouldn't need to be perfect. You could randomly vary the rocket/plane parameters and you could port the results to KSP and then do slower training after it figured out the basics.

Link to comment
Share on other sites

  • 1 month later...
This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...