Machine learning

Benjamin Kerman · April 9, 2017

Hey everyone! Browsing twitch and found this gem: https://www.twitch.tv/nonprofitgibi

This KSP streamer is using a neutral net to try to teach a computer to play KSP!

Check it out, and share a thought!

Scotius · April 9, 2017

I took a look, and am not impressed. Bot was flinging a simple rocket in random directions, usually at flat angles of attack. I haven't seen any course corrections, throttling the engine etc. All in all, our new cybernetic overlords are still not ready to grace us, meatbags with their presence

rkarmark · April 9, 2017

Well it doesn't seam to learn anything makes the same mistakes over and over again

More machine than learning

Green Baron · April 9, 2017

4 hours ago, Benjamin Kerman said:

... neutral net ...

... share a thought!

This was probably meant to be a neural net ?

Autocorrection .... :-)

Kerbart · April 9, 2017

33 minutes ago, rkarmark said:

Well it doesn't seam to learn anything makes the same mistakes over and over again

More machine than learning

It's doing a lot better now than a few hours ago. Instead of all flights going horizontal, a few now go vertical.

The neural network (?) still has a hard time figuring out a compromise between the two.

K^2 · April 9, 2017

This type of problem is solved by an unguided learning system, which requires a ton of training input. Even if you are using an NN approach for the guidance system, your best bet is a genetic algorithm. With that in mind, I wouldn't start with an actual game. There is no way to run hundreds of thousands of trials in short amount of time. I'd start with a good simulation, which can be ran much faster if you don't wast time on rendering and approximate rocket as rigid. Simulation like that can run an entire launch to orbit in a fraction of a second. This is where you can start learning. You can then hook it up to the game to display some of the better results with an actual flight.

P.S. For anyone who wants to do NNs or Deep Learning in Python, I strongly recommend TensorFlow. It's a modern tool, far superior to most Python NN libraries.

Edited April 9, 2017 by K^2

Benjamin Kerman · April 9, 2017

Checked it out again, looks like it goes up slightly more now

It is interesting that I saw it do what I think all of us would consider a steep gravity turn to the north, and then when it reverted because of something, did another flop.

rkarmark · April 9, 2017

Now it went the right way and made it to 116000 m max with 1800 ms sideways veloseti

Before sometimes it realesed the launch clamps and didn't turn on the engine

Green Baron · April 9, 2017

10 minutes ago, rkarmark said:

Before sometimes it realesed the launch clamps and didn't turn on the engine

lol ! Now, when did this happen the last time to one of us ?

YNM · April 10, 2017

I think there's something wrong with the evolution. The "AI" always try to either constantly putting yaw / pitch (almost never both, only thanks to perturbation that it sometimes have to keep heading, when it should really only goes east) or just go vertical when close to no perturbation occured.

I call a redesign of the thing, esp. regarding the goals - flying a basic gravity turn shall be the basic standard than reaching certain alt / velocity.

Edited April 10, 2017 by YNM

Tullius · April 10, 2017

1 hour ago, YNM said:

I think there's something wrong with the evolution. The "AI" always try to either constantly putting yaw / pitch (almost never both, only thanks to perturbation that it sometimes have to keep heading, when it should really only goes east) or just go vertical when close to no perturbation occured.

I call a redesign of the thing, esp. regarding the goals - flying a basic gravity turn shall be the basic standard than reaching certain alt / velocity.

The whole idea is that the AI starts from nothing, and doesn't even know what it is doing. It can play around with certain commands, gets a few bits of information (such as the navball, the altitude, etc.), but it doesn't know what those commands and pieces mean or what is an orbit or gravity turn, as it should learn that all by itself.

The only way the AI gets information about how well it is doing is after each attempt, when it gets a score. So the AI starts with a completely random behaviour and gets a feedback in the form of a score. If the score is good, it will be more likely to attempt something similar in the future, if the score is bad, it will be less likely.

For KSP, this is not the best idea, since there are well established concepts of going to orbit, but it is a school project about applying neural nets to playing KSP. So it is no problem, if it is completely pointless.

There have been others before that let a similar AI (based on neural nets) learn how to play Super Mario, or Google that developed AlphaGo to beat even the best human Go players. Those two examples need a bit more "creativity" by the AIs, since there are no well-established solutions for their respective problems, unlike KSP, where people have already written kOS scripts to fly rockets to orbit.

And, besides the above, why should the AI fly east? You can achieve orbit in any direction, it is just a bit easier, if you go east. But the AI probably doesn't know that yet, and depending on the scoring, it might never learn it (if it is just about getting into orbit).

Edited April 10, 2017 by Tullius

YNM · April 12, 2017

@Tullius true as you told. But KSP simulates a very, very much more open-ended game compared to Chess or Go or even Super Mario. In fact, without any idea of "what's an orbit" itself, reaching orbit is close to impossible - some man only thought about orbits after derriving the principles of nature, and people didn't realistically dreamed of orbits until much later (not even today, sometimes). Reaching orbit is pretty much limited to a certain way, and if anything, you're doing optimizations based on the principles itself.

If I'm not mistaken, the goal is to achieve both an altitude and sufficent speed - nothing said about the vectors ?

But hey, if this AI turns out to be the Newton of AIs, well that's a news...

Tullius · April 12, 2017

2 hours ago, YNM said:

If I'm not mistaken, the goal is to achieve both an altitude and sufficent speed - nothing said about the vectors ?

The score that the AI gets as a result after each try is also based on periapsis and apoapsis height. Or put differently, the scoring system knows what an orbit is and gives the AI a score based on how good its achieved "orbit" was.

But sure, the scoring is much more difficult in the case of KSP than those of Go or Super Mario. For Go, it could be binary (win = good, loss = bad), or for Super Mario, one would probably base the score on the distance travelled before it died (since after all, winning means travelling the whole distance of a level.

Or, to compare again with Super Mario, while the AI doesn't know that, if it falls into the holes in the ground, it will die, it will notice after a couple of attempts that jumping over them allows it to travel much further than falling into them. And since travelling further is good, it will start jumping over the holes.

Edited April 12, 2017 by Tullius

YNM · April 12, 2017

3 hours ago, Tullius said:

The score that the AI gets as a result after each try is also based on periapsis and apoapsis height. Or put differently, the scoring system knows what an orbit is and gives the AI a score based on how good its achieved "orbit" was.

...

Or, to compare again with Super Mario, while the AI doesn't know that, if it falls into the holes in the ground, it will die, it will notice after a couple of attempts that jumping over them allows it to travel much further than falling into them. And since travelling further is good, it will start jumping over the holes.

The AI seems to still haven't grasped how the change in Pe is dependent on how it points the thrust (and how it applies thrust) then. Will "take some time" I suppose.

Tullius · April 12, 2017

22 minutes ago, YNM said:

The AI seems to still haven't grasped how the change in Pe is dependent on how it points the thrust (and how it applies thrust) then. Will "take some time" I suppose.

The AI doesn't grasp anything, it just notices that certain combinations of actions lead to certain scores. If you look at the console output in the stream, you see after each try the line "fitness: ...", which is the score the AI received in the previous try (the higher, the better). And at the moment the highest achieved score is 6,9...

I am still unsure what the "adj fit" in the table means; maybe it is something like the average score achieved, and thereby showing the progress. But I am really not sure; I just see it being higher, if I check back after a few hours.

So the AI might notice that certain directions of thrust lead to good scores, but it won't notice the behaviour of the Pe independently of the rest. But most of its tries ending up in quick turns might suggest that it noticed that thrusting horizontally leads to good scores, since it increases the periapsis.

And "will take some time", it something like the motto of neural networks. It took AlphaGo a couple of months to increase from barely the level of good European Go player to beating one of the best human players. And that despite running on a supercomputer. In the experiment above, we are not even at 6000 tries.

But in the end, we all needed quite a few tries before achieving our first orbit in KSP. And we knew what an orbit is, and maybe also read up on how to achieve orbit in KSP. And now imagine someone, who starts literally from 0, knowing absolutely nothing about the task ahead.

Edited April 12, 2017 by Tullius

WinkAllKerb'' · April 12, 2017

that's personnal but i tend to be allergic to ed 209 directive, while i do support self rescript abilities ^^ biped are stupid

Edited April 12, 2017 by WinkAllKerb''

YNM · April 12, 2017

5 hours ago, Tullius said:

The AI doesn't grasp anything ...

...

... "will take some time", it something like the motto of neural networks. It took AlphaGo a couple of months to increase from barely the level of good European Go player to beating one of the best human players. And that despite running on a supercomputer. In the experiment above, we are not even at 6000 tries.

But in the end, we all needed quite a few tries before achieving our first orbit in KSP. And we knew what an orbit is, and maybe also read up on how to achieve orbit in KSP. And now imagine someone, who starts literally from 0, knowing absolutely nothing about the task ahead.

Which is what most people do at first : point it up and see how far can you go XD (or, well, keeping your rocket from "stalling" ...)

All good points anyway. Take your time, for the simulators !

Edited April 12, 2017 by YNM

sevenperforce · April 12, 2017

Has it figured out yet that staging before burnout is never, ever a good idea?

I'm reminded of this....

YNM · May 15, 2017

Alright, I know, massive digging here. But :

I... presume it's on a similar level of difficulty ? If you were the developers, what would you set as a quite universal goal ?

Machine learning

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation