Jump to content

Machine learning


Recommended Posts

I took a look, and am not impressed. Bot was flinging a simple rocket in random directions, usually at flat angles of attack. I haven't seen any course corrections, throttling the engine etc. All in all, our new cybernetic overlords are still not ready to grace us, meatbags with their presence :P

Link to comment
Share on other sites

33 minutes ago, rkarmark said:

Well it doesn't seam to learn anything makes the same mistakes over and over again

More machine than learning

It's doing a lot better now than a few hours ago. Instead of all flights going horizontal, a few now go vertical.

The neural network (?) still has a hard time figuring out a compromise between the two.

Link to comment
Share on other sites

This type of problem is solved by an unguided learning system, which requires a ton of training input. Even if you are using an NN approach for the guidance system, your best bet is a genetic algorithm. With that in mind, I wouldn't start with an actual game. There is no way to run hundreds of thousands of trials in short amount of time. I'd start with a good simulation, which can be ran much faster if you don't wast time on rendering and approximate rocket as rigid. Simulation like that can run an entire launch to orbit in a fraction of a second. This is where you can start learning. You can then hook it up to the game to display some of the better results with an actual flight.

P.S. For anyone who wants to do NNs or Deep Learning in Python, I strongly recommend TensorFlow. It's a modern tool, far superior to most Python NN libraries.

Edited by K^2
Link to comment
Share on other sites

I think there's something wrong with the evolution. The "AI" always try to either constantly putting yaw / pitch (almost never both, only thanks to perturbation that it sometimes have to keep heading, when it should really only goes east) or just go vertical when close to no perturbation occured.

I call a redesign of the thing, esp. regarding the goals - flying a basic gravity turn shall be the basic standard than reaching certain alt / velocity.

Edited by YNM
Link to comment
Share on other sites

1 hour ago, YNM said:

I think there's something wrong with the evolution. The "AI" always try to either constantly putting yaw / pitch (almost never both, only thanks to perturbation that it sometimes have to keep heading, when it should really only goes east) or just go vertical when close to no perturbation occured.

I call a redesign of the thing, esp. regarding the goals - flying a basic gravity turn shall be the basic standard than reaching certain alt / velocity.

The whole idea is that the AI starts from nothing, and doesn't even know what it is doing. It can play around with certain commands, gets a few bits of information (such as the navball, the altitude, etc.), but it doesn't know what those commands and pieces mean or what is an orbit or gravity turn, as it should learn that all by itself.

The only way the AI gets information about how well it is doing is after each attempt, when it gets a score. So the AI starts with a completely random behaviour and gets a feedback in the form of a score. If the score is good, it will be more likely to attempt something similar in the future, if the score is bad, it will be less likely.

For KSP, this is not the best idea, since there are well established concepts of going to orbit, but it is a school project about applying neural nets to playing KSP. So it is no problem, if it is completely pointless.

There have been others before that let a similar AI (based on neural nets) learn how to play Super Mario, or Google that developed AlphaGo to beat even the best human Go players. Those two examples need a bit more "creativity" by the AIs, since there are no well-established solutions for their respective problems, unlike KSP, where people have already written kOS scripts to fly rockets to orbit.

And, besides the above, why should the AI fly east? You can achieve orbit in any direction, it is just a bit easier, if you go east. But the AI probably doesn't know that yet, and depending on the scoring, it might never learn it (if it is just about getting into orbit).

Edited by Tullius
Link to comment
Share on other sites

@Tullius true as you told. But KSP simulates a very, very much more open-ended game compared to Chess or Go or even Super Mario. In fact, without any idea of "what's an orbit" itself, reaching orbit is close to impossible - some man only thought about orbits after derriving the principles of nature, and people didn't realistically dreamed of orbits until much later (not even today, sometimes). Reaching orbit is pretty much limited to a certain way, and if anything, you're doing optimizations based on the principles itself.

If I'm not mistaken, the goal is to achieve both an altitude and sufficent speed - nothing said about the vectors ?

 

But hey, if this AI turns out to be the Newton of AIs, well that's a news...

Link to comment
Share on other sites

2 hours ago, YNM said:

If I'm not mistaken, the goal is to achieve both an altitude and sufficent speed - nothing said about the vectors ?

The score that the AI gets as a result after each try is also based on periapsis and apoapsis height. Or put differently, the scoring system knows what an orbit is and gives the AI a score based on how good its achieved "orbit" was.

But sure, the scoring is much more difficult in the case of KSP than those of Go or Super Mario. For Go, it could be binary (win = good, loss = bad), or for Super Mario, one would probably base the score on the distance travelled before it died (since after all, winning means travelling the whole distance of a level.

Or, to compare again with Super Mario, while the AI doesn't know that, if it falls into the holes in the ground, it will die, it will notice after a couple of attempts that jumping over them allows it to travel much further than falling into them. And since travelling further is good, it will start jumping over the holes.

Edited by Tullius
Link to comment
Share on other sites

3 hours ago, Tullius said:

The score that the AI gets as a result after each try is also based on periapsis and apoapsis height. Or put differently, the scoring system knows what an orbit is and gives the AI a score based on how good its achieved "orbit" was.

...

Or, to compare again with Super Mario, while the AI doesn't know that, if it falls into the holes in the ground, it will die, it will notice after a couple of attempts that jumping over them allows it to travel much further than falling into them. And since travelling further is good, it will start jumping over the holes.

The AI seems to still haven't grasped how the change in Pe is dependent on how it points the thrust (and how it applies thrust) then. Will "take some time" I suppose.

Link to comment
Share on other sites

22 minutes ago, YNM said:

The AI seems to still haven't grasped how the change in Pe is dependent on how it points the thrust (and how it applies thrust) then. Will "take some time" I suppose.

The AI doesn't grasp anything, it just notices that certain combinations of actions lead to certain scores. If you look at the console output in the stream, you see after each try the line "fitness: ...", which is the score the AI received in the previous try (the higher, the better). And at the moment the highest achieved score is 6,9...

I am still unsure what the "adj fit" in the table means; maybe it is something like the average score achieved, and thereby showing the progress. But I am really not sure; I just see it being higher, if I check back after a few hours.

So the AI might notice that certain directions of thrust lead to good scores, but it won't notice the behaviour of the Pe independently of the rest. But most of its tries ending up in quick turns might suggest that it noticed that thrusting horizontally leads to good scores, since it increases the periapsis.

And "will take some time", it something like the motto of neural networks. It took AlphaGo a couple of months to increase from barely the level of good European Go player to beating one of the best human players. And that despite running on a supercomputer. In the experiment above, we are not even at 6000 tries.

But in the end, we all needed quite a few tries before achieving our first orbit in KSP. And we knew what an orbit is, and maybe also read up on how to achieve orbit in KSP. And now imagine someone, who starts literally from 0, knowing absolutely nothing about the task ahead.

Edited by Tullius
Link to comment
Share on other sites

5 hours ago, Tullius said:

The AI doesn't grasp anything ...

...

... "will take some time", it something like the motto of neural networks. It took AlphaGo a couple of months to increase from barely the level of good European Go player to beating one of the best human players. And that despite running on a supercomputer. In the experiment above, we are not even at 6000 tries.

But in the end, we all needed quite a few tries before achieving our first orbit in KSP. And we knew what an orbit is, and maybe also read up on how to achieve orbit in KSP. And now imagine someone, who starts literally from 0, knowing absolutely nothing about the task ahead.

Which is what most people do at first : point it up and see how far can you go XD (or, well, keeping your rocket from "stalling" ...)

All good points anyway. Take your time, for the simulators !

Edited by YNM
Link to comment
Share on other sites

  • 1 month later...

Alright, I know, massive digging here. But :

I... presume it's on a similar level of difficulty ? If you were the developers, what would you set as a quite universal goal ?

Link to comment
Share on other sites

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...