Jump to content

Recommended Posts

WOOHOO I'm not dead (and I've actually been working a bit)


Just wanted to let you guys know, I think I have a little something that seems to be working. The end result is actually very bad, but I think that's mostly due to lack of data (and maybe of my skills too). That shouldn't be a problem too long (I hope).

Anyways, if you guys are interested, I'm going to be streaming what I found and explaining it tomorrow on Twitch: 

So, on May 12th, at 1:00pm UTC time (I live in Taiwan at the moment, a lot of the people I know that play KSP are in France, so that's the time that most fits everyone).
That makes it: 9am EDT and 6am PDT for the folks in the US.

Hope you guys will be interested in this! :)

 

Though, I should warn you: Don't get your hopes up, this is mostly a first test of linking KSP, Python and some machine learning together, it's faaaaaaaaaaaaar from perfect.
(I promise I didn't harm any Kerba... Nah nobody believe that, you know how Kerbal science is done)

 

Link to comment
Share on other sites

4 hours ago, Scotius said:

Are you familiar with this channel? Some pretty impressive results - aside from the fact that Pacman isn't nearly as much complicated as KSP.



 

Still say an rocket launch is way simpler than packman as you only have to do staging, gravitational turn and circulating and obviously keep the rocket stable. 
Staring with an SSTO capable rocket or auto staging would help although staging would be pretty easy to learn, then trust reach zero, stage, staging before is an fail an to late and you loose speed. 
Gravitation turn and correct inclination is analogue but pretty simple rules. 

One fun thing with AI is that you can run them against each others. This could be used as an way to balance an PvP game if you add human level delays and lag to the system. 
And you could use the AI for an NPC :) 

Link to comment
Share on other sites

Y'know... i have a niggling suspicion... that first truly sentient AI will be a part of a game, maybe twenty years down the line. And it will be a complete surprise for EVERYONE when a random NPC will all of a sudden start behaving like a self-aware individual, ask questions and refuse to fight the player like a little good piece of code should :P

Link to comment
Share on other sites

38 minutes ago, Scotius said:

Y'know... i have a niggling suspicion... that first truly sentient AI will be a part of a game, maybe twenty years down the line. And it will be a complete surprise for EVERYONE when a random NPC will all of a sudden start behaving like a self-aware individual, ask questions and refuse to fight the player like a little good piece of code should :P

And you get an murder charge for killing him. 
Radiant AI in elder scroll Oblivion was pretty fun but could have catastrophic effects. 
It was an ruled based AI but it depend on your relation to various factions, say you was the high rank in the magic guild. so they would help you, as they often use area effect spells like fire bomb this tended to hit other groups including guards who after 3 hits got hostile. 
yes you could get civil wars, so this was tuned down a lot in SKyrim. On the other side in Oblivion you could make goblin warlords who was high level bosses permanently hostile to the goblin faction who was high level popcorn time more so as they would return to the goblin base after some time :)

 

 

Link to comment
Share on other sites

Yeah, I've been following Code Bullet for a while!

But, as you pointed out, KSP is quite different from Pacman, or the other games he plays, because he usually makes the environement himself, whereas I'm using one that already exists: KSP
So I'm trying to work around the limitations of KSP, that clearly wasn't made with people trying to make IAs with it in mind. But it's an interesting project then!

I did the live-stream, thanks for those that tuned in!
I have a recording on it, but it's not on Twitch, I don't know if I can upload it once the stream is over. (Streamed for the first time, so of course a lot of things didn't go as expected!)

Anyways, the take-away is that my overall program runs for the moment, but it's just not actually training the way I'd like it to. So there's going to be a lot of work to figure out what the issue is, and fixing it. I'll post a bit more later I think.

Link to comment
Share on other sites

@Jirokoh How about doing the basic training of your machine learning program as a simulation inside a simulation? The physics of vacuum rocketry are fairly basic to model, and you could account for the varying thrust and ISP with altitude. The big kicker is:

flight().simulate_aerodynamic_force_at(body, position, velocity)

If you did that, you could run hundreds of iterations without leaving the launchpad. At least well enough to set you up to do fewer, more accurate iterations on real flights.

Edited by FleshJeb
Stupid fat fingers
Link to comment
Share on other sites

4 hours ago, FleshJeb said:

@Jirokoh How about doing the basic training of your machine learning program as a simulation inside a simulation? The physics of vacuum rocketry are fairly basic to model, and you could account for the varying thrust and ISP with altitude. The big kicker is:

flight().simulate_aerodynamic_force_at(body, position, velocity)

If you did that, you could run hundreds of iterations without leaving the launchpad. At least well enough to set you up to do fewer, more accurate iterations on real flights.

That did cross my mind at some point.

There are multiple drawbacks to that:

- Even if it's not *that* hard on paper, it's still going to take an aweful lot of time for me to do that. because, in programming, just like in KSP, things don't quite go well on the first try :P

- It has to be very similar to the environement of KSP. The IA is going to learn how things are going in the environement it has trained in. But if the envrionement in then moves around in changes a bit, I'm afraid it would get totally lost That's more related to the way machine learning models work, they are very sensitive to this kind of change. Usually, we try to avoid this as much as possible. But, since I'm only using altitude, speed and throttle at this point, it might as well be possible. The big issue might be when trying to scale this up. Bascially I'm going to have to make KSP from scratch if I want to do this ^^ It could also be an intersting benchmark to test the machine learning model itself, not necessarily to train it for KSP, but just to see that in this environement, it does learn properly. Because right now, I don't know my problem actually is a lack of data. It might be, but it might also not be.

But the big benefit is indeed that once it's working, it would speed up training by multiple orders of magnitude. (And not to say it probably would be a great excercise for me to try to implement that)

The idea is quite tempting, at least for this easy first case. I'll have a more in depth look at what it would take to do this, because on paper it is a good idea, and I'll let you guys know. 

(I just have a thesis to finish at the moment, I'm not sure I'm going to have a lot of time to devote to this unfortunately)

Link to comment
Share on other sites

I'm a bit late to the game here, but here are my thoughts:

  • Evolutionary algorithms take up a lot of generations. Given that making it to orbit takes a few minutes, testing parameters over hundreds of launches can take weeks.
  • It's already fun in kRPC to build a PID controller and see it control your rocket!
  • Once you have your PID controllers working, work on a launch platform where your pitch angle is a function of... something. Time, altitude, velocity? You pick. And then use machine learning to figure out what the right parameter for your function (say, 1.2° per 1000m) is?
Link to comment
Share on other sites

I had similar thought. Teaching AI anything takes time. Teaching it something really complicated takes an awful lot of time. But what if we cut the lesson into smaller pieces? For example: first teach the AI how to launch vertically - let it learn how to control thrust and steering to reach maximum altitude possible. When AI masters this task, teach it how to reach maximum distance from the point of launch. At one point it should be able to hit the jackpot and circularise at 70 km. :)

Link to comment
Share on other sites

On 5/14/2019 at 11:09 AM, Kerbart said:

I'm a bit late to the game here, but here are my thoughts:

  • Evolutionary algorithms take up a lot of generations. Given that making it to orbit takes a few minutes, testing parameters over hundreds of launches can take weeks.
  • It's already fun in kRPC to build a PID controller and see it control your rocket!
  • Once you have your PID controllers working, work on a launch platform where your pitch angle is a function of... something. Time, altitude, velocity? You pick. And then use machine learning to figure out what the right parameter for your function (say, 1.2° per 1000m) is?

Well, that's basically what I'm doing already, kRPC is controlling everything, and the only parameter is throttle for the moment. Pitch will come later, when I'll have something working for just thrust. This is more of a test bench to see if it can actually work.

And, it's not *really* an evolutionary alogrithm, that works a bit differently, but the point about data is still pretty much the same. That is the biggest problem I have at the moment.

On 5/14/2019 at 2:24 PM, Scotius said:

I had similar thought. Teaching AI anything takes time. Teaching it something really complicated takes an awful lot of time. But what if we cut the lesson into smaller pieces? For example: first teach the AI how to launch vertically - let it learn how to control thrust and steering to reach maximum altitude possible. When AI masters this task, teach it how to reach maximum distance from the point of launch. At one point it should be able to hit the jackpot and circularise at 70 km. :)

Again, tha'ts why I'm only trying to teach it to throttle, going straight up, for only 7/8 seconds ;)

And I don't think I'll be able to tranfer the learning just as easily to be honest.

Link to comment
Share on other sites

6 hours ago, Scotius said:

Just like teaching a human child something, eh? :) Or maybe rather training a dog.

Well, this analogy works but only up to a certain way. Humans are stupidly efficient in the way they learn compared to even the best of the articifical intelligence that we can make today. So, at the level where I am trying to make this, it's really super basic things. I think we should keep in mind these are analogies, that are good at explaining the big picture when first trying to understand these concepts, but that rapidly become ireelevant when trying to build or debug implementations of these algorithms.

At the moment, the issue I have is more on getting enough data, and being sure that my model is actually learning what I want it to, so this starts really being outside of the boundaries of such an analogy.

But again, these are great for explaining in simple terms how in general machine learning works! But we should be very careful not to push the analogy too far, that becomes anthropomorphism, which is starting to become irrelevant. There's still so much to do before we can even consider AIs to be really learning like humans. Even more in my case: this is really a dumb program, it's basically trying to find correlations and a local minima of 3 (for the moment) different data parameters.

Link to comment
Share on other sites

On 5/18/2019 at 6:27 AM, Jirokoh said:

Humans are stupidly efficient in the way they learn compared to even the best of the articifical intelligence that we can make today.

Humans have the advantage of billions of years of evolution. Lots and lots of stuff is hard-coded into us regarding the laws of physics and our environment. When faced with ridiculously hard problems we instinctively guess that the solution is going to be something realistic for the environment we evolved in. This works almost all the time in the real world, allowing us to solve problems AIs completely fail at. However, it's also the basis of most optical illusion fallacies, like forced perspective.

3947f65f308b765dd119f40c5aa317c4.jpg

a71f0_Cool_Perspective_Trick-s750x500-21

Edited by mikegarrison
Link to comment
Share on other sites

On 5/22/2019 at 2:12 PM, mikegarrison said:

Lots and lots of stuff is hard-coded into us regarding the laws of physics and our environment.

I'm not sure about that. From what I understood, it's rather that we have a hugely adaptable brain that can learn very efficiently and understand correlations very easily. It feels like the brain is an awesome correlation finding machine, which does allow us to in turn learn how our environment works, rather than that being hard-coded. That's why blind people, semi-paralized, or others can learn to interract with the world in a totally different way than you and me do, it's adaptation rather than rule based. Just like good code: it's scalable to new problems.
But we kind of digress here. (still, super interesting topic, which I know way too little about compared to what people working in the field know. So, as the rest, don't trust what I write here :D)

Link to comment
Share on other sites

3 hours ago, Jirokoh said:

I'm not sure about that. From what I understood, it's rather that we have a hugely adaptable brain that can learn very efficiently and understand correlations very easily. It feels like the brain is an awesome correlation finding machine, which does allow us to in turn learn how our environment works, rather than that being hard-coded. That's why blind people, semi-paralized, or others can learn to interract with the world in a totally different way than you and me do, it's adaptation rather than rule based. Just like good code: it's scalable to new problems.
But we kind of digress here. (still, super interesting topic, which I know way too little about compared to what people working in the field know. So, as the rest, don't trust what I write here :D)

Yes, we are very adaptable, but that adaptability is applied on top of an evolutionary hard-coding. An example of this is that people are instinctively afraid of lions and tigers, but they need to be taught to be afraid of walking into the street without looking for cars. But it's actually the latter that will kill you in modern life, not the former.

You are right that the brain is adaptable enough to adapt to the available input and re-purpose things like language (which most people learn by listening to and making sounds, but can be learned by watching and making hand movements). But that's kind of like saying a car can be re-purposed as an off-road dune buggy. Yes, people are very adaptable, but they adapt mechanisms they already have available to start with.

Not everybody agrees on how much of of our brain mechanisms are newly learned versus hard-coded and adapted. I encourage you to read about the subject if you are interested. One place to start could be Steven Pinker's How The Mind Works. Pinker is one one side of the discussion -- maybe somebody else can suggest a good book on the other side of the discussion.

Link to comment
Share on other sites

Again, I think we're going off track here.

And, to be honest, I have some doubts about being instinctly afraid of a lion. I'm no expert, so I don't want to go deeper into this sort of discussion. It's only going to go more into what probably both of us don't know about.

Thanks for the book recommendation though, I'll have a look!

 

In other news, still pretty busy at the moment, so I haven't gone back to improving anything or making any progress. I'll keep you guys posted!

Link to comment
Share on other sites

On 5/30/2019 at 3:46 AM, Jirokoh said:

 It's only going to go more into what probably both of us don't know about.

Hmm. Well, I'm not a cognitive scientist, but I've been reading about this stuff seriously for about 30 years, so actually I think I do know quite a bit about it.

Link to comment
Share on other sites

On 6/1/2019 at 3:25 PM, James Kerman said:

Congratulations @Jirokoh and Bertrand, you both have been awarded thread of the month.

(Side note should your AI achieve sentience: I never laughed at your learning process, Bertrand)

Thanks a lot to those how have been following this then! :)
Hopefully in the next decade uh, weeks or months I'll actually have something working!

On 6/2/2019 at 11:11 AM, Cheif Operations Director said:

You could most likely mod KSP to have the physics (for the most part) for the real world and then transfer the movements of your rocket into Python. This way you are having your rocket move accurately. Im not sure how the second part would work as I do not code in any meaningful sense. 

I'm not sure I get your idea here?
The goal is to have something learn to fly in KSP, not in the real world (though, if people have waaaay too much money and don't know what to do with it, my DM's open, just sayin').
So I would rather have to make an environment (understand code the said laws of physics, gravitation, atmospheric drag, intertia, you name it) that mimics KSP, but runs much, much faster, in order to speed the training process up, while still being in an environment that is close enough to then transfer the learning into KSP.

No problem, all ideas are welcomed :-)

21 hours ago, mikegarrison said:

Hmm. Well, I'm not a cognitive scientist, but I've been reading about this stuff seriously for about 30 years, so actually I think I do know quite a bit about it.

Well, my bad then!

Link to comment
Share on other sites

On 5/18/2019 at 8:27 PM, Jirokoh said:

Humans are stupidly efficient in the way they learn compared to even the best of the articifical intelligence that we can make today.

We also have tens of millenias of experience, and there're billions of us.

Most AIs are a dust mote compared to the whole homo experience - we've even transcended biological traits (somewhat) by having a way to communicate complexly - complex language - which we've had for the last ten millenias.

AI learning isn't much different than human learning. Try teaching, idk, quantum mechanics in it's full mathematical expression to a toddler, and you'd have problems.

Either the bar has to be set low; or the bar has to start low, then successively raised up.

Link to comment
Share on other sites

The biggest issue I foresee with this is finding a series of appropriate loss functions and training methods so that many minutes of flight doesn't just give the network a single piece of feedback, and also avoids creating inscrutable learning cliffs (such as having payload to orbit be your loss function at the start) or situations where suboptimal local minima are quickly found. Also reward-hacking could be an issue. Since KSP has reproduceable bugs that result in very high velocities and altitudes, another question is would the network learn to ignore rocketry and automate Danny2462 videos instead?

 

My experience says long missions giving a 1 simple piece of training data is not gonna give you a lot of useful info. In one of the early BAD-T challenges I tried to *evolve* a fighter:

Spoiler

 

Not just the AI tuning but mutate files that would adjust its physical structure (it was entirely made of P-wings so giving it full control over every single parameter, and the ability to add, copy, and remove parts and subtrees was not that unreasonably difficult).

 

So I did exactly that. I wrote a program to store of them and started auto-adjusting files and running battles. Losers die. Winners reproduce with mutations.

 

But the problem is that this means fitness = combat ability. And while the starting point was, well, not unflyable, it certainly wasn't clearing the skies in record time.

So I would suggest that whatever you do, you don't want to give a very weak optimizer one bit of decisionmaking ability every 10 or 20 minutes. Generations took 4 hours or more to run and still involved some manual clicking. By the time I was on generation 4 I had decided that that generation was only a marginal improvement over the base design, and did not want to spend months of actual run time optimizing a fighter against itself.

 

 

One suggestion would be to start completely outside KSP. Write the minimum possible rocket sim where you have absolute control and high simulation speeds. The "ground" is just a check to see if the distance to the planetary core is equal to or less than 600 km. Kraken/Far lands bug? Just use doubles. Then code the world's simplest atmosphere model. Don't even bother coding lift, just give your rockets a "stability" variable, two axes of angular inertia, a dry mass, a wet mass, a thrust, maximum torque per axis, and specific impulse. Vary your rocket specs and such a little bit on each flight so your AI doesn't get used to one particular vehicle and the learning will transfer more smoothly to KSP.

Once it is trained enough to do something in your rocket sim, just copy weights and biases over to the KSP version and watch the training slow by a factor of like a hundred thousand. Want to see it do stupid things IN KSP instead of on a terminal? Just plug in stupid networks in the KSP version. But training in KSP at 25 physics fps where you could likely be getting 2 million seems inefficient. I would only use KSP for what would be time-prohibitive to write a much faster version of.

Edited by Pds314
Link to comment
Share on other sites

Well, I've been thinking about coding a "simple" KSP sim, but I'd like to try to see if I can't solve the problem in KSP first.

Because, while this solution might work for this easy problem, it's still only the first problem I want to try to solve. If I can make this work, I'd like to go further, and tackle more complicated problems. That would mean also making my simulation more complicated, to train it on those more complicated issues. But, maybe, why not? I think I'll have to wait to have more time on my hands to do that then, so maybe only in a few months!

 

23 hours ago, Pds314 said:

My experience says long missions giving a 1 simple piece of training data is not gonna give you a lot of useful info.

While I am getting 1 data point for every run, each run is only 12 to 13 seconds long, and during 1 run I actually get 6 / 7 meaurement points. So it's not that bad. The real issue is the RAM leakage, meaning I can't scale my data scrapping on long periods of time. If I could, I'd just leave my PC to run during the night and get data like that. It would be long, but it would work.
Fixing this issue is what I'd like to do, because it would allow me to at least be able to then scale to more complicated problems using the same method, even if each run would be longer for more complicated problems.

Link to comment
Share on other sites

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...