Jump to content

Alpha Zero Chess AI


Shpaget

Recommended Posts

Ok, I'm impressed.

A few days ago Google's Deep Mind and its Alpha Zero AI (successor to the Alpha Go that has beaten Lee Sedol in Go recently) played against one of the best chess engines around - Stockfish. The results are quite stunning.

https://www.arxiv-vanity.com/papers/1712.01815v1/

There were 100 games played in total. Out of 50 games Alpha Zero played with white it won 25 games and drew the other 25. The 50 other games Alpha Zero played with black pieces, it won 3 games and 47 were a draw, meaning that Alpha Zero didn't lose a single game. Yeah, you might say, someone just made a better chess engine. Anybody can add more processing power and brute force a better chess playing software that can explore more possible moves. But you'd be wrong. That is not what happened. This is so much more interesting.

Here's the best part: the only knowledge the AI was given were the basic rules of chess. It was not fed theory, opening moves or endgame tactics, just the rules. It was then given just four hours to master the game (play against itself, using trial and error) and then thrown against one of the strongest opponents out there.

I'm not much of a chess player myself, but if you have ever tried playing against a chess engine, you'll know that an average human player stands no chance. It has been a fact of life that ever since Kasparov was beaten by a machine some 20 years ago that chess engines were far superior, even compared to grandmasters. But watching these games being played is something completely different. Chess engines are usually materialistic, valuing numbers of pieces on the board. Alpha Zero, on the other hand, gives impression that it's perfectly happy to sacrifice pieces for positional superiority. And that it does gloriously.

Go and watch some of the games between the two, they are amazing, even if you have only superficial knowledge of chess.

Link to comment
Share on other sites

I played chess (in a low league) when Kasparov was defeated by Deep Blue. No, an average player has no chance against even a mediocre chess algorithm. Like an average human has no chance calculating Pi to [insert n element N] digit against even a Zilog Z80 with a simple little program.

So, the engines are playing against each other now ? Like car racing ... Mercedes against Ferrari without drivers ?

Wake me when the same program gets up, cooks a meal, chats with a friend, draws a picture, plays the guitar, etc. pp. what a human does during the course of a day.

 

I am not 100% serious here, i love technology !

Edited by Green Baron
Link to comment
Share on other sites

42 minutes ago, Green Baron said:

IWake me when the same program gets up, cooks a meal, chats with a friend, draws a picture, plays the guitar, etc. pp. what a human does during the course of a day.

One might say the same of some human chess grandmasters.  Especially the part involving friends, pictures, and guitars...

Link to comment
Share on other sites

1 hour ago, Shpaget said:

... It was then given just four hours to master the game (play against itself, using trial and error) and then thrown against one of the strongest opponents out there. ...

How many games was that ? 4000 ?

 

Also, if only I can have a clone !

 

 

 

But yeah. AI is a well thing. Maybe Google should get AIs to be their employees, and AIs to be their owner...

Link to comment
Share on other sites

Years ago, many would have said the equivalent of what @Green Baron said about cooking a meal or chatting, but would have said, “Wake me when a computer can play chess.” 

The bar keeps moving, until at some point crossing it is an “AI complete” problem, I suppose. These learning systems really move the project closer to the finish line, imho, and make predicting a timeline harder.

What happens when even a narrow learning system learns to code learning systems? This could be just a tool that leverages a human coder, too.

 

Link to comment
Share on other sites

1 hour ago, PakledHostage said:

In what way? Maybe I am a bit thick but I don't understand how this relates to your previous sentence?

If they develop code that makes coders X times more efficient, then those coders can achieve X years progress per year. Say AGI was 100 years of human coding off, but code was developed that could help in the task of coding, making coders 10X more time efficient. Then the 100 years we have until we need to worry about AI just became 10 years. More likely shorter, since before the coders+intelligent systems came up with AGI, they'd come up with better intelligent coding systems.

 

Edited by tater
Link to comment
Share on other sites

4 hours ago, tater said:

What happens when even a narrow learning system learns to code learning systems?

It will hang on programmers' forums and spend 100% of time in holywars like "Java vs C#", "Windows vs Linux", etc.
A decade later these forums will be completely being auto-generated by the AI.
And its bots will be banning any human user with a "not-a-bot" mark.

Edited by kerbiloid
Link to comment
Share on other sites

The team at Vicarious AI that successfully used machine learning to defeat captcha's in 2013 recently revealed their methods (first published in Science) .

Quote

RCN (recursive cortical network) was effective in breaking a wide variety of CAPTCHAs with very little training data and without using CAPTCHA-specific heuristics. By comparison, a convolutional neural network required a 50,000-fold larger training set and was less robust to perturbations to the input.


 

Link to comment
Share on other sites

7 hours ago, YNM said:

How many games was that ? 4000 ?

From the article, Table S3: Selected statistics of AlphaZero training in Chess, Shogi and Go. 

44 million games in total over 9 hours, but it's unclear how many were played in the first four hours (time it took to beat Stockfish).

Link to comment
Share on other sites

7 minutes ago, Scotius said:

So, roughly almost 5 million games per hour. I don't think you will find a human being able to achieve anything similiar during his or her lifetime total. IMO it's still bruteforcing, just a bit more sophisticated.

Well, humans generally have other skilled humans to learn from…who learned from other skilled humans, et cetera. This algorithm learned from scratch playing only against (a similarly unskilled copy of) itself. If you count the games played through the entire mentorship lineage of a given grandmaster, you'll likely find something greater than 44 million.

With progressively more sophisticated learning algorithms, we can probably get that number down even further. Perhaps to a few thousand.

Link to comment
Share on other sites

Another interesting aspect is that Stockfish calculates around 70 000 000 positions per second, while Alpha Zero "only" 80 000. Apparently it uses a much more focused search and doesn't waste time on moves that have no future.

It plays moves that Stockfish doesn't even consider being good, yet just a couple of moves later they turn out to be amazing, leading to exceptional control and presence on the board.

Link to comment
Share on other sites

2 hours ago, Shpaget said:

From the article, Table S3: Selected statistics of AlphaZero training in Chess, Shogi and Go. 

44 million games in total over 9 hours, but it's unclear how many were played in the first four hours (time it took to beat Stockfish).

Way above any humankind.

1 hour ago, 0111narwhalz said:

Well, humans generally have other skilled humans to learn from…who learned from other skilled humans, et cetera. This algorithm learned from scratch playing only against (a similarly unskilled copy of) itself. If you count the games played through the entire mentorship lineage of a given grandmaster, you'll likely find something greater than 44 million.

With progressively more sophisticated learning algorithms, we can probably get that number down even further. Perhaps to a few thousand.

If it's still semi-random then you'll still need a lot. Also, maybe there's more human match than the 44 million, but how much of those matches are entirely unique ? (I know how much is possible in chess - but it's human with similar starting minds we're talking here, so I expect something like the constelations.)

49 minutes ago, Shpaget said:

Another interesting aspect is that Stockfish calculates around 70 000 000 positions per second, while Alpha Zero "only" 80 000. Apparently it uses a much more focused search and doesn't waste time on moves that have no future.

It plays moves that Stockfish doesn't even consider being good, yet just a couple of moves later they turn out to be amazing, leading to exceptional control and presence on the board.

It's a neural net right ? It doesn't even know if it's good or not. All it know is that it used to work. But you know what that means.

Link to comment
Share on other sites

 

9 minutes ago, YNM said:

It's a neural net right ? It doesn't even know if it's good or not. All it know is that it used to work. But you know what that means.

Yes, it is neural network, but I don't agree with you that it knows something worked before.

It is a virtual certainty that in those 44 million games of training it did not repeatedly find itself in the same position and played out every possible move, and every response to every possible move by the opponent. The database would just be too large.

When it comes to chess, 44 million games doesn't even scratch the Shannon number, which Wikipedia says "is a conservative lower bound (not an estimate) of the game-tree complexity of chess of 10120".

https://en.wikipedia.org/wiki/Shannon_number

What I mean to say is that it doesn't perform well because it already played that specific game in every imaginable way and knows what worked before and what didn't. It has to evaluate the condition on the board for every move because, most likely, it is the first time it is in that situation.

Link to comment
Share on other sites

24 minutes ago, YNM said:

If it's still semi-random then you'll still need a lot.

That's the thing, though: Training techniques no longer depend on randomness. Gradient descent, for example, uses a little calculus magic and finds local minima rapidly, without any randomness. Add in just a little simulated annealing and you can find global minima.

Link to comment
Share on other sites

1 hour ago, Shpaget said:

 

Yes, it is neural network, but I don't agree with you that it knows something worked before.

It is a virtual certainty that in those 44 million games of training it did not repeatedly find itself in the same position and played out every possible move, and every response to every possible move by the opponent. The database would just be too large.

When it comes to chess, 44 million games doesn't even scratch the Shannon number, which Wikipedia says "is a conservative lower bound (not an estimate) of the game-tree complexity of chess of 10120".

https://en.wikipedia.org/wiki/Shannon_number

What I mean to say is that it doesn't perform well because it already played that specific game in every imaginable way and knows what worked before and what didn't. It has to evaluate the condition on the board for every move because, most likely, it is the first time it is in that situation.

 

1 hour ago, 0111narwhalz said:

That's the thing, though: Training techniques no longer depend on randomness. Gradient descent, for example, uses a little calculus magic and finds local minima rapidly, without any randomness. Add in just a little simulated annealing and you can find global minima.

I see two different assumptions here :

1. The thing knows what's it doing.

2. The thing doesn't know what it is (and was) doing.

What's the difference ?

The first basically means that it knows what's the "win" of the game, it knows how each piece "work".

The second means it's doing random walk.

Comparing one to another is like comparing quantum bits to normal bits when doing quantum-related calculation. One embeds the "quanta" inside it, the other doesn't.

 

Now, which side did any of the AI fall into ?

Link to comment
Share on other sites

3 hours ago, 0111narwhalz said:

Well, humans generally have other skilled humans to learn from…who learned from other skilled humans, et cetera. This algorithm learned from scratch playing only against (a similarly unskilled copy of) itself. If you count the games played through the entire mentorship lineage of a given grandmaster, you'll likely find something greater than 44 million.

With progressively more sophisticated learning algorithms, we can probably get that number down even further. Perhaps to a few thousand.

A few thousands would require an serious breakthrough, the AI would really understand chess early on, think it would require human level intelligence 
Benefit of AI is the brute force learning works well at least if the input can be simulated, you have an huge database or have the AI train it self. 


 

Link to comment
Share on other sites

55 minutes ago, YNM said:

I see two different assumptions here :

1. The thing knows what's it doing.

2. The thing doesn't know what it is (and was) doing.

What's the difference ?

Knowledge of "what it's doing" is encoded within the structure of the network. Learning refines this structure so that the network's structure tends to map appropriate (winning) outputs to any arbitrary input. In this way it's hard to say that the network "knows" anything more than your cerebellum and assorted ganglia"know" how to walk.

If you give the network too many neurons it'll overfit and essentially become an inefficient lookup table–not useful. But if you restrict the network to a minimal number of neurons, add some dropout, and so on, it tends to learn rules instead of mere memorization. This makes sense; it's easier (takes less neurons) to know how to multiply than to memorize multiplication tables, but if you had resources to spare you'd probably rather use a logarithm table than know how to take logarithms.

30 minutes ago, magnemoe said:

A few thousands would require an serious breakthrough,

Yes, I fear I have hyperbolized.

Link to comment
Share on other sites

27 minutes ago, 0111narwhalz said:

Knowledge of "what it's doing" is encoded within the structure of the network. Learning refines this structure so that the network's structure tends to map appropriate (winning) outputs to any arbitrary input. In this way it's hard to say that the network "knows" anything more than your cerebellum and assorted ganglia"know" how to walk.

If you give the network too many neurons it'll overfit and essentially become an inefficient lookup table–not useful. But if you restrict the network to a minimal number of neurons, add some dropout, and so on, it tends to learn rules instead of mere memorization. This makes sense; it's easier (takes less neurons) to know how to multiply than to memorize multiplication tables, but if you had resources to spare you'd probably rather use a logarithm table than know how to take logarithms.

Yep, exactly.

Walking is difficult. Esp. walking without having to concentrate on it. It still implies that it knows - much like sport players.

But, is that exactly what happened ? Is this thing really encodes itself, like YouTube's algorithm ?

Link to comment
Share on other sites

2 hours ago, Shpaget said:

 

Yes, it is neural network, but I don't agree with you that it knows something worked before.

It is a virtual certainty that in those 44 million games of training it did not repeatedly find itself in the same position and played out every possible move, and every response to every possible move by the opponent. The database would just be too large.

When it comes to chess, 44 million games doesn't even scratch the Shannon number, which Wikipedia says "is a conservative lower bound (not an estimate) of the game-tree complexity of chess of 10120".

https://en.wikipedia.org/wiki/Shannon_number

What I mean to say is that it doesn't perform well because it already played that specific game in every imaginable way and knows what worked before and what didn't. It has to evaluate the condition on the board for every move because, most likely, it is the first time it is in that situation.

yes also far higher than number of realistic variations. Still it surprises me that playing only against itself did not generate bad patterns, starting it of against another chess program who increased difficulty ones it learned the game should work better, once it get good you can run it against itself. 
 

Link to comment
Share on other sites

The "does it know" question might not matter at some point, if it exceeds human capabilities at everything, lol. It's certainly an interesting question, however. A Japanese AI researcher got a learning system to pass the Japanese college entrance exam, including the essay portion (graded by a human). She made a point of letting the audience (I heard this on a TED Talk) know that the program didn't even understand Japanese. All it did was fit answers that were statistically in line with what would be expected given the pattern of characters preceding the "?" for a given question. So it wrote an essay, that a human thought was college material, and it didn't "know" anything at all (and the grader didn't know it was not written by a human).

Wonder if such a system would pass a Turing Test?

Link to comment
Share on other sites

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...