In 2013, DeepMind Technologies published a breakthrough paper showing how a neural network can learn to play video games from the 1980s by watching the screen. A few months later Google acquired the company for US$400 million. DeepMind began to apply deep learning, even in AlphaGo. The game is better than humans, but at the same time it also highlights the limitations of deep machine learning speeds that make scientists begin to explore the secrets of human learning things.
The MIT Technology Review reported that the University of California, Berkeley research team has explored how humans interact with video games to understand what kind of prior knowledge humans rely on to understand games. The study found that when humans start a new game, they use a lot of background knowledge to make the game better. However, if the game is redesigned to break the prior knowledge, humans will be in trouble. The machine is executed in exactly the same way in both games.
The researchers sought 40 people to play Montezuma’s Revenge based on classic game designs on Amazon’s crowdsourcing site, Mechanical Turk. The researchers did not provide operating manuals and instructions. The participants did not know how to play the game at all. It took about 1 minute for participants to complete the game and about 3,000 keyboard operations. However, the algorithm used 4 million keyboard movements to complete the game. It took about 37 hours to play the game.
The researchers said that this is not surprising, because humans can easily guess that the goal of the game is to step on a brick-shaped object and use a ladder to reach a higher platform while avoiding angry pink and flame objects. Move to the princess. In contrast, games are difficult for machines. Many standard deep learning algorithms do not solve the problem at all. Because only feedback is available when the game is completed, the algorithm cannot evaluate the game content.
The researchers attributed their prior knowledge to knowing that certain objects are good, while other objects, such as frowning or flames in the game, are bad, the platform supports the object, the ladder can climb, and the behavior of the same thing looks like In the same way, gravity pulls down the object and determines what the object is. But the machine knows nothing about it.
The researchers redesigned the game, selecting textures to cover various forms of prior knowledge such as ladders, enemies, keys, and platforms, and changing the physical properties of the game, such as the effects of gravity, and the way the characters interact with the environment. Keeping these prior knowledge irrelevant, then measuring how long it takes for humans to complete the game.
As a result, it was found that the deletion of some prior knowledge would lead to a sharp decrease in the speed at which the human player solves the game, and the time to complete the game increased from 1 minute to more than 20 minutes. Deleting these messages had no effect on the learning speed of the machine algorithm.
Researchers can even change the project design to observe changes in the playing time of the game. The more time increases, the more important the corresponding prior knowledge. For example, if you remove an object symbol, such as a frown or a flame symbol, the participant will take longer to complete. However, using textures to cover the surface of objects will make the game more difficult. Researchers will have to increase their rewards and participants will be willing to play.
This ranking has an interesting connection with human learning. Psychologists found that when babies were 2 months old, they had a primitive concept of objects but they could not identify the species. Babies aged 3 to 5 months learn to recognize the type of objects, 18-24 months learn to recognize individual objects, and learn the properties of objects, and the order of importance of human prior knowledge is the same as infants.
The value of this experiment lies in quantifying the importance of human beings’ use of various kinds of knowledge in solving video games, and understanding how previous knowledge has enabled humans to be good at handling complex tasks, providing an interesting way for computer scientists to develop machine intelligence. The algorithm is designed using the same basic knowledge that humans have accepted since childhood, so that machines should be able to catch up to humans’ learning speeds and may even exceed humans.