Reinforcement Learning, what is it?
Intro
We have all probably heard of AI, either by research or terminator-like movies and you’re probably wondering what it is. Well this article is about AI and how I used a sub-type of AI called Reinforcement Learning to teach an Agent to play Space Invaders effectively and efficiently.
What is AI?
AI or Artificial Intelligence is by loose definition, a completely inorganic and technology based intelligence, created by humans to do things humans can do. There are many different types of AI, in my knowledge I know 4 of the types of AI: Reactive Machines, Limited Memory, Theory of Mind and Self Aware. These all go in a “simple to complex” scale from Reactive Machines being the simplest and Self Aware being the most complex. Reinforcement Learning is a subtype of Limited Memory. How it works is really similar to how humans learn from their mistakes, if the reinforcement Learning does something slightly wrong, the RL gets a slight punishment in a form of something like a negative number. If the RL gets something really wrong, then the RL gets a big negative number, but if the RL gets something right, it would get a positive number.
Why Atari Games?
The answer is simple; Atari games are 2D and rather simplistic, so an AI that looks at its mistakes and learns from them would be much easier to use in a 2 dimensional game then Minecraft. Minecraft is a game with almost limitless, if not infinite possibilities since it is 3D and has a more complex gameplay then a simple “move left and right and shoot till the enemy is gone” game.
How?
This was done by firstly, generating the environment for the game, the building the game and finally, building an Agent that would pass over the entire game screen with filters, breaking down the info until it decided an action. This Agent would have to train for an absurd amount of time since it requires many, many rounds to train to be even a little better.
Training
There are 2 different types of training, either training from many examples of playing from the web or the Agent randomly doing things for a while until it learns some strategies. Both go at different speeds since for the Agent to learn by itself it needs to do thousands, maybe millions of rounds until it gets a viable strategy. While since the other Agent learns from other people, it goes by faster. This is called unsupervised learning and supervised learning. Where in unsupervised learning the AI learns things and categorizes things by teaching itself, which is very time consuming. In supervised learning, the AI gets examples by humans and so develop the strategies and categories that have been shown to them.
Final Note
AI all around the world has been used to help humanity, whether it be YouTube recommendations or GPS. AI has been used to help humanity and it is possible that in the future there maybe even more ways AI will help humanity. I don’t know about you, but I want to stick around to see it happen. Catch you later!