Open world AI
Open world AI
Open world videogames are an interesting environment to experiment with generalized artificial intelligence agents.
Machine learning-based AI essentially revolves around optimization functions, where the computer learns to solve a problem correctly and efficiently. Classification problems are where machine learning excels. You have data (the input) that you want the computer to be able to classify, but you can't or don't want to manually write identification rules for all the possible combinations of inputs. So instead, you have the computer classify the inputs by coming up with its own identification rules. For example, you may want to train a computer to recognize images of cats. This is called a supervised learning problem. So instead of trying to aimlessly write a program that can recognize a cat in any arbitrary image, you instead train the computer to create a model that when given a particular image, can correctly determine if a cat is or isn’t present. You do this by collecting many images of cats and non-cats, then feeding them into a learning function. The inputs are the pixel data for each of the images, and a label for whether the picture has a cat in it or not. You then feed this learning function many, many more images with labels, and it will eventually learn to recognize a cat! The results may not be perfect, but they can be improved with more training data. The learning function would be implemented using an artificial neural network (ANN), something you can read more about [here].
In short, the ANN is implemented using a series of layers which information passes through. The first layer is the input layer, which is where the inputs are fed into (in our example, the image pixel data). After the first layer exists the hidden layers. Each hidden layer has an arbitrary number of artificial neurons in it, each of which can connect to other neurons in the next layer. The last layer is the output layer, which contains the result (so in our example, whether the input is a cat or not). The first time that the learning function is passed in input data, it produces (mostly) nonsense data in the final layer. We can observe if the final layer’s output is correct by matching it against our training data. If it is, we can tell the network “Keep doing what you’re doing” otherwise “this is wrong, you did something wrong”. If the assumption was correct, the network’s neuronal connections in the hidden layers will be optimized in such a way that it learns to better identify similar images in the future. If the assumption was incorrect, the hidden layers will be optimized to try and prevent such failures in the future. This process of checking the error of the output and using it to adjust the network is called back-propagation. It’s done many, many times for each input in the training phase to achieve an optimal model.