Modern Learning Classifier Systems

From the classic point of view, machine learning algorithms are classified based on the desired outcome of the algorithm. There are three main types of learning: supervised learning, where an expert or teacher provides feedback in the learning process, unsupervised learning, where there is no expert or teacher when the learning process is running, and reinforcement learning, where the program learns interacting with the environment. The latter technique of learning is a fundamental mechanism in learning classifier systems (LCSs). These are cognitive systems that receive perceptions from the environment and, in response to these, perform actions to solve the problem that are facing.

Originally proposed by John Holland and later simplified by David Goldberg and others, LCSs are computer programs that are based on observations of how natural selection processes and Darwinian evolution solve complex tasks. The original purpose was to create true artificial intelligence mimicking the adaptive behavior of nature. There are two main LCS models, Michigan-style and Pittsburgh-style LCSs. The main differences are (1) how they code the population, being an individual a rule in Michigan-style and being an individual a set of rules in Pittsburgh-style, (2) the way the solution is obtained, being all the population the solution to a given problem in Michigan-style and being the best individual the solution to a given problem in Pittsburgh-style, and (3) Michigan-style is an on-line learning architecture whereas Pittsburgh-style is an off-line one.

In this entry I will introduce two of the most successful Michigan-style LCS.

The Extended Classifier System
XCS is a Michigan-style LCS that evolves a population of classifiers on-line by means of interaction with an environment. The main difference between XCS and other LCSs is that XCS is based on accuracy and not on strength. The central idea of XCS consists in the introduction of a fitness function that is not proportional to the expected reward but to the accuracy of the prediction of this reward. The classifiers that survive in the population are not longer necessarily those predicting a large reward, but those predicting the reward they receive. As a result XCS covers the state space more efficiently than other LCSs. Summarized by Orriols-Piug, XCS computes the classifiers's fitness based on the accuracy of the reward prediction instead of calculating the fitness from the reward prediction itself. XCS creates a set of maximally general and accurate classifiers that map the problem space completely.

Like many other LCSs the XCS classifier system uses a population of classifiers, where each classifier has a structure that contains a rule and a set of parameters that estimate the quality of the rule. A rule takes the form: if condition then action to be taken; so that is why is called a rule-based system.

XCS learns new rules by interacting with an environment which provides a new training example at each iteration. The system has two operating modes: the training mode, referenced as exploration in the literature and the test mode, referenced as exploitation. In exploration mode, XCS discovers new rules that minimize the prediction error acquiring new knowledge. In exploitation mode, XCS uses its current knowledge to decide the best action for a new input instance and no external knowledge is added.

There are excellent references of XCS, so I not entered in details.

The Supervised Classifier System
UCS is an accuracy-based Michigan-style learning classifier system (LCS) that inherits the main components of XCS, but specializes the on-line learning architecture for classification tasks. That means the following: instead of evolving the complete action set (as XCS does), it evolves the correct action set. UCS does that by focusing the exploration process toward classifiers that predict the correct class or label accurately. With this focus shift on LCS, UCS improves to traditional LCSs in (1) evolving a solution quicker and (2) requiring less population to store the solution.

As XCS, UCS evolves a population of classifiers, where each classifier contains a rule and a set of parameters, like the accuracy of the rule or the experience it has.

UCS has the typical two operation modes, explore and exploit, and it proceeds the same way as XCS: UCS starts the exploration phase with an empty population and, in each learning iteration, the system receives a new impute example with the corresponding class. Then the system creates a match set, which contains all those classifiers in the population whose conditions match with the input example. Next, those classifiers in the match set which predict the class given by the input example are added to the correct set. If the correct set is empty, that is, if there are no classifiers in the match set whose class predictions are the same as the one given by the system, a new classifier is created and added to the correct set. This fresh new classifier covers the example, adding some generalization in the condition part, and it has the same class as the input. After this, all the classifiers that form the match set are evaluated and, eventually, the correct set receives a genetic event, aimed at classifier discovery.

References
D. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley Longman, Inc., Alabama, 1989. ISBN 0-201-15767-5.
T. M. Mitchell. Machine learning. McGraw Hill, 1997.
A. Orriols-Puig. New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Models. PhD thesis, Arquitectura i Enginyeria La Salle, Universitat Ramon Llull, Passeig de la Bonanova 8, 08022 - Barcelona, Nov 2008.
S. W. Wilson. Classifier fitness based on accuracy. Evolutionary Computation, 3(2):149–175, 1995.
S. W. Wilson. Generalization in the XCS classifier system. In 3rd Annual Conf. on Genetic Programming, pages 665–674. Morgan Kaufmann, 1998.

Comments

  1. Hello. Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. Yesterday I found one great NEW book. It is free to download, or you just can read it on online reading platform here: http://www.intechopen.com/books/show/title/advances-in-reinforcement-learning This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic. Cheers!

    ReplyDelete

Post a Comment

Popular posts from this blog

A conceptual machine for cloning a human driver

High Performance Computing, yet another brief installation tutorial

Data Streams and VFML