In the mid 1990s, Thrun and Mitchell worked on a
lifelong learning approached they called explanation-based
neural networks (Thrun 1996). EBNN is able to transfers
knowledge across multiple learning tasks. When faced with
a new learning task, EBNN exploits domain knowledge of
previous learning tasks (back-propagation gradients of prior
learned tasks) to guide the generalization of the new one. As
a result, EBNN generalizes more accurately from less data
than comparable methods. Thrun and Mitchell apply EBNN
transfer to autonomous robot learning when a multitude of
control learning tasks are encountered over an extended pe-
riod of time (Thrun and Mitchell 1995).
Since 1995, Silver et al. have proposed variants of se-
quential learning and consolidation systems using standard
back-propagation neural networks (Silver and Poirier 2004;
Silver, Poirier, and Currie 2008). A system of two multi-
ple task learning networks is used; one for short-term learn-
ing using task rehearsal to selectively transfer prior knowl-
edge, and a second for long-term consolidation using task
rehearsal to overcome the stability-plasticity problem. Task
rehearsal is an essential part of this system. After a task
has been successfully learned, its hypothesis representation
is saved. The saved hypothesis can be used to generate vir-
tual training examples so as to rehearse the prior task when
learning a new task. Knowledge is transferred to the new
task through the rehearsal of previously learned tasks within
the shared representation of the neural network. Similarly,
the knowledge of a new task can be consolidated into a large
domain knowledge network without loss of existing task
knowledge by using task rehearsal to maintain the function
accuracy of the prior tasks while the representation is modi-
fied to accommodate the new task.
Rivest and Schultz proposed knowledge-based cascade-
correlation neural networks in the late 1990s (Shultz and
Rivest 2001). The method extends the original cascade-
correlation approach, by selecting previously learned sub-
networks as well as simple hidden units. In this way the
system is able to use past learning to bias new learning.
Unsupervised Learning
To overcome the stability-plasticity problem of forgetting
previous learned data clusters (concepts) Carpenter and
Grossberg proposed ART (Adaptive Resonance Theory)
neural networks (Grossberg 1987). Unsupervised ART net-
works learn a mapping between “bottom-up” input sensory
nodes and “top-down” expectation nodes (or cluster nodes).
The vector of new sensory data is compared with the vec-
tor of weights associated with one of the existing expecta-
tion nodes. If the difference does not exceed a set threshold,
called the “vigilance parameter”, the new example will be
considered a member of the most similar expectation node.
If the vigilance parameter is exceeded than a new expecta-
tion node is used and thus a new cluster is formed.
In (Strehl and Ghosh 2003), Trehl and Ghosh present a
cluster ensemble framework to reuse previous partitionings
of a set objects without accessing the original features. By
using the cluster label but not the original features, the pre-
existing knowledge can be reused to either create a single
consolidated cluster or generate a new partitioning of the
objects.
Raina et al. proposed the Self-taught Learning method
to build high-level features using unlabeled data for a set
of tasks (Raina et al. 2007). The authors used the features
to form a succinct input representation for future tasks and
achieve promising experimental results in several real appli-
cations such as image classification, song genre classifica-
tion and webpage classification.
Carlson et al. (Carlson et al. 2010) describe the design and
partial implementation of a never-ending language learner,
or NELL, that each day must (1) extract, or read, informa-
tion from the web to populate a growing structured knowl-
edge base, and (2) learn to perform this task better than on
the previous day. The system uses a semi-supervised multi-
ple task learning approach in which a large number (531) of
different semantic functions are trained together in order to
improve learning accuracy.
Recent research into the learning of deep architectures of
neural networks can be connected to LML (Bengio 2009).
Layered neural networks of unsupervised Restricted Boltz-
man Machine and auto-encoders have been shown to effi-
ciently develop hierarchies of features that capture regulari-
ties in their respective inputs. When used to learn a variety of
class categories, these networks develop layers of common
features similar to that seen in the visual cortex of humans.
Recently, Le et al. used the deep learning method to build
high-level features for large-scale applications by scaling up
the dataset, the model and the computational resources (Le
et al. 2012). By using millions of high resolution images
and very large neural networks, their system effectively dis-
cover high-level concepts like a cat’s face and a human body.
Experimental results on image classification show that their
network can use its learned features to achieve a signifi-
cant improvement in classification performance over state-
of-the-art methods.
Reinforcement Learning
Several reinforcement learning researchers have considered
LML systems. In 1997, Ring proposed a lifelong learning
approach called continual learning that builds more compli-
cated skills on top of those already developed both incre-
mentally and hierarchically (Ring 1997). The system can
efficiently solve reinforcement-learning tasks and can then
transfer its skills to related but more complicated tasks.
Tanaka and Yamamura proposed a lifelong reinforcement
learning method for autonomous-robots by treating multi-
ple environments as multiple-tasks (Tanaka and Yamamura
1999). Parr and Russell used prior knowledge to reduce the
hypothesis space for reinforcement learning when the po-
lices considered by the learning process are constrained by
hierarchies (Parr and Russell 1997).
In (Sutton, Koop, and Silver 2007), Sutton et al. suggests
that learning should continue during an agent’s operations
since the environment may change making prior learning in-
sufficient. In their work, an agent is proposed to adapt to dif-
ferent local environments when encountering different parts
of its world over an extended period of time. The experi-
mental results suggest continual tracking of a solution can
评论