
2 计 算 机 学 报 2017 年
the outstandingly increased size of data used for training and the drastic increases in chip processing capabilities,
this method today has resulted in significant progress and been used in a broad area of applications such as object
detection, computer vision, natural language processing, speech recognition and semantic parsing and so on, thus
also promoting the advancement of artificial intelligence. Deep learning which consists of multiple levels of
non-linear transformations is a hierarchical machine learning method. And deep neural network is the main form
of the present deep learning method in which the connectivity pattern between its neurons is inspired by the
organization of the animal visual cortex. Convolutional neural network that has been widely used is a classic
kind of deep neural network. There are several characteristics such as local connections, shared weights, pooling
etc. These features can reduce the complexity of the network model and the number of training parameters, and
they also can make the model creating some degree of invariance to shift, distortion and scale and having
strong robustness and fault tolerance. So it is easy to train and optimize its network structure. Based on these
predominant characteristics, it has been shown to outperform the standard fully connected neural networks in a
variety of signal and information processing tasks. In this paper, first of all, the historical development of
convolutional neural network is summarized. After that, the structures of a neuron model and multilayer
perceptron are shown. Later on, a detailed analysis of the convolutional neural network architecture which is
comprised of a number of convolutional layers and pooling layers followed by fully connected layers is given.
Different kinds of layers in convolutional neural network architecture play different roles. Then, a few improved
algorithms such as network in network and spatial transformer networks of convolutional neural network are
described. Meanwhile, the supervised learning and unsupervised learning method of convolutional neural
network and some widely used open source tools are introduced, respectively. In addition, the application of
convolutional neural network on image classification, face recognition, audio retrieve, electrocardiogram
classification, object detection, and so on is analyzed. Integrating of convolutional neural network and recurrent
neural network to train inputted data could be an alternative machine learning approach. Finally, different
convolution neural network structures with different parameters and different depths are designed. Through a
series of experiments, the relations between these parameters in these models and the influence of different
parameter settings are ready. Some advantages and remained issues of convolutional neural network and its
applications are concluded.
Key words convolutional neural network; deep learning; network structure; training method; domain data
1 引 言
人工神经元网络(Artificial Neural Network,
ANN)是对生物神经网络的一种模拟和近似,是由
大量神经元通过相互连接而构成的自适应非线性
动态网络系统。1943 年,心理学家 McCulloch 和数
理逻辑学家 Pitts 提出了神经元的第一个数学模型
—MP 模型
[1]
。MP 模型具有开创意义,为后来的研
究工作提供了依据。到了上世纪 50 年代末、60 年
代初,Rosenblatt 在 MP 模型的基础之上增加学习
功能,提出了单层感知器模型,第一次把神经网络
的研究付诸实践
[2-3]
。但是单层感知器网络模型不能
够处理线性不可分问题。直至 1986 年,Rumelhart
和 Hinton 等提出了一种按误差逆传播算法训练的
多层前馈网络—反向传播网络(Back Propagation
Network,简称 BP 网络),解决了原来一些单层感
知器所不能解决的问题
[4]
。由于在 90 年代,各种浅
层机器学习模型相继被提出,较经典的如支持向量
机
[5]
。而且当增加神经网络的层数时传统的 BP 网
络会遇到局部最优、过拟合及梯度扩散等问题,这
些使得深度模型的研究被搁置。
2006 年,Hinton 等人
[6]
在《Science》上发文,
其主要观点有:1)多隐层的人工神经网络具有优
异的特征学习能力;2)可通过“逐层预训练”
(layer-wise pre-training)来有效克服深层神经网络
在训练上的困难,从此引出了深度学习(Deep
Learning)的研究,同时也掀起了人工神经网络的
又一热潮
[7]
。在深度学习的逐层预训练算法中首先
将无监督学习应用于网络每一层的预训练,每次只
无监督训练一层,并将该层的训练结果作为其下一
评论