卷积神经网络研究综述（周飞燕等）.pdf

yBmZlQzJ

161

23页

7次

2024-11-20

免费下载

第 40 卷计算机学报 Vol. 40

2017 年论文在线出版号 No.7 月 CHINESE JOURNAL OF COMPUTERS Online Publishing No.7

———————————————

周飞燕，女，1986年生，博士研究生,主要研究领域为计算机辅助心血管疾病诊断.E-mail: fyzhou2013@sinano.ac.cn.金林鹏，男，1984年生，博士,主

要研究领域为机器学习. 董军（通讯作者），男，1964年生，博士，研究员，博士生导师,主要研究领域为人工智能.

卷积神经网络研究综述

周飞燕

1),2)

金林鹏

1),2)

董军

(中国科学院苏州纳米技术与纳米仿生研究所, 苏州市 216123)

(中国科学院大学, 北京市 100049)

摘要作为一个十余年来快速发展的崭新领域，深度学习受到了越来越多研究者的关注，它在特征提取和模型拟合

上都有着相较于浅层模型显然的优势。深度学习善于从原始输入数据中挖掘越来越抽象的分布式特征表示，而这些表示

具有良好的泛化能力。它解决了过去人工智能中被认为难以解决的一些问题。且随着训练数据集数量的显著增长以及芯

片处理能力的剧增，它在目标检测和计算机视觉、自然语言处理、语音识别和语义分析等领域成效卓然，因此也促进了

人工智能的发展。深度学习是包含多级非线性变换的层级机器学习方法，深层神经网络是目前的主要形式，其神经元间

的连接模式受启发于动物视觉皮层组织，而卷积神经网络则是其中一种经典而广泛应用的网络结构。卷积神经网络的局

部连接、权值共享及池化操作等特性使之可以有效地降低网络的复杂度，减少训练参数的数目，使模型对平移、扭曲、

缩放具有一定程度的不变性，并具有强鲁棒性和容错能力，且也易于训练和优化网络结构。基于这些优越的特性，它在

各种信号和信息处理任务中的性能优于标准的全连接神经网络。本文首先概述了卷积神经网络的发展历史，然后分别描

述了神经元模型、多层感知器的结构。接着，详细分析了卷积神经网络的结构，包括卷积层、取样层、全连接层，它们

发挥着不同的作用。然后，讨论了网中网结构、空间变换网络等改进的卷积神经网络。同时，还分别介绍了卷积神经网

络的监督学习、无监督学习训练方法以及一些常用的开源工具。此外，本文以图像分类、人脸识别、音频检索、心电图

分类及目标检测等为例，对卷积神经网络的应用作了归纳。卷积神经网络与递归神经网络的集成是一个途径。为了给读

者以尽可能多的借鉴，本文还设计并试验了不同参数及不同深度的卷积神经网络以图把握各参数间的相互关系及不同参

数设置对结果的影响。最后，给出了卷积神经网络及其应用中待解决的若干问题。

关键词卷积神经网络；深度学习；网络结构；训练方法；领域数据

中图法分类号 TP81

论文引用格式：

周飞燕,金林鹏,董军, 卷积神经网络研究综述,2017, Vol.40,在线出版号 No.7

ZHOU Fei-Yan, JIN Lin-Peng, DONG Jun, Review of Convolutional Neural Network, 2017,Vol.40,Online Publishing No.7

Review of Convolutional Neural Network

ZHOU Fei-Yan

1)2)

JIN Lin-Peng

1)2)

DONG Jun

(Suzhou Institute of Nano-tech and Nano-bionics, Chinese Academy of Sciences, Suzhou 215123)

(University of Chinese Academy of Sciences, Beijing 100049)

Abstract As a new and rapidly growing field for more than ten years, deep learning has gained more and more

attentions from different researchers. Compared with shallow architectures, it has great advantage in both feature

extracting and model fitting. And it is very good at discovering increasingly abstract distributed feature

representations whose generalization ability is strong from the raw input data. It also has successfully solved

some problems which were considered difficult to solve in artificial intelligence in the past. Furthermore, with

2 计算机学报 2017 年

the outstandingly increased size of data used for training and the drastic increases in chip processing capabilities,

this method today has resulted in significant progress and been used in a broad area of applications such as object

detection, computer vision, natural language processing, speech recognition and semantic parsing and so on, thus

also promoting the advancement of artificial intelligence. Deep learning which consists of multiple levels of

non-linear transformations is a hierarchical machine learning method. And deep neural network is the main form

of the present deep learning method in which the connectivity pattern between its neurons is inspired by the

organization of the animal visual cortex. Convolutional neural network that has been widely used is a classic

kind of deep neural network. There are several characteristics such as local connections, shared weights, pooling

etc. These features can reduce the complexity of the network model and the number of training parameters, and

they also can make the model creating some degree of invariance to shift, distortion and scale and having

strong robustness and fault tolerance. So it is easy to train and optimize its network structure. Based on these

predominant characteristics, it has been shown to outperform the standard fully connected neural networks in a

variety of signal and information processing tasks. In this paper, first of all, the historical development of

convolutional neural network is summarized. After that, the structures of a neuron model and multilayer

perceptron are shown. Later on, a detailed analysis of the convolutional neural network architecture which is

comprised of a number of convolutional layers and pooling layers followed by fully connected layers is given.

Different kinds of layers in convolutional neural network architecture play different roles. Then, a few improved

algorithms such as network in network and spatial transformer networks of convolutional neural network are

described. Meanwhile, the supervised learning and unsupervised learning method of convolutional neural

network and some widely used open source tools are introduced, respectively. In addition, the application of

convolutional neural network on image classification, face recognition, audio retrieve, electrocardiogram

classification, object detection, and so on is analyzed. Integrating of convolutional neural network and recurrent

neural network to train inputted data could be an alternative machine learning approach. Finally, different

convolution neural network structures with different parameters and different depths are designed. Through a

series of experiments, the relations between these parameters in these models and the influence of different

parameter settings are ready. Some advantages and remained issues of convolutional neural network and its

applications are concluded.

Key words convolutional neural network; deep learning; network structure; training method; domain data

1 引言

人工神经元网络（Artificial Neural Network，

ANN）是对生物神经网络的一种模拟和近似，是由

大量神经元通过相互连接而构成的自适应非线性

动态网络系统。1943 年，心理学家 McCulloch 和数

理逻辑学家 Pitts 提出了神经元的第一个数学模型

—MP 模型

[1]

。MP 模型具有开创意义，为后来的研

究工作提供了依据。到了上世纪 50 年代末、60 年

代初，Rosenblatt 在 MP 模型的基础之上增加学习

功能，提出了单层感知器模型，第一次把神经网络

的研究付诸实践

[2-3]

。但是单层感知器网络模型不能

够处理线性不可分问题。直至 1986 年，Rumelhart

和 Hinton 等提出了一种按误差逆传播算法训练的

多层前馈网络—反向传播网络（Back Propagation

Network，简称 BP 网络），解决了原来一些单层感

知器所不能解决的问题

[4]

。由于在 90 年代，各种浅

层机器学习模型相继被提出，较经典的如支持向量

机

[5]

。而且当增加神经网络的层数时传统的 BP 网

络会遇到局部最优、过拟合及梯度扩散等问题，这

些使得深度模型的研究被搁置。

2006 年，Hinton 等人

[6]

在《Science》上发文，

其主要观点有：1）多隐层的人工神经网络具有优

异的特征学习能力；2）可通过“逐层预训练”

（layer-wise pre-training）来有效克服深层神经网络

在训练上的困难，从此引出了深度学习（Deep

Learning）的研究，同时也掀起了人工神经网络的

又一热潮

[7]

。在深度学习的逐层预训练算法中首先

将无监督学习应用于网络每一层的预训练，每次只

无监督训练一层，并将该层的训练结果作为其下一

of 23

免费下载

卷积

关注

评论