暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
09.XNOR.Net_ ImageNet Classification Using Binary Convolutional Neural Networks.pdf
219
17页
0次
2021-02-22
50墨值下载
XNOR-Net: ImageNet Classification Using Binary
Convolutional Neural Networks
Mohammad Rastegari
, Vicente Ordonez
, Joseph Redmon
, Ali Farhadi
†∗
Allen Institute for AI
, University of Washington
{mohammadr,vicenteor}@allenai.org
{pjreddie,ali}@cs.washington.edu
Abstract. We propose two efficient approximations to standard convolutional
neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-
Networks, the filters are approximated with binary values resulting in 32× mem-
ory saving. In XNOR-Networks, both the filters and the input to convolutional
layers are binary. XNOR-Networks approximate convolutions using primarily bi-
nary operations. This results in 58× faster convolutional operations (in terms of
number of the high precision operations) and 32× memory savings. XNOR-Nets
offer the possibility of running state-of-the-art networks on CPUs (rather than
GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work
on challenging visual tasks. We evaluate our approach on the ImageNet classifi-
cation task. The classification accuracy with a Binary-Weight-Network version of
AlexNet is the same as the full-precision AlexNet. We compare our method with
recent network binarization methods, BinaryConnect and BinaryNets, and out-
perform these methods by large margins on ImageNet, more than 16% in top-1
accuracy. Our code is available at: http://allenai.org/plato/xnornet.
1 Introduction
Deep neural networks (DNN) have shown significant improvements in several applica-
tion domains including computer vision and speech recognition. In computer vision, a
particular type of DNN, known as Convolutional Neural Networks (CNN), have demon-
strated state-of-the-art results in object recognition [1,2,3,4] and detection [5,6,7].
Convolutional neural networks show reliable results on object recognition and de-
tection that are useful in real world applications. Concurrent to the recent progress in
recognition, interesting advancements have been happening in virtual reality (VR by
Oculus) [8], augmented reality (AR by HoloLens) [9], and smart wearable devices.
Putting these two pieces together, we argue that it is the right time to equip smart
portable devices with the power of state-of-the-art recognition systems. However, CNN-
based recognition systems need large amounts of memory and computational power.
While they perform well on expensive, GPU-based machines, they are often unsuitable
for smaller devices like cell phones and embedded electronics.
For example, AlexNet[1] has 61M parameters (249MB of memory) and performs
1.5B high precision operations to classify one image. These numbers are even higher for
deeper CNNs e.g.,VGG [2] (see section 4.1). These models quickly overtax the limited
storage, battery power, and compute capabilities of smaller devices like cell phones.
2 Rastegari et al.
. . .
. . .
c
w
in
h
in
w
h
Input
Weight
!"#$%&'()*&+*,%-.( /0"&*,%-.(
1."2(+-(
3%-4%51,%-(
6"7%&8(
9*4+-:(
;<-="&"->"?(
Computation
( Saving(
;<-="&"->"?(
C>>1&*>8(%-(
<7*:"!"#(
;C5"D!"#?(
9#*-2*&2(
3%-4%51,%-(
(
E(F(G(F(H(
ID( ID( JKLMN(
O+-*&8(P"+:Q#(
E(F(G(
RLSD( RTD(
JK6MV(
O+-*&8P"+:Q#((
O+-*&8(<-01#(
;!"#$%"&'?(
W!/X(F(
Y+#>%1-#(
RLSD( RKVD( JSSMT(
0.11 -0.21 ... -0.34
-0.25 0.61 ... 0.52
Real-Value Weights
Real-Value Inputs
!
!
!
!
!
!
0.11 -0.21 ... -0.34
-0.25 0.61 ... 0.52
Binary Weights
Real-Value Inputs
!
!
!
!
!
!
1 -1 ... -1
-1 1 ... 1
Binary Weights
Binary Inputs
!
!
!
!
!
!
32x
32x
Fig. 1: We propose two efficient variations of convolutional neural networks. Binary-
Weight-Networks, when the weight filters contains binary values. XNOR-Networks,
when both weigh and input have binary values. These networks are very efficient in
terms of memory and computation, while being very accurate in natural image classifi-
cation. This offers the possibility of using accurate vision techniques in portable devices
with limited resources.
In this paper, we introduce simple, efficient, and accurate approximations to CNNs
by binarizing the weights and even the intermediate representations in convolutional
neural networks. Our binarization method aims at finding the best approximations of the
convolutions using binary operations. We demonstrate that our way of binarizing neural
networks results in ImageNet classification accuracy numbers that are comparable to
standard full precision networks while requiring a significantly less memory and fewer
floating point operations.
We study two approximations: Neural networks with binary weights and XNOR-
Networks. In Binary-Weight-Networks all the weight values are approximated with bi-
nary values. A convolutional neural network with binary weights is significantly smaller
( 32×) than an equivalent network with single-precision weight values. In addition,
when weight values are binary, convolutions can be estimated by only addition and
subtraction (without multiplication), resulting in 2× speed up. Binary-weight ap-
proximations of large CNNs can fit into the memory of even small, portable devices
while maintaining the same level of accuracy (See Section 4.1 and 4.2).
To take this idea further, we introduce XNOR-Networks where both the weights
and the inputs to the convolutional and fully connected layers are approximated with
binary values
1
. Binary weights and binary inputs allow an efficient way of implement-
ing convolutional operations. If all of the operands of the convolutions are binary, then
the convolutions can be estimated by XNOR and bitcounting operations [11]. XNOR-
Nets result in accurate approximation of CNNs while offering 58× speed up in CPUs
(in terms of number of the high precision operations). This means that XNOR-Nets can
enable real-time inference in devices with small memory and no GPUs (Inference in
XNOR-Nets can be done very efficiently on CPUs).
To the best of our knowledge this paper is the first attempt to present an evalua-
tion of binary neural networks on large-scale datasets like ImageNet. Our experimental
1
fully connected layers can be implemented by convolution, therefore, in the rest of the paper,
we refer to them also as convolutional layers [10].
of 17
50墨值下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜