
tion, we only take the next operations, i.e., unexplored next-
step compression strategies, of the evaluated compression
scheme as the search space. Then, we select the pareto op-
timal operations for scheme evaluation, and finally take the
next operations of the new scheme as the newly expanded
search area to participate in the next round of optimization.
In this way, AutoMC can selectively and gradually explore
more valuable search space, reduce the search difficulty, and
improve the search efficiency. In addition, AutoMC can
analyze and compare the impact of subsequent operations
on the performance of each compression scheme in a fine-
grained manner, and finalize a more valuable next-step ex-
ploration route for implementation, thereby effectively re-
ducing the evaluation of useless schemes.
The final experimental results show that AutoMC can
quickly search for powerful model compression schemes.
Compared with the existing AutoML algorithms which are
non-progressive and ignore domain knowledge, AutoMC is
more suitable for dealing with the automatic model com-
pression problem where search space is huge and compo-
nents are complete and executable algorithms.
Our contributions are summarized as follows:
1. Automation. AutoMC can automatically design the
effective model compression scheme according to the
user demands. As far as we know, this is the first auto-
matic model compression tool.
2. Innovation. In order to improve the search efficiency
of AutoMC algorithm, an effective analysis method
based on domain knowledge and a progressive search
strategy are designed. As far as we know, AutoMC
is the first AutoML algorithm that introduce external
knowledge.
3. Effectiveness. Extensive experimental results show
that with the help of domain knowledge and progres-
sive search strategy, AutoMC can efficiently search the
optimal model compression scheme for users, outper-
forming compression methods designed by humans.
2. Related Work
2.1. Model Compression Methods
Model compression is the key point of applying neural
networks to mobile or embedding devices, and has been
widely studied all over the world. Researchers have pro-
posed many effective compression methods, and they can
be roughly divided into the following four categories. (1)
pruning methods, which aim to remove redundant parts
e.g., filters, channels, kernels or layers, from the neural
network [7, 17, 18, 22]; (2) knowledge distillation methods
that train the compact and computationally efficient neural
model with the supervision from well-trained larger models;
(3) low-rank approximation methods that split the convolu-
tional matrices into small ones using decomposition tech-
niques [16]; (4) quantization methods that reduce the preci-
sion of parameter values of the neural network [10, 29].
These compression methods have their own advantages,
and have achieved great success in many compression tasks,
but are difficult to apply as is discussed in the introduction
part. In this paper, we aim to flexibly use the experience
provided by them to support the automatic design of model
compression schemes.
2.2. Automated Machine Learning Algorithms
The goal of Auto
mated Machine Learning (AutoML) is
to realize the progressive automation of ML, including au-
tomatic design of neural network architecture, ML work-
flow [9,28] and automatic setting of hyperparameters of ML
model [11,23]. The idea of the existing AutoML algorithms
is to define an effective search space which contains a va-
riety of solutions, then design an efficient search strategy
to quickly find the best ML solution from the search space,
and finally take the best solution as the final output.
Search strategy has a great impact on the performance
of the AutoML algorithm. The existing AutoML search
strategies can be divided into 3 categories: Reinforcement
Learning (RL) methods [1], Evolutionary Algorithm (EA)
based methods [4, 25] and gradient-based methods [20, 24].
The RL-based methods use a recurrent network as con-
troller to determine a sequence of operators, thus construct
the ML solution sequentially. EA-based methods initialize a
population of ML solutions first and then evolve them with
their validation accuracies as fitnesses. As for the gradient-
based methods, they are designed for neural architecture
search problems. They relax the search space to be contin-
uous, so that the architecture can be optimized with respect
to its validation performance by gradient descent [3]. They
fail to deal with the search space composed of executable
compression strategies. Therefore, we only compare Au-
toMC’s search strategy with the previous two methods.
3. Our Approach
We firstly give the related concepts on model compres-
sion and problem definition of automatic model compres-
sion (Section 3.1). Then, we make full use of the exist-
ing experience to construct an efficient search space for
the compression area (Section 3.2). Finally, we designed
a search strategy, which improves the search efficiency
from the perspectives of knowledge introduction and search
space reduction, to help users quickly search for the optimal
compression scheme (Section 3.3).
3.1. Related Concepts and Problem Definition
Related Concepts. Given a neural model M , we use
P (M ), F (M) and A(M ) to denote its parameter amount,
2
评论