
OPTICAL COMPUTING
All-optical machine learning using
diffractive deep neural networks
Xing Lin
1,2,3
*, Yair Rivenson
1,2,3
*, Nezih T. Yardimci
1,3
, Muhammed Veli
1,2,3
,
Yi Luo
1,2,3
, Mona Jarrahi
1,3
, Aydogan Ozcan
1,2,3,4
†
Deep learning has been transforming our ability to execute advanced inference tasks using
computers. Here we introduce a physical mechanism to perform machine learning by
demonstrating an all-optical diffractive deep neural network (D
2
NN) architecture that can
implement various functions following the deep learning–based design of passive
diffractive layers that work collectively. We created 3D-printed D
2
NNs that implement
classification of images of handwritten digits and fashion products, as well as the function
of an imaging lens at a terahertz spectrum. Our all-optical deep learning framework can
perform, at the speed of light, various complex functions that computer-based neural
networks can execute; will find applications in all-optical image analysis, feature detection,
and object classification; and will also enable new camera designs and optical components
that perform distinctive tasks using D
2
NNs.
D
eep learning is one of the fastest-growing
machine learning methods (1). This ap-
proach uses multilayered artificial neural
networks implemented in a computer to
digitally learn data representation and ab-
straction and to perform advanced tasks in a
manner comparable or even superior to the per-
formance of human experts. Recent examples in
which deep learning has made major advances in
machine learning include medical image analysis
(2), speech recognition (3), language transla-
tion (4), and image classification (5), among others
(1, 6). Beyond some of these mainstream appli-
cations, deep learning methods are also being
used to solve inverse imaging problem s (7–13).
Here we introduce an all-optical deep learning
framework in which the neural network is phys-
ically formed by multiple layers of diffractive
surfaces that work in collaboration to optically
perform an arbitrary function that the network
can statistically learn. Whereas the inference and
prediction mechanism of the physical network
is all optical, the learning part that leads to its
design is done through a computer. We term this
framework a diffractive deep neural network
(D
2
NN) and demonstrate its inference capabil-
ities through both simulations and experiments.
Our D
2
NN can be physically created by using
several transmissive and/or reflective layers (14),
where each point on a given layer either trans-
mits or reflects the incoming wave, representing
an artificial neuron that is connected to other
neurons of the following layers through optical
diffraction(Fig.1A).Inaccordancewiththe
Huygens-Fresnel principle, our terminolog y is
based on each point on a given layer acting as a
secondary source of a wave, the amplitude and
phase of which are determined by the product
of the input wave and the complex-valued trans-
mission or reflection coefficient at that point [see
(14) for an analysis of the waves within a D
2
NN].
Therefore, an artificial neuron in a D
2
NN is con-
nected to other neurons of the following layer
through a secondary wave modulated in ampli-
tude and phase by both the input interference
pattern created by the earlier layers and the local
transmission or reflection coefficient at that point.
As an analogy to standard deep neural networks
(Fig. 1D), one can consider the transmission or
reflection coefficient of each point or neuron as
amultiplicative“bias” term, which is a learnable
network parameter that is iteratively adjusted
during the training process of the diffractive net-
work, using an error back-propagation method.
After this num erical training phase, th e D
2
NN
design is fixed and the transmission or reflec-
tion coefficients of the neurons of all layers are
determined. This D
2
NN design—once physically
fabricated using techniques such as 3D-printing
or lithography—can then perform, at the speed
of light, the specific task for which it is trained,
using only optical diffraction and passive optical
components or layers that do not need po wer,
thereby creating an efficient and fast way of
implementing machine learning tasks.
In general, the ph ase and amplitude of ea ch
neuron can be learnable parameters, providing
a complex-valued modulation at each layer,
which improves the inference performance of
the diffractive network (fig. S1) (14). For coher-
ent transmissive networks with phase-only mod-
ulation, each layer can be approximated as a thin
optical element (Fig. 1). Through deep learning,
the phase values of the neurons of each layer of
the diffractive network are iteratively adjusted
(trained) to perform a specific function by feed-
ing training data at t he input layer and then
computing the network’s output through optical
diffraction. On the basis of the calculated error
with respect to the target output, determined by
the desired function, the network structure and
its neuron phase values are optimized via an error
back-propagation algorithm, which is based on
the stochastic gradient descent approach used
in conventional deep learning (14 ).
To demonstrate the performance of the D
2
NN
framework, we first trained it as a digit classifier
to perform automated classification of hand-
written digits, from 0 to 9 (Figs. 1B and 2A). For
this task, phase-only transmission masks were
designed by training a five-layer D
2
NN with
55,000 images (5000 validation images) from the
MNIST (Modified National Institute of Stan-
dards and Technology) handwritten digit data-
base (15). Input digits were encoded into the
amplitude of the i nput field to the D
2
NN, and
the diffractive network was trained to map input
digits into 10 detector regions, one for each digit.
The classification criterion was to find the de-
tector with the maximum optical signal, and this
was also used as a loss function during the net-
work training (14).
After training, the design of the D
2
NN digit
classifier was numer ically tested using 10,000
images from the MNIST test dataset (which were
not used as part of the training or validation
image sets) and achieved a classification accu-
racy of 91.75% (Fig. 3C and fig. S1). In addition to
the classification performance of the diffractive
network, we also analyzed the energy distribu-
tion observed at the network output plane for the
same 10,000 test digits (Fig. 3C), the results of
which clearly demonstrate that the diffractive
network learned to focus the input energy of
each handwritten digit into the correct (i.e., the
target) detector region, in accord with its train-
ing. With the use of complex-valued modulation
and increasing numbers of layers, neurons, and
connections in the diffractive network, our classi-
fication accuracy can be further improved (figs.
S1 and S2). For example, fig. S2 demonstrates a
Lego-like physical transfer learning behavior for
D
2
NN framework, where the inference perform-
ance of an already existing D
2
NN can be further
improved by adding new diffractive layers—or, in
some cases, by peeling off (i.e., discarding) some
of the existing layers—where the new layers to
be added are trained for improved inference
(coming from the entire diffractive network: old
and new layers). By using a patch of two layers
added to an existing and fixed D
2
NN design (N =
5 layers), we improved our MNIST classification
accuracy to 93.39% (fig. S2) (14); the state-of-the-
art convolutional neural network performance
has been reported as 99.60 to 99.77% (16–18).
More discussion on reconfiguring D
2
NN designs
is provided in the supplementary materials (14).
Following these numerical results, we 3D-
printed our five- layer D
2
NN design (Fig. 2A),
with each layer having an area of 8 cm by 8 cm,
followed by 10 detector regions defined at the
output plane of the diffractive network (Figs. 1B
and 3A). We then used continuous-wave illumi-
nation at 0.4 THz to test the network’sinference
performance (Figs. 2, C and D). Phase values of
RESEARCH
Lin et al., Science 361, 1004–1008 (2018) 7 September 2018 1of5
1
Department of Electrical and Computer Engineering, University
of California, Los Angeles, CA 90095, USA.
2
Department
of Bioengineering, University of California, Los Angeles, CA
90095, USA.
3
California NanoSystems Institute (CNSI),
University of California, Los Angeles, CA 90095, USA.
4
Department of Surg ery, David Geffen School of
Medicine, University of California, Los Angeles, CA 90095, USA.
*These authors contributed equally to this work.
†Corresponding author. Email: ozcan@ucla.edu
Downloaded from https://www.science.org on August 29, 2024
文档被以下合辑收录
评论