
FEEDFORWARD NEURAL NETWORK
2
•
What we’ll cover
‣
how neural networks take input x and make predict f(x)
-
forward propagation
-
types of units
-
capacity of neural networks
‣
how to train neural nets (classifiers) on data
-
loss function
-
parameter gradient computation with backpropagation
-
gradient descent algorithms
‣
deep learning
-
dropout
-
batch normalization
-
unsupervised pre-training
...
Feedforward neural network
Hugo Larochelle
D
´
epartement d’informatique
Universit
´
e de Sherbrooke
hugo.larochelle@usherbrooke.ca
September 6, 2012
Abstract
Math for my slides “Feedforward neural network”.
• a(x)=b +
P
i
w
i
x
i
= b + w
>
x
• h(x)=g(a(x)) = g(b +
P
i
w
i
x
i
)
• x
1
x
d
• w
• {
• g(·) b
• h(x)=g(a(x))
• a(x)=b
(1)
+ W
(1)
x
⇣
a(x)
i
= b
(1)
i
P
j
W
(1)
i,j
x
j
⌘
• o(x)=g
(out)
(b
(2)
+ w
(2)
>
x)
1
Feedforward neural network
Hugo Larochelle
D
´
epartement d’informatique
Universit
´
e de Sherbrooke
hugo.larochelle@usherbrooke.ca
September 6, 2012
Abstract
Math for my slides “Feedforward neural network”.
• a(x)=b +
P
i
w
i
x
i
= b + w
>
x
• h(x)=g(a(x)) = g(b +
P
i
w
i
x
i
)
• x
1
x
d
• w
• {
• g(·) b
• h(x)=g(a(x))
• a(x)=b
(1)
+ W
(1)
x
⇣
a(x)
i
= b
(1)
i
P
j
W
(1)
i,j
x
j
⌘
• o(x)=g
(out)
(b
(2)
+ w
(2)
>
x)
1
...
Feedforward neural network
Hugo Larochelle
D
´
epartement d’informatique
Universit
´
e de Sherbrooke
hugo.larochelle@usherbrooke.ca
September 6, 2012
Abstract
Math for my slides “Feedforward neural network”.
• a(x)=b +
P
i
w
i
x
i
= b + w
>
x
• h(x)=g(a(x)) = g(b +
P
i
w
i
x
i
)
• x
1
x
d
bw
1
w
d
• w
• {
• g(a)=a
• g(a) = sigm(a)=
1
1+exp(a)
• g(a) = tanh(a)=
exp(a)exp(a)
exp(a)+exp(a)
=
exp(2a)1
exp(2a)+1
• g(a) = max(0,a)
• g(a)=reclin(a) = max(0,a)
• g(·) b
• W
(1)
i,j
b
(1)
i
x
j
h(x)
i
• h(x)=g(a(x))
• a(x)=b
(1)
+ W
(1)
x
⇣
a(x)
i
= b
(1)
i
P
j
W
(1)
i,j
x
j
⌘
• o(x)=g
(out)
(b
(2)
+ w
(2)
>
x)
1
......
......
...
Feedforward neural network
Hugo Larochelle
D
´
epartement d’informatique
Universit
´
e de Sherbrooke
hugo.larochelle@usherbrooke.ca
September 13, 2012
Abstract
Math for my slides “Feedforward neural network”.
• f (x)
• l(f (x
(t)
; ✓),y
(t)
)
• r
✓
l(f (x
(t)
; ✓),y
(t)
)
• ⌦(✓)
• r
✓
⌦(✓)
• f(x)
c
= p(y = c|x)
• x
(t)
y
(t)
• l(f (x) ,y)=
P
c
1
(y=c)
log f(x)
c
= log f(x)
y
=
•
@
f(x)
c
log f(x)
y
=
1
(y=c)
f(x)
y
r
f (x)
log f(x)
y
=
1
f(x)
y
[1
(y=0)
,...,1
(y=C1)
]
>
=
e(c)
f(x)
y
1
评论