TensorFlow 2.0（二）基本图像分类

私人物语 2020-05-07

739

本指南训练了一个神经网络模型来对服饰图片进行分类，例如运动鞋和衬衫。如果您不了解所有细节，也可以；这是完整的TensorFlow程序的快速概述，详细内容随您进行。

本指南使用tf.keras（高级API）在TensorFlow中构建和训练模型。

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt


print(tf.__version__)

导入fashion mnist数据集

fashion mnist数据集由70,000张黑白图片构成，每张图片大小为 28x28，由十类服饰图片构成。其中60,000张图片作为训练集，10,000张图片作为测试集。这个数据集可以从 TensorFlow 中直接获取，返回值为numpy数组。

fashion_mnist = keras.datasets.fashion_mnist
# 这一步会从googleapis下载四个压缩包，分别为训练集标签、图片和测试集标签、图片
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

图片都是28x28的numpy数组，其像素取值范围0~255
。标签是整数组成的数组，元素取值0~9
，对应了服饰的类别。

标签	类别
0	T-shirt/top
1	Trouser
2	Pullover
3	Dress
4	Coat
5	Sandal
6	Shirt
7	Sneaker
8	Bag
9	Ankle boot

class_names = [
    'T-shirt/top', 
    'Trouser', 
    'Pullover', 
    'Dress', 
    'Coat',
    'Sandal', 
    'Shirt', 
    'Sneaker', 
    'Bag', 
    'Ankle boot'
]

数据探索

train_images.shape # (60000, 28, 28), 训练集60000图，每图28x28像素
len(train_labels) # 60000，训练集中标签数量
train_labels # array([9, 0, 0, ..., 3, 0, 5], dtype=uint8)  标签格式
test_images.shape # (10000, 28, 28)，测试集10000图，每图28x28像素
len(test_labels) # 10000，测试集中标签数量
test_labels  # array([9, 2, 1, ..., 8, 1, 5], dtype=uint8)  标签格式

数据预处理

训练之前，需要对数据预处理。由于图片像素取值0~255
，我们将其缩放为0~1
。只需将其值除以255，训练集和测试集必须同样处理。

train_images = train_images / 255.0
test_images = test_images / 255.0

构建和训练网络之前，先验证数据格式是否正确，我们从训练集取前25张图片，将每张图片的类别标示出来。

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.show()

构建模型

神经网络的基本构成组成是网络层(layer)，大部分深度学习网络都由多个简单的 layers 构成。

model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

网络的第一层，Flatten
将输入从28x28 的二维数组转为784的一维数组，这一层的作用仅仅是将每一行值平铺在一行。

接下来是2层Dense
，即全连接层(fully connected, FC)，第一层Dense有128个神经元。第二层有10个神经元，经过 softmax 后，返回了和为1长度为10的概率数组，每一个数分别代表当前图片属于分类0-9的概率。

编译模型

模型准备训练前，在模型编译(Compile)时还需要设置一些参数。

•Loss function - 损失函数，训练时评估模型的正确率，希望最小化这个函数，往正确的方向训练模型。•Optimizer - 优化器算法，更新模型参数的算法。•Metrics - 指标，用来监视训练和测试步数，下面的例子中使用accuracy，即图片被正确分类的比例

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

训练模型

训练神经网络，通常有以下几个步骤。

•传入训练数据，train_images和train_labels;•训练模型去关联图片和标签;•模型对测试集test_images作预测，并用test_labels验证预测结果。

使用model.fit
函数开始训练10个周期。

model.fit(train_images, train_labels, epochs=10)

结果如下：

Train on 60000 samples
Epoch 1/10
60000/60000 [==============================] - 4s 60us/sample - loss: 0.4923 - accuracy: 0.8271
Epoch 2/10
60000/60000 [==============================] - 4s 60us/sample - loss: 0.3717 - accuracy: 0.8662
Epoch 3/10
60000/60000 [==============================] - 4s 62us/sample - loss: 0.3328 - accuracy: 0.8788
Epoch 4/10
60000/60000 [==============================] - 4s 58us/sample - loss: 0.3096 - accuracy: 0.8854
Epoch 5/10
60000/60000 [==============================] - 3s 58us/sample - loss: 0.2936 - accuracy: 0.8915
Epoch 6/10
60000/60000 [==============================] - 4s 59us/sample - loss: 0.2810 - accuracy: 0.8975
Epoch 7/10
60000/60000 [==============================] - 3s 55us/sample - loss: 0.2675 - accuracy: 0.9015
Epoch 8/10
60000/60000 [==============================] - 4s 63us/sample - loss: 0.2587 - accuracy: 0.9048
Epoch 9/10
60000/60000 [==============================] - 4s 59us/sample - loss: 0.2468 - accuracy: 0.9083
Epoch 10/10
60000/60000 [==============================] - 3s 55us/sample - loss: 0.2399 - accuracy: 0.9095

经过10个周期训练后，准确率达到了接近91%。

准确率评估

接下来，看看在测试集中的表现。

test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy:', test_acc)

结果如下：

10000/10000 [==============================] - 0s 37us/sample - loss: 0.3358 - accuracy: 0.8801
Test accuracy: 0.8801

可见测试集的准确率低于训练集，训练集和测试集准确率之间的差距代表模型过拟合(overfitting)。即对于训练中没有见过的新数据，模型表现差。

预测

模型训练好了，可以用来做预测了。

我们可以直接用训练好的模型，也可以在该模型的线性输出log值上加softmax层把log值转为可能性。

probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
predictions = probability_model.predict(test_images)
predictions[0]
np.argmax(predictions[0])  # 9
test_labels[0] # 9

测试集中第一张图片的预测结果如下：

array([0.08613347, 0.08613332, 0.08613332, 0.08613332, 0.08613338,
       0.08675522, 0.08613337, 0.09150714, 0.08613358, 0.21880393],
      dtype=float32)
9
9

预测结果是长度为10的numpy数组，数组元素代表了依次属于0~9
分类的可能性，其中数字最大元素是index为9的，也就是label为9的分类。而测试集中数字也是9，这表示预测结果正确。

预测更多

下面提供了两个函数，一个绘服饰图，一个绘直方图。服饰图里，预测错误的用红色字体标准，正确的用蓝色字体标注。直方图里，方块越高代表越接近预测结果。

def plot_image(i, predictions_array, true_label, img):
  predictions_array, true_label, img = predictions_array, true_label[i], img[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
  plt.imshow(img, cmap=plt.cm.binary)
  predicted_label = np.argmax(predictions_array)
  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'
  plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)


def plot_value_array(i, predictions_array, true_label):
  predictions_array, true_label = predictions_array, true_label[i]
  plt.grid(False)
  plt.xticks(range(10))
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions_array, color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions_array)
  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')

取第1张图来预测：

i = 0
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i],  test_labels)
plt.show()

结果预测正确：

取第13张图来预测：

i = 13
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i],  test_labels)
plt.show()

结果预测错误：

再来看看一次预测15张图片:

num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
  plt.subplot(num_rows, 2*num_cols, 2*i+1)
  plot_image(i, predictions[i], test_labels, test_images)
  plt.subplot(num_rows, 2*num_cols, 2*i+2)
  plot_value_array(i, predictions[i], test_labels)


plt.tight_layout()
plt.show()

结果可见，只有第13张图预测错误：

使用已训练好的模型

最后，我们用已训练好的模型来预测单张图片。

img = test_images[1]
print(img.shape)  # (28, 28)
img = (np.expand_dims(img, 0))
print(img.shape)  # (1, 28, 28)
predictions_single = probability_model.predict(img)
print(predictions_single)
np.argmax(predictions_single[0])  # 2

结果：

[[0.08533989 0.08533908 0.23193227 0.08533908 0.08535255 0.08533908
  0.08534076 0.08533908 0.08533908 0.08533908]]
2

绘图：

plot_value_array(1, predictions_single[0], test_labels)
_ = plt.xticks(range(10), class_names, rotation=45)
plt.show()

预测正确。

数据库

文章转载自私人物语，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

TensorFlow 2.0（二）基本图像分类

评论