• tensorflow-卷积神经网络-图像分类入门demo


    猫狗识别

    • 数据预处理:图像数据处理,准备训练和验证数据集
    • 卷积网络模型:构建网络架构
    • 过拟合问题:观察训练和验证效果,针对过拟合问题提出解决方法
    • 数据增强:图像数据增强方法与效果
    • 迁移学习:深度学习必备训练策略

    导入工具包

    1. import os
    2. import warnings
    3. warnings.filterwarnings("ignore")
    4. import tensorflow as tf
    5. from tensorflow.keras.optimizers import Adam
    6. from tensorflow.keras.preprocessing.image import ImageDataGenerator

    指定好数据路径(训练和验证)

    1. # 数据所在文件夹
    2. base_dir = './data/cats_and_dogs'
    3. train_dir = os.path.join(base_dir, 'train')
    4. validation_dir = os.path.join(base_dir, 'validation')
    5. # 训练集
    6. train_cats_dir = os.path.join(train_dir, 'cats')
    7. train_dogs_dir = os.path.join(train_dir, 'dogs')
    8. # 验证集
    9. validation_cats_dir = os.path.join(validation_dir, 'cats')
    10. validation_dogs_dir = os.path.join(validation_dir, 'dogs')

    构建卷积神经网络模型

    • 几层都可以,大家可以随意玩
    • 如果用CPU训练,可以把输入设置的更小一些,一般输入大小更主要的决定了训练速度
      1. model = tf.keras.models.Sequential([
      2. #如果训练慢,可以把数据设置的更小一些
      3. tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(64, 64, 3)),
      4. tf.keras.layers.MaxPooling2D(2, 2),
      5. tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
      6. tf.keras.layers.MaxPooling2D(2,2),
      7. tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
      8. tf.keras.layers.MaxPooling2D(2,2),
      9. #为全连接层准备
      10. tf.keras.layers.Flatten(),
      11. tf.keras.layers.Dense(512, activation='relu'),
      12. # 二分类sigmoid就够了
      13. tf.keras.layers.Dense(1, activation='sigmoid')
      14. ])
      model.summary()

    • 配置训练器

      1. model.compile(loss='binary_crossentropy',
      2. optimizer=Adam(lr=1e-4),
      3. metrics=['acc'])

      数据预处理

    • 读进来的数据会被自动转换成tensor(float32)格式,分别准备训练和验证
    • 图像数据归一化(0-1)区间
      1. train_datagen = ImageDataGenerator(rescale=1./255)
      2. test_datagen = ImageDataGenerator(rescale=1./255)
      1. train_generator = train_datagen.flow_from_directory(
      2. train_dir, # 文件夹路径
      3. target_size=(64, 64), # 指定resize成的大小
      4. batch_size=20,
      5. # 如果one-hot就是categorical,二分类用binary就可以
      6. class_mode='binary')
      7. validation_generator = test_datagen.flow_from_directory(
      8. validation_dir,
      9. target_size=(64, 64),
      10. batch_size=20,
      11. class_mode='binary')

      训练网络模型

    • 直接fit也可以,但是通常咱们不能把所有数据全部放入内存,fit_generator相当于一个生成器,动态产生所需的batch数据
    • steps_per_epoch相当给定一个停止条件,因为生成器会不断产生batch数据,说白了就是它不知道一个epoch里需要执行多少个step
      1. history = model.fit_generator(
      2. train_generator,
      3. steps_per_epoch=100, # 2000 images = batch_size * steps
      4. epochs=20,
      5. validation_data=validation_generator,
      6. validation_steps=50, # 1000 images = batch_size * steps
      7. verbose=2)
      Epoch 1/20
      100/100 - 7s - loss: 0.6892 - acc: 0.5325 - val_loss: 0.6705 - val_acc: 0.5970
      Epoch 2/20
      100/100 - 6s - loss: 0.6595 - acc: 0.6055 - val_loss: 0.6346 - val_acc: 0.6470
      Epoch 3/20
      100/100 - 6s - loss: 0.6350 - acc: 0.6515 - val_loss: 0.6358 - val_acc: 0.6320
      Epoch 4/20
      100/100 - 7s - loss: 0.5936 - acc: 0.6865 - val_loss: 0.5906 - val_acc: 0.6780
      Epoch 5/20
      100/100 - 7s - loss: 0.5530 - acc: 0.7170 - val_loss: 0.5978 - val_acc: 0.6670
      Epoch 6/20
      100/100 - 8s - loss: 0.5179 - acc: 0.7490 - val_loss: 0.5484 - val_acc: 0.7140
      Epoch 7/20
      100/100 - 8s - loss: 0.4854 - acc: 0.7725 - val_loss: 0.5686 - val_acc: 0.7080
      Epoch 8/20
      100/100 - 8s - loss: 0.4595 - acc: 0.7905 - val_loss: 0.5452 - val_acc: 0.7150
      Epoch 9/20
      100/100 - 8s - loss: 0.4406 - acc: 0.7885 - val_loss: 0.5453 - val_acc: 0.7210
      Epoch 10/20
      100/100 - 7s - loss: 0.4109 - acc: 0.8170 - val_loss: 0.5317 - val_acc: 0.7270
      Epoch 11/20
      100/100 - 8s - loss: 0.3892 - acc: 0.8285 - val_loss: 0.5384 - val_acc: 0.7220
      Epoch 12/20
      100/100 - 8s - loss: 0.3542 - acc: 0.8570 - val_loss: 0.5480 - val_acc: 0.7180
      Epoch 13/20
      100/100 - 8s - loss: 0.3421 - acc: 0.8580 - val_loss: 0.5355 - val_acc: 0.7420
      Epoch 14/20
      100/100 - 8s - loss: 0.3217 - acc: 0.8665 - val_loss: 0.5572 - val_acc: 0.7340
      Epoch 15/20
      100/100 - 8s - loss: 0.2931 - acc: 0.8805 - val_loss: 0.5545 - val_acc: 0.7400
      Epoch 16/20
      100/100 - 8s - loss: 0.2739 - acc: 0.8870 - val_loss: 0.5540 - val_acc: 0.7360
      Epoch 17/20
      100/100 - 8s - loss: 0.2535 - acc: 0.9040 - val_loss: 0.5564 - val_acc: 0.7380
      Epoch 18/20
      100/100 - 8s - loss: 0.2257 - acc: 0.9245 - val_loss: 0.5710 - val_acc: 0.7420
      Epoch 19/20
      100/100 - 8s - loss: 0.2084 - acc: 0.9350 - val_loss: 0.5734 - val_acc: 0.7460
      Epoch 20/20
      100/100 - 8s - loss: 0.2258 - acc: 0.9130 - val_loss: 0.5897 - val_acc: 0.7300
      

      效果展示

      1. import matplotlib.pyplot as plt
      2. acc = history.history['acc']
      3. val_acc = history.history['val_acc']
      4. loss = history.history['loss']
      5. val_loss = history.history['val_loss']
      6. epochs = range(len(acc))
      7. plt.plot(epochs, acc, 'bo', label='Training accuracy')
      8. plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
      9. plt.title('Training and validation accuracy')
      10. plt.figure()
      11. plt.plot(epochs, loss, 'bo', label='Training Loss')
      12. plt.plot(epochs, val_loss, 'b', label='Validation Loss')
      13. plt.title('Training and validation loss')
      14. plt.legend()
      15. plt.show()

  • 相关阅读:
    Nacos注册中心和服务消费方式
    回溯算法 | 分割字符串 | 复原IP地址 | leecode刷题笔记
    配置网卡多队列
    代码大佬的【Linux内核开发笔记】分享,前人栽树后人乘凉!
    4zhou 舵机
    项目知识点
    【数据结构】如何设计循环队列?图文解析(LeetCode)
    代码随想录——字符串篇
    为什么说“分布式架构”才是AR眼镜的未来
    【21天python打卡】第6天 面向对象编程(1)
  • 原文地址:https://blog.csdn.net/qq_65838372/article/details/133184943