• CNN入门实战:猫狗分类


    前言

            CNN(Convolutional Neural Network,卷积神经网络)是一种深度学习模型,特别适用于处理图像数据。它通过多层卷积和池化层来提取图像的特征,并通过全连接层进行分类或回归等任务。CNN在图像识别、目标检测、图像分割等领域取得了很大的成功。

    CNN网络结构

            目标分类是指识别图像中的物体,并将其归类到不同的类别中。例如,猫狗分类就是一个目标分类的任务,CNN可以帮助我们构建一个模型来自动识别图像中的猫和狗。

    如何入门CNN

    要入门CNN,可以先了解深度学习的基本概念和原理,然后学习如何构建和训练CNN模型。可以选择一些经典的教材、在线课程或者教程来学习深度学习和CNN的基础知识。

    实战案例分析

    以下是一个简单的使用PyTorch构建CNN模型的示例代码:

    1、导包

    1. from PIL import Image
    2. import torch
    3. import torchvision.transforms as transforms
    4. from torch.utils.data import DataLoader, random_split, Dataset
    5. from torchvision import datasets, models
    6. import torch.nn as nn
    7. import torch.nn.functional as F
    8. import torch.optim as optim
    9. import numpy as np
    10. import pandas as pd
    11. import matplotlib.pyplot as plt
    12. import os
    13. from tqdm import tqdm

    2、设置数据集目录

    1. # Setting the data directories/ paths
    2. data_pth = '/你的数据集目录'
    3. cats_dir = data_pth + '/Cat'
    4. dogs_dir = data_pth + '/Dog'

    3、打印图像的数量

    1. print("Total Cats Images:", len(os.listdir(cats_dir)))
    2. print("Total Dogs Images:", len(os.listdir(dogs_dir)))
    3. print("Total Images:", len(os.listdir(cats_dir)) + len(os.listdir(dogs_dir)))

    4、查看Cat数据

    1. cat_img = Image.open(cats_dir + '/' + os.listdir(cats_dir)[0])
    2. print('Shape of cat image:', cat_img.size)
    3. cat_img

    5、查看Dog的数据

    1. dog_img = Image.open(dogs_dir + '/' + os.listdir(dogs_dir)[0])
    2. print('Shape of dog image:', dog_img.size)
    3. dog_img

    6、自定义加载数据集方法

    1. class CustomDataset(Dataset):
    2. def __init__(self, data_path, transform=None):
    3. # Initialize your dataset here
    4. self.data = data
    5. self.transform = transform
    6. def __len__(self):
    7. # Return the number of samples in your dataset
    8. return len(self.data)
    9. def __getitem__(self, idx):
    10. # Implement how to get a sample at the given index
    11. sample = self.data[idx]
    12. try:
    13. img = Image.open(data_pth + '/Cat/' + sample)
    14. label = 0
    15. except:
    16. img = Image.open(data_pth + '/Dog/' + sample)
    17. label = 1
    18. # Apply any transformations (e.g., preprocessing)
    19. if self.transform:
    20. img = self.transform(img)
    21. return img, label

    7、定义数据转换(调整大小、规格化、转换为张量等)

    在训练目标分类模型时,我们通常会使用转换数据来对输入数据进行预处理,以便更好地适应模型的训练和提高模型的性能。

    使用转换数据的原因包括:

    1. 调整大小:输入数据通常具有不同的尺寸和分辨率,为了确保模型能够处理这些不同尺寸的数据,我们需要将其调整为统一的大小。这样可以确保模型在训练和预测时能够处理相同大小的输入数据。

    2. 规格化:规格化是指将输入数据的数值范围调整到相似的范围,以便更好地适应模型的训练。规格化可以帮助模型更快地收敛,提高模型的稳定性和准确性。

    3. 转换为张量:在深度学习中,输入数据通常需要转换为张量形式,以便与神经网络模型进行计算。因此,我们需要将输入数据转换为张量形式,以便能够输入到模型中进行训练和预测。

    总之,转换数据是为了确保模型能够更好地适应输入数据,并提高模型的性能和准确性。通过调整大小、规格化和转换为张量等操作,我们可以更好地准备输入数据,使其更适合用于训练目标分类模型。

    1. # Define the data transformation (resize, normalize, convert to tensor, etc.)
    2. transform = transforms.Compose([
    3. transforms.Resize((224, 224)), # Resize images to a fixed size (adjust as needed)
    4. transforms.Grayscale(num_output_channels=1),
    5. transforms.ToTensor(), # Convert images to PyTorch tensors
    6. ])
    7. data = [i for i in os.listdir(data_pth + '/Cat') if i.endswith('.jpg')] + [i for i in os.listdir(data_pth + '/Dog') if i.endswith('.jpg')]
    8. combined_dataset = CustomDataset(data_path=data, transform=transform)
    9. #dataloader = torch.utils.data.DataLoader(custom_dataset, batch_size=64, shuffle=False)

    8、定义拆分比例(例如,80%用于培训,20%用于测试)

    1. # Define the ratio for splitting (e.g., 80% for training, 20% for testing)
    2. train_ratio = 0.8
    3. test_ratio = 1.0 - train_ratio
    4. # Calculate the number of samples for training and testing
    5. num_samples = len(combined_dataset)
    6. num_train_samples = int(train_ratio * num_samples)
    7. num_test_samples = num_samples - num_train_samples
    8. # Use random_split to split the dataset
    9. train_dataset, test_dataset = random_split(combined_dataset, [num_train_samples, num_test_samples])
    10. # Create data loaders for training and testing datasets
    11. batch_size = 32
    12. train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=False)
    13. test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

    9、自定义CNN模型

    1. # Define the CNN model
    2. class CNN(nn.Module):
    3. def __init__(self):
    4. super(CNN, self).__init__()
    5. self.conv = nn.Sequential(
    6. nn.Conv2d(1, 8, kernel_size=3, stride=2),
    7. nn.MaxPool2d(2, 2),
    8. nn.ReLU(),
    9. nn.Conv2d(8, 16, kernel_size=3, stride=2),
    10. nn.MaxPool2d(2, 2),
    11. nn.ReLU(),
    12. nn.Conv2d(16, 32, kernel_size=3, stride=2),
    13. nn.MaxPool2d(2, 2),
    14. nn.ReLU(),
    15. )
    16. self.fc = nn.Sequential(
    17. nn.Flatten(),
    18. nn.Linear(288, 128),
    19. nn.ReLU(),
    20. nn.Linear(128, 1),
    21. nn.Sigmoid()
    22. )
    23. def forward(self, x):
    24. x = self.conv(x)
    25. x = self.fc(x)
    26. return x

    10、GPU是否可用

    1. # check if gpu is available or not
    2. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    3. print(device)

    11、初始化模型、损失函数和优化器

         训练过程中的关键步骤,其重要性如下:

    1. 初始化模型:模型的初始化是指对模型参数进行初始赋值。正确的初始化可以加速模型的收敛,提高训练的效率和稳定性。如果模型参数的初始值过大或过小,可能会导致梯度爆炸或梯度消失,从而影响模型的训练效果。

    2. 损失函数:损失函数是用来衡量模型预测结果与真实标签之间的差距。选择合适的损失函数可以帮助模型更好地学习数据的特征,并且在训练过程中不断优化模型参数,使得损失函数值逐渐减小。

    3. 优化器:优化器是用来更新模型参数的算法,常见的优化器包括随机梯度下降(SGD)、Adam、RMSprop等。选择合适的优化器可以加速模型的收敛,提高训练的效率和稳定性。不同的优化器有不同的更新规则,可以根据具体的任务和数据特点选择合适的优化器。

    因此,初始化模型、损失函数和优化器是CNN训练过程中的关键步骤,它们的选择和设置会直接影响模型的训练效果和性能。

    1. # Initialize the model, loss function, and optimizer
    2. net = CNN().to(device)
    3. criterion = nn.BCELoss()
    4. optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

    12 、开始训练数据

    1. epochs = 5
    2. net.train()
    3. for epoch in range(epochs):
    4. running_loss = 0.0
    5. for idx, (inputs, labels) in tqdm(enumerate(train_loader), total=len(train_loader)):
    6. inputs = inputs.to(device)
    7. labels = labels.to(device).to(torch.float32)
    8. optimizer.zero_grad()
    9. outputs = net(inputs).reshape(-1)
    10. loss = criterion(outputs, labels)
    11. loss.backward()
    12. optimizer.step()
    13. running_loss += loss.item()
    14. print(f'Epoch: {epoch + 1}, Loss: {running_loss}')
    15. print('Training Finished!')

    13、验证数据

    1. net.eval() # Set the model to evaluation mode
    2. correct = 0
    3. total = 0
    4. with torch.no_grad():
    5. for idx, (inputs, labels) in tqdm(enumerate(test_loader), total=len(test_loader)):
    6. inputs = inputs.to(device)
    7. labels = labels.to(device).to(torch.float32)
    8. outputs = net(inputs).reshape(-1)
    9. predicted = (outputs > 0.5).float() # Assuming a binary classification threshold of 0.5
    10. correct += (predicted == labels).sum().item()
    11. total += labels.size(0)
    12. accuracy = correct / total if total > 0 else 0.0
    13. print(f'Test Accuracy: {accuracy:.2%}')

    14、测试数据

    1. label_names = ['cat', 'dog']
    2. fig, ax = plt.subplots(1, 5, figsize=(15, 5))
    3. outputs = outputs.cpu()
    4. inputs = inputs.cpu()
    5. labels = labels.cpu()
    6. for i in range(5):
    7. ax[i].imshow(inputs[i].permute(1,2,0))
    8. ax[i].set_title(f'True: {label_names[labels[i].to(int)]}, Pred: {label_names[torch.where(outputs[i] > 0.5, 1, 0).item()]}')
    9. ax[i].axis(False)
    10. plt.show()

    至此,一个CNN训练模型从搭建到测试的完整实现过程就完成了,我们使用了PyTorch构建了一个简单的CNN模型,并使用猫狗分类的训练数据对模型进行训练。首先准备了训练数据集,然后构建了一个简单的CNN模型,定义了损失函数和优化器,最后进行了模型的训练。

    数据集训练完整代码

    1. from PIL import Image
    2. import torch
    3. import torchvision.transforms as transforms
    4. from torch.utils.data import DataLoader, random_split, Dataset
    5. from torchvision import datasets, models
    6. import torch.nn as nn
    7. import torch.nn.functional as F
    8. import torch.optim as optim
    9. import numpy as np
    10. import pandas as pd
    11. import matplotlib.pyplot as plt
    12. import os
    13. from tqdm import tqdm
    14. data_pth = '/你的数据集目录'
    15. cats_dir = data_pth + '/Cat'
    16. dogs_dir = data_pth + '/Dog'
    17. cat_img = Image.open(cats_dir + '/' + os.listdir(cats_dir)[0])
    18. dog_img = Image.open(dogs_dir + '/' + os.listdir(dogs_dir)[0])
    19. #定义加载数据集方法
    20. class CustomDataset(Dataset):
    21. def __init__(self, data_path, transform=None):
    22. # Initialize your dataset here
    23. self.data = data
    24. self.transform = transform
    25. def __len__(self):
    26. # 返回数据集数量
    27. return len(self.data)
    28. def __getitem__(self, idx):
    29. # 获取数据集对应的下标
    30. sample = self.data[idx]
    31. try:
    32. img = Image.open(data_pth + '/Cat/' + sample)
    33. label = 0
    34. except:
    35. img = Image.open(data_pth + '/Dog/' + sample)
    36. label = 1
    37. if self.transform:
    38. img = self.transform(img)
    39. return img, label
    40. transform = transforms.Compose([
    41. transforms.Resize((224, 224)), # Resize images to a fixed size (adjust as needed)
    42. transforms.Grayscale(num_output_channels=1),
    43. transforms.ToTensor(), # Convert images to PyTorch tensors
    44. ])
    45. data = [i for i in os.listdir(data_pth + '/Cat') if i.endswith('.jpg')] + [i for i in os.listdir(data_pth + '/Dog') if i.endswith('.jpg')]
    46. combined_dataset = CustomDataset(data_path=data, transform=transform)
    47. #划分数据集
    48. train_ratio = 0.8
    49. test_ratio = 1.0 - train_ratio
    50. # Calculate the number of samples for training and testing
    51. num_samples = len(combined_dataset)
    52. num_train_samples = int(train_ratio * num_samples)
    53. num_test_samples = num_samples - num_train_samples
    54. # Use random_split to split the dataset
    55. train_dataset, test_dataset = random_split(combined_dataset, [num_train_samples, num_test_samples])
    56. # Create data loaders for training and testing datasets
    57. batch_size = 32
    58. train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=False)
    59. test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
    60. #自定义CNN模型
    61. class CNN(nn.Module):
    62. def __init__(self):
    63. super(CNN, self).__init__()
    64. self.conv = nn.Sequential(
    65. nn.Conv2d(1, 8, kernel_size=3, stride=2),
    66. nn.MaxPool2d(2, 2),
    67. nn.ReLU(),
    68. nn.Conv2d(8, 16, kernel_size=3, stride=2),
    69. nn.MaxPool2d(2, 2),
    70. nn.ReLU(),
    71. nn.Conv2d(16, 32, kernel_size=3, stride=2),
    72. nn.MaxPool2d(2, 2),
    73. nn.ReLU(),
    74. )
    75. self.fc = nn.Sequential(
    76. nn.Flatten(),
    77. nn.Linear(288, 128),
    78. nn.ReLU(),
    79. nn.Linear(128, 1),
    80. nn.Sigmoid()
    81. )
    82. def forward(self, x):
    83. x = self.conv(x)
    84. x = self.fc(x)
    85. return x
    86. #GPU是否可用
    87. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    88. #初始化模型、损失函数和优化器
    89. net = CNN().to(device)
    90. criterion = nn.BCELoss()
    91. optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
    92. #开始训练
    93. epochs = 5
    94. net.train()
    95. for epoch in range(epochs):
    96. running_loss = 0.0
    97. for idx, (inputs, labels) in tqdm(enumerate(train_loader), total=len(train_loader)):
    98. inputs = inputs.to(device)
    99. labels = labels.to(device).to(torch.float32)
    100. optimizer.zero_grad()
    101. outputs = net(inputs).reshape(-1)
    102. loss = criterion(outputs, labels)
    103. loss.backward()
    104. optimizer.step()
    105. running_loss += loss.item()
    106. print(f'Epoch: {epoch + 1}, Loss: {running_loss}')
    107. print('Training Finished!')

    数据集下载

    百度网盘:https://pan.baidu.com/s/1CjTNLGvBBDxmKEADN3SNWw?pwd=o37e 
    提取码:o37e

  • 相关阅读:
    Java反射,看完就会用
    SAP UI5 index.html 根节点的 css 类填充逻辑
    ES6之函数的扩展
    Java项目:SSH学生学籍管理系统及教务管理系统
    手动引入jar包,解决Dependency ‘XXX‘ not found的两种方式
    CSS实现图片滑动对比
    【计算机组成与设计】-第五章 memory hierarchy(三)
    网络变压器工厂:了解POE POE+ 网络变压器(网络隔离滤波器)
    Spring Boot 3.2发布:大量Java 21的支持上线,改进可观测性
    Abaqus周期性边界条件
  • 原文地址:https://blog.csdn.net/qq_39312146/article/details/134290713