视频来源:PyTorch深度学习快速入门教程(绝对通俗易懂!)【小土堆】
前面P1-P5属于环境安装,略过。
数据文件: hymenoptera_data
# read_data.py文件
from torch.utils.data import Dataset
from PIL import Image
import os
class MyData(Dataset):
def __init__(self, root_dir, label_dir):
self.root_dir = root_dir
self.label_dir = label_dir
self.path = os.path.join(self.root_dir, self.label_dir)
self.img_path = os.listdir(self.path)
def __getitem__(self, idx):
img_name = self.img_path[idx]
img_item_path = os.path.join(self.root_dir, self.label_dir, img_name)
img = Image.open(img_item_path)
label = self.label_dir
return img, label
def __len__(self):
return len(self.img_path)
root_dir = "dataset/train"
ants_label_dir = "ants"
bees_label_dir = "bees"
ants_dataset = MyData(root_dir, ants_label_dir)
bees_dataset = MyData(root_dir, bees_label_dir)
train_dataset = ants_dataset + bees_dataset
1.在jupytrer notebook中,可以使用
help(xxx)
或者xxx??
来获取帮助文档。
2.__init__方法主要用于声明一些变量用于后续类内的方法。
3.python console可以显示变量的值,所以建议使用它来进行调试。
x.使用os.path.join()来拼接路径的好处是:适配windows和linux。
# tb.py
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter("logs")
for i in range(100):
writer.add_scalar("y=x", i, i)
writer.close()
不要以
test+其他字符
作为.py文件的文件名(test.py是可以的),这会导致报empty suite(没有测试用例)。
详细参考:笔记19:在运行一个简单的carla例程时,报错 Empty Suite
SummaryWriter(log_dir, comment, ...)
实例化时,log_dir是可选参数,表示事件文件存放地址。comment也是可选参数,会扩充事件文件的存放地址后缀。
add_scalar(tag, scalar_value, global_steap)
调用时,tag
是标题(标识符),scaler_value
是y轴数值,gloabl_step
是x轴数值。
# shell
tensorboard --logdir=logs --port=6007
一般上述命令打开6006端口,但如果一台服务器上有好几个人打开tensorboard,会麻烦。所以
--port=6007
可以指定端口。
如果两次写入的scalar写入的tag是相同的,那么两次scalar会在一个图上。
# P8_Tensorboard.py
from torch.utils.tensorboard import SummaryWriter
from PIL import Image
import numpy as np
writer = SummaryWriter("logs")
image_path = 'dataset/train/ants/0013035.jpg'
img_PIL = Image.open(image_path)
img_array = np.array(img_PIL)
writer.add_image('test', img_array, 1, dataformats='HWC')
writer.close()
add_image(tag, img_tensor, global_steap)
调用时,img_tensor
需要是torch.Tensor, numpy.ndarray或string等。
add_image
默认匹配的图片的大小是(3, H, W)
,如果大小是(H, W, 3)
,需要添加参数dataformats='HWC'
# P9_Transforms
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
img_path = 'dataset/train/ants/0013035.jpg'
img = Image.open(img_path) # 得到PIL类型图片
# 这里也可以通过cv2.imread()读取图片,转化为nd.array
writer = SummaryWriter('logs')
tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img) # ToTensor支持PIL、nd.array图片类型作为输入
writer.add_image('Tensor_img', tensor_img)
writer.close()
对于一个模块文件,如transforms.py,可以借助pycharm的Structure快速了解其中定义的class类。
pip install opencv-python
之后才能import cv2
Image.open()
返回的是PIL
类型的图片。cv2.imread()
返回的是nd.array
类型的图片。
类里面的__call__方法的作用是:使得实例化对象可以像函数一样被调用。
作用:将PIL,nd.array转化为Tensor类型。
这个对象的输入可以是PIL图像,也可以是np.ndarray。
作用:对tensor格式的图像做标准化。需要多通道的均值和多通道的标准差。
这个对象的输入必须是tensor图像。
作用:变更大小。如果size的值是形如(h, w)的序列,则输出的大小就是(h, w)。如果size的值是一个标量,则较小的边长变成该标量,另一个边长成比例缩放。
这个对象的输入可以是PIL图像,也可以是np.array
(这意味着cv2.imread得到的ndarray也可以作为输入)。(之前的版本只能是PIL图像)
设置大小写不敏感的代码补缺
:通过搜索settings->Editor->General->Code Completion,取消对Match Case的勾选
作用:组合各种transforms.xx
作用:随机裁剪
# P9_Transforms.py
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
img_path = 'dataset/train/ants/0013035.jpg'
img = Image.open(img_path)
writer = SummaryWriter('logs')
# ToTensor
trans_totensor = transforms.ToTensor()
tensor_img = trans_totensor(img) # ToTensor支持PIL图片类型作为输入
writer.add_image('Tensor_img', tensor_img)
# Normalize
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(tensor_img) # 标准化
writer.add_image('Normalize', img_norm)
# Resize
trans_resize = transforms.Resize((512, 512))
# img PIL -> resize -> img_resize PIL
img_resize = trans_resize(img)
# img_resize PIL -> resize -> img_resize tensor
img_resize = trans_totensor(img_resize)
writer.add_image('Resize', img_resize, 0)
# Compose - resize - 2
trans_resize_2 = transforms.Resize(512)
# PIL -> PIL -> tensor
trans_compose = transforms.Compose([trans_resize_2, trans_totensor])
img_resize_2 = trans_compose(img)
writer.add_image('Resize', img_resize_2, 1)
# RandomCrop
trans_random = transforms.RandomCrop(50)
trans_compose_2 = transforms.Compose([trans_random, trans_totensor])
for i in range(10):
img_crop = trans_compose_2(img)
writer.add_image('RandomCrop', img_crop, i)
writer.close()
总结:
主要关注输入和输出。
多看官方文档
关注方法需要的参数
本节介绍如何将torchvision的数据集和transforms结合起来。
# P10_dataset_transforms
import torchvision
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
dataset_transform = transforms.Compose([transforms.ToTensor()])
train_set = torchvision.datasets.CIFAR10(
root="./dataset", train=True, transform=dataset_transform, download=True
)
test_set = torchvision.datasets.CIFAR10(
root="./dataset", train=False, transform=dataset_transform, download=True
)
writer = SummaryWriter("p10")
for i in range(10):
img, target = test_set[i]
writer.add_image("test_set", img, i)
writer.close()
参考资料:torch.utils.data.DataLoader
# dataloader
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
test_data = torchvision.datasets.CIFAR10('./dataset', train=False, transform=torchvision.transforms.ToTensor())
test_loader = DataLoader(test_data, batch_size=64, shuffle=True, num_workers=0, drop_last=False)
# 测试数据集中第一张图片及target
img, target = test_data[0]
# print(img.shape) # (3, 32, 32)
# print(target) # 3
writer = SummaryWriter("dataloader")
step = 0
for data in test_loader:
imgs, targets = data
# print(imgs.shape) # (4, 3, 32, 32)
# print(targets) # [2, 7, 2, 2]
writer.add_images('test_data', imgs, step) # 多张图片用add_images
step += 1
writer.close()
按照上面的模版,定义模型名,继承Module类,重写forward函数。下面写一个例子。(这一节比较简单)
import torch
from torch import nn
class Tudui(nn.Module):
def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, **kwargs)
def forward(self, input):
output = input + 1
return output
tudui = Tudui()
x = torch.tensor(1.0)
output = tudui(x)
print(output)
第17个视频主要通过torch.nn.functional.conv2d
来介绍stride
和padding
。这里略过。
import torchvision
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
dataset = torchvision.datasets.CIFAR10(
"./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True
)
dataloader = DataLoader(dataset, batch_size=64)
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = Conv2d(
in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0
)
def forward(self, x):
x = self.conv1(x)
return x
tudui = Tudui()
for data in dataloader:
imgs, targets = data
output = tudui(imgs)
print(imgs.shape)
print(output.shape)
import torch
from torch import nn
from torch.nn import MaxPool2d
input = torch.tensor(
[[1, 2, 0, 3, 1, 0, 1, 2, 3, 1, 1, 2, 1, 0, 0, 5, 2, 3, 1, 1, 2, 1, 0, 1, 1]],
dtype=torch.float32,
)
input = torch.reshape(input, (1, 5, 5)) # 也可以是(-1, 1, 5, 5)
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)
def forward(self, input):
output = self.maxpool1(input)
return output
tudui = Tudui()
output = tudui(input)
print(output)
print(output.shape)
input必须指定dtype,例如dtype=torch.float32。否则全部输入整数,默认会是long类型。后续进行Max_Pool2d会报错。
使用ceil_mode=True时,对于不足kernel_size*kernel_size的局部,也会计算MaxPool.
import torch
from torch import nn
from torch.nn import ReLU
input = torch.tensor([[1, -0.5, -1, 3]])
input = torch.reshape(input, (-1, 1, 2, 2))
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.relu1 = ReLU()
def forward(self, input):
output = self.relu1(input)
return output
tudui = Tudui()
output = tudui(input)
print(output)
inplace参数表示是否原地替换。一般推荐inplace=False,以保留原始数据。
import torch
import torchvision
from torch import nn
from torch.utils.data import DataLoader
dataset = torchvision.datasets.CIFAR10('./dataset', train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=64)
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(196608, 10)
def forward(self, input):
output = self.linear1(input)
return output
tudui = Tudui()
for data in dataloader:
imgs, targets = data
print(imgs.shape)
# input = torch.reshape(imgs, (1, 1, 1, -1))
input = torch.flatten(imgs)
output = tudui(input)
print(output.shape)
break
torch.flatten()可以将张量直接摊平成一个行向量。
import torch
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Flatten, Linear
from torch.utils.tensorboard import SummaryWriter
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10),
)
def forward(self, x):
x = self.model1(x)
return x
tudui = Tudui()
input = torch.ones((64, 3, 32, 32))
output = tudui(input)
print(output.shape)
writer = SummaryWriter("./logs_seq")
writer.add_graph(tudui, input)
writer.close()
tensorboard也可以进行模型可视化
# nn_loss
import torch
from torch import nn
from torch.nn import L1Loss
inputs = torch.tensor([1, 2, 3], dtype=torch.float32)
targets = torch.tensor([1, 2, 5], dtype=torch.float32)
inputs = torch.reshape(inputs, (1, 1, 1, 3))
targets = torch.reshape(targets, (1, 1, 1, 3))
loss = L1Loss()
result = loss(inputs, targets)
print(result)
loss_mse = nn.MSELoss()
result_mse = loss_mse(inputs, targets)
print(result_mse)
x = torch.tensor([0.1, 0.2, 0.3])
x = torch.reshape(x, (1, 3))
y = torch.tensor([1])
loss_corss = nn.CrossEntropyLoss()
result_corss = loss_corss(x, y)
print(result_corss)
# nn_loss_network
import torchvision
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, CrossEntropyLoss, Sequential
from torch.utils.data import DataLoader
dataset = torchvision.datasets.CIFAR10(
"./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True
)
dataloader = DataLoader(dataset, batch_size=64)
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10),
)
def forward(self, x):
x = self.model1(x)
return x
loss = CrossEntropyLoss()
tudui = Tudui()
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets) # 反向传播求梯度,这些梯度后续会被用于优化器优化参数
print(result_loss)
import torch
import torchvision
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, CrossEntropyLoss, Sequential
from torch.utils.data import DataLoader
dataset = torchvision.datasets.CIFAR10(
"./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True
)
dataloader = DataLoader(dataset, batch_size=64)
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10),
)
def forward(self, x):
x = self.model1(x)
return x
loss = CrossEntropyLoss()
tudui = Tudui()
optim = torch.optim.SGD(tudui.parameters(), lr=0.01)
for epoch in range(20):
running_loss = 0.0
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets) # 反向传播求梯度,这些梯度后续会被用于优化器优化参数
optim.zero_grad()
result_loss.backward()
optim.step()
running_loss = running_loss + result_loss
print(running_loss)
import torchvision
from torch import nn
# train_data = torchvision.datasets.ImageNet(
# "./dataset",
# split="train",
# download=True,
# transform=torchvision.transforms.ToTensor()
# )
vgg16_false = torchvision.models.vgg16()
vgg16_true = torchvision.models.vgg16(weights='VGG16_Weights.IMAGENET1K_V1')
# 在预训练模型后面接上新的层
print(vgg16_true)
vgg16_true.classifier.add_module('add_linear', nn.Linear(1000, 10))
print(vgg16_true)
# 修改预训练模型的某一层参数
print(vgg16_false)
vgg16_false.classifier[6] = nn.Linear(4096, 10)
print(vgg16_false)
两种保存方式:
import torch
import torchvision
vgg16 = torchvision.models.vgg16()
# 保存方式1: 保存整个模型+模型参数
torch.save(vgg16, 'vgg16_method1.pth')
model1 = torch.load('vgg16_method1.pth')
print(model1)
# 保存方式1: 保存模型参数(官方推荐)
torch.save(vgg16.state_dict(), 'vgg16_method2.pth')
vgg16.load_state_dict(torch.load('vgg16_method2.pth'))
print(vgg16)
# model.py
import torch
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Flatten, Linear
class Tudui(nn.Module):
def __init__(self):
super().__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10),
)
def forward(self, x):
x = self.model1(x)
return x
if __name__ == '__main__':
tudui = Tudui()
input = torch.zeros((64, 3, 32, 32))
output = tudui(input)
print(output.shape)
import torch
import torchvision
from torch import nn
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from model import Tudui
# 准备数据集
train_data = torchvision.datasets.CIFAR10(
"./dataset", train=True, transform=torchvision.transforms.ToTensor(), download=True
)
valid_data = torchvision.datasets.CIFAR10(
"./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True
)
train_data_size = len(train_data)
valid_data_size = len(valid_data)
print('训练数据集的长度为: {}'.format(train_data_size))
print('验证数据集的长度为: {}'.format(valid_data_size))
# 利用DataLoader加载数据集
train_data_loader = DataLoader(train_data, batch_size=64)
valid_data_loader = DataLoader(valid_data, batch_size=64)
# 创建网络模型
tudui = Tudui()
# 定义损失函数
loss_fn = nn.CrossEntropyLoss()
# 定义优化器
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), learning_rate)
# 记录网络训练的参数
total_train_step = 0
# 训练的轮数
epoch= 10
# 添加tensorboard
writer = SummaryWriter('./logs_train')
for i in range(epoch):
print('--------第 {} 轮训练开始--------'.format(i+1))
# 训练开始
tudui.train()
for data in train_data_loader:
imgs, targets = data
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad()
loss.backward()
optimizer.step()
total_train_step += 1
if total_train_step % 100 == 0:
print(f'训练次数: {total_train_step}, loss: {loss.item()}')
writer.add_scalar('train_loss', loss.item(), total_train_step)
# 评估开始
tudui.eval()
total_valid_loss = 0
total_accuracy = 0
with torch.no_grad():
for data in valid_data_loader:
imgs, targets = data
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
total_valid_loss += loss.item()
accuracy = (outputs.argmax(1) == targets).sum()
total_accuracy += accuracy
print('整体验证集上的Loss: {total_valid_loss}')
print(f'整体验证集上的正确率: {total_accuracy/valid_data_size}')
writer.add_scalar('valid_loss', total_valid_loss, i)
writer.add_scalar('valid_accuracy', total_accuracy, i)
# 保存模型参数
torch.save(tudui.state_dict(), f'tudui_{i}.pth')
writer.close()
有两种方式
# 网络模型转到cuda
tudui = tudui.cuda()
# 数据(输入、输出)转到cuda
imgs = imgs.cuda()
targets = targest.cuda()
# 损失函数转到cuda
loss_fn = loss_fn.cuda()
# 定义训练的设备
device = torch.device('cpu')
device = torch.device('cuda: 0') # cuda:1等
# 网络模型转到cuda
tudui = tudui.to(device)
# 数据(输入、输出)转到cuda
imgs = imgs.to(device)
targets = targest.to(device)
# 损失函数转到cuda
loss_fn = loss_fn.to(device)
事实上,只有数据需要重新赋值,网络模型和损失函数并不需要赋值。