AlexNet重点介绍和源码测试

AlexNet

1 网络背景

AlexNet 首次发表在 NIPS 2012，获得了 ILSVRC 2012 的冠军，开启了深度学习模型在图像分类中的应用历程，由Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton三人提出，因此网络名称为AlexNet，论文阅读地址：ImageNet Classification with Deep Convolutional Neural Networks

2 网络结构

在这里插入图片描述

这里注意一下，原文给出的输入尺寸是224*224，但是使用它的卷积参数，想要达到下一层是55*55的尺寸，需要向上取整，所以，复现代码用的227的尺寸，实现起来更加方便

分为上下两个，是因为原始AlexNet是多GPU训练，将不同的训练内容放在不同的GPU上。
网络层包含8层权重，前五层是卷积操作，后三层是全连接操作，最后一个全连接操作使用softmax函数进行标签确定。
参数设置见下图：

结合网络结构图和代码一起看，在前两个网络层中，通过in_out_channels控制输入输出通道，同时因为kernel_size和strid的原因，会导致图像尺寸变化，后三个卷积层，因为运算关系，没有改变图像尺寸，计算公式如下：Conv2d和MaxPool2d输入输出尺寸计算

在这里插入图片描述

3 网络基础设置

使用ReLu激活函数替代以往的f(x) = tanh(x) or f(x) =Sigmoid(x)
在ReLU激活函数后，再一次使用LRN防止过拟合，参考文献有：文章1 文章2
数据增强：图像平移，水平反射、调整图像中RGB通道强度进行PCA(主成分分析)
使用Dropout加快训练速度和减少过拟合。
使用动态衰减权重，即随着训练次数的增加，不断缩小权重变化幅度，提升训练精度和速度

4 代码分享

平台使用的Kaggle，使用GPU进行加速
数据集使用的是medical-mnist，是Kaggle开放的数据集，包含6个分类，训练图片47163张，测试图片11791张
基本保留了AlexNet的网络结构和基础设置，改变了数据集以及没有使用分布式训练的方法

部分截图：

在这里插入图片描述
源码使用：
第一步：在Kaggle平台上搜索medical mnist，找到并new Notebook
第二步：打开GPU加速
第三步：复制以下代码，跑起来吧！

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import datasets, transforms, models

from torchvision.utils import make_grid

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

import os
import random 
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings("ignore")

train_transforms = transforms.Compose([
    transforms.RandomRotation(10),    #  Rotation (-10,109)
    transforms.RandomHorizontalFlip(),   # HorizontalFlip by 0.5 ratio
    transforms.Resize(227),
    transforms.CenterCrop(227),
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406],   # Three channesl by data-mean/std    (mean, std)
                        [0.229,0.224,0.225])
])


dataset = datasets.ImageFolder(root="../input/medical-mnist", transform=train_transforms)

train_indices, test_indices = train_test_split(list(range(len(dataset.targets))), test_size=0.2, stratify=dataset.target_transform)
train_data = torch.utils.data.Subset(dataset, train_indices)
test_data = torch.utils.data.Subset(dataset, test_indices)
print(len(train_data), len(test_data))

train_loader = DataLoader(train_data, batch_size=12, shuffle=True)
test_loader = DataLoader(test_data, batch_size=12)
print(len(test_loader), len(train_loader))


# AlexNet
class AlexNet(nn.Module):
    "Neural network model "
        
    def __init__(self, num_classes):
        super().__init__()
        self.net = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=96, kernel_size=11, stride=4), # (b x 96 x 55 x 55)
            nn.ReLU(False),
            nn.LocalResponseNorm(size=5, alpha=0.0001,beta=0.75, k=2),
            nn.MaxPool2d(kernel_size=3, stride=2),   # (b x 96 x 27 x 27)
            nn.Conv2d(in_channels=96, out_channels=256, kernel_size=5,padding=2),  # (b x 256 x 27 x 27)
            nn.ReLU(False),
            nn.LocalResponseNorm(size=5, alpha=0.0001,beta=0.75, k=2),
            nn.MaxPool2d(kernel_size=3, stride=2),    # (b x 256 x 13 x 13)
            nn.Conv2d(in_channels=256, out_channels=384, kernel_size=3, padding=1),  # (b x 384 x 13 x 13)
            nn.ReLU(False),
            nn.Conv2d(in_channels=384, out_channels=384, kernel_size=3, padding=1),  # (b x 384 x 13 x 13)
            nn.ReLU(False),
            nn.Conv2d(in_channels=384, out_channels=256, kernel_size=3, padding=1),  # (b x 256 x 13 x 13)
            nn.ReLU(False),
            nn.MaxPool2d(kernel_size=3, stride=2)  # (b x 256 x 6 x 6)
        )
    
        # this is last outLayer 
        self.classifier = nn.Sequential(
            nn.Dropout(p=0.5, inplace=False),
            nn.Linear(in_features=(256*6*6), out_features=4096),
            nn.ReLU(False),
            nn.Dropout(p=0.5, inplace=False),
            nn.Linear(in_features=4096, out_features=4096),
            nn.ReLU(False),
            nn.Linear(in_features=4096, out_features=num_classes)  
        )
        self.init_bias()   # initialize bias
        
    def init_bias(self):
        for layer in self.net:
            if  isinstance(layer, nn.Conv2d):
                nn.init.normal_(layer.weight, mean=0, std=0.01)
                nn.init.constant_(layer.bias, 0)
             # original paper = 1 for Conv2d layers 2nd, 4th, and 5th conv layers
        nn.init.constant_(self.net[4].bias, 1)
        nn.init.constant_(self.net[10].bias, 1)
        nn.init.constant_(self.net[12].bias, 1)
        
    def forward(self,x):
        x = self.net(x)
#         print(x.shape)
        x = x.view(-1, 256*6*6) # reduce the dimensions for linear layer input
        return self.classifier(x)


# count model parameters
def count_parameters(model):
    params = [p.numel() for p in model.parameters() if p.requires_grad]
    for item in params:
        print(f'{item:>8}')
    print(f'________\n{sum(params):>8}')
    
count_parameters(AlexNet(6))

device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')

alexnet = AlexNet(num_classes=6).to(device)

optimizer = torch.optim.Adam(alexnet.parameters(), lr=0.0001)

criterion = nn.CrossEntropyLoss().to(device)

lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)



import time
start_time = time.time()
train_losses = []
test_losses = []
train_acc = []
test_acc = []
epoch = 90
for i in range(epoch):
    total_train_loss = 0
    total_train_acc = 0
    for b,(X_train, y_train) in enumerate(train_loader):
        X_train, y_train  = X_train.to(device), y_train.to(device)
        
        y_pred= alexnet(X_train)
        loss = criterion(y_pred, y_train)
        total_train_loss += loss.item()

        accuracy = (y_pred.argmax(1) == y_train).sum()
        total_train_acc += accuracy
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
            
    train_losses.append(total_train_loss)
    train_acc.append(total_train_acc/len(train_data))
    
    total_test_loss = 0
    total_test_acc = 0
    with torch.no_grad():
        for b, (X_test, y_test) in enumerate(test_loader):
            X_test, y_test = X_test.to(device), y_test.to(device)
            y_val = alexnet(X_test)
            loss = criterion(y_val, y_test)
            total_test_loss += loss.item()
            accuracy = (y_val.argmax(1) == y_test).sum()
            total_test_acc += accuracy
        
        test_losses.append(total_test_loss)
        test_acc.append(total_test_acc/len(test_data))
    
    print(f"epoch:{i+1},\t train_loss:{total_train_loss} \t train_acc:{total_train_acc/len(train_data)} \t test_loss:{total_test_loss} \t test_acc:{total_test_acc/len(test_data)}")
print(f'\nDuration: {time.time() - start_time:.0f} seconds')    
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162

按块复制运行，有助于辅助理解
总的参数个数约为58305926个，和原文给出的基本类似
个人认为这里使用ImageFolder、train_test_split、torch.utils.data.Subset进行数据加载也非常重要，可以稍微注意一下

最后贴一下源代码链接，可以参考学习一下：AlexNet代码_清晰易懂
我还预测了你们会报错的地方，就是RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [128, 4096]].......
修改方法：将源码中出现的ReLU和Dropout参数中的True改为False

相关阅读:
企业电子招标采购系统源码Spring Boot + Mybatis + Redis + Layui + 前后端分离构建企业电子招采平台之立项流程图
 C# List 复制之深浅拷贝
 MacBook Pro M1 安装Burp Suite中文版
 Python - Numpy库的使用（简单易懂）
web自动化测试入门篇03——selenium使用教程
 139. 单词拆分
 docker 安装nacos最新版本单机版
 一个关于proto 文件的经验分享：gRPC 跨语言双端通信显示错误码：12 UNIMPLEMENTED （附赠gRPC错误码表）
SpringMVC之JSR303与拦截器
 SLICEM是如何将查找表配置为分布式RAM/移位寄存器的
原文地址：https://blog.csdn.net/qq_44864833/article/details/127365981