• 使用accumulate step节省显卡内存


    使用前提:
    单卡,模型+batch=1的数据能跑起来
    使用accumulate step的意思就是,每次forward较小的batch,如batch=4,每4steps再更新一次参数,训练结果等效于batch=16
    先跑一次原先的模型

    python NLinear_exp_full.py --accu_step 1 --batch 16 
    epoch: 0
    time comsuming: 1.8598144054412842
    training epoch:0:0.0%
    time comsuming: 2.137087106704712
    training epoch:0:80.64516129032258%
    time comsuming: 2.2242424488067627
    time comsuming: 2.294013500213623
    test epoch:0:0.0%
    episode 0 mae 23.900234 rmse 66.41403 smape 0.934281
    epoch: 1
    time comsuming: 3.2021634578704834
    training epoch:1:0.0%
    time comsuming: 3.477159261703491
    training epoch:1:80.64516129032258%
    time comsuming: 3.560976505279541
    time comsuming: 3.624363422393799
    test epoch:1:0.0%
    episode 1 mae 22.137833 rmse 64.748055 smape 0.79881644
    epoch: 2
    time comsuming: 3.982663869857788
    training epoch:2:0.0%
    time comsuming: 4.26115345954895
    training epoch:2:80.64516129032258%
    time comsuming: 4.350359678268433
    time comsuming: 4.427008628845215
    test epoch:2:0.0%
    episode 2 mae 21.542023 rmse 64.10915 smape 0.68798375
    epoch: 3
    time comsuming: 4.786099910736084
    training epoch:3:0.0%
    time comsuming: 5.036171913146973
    training epoch:3:80.64516129032258%
    time comsuming: 5.121201038360596
    time comsuming: 5.197283744812012
    test epoch:3:0.0%
    episode 3 mae 21.322206 rmse 64.079384 smape 0.6753313
    epoch: 4
    time comsuming: 5.5672008991241455
    training epoch:4:0.0%
    time comsuming: 5.830775260925293
    training epoch:4:80.64516129032258%
    time comsuming: 5.919378757476807
    time comsuming: 5.9778666496276855
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44

    再跑一次batch设置为4,且accumulate step为4的情况

    python NLinear_exp_full.py --accu_step 4 --batch 4 
    time comsuming: 1.9860742092132568
    training epoch:0:0.0%
    time comsuming: 2.221600294113159
    training epoch:0:20.161290322580644%
    time comsuming: 2.453077554702759
    training epoch:0:40.32258064516129%
    time comsuming: 2.675966262817383
    training epoch:0:60.483870967741936%
    time comsuming: 2.832383394241333
    training epoch:0:80.64516129032258%
    time comsuming: 3.0732641220092773
    time comsuming: 3.1844491958618164
    test epoch:0:0.0%
    time comsuming: 3.4134249687194824
    test epoch:0:72.99270072992701%
    episode 0 mae 23.900234 rmse 66.41403 smape 0.934281
    epoch: 1
    time comsuming: 4.225269079208374
    training epoch:1:0.0%
    time comsuming: 4.442946434020996
    training epoch:1:20.161290322580644%
    time comsuming: 4.611685752868652
    training epoch:1:40.32258064516129%
    time comsuming: 4.845811367034912
    training epoch:1:60.483870967741936%
    time comsuming: 5.074229001998901
    training epoch:1:80.64516129032258%
    time comsuming: 5.326176166534424
    time comsuming: 5.397624492645264
    test epoch:1:0.0%
    time comsuming: 5.633365869522095
    test epoch:1:72.99270072992701%
    episode 1 mae 22.137833 rmse 64.748055 smape 0.79881644
    epoch: 2
    time comsuming: 5.991377592086792
    training epoch:2:0.0%
    time comsuming: 6.217101097106934
    training epoch:2:20.161290322580644%
    time comsuming: 6.363693714141846
    training epoch:2:40.32258064516129%
    time comsuming: 6.590087175369263
    training epoch:2:60.483870967741936%
    time comsuming: 6.823684215545654
    training epoch:2:80.64516129032258%
    time comsuming: 7.081570625305176
    time comsuming: 7.148298978805542
    test epoch:2:0.0%
    time comsuming: 7.377046823501587
    test epoch:2:72.99270072992701%
    episode 2 mae 21.542023 rmse 64.10915 smape 0.68798375
    epoch: 3
    time comsuming: 7.766062021255493
    training epoch:3:0.0%
    time comsuming: 7.996231317520142
    training epoch:3:20.161290322580644%
    time comsuming: 8.161593675613403
    training epoch:3:40.32258064516129%
    time comsuming: 8.388957738876343
    training epoch:3:60.483870967741936%
    time comsuming: 8.618509769439697
    training epoch:3:80.64516129032258%
    time comsuming: 8.876739978790283
    time comsuming: 8.95041275024414
    test epoch:3:0.0%
    time comsuming: 9.18027663230896
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67

    显存占比: 514MB VS 494MB

    注意 当模型中含有BatchNorm层或者Drop_path时,最终的结果不一致

  • 相关阅读:
    基于注解的声明式事务
    【毕业设计】深度学习乳腺癌医学图像分类算法研究与实现 - python 卷积神经网络
    golang gin ShouldBindJSON绑定form表单数据:application/x-www-form-urlencoded对应form
    SAP-MM-错误代码M7018 输入物料转移过账
    轻量级分布式协调工具Etcd介绍和使用
    网 络 编 程
    线性回归模型用于波士顿房价预测的(普通VSsklearn库方法)比较
    抬升市场投资情绪,若羽臣是否还需“自身硬”?
    可完全替代FTP的文件传输工具大集合
    node基础之三:http 模块
  • 原文地址:https://blog.csdn.net/qq_45654059/article/details/133862992