• 【人工智能选修课】实验报告


    基于神经网络的 MNIST 手写数字识别

    仅仅是“人工智能选修课”的上机作业

    一、实验目的

    • 掌握运用神经网络模型解决有监督学习问题
    • 掌握机器学习中常用的模型训练测试方法
    • 了解不同训练方法的选择对测试结果的影响

    二、实验内容

    MNIST数据集

    本实验采用的数据集MNIST是一个手写数字图片数据集,共包含图像和对应的标签。数据集中所有图片都是28x28像素大小,且所有的图像都经过了适当的处理使得数字位于图片的中心位置。MNIST数据集使用二进制方式存储。图片数据中每个图片为一个长度为784(28x28x1,即长宽28像素的单通道灰度图)的一维向量,而标签数据中每个标签均为长度为10的一维向量。

    分层采样方法

    分层采样(或分层抽样,也叫类型抽样)方法,是将总体样本分成多个类别,再分别在每个类别中进行采样的方法。通过划分类别,采样出的样本的类型分布和总体样本相似,并且更具有代表性。在本实验中,MNIST数据集为手写数字集,有0~9共10种数字,进行分层采样时先将数据集按数字分为10类,再按同样的方式分别进行采样。

    神经网络模型评估方法

    通常,我们可以通过实验测试来对神经网络模型的误差进行评估。为此,需要使用一个测试集来测试模型对新样本的判别能力,然后以此测试集上的测试误差作为误差的近似值。两种常见的划分训练集和测试集的方法:

    留出法(hold-out) 直接将数据集按比例划分为两个互斥的集合。划分时为尽可能保持数据分布的一致性,可以采用分层采样(stratified sampling)的方式,使得训练集和测试集中的类别比例尽可能相似。需要注意的是,测试集在整个数据集上的分布如果不够均匀还可能引入额外的偏差,所以单次使用留出法得到的估计结果往往不够稳定可靠。在使用留出法时,一般要采用若干次随机划分、重复进行实验评估后取平均值作为留出法的评估结果。

    k折交叉验证法(k-fold cross validation) 先将数据集划分为k个大小相似的互斥子集,每个子集都尽可能保持数据分布的一致性,即也采用分层采样(stratified sampling)的方法。然后,每次用k-1个子集的并集作为训练集,余下的那个子集作为测试集,这样就可以获得k组训练集和测试集,从而可以进行k次训练和测试。最终返回的是这k个测试结果的均值。显然,k折交叉验证法的评估结果的稳定性和保真性在很大程度上取决于k的取值。k最常用的取值是10,此外常用的取值还有5、20等。

    三、实验方法设计

    介绍实验中程序的总体设计方案、关键步骤的编程方法及思路,主要包括:

    1)模型构建的程序设计(伪代码或源代码截图)及说明解释 (10分)

    构建全连接神经网络,每一层的神经元个数分别为:784->128->128->10

    采用 Adam 优化器,使用 softmax 函数计算 loss

    具体解释见代码注释

    # 构建和训练模型
    def train_and_test(images_train, labels_train, images_test, labels_test, images_validation, labels_validation):
        x = tf.placeholder(tf.float32, [None, 784], name="X")
        y = tf.placeholder(tf.float32, [None, 10], name="Y")
        h1 = fcn_layer(inputs=x,
                       input_dim=784,
                       output_dim=128,
                       activation=tf.nn.relu)
    
        h2 = fcn_layer(inputs=h1,
                       input_dim=128,
                       output_dim=128,
                       activation=tf.nn.relu)
        forward = fcn_layer(inputs=h2,
                            input_dim=128,
                            output_dim=10,
                            activation=None)
        pred = tf.nn.softmax(forward)
    
        loss_function = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(logits=forward, labels=y))
    
        optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function) # 优化器
        correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) # 比较预测值和真实值
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25

    其中 fcn_layer 函数:

    def fcn_layer(inputs,           # 输入数据
                  input_dim,        # 输入神经元数量
                  output_dim,       # 输出神经元数量
                  activation=None): # 激活函数
        W = tf.Variable(tf.truncated_normal(
            [input_dim, output_dim], stddev=0.1))  # 初始化权重
        b = tf.Variable(tf.zeros([output_dim]))  # 初始化为0
        XWb = tf.matmul(inputs, W) + b
        return XWb if activation is None else activation(XWb)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    2)模型迭代训练的程序设计(伪代码或源代码截图)及说明解释 (10分)

    具体解释见代码注释

    train_epochs = 32  # 训练轮数
    batch_size = 64  # 单次训练样本数(批次大小)
    display_step = 4096  # 显示粒度
    learning_rate = 0.001  # 学习率
    
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function) # 优化器
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) # 比较预测值和真实值
    # 准确率,将布尔值转化为浮点数,并计算平均值
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 
    
    
    with tf.Session() as sess:
        init = tf.global_variables_initializer()  # 初始化变量
        sess.run(init)
    
        step = 0
        for (batchImages, batchLabels) in batch_iter(images_train, labels_train, batch_size, train_epochs, shuffle=True):
            sess.run(optimizer,feed_dict={x: batchImages, y: batchLabels})
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19

    3)模型训练过程中周期性测试的程序设计(伪代码或源代码截图)及说明解释(周期性测试指的是每训练n个step就对模型进行一次测试,得到准确率和loss值)(10分)

    具体解释见代码注释

    display_step = 4096  # 显示粒度
    with tf.Session() as sess:
        init = tf.global_variables_initializer()  # 初始化变量
        sess.run(init)
    
        step = 0
        for (batchImages, batchLabels) in batch_iter(images_train, labels_train, batch_size, train_epochs, shuffle=True):
            sess.run(optimizer,feed_dict={x: batchImages, y: batchLabels})
    
            if step % display_step == 0:
                loss, acc = sess.run([loss_function, accuracy],
                                     feed_dict={x: images_validation, y: labels_validation}) # 测试
                print(f"step: {step+1} Loss={loss} accuracy={acc}")
                step += 1
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    输出结果:

    step: 1 Loss=2.238192558288574 accuracy=0.17159998416900635
    step: 4097 Loss=0.09725397080183029 accuracy=0.9717997312545776
    step: 8193 Loss=0.10235630720853806 accuracy=0.9781997203826904
    step: 12289 Loss=0.13071678578853607 accuracy=0.9735997915267944
    step: 16385 Loss=0.12960655987262726 accuracy=0.9757996797561646
    step: 20481 Loss=0.14140461385250092 accuracy=0.9765996932983398
    step: 24577 Loss=0.16358020901679993 accuracy=0.9759997129440308
    === test accuracy: 0.97  ===
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    4)分层采样的程序设计(伪代码或源代码截图)及说明解释 (10分)

    利用 sklearn 中的 train_test_split 实现。十次随机抽取训练集和测试集,取平均值。

    具体解释见代码注释

    # 留出法(hold-out)
    from sklearn.model_selection import train_test_split 
    def hold_out(images, labels, train_percentage):
        accu = []
        # 十次随机抽取训练集和测试集,取平均值
        for _ in range(10):
            train_images, test_images, train_labels, test_labels = \
                train_test_split(images, 
                                 labels, 
                                 train_size=train_percentage, # 训练集比例
                                 stratify = labels # 保持类别分布
                                 ) 
            accu.append(train_and_test(train_images, train_labels, test_images, test_labels, test_images, test_labels))
        print("hold-out accuracy:", accu)
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    5)k折交叉验证法的程序设计(伪代码或源代码截图)及说明解释 (10分)

    利用 sklearn 中的 KFold 实现。计算k中不同抽取下的平均值。

    具体解释见代码注释

    # k折交叉验证法(k-fold cross validation)
    from sklearn.model_selection import KFold
    def cross_validation(images, labels, k):
        accu = []
        kf = KFold(n_splits=k, shuffle=True)
        for train_index, test_index in kf.split(images):
            images_train, images_test = images[train_index], images[test_index]
            labels_train, labels_test = labels[train_index], labels[test_index]
            accu.append(train_and_test(images_train, labels_train, images_test, labels_test, images_test, labels_test))
        print("cross-validation accuracy:", np.mean(accu))
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    四、实验结果展示

    展示程序界面设计、运行结果及相关分析等,主要包括:

    1)模型在验证集下的准确率(输出结果并截图)(10分)

    step: 1 Loss=2.238192558288574 accuracy=0.17159998416900635
    step: 4097 Loss=0.09725397080183029 accuracy=0.9717997312545776
    step: 8193 Loss=0.10235630720853806 accuracy=0.9781997203826904
    step: 12289 Loss=0.13071678578853607 accuracy=0.9735997915267944
    step: 16385 Loss=0.12960655987262726 accuracy=0.9757996797561646
    step: 20481 Loss=0.14140461385250092 accuracy=0.9765996932983398
    step: 24577 Loss=0.16358020901679993 accuracy=0.9759997129440308
    === test accuracy: 0.97  ===
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    2)不同模型参数(隐藏层数、隐藏层节点数)对准确率的影响和分析 (10分)

    不同的隐藏层数
    • 隐藏层数为 0 时:
    step: 1 Loss=2.5073938369750977 accuracy=0.0729999914765358
    step: 4097 Loss=0.27769413590431213 accuracy=0.9217997789382935
    step: 8193 Loss=0.26662880182266235 accuracy=0.9259997010231018
    step: 12289 Loss=0.263393372297287 accuracy=0.9231997728347778
    step: 16385 Loss=0.26742368936538696 accuracy=0.9237997531890869
    step: 20481 Loss=0.26651620864868164 accuracy=0.9251997470855713
    step: 24577 Loss=0.26798802614212036 accuracy=0.9247996807098389
    === test accuracy: 0.9248  ===
    0.92479974
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 隐藏层数为 1 时:
    step: 1 Loss=2.4127447605133057 accuracy=0.09719999134540558
    step: 4097 Loss=0.08607088774442673 accuracy=0.9745997190475464
    step: 8193 Loss=0.07784661650657654 accuracy=0.9785997271537781
    step: 12289 Loss=0.095745749771595 accuracy=0.9759998321533203
    step: 16385 Loss=0.09472983330488205 accuracy=0.9799997210502625
    step: 20481 Loss=0.09713517129421234 accuracy=0.9787996411323547
    step: 24577 Loss=0.0993366464972496 accuracy=0.9801996946334839
    === test accuracy: 0.9802  ===
    0.98019964
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 隐藏层数为 2 时:
    step: 1 Loss=2.238192558288574 accuracy=0.17159998416900635
    step: 4097 Loss=0.09725397080183029 accuracy=0.9717997312545776
    step: 8193 Loss=0.10235630720853806 accuracy=0.9781997203826904
    step: 12289 Loss=0.13071678578853607 accuracy=0.9735997915267944
    step: 16385 Loss=0.12960655987262726 accuracy=0.9757996797561646
    step: 20481 Loss=0.14140461385250092 accuracy=0.9765996932983398
    step: 24577 Loss=0.16358020901679993 accuracy=0.9759997129440308
    === test accuracy: 0.97  ===
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    综上可以得出结论:隐藏层的层数越多,训练越久,但是得到的结果也越准确。但越多增加的效果也越不明显

    不同的隐藏层节点数
    • 隐藏节点数为10*10时:
    step: 1 Loss=2.300844669342041 accuracy=0.12519998848438263
    step: 4097 Loss=0.2754775583744049 accuracy=0.9239997863769531
    step: 8193 Loss=0.24036210775375366 accuracy=0.9319997429847717
    step: 12289 Loss=0.22833241522312164 accuracy=0.9349997639656067
    step: 16385 Loss=0.22694511711597443 accuracy=0.9351996779441833
    step: 20481 Loss=0.2160138636827469 accuracy=0.9395997524261475
    step: 24577 Loss=0.20927678048610687 accuracy=0.9417997598648071
    === test accuracy: 0.9392  ===
    0.93919969
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 隐藏层数为16*16时:
    step: 1 Loss=2.302095890045166 accuracy=0.10459998995065689
    step: 4097 Loss=0.24206139147281647 accuracy=0.9285997152328491
    step: 8193 Loss=0.19353719055652618 accuracy=0.9429997801780701
    step: 12289 Loss=0.18354550004005432 accuracy=0.9491997361183167
    step: 16385 Loss=0.18149533867835999 accuracy=0.9485996961593628
    step: 20481 Loss=0.1877274215221405 accuracy=0.9493997097015381
    step: 24577 Loss=0.1913667917251587 accuracy=0.951799750328064
    === test accuracy: 0.9548  ===
    0.95479971
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 隐藏层数为128*128时:
    step: 1 Loss=2.238192558288574 accuracy=0.17159998416900635
    step: 4097 Loss=0.09725397080183029 accuracy=0.9717997312545776
    step: 8193 Loss=0.10235630720853806 accuracy=0.9781997203826904
    step: 12289 Loss=0.13071678578853607 accuracy=0.9735997915267944
    step: 16385 Loss=0.12960655987262726 accuracy=0.9757996797561646
    step: 20481 Loss=0.14140461385250092 accuracy=0.9765996932983398
    step: 24577 Loss=0.16358020901679993 accuracy=0.9759997129440308
    === test accuracy: 0.97  ===
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    综上可以得出结论:

    隐藏层的节点数越多,参数量指数上升,训练越久,但是得到的结果也越准确。

    (但随着参数量到一定程度,训练结果准确率上升趋于不明显,甚至发生过拟合现象。)

    3)不同训练参数(batch size、epoch num、学习率)对准确率的影响和分析 (10分)

    不同的 batch size
    • batch size = 64
    step: 1 Loss=2.238192558288574 accuracy=0.17159998416900635
    step: 4097 Loss=0.09725397080183029 accuracy=0.9717997312545776
    step: 8193 Loss=0.10235630720853806 accuracy=0.9781997203826904
    step: 12289 Loss=0.13071678578853607 accuracy=0.9735997915267944
    step: 16385 Loss=0.12960655987262726 accuracy=0.9757996797561646
    step: 20481 Loss=0.14140461385250092 accuracy=0.9765996932983398
    step: 24577 Loss=0.16358020901679993 accuracy=0.9759997129440308
    === test accuracy: 0.97  ===
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • batch size = 4096
    step: 1 Loss=2.2310731410980225 accuracy=0.16619999706745148
    === test accuracy: 0.9718  ===
    0.97179973
    
    • 1
    • 2
    • 3
    • batch size = 32786
    step: 1 Loss=2.2919344902038574 accuracy=0.15059998631477356
    === test accuracy: 0.8782  ===
    0.87819982
    
    • 1
    • 2
    • 3

    综上可以得出结论:batch size 越大,训练越快。

    但是过大的 batch size 会占用过多的显存,甚至导致溢出;同时也不利于随机梯度下降。

    不同的 epoch num
    • epoch num = 1
    step: 1 Loss=2.343087673187256 accuracy=0.11779998987913132
    step: 129 Loss=0.330795556306839 accuracy=0.9037997722625732
    step: 257 Loss=0.24847447872161865 accuracy=0.925399661064148
    step: 385 Loss=0.198109969496727 accuracy=0.9409997463226318
    step: 513 Loss=0.17987975478172302 accuracy=0.9479997754096985
    step: 641 Loss=0.1628917008638382 accuracy=0.9541996717453003
    step: 769 Loss=0.14910820126533508 accuracy=0.9547997117042542
    === test accuracy: 0.9526  ===
    0.95259976
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • epoch num = 32
    step: 1 Loss=2.238192558288574 accuracy=0.17159998416900635
    step: 4097 Loss=0.09725397080183029 accuracy=0.9717997312545776
    step: 8193 Loss=0.10235630720853806 accuracy=0.9781997203826904
    step: 12289 Loss=0.13071678578853607 accuracy=0.9735997915267944
    step: 16385 Loss=0.12960655987262726 accuracy=0.9757996797561646
    step: 20481 Loss=0.14140461385250092 accuracy=0.9765996932983398
    step: 24577 Loss=0.16358020901679993 accuracy=0.9759997129440308
    === test accuracy: 0.97  ===
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    综上可以得出结论:epoch num 训练了多少数据后停止训练。最好在模型准确率趋于稳定之后停止训练,不然准确率将达不到期望值。

    不同的学习率
    • 学习率 = 0.0001
    step: 1 Loss=2.328317642211914 accuracy=0.09679999947547913
    step: 129 Loss=1.5215153694152832 accuracy=0.7011998891830444
    step: 257 Loss=0.8109210133552551 accuracy=0.8205997943878174
    step: 385 Loss=0.5582057237625122 accuracy=0.863199770450592
    step: 513 Loss=0.4527219235897064 accuracy=0.8837997317314148
    step: 641 Loss=0.39591166377067566 accuracy=0.8925997614860535
    step: 769 Loss=0.3588014245033264 accuracy=0.8997997641563416
    === test accuracy: 0.9004  ===
    0.9003998
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 学习率 = 0.001
    step: 1 Loss=2.343087673187256 accuracy=0.11779998987913132
    step: 129 Loss=0.330795556306839 accuracy=0.9037997722625732
    step: 257 Loss=0.24847447872161865 accuracy=0.925399661064148
    step: 385 Loss=0.198109969496727 accuracy=0.9409997463226318
    step: 513 Loss=0.17987975478172302 accuracy=0.9479997754096985
    step: 641 Loss=0.1628917008638382 accuracy=0.9541996717453003
    step: 769 Loss=0.14910820126533508 accuracy=0.9547997117042542
    === test accuracy: 0.9526  ===
    0.95259976
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 学习率 = 0.01
    step: 1 Loss=2.3110170364379883 accuracy=0.2709999680519104
    step: 129 Loss=0.2452460527420044 accuracy=0.9277997016906738
    step: 257 Loss=0.215981587767601 accuracy=0.9361997246742249
    step: 385 Loss=0.21104326844215393 accuracy=0.9363997578620911
    step: 513 Loss=0.172766774892807 accuracy=0.9469997882843018
    step: 641 Loss=0.14438582956790924 accuracy=0.9573997855186462
    step: 769 Loss=0.15849816799163818 accuracy=0.9527996778488159
    === test accuracy: 0.9558  ===
    0.95579976
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 学习率 = 0.1
    step: 1 Loss=43.770484924316406 accuracy=0.10619999468326569
    step: 129 Loss=1.7850791215896606 accuracy=0.2809999883174896
    step: 257 Loss=1.7752128839492798 accuracy=0.3105999827384949
    step: 385 Loss=1.719871997833252 accuracy=0.3147999942302704
    step: 513 Loss=1.6704318523406982 accuracy=0.3511999845504761
    step: 641 Loss=1.6277217864990234 accuracy=0.34059998393058777
    step: 769 Loss=1.8401107788085938 accuracy=0.2733999788761139
    === test accuracy: 0.2738  ===
    0.27379999
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    综上可以得出结论:

    学习率减小,准确率提高,但收敛慢。

    学习率减小,学习速率增加,但易震荡

    4)留出法不同比例对结果的影响和分析 (10分)

    print("===== hold-out =====")
    print("train_percentage: 0.8: ", end='')
    hold_out(total_images, total_labels, 0.8)
    print("train_percentage: 0.9: ", end='')
    hold_out(total_images, total_labels, 0.9)
    print("train_percentage: 0.5: ", end='')
    hold_out(total_images, total_labels, 0.5)
    print("train_percentage: 0.2: ", end='')
    hold_out(total_images, total_labels, 0.2)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    结果:

    ===== hold-out =====
    
    train_percentage: 0.8:
    hold-out accuracy: [0.97072774, 0.974455, 0.97690958, 0.97781873, 0.97545499, 0.96990955, 0.97654593, 0.97209132, 0.97390956, 0.97772777]
    
    train_percentage: 0.9: 
    hold-out accuracy: [0.97945446, 0.97327256, 0.97381806, 0.97654533, 0.97981811, 0.97472721, 0.97472715, 0.97327256, 0.97436351, 0.97163624]
    
    train_percentage: 0.5: 
    hold-out accuracy: [0.97501898, 0.97083724, 0.97414637, 0.97454625, 0.97087359, 0.97469169, 0.97520077, 0.96894628, 0.97367346, 0.97367346]
    
    train_percentage: 0.2:
    hold-out accuracy: [0.96202344, 0.95893252, 0.95929617, 0.95911437, 0.95968258, 0.95965987, 0.96059167, 0.96009171, 0.96056885, 0.95806891]
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13

    综上可以得出结论:太极端的 train_percentage 使测试说服性降低,最好在0.8附近

    5)k折交叉验证法不同k值对结果的影响和分析 (10分)

    print("===== cross-validation =====")
    print("k=5: ", end='')
    cross_validation(total_images, total_labels, 5)
    print("k=10: ", end='')
    cross_validation(total_images, total_labels, 10)
    print("k=20: ", end='')
    cross_validation(total_images, total_labels, 20)
    print("k=2: ", end='')
    cross_validation(total_images, total_labels, 2)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    结果:

    ===== cross-validation =====
    k=5:
    cross-validation accuracy: 0.975146
    
    k=10: 
    cross-validation accuracy: 0.976927
    
    k=20: 
    cross-validation accuracy: 0.977491
    
    k=2: 
    cross-validation accuracy: 0.973474
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13

    综上可以得出结论:k折交叉验证法比留出法对k的鲁棒性要好一点

    五、实验总结及心得

    我通过MNIST 手写数字图片数据集训练一个简单的手写数字识别神经网络为例子,了解了用 Transflow 训练全连接神经网络的技巧,探究了神经网络的各种参数对训练过程以及训练结果的影响。

    还尝试了“留出法”与“ k 折交叉验证法”这两种神经网络模型评估方法,探索了这两种方法参数对评估结果的影响。

    参考

    anaconda 优雅安装 tensorflow (不用手动安装cuda、cudnn等):

    conda create --name tf_gpu_env python=3.6 anaconda tensorflow-gpu

    不踩坑:Ubuntu下安装TensorFlow的最简单方法(无需手动安装CUDA和cuDNN) - 知乎 (zhihu.com)

    运行 jupyter 时遇到的问题解决:

    彻底解决:AttributeError:type object IOLoop has no attribute initialized_Joyyang_c的博客-CSDN博客

  • 相关阅读:
    Vue 常用指令
    如何做好漏洞扫描工作提高网络安全
    个推集成方式
    Beckhoff倍福工业电脑C6240-1037-0030主板维修CB1051-0003 CPU深圳捷达工控维修
    Leetcode 429:N叉树的层次遍历
    Chapter 19 Tips and Traps: Common Goofs for Novices
    机器学习中常用的分类算法总结
    lightdb-no_push_subq
    JS基础6--逻辑运算符
    1876. 长度为三且各字符不同的子字符串
  • 原文地址:https://blog.csdn.net/weixin_47102975/article/details/126549178