• 【MindSpore易点通】模型测试和验证


    1 模型测试

    在训练完成之后,需要测试模型在测试集上的表现。依据模型评估方式的不同,分以下两种情况

    1.评估方式在MindSpore中已实现

    MindSpore中提供了多种Metrics方式:Accuracy、Precision、Recall、F1、TopKCategoricalAccuracy、Top1CategoricalAccuracy、Top5CategoricalAccuracy、MSE、MAE、Loss 。在测试中调用MindSpore已有的评估函数,需要定义一个dict,包含要使用的评估方式,并在定义model时传入,后续调用model.eval()会返回一个dict,内容即为metrics的指标和结果。

    ...def test_net(network, model, test_data_path, test_batch):

    """define the evaluation method"""

    print("============== Start Testing ==============")

    # load the saved model for evaluation

    param_dict = load_checkpoint("./train_resnet_cifar10-1_390.ckpt")

    #load parameter to the network

    load_param_into_net(network, param_dict)

    #load testing dataset

    ds_test = create_dataset(test_data_path, do_train=False,

    batch_size=test_batch)

    acc = model.eval(ds_test, dataset_sink_mode=False)

    print("============== test result:{} ==============".format(acc))

    if __name__ == "__main__":

    ...

    net = resnet()

    loss = nn.loss.SoftmaxCrossEntropyWithLogits(sparse=True,

    reduction='mean')

    opt = nn.SGD(net.trainable_params(), LR_ORI, MOMENTUM_ORI, WEIGHT_DECAY)

    metrics = {

    'accuracy': nn.Accuracy(),

    'loss': nn.Loss()

    }

    model = Model(net, loss, opt, metrics=metrics)

    test_net(net, model_constructed, TEST_PATH, TEST_BATCH_SIZE)

    2.评估方式在MindSpore中没有实现

    如果MindSpore中的评估函数不能满足要求,可参考accuracy.py 通过继承Metric基类完成Metric定义之后,并重写clear,updata,eval三个方法即可。通过调用model.predict()接口,得到网络输出后,按照自定义评估标准计算结果。

    下面以计算测试集精度为例,实现自定义Metrics:

    1. class AccuracyV2(EvaluationBase):
    2. def __init__(self, eval_type='classification'):
    3. super(AccuracyV2, self).__init__(eval_type)
    4. self.clear()
    5. def clear(self):
    6. """Clears the internal evaluation result."""
    7. self._correct_num = 0
    8. self._total_num = 0
    9. def update(self, output_y, label_input):
    10. y_pred = self._convert_data(output_y)
    11. y = self._convert_data(label_input)
    12. indices = y_pred.argmax(axis=1)
    13. results = (np.equal(indices, y) * 1).reshape(-1)
    14. self._correct_num += results.sum()
    15. self._total_num += label_input.shape[0]
    16. def eval(self):
    17. if self._total_num == 0:
    18. raise RuntimeError('Accuary can not be calculated')
    19. return self._correct_num / self._total_num
    20. def test_net(network, model, test_data_path, test_batch):
    21. """define the evaluation method"""
    22. print("============== Start Testing ==============")
    23. # Load the saved model for evaluation
    24. param_dict = load_checkpoint("./train_resnet_cifar10-1_390.ckpt")
    25. # Load parameter to the network
    26. load_param_into_net(network, param_dict)
    27. # Load testing dataset
    28. ds_test = create_dataset(test_data_path, do_train=False,
    29. batch_size=test_batch)
    30. metric = AccuracyV2()
    31. metric.clear()
    32. for data, label in ds_test.create_tuple_iterator():
    33. output = model.predict(data)
    34. metric.update(output, label)
    35. results = metric.eval()
    36. print("============== New Metric:{} ==============".format(results))
    37. if __name__ == "__main__":
    38. ...
    39. net = resnet()
    40. loss = nn.loss.SoftmaxCrossEntropyWithLogits(sparse=True,
    41. reduction='mean')
    42. opt = nn.SGD(net.trainable_params(), LR_ORI, MOMENTUM_ORI, WEIGHT_DECAY)
    43. model_constructed = Model(net, loss, opt)
    44. test_net(net, model_constructed, TEST_PATH, TEST_BATCH_SIZE)

    2 边训练边验证

    在训练的过程中,在验证集上测试模型的效果。目前MindSpore有两种方式。

    1、交替调用model.train()和model.eval() ,实现边训练边验证。

    1. ...def train_and_val(model, dataset_train, dataset_val, steps_per_train,
    2. epoch_max, evaluation_interval):
    3. config_ck = CheckpointConfig(save_checkpoint_steps=steps_per_train,
    4. keep_checkpoint_max=epoch_max)
    5. ckpoint_cb = ModelCheckpoint(prefix="train_resnet_cifar10",
    6. directory="./", config=config_ck)
    7. model.train(evaluation_interval, dataset_train,
    8. callbacks=[ckpoint_cb, LossMonitor()], dataset_sink_mode=True)
    9. acc = model.eval(dataset_val, dataset_sink_mode=False)
    10. print("============== Evaluation:{} ==============".format(acc))
    11. if __name__ == "__main__":
    12. ...
    13. ds_train, steps_per_epoch_train = create_dataset(TRAIN_PATH,
    14. do_train=True, batch_size=TRAIN_BATCH_SIZE, repeat_num=1)
    15. ds_val, steps_per_epoch_val = create_dataset(VAL_PATH, do_train=False,
    16. batch_size=VAL_BATCH_SIZE, repeat_num=1)
    17. net = resnet()
    18. loss = nn.loss.SoftmaxCrossEntropyWithLogits(sparse=True,
    19. reduction='mean')
    20. opt = nn.SGD(net.trainable_params(), LR_ORI, MOMENTUM_ORI, WEIGHT_DECAY)
    21. metrics = {
    22. 'accuracy': nn.Accuracy(),
    23. 'loss': nn.Loss()
    24. }
    25. net = Model(net, loss, opt, metrics=metrics)
    26. for i in range(int(EPOCH_MAX / EVAL_INTERVAL)):
    27. train_and_val(net, ds_train, ds_val, steps_per_epoch_train,
    28. EPOCH_MAX, EVAL_INTERVAL)
    29. 2、MindSpore通过调用model.train接口,在callbacks中传入自定义的EvalCallBack实例,进行训练并验证。
    30. class EvalCallBack(Callback):
    31. def __init__(self, model, eval_dataset, eval_epoch, result_evaluation):
    32. self.model = model
    33. self.eval_dataset = eval_dataset
    34. self.eval_epoch = eval_epoch
    35. self.result_evaluation = result_evaluation
    36. def epoch_end(self, run_context):
    37. cb_param = run_context.original_args()
    38. cur_epoch = cb_param.cur_epoch_num
    39. if cur_epoch % self.eval_epoch == 0:
    40. acc = self.model.eval(self.eval_dataset, dataset_sink_mode=False)
    41. self.result_evaluation["epoch"].append(cur_epoch)
    42. self.result_evaluation["acc"].append(acc["accuracy"])
    43. self.result_evaluation["loss"].append(acc["loss"])
    44. print(acc)
    45. if __name__ == "__main__":
    46. ...
    47. ds_train, steps_per_epoch_train = create_dataset(TRAIN_PATH,
    48. do_train=True, batch_size=TRAIN_BATCH_SIZE, repeat_num=REPEAT_SIZE)
    49. ds_val, steps_per_epoch_val = create_dataset(VAL_PATH, do_train=False,
    50. batch_size=VAL_BATCH_SIZE, repeat_num=REPEAT_SIZE)
    51. net = resnet()
    52. loss = nn.loss.SoftmaxCrossEntropyWithLogits(sparse=True,
    53. reduction='mean')
    54. opt = nn.SGD(net.trainable_params(), LR_ORI, MOMENTUM_ORI, WEIGHT_DECAY)
    55. metrics = {
    56. 'accuracy': nn.Accuracy(),
    57. 'loss': nn.Loss()
    58. }
    59. net = Model(net, loss, opt, metrics=metrics)
    60. result_eval = {"epoch": [], "acc": [], "loss": []}
    61. eval_cb = EvalCallBack(net, ds_val, EVAL_PER_EPOCH, result_eval)
    62. net.train(EPOCH_MAX, ds_train,
    63. callbacks=[ckpoint_cb, LossMonitor(), eval_cb],
    64. dataset_sink_mode=True, sink_size=steps_per_epoch_train)

    3 样例代码使用说明

    本文的样例代码是一个Resnet50在Cifar10上训练的分类网络,采用datasets.Cifar10Dataset接口读取二进制版本的CIFAR-10数据集,因此下载CIFAR-10 binary version (suitable for C programs),并在代码中配置好数据路径。

    启动命令:

    python xxx.py --data_path=xxx --epoch_num=xxx

    运行脚本,可以看到网络输出结果:

    详细代码请前往MindSpore论坛进行下载:华为云论坛_云计算论坛_开发者论坛_技术论坛-华为云

     

  • 相关阅读:
    互斥锁,条件变量,信号量的三个小demo
    Java 注解
    Flink 状态编程
    Allegro如何查看器件的管脚号?
    开发工程师必备————【Day1】网络编程
    使用 BeanUtils.copyProperties属性拷贝
    圆梦字节之后,我收集整理了这份“2021秋招常见Java面试题汇总”
    学习笔记 谷粒02 Cloud
    什么是 Cooke、Session 和 Token?
    Java学习
  • 原文地址:https://blog.csdn.net/skytttttt9394/article/details/126600024