• 利用微调的deberta-v3-large来预测情感分类


    前言:

    昨天我们讲述了怎么利用emotion数据集进行deberta-v3-large大模型的微调,那今天我们就来输入一些数据来测试一下,看看模型的准确率,为了方便起见,我直接用测试集的前十条数据

    代码:

    1. from transformers import AutoModelForSequenceClassification,AutoTokenizer
    2. import torch
    3. import numpy
    4. tokenizer = AutoTokenizer.from_pretrained("deberta-v3-large")
    5. model = AutoModelForSequenceClassification.from_pretrained("result/checkpoint-500",num_labels=6)
    6. raw_inputs = [
    7. "im feeling rather rotten so im not very ambitious right now",
    8. "im updating my blog because i feel shitty",
    9. "i never make her separate from me because i don t ever want her to feel like i m ashamed with her",
    10. "i left with my bouquet of red and yellow tulips under my arm feeling slightly more optimistic than when i arrived",
    11. "i was feeling a little vain when i did this one",
    12. "i cant walk into a shop anywhere where i do not feel uncomfortable",
    13. "i felt anger when at the end of a telephone call",
    14. "i explain why i clung to a relationship with a boy who was in many ways immature and uncommitted despite the excitement i should have been feeling for g
    15. etting accepted into the masters program at the university of virginia",
    16. "i like to have the same breathless feeling as a reader eager to see what will happen next",
    17. "i jest i feel grumpy tired and pre menstrual which i probably am but then again its only been a week and im about as fit as a walrus on vacation for the
    18. summer"
    19. ]
    20. inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
    21. outputs = model(**inputs)
    22. print(outputs.logits.argmax(-1).numpy())
    23. output_tensor = torch.softmax(outputs.logits, dim=1)
    24. numpy.set_printoptions(suppress=True, precision=15)
    25. print(output_tensor.detach().numpy())

    标注结果:

    [0 0 0 1 0 4 3 1 1 3]

    测试结果:

    1. [0 0 0 1 0 4 4 2 1 3]
    2. [[0.99185866 0.0011510316 0.00038844926 0.0026896652 0.0029623401
    3. 0.00094986777]
    4. [0.9918577 0.0011512033 0.00038886679 0.0026923663 0.0029585315
    5. 0.000951257 ]
    6. [0.99185807 0.0011446937 0.00038163515 0.0026456509 0.0030354485
    7. 0.00093440723]
    8. [0.00041773843 0.9972398 0.0014854104 0.0002909223 0.00036231524
    9. 0.00020376328]
    10. [0.99185014 0.0011451623 0.00038086114 0.0026396883 0.0030524035
    11. 0.00093187904]
    12. [0.015044774 0.0025362356 0.00041989447 0.015223678 0.95009714
    13. 0.016678285 ]
    14. [0.11319714 0.030935207 0.007336047 0.3035547 0.47545433
    15. 0.069522515 ]
    16. [0.0011094044 0.18334262 0.8081213 0.0011003793 0.0007297965
    17. 0.005596481 ]
    18. [0.0004444314 0.9972433 0.0014491597 0.00028465112 0.00037411976
    19. 0.00020446534]
    20. [0.00241266 0.00079152075 0.00092184055 0.9924028 0.0024109248
    21. 0.0010602956 ]]

    结果对比:

    除了第七、第八条数据错误外,其他的八条数据都是正确的

    代码解释:

    1、raw_inputs:用户输入的数据,这个地方你可以使用一个while循环,然后使用input来与用户进行交互,需要注意的是这个必须是一个数组,哪怕用户只输入了一句文本。

    2、return_tensors="pt":表示tokenizer返回的是PyTorch格式的数据

    3、argmax(-1):将logits属性中的浮点数张量沿着最后一个轴(即-1轴)进行argmax操作,从而找到该张量中最大值所对应的标签编号。

    4、softmax(outputs.logits, dim=1):dim指沿着哪个维度计算softmax,通常指定为1,表示对每一行进行softmax操作。如果不指定,则默认在最后一维计算softmax。

    5、numpy.set_printoptions(suppress=True, precision=15):使用 numpy.set_printoptions() 函数来设置打印选项,从而调整打印输出格式。其中,suppress 选项可以关闭科学计数法,precision 选项可以设置打印精度。

  • 相关阅读:
    推特群推掀开营销新篇章
    linux常用指令
    MarkDown详细入门笔记
    什么是video codec? video codec在实际业务的应用。
    面试题精讲丨MySQL的隔离级别真的越高越好吗?!
    远程办公时意外摔伤,算工伤吗?
    杰哥教你面试之一百问系列:java集合
    渗透测试工程师(NISP-PT)
    java惠生活网站计算机毕业设计MyBatis+系统+LW文档+源码+调试部署
    【FFmpeg】Filter 过滤器 ① ( FFmpeg 过滤器简介 | 过滤器概念 | 过滤器用法 | 过滤器工作流程 | 过滤器文档 | 过滤器分类 )
  • 原文地址:https://blog.csdn.net/duzm200542901104/article/details/132722117