• 相似度论文再回顾


    Towards a Unified Multi-Dimensional Evaluator for Text Generation

    多个维度出发评价生成文本的质量,如一致性、流畅度等等。

    每个维度的伪标注样本数量为30K,作者构建的数据集:

    we first design specific rules for several commonly evaluated dimensions to construct pseudo data, and then combine them to train the evaluator.

    任务形式:summary和dialogue。

    实验验证:对比model有BLEU、METHOR、ROUGE、Bertscore....

    人工标注的数据:TO verfify the proposed evaluator is qualifited, we need to calculated correlations with human scores in each benchamark.

    Train the evaluator for 1-3 epochs. _Supervised method.

    BARTSCORE: Evaluating Generated Text as Text Generation

    Conditional text generation: for example,machine translation, so the goal is to generate a hypothesis (h = h1, · · · , hm) based on a given source text (s = s1, · · · , sn)

    require human judgments to train (i.e., supervised metrics): COMET [57], BLEURT [63], or are human judgment-free (i.e., unsupervised): BLEU [51] ROUGE-1 and ROUGE-2, ROUGE-L, CHRF [53], PRISM [66], MoverScore [77], BERTScore [76].

    Datasets (such are generate for specific areas from 2015):

    TASKDatasetsDescrip
    SUMNER 1860个articles
    MT WMT 19
    FactualityRank19373 triples of a source sentence with two summary sentences, one correct and one incorrect.
    FactualityQAGS20235 test outputs on CNNDM dataset from [16] and 239 test outputs on XSUM dataset [48] from BART fine-tuned on XSUM
    Data to TextBAGEL202 samples , each sample consists of one meaning representation, multiple references, and utterances generated by different systems 

    Train the evaluator .

    BERTSCORE: EVALUATING TEXT GENERATION WITH BERT

    dataset: wnt 18和wnt 19

    task format :machine translation and image capition

    No training. 

  • 相关阅读:
    Flutter案例日程安排首页效果 Lottie动画与Shimmer实现的微光效果
    磷酸化甘露糖苷修饰白蛋白纳米粒/卵白蛋白-葡聚糖纳米凝胶的
    ModelSim相关实用设置
    windows线程同步与互斥
    java词汇
    vscode常用主题推荐
    Springboot实现国际化以及部署Linux不生效问题
    【毕业设计】 单片机自动写字机器人设计与实现 - 物联网 嵌入式 stm32
    Java基础-IO流
    Linux shell 逻辑运算符、逻辑表达式详解
  • 原文地址:https://blog.csdn.net/Hekena/article/details/128125876