• 论文阅读---REALISE model


    REALISE model:

    1.utilizes multiple encoders to obtain the semantic ,phonetic , and graphic information to distinguish the similarities of Chinese characters and correct the spelling errors.
    2.And then, develop a selective modality fusion module to obtain the context-aware multimodal representations.
    3.Finally ,the output layer predict the probabilities of error corrections.

    Encoders:

    Semantic encoder:

    BERT, which provides rich contextual word representation with the unsupervised pretraining on large corpora.

    from transformers import BertTokenizer
    tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
    
    • 1
    • 2

    Tokenizer是一种文本处理工具,用于将文本分解成单个单词(称为tokens)或其他类型的单位,例如标点符号和数字。在自然语言处理领域,tokenizer通常用于将句子分解为单个单词或词元,以便进行文本分析和机器学习任务。常用的tokenizer包括基于规则的tokenizer和基于机器学习的tokenizer,其中基于机器学习的tokenizer可以自动识别单词和短语的边界,并将其分解为单个tokens。

    Phonetic encoder

    pinyin: initial(21)+final(39)+tone(5)
    hierarchical phonetic encoder :character-level encoder and sentence-level encoder

    Character-level encoder

    GRU:
    GRU(Gate Recurrent Unit)是循环神经网络(Recurrent Neural Network, RNN)的一种。和LSTM(Long-Short Term Memory)一样,也是为了解决长期记忆和反向传播中的梯度等问题而提出来的。

    GRU和LSTM在很多情况下实际表现上相差无几,那么为什么我们要使用新人GRU(2014年提出)而不是相对经受了更多考验的LSTM(1997提出)呢。
    我们在我们的实验中选择GRU是因为它的实验效果与LSTM相似,但是更易于计算。

    Sentence-level Encoder: obtain the contextualized phonetic representation for each Chinese characters

    4-layer Transformer with the same hidden size as the semantic encoder
    because independent phonetic vectors are not distinguished in order, so we add the positional embeading to each vector. +pack the vector together ->transformer layers to calculate the contextualized representation in acoustic modality.

    Graphic Encoder

    ResNet
    three fonds correpond to the three channels of the character images whose size is set to 32*32 pixel

    Selective Modality Fusion Module

    Ht, Ha,Hv ==textual ,acoustic,visual
    fuse information i n different modalities
    selective gate unit: select how much information flow to the mixed multimodal representation.
    gate values :fully-connected layer followed by a sigmoid function.

    Acoustic and Visual Pretraining

    aims to learn the acoustic-textual and visual-textual relationships
    phonetic encoder:input method pretraining objective
    graphhic encoder:OCP pretraining objective

    Data and Metrics

    data:SIGHAN —>convert to simplified chinese by using the OPENCC tools

    two level :detection and correction level to test the model

  • 相关阅读:
    Python 基本语法
    渗透测试-一次完整的渗透测试流程
    vue中设置动态路由
    c#求STDEV标准偏差方法
    交叉编译BusyBox
    jupyter notebook连接不上内核
    java计算机毕业设计新能源汽车故障分析2021源码+系统+lw+数据库+调试运行
    算法基础 1.2 归并排序
    Python 的运算符和语句(条件、循环、异常)基本使用指南
    MarkDown语法——更好地写博客
  • 原文地址:https://blog.csdn.net/qq_48566899/article/details/132560529