• nlp中的对抗训练比较


    对抗训练是一种引入噪声的训练方式,可以对参数进行正则化,提升模型鲁棒性和泛化能力

    常见的对抗训练有:fgsm、fgm、pgd、FreeAT、yopo、FreeLB、smart,AWP

    这里给出fgsm、fgm、pgd、FreeAT的代码以及实验结果

    仓库地址如下:GTyingzi/Compare_Adversial (github.com)

    对抗训练代码

    FGSM
    官方实现
    class FGSM:
        def __init__(self, model, eps=1):
            self.model = model
            self.eps = eps
            self.backup = {}
    
        def attack(self, emb_name='embedding'):
            # emb_name这个参数要换成你模型中embedding的参数名
            for name, param in self.model.named_parameters():
    
                if param.requires_grad and emb_name in name:
                    self.backup[name] = param.data.clone()
                    r_at = self.eps * param.grad.sign()
                    param.data.add_(r_at)
    
        def restore(self, emb_name='embedding'):
            # emb_name这个参数要换成你模型中embedding的参数名
            for name, para in self.model.named_parameters():
                if para.requires_grad and emb_name in name:
                    assert name in self.backup
                    para.data = self.backup[name]
            self.backup = {}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    实例
    fgsm = FGSM(model=model)
    for i,(trains,labels) in enumerate(train_iter):
        # 正常训练
    	outputs = model(trains)
    	loss = F.cross_entropy(outputs,labels)
    	loss.backward() # 反向传播得到正常的grad
        # 对抗训练
    	fgsm.attack() # 在embedding上添加对抗扰动
    	outputs = model(trains)
     	loss_adv = F.cross_entropy(outputs,labels)
    	loss_adv.backward() # 反向传播,并在正常的grad基础上,累加对抗训练的梯度
    	fgsm.restore() # 恢复embedding参数
        # 梯度下降,更新参数
        optimizer.step()
        model.zero_grad()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    FGM
    官方实现
    class FGM:
        def __init__(self, model, eps=1):
            self.model = model
            self.eps = eps
            self.backup = {}
    
        def attack(self, emb_name='embedding'):
            # emb_name这个参数要换成你模型中embedding的参数名
            for name, param in self.model.named_parameters():
                if param.requires_grad and emb_name in name:
                    self.backup[name] = param.data.clone()
                    norm = torch.norm(param.grad)
                    if norm and not torch.isnan(norm):
                        r_at = self.eps * param.grad / norm
                        param.data.add_(r_at)
    
        def restore(self, emb_name='embedding'):
            # emb_name这个参数要换成你模型中embedding的参数名
            for name, para in self.model.named_parameters():
                if para.requires_grad and emb_name in name:
                    assert name in self.backup
                    para.data = self.backup[name]
            self.backup = {}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    实例
    fgm = FGM(model=model)
    for i,(trains,labels) in enumerate(train_iter):
        # 正常训练
    	outputs = model(trains)
    	loss = F.cross_entropy(outputs,labels)
    	loss.backward() # 反向传播得到正常的grad
        # 对抗训练
    	fgm.attack() # 在embedding上添加对抗扰动
    	outputs = model(trains)
     	loss_adv = F.cross_entropy(outputs,labels)
    	loss_adv.backward() # 反向传播,并在正常的grad基础上,累加对抗训练的梯度
    	fgm.restore() # 恢复embedding参数
        # 梯度下降,更新参数
        optimizer.step()
        model.zero_grad()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    PGD
    官方实现
    # PGD
    class PGD:
        def __init__(self, model, eps=1, alpha=0.3):
            self.model = model
            self.eps = eps
            self.alpha = alpha
            self.emb_backup = {}
            self.grad_backup = {}
    
        def attack(self, emb_name='embedding', is_first_attack=False):
            for name, param in self.model.named_parameters():
                if param.requires_grad and emb_name in name:
                    if is_first_attack:
                        self.emb_backup[name] = param.data.clone()
                    norm = torch.norm(param.grad)
                    if norm != 0 and not torch.isnan(norm):
                        r_at = self.alpha * param.grad / norm
                        param.data.add_(r_at)
                        param.data = self.project(name, param.data)
    
        def restore(self, emb_name='embedding'):
            for name, param in self.model.named_parameters():
                if param.requires_grad and emb_name in name:
                    assert name in self.emb_backup
                    param.data = self.emb_backup[name]
            self.emb_backup = {}
    
        def project(self, param_name, param_data):
            r = param_data - self.emb_backup[param_name]
            if torch.norm(r) > self.eps:
                r = self.eps * r / torch.norm(r)
            return self.emb_backup[param_name] + r
    
        def backup_grad(self):
            for name, param in self.model.named_parameters():
                if param.requires_grad and param.grad is not None:
                    self.grad_backup[name] = param.grad.clone()
    
        def restore_grad(self):
            for name, param in self.model.named_parameters():
                if param.requires_grad and param.grad is not None:
                    param.grad = self.grad_backup[name]
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    实例
    pgd = PGD(model=model)
    for i,(trains,labels) in enumerate(train_iter):
        # 正常训练
    	outputs = model(trains)
    	loss = F.cross_entropy(outputs,labels)
    	loss.backward() # 反向传播得到正常的grad
        # 对抗训练
        pgd_k = 3
        for _t in range(pgd_k):
        	pgd.attack(is_first_attack=(_t == 0))# 在embedding上添加对抗扰动, first attack时备份param.data
    		if _t != pgd_k - 1:
    			model.zero_grad()
    		else:
    			pgd.restore_grad()
    		outputs = model(trains)
    		loss_adv = F.cross_entropy(outputs,labels)
    		loss_adv.backward()# 反向传播,并在正常的grad基础上,累加对抗训练的梯度
    	pgd.restore()# 恢复embedding参数
        # 梯度下降,更新参数
        optimizer.step()
        model.zero_grad()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    FreeAT
    官方实现
    class FreeAT:
        def __init__(self, model, eps=0.1):
            self.model = model
            self.eps = eps
            self.emb_backup = {}
            self.grad_backup = {}
            self.last_r_at = 0
    
        def attack(self, emb_name='embedding', is_first_attack=False):
            for name, param in self.model.named_parameters():
                if param.requires_grad and emb_name in name:
                    if is_first_attack:
                        self.emb_backup[name] = param.data.clone()
                    param.data.add_(self.last_r_at)
                    param.data = self.project(name, param.data)
                    self.last_r_at = self.last_r_at + self.eps * param.grad.sign()
    
        def restore(self, emb_name='embedding'):
            for name, param in self.model.named_parameters():
                if param.requires_grad and emb_name in name:
                    assert name in self.emb_backup
                    param.data = self.emb_backup[name]
            self.emb_backup = {}
    
        def project(self, param_name, param_data):
            r = param_data - self.emb_backup[param_name]
            if torch.norm(r) > self.eps:
                r = self.eps * r / torch.norm(r)
            return self.emb_backup[param_name] + r
    
        def backup_grad(self):
            for name, param in self.model.named_parameters():
                if param.requires_grad and param.grad is not None:
                    self.grad_backup[name] = param.grad.clone()
    
        def restore_grad(self):
            for name, param in self.model.named_parameters():
                if param.requires_grad and param.grad is not None:
                    param.grad = self.grad_backup[name]
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    实例
    free_at = FreeAT(model=model)
    for i,(trains,labels) in enumerate(train_iter):
        # 正常训练
    	outputs = model(trains)
    	loss = F.cross_entropy(outputs,labels)
    	loss.backward() # 反向传播得到正常的grad
        # 对抗训练
    	m = 5
        for _t in range(m):
        	free_at.attack(is_first_attack=(_t == 0))# 在embedding上添加对抗扰动, first attack时备份param.data
    		if _t != pgd_k - 1:
    			model.zero_grad()
    		else:
    			free_at.restore_grad()
    		outputs = model(trains)
    		loss_adv = F.cross_entropy(outputs,labels)
    		loss_adv.backward()# 反向传播,并在正常的grad基础上,累加对抗训练的梯度
    	free_at.restore()# 恢复embedding参数
        # 梯度下降,更新参数
        optimizer.step()
        model.zero_grad()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21

    测试结果区

    TextCNN + attack_train
    baseline+attack_trainprecisionrecallF1
    TextCNN0.90830.90780.9079
    TextCNN + FGSM0.91050.91030.9103
    TextCNN + FGM0.91100.91040.9105
    TextCNN + PGD0.91030.90980.9099
    TextCNN + FreeAT0.91040.90970.9096
    TextRNN++ attack_train
    baseline+attack_trainprecisionrecallF1
    TextRNN0.90460.90340.9038
    TextRNN + FGSM0.90680.90550.9058
    TextRNN + FGM0.91600.91610.9160
    TextRNN + PGD0.91440.91420.9140
    TextRNN + FreeAT0.90640.90620.9059

    参考资料:

    attack_train/Attack-Train-Compare-Pytorch at main · tanshoudong/attack_train (github.com)

    (517条消息) 对抗训练fgm、fgsm和pgd原理和源码分析_谈笑风生…的博客-CSDN博客_pgd代码

    一文搞懂NLP中的对抗训练FGSM/FGM/PGD/FreeAT/YOPO/FreeLB/SMART - 知乎 (zhihu.com)

    (22条消息) 对抗学习总结:FGSM->FGM->PGD->FreeAT, YOPO ->FreeLb->SMART->LookAhead->VAT_zhurui_xiaozhuzaizai的博客-CSDN博客

    lonePatient/TorchBlocks: A PyTorch-based toolkit for natural language processing (github.com)

  • 相关阅读:
    什么是yandex.metrica 目标?
    AIRIOT亮相IOTE2023深圳物联网展,产品创新力再获“IOTE金奖”
    基于DF模式的协作通信技术matlab性能仿真
    Linux编译器-gcc/g++使用和动静态库的对比
    PostgreSQL的学习心得和知识总结(一百零一)|深入理解PostgreSQL数据库开源MPP扩展Citus 分布表和本地表Join 的实现原理
    黑五如何大卖?TikTok三大类目已抢跑,业绩翻倍指南请查收!
    Docker 链接sqlserver时出现en-us is an invalid culture错误解决方案
    基于go语言的史上最流弊的学生成绩管理系统
    高速公路安全监测预警系统的功能优势
    智云通CRM:客户为什么不买?
  • 原文地址:https://blog.csdn.net/mynameisgt/article/details/127588733