• 用huggingface.transformers在文本分类任务(单任务和多任务场景下)上微调预训练模型


    诸神缄默不语-个人CSDN博文目录

    transformers官方文档:https://huggingface.co/docs/transformers/index
    AutoModel文档:https://huggingface.co/docs/transformers/v4.23.1/en/model_doc/auto#transformers.AutoModel
    AutoTokenizer文档:https://huggingface.co/docs/transformers/v4.23.1/en/model_doc/auto#transformers.AutoTokenizer

    单任务就是直接用Bert表征,然后接一个Dropout,接一层线性网络(和直接使用AutoModelforSequenceClassification性质相同)。
    多任务单数据集就是将单任务的线性网络改成给每个任务一个线性网络。

    https://github.com/huggingface/transformers/blob/ad654e448444b60937016cbea257f69c9837ecde/src/transformers/modeling_utils.py
    https://github.com/huggingface/transformers/blob/ee0d001de71f0da892f86caa3cf2387020ec9696/src/transformers/models/bert/modeling_bert.py

    多任务多数据集则是参考transformers官方代码(上面两个网址),在多任务单数据集的基础上再把BertEmbeddings拆出来,所有任务仅共享BertEncoder部分。

    (事实上多任务学习有很多种范式,本文使用的是基本的硬共享机制)

    1. 单任务文本分类

    本文用的数据集是https://raw.githubusercontent.com/SophonPlus/ChineseNlpCorpus/master/datasets/ChnSentiCorp_htl_all/ChnSentiCorp_htl_all.csv,预训练语言模型是https://huggingface.co/bert-base-chinese

    可参考我写的另一个项目PolarisRisingWar/pytorch_text_classification

    代码:

    import csv,random
    from tqdm import tqdm
    from copy import deepcopy
    
    from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score
    
    import torch
    import torch.nn as nn
    from torch.utils.data import Dataset,DataLoader
    
    from transformers import AutoModel, AutoTokenizer
    
    #超参设置
    random_seed=20221125
    split_ratio='6-2-2'
    pretrained_path='/data/pretrained_model/bert-base-chinese'
    dropout_rate=0.1
    max_epoch_num=16
    cuda_device='cuda:2'
    output_dim=2
    
    #数据预处理
    with open('other_data_temp/ChnSentiCorp_htl_all.csv') as f:
        reader=csv.reader(f)
        header = next(reader)  #表头
        data = [[int(row[0]),row[1]] for row in reader]  #每个元素是一个由字符串组成的列表,第一个元素是标签(01),第二个元素是评论文本。
    
    random.seed(random_seed)
    random.shuffle(data)
    split_ratio_list=[int(i) for i in split_ratio.split('-')]
    split_point1=int(len(data)*split_ratio_list[0]/sum(split_ratio_list))
    split_point2=int(len(data)*(split_ratio_list[0]+split_ratio_list[1])/sum(split_ratio_list))
    train_data=data[:split_point1]
    valid_data=data[split_point1:split_point2]
    test_data=data[split_point2:]
    
    #建立数据集迭代器
    class TextInitializeDataset(Dataset):
        def __init__(self,input_data) -> None:
            self.text=[x[1] for x in input_data]
            self.label=[x[0] for x in input_data]
        
        def __getitem__(self, index):
            return [self.text[index],self.label[index]]
        
        def __len__(self):
            return len(self.text)
    
    tokenizer=AutoTokenizer.from_pretrained(pretrained_path)
    
    def collate_fn(batch):
        pt_batch=tokenizer([x[0] for x in batch],padding=True,truncation=True,max_length=512,return_tensors='pt')
        return {'input_ids':pt_batch['input_ids'],'token_type_ids':pt_batch['token_type_ids'],'attention_mask':pt_batch['attention_mask'],
                'label':torch.tensor([x[1] for x in batch])}
    
    train_dataloader=DataLoader(TextInitializeDataset(train_data),batch_size=16,shuffle=True,collate_fn=collate_fn)
    valid_dataloader=DataLoader(TextInitializeDataset(valid_data),batch_size=128,shuffle=False,collate_fn=collate_fn)
    test_dataloader=DataLoader(TextInitializeDataset(test_data),batch_size=128,shuffle=False,collate_fn=collate_fn)
    
    #建模
    class ClsModel(nn.Module):
        def __init__(self,output_dim,dropout_rate):
            super(ClsModel,self).__init__()
    
            self.encoder=AutoModel.from_pretrained(pretrained_path)
    
            self.dropout=nn.Dropout(dropout_rate)
            self.classifier=nn.Linear(768,output_dim)
        
        def forward(self,input_ids,token_type_ids,attention_mask):
            x=self.encoder(input_ids=input_ids,token_type_ids=token_type_ids,attention_mask=attention_mask)['pooler_output']
            x=self.dropout(x)
            x=self.classifier(x)
    
            return x
    
    
    loss_func=nn.CrossEntropyLoss()
    
    model=ClsModel(output_dim,dropout_rate)
    model.to(cuda_device)
    
    optimizer=torch.optim.Adam(params=model.parameters(),lr=1e-5)
    
    max_valid_f1=0
    best_model={}
    
    for e in tqdm(range(max_epoch_num)):
        for batch in train_dataloader:
            model.train()
            optimizer.zero_grad()
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask)
            train_loss=loss_func(outputs,batch['label'].to(cuda_device))
            train_loss.backward()
            optimizer.step()
        
        #验证
        with torch.no_grad():
            model.eval()
            labels=[]
            predicts=[]
            for batch in valid_dataloader:
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask)
                labels.extend([i.item() for i in batch['label']])
                predicts.extend([i.item() for i in torch.argmax(outputs,1)])
            f1=f1_score(labels,predicts,average='macro')
            if f1>max_valid_f1:
                best_model=deepcopy(model.state_dict())
                max_valid_f1=f1
    
    #测试
    model.load_state_dict(best_model)
    with torch.no_grad():
        model.eval()
        labels=[]
        predicts=[]
        for batch in test_dataloader:
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask)
            labels.extend([i.item() for i in batch['label']])
            predicts.extend([i.item() for i in torch.argmax(outputs,1)])
        print(accuracy_score(labels,predicts))
        print(precision_score(labels,predicts,average='macro'))
        print(recall_score(labels,predicts,average='macro'))
        print(f1_score(labels,predicts,average='macro'))
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133

    用时约1h35min

    实验结果:

    accuracymacro-Pmacro-Rmacro-F
    91.8991.3990.3390.82

    2. 多任务文本分类(单数据集)

    本文使用的数据集TEL-NLP来自:https://github.com/scsmuhio/MTGCN
    我用的数据集文件是:https://raw.githubusercontent.com/scsmuhio/MTGCN/main/Data/ei_task.csv
    出处论文MT-Text GCN:Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language
    我用的泰卢固语Bert模型权重是:https://huggingface.co/kuppuluri/telugu_bertu(不是数据集原论文用的表征工具)

    这是个泰卢固语多任务文本分类数据集。呃我其实完全不会泰卢固语,所以原则上我其实不想用这个数据集的,但是我只找到了这一个很典型的单数据集多任务文本分类数据集!

    数据集示例:
    在这里插入图片描述

    本文用的数据集预处理方法和论文里写的相似(无法相同,因为第一,这个数据集和论文里给的数据不一样,我也在GitHub项目里问了:Questions about data · Issue #1 · scsmuhio/MTGCN;第二,代码里没有给出每次划分的结果,我只能自定义随机种子实现;第三,我其实没太看懂论文里到底是咋分的,据我理解大概是5次按照7-1-2比例随机划分,用5次实验上的结果平均值作为最终结果,但是我懒得搞这么多次):
    按照7-1-2比例随机划分数据集(随机种子为20221028)
    (最终结果看起来和论文里报的结果就没法比,就完全不在一个谱上……)

    跑了2次实验,对比使用单任务分类范式和多任务分类范式的区别,每次都是微调最多16个epoch,取macro-F1值最高的epoch的模型来做测试(多任务就是macro-F1平均值最高)。
    单看实验结果的话,感觉多任务范式没有体现出明显的优势或劣势。但是多任务范式没有做什么优化就是啦,搞得比较简单,有时间的话再优化一下代码。

    单任务版代码:

    import csv,os,random
    from tqdm import tqdm
    from copy import deepcopy
    
    from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score
    
    import torch
    import torch.nn as nn
    from torch.utils.data import Dataset,TensorDataset,DataLoader
    
    from transformers import AutoModel, AutoTokenizer, pipeline
    
    #数据预处理
    with open('other_data_temp/telnlp_ei.csv') as f:
        reader=csv.reader(f)
        header = next(reader)  #表头
        print(header)
        data=list(reader)
    
        #对标签进行数值化
        map1={'neg':0,'neutral':1,'pos':2}
        map2={'angry':0,'sad':1,'fear':2,'happy':3}
        map3={'yes':0,'no':1}
    
        random.seed(20221028)
        random.shuffle(data)
        split_ratio_list=[7,1,2]
        split_point1=int(len(data)*split_ratio_list[0]/sum(split_ratio_list))
        split_point2=int(len(data)*(split_ratio_list[0]+split_ratio_list[1])/sum(split_ratio_list))
        train_data=data[:split_point1]
        valid_data=data[split_point1:split_point2]
        test_data=data[split_point2:]
    
    #建立数据集迭代器
    class TextInitializeDataset(Dataset):
        def __init__(self,input_data) -> None:
            self.text=[x[0] for x in input_data]
            self.sentiment=[map1[x[1]] for x in input_data]
            self.emotion=[map2[x[2]] for x in input_data]
            self.hate=[map3[x[3]] for x in input_data]
            self.sarcasm=[map3[x[4]] for x in input_data]
        
        def __getitem__(self, index):
            return [self.text[index],self.sentiment[index],self.emotion[index],self.hate[index],self.sarcasm[index]]
        
        def __len__(self):
            return len(self.text)
    
    tokenizer = AutoTokenizer.from_pretrained("/data/pretrained_model/telugu_bertu",clean_text=False,handle_chinese_chars=False,
                                            strip_accents=False,wordpieces_prefix='##')
    
    def collate_fn(batch):
        pt_batch=tokenizer([x[0] for x in batch],padding=True,truncation=True,max_length=512,return_tensors='pt')
        return {'input_ids':pt_batch['input_ids'],'token_type_ids':pt_batch['token_type_ids'],'attention_mask':pt_batch['attention_mask'],
                'sentiment':torch.tensor([x[1] for x in batch]),'emotion':torch.tensor([x[2] for x in batch]),'hate':torch.tensor([x[3] for x in batch]),
                'sarcasm':torch.tensor([x[4] for x in batch])}
    
    train_dataloader=DataLoader(TextInitializeDataset(train_data),batch_size=64,shuffle=True,collate_fn=collate_fn)
    valid_dataloader=DataLoader(TextInitializeDataset(valid_data),batch_size=512,shuffle=False,collate_fn=collate_fn)
    test_dataloader=DataLoader(TextInitializeDataset(test_data),batch_size=512,shuffle=False,collate_fn=collate_fn)
    
    #建模
    class ClsModel(nn.Module):
        def __init__(self,output_dim,dropout_rate):
            super(ClsModel,self).__init__()
    
            self.encoder=AutoModel.from_pretrained("/data/pretrained_model/telugu_bertu")
    
            self.dropout=nn.Dropout(dropout_rate)
            self.classifier=nn.Linear(768,output_dim)
        
        def forward(self,input_ids,token_type_ids,attention_mask):
            x=self.encoder(input_ids=input_ids,token_type_ids=token_type_ids,attention_mask=attention_mask)['pooler_output']
            x=self.dropout(x)
            x=self.classifier(x)
    
            return x
    
    #运行
    dropout_rate=0.1
    max_epoch_num=16
    cuda_device='cuda:1'
    od_map={'sentiment':3,'emotion':4,'hate':2,'sarcasm':2}
    
    loss_func=nn.CrossEntropyLoss()
    
    for the_label in ['sentiment','emotion','hate','sarcasm']:
        model=ClsModel(od_map[the_label],dropout_rate)
        model.to(cuda_device)
    
        optimizer=torch.optim.Adam(params=model.parameters(),lr=1e-5)
    
        max_valid_f1=0
        best_model={}
    
        for e in tqdm(range(max_epoch_num)):
            for batch in train_dataloader:
                model.train()
                optimizer.zero_grad()
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask)
                train_loss=loss_func(outputs,batch[the_label].to(cuda_device))
                train_loss.backward()
                optimizer.step()
            
            #验证
            with torch.no_grad():
                model.eval()
                labels=[]
                predicts=[]
                for batch in valid_dataloader:
                    input_ids=batch['input_ids'].to(cuda_device)
                    token_type_ids=batch['token_type_ids'].to(cuda_device)
                    attention_mask=batch['attention_mask'].to(cuda_device)
                    outputs=model(input_ids,token_type_ids,attention_mask)
                    labels.extend([i.item() for i in batch[the_label]])
                    predicts.extend([i.item() for i in torch.argmax(outputs,1)])
                f1=f1_score(labels,predicts,average='macro')
                if f1>max_valid_f1:
                    best_model=deepcopy(model.state_dict())
                    max_valid_f1=f1
        
        #测试
        model.load_state_dict(best_model)
        with torch.no_grad():
            model.eval()
            labels=[]
            predicts=[]
            for batch in test_dataloader:
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask)
                labels.extend([i.item() for i in batch[the_label]])
                predicts.extend([i.item() for i in torch.argmax(outputs,1)])
            print(the_label)
            print(accuracy_score(labels,predicts))
            print(precision_score(labels,predicts,average='macro'))
            print(recall_score(labels,predicts,average='macro'))
            print(f1_score(labels,predicts,average='macro'))
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142

    多任务版代码:

    import csv,os,random
    from tqdm import tqdm
    from copy import deepcopy
    from statistics import mean
    
    from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score
    
    import torch
    import torch.nn as nn
    from torch.utils.data import Dataset,TensorDataset,DataLoader
    
    from transformers import AutoModel, AutoTokenizer, pipeline
    
    #数据预处理
    with open('other_data_temp/telnlp_ei.csv') as f:
        reader=csv.reader(f)
        header = next(reader)  #表头
        print(header)
        data=list(reader)
    
        #对标签进行数值化
        map1={'neg':0,'neutral':1,'pos':2}
        map2={'angry':0,'sad':1,'fear':2,'happy':3}
        map3={'yes':0,'no':1}
    
        random.seed(20221028)
        random.shuffle(data)
        split_ratio_list=[7,1,2]
        split_point1=int(len(data)*split_ratio_list[0]/sum(split_ratio_list))
        split_point2=int(len(data)*(split_ratio_list[0]+split_ratio_list[1])/sum(split_ratio_list))
        train_data=data[:split_point1]
        valid_data=data[split_point1:split_point2]
        test_data=data[split_point2:]
    
    #建立数据集迭代器
    class TextInitializeDataset(Dataset):
        def __init__(self,input_data) -> None:
            self.text=[x[0] for x in input_data]
            self.sentiment=[map1[x[1]] for x in input_data]
            self.emotion=[map2[x[2]] for x in input_data]
            self.hate=[map3[x[3]] for x in input_data]
            self.sarcasm=[map3[x[4]] for x in input_data]
        
        def __getitem__(self, index):
            return [self.text[index],self.sentiment[index],self.emotion[index],self.hate[index],self.sarcasm[index]]
        
        def __len__(self):
            return len(self.text)
    
    tokenizer = AutoTokenizer.from_pretrained("/data/pretrained_model/telugu_bertu",clean_text=False,handle_chinese_chars=False,
                                            strip_accents=False,wordpieces_prefix='##')
    
    def collate_fn(batch):
        pt_batch=tokenizer([x[0] for x in batch],padding=True,truncation=True,max_length=512,return_tensors='pt')
        return {'input_ids':pt_batch['input_ids'],'token_type_ids':pt_batch['token_type_ids'],'attention_mask':pt_batch['attention_mask'],
                'sentiment':torch.tensor([x[1] for x in batch]),'emotion':torch.tensor([x[2] for x in batch]),'hate':torch.tensor([x[3] for x in batch]),
                'sarcasm':torch.tensor([x[4] for x in batch])}
    
    train_dataloader=DataLoader(TextInitializeDataset(train_data),batch_size=64,shuffle=True,collate_fn=collate_fn)
    valid_dataloader=DataLoader(TextInitializeDataset(valid_data),batch_size=512,shuffle=False,collate_fn=collate_fn)
    test_dataloader=DataLoader(TextInitializeDataset(test_data),batch_size=512,shuffle=False,collate_fn=collate_fn)
    
    #建模
    class ClsModel(nn.Module):
        def __init__(self,output_dims,dropout_rate):
            super(ClsModel,self).__init__()
    
            self.encoder=AutoModel.from_pretrained("/data/pretrained_model/telugu_bertu")
    
            self.dropout=nn.Dropout(dropout_rate)
            self.classifiers=nn.ModuleList([nn.Linear(768,output_dim) for output_dim in output_dims])
        
        def forward(self,input_ids,token_type_ids,attention_mask):
            x=self.encoder(input_ids=input_ids,token_type_ids=token_type_ids,attention_mask=attention_mask)['pooler_output']
            x=self.dropout(x)
            xs=[classifier(x) for classifier in self.classifiers]
    
            return xs
    
    #运行
    dropout_rate=0.1
    max_epoch_num=16
    cuda_device='cuda:2'
    od_name=['sentiment','emotion','hate','sarcasm']
    od=[3,4,2,2]
    
    loss_func=nn.CrossEntropyLoss()
    
    model=ClsModel(od,dropout_rate)
    model.to(cuda_device)
    
    optimizer=torch.optim.Adam(params=model.parameters(),lr=1e-5)
        
    max_valid_f1=0
    best_model={}
    
    for e in tqdm(range(max_epoch_num)):
        for batch in train_dataloader:
            model.train()
            optimizer.zero_grad()
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask)
            loss_list=[loss_func(outputs[i],batch[od_name[i]].to(cuda_device)) for i in range(4)]
            train_loss=torch.sum(torch.stack(loss_list))
            train_loss.backward()
            optimizer.step()
        
        #验证
        with torch.no_grad():
            model.eval()
            labels=[[] for _ in range(4)]
            predicts=[[] for _ in range(4)]
            for batch in valid_dataloader:
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask)
                for i in range(4):
                    labels[i].extend([i.item() for i in batch[od_name[i]]])
                    predicts[i].extend([i.item() for i in torch.argmax(outputs[i],1)])
            f1=mean([f1_score(labels[i],predicts[i],average='macro') for i in range(4)])
            if f1>max_valid_f1:
                best_model=deepcopy(model.state_dict())
                max_valid_f1=f1
    
    #测试
    model.load_state_dict(best_model)
    with torch.no_grad():
        model.eval()
        labels=[[] for _ in range(4)]
        predicts=[[] for _ in range(4)]
        for batch in test_dataloader:
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask)
            for i in range(4):
                labels[i].extend([i.item() for i in batch[od_name[i]]])
                predicts[i].extend([i.item() for i in torch.argmax(outputs[i],1)])
        for i in range(4):
            print(od_name[i])
            print(accuracy_score(labels[i],predicts[i]))
            print(precision_score(labels[i],predicts[i],average='macro'))
            print(recall_score(labels[i],predicts[i],average='macro'))
            print(f1_score(labels[i],predicts[i],average='macro'))
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147

    (多任务时间是单任务的1/4,具体差多少没计时)
    实验结果对比(×100 保留2位小数):

    任务-标签accuracymacro-Pmacro-Rmacro-F
    单-sentiment85.6964.3863.5563.73
    多-sentiment86.3765.7463.2963.9
    单-emtion87.6172.1873.1672.47
    多-emotion88.2879.9766.5170.81
    单-hate-speech96.5863.9969.1566.12
    多-hate-speech96.8466.3672.7868.99
    单-sarcasm98.3464.4768.5566.25
    多-sarcasm98.0360.9266.0462.96

    3. 多任务文本分类(多数据集)

    本文用的数据集是2种新浪微博数据,都来源于https://github.com/SophonPlus/ChineseNlpCorpus这个项目:
    一个标注情感正负性(0/1):https://pan.baidu.com/s/1DoQbki3YwqkuwQUOj64R_g
    一个标注4种情感:https://pan.baidu.com/s/16c93E5x373nsGozyWevITg

    预训练语言模型是https://huggingface.co/bert-base-chinese

    (时间太久了,懒得跑好几个epoch,我就都只跑1个epoch了)

    单任务代码:

    import csv,random
    from tqdm import tqdm
    from copy import deepcopy
    
    from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score
    
    import torch
    import torch.nn as nn
    from torch.utils.data import Dataset,DataLoader
    
    from transformers import AutoModel, AutoTokenizer
    
    #超参设置
    random_seed=20221125
    split_ratio='6-2-2'
    pretrained_path='/data/pretrained_model/bert-base-chinese'
    dropout_rate=0.1
    max_epoch_num=1
    cuda_device='cuda:3'
    output_dim=[['/data/other_data/weibo_senti_100k.csv',2],['/data/other_data/simplifyweibo_4_moods.csv',4]]
    
    #数据预处理
    random.seed(random_seed)
    
    #建立数据集迭代器
    class TextInitializeDataset(Dataset):
        def __init__(self,input_data) -> None:
            self.text=[x[1] for x in input_data]
            self.label=[x[0] for x in input_data]
        
        def __getitem__(self, index):
            return [self.text[index],self.label[index]]
        
        def __len__(self):
            return len(self.text)
    
    tokenizer = AutoTokenizer.from_pretrained(pretrained_path)
    
    def collate_fn(batch):
        pt_batch=tokenizer([x[0] for x in batch],padding=True,truncation=True,max_length=512,return_tensors='pt')
        return {'input_ids':pt_batch['input_ids'],'token_type_ids':pt_batch['token_type_ids'],'attention_mask':pt_batch['attention_mask'],
                'label':torch.tensor([x[1] for x in batch])}
    
    
    
    #建模
    class ClsModel(nn.Module):
        def __init__(self,output_dim,dropout_rate):
            super(ClsModel,self).__init__()
    
            self.encoder=AutoModel.from_pretrained(pretrained_path)
    
            self.dropout=nn.Dropout(dropout_rate)
            self.classifier=nn.Linear(768,output_dim)
        
        def forward(self,input_ids,token_type_ids,attention_mask):
            x=self.encoder(input_ids=input_ids,token_type_ids=token_type_ids,attention_mask=attention_mask)['pooler_output']
            x=self.dropout(x)
            x=self.classifier(x)
    
            return x
    
    #运行
    loss_func=nn.CrossEntropyLoss()
    
    for task in output_dim:
        with open(task[0]) as f:
            reader=csv.reader(f)
            header = next(reader)  #表头
            data = [[int(row[0]),row[1]] for row in reader]  #每个元素是一个由字符串组成的列表,第一个元素是标签(01),第二个元素是评论文本。
    
        split_ratio_list=[int(i) for i in split_ratio.split('-')]
        split_point1=int(len(data)*split_ratio_list[0]/sum(split_ratio_list))
        split_point2=int(len(data)*(split_ratio_list[0]+split_ratio_list[1])/sum(split_ratio_list))
        train_data=data[:split_point1]
        valid_data=data[split_point1:split_point2]
        test_data=data[split_point2:]
    
        train_dataloader=DataLoader(TextInitializeDataset(train_data),batch_size=16,shuffle=True,collate_fn=collate_fn)
        valid_dataloader=DataLoader(TextInitializeDataset(valid_data),batch_size=128,shuffle=False,collate_fn=collate_fn)
        test_dataloader=DataLoader(TextInitializeDataset(test_data),batch_size=128,shuffle=False,collate_fn=collate_fn)
        #64-512在第一个数据集上是可行的,在第二个数据集上会OOM,所以我直接全调一样了
    
        model=ClsModel(task[1],dropout_rate)
        model.to(cuda_device)
    
        optimizer=torch.optim.Adam(params=model.parameters(),lr=1e-5)
    
        max_valid_f1=0
        best_model={}
    
        for e in tqdm(range(max_epoch_num)):
            for batch in train_dataloader:
                model.train()
                optimizer.zero_grad()
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask)
                train_loss=loss_func(outputs,batch['label'].to(cuda_device))
                train_loss.backward()
                optimizer.step()
            
            #验证
            with torch.no_grad():
                model.eval()
                labels=[]
                predicts=[]
                for batch in valid_dataloader:
                    input_ids=batch['input_ids'].to(cuda_device)
                    token_type_ids=batch['token_type_ids'].to(cuda_device)
                    attention_mask=batch['attention_mask'].to(cuda_device)
                    outputs=model(input_ids,token_type_ids,attention_mask)
                    labels.extend([i.item() for i in batch['label']])
                    predicts.extend([i.item() for i in torch.argmax(outputs,1)])
                f1=f1_score(labels,predicts,average='macro')
                if f1>max_valid_f1:
                    best_model=deepcopy(model.state_dict())
                    max_valid_f1=f1
        
        #测试
        model.load_state_dict(best_model)
        with torch.no_grad():
            model.eval()
            labels=[]
            predicts=[]
            for batch in test_dataloader:
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask)
                labels.extend([i.item() for i in batch['label']])
                predicts.extend([i.item() for i in torch.argmax(outputs,1)])
            print(task[0])
            print(accuracy_score(labels,predicts))
            print(precision_score(labels,predicts,average='macro'))
            print(recall_score(labels,predicts,average='macro'))
            print(f1_score(labels,predicts,average='macro'))
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138

    多任务代码:

    import csv,random
    from tqdm import tqdm
    from copy import deepcopy
    
    from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score
    
    import torch
    import torch.nn as nn
    from torch.utils.data import Dataset,DataLoader
    
    from transformers import AutoTokenizer,AutoConfig
    from transformers.models.bert.modeling_bert import BertEmbeddings,BertEncoder,BertPooler
    from transformers.modeling_outputs import BaseModelOutputWithPoolingAndCrossAttentions
    from transformers.modeling_utils import ModuleUtilsMixin
    
    instance=ModuleUtilsMixin()
    
    #超参设置
    random_seed=20221125
    split_ratio='6-2-2'
    pretrained_path='/data/pretrained_model/bert-base-chinese'
    dropout_rate=0.1
    max_epoch_num=1
    cuda_device='cuda:2'
    output_dim=[2,4]
    
    #数据预处理
    random.seed(random_seed)
    
    #数据1
    with open('/data/other_data/weibo_senti_100k.csv') as f:
        reader=csv.reader(f)
        header = next(reader)  #表头
        data = [[int(row[0]),row[1]] for row in reader]  #每个元素是一个由字符串组成的列表,第一个元素是标签(01),第二个元素是评论文本。
    
    random.shuffle(data)
    split_ratio_list=[int(i) for i in split_ratio.split('-')]
    split_point1=int(len(data)*split_ratio_list[0]/sum(split_ratio_list))
    split_point2=int(len(data)*(split_ratio_list[0]+split_ratio_list[1])/sum(split_ratio_list))
    train_data1=data[:split_point1]
    valid_data1=data[split_point1:split_point2]
    test_data1=data[split_point2:]
    
    #数据2
    with open('/data/other_data/simplifyweibo_4_moods.csv') as f:
        reader=csv.reader(f)
        header = next(reader)  #表头
        data = [[int(row[0]),row[1]] for row in reader]  #每个元素是一个由字符串组成的列表,第一个元素是标签(01),第二个元素是评论文本。
    
    random.shuffle(data)
    split_ratio_list=[int(i) for i in split_ratio.split('-')]
    split_point1=int(len(data)*split_ratio_list[0]/sum(split_ratio_list))
    split_point2=int(len(data)*(split_ratio_list[0]+split_ratio_list[1])/sum(split_ratio_list))
    train_data2=data[:split_point1]
    valid_data2=data[split_point1:split_point2]
    test_data2=data[split_point2:]
    
    #建立数据集迭代器
    class TextInitializeDataset(Dataset):
        def __init__(self,input_data) -> None:
            self.text=[x[1] for x in input_data]
            self.label=[x[0] for x in input_data]
        
        def __getitem__(self, index):
            return [self.text[index],self.label[index]]
        
        def __len__(self):
            return len(self.text)
    
    tokenizer=AutoTokenizer.from_pretrained(pretrained_path)
    
    def collate_fn(batch):
        pt_batch=tokenizer([x[0] for x in batch],padding=True,truncation=True,max_length=512,return_tensors='pt')
        return {'input_ids':pt_batch['input_ids'],'token_type_ids':pt_batch['token_type_ids'],'attention_mask':pt_batch['attention_mask'],
                'label':torch.tensor([x[1] for x in batch])}
    
    train_dataloader1=DataLoader(TextInitializeDataset(train_data1),batch_size=16,shuffle=True,collate_fn=collate_fn)
    train_dataloader2=DataLoader(TextInitializeDataset(train_data2),batch_size=16,shuffle=True,collate_fn=collate_fn)
    valid_dataloader1=DataLoader(TextInitializeDataset(valid_data1),batch_size=128,shuffle=False,collate_fn=collate_fn)
    valid_dataloader2=DataLoader(TextInitializeDataset(valid_data2),batch_size=128,shuffle=False,collate_fn=collate_fn)
    test_dataloader1=DataLoader(TextInitializeDataset(test_data1),batch_size=128,shuffle=False,collate_fn=collate_fn)
    test_dataloader2=DataLoader(TextInitializeDataset(test_data2),batch_size=128,shuffle=False,collate_fn=collate_fn)
    
    config=AutoConfig.from_pretrained(pretrained_path)
    
    #建模
    class ClsModel(nn.Module):
        def __init__(self,output_dim,dropout_rate):
            super(ClsModel,self).__init__()
    
            self.config=config
            self.embedding1=BertEmbeddings(config)
            self.embedding2=BertEmbeddings(config)
            self.encoder=BertEncoder(config)
            self.pooler=BertPooler(config)
    
            self.dropout=nn.Dropout(dropout_rate)
            self.classifier1=nn.Linear(768,output_dim[0])
            self.classifier2=nn.Linear(768,output_dim[1])
        
        def forward(self,input_ids,token_type_ids,attention_mask,type):
            output_attentions=self.config.output_attentions
            output_hidden_states=self.config.output_hidden_states
            return_dict=self.config.use_return_dict
    
            if self.config.is_decoder:
                use_cache=self.config.use_cache
            else:
                use_cache = False
    
            input_shape = input_ids.size()
    
            batch_size, seq_length = input_shape
            device = input_ids.device
    
            # past_key_values_length
            past_key_values_length = 0
    
            if attention_mask is None:
                attention_mask = torch.ones(((batch_size, seq_length + past_key_values_length)), device=device)
    
            if type==1:
                self.embeddings=self.embedding1
            else:
                self.embeddings=self.embedding2
    
            # We can provide a self-attention mask of dimensions [batch_size, from_seq_length, to_seq_length]
            # ourselves in which case we just need to make it broadcastable to all heads.
            dtype=attention_mask.dtype
            # We can provide a self-attention mask of dimensions [batch_size, from_seq_length, to_seq_length]
            # ourselves in which case we just need to make it broadcastable to all heads.
            if attention_mask.dim() == 3:
                extended_attention_mask = attention_mask[:, None, :, :]
            elif attention_mask.dim() == 2:
                # Provided a padding mask of dimensions [batch_size, seq_length]
                # - if the model is a decoder, apply a causal mask in addition to the padding mask
                # - if the model is an encoder, make the mask broadcastable to [batch_size, num_heads, seq_length, seq_length]
                if self.config.is_decoder:
                    extended_attention_mask = ModuleUtilsMixin.create_extended_attention_mask_for_decoder(
                        input_shape, attention_mask, device
                    )
                else:
                    extended_attention_mask = attention_mask[:, None, None, :]
            else:
                raise ValueError(
                    f"Wrong shape for input_ids (shape {input_shape}) or attention_mask (shape {attention_mask.shape})"
                )
    
            # Since attention_mask is 1.0 for positions we want to attend and 0.0 for
            # masked positions, this operation will create a tensor which is 0.0 for
            # positions we want to attend and the dtype's smallest value for masked positions.
            # Since we are adding it to the raw scores before the softmax, this is
            # effectively the same as removing these entirely.
            extended_attention_mask = extended_attention_mask.to(dtype=dtype)  # fp16 compatibility
            extended_attention_mask = (1.0 - extended_attention_mask) * torch.iinfo(dtype).min
    
            encoder_extended_attention_mask = None
    
            # Prepare head mask if needed
            # 1.0 in head_mask indicate we keep the head
            # attention_probs has shape bsz x n_heads x N x N
            # input head_mask has shape [num_heads] or [num_hidden_layers x num_heads]
            # and head_mask is converted to shape [num_hidden_layers x batch x num_heads x seq_length x seq_length]
            head_mask=[None] *self.config.num_hidden_layers
    
            embedding_output = self.embeddings(
                input_ids=input_ids,
                position_ids=None,
                token_type_ids=token_type_ids,
                inputs_embeds=None,
                past_key_values_length=past_key_values_length,
            )
    
            encoder_outputs = self.encoder(
                embedding_output,
                attention_mask=extended_attention_mask,
                head_mask=head_mask,
                encoder_hidden_states=None,
                encoder_attention_mask=encoder_extended_attention_mask,
                past_key_values=None,
                use_cache=use_cache,
                output_attentions=output_attentions,
                output_hidden_states=output_hidden_states,
                return_dict=return_dict,
            )
    
            sequence_output = encoder_outputs[0]
            pooled_output = self.pooler(sequence_output) if self.pooler is not None else None
    
            if not return_dict:
                return (sequence_output, pooled_output) + encoder_outputs[1:]
    
            x=BaseModelOutputWithPoolingAndCrossAttentions(
                last_hidden_state=sequence_output,
                pooler_output=pooled_output,
                past_key_values=encoder_outputs.past_key_values,
                hidden_states=encoder_outputs.hidden_states,
                attentions=encoder_outputs.attentions,
                cross_attentions=encoder_outputs.cross_attentions,
            )['pooler_output']
    
            x=self.dropout(x)
    
            if type==1:
                self.classifier=self.classifier1
            else:
                self.classifier=self.classifier2
    
            x=self.classifier(x)
    
            return x
    
    
    loss_func=nn.CrossEntropyLoss()
    
    model=ClsModel(output_dim,dropout_rate)
    model.to(cuda_device)
    
    optimizer=torch.optim.Adam(params=model.parameters(),lr=1e-5)
    
    max_valid_f1=0
    best_model={}
    
    for e in tqdm(range(max_epoch_num)):
        for batch in train_dataloader1:
            model.train()
            optimizer.zero_grad()
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask,1)
            train_loss=loss_func(outputs,batch['label'].to(cuda_device))
            train_loss.backward()
            optimizer.step()
        
        for batch in train_dataloader2:
            model.train()
            optimizer.zero_grad()
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask,2)
            train_loss=loss_func(outputs,batch['label'].to(cuda_device))
            train_loss.backward()
            optimizer.step()
        
        #验证
        with torch.no_grad():
            model.eval()
    
            labels=[]
            predicts=[]
            for batch in valid_dataloader1:
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask,1)
                labels.extend([i.item() for i in batch['label']])
                predicts.extend([i.item() for i in torch.argmax(outputs,1)])
            f11=f1_score(labels,predicts,average='macro')
    
            labels=[]
            predicts=[]
            for batch in valid_dataloader2:
                input_ids=batch['input_ids'].to(cuda_device)
                token_type_ids=batch['token_type_ids'].to(cuda_device)
                attention_mask=batch['attention_mask'].to(cuda_device)
                outputs=model(input_ids,token_type_ids,attention_mask,2)
                labels.extend([i.item() for i in batch['label']])
                predicts.extend([i.item() for i in torch.argmax(outputs,1)])
            f12=f1_score(labels,predicts,average='macro')
    
            f1=(f11+f12)/2
            if f1>max_valid_f1:
                best_model=deepcopy(model.state_dict())
                max_valid_f1=f1
    
    #测试
    model.load_state_dict(best_model)
    with torch.no_grad():
        model.eval()
        labels=[]
        predicts=[]
        for batch in test_dataloader1:
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask,1)
            labels.extend([i.item() for i in batch['label']])
            predicts.extend([i.item() for i in torch.argmax(outputs,1)])
        print(accuracy_score(labels,predicts))
        print(precision_score(labels,predicts,average='macro'))
        print(recall_score(labels,predicts,average='macro'))
        print(f1_score(labels,predicts,average='macro'))
    
        labels=[]
        predicts=[]
        for batch in test_dataloader2:
            input_ids=batch['input_ids'].to(cuda_device)
            token_type_ids=batch['token_type_ids'].to(cuda_device)
            attention_mask=batch['attention_mask'].to(cuda_device)
            outputs=model(input_ids,token_type_ids,attention_mask,2)
            labels.extend([i.item() for i in batch['label']])
            predicts.extend([i.item() for i in torch.argmax(outputs,1)])
        print(accuracy_score(labels,predicts))
        print(precision_score(labels,predicts,average='macro'))
        print(recall_score(labels,predicts,average='macro'))
        print(f1_score(labels,predicts,average='macro'))
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188
    • 189
    • 190
    • 191
    • 192
    • 193
    • 194
    • 195
    • 196
    • 197
    • 198
    • 199
    • 200
    • 201
    • 202
    • 203
    • 204
    • 205
    • 206
    • 207
    • 208
    • 209
    • 210
    • 211
    • 212
    • 213
    • 214
    • 215
    • 216
    • 217
    • 218
    • 219
    • 220
    • 221
    • 222
    • 223
    • 224
    • 225
    • 226
    • 227
    • 228
    • 229
    • 230
    • 231
    • 232
    • 233
    • 234
    • 235
    • 236
    • 237
    • 238
    • 239
    • 240
    • 241
    • 242
    • 243
    • 244
    • 245
    • 246
    • 247
    • 248
    • 249
    • 250
    • 251
    • 252
    • 253
    • 254
    • 255
    • 256
    • 257
    • 258
    • 259
    • 260
    • 261
    • 262
    • 263
    • 264
    • 265
    • 266
    • 267
    • 268
    • 269
    • 270
    • 271
    • 272
    • 273
    • 274
    • 275
    • 276
    • 277
    • 278
    • 279
    • 280
    • 281
    • 282
    • 283
    • 284
    • 285
    • 286
    • 287
    • 288
    • 289
    • 290
    • 291
    • 292
    • 293
    • 294
    • 295
    • 296
    • 297
    • 298
    • 299
    • 300
    • 301
    • 302
    • 303
    • 304
    • 305
    • 306
    • 307
    • 308

    单任务实验结果:
    (第二个数据集为什么会这样我也很迷茫,但是我结果打印出来确实是这样的!)

    数据集accuracymacro-Pmacro-Rmacro-F用时
    weibo_senti_100k90.045045.0247.3832min
    simplifyweibo_4_moods00002h

    多任务实验结果:(耗时2h30min)

    数据集accuracymacro-Pmacro-Rmacro-F
    weibo_senti_100k85.5488.6285.6985.29
    simplifyweibo_4_moods57.3343.0730.1527.81
  • 相关阅读:
    【物理】复合场中配速法
    帆软FineBI随时记
    [附源码]Python计算机毕业设计Django天狗电子商城系统
    Flink之Window窗口机制
    记录一次gcc的编译
    2023年四川省安全员B证证模拟考试题库及四川省安全员B证理论考试试题
    Vuex---对Vuex的理解与使用~~~
    封装一个antd的Table操作项中的一个展开与收起通用功能
    iOS App Tech Support(URL)
    从功能测试到自动化测试你都知道他们的有缺点吗?
  • 原文地址:https://blog.csdn.net/PolarisRisingWar/article/details/127365675