使用bert进行文本二分类

构建BERT（Bidirectional Encoder Representations from Transformers）的训练网络可以使用PyTorch来实现。下面是一个简单的示例代码：


import torch
import torch.nn as nn
from transformers import BertModel, BertTokenizer
 
# Load BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
bert_model = BertModel.from_pretrained('bert-base-uncased')
 
# Example input sentence
input_sentence = "I love BERT!"
 
# Tokenize input sentence
tokens = tokenizer.encode_plus(input_sentence, add_special_tokens=True, padding='max_length', max_length=10, return_tensors='pt')
 
# Get input tensors
input_ids = tokens['input_ids']
attention_mask = tokens['attention_mask']
 
# Define BERT-based model
class BERTModel(nn.Module):
    def __init__(self):
        super(BERTModel, self).__init__()
        self.bert = bert_model
        self.fc = nn.Linear(768, 2)  # Example: 2-class classification
        self.softmax = nn.Softmax(dim=1)
        
    def forward(self, input_ids, attention_mask):
        bert_output = self.bert(input_ids=input_ids, attention_mask=attention_mask)[0]
        pooled_output = bert_output[:, 0, :]  # Use the first token's representation (CLS token)
        output = self.fc(pooled_output)
        output = self.softmax(output)
        return output
# Initialize BERT model
model = BERTModel()
# Example of training process
input_ids = input_ids.squeeze(0)
attention_mask = attention_mask.squeeze(0)
labels = torch.tensor([0])  # Example: binary classification with label 0
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
# Training loop
for epoch in range(10):
    optimizer.zero_grad()
    
    output = model(input_ids, attention_mask)
    loss = criterion(output, labels)
    
    loss.backward()
    optimizer.step()
    
    print(f"Epoch {epoch+1} - Loss: {loss.item()}")
# Example of using trained BERT model for prediction
test_sentence = "I hate BERT!"
test_tokens = tokenizer.encode_plus(test_sentence, add_special_tokens=True, padding='max_length', max_length=10, return_tensors='pt')
test_input_ids = test_tokens['input_ids'].squeeze(0)
test_attention_mask = test_tokens['attention_mask'].squeeze(0)
with torch.no_grad():
    test_output = model(test_input_ids, test_attention_mask)
    predicted_label = torch.argmax(test_output, dim=1).item()
print(f"Predicted label: {predicted_label}")

在这个示例中，使用Hugging Face的transformers库加载已经预训练好的BERT模型和tokenizer。然后定义了一个自定义的BERT模型，它包含一个BERT模型层（bert_model）和一个线性层和softmax激活函数用于分类任务。

在训练过程中，使用交叉熵损失函数和Adam优化器进行训练。在每个训练周期中，将输入数据传递给BERT模型和线性层，计算输出并计算损失。然后更新模型的权重。

在使用训练好的BERT模型进行预测时，我们通过输入句子使用tokenizer进行编码，并传入BERT模型获取输出。最后，我们使用argmax函数获取最可能的标签。

请确保在运行代码之前已经安装了PyTorch和transformers库，并且已经下载了BERT预训练模型（bert-base-uncased）。可以使用pip install torch transformers进行安装。

相关阅读:
核酸检测小程序实战教程
idea 打 jar 包以及运行使用
【设计模式】单例模式
云端智享——记移动云手写docker-demo
19. 内置 Tomcat 配置和切换
4.1 应用层Hook挂钩原理分析
MySQL事务隔离级别
142. 环形链表 II
web基础与HTTP协议
【面试经典150题】跳跃游戏Ⅱ

原文地址：https://blog.csdn.net/Metal1/article/details/132890852