• HuggingFace——Accelerate的使用


    Overview

    🤗 Accelerate is a library that enables the same PyTorch code to be run across any distributed configuration by adding just four lines of code! In short, training and inference at scale made simple, efficient and adaptable.

    Demo

    # + 代表使用accelerate的增加语句;- 代表去掉
    + from accelerate import Accelerator
     from transformers import AdamW, AutoModelForSequenceClassification, get_scheduler
    
    + accelerator = Accelerator()
    
     model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
     optimizer = AdamW(model.parameters(), lr=3e-5)
    
    - device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
    - model.to(device)
    
    + train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
    +     train_dataloader, eval_dataloader, model, optimizer
    + )
    
     num_epochs = 3
     num_training_steps = num_epochs * len(train_dataloader)
     lr_scheduler = get_scheduler(
         "linear",
         optimizer=optimizer,
         num_warmup_steps=0,
         num_training_steps=num_training_steps
     )
    
     progress_bar = tqdm(range(num_training_steps))
    
     model.train()
     for epoch in range(num_epochs):
         for batch in train_dataloader:
    -         batch = {
        k: v.to(device) for k, v in batch.items()}
             outputs = model(**batch)
             loss = outputs.loss
    -         loss.backward()
    +         accelerator.backward(loss)
    
             optimizer.step()
             lr_scheduler.step()
             optimizer.zero_grad()
             progress_bar.update(1)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41

    如果简单来说,就是添加了一个accelerate来控制分布式训练,其中了loss的backward变成了accelerate.backward(loss)

    Installation & Configuration

    安装和配置参考官网即可,其中配置的过程是需要在终端Terminal上通过回答一系列问题,然后自动生成一个名为default_config的yaml文件,并保存在根目录.catch/huggingface/accelerate目录下。

    配置完成之后可以使用accelerate env [--config_file] [config_file_name]来验证配置文件是否是Valid。

    默认配置文件内容:

    - `Accelerate` version: 0.11.0.dev0
    - Platform: Linux-5.10.0-15-cloud-amd64-x86_64-with-debian-11.3
    - Python version: 3.7.12
    - Numpy version: 1.19.5
    - PyTorch version
    • 1
    • 2
    • 3
    • 4
  • 相关阅读:
    十六、java 中常见日期格式的设置
    OAuth2:资源服务器
    ModStartCMS 主题入门开发教程
    大模型应用发展的方向|代理 Agent 的兴起及其未来(上)
    前端异常监控方案
    PyQt theme
    Flutter笔记:发布一个Flutter头像模块 easy_avatar
    现在的00后,实在是太卷了,我们这些老油条都想辞职了......
    Linux 高级指令
    通过PaddleOCR识别PDF
  • 原文地址:https://blog.csdn.net/c___c18/article/details/127616417