• AI实战营第二期 第十节 《MMagic 代码课》——笔记11


    AI实战营第二期 第十节 《MMagic 代码课》

    MMagic (Multimodal Advanced, Generative, and Intelligent Creation) 是一个供专业人工智能研究人员和机器学习工程师去处理、编辑和生成图像与视频的开源 AIGC 工具箱。
    在这里插入图片描述

    MMagic 允许研究人员和工程师使用最先进的预训练模型,并且可以轻松训练和开发新的定制模型。

    MMagic 支持各种基础生成模型,包括:

    • 无条件生成对抗网络 (GANs)

    • 条件生成对抗网络 (GANs)

    • 内部学习

    • 扩散模型

    • 还有许多其他生成模型即将推出!

    MMagic 支持各种应用程序,包括:

    • 图文生成

    • 图像翻译

    • 3D 生成

    • 图像超分辨率

    • 视频超分辨率

    • 视频插帧

    • 图像补全

    • 图像抠图

    • 图像修复

    • 图像上色

    • 图像生成

    • 还有许多其他应用程序即将推出!

    在这里插入图片描述

    【课程链接】https://www.bilibili.com/video/BV1gM4y1n7vP/
    【讲师介绍】张子豪 OpenMMLab算法工程师

    OpenMMLab 生成模型+底层视觉+AIGC+多模态 算法库 MMagic
    MMagic主页:https://github.com/open-mmlab/mmagic
    代码教程:https://github.com/TommyZihao/MMagic_Tutorials
    中文文档:https://mmagic.readthedocs.io/zh_CN/latest/

    【代码教程目录】
    安装配置MMagic环境
    黑白老照片上色
    文生图-Stable Diffusion
    文生图-Dreambooth
    图生图-ControlNet

    安装配置MMagic

    安装Pytorch

    !pip3 install install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
    
    • 1

    安装MMCV和MMEngine环境

    [2]

    !pip3 install openmim
    !mim install 'mmcv>=2.0.0'
    !mim install 'mmengine'
    
    • 1
    • 2
    • 3

    安装MMagic

    !mim install 'mmagic'
    从源码安装MMagic
    !rm -rf mmagic # 删除原有的 mmagic 文件夹(如有)
    !git clone https://github.com/open-mmlab/mmagic.git # 下载 mmagic 源代码
    
    • 1
    • 2
    • 3
    • 4
    import os
    os.chdir('mmagic')
    !pip3 install -e .
    
    • 1
    • 2
    • 3

    检查安装成功

    # 检查 Pytorch
    import torch, torchvision
    print('Pytorch 版本', torch.__version__)
    print('CUDA 是否可用',torch.cuda.is_available())
    
    • 1
    • 2
    • 3
    • 4

    [

    检查 mmcv

    import mmcv
    from mmcv.ops import get_compiling_cuda_version, get_compiler_version
    print('MMCV版本', mmcv.__version__)
    print('CUDA版本', get_compiling_cuda_version())
    print('编译器版本', get_compiler_version())
    
    • 1
    • 2
    • 3
    • 4
    • 5

    检查 mmagic

    import mmagic
    print('MMagic版本', mmagic.__version__)
    MMagic版本 1.0.2dev0
    
    • 1
    • 2
    • 3

    安装其它工具包

    !pip install opencv-python pillow matplotlib seaborn tqdm -i https://pypi.tuna.tsinghua.edu.cn/simple
    !pip install clip transformers gradio 'httpx[socks]' diffusers==0.14.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
    !mim install 'mmdet>=3.0.0'
    
    • 1
    • 2
    • 3

    黑白照片上色

    进入 MMagic 主目录

    import os
    os.chdir('mmagic')
    
    • 1
    • 2

    下载样例图片

    [2]

    !wget https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20230613-MMagic/data/test_colorization.jpg -O test_colorization.jpg
    
    • 1

    运行预测

    [3]
    !python demo/mmagic_inference_demo.py
    –model-name inst_colorization
    –img test_colorization.jpg
    –result-out-dir out_colorization.png

    文生图-Stable Diffusion

    from mmagic.apis import MMagicInferencer
    sd_inferencer = MMagicInferencer(model_name='stable_diffusion')
    text_prompts = 'A panda is having dinner at KFC'
    
    text_prompts = 'A Persian cat walking in the streets of New York'
    
    sd_inferencer.infer(text=text_prompts, low_cpu_mem_usage=True,result_out_dir='output/sd_res.png')
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    在这里插入图片描述

    文生图-Dreambooth

    新建文件夹data/dreambooth/imgs/

    在这里插入图片描述
    修改config/dreambooth/文件夹中的dreambooth-lora.py脚本
    在这里插入图片描述

    dataset = dict(
        type='DreamBoothDataset',
        data_root='./data/dreambooth',
        # TODO: rename to instance
        concept_dir='imgs',
        prompt='a photo of gril',
        pipeline=pipeline)
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    然后执行命令:

    !bash tools/dist_train.sh configs/dreambooth/dreambooth-lora.py 1
    
    • 1

    用训练好的模型做预测

    from mmengine import Config
    
    from mmagic.registry import MODELS
    from mmagic.utils import register_all_modules
    
    register_all_modules()
    cfg = Config.fromfile('./mmagic/configs/dreambooth/dreambooth-lora.py')
    dreambooth_lora = MODELS.build(cfg.model)
    
    state = torch.load('mmagic/work_dirs/dreambooth-lora/iter_1000.pth')['state_dict']
    
    def convert_state_dict(state):
        state_dict_new = {}
        for k, v in state.items():
            if '.module' in k:
                k_new = k.replace('.module', '')
            else:
                k_new = k
            if 'vae' in k:
                if 'to_q' in k:
                    k_new = k.replace('to_q', 'query')
                elif 'to_k' in k:
                    k_new = k.replace('to_k', 'key')
                elif 'to_v' in k:
                    k_new = k.replace('to_v', 'value')
                elif 'to_out' in k:
                    k_new = k.replace('to_out.0', 'proj_attn')
            state_dict_new[k_new] = v
        return state_dict_new
    dreambooth_lora.load_state_dict(convert_state_dict(state))
    dreambooth_lora = dreambooth_lora.cuda()
    samples = dreambooth_lora.infer('side view of gril', guidance_scale=5)
    samples['samples'][0]
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33

    请添加图片描述

    图生图-ControlNet-Canny

    导入工具包

    import cv2
    import numpy as np
    import mmcv
    from mmengine import Config
    from PIL import Image
    
    from mmagic.registry import MODELS
    from mmagic.utils import register_all_modules
    
    register_all_modules()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    载入ControlNet模型

    cfg = Config.fromfile('configs/controlnet/controlnet-canny.py')
    controlnet = MODELS.build(cfg.model).cuda()
    
    • 1
    • 2

    输入Canny边缘图

    control_url = 'https://user-images.githubusercontent.com/28132635/230288866-99603172-04cb-47b3-8adb-d1aa532d1d2c.jpg'
    control_img = mmcv.imread(control_url)
    control = cv2.Canny(control_img, 100, 200)
    control = control[:, :, None]
    control = np.concatenate([control] * 3, axis=2)
    control = Image.fromarray(control)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    咒语Prompt

    prompt ='Room with blue walls and a yellow ceiling.'
    
    • 1

    执行预测

    output_dict = controlnet.infer(prompt, control=control)
    samples = output_dict['samples']
    for idx, sample in enumerate(samples):
        sample.save(f'sample_{idx}.png')
    controls = output_dict['controls']
    for idx, control in enumerate(controls):
        control.save(f'control_{idx}.png')
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    图生图-ControlNet-Pose

    import mmcv
    from mmengine import Config
    from PIL import Image
    
    from mmagic.registry import MODELS
    from mmagic.utils import register_all_modules
    
    register_all_modules()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    载入ControlNet模型

    cfg = Config.fromfile('configs/controlnet/controlnet-pose.py')
    # convert ControlNet's weight from SD-v1.5 to Counterfeit-v2.5
    cfg.model.unet.from_pretrained = 'gsdf/Counterfeit-V2.5'
    cfg.model.vae.from_pretrained = 'gsdf/Counterfeit-V2.5'
    cfg.model.init_cfg['type'] = 'convert_from_unet'
    controlnet = MODELS.build(cfg.model).cuda()
    # call init_weights manually to convert weight
    controlnet.init_weights()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    咒语Prompt

    prompt = 'masterpiece, best quality, sky, black hair, skirt, sailor collar, looking at viewer, short hair, building, bangs, neckerchief, long sleeves, cloudy sky, power lines, shirt, cityscape, pleated skirt, scenery, blunt bangs, city, night, black sailor collar, closed mouth'
    
    
    • 1
    • 2

    输入Pose图

    control_url = 'https://user-images.githubusercontent.com/28132635/230380893-2eae68af-d610-4f7f-aa68-c2f22c2abf7e.png'
    control_img = mmcv.imread(control_url)
    control = Image.fromarray(control_img)
    control.save('control.png')
    
    • 1
    • 2
    • 3
    • 4

    在这里插入图片描述

    执行预测

    output_dict = controlnet.infer(prompt, control=control, width=512, height=512, guidance_scale=7.5)
    samples = output_dict['samples']
    for idx, sample in enumerate(samples):
        sample.save(f'sample_{idx}.png')
    controls = output_dict['controls']
    for idx, control in enumerate(controls):
        control.save(f'control_{idx}.png')
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    图生图-ControlNet Animation

    方式一:Gradio命令行

    !python demo/gradio_controlnet_animation.py
    
    • 1

    点击URL,打开Gradio在线交互式网站,上传视频,执行预测
    方式二:MMagic API

    # 导入工具包
    from mmagic.apis import MMagicInferencer
    
    # Create a MMEdit instance and infer
    editor = MMagicInferencer(model_name='controlnet_animation')
    # 指定 prompt 咒语
    prompt = 'a girl, black hair, T-shirt, smoking, best quality, extremely detailed'
    negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
    
    # 待测视频
    # https://user-images.githubusercontent.com/12782558/227418400-80ad9123-7f8e-4c1a-8e19-0892ebad2a4f.mp4
    video = '../run_forrest_frames_rename_resized.mp4'
    save_path = '../output_video.mp4'
    # 执行预测
    editor.infer(video=video, prompt=prompt, image_width=512, image_height=512, negative_prompt=negative_prompt, save_path=save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    训练自己的ControlNet

    下载数据集

    !rm -rf fill50k.zip fill50k
    !wget https://huggingface.co/lllyasviel/ControlNet/blob/main/training/fill50k.zip
    !unzip fill50k.zip >> /dev/null # 解压压缩包
    !rm -rf fill50k.zip # 删除压缩包
    
    • 1
    • 2
    • 3
    • 4

    训练

    !bash tools/dist_train.sh configs/controlnet/controlnet-1xb1-demo_dataset 1
    
    • 1
    from mmagic.apis import MMagicInferencer
    import matplotlib.pyplot as plt
    sd_inferencer = MMagicInferencer(model_name='stable_diffusion')
    
    import cv2
    import numpy as np
    import mmcv
    from mmengine import Config
    from PIL import Image
    
    from mmagic.registry import MODELS
    from mmagic.utils import register_all_modules
    
    register_all_modules()
    cfg = Config.fromfile('configs/controlnet/controlnet-canny.py')
    controlnet = MODELS.build(cfg.model).cuda()
    control_img = mmcv.imread('11.JPG')
    control = cv2.Canny(control_img, 100, 200)
    control = control[:, :, None]
    control = np.concatenate([control] * 3, axis=2)
    control = Image.fromarray(control)
    plt.subplot(121)
    plt.imshow(control_img)
    plt.subplot(122)
    plt.imshow(control)
    plt.show()
    prompt = 'Make this room full of warmth.'
    output_dict = controlnet.infer(prompt, control=control)
    samples = output_dict['samples']
    for idx, sample in enumerate(samples):
        sample.save(f'sample_{idx}.png')
    controls = output_dict['controls']
    for idx, control in enumerate(controls):
        control.save(f'control_{idx}.png')
    
    plt.subplot(121)
    plt.imshow(control_img)
    plt.subplot(122)
    sample_0 = mmcv.imread('./sample_0.png')
    plt.imshow(sample_0)
    plt.show()
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41

    请添加图片描述

    在这里插入图片描述

    在这里插入图片描述

    在这里插入图片描述

  • 相关阅读:
    【乐吾乐2D可视化组态编辑器】开关、阀门、报警状态切换
    123456
    Python——第6章 Numpy库的使用
    国密国际SSL双证书解决方案,满足企事业单位国产国密SSL证书要求
    django的csrf跨站请求伪造
    【初试396分】西北工业大学827学长经验分享
    基于PyTorch机器学习与深度学习实践应用与案例分析
    【CSH 入门基础 8 -- csh 中 set 与 setenv 的区别 】
    搞懂TypeScript的类型声明
    【每日一题】34. 在排序数组中查找元素的第一个和最后一个位置
  • 原文地址:https://blog.csdn.net/hhhhhhhhhhwwwwwwwwww/article/details/131251962