PandaGPT部署演示 - 码农知识堂

PandaGPT部署演示
PandaGPT 是一种通用的指令跟踪模型，可以看到和听到。实验表明，PandaGPT 可以执行复杂的任务，例如生成详细的图像描述、编写受视频启发的故事以及回答有关音频的问题。更有趣的是，PandaGPT 可以同时接受多模态输入并自然地组合它们的语义。例如，PandaGPT 可以连接对象在照片中的外观以及它们在音频中的声音。为此，PandaGPT 结合了 ImageBind 的多模式编码器和 Vicuna 的大型语言模型。值得注意的是，尽管 PandaGPT 在六种模式（文本、图像/视频、音频、深度、热和 IMU）中展示了令人印象深刻的跨模式功能，但它仅使用对齐的图像-文本对进行训练，感谢 ImageBind 提供的共享嵌入空间。我们希望 PandaGPT 成为构建 AGI 的第一步，它可以像我们人类一样整体感知和理解不同模式的输入。

github：GitHub - yxuansu/PandaGPT: PandaGPT: One Model To Instruction-Follow Them All

1、搭建环境
```
conda create -n pandagpt python=3.9
 
conda activate pandagpt
```
创建虚拟环境pandagpt，python3.9，激活环境
```
git clone https://github.com/yxuansu/PandaGPT.git
 
cd /content/drive/MyDrive/PandaGPT
 
pip install -r requirements.txt
 
pip install torch==1.13.1+cu117 -f https://download.pytorch.org/whl/torch/
 
pip install gradio mdtex2html
```
下载PandaGPT源码，安装依赖包。

2、下载模型
```
curl -L https://dl.fbaipublicfiles.com/imagebind/imagebind_huge.pth -o /content/drive/MyDrive/imagebind_ckpt/_huge.pth
```
准备ImageBind Checkpoint，保存为/content/drive/MyDrive/imagebind_ckpt/_huge.pth

准备Vicuna Checkpoint，保存为/content/drive/MyDrive/vicuna_ckpt/7b_v0/vicuna-7b-v0，详见

Llama-7b-hf和vicuna-7b-delta-v0合并成vicuna-7b-v0_Spielberg_1的博客-CSDN博客
```
pip install huggingface_hub
 
from huggingface_hub import snapshot_download
 
save_dir="/content/drive/MyDrive/pandagpt_ckpt/7b"
repo_id="openllmplayground/pandagpt_7b_max_len_1024"
 
snapshot_download(repo_id=repo_id,local_dir=save_dir, local_dir_use_symlinks=False)
```
下载PandaGPT权重，保存到/content/drive/MyDrive/pandagpt_ckpt/7b

还可以从huggingface网站下载模型等。

三、部署演示
```
cd /content/drive/MyDrive/PandaGPT/code
```
进入PandaGPT目录，打开/PandaGPT/code/web_demo.py文件，

传入的imagebind_ckpt_path、vicuna_ckpt_path模型路径修改成你的目录

delta_ckpt_path修改为pandagpt权重路径，最后是.pt文件

最后一行，share参数改为True，这样就可以分享链接，在网页上打开了

终于可以运行啦
```
cd /content/drive/MyDrive/PandaGPT/code
 
 
CUDA_VISIBLE_DEVICES=0 python web_demo.py
```
我们看一下效果，打开链接，看到页面有四个框，依次上传图片、音频、视频、热力图，

文件上传到模型受网络影响，图片可能要等待几秒，音频和视频要更长一点，有文件上传完成提示就更好了(期待大神更新)。

在[input]框中输入想要的提问，点击[submit]，等到推理出结果，在[chatbot]中对给出回答

传入一张猫咪，提问：“这是什么颜色” 回答：灰色和白色

上传一段雷声，提问：这是什么声音

上传汉堡图片，提问：举两个生产商

总结：总的来看，pandagpt使用文本和图像对进行训练，能够实现跨模态的输入的，生成文本，还是很让人欣喜的，感谢分享，期待大神继续完善。

遇到的问题：

ImportError: LlamaTokenizer requires the SentencePiece library but it was not found in your environment. Checkout the instructions on the installation page of its repo: https://github.com/google/sentencepiece# installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.
```
pip install Sentencepiece
```
/mnt/PandaGPT/code/web_demo.py:154: GradioUnusedKwargWarning: You have unused kwarg parameters in Blocks, please remove them: {'scale': 4}
with gr.Blocks(scale=4) as demo:
/mnt/PandaGPT/code/web_demo.py:157: GradioUnusedKwargWarning: You have unused kwarg parameters in Row, please remove them: {'scale': 4}
with gr.Row(scale=4):
/mnt/PandaGPT/code/web_demo.py:163: GradioUnusedKwargWarning: You have unused kwarg parameters in Video, please remove them: {'type': 'file'}
video_path = gr.Video(type='file', label="Video")
/mnt/PandaGPT/code/web_demo.py:167: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
chatbot = gr.Chatbot().style(height=300)
/mnt/PandaGPT/code/web_demo.py:171: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
user_input = gr.Textbox(show_label=False, placeholder="Input...", lines=10).style(container=False)
/mnt/PandaGPT/code/web_demo.py:173: GradioUnusedKwargWarning: You have unused kwarg parameters in Row, please remove them: {'scale': 1}
with gr.Row(scale=1):
/mnt/PandaGPT/code/web_demo.py:175: GradioUnusedKwargWarning: You have unused kwarg parameters in Row, please remove them: {'scale': 1}
with gr.Row(scale=1):
Running on local URL: http://0.0.0.0:24000

To create a public link, set `share=True` in `launch()`.
Error: no DISPLAY environment variable specified

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.70 GiB total capacity; 7.16 GiB already allocated; 12.69 MiB free; 7.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

查看显卡显存被占用，杀掉占用进程

Ubuntu下跑Aplaca报错：torch.cuda.0utofMemoryError: CUDA out of memory.解决办法（查看CUDA占用情况&清除GPU缓存）_李卓璐的博客-CSDN博客

FileNotFoundError: [Errno 2] No such file or directory: 'ffprobe'
```
pip install ffmpeg
```
感谢：
FileNotFoundError: [Errno 2] 没有这样的文件或目录: 'ffprobe': 'ffprobe' | 那些遇到过的问题

python - FileNotFoundError: No such file or directory: 'ffprobe' FileNotFoundError: No such file or directory: 'ffmpeg' - Stack Overflow

python - FileNotFoundError: [Errno 2] No such file or directory: 'ffprobe': 'ffprobe' - Stack Overflow

Our pilot experiments show that 我们的实验显示

reason over 推理 reasons over the user's input 对用户的输入进行推理

a set of 一套

understanding/reasoning 理解/推理

knowledeg-grouding description 基于知识的描述

multi-turn conversation 多轮对话

general-purpose 多用途，多功能

pilot experiments 中试试验，发布之前的实验

perform complex tasks 执行复杂实验

image description generation 生成图像描述

writing stories inspired by videos 创作受视频启发故事

answering questions about audios 回答关于音频的问题

take multimodal input simultaneously 同时接收多种模式的输入
相关阅读:
SpringSecurity分布式安全框架
 tail: inotify cannot be used, reverting to polling: Too many open files
js-day04-作业
 应用程序管理工具
 AQS初探
 【Apollo学习笔记】——规划模块TASK之PIECEWISE_JERK_SPEED_OPTIMIZER
详解HTTP协议版本（HTTP/1.0、1.1、2.0、3.0区别）
12.6 - 每日一题 - 408
SUB-1G芯片--PAN3028 一款低功耗远距离无线收发芯片
 矿大数据结构实验四折半查找二叉搜索树最短路径排序
原文地址：https://blog.csdn.net/qq_18815817/article/details/132639674