efficientsam-pytorch基于point、box和segment everthing推理模型

efficientsam-pytorch基于point、box和segment everthing推理模型
EfficientSAM

论文

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
- https://arxiv.org/abs/2312.00863
模型结构

EfficientSAM模型利用掩码图像预训练（SAMI），该预训练学习从SAM图像编码器重构特征，以进行有效的视觉表示学习。然后采用SAMI预训练的轻量级图像编码器和掩码解码器来构建EfficientSAMs ，并在SA-1B数据集上对模型进行微调以执行分割一切的任务。EfficientSAM-S将SAM的推理时间减少了约20倍，参数大小减少了约20倍，性能下降很小。

算法原理

模型包含两个阶段：ImageNet上的SAMI预训练和SA-1B上的SAM微调。EfficientSAM的核心组件包括：交叉注意力解码器、线性投影头、重建损失。交叉注意力解码器：在SAM特征监督下，解码器重构掩蔽令牌，同时编码器输出作为重构锚点。解码器查询来自掩码令牌，键和值来自编码器和未掩码特征。结合编解码器两者输出特征，用于MAE输出嵌入，并重新排序至原始图像位置。线性投影头：将编码器和解码器输出特征输入到线性投影头，以对齐SAM图像编码器特征并解决特征维数不匹配问题。重建损失：在每次训练迭代中，SAMI由从SAM图像编码器中提取的前馈特征，以及MAE的前馈和反向传播过程组成。比较了SAM图像编码器和MAE线性投影头的输出，计算了重建损失。

环境配置

Docker（方法一）

此处提供光源拉取docker镜像的地址与使用步骤
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk23.10-py38
docker run -it --shm-size=64G -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal:/opt/hyhal --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name efficientsam_pytorch  <your IMAGE ID> bash # <your IMAGE ID>为以上拉取的docker的镜像ID替换，本镜像为：ffa1f63239fc
cd /path/your_code_data/efficientsam_pytorch
pip install -e .
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com
git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
pip install -e .
```
Dockerfile（方法二）

此处提供dockerfile的使用方法
```
docker build --no-cache -t efficientsam:latest .
docker run -it --shm-size=64G -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal:/opt/hyhal --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name efficientsam_pytorch  efficientsam  bash
cd /path/your_code_data/efficientsam_pytorch
pip install -e .
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com
git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
pip install -e .
```
Anaconda（方法三）

此处提供本地配置、编译的详细步骤，例如：

关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装。
```
DTK驱动：dtk23.10
python：python3.8
torch: 2.1.0
torchvision: 0.16.0
triton:2.1.0
```
Tips：以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应

其它依赖环境安装如下：
```
cd /path/your_code_data/efficientsam_pytorch
pip install -e .
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/  --trusted-host mirrors.aliyun.com
git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
pip install -e .
```
数据集

预训练阶段数据集为数据集为 ImageNet

微调阶段数据集为数据集为 SA-1B

训练

官方暂未开放

推理

模型的权重可以通过以下表格链接获得，推理时将其下载放置于weights文件夹下。

EfficientSAM-S EfficientSAM-Ti
Download Download

单卡推理

进入代码文件夹
```
cd /path/your_code_data/efficientsam_pytorch
```
基于point和box推理，更多细节参考netbooks/EfficientSAM_example.ipynb：

基于point推理
```
python inference_point_prompt.py
```
基于box推理
```
python inference_box_prompt.py
```
segment everything推理,更多细节参考netbooks/EfficientSAM_segment_everything_example.ipynb：
```
python inference_segment_everything.py
```
result

EfficientSAM-S和EfficientSAM-Ti 基于point测试结果如下：

EfficientSAM-S和EfficientSAM-Ti 基于box测试结果如下：

EfficientSAM-S和EfficientSAM-Ti segment_everything测试结果如下：

精度

无

应用场景

算法类别

图像分割

热点应用行业

制造,广媒,能源,医疗,家居,教育

源码仓库及问题反馈
- ModelZoo / efficientsam_pytorch · GitLab
参考资料
- GitHub - yformer/EfficientSAM: EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
相关阅读:
卫星业务。。。。
操作系统开发中printf函数的简单实现
 Sentry Relay 二次开发调试简介
 【编译部署】使用Visual Studio编译Linux平台程序/动态库（远程连接）
照片heic怎么转成jpg?
JavaWeb核心、综合案例(详细！Web开发流程)
全国医疗发票OCR识别，一个接口即满足
 Linux exec 命令和Python exec 函数区别
 LeetCode刷题之HOT100之岛屿数量
 联想r9000p 关闭背面光
原文地址：https://blog.csdn.net/qq_27815483/article/details/139712860

EfficientSAM

论文

模型结构

算法原理

环境配置

Docker（方法一）

Dockerfile（方法二）

Anaconda（方法三）

数据集

训练

推理

单卡推理

result

精度

应用场景

算法类别

热点应用行业

源码仓库及问题反馈

参考资料