1、搞懂Transformer
2、搞懂 Vision Transformer 原理和代码,看这篇技术综述就够了
1、搞懂DEtection TRanformer(DETR)
2、End-to-End Object Detection with Transformers[DETR]
3、DETR(DEtection TRansformer)的初步摸索
4、DETR详解:NLP里面的Transformer也能做目标检测?
5、DETR:Facebook提出基于Transformer的目标检测新范式,性能媲美Faster RCNN | ECCV 2020 Oral
6、使用PyTorch实现目标检测新范式DETR(基于transformer)
7、如何看待End-to-End Object Detection with Transformers?
8、DETR 学习笔记
9、DETR训练自己的数据集—使用方法
1、万字长文盘点2021年paper大热的Transformer(ViT)
2、Vision Transformer
1、Transformer全面超越ResNet:依图开源“可大可小”T2T-ViT,轻量版优于MobileNet
CUDA_VISIBLE_DEVICES=0 python3 main.py data --model t2t_vit_14 -b 100 --eval_checkpoint models/81.5_T2T_ViT_14.pth.tar
nx@nx-desktop:~/Projects/T2T-ViT$ CUDA_VISIBLE_DEVICES=0 python3 main.py data --model t2t_vit_14 -b 100 --eval_checkpoint models/81.5_T2T_ViT_14.pth.tar
Training with a single process on 1 GPUs.
adopt performer encoder for tokens-to-token
Model t2t_vit_14 created, param count: 21545550
Data processing configuration for current model + dataset:
input_size: (3, 224, 224)
interpolation: bicubic
mean: (0.485, 0.456, 0.406)
std: (0.229, 0.224, 0.225)
crop_pct: 0.9
AMP not enabled. Training in float32.
Scheduled epochs: 310
Loaded state_dict_ema from checkpoint 'models/81.5_T2T_ViT_14.pth.tar'
Test: [ 0/0] Time: 1.624 (1.624) Loss: 9.8540 (9.8540) Acc@1: 0.0000 ( 0.0000) Acc@5: 0.0000 ( 0.0000)
Top-1 accuracy of the model is: 0.0%