码农知识堂 - 1000bd
  •   Python
  •   PHP
  •   JS/TS
  •   JAVA
  •   C/C++
  •   C#
  •   GO
  •   Kotlin
  •   Swift
  • CV计算机视觉每日开源代码Paper with code速览-2023.11.14


    点击@CV计算机视觉,关注更多CV干货

    论文已打包,点击进入—>下载界面

    点击加入—>CV计算机视觉交流群

    1.【基础网络架构:Transformer】Aggregate, Decompose, and Fine-Tune: A Simple Yet Effective Factor-Tuning Method for Vision Transformer

    • 论文地址:https://arxiv.org//pdf/2311.06749

    • 开源代码(即将开源):https://github.com/Dongping-Chen/EFFT-EFfective-Factor-Tuning

    2.【缺陷检测】Self-supervised Context Learning for Visual Inspection of Industrial Defects

    • 论文地址:https://arxiv.org//pdf/2311.06504

    • 开源代码(即将开源):https://github.com/wangpeng000/VisualInspection

    3.【目标检测、分割】CD-COCO: A Versatile Complex Distorted COCO Database for Scene-Context-Aware Computer Vision

    • 论文地址:https://arxiv.org//pdf/2311.06976

    • 开源代码:https://github.com/Aymanbegh/CD-COCO

    4.【视频分割】Sketch-based Video Object Segmentation: Benchmark and Analysis

    • 论文地址:https://arxiv.org//pdf/2311.07261

    • 开源代码(即将开源):https://github.com/YRlin-12/Sketch-VOS-datasets

    5.【多模态】SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

    • 论文地址:https://arxiv.org//pdf/2311.07575

    • 开源代码:https://github.com/Alpha-VLLM/LLaMA2-Accessory

    6.【多模态】To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

    • 论文地址:https://arxiv.org//pdf/2311.07574

    • 开源代码(即将开源):https://github.com/X2FD/LVIS-INSTRUCT4V

    7.【多模态】GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

    • 论文地址:https://arxiv.org//pdf/2311.07562

    • 开源代码(即将开源):https://github.com/zzxslp/MM-Navigator

    8.【多模态】GPT-4V(ision) as A Social Media Analysis Engine

    • 论文地址:https://arxiv.org//pdf/2311.07547

    • 开源代码(即将开源):https://github.com/VIStA-H/GPT-4V_Social_Media

    9.【多模态】InfMLLM: A Unified Framework for Visual-Language Tasks

    • 论文地址:https://arxiv.org//pdf/2311.06791

    • 开源代码:https://github.com/mightyzau/InfMLLM

    10.【多模态】Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

    • 论文地址:https://arxiv.org//pdf/2311.06783

    • 工程主页:Q-Instruct | [IQA, Low-level Vision, MLLM] Low-level visual instruction tuning, with a 200K dataset and a model zoo for fine-tuned checkpoints.

    • 开源代码:https://github.com/Q-Future/Q-Instruct/

    11.【多模态】ChatAnything: Facetime Chat with LLM-Enhanced Personas

    • 论文地址:https://arxiv.org//pdf/2311.06772

    • 工程主页:ChatAnything

    • 开源代码:https://github.com/zhoudaquan/ChatAnything

    12.【多模态】Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models

    • 论文地址:https://arxiv.org//pdf/2311.06607

    • 开源代码(即将开源):https://github.com/Yuliang-Liu/Monkey

    13.【多模态】An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation

    • 论文地址:https://arxiv.org//pdf/2311.07397

    • 开源代码(即将开源):https://github.com/junyangwang0410/AMBER

    14.【多模态】Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision

    • 论文地址:https://arxiv.org//pdf/2311.07362

    • 开源代码(即将开源):https://github.com/kaistAI/Volcano

    15.【多模态】ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models

    • 论文地址:https://arxiv.org//pdf/2311.07022

    • 工程主页:ViLMA - Video Language Model Assessment

    • 开源代码:https://github.com/ilkerkesen/ViLMA

    16.【数字人】(WACV2024)CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer

    • 论文地址:https://arxiv.org//pdf/2311.06443

    • 开源代码(即将开源):https://github.com/HowieMa/CVTHead

    17.【深度估计】MonoDiffusion: Self-Supervised Monocular Depth Estimation Using Diffusion Model

    • 论文地址:https://arxiv.org//pdf/2311.07198

    • 开源代码(即将开源):https://github.com/ShuweiShao/MonoDiffusion

    18.【深度估计】(ICCV2023)NDDepth: Normal-Distance Assisted Monocular Depth Estimation and Completion

    • 论文地址:https://arxiv.org//pdf/2311.07166

    • 开源代码(即将开源):https://github.com/ShuweiShao/NDDepth

    19.【自动驾驶:BEV】Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

    • 论文地址:https://arxiv.org//pdf/2311.07152

    • 开源代码:https://github.com/HuangJunJie2017/BEVDet

    20.【自动驾驶:BEV】Deep Perspective Transformation Based Vehicle Localization on Bird's Eye View

    • 论文地址:https://arxiv.org//pdf/2311.06796

    • 开源代码(即将开源):https://github.com/IPM-HPC/Perspective-BEV-Transformer

    21.【Diffusion】Sampler Scheduler for Diffusion Models

    • 论文地址:https://arxiv.org//pdf/2311.06845

    • 开源代码:https://github.com/Carzit/sd-webui-samplers-scheduler

    22.【NeRF】-Sampler: An Model Guided Volume Sampling for NeRF

    • 论文地址:https://arxiv.org//pdf/2311.07044

    • 工程主页:L0-Sampler: An L0 Model Guided Volume Sampling for NeRF

    • 开源代码:https://github.com/USTC3DV/L0-Sampler-code

    23.【Visual Question Answering】Analyzing Modular Approaches for Visual Question Decomposition

    • 论文地址:https://arxiv.org//pdf/2311.06411

    • 开源代码:https://github.com/brown-palm/visual-question-decomposition

    论文已打包,下载链接​​​​​​​

    CV计算机视觉交流群

    群内包含目标检测、图像分割、目标跟踪、Transformer、多模态、NeRF、GAN、缺陷检测、显著目标检测、关键点检测、超分辨率重建、SLAM、人脸、OCR、生物医学图像、三维重建、姿态估计、自动驾驶感知、深度估计、视频理解、行为识别、图像去雾、图像去雨、图像修复、图像检索、车道线检测、点云目标检测、点云分割、图像压缩、运动预测、神经网络量化、网络部署等多个领域的大佬,不定期分享技术知识、面试技巧和内推招聘信息。

    想进群的同学请添加微信号联系管理员:PingShanHai666。添加好友时请备注:学校/公司+研究方向+昵称。

    推荐阅读:

    ​​​​​​​CV计算机视觉每日开源代码Paper with code速览-2023.11.13

    CV计算机视觉每日开源代码Paper with code速览-2023.11.10

    CV计算机视觉每日开源代码Paper with code速览-2023.11.9

    CV计算机视觉每日开源代码Paper with code速览-2023.11.8

    CV计算机视觉每日开源代码Paper with code速览-2023.11.7

    CV计算机视觉每日开源代码Paper with code速览-2023.11.6

  • 相关阅读:
    GIF图像动态生成-JAVA后台生成
    js判断数据类型、toString和valueOf区别,类型转换、不同类型间的运算、判断相等
    OneNote 教程,如何在 OneNote 中使用绘图和批注?
    Ubuntu上Jenkins自动化部署Gitee上SpringBoot项目
    ThreadLocal的短板,我TTL来补
    VS报错 The build tools for v141 (Platform Toolset = ‘v141‘) cannot be found.
    Java项目:JSP酒店管理系统
    IceRPC之如何创建连接connection
    MacOS配置Clion的Qt环境的详细步骤(完整版)
    索引数据结构详解
  • 原文地址:https://blog.csdn.net/zhangkai950121/article/details/134479138
  • 最新文章
  • 攻防演习之三天拿下官网站群
    数据安全治理学习——前期安全规划和安全管理体系建设
    企业安全 | 企业内一次钓鱼演练准备过程
    内网渗透测试 | Kerberos协议及其部分攻击手法
    0day的产生 | 不懂代码的"代码审计"
    安装scrcpy-client模块av模块异常,环境问题解决方案
    leetcode hot100【LeetCode 279. 完全平方数】java实现
    OpenWrt下安装Mosquitto
    AnatoMask论文汇总
    【AI日记】24.11.01 LangChain、openai api和github copilot
  • 热门文章
  • 十款代码表白小特效 一个比一个浪漫 赶紧收藏起来吧!!!
    奉劝各位学弟学妹们,该打造你的技术影响力了!
    五年了,我在 CSDN 的两个一百万。
    Java俄罗斯方块,老程序员花了一个周末,连接中学年代!
    面试官都震惊,你这网络基础可以啊!
    你真的会用百度吗?我不信 — 那些不为人知的搜索引擎语法
    心情不好的时候,用 Python 画棵樱花树送给自己吧
    通宵一晚做出来的一款类似CS的第一人称射击游戏Demo!原来做游戏也不是很难,连憨憨学妹都学会了!
    13 万字 C 语言从入门到精通保姆级教程2021 年版
    10行代码集2000张美女图,Python爬虫120例,再上征途
Copyright © 2022 侵权请联系2656653265@qq.com    京ICP备2022015340号-1
正则表达式工具 cron表达式工具 密码生成工具

京公网安备 11010502049817号