• 开源大模型ChatGLM2-6B 2. 跟着LangChain参考文档搭建LLM+知识库问答系统


    0. 环境


    租用了1台GPU服务器,系统 ubuntu20,Tesla V100-16GB

    (GPU服务器已经关机结束租赁了)
    SSH地址:*
    端口:17520

    SSH账户:root
    密码:Jaere7pa

    内网: 3389 , 外网:17518

    VNC地址:*
    端口:17519

    VNC用户名:root
    密码:Jaere7pa

    硬件需求,ChatGLM-6B和ChatGLM2-6B相当。
    量化等级    最低 GPU 显存
    FP16(无量化)    13 GB
    INT8    10 GB
    INT4    6 GB

    1. 基本环境


    1.1 测试gpu

    1. nvidia-smi
    2. (base) root@ubuntuserver:~# nvidia-smi
    3. Tue Sep 12 02:06:45 2023
    4. +-----------------------------------------------------------------------------+
    5. | NVIDIA-SMI 510.54       Driver Version: 510.54       CUDA Version: 11.6     |
    6. |-------------------------------+----------------------+----------------------+
    7. | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    8. | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    9. |                               |                      |               MIG M. |
    10. |===============================+======================+======================|
    11. |   0  Tesla V100-PCIE...  Off  | 00000000:00:07.0 Off |                    0 |
    12. | N/A   42C    P0    38W / 250W |      0MiB / 16384MiB |      0%      Default |
    13. |                               |                      |                  N/A |
    14. +-------------------------------+----------------------+----------------------+
    15. +-----------------------------------------------------------------------------+
    16. | Processes:                                                                  |
    17. |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
    18. |        ID   ID                                                   Usage      |
    19. |=============================================================================|
    20. |  No running processes found                                                 |
    21. +-----------------------------------------------------------------------------+
    22. (base) root@ubuntuserver:~#

    1.2 python


    当前LangChain安装说明,需要Python 3.8 - 3.10 版本
    执行python3
    可以看到python3.9

    1. # 如果低于这个版本,可使用conda安装环境
    2. $ conda create -p /root/work/conda_py310_chatglm2 python=3.10
    3. # 激活环境
    4. $ source activate /root/work/conda_py310_chatglm2
    5. # 更新py库
    6. $ pip3 install --upgrade pip
    7. # 关闭环境
    8. $ source deactivate /root/work/conda_py310_chatglm2
    9. # 删除环境
    10. $ conda env remove -p  /root/work/conda_py310_chatglm2

    1.3 pip

    pip3 install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple

    1.4 准备仓库

    1. git clone https://github.com/chatchat-space/Langchain-Chatchat.git
    2. cd Langchain-Chatchat

    1.5 升级cuda


    查看显卡驱动版本要求:
    https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

    发现cuda 11.8需要 >=450.80.02。已经满足。

    执行指令更新cuda

    1. wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
    2. sh cuda_11.8.0_520.61.05_linux.run


    -> 输入 accept
    -> 取消勾选 Driver
    -> 点击 install

    1. export PATH=$PATH:/usr/local/cuda-11.8/bin
    2. nvcc --version

    准备switch-cuda.sh脚本

    1. #!/usr/bin/env bash
    2. # Copyright (c) 2018 Patrick Hohenecker
    3. #
    4. # Permission is hereby granted, free of charge, to any person obtaining a copy
    5. # of this software and associated documentation files (the "Software"), to deal
    6. # in the Software without restriction, including without limitation the rights
    7. # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    8. # copies of the Software, and to permit persons to whom the Software is
    9. # furnished to do so, subject to the following conditions:
    10. #
    11. # The above copyright notice and this permission notice shall be included in all
    12. # copies or substantial portions of the Software.
    13. #
    14. # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    15. # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    16. # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    17. # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    18. # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    19. # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    20. # SOFTWARE.
    21. # author: Patrick Hohenecker
    22. # version: 2018.1
    23. # date: May 15, 2018
    24. set -e
    25. # ensure that the script has been sourced rather than just executed
    26. if [[ "${BASH_SOURCE[0]}" = "${0}" ]]; then
    27. echo "Please use 'source' to execute switch-cuda.sh!"
    28. exit 1
    29. fi
    30. INSTALL_FOLDER="/usr/local" # the location to look for CUDA installations at
    31. TARGET_VERSION=${1} # the target CUDA version to switch to (if provided)
    32. # if no version to switch to has been provided, then just print all available CUDA installations
    33. if [[ -z ${TARGET_VERSION} ]]; then
    34. echo "The following CUDA installations have been found (in '${INSTALL_FOLDER}'):"
    35. ls -l "${INSTALL_FOLDER}" | egrep -o "cuda-[0-9]+\\.[0-9]+$" | while read -r line; do
    36. echo "* ${line}"
    37. done
    38. set +e
    39. return
    40. # otherwise, check whether there is an installation of the requested CUDA version
    41. elif [[ ! -d "${INSTALL_FOLDER}/cuda-${TARGET_VERSION}" ]]; then
    42. echo "No installation of CUDA ${TARGET_VERSION} has been found!"
    43. set +e
    44. return
    45. fi
    46. # the path of the installation to use
    47. cuda_path="${INSTALL_FOLDER}/cuda-${TARGET_VERSION}"
    48. # filter out those CUDA entries from the PATH that are not needed anymore
    49. path_elements=(${PATH//:/ })
    50. new_path="${cuda_path}/bin"
    51. for p in "${path_elements[@]}"; do
    52. if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
    53. new_path="${new_path}:${p}"
    54. fi
    55. done
    56. # filter out those CUDA entries from the LD_LIBRARY_PATH that are not needed anymore
    57. ld_path_elements=(${LD_LIBRARY_PATH//:/ })
    58. new_ld_path="${cuda_path}/lib64:${cuda_path}/extras/CUPTI/lib64"
    59. for p in "${ld_path_elements[@]}"; do
    60. if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
    61. new_ld_path="${new_ld_path}:${p}"
    62. fi
    63. done
    64. # update environment variables
    65. export CUDA_HOME="${cuda_path}"
    66. export CUDA_ROOT="${cuda_path}"
    67. export LD_LIBRARY_PATH="${new_ld_path}"
    68. export PATH="${new_path}"
    69. echo "Switched to CUDA ${TARGET_VERSION}."
    70. set +e
    71. return

    用法

    source switch-cuda.sh 11.8

    1.6 单独安装torch-gpu版本

    $ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

    1.7 安装全部依赖

    $ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

    验证torch是否带有cuda

    1. import torch
    2. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    3. print(device)

    2. 下载模型


    2.1 chatglm2-6b

    GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/THUDM/chatglm2-6b
    下载ChatGLM2作者上传到清华网盘的模型文件
    https://cloud.tsinghua.edu.cn/d/674208019e314311ab5c/?p=%2Fchatglm2-6b&mode=list
    并覆盖到chatglm2-6b

    先前以为用wget可以下载,结果下来的文件是一样大的,造成推理失败。
    win10 逐一校验文件SHA256,需要和https://huggingface.co/THUDM/chatglm2-6b中Git LFS Details的匹配。

    1. C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00001-of-00007.bin SHA256
    2. pytorch_model-00001-of-00007.bin         cdf1bf57d519abe11043e9121314e76bc0934993e649a9e438a4b0894f4e6ee8
    3. pytorch_model-00002-of-00007.bin        1cd596bd15905248b20b755daf12a02a8fa963da09b59da7fdc896e17bfa518c
    4. pytorch_model-00003-of-00007.bin         812edc55c969d2ef82dcda8c275e379ef689761b13860da8ea7c1f3a475975c8
    5. pytorch_model-00004-of-00007.bin         555c17fac2d80e38ba332546dc759b6b7e07aee21e5d0d7826375b998e5aada3
    6. pytorch_model-00005-of-00007.bin         cb85560ccfa77a9e4dd67a838c8d1eeb0071427fd8708e18be9c77224969ef48
    7. pytorch_model-00006-of-00007.bin         09ebd811227d992350b92b2c3491f677ae1f3c586b38abe95784fd2f7d23d5f2
    8. pytorch_model-00007-of-00007.bin         316e007bc727f3cbba432d29e1d3e35ac8ef8eb52df4db9f0609d091a43c69cb

    这里需要推到服务器中。并在ubuntu下用sha256sum 校验下大文件。

    2.2 text2vec


    GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese text2vec
    下载这两份文件,并放到 text2vec 内:

    1. model.safetensors                        eaf5cb71c0eeab7db3c5171da504e5867b3f67a78e07bdba9b52d334ae35adb3
    2. pytorch_model.bin                        5883cb940ac5509b75e9fe23a9aea62694045849dc8c8c2da2894861a045d7f5

    3. 参数配置

    1. cd configs
    2. cp configs/model_config.py.example configs/model_config.py
    3. cp configs/server_config.py.example configs/server_config.py

    修改configs/model_config.py·

    1. embedding_model_dict = {
    2.     "text2vec": "/root/work/Langchain-Chatchat/text2vec",
    3. }
    4. # 选用的 Embedding 名称
    5. EMBEDDING_MODEL = "text2vec"
    6. llm_model_dict = {
    7.     "chatglm2-6b": {
    8.         "local_model_path": "/root/work/Langchain-Chatchat/chatglm2-6b",
    9.     },
    10. }
    11. # LLM 名称
    12. LLM_MODEL = "chatglm2-6b"

    4. 知识库初始化与迁移


    初始化知识库:

    $ python init_database.py --recreate-vs

    5. 一键启动API 服务或 Web UI


    5.1 启动命令


    一键启动脚本 startup.py,一键启动所有 Fastchat 服务、API 服务、WebUI 服务,示例代码:

    $ python startup.py -a

    5.2 运行测试

    浏览器打开 127.0.0.1:8501。

    对话模式支持LLM对话,知识库问答,搜索引擎问答。

    知识库问答看起来是本仓库作者制作的,根据分析、数据检索生成的问答结果。


    参考:

    1. [1]https://github.com/THUDM/ChatGLM2-6B
    2. [2]ChatGLM-6B (介绍以及本地部署),https://blog.csdn.net/qq128252/article/details/129625046
    3. [3]ChatGLM2-6B|开源本地化语言模型,https://openai.wiki/chatglm2-6b.html
    4. [3]免费部署一个开源大模型 MOSS,https://zhuanlan.zhihu.com/p/624490276
    5. [4]LangChain + ChatGLM2-6B 搭建个人专属知识库,https://zhuanlan.zhihu.com/p/643531454
    6. [5]https://pytorch.org/get-started/locally/

  • 相关阅读:
    微信商户平台转账到零钱功能接入实战
    密码学系列之:PKI的证书格式表示X.509
    docker安装onlyoffice
    AVL 树
    38Java Math类
    【问题思考总结】CPU怎么访问磁盘?CPU只有32位,最多只能访问4GB的空间吗?
    微信小程序之vue按钮切换内容变化
    HTML---表单验证
    2084. 为订单类型为 0 的客户删除类型为 1 的订单
    基于Ansible实现Apache Doris快速部署运维指南
  • 原文地址:https://blog.csdn.net/qq_27158179/article/details/132838432