Dify源码本地部署启动

Dify源码本地部署启动
背景

Dify是一个开源LLM应用程序开发平台。Dify的直观界面结合了人工智能工作流、RAG管道、代理功能、模型管理、可观察性功能等，让您快速从原型到生产。

Dify提供在线试用功能，可以直接在线体验其功能。同时也支持docker部署，源码部署等方式。源码部署可以查看Dify的实现细节，并进行定制化改造。本次记录源码部署遇到的问题和解决方案。

前置准备

由于是源码部署，还要对Dify进行改造，所以在Windows系统进行部署。本次部署使用win11系统。

Dify官网建议源码在linux系统下启动，所以需要在Windows下安装WSL2，启动linux子系统。本次安装WSL2使用的是Ubuntu 20.04.6系统。

同时需要在Windows系统安装Docker Desktop。点击下载

上面下载链接国内可能打不开，如果打不开需要自己找Docker Desktop安装包进行安装。本次使用的是4.31.1版本Docker Desktop。下载后正常安装Docker Desktop即可。然后注册账号进行登录。后续需要在Docker Desktop上拉取镜像。

部署过程

 1. 拉取源码

在Windows系统，拉取源码即可:
```
git clone https://github.com/langgenius/dify.git
```
2. 拉取必要镜像

首先，打开拉取的Dify源码代码，在docker文件夹中，打开docker-compose.middleware.yaml文件，看里面定义的镜像已经版本。
包括:
- image: postgres:15-alpine
- image: redis:6-alpine
- image: semitechnologies/weaviate(此处注意，官网定义的版本在Docker Desktop中不能拉取到，所以把版本去掉了，拉取最新版本镜像)
- image: langgenius/dify-sandbox:0.2.1
- image: ubuntu/squid:latest
在Docker Desktop中搜索上面镜像，点击pull进行拉取(不要使用Docker Desktop启动镜像)，如下图所示:

3. 启动容器

通过WSL2系统进入到下载的Dify源码文件夹中，进入docker 文件夹，使用以下命令启动容器:
```
docker compose -f docker-compose.middleware.yaml up -d
```
注:Windows系统安装了Docker Desktop后，WSL2系统也可以使用docker命令。

这里遇到一个坑，安装官网操作，上述命令应该能正常启动docker容器。但是在实际操作中postgres容器启动报错。报错信息是:
```
initdb: error: could not change permissions of directory "/var/lib/postgresql/data/pgdata"
```
没有操作/var/lib/postgresql/data/pgdata的权限。通过查看docker-compose.middleware.yaml中的定义，如下图:

在PGDATA和volumes中定义了此路径。

这里的解决方案是绕了一个弯解决此问题，因为我是用源码启动进行Dify源码的学习，所以数据是否要挂载出来并没有太大影响，所以我选择不进行此路径的挂载。同时，PGDATA的系统变量我也不再进行设置，而是使用它的默认值。

这里，我用Docker Desktop来启动postgresql镜像，并没有使用docker compose来启动。

首先，把WSL中使用docker compose启动的postgresql容器stop，然后将其rm删掉。因为这个容器一直报错一直重启，无法正常使用。然后在Windows的Docker Desktop下，启动postgresql镜像，并按照docker-compose.middleware.yaml中的配置来设置启动参数(volumes和PGDATA除外)，如下图所示:

这样可以正常启动postgres容器，且在WSL2中也可以正常使用。

4. 启动后台服务

后台服务包括一个api service，一个Worker Asynchronous Queue Consumption Service。需要启动这两个服务。

首先需要在WSL2中安装Anaconda，安装方式此处不再进行赘述。
同时需要创建虚拟空间，如下命令所示:
```
conda create --name dify python=3.10
```
同时切换到此虚拟环境:
```
conda activate dify
```
然后进行以下操作:
1. 在WSL2中进入Dify源码的api文件夹下，配置.env文件:
```
cp .env.example .env
```
1. 生成SECRET_KEY：
```
openssl rand -base64 42
```
1. 把生成的key复制到.env文件里的SECRET_KEY后面。
2. 安装api服务需要的python依赖:
```
pip install -r requirements.txt
```
1. 初始化postgres表数据:
```
flask db upgrade
```
**注意:**执行到这里时，出现了报错，具体报错博主没有进行记录。大概错误也是少python包。根据报错提示，pip install对应的包，再执行此命令，就可以成功。
1. 启动api服务:
```
flask run --host 0.0.0.0 --port=5001 --debug
```
输入以下字样代表成功:
- Debug mode: on INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server
  instead. * Running on all addresses (0.0.0.0) * Running on
  http://127.0.0.1:5001 INFO:werkzeug:Press CTRL+C to quit
  INFO:werkzeug: * Restarting with stat WARNING:werkzeug: * Debugger is
  active! INFO:werkzeug: * Debugger PIN: 695-801-919
1. 启动Worker service服务
重新打开一个WSL2终端，切换到dify虚拟环境，在cd到Dify源码的api文件夹下，执行下面命令:
```
celery -A app.celery worker -P solo --without-gossip --without-mingle -Q dataset,generation,mail --loglevel INFO
```
输出以下字样代表启动成功:
-------------- celery@TAKATOST.lan v5.2.7 (dawn-chorus)
— ***** -----
– ******* ---- macOS-10.16-x86_64-i386-64bit 2023-07-31 12:58:08
- *** — * —
- ** ---------- [config]
- ** ---------- .> app: app:0x7fb568572a10
- ** ---------- .> transport: redis://😗*@localhost:6379/1
- ** ---------- .> results: postgresql://postgres:**@localhost:5432/dify
- *** — * — .> concurrency: 1 (gevent)
  – ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
  — ***** ----- -------------- [queues]
  .> dataset exchange=dataset(direct) key=dataset
  .> generation exchange=generation(direct) key=generation
  .> mail exchange=mail(direct) key=mail
[tasks] .
tasks.add_document_to_index_task.add_document_to_index_task .
tasks.clean_dataset_task.clean_dataset_task .
tasks.clean_document_task.clean_document_task .
tasks.clean_notion_document_task.clean_notion_document_task .
tasks.create_segment_to_index_task.create_segment_to_index_task .
tasks.deal_dataset_vector_index_task.deal_dataset_vector_index_task
. tasks.document_indexing_sync_task.document_indexing_sync_task .
tasks.document_indexing_task.document_indexing_task .
tasks.document_indexing_update_task.document_indexing_update_task .
tasks.enable_segment_to_index_task.enable_segment_to_index_task .
tasks.generate_conversation_summary_task.generate_conversation_summary_task
. tasks.mail_invite_member_task.send_invite_member_mail_task .
tasks.remove_document_from_index_task.remove_document_from_index_task
. tasks.remove_segment_from_index_task.remove_segment_from_index_task
. tasks.update_segment_index_task.update_segment_index_task .
tasks.update_segment_keyword_index_task.update_segment_keyword_index_task

[2023-07-31 12:58:08,831: INFO/MainProcess] Connected to
redis://:@localhost:6379/1 [2023-07-31 12:58:08,840:
INFO/MainProcess] mingle: searching for neighbors [2023-07-31
12:58:09,873: INFO/MainProcess] mingle: all alone [2023-07-31
12:58:09,886: INFO/MainProcess] pidbox: Connected to
redis://:@localhost:6379/1. [2023-07-31 12:58:09,890:
INFO/MainProcess] celery@TAKATOST.lan ready.
5. 启动前台服务

重新打开一个WSL2终端，进入Dify源码的web文件夹下，进行以下操作:
1. 安装node以及npm。
  需要Node.js v18.x (LTS) 以上和 NPM version 8.x.x以上。
  安装命令如下:
```
# installs nvm (Node Version Manager)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash

# download and install Node.js (you may need to restart the terminal)
nvm install 18

# verifies the right Node.js version is in the environment
node -v # should print `v18.20.3`

# verifies the right NPM version is in the environment
npm -v # should print `10.7.0`
```
1. 下载前端依赖包:
```
npm install
```
1. 复制web文件夹下的.env.example文件，并将复制的文件重命名为.env.local。里面内容无需改动。
2. 构建前端代码:
```
npm run build
```
1. 启动前端服务:
```
npm run start
```
当出现以下字样时，代表启动成功:

ready - started server on 0.0.0.0:3000, url: http://localhost:3000
warn - You have enabled experimental feature (appDir) in
next.config.js. warn - Experimental features are not covered by
semver, and may cause unexpected or broken application behavior. Use
at your own risk. info - Thank you for testing appDir please leave
your feedback at https://nextjs.link/app-feedback

6. 访问系统，初始化账号

访问 http://127.0.0.1:3000 地址，出现登录界面或注册账号界面，代表启动成功。
首先注册账号，此处遇到的坑是密码如果带 * 号或者括号等特殊字符，可以注册成功，但是登录不进去，会有问题。所以设置密码时，只设置字母大小写+数字的密码才行。

最后

本文部署过程参考官方文档。并对官方文档中没有写明的坑进行了补充。可以结合官方文档和本文进行本地化源码部署。
相关阅读:
记录一次典型oom的处理过程
 【python量化】将Transformer模型用于股票价格预测
 【深度学习】05-02-自注意力机制多种变形-李宏毅老师21&22深度学习课程笔记
 R语言动量交易策略分析调整后的数据
 Rust交互式编程环境搭建
 11i重磅升级，一文读懂 GPA Web端五大亮点
 自从学会了ChatGPT，我就再没加过班
 17.适配器模式（Adapter）
力扣第216 组合总和 ||| c++ 回溯 + 注释
 Flink Data Transformation
原文地址：https://blog.csdn.net/qq1309664161/article/details/139709161

背景

前置准备

部署过程

1. 拉取源码

2. 拉取必要镜像

3. 启动容器

4. 启动后台服务

5. 启动前台服务

6. 访问系统，初始化账号

最后