带上ES一起寻找理想的另一半

在这里插入图片描述

😊你好，我是小航，一个正在变秃、变强的文艺倾年。
🔔本文讲解实战ElasticSearch搜索匹配，欢迎大家多多关注！
🔔一起卷起来叭！

目录

前言：
一、设计数据库
二、初始化项目
三、功能实现
1.父子节点：
2.搜索引擎：
准备工作：
整合Elasticsearch
数据库、索引设计
数据新增
数据检索

前言：

某年某月某日，当我在逛珍爱网的时候，突然想到了自己还木有女朋友，甚至忽略了我是一个男性！哦当然这不是重点，重点是它没有匹配到我理想的另一半，于是我决定，自己写一个搜索匹配，寻找自己理想的另一半。

在这里插入图片描述

一、设计数据库

SQL设计如下：
字典表设计：（用户所处的城市、兴趣…）

CREATE TABLE `data_dict` (
  `id` bigint NOT NULL AUTO_INCREMENT COMMENT '主键ID',
  `node_name` varchar(50) NOT NULL COMMENT '节点名称',
  `parent_id` bigint NOT NULL DEFAULT '0' COMMENT '父ID',
  `type` int NOT NULL COMMENT '类型：0-城市；1-兴趣',
  `node_level` int NOT NULL COMMENT '节点层级',
  `show_status` int NOT NULL COMMENT '是否显示：1-显示；0-不显示',
  `sort` int NOT NULL COMMENT '排序',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
1
2
3
4
5
6
7
8
9
10

二、初始化项目

项目结构：

在这里插入图片描述

如何初始化项目这里不再赘述，请看往期实战教程

三、功能实现

1.父子节点：

修改DataDictEntity表：

增加逻辑删除注解
增加child属性

package com.example.demo.entity;

import com.baomidou.mybatisplus.annotation.TableField;
import com.baomidou.mybatisplus.annotation.TableId;
import com.baomidou.mybatisplus.annotation.TableLogic;
import com.baomidou.mybatisplus.annotation.TableName;

import java.io.Serializable;
import java.util.Date;
import java.util.List;

import com.fasterxml.jackson.annotation.JsonInclude;
import lombok.Data;

/**
 *
 *
 * @author Liu
 * @email 1531137510@qq.com
 * @date 2022-10-06 20:48:15
 */
@Data
@TableName("data_dict")
public class DataDictEntity implements Serializable {
	private static final long serialVersionUID = 1L;

	/**
	 * 主键ID
	 */
	@TableId
	private Long id;
	/**
	 * 节点名称
	 */
	private String nodeName;
	/**
	 * 父ID
	 */
	private Long parentId;
	/**
	 * 类型：0-城市；1-兴趣
	 */
	private Integer type;
	/**
	 * 节点层级
	 */
	private Integer nodeLevel;
	/**
	 * 是否显示：1-显示；0-不显示
	 */
	@TableLogic(value = "1", delval = "0")
	private Integer showStatus;
	/**
	 * 排序
	 */
	private Integer sort;

	@JsonInclude(JsonInclude.Include.NON_EMPTY) // 属性为空不参与序列化，这里方便前端处理
	@TableField(exist = false) // 数据库表中不存在该字段
	private List<DataDictEntity> children;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

逻辑删除的配置也可以通过配置文件配置：

mybatis-plus:
  mapper-locations: classpath:/mapper/*.xml
  global-config:
    db-config:
      id-type: auto  # 主键自增
      logic-delete-value: 1
      logic-not-delete-value: 0
1
2
3
4
5
6
7

接下来我们编写接口：

控制层
ApiController:

@RestController
public class ApiController {

    @Autowired
    DataDictService dataDictService;

    @GetMapping("/list/tree")
    public Result<List<DataDictEntity>> listWithTree() {
        List<DataDictEntity> entities = dataDictService.listWithTree();
        return new Result<List<DataDictEntity>>().ok(entities);
    }
}
1
2
3
4
5
6
7
8
9
10
11
12

业务层
DataDictServiceImpl：

/**
     * 树形查询
     */
    @Override
    public List<DataDictEntity> listWithTree() {
        // 1.查出所有分类（数据库只查询一次，内存进行修改）
        List<DataDictEntity> entities = baseMapper.selectList(null);
        // 2.组装分类
        return entities.stream().filter(node -> node.getParentId() == 0) // 先过滤得到所有一级分类
                .peek((nodeEntity) -> {
                    nodeEntity.setChildren(getChildrens(nodeEntity, entities)); // 递归得到一级分类的子部门
                }).sorted(Comparator.comparingInt(node -> (node.getSort() == null ? 0 : node.getSort()))).collect(Collectors.toList());
    }

    /**
     * 递归查询子节点
     */
    private List<DataDictEntity> getChildrens(DataDictEntity root, List<DataDictEntity> all) {
        return all.stream().filter(node -> root.getId().equals(node.getParentId())) // 找到root的子部门
                .peek(dept -> {
                    dept.setChildren(getChildrens(dept, all)); // 设置为子部门
                }).sorted(Comparator.comparingInt(node -> (node.getSort() == null ? 0 : node.getSort()))).collect(Collectors.toList());
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

具体逻辑已经写到注释上面了

我们新增几个测试数据：

INSERT INTO `data_dict` VALUES (1, '1', 0, 0, 1, 1, 2);
INSERT INTO `data_dict` VALUES (2, '1-1', 1, 0, 2, 1, 1);
INSERT INTO `data_dict` VALUES (3, '1-1-1', 2, 0, 3, 1, 1);
INSERT INTO `data_dict` VALUES (4, '2', 0, 0, 1, 1, 1);
INSERT INTO `data_dict` VALUES (5, '3', 0, 0, 1, 0, 1);
1
2
3
4
5

在这里插入图片描述
打开测试工具Apifox测试：

发送Get请求：http://localhost:8080/list/tree
返回结果：

{
    "code": 0,
    "msg": "success",
    "data": [
        {
            "id": 4,
            "nodeName": "2",
            "parentId": 0,
            "type": 0,
            "nodeLevel": 1,
            "showStatus": 1,
            "sort": 1
        },
        {
            "id": 1,
            "nodeName": "1",
            "parentId": 0,
            "type": 0,
            "nodeLevel": 1,
            "showStatus": 1,
            "sort": 2,
            "children": [
                {
                    "id": 2,
                    "nodeName": "1-1",
                    "parentId": 1,
                    "type": 0,
                    "nodeLevel": 2,
                    "showStatus": 1,
                    "sort": 1,
                    "children": [
                        {
                            "id": 3,
                            "nodeName": "1-1-1",
                            "parentId": 2,
                            "type": 0,
                            "nodeLevel": 3,
                            "showStatus": 1,
                            "sort": 1
                        }
                    ]
                }
            ]
        }
    ]
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

如果树形节点数据不经常变动，且不是很重要的数据，我们可以考虑把数据缓存起来，加快查询速度

之前Redis详细的缓存实战请看这里：对接外部API + 性能调优

由于这里是一般场景，缓存数量不是很大，没必要使用第三方缓存，使用Spring Cache足够了：

1.开启Cache

@SpringBootApplication
@EnableCaching
public class DemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(DemoApplication.class, args);
    }

}
1
2
3
4
5
6
7
8
9

2.添加Cacheable 注解

/**
     * 树形查询
     * value：缓存名
     * key：显示的指定key Spring官方更推荐，SpEL：Spring Expression Language，Spring 表达式语言
     * sync = true 解决缓存击穿
     */
    @Cacheable(value = {"data_dict"}, key = "#root.method.name", sync = true)
    @Override
    public List<DataDictEntity> listWithTree() {
        // 1.查出所有分类（数据库只查询一次，内存进行修改）
        List<DataDictEntity> entities = baseMapper.selectList(null);
        log.info("查询了数据库！");
        // 2.组装分类
        return entities.stream().filter(node -> node.getParentId() == 0) // 先过滤得到所有一级分类
                .peek((nodeEntity) -> {
                    nodeEntity.setChildren(getChildrens(nodeEntity, entities)); // 递归得到一级分类的子部门
                }).sorted(Comparator.comparingInt(node -> (node.getSort() == null ? 0 : node.getSort()))).collect(Collectors.toList());
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

我们打开Api文档测试：
调用两次方法后发现：

 查询了数据库！
 # 只出现了一次！
1
2

如果需要配置第三方缓存，需要引入依赖(spring-boot-starter-cache)，然后在配置文件修改spring.cache.type：

<dependency>
	第三方依赖
dependency>
<dependency>
    <groupId>org.springframework.bootgroupId>
    <artifactId>spring-boot-starter-cacheartifactId>
dependency>
1
2
3
4
5
6
7

这里就不再赘述了

2.搜索引擎：

准备工作：

（1）下载ealastic search（存储和检索）和kibana（可视化检索）

版本要统一
docker pull elasticsearch:7.4.2
docker pull kibana:7.4.2
1
2
3

（2）配置：

# 将docker里的目录挂载到linux的/mydata目录中
# 修改/mydata就可以改掉docker里的
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data

# es可以被远程任何机器访问
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml

# 递归更改权限，es需要访问
chmod -R 777 /mydata/elasticsearch/
1
2
3
4
5
6
7
8
9
10

（3）启动Elastic search：

# 9200是用户交互端口 9300是集群心跳端口
# -e指定是单阶段运行
# -e指定占用的内存大小，生产时可以设置32G
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e  "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v  /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.4.2 


# 设置开机启动elasticsearch
docker update elasticsearch --restart=always
1
2
3
4
5
6
7
8
9
10
11
12
13
14

（4）启动kibana：

# kibana指定了了ES交互端口9200  # 5600位kibana主页端口
docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.56.10:9200 -p 5601:5601 -d kibana:7.4.2


# 设置开机启动kibana
docker update kibana  --restart=always
1
2
3
4
5
6

（5）测试

查看elasticsearch版本信息： http://192.168.56.10:9200

{
    "name": "66718a266132",
    "cluster_name": "elasticsearch",
    "cluster_uuid": "xhDnsLynQ3WyRdYmQk5xhQ",
    "version": {
        "number": "7.4.2",
        "build_flavor": "default",
        "build_type": "docker",
        "build_hash": "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
        "build_date": "2019-10-28T20:40:44.881551Z",
        "build_snapshot": false,
        "lucene_version": "8.2.0",
        "minimum_wire_compatibility_version": "6.8.0",
        "minimum_index_compatibility_version": "6.0.0-beta1"
    },
    "tagline": "You Know, for Search"
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

显示elasticsearch 节点信息 http://192.168.56.10:9200/_cat/nodes

127.0.0.1 14 99 25 0.29 0.40 0.22 dilm * 66718a266132

66718a266132代表上面的结点
*代表是主节点
1
2
3
4

访问Kibana： http://192.168.56.10:5601/app/kibana 在这里插入图片描述
为了增加ES的安全性，我们这里设置一下密码：

修改elasticsearch.yml文件(6.2或更早版本需要安装X-PACK, 新版本已包含在发行版中)

vim /mydata/elasticsearch/config/elasticsearch.yml

## 增加内容：
xpack.security.enabled: true
xpack.license.self_generated.type: basic
xpack.security.transport.ssl.enabled: true
1
2
3
4
5
6

重启ES服务：

docker restart elasticsearch
1

进入elasticsearch容器bin目录下初始化密码：

docker exec -it elasticsearch /bin/bash
/usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
# 因为需要设置 elastic，apm_system，kibana，kibana_system，logstash_system，beats_system，remote_monitoring_user 这些用户的密码，故这个过程比较漫长，耐心设置；注意输入密码的时候看不到是正常的
1
2
3

这里我们将密码修改为：123456
修改密码测试：
浏览器访问：http://192.168.56.10:9200

在这里插入图片描述

- 账号：elastic
- 密码：123456
1
2

exit  # 退出之前的容器
# 进入kibana 容器内部
docker exec -it kibana /bin/bash		

vi config/kibana.yml

# kinana.yml 末尾添加：
elasticsearch.username: "elastic"
elasticsearch.password: "123456"

# 重新启动kibana
exit
docker restart kibana
1
2
3
4
5
6
7
8
9
10
11
12
13

安装ik分词器：

由于所有的语言分词默认使用的都是“Standard Analyzer”，但是这些分词器针对于中文的分词，并不友好。为此需要安装中文的分词器。

查看自己的elasticsearch版本号：

访问：http://192.168.56.10:9200

版本对应关系：

IK version	ES version
master	7.x -> master
6.x	6.x
5.x	5.x
1.10.6	2.4.6
1.9.5	2.3.5
1.8.1	2.2.1
1.7.0	2.1.1
1.5.0	2.0.0
1.2.6	1.0.0
1.2.5	0.90.x
1.1.3	0.20.x
1.0.0	0.16.2 -> 0.19.0

ik分词器下载

之前我们已经将elasticsearch容器的/usr/share/elasticsearch/plugins目录，映射到宿主机的 /mydata/elasticsearch/plugins目录下，所以我们直接下载/elasticsearch-analysis-ik-7.4.2.zip文件，然后解压到该文件夹下即可。安装完毕后，记得重启elasticsearch容器。

安装完成后，测试分词器：

打开 kibana-DevTool 控制台：

GET _analyze
{
   "analyzer": "ik_smart", 
   "text":"小航是中国人"
}
1
2
3
4
5

输出结果：

{
  "tokens" : [
    {
      "token" : "小",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "航",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "是",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 2
    },
    {
      "token" : "中国人",
      "start_offset" : 3,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 3
    }
  ]
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

小航竟然没有被识别出来！！！

这可不行，得把“小航”当作一个词，所以我们搞个“自定义词库”：

安装Nginx:

//先创建一个存放nginx的文件夹
cd /mydata/
mkdir nginx
//下载安装nginx1.10，只是为了获取配置信息，进行配置映射，直接安装会先下载再安装
docker run -p 80:80 --name nginx -d nginx:1.10
//将容器里面的配置文件拷贝到当前目录
docker container cp nginx:/etc/nginx .
//查看mydata的nginx下面有没有文化，有则表示拷贝成功，则可以停止服务
docker stop nginx
docker rm nginx
//为了防止后面安装新的nginx会出现的问题，进入mydata文件夹，再将之前复制的文件重新命名
mv nginx conf
//再创建nginx，将conf移动到nginx里面
mkdir nginx
mv conf nginx/
//再安装新的nginx 
 docker run -p 80:80 --name nginx \
 -v /mydata/nginx/html:/usr/share/nginx/html  \
 -v /mydata/nginx/logs:/var/log/nginx \
 -v /mydata/nginx/conf/:/etc//nginx \
 -d nginx:1.10
//再在nginx的html下面创建一个文件夹
cd  /mydata/nginx/html
mkdir es
cd es
//再创建一个fenci.txt,追加内容“小航”，并查看
echo '小航' >> ./fenci.txt
cat fenci.txt 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

nginx启动后测试访问该文件：
http://192.168.56.10/es/fenci.txt
在这里插入图片描述

修改/mydata/elasticsearch/plugins/elasticsearch-analysis-ik-7.4.2/config中的IKAnalyzer.cfg.xml 去掉注释，修改地址


DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
	<comment>IK Analyzer 扩展配置comment>
	
	<entry key="ext_dict">entry>
	 
	<entry key="ext_stopwords">entry>
	
	<entry key="remote_ext_dict">http://192.168.56.10/es/fenci.txtentry> 
	
	
properties>
1
2
3
4
5
6
7
8
9
10
11
12
13

！！！重启es：

docker restart elasticsearch
1

再次测试：

GET _analyze
{
   "analyzer": "ik_smart", 
   "text":"小航是中国人"
}
1
2
3
4
5

输出结果：

{
  "tokens" : [
    {
      "token" : "小航",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "中国人",
      "start_offset" : 3,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 2
    }
  ]
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Nice！

整合Elasticsearch

如果您对ES的基础操作不太了解，请先学习！后期有时间再出ES快速上手教程，本期只写准备环境和整合

Java操作es有两种方式：

1）9300: TCP

spring-data-elasticsearch:transport-api.jar;
springboot版本不同，ransport-api.jar不同，不能适配es版本
7.x已经不建议使用，8以后就要废弃

2）9200: HTTP

jestClient: 非官方，更新慢；
RestTemplate：模拟HTTP请求，ES很多操作需要自己封装，麻烦；
HttpClient：同上；
Elasticsearch-Rest-Client：官方RestClient，封装了ES操作，API层次分明，上手简单；

我们最终选择Elasticsearch-Rest-Client（elasticsearch-rest-high-level-client），具体说明文档：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high.html

1.导入依赖：（springboot这里默认给的版本是7.6，和咱们的不一样，这里排除重新引入）

<properties>
    <elasticsearch.version>7.4.2elasticsearch.version>
properties>


        <dependency>
            <groupId>org.elasticsearchgroupId>
            <artifactId>elasticsearchartifactId>
            <version>${elasticsearch.version}version>
        dependency>
        <dependency>
            <groupId>org.elasticsearch.clientgroupId>
            <artifactId>elasticsearch-rest-clientartifactId>
            <version>${elasticsearch.version}version>
        dependency>
        <dependency>
            <groupId>org.elasticsearch.clientgroupId>
            <artifactId>elasticsearch-rest-high-level-clientartifactId>
            <version>${elasticsearch.version}version>
            <exclusions>
                <exclusion>
                    <groupId>org.elasticsearchgroupId>
                    <artifactId>elasticsearchartifactId>
                exclusion>
                <exclusion>
                    <groupId>org.elasticsearch.clientgroupId>
                    <artifactId>elasticsearch-rest-clientartifactId>
                exclusion>
            exclusions>
        dependency>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

修改后：
在这里插入图片描述
修改前：

2.配置信息：

application.yml：

elasticsearch:
  schema: http
  host: 192.168.56.10
  port: 9200
  username: elastic
  password: 123456
1
2
3
4
5
6

编写ElasticSearchConfig配置类：

package com.example.demo.config;

import lombok.Data;
import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.elasticsearch.client.*;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

/**
 * @author xh
 * @Date 2022/10/8
 */
@Data
@Configuration
@ConfigurationProperties(prefix = "elasticsearch")
public class ElasticSearchConfig {

    public static final RequestOptions COMMON_OPTIONS;

    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
        // 默认缓存限制为100MB，此处修改为30MB。
        builder.setHttpAsyncResponseConsumerFactory(
                new HttpAsyncResponseConsumerFactory
                        .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024));
        COMMON_OPTIONS = builder.build();
    }

    private String schema;
    private String host;
    private Integer port;
    private String username;
    private String password;

    @Bean
    public RestHighLevelClient client() {
        // Elasticsearch需要basic auth验证
        final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
        // 配置账号密码
        credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(username, password));
        // 通过builder创建rest client，配置http client的HttpClientConfigCallback。
        RestClientBuilder builder = RestClient.builder(new HttpHost(host, port, schema))
                .setHttpClientConfigCallback(httpClientBuilder -> {
                    httpClientBuilder.disableAuthCaching();
                    return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
                });
        return new RestHighLevelClient(builder);
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

3.测试：

@SpringBootTest
class DemoApplicationTests {

    @Autowired
    RestHighLevelClient client;

    /**
     * 测试获取elasticsearch对象
     */
    @Test
    void contextLoads() {
        System.out.println(client);
    }
    
    /**
     * 新建索引测试
     **/
	@Test
    public void indexData() throws IOException {

        // 设置索引
        IndexRequest indexRequest = new IndexRequest ("users");
        indexRequest.id("1");

        User user = new User();
        user.setUsername("张三");
        Gson gson = new Gson();
        String jsonString = gson.toJson(user);

        //设置要保存的内容，指定数据和类型
        indexRequest.source(jsonString, XContentType.JSON);

        //执行创建索引和保存数据
        IndexResponse index = client.index(indexRequest, ElasticSearchConfig.COMMON_OPTIONS);

        System.out.println(index);

    }

    @Data
    class User {
        private String username;
    }

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

运行结果：

org.elasticsearch.client.RestHighLevelClient@47248a48

说明elasticsearch对象成功加载到spring上下文中


IndexResponse[index=users,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]

索引建立成功
1
2
3
4
5
6
7
8

数据库、索引设计

新增数据库：data_info

CREATE TABLE `data_info` (
  `id` bigint NOT NULL AUTO_INCREMENT COMMENT '主键ID',
  `title` varchar(255) NOT NULL COMMENT '标题',
  `info` text NOT NULL COMMENT '详情',
  `img` varchar(255) DEFAULT NULL COMMENT '标题图',
  `likes` bigint NOT NULL DEFAULT '0' COMMENT '点赞量',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=18 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
1
2
3
4
5
6
7
8

建立data_info索引：

PUT data_info
{
    "mappings":{
        "properties": {
        	"dataId":{ "type": "long" },
            "dataTitle": { 
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer":"ik_smart"
            },
            "dataInfo": {
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer":"ik_smart"
            },
            "dataLike":{ "type":"long" },
            "dataImg":{
                "type": "keyword",
                "index": false, 
                "doc_values": false 
            },
    		"node": {
                "type": "nested",
                "properties": {
                    "nodeId": {"type": "long"  },
                    "nodeName": {
                        "type": "keyword",
                        "index": false,
                        "doc_values": false
                    }
                }
            }
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

索引说明：

PUT data_info
{
    "mappings":{
        "properties": {
        	"dataId":{ "type": "long" }, # 信息ID
            "dataTitle": { # 信息标题
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer":"ik_smart"
            },
            "dataInfo": { # 简略信息
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer":"ik_smart"
            },
            "dataLike":{ "type":"long" }, # 信息点赞量
            "dataImg":{ # 信息预览图
                "type": "keyword",
                "index": false,  # 不可被检索，不生成index，只用做页面使用
                "doc_values": false # 不可被聚合，默认为true
            },
    		"node": { # 节点信息
                "type": "nested",
                "properties": {
                    "nodeId": {"type": "long"  },
                    "nodeName": {
                        "type": "keyword",
                        "index": false,
                        "doc_values": false
                    }
                }
            }
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

数据新增

ApiController新增新的接口：save

	@Autowired
    DataDictService dataDictService;

    @Autowired
    DataInfoService dataInfoService;

    @Autowired
    RestHighLevelClient client;


	@PostMapping("/save")
    public Result<String> saveData(@RequestBody List<ESModel> esModels) {
        boolean flag = dataInfoService.saveDatas(esModels);
        if(flag) {
            // TODO 审核后可检索到
            flag = esUpdate(esModels);
        }
        if(flag) {
            return new Result<String>().ok("数据保存成功！");
        } else {
            return new Result<String>().error("数据保存失败！");
        }
    }

    private boolean esUpdate(List<ESModel> esModel) {
        // 1.给ES建立一个索引 dataVo
        BulkRequest bulkRequest = new BulkRequest();
        for (ESModel model : esModel) {
            // 设置索引
            IndexRequest indexRequest = new IndexRequest("data_info");
            // 设置索引id
            indexRequest.id(model.getDataId().toString());
            Gson gson = new Gson();
            String jsonString = gson.toJson(model);
            indexRequest.source(jsonString, XContentType.JSON);
            // add
            bulkRequest.add(indexRequest);
        }
        // bulk批量保存
        BulkResponse bulk = null;
        try {
            bulk = client.bulk(bulkRequest, ElasticSearchConfig.COMMON_OPTIONS);
        } catch (IOException e) {
            e.printStackTrace();
        }
        boolean hasFailures = bulk.hasFailures();
        if(hasFailures){
            List<String> collect = Arrays.stream(bulk.getItems()).map(BulkItemResponse::getId).collect(Collectors.toList());
            log.error("ES新增错误：{}",collect);
        }
        return !hasFailures;
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

具体解释都在注释中，这里就不赘述了。

DataInfoServiceImpl：

package com.example.demo.service.impl;

import com.example.demo.entity.DataDictEntity;
import com.example.demo.vo.ESModel;
import org.springframework.stereotype.Service;
import com.baomidou.mybatisplus.extension.service.impl.ServiceImpl;

import com.example.demo.dao.DataInfoDao;
import com.example.demo.entity.DataInfoEntity;
import com.example.demo.service.DataInfoService;

import java.util.ArrayList;
import java.util.List;


@Service("dataInfoService")
public class DataInfoServiceImpl extends ServiceImpl<DataInfoDao, DataInfoEntity> implements DataInfoService {

    @Override
    public boolean saveDatas(List<ESModel> esModels) {
        List<DataInfoEntity> dataInfoEntities = new ArrayList<>();
        for (ESModel esModel : esModels) {
            DataInfoEntity dataInfoEntity = new DataInfoEntity();
            dataInfoEntity.setImg(esModel.getDataImg());
            dataInfoEntity.setInfo(esModel.getDataInfo());
            dataInfoEntity.setLikes(0L);
            dataInfoEntity.setTitle(esModel.getDataTitle());
            dataInfoEntities.add(dataInfoEntity);
            baseMapper.insert(dataInfoEntity);
            esModel.setDataId(dataInfoEntity.getId());
            esModel.setDataLike(dataInfoEntity.getLikes());
        }
//        return saveBatch(dataInfoEntities);
        return true;
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

TODO：这里批量处理待优化，先鸽这！

启动项目测试：
测试数据：

[
    {
        "dataTitle": "title",
        "dataInfo": "dataInfo",
        "dataImg": "dataImg",
        "nodes": [
            {
                "nodeId": 1,
                "nodeName": "1"
            }
        ]
    }
]
1
2
3
4
5
6
7
8
9
10
11
12
13

返回结果：

{
    "code": 0,
    "msg": "success",
    "data": "数据保存成功！"
}
1
2
3
4
5

我们打开ES控制台查看一下结果：

命令：
GET /data_info/_search
{
  "query": {"match_all": {}}
}
结果：
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "data_info",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "dataId" : 1,
          "dataTitle" : "title",
          "dataInfo" : "dataInfo",
          "dataLike" : 0,
          "dataImg" : "dataImg",
          "nodes" : [
            {
              "nodeId" : 1,
              "nodeName" : "1"
            }
          ]
        }
      }
    ]
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

Perfectly！

数据检索

我们先来思考一下检索条件可能有哪些：

全文检索：dataTitle、dataInfo
排序：dataLike（点赞量）
过滤：node.id
聚合：node

keyword=小航&
sort=dataLike_desc/asc&
node=3:4
1
2
3

额，貌似需求有点简单，好像不够把知识点都串上

增加一组测试数据：

[
    {
        "dataTitle": "速度还是觉得还是觉得合适机会减少",
        "dataInfo": "网络新词 网络上经常会出现一些新词，比如“蓝瘦香菇”，蓝瘦香菇默认情况下会被分词，分词结果如下所示 蓝，瘦，香菇 这样的分词会导致搜索出很多不相关的结果，在这种情况下，我们使用扩展词库",
        "dataImg": "dataImg",
        "nodes": [
            {
                "nodeId": 1,
                "nodeName": "节点1"
            }
        ]
    }
]
1
2
3
4
5
6
7
8
9
10
11
12
13

编写DSL查询语句：

GET /data_info/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "速度",
            "fields": [
              "dataTitle",
              "dataInfo"
            ]
          }
        }
      ],
      "filter": {
        "nested": {
          "path": "nodes",
          "query": {
            "bool": {
              "must": [
                {
                  "term": {
                    "nodes.nodeId": {
                      "value": 1
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  "sort": [
    {
      "dataLike": {
        "order": "desc"
      }
    }
  ],
  "from": 0,
  "size": 5,
  "highlight": {
    "fields": {
      "dataTitle": {},
      "dataInfo": {}
    },
    "pre_tags": "",
    "post_tags": ""
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53

查询结果：

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "data_info",
        "_type" : "_doc",
        "_id" : "17",
        "_score" : null,
        "_source" : {
          "dataId" : 17,
          "dataTitle" : "速度还是觉得还是觉得合适机会减少",
          "dataInfo" : "网络新词 网络上经常会出现一些新词，比如“蓝瘦香菇”，蓝瘦香菇默认情况下会被分词，分词结果如下所示 蓝，瘦，香菇 这样的分词会导致搜索出很多不相关的结果，在这种情况下，我们使用扩展词库",
          "dataLike" : 0,
          "dataImg" : "dataImg",
          "nodes" : [
            {
              "nodeId" : 1,
              "nodeName" : "节点1"
            }
          ]
        },
        "highlight" : {
          "dataTitle" : [
            "速度还是觉得还是觉得合适机会减少"
          ]
        },
        "sort" : [
          0
        ]
      }
    ]
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

接下来我们使用Java的方式操作DSL：
SearchParam 请求参数：

package com.example.demo.vo;

import lombok.Data;

import java.util.List;

/**
 * @author xh
 * @Date 2022/10/12
 */
@Data
public class SearchParam {
    // 页面传递过来的全文匹配关键字：keyword=小航
    private String keyword;
    //排序条件：sort=dataLike_desc/asc
    private String sort;
    /*** 按照节点进行筛选 */
    // node=3:4
    private List<String> nodes;
    /*** 页码*/
    private Integer pageNum = 1;
    /*** 原生所有查询属性*/
    private String _queryString;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

SearchResult 返回结果：

package com.example.demo.vo;

import com.example.demo.entity.DataInfoEntity;
import lombok.Data;

import java.util.List;

/**
 * @author xh
 * @Date 2022/10/12
 */
@Data
public class SearchResult {
    /** 查询到所有的DataInfos*/
    private List<DataInfoEntity> dataInfos;
    /*** 当前页码*/
    private Integer pageNum;
    /** 总记录数*/
    private Long total;
    /** * 总页码*/
    private Integer totalPages;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

由于我们的需求有：每条信息对应的标签也需要显示

@Data
@TableName("data_info")
public class DataInfoEntity implements Serializable {
	private static final long serialVersionUID = 1L;

	/**
	 * 主键ID
	 */
	@TableId(type = IdType.AUTO)
	private Long id;
	/**
	 * 标题
	 */
	private String title;
	/**
	 * 详情
	 */
	private String info;
	/**
	 * 标题图
	 */
	private String img;
	/**
	 * 点赞量
	 */
	private Long likes;
	/**
	 * 标签
	 */
	@TableField(exist = false)
	private List<String> nodeNames;

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

编写接口：
ApiController：

	@Autowired
    DataDictService dataDictService;

    @Autowired
    DataInfoService dataInfoService;

    @Autowired
    RestHighLevelClient client;

    public static final Integer PAGE_SIZE = 5;


	@GetMapping("/search")
    public Result<SearchResult> getSearchPage(SearchParam searchParam, HttpServletRequest request) {
        // TODO 请求参数加密 && 反爬虫
        // 获取请求参数
        searchParam.set_queryString(request.getQueryString());
        SearchResult result = getSearchResult(searchParam);
        return new Result<SearchResult>().ok(result);
    }

    /**
     * 得到请求结果
     */
    public SearchResult getSearchResult(SearchParam searchParam) {//根据带来的请求内容封装
        SearchResult searchResult= null;
        // 通过请求参数构建查询请求
        SearchRequest request = buildSearchRequest(searchParam);
        try {
            SearchResponse searchResponse = client.search(request,
                    ElasticSearchConfig.COMMON_OPTIONS);
            // 将es响应数据封装成结果
            searchResult = buildSearchResult(searchParam,searchResponse);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return searchResult;
    }

    private SearchResult buildSearchResult(SearchParam searchParam, SearchResponse searchResponse) {
        SearchResult result = new SearchResult();

        SearchHits hits = searchResponse.getHits();
        //1. 封装查询到的商品信息
        if (hits.getHits()!=null&&hits.getHits().length>0){
            List<DataInfoEntity> dataInfoEntities = new ArrayList<>();
            for (SearchHit hit : hits) {
                // 获取JSON并解析为ESModel
                String sourceAsString = hit.getSourceAsString();
                Gson gson = new Gson();
                ESModel esModel = gson.fromJson(sourceAsString, new TypeToken<ESModel>() {
                }.getType());
                // ESModel转DataInfoEntity
                DataInfoEntity dataInfoEntity = new DataInfoEntity();
                dataInfoEntity.setTitle(esModel.getDataTitle());
                dataInfoEntity.setInfo(esModel.getDataInfo());
                dataInfoEntity.setImg(esModel.getDataImg());
                dataInfoEntity.setId(esModel.getDataId());
                dataInfoEntity.setLikes(esModel.getDataLike());
                dataInfoEntity.setNodeNames(esModel.getNodes().stream()
                        .map(ESModel.Node::getNodeName).collect(Collectors.toList()));
                //设置高亮属性
                if (!StringUtils.isEmpty(searchParam.getKeyword())) {
                    HighlightField dataTitle = hit.getHighlightFields().get("dataTitle");
                    if(dataTitle != null) {
                        String highLight = dataTitle.getFragments()[0].string();
                        dataInfoEntity.setTitle(highLight);
                    }
                    HighlightField dataInfo = hit.getHighlightFields().get("dataInfo");
                    if(dataInfo != null) {
                        String highLight = dataInfo.getFragments()[0].string();
                        dataInfoEntity.setInfo(highLight);
                    }
                }
                dataInfoEntities.add(dataInfoEntity);
            }
            result.setDataInfos(dataInfoEntities);
        }

        //2. 封装分页信息
        //2.1 当前页码
        result.setPageNum(searchParam.getPageNum());
        //2.2 总记录数
        long total = hits.getTotalHits().value;
        result.setTotal(total);
        //2.3 总页码
        Integer totalPages = (int)total % PAGE_SIZE == 0 ?
                (int)total / PAGE_SIZE : (int)total / PAGE_SIZE + 1;
        result.setTotalPages(totalPages);
        return result;
    }


    /**
     * 构建请求语句
     */
    private SearchRequest  buildSearchRequest(SearchParam searchParam) {
        // 用于构建DSL语句
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        //1. 构建bool query
        BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
        //1.1 bool must
        if (!StringUtils.isEmpty(searchParam.getKeyword())) {
            boolQueryBuilder.must(
                    QueryBuilders.multiMatchQuery(searchParam.getKeyword(), "dataTitle", "dataInfo")
            );
        }
        // 1.2 filter nested
        List<Long> nodes = searchParam.getNodes();
        BoolQueryBuilder queryBuilder = new BoolQueryBuilder();
        if (nodes!=null && nodes.size() > 0) {
            nodes.forEach(nodeId ->{
                queryBuilder.must(QueryBuilders.termQuery("nodes.nodeId", nodeId));
            });
        }
        NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery("nodes", queryBuilder, ScoreMode.None);
        boolQueryBuilder.filter(nestedQueryBuilder);
        //1.3 bool query构建完成
        searchSourceBuilder.query(boolQueryBuilder);
        //2. sort  eg:sort=dataLike_desc/asc
        if (!StringUtils.isEmpty(searchParam.getSort())) {
            String[] sortSplit = searchParam.getSort().split("_");
            searchSourceBuilder.sort(sortSplit[0], "asc".equalsIgnoreCase(sortSplit[1]) ? SortOrder.ASC : SortOrder.DESC);
        }

        //3. 分页 // 是检测结果分页
        searchSourceBuilder.from((searchParam.getPageNum() - 1) * PAGE_SIZE);
        searchSourceBuilder.size(PAGE_SIZE);

        //4. 高亮highlight
        if (!StringUtils.isEmpty(searchParam.getKeyword())) {
            HighlightBuilder highlightBuilder = new HighlightBuilder();
            highlightBuilder.field("dataTitle");
            highlightBuilder.field("dataInfo");
            highlightBuilder.preTags("");
            highlightBuilder.postTags("");
            searchSourceBuilder.highlighter(highlightBuilder);
        }

        log.debug("构建的DSL语句 {}",searchSourceBuilder.toString());
        SearchRequest request = new SearchRequest(new String[]{"data_info"}, searchSourceBuilder);
        return request;
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143

测试接口：
请求地址：http://localhost:8080/search?keyword=速度&sort=dataLike_desc&nodes=1 GET请求

返回结果：

{
    "code": 0,
    "msg": "success",
    "data": {
        "dataInfos": [
            {
                "id": 17,
                "title": "速度还是觉得还是觉得合适机会减少",
                "info": "网络新词 网络上经常会出现一些新词，比如“蓝瘦香菇”，蓝瘦香菇默认情况下会被分词，分词结果如下所示 蓝，瘦，香菇 这样的分词会导致搜索出很多不相关的结果，在这种情况下，我们使用扩展词库",
                "img": "dataImg",
                "likes": 0,
                "nodeNames": [
                    "节点1"
                ]
            }
        ],
        "pageNum": 1,
        "total": 1,
        "totalPages": 1
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

大功告成！

相关阅读:
（附源码）计算机毕业设计SSM教师信息采集系统
 Unity DOTS中的baking（五）prefabs
猿创征文|SpringBoot概述及在idea中创建方式
 c#设计模式-结构型模式之享元模式
 线性卷积和循环卷积（圆周卷积）
Python数据可视化工具matpoltlib使用
 Win10鼠标宏怎么设置？电脑设置鼠标宏的方法
 2. 如何给在 SAP Business Application Studio 里开发的 OData 服务准备测试数据
 V10 桌面版、服务器版系统加固
 C++学习——C++函数的编译、成员函数的调用、this指针详解
原文地址：https://blog.csdn.net/m0_51517236/article/details/127187569