MongoDB区别Mysql的地方,就是MongoDB支持文档嵌套,比如最近业务中就有一个在音频转写结果中进行对话场景,一个音频中对应多轮对话,这些音频数据和对话信息就存储在MongoDB中文档中。集合结构大致如下
- {
- "_id":23424234234324234,
- "audioId": 2689944,
- "contextId": "cht000d24ab@dx187d1168a449a4b540",
- "dialogues": [{
- "ask": "今天是礼拜天?",
- "answer": "是的",
- "createTime": 1697356990966
- }, {
- "ask": "你也要加油哈",
- "answer": "奥利给!",
- "createTime": 1697378011483
- }, {
- "ask": "下周见",
- "answer": "拜拜!",
- "createTime": 1697378072063
- }]
- }
下面简单介绍几个业务中用到的简单操作。
- public Integer getDialoguesSize(Long audioId) {
- Integer datasSize = 0;
- List
group = Arrays.asList( - new Document("$match",
- new Document("audioId",
- new Document("$eq", audioId)
- )
- ), new Document("$match",
- new Document("dialogues",
- new Document("$exists", true)
- )
- ), new Document("$project",
- new Document("datasSize",
- new Document("$size", "$dialogues"))
- )
- );
- AggregateIterable
aggregate = generalCollection.aggregate(group); - Document document = aggregate.first();
- if (document != null) {
- datasSize = (Integer) document.get("datasSize");
- }
- return datasSize;
- }
下面的代码主要查询指定audioId中的dialogues集合中小于createTime,并且根据limit分页查询,这里用到了MongoDB中的Aggregates和unwind来进行聚合查询,具体使用细节,可以参见MongoDB官方文档
- public AIDialoguesResultDTO queryAiResult(Long audioId, Long createTime, Integer limit) {
- AIDialoguesResultDTO aiDialoguesResultDTO = new AIDialoguesResultDTO();
-
- List
pipeline = Arrays.asList( - Aggregates.match(Filters.eq("audioId", audioId)),
- Aggregates.unwind("$dialogues"),
- Aggregates.match(Filters.lt("dialogues.createTime", createTime)),
- Aggregates.sort(Sorts.descending("dialogues.createTime")),
- Aggregates.limit(limit)
- );
-
- AggregateIterable
aggregate = generalCollection.aggregate(pipeline); - List
aiDialoguesResultList = new ArrayList<>(); - String contextId = Constant.EMPTY_STR;
- for (Document document : aggregate) {
- AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
- List
key = Collections.singletonList("dialogues"); -
- aiDialoguesResult.setAnswer(document.getEmbedded(key, Document.class).getString("answer"));
- aiDialoguesResult.setAsk(document.getEmbedded(key, Document.class).getString("ask"));
- aiDialoguesResult.setCreateTime(document.getEmbedded(key, Document.class).getLong("createTime"));
- aiDialoguesResultList.add(aiDialoguesResult);
- contextId = document.getString("contextId");
- }
-
- if (!CollectionUtils.isEmpty(aiDialoguesResultList)) {
- aiDialoguesResultList = aiDialoguesResultList.stream().sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime)).collect(Collectors.toList());
- }
-
- aiDialoguesResultDTO.setCount(aiDialoguesResultList.size());
- aiDialoguesResultDTO.setContextId(contextId);
- aiDialoguesResultDTO.setResult(aiDialoguesResultList);
- return aiDialoguesResultDTO;
- }
当然,我们还有一种比较简单的写法
- public AIDialoguesResultDTO queryAiResultBackupVersion(Long audioId, Long createTime, Integer limit) {
- Bson query = and(eq("audioId", audioId));
- AITextResult aiTextResult = mongoDao.findSingle(query, AITextResult.class);
- AIDialoguesResultDTO aiDialoguesResultDTO = new AIDialoguesResultDTO();
- if (Objects.isNull(aiTextResult)) {
- aiDialoguesResultDTO.setResult(Collections.emptyList());
- aiDialoguesResultDTO.setCount(0);
- aiDialoguesResultDTO.setContextId("");
- }
- List
aiDialoguesResultList = aiTextResult.getDialogues(); -
- if (CollectionUtils.isEmpty(aiDialoguesResultList)) {
- return aiDialoguesResultDTO;
- }
-
- Long finalCreateTime = createTime;
- List
afterFilterAiDialoguesResultList = - aiDialoguesResultList.stream().filter(t -> t.getCreateTime()
- < finalCreateTime).sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime).reversed())
- .limit(limit).collect(Collectors.toList());
-
- if (CollectionUtils.isEmpty(afterFilterAiDialoguesResultList)) {
- aiDialoguesResultDTO.setCount(0);
- } else {
- aiDialoguesResultDTO.setCount(afterFilterAiDialoguesResultList.size());
- }
- afterFilterAiDialoguesResultList = afterFilterAiDialoguesResultList.
- stream().sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime)).collect(Collectors.toList());
- aiDialoguesResultDTO.setResult(afterFilterAiDialoguesResultList);
- aiDialoguesResultDTO.setContextId(aiTextResult.getContextId());
- return aiDialoguesResultDTO;
- }
上面这种写法比较直接,就是直接audioId进行匹配查询, 然后将当前文档中的dialogues全部加载到内存中,然后在内存中进行排序,分页返回,显然如果dialogues集合长度很大,对内存占用会比较高。
对于dialogues数组,如果我们要向dialogues追加元素,我们可以把audioId对应的dialogues全部取出来,然后在List后面追加一个元素,大致代码如下
- public void saveAiResult(SaveAIResultDTO saveAIResultDTO) {
- Long audioId = saveAIResultDTO.getAudioId();
- Bson filter = Filters.eq("audioId", audioId);
- AITextResult aiTextResult = mongoDao.findSingle(filter, AITextResult.class);
- if (Objects.isNull(aiTextResult)) {
- aiTextResult = AITextResult.buildAiTextResult(saveAIResultDTO);
- mongoDao.saveOrUpdate(aiTextResult);
- return;
- }
- List
aiDialoguesResults = aiTextResult.getDialogues(); - AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
- aiDialoguesResult.setCreateTime(new Date().getTime());
- aiDialoguesResult.setAsk(saveAIResultDTO.getAsk());
- aiDialoguesResult.setAnswer(saveAIResultDTO.getAnswer());
- aiDialoguesResults.add(aiDialoguesResult);
- aiTextResult.setDialogues(aiDialoguesResults);
- mongoDao.saveOrUpdate(aiTextResult);
- }
上面这种写法本身没有什么问题,但是如果dialogues集合大小比较大,每次追加都将dialogues全部取出来进行追加操作,可能比较占用内存,我们可以利用MongoDB中的push操作,直接追加
- public void saveAiResultIncremental(SaveAIResultDTO saveAIResultDTO) {
- Long audioId = saveAIResultDTO.getAudioId();
- Document query = new Document("audioId", audioId);
- Bson projection = Projections.fields(Projections.include("contextId"), Projections.excludeId());
- FindIterable
result = generalCollection.find(query).projection(projection); - AITextResult aiTextResult;
-
- if (!result.iterator().hasNext()) {
- aiTextResult = AITextResult.buildAiTextResult(saveAIResultDTO);
- mongoDao.saveOrUpdate(aiTextResult);
- return;
- }
-
- AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
- aiDialoguesResult.setCreateTime(new Date().getTime());
- aiDialoguesResult.setAsk(saveAIResultDTO.getAsk());
- aiDialoguesResult.setAnswer(saveAIResultDTO.getAnswer());
- Bson update = push("dialogues", aiDialoguesResult);
- Bson filter = Filters.eq("audioId", audioId);
- generalCollection.updateOne(filter, update);
- }
既然选择了MongoDB,就不能继续沿用Mysql的查询风格,要学会利用MongoDB的特性,否则往往达不到预期效果。