vectoreStore:该组件通常用来做内存向量存储的,同时利用该存储区获取他的retrieval检索
内存向量存储使用的参数为:文档、embeddings、输出
由于他的输出分为vectoreStore向量库存储、retrieval向量检索器,因此需要拿到他的输出分别做处理
步骤:
第一步:获取值
第二步:定义类型并实例化[采用mmr做他的检索处理]
第三步:对他的输出结果分别做处理
学习链接:
https://www.langchain.com.cn/modules/prompts/example_selectors/examples/mmr
from typing import Any, Dict, Optional,List,Union
from langchain.schema import Document, BaseRetriever
from langchain.vectorstores import Chroma
from langchain.vectorstores.base import VectorStore
from langchain.embeddings.base import Embeddings
import chromadb,os
class MemoryVectorStore():
def __init__(self,param_dict: Optional[Dict[str, Any]] = None):
documents = param_dict.get("document")
embeddings :str = param_dict.get("embeddings")
if documents is None or len(documents) <= 0:
raise Exception()
if embeddings is None or not isinstance(embeddings, Embeddings):
raise Exception()
texts = []
for doc in documents:
if len(doc)>=1:
for doc_copy in doc:
doc_page = doc_copy.page_content.replace("\n","")
texts.append(doc_page)
self.__vectorstore = Chroma.from_documents(
client=chromadb_client,
documents=texts,
embedding=embeddings
)
outputs: list = param_dict.get("outputs")
self.__output = outputs['output'] if outputs is not None and 0 < len(outputs) else "retriever"
def source(self) -> Optional[Union[BaseRetriever, VectorStore]]:
if self.__output.lower() == "retriever".lower():
retriver = self.__vectorstore.as_retriever()
retriver.search_type = 'mmr'
return retriver
elif self.__output.lower() == "vectorStore".lower():
return self.__vectorstore
else:
return None