
本教程将分享如何使用Milvus和硅基流动SiliconCloud构建RAG(检索增强生成)流水线。
准备工作
1. 环境设置和依赖项
pip install --upgrade pymilvus openai requests tqdm
SiliconCloud提供类似于OpenAI的API密匙(https://docs.siliconflow.cn/quickstart)。你可以在其官方网站登录,然后准备一个API密匙SILICON_FLOW_API_KEY作为环境变量。
import osos.environ["SILICON_FLOW_API_KEY"] = "***********"
2. 准备数据
我们将使用Milvus文档2.4.x(https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip)中的FAQ页面作为RAG的专有知识,这对于简单的RAG流水线来说是很好的数据源。
下载zip文件并将其解压缩到名为milvus_docs的文件夹中。
wget https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zipunzip -q milvus_docs_2.4.x_en.zip -d milvus_docs
from glob import globtext_lines = []for file_path in glob("milvus_docs/en/faq/*.md", recursive=True):with open(file_path, "r") as file:file_text = file.read()text_lines += file_text.split("# ")
from openai import OpenAIsiliconflow_client = OpenAI(api_key=os.environ["SILICON_FLOW_API_KEY"], base_url="https://api.siliconflow.cn/v1")
def emb_text(text):return (siliconflow_client.embeddings.create(input=text, model="BAAI/bge-large-en-v1.5").data[0].embedding)
test_embedding = emb_text("This is a test")embedding_dim = len(test_embedding)print(embedding_dim)print(test_embedding[:10])
1024[0.011475468054413795, 0.02982141077518463, 0.0038535362109541893, 0.035921916365623474, -0.0159175843000412, -0.014918108470737934, -0.018094222992658615, -0.002937349723652005, 0.030917132273316383, 0.03390815854072571]
from pymilvus import MilvusClientmilvus_client = MilvusClient(uri="./milvus_demo.db")collection_name = "my_rag_collection"
注意
有关MilvusClient参数:
将URI设置为本地文件,例如./milvus.db是最便捷的方法,因为这将自动使用Milvus Lite在文件中保存所有数据。
如果你有大量数据,可以设置一个Milvus服务器,它能在Docker或Kubernetes上更有效的运行。在这个设置中,请使用服务器的URI,例如 http://localhost:19530。
如果你想使用Zilliz Cloud,即Milvus的完全管理云服务,请调整URI和token,它们对应Zilliz Cloud中的公共端点和API密钥(https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details)。
在添加新集合前,请确认当前集合不存在,如存在则删除以避免重复创建。
if milvus_client.has_collection(collection_name):milvus_client.drop_collection(collection_name)
milvus_client.create_collection(collection_name=collection_name,dimension=embedding_dim,metric_type="IP", # Inner product distanceconsistency_level="Strong", # Strong consistency level)
from tqdm import tqdmdata = []for i, line in enumerate(tqdm(text_lines, desc="Creating embeddings")):data.append({"id": i, "vector": emb_text(line), "text": line})milvus_client.insert(collection_name=collection_name, data=data)
Creating embeddings: 100%|██████████| 72/72 [00:04<00:00, 16.97it/s]{'insert_count': 72, 'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], 'cost': 0}
question = "How is data stored in milvus?"
search_res = milvus_client.search(collection_name=collection_name,data=[emb_text(question)], # Use the `emb_text` function to convert the question to an embedding vectorlimit=3, # Return top 3 resultssearch_params={"metric_type": "IP", "params": {}}, # Inner product distanceoutput_fields=["text"], # Return the text field)
import jsonretrieved_lines_with_distances = [(res["entity"]["text"], res["distance"]) for res in search_res[0]]print(json.dumps(retrieved_lines_with_distances, indent=4))
[[" Where does Milvus store data?\n\nMilvus deals with two types of data, inserted data and metadata. \n\nInserted data, including vector data, scalar data, and collection-specific schema, are stored in persistent storage as incremental log. Milvus supports multiple object storage backends, including [MinIO](https://min.io/), [AWS S3](https://aws.amazon.com/s3/?nc1=h_ls), [Google Cloud Storage](https://cloud.google.com/storage?hl=en#object-storage-for-companies-of-all-sizes) (GCS), [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs), [Alibaba Cloud OSS](https://www.alibabacloud.com/product/object-storage-service), and [Tencent Cloud Object Storage](https://www.tencentcloud.com/products/cos) (COS).\n\nMetadata are generated within Milvus. Each Milvus module has its own metadata that are stored in etcd.\n\n###",0.833885133266449],["How does Milvus flush data?\n\nMilvus returns success when inserted data are loaded to the message queue. However, the data are not yet flushed to the disk. Then Milvus' data node writes the data in the message queue to persistent storage as incremental logs. If `flush()` is called, the data node is forced to write all data in the message queue to persistent storage immediately.\n\n###",0.812842607498169],["Does the query perform in memory? What are incremental data and historical data?\n\nYes. When a query request comes, Milvus searches both incremental data and historical data by loading them into memory. Incremental data are in the growing segments, which are buffered in memory before they reach the threshold to be persisted in storage engine, while historical data are from the sealed segments that are stored in the object storage. Incremental data and historical data together constitute the whole dataset to search.\n\n###",0.7714196443557739]]
context = "\n".join([line_with_distance[0] for line_with_distance in retrieved_lines_with_distances])
SYSTEM_PROMPT = """Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided."""USER_PROMPT = f"""Use the following pieces of information enclosed in <context> tags to provide an answer to the question enclosed in <question> tags.<context>{context}</context><question>{question}</question>"""
response = siliconflow_client.chat.completions.create(model="deepseek-ai/DeepSeek-V2.5",messages=[{"role": "system", "content": SYSTEM_PROMPT},{"role": "user", "content": USER_PROMPT},],)print(response.choices[0].message.content)
In Milvus, data is stored in two main categories: inserted data and metadata.- **Inserted Data**: This includes vector data, scalar data, and collection-specific schema, which are stored in persistent storage as incremental logs. Milvus supports various object storage backends such as MinIO, AWS S3, Google Cloud Storage (GCS), Azure Blob Storage, Alibaba Cloud OSS, and Tencent Cloud Object Storage (COS).
推荐阅读








