小白学RAG：什么是SQL Agent？

Coggle数据科学 2024-03-04

612

unsetunset什么是SQL Agent？unsetunset

SQL Agent 是一个与数据库交互的系统或工具，它能够执行与SQL数据库相关的任务。在LangChain这个上下文中，SQL Agent 提供了一种更灵活的方式来与SQL数据库进行交互。它不仅能够根据数据库的内容回答问题，还能够基于数据库的模式（schema）来回答问题，例如描述一个特定的表。

SQL Agent 的主要优势包括：

基于模式和内容的回答：SQL Agent 可以根据数据库的模式和内容来生成和回答问题，这意味着它可以提供更深入的信息，比如关于特定表的描述。
错误恢复：如果执行生成的查询时出现错误，SQL Agent 能够捕获错误信息（traceback），并正确地重新生成查询。
处理多步依赖查询：有些问题可能需要执行多个相互依赖的查询才能得到答案，SQL Agent 能够处理这种情况。
节省资源：SQL Agent 在处理问题时，只会考虑相关表的模式，这样可以节省资源，比如减少不必要的查询。

unsetunsetLangchain 案例unsetunset

https://python.langchain.com/docs/use_cases/sql/agents

SQL Agent的概要步骤如下：

将问题转换为 SQL 查询
执行 SQL 查询
使用查询结果回应用户输入

链接数据库

from langchain_community.utilities import SQLDatabase

db = SQLDatabase.from_uri("sqlite:///Chinook.db", sample_rows_in_table_info=3)
print(db.dialect)
print(db.get_usable_table_names())
db.run("SELECT * FROM Artist LIMIT 10;")

创建SQL问答

from langchain.chains import create_sql_query_chain
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature="0")
chain = create_sql_query_chain(llm, db)
chain.get_prompts()[0].pretty_print()

You are a SQLite expert. Given an input question, first create a syntactically correct SQLite query to run, then look at the results of the query and return the answer to the input question.
Unless the user specifies in the question a specific number of examples to obtain, query for at most 5 results using the LIMIT clause as per SQLite. You can order the results to return the most informative data in the database.
Never query for all columns from a table. You must query only the columns that are needed to answer the question. Wrap each column name in double quotes (") to denote them as delimited identifiers.
Pay attention to use only the column names you can see in the tables below. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
Pay attention to use date('now') function to get the current date, if the question involves "today".

Use the following format:

Question: Question here
SQLQuery: SQL Query to run
SQLResult: Result of the SQLQuery
Answer: Final answer here

Only use the following tables:
{table_info}

Question: {input}

解析数据库

context = db.get_context()
print(list(context))
print(context["table_info"])

SQL问答

prompt_with_context = chain.get_prompts()[0].partial(table_info=context["table_info"])
print(prompt_with_context.pretty_repr()[:1500])

unsetunset验证与验证SQLunsetunset

https://python.langchain.com/docs/use_cases/sql/query_checking

在执行查询之前，对用户输入进行验证是非常重要的。确保用户输入符合预期的格式和类型，以防止恶意输入或错误输入导致的问题。可以通过Agent来完成SQL的验证和改写：

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

system = """Double check the user's {dialect} query for common mistakes, including:
- Using NOT IN with NULL values
- Using UNION when UNION ALL should have been used
- Using BETWEEN for exclusive ranges
- Data type mismatch in predicates
- Properly quoting identifiers
- Using the correct number of arguments for functions
- Casting to the correct data type
- Using the proper columns for joins

If there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.

Output the final SQL query only."""
prompt = ChatPromptTemplate.from_messages(
    [("system", system), ("human", "{query}")]
).partial(dialect=db.dialect)
validation_chain = prompt | llm | StrOutputParser()

full_chain = {"query": chain} | validation_chain

unsetunset多表查询unsetunset

https://python.langchain.com/docs/use_cases/sql/large_db

当存在许多表、列或者高基数列时，将数据库的全部信息都包含在每个提示（prompt）中变得不可能。

提取相关表名

首先根据用户输入的内容，使用一些文本处理技术（如正则表达式、自然语言处理等）来识别可能相关的表名。
这一步可以通过分析用户的问题，提取关键词或短语，然后将这些关键词与数据库中的表名进行匹配。

获取表模式

一旦确定了相关表的名称，就可以使用数据库查询来获取这些表的模式信息。
在SQL中，这通常涉及到执行如DESCRIBE table_name;
或PRAGMA table_info(table_name);
（对于SQLite）这样的命令。

学习大模型、推荐系统、算法竞赛

添加👇微信拉你进群

加入了之前的社群不需要重复添加~

sql数据库数据库 sql

文章转载自Coggle数据科学，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。