核心功能和优势:
结合搜索:
不同于传统数据库只能处理结构化数据,AI Vector Search 能够理解和处理非结构化数据(如图片、PDF、音频等)。 利用 AI 矢量化技术,将文本、图像等数据转换为数学向量,通过计算相似度快速找到相关内容。
实时运行:
依托 Oracle Database 23ai,AI 算法直接与数据库集成,无需将数据导出到外部 AI 平台。 这意味着用户可以直接在数据库中实时运行 AI 模型,节省数据移动所需的时间和资源。
高效开发:
Oracle 提供了统一的开发工具和关键任务支持,开发者可以用一种更高效的方式构建智能应用程序。 数据专业人员无需掌握复杂的 AI 技术,也能轻松利用 AI Vector Search 提高业务搜索效率。
安全可靠:
由于所有操作都在 Oracle 数据库内部完成,数据始终受到高标准的安全保护。 结合企业的关键任务功能,可确保搜索过程的稳定性和高性能。
适用场景:
- 知识管理:
在企业内部快速搜索相关文档、合同、图片等,提升员工的生产力。 - 客户服务:
通过矢量搜索匹配客户提问与企业知识库中的答案,构建智能客服系统。 - 推荐系统:
根据用户行为和兴趣向量推荐产品或服务。 - 医疗健康:
在医学影像、病例中快速找到相关数据,用于诊断或研究。
通过 Oracle AI Vector Search,开发人员和数据专业人员能够更轻松地将 AI 与数据库融合,开发智能应用程序,并处理关键任务工作负载,提升企业效率与竞争力。
很多人跟我一样对自己制作模型一无所知,所以这里我可以使用 Oracle 为我们提供的一个名为 all-MiniLM-L12-v2 的模型。
2、下载并解压模型
我们创建一个目录来保存模型,并将模型解压缩到该目录中。
mkdir -p u01/modelscd u01/modelswget https://adwc4pm.objectstorage.us-ashburn-1.oci.customer-oci.com/p/VBRD9P8ZFWkKvnfhrWxkpPe8K03-JIoM5h_8EJyJcpE80c108fuUjg7R5L5O7mMZ/n/adwc4pm/b/OML-Resources/o/all_MiniLM_L12_v2_augmented.zipunzip -oq all_MiniLM_L12_v2_augmented.zip
3、下载模型(第1步已含)
wget https://adwc4pm.objectstorage.us-ashburn-1.oci.customer-oci.com/p/VBRD9P8ZFWkKvnfhrWxkpPe8K03-JIoM5h_8EJyJcpE80c108fuUjg7R5L5O7mMZ/n/adwc4pm/b/OML-Resources/o/all_MiniLM_L12_v2_augmented.zip
4、解压缩模型包(第1步已含)
unzip -oq all_MiniLM_L12_v2_augmented.zip
5、连接数据库
我们连接到数据库,创建测试用户,创建指向物理目录的目录对象,并为测试用户授予访问权限。
conn sys/SysPassword1@//localhost:1521/freepdb1 as sysdba
6、创建测试用户和模型目录并授权
-- 创建用户并授权create user if not exists testuser1 identified by testuser1 quota unlimited on users;grant create session, db_developer_role, create mining model to testuser1;-- 创建目录并授权create or replace directory model_dir as '/u01/models';grant read, write on directory model_dir to testuser1;
7、将模型加载到数据库
我们现在可以使用DBMS_VECTOR包将模型加载到数据库中。
begindbms_vector.drop_onnx_model (model_name => 'ALL_MINILM_L12_V2',force => true);dbms_vector.load_onnx_model (directory => 'model_dir',file_name => 'all_MiniLM_L12_v2.onnx',model_name => 'ALL_MINILM_L12_V2');end;/
8、查看模型信息
我们在USER_MINING_MODELS视图中看到了模型信息。
column model_name format a30column algorithm format a10column mining_function format a15select model_name, algorithm, mining_functionfrom user_mining_modelswhere model_name = 'ALL_MINILM_L12_V2';MODEL_NAME ALGORITHM MINING_FUNCTION------------------------------ ---------- ---------------ALL_MINILM_L12_V2 ONNX EMBEDDINGSQL>
9、生成向量(VECTOR数据类型)
et long 1000000select vector_embedding(all_minilm_l12_v2 using 'Quick test' as data) AS my_vector;MY_VECTOR--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------[-3.8644433E-002,7.27762729E-002,-6.99380785E-003,-7.29616638E-003,8.81515723E-003,-6.36086538E-002,4.39667143E-003,-4.20215279E-002,-1.32307231E-001,-5.83763141E-003,-1.32369539E-002,-1.62914675E-002,6.54895976E-003,-4.98352237E-002,-1.98450759E-002,-4.69920859E-002,1.03937693E-001,-8.96753371E-002,-2.77858647E-003,4.13947217E-002,-6.51627034E-002,-1.0990192E-001,-8.73052608E-003,2.533352E-002,-1.42030744E-002,-2.42071413E-002,1.91592015E-002,4.93750861E-003,6.30867062E-003,-1.24127828E-001,-7.17296917E-003,3.73173878E-002,4.9763605E-002,4.52162512E-002,1.49683282E-002,-2.21796334E-002,-3.67936343E-002,-6.20212057E-004,7.16803446E-002,5.33913262E-003,1.92087796E-002,-9.91346464E-002,3.90679725E-002,2.22725477E-002,5.04363291E-002,1.81943234E-002,5.34031764E-002,1.44161787E-002,-1.99908093E-002,-1.20323906E-002,-2.63888612E-002,-4.14666645E-002,6.24738075E-002,-4.68838811E-002,1.16748791E-002,-2.43180897E-002,-3.11982501E-002,-7.57505326E-003,2.25466546E-002,-4.17359509E-002,1.23237111E-002,4.31706719E-002,-7.83751085E-002,1.24918474E-002,5.42060696E-002,4.33742851E-002,2.52278009E-003,-1.15482165E-002,-9.98713658E-004,-2.12613232E-002,1.00960573E-002,3.17986757E-002,-1.13147125E-002,-1.26893111E-002,2.66182888E-002,-7.5068539E-003,-3.70341614E-002,1.9485198E-002,-2.921376E-002,-2.61210185E-002,2.86212768E-002,-9.15901735E-002,1.50552308E-002,-4.9816858E-002,2.29324102E-002,7.82513991E-003,4.22973074E-002,3.37974578E-002,-4.23457436E-002,-6.32970929E-002,3.84950414E-002,-1.93851739E-002,1.96237396E-003,-3.91559181E-004,7.80333811E-003,5.63595779E-002,4.45815139E-002,-4.97016683E-002,1.36383837E-002,2.76547611E-001,6.35802001E-002,-1.69337578E-002,-3.25948372E-002,2.74621435E-002,-1.84808951E-002,-3.58916447E-002,3.18281911E-003,-3.92074399E-002,-5.03929285E-003,-3.95198613E-002,2.64223926E-002,5.44404797E-002,-2.97637819E-003,1.0656476E-002,4.55005579E-002,-9.6166715E-002,4.53018732E-002,3.02239601E-002,-1.11025617E-001,6.18583001E-002,8.55141282E-002,-1.51456818E-002,-5.6408301E-002,-5.93017961E-004,1.07500203E-001,-6.81523681E-002,1.85917076E-002,3.75313126E-002,-3.27163152E-002,-4.72421981E-002,5.59753105E-002,2.20437739E-002,2.74991728E-002,2.6306238E-002,-5MY_VECTOR--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.36230095E-002,6.86871773E-003,5.06029651E-003,8.86866376E-002,3.97678688E-002,-1.4952058E-003,-1.11991599E-001,-1.48920929E-002,-1.42183332E-002,5.44683374E-002,-5.46902604E-002,-3.37714627E-002,-3.93099897E-002,-8.8875033E-003,-2.50034444E-002,-3.8671799E-002,7.15422854E-002,-1.72948167E-002,5.7217218E-002,1.23445876E-002,-6.25340268E-002,-1.97963398E-002,4.08164188E-002,9.2235636E-003,2.36456394E-002,-4.27598739E-003,-1.24366455E-001,8.28649029E-002,-5.27118742E-002,-1.11121098E-002,4.35752161E-002,1.35777248E-002,-2.25060564E-002,4.55260463E-002,3.89738753E-002,-8.9330703E-002,1.17293425E-001,5.51190637E-002,-2.56631747E-002,-5.30632809E-002,-8.39535818E-002,4.83218301E-003,6.57674894E-002,8.87271166E-002,-1.52742919E-002,1.05254455E-002,-1.58144645E-002,-3.10783144E-002,-2.9069094E-002,7.04223011E-003,-3.09849512E-002,-4.46299138E-003,-7.20088407E-002,-7.05658719E-002,4.65546325E-002,1.10276632E-001,3.60871851E-002,1.86070222E-002,-6.10643029E-002,3.21829244E-002,-1.43657476E-002,-6.75653145E-002,8.07486102E-002,1.68783069E-002,-1.0059043E-001,-7.55800903E-002,-1.69591829E-002,-4.45710421E-002,-8.60542338E-003,4.33787964E-002,4.29520719E-002,3.94066162E-002,8.67492054E-003,-8.52123275E-002,1.20206453E-001,-1.14268251E-001,-1.70285124E-002,8.87670461E-003,-4.69080843E-002,-3.02138515E-002,4.61057499E-002,-4.92520146E-002,1.56181203E-002,-9.27053615E-002,-6.08294345E-002,1.46451779E-002,-1.8469112E-002,-1.40407547E-001,5.35490811E-002,5.85880674E-033,7.62652978E-002,-3.0770693E-002,-6.7476593E-003,1.03074148E-001,7.20860213E-002,-9.75818858E-002,1.51840553E-001,7.43322074E-002,-2.99238227E-002,9.39518213E-002,1.50299622E-002,4.35530245E-002,-7.58084841E-003,-7.49263093E-002,-5.07647246E-002,4.0109925E-002,-7.43360296E-002,4.62087467E-002,9.6142469E-003,3.1514265E-004,6.2026035E-002,1.57011151E-002,3.29307579E-002,5.6974791E-002,-7.89973959E-002,9.78369173E-003,1.16775157E-002,-3.65987495E-002,-5.30386977E-002,-1.22491308E-002,5.65312728E-002,3.41438502E-002,-4.26849015E-002,9.84478667E-002,1.52464816E-003,-6.92429468E-002,9.64930728E-002,-1.85021404E-002,4.28027436E-002,-4.41830456E-002,-2.54553054E-002,5.20384647E-002,-1.3808256E-002,-1.59469489E-002,2.10003MY_VECTOR--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------518E-002,-1.85695086E-002,2.29395907E-002,1.91418286E-002,4.09490913E-002,2.35137846E-002,-3.91655453E-002,3.57466601E-002,4.80409786E-002,-1.02699241E-002,1.45040052E-002,-4.27465513E-002,-5.0925903E-002,-7.12790117E-002,-9.191221E-002,1.79740768E-002,-3.53490785E-002,-2.26370115E-002,1.64974518E-002,1.05952621E-001,-3.52565721E-002,-3.41151282E-002,-5.72828352E-002,-3.10265515E-002,6.9757998E-002,-2.50361152E-002,-3.91423702E-002,1.37532372E-002,6.76134787E-003,-5.15896529E-002,-3.5578981E-002,6.91288859E-002,-3.47342566E-002,-1.09835267E-002,-2.45214235E-002,-3.24611999E-002,1.08489944E-002,8.2179103E-003,-2.97738314E-002,4.81431969E-002,-5.78631386E-002,2.85624769E-002,4.0292047E-002,2.82907616E-002,-3.79493125E-002,8.59350059E-003,1.02058621E-002,2.48055886E-002,7.16195907E-003,-6.24535345E-002,-3.2572642E-002,4.26035373E-033,-7.57267745E-003,-4.15649898E-002,-4.98135239E-002,1.02479551E-002,3.28872576E-002,1.50394207E-003,-6.39199391E-002,-7.5357914E-002,-2.46184096E-002,-3.06450184E-002,4.16100062E-002,7.04020485E-002,-8.15085992E-002,2.55300701E-002,1.8981915E-002,4.26408909E-002,-2.19986513E-002,7.1362392E-003,-3.42554934E-002,3.70068592E-003,-3.15256999E-003,1.41581176E-002,5.00134267E-002,7.54985586E-002,6.42605498E-002,7.55612329E-002,1.52721843E-002,1.15661159E-001,-2.45987438E-002,1.08358078E-002,5.02406172E-002,6.28810897E-002,-5.52952439E-002,-5.51969372E-002,-4.60022828E-003,-1.46539817E-002,6.40283972E-002,5.18338121E-002,2.51765884E-002,6.45218417E-002,-8.35603774E-002,3.57579961E-002,6.28177961E-003,3.15946154E-003,2.32084282E-002,4.76812199E-002,-5.47785498E-003,-1.06323607E-001,-1.45862792E-002,-5.92180602E-002,-1.59236379E-002,-1.90922264E-002,4.61262129E-002,2.41158772E-002,-7.9021994E-003,1.11448221E-001,1.11205485E-002,-2.0573834E-002,-4.08657975E-002,5.54621816E-002,5.37177958E-002,4.76263165E-002,-3.29907499E-002,4.3731384E-002]SQL>
10、下载测试用的CSV文本数据
我们可以在DML中使用VECTOR_EMBEDDING函数,这是我们将在下一个示例中执行的操作。
从《拥抱的脸》中摘录了一些电影台词。
cd u01/modelwget https://huggingface.co/datasets/ygorgeurts/movie-quotes/resolve/main/movie_quotes.csv?download=true -O movie_quotes.csv
11、根据csv内容创建一个新的测试表
conn testuser1/testuser1@//localhost:1521/freepdb1drop table if exists movie_quotes purge;create table movie_quotes asselect movie_quote, movie, movie_type, movie_yearfrom external ((movie_quote varchar2(400),movie varchar2(200),movie_type varchar2(50),movie_year number(4))type oracle_loaderdefault directory model_diraccess parameters (records delimited by newlineskip 1badfile model_dirlogfile model_dir:'moview_quotes_ext_tab_%a_%p.log'discardfile model_dirfields csv with embedded terminated by ',' optionally enclosed by '"'missing field values are null(movie_quote char(400),movie,movie_type,movie_year))location ('movie_quotes.csv')reject limit unlimited);desc movie_quotesName Null? Type----------------------------------------- -------- ----------------------------MOVIE_QUOTE VARCHAR2(400)MOVIE VARCHAR2(200)MOVIE_TYPE VARCHAR2(50)MOVIE_YEAR NUMBER(4)SQL>
12、添加向量数据列
添加字段 movie_quote_vector 数据类型为 vector,用来保存向量数据。
alter table movie_quotes add (movie_quote_vector vector);desc movie_quotesName Null? Type----------------------------------------- -------- ----------------------------MOVIE_QUOTE VARCHAR2(400)MOVIE VARCHAR2(200)MOVIE_TYPE VARCHAR2(50)MOVIE_YEAR NUMBER(4)MOVIE_QUOTE_VECTOR VECTOR(*, *)SQL>
13、生成向量填充新列
update movie_quotesset movie_quote_vector = vector_embedding(all_minilm_l12_v2 using movie_quote as data);commit;
14、使用 Vector_DISTANCE 进行向量搜索
我们使用VECTOR_DISTANCE函数进行搜索。这接受两个向量,顾名思义,返回它们之间的距离。由于我们使用的模型为文本数据生成向量,因此我们预计两个相似的向量的向量距离会更小。在以下示例中,我们根据搜索文本创建一个向量,并根据搜索文本和引用文本之间的向量距离对查询的输出进行排序。
首先,我们要求“有励志演讲的电影”。
variable search_text varchar2(100);exec :search_text := 'Films with motivational speaking in them';set linesize 200column movie format a50column movie_quote format a100SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,movie,movie_quoteFROM movie_quotesorder by 1fetch approximate first 5 rows only;DISTANCE MOVIE MOVIE_QUOTE---------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------6.786E-001 Once Upon a Time in Hollywood That was the best acting i've ever seen in my whole life.6.979E-001 Dead Poets Society You must strive to find your own voice because the longer you wait to begin, the less likely you aregoing to find it at all.7.169E-001 The Pursuit of Happyness Walk that walk and go forward all the time. Don't just talk that talk, walk it and go forward. Also,the walk didn't have to be long strides; baby steps counted too. Go forward.7.186E-001 Joker My mother always tells me to smile and put on a happy face. She told me I had a purpose to bring laughter and joy to the world.7.234E-001 Blazing Saddles Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you men will only be risking your lives, whilst I will be risking an almost certain Academy Award nomination for Best Supporting Actor.SQL>
接下来,我们搜索“关于战争的电影”。
variable search_text varchar2(100);exec :search_text := 'Films about war';set linesize 200column movie format a50column movie_quote format a100SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,movie,movie_quoteFROM movie_quotesorder by 1fetch approximate first 5 rows only;DISTANCE MOVIE MOVIE_QUOTE---------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------5.682E-001 Dr. Strangelove Gentlemen, you can't fight in here! This is the War Room!6.346E-001 Blazing Saddles Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you men will only be risking your lives, whilst I will be risking an almost certain Academy Award nomination for Best Supporting Actor.6.587E-001 Fury Ideals are peaceful; history is violent.7.243E-001 The Kill Team You give me your loyalty, and I?ll guarantee that each and every one of you will have a chance to bea warrior, to actually be a part of history.7.253E-001 Dr. No Bond. James BondSQL>
5月18号TiDB社区走进360|聊聊全球视野下的TiDB应用实践!阵容超强大!想要get降本增效大佬秘籍的TiDBer快报名! Oracle 23c 中使用注解 Oracle 23 ai 中 SUM 和 AVG 聚合函数与 INTERVAL 数据类型一起使用 Oracle Control File中的SECTION NUMBERS什么意思? 北纬36度线上的神奇的小城——诸城 诸城方言 自制数字人播报视频 信息系统项目管理师学习笔记(一) Oracle 23c 中列默认值定义为 DEFAULT ON NULL FOR INSERT AND UPDATE Oracle 数据库中的数据质量运算符 在Sybase的Syslogs中我怎么确定Secondary Truncation Point OGG将Oracle的Number数据类型映射为Sybase的Timestamp或者Varbinary 数据类型 Oracle 23c 中的 DBMS_HCHECK 包 在这个喧嚣的世界里,我们都是匆匆过客 如果有一天我不在了(AI音乐创作) OpenAI推出Sora:用AI生成高清视频的魔法工具 香港自由行-游记 软考高项学习笔记 | 一文概括高项49个项目管理过程主要知识点 Oracle Control File中的SECTION NUMBERS什么意思? Windows下Oracle的systimestamp函数的最高精度问题 Oracle Adaptive Plans新特性的个人理解




