暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle 需要地址替换正则表达式的帮助

ASKTOM 2021-03-31
343

问题描述

我有一个递归查询如下:

with addresses as
(
  select cust_id,address addr from
  (
  select 10 cust_id,'9 Help Street, Level 4' address from dual union all
  select 11 cust_id,'22 Victoria Street' address from dual union all
  select 12 cust_id,'1495 Franklin Str.' address from dual union all
  select 13 cust_id,'30 Hasivim St.,Petah-Tikva' address from dual union all
  select 14 cust_id,'2 Jakaranda St' address from dual union all
  select 15 cust_id,'61, Science Park Rd' address from dual union all
  select 16 cust_id,'61, Social park road.' address from dual union all
  select 17 cust_id,'Av. Hermanos Escobar 5756' address from dual union all
  select 18 cust_id,'Ave. Hermanos Escobar 5756' address from dual union all
  select 19 cust_id,'8000 W FLORISSANT Ave.' address from dual union all
  select 20 cust_id,'8600 MEMORIAL PKWY SW' address from dual union all
  select 21 cust_id,'8200 FLORISSANTMEMORIALWAYABOVE SW' address from dual union all
  select 22 cust_id,'8600 MEMORIALFLORISSANT PKWY SW.' address from dual
  ) t1
),
 replacements as
  (
  select id,to_str,from_string from_str from
  (
  select 1 id,'St' to_str,'Street' from_string from dual union all
 select 2 id,'St' to_str,'St' from_string from dual union all
 select 3 id,'St' to_str,'Strit' from_string from dual union all
 select 4 id,'St' to_str,'Str' from_string from dual union all
 select 5 id,'Rd' to_str,'Rd.' from_string from dual union all
 select 6 id,'Rd' to_str,'road' from_string from dual union all
 select 7 id,'Av' to_str,'Av.' from_string from dual union all
 select 8 id,'Av' to_str,'Ave' from_string from dual union all
 select 9 id,'Av' to_str,'Avenue' from_string from dual union all
 select 10 id,'Av' to_str,'Aven.' from_string from dual union all
 select 11 id,'West' to_str,'W' from_string from dual union all
 select 12 id,'South West' to_str,'SW' from_string from dual
 ) t2
),
 r(cust_id,addr,test_addr,l) as 
 (
   select cust_id,addr,regexp_replace(addr,'(^|\W)' || from_str || '(\W|$)','\1' || to_str || '\2') test_addr,
     id - 1
     from  
     addresses,
     replacements
     where id = (select count(*) from replacements)
   union all
     select cust_id,addr,regexp_replace(test_addr,'(^|\W)' || from_str || '(\W|$)','\1' || to_str || '\2') test_addr,
     l - 1
     from r,
     replacements
     where id = l
  )
 select cust_id,addr,test_addr,l
 from r
 where l=0
 ;


当前输出:

cust_id addr test_addr
10 9 Help Street, Level 4          9 Help St, Level 4
11 22 Victoria Street                   22 Victoria St
12 1495 Franklin Str.                   1495 Franklin St.
13 30 Hasivim St.,Petah-Tikva    30 Hasivim St.,Petah-Tikva
14 2 Jakaranda St                    2 Jakaranda St
15 61, Science Park Rd 61, Science Park Rd
16 61, Social park road. 61, Social park Rd
17 Av. Hermanos Escobar 5756 Av Hermanos Escobar 5756
18 Ave. Hermanos Escobar 5756 Av Hermanos Escobar 5756
19 8000 W FLORISSANT Ave. 8000 West FLORISSANT Av
20 8600 MEMORIAL PKWY SW 8600 MEMORIAL PKWY South West
21 8200 FLORISSANTMEMORIALWAYABOVE SW 8200 FLORISSANTMEMORIALWAYABOVE South West
22 8600 MEMORIALFLORISSANT PKWY SW. 8600 MEMORIALFLORISSANT PKWY South West.


对于像16,18,22这样的cust_id,查询不能按预期工作。在cust_id中,dot在道路之后存在,但仍然变为Rd。

我需要2个查询... 一个完全匹配,包括点,另一个sql与点或不匹配。

期望为第一个sql完全匹配:

cust_id addr test_addr
10 9 Help Street, Level 4 9 Help St, Level 4
11 22 Victoria Street 22 Victoria St
12 1495 Franklin Str. 1495 Franklin St.
13 30 Hasivim St.,Petah-Tikva 30 Hasivim St.,Petah-Tikva
14 2 Jakaranda St 2 Jakaranda St
15 61, Science Park Rd 61, Science Park Rd
16 61, Social park road. 61, Social park road.
17 Av. Hermanos Escobar 5756 Av Hermanos Escobar 5756
18 Ave. Hermanos Escobar 5756 Ave. Hermanos Escobar 5756
19 8000 W FLORISSANT Ave. 8000 West FLORISSANT Ave.
20 8600 MEMORIAL PKWY SW 8600 MEMORIAL PKWY South West
21 8200 FLORISSANTMEMORIALWAYABOVE SW 8200 FLORISSANTMEMORIALWAYABOVE South West
22 8600 MEMORIALFLORISSANT PKWY SW. 8600 MEMORIALFLORISSANT PKWY SW.


第二个sql的预期输出包括点if存在:

cust_id addr test_addr
10 9 Help Street, Level 4 9 Help St, Level 4
11 22 Victoria Street 22 Victoria St
12 1495 Franklin Str. 1495 Franklin St.
13 30 Hasivim St.,Petah-Tikva 30 Hasivim St.,Petah-Tikva
14 2 Jakaranda St 2 Jakaranda St
15 61, Science Park Rd 61, Science Park Rd
16 61, Social park road. 61, Social park Rd
17 Av. Hermanos Escobar 5756 Av Hermanos Escobar 5756
18 Ave. Hermanos Escobar 5756 Av Hermanos Escobar 5756
19 8000 W FLORISSANT Ave. 8000 West FLORISSANT Av
20 8600 MEMORIAL PKWY SW 8600 MEMORIAL PKWY South West
21 8200 FLORISSANTMEMORIALWAYABOVE SW 8200 FLORISSANTMEMORIALWAYABOVE South West
22 8600 MEMORIALFLORISSANT PKWY SW. 8600 MEMORIALFLORISSANT PKWY South West.


谢谢

专家解答

问题在于替换的递归性质。

替换数据图

路 => 路
研发。=> 研发

因此,当查询通过cust 16的替换工作时,它将进行以下更改:

社会公园路61号。=>
社会公园路61号。=>
社会公园路61号

无论你做什么,你都可能会得到像这样的替换链,使用递归SQL给出意想不到的结果。另外,它会尝试对每个地址进行各种替代,这是浪费精力。

您可以通过以下方式避免这些问题:

-将源地址拆分为单词
-将这些外部连接到替换字符串
-使用listagg将这些重新组合回地址

这看起来像:

with addresses as
(
  select cust_id,address addr from
  (
  select 10 cust_id,'9 Help Street, Level 4' address from dual union all
  select 11 cust_id,'22 Victoria Street' address from dual union all
  select 12 cust_id,'1495 Franklin Str.' address from dual union all
  select 13 cust_id,'30 Hasivim St.,Petah-Tikva' address from dual union all
  select 14 cust_id,'2 Jakaranda St' address from dual union all
  select 15 cust_id,'61, Science Park Rd' address from dual union all
  select 16 cust_id,'61, Social park road.' address from dual union all
  select 17 cust_id,'Av. Hermanos Escobar 5756' address from dual union all
  select 18 cust_id,'Ave. Hermanos Escobar 5756' address from dual union all
  select 19 cust_id,'8000 W FLORISSANT Ave.' address from dual union all
  select 20 cust_id,'8600 MEMORIAL PKWY SW' address from dual union all
  select 21 cust_id,'8200 FLORISSANTMEMORIALWAYABOVE SW' address from dual union all
  select 22 cust_id,'8600 MEMORIALFLORISSANT PKWY SW.' address from dual
  ) t1
),
 replacements as
  (
  select id,to_str,from_string from_str from
  (
  select 1 id,'St' to_str,'Street' from_string from dual union all
 select 2 id,'St' to_str,'St' from_string from dual union all
 select 3 id,'St' to_str,'Strit' from_string from dual union all
 select 4 id,'St' to_str,'Str' from_string from dual union all
 select 5 id,'Rd' to_str,'Rd.' from_string from dual union all
 select 6 id,'Rd' to_str,'road' from_string from dual union all
 select 7 id,'Av' to_str,'Av.' from_string from dual union all
 select 8 id,'Av' to_str,'Ave' from_string from dual union all
 select 9 id,'Av' to_str,'Avenue' from_string from dual union all
 select 10 id,'Av' to_str,'Aven.' from_string from dual union all
 select 11 id,'West' to_str,'W' from_string from dual union all
 select 12 id,'South West' to_str,'SW' from_string from dual
 ) t2
), words as (
  select cust_id, addr,
         regexp_substr ( addr, '[^ ]+', 1, rn ) wd, rn
  from   addresses, lateral (
    select level rn from dual
    connect by level <= regexp_count ( addr, ' ' ) + 1
  )
)
  select cust_id, addr,
         listagg ( nvl ( to_str, wd ), ' ' ) 
           within group ( order by rn ) new_addr
  from   words
  left   join replacements
  on     wd = from_str
  group  by cust_id, addr;
  
CUST_ID ADDR                                NEW_ADDR                                     
     10 9 Help Street, Level 4              9 Help Street, Level 4                        
     11 22 Victoria Street                  22 Victoria St                                
     12 1495 Franklin Str.                  1495 Franklin Str.                            
     13 30 Hasivim St.,Petah-Tikva          30 Hasivim St.,Petah-Tikva                    
     14 2 Jakaranda St                      2 Jakaranda St                                
     15 61, Science Park Rd                 61, Science Park Rd                           
     16 61, Social park road.               61, Social park road.                         
     17 Av. Hermanos Escobar 5756           Av Hermanos Escobar 5756                      
     18 Ave. Hermanos Escobar 5756          Ave. Hermanos Escobar 5756                    
     19 8000 W FLORISSANT Ave.              8000 West FLORISSANT Ave.                     
     20 8600 MEMORIAL PKWY SW               8600 MEMORIAL PKWY South West                 
     21 8200 FLORISSANTMEMORIALWAYABOVE SW  8200 FLORISSANTMEMORIALWAYABOVE South West    
     22 8600 MEMORIALFLORISSANT PKWY SW.    8600 MEMORIALFLORISSANT PKWY SW.  


如果要使用尾随点替换字符串 (并将其删除),请将连接条件更改为:

on     ( wd = from_str or wd = from_str || '.' )

文章转载自ASKTOM,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论