结果如下:
[Mc]
Select
* continent=Europe */ Code, City.Name, City.population from
world.Country join world.City on
world.City.CountryCode=world.Country.Code where City.population >
10000 group by Name order by City.Population desc limit 5;
+------+---------------+------------+
| Code | Name | population |
+------+---------------+------------+
| RUS | Moscow | 8389200 |
| GBR | London | 7285000 |
| RUS | St Petersburg | 4694000 |
| DEU | Berlin | 3386667 |
| ESP | Madrid | 2879052 |
+------+---------------+------------+
可以用下面的查询看到ProxySQL的内部信息:
[Pa]
select
active,hits, mysql_query_rules.rule_id, match_digest, match_pattern,
replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules
NATURAL JOIN stats.stats_mysql_query_rules ORDER BY
mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
|
active | hits | rule_id | match_digest |
match_pattern | replace_pattern | cache_ttl |
apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
|
1 | 1 | 33 | NULL |
\S*\s*\/\*\s*continent=.*Europe\s*\*.* | NULL | NULL |
0 | 0 | 25 |
| 1 |
4 | 34 | NULL |
world. | Europe. | NULL |
0 | 25 | 25 |
| 1 |
0 | 35 | NULL |
\S*\s*\/\*\s*continent=.*Africa\s*\*.* | NULL | NULL |
0 | 0 | 24 |
| 1 |
0 | 36 | NULL |
world. | Africa. | NULL |
0 | 24 | 24 |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
And:
[Pa]
select * from stats_mysql_query_digest;
<snip and taking only digest_text>
Select
Code, City.Name, City.population from Europe.Country join Europe.City
on Europe.City.CountryCode=Europe.Country.Code where City.population
> ? group by Name order by City.Population desc limit ?
可以看到ProxySQL已经在查询中很好地用Europe取代了world,它运行QR34四次。
这对insert/update/delete操作显然也是有用的。
这样的查询:
insert into * continent=Europe */ world.City values(999999,'AAAAAAA','ITA','ROMA',0) ;
将被转换为:
[Pa]
select digest_text from stats_mysql_query_digest;
+-------------------------------------------+
| digest_text |
+-------------------------------------------+
| insert into Europe.City values(?,?,?,?,?) |
+-------------------------------------------+
并且只在需要的schema上执行。
Sharding by host(主机分片)
Using hint
如何分片且重定向查询到host(而不是schema)?这其实更容易!
最主要的一点是,无论什么查询匹配了规则都应该传到定义的HG。没有重写,这意味着更少的工作。
这是如何实现的呢?和之前一样,我有三个节点:192.168.1.[5-6-7]。对于这个例子,我将使用world DB(没有continent schema),分布在每一个节点,我将检索绑定到IP的节点确保我去了正确的位置。
我指示ProxySQL通过HINT发送查询到特定主机。我选择hint ”shard_host_HG”,我将把hint作为一个comment插入到查询中。
因此查询规则将会是:
[Pa]
delete from mysql_query_rules where rule_id in (40,41,42, 10,11,12);
INSERT
INTO mysql_query_rules
(rule_id,active,username,match_pattern,destination_hostgroup,apply)
VALUES (10,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Europe\s*\*.",10,0);
INSERT
INTO mysql_query_rules
(rule_id,active,username,match_pattern,destination_hostgroup,apply)
VALUES (11,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Asia\s*\*.",20,0);
INSERT
INTO mysql_query_rules
(rule_id,active,username,match_pattern,destination_hostgroup,apply)
VALUES (12,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Africa\s*\*.",30,0);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
我将测试的查询是:
[Mc]
Select
/* shard_host_HG=Europe */ City.Name, City.Population from
world.Country join world.City on
world.City.CountryCode=world.Country.Code where Country.code='ITA' limit
5; SELECT * /* shard_host_HG=Europe */ from
information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
Select
/* shard_host_HG=Asia */ City.Name, City.Population from world.Country
join world.City on world.City.CountryCode=world.Country.Code where
Country.code='IND' limit 5; SELECT * /* shard_host_HG=Asia */ from
information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
Select
/* shard_host_HG=Africa */ City.Name, City.Population from
world.Country join world.City on
world.City.CountryCode=world.Country.Code where Country.code='ETH' limit
5; SELECT * /* shard_host_HG=Africa */ from
information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
执行Africa的查询,将得到:
[Mc]
Select
/* shard_host_HG=Africa */ City.Name, City.Population from
world.Country join world.City on
world.City.CountryCode=world.Country.Code where Country.code='ETH' limit
5; SELECT * /* shard_host_HG=Africa */ from
information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
+-------------+------------+
| Name | Population |
+-------------+------------+
| Addis Abeba | 2495000 |
| Dire Dawa | 164851 |
| Nazret | 127842 |
| Gonder | 112249 |
| Dese | 97314 |
+-------------+------------+
+---------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+---------------+----------------+
| BIND_ADDRESS | 192.168.1.7 |
+---------------+----------------+
将给我的结果:
[Pa]
select
active,hits, mysql_query_rules.rule_id, match_digest, match_pattern,
replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules
NATURAL JOIN stats.stats_mysql_query_rules ORDER BY
mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
|
active | hits | rule_id | match_digest |
match_pattern | replace_pattern | cache_ttl |
apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
|
1 | 0 | 40 | NULL |
\/\*\s*shard_host_HG=.*Europe\s*\*. | NULL | NULL |
0 | 0 | 0 |
| 1 |
0 | 41 | NULL |
\/\*\s*shard_host_HG=.*Asia\s*\*. | NULL | NULL |
0 | 0 | 0 |
| 1 |
2 | 42 | NULL |
\/\*\s*shard_host_HG=.*Africa\s*\*. | NULL | NULL |
0 | 0 | 0 | <-- Note the HITS (2 as the run queries)
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
在这个例子中,我们没有replace_patter。只有一个匹配和重定向规则,目标HG在插入时由destination_hostgroup属性值定义。Africa 的HG 是30。
HG 30中的主机:
[Pa]
select hostgroup_id,hostname,port,status from mysql_servers ;
+--------------+-------------+------+--------+
| hostgroup_id | hostname | port | status |
+--------------+-------------+------+--------+
| 10 | 192.168.1.5 | 3306 | ONLINE |
| 20 | 192.168.1.6 | 3306 | ONLINE |
| 30 | 192.168.1.7 | 3306 | ONLINE | <---
+--------------+-------------+------+--------+
这完美的匹配了我们的返回值。
你可以自己试试另外2个continent。
Using destination_hostgroup(使用目标主机组)
另外一个标记最终主机的方法是用目标主机组(destination_hostgroup),在查询中设置Schema_name属性并使用use schema语法。
例如:
[Pa]
INSERT INTO mysql_query_rules (active,schemaname,destination_hostgroup,apply) VALUES
(1, 'shard00', 1, 1), (1, 'shard01', 1, 1), (1, 'shard03', 1, 1),
(1, 'shard04', 2, 1), (1, 'shard06', 2, 1), (1, 'shard06', 2, 1),
(1, 'shard07', 3, 1), (1, 'shard08', 3, 1), (1, 'shard09', 3, 1);
然后在查询中执行:
use shard02; Select * from tablex;
这可能会产生一个错误,因为shard03可能不在包含shard01的主机上。
因此只有当你100%确定你在做什么的时候才能使用这种方法,当你确定没有查询有显式申明的schema。
Shard by host and by schema(通过主机和schema分片)
最后,显然是可以结合这上面两种方法的,即通过主机和schema的一个子集分片。
为此,让我们用所有的三个节点且他们的schema分布如下:
Europe on Server 192.168.1.5 -> HG 10
Asia on Server 192.168.1.6 -> HG 20
Africa on Server 192.168.1.7 -> HG 30
我已经用HINT设置过查询规则,我所要做的就是使用它们来连接操作:
[Mc]
Select
/* shard_host_HG=Asia */ /* continent=Asia */ City.Name,
City.Population from world.Country join world.City on
world.City.CountryCode=world.Country.Code where Country.code='IND' limit
5; SELECT * /* shard_host_HG=Asia */ from
information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
+--------------------+------------+
| Name | Population |
+--------------------+------------+
| Mumbai (Bombay) | 10500000 |
| Delhi | 7206704 |
| Calcutta [Kolkata] | 4399819 |
| Chennai (Madras) | 3841396 |
| Hyderabad | 2964638 |
+--------------------+------------+
5 rows in set (0.00 sec)
+---------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+---------------+----------------+
| BIND_ADDRESS | 192.168.1.6 |
+---------------+----------------+
1 row in set (0.01 sec)
[Pa]
mysql> select digest_text from stats_mysql_query_digest;
+--------------------------------------------------------------------------------------------------------------------------------------------+
| digest_text |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| SELECT * from information_schema.GLOBAL_VARIABLES where variable_name like ? |
| Select City.Name, City.Population from Asia.Country join Asia.City on Asia.City.CountryCode=Asia.Country.Code where Country.code=? limit ? |
+--------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
mysql> select active,hits, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
可以看到,Rule11 有2个命中,这意味着我的查询将去到相关的HG。但给定到Rule11的Apply= 0,ProxySQL会继续处理查询规则。
因此也会把查询传到Rule 31和32,每一个都有预期命中数(第一为1,第二个为4,因为循环数)。
这是我们在ProxySQL中执行两层分片的所有操作。
Conclusion(结论)
ProxySQL允许用户用一种非常简单的方法,通过分片访问数据。查询规则遵循正则表达式的模式,结合查询规则和主机组的方法的可能性,给了我们巨大且相对简单的应用灵活性。




