OneProxy :: 利用多个视图(View)分片来并行加速MySQL单个大表的查询速度

昨天在测试Spark over MySQL/OneProxy时,发现网上有人使用Spark来加速(通过并行)MySQL上单个大表的查询速度,马上想到OneProxy里也有并行功能,是不是也可以用来加速MySQL上单个大表的查询速度呢?毕竞Spark的任务下发过程中时延还是比较大的,可能OneProxy更适合这个场景。假设在MySQL数据库中有一张比较大的表,如下所示:

mysql> select count(*) from my_hash2;
+----------+
| count(*) |
+----------+
|  8698317 |
+----------+
1 row in set (2.17 sec)

可以通过视图来将数据分成多个分片,如下所示:

create view my_hash2_p0 as select * from
       my_hash2 where tid < 1000000;
create view my_hash2_p1 as select * from
       my_hash2 where tid >= 1000000 and tid < 2000000;
......
create view my_hash2_p8 as select * from
      my_hash2 where tid >= 8000000 and tid < 9000000;

接下来只需要在OneProxy里配置一个分区表,就可以使用OneProxy的并行查询企业版特有功能)来加速了。分区表的配置如下所示:

[
   {
       "table"      : "my_hash2",
       "pkey"       : "tid",
       "type"       : "int",
       "method"     : "range",
       "partitions":
          [
              { "suffix" : "_p0", "group": "shixun1", "value" : 1000000 },
       	      { "suffix" : "_p1", "group": "shixun1", "value" : 2000000 },
       	      { "suffix" : "_p2", "group": "shixun1", "value" : 3000000 },
       	      { "suffix" : "_p3", "group": "shixun1", "value" : 4000000 },
       	      { "suffix" : "_p4", "group": "shixun1", "value" : 5000000 },
       	      { "suffix" : "_p5", "group": "shixun1", "value" : 6000000 },
       	      { "suffix" : "_p6", "group": "shixun1", "value" : 7000000 },
       	      { "suffix" : "_p7", "group": "shixun1", "value" : 8000000 },
       	      { "suffix" : "_p8", "group": "shixun1", "value" : 9000000 }
          ]
    }
]

接下来通过OneProxy来查询表的记录数,看看需要多长时间,如下所示:

mysql> select /* parallel */ count(*) from my_hash2;
+----------+
| count(*) |
+----------+
|  8698317 |
+----------+
1 row in set (0.49 sec)

可以看到,查询速度提升了将近4倍多,MySQL 5.7对于视图合并的SQL优化好象进步比较大,不再是将视图先生成临时表来处理了,那时这个并行加速的方法将相当实用,Good Luck!