カラム指向DB比較 - MonetDB インストールの巻。
カラム指向DB比較 - MonetDB パフォーマンス検証の巻。
カラム指向DB比較 - InfiniDB インストールの巻。
カラム指向DB比較 - InfiniDB(番外編) - はまりポイント挙げるの巻。
の続き。
Infinidb をインストールしたので MonetDB の時と同じクエリでパフォーマンスを検証してみる。
実施環境
カラム指向DB比較 - MonetDB パフォーマンス検証の巻。
カラム指向DB比較 - InfiniDB インストールの巻。
カラム指向DB比較 - InfiniDB(番外編) - はまりポイント挙げるの巻。
の続き。
Infinidb をインストールしたので MonetDB の時と同じクエリでパフォーマンスを検証してみる。
- 検証用DB作成
- 検証データ投入
- クエリによるパフォーマンス
実施環境
- CloudCore
- OS: Ubuntu 12.04
- CPU: AMD Phenom(tm) 9550 Quad-Core Processor 2.2GHz
- Memory: 2G
1. 検証用DB作成
まずはデータベースの作成。
$ idbmysql -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3485 Server version: 5.1.39 MySQL Community / Calpont InfiniDB Community 2.2.9-2 Final (COSS LA) mysql> create database performance; Query OK, 1 row affected (0.00 sec) mysql> use performance; Database changed
2. 検証データ投入
テストデータとして下記3ファイルをバルクロードする。
- users: 100万件
- payments: 5000万件
- products: 26件
# users mysql> create table users(id int, prop1 int, prop2 int, prop3 int, prop4 varchar(8), registered_date date) engine=infinidb; Query OK, 0 rows affected (0.81 sec) # payments mysql> create table payments(id int, user_id int, product_id int, quantity int, sale int, buy_at datetime) engine=infinidb; Query OK, 0 rows affected (0.57 sec) # products mysql> create table products(id int, name varchar(8), price int) engine=infinidb; Query OK, 0 rows affected (0.27 sec) # fact mysql> create table fact(user_id int, prop1 int, prop2 int, prop3 int, prop4 varchar(8), registered_date date, quantity int, sales int, first_buy_at datetime, last_buy_at datetime) engine=infinidb; Query OK, 0 rows affected (0.44 sec) mysql> show tables; +-----------------------+ | Tables_in_performance | +-----------------------+ | fact | | payments | | products | | users | +-----------------------+ 4 rows in set (0.00 sec)続いてデータ投入
# users $ cpimport performance users /usr/local/var/db/data/no_header/users.csv -s "," ~省略~
2012-09-01 22:42:00 (16816) INFO : No of Read Threads Spawned = 1 # ここの設定がパフォーマンスに影響しそう 2012-09-01 22:42:00 (16816) INFO : No of Parse Threads Spawned = 3 # ここの設定がパフォーマンスに影響しそう
~省略~ 2012-09-01 22:38:27 (16752) INFO : For table performance.users: 1000000 rows processed and 1000000 rows inserted. 2012-09-01 22:38:27 (16752) INFO : Bulk load completed, total run time : 5.50646 seconds # payments $ cpimport performance payments /usr/local/var/db/data/no_header/payments.csv -s "," ~省略~ 2012-09-01 22:44:01 (16816) INFO : For table performance.payments: 50000000 rows processed and 50000000 rows inserted. 2012-09-01 22:44:01 (16816) INFO : Bulk load completed, total run time : 120.644 seconds # products $ cpimport performance products /usr/local/var/db/data/no_header/products.csv -s "," ~省略~ 2012-09-01 22:45:18 (16863) INFO : For table performance.products: 26 rows processed and 26 rows inserted. 2012-09-01 22:45:18 (16863) INFO : Bulk load completed, total run time : 0.113074 seconds1秒で約41万件以上(=5000万件 / 120s)インポートしている。相当速いが MonetDB よりは遅め。
3. クエリによるパフォーマンス検証
# all count: users mysql> select count(id) from users; +-----------+ | count(id) | +-----------+ | 1000000 | +-----------+ 1 row in set (0.16 sec) # all payments: count mysql> select count(id) from payments; +-----------+ | count(id) | +-----------+ | 50000000 | +-----------+ 1 row in set (5.95 sec) # all products: count mysql> select count(id) from products; +-----------+ | count(id) | +-----------+ | 26 | +-----------+ 1 row in set (0.03 sec) # 100 records: users mysql> select * from users order by id limit 100; +------+-------+-------+-------+--------+-----------------+ | id | prop1 | prop2 | prop3 | prop4 | registered_date | +------+-------+-------+-------+--------+-----------------+ | 1 | 41 | 857 | 7976 | prop_W | 2012-11-09 | … | 100 | 75 | 580 | 4855 | prop_X | 2011-05-14 | +------+-------+-------+-------+--------+-----------------+ 100 rows in set (3.03 sec) # 100 records: payments mysql> select * from payments order by id limit 100; +------+---------+------------+----------+--------+---------------------+ | id | user_id | product_id | quantity | sale | buy_at | +------+---------+------------+----------+--------+---------------------+ | 1 | 817590 | 6 | 12 | 87960 | 2011-05-03 21:12:16 | … | 100 | 468994 | 14 | 39 | 300300 | 2012-08-29 07:10:37 | +------+---------+------------+----------+--------+---------------------+ 100 rows in set (2 min 9.46 sec) # 100 records: products mysql> select * from products order by id limit 100; +------+--------+-------+ | id | name | price | +------+--------+-------+ | 1 | item_A | 3800 | … | 26 | item_Z | 8240 | +------+--------+-------+ 26 rows in set (0.05 sec) # a user: users mysql> select * from users where id = 500000; +--------+-------+-------+-------+--------+-----------------+ | id | prop1 | prop2 | prop3 | prop4 | registered_date | +--------+-------+-------+-------+--------+-----------------+ | 500000 | 23 | 371 | 5839 | prop_N | 2012-08-22 | +--------+-------+-------+-------+--------+-----------------+ 1 row in set (0.40 sec) # a user: payments mysql> select * from payments where user_id = 500000; +----------+---------+------------+----------+--------+---------------------+ | id | user_id | product_id | quantity | sale | buy_at | +----------+---------+------------+----------+--------+---------------------+ | 1316372 | 500000 | 5 | 48 | 415680 | 2012-11-07 03:52:05 | … | 46098674 | 500000 | 9 | 32 | 7680 | 2011-04-01 11:17:47 | +----------+---------+------------+----------+--------+---------------------+ 40 rows in set (3.18 sec) # group by: users mysql> select -> registered_date, -> count(id) as uu -> from users -> group by registered_date -> order by registered_date; +-----------------+------+ | registered_date | uu | +-----------------+------+ | 2010-01-01 | 974 | … | 2012-12-31 | 940 | +-----------------+------+ 1096 rows in set (0.42 sec) # group by: payments mysql> select -> product_id, -> count(id) as cnt, -> sum(sale) as sale -> from payments -> group by product_id -> order by product_id; +------------+---------+--------------+ | product_id | cnt | sale | +------------+---------+--------------+ | 1 | 1922437 | 92240910320 | … | 26 | 1924741 | 189485558780 | +------------+---------+--------------+ 26 rows in set (17.75 sec) # data join mysql> select -> usr.id as user_id, -> pdt.name as product_name, -> pmt.quantity, -> pmt.sale, -> pmt.buy_at -> from users as usr -> left outer join payments as pmt -> on -> pmt.user_id = usr.id -> left outer join products as pdt -> on -> pdt.id = pmt.product_id -> limit 100; +---------+--------------+----------+--------+---------------------+ | user_id | product_name | quantity | sale | buy_at | +---------+--------------+----------+--------+---------------------+ | 662139 | item_Q | 42 | 399840 | 2012-03-24 07:13:09 | … | 823105 | item_F | 25 | 183250 | 2011-10-26 21:08:21 | +---------+--------------+----------+--------+---------------------+ 100 rows in set (1.40 sec) # data join & sort mysql> select -> usr.id as user_id, -> pdt.name as product_name, -> pmt.quantity, -> pmt.sale, -> pmt.buy_at -> from users as usr -> left outer join payments as pmt -> on -> pmt.user_id = usr.id -> left outer join products as pdt -> on -> pdt.id = pmt.product_id -> order by usr.id -> limit 100; +---------+--------------+----------+--------+---------------------+ | user_id | product_name | quantity | sale | buy_at | +---------+--------------+----------+--------+---------------------+ | 1 | item_V | 13 | 30290 | 2011-08-21 01:30:40 | … | 2 | item_N | 48 | 369600 | 2011-11-23 09:00:20 | +---------+--------------+----------+--------+---------------------+ 100 rows in set (3 min 56.20 sec) # join count mysql> select count(usr.id) -> from users as usr -> left outer join payments as pmt -> on -> pmt.user_id = usr.id -> left outer join products as pdt -> on -> pdt.id = pmt.product_id; +---------------+ | count(usr.id) | +---------------+ | 50000000 | +---------------+ 1 row in set (1 min 32.36 sec) # delete a user: users mysql> delete from users where id = 500000; Query OK, 1 row affected (0.42 sec) # delete a user: payments mysql> delete from payments where user_id = 500000; Query OK, 40 rows affected (3.27 sec) # delete whole data mysql> delete from users; Query OK, 999999 rows affected (1.52 sec) mysql> delete from payments; Query OK, 49999960 rows affected (1 min 35.67 sec) mysql> delete from products; Query OK, 26 rows affected (0.32 sec)select-insert が実施出来ないので select 文でデータ抽出&バルクロードという形で fact のデータ作成を実施。
# データ抽出 $ time idbmysql -u root -h localhost -D performance -e " select usr.id, min(usr.prop1), min(usr.prop2), min(usr.prop3), min(usr.prop4), min(usr.registered_date), sum(pmt.quantity), sum(pmt.sale), min(pmt.buy_at), max(pmt.buy_at) from users as usr left outer join payments as pmt on pmt.user_id = usr.id group by usr.id; " > result.tsv real 4m24.022s user 0m2.212s sys 0m0.400s # データ投入 cpimport performance fact ./result.tsv -s "\t" ~省略~ 2012-09-01 23:53:32 (19885) INFO : For table performance.fact: 1000000 rows processed and 1000000 rows inserted. 2012-09-01 23:53:32 (19885) INFO : Bulk load completed, total run time : 8.37294 seconds
全体的に MonetDB よりもパフォーマンスは遅めなのかなという印象。ただ、クエリ実施中に cpu が 100% 近く達している事からもっと性能の良いマシンであれば結果は変わってくるかもしれない。またメモリの量も効いてくるはず。
あくまでも『今回の環境』だと上記の結果という事しか言えない。
※DB設定はインストール時のまま検証を行なっている。DWH用途にある程度のチューニングは実施済みだと思うが頑張ればもっとパフォーマンスは上がるかもしれない
0 件のコメント:
コメントを投稿