» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with performance + Database

InnoDB plugin row format performance - II

This is a continuation post on InnoDB plugin row format performance with few more test scenarios to cover based on the feedback and questions that I got in my Inbox.

This time the initial table looks like this

CREATE TABLE `sbtest` (
  `id` int(10) unsigned NOT NULL,
  `k` int(10) unsigned NOT NULL DEFAULT ‘0′,
  `c` char(120) NOT NULL DEFAULT ,
  `pad` char(60) NOT NULL DEFAULT ,
  PRIMARY KEY (`id`),
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED;

This time, the table does not have a secondary index in the create time, instead it will be added after the table is populated with the data to test the performance of the new faster index support from the plugin. I also added a new scenario to test 2K compression as we noticed 4K compression seems to be a bottleneck in the earlier case.

Same test environment is chosen as of the first one, RHEL-4 64-Bit having MySQL-5.1.24 with InnoDB plugin 1.0 statically linked.

InnoDB configuration variables:

   1: mysql> show variables like ‘%innodb%’;
   2: +———————————+————————–+
   3: | Variable_name                   | Value                    |
   4: +———————————+————————–+
   5: | have_innodb                     | YES                      |
   6: | innodb_adaptive_hash_index      | ON                       |
   7: | innodb_additional_mem_pool_size | 104857600                |
   8: | innodb_autoextend_increment     | 8                        |
   9: | innodb_autoinc_lock_mode        | 1                        |
  10: | innodb_buffer_pool_size         | 6442450944               |
  11: | innodb_checksums                | ON                       |
  12: | innodb_commit_concurrency       | 0                        |
  13: | innodb_concurrency_tickets      | 500                      |
  14: | innodb_data_file_path           | ibdata1:256M:autoextend  |
  15: | innodb_data_home_dir            | /mysql/ibdata |
  16: | innodb_doublewrite              | ON                       |
  17: | innodb_fast_shutdown            | 1                        |
  18: | innodb_file_format              | Barracuda                |
  19: | innodb_file_io_threads          | 4                        |
  20: | innodb_file_per_table           | ON                       |
  21: | innodb_flush_log_at_trx_commit  | 1                        |
  22: | innodb_flush_method             | O_DIRECT                 |
  23: | innodb_force_recovery           | 0                        |
  24: | innodb_lock_wait_timeout        | 50                       |
  25: | innodb_locks_unsafe_for_binlog  | OFF                      |
  26: | innodb_log_buffer_size          | 8388608                  |
  27: | innodb_log_file_size            | 268435456                |
  28: | innodb_log_files_in_group       | 2                        |
  29: | innodb_log_group_home_dir       | /mysql/iblog  |
  30: | innodb_max_dirty_pages_pct      | 90                       |
  31: | innodb_max_purge_lag            | 0                        |
  32: | innodb_mirrored_log_groups      | 1                        |
  33: | innodb_open_files               | 300                      |
  34: | innodb_replication_delay        | 0                        |
  35: | innodb_rollback_on_timeout      | OFF                      |
  36: | innodb_strict_mode              | OFF                      |
  37: | innodb_support_xa               | ON                       |
  38: | innodb_sync_spin_loops          | 20                       |
  39: | innodb_table_locks              | ON                       |
  40: | innodb_thread_concurrency       | 32                       |
  41: | innodb_thread_sleep_delay       | 10000                    |
  42: +———————————+————————–+
  43: 37 rows in set (0.00 sec)

Normally I disable the following InnoDB configuration options when running any benchmark tests; but this time enabled all of them (including in the first case)…

innodb_checksums
innodb_doublewrite
innodb_support_xa

Table Load:

Load time from a dump of SQL script having 10M rows (not batched)

  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
6G buffer pool 27m 34s 28m 25s 29m 37s 32m 49s 27m 23s

Secondary Index Creation:

Secondary Index create time after the table has been populated. The index added using

CREATE INDEX k ON sbtest(k)
 
  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
6G buffer pool 0m 51s 1m 23s 7m 0s 7m 42s 0m 51s

File Sizes:

Here is the size of the .ibd file after each data load

  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
Before Secondary Index 2.2G 1.1G 552M 284M 2.2G
After Secondary Index 2.3G 1.2G 592M 324M 2.3G

Testing:

I used the sysbench to run the tests by varying the threads to run mixed read/write on a  pre-populated table by creating the table with different ROW_FORMATS along with KEY_BLOCK_SIZE. The server is taken down by cleaning its logs and data directories completely and a fresh server is started back to re-run a different row-format .

Here is the sysbench options:

--test=oltp --db-driver=mysql --oltp-table-size=10000000
–oltp-skip-trx=on –oltp-auto-inc=off –mysql-table-engine=innodb
–mysql-user=root –mysql-ignore-duplicates=onmax-requests=100000
–init-rng=on –oltp-test-mode=complex –oltp-dist-type=uniform
–mysql-engine-trx=yes –oltp-read-only=off
–num-threads=XXX run

Some variables are non-relevant to test, but I use the same scripts to run lot of other cases, –mysql-ignore-duplicates is not a standard sysbench option, its a patch that I created to make sure the test does not bail on duplicate key errors due to poor random number generator.

The test typically executes 950000 queries in each iteration (700K read queries and 250K write queries) with max of 50000 transactions in this test and 100K transactions in the part-I case. Here is the typical output from sysbench:

LTP test statistics:
    queries performed:
        read:                            700000
        write:                           250000
        other:                           0
        total:                           950000
    transactions:                        50000  (206.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 950000 (3914.01 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          242.7179s
    total number of events:              50000
    total time taken by event execution: 123984.2189
    per-request statistics:
         min:                            0.2133s
         avg:                            2.4797s
         max:                            15.6503s
         approx.  95 percentile:         4.0491s

Performance:

Here is the performance of various row formats with threads ranging from 1 to 512 for 6G buffer pool size for both concurrent reads and writes.

compress6g-new

 Observations:

  • The load time is almost same for both Compact and Dynamic row formats. In this table and data scenario, both perform same amount of transactions (as you can see from the graph, they overlap)
  • The load/insert time increases based on the compression ratio from 8K, 4K, 2K, 1K etc
  • When the whole data set can fit in memory; then compression will relatively slowdown the transaction rate
  • When the whole data set can not fit in memory, then compression on the table seems to be a right choice (8K in this case)
  • Creation of secondary index still takes lot of time when the compression is enabled. The 8K compression is much better and bit close to Compact/Dynamic; but 4K/2K seems to take ~6 times more time than 8K
  • The best compression in terms of performance is 8K; but it may vary based on the table structure and data and its distribution.
  • Part-I observations can be found from here

Looks like there is something wrong in terms of performance of 4K/2K compression when compared with 8K when the data set can fit everything in memory (as we have 6G buffer pool). I will dig further and see if we can find a solution for the actual problem or file a bug report..more later.

Convert from one row format to another:

Alter table seems to be a right choice to change from one row format to another one or to change the compression ratio and it is efficient. For example, here is a case where the table is converted from compressed to compact and compact to Compressed and InnoDB re-creates both .frm and .ibd by using temp files. Even here you can see that 4K takes much longer time than expected.

mysql> alter table sbtest ROW_FORMAT=Compact;
Query OK, 10000000 rows affected, 1 warning (2 min 56.48 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=8;
Query OK, 10000000 rows affected (4 min 11.56 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=4;
Query OK, 10000000 rows affected (20 min 6.36 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compact;
Query OK, 10000000 rows affected, 1 warning (2 min 56.61 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> show warnings;
+———+——+—————————————————————–+
| Level   | Code | Message                                                         |
+———+——+—————————————————————–+
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4 unless ROW_FORMAT=COMPRESSED. |
+———+——+—————————————————————–+
1 row in set (0.00 sec)

mysql> alter table sbtest ROW_FORMAT=Dynamic;
Query OK, 10000000 rows affected, 1 warning (2 min 42.12 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> show warnings;
+———+——+—————————————————————–+
| Level   | Code | Message                                                         |
+———+——+—————————————————————–+
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4 unless ROW_FORMAT=COMPRESSED. |
+———+——+—————————————————————–+
1 row in set (0.00 sec)

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=2;
Query OK, 10000000 rows affected (20 min 53.76 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

Nice to have:

It will be nice if InnoDB team can extend the SHOW TABLE STATUS to show the compression details. Right now the Information schema.cmp will only lists general compression statistics but not on per table basis.

Also, it will be added advantage if they can support the following two config variables:

innodb_default_row_format (default compact or dynamic) and innodb_default_key_block_size (optionally, and make sure it does not raise a warning on default row_format)

MySQL: Planet MySQL

InnoDB plugin row format performance

Here is a quick comparison of the new InnoDB plugin performance between different compression, row formats that is introduced recently.

The table is a pretty simple one:

CREATE TABLE `sbtest` (
  `id` int(10) unsigned NOT NULL,
  `k` int(10) unsigned NOT NULL DEFAULT ‘0′,
  `c` char(120) NOT NULL DEFAULT ,
  `pad` char(60) NOT NULL DEFAULT ,
  PRIMARY KEY (`id`),
  KEY `k` (`k`)
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;

The table is populated with 10M rows with average row length being 224 bytes. The tests are performed for Compact, Dynamic and Compressed (8K and 4K)  row formats using MySQL-5.1.24 with InnoDB plugin-1.0.0-5.1 on Dell PE2950  1x Xeon quad core with 16G RAM, RAID-10 with RHEL-4 64-bit.

Here are the four test scenarios:

  1. No compression, ROW_FORMAT=Compact
  2. ROW_FORMAT=Compressed with KEY_BLOCK_SIZE=8
  3. ROW_FORMAT=Compressed with KEY_BLOCK_SIZE=4
  4. ROW_FORMAT=Dynamic

All the above tests are repeated with innodb_buffer_pool_size=6G and 512M to make sure one fits everything in memory and another one overflows. The rest of the InnoDB settings are all default except that innodb_thread_concurrency=32.

Here is the summary of the test results:

Table Load:

Load time from a dump of SQL script having 10M rows (not batched)

Compact Compressed (8K) Compressed (4K) Dynamic
28m 18s 29m 46s 36m 43s 27m 55s

File Sizes:

Here is the size of the .ibd file after each data load

Compact Compressed (8K) Compressed (4K) Dynamic
2.3G 1.2G 592M 2.3G

Data and Index Size from Table Status:

Here is the Data and Index size in bytes from SHOW TABLE STATUS and you can see the original data size here rather than the compressed size

  Compact Compressed (8K) Compressed (4K) Dynamic
Data 2247098368 2247098368 2249195520 2247098368
Index 137019392 137035776 160301056 137019392

Compression Stats:

Here is the compression stats after the table is populated from information_schema.InnoDB_cmp; and you notice that 4K takes more operations and time for both compression and un-compression

  Page_size Compress_ops Compress_ops_ok Compress_time Uncompress_ops Uncompress_time
8K 8192 446198 445598 73 300 0
4K 4096 1091421 1012917 463 38801 13

Performance:

Here is the performance of various row formats with threads ranging from 1-512 for both 512M and 6G buffer pool size for both concurrent reads and writes.

compress512m

compress6g

Observations:

Few key observations from the performance tests that I performed without looking to any of the sources, as I could be wrong, someone can correct me here. Its hard to draw from these input scenarios, but helps to estimate what is what.

  • The load time is almost same except that the 4K compression seems to take longer than the rest; and compression in general is hitting the INSERT/Load performance a little bit.
  • Compact or Dynamic, there is no compression; so the data and index file sizes will be almost same
  • The SHOW TABLE STATUS for compressed table will have its original Data_Length and Index_Length statistics rather than the compressed statistics (may be a bug or InnoDB needs to extend SHOW TABLE STATUS to show any compressed sizes or other means, right now only option is to view your files manually)
  • 8K compression reduced the .ibd file by nearly 50% (1.2G out of 2.3G) and 4K compression reduced the size by 1/4th (592M out of 2.3G); and it could vary based on table types and data.
  • 8K compression takes less ops and time for both compression and de-compression when compared to 4K (obvious)
  • When there is enough Innodb buffer pool size to act data in memory, the compression is a bit overhead, but you will be saving space
  • When there is a overflow from buffer pool (IO bound), compression seems to really help
  • 4K compression in general seems to be slower when compared with 8K or any other row_format.

MySQL: Planet MySQL

Railers Need To Stop Not Caring About The Database

In other news, most Rails apps are slow because the developers are ignorant of database basics. Not slamming the blog's author, but it is a good example of how easy it is to mess up a Rails application's performance by something simple like not creating indices in your database.

WakkaWiki: mornography.de

commons-dbcp

A framework for database connection pooling, a technique to help database-intensive applications scale and run faster by sharing database connections.

SourceLabs includes DBCP in its Self Support Suite for Linux and Open Source Java

google-mysql-tools - Google Code

"Various tools for managing, maintaining, and improving the performance of MySQL databases, originally written by Google."

open-source: del.icio.us tag/open-source

Query Processing at Light Speed

<sep/>MonetDB is a open-source database system for high-performance applications in data mining, OLAP, GIS, XML Query, text and multimedia retrieval. MonetDB often achieves a 10-fold raw speed improvement for<sep/>

XML: del.icio.us/tag/xml

Database Sharding Helps High-Traffic Sites - Data Center Knowledge

About sharding (horizontal partitioning) data between multiple databases. It includes links to Hibernate Shards

Hibernate: del.icio.us tag/hibernate

[from jeoffw] Oracle JDBC Performance (ch 19)

Java Programming with Oracle JDBC: Chapter 19: Performance (oreilly)

User:jeyrb: del.icio.us/network/jey

Page 1 | Next >>