» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with performance + MySQL

Getting Hibernate and MySQL's "ON DUPLICATE KEY UPDATE" Feature to Play Nice Together...

MySQL has a handy feature, that allows you to turn an INSERT into an UPDATE if a unique or primary key duplication is detected:

http://dev.mysql.com/doc/refman/5.1/en/insert-on-duplicate.html

A common usage pattern for this is “lazy initialization“ of a row in a database, which is exactly what my team was using it for yesterday to solve a problem in the backend for version 2.0 of the MySQL Enterprise Monitor. However, we ran into an issue where Hibernate would throw an exception complaining that when the INSERT was turned into an UPDATE, it couldn‘t retrieve the generated primary key value (we are using auto increments on this particular table, as it‘s not a high insertion-rate table).

To understand why this happens, you have to know a little bit about how Statement.getGeneratedKeys() works with MySQL‘s JDBC driver. When an auto increment value is generated by MySQL, that value is returned on the wire as part of the message back to the client that contains the update count, warnings, etc. When an application asks for the generated keys from MySQL‘s JDBC driver, it takes the value returned on the wire and crafts a “synthetic“ result set to represent it.

In the case where “ON DUPLICATE KEY UPDATE“ is in play, it turns out that the value returned on the wire, by default is indeterminate, which we would‘ve learned from reading our own manual more carefully:

“If a table contains an AUTO_INCREMENT column and INSERT UPDATE inserts a row, the LAST_INSERT_ID() function returns the AUTO_INCREMENT value. If the statement updates a row instead, LAST_INSERT_ID() is not meaningful.”

It turns out, the answer to this is to add a little assignment to the UPDATE clause:

INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id)

With that little “id=LAST_INSERT_ID(id)” bit added to the INSERT statement, the server now returns the already existing auto increment value back to the client, and Hibernate is happy again.

MySQL: Planet MySQL

Getting Hibernate and MySQL's "ON DUPLICATE KEY UPDATE" Feature to Play Nice Together...

MySQL has a handy feature, that allows you to turn an INSERT into an UPDATE if a unique or primary key duplication is detected:

http://dev.mysql.com/doc/refman/5.1/en/insert-on-duplicate.html

A common usage pattern for this is “lazy initialization“ of a row in a database, which is exactly what my team was using it for yesterday to solve a problem in the backend for version 2.0 of the MySQL Enterprise Monitor. However, we ran into an issue where Hibernate would throw an exception complaining that when the INSERT was turned into an UPDATE, it couldn‘t retrieve the generated primary key value (we are using auto increments on this particular table, as it‘s not a high insertion-rate table).

To understand why this happens, you have to know a little bit about how Statement.getGeneratedKeys() works with MySQL‘s JDBC driver. When an auto increment value is generated by MySQL, that value is returned on the wire as part of the message back to the client that contains the update count, warnings, etc. When an application asks for the generated keys from MySQL‘s JDBC driver, it takes the value returned on the wire and crafts a “synthetic“ result set to represent it.

In the case where “ON DUPLICATE KEY UPDATE“ is in play, it turns out that the value returned on the wire, by default is indeterminate, which we would‘ve learned from reading our own manual more carefully:

“If a table contains an AUTO_INCREMENT column and INSERT UPDATE inserts a row, the LAST_INSERT_ID() function returns the AUTO_INCREMENT value. If the statement updates a row instead, LAST_INSERT_ID() is not meaningful.”

It turns out, the answer to this is to add a little assignment to the UPDATE clause:

INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id)

With that little “id=LAST_INSERT_ID(id)” bit added to the INSERT statement, the server now returns the already existing auto increment value back to the client, and Hibernate is happy again.

MySQL: Mark Matthew

High Performance MySQL Second Edition Schedule

I just got the rest of the production schedule from the publisher, plus the PDF files for quality control, for our upcoming book. (Now I have to proofreeed the whole book!) This is the first time I’ve seen the entire production schedule. The book is supposed to go to the printer in the first week of June. I don’t know what the on-the-shelf date will be, but I think very shortly after that. The publisher has promised that it’ll physically be on sale at Velocity.

I also took a peek at the PDFs. Without the appendixes, the last page of Chapter 14 (Tools for High Performance) is page 604. The appendixes bring it to 660 pages. That’s real material, not including tables of contents and indexes. So my estimate (620) was not too far off.

660 pages is not bad, considering that the contract was for 384 pages.

Another note: the marketing materials for the book emphasize that it covers MySQL 5.1. While this is true, I want to point out that we took a real-life approach: we write about what we’ve seen in the real world, and 5.1 is not as widely deployed in the real world. However, the book’s real value, as far as version-specific content goes, is its tremendous depth and breadth in MySQL 4.1 and 5.0. These have been “out there” for a long time, and among the four of us we’ve seen about every conceivable scenario with it. So you’ll get a lot of insight about current, production-ready, widely-used versions. Let the other guys speculate — we just report the facts. It’s not like there’s any shortage of things to say about 5.0, right?

MySQL: Planet MySQL

Hyperic Announces MySQL Performance Study Results

MySQLFirst Enterprise Application to Prove MySQL Supports Immense Scale

JAVAONE?San Francisco, Calif. - News Release:

  • Today at JavaOne, Hyperic (Booth #1028), announced the results of a large scale performance study on Hyperic using Sun Microsystems? MySQL database as a backend.
  • Results showed Hyperic monitoring upwards of 2.3 million metric transactions per minute.
  • These results definitively prove that MySQL and Hyperic can be a compelling option for even the most demanding enterprise and web applications.
  • Hyperic customer CNET was part of large-scale beta testing of MySQL on Hyperic.

About the tests:

  • The tests simulated a large-scale deployment monitoring 32,000 services across 675 discreet managed platforms.
  • There were two primary tests?designed to determine the actual and maximum loads amount of metrics per minute that Hyperic is capable of running on MySQL.
    • In the Actual Load test, Hyperic HQ averaged 220,000 metrics per minute, with minimal server load for either Hyperic HQ or MySQL.
    • In the Maximum Load test, Hyperic HQ achieved a record 2.3 million metrics per minute, with constraints on the Hyperic Server and only moderate load on MySQL.
  • The test setup used standard hardware that would typically be used to run Hyperic HQ with MySQL
    • The Hyperic HQ Server ran on a 2 Quad Core, 2GHz machine with 16 GB of RAM, 4 GB of Heap and a NIC with a 1 GB interconnect between HQ and MySQL
    • The MySQL Database Server ran on a 2 Quad Core, 1596 MHz machine with 8 GB of RAM, 4.5 GB InnoDB buffer pool, 16 thread concurrency, O_DSYNC flush method, and RAID-1 146G SAS 3G HardDrives

Supporting Quotes:

“For growing web-driven companies, scaling their web applications is critical to their business. Traffic is unpredictable and can grow exponentially. Operations teams must not only monitor every component of their application stack, but quickly respond if things go wrong. These performance results prove that the combination of Hyperic and MySQL is a good fit for companies that need a massively scalable web infrastructure.? — Paul Melmon, senior vice president of engineering at Hyperic

?MySQL has been designed and optimized to handle the fast-growth and high-traffic requirements of today?s modern online applications. As Hyperic is also targeting this same Web audience, there is a natural synergy between our products. MySQL and Hyperic address enterprise-level needs for performance, scalability, availability and reliability.? –Zack Urlocker, vice president of products, Sun Microsystems Database Group

“Support for MySQL has proven to be a major win for Hyperic customers by offering a scalable, enterprise class data store with the array of features they demand to handle reliable backup, archive, and disaster recovery of the highly valuable data Hyperic HQ captures. Since the official release in late January, we’ve had about a quarter of our Enterprise customers either migrate or express interest in migrating to MySQL as a database backend.” –Marty Messer, director of customer success at Hyperic

Supporting resources:

MySQL Performance study

More about Hyperic HQMore on MySQL?s new 5.1 version

More about JavaOne

Hyperic?s History with MySQL

MySQL: Planet MySQL

Come to beCamp 2008

I’m going to be at beCamp 2008, the followup to the first beCamp, which I sadly missed.

beCamp is a BarCamp un-conference. Tonight was about meeting, greeting, and throwing ideas at the wall to see which ones stick. Literally. We stuck pieces of paper on the wall with our ideas — things we can either talk about or want to hear about — and then scratched our votes on them to see which are popular.

I live and breathe MySQL for a decent part of the day, so I hesitated, but then stuck “MySQL Performance” on the wall. It got quite a few votes, so I assume will be giving a talk on MySQL performance basics at some point during the conference. (The exact schedule is probably being determined right now, in my absence, but I’m so tired right now that I’ll just take my chances on it not being at 8:00 AM tomorrow.) [edit: I just checked the website and there won’t be anything before 9:00, and the schedule is determined tomorrow. I did say I’m tired, right?]

See you there!

PS: if you want to meet some of my colleagues from my former employer, the Rimm-Kaufman Group, they’ll be there too, wearing the “We’re Hiring” t-shirts. They’re hiring, by the way.

, , , ,

MySQL: Planet MySQL

Pre-Order High Performance MySQL Second Edition

High Performance MySQL

If you’re waiting for High Performance MySQL Second Edition to hit the shelf, you’re not the only one. I am too! I can’t wait to actually hold it in my hands.

But you don’t have to wait idly. No, not at all! You can pre-order it and then you’ll get it as soon as possible. Plus your pre-order will help them figure out how much demand there is, so it doesn’t sell out and make you wait for your own copy.

No Tags

MySQL: Planet MySQL

The Top 20 Design Tips for Enterprise Data Architects

At the 2008 MySQL Conference and Expo, Ronald Bradford delivered "The Top 20 Design Tips for Enterprise Data Architects". See the slides on the Forge at http://forge.mysql.com/wiki/MySQLConf2008ThursdayNotes#Top_20_DB_Design_Tips_Every_Architect_Needs_to_Know

MySQL: Planet MySQL

Video: Applied Partitioning and Scaling Your Database System

At the 2008 MySQL User Conference and Expo, Phil Hildebrand spoke on "Applied Partitioning and Scaling Your Database System". Download the slides, see people's notes, and more on the MySQL Forge Wiki at http://forge.mysql.com/wiki/MySQLConf2008WednesdayNotes#Applied_Partitioning_and_Scaling_Your_Database_System.

MySQL: Planet MySQL

Video: The MySQL Query Cache

At the 2008 MySQL User Conference and Expo, Baron Schwartz spoke on "The MySQL Query Cache". Download the slides, see people's notes, and more on the MySQL Forge Wiki at http://forge.mysql.com/wiki/MySQLConf2008WednesdayNotes#The_MySQL_Query_Cache

MySQL: Planet MySQL

Video: Optimizing MySQL and InnoDB on Solaris 10 for World's Largest Photo Blogging Community

At the 2008 MySQL User Conference and Expo, Farhan Mashraqi spoke about "Optimizing MySQL and InnoDB on Solaris 10 for World's Largest Photo Blogging Community". Download the slides, see people's notes, and more on the MySQL Forge Wiki at http://forge.mysql.com/wiki/MySQLConf2008ThursdayNotes#Optimizing_MySQL_and_InnoDB_on_Solaris_10_for_World.27s_Largest_Photo_Blogging_Community

MySQL: Planet MySQL

Video: Addressing Challenges of Data Warehousing - a Panel Discussion

At the 2008 MySQL Conference and Expo, there was a panel discussion on "Addressing Challenges of Data Warehousing - a Panel Discussion" including:

Robin Schumacher (Sun/MySQL) (moderator)

Brian Miezejewski (MySQL), Charles Hooper (Pro Relational Systems), Paul Whittington (NitroSecurity, Inc.), Raj Cherabuddi (Kickfire), Victoria Eastwood (InfoBright Inc.)

MySQL: Planet MySQL

Video: Portable Scale-out Benchmarks for MySQL

At the 2008 MySQL User Conference and Expo, Robert Hodges spoke on "Portable Scale-out Benchmarks for MySQL". Download slides and see links to blog postings at the MySQL Forge Wiki at http://forge.mysql.com/wiki/MySQLConf2008WednesdayNotes#Portable_Scale-out_Benchmarks_for_MySQL

MySQL: Planet MySQL

Video: Faster, Greener, Cheaper: Why Every MySQL Database Server Will One Day Have a SQL Chip

At the 2008 MySQL User Conference and Expo, Rick Falkvinge of the Swedish Pirate Party delivered a keynote on "Copyright Regime vs. Civil Liberties".

MySQL: Planet MySQL

Video: Practical MySQL for Web Applications

At the 2008 MySQL User Conference and Expo, Domas Mituzas gave a workshop on "Practical MySQL for Web Applications".

MySQL: Planet MySQL

Video: Memcached and MySQL

At the 2008 MySQL Users Conference and Expo, Brian Aker (MySQL) and Allan Kasindorf (SixApart) gave a presentation on Memcached and MySQL. Download the slides and see blog posts others have written about the tutorial from the Forge Wiki at http://forge.mysql.com/wiki/MySQLConf2008MondayNotes#Memcached_and_MySQL.

If you'd like to download the WMV video file in parts, here's a link to:
Part 1, 112.09 Mb

read more

MySQL: Planet MySQL

Video: Replication Tutorial

At the 2008 MySQL Users Conference and Expo, Lars Thalmann and Mats Kindahl gave a tutorial on replication. Download the slides and see blog posts others have written about the tutorial from the Forge Wiki at http://forge.mysql.com/wiki/MySQLConf2008MondayNotes#MySQL_Replication_Tutorial.

If you'd like to download the WMV video file in parts, here's a link to:
Part 1, 175.27 Mb
and

read more

MySQL: Planet MySQL

InnoDB plugin row format performance - II

This is a continuation post on InnoDB plugin row format performance with few more test scenarios to cover based on the feedback and questions that I got in my Inbox.

This time the initial table looks like this

CREATE TABLE `sbtest` (
  `id` int(10) unsigned NOT NULL,
  `k` int(10) unsigned NOT NULL DEFAULT ‘0′,
  `c` char(120) NOT NULL DEFAULT ,
  `pad` char(60) NOT NULL DEFAULT ,
  PRIMARY KEY (`id`),
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED;

This time, the table does not have a secondary index in the create time, instead it will be added after the table is populated with the data to test the performance of the new faster index support from the plugin. I also added a new scenario to test 2K compression as we noticed 4K compression seems to be a bottleneck in the earlier case.

Same test environment is chosen as of the first one, RHEL-4 64-Bit having MySQL-5.1.24 with InnoDB plugin 1.0 statically linked.

InnoDB configuration variables:

   1: mysql> show variables like ‘%innodb%’;
   2: +———————————+————————–+
   3: | Variable_name                   | Value                    |
   4: +———————————+————————–+
   5: | have_innodb                     | YES                      |
   6: | innodb_adaptive_hash_index      | ON                       |
   7: | innodb_additional_mem_pool_size | 104857600                |
   8: | innodb_autoextend_increment     | 8                        |
   9: | innodb_autoinc_lock_mode        | 1                        |
  10: | innodb_buffer_pool_size         | 6442450944               |
  11: | innodb_checksums                | ON                       |
  12: | innodb_commit_concurrency       | 0                        |
  13: | innodb_concurrency_tickets      | 500                      |
  14: | innodb_data_file_path           | ibdata1:256M:autoextend  |
  15: | innodb_data_home_dir            | /mysql/ibdata |
  16: | innodb_doublewrite              | ON                       |
  17: | innodb_fast_shutdown            | 1                        |
  18: | innodb_file_format              | Barracuda                |
  19: | innodb_file_io_threads          | 4                        |
  20: | innodb_file_per_table           | ON                       |
  21: | innodb_flush_log_at_trx_commit  | 1                        |
  22: | innodb_flush_method             | O_DIRECT                 |
  23: | innodb_force_recovery           | 0                        |
  24: | innodb_lock_wait_timeout        | 50                       |
  25: | innodb_locks_unsafe_for_binlog  | OFF                      |
  26: | innodb_log_buffer_size          | 8388608                  |
  27: | innodb_log_file_size            | 268435456                |
  28: | innodb_log_files_in_group       | 2                        |
  29: | innodb_log_group_home_dir       | /mysql/iblog  |
  30: | innodb_max_dirty_pages_pct      | 90                       |
  31: | innodb_max_purge_lag            | 0                        |
  32: | innodb_mirrored_log_groups      | 1                        |
  33: | innodb_open_files               | 300                      |
  34: | innodb_replication_delay        | 0                        |
  35: | innodb_rollback_on_timeout      | OFF                      |
  36: | innodb_strict_mode              | OFF                      |
  37: | innodb_support_xa               | ON                       |
  38: | innodb_sync_spin_loops          | 20                       |
  39: | innodb_table_locks              | ON                       |
  40: | innodb_thread_concurrency       | 32                       |
  41: | innodb_thread_sleep_delay       | 10000                    |
  42: +———————————+————————–+
  43: 37 rows in set (0.00 sec)

Normally I disable the following InnoDB configuration options when running any benchmark tests; but this time enabled all of them (including in the first case)…

innodb_checksums
innodb_doublewrite
innodb_support_xa

Table Load:

Load time from a dump of SQL script having 10M rows (not batched)

  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
6G buffer pool 27m 34s 28m 25s 29m 37s 32m 49s 27m 23s

Secondary Index Creation:

Secondary Index create time after the table has been populated. The index added using

CREATE INDEX k ON sbtest(k)
 
  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
6G buffer pool 0m 51s 1m 23s 7m 0s 7m 42s 0m 51s

File Sizes:

Here is the size of the .ibd file after each data load

  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
Before Secondary Index 2.2G 1.1G 552M 284M 2.2G
After Secondary Index 2.3G 1.2G 592M 324M 2.3G

Testing:

I used the sysbench to run the tests by varying the threads to run mixed read/write on a  pre-populated table by creating the table with different ROW_FORMATS along with KEY_BLOCK_SIZE. The server is taken down by cleaning its logs and data directories completely and a fresh server is started back to re-run a different row-format .

Here is the sysbench options:

--test=oltp --db-driver=mysql --oltp-table-size=10000000
–oltp-skip-trx=on –oltp-auto-inc=off –mysql-table-engine=innodb
–mysql-user=root –mysql-ignore-duplicates=onmax-requests=100000
–init-rng=on –oltp-test-mode=complex –oltp-dist-type=uniform
–mysql-engine-trx=yes –oltp-read-only=off
–num-threads=XXX run

Some variables are non-relevant to test, but I use the same scripts to run lot of other cases, –mysql-ignore-duplicates is not a standard sysbench option, its a patch that I created to make sure the test does not bail on duplicate key errors due to poor random number generator.

The test typically executes 950000 queries in each iteration (700K read queries and 250K write queries) with max of 50000 transactions in this test and 100K transactions in the part-I case. Here is the typical output from sysbench:

LTP test statistics:
    queries performed:
        read:                            700000
        write:                           250000
        other:                           0
        total:                           950000
    transactions:                        50000  (206.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 950000 (3914.01 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          242.7179s
    total number of events:              50000
    total time taken by event execution: 123984.2189
    per-request statistics:
         min:                            0.2133s
         avg:                            2.4797s
         max:                            15.6503s
         approx.  95 percentile:         4.0491s

Performance:

Here is the performance of various row formats with threads ranging from 1 to 512 for 6G buffer pool size for both concurrent reads and writes.

compress6g-new

 Observations:

  • The load time is almost same for both Compact and Dynamic row formats. In this table and data scenario, both perform same amount of transactions (as you can see from the graph, they overlap)
  • The load/insert time increases based on the compression ratio from 8K, 4K, 2K, 1K etc
  • When the whole data set can fit in memory; then compression will relatively slowdown the transaction rate
  • When the whole data set can not fit in memory, then compression on the table seems to be a right choice (8K in this case)
  • Creation of secondary index still takes lot of time when the compression is enabled. The 8K compression is much better and bit close to Compact/Dynamic; but 4K/2K seems to take ~6 times more time than 8K
  • The best compression in terms of performance is 8K; but it may vary based on the table structure and data and its distribution.
  • Part-I observations can be found from here

Looks like there is something wrong in terms of performance of 4K/2K compression when compared with 8K when the data set can fit everything in memory (as we have 6G buffer pool). I will dig further and see if we can find a solution for the actual problem or file a bug report..more later.

Convert from one row format to another:

Alter table seems to be a right choice to change from one row format to another one or to change the compression ratio and it is efficient. For example, here is a case where the table is converted from compressed to compact and compact to Compressed and InnoDB re-creates both .frm and .ibd by using temp files. Even here you can see that 4K takes much longer time than expected.

mysql> alter table sbtest ROW_FORMAT=Compact;
Query OK, 10000000 rows affected, 1 warning (2 min 56.48 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=8;
Query OK, 10000000 rows affected (4 min 11.56 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=4;
Query OK, 10000000 rows affected (20 min 6.36 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compact;
Query OK, 10000000 rows affected, 1 warning (2 min 56.61 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> show warnings;
+———+——+—————————————————————–+
| Level   | Code | Message                                                         |
+———+——+—————————————————————–+
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4 unless ROW_FORMAT=COMPRESSED. |
+———+——+—————————————————————–+
1 row in set (0.00 sec)

mysql> alter table sbtest ROW_FORMAT=Dynamic;
Query OK, 10000000 rows affected, 1 warning (2 min 42.12 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> show warnings;
+———+——+—————————————————————–+
| Level   | Code | Message                                                         |
+———+——+—————————————————————–+
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4 unless ROW_FORMAT=COMPRESSED. |
+———+——+—————————————————————–+
1 row in set (0.00 sec)

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=2;
Query OK, 10000000 rows affected (20 min 53.76 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

Nice to have:

It will be nice if InnoDB team can extend the SHOW TABLE STATUS to show the compression details. Right now the Information schema.cmp will only lists general compression statistics but not on per table basis.

Also, it will be added advantage if they can support the following two config variables:

innodb_default_row_format (default compact or dynamic) and innodb_default_key_block_size (optionally, and make sure it does not raise a warning on default row_format)

MySQL: Planet MySQL

InnoDB plugin row format performance

Here is a quick comparison of the new InnoDB plugin performance between different compression, row formats that is introduced recently.

The table is a pretty simple one:

CREATE TABLE `sbtest` (
  `id` int(10) unsigned NOT NULL,
  `k` int(10) unsigned NOT NULL DEFAULT ‘0′,
  `c` char(120) NOT NULL DEFAULT ,
  `pad` char(60) NOT NULL DEFAULT ,
  PRIMARY KEY (`id`),
  KEY `k` (`k`)
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;

The table is populated with 10M rows with average row length being 224 bytes. The tests are performed for Compact, Dynamic and Compressed (8K and 4K)  row formats using MySQL-5.1.24 with InnoDB plugin-1.0.0-5.1 on Dell PE2950  1x Xeon quad core with 16G RAM, RAID-10 with RHEL-4 64-bit.

Here are the four test scenarios:

  1. No compression, ROW_FORMAT=Compact
  2. ROW_FORMAT=Compressed with KEY_BLOCK_SIZE=8
  3. ROW_FORMAT=Compressed with KEY_BLOCK_SIZE=4
  4. ROW_FORMAT=Dynamic

All the above tests are repeated with innodb_buffer_pool_size=6G and 512M to make sure one fits everything in memory and another one overflows. The rest of the InnoDB settings are all default except that innodb_thread_concurrency=32.

Here is the summary of the test results:

Table Load:

Load time from a dump of SQL script having 10M rows (not batched)

Compact Compressed (8K) Compressed (4K) Dynamic
28m 18s 29m 46s 36m 43s 27m 55s

File Sizes:

Here is the size of the .ibd file after each data load

Compact Compressed (8K) Compressed (4K) Dynamic
2.3G 1.2G 592M 2.3G

Data and Index Size from Table Status:

Here is the Data and Index size in bytes from SHOW TABLE STATUS and you can see the original data size here rather than the compressed size

  Compact Compressed (8K) Compressed (4K) Dynamic
Data 2247098368 2247098368 2249195520 2247098368
Index 137019392 137035776 160301056 137019392

Compression Stats:

Here is the compression stats after the table is populated from information_schema.InnoDB_cmp; and you notice that 4K takes more operations and time for both compression and un-compression

  Page_size Compress_ops Compress_ops_ok Compress_time Uncompress_ops Uncompress_time
8K 8192 446198 445598 73 300 0
4K 4096 1091421 1012917 463 38801 13

Performance:

Here is the performance of various row formats with threads ranging from 1-512 for both 512M and 6G buffer pool size for both concurrent reads and writes.

compress512m

compress6g

Observations:

Few key observations from the performance tests that I performed without looking to any of the sources, as I could be wrong, someone can correct me here. Its hard to draw from these input scenarios, but helps to estimate what is what.

  • The load time is almost same except that the 4K compression seems to take longer than the rest; and compression in general is hitting the INSERT/Load performance a little bit.
  • Compact or Dynamic, there is no compression; so the data and index file sizes will be almost same
  • The SHOW TABLE STATUS for compressed table will have its original Data_Length and Index_Length statistics rather than the compressed statistics (may be a bug or InnoDB needs to extend SHOW TABLE STATUS to show any compressed sizes or other means, right now only option is to view your files manually)
  • 8K compression reduced the .ibd file by nearly 50% (1.2G out of 2.3G) and 4K compression reduced the size by 1/4th (592M out of 2.3G); and it could vary based on table types and data.
  • 8K compression takes less ops and time for both compression and de-compression when compared to 4K (obvious)
  • When there is enough Innodb buffer pool size to act data in memory, the compression is a bit overhead, but you will be saving space
  • When there is a overflow from buffer pool (IO bound), compression seems to really help
  • 4K compression in general seems to be slower when compared with 8K or any other row_format.

MySQL: Planet MySQL

MySQL Engines: MyISAM vs. InnoDB

Why use InnoDB?

InnoDB is commonly viewed as anything but performant, especially when compared to MyISAM. Many actually call it slow. This view is mostly supported by old facts and mis-information. In reality, you would be very hard-pressed to find a current, production-quality MySQL Database Engine with the CPU efficiency of InnoDB. It has its performance "quirks" and there are definitely workloads for which it is not optimal, but for standard OLTP (Online Transaction Processing) loads, it is tough to find a better, safer fit.

read more

MySQL: Planet MySQL

EXPLAIN Cheatsheet

At the 2008 MySQL Conference and Expo, The Pythian Group gave away EXPLAIN cheatsheets. They were very nice, printed in full color and laminated to ensure you can spill your coffee* on it and it will survive.

For those not at the conference, or those that want to make more, the file is downloadable as a 136Kb PDF at explain-diagram.pdf

* or tea, for those of us in the civilized world.

MySQL: Planet MySQL

Spring 2008 issue of MySQL Magazine

Keith Murphy and his hard-working crew have released the spring 2008 issue of MySQL Magazine. Go take a look — it includes quite a few articles on various topics, even a mention of our upcoming book (High Performance MySQL, Second Edition).

, ,

MySQL: Planet MySQL

Liveblogging: 10,000 Tables Can?t Be Wrong

10,000 Tables Can?t Be Wrong: Designing a Highly Scalable MySQL Architecture for Write-intensive Applications by Richard Chart

Chose MySQL for performance and stability, and less important but still there, experience and support. Support is becoming increasingly more and more important.

Starting point: 1 appliance supporting 200 devices
Problem/Goal: Extensible architecture with deep host and app monitoring, over 1000 devices with 100 mgmt points each
Distributed collection over a WAN, with latency and security concerns
Current reality: several times the scale of the original goal
Commercial embedded product, so they actually pay for the embedded MySQL server

Future: The fundamentals are sound: next generation of the product moves up another order of magnitude

Data Characteristics
>90% writes
ACID not important
Resilient to loss, because gaps in data do not invalidate the rest of the data
Data elements by themselves are valuable, but much more so when relationships are added.

Chose MyISAM because: (more…)

MySQL: Planet MySQL

Liveblogging: A Match Made in Heaven? The Social Graph and the Database

Jeff Rothschild of Facebook’s “A Match Made in Heaven? The Social Graph and the Database”

Taking a look at the social graph and what it means for the database.

The social graph:

  • At it’s heart it’s about people and their connections.
  • Learning about people who are in your world.
  • Can be a powerful tool for accelerating the use of an application.

“The social graph has transformed a seemingly simple application such as photos into something tremendously more powerful.” We’re interested about what people are saying about us, and about our friends. Social applications are compelling.

Facebook users blew through the estimate for 6 months of storage in 6 weeks. It is serving 250,000 photos per second at peak time, not including profiles. Facebook serves more photos than even the photo sites out there, and serves more event invitations than any other website out there.

E-mail invitations are an example of the power of the social graph. If you get a newsfeed or an invitation that tells you 12 friends are attending an event, you have more information, and then can have a better decision on whether or not you want to go. (more…)

MySQL: Planet MySQL