» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with databases + Programming

MySQL Slave Delay Explained And 7 Ways To Battle It

http://forge.mysql.com/w/images/1/1e/Dolphin_Laptop_cropped-386x222.jpgSlave delay can be a nightmare. I battle it every day and know plenty of people who curse the serialization problem of replication. For those who are not familiar with it, replication on MySQL slaves runs commands in series - one by one, while the master may run them in parallel. This fact usually causes bottlenecks. Consider these 2 examples:

  • Between 1 and 100 UPDATE queries are constantly running on the master in parallel. If the slave IO is only fast enough to handle 50 of them without lagging, as soon as 51 start running, the slaves starts to lag.
  • A more common problem is when one query takes an hour to run (let's say, it's an UPDATE with a big WHERE clause that doesn't use an index). In this case, the query runs on the master for an hour, which isn't a big problem because it doesn't block other queries. However, when the query moves over to the slaves, all of them start to lag because it plugs up the single replication thread.

Sidenote: when I hear an argument that a master has to be the most powerful machine in the group, I cringe at the logic. If the master can crunch more INSERTs/UPDATEs after an upgrade to a better machine, then replication will fall behind even faster.

There is nothing you can do right now to fix the way MySQL handles replication. If the replication threads could run in parallel, I'm guessing horrible things would happen to the data integrity due to race conditions, canceled queries, slave restarts, differences in query execution times due to server load and configuration, etc. Replication is already an asynchronous, prone to getting out of sync business (hint: use maatkit tools by Baron Schwartz and specifically mk-table-checksum and mk-table-sync to sync up your slaves).

In order to see if a slave is lagging, execute the 'show slave status' command and look for the Seconds_Behind_Master value. The way this value is calculated can be slightly ambiguous and unclear, so I'll explain. It is simply a difference between the 2 timestamps - the time of the last received (and queued up in the relay log) query that already executed on the master and the time of the currently executing query on the slave. Thus this value is not real time (it is possible to catch up to the master much faster); it's an approximation, or special metrics if you will, that helps point out problems.

So what can you do if you start hitting replication lag? This is the ultimate question, and the answer depends on your application. Here are the things I came up with after dealing with MySQL for a few years (there are undoubtedly other techniques, but these all come from my own experience):

  1. Normalize your data, if it is not already. Non-normalized tables lead to repetition and is generally considered bad practice. More data - more IO in most cases. There can be cases, however, where you can normalize too much. Having JOINs is much slower than not having them, and it can hurt your queries if you JOIN a lot. Finally, the extreme case is mentioned at highscalability.com: How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale. "You–pause for dramatic effect–duplicate data instead of normalize it. *shudder*". Flickr is provided as an example.
  2. Shard (meaning, slice) your data, horizontally and vertically. For example, you can horizontally partition by some sort of key, hash, username, or other properties. You can also partition vertically by moving out some table columns into other databases. As an example, if you had a database of videos, storing view counts, number of favorites, etc. is OK but if these fields receive a lot of frequent updates, you are bound to have slave lag. Instead, separate these into a dedicated stats table(s). You don't have to shard all of your data - even sharding the most active bits helps immensely (for example, you can choose to shard your stats tables and leave the main one alone).
  3. Upgrade machines running MySQL (first slaves, then master, for the reasons given above). 99% of the time, disk IO is the bottleneck, CPU being the other 1%. Move to RAIDed setups (RAID10 or RAID0) with 6-10 15K RPM SCSI or SSD drives. Add a lot of RAM. Make sure you're running a 64 bit OS if you have more than 3GB of RAM, so that the mysql process may utilize more of it. My search for the best MySQL server under $10K may be of some help here.
  4. Separate your applications onto different MySQL instances. If you are running separate applications A, B, and C that don't depend on each other, consider giving them their own machines, otherwise a single long-running UPDATE or INSERT query in application A will delay all writes by application B and C. This is actually quite common - even though the server may not appear to be loaded, the annoying slave delay will still show its cowardly tail. I want to highlight this again: the replication thread is shared between all databases on the server.
  5. Another solution to (4) is multiple MySQL instances, granted that the MySQL machine isn't generally overloaded already. In that case, installing more than 1 mysql daemons would separate replication threads and allow running multiple applications, like A, B, and C on one machine, without affecting each other. MySQL sandbox achieves just that - it is my preferred solution.
  6. Split up longer running queries into shorter ones. This should be pretty straightforward - a single query on 10 million rows may run a few hours. Splitting it into batches of 50,000, for example, will give other queries a chance to run in between. Of course, you should take care of data integrity and generally double check what you are doing.
  7. Don't overload the same slave by sending all queries to it, as it will just make the matter worse. You can round-robin the queries using either round-robin DNS (eww), round-robin within the application logic (better), smarter application logic, like checking slave load and status from time to time, or my personal favorite - using MySQL proxy and having it pick the least lagging slave for you. An official solution utilizing mysql proxy, called MySQL load balancer, is apparently in the works (I was promised beta access but haven't got it so far).

As a bonus, I wanted to throw in this idea of helping minimize a certain corner case cause of slave delay and feed it to the hungry MySQL minds. I'm not sure if it is mentioned anywhere else, as I have not Googled it. If it's a widely known fact, then I will consider this post as just adding my vote to the usefulness of the technique.

Here goes: if you have replication setups that use a lot of INSERT commands and you expect that most of such INSERTs would dupe with existing data (and you are using INSERT IGNORE, not REPLACE), consider replacing such queries with SELECTs, followed by only necessary INSERTs. The reasoning is simple: INSERTs propagate to all the slaves and have to run on a master. SELECTs can run on any slave and don't propagate anywhere, so if only 0.01% of the queries result in new rows, this technique will get rid of a lot of unnecessary slave query traffic.

Well, there you have it. Comments are open, so feel free to share your own replication strategies and thoughts about mine.

Similar Posts:

MySQL: Planet MySQL

My MySQL Conference Schedule

Were there too many "my"'s in that title? Anyway… this week's MySQL conference is promising to be really busy and exciting. I can't wait to finally be there and experience it in all its glory. Thanks to the O'Reilly personal conference planner and scheduler and the advice of my fellow conference goers, I was able to easily (not really) pick out the speeches I am most interested in attending.

Here goes (my pass doesn't include Monday ( ):

Tuesday

8:30am Tuesday, 04/15/2008

State of MySQL

Keynote Ballroom E

MÃ¥rten Mickos (MySQL)

In his annual State of MySQL keynote, Marten discusses the current and future role of MySQL in the modern online world. The presentation also covers the acquisition by Sun of MySQL, the role open source is playing for users and customers all over the planet, and what the visions for the future are. Read more.

 

9:05am Tuesday, 04/15/2008

Open Source: The Heart of the Network Economy

Keynote Ballroom E

Jonathan Schwartz (Sun Microsystems)

Free software and open communities are the lifeblood of network innovation. Sun Microsystems CEO Jonathan Schwartz will highlight the rising open source tide and how Sun's recently announced acquisition of MySQL furthers free software as a platform for the web economy. Read more.

 

9:40am Tuesday, 04/15/2008

A Head in the Cloud - The Power of Infrastructure as a Service

Keynote Ballroom E

Werner Vogels (Amazon.com)

There are many challenges when building a reliable, flexible architecture that can manage unpredictable behaviors of today's internet business. This presentation will review some of the lessons learned from building one of the world's largest distributed systems; Amazon.com. Read more.

 

10:50am Tuesday, 04/15/2008

Performance Guide for MySQL Cluster

MySQL Cluster and High Availability, Performance Tuning and Benchmarks Ballroom D

Mikael Ronstrom (MySQL)

Learn about all the tricks required to make MySQL Cluster high performance. This includes using real-time scheduling, batching in all its form, cluster interconnects, and locking threads to CPUs. Read more.

 

11:55am Tuesday, 04/15/2008

The Future of MySQL: What You Need to Know About What's Coming

Architecture and Technology, General Ballroom B

Robin Schumacher (Sun/MySQL), Rob Young (Sun/MySQL)

What enhancements can you expect in the MySQL Server in the next few years? What new tools, services, and software is MySQL going to deliver this year and next to help you deploy and maintain MySQL applications? This session will let you in on all the plans MySQL has for the server, the Enterprise Monitor, the upcoming Load Balancer and Query Analyzer, management tools, and more. Read more.

 

2:00pm Tuesday, 04/15/2008

InnoDB: Status, Architecture, and New Features

Architecture and Technology Ballroom F

Heikki Tuuri (Innobase / Oracle Corp.), Ken Jacobs (Oracle / Innobase)

Ken Jacobs and Heikki Tuuri will describe the InnoDB architecture in depth, and discuss the new powerful performance-enhancing capabilities in InnoDB. Read more.

 

3:05pm Tuesday, 04/15/2008

Investigating Innodb Scalability Limits

Performance Tuning and Benchmarks Ballroom F

Peter Zaitsev (MySQL Performance Blog), Vadim Tkachenko (MySQLPerformanceBlog.com)

You may have heard Innodb has limited scalability with multiple CPUs and some of these were fixed in recent MySQL 5.0 versions. In this presentations we will look into which problems are fixed. Read more.

 

4:25pm Tuesday, 04/15/2008

Disaster is Inevitable?Are You Prepared?

Security and Database Administration Ballroom B

Farhan Mashraqi (Fotolog)

What?s the worst disaster you expect to happen? What can you do to better prepare for the disaster? Join us in this heart-racing, real-life inspired presentation for answers to these questions and more. Read more.

 

5:15pm Tuesday, 04/15/2008

Mitigating Replication Latency in a Distributed Application Environment

Architecture and Technology, Business and Case Studies, Replication and Scale-Out Ballroom E

Jeff Freund (Clickability)

Master-Master replication provides high availability and serviceability for the applications. Publishing web sites is a read-intensive operation, and the combination of Master-Slave replication with an application layer that intelligently splits database read and write operations allows for rapid scale out. Hear how Clickability solves issues for both environments. Read more.

 

Wednesday

8:30am Wednesday, 04/16/2008

Copyright Regime vs. Civil Liberties

Keynote Ballroom E

Rick Falkvinge (Swedish Pirate Party)

Rick Falkvinge, founder of the Swedish Pirate Party, talks about the rise and success of pirates and why pirates are necessary in today's politics. He'll also outline the next steps in the pirates' strategy to change global copyright laws. Read more.

 

9:15am Wednesday, 04/16/2008

Scaling MySQL - Up or Out?

Keynote Ballroom E

John Allspaw (Flickr (Yahoo!)), Jeff Rothschild (Facebook.com), Monty Taylor (MySQL), Domas Mituzas (MySQL), Paul Tuckfield (YouTube)

This lively panel discussion keynote will address the challenges large, modern web properties face in scaling MySQL. Panelists from Facebook, YouTube, and Flickr pair up with MySQL engineers in discussing the current and future problem domain and possible solutions. Read more.

 

10:00am Wednesday, 04/16/2008

Faster, Greener, Cheaper: Why Every MySQL Database Server Will One Day Have a SQL Chip

Keynote Ballroom E

Raj Cherabuddi (Kickfire)

The history of computing is full of algorithms such as graphics processing that are fine-tuned in general purpose CPUs over decades. Only when they are finally ported to dedicated hardware are tremendous improvements in speed, cost, and power realized. Raj Cherabuddi explains how a new SQL chip will revolutionize today?s database query processing. Read more.

 

10:50am Wednesday, 04/16/2008

Portable Scale-out Benchmarks for MySQL

Architecture and Technology, Performance Tuning and Benchmarks, Replication and Scale-Out Ballroom D

Robert Hodges (Continuent.com)

This talk presents new open source tools that allow users to set up and run database scale-out benchmarks easily. Hodges illustrates with benchmark results from your favorite MySQL configurations. Read more.

 

11:55am Wednesday, 04/16/2008

Applied Partitioning and Scaling Your Database System

General Ballroom D

Phil Hildebrand (thePlatform)

Take advantage of MySQL partitioning to allow your database applications to scale in both size and performance. A practical look at applying partitioning to OLTP database systems. Read more.

 

2:00pm Wednesday, 04/16/2008

Architecture of Maria: A New Storage Engine with a Transactional Design

Architecture and Technology, Performance Tuning and Benchmarks Ballroom E

Michael Widenius (MySQL)

A deep tour into the design of Maria, a new MVCC storage engine for MySQL from the original authors of MySQL that is designed to support transactions and automatic recovery. Read more.

 

3:05pm Wednesday, 04/16/2008

An Introduction to BLOB Streaming for MySQL Project

Java, Storage Engine Development and Optimization Ballroom A

Paul McCullagh (PrimeBase Technologies GmbH)

This session explains how the BLOB Streaming engine solves the problems involved in storing pictures, films, MP3 files, and other binary and text objects (BLOBs) in the database. Read more.

 

4:25pm Wednesday, 04/16/2008

Benchmarking and Monitoring: Tools of the Trade (Part I)

Performance Tuning and Benchmarks Ballroom D

Tom Hanlon (MySQL)

Benchmarking and Profiling are extrememly important and a large array of tools exist for the job. Join Tom Hanlon for a tour of the current landscape. Demos of each tool wil be shown. Read more.

 

5:15pm Wednesday, 04/16/2008

Benchmarking and Monitoring: Tools of the Trade (Part II)

Performance Tuning and Benchmarks, Security and Database Administration Ballroom D

Tom Hanlon (MySQL)

Join us for a presentation of the wonderful world of benchmarks and monitoring tools. Here you will learn what is available, how each tool works, and a demonstration using each tool against a running database from a veteran MySQL expert. Read more.

 

8:30pm Wednesday, 04/16/2008

Sun Sponsor Party

Event Ballroom F

Have a drink, mingle with fellow conference participants, and enter our raffle to win great prizes, including a a Sony PS3! Sponsored by Sun Microsystems. Read more.

 

Thursday

8:30am Thursday, 04/17/2008

Who is the Dick on My Site?

Keynote Ballroom E

Dick Hardt (Sxip Identity Corporation)

Much of the data in a database is about people. Identity 2.0 technologies will lower the friction for people to provide and easily move data about themselves online. This fast paced keynote will offer a background on Identity 2.0, discuss current roadblocks and future opportunities, and explore the potential impacts these will have on databases. Read more.

 

9:15am Thursday, 04/17/2008

A Match Made in Heaven? The Social Graph and the Database

Keynote Ballroom E

Jeff Rothschild (Facebook.com)

Social applications integrate information about many different facets of people?s lives. Join us as Jeff Rothschild from Facebook looks at the power of the social graph, how it can increase the utility and adoption of applications, and its implications on storage architectures. Read more.

 

10:50am Thursday, 04/17/2008

MySQL Proxy, the Friendly Man in the Middle

Architecture and Technology Ballroom F

Jan Kneschke (MySQL), Jimmy Guerrero (Sun-MySQL)

MySQL Proxy is a tool to route, rewrite, handle, and block queries on the MySQL Protocol level. Load Balancing, Query Replay, Online Query Rewrites, and more with a grain of scripting. Read more.

 

11:55am Thursday, 04/17/2008

Sphinx: High Performance Full Text Search for MySQL

General Ballroom C

Andrew Aksyonoff (Sphinx Technologies), Peter Zaitsev (MySQL Performance Blog)

Sphinx is an open source full-text search engine designed for indexing databases and integrated especially well with MySQL. We'll talk about its features, capabilities, and real-world applications. Read more.

 

2:00pm Thursday, 04/17/2008

Top 20 DB Design Tips Every Architect Needs to Know

Architecture and Technology, Data Warehousing and Business Intelligence, Security and Database Administration Ballroom B

Ronald Bradford (Primebase Technologies)

Each database product has strengths and weaknesses. Having chosen MySQL as your database product, leverage the strengths of the product to maximize design and performance. Learn the things to avoid. Read more.

 

2:50pm Thursday, 04/17/2008

The Power of Lucene

Architecture and Technology, Java, Ruby and MySQL Ballroom G

Farhan Mashraqi (Fotolog)

Lucene is a high performance, scalable, full-text search engine library that allows you to add search to any application. This presentation shows you how you can use Lucene within your environment. Read more.

 

3:50pm Thursday, 04/17/2008

The Science and Fiction of Petascale Analytics

Keynote Ballroom E

Jacek Becla (Stanford Linear Accelerator Center)

Scientists are trying to understand dark matter, discover distant galaxies, hunt for the Higgs boson, detect asteroids, and take movies of molecules. Their science is fascinating but their analysis requirements may seem like science fiction. Few have experienced the reality of petascale analytics so far, but everybody, including you, will face it tomorrow. Are we ready? Read more.

 

4:35pm Thursday, 04/17/2008

Farewell Closing Reception

Event Ballroom E

Take the opportunity to network one last time at this closing event. Say thank you and exchange contact information until next year. Read more.

 

Phew. I think I've picked out the most interesting topics. I'm excited to see Peter, Farhan, Ron, Paul, Jan, and everyone else. I hope I didn't skip anything interesting…

Similar Posts:

MySQL: Planet MySQL

Postgres and LOLCODE: GIMMEH RECORDZ OUTTA DATABUKKIT

I was wrong in my last post, it seems that all Sun database developers are now part of the same organisation, including PostgreSQL's Josh Berkus.

MySQL has the pluggable storage engine architecture, which is unique in the industry. The idea is you pick from among a suite of storage engines the most suitable one. PostgreSQL on the other hand has a plugin architecture for programming languages you can then use for stored procedures. And the cool thing about Open Source...

Someone went as far as to implement a PostgreSQL plugin of LOLCODE, a funny programming language I didn't know about until recently. So now you could do this with PostgreSQL:

read more

MySQL: Planet MySQL

Argo UML

Opensource UML tool - works on the mac!

UML: del.icio.us tag/uml

XML::DB::Database - Abstract class for extension by XML:SimpleDB drivers - search.cpan.org

This is an abstract class implementing the interface Database from the XML:DB base specification. It should only be used indirectly, as superclass for a specific database<sep/>

XML: del.icio.us/tag/xml

XML::DB::Database::Xindice - XML:DB driver for the Xindice database - search.cpan.org

This is the Xindice XML-RPC driver. It is intended to be used through the XML:DB API, so that it is never called directly from user code. It implements the internal API defined in XML::DB::Database.

XML: del.icio.us/tag/xml

Innodb Locks, ActiveRecord and acts_as_ferret Problem

Last few days one of our customers (one of the largest Ruby on Rails sites on the Net) was struggling to solve some really strange problem - once upon a time they were getting an error from ActiveRecord on their site:

(ActiveRecord::StatementInvalid) "Mysql::Error: Lock wait timeout exceeded; try restarting transaction: UPDATE some_table.....

They have innodb_lock_wait_timeout set to 20 seconds. After a few hours of looking for strange transactions we were decided to create s script to dump SHOW INNODB STATUS and SHOW FULL PROCESSLIST commands output to a file every 10 seconds to catch one of those moments when this error occurred.

Today we’ve got next error and started digging in our logs…

(more…)

Tags: , , ,

MySQL: Planet MySQL

Small Tip: How to Enable ActiveRecord Logging in Merb

Today I was developing one small merb application for one of our projects and needed to see ActiveRecord logging on console like I do in Rails. After a short research I’ve found out that merb_active_record plugin passes its MERB_LOGGER to AR by default so I decided to try to change merb log level and here they are - my pretty colored AR logs!

So, if you want to see ActiveRecord logs in your application in development mode, then you need to add one line to your conf/environments/development.rb file:

puts "Loaded DEVELOPMENT Environment..."
MERB_LOGGER.level = Merb::Logger::DEBUG

That’s it for now. Long live merb! -)

Tags: , , ,

MySQL: Planet MySQL

SubSonic: All Your Database Are Belong To Us

A Super High-fidelity Batman Utility Belt. SubSonic works up your DAL for you, throws in some much-needed utility functions, and generally speeds along your dev cycle.

opensource: del.icio.us tag/opensource

MMM checkers memory leak?

One of MMM users reported that they’re experiencing really weird memory leaks in checker processes used by MMM. After a deep investigation I’ve found out that Perl part of the checker and checker modules does not leak (at least I didn’t found these leaks), so I think it could be caused by some problems in MySQLl DBD module (client uses Ubuntu server).

So, I’d like to ask all users to check if their checker processes use more memory than expected and if yes, what OS, MySQL libraries versions and Perl version used on their servers.

Thanks in advance for any help.

Tags: , , , ,

MySQL: Planet MySQL

Useful Cacti Templates to Monitor Your Servers

Recently I had one customer for consulting and aside from mysql optimization, etc they asked me for cacti installation/setup to monitor their pretty generic LAMP application. I’ve started setting up all this stuff and I’ve never thought it could be so painful… lots of different templates for the same tasks, all of them are incompatible with recent cacti releases, etc, etc… So, this post is generally a list of used templates with a fixes I’ve made to make them work on recent cacti release.

(more…)

Tags: , , , , , ,

MySQL: Planet MySQL

CouchDB Document Database

* A document database server, accessible via a RESTful JSON API. * Ad-hoc and schema-free with a flat address space. * Distributed, featuring robust<sep/>

json: del.icio.us/tag/json

CouchDB Document Database

* A document database server, accessible via a RESTful JSON API. * Ad-hoc and schema-free with a flat address space. * Distributed, featuring robust, incremental replication with bi-directional conflict detection and management. * Quer

opensource: del.icio.us tag/opensource

Labnotes " CouchDB: Thinking beyond the RDBMS

ocument you want to retrieve. The second is referencing it from another document. And remember, it's JSON in/JSON out, with REST access all around, so relative URLs and you're fine.

json: del.icio.us/tag/json

Page 1 | Next >>