I have been getting a lot of comments and errata from people who seem to be mistakenly buying the first edition and believing it’s the second edition. A lot of the blame for this probably rests with Amazon, who did not distinguish between the two editions at all until the editor and I (among others) leaned on them persistently for about 6 weeks. I think some people are buying the second edition and getting the first edition.
I’ve even spoken to people in person who said “yeah, I’ve been reading it” and I give them a copy of the second edition to hold in their hands, and they go “whoa, that is like twice the size. I don’t have this edition at all.”
If you have any question at all, just look at the front cover. If you have the second edition, you will see it clearly in the upper right-hand corner of the cover, as shown in this picture:

I feel like a Microsoft Anti-Piracy Minion “educating” you about how to verify that you are installing Genuine Spyware. Don’t worry, the feeling will pass and I’ll be okay *grin*
If you ordered the second edition and got the first edition, Amazon should send you a new book free of charge. If they’re really making the mistake that they seem to be, I predict they’ll fix it when it starts costing them money.
No TagsApparently High Performance MySQL, 2nd Edition is selling quite well — I’m not sure exactly how well — because we’re preparing for a second printing. This makes me very happy. I don’t think they anticipated going back to the press for quite some time.
The book fluctuates between sales rank 1000 and 2000 on Amazon during the day, and has reached as high as 600 or so. This is just phenomenal. The O’Reilly team was psyched when it broke 5000, and so was I — but now we’ve stayed under 2000 for a long time (except when Amazon sold out of it). Frankly I’d have thought that for a niche-market book like this, we’d have been in the 10,000 range or something like that.
Clearly we (the authors, editors, publisher, etc) have done something right! This is a great feeling.
Thanks for sending errata, by the way. I have just completed proofreading the whole book myself, and found a number of things that may be fixed in the second printing. I think certain types of errors won’t be fixed, but the important ones certainly will be.
Books, mysql, writingThe Sphinx project just released version 0.9.8, with many enhancements since the previous release. There’s never been a better time to try it out. It’s really cool technology.
What is Sphinx? Glad you asked. It’s fast, efficient, scalable, relevant full-text searching and a heck of a lot more. In fact, Sphinx complements MySQL for a lot of non-search queries that MySQL frankly isn’t very good at, including WHERE clauses on low-selectivity columns, ORDER BY with a LIMIT and OFFSET, and GROUP BY. A lot of you are probably running fairly simple queries with these constructs and getting really bad performance in MySQL. I see it a lot when I’m working with clients, and there’s often not much room for optimization. Sphinx can execute a subset of such queries very efficiently, due to its smart I/O algorithms and the way it uses memory. By “subset” I mean you don’t get the full complexity of SQL, but you get enough functionality for lots of the poorly-performing queries I see in the wild. It’s a 95% solution.
Is Sphinx for you? Good question. You can find answers in Appendix C in High Performance MySQL. And yes, that is why I wrote this blog post — to put in a plug for the book. *grin* But before I go, let me put in another plug for Sphinx: go vote for it on Sourceforge! If it’s voted as one of the Community Choice projects of the year, that will be fantastic.
The book is done now, right? What’s next?
Don’t tell my wife this, but a book is never done.
Right now I’m proofreading the printed copy. I proofread PDF after PDF during production, but some problems will always slip through and make it to paper. I’m finding quite a few little mistakes. For example, at one point we refer to TPC as TCP three times in a row. Oops.
These problems will be corrected in the next printing. Please notify me if you find any errors yourself, and I’ll add them to the list of things to fix! Also let me know if you find things that should just be “fixed” in general. For example, the layout and page-breaking on pages 364 and 365 is totally confusing — it’s hard to tell which figures are associated with which text.
I’m not offering rewards like Donald Knuth, sorry…
I will place a list of errata on the official High Performance MySQL Second Edition website, including a non-mangled Figure 8-12. (Another thing that didn’t make it through production unscathed.)
errata, printing, proofreading, writingMy post on what it’s like to write a technical book was a stream-of-consciousness look at the process of writing High Performance MySQL, Second Edition. I got a lot of responses from it and learned some neat things I wouldn’t have learned if I hadn’t written the post. I also got a lot of questions, and my editor wrote a response too. I want to follow up on these things.
I really intended to write the post as just “here’s what it’s like, just so you’re prepared.” But at some point I got really deep into it and lost my context. That’s when I started to write about the things that didn’t go so smoothly with the publisher, and some of these things had a little extra sting in them that I would have done well to edit out.
All of us are human and the process wasn’t that bad, all things considered — the book was just a massive project that put huge demands on all of us and stressed everything from the capabilities of our chosen tools to our patience. As the editor points out in his response to my blog post, this is precisely why nobody else has ever been able to pull this off. This book stands head and shoulders above the crowd. It’s just hard to write, and very few people in the world actually have the knowledge to do it, much less the time, inclination, and ability.
Everything I said was (I believe) factual and correct, although as the editor points out there are different stories behind them. I also want to mention that I’d shared all those concerns with my editor; I avoid criticizing people behind their backs. In hindsight, throwing all of my concerns onto a blog post without warning isn’t the kind of thing I like to do either.
So I believe I was honest, but unfair to the editor. I’ve apologized to him. And by the way, yes I would work with him again, and I fully expect that it would be easier because I have learned more about the process.
I ran this post by my editor before publishing it.
Several people asked me to say more about my heuristics for improving the quality of the writing. I’ve already explained many of them, but here’s more:
The tools I used to find sentences and phrases that score badly on some readability metric were pretty helpful to me as I tightened the writing up more and more. Nobody has reviewed the book yet, but I think when they do, they’ll be unlikely to mention “oh, and by the way the writing is wonderfully compact!” If we pulled this off right, you won’t notice that the writing is clear and compact. Writing is like a stereo system: you’re supposed to hear the music, not the speakers.
Anyway, my point is that we expanded the first edition’s actual coverage many times over, and ended up with only 658 pages of actual material. So the writing is much more compressed, and to do that you have to find and eliminate confusing writing. Confusing writing usually means that the concepts don’t flow clearly, and it takes more words to say the same thing because you’re kind of bumbling about, gesturing at your meaning from several angles instead of saying it clearly just once.
Here’s how I analyzed each chapter:
As I wrote in my previous post, the analyzer uses a combination of readability metrics and “other stuff” to measure the badness of each sentence and paragraph. It aggregates sentences and paragraphs by the metrics. I calculated the number of words, percent of complex words, syllables per word, number of sentences, words per sentence, and a bunch of other things, as well as the standard readability metrics. Each sentence and paragraph got scored on these. Then I printed overall metrics, and sorted the sentences and paragraphs worst-first and printed out a snippet of the offending text. Here’s a sample of chapter 3’s metrics (originally numbered chapter 4) at some intermediate stage in the writing process.
This was a lot of work. If I had been writing with Vim, I could have done better. I could have used the compiler integration and set my “make” program to the analysis program. If you use Vim and you don’t know about this, it’s a pity. My next book will be written in Vim, by the way.
Actually, I probably could have done better regardless, but this was good enough. I just searched for the snippets and then examined what was going on.
There were some false positives. For example, bullet-points often scored badly on the readability metrics, and so a five-word bullet point item would look like terrible writing just because it was short enough that it had a high percentage of complex words. It’s not an exact science. Maybe next time will be better.
If you’d like to see the source code, here’s the clean_text.pl and here’s the analyze_text.pl. Enjoy!
Perl, writingThe book, that is. My dog is already studying it. You should buy a few copies for yourself, your family, and all your pets.

Final versions of High Performance MySQL, Second Edition sample content are posted at the official website. You can download unrestricted PDFs of the foreword, table of contents, chapter 4 (Query Performance Optimization), and the index.
PDF, Sample ChapterToday High Performance MySQL, Second Edition went to press. I’ve been working with the production team over the last couple of weeks, proofreading and checking the index and working with the artist who re-drew the illustrations.
I spoke to the production editor this morning and she told me the schedule is for the bound-book date to be the 16th of June. The official in-stock date is June 19th. I don’t know how many copies they’re printing for the first printing. But I think there have been a lot of pre-orders (rumors I’ve heard from my Amazon Affiliate account).
I cannot wait to hold my copy in my hands!
PublishingIn preparation for the book’s launch next month, I’ve created a website for it: High Performance MySQL. You may notice that the URL isn’t the same as the site for the first edition. It proved to be difficult to transfer that domain. If we accomplish it later on, I’ll set up a redirect.
Why an official site? To give you free stuff, of course. Final drafts of the front matter (TOC, preface, foreword), a sample chapter, and the index are there already. When the final quality control is done, I’ll update these. Right now they don’t have professionally drawn figures. That will change soon.
Also, you’ll eventually various things such as errata* and book-related info that I feel belongs there instead of here. You can subscribe to the site’s RSS feed to find out when these planned additions become reality.
* Surely there will be no errata, right? Right?
mysqlI just got the rest of the production schedule from the publisher, plus the PDF files for quality control, for our upcoming book. (Now I have to proofreeed the whole book!) This is the first time I’ve seen the entire production schedule. The book is supposed to go to the printer in the first week of June. I don’t know what the on-the-shelf date will be, but I think very shortly after that. The publisher has promised that it’ll physically be on sale at Velocity.
I also took a peek at the PDFs. Without the appendixes, the last page of Chapter 14 (Tools for High Performance) is page 604. The appendixes bring it to 660 pages. That’s real material, not including tables of contents and indexes. So my estimate (620) was not too far off.
660 pages is not bad, considering that the contract was for 384 pages.
Another note: the marketing materials for the book emphasize that it covers MySQL 5.1. While this is true, I want to point out that we took a real-life approach: we write about what we’ve seen in the real world, and 5.1 is not as widely deployed in the real world. However, the book’s real value, as far as version-specific content goes, is its tremendous depth and breadth in MySQL 4.1 and 5.0. These have been “out there” for a long time, and among the four of us we’ve seen about every conceivable scenario with it. So you’ll get a lot of insight about current, production-ready, widely-used versions. Let the other guys speculate — we just report the facts. It’s not like there’s any shortage of things to say about 5.0, right?
mysqlI’m going to be at beCamp 2008, the followup to the first beCamp, which I sadly missed.
beCamp is a BarCamp un-conference. Tonight was about meeting, greeting, and throwing ideas at the wall to see which ones stick. Literally. We stuck pieces of paper on the wall with our ideas — things we can either talk about or want to hear about — and then scratched our votes on them to see which are popular.
I live and breathe MySQL for a decent part of the day, so I hesitated, but then stuck “MySQL Performance” on the wall. It got quite a few votes, so I assume will be giving a talk on MySQL performance basics at some point during the conference. (The exact schedule is probably being determined right now, in my absence, but I’m so tired right now that I’ll just take my chances on it not being at 8:00 AM tomorrow.) [edit: I just checked the website and there won’t be anything before 9:00, and the schedule is determined tomorrow. I did say I’m tired, right?]
See you there!
PS: if you want to meet some of my colleagues from my former employer, the Rimm-Kaufman Group, they’ll be there too, wearing the “We’re Hiring” t-shirts. They’re hiring, by the way.
BarCamp, beCamp, beCamp2008, mysql, Rimm Kaufman GroupIf you’re waiting for High Performance MySQL Second Edition to hit the shelf, you’re not the only one. I am too! I can’t wait to actually hold it in my hands.
But you don’t have to wait idly. No, not at all! You can pre-order it and then you’ll get it as soon as possible. Plus your pre-order will help them figure out how much demand there is, so it doesn’t sell out and make you wait for your own copy.
No TagsKeith Murphy and his hard-working crew have released the spring 2008 issue of MySQL Magazine. Go take a look — it includes quite a few articles on various topics, even a mention of our upcoming book (High Performance MySQL, Second Edition).
Keith Murphy, mysql, MySQL MagazineIf you’re at the MySQL Conference and Expo, you can get a free sample chapter of the upcoming High Performance MySQL Second Edition. Just go to the exhibition area. As you go through the doors, take an immediate left and look for the sample chapter on O’Reilly’s table. It’s a rough draft and contains typos and my incredibly crude drawings instead of those that will go into the final book, but it should serve to give you an idea of the book’s depth and scope. Kudos to Andy Oram, our editor, who was able to get these done for us on very short notice.
Andy Oram, mysqluc2008There are quite a few business angles you might see only if you’re here at the conference, and you won’t get from blogs. For example, let’s take a look at the contents of the shoulder bags they hand out with your registration. (This is only a partial list.)
Sorry. I have a short attention span.
Kickfire, Marketing, mysql, mysqluc2008, ZmandaJust a quick note to say we have reached the production stage of the book project. Production is the process of transforming our OpenOffice.org files into the final page layout using a professional typesetting program.
As you can probably guess, this is later than we would have wished. This also means we won't have the book for sale at the upcoming MySQL Conference and Expo. We will have a display copy at the O'Reilly booth at the conference, and you will be able to pre-order the book at a discount at that booth. (Several details remain to be worked out -- do not trust the Amazon.com information on the book, as it is a weird blend of the first and second editions).
The book is very, very good. You will not be disappointed. I can't think of a credible way to explain how good this book is -- it's just very, very good. Better than anything else you've ever read on the subject. So good that you will not want to share, because you'll want to have your own copy handy for frequent reference (I currently refer to the OpenOffice.org files several times a week myself, and I wrote them!). But I'll let you see for yourself. Buy a copy for yourself, your boss, your coworkers, and your mom. And your cat.
I dashed off a hasty post about speeding up replication slaves, and gave no references or explanation. That's what happens when I write quickly! This post explains what the heck I was talking about.
Well, if my perfectionist nature were allowed to run free, and if Peter et al's encyclopedic knowledge were somehow all transferred to paper, the second edition of High Performance MySQL would end up being the perfect encyclopedia of MySQL performance. But as it is, you're apparently going to have to settle for "very good." This quote by Sheeri Kritzer Cabral, one of our tech reviewers, really made my day:
I gotta hand it to Peter, Vadim, Arjen, and Baron. They know how to write a book!
And now I must begin a solid weekend of revisions... wish me luck!
It's been a while since I said anything about the progress on the book. That doesn't mean we are not still working on it, though.
As Peter wrote a while ago, he is basically wearing the hat of a very advanced technical reviewer at this point. We've finished writing all the chapters from his detailed outlines. He has worked through about half the chapters, and I'm continuing to spend my evenings and weekends and holidays (yes, nearly all my free time -- just ask my wife!) writing some new material (an appendix on EXPLAIN, for example), finishing unfinished things marked with TODO in the text, and revising chapters after Peter reviews them. Vadim is working on benchmarks. For example, he just finished some benchmarks for something I profiled with SHOW STATUS. I thought that would be good enough to assert something about the performance. Sure enough, SHOW STATUS says it does less work, but Vadim's benchmarks show it's slower :-) This is why we check each other's work!
The core chapters on MySQL performance -- beginning with Benchmarking and Profiling, and continuing through Optimizing Server Settings -- are the ones Andy Oram, our editor, thinks we should put the most effort into, and I agree. We will probably circle back and go through another review/edit cycle before we release them for technical review. Some of the other chapters, such as Replication, are already out for technical review.
Despite the fact that all of the chapters and appendixes are theoretically a "first draft," as of several weeks ago, there is still a lot of work to do. Depending on the chapter, it takes me a solid weekend to revise a chapter after Peter reviews it. Each little thing anyone points out (does MySQL version X really do Y by default?) requires some research, testing, benchmarks, or even reading the source code.
Some miscellanea:
Well, I've run out of my allotted thirty minutes of blogging! Back to the salt mines! Just kidding... I'm actually off to the climbing gym soon to get my mind off it.
Very fast, as it turns out. Click through to the full article for details.
As I wrote a few days ago, I'm writing the replication chapter for the second edition of High Performance MySQL. I'm writing about replication filtering rules right now, and I thought it would be good to get input on this. If you have favorite replication filtering tricks you'd like to share, or tasks that always frustrate and/or confuse you, please post them in the comments. I'm making a section that shows how to accomplish common filtering and rewriting needs, such as preventing GRANT statements from replicating to the slaves.
Thanks very much! I hope the community involvement will make this book more useful for everyone.
Your comments on the Advanced MySQL Features chapter were great. A lot of the questions I got (in the comments and via email) about chapter 6 are really addressed in chapter 5, "Query Performance Optimization," so I'm posting its outline too. I have the same questions: are there things you'd like to see us cover? Do you have any favorite techniques you'd like to see us include? Any other comments or questions?
Work continues apace on High Performance MySQL, Second Edition (the link leads to the chapter outline). I'm working now on Chapter 6, Advanced SQL Functionality, and thought I'd solicit input on it. Are there things you'd like to see us cover? Do you have any favorite techniques you'd like to see us include? Feel free to leave feedback in the comments. The chapter is already significantly done, with 26 pages written, but the ink's not on paper yet, so there's still time to correct omissions!
I wrote a couple weeks ago about my work on the Backup and Recovery chapter for High Performance MySQL, 2nd Edition. Thanks for your comments and suggestions, and thanks to those of you who helped me over email as well.
I've had several questions about what is included in the chapter, so I thought I'd post the outline as it stands now.
I mentioned earlier that I'd blog about progress on the book as we go. It's not only progress on the book itself -- I want to write about the process of writing, because I think it's very interesting and relevant to software engineering. I'm finding a lot of the work in writing a book comes from some of the same things that make software hard: coordinating work, deciding what should go where, and so on.
| RANK | INSTITUTION | COMPUTER | #PROCESSORS | #Tflops |
| 1 | DOE/NNSA/LLNL | eServer Blue Gene Solution | 131072 | 280.6 |
| 2 | NNSA/Sandia National Laboratories | Sandia/ Cray Red Storm | 26544 | 101.4 |
| 3 | IBM Thomas J. Watson Research Center | eServer Blue Gene Solution | 40960 | 91.290 |
| 4 | DOE/NNSA/LLNL | eServer pSeries p5 575 1.9 GHz | 12208 | 75.760 |
| 5 | Barcelona Supercomputing Center | BladeCenter JS21 Cluster | 10240 | 62.630 |
| 6 | NNSA/Sandia National Laboratories | PowerEdge 1850 | 9024 | 53 |
| 7 | Commissariat a l'Energie Atomique (CEA) | NovaScale 5160 | 9968 | 52.840 |
| 8 | NASA/Ames Research Center/NAS | SGI Altix 1.5 GHz | 10160 | 51.870 |
| 9 | GSIC Center, Tokyo Institute of Technology | Sun Fire x4600 Cluster | 11088 | 47.380 |
| 10 | Oak Ridge National Laboratory | Cray XT3 | 10424 | 43.480 |
| RANK | INSTITUTION | COMPUTER | #PROCESSORS | #Tflops |
| 1 | DOE/NNSA/LLNL | eServer Blue Gene Solution | 131072 | 280.6 |
| 2 | NNSA/Sandia National Laboratories | Sandia/ Cray Red Storm | 26544 | 101.4 |
| 3 | IBM Thomas J. Watson Research Center | eServer Blue Gene Solution | 40960 | 91.290 |
| 4 | DOE/NNSA/LLNL | eServer pSeries p5 575 1.9 GHz | 12208 | 75.760 |
| 5 | Barcelona Supercomputing Center | BladeCenter JS21 Cluster | 10240 | 62.630 |
| 6 | NNSA/Sandia National Laboratories | PowerEdge 1850 | 9024 | 53 |
| 7 | Commissariat a l'Energie Atomique (CEA) | NovaScale 5160 | 9968 | 52.840 |
| 8 | NASA/Ames Research Center/NAS | SGI Altix 1.5 GHz | 10160 | 51.870 |
| 9 | GSIC Center, Tokyo Institute of Technology | Sun Fire x4600 Cluster | 11088 | 47.380 |
| 10 | Oak Ridge National Laboratory | Cray XT3 | 10424 | 43.480 |