Brian Aker gives the “zinger” lightning talk about the newly announced “Drizzle”. This short (under 8 minutes) video captures Aker’s highlights of why he started the Drizzle project and how Drizzle is different from MySQL — both in what has been removed from MySQL and what features Drizzle can accomodate.
Play the video directly in your browser at http://technocation.org/node/576/play or download the 116 Mb file at http://technocation.org/node/576/download.
Yesterday I had a good conversation with Monty Widenius (a MySQL founder) about MySQL 5.1. Specifically, about the fact that MySQL 5.1 is not a GA (generally available) release.
My impression, which was wrong, was that it was difficult getting critical mass to download 5.1 and use it simply because it was not a GA release yet. I thought the paradox of needing to have a certain amount of usage before release was the barrier.
That’s not the case at all.
(more…)
Greetings from Wisconsin! Welcome to the 106th edition of the Log Buffer. Mr. Edwards is on a brief holiday and kindly asked me to fill in for him. So join me as we take a tour of some of this week’s database blogging activity.
I’d like to start by sharing the story of MySQL engineer Andrii Nikitin’s young son, Ivan. The short story is that Ivan is in need of a bone marrow transplant and that would also require travelling outside of their native Ukraine for the procedure. The family is asking for donations to cover the cost of the operation and trip, so please consider donating via the previous link.
Now, moving on to the database topics, we begin with my own area of (relative) expertise, Oracle. The big news this week is the July 2008 CPU, or Critical Patch Update, which Oracle releases quarterly. The Integrity blog has a nice summary of what’s included in this latest patchset. I’m sure plenty of you will be heading to Oracle OpenWorld 2008 as well, and OTN’s Justin Kestelyn announced a community preview page with some sneak peeks of what to expect (spoiler: beer!).
In a quick round-up of Oracle 11g topics, Amit Bansal at AskDBA.org has a nice guide for installing Oracle 11g on Solaris 10. Meanwhile, Oracle performance guru (and fellow Wisconsin native) Greg Rahn makes a great case for using incremental global statistics on partitioned tables. I see that Tim Hall has taken a break from reviewing summer blockbusters to write up a nice article on AWR baseline enhancements in Oracle 11gR1, and Alex Nuijten at AMIS has a nice presentation on using Oracle 11g’s virtual columns to clean up your data model.
(more…)
This week, database blogs seemed to talk about conforming to stereotypes as well as breaking them.
Fulfilling the stereotype of Microsoft software being unsecure, Microsoft released a very important SQL Server update that Aaron Bertrand notes patches “four elevation of privilege vulnerabilities.” That’s right, not one, not two, but four!!! At least there is a patch now……go forth and patch! Usually it is MySQL that throws ambiguous warnings or errors which are not an accurate representation of the actual error, but Bertrand also notes that there is A Little Management Studio Oops.
Contrary to stereotype, Microsoft is giving away content with NO DRM! Kalen Delaney asks, Did You Know the History of SQL Server? and shares a link to a chapter from a book on SQL Server 2000 in the Inside SQL Server series.
The PSS SQL Escalation Services team has fought many times about SQL Server I/O Bottleneck, I don’t have one, YES YOU DO! The team gives some reminders about how to interpet averages properly.
Allen White asks about Tools for the Reluctant DBA — that is, a programmer or administrator so good at databases that they were promoted to DBA, but may not actually want the job. Check out the comments and add your own.
Tibor Karaszi shares his stored procedure to find physical index details. Now you can use one stored procedure does what a stored procedure plus 3-4 tables ordinarily does.
By “Good SQL Querying”, I am not referring to “how to make your queries more perfomant.” I am about to go on a mini-rant about how to make readable and self-documenting SQL queries.
One practice that will get me instantly going on a rant is using a comma join. There is NO reason to do the following:
-- uses the sakila sample database SELECT first_name, last_name, address FROM customer,address;
What kind of join did the original author intend? A CROSS JOIN? Or did they really want an INNER JOIN and forget the WHERE clause?
The answer: you do not know for sure; you can only guess. Had the query been
SELECT first_name,last_name,address FROM customer INNER JOIN address;
you would know that the author intended an INNER JOIN; had the query been
SELECT first_name,last_name,address FROM customer CROSS JOIN address;
you would know that the author intended a CROSS JOIN. I advocate using INNER JOIN instead of JOIN because like the comma, JOIN is ambiguous.
For full disclosure, the MySQL EXPLAIN plan is exact same for the above queries. The difference is purely semantic, but I think it a very important difference, because it makes the query author’s intentions clear. There are issues with the comma join not having the same precedence as INNER JOIN, but that is for more complex queries. To wit:
SELECT first_name,last_name,address FROM customer, address INNER JOIN city;
will actually do address INNER JOIN city first, then join with customer. This was changed to be more like the SQL standard, so it’s only in MySQL 5.0.12 and higher that it acts this way.
My feeling is that a comma join should NEVER be used in MySQL. There is no such thing as a “comma join”, really what you mean is an “inner join” or a “left/right join” or “cross join” (or “natural join”).
I know that naysayers will say that
SELECT first_name,last_name,address FROM customer INNER JOIN address;
actually does a CROSS JOIN (Cartesian product), so it is folly to rely only on the name of the join. And I agree. However, if you name your joins appropriately for what you want and intend them to be, it is much easier to catch mistakes.
And while we are on the topic of good SQL query techniques…..I would like to rant a bit about join criteria. (more…)
I was contacted by the folks at MONyog and asked if I would review MONyog. Since using MONyog is something I have been wanting to do for a while, I jumped at the chance. Of course, “jumped” is relative; Rohit asked me at the MySQL User Conference back in April, and here it is two months later, in June. My apologies to folks for being slow.
This review is an overall review of MONyog as well as specifically reviewing the newest features released in the recent beta (Version 2.5 Beta 2). Feature requests are easily delineated with (feature request). This review is quite long, feel free to bookmark it and read it at your leisure. If you have comments please add them, even if it takes a while for you to read this entire article.
While the webyog website gives some information about what MONyog can do, it is a bit vague about what MONyog is, although there is a link to a PDF whitepaper on What is MONyog? which does answer much of these questions.
The screenshots available from the website are accurate, so I will not reproduce them here. I will note that I have not shared this feedback with the webyog team yet, so I may be upset that a feature is lacking, and the feature may be implemented but I missed it. I will post a follow-up in that case, even though they will likely comment here too.
My reference points — I have used other monitoring and graphing tools such as Nagios, Cacti, and Intermapper as well as MySQL’s Enterprise Monitor.
As an overall review — MONyog is the best out-of-the-box GUI monitoring tool for MySQL that I have seen. It “just works.” As promised, getting up and running quickly is easy, and having a centralized location for monitoring is very useful. The graphs are beautiful and the statistics that are graphed are useful time-savers.
(more…)
From 7:30 - 8:30 pm tonight, Wed. June 25th, in the Berkeley room of the Sheraton Boston, I will be hosting a Birds of a Feather conversation entitled “Pros and Cons of Managed Services”. This will go beyond MySQL and even beyond remote database management, and just deal with the overall pros and cons.
Come, share your good and bad experiences, and discuss why managed services may or may not be appropriate for your situation. I will try to take notes at the BoF.
(Note: I have no idea if they check badges for Birds of a Feather sessions or not)
If you are attending Usenix 2008 at the Sheraton Hotel in downtown Boston, you can meet me and ask your burning MySQL questions at my “The Guru is In” session. On Friday, June 27th, 2008 from 2 - 3:30 pm in Constitution B, I will be helping folks out by optimizing queries and schemas, teaching general principles of working with MySQL databases, and answering (to the best of my ability) any other question they may throw at me.
The event details are at:
http://www.usenix.org/events/usenix08/tech/#fri
Hope to see you there!
So, we have all heard that Billy Joel played a concert at Oracle’s OpenWorld in 2007.
What follows is an actual IRC conversation among Don Seiler, Dave Edwards, and myself:
(4:02:46 PM) don: ha @ Billy Joel at OOW
(4:03:38 PM) dave: “We didn’t fire the startup…”
(4:07:53 PM) don: “we didn’t start the backup”?
(4:12:53 PM) dave: “Don’t go changin’ . . . your slave and master”
(4:20:19 PM) ***sheeri shoots Dave
(4:20:49 PM) sheeri: “I don’t want clever replication, we never could have come this far”
(4:24:05 PM) sheeri: “And the server sounds like an aero-plane, and replication chugs along as it must…and the inserts go on, replication corrupts, and I say “Man, now I’m workin’ all night!”
(4:24:29 PM) dave: “I said ‘ls -u’ . . . that’s for access”
[”I said I love you . . . that’s forever”]
(4:24:30 PM) don: UP-TIME GIRL
(4:34:09 PM) dave: “Say it’s not wrong, execution plan!”
(4:43:39 PM) sheeri: Where’s my execution plan, oh man?
[Sing us a song of a piano man]
(4:45:52 PM) sheeri: Go ahead with your schema, leave me alone!
Comment here with your own database-themed parody of a Billy Joel song. Perhaps if we get enough MySQL-themed entries, we can get him to come to the MySQL Conference in April.
That and maybe thousands of dollars………..
Post Summary: An apology with a lesson.
When Steve Curry contacted me just after the MySQL Conference and Expo asking me if I’d be interested in a community roundtable, I was excited. Not just because Steve Curry brought me an inflatable pink dolphin after I squee‘d that I needed one, although I never forget when someone does me a favor.
However, a few weeks ago it seemed like the event was more of a PR gathering than a community roundtable. I was disappointed, and told Steve as much.
And then, one of two things happened:
1) My concerns were brought up, discussed and folks decided a roundtable involving community was a good idea;
or
2) I had come up with two different pictures of the event in my mind, based on my expectations of “community roundtable” at first and “event with businesses and PR, to include community” as the final description.
Now, last night was an excellent opportunity for me and also a lot of fun. A lot of the questions were really implicitly asking, “Is open source better? Why?” More on that in the next post, I promise.
So I wanted to say to MySQL that I was wrong.
I am sorry.
Sure, MySQL did not know what I was thinking. And certainly the event could have turned out to be one I did not enjoy.
The lesson to learn from this is that sometimes we get upset at our perception of reality, and not reality itself.
And to follow up on my cranky post where I was annoyed at the MySQL’s website’s lack of functionality at http://www.pythian.com/blogs/1016/mysql-website-a-reflection-of-values, I feel I should note that I got a call later that day from MySQL’s web designer telling me that my concerns were valid and MySQL was actively working on them. Indeed, www.mysql.com has added a “Documentation” link in the orange submenu (first is “Products” and second is “Downloads”, so I completely agree with their prioritization as well).
The other lesson: Always trade business cards with people, so they have your contact information when they want to contact you. A phone call was so much more powerful than an e-mail ever could have been.
At last night’s event, a lot of the questions were really implicitly asking, “Is open source better? Why?”
The first answer everyone comes up with is that it’s free, and that’s better.
However, that is neither necessary nor sufficient to deem it “better”.
If MySQL did exactly the same tasks Oracle did, but was free, there’s still a huge amount of money involved when migrating. Merely staffing the migration costs a lot of money.
Companies using open source technologies because they are free are (probably) making the right software choice for the wrong reason.
Firstly, open source does not have to be free — MySQL proves that. Their Enterprise source code is free to paying customers (and whoever paying customers distribute to, but that is not the issue).
Secondly, open source’s benefits far outweigh mere license costs, though the license cost is definitely the most tangible benefit.
I realized while the benefits of open source were being touched upon that the benefits are not lacking in the closed software world, they are simply much harder to come by. For instance, there are companies that reverse engineer solutions, develop their own in-house solutions without being able to read a line of original code. Surely it is easier to build a home-grown solution when the code is readable to begin with.
As well, the talent pool for open source is greater, because there is a lower barrier to entry. It’s still just as difficult to separate the wheat from the chaff as it is in a closed source world, however if your company is willing to hire the top 10%, I’d rather try to find the top 10% from a pool of tens of thousands of people than from a pool of thousands.
The oft-quoted “you can hack it yourself if you want” still applies, and moreso the idea that “even if the company goes out of business, or the core developers stop developing, others can pick up where the previous developers left off.”
One issue we did not touch upon was that open source tends to follow a popular concept in “extreme programming” — the idea that the software is always working. It may not have all the features, maybe it’s not much more than “hello world”, but it works. A feature is added, the code integrated, and it still works, now with +1 feature.
I think the issue is that in general, it is *easier* to reap these benefits from open source than from closed. It makes the argument more difficult, because it’s *possible* to reap similar (or the same) benefits from closed source, but it’s easier with open source.
Twitter has had many outages recently. On May 17th, 2008 http://blog.twitter.com/2007/05/devils-in-details.html was posted and says:
What went wrong? We checked in code to provide more accurate pagination, to better distribute and optimize our messaging system?basically we just kept tweaking when we should have called it a day. Details are great but getting too caught up in them is a mistake. I’ve been CEO of Twitter for two months now and this an awesome lesson learned. We’re seeing the bigger picture and Twitter is back. Please contact us if something isn’t working right (with Twitter that is).
(in other news, that post was made on May 17th and does not show up on http://blog.twitter.com, which it should, between the May 16th and May 19th posts. I found a reference in other posts and had to search the site to find that post).
A real “awesome lesson learned” is “do not tweak production without testing first.” In every job I have had I have first learned and then taught the concept of “test everything possible.” Which Twitter has not learned yet, because http://blog.twitter.com/2008/05/not-true.html, posted on Tuesday May 20th, states:
We caused a database to fail during a routine update early this afternoon.
As someone who has years of experience working with MySQL, and before that was a systems adminsitrator; as someone who was referred to as “the MySQL Queen” yesterday (by someone who wanted me to test their product, so yes, they were flattering me); I can assure you:
no matter how “routine” a change is, if you do it on production without testing it first, you are playing with fire, and 95% of the fires caused by not testing first are completely preventable.
I will repeat this, because repetition is important to learning concepts.
no matter how “routine” a change is, if you do it on production without testing it first, you are playing with fire, and 95% of the fires caused by not testing first are completely preventable.
With a proper testing environment, 19 out of 20 “whoops, didn’t expect THAT from a routine change!” issues are caught. And I can tell you that often “routine changes” cause unexpected results.
Now, I was online during an outage, and http://twitter.com/home was showing their “site isn’t working” page for at least 3 hours between 2 and 5 am EDT yesterday (Tuesday, May 20th, 2008).
So…..there is no read-only copy around that Twitter could use? Maybe I cannot tweet, but I should at least be able to read what was done before!
Of course, since last week Twitter has done the opposite — often I can see the most recent 20 or so posts, but not anything prior. Now, I understand that it is hard to get all the histories for the people I follow. But it only needs to be done once, and could then be cached — “Posts from who Sheeri follows on 5/20″. It would not be difficult, and I would be OK with the functionality changing such that “once you follow a new person, their tweets prior to when you followed them do not show up in the history.”
Alternatively, you could go the snarky way and say: http://www.techcrunch.com/2008/05/20/twitter-something-is-technically-wrong/ states:
What would be great is if Twitter just moved their blog to another platform so that it doesn?t fail when users need it most.
I am not a huge user of rails. But I will say that given the content of the public announcements, the platform is not the problem. It is the code release process that is the problem. Maybe there’s “agile development” happening, paired programming and code reviews. But there is not adequate testing.
Twitter — if you truly need scaling help, please ask for help — I know Pythian would be happy to help. However, if it really is as it seems — that basic good practice is not being followed — I would like to remind you that backups are really important too, just on the off chance that backups are not happening.
I understand that MySQL as a company wants to recruit paying customers. However, as a community user I have a hard time finding what I want on the MySQL website. Today’s frustration is brought to you by trying to find the documentation.
Go ahead, hit http://www.mysql.com. From there, where do you go to find the documentation?
It’s not Services, not even Services -> Support.
According to Products, the community server is not even a product. How is a potential new user, who wants to learn about MySQL, supposed to know a community version exists? Here are the products listed on the Products page:
MySQL Enterprise
MySQL Enterprise Monitor
MySQL Cluster
MySQL Embedded Database
MySQL Database Drivers
MySQL Database Tools
Where’s “MySQL Database” on that list? A website user basically has to know what they’re looking for, since that page does not help find “the mysql database”. You can guess it’s the “MySQL Enterprise” — but you’d be wrong. Imagine if you’re a person unfamiliar with MySQL who has been told “go to the MySQL website and get the free version of the database, it’s great!”
OK, OK, I know, you’re saying “click on Downloads”. Which takes me to the “choose which version you want to download.” OK, nice for newbies, but really annoying for the experienced. Why aren’t the links to download at the top, with the explanations just underneath? A newbie would be presented with the links, think “I don’t know which to choose!” but then see that the descriptions are just underneath. An experienced user can just click and go.
This of course is made even more silly when you realize that the community download link simply scrolls down. Yes, that’s right, the community downloads immediately follow that chart of explanations. Which means that in the current state, when the link is at the bottom of the chart, the link points you to the next line. If the link is at the top of the page, at least there’s a reason for it — to scroll down past the explanations.
Now recall that the exercise was to find documentation. There is no way to download the documentation here.
I am a non-paying user, not a developer. Going to the “Developer Zone” is not intuitive for me. However, that is exactly where I need to be:
* DevZone
* Downloads
* Documentation
* Articles
* Forums
* Bugs
* Forge
* Blogs
Given that list of topics, why on earth is this section called the “Developer Zone”? (it’s a rhetorical question!) Sure, if I’m a software developer I might think it’s useful to me. On any other website “developer zone” is relegated to advanced users, or folks using some different part of the product than what most people use (think Apple’s Developer Connection).
Take a look at that list. On any other website, it would be under “Support” or “Help” or “Learn”.
No wonder folks have no idea that the forums exist, much less the Forge or Planet MySQL. If I was a new user to MySQL and I wanted to find the documentation, I’d be very unhappy.
Heck, I’m very unhappy anyway — what kind of company has a huge community of people supporting each other and contributing back to the company, and does not give people an easy way to find the community?
One of the most frequently needed functionality in the MySQL Proxy is the need to know which server you are on. This is not given, on purpose, by the proxy, because the proxy is supposed to be transparent. It is not supposed to matter which back-end server you are on.
However, for testing purposes we often want to know which back-end server we’re on. Thus I developed functionality for SHOW PROXY BACKEND [INDEX ADDRESS OTHER].
SHOW PROXY BACKEND INDEX — gives the index of the server you’re on (backend_ndx, ie 1)
SHOW PROXY BACKEND ADDRESS — gives the address of the server you’re on (ie, foo.bar.com:3306)
SHOW PROXY BACKEND OTHER — gives the address of all the other servers except those you’re not on, in multiline format.
Note that I was pretty lazy and the commands are case-sensitive. But I figured that since this is supposed to be used mostly in testing circumstances, it did not really matter.
The code is on the MySQL Forge Wiki at http://forge.mysql.com/tools/tool.php?id=139
Interestingly enough, this script is actually being used in production — a site has a primary and failover server, and wants to check that when the primary server is in use, there are no connections on the failover. I wrote that check as well, but as the logic is somewhat particular, I am not sure it would be useful to many. The logic is:
SHOW PROXY BACKEND INDEX.SHOW PROXY BACKEND OTHER.OTHER address, connect to them and find what that connection’s host looks like (sometimes it looks like foo.bar.com:4325, other times it looks like 1.2.3.4:4573). Get the value of “my host” by stripping off the port.OTHER address again, killing off connections from “my host” except for the “system user” and a few other special accounts (replication slave being one of them). Log each kill with the thread number and a warning.Let me know if you’d like to see that…it’s a shell script, and it requires the mysql client and bintools like grep and cut.
I have already blogged about this keynote at http://www.pythian.com/blogs/948/liveblogging-who-is-the-dick-on-my-site.
If you are interested in actually seeing the video, the 286 Mb .wmv file can be downloaded at http://technocation.org/videos/original/mysqlconf2008/2008_04_17_panelDick.wmv and played through your browser by clicking the “play” link at http://tinyurl.com/55c5ps. This is not to be missed!
At the 2008 MySQL Conference and Expo, The Pythian Group gave away EXPLAIN cheatsheets. They were very nice, printed in full color and laminated to ensure you can spill your coffee* on it and it will survive.
For those not at the conference, or those that want to make more, the file is downloadable as a 136Kb PDF at explain-diagram.pdf
* or tea, for those of us in the civilized world.
In a despicable business practice, I received a message from a PR Firm representing Ingres. Now, I even wrote about the controversy that seems to have swept the open source community; but even my writings were not completely factually correct — I wrote that even if online backups were closed it was not necessarily the worst thing in the world. The actual parts of the online backup that are not open source and free are compression and encryption — that is all.
So really, we are talking about a very small part of backup. The last I saw most people used their own compressing (ie, | gzip -c) and encryption for backups. And honestly, I would rather use tried and true compression and encryption than something new that MySQL comes up with, so I do not even see most people wanting compression nor encryption.
But that’s besides the point. If Ingres thinks they can win customers over by swooping in when a controversy is happening, they are way more evil than I would ever have guessed. It’s not good business practice to do this kind of thing — it is cold, calculated, uncalled for, and just plain wrong.
As for my comments — firstly, MySQL did not make any particular “announcement”, and secondly, MySQL is already losing “feedback and contributions on its products from a large group of users in the community” because it does not have an easy way to get community patches committed to the source tree.
There is so much else that bothered me about the e-mail I received, so I will just copy it here for full context for readers:
Subject: Ingres' Thoughts on Today's MySQL Announcement and What it Means for the Open Source Community From: Lindsey Pappas Date: Apr 15, 2008 5:33 PM Hi Sheeri, Did you see MySQL ’s announcement on new features in MySQL Enterprise version of its product? This is interesting news for the open source community, as it appears that MySQL is moving away from true open source towards a proprietary model by not providing the same features in its community version. Ingres is a a leading provider of open source database management software and support services and views the news as the loss of a true competitor in the open source space. By doing this, MySQL will lose feedback and contributions on its products from a large group of users in the community. Costs will go up and quality will go down. Ingres is a believer in the true open source of its products. I can put you in touch with an executive at Ingres to hear thoughts on the news and what MySQL’s announcement means for the future of the open source community, including comment on: · The benefits of open source solutions that are identical between community and enterprise, functionality should not differ between products, cost or version. · The importance of an online backup component for mission critical applications. · In a community where everyone benefits from the valuable contributions and feedback on new developments and innovations from other players, MySQL?s move away from open source is a loss for the open source community. Please let me know if you?d like to connect in the next day or so to discuss MySQL?s announcement and where the open source community is headed. Thanks, Lindsey Lindsey Pappas Atomic Public Relations 8 California Street San Francisco, CA 94111 (415) 402-0230 lindsey@atomicpr.com
10,000 Tables Can?t Be Wrong: Designing a Highly Scalable MySQL Architecture for Write-intensive Applications by Richard Chart
Chose MySQL for performance and stability, and less important but still there, experience and support. Support is becoming increasingly more and more important.
Starting point: 1 appliance supporting 200 devices
Problem/Goal: Extensible architecture with deep host and app monitoring, over 1000 devices with 100 mgmt points each
Distributed collection over a WAN, with latency and security concerns
Current reality: several times the scale of the original goal
Commercial embedded product, so they actually pay for the embedded MySQL server
Future: The fundamentals are sound: next generation of the product moves up another order of magnitude
Data Characteristics
>90% writes
ACID not important
Resilient to loss, because gaps in data do not invalidate the rest of the data
Data elements by themselves are valuable, but much more so when relationships are added.
Chose MyISAM because: (more…)
Jeff Rothschild of Facebook’s “A Match Made in Heaven? The Social Graph and the Database”
Taking a look at the social graph and what it means for the database.
The social graph:
“The social graph has transformed a seemingly simple application such as photos into something tremendously more powerful.” We’re interested about what people are saying about us, and about our friends. Social applications are compelling.
Facebook users blew through the estimate for 6 months of storage in 6 weeks. It is serving 250,000 photos per second at peak time, not including profiles. Facebook serves more photos than even the photo sites out there, and serves more event invitations than any other website out there.
E-mail invitations are an example of the power of the social graph. If you get a newsfeed or an invitation that tells you 12 friends are attending an event, you have more information, and then can have a better decision on whether or not you want to go. (more…)
Yesterday’s keynote panel on “Scaling MySQL — Up or Out?”
Directly download the 310MB wmv file (not if you are on the conference wireless please!), or watch it in your browser via streaming — simply click the “play” link on this page.
Keith Murphy managed to take painstaking notes with all the facts and figures. As well, Venu Anuganti presents a chart with the results as well as notes on the more detailed answers. Ronald Bradford has a brief summary of the 20 seconds of wisdom from each panelist.
Identity 2.0: A world that’s simple, safe and secure.
Who is the Dick on My Site? by Dick Hardt (Sxip Identity Corporation)
Quotes:
“Really, data is about people. It’s really identity data.”
“Identity helps you predict behavior.”
“Identity is who you are.”
“Identity is also what you like.”
“Identity enables you to uniquely identify somebody.”
“There are things that other people say about you, too.”
“Modern identity is about photo IDs so you can prove your identity.”
“Identity is a complicated issue….Everyone has a different idea of what it is.”
Identity transactions are:
Identity transactions can be: (more…)
Liveblogging: Extending MySQL by Brian “Krow” Aker
Brian wins the award for “most frequent great quotes during a talk”.
Before MySQL 5.1 a UDF was the only way to extend MySQL.
All you need in a UDF is: init() execute() deinit()
my_bool id3_parse_init(UDF_INIT *initid UDF_ARGS *args, char *message)
UDF_ARGS tell you about incoming args
char *message is the output that might return
args->arg_count is the # of args
WARNING: use STRICT mode in MySQL, otherwise there are tons of silent failures.
“When you work on databases you start to put everything in databases. Tip, don’t put a DVD into a database, because really long BLOBs aren’t actually supported….”
In MySQL 5.1, you can now install plugins (example is memcache_servers plugin):
mysql> INSTALL PLUGIN memcache_servers SONAME 'libmemcache_engine.so'
In the plugin architecture, there’s a callback mechanism to call a structure and then the structure is self-hosted into the environment. mysql_declare_plugin() is usually found at the bottom of the file because that’s how Brian wrote the example….
A storage engine is really about connecting to anything.
MySQL has a parser, and optimizer, and a retrieve/store part, which talks to the storage engines. The engine interface is written in C++.
Quote about the BLACKHOLE storage engine: “It’s amazing how much money can be made out of something that does NOTHING.”
What about defining features?
tables_flags()
index_flags()
handlerton flags
Table object methods:
base_ext()
create()
delete_table()
rename_table()
optimize(), repair(), etc.
As you can see, the API is pretty specific to actions that are happening at the SQL layer.
Reads
rnd_init() — scan is about to begin
rnd_next() — get the next record
(rnd = random or read next data)
index_first() — index read is about to begin
index_next() — get the next record in the index
Writes:
write_row() — “here’s new data, please store this”.
delete_row()
update_row()
on delete or update, MySQL starts reading the rows and once the predicate matches, then the delete_row() or update_row() is called. update_row() will be passed the before and after image of the row.
What about information_schema?
The information_schema API is an easy way to get more information into the information_schema tables.
(there’s a really nice example of an information_schema plugin)
A quote about his code — “That in the middle there looks like a bug….but it’s never called, so that’s OK. That’s what I get for copying and pasting my own examples.”
The plugin is responsible for all the locking.
Daemon plugins created because otherwise people would “bastardize the storage engine plugins.”
A daemon plugin has full access to the server internals, and it is a simple interface to be able to extend.
“If you’re really going to shoot yourself in the foot, you really want to go full blast.”
UDP Daemon just allows UDP packets to be put into the db (instead of TCP/IP with its darn authentication schema). Simplest interface to MySQL that exists. No connector, just UDP. The point is you can put other listeners to MySQL, such as an HTTP listener.
Daemon plugins are also a way to
In 3 words:
They already do.
MySQL Enterprise is more than just a binary. http://mysql.com/products/enterprise/ has the details on the other features MySQL Enterprise includes.
One of these features is the MySQL Enterprise Monitor, which is closed source, proprietary alerting software.
So when bloggers make statements such as:
MySQL will start offering some features (specifically ones related to online backups) only in MySQL Enterprise. This represents a substantive change to their development model ? previously they have been developing features in both MySQL Community and MySQL Enterprise. However, with a shift to offering some features only in MySQL Enterprise, this means a shift to development of those features occurring (and thus code being tested) only in MySQL Enterprise.
All they are doing is showing their lack of knowledge about what MySQL Enterprise really is. Complaining that MySQL will likely charge for online backups is a valid one, but currently InnoDB Hot Backup is for sale only. There are solutions out there that do not cost anything, but nothing that MySQL produces. The only free [mostly] online backup product is mysqlhotcopy, and that has limitations such as “for MyISAM tables only”.
While I have said that software should be open source and free, where “free” means “free as in water”, that does not mean that we should take it for granted that free software will always be there. The path to that is to have a contrast of software one pays for, and software one does not.
Of course, that is not why MySQL charges for the software. They charge for some of their products because without money, they cease to exist, and then even the free version stops being developed on.
“A necessary evil” is a dangerous phrase, and can lead to a slippery slope….but in this case, I do not envision that it is a big problem. Maybe I will regret saying this if things keep getting more and more closed, but I do not believe so.
And I, too, am frustrated that we have been told that “online backups are coming” only to find out that they may not come in the means I expected. However, software is not bug-free, often does not come out on time (MySQL 5.1 itself is a year late!), and sometimes, promises get broken.
My faith in MySQL is unbroken, and I hope most others reading this feel this way as well.
Architecture of Maria: A New Storage Engine with a Transactional Design
Goals of Maria:
Maria is actually faster than MyISAM.
Will support READ COMMITTED and REPEATABLE READ (was a specific question).
“You should be able to upgrade Maria and have it just work, without ever having to backup and restore as part of the upgrade.”
Why do we need 2 storage engines, Maria and Falcon?
More storage engines = more scenarios to be able to handle. Falcon good for lots of memory, and shorter queries, Maria is to take care of long transactions and data warehousing side, and also to be an overall “good” engine for most purposes (at least as good as InnoDB).
Project plan — MySQL 5.1 and higher. Maria strives to be bug-free.
Maria 1.0 = “crash safe”, released in Jan 2008. Cacheable row format.
Maria 1.5 = “concurrent insert/select”, Apr/May 2008. Will be merged into the MySQL 6.0 release. Non-transactional.
Supports:
MySQL 6.0 (includes Maria)
Maria 2.0 = “Transactional and ACID” alpha, Q4 2008
Maria 3.0 = “High concurrency + Online backup” alpha, Q1 2009; GA, Q2 2009
Advantages of Maria compared to MyISAM
Before Apr/May release of Maria 1.5
Disadvantages, not likely to be fixed
XDB indexes — group records, and get the min and max for each group (groups are not sorted), and save that in a separate block. This means that stats like min and max are very fast. This is something that will help data warehousing queries.
Resources:
Best quote from Monty: “You don’t steal ideas in open source. You respect people.”
Here are my liveblogging notes from MySQL Sandbox: Easily Using Multiple Database Servers in Isolation by Giuseppe Maxia
Giuseppe has been a community member since 2001, and in the past year or so, a MySQL Employee.
He likes to give things away for free — he gave away T-shirts to the early arrivers to the workshop, and that’s why he’s giving away the sandbox as well. The sandbox is NOT an official MySQL product. It is released from GPL, available from http://sf.net/projects/mysql-sandbox.
Why the sandbox? To be able to set up 1 server in under 10 seconds. And to be able to set up multiple MySQL instances very quickly, and to use them quickly.
The sandbox untars in seconds, for installing alternative servers, not main instance, it creates a separated environment (datadir, port, sockets) — for groups of related or unrelated servers. Really good for testing out new server versions. WARNING: If you do not use separated environment (separate datadirs, ports and sockets), you can corrupt your data.
Doing it manually (the old, hard way):
unpack tarball, move to separate directory, create db tables, create .cnf with port, socket, datadir, launch mysqld_safe manually, launch mysql commandline script with options.
OK for doing it once, but a good DBA will automate this if they’re doing this a lot, to avoid mistakes — for example, while doing QA to test several versions.
So the easy automated way — MySQL Sandbox!
Just provide the version # and it creates $HOME/VER/data, VER, /tmp/mysql_VER.sock
Two examples:
Version 5.1.24 datadir = $HOME/5_1_24/data port 5124 socket = /tmp/mysql_5124.sock
Version 6.0.5 $HOME/6_0_5/data port 6005 socket = /tmp/mysql_6005.sock
(NOTE: you can have multiple instances of the same version)
Once installed:
In the case of a single sandbox — commands are start, stop and clear (removes all data and files in the datadir except for the mysql system db and tables), and use (instead of typing mysql -S /path/to/socket -u user -p you can use the use shell command and it will read what it needs from the my_sandbox.cnf file).
In the case of multiple sandbox, you can use the same commands as with a single instance, but there are commands that can affect all the instances. These are start_all, stop_all, clear_all, and multi_cmd. The first three have the obvious results; multi_cmd command executes command for all nodes — so you do not have to call it on each node. For example:
multi_cmd "select * from test.t1"
The easy way to install — download the package from Sourceforge. The sandbox doesn’t contain MySQL build, so you have to download a tarball or compile one yourself.
To install a single instance of MySQL 5.1.23:
express_install.pl /path/mysql-OS-5.1.23.tar.gz
Instead of using express_install.pl, to install 1 master and 2 slaves, run:
set_replication.pl /path/mysql-OS-5.1.23.tar.gz
To install multiple servers of the same version at the same time:
set_many.pl /path/mysql-OS-5.1.23.tar.gz
This installs 3 instances by default, but you can specify how many you want with options to set_many.pl.
For multiple servers of different versions, either:
Download the tarballs and run set_custom_many.pl /path/to/mysql-OS-5.0.51.tar.gz /path/to/mysql-OS-5.1.23.tar.gz /path/to/mysql-OS-6.0.5.tar.gz.
or:
Expand the tarballs in $HOME/opt/mysql and run set_custom_many.pl 5.0.51 5.1.23 6.0.5
Fine tuning - express_install.pl /path/mysql-OS-5.1.23.tar.gz [option]… can customize port, datadir, enable federated tables, disable innodb, skip networking, and so on.
You can pick your default my.cnf “size” with the my_file option:
express_install.pl /path/mysql-OS-5.1.23.tar.gz --my_file=small
express_install.pl /path/mysql-OS-5.1.23.tar.gz --my_file=medium
express_install.pl /path/mysql-OS-5.1.23.tar.gz --my_file=large
express_install.pl /path/mysql-OS-5.1.23.tar.gz --my_file=huge
These will use the my_small.cnf, my_medium.cnf, my_large.cnf or my_huge.cnf sample config files bundled with MySQL.
To easily fine tune the 50+ options in the sandbox, you can run
express_install.pl /path/mysql-OS-5.1.23.tar.gz --interactive
to ask you the values for each value — if you want to skip the rest of the questions while you are in the interactive mode and continue the install using default values for the rest of the questions, you can type “default” at any prompt. You can also enter “back” at any prompt to go back to the previous question; or enter “quit” at any prompt to quit the interactive server without completing the sandbox installation.
The use shell command starts the mysql client, using the credentials in my_sandbox.cnf. By the way, the default username/password = msandbox/msandbox, default root password = msandbox
There are shortcuts for using mysqldump, mysqlbinlog and mysqladmin for each instance in a sandbox. These shortcuts start the mysql client using the credentials in my_sandbox.cnf.
my sqldump
my sqlbinlog
my sqladmin
Using a multi-instance sandbox
start_all starts the master, then slaves. stop_all stops the slaves and then the master. clear_all clears all the slaves and then the master.
multi_cmd was already mentioned to run the same command on all the instances. However, there are different commands to run a command on a single instance of a multi-instance sandbox. Instead of the use command, the shell commands to use the multiple instances are:
Replication sandbox
m to use the master
s1 to use the first slave
s2 to use the second slave
Multi-node sandbox
n1 to use the first node
n2 to use the second node
n3 to use the third node
Jonathan Schwartz’s Keynote at the 2008 MySQL Conference can be played directly in your browser or you can download the 147MB .wmv file. (Please do not download the movie on the conference wireless system!)
I finally realized who Jonathan Schwartz reminds me of:

That’s right, comedian and magician Penn Jillette.
Schwartz started by saying, “enough of this free software stuff!” It got lots of laughs. He started to talk about Sun’s agenda, and mentioned that MySQL and Sun had similar values, as well as similar dysfunctions as well — particularly that each engineer has his/her own opinion.
But Schwartz goes on in earnest to say “The future will be defined by free….and freedom,” that “freedom is a price tag *and* a philosophy.” I agree completely.
He sees “the network is a social utility” much like heat and electricity are utilities. His talk somewhat reminded me of the Free as in Water post I made in June 2007.
He finished up by comparing the Amazon River to Sun — The Amazon River is really ecosystem of many rivers — 10,000 smaller rivers. Sun is really an ecosystem of many communities — Java, Open Office, Solaris, Open Solaris, MySQL, and so on. I was very pleased to hear that he feels that community is not only inclusive of all users, paid or otherwise, but that the ecosystem cannot exist without each part of that “smaller river” contributing to the whole. And *that* is Sun’s agenda, to continue to build that river, so if one part has problems, the entire river is not damaged.
Also, Sun has a quote that “innovation happens elsewhere,” so it is crucial to build those bridges (mixed metaphor unintended) so that Sun can support and enable the innovation. In a world where diversity is key, this is an excellent message.
Bravo, Jonathan!
Yesterday I presented “Best Practices for Database Administrators” at the MySQL User Conference and Expo. I was successful in streaming the live video on www.ustream.tv, and you can see it in totality at http://www.ustream.tv/recorded/352479.
The slides are available in pdf form. And here are some of the links I spoke about:
MySQL 5.0 manual
explain
show variables
show status
data types
procedure analyse()
mysqldumpslow’s manual page says “Use mysqldumpslow –help to see the options that this command supports.”
mysqlsla
mysqlreport
Maatkit, contains mk-query-profiler and mk-table-sync
Hello everyone,
Reading PlanetMySQL today, I discovered that Alex Gorbachev’s announcement that he has released the first public beta of his Oracle Grid Control plugin for MySQL was not aggregated! This is probably because Alex is primarily working on our Oracle space and so his feed isn’t on planet.
This plugin has been under development since 2006 and this is a major achievement.
Knowing that my feed is aggregated, and not willing to let this news and this amazing work go unnoticed by the MySQL community during the conference (I am at MySQLConf listening to Amazon.com’s CTO speak right now!)
In any event, if you missed them inline up there, here’s a link to Alex’s announcement with some impressive screenshots, and here’s a link to the product’s home page.
And check out the very positive comments from the first testers already on the announcement post.
Congratulations and thanks, Alex!