I was writing a utility in Python (using boto) to test/play with Amazon’s SQS service. As boto isn’t particularly well documented where SQS specifically is concerned, I also plan to post some examples (either here or on Linuxlaboratory.org, or both). When I had some trouble getting a message that was sent to a queue, I went to the Amazon documentation, and found this little gem in the Amazon Web Services FAQ
I am sure that my queue has messages, but a call to ReceiveMessage returned none. What could be the problem?
Due to the distributed nature of the queue, a weighted random set of machines is sampled on a ReceiveMessage call. That means only the messages on the sampled machines are returned. If the number of messages in the queue is small (less than 1000), it is likely you will get fewer messages than you requested. If the number of messages in the queue is extremely small, you might not receive any messages in a particular ReceiveMessage response. Your application should be prepared to poll the queue until a message is received. Note that with the 2008-01-01 version of Amazon SQS, you?re charged for each request you make, so set your polling frequency with that in mind.
So… if you were planning to decouple application components using SQS using an ‘eventual consistency’ model, keep in mind that they’re using the same model, and that they’re charging you for the privilege of eventually getting the messages you’ve already paid to put there, but aren’t necessarily available at any given point in time. I personally think this is a little goofy, and wrong.
If I put a message in a queue, I should be charged for actually getting the message. I should *not* be charged for checking to see if Amazon’s internal workings have made my messages available to me yet.
addthis_url = 'http%3A%2F%2Fwww.protocolostomy.com%2F2008%2F06%2F23%2Fwhy-should-i-pay-for-this-aws-design-decision%2F'; addthis_title = 'Why+should+I+pay+for+this+AWS+design+decision%3F'; addthis_pub = 'jonesy';UPDATE - 2008-06-23 - A member of O’Reilly’s editing team commented that this privilege has *NOT* been discontinued, and all O’Reilly authors should receive a free Safari account. Thanks a bunch, Mary, for the clarification (see comments for more).
I learned from one of the authors of the recently released second (read: first, squared) edition of High Performance MySQL that O’Reilly apparently did away with the idea of giving O’Reilly book authors free Safari accounts. Lame.
I do not know why in the world they would discontinue this offering for authors. Perhaps they’re not aware, but a great many of the O’Reilly authors are also bloggers. Tech bloggers. Some of them write on the O’Reilly blogs themselves, but almost all of them blog outside of that arena as well. And guess what they blog about? Well, lots of stuff, but there’s plenty of blogging about “something I learned”, or “this book rocks”, etc. Heck, we even blog about products we use — I’ve even blogged about Safari… *today* even!
In a world where people are paid to blog about products, it surprises me that O’Reilly wouldn’t offer people who are already actively blogging in and around their content, and who have actually formally joined the O’Reilly family, the opportunity to become users, and thereby advocates, of their other offerings.
I am an O’Reilly author, and have a free Safari account (we’ll see how long that lasts after this post goes live). I can think of *plenty* of instances where I’ve recommended that people who don’t have an account try to get one, or try to get their employer to get them (or their whole site) an account. I consult (as do TONS of O’Reilly authors), and I’ve also recommended to my clients that they get Safari accounts for their technical staff. Had O’Reilly not offered me the free account, that would never have happened. I am confident that the amount of money grossed by O’Reilly due to my big mouth since 2005 is approaching 6 figures, if it hasn’t exceeded that already.
Not to mention the fact that having the account is a very real, very sincerely felt way to make the authors feel appreciated, because lord knows we don’t write books for the money.
So, Tim, do you think you could find it in your budget to give the guys with probably the most popular MySQL Performance blog (and probably consulting outfit as well) free Safari accounts? Please? If they agree to put some badge on their blog or something? (I’d gladly do that as well).
Here’s to hoping they see the light, fellas.
addthis_url = 'http%3A%2F%2Fwww.protocolostomy.com%2F2008%2F06%2F20%2Foreilly-give-your-authors-safari-access%2F'; addthis_title = 'O%26%238217%3BReilly%3A+Give+your+authors+Safari+access%21'; addthis_pub = 'jonesy';All right! In the past, some books seem to be delayed in getting into O’Reilly’s Safari site, but on the day that Baron announces the book’s arrival, I find that I’m able to access it in Safari right now! Sweet!
addthis_url = 'http%3A%2F%2Fwww.protocolostomy.com%2F2008%2F06%2F20%2Fhigh-performance-mysql-on-safari%2F'; addthis_title = 'High+Performance+MySQL+on+Safari%21'; addthis_pub = 'jonesy';I’m going to OSCON in July, and I know that just about everyone I know who is a participant in this crazy life we call IT (or web 2.0, or whatever it’s called now), is flying to a conference or something in 2008. I’m starting to notice more and more posts like this one, so if you can avoid it, don’t put anything in a checked bag that you can’t afford to lose, and avoid US Airways, and pass it on, because when you see the list of things they don’t cover in their lost baggage policy, you’ll suddenly feel like you’re lucky to still have anything you ever checked with your bags.
addthis_url = 'http%3A%2F%2Fwww.protocolostomy.com%2F2008%2F06%2F13%2Ffor-anyone-flying-to-a-conference-flyer-beware%2F'; addthis_title = 'For+anyone+flying+to+a+conference+-+flyer+beware'; addthis_pub = 'jonesy';This rocks. It’s not complete, but Pyshards is the closest thing I’ve seen to a real attempt at making a more or less generic sharding toolkit, written in Python. This is not just great because it’s written in Python or because it helps people who need sharding capabilities in MySQL. It’s great because having a toolkit to use for this benefits the community by creating a point of reference for how to get things done, and can help unite those who are treading into this territory and help them all get a leg up on this beast that is “sharding”.
I, for one, have found ways (so far) to avoid having to do this. It’s a good bit of complexity for data that would otherwise be very simple, and an infrastructure architecture that would otherwise also be simple (by design). But one of the things that makes sharding seem complex is that there aren’t any standardized tools to aid the admin in setting up, and (worse) maintaining/rebalancing shards.
addthis_url = 'http%3A%2F%2Fwww.protocolostomy.com%2F2008%2F06%2F12%2Fmysql-data-sharding-toolkit-in-python%2F'; addthis_title = 'MySQL+Data+Sharding+Toolkit+in+Python'; addthis_pub = 'jonesy';Hi. My name is Brian, and I’m a tech bibliophile.
I have owned more books covering more technologies than I care to admit. Some of my more technical friends have stood in awe of the number of tech books I own. I am also constantly rotating old books that almost *can’t* be useful anymore out of my collection because there’s just no room to keep them all, and it would be an almost embarrassingly large collection if not for the fact that I have no shame or guilt associated with my need for dead trees.
If you need further proof:
I have also co-authored a book for O’Reilly, and in addition to my day job (I’m the director of IT for AddThis.com), I also work for a publisher, MTA, the publisher of Python Magazine, php|architect, as well as a line of books. Oh yeah. I’m into it. It’s bad.
I’ve learned quite a bit about buying books, and some of that learning came from unexpected places. There’s even more that I don’t know, but at least now I know that I don’t know it, and can try to figure out more stuff
So here are a few things to keep in mind when you need to buy a technical book, or one just tugs at your impulse buy strings.
The first books about PHP 5 were dreadful. I never, ever return books to a book store, even if I don’t particularly care for them, but I returned a book about PHP 5 because the level of inadequacy was just insulting to me as a consumer. This was quite some time ago (when PHP 5 books first hit the shelves), and thinking about it now I’m still amazed at how terrible that book was. Of course, PHP 5 is just one example. Way, way back in the day (1998-9 or so?) when the first books about Java 2 hit the shelves (some might remember that booksellers actually put stickers over the part of the title that said “1.2″ when it was renamed “2″), I had the same experience.
It’s not exclusive to languages either. When the first MySQL books came out that said “covers mysql 5″, they just barely covered MySQL 5. In fact, there’s a new edition of High Performance MySQL coming out that is *going* to say “covers MySQL 5.1″ on it, and it’s not really going to cover much about 5.1, so says one of the books authors (whose honesty I greatly appreciate, by the way - I’d love to see that from the various book publishers).
At the OS level, I’m mostly a Linux guy, and at this point I wouldn’t take a book about a specific version of a specific distribution of Linux if you paid me to take it. Those books are mostly rehashes of the last version of the book put together as marketing objects. I know, because when the “<distro><version> Bible” series first came out, I read them (I think RedHat was the only distro covered initially), and I followed up with later versions of the books, and was always disappointed. Nowadays, I don’t know how you can think that a book about something as fast moving as Fedora Core is going to be useful. Maybe if you’re learning it for the first time something like this can work out, but if you’re looking to exploit new features, you’re really better off just reading the release notes and changelog.
Lesson learned. Books take time to write, to edit, to format, to print, to distribute, and to get on the shelves. Keep that in mind when you see a book about Python 3000 on the shelves within days of a GA release of Python 3000. It’s likely that that book was completely written and in an editor’s hands 3 months ago, and writing for that project began probably 9 months ago… 9 months before Python 3000 was a reality in this example. Some changes can be accounted for during the writing process, but a book that is released 6 months after the release of a new technology is likely to be built on more solid ground (of course, this is only part of assessing the quality of the book - but I suspect it’s often overlooked).
I’d also like to note that this probably wasn’t the case quite so much in the days when, for each language or technology or application or whatever, there were far fewer titles in print on the topic, and an authoritative title was more easily identified. Nowadays, the number of books about Ruby is dizzying to witness on the shelves of your local retailer. I just don’t think there was a market to support that kind of sensationalistic publishing model back when, say, C++ hit the scene. Maybe I’m mistaken there and some more… distinguished folks can enlighten all of us.
Book reviews are lame, unless you know the source. When I say “know”, I don’t mean “have heard of”. I mean “know” in the sense that you have some idea what this reviewer is working with on a day-to-day basis, you know what their leanings are within the technological landscape, and you recognize that person as an authority on some topic at least loosely related to the book being reviewed.
I wouldn’t put much faith in the reviews on Amazon unless it is an established title that’s in its second edition. First edition books that all of a sudden have 20 reviews on Amazon within the first week of the release are probably reviews done by other authors who work for the same publisher, or who have some other motivation for writing the review.
You can learn to identify lame reviews or astroturfing on sight (now that you’re aware of it, it’s not all that hard to recognize), so be on the lookout. If you can, google the reviewer by name. Some of those folks work for the same publisher, and should likely just be discarded. I hate astroturfing, but I guess the publishers feel like they have to do it to compete with everyone else who is doing it and creating buzz around their titles. Sad.
By the way, astroturfing in this context means sending everyone you know (and/or works for you, or wants to) to do reviews, talk up the book, link to the book’s web site, or the author’s blog (where his book is probably displayed prominently) or run ads on their blog, or mention the book on irc, digg, del.icio.us, slashdot, etc. If you get enough people to do this, it gives the impression that there’s a lot of buzz and “grass roots” enthusiasm around the book. Except the “grass” is fake. Hence “astroturfing”. This is the kind of thing that Digg fights against all the time. Mostly unsuccessfully. It goes like this:
…But I digress. Just take reviews with a grain of salt. Same goes for big numbers on Digg and other like services.
The K&R book on C is a timeless tome. The GoF book on Design Patterns is a timeless tome. Stevens on TCP/IP is a timeless tome. C.J. Date’s early Intro to Database Systems is a timeless tome. These books came out a shockingly long time ago considering how often they are referenced and recommended and handed down through generations of technologists. If you need a solid foundation in some technology like this, you should look for books on the topic that have stood the test of time.
However, time isn’t always your friend, and some of these tomes are enormous. That’s why there are books like “Learn Java in 24 Hours”. If you go after this type of book, fine. I have tons of books like this. Just know that going through it does NOT mean you “know” Java. See here for details.
Timeless tomes seem harder to find now that there are stores with 150,000 titles in stock. They get lost in the noise. They’re out there, though. I have a built-in Amazon storefront on LinuxLaboratory.org that I try to keep updated with books I have read and found genuinely useful. I’m a little behind on that, but the books there are a mix of huge tomes (Understanding and Deploying LDAP is enormous), and useful reference or “contextual” books that explain how to use a technology in a particular context (Perl for System Administration, for example, is a good book). The next book I need to add there is “The Art of SQL”, which completely rocks and I highly recommend if you *already know* SQL.
Technology moves at break neck speed. Some books that are still on the shelves that say “PHP and MySQL” cover versions that aren’t even supported anymore. Oracle 8i books are still around. Some books about Apache only make passing references to Apache 2. It would take some time to sit around flipping through pages to figure out if the version you need information about is covered. If you have some familiarity with the subject, checking the copyright date is a quick reference that can let you know if this book is the one you need. It can also help you avoid the dreaded “written before the technology was GA” problem mentioned above. If you know that FooLang 24 came out in February of 2008, the book in your hands that says “FooLang 24″ on the cover should not have a 2007 copyright, ideally.
First: there are “Volumes” and there are “Editions”. A second volume is a completely different book from the first volume. A second *edition* is an updating of the first edition. It’s the same basic material. Or… that’s how it used to be. Nowadays, marketing sometimes dictates that new editions should include whole new sections about new and exciting buzzwords of the day and stuff like that. Have you seen the most recent edition of “Programming Python”? It’s probably the thickest technology book I own, even beating out Understanding and Deploying LDAP Directories. I have no idea if anything in there was put upon the author by O’Reilly - and I’m not making accusations (I’ve worked for O’Reilly and have no reason to believe they’re guilty of this practice) - I’m just saying that the first edition was probably around half the size of the second.
For what it’s worth, I own the latest edition of Programming Python, and am not sorry I bought it. In my editing work for Python Magazine, I came across code that used seemingly every conceivable Python module, and I had to be able to quickly reference and read up on stuff that was in unfamiliar territory. Of course, we have tech editors (who rock, by the way), but I still needed to make sure the text was explaining things in a way that made sense and didn’t contradict the code (or vice versa). That book covers a ton of stuff, and I was glad to have it.
I’ve worked with a good number of publishers, and I have definitely been encouraged to make mention of different things I had no interest in writing about, because it was good for Google rankings, or blog buzz, or tag clouds, or whatever. I have friends in tech publishing circles (and tech authors) who have confirmed that this *does* happen.
Understanding that publishers, no matter how granola they look, run businesses, and businesses need to grow and make money, which is an enormously large feat to pull off in publishing. Eventually, they hire marketing people, and priorities can conflict, and bad things can happen. This is not a diatribe against the publishers. It’s a guide for the reader and technical bibliophiles.
My $.02.
As usual, the more information the better, so share your thoughts!!
addthis_url = 'http%3A%2F%2Fwww.protocolostomy.com%2F2008%2F06%2F03%2Fnotes-on-book-shopping-from-a-tech-bibliophile%2F'; addthis_title = 'Notes+on+Book+Shopping+from+a+Tech+Bibliophile'; addthis_pub = 'jonesy';Following a link from the High Scalability blog, I found this really great article about scalability practices, as told by Randy Shoup at eBay. Randy is very good at explaining some of the more technical aspects in more or less plain English, and it even helped me find some wording I was looking for to help me explain the notion (and benefits) of functional partitioning. He also covers ideas that apply directly to your application code, your database architecture (including a little insight into their sharding strategy), and more. Even more about eBay’s architecture can be found here.
addthis_url = 'http%3A%2F%2Fwww.protocolostomy.com%2F2008%2F05%2F29%2Fscalability-best-practices-ebay%2F'; addthis_title = 'Scalability+Best+Practices%3A+eBay'; addthis_pub = 'jonesy';