I am a big fan of Ruby’s Struct class. It makes it trivial to build container objects, and then later expand them into full blown classes when needed. If you’ve not worked with them before, here’s an example:
Person = Struct.new(:first_name,:last_name,:email) joe = Person.new joe.first_name = "Joe" joe.last_name = "Loop" joe.email = "joeloop@blah.com"
class Person
def name
first_name + " " + last_name
end
end
joe.name #=> "Joe Loop"
That’s all good fun. But what about when you have a ton of nested attributes and not a whole lot of logic? You could use something like OpenStruct, which allows you to arbitrarily assign values to attributes on an object… but that’s a great way to have a typo or two ruin your day.
Usually, that means putting together a bunch of classes with accessors, and boring methods that chain the parent objects to their children. Here at RubyConf, I got tired of that.
So now, with my little Structure object, you can define containers like this:
Settings = Structure.build do |s|
s.width
s.height
s.data_type
# ....
s.background do |bg|
bg.color
bg.alpha
bg.border_color
bg.border_alpha
bg.file
end
s.plot_area do |pa|
pa.color
pa.alpha
pa.border_color
pa.border_alpha
pa.margins do |m|
m.left
m.top
m.right
m.bottom
end
end
#...
end
This creates a skeleton class with nested accessors, without giving up the notion of having a NoMethodError on a typo. That means we can do something like:
s = Settings.new s.plot_area.color = "green"
Now, I must admit, the implementation of Structure is a hack. I guess it’s the spirit of the conference that has me playing fast-and-loose with my code… but please leave comments with suggestions if you’d like.
Did you know you can do this with Ruby out of the box?
# A real lambda
λ { puts ‘Hello’ }.call => ‘Hello’
# Sigma - sum of all elements
∑ [1,2,3] => 6
# Square root
√ 49 => 7.0
How difficult was this to implement? Keep reading!
# Be sure to run with the "-Ku" flag!
module Kernel
alias λ proc
def ∑(*args)
sum = 0
args.each{ |e| sum += e }
sum
end
def √(root)
Math.sqrt(root)
end
end
Pretty tricky, eh?
Just remember the “-Ku”. :)
These days, it seems I hardly have the time for doing fun random hacks. So here I’ve started one, and if anyone finds it interesting, please take it from here and let me know how it turns out.
Loosely based off of AIML, kind of, but not really:
class Conversation
def initialize(person)
@person = person
@response_id = 0
end
attr_reader :response_id
def say(msg)
print "#{@person}: "
@response_id = Response[@response_id].respond_to(msg)
end
class Response
def self.responses
@responses ||= {}
end
def self.[](id)
responses[id]
end
def initialize(id)
@id = id
self.class.responses[id] = self
@matchers = []
@messages = []
end
attr_reader :id,:matchers
def when(pattern,id)
@matchers
I’m looking for someone to take over PDF::Writer, color-tools, and Transaction::Simple. I do not have time to maintain these anymore. I should have done this months ago, but pride of ownership and a belief that more free time would be just around the corner got in the way.
You can read more details on my original blog posting at my personal blog.
Anyone interested? Anyone know anyone interested?
I think most Rubyists have picked up a good trick or two from Jim Weirich. Though it’s only a tiny part of his latest article (Using Flexmock to Test Computational Fluid Dynamics Code), I got excited to see his ‘Existence Test’ in his code:
def test_initial_conditions
q = F3DQueue.new
assert_not_nil q
end
Looks pretty simple, eh? You might be quick to say that this doesn’t do anything. However, it is actually a pretty clever practice. This test makes sure the tests themselves are working as expected. I was already in the habit of starting with a failure, usually something like:
def test_doomed
flunk
end
The purpose of the above is simply to make sure your tests are picked up within your suite, and aren’t being overlooked by your Rakefile, autotest, or whatever runner you’re using. But the existence test actually goes a little farther. Because you’re initializing an object, you’re making sure that the files you need to be loading are present, that you can build your objects, *and* that your tests are hooked up.
After you’ve got a couple tests passing, you can remove this sanity check or morph it into a setup(), whatever makes sense.
Many people think this is a little paranoid, and most of the time, it is. Still, all it takes is one bad experience coding under falsely passing tests, and you’ll be converted in no time. :)
Hey folks, I’ve picked a winner for June’s Ruby project spotlight and will have a post out within the next few days about it, but I’d like to remind folks that this is an ongoing project.
What that means is that I’m now accepting July submissions. Every submission we got for June was excellent, and if you were not selected, you can always resubmit for a later month. Here’s a recap of the rules, but see the original post for details.
Please email me if you’ve got a cool project to submit!
From Nick Sutterer, A Computer Science Undergraduate at Albert-Ludwigs University (Freiburg, Germany)
When writing an article about Apotomo I had to make a decision: either introduce it as a simple widget plugin for rails or - as the name Apotomo (”all power to the model”) implies - end up in monologues about model-driven component-oriented enterprize concepts. Today I will simply introduce Apotomo as a widget library for rails.
Apotomo is a widget library for rails. The concept is familiar to everyone who’s already worked with a GUI library: Take a window, draw some frames in it and throw in some buttons. Attach some logic to the buttons, hook a method to the frame and you’re done.
In Apotomo (that’s a widget library), the central place - where all this drawing and attaching happens - is the modeling tree. For your convenience I prepared a meaningful model which is the foundation of an imaginary drinking application: people can track their drinks in a database, can list what they drank and can view their current blood alcohol value. Useful? Not very.
def drinking_model_tree
top_page = page("Top Page!", 'top_page')
track_page = section("Tracking Page", 'tracking_page')
tracking_notebook = notebook('tracking_tabs')
track_tab = tab("Track a Drink", 'track_tab')
list_tab = tab("List tracked", 'list_tab')
level_page = section("Permille Page", 'permille_page')
top_page
top_page [:track_drink])
tracking_notebook
return top_page
end
And a controller method:
def apotomo
act_as_widget('top_page', drinking_model_tree)
end
It may look a bit weird at first, but it is very simple: I nest widget objects to model my application. For example, I create a notebook widget which has two tab children, that again have children.
Using the #act_as_widget method in a Controller action I can command Apotomo to render my top page.
Have a look at the rendered states of the application! Can you see the cool tabs? This is all done by Apotomo since it provides some handy and ready-to-use widgets. We will now look at some of those widgets.

Apotomo (did I mention that this is a widget library for rails?) is based on another rails plugin called “rails cells”. A typical cell looks like a controller, with methods and respective views, but is not bound to a specific controller and thus can be called throughout the application.
Every widget in Apotomo is a derived cell and can be fully adjusted to the developers needs - behaviour as well as the templates used for rendering can be overwritten. This is an important principle in Apotomo.
Pages, sections, notebooks, and tabs are all structural widgets used for grouping parts of the application. They have predefined (but customizable!) views and behaviour.
Notebooks and pages are very similar, they render themself and their current sub page/tab. Apotomo’s state- and addressing system provides information about which pages or tabs are presently focused by the user.
The crucial parts of every application - the business logic - is packed into logic cells. Looking at our example app, the line
list_tabattaches the #list_drinks method of the drinker cell class to the "Tracking Page" widget. This cell method could look like
def list_drinks id = param(:user_id) @drinks = drinks_for_user_id(id) endThis really looks common to us - it’s somehow identical to a controller method.
But wait, what is this param() call?It’s more than a widget library!
In conventional Rails parameter values are accessed using the #params method. Apotomo (Hey! Did you know that this is a widg… ok, I’ll be quiet) provides a more progressive approach to parameter accessing. Instead of looking directly in the global request parameter hash, a parameter request through the param method travels up the widget hierarchy asking every ascendent widget if it knows the value.
This opens the way for a completely new parameter management and some innovative concepts which are already implemented in so called domain widgets in Apotomo. I’ll discuss this in the next article, promised!
Another feature is the cell addressing in Apotomo. Normally in Rails you have to know the respective controller/method combination to address a specific function, e.g. when linking to another page, or in a form. We rather address widgets in Apotomo.
Let’s take the linetop_pagethat attaches the cell method pages_menu to the Top Page to render a clickable navigation menu. The view template for this method might look like
<ul> <li><%= link_to_cell("Tracking Page", 'tracking_page') %></li> <li><%= link_to_cell("What's my level?", 'permille_page') %></li> </ul>That’s cool - we refer to cells (or, widgets) - and this brings some great advantage: the addressing mechanism travels - similar to the parameter thing - up the hierarchy and to the target cell, asking every cell on its way if they want add something to the address. This is extremely helpful for saving state-relevant information in urls. We’ll stick to this in the next article as well.
Apotomo integration
Basically Apotomo could control a complete application. However people will be sceptical with this new concept. The #act_as_widget method provides a way to render only parts of the modeling tree within views or controllers of an existing application. For example I could integrate only the tracking-notebook widget and its tabs in some view of my app - leaving it open to the developer how much “Apotomo” he wants.
And now?
This was a very brief overview about Apotomo, it has more features that will (hopefully) be discussed in another article. I’d love to see some discussion going on at http://nick.smt.de/trac/nick/wiki/ApotomoDiscussion . If you have any issues with Apotomo feel free to mail me at nick@tesbo.com.
Props go out to Google for sponsoring Apotomo during Summer of Code 2007 and my mentor Patrick Hurley for his help and support!
More about cells and Apotomo can be found at http://apotomo.rubyforge.org.
The example app can be downloaded as standalone rails environment at http://rubyforge.org/projects/apotomo, but please notice that the API may change in the near future.
To answer a question on RubyTalk the other day, I had to reference Mauricio Fernandez’s nicely compiled list of Changes in Ruby 1.9. While I was there I took another walk through the whole thing.
There are of course some features I *don’t* like.
a = ->(b,c){ b + c }
a.call(1,2) # => 3
But there are quite a few that I do, and here I’ve listed ten I think will totally rock. I use Mauricio’s examples, so all credit goes to him. Also, this article is from February, so if you find any features below that have changed, shout and I’ll update.
This means all your enumerable objects can return Enumerators without a require, and also avoids some use of enum_for
a = 4.times
a = a.each
a.inject{|s,x| s+x} # => 6
I had to cross my eyes a couple times to understand what was going on there, but I came to the conclusion that ultimately, that is going to rock.
I think most people will at some point be look to do a map_with_index, and this brings you quite close:
[1,2,3,4,5,6].map.with_index {|x,i|[2,5].include?(i) ? x : x*2} #=> [2, 4, 3, 8, 10, 6]
[1,2,3,4].to_s # => "[1, 2, 3, 4]"
{1,2,3,4}.to_s # => "{1=>2, 3=>4}"
IIRC, puts will still do its magic when used on Arrays.
class A; def foo; end end a = A.new a.method(:foo).receiver # => #<A:0xa7c9f6d8> class A; def foo; end end a = A.new a.method(:foo).owner # => A
I’m sure we’ll find something evil to do with that. :)
Process.daemon() => fixnum Process.daemon(nochdir=nil,noclose=nil) => fixnum
By default, this will detach the process and change the working dir to the root. It’ll also redirect all output to /dev/null. Sounds like this will be a *nix only feature but having built in support for daemonizing scripts should be great.
define_method(:foo){|&b| b.call(bar)}
Hooray, a simultaneous win for higher order procedures and metaprogramming goodness!
a = 1
10.times{|a| } # !> shadowing outer local variable - a
a # => 1
Compared to the nasty behaviour on 1.8:
a = 1
10.times { |a| }
a # => 9
This is going to be great for making fake named arguments look even prettier
{ a: 1, b: 2 }
is now equivalent to:
{ :a => 1, :b => 2 }
which means you could easily do something like:
foo(a: 1, b: 2)
Slightly weakening the case against them…
A lot of times, you want a minimalist object. There have been plenty of hacks to show how to construct one in Ruby 1.8, but we’ll get one for free in 1.9
BasicObject.instance_methods # => ["__send__", "funcall", "__id__", "==", "send", "respond_to?", "equal?", "object_id"] Object.ancestors # => [Object, Kernel, BasicObject]
Enumerable#group_by looks like it rocks. Symbol#to_proc wasn’t mentioned here but it’s handy (lets you do something.map(&:some_attr)). Also, the best damn Regex engine ever, Oniguruma, is built into Ruby 1.9
I wonder if the core team is still on schedule for a Christmas release…. Should be interesting to see how people make use of all this new stuff.
Consider this fact: Multi-core CPUs are not only the future, they’re the only way CPUs can continue to grow at their current pace. It’s also a hotly debated subject in the software world. Multi-threaded programming is different and not seen as often as procedural programming, and therefore it’s not yet as well understood. So the question is, how can programming languages (and Ruby in particular) make it easier to harness these systems?
As Ruby struggles to graduate from its current implementation into something more powerful, we’ve already seen several projects attempt to update Ruby to help developers cope. Those who’ve been working with Ruby for awhile may remember YARV, which promises to provide more threading support. JRuby offers all the power of Java’s threads to Ruby, if it can harness it. And Evan Phoenix’s small but rapidly growing project Rubinius is attempting to be the next big contender.
No matter what implementation becomes the next de-facto Ruby platform, one thing is clear: People are interested in taking advantage of their newer, more powerful multi-core systems (as the recent surge in interest in Erlang in recent RailsConf and RubyConfs has shown). As Ruby becomes increasingly part of solutions that deal in high volumes of data processing, this demand can only increase.
That’s why it’s so very surprising to see David Heinemeier Hansson dismiss the whole notion out of hand regarding Rails. His argument seems to be that Rails already scales to multiple cores in the same way it scales to multiple machines, via UNIX process distribution. After all, isn’t this the very crux of “Share Nothing?”
But the math says something different, because for a single server “Share Nothing” doesn’t really exist. Even if the processes don’t share state, they share the same pool of resources (e.g., system memory, disk and system bus bandwidth). Each one can be a serious issue. Consider a deployed Rails application, where each Rails process (running mongrel) weighs in at about 200 real megabytes of RAM. If we wanted to take advantage of 8 cores, we’d be using a bare minimum of 1.6gb of memory–not to mention an even more dire situation with system bus bandwidth. With a dual-processor setup, you could easily see a machine with 16gb of RAM being resource starved.
David talks about welcoming a 64-core chip, but the truth is that Rails’s process-level concurrency can already barely accommodate today’s top of the line. Within 6 months we will see machines with 32 and possibly 64 cores in a dual-processor configuration as a top of the line, and today’s best being commonplace. What scales for many machines doesn’t scale for one.
It isn’t surprising that many Ruby libraries prefer to scale at the process level. The argument for process-level concurrency is a good one: It’s dead simple. We’re already doing it, and it’s worked fairly well up until now. It’s also simple because some of the many Ruby libraries that Rails uses don’t play nice with threads. Changing that requires a lot of work, and it’s work that wouldn’t immediately yield up any new features. It’s a lot of work for a status quo, which can be hard to invest time in.
The most important thing to remember when thinking about the future of Ruby is that just because we don’t have convenient methods for threading Ruby today, it doesn’t mean we shouldn’t explore all the possible avenues. YARV, JRuby, or Rubinius may come along any day and blow us away with completely new ways to think about working with concurrency. If Rails is ready for this, it can continue to be on the forefront of web toolkits. If it is not it will rapidly fall behind, because ignoring the problem at this stage is ignoring problems that well-funded startups have already encounterd.
Talking to some major Rails developers for 5 minutes, ideas like simultaneous request-processing (something Ezra Zygmuntowicz’s Merb already does), parallelized partial rendering, and really crazy out-there future-talking ideas like stateful HTTP, or trivial implementations of Comet-like technology were mentioned immediately. Imagine what could be accomplished with a real implementation to play with?
Keep an eye on these new Ruby implementations. The first people to really innovate technically with them will have an enormous advantage over their competitors.
I was not mowing my lawn like Gregory, but I was reading this blog when I got an idea for a Rails version of the Ruby Project Spotlight series Gregory is spinning up.
The idea is that I’ll post an entry once or twice a month about a new and active Rails project that’s looking for more exposure. The project can be a gem, a plugin, an open source Rails application: basically anything that’s related to Rails. The process for getting a project mentioned is simple: send me an e-mail about your project. It should follow these simple rules:
The rules are so specific because the coverage is plentiful of things that don’t fit within them, but a lot of the newer and (usually) more interesting projects aren’t really gaining the exposure they need to get a flourishing community to spring up around them.
So, if you’re a developer on one of these projects, e-mail me your submission by June 30th. I’ll judiciously pick through them and make my rather subjective choice for July soon after. Hopefully this can bring some really cool Rails projects some exposure, and expose our readers to tools and projects they can make use of.
As I was mowing the lawn, I had an idea for a series I thought might be fun.
I’d like to put out an entry once a month about an new or highly active Ruby project that’s looking to gain some extra exposure. People can send me proposals for their projects, and I’ll pick one each month to write about here. This way, if you’re way behind on your mailing list reading, you’ll be able to easily find at least one new project announcement each month.
Below are the semi-arbitrary rules for submission:
The reason these rules are fairly specific is that I’m hoping that this series will be a sort of grass-roots effort to have Ruby recognized as a useful language standing on its own two legs. If someone else wanted to start up a similar series about Rails projects on this blog or elsewhere, that’d be great.
I’m also stipulating that the project needs to be relatively new and fresh, because there is no shortage of coverage of the more popular Ruby projects out there. This is a chance to give new folks and new projects some time to shine.
Please email me your submissions by June 30th. I’ll get in touch with my favorite pick some time in the first week of July, and have a post out then. Hopefully this will be a fun little series, and a useful service to those having trouble keeping up on the latest Ruby software.
Last year, one of the most difficult things about keeping track of the progress of the Ruby projects in Google’s Summer of Code was finding where the students / mentors were talking about their projects. Since several of the bloggers on O’Reilly Ruby are directly involved in the the Summer of Code in one way or another, we decided that we’d try to make things a little easier for the community for GSoC 2007.
We’ve sent out an open invite to all students and mentors who are assigned to RubyCentral for the summer. Rather than just relaying second hand news, we’ve encouraged those involved to submit blog posts to us, and we’ll post them all here using the special GSoC account. If you missed the original announcement, please contact Gregory Brown, as he’ll be coordinating the effort.
Better than half of the students involved this summer have expressed interest in participating with us. We’re busy collecting bios, and will soon make a post that introduces the folks who will be blogging with us this summer, and a little more detail about their projects.
One of the students involved has plans to have an announcement about their project ready by the end of the month, so keep an eye out for that!
At RailsConf 2007 DHH mentioned that Rails 2.0 would support query caching on the client side in order to speed up AR. I immediately thought to myself, “Huh? Why do it on the client side when the database server will handle that?”.
The answer is that ActiveRecord (AR) doesn’t support bind parameters. In fact, AR is downright deceptive in this regard, because it sure *looks* like it’s using bind parameters. Consider this simple example:
orders = Order.find(
:all,
:conditions => 'name = ?', 'Daniel Berger'
)
In DBI, that query would look something like this:
...
sth = dbh.prepare('select * from order where name = ?')
sth.execute('Daniel Berger')
orders = sth.fetch
...
Anyone coming from a DBI background would look and the AR version and think that bind parameters were being used. But, in fact, AR is doing variable interpolation behind the scenes. This is not optimal for databases that support bind parameters. But first, a bit about Oracle and bind parameters. This mostly applies to PostgreSQL as well 1.
When you send a query to an Oracle DB server a few things happen. First, the query is parsed to ensure that it’s formed properly. Then, Oracle determines the execution plan (aka ‘explain plan’). The execution plan is, in short, the strategy that the DB server forms in order to fetch the rows for the query, e.g. “do a full table scan on table X and an index scan on table Y”. Finally, it fetches the data.2 Parsing the SQL and creating the execution plan is the most expensive portion of the operation.
Back to the part about why not supporting bind parameters is a bad idea. First, and foremost, is performance. With variable interpolation SQL is re-parsed and the execution plan is re-formed every time the query is run. By using bind parameters the execution plan is generated only *once* (and is stored in a query cache) since the strategy for fetching the rows doesn’t change, merely the particular column value that you happen to be looking for.
To prove my point, here’s some sample code that’s very similar to what I use in a production report. It grabs a list of telephone numbers from a plain text file and gets the necessary information based on that number. The first example uses variable interpolation, the second uses bind parameters. Benchmarks then follow. Column and table names changed to protect the innocent:
# Variable Interpolation
IO.foreach('numbers.txt'){ |tn|
sth = dbh.prepare("
select so.order_number, so.order_id, so.service_number
from some_order so, network nh, service se
where se.service_id = so.service_id
and nh.child_link = se.child_link(+)
and nh.telephone_number = '#{tn}'
")
sth.execute
info = sth.fetch
}
# Bind parameters
sth = dbh.prepare("
select so.order_number, so.order_id, so.service_number
from some_order so, network nh, service se
where se.service_id = so.service_id
and nh.child_link = se.child_link(+)
and nh.telephone_number = ?
")
IO.foreach('numbers.txt'){ |tn|
sth.execute(tn)
info = sth.fetch
}
When I ran this against our reports database against 10,000 telephone numbers, the first example took approximately 3:50 in repeated runs, while the second example took approximately 1:45 in repeated runs.3 This example is typical and, in fact, most of my reports consist of nested cursors and much more complex SQL. In those cases the performance difference is even more significant.
In addition bind parameters are also a better defense against SQL injection attacks. AR protects you via quoting. It also cuts down on CPU cycles being used by the DB server, which the DBA’s will appreciate. Lastly, bind parameters are *absolutely necessary* for getting at binary data. This point is crucial, because it’s a problem that caching queries on the client side won’t solve.
What can we do about it? I believe Izumi 4 has the best idea - refactor the AR design to support actual bind parameters for those vendors that support them, and fake the rest with quoting to make the interface seamless. When necessary, disable automatic binding for those situations where they aren’t ideal. 5 In the case of MySQL, that would probably be often, since bind parameters actually seem to slow down many queries. Furthermore, I’ve been told that the MySQL query cache is cleared every time an INSERT or UPDATE occurs - not very useful.
His suggestion has been submitted, but the adapters still need work. Please take a look.
One final Oracle-specific note. Oracle has the concept of “hints”, little bits of meta information you can embed in the sql directly in order to alter the way Oracle generates its explain plan and/or to change the behavior of the results, e.g. PARALLEL to take advantage of multiple cpu architectures and FIRST_ROWS to optimize for immediate results.6. I’d like to see the Oracle adapter support this, and it would be easier to integrate if we adopt Izumi’s architecture.
See you next Wednesday.7
1 I’ll talk about MySQL later. I am not familiar enough with DB2 or SQLServer to comment, but my hunch is that they support bind parameters as well.
2 For a more in depth explanation of what’s going on behind the scenes, please see Luca Mearelli’s excellent article at http://www.oracle.com/technology/pub/articles/mearelli-optimizing-oracle-rails.html
3 For those wondering, I actually used the NOCACHE hint in the SQL to make sure the DB server wasn’t using a pre-existing cache.
4 http://izumi.plan99.net/blog/ - scroll down a bit for the SVG. Don’t use IE.
5 Such as histograms and ‘like’ queries.
6 Tools like TOAD use this so they can display a small result set immediately before all the rows are actually fetched. This would be useful for pagination.
7 On a final note I would like to thank Tim Bunce of (Perl fame) for his DBI book that taught me some basic knowledge of bind parameters back when I was a fledgling Perl programmer.
Last year, one of the most difficult things about keeping track of the progress of the Ruby projects in Google’s Summer of Code was finding where the students / mentors were talking about their projects. Since several of the bloggers on O’Reilly Ruby are directly involved in the the Summer of Code in one way or another, we decided that we’d try to make things a little easier for the community for GSoC 2007.
We’ve sent out an open invite to all students and mentors who are assigned to RubyCentral for the summer. Rather than just relaying second hand news, we’ve encouraged those involved to submit blog posts to us, and we’ll post them all here using the special GSoC account. If you missed the original announcement, please contact Gregory Brown, as he’ll be coordinating the effort.
Better than half of the students involved this summer have expressed interest in participating with us. We’re busy collecting bios, and will soon make a post that introduces the folks who will be blogging with us this summer, and a little more detail about their projects.
One of the students involved has plans to have an announcement about their project ready by the end of the month, so keep an eye out for that!
Just because Everyone Is Here In The Future, doesn’t mean you should be too!
The culprit, in camping’s reloader:
86 # The timestamp of the most recently modified app dependency.
87 def mtime
88 ((@requires || []) + [@script]).map do |fname|
89 fname = fname.gsub(/^#{Regexp::quote File.dirname(@script)}//, '')
90 begin
91 File.mtime(File.join(File.dirname(@script), fname))
92 rescue Errno::ENOENT
93 remove_app
94 @mtime
95 end
96 end.max
97 end
If your most recent modified time is more recent than your current system time, your reloader will break until you go Back To The Future.
I figure this is probably a rare case, and not really a bug, but if you’re playing around with system time dependant apps (I am), this might bite you.
I’m excited to be able to finally get around to another post in this Digging Deep series, in which I hope to delve head first into some Ruby esoterica. This time around, I have an interview with Ara T. Howard about some of the hackery he does with packaging rq
Hope you enjoy it! Questions follow.
>> I’d like to ask you some questions about the way you package Ruby Queue,
>> but before I do that, can you give me a short description of the project?
DESCRIPTION
ruby queue (rq) is a zero-admin zero-configuration tool used to create instant
unix clusters. rq requires only a central nfs file system in order to manage a
simple sqlite database as a distributed priority work queue. this simple
design allows researchers with minimal unix experience to install and
configure, in only a few minutes and without root privileges, a robust unix
cluster capable of distributing processes to many nodes - bringing dozens of
powerful cpus to their knees with a single blow. clearly this software should
be kept out of the hands of free radicals, seti enthusiasts, and one mr. j
safran.
the central concept of rq is that n nodes work in isolation to pull jobs
from an centrally mounted nfs priority work queue in a synchronized fashion.
the nodes have absolutely no knowledge of each other and all communication
is done via the queue meaning that, so long as the queue is available via
nfs and a single node is running jobs from it, the system will continue to
process jobs. there is no centralized process whatsoever - all nodes work
to take jobs from the queue and run them as fast as possible. this creates
a system which load balances automatically and is robust in face of node
failures.
although the rq system is simple in it’s design it features powerful
functionality such as priority management, predicate and sql query , compact
streaming command-line processing, programmable api, hot-backup, and
input/capture of the stdin/stdout/stderr io streams of remote jobs. to date
rq has had no reported runtime failures and is in operation at dozens of
research centers around the world.
URIS:
the short version is this:
rq is a command line application which runs in several modes: in daemon mode
is pulls jobs from an nfs mounted priority queue and runs them. in other
modes it manipulates said queue by submitting jobs to it, queries existing
jobs, etc. rq is designed such that both the queue and code live on an nfs
mount. if sweet graphics help you understand things then this will be
illuminating
+-----------+ +--------------+
| | | |
| compute | | compute |
| node | | node |
| | | |
+-----------+ +--------------+
`.
`.
`. `-.
NFS SERVER `.
-------------------------------`.
| |
| /nfs/exported/priority/q |
| |
| /nfs/exported/bin/rq |
| |
| /nfs/exported/bin/ruby |
| /nfs/exported/lib/ruby |
__.----------------------------=.-'+
___..--''' .-'
''--------------+ _.-'------------+
| | | |
| compute | | compute |
| node | | node |
| | +---------------+
+---------------+
the idea is that you dump some code on an nfs box and start some daemons on
the clients and you’re done.
>> You had mentioned at the MountainWest RubyConf that instead of relying on
>> the sqlite3 gems which build native extensions, you actually manually
>> package and build sqlite3 within rq’s own gem install process, can you
>> explain what this gains you?
actually, rq uses sqlite v2. word on the street is that sqlite v3 is actually
slower than sqlite in some cases and, because the ruby apis are more well
tested with v2 i chose to go that route. also, when i began developing rq
sqlite v3 had only just come out and i was looking to be robust, not bleeding
edge.
sqlite, and the ruby bindings, are a pretty simple pair of things to install
if you’ve ever compiled anything. however, things can get complicated since
many linux distros may have one version installed and users have sometimes
installed another, say in /usr/local or whatever, and trying to build a v2
ruby binding against a v3 lib, or visa versa, is a nightmare of course. then
gems take this and abstracts is one step further - requiring a few
incantations to get the right compiler flags through to the underlying sqlite
setup.rb script, which sometimes fails without error, etc. in short the
sqlite + ruby sqlite install is completely normal with respect to installing
some related open source packages, which is to say not trivial unless you are
the kind of guy who knows what ’strings libsqlite.so|grep version’ does.
so the short answer is that i don’t want people to have to deal with
installing the right version of sqlite and making sure rq finds it.
so, added to all this, is that fact that the target audience for rq is non
technical users who happen to need to setup a small linux cluster - today.
the original rq installer actually dumped everything it needs onto an nfs
mount, including ruby. the hope was that the user may not have even heard of
ruby but could still get a linux cluster up in an hour or so. as it turned
out that worked well - many users downloaded the rq tar ball, unpacked, ran a
/bin/sh script and viola - live linux cluster.
the problem is with the ruby community, who don’t want a massive installer
that clobbers your ruby and sqlite installation when run (damn them!). they
want a gem, of course.
>> Can you share some of the details of how you actually do this?
it’s pretty straight forward
1) the rq dist includes the src for sqlite and sqlite-ruby
2) during install, the installer first builds both of them with this logic
require 'rbconfig'
rqlib = './lib' # rq's own libdir!
c = Config::CONFIG
arch = c['sitearch'] || c['arch'] # i686-linux for example
prefix = File.join rqlib, arch
bindir = File.join prefix, 'bin'
libdir = File.join prefix, 'lib'
# ....
system "./configure --prefix=#{ prefix } && make && make install"
so the result is that rq has both sqlite and the sqlite-ruby bindings
dumped into an arch specific directory that only it know about. the
directory is the ‘lib’ dir, the very same directory that
gems/install.rb/setup.rb will install during the normal installation
procedure. now it’s simply a matter of having rq.rb do the following
dirname, basename = File.split(File.expand_path(__FILE__))
require 'rbconfig'
c = Config::CONFIG
arch = c['sitearch'] || c['arch']
prefix = File.join dirname, arch
bindir = File.join prefix, 'bin'
libdir = File.join prefix, 'lib'
ENV['LD_LIBRARY_PATH'] = [ libdir, ENV['LD_LIBRARY_PATH'] ].join ':'
ENV['PATH'] = [ bindir, ENV['PATH'] ].join ':'
so, at runtime, rq will configure itself such that both it’s private
sqlite and sqlite libraries are first in it’s path. now, when rq.rb does
require File.join(dirname, 'sqlite') # require rq's sqlite binding
or
system 'sqlite ...'
it can be sure that the sqlite.so or sqlite binary that’s picked up is
it’s very own - regardless of what the user may have installed in various
locations around the system.
in summary rq simply sets up it’s own private cache of binary prerequisites
and arranges for them to be found at run time.
>> What about Windows? In the case of rq, it’s a non-issue obviously,
>> but have you tried using this technique with projects that need to run
>> on Windows? If so, does it involve some cygwin / mingw mess?
you know - i have compiled both the gsl and narray for windows using the mingw
approach. it’s a pain. you have to compile using mingw but install manually
isnce rbconfig is F.U.B.A.R for the one click installer.
it’s my personal belief that, despite the benefit is has made to the
community, the ruby one-click installer is badly broken in that it results in
a ruby that’s lacking one of it’s most important features: the ability to
bootstrap itself with new binary features. that is to say that a ruby without
a working mkmf.rb and rbconfig.rb is badly broken in my eyes. i’ve spoken to
austin many times about this and we have different opinions on how this should
be solved: he’s for the total ms compatibility approach and i’m for a windows
dist which includes the msys compiler toolchain. either are valid and both
would result in a ruby which could, either after a toolchain install or
without it, bootstrap itself into the wonderful world of ruby extensions.
i’m sure i’ve started a holy war here so i best stop while i’m ahead…
>> Is there anything that’s missing in RubyGems that would help solve this
>> problem?
rubygems is not an application installer - it’s a library installer. i think
rubygems should incorporate all the wonderful work erik veenstra has done such
that a user can install a ruby application that bundles ruby itself - one that
will continue to run as rubys get upgraded and gem libraries come an go. in
short we need a a robust way to install applications on systems where people
may never have even heard of ruby. rubygems may not be the ultimate answer,
but it would interesting to explore integrating rubyscript2exe application
creation from within the gemspec framework.
i’ll point out that this is no over sight of rubygems - it was not designed to
be an application installer but, now that ruby is mainstream we, as
developers, need simple tools to distribute ruby applications to the
uninitiated.
>> Got any other neat packaging tricks?
there is one - but i cannot write about it in public (seriously!).
I spend most of my time building relatively large applications with Ruby, and this makes me forget how easy the quick and dirty hacks are. In less than the time it’d take me to google the right UNIX tool for escaping HTML, here is my tiny script that I use for things like blog entries and mass spam emails.
#!/usr/bin/env ruby require "cgi" puts CGI.escapeHTML(ARGF.read)
Mmh,… sweet simplicity. If you’ve not worked with the CGI lib before, there are probably other goodies in there so have a look at the API docs.
UPDATE: Sam Aaron does a good job of explaining what this script actually does in the comments
For those who saw my other post on what the RubyForge forum is about, I apologize for the redundancy here. However, I feel like perhaps if some folks pass this reminder along, it’ll get the message out. I’ve seen a huge resurgence of off topic posts, and I’m actually feeling bad that people end up waiting for replies only to get the same ‘we don’t deal with those questions here’ reply.
The RubyForge support forum is meant to support RubyForge itself. That means that if you have a feature request you might want to talk about before submitting a formal proposal, if you think you might have found a broken service in RubyForge, but you’re not sure, or if you just want to talk to us about some of the stuff we offer, you’ve found the right place.
If you have svn access that works on Windows but not on Linux, If you can’t install rails but you don’t suspect our gem servers are broken, or if you just want to ask what a particular library does, please, don’t use the RubyForge forum. You will get much better help elsewhere.
I am the only active volunteer monitoring our forum right now, so please… help me out a bit by using the great mailing lists out there!
This isn’t to discourage people from using our forum, in fact, if in doubt, post to us anyway. But please, read the FAQ before you post. If others can spread the word by linking my other article on what our forum is for, that’d be very helpful!
Here’s a little problem I ran into in some old code of mine.
Sometimes we’ve got methods where we want to return a new instance of the same class.
It’s tempting to write the following code:
>> class A >> def a_whole_new_me >> A.new >> end >> end
Sure enough, that seems to work:
>> a = A.new >> a.class => A >> a.a_whole_new_me.class => A
So what’s wrong with it? We forgot about subclasses!
>> class B > b = B.new >> b.class => B >> b.a_whole_new_me.class => A
If we’re expecting a copy of B and get A, this is certainly going to cause trouble, but the scary part is it might not right away (since B usually has all of A’s methods, but not necessarily the other way around)
Luckily, this is easy to fix, just use self.class
>> class A >> def a_whole_new_me >> self.class.new >> end >> end >> a = A.new >> a.class => A >> a.a_whole_new_me.class => A >> class B > b = B.new >> b.class => B >> b.a_whole_new_me.class => B
This practice is usually a good idea whenever we want to refer to our class object. Rather than making things rigid, if you use self.class when possible, your code will be easier to extend and behave better in general. Of course, your mileage may vary depending on your task.
Over the last couple of Ruby releases I’ve made some improvements (with Eric Hodel’s help and blessing) to RDoc for C extensions that I thought I would share with you. If you write C extensions with Ruby then keep reading. If you don’t do C and/or don’t care that much about RDoc, this post may not be that interesting for you. :)
First, and most significantly, you no longer need to use the “Document-class” directive for source files that contain multiple classes and/or classes that don’t match the ‘xxx’ portion of ‘Init_xxx’. Prior to 1.8.6, for example, you might have something like this:
/*
* Document-class: Top
* This is the Top namespace.
*/
/*
* Document-class: Bar
* This is the Bar class
*/
/*
* Document-class: Baz
* This is the Baz class.
*/
void Init_foo(){
VALUE mTop, cBar, cBaz;
mTop = rb_define_module("Top");
cBar = rb_define_class_under(mTop, "Bar", rb_cObject);
cBaz = rb_define_class_under(mTop, "Baz", rb_cObject);
}
I thought that having to explicitly document classes and modules outside of the Init_xxx function using special directives like that was ugly, so I dug into the rdoc source (scary!) and figured out to get this working. The short of it is that you can document your classes and modules in a manner that is much more in line with the way rdoc works for other C elements:
void Init_foo(){
VALUE mTop, cBar, cBaz;
/* This is the top namespace */
mTop = rb_define_module("Top");
/* This is the Bar class */
cBar = rb_define_class_under(mTop, "Bar", rb_cObject);
/* This is the Baz class */
cBaz = rb_define_class_under(mTop, "Baz", rb_cObject);
}
This is a much nicer DWIM approach in my opinion. No special directives required.
The second improvement was a minor one. In pure Ruby you can have “personal comments” in methods that won’t be picked up in the rdoc by using “–” to delineate them:
# This is the foo method. There are many like it but this one is mine. #-- # This was a major pain to implement for MS Windows. def foo "hello" end
In the above example the comment “This was a major pain to implement for MS Windows” is not picked up for the final rdoc output. Prior to Ruby 1.8.5 there was no similar mechanism for C extensions. Now, however, you can use the same approach:
/*
* This is the foo method. There are many like it but this one is mine.
*--
* This was a major pain to implement for MS Windows.
*/
The last thing I’ll mention is an improvement in the way constants are documented. Prior to 1.8.6 the constant definitions were parsed literally because rdoc has no way of knowing what a constant C value is. For example, if you had something like this:
#define FOO_VERSION "1.2.0"
void Init_foo(){
VALUE cFoo = rb_define_class("Foo", rb_cObject);
/* The version of this package */
rb_define_const(cFoo, "VERSION", rb_str_new2(FOO_VERSION));
}
The end result would be “VERSION = rb_str_new2(FOO_VERSION)”. Not what we want. Now, however, you can specify the literal value yourself by using the “value: comment” syntax:
#define FOO_VERSION "1.2.0"
void Init_foo(){
VALUE cFoo = rb_define_class("Foo", rb_cObject);
/* 1.2.0: The version of this package */
rb_define_const(cFoo, "VERSION", rb_str_new2(FOO_VERSION));
}
Enjoy!
I haven’t posted here in a very long time, but I recently got a full-time job using Ruby and Rails (hurray) so Ruby is more on my mind lately. In fact I’ve gotten a better understanding of what life is like for the average Rails developer by seeing how my co-worker Alex writes his Ruby code. Now Alex is a smart guy, he has been doing web-sites for years, is proficient in ColdFusion, PHP , Flash, HTML and CSS, yet his Ruby code is not always as elegant as he or I would like. Of course I’ve been using Ruby for almost 6 years so know it quite well (and that is a big reason why I was hired.)
Still even I find the occasional new nugget and figured this blog would be a good forum to expose some of my new insights. This way other Rails developers like Alex who aren’t as proficient in Ruby as they would like can benefit from my experience.
Recently I was perusing the documentation for the Enumerable module and took a closer look at the grep method. This method is surprisingly more powerful than it might seem at first glance. To learn more, please continue reading this entry…
First, let’s describe the basic workings of the grep method, straight from the Ruby documentation:
enumObj.grep( pattern ) -> anArray
enumObj.grep( pattern ) {| obj | block } -> anArray
Returns an array of every element in enumObj for which Pattern === element. If the
optional block is supplied, each matching element is passed to it, and the
block's result is stored in the output array.
The most obvious use of grep is with arrays of Strings and a RegExp as the argument:
irb(main):001:0> names = ["Joe", "Bill", "Jill", "Susan", "Sam"]
=> ["Joe", "Bill", "Jill", "Susan", "Sam"]
irb(main):002:0> names.grep(/^J/)
=> ["Joe", "Jill"]
irb(main):003:0> names.grep(/^S/) {|name| name.upcase}
=> ["SUSAN", "SAM"]
But the key thing to remember is that grep actually uses the === operator when comparing the argument passed to each element in the Array. So any class that intelligently implements that operator can be used:
irb(main):004:0> numbers = [1, 2, 3, 4, 5, 6, 8, 9]
=> [1, 2, 3, 4, 5, 6, 8, 9]
irb(main):005:0> numbers.grep(3..6)
=> [3, 4, 5, 6]
irb(main):006:0> dates = [Date.new(2000, 1, 1), Date.new(2002, 2, 2),
Date.new(2004, 3, 3), Date.new(2006, 4, 4)]
=> [#<Date: 4903089/2,0,2299161>, #<Date: 4904615/2,0,2299161>,
#<Date: 4906135/2,0,2299161>, #<Date: 4907659/2,0,2299161>]
irb(main):007:0> dates.grep(Date.new(2001, 1, 1)..Date.new(2005, 1, 1)) {|date| date.to_s }
=> ["2002-02-02", "2004-03-03"]
irb(main):008:0> class Base;end
=> nil
irb(main):009:0> class Child1 < Base;end
=> nil
irb(main):010:0> class Child2 < Base;end
=> nil
irb(main):011:0> class NotAChild;end
=> nil
irb(main):012:0> objects = [Child1.new, NotAChild.new, Child2.new]
=> [#<Child1:0x2df6ce8>, #<NotAChild:0x2df6cd4>, #<Child2:0x2df6cc0>]
irb(main):013:0> objects.grep(Base)
=> [#<Child1:0x2df6ce8>, #<Child2:0x2df6cc0>]
In the above examples the Range#=== and Class#=== operators to grep through numbers, dates and instances of classes.
Something that I haven’t yet talked about, but which I’ve used in the examples, is the block passed to grep. This acts much like the block in Enumerable#map, taking a member of the array and returning it transformed in some way. Above I’ve made Strings uppercase and turned dates into more readable Strings, but this block can be as complex as you might need.
If you have classes which you might want to use with grep, all you need to do is implement an intelligent === method for whatever you will be passing to grep. In fact, as an example I decided to implement a Magic class which takes a block for use in the === method:
class Magic
def initialize(&block)
@block = block
end
def ===(other)
@block.call(other)
end
end
class Animal < Struct.new(:name, :sound, :class)
def to_s
"#{name}'s go '#{sound}'"
end
end
animals = [
Animal.new("Cow", "Moo!", "Mammal"),
Animal.new("Snake", "Hiss!", "Reptile"),
Animal.new("Dog", "Bark!", "Mammal"),
Animal.new("Eagle", "Go America!", "Bird"),
Animal.new("Cat", "Meow!", "Mammal"),
Animal.new("Shark", "Da Dum, Da Dum, Da Dum!", "Fish")
]
puts animals.grep(Magic.new {|a| a.class == "Mammal"})
# Results in:
# Cow's go 'Moo!'
# Dog's go 'Bark!'
# Cat's go 'Meow!'
I hope this relatively brief look into the Ruby Core was informative and will help any readers produce more elegant and maintainable Ruby code in their applications. As I find other interesting methods and uses I’ll post about them.
This came up in #camping today and I figured it was worth at least a mention:
Vanilla HashWithIndifferentAccess is slightly more choosy than Camping’s.
A quick irb session with each shows the difference.
>> require "active_support"
=> true
>> a = HashWithIndifferentAccess.new
=> {}
>> a.apple
NoMethodError: undefined method `apple' for {}:HashWithIndifferentAccess
from (irb):4
>> a.apple "bar"
NoMethodError: undefined method `apple' for {}:HashWithIndifferentAccess
from (irb):5
>> a.apple = "bar"
NoMethodError: undefined method `apple=' for {}:HashWithIndifferentAccess
from (irb):6
>> require "camping"
=> true
>> a = HashWithIndifferentAccess.new
=> {}
>> a.apple
=> nil
>> a.apple "bar"
NoMethodError: apple
from /usr/local/lib/ruby/gems/1.8/gems/camping-1.5/lib/camping.rb:51:in `method_missing'
from (irb):5
>> a.apple = "bar"
=> "bar"
This is not a complaint, just an observation I hope will be helpful. :)
Since picking up RoR I have had to dig in to technologies which I had previously only glanced at. One of these is CSS, part of the DHTML set of technologies.
The experience has been rewarding and at times frustrating but the outcomes has been positive. It is a technology that people need to get their heads around in order to develop an effective and usable front end to your web application.
Here are some resources that eased the learning curve