We’re starting a new project and I’m finding myself adding things to the code base that we’ve done in the past… hence the last few posts. As we’re doing this, I’d like to highlight some of the little things that we do on each project to maintain some consistency and in that process reach out to the community for alternative approaches.
I’m intrigued by the vendor everything concept, but we haven’t yet adopted this on any of our projects (yet).
What we have been doing is to maintain a REQUIRED_GEMS file in the root directory of our Rails application.
For example:
$ cat REQUIRED_GEMS
actionmailer
actionpack
actionwebservice
activerecord
activesupport
cgi_multipart_eof_fix
daemons
fastercsv
fastthread
feedtools
gem_plugin
image_science
mongrel
mongrel_cluster
mysql
rails
rake
RedCloth
Ruby-MemCache
soap4r
uuidtools
Everybody on the team (designers/developers) knows to look here to make sure they have everything installed when beginning to work on the application.
This has worked fairly well from project to project but since we’re starting a new project, I’m curious if anybody has some better ways to approach this. Should we look more seriously at the vendor everything approach or are there any alternative approaches?
Development
Ruby
Workflow
SCM
distributed
ticketing
bugtracking
Until now I mainly wrote about exposing a workflow/BPM engine as a set of resources. I’ll write a few line about the other side. I could have named this post “RESTful orchestration”, but BPM is shorter and is music related as well.
For orchestration vs choreography, the best quote, via Stefan Tilkov, might be :
In orchestration, there’s someone — the conductor — who tells everybody in the orchestra what to do and makes sure they all play in sync.
One might argue that the conductor is merely a metronome that cares about BPM and that every participant knows where he fits in the ensemble. Would ‘orchestration’ rather refer to the preparation work of the execution ? Arrangements ?
Let’s forget about big words and acronyms, there’s work to do and no time for taking part in an already old argument. Let’s get started with the resources and where they live. I’ll go with a content management example, it’s a just a few steps away (not hardcore ROA).
There is one staging server, one preview server and two production servers. Content is developed on the staging server, pushed on the preview server for approval and then passed to the production servers.
shell scripts
The “push to the preview server” might look like :
#!/bin/bash # to_preview.sh curl http://staging.example.com/pages/$0 > tmp.out curl -X PUT \ --data-binary "@tmp.out" \ -H "Content-Type: text/plain" \ http://preview.example.com/pages/$0 curl -X POST \ -d "uri=http://preview.example.com/pages/$0" \ -d "title=please approve" -d "team=review" \ http://preview.example.com:4567/tasks
Which deals with “/pages” resources getting and putting, and finally creating a task for the review team (a “/tasks” resource on the preview host). (thanks curl)
Note that this script (and this post) doesn’t care about authentication and authorization, nor about what’s the content type of our pages (it’s also limited to one ‘page’ what about batches of them ?).
The content producer and designers push for preview by running :
./to_preview.sh summer_catalog.html
I could show the scripts for refusing the preview material or for approving and publishing it, but they’re not that hard to deduce.
All is well, provided that the producers/designers are not too averse to command line tools. There are just two kinds of resource to orchestrate, pages and tasks, and three sets of page resources (staging, preview, production).
But this workflow is kind of loose, there seem to be no “conductor”, just “agents” pushing stuff around with tiny scripts. Business Process Management is hard, let’s go shopping… I mean, this business process is hard to manage, because it is fragmented.
business process scripting
“There he goes again…”, let’s try to do that with a workflow/BPM engine. Feel free to just click away.
A workflow engine deals with “long running asynchronous processes”, that means it is patient enough to wait for replies, to play its conductor role. And if interrupted/restarted, it will wait again.
It would be nice to have one process definition to track this and all the implicit requirements :
There is one staging server, one preview server and two production servers. Content is developed on the staging server, pushed on the preview server for approval and then passed to the production servers.
The “business process” could look like :
class PubProcess0 < OpenWFE::ProcessDefinition
cursor do
participant "producer"
hget "http://staging.example.com/pages/${f:page}"
hput "http://preview.example.com/pages/${f:page}"
participant "reviewer"
cancel :if => "${f:decision} == cancel"
rewind :if => "${f:decision} == redo"
# else publish...
hget "http://staging.example.com/pages/${f:page}"
concurrence do
hput "http://prod1.example.com/pages/${f:page}"
hput "http://prod2.example.com/pages/${f:page}"
end
end
end
This definition uses the cursor expression and its ‘rewind’ and ‘cancel’ subexpressions, as well as the new http expression (hget, hput, hpost, … more on it later)
Looks neat but the interaction with the /tasks resource vanished, maybe I’ll be using the worklist embedded in ruote-web or ruote-rest.
But what if we really wanted to use that /tasks resource ?
cursor do
#...
#participant "reviewer"
hpost(
"http://preview.example.com/tasks",
:params => {
:uri => "http://preview.example.com/pages/$0"
})
hpoll(
"${f:hheaders.location}",
:until => "${f:hcode} == 404")
#...
end
Assuming that the ‘hpost’ created successfully the task and was kind enough to reply with a “Location” header, the ‘hpoll’ expression would block the process, waiting for the ‘until’ condition to realize (the condition here is quite silly, it will poll every ten seconds until the /task/{id} resource goes 404, ‘task not found’ being interpreted as ‘task executed’).
RESTful orchestration
Has it been reached ? By adding the HTTP verbs to a business process definition language ?
Isn’t that a kind of specialization ? BPEL was born with a similar specialization a few years ago.
The move from “participant ‘reviewer’” to the “hpost/hpoll” pair is a specialization. With the ‘participant’ the details are abstracted away from the business process [definition], with hpost/hpoll there is an assumption of a resource oriented architecture/context.
the hget/hpost/…/hpoll expressions are a few hours of work, they are not finished. I meant to include them in Ruote (OpenWFEru) since long ago but this “specialization” issue refrained me. I feel they are not true to the “participants” vs “definitions” concept / dichotomy.
I don’t want to invest too much time into those new expressions. They should only cover basic cases. Complex cases should be abstracted to participant implementations.
(I started OpenWFE in Java a few years ago. I wanted to implement an open source workflow engine. In 2006, I switched to Ruby, this language has this huge expressiveness gain. I could [re]write the engine with less code. But I was/am also banking on people doing the last mile by themselves. Need a report ? Check prawn and write it by yourself. I don’t have to write all the participant implementations, I don’t have to cover all cases and all options. Ruby with its huge expressiveness and its openness (and its stock of gems) is making it easy for people to plug in their participants)
Anyway, those new Ruote expressions were fun to implement, they are available in the upcoming 0.9.19 release of OpenWFEru Ruote. Feedback is welcome on the user mailing list.
post scriptum / 303
I have not shown how to do authentication here. I have not talked about conditional GETs. Exceptions and retrying have not been discussed either. Maybe yet another incentive for hiding those details under the roof of a [custom] participant implementation.
RESTful BPM… Jason thinks there might be an elephant in the room.

OpenWFEru (Ruote) is an open source workflow and BPM engine written in Ruby. I just released version 0.9.18.
Why “Ruote” ? “OpenWFEru” is long to spell… The initial nickname was “Rufus”, but as I reused it for the subprojects not related to BPM, I needed another one. Tomaso suggested me “Ruote” and it sounded just right.
“Ruote” means “wheels” in italian, but it has migrated to english as well. Leopard’s dictionary says :
ruote |roōˈōtē|
noun
pasta that resembles small wheels with five spokes radiating from a hub.
ORIGIN Italian, literally ‘wheels’ .
And it’s OK to have a pasta as the nickname of a project that now allows to script spaghetti business processes thanks to the new step expression.
The complete release announcement as addressed to the users mailing list.
OK, back to work, I’d like to improve ruote-rest, I guess I’ll have to improve (finish) the example first before feeling restful.
home - demo - download - source - quickstart - issues - feedback

It may seem as an exercise, writing something restful. In the first take, I used Ruby on Rails to wrap OpenWFEru with a RESTful interface.
In this second take, I considered the alternatives among the Ruby web frameworks and went for Sinatra by Blake Mizerany.
Ruote-rest is mainly the result of a collaboration with a company which is integrating OpenWFEru (ruote) among its .NET applications. Its software artifacts speak XML over HTTP with ruote.
Ruote-rest, as a “take two”, is also the refinement of the concepts explored in the take one.
So what’s this RESTful workflow/BPM engine ?
It’s named “ruote-rest” and it’s available from GitHub at http://github.com/jmettraux/ruote-rest.
It provides for now 3 resources, /processes, /participants and /workitems (and the special /expressions resource).
These resources are sufficient for
Ruote-rest currently provides two representations of its resources, an XML one and an HTML one. The XML representation is meant for automated consumption, the main case, while the HTML one is meant for learning / debugging.
The installation instructions are in the README.txt, and there is also a packaged release available.
— warning : boring technical post ahead —
a resource tour
Let’s have a look at ruote-rest once it’s running, through its ‘HTML interface’, which demonstrates the API ruote-rest exposes, via hyperlinks and forms.
/processes enumerates the processes currently running in the engine.
New process instances can be launched (POSTed) with the form at the bottom of the HTML representation of /processes.
The form shows two ways of providing a process definition to the engine, via an URL (first input box) or by providing the process definition directly (text area).
Let’s just hit POST /processes.
Ruote-rest answers with an HTTP 201 reply containing a link to the newly created process instance. (The current HTTP status code is indicated on the top right of the HTML page, most of the time it’s at 200 OK).
Getting /processes/20080507-buparijitsu yields an HTML representation of the running [business] process instance (the workflow instance id being the 20080507-buparijitsu mentioned in the URI).
This page features links to all the expressions (/expressions/:wfid) that make up a process instance and to the JSON representation of the process definition itself.
It’s this representation that is interpreted in ruote-web when rendering graphically a process definition to some subset of BPMN. (Note that if the process instance got modified “en route”, it’s the process definition as running, not as launched which is returned).
There are two fat submit buttons, one for pausing (or resuming) the process instance (a PUT), and one for terminating (canceling) it altogether (a DELETE).
There is also a link to the “currently active expression” in the process (in other words, where the process is currently waiting). If there are concurrent execution paths for the process, there will be multiple “currently active expressions”.
A process instance is made up of expressions linked in an execution tree. The most common expressions are participant, sequence and concurrence.
The participant expression is used to link the execution of a workflow with an [external] participant, be it a service, a human, a small snip of code, …
OpenWFEru allows to modify the expressions in a running process instance. It may be necessary for administrative or business purposes (missing plan B maybe).
Expressions are stored in two states : unapplied and applied. Unapplied expressions have not yet been reached by the execution flow, they mainly are stored “raw”, uninterpreted (the whole branch they form with their unwound children expression).
The /expression/:wfid/:expid page features links back to the process and to the list of all expressions. There are also links to the parent expression, the environment expression (scope) and the children expressions, if any.
At the end of the page, there is a submit button for terminating the expression (cancelling it) (DELETE). If the execution path is located in the branch being terminated, the flow will be resumed after the cancelled branch.
By hitting the show link above the delete button, a text area containing the YAML representation of the expression will be displayed, along with a submit button for PUTting the updated expression back in the workflow engine.
That shows how to update business process instances, on the fly (this subject has been addressed for ruote-web in process gardening).
Ruote-rest, like ruote-web integrates a worklist for storing workitems for human consumption (not comestible though).
From the /workitems, the page for each individual workitem can be reached. It is also OK to query for the workitems belonging to just one process (/workitems?wfid=20080507-buparijitsu for example).
The HTML representation of a workitem has a text area containing the current payload of the workitem, as a JSON string, ready for being PUT back. If the “proceed” box is checked, the process will resume (with the updated payload).
The /participants page allows for addition / deletion of participants.
It currently only provides the way to add “active participants”, ie participants that place workitem in the integrated worklist. Other classes of participants can be added via the conf/participants.rb configuration file.
now and then
Ruote-rest is not finished. For example the /errors resource is not yet implemented (replaying a business process after an error is something important).
The initial representation is XML, along with HTML for educational purposes. JSON is a must. Why not some AtomPub like Kisha (the “take one” approach) included ?
Maybe YAML would be interesting too, for Ruby clients at least.
I’ve chosen Sinatra because I adhere to its convention :
get "/things/:tid" do
"<p>I'm thing and my name is #{params[:tid]}</p>"
end
On the todo list is also some support for etags and if-modified-since, let’s make ruote-rest a good, respectful HTTP citizen.
The security stuff is rudimentary for now, here is how it looks :
# called before each request gets processed
#
before do
throw :halt, [ 401, "get off !" ] \
if request.env['REMOTE_ADDR'] != '127.0.0.1'
end
The company for which ruote-rest is being specifically developed for doesn’t require any fancy auth mecha. But with Sinatra, they aren’t that far (see the so project for example or these posts).
Talking about security, the readers will certainly have noticed that the process definition can be fed directly into the engine, that’s very practical but you have to trust your auth mechanisms…
For persistence, ruote-rest currently uses the filesystem for the engine itself and a database (via Ruby on Rails’ ActiveRecord) for its worklist.
I’m thinking about integrating the automatic graphical process rendering found in ruote-web (ex-Densha) into the HTML “educational pages”, it would look good and help people understand what’s going on.
at the github
All this happens at http://github.com/jmettraux/ruote-rest, feel free to fork. Find the mailing lists at google groups.
