» tagged pages
» logout

(Feed found, click Add Page to syndicate.) Error finding feed, please try again » Find feed title

A Blog Page allows you to add entries, for news or other time sensitive postings

(Login required to save to your tagged pages.)
(or Cancel)

Make further edits, (or Cancel)

(Login required to save to your tagged pages.)
(or Cancel)

(Editing anonymously: to be credited for your changes, login or register a new account)

Change Page Permissions? Changing these permissions will adjust who can modify this page.

Anonymous (change)
Swik Users (change)
(or Cancel)
Upload an image from your computer:
or Copy an image from a URL:
or Erase the current icon:
Icon Preview:

or Cancel

Erase gentlecms? The contents of gentlecms page and all pages directly attached to gentlecms will be erased.

or Cancel

(Editing anonymously: to be credited for your changes, login or register a new account)

other page actions:
gentlecms

GentleCMS

Tags Applied to gentlecms

2 people have tagged this page:

gentlecms Wiki Pages

Tag Cloud

To further filter what appears in the Things Tagged gentlecms list, select a tag from the Tag Cloud.

GentleCMS is a resource-oriented content management system, designed for high-traffic websites or large-scale content repositories.

Need more information!

gentlecms.com

sorted by: recent | see : popular
Content Tagged gentlecms

GentleCMS Development Log: Part 4

I’ve been up to no good again. I keep changing my directory structure around. Nothing feels quite right, but each time I change it, it seems a bit better than the last time. In any case, my svn repository for this project is now something of a mess. :-(

Anyways, I decided my URI implementation still had a weak spot that needed to be taken care of.

I figure I’ll fill out the missing chart on Sam’s list with the results of Ruby’s two URI implementations:

testuri.1.rb produces:

http://example.com/          http://example.com           true
HTTP://example.com/          http://example.com/          false
http://example.com/          http://example.com:/         true
http://example.com/          http://example.com:80/       true
http://example.com/          http://Example.com/          true
http://example.com/~smith/   http://example.com/%7Esmith/ false
http://example.com/~smith/   http://example.com/%7esmith/ false
http://example.com/%7Esmith/ http://example.com/%7esmith/ false
http://example.com/%C3%87    http://example.com/C%CC%A7   false

testuri.2.rb produces:

http://example.com/          http://example.com           true
HTTP://example.com/          http://example.com/          true
http://example.com/          http://example.com:/         true
http://example.com/          http://example.com:80/       true
http://example.com/          http://Example.com/          true
http://example.com/~smith/   http://example.com/%7Esmith/ true
http://example.com/~smith/   http://example.com/%7esmith/ true
http://example.com/%7Esmith/ http://example.com/%7esmith/ true
http://example.com/%C3%87    http://example.com/C%CC%A7   true

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 5

Get me rewrite! Again.

I’m not certain exactly how many times I’ve rewritten the base stuff for GentleCMS now, I’m pretty sure it’s been at least 5 times now, and every time it gets better and cleaner and simpler and better specified. I just finished up writing the code for managing resource properties (arbitrary file metadata). This time around, I (re)wrote the whole thing as a fancy subclass of Hash. Complete with 100.0% C0 code coverage and a fairly respectable 1.37:1 spec:code ratio. I’ll probably begin writing the specs for the ResourceNode class (the most important class in GentleCMS) in the next couple days.

I’ve been making a few evolutionary improvements to the URI class lately. (Which, by the way, is now at 100.0% C0 code coverage and 1.63:1 spec:code ratio. Ultimately the goal is to have 100.0% C0 code coverage for every single piece of GentleCMS and at least a 1.0:1 spec:code ratio for every file.) The improvements were mostly related to escaping. I misread parts of the RFC related to escaping certain characters and I had to go back and improve the specs for that. I’m pretty sure there’s still some edge cases that need to be better specified in my RSpec code. Whenever there’s some section of the RFC that supplies an example, I’ve been adding it verbatim to the RSpec code with a comment to allow easy cross-referencing between the executable specification and the RFC. I almost wish RSpec had some functionality similar to RDoc that would merge the comments with the generated HTML specification somehow. The generated stuff tends to be fairly bland (though still somewhat useful), but it would be really cool if it could be fleshed out a bit.

Anyways, since the properties code was what I just finished up, I thought I’d explain a bit about how the feature works exactly. Properties in GentleCMS are loosly modeled on Subversion’s metadata system, which is really quite simple. Metadata in both systems is basically represented as a set of key-value string pairs, which is why it makes a lot of sense to code it up as a Hash subclass. Namespacing is dealt with by simply prefixing the key’s name with the namespace string followed by a colon. So for example, Subversion uses “svn:mime-type” to store the mime-type of a file, while GentleCMS uses “cms:mime-type”. There’s really nothing special about the way namespacing is done, it’s actually not much more than a style convention.

However, GentleCMS’s metadata system is significantly different in one respect from Subversion’s. GentleCMS allows properties to be auto-generated. GentleCMS has a special class called ResourceAdaptor. Subclasses of ResourceAdaptor are able to selectively alter the behavior and state of ResourceNodes depending on the ResourceNode’s state. For example, if you wanted to auto-detect the encoding of an XML file in order to eliminate many of the problems that crop up as a result of RFC 3023, you could write a ResourceAdaptor subclass that only accepts XML file ResourceNodes. The subclass would add a generated property to the ResourceNode, “cms:encoding” whose value was obtained by inspecting the ResourceNode’s content and determining the XML file’s encoding. GentleCMS knows what to do with the “cms:encoding” property and will automatically supply the correct HTTP headers when a representation of the resource is requested.

FeedTools: Sporkmonger Blog

A Question

Lately, this has become something of a design pattern for me: A base class describes a type of behavior, and rather than subclasses inheriting the base class’s behavior (which tend to just raise NotImplementedErrors), the subclasses override the base class’s methods to do their own thing. The catch, of course, is that some of the base class’s methods are not, in fact, just stubs. Some of them actually dispatch the message just received to each of the subclasses. So if you send GentleCMS::Cache the :clear message, that message will actually get relayed to GentleCMS::ResponseCache, GentleCMS::ResourceCache, and GentleCMS::RouteCache, thus clearing out all caching systems within GentleCMS. This allows for either selective or indiscriminate clearing of the cache. Very useful.

The problem is that my method for finding subclasses is slow, and I was hoping that you, Dear Readers, might have some suggestions for how I might improve the performance of this method:

1
2
3
4
5
6
7
8
9
10
11
12
13

class Module #:nodoc:
  # Returns a list of modules and classes that descend from this module.
  def descendants
    descendant_modules = []
    ObjectSpace.each_object do |object|
      next if !object.kind_of? Module
      next if object == self
      descendant_modules << object if object.ancestors.include? self
    end
    return descendant_modules
  end
end

I tend to cache the results of calling this method, so it doesn’t have a huge performance hit during normal operation, but startup times have become rather surprisingly long.

Update: I discovered that ObjectSpace.each_object could take an optional type parameter. The code runs much faster now:

1
2
3
4
5
6
7
8
9
10
11
12

class Module #:nodoc:
  # Returns a list of modules and classes that descend from this module.
  def descendants
    descendant_modules = []
    ObjectSpace.each_object(Module) do |object|
      next if object == self
      descendant_modules << object if object.ancestors.include? self
    end
    return descendant_modules
  end
end

Update: Cool, that took the worst-case startup times down from 12 seconds to a worst-case startup time of 3 seconds. Not bad, a 400% overall performance improvement from changing two lines of code. I can live with that.

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 6

I’m beginning to think that perhaps GentleCMS is badly named. It’s a content management system, yes, but it’s also a lot more. It would probably be more accurate to call it a content management framework.

I’ve pretty much finished the main stuff for the backend. There’s some polishing left to do still, but without a frontend in place, further work on the backend code is going to inevitably end up getting out of touch with reality. I really need to get the code to the point where frontend development is sane, and as quickly as possible.

I’ve almost got the second rewrite of the theming code done. (Seriously, I don’t think there will be even a single component that hasn’t been rewritten at least once when I finally release this thing.) I’m going to keep the internal templating code exceptionally simple for the time being because at this point, it’s looking highly likely that I’m going to end up writing a custom templating system, probably based on Kid. (I’ll release it as a separate project most likely.) It’s not necessary yet though, so in the meantime, I’m just going to stick with very, very simple Erb templates so that any work I do is eventually going to be reusable. In any case, GentleCMS doesn’t really care too much how the templates are implemented, and as an end user, you’ll be able to pick and choose without any fuss.

Originally, my intention had been for GentleCMS to be a Rails application which was able to host other applications. That plan had a lot of shortcomings, not the least of which was that I really didn’t think a Rails application was going to be a good way to distribute, deploy, or upgrade things. Gem-based installers like the one Typo and other projects have recently started using would alleviate some of the pain points, but upgrades would still be problematic.

But beyond that, GentleCMS is nothing like Rails. Rails’ sweet-spot (quickly producing small database-backed web apps) is that which you should probably never try to do on GentleCMS. So, aside from hype-factor, why on earth would I want to tie this thing into Rails? Rails isn’t a webserver, and really, a webserver is all GentleCMS needs, not a full-blown framework, because GentleCMS is already the framework.

Which brings me to the option I’ve actually decided to go with: hooking directly into Mongrel. This gives me a lot of extra options, like for instance, multithreading like Merb, as well as a lot more control. Plus the performance is a lot better.

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 6

I’m beginning to think that perhaps GentleCMS is badly named. It’s a content management system, yes, but it’s also a lot more. It would probably be more accurate to call it a content management framework.

I’ve pretty much finished the main stuff for the backend. There’s some polishing left to do still, but without a frontend in place, further work on the backend code is going to inevitably end up getting out of touch with reality. I really need to get the code to the point where frontend development is sane, and as quickly as possible.

I’ve almost got the second rewrite of the theming code done. (Seriously, I don’t think there will be even a single component that hasn’t been rewritten at least once when I finally release this thing.) I’m going to keep the internal templating code exceptionally simple for the time being because at this point, it’s looking highly likely that I’m going to end up writing a custom templating system, probably based on Kid. (I’ll release it as a separate project most likely.) It’s not necessary yet though, so in the meantime, I’m just going to stick with very, very simple Erb templates so that any work I do is eventually going to be reusable. In any case, GentleCMS doesn’t really care too much how the templates are implemented, and as an end user, you’ll be able to pick and choose without any fuss.

Originally, my intention had been for GentleCMS to be a Rails application which was able to host other applications. That plan had a lot of shortcomings, not the least of which was that I really didn’t think a Rails application was going to be a good way to distribute, deploy, or upgrade things. Gem-based installers like the one Typo and other projects have recently started using would alleviate some of the pain points, but upgrades would still be problematic.

But beyond that, GentleCMS is nothing like Rails. Rails’ sweet-spot (quickly producing small database-backed web apps) is that which you should probably never try to do on GentleCMS. So, aside from hype-factor, why on earth would I want to tie this thing into Rails? Rails isn’t a webserver, and really, a webserver is all GentleCMS needs, not a full-blown framework, because GentleCMS is already the framework.

Which brings me to the option I’ve actually decided to go with: hooking directly into Mongrel. This give me a lot of extra options, like for instance, multithreading like Merb, as well as a lot more control. Plus the performance is a lot better.

FeedTools: Sporkmonger Blog

A Question

Lately, this has become something of a design pattern for me: A base class describes a type of behavior, and rather than subclasses inheriting the base class’s behavior (which tend to just raise NotImplementedErrors), the subclasses override the base class’s methods to do their own thing. The catch, of course, is that some of the base class’s methods are not, in fact, just stubs. Some of them actually dispatch the message just received to each of the subclasses. So if you send GentleCMS::Cache the :clear message, that message will actually get relayed to GentleCMS::ResponseCache, GentleCMS::ResourceCache, and GentleCMS::RouteCache, thus clearing out all caching systems within GentleCMS. This allows for either selective or indiscriminate clearing of the cache. Very useful.

The problem is that my method for finding subclasses is slow, and I was hoping that you, Dear Readers, might have some suggestions for how I might improve the performance of this method:

class Module #:nodoc:
  # Returns a list of modules and classes that descend from this module.
  def descendants
    descendant_modules = []
    ObjectSpace.each_object do |object|
      next if !object.kind_of? Module
      next if object == self
      descendant_modules << object if object.ancestors.include? self
    end
    return descendant_modules
  end
end

I tend to cache the results of calling this method, so it doesn’t have a huge performance hit during normal operation, but startup times have become rather surprisingly long.

Update: I discovered that ObjectSpace.each_object could take an optional type parameter. The code runs much faster now:

class Module #:nodoc:
  # Returns a list of modules and classes that descend from this module.
  def descendants
    descendant_modules = []
    ObjectSpace.each_object(Module) do |object|
      next if object == self
      descendant_modules << object if object.ancestors.include? self
    end
    return descendant_modules
  end
end

Update: Cool, that took the worst-case startup times down from 12 seconds to a worst-case startup time of 3 seconds. Not bad, a 400% overall performance improvement from changing two lines of code. I can live with that.

FeedTools: Sporkmonger Blog

A question

Lately, this has become something of a design pattern for me: A base class describes a type of behavior, and rather than subclasses inheriting the base class’s behavior (which tend to just raise NotImplementedErrors), the subclasses override the base class’s methods to do their own thing. The catch, of course, is that some of the base class’s methods are not, in face, just stubs. Some of them actually dispatch the message just received to each of the subclasses. So if you send GentleCMS::Cache the :clear message, that message will actually get relayed to GentleCMS::ResponseCache, GentleCMS::ResourceCache, and GentleCMS::RouteCache, thus clearing out all caching systems within GentleCMS. This allows for either selective or indiscriminate clearing of the cache. Very useful.

The problem is that my method for finding subclasses is slow, and I was hoping that you, Dear Readers, might have some suggestions for how I might improve the performance of this method:

class Module #:nodoc:
  # Returns a list of modules and classes that descend from this module.
  def descendants
    descendant_modules = []
    ObjectSpace.each_object do |object|
      next if !object.kind_of? Module
      next if object == self
      descendant_modules << object if object.ancestors.include? self
    end
    return descendant_modules
  end
end

I tend to cache the results of calling this method, so it doesn’t have a huge performance hit during normal operation, but startup times have become rather surprisingly long.

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 5

Get me rewrite! Again.

I’m not certain exactly how many times I’ve rewritten the base stuff for GentleCMS now, I’m pretty sure it’s been at least 5 times now, and every time it gets better and cleaner and simpler and better specified. I just finished up writing the code for managing resource properties (arbitrary file metadata). This time around, I (re)wrote the whole thing as a fancy subclass of Hash. Complete with 100.0% C0 code coverage and a fairly respectable 1.37:1 spec:code ratio. I’ll probably begin writing the specs for the ResourceNode class (the most important class in GentleCMS) in the next couple days.

I’ve been making a few evolutionary improvements to the URI class lately. (Which, by the way, is now at 100.0% C0 code coverage and 1.63:1 spec:code ratio. Ultimately the goal is to have 100.0% C0 code coverage for every single piece of GentleCMS and at least a 1.0:1 spec:code ratio for every file.) The improvements were mostly related to escaping. I misread parts of the RFC related to escaping certain characters and I had to go back and improve the specs for that. I’m pretty sure there’s still some edge cases that need to be better specified in my RSpec code. Whenever there’s some section of the RFC that supplies an example, I’ve been adding it verbatim to the RSpec code with a comment to allow easy cross-referencing between the executable specification and the RFC. I almost wish RSpec had some functionality similar to RDoc that would merge the comments with the generated HTML specification somehow. The generated stuff tends to be fairly bland (though still somewhat useful), but it would be really cool if it could be fleshed out a bit.

Anyways, since the properties code was what I just finished up, I thought I’d explain a bit about how the feature works exactly. Properties in GentleCMS are loosly modeled on Subversion’s metadata system, which is really quite simple. Metadata in both systems is basically represented as a set of key-value string pairs, which is why it makes a lot of sense to code it up as a Hash subclass. Namespacing is dealt with by simply prefixing the key’s name with the namespace string followed by a colon. So for example, Subversion uses “svn:mime-type” to store the mime-type of a file, while GentleCMS uses “cms:mime-type”. There’s really nothing special about the way namespacing is done, it’s actually not much more than a style convention.

However, GentleCMS’s metadata system is significantly different in one respect from Subversion’s. GentleCMS allows properties to be auto-generated. GentleCMS has a special class called ResourceAdaptor. Subclasses of ResourceAdaptor are able to selectively alter the behavior and state of ResourceNodes depending on the ResourceNode’s state. For example, if you wanted to auto-detect the encoding of an XML file in order to eliminate many of the problems that crop up as a result of RFC 3023, you could write a ResourceAdaptor subclass that only accepts XML file ResourceNodes. The subclass would add a generated property to the ResourceNode, “cms:encoding” whose value was obtained by inspecting the ResourceNode’s content and determining the XML file’s encoding. GentleCMS knows what to do with the “cms:encoding” property and will automatically supply the correct HTTP headers when a representation of the resource is requested.

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 4

I’ve been up to no good again. I keep changing my directory structure around. Nothing feels quite right, but each time I change it, it seems a bit better than the last time. In any case, my svn repository for this project is now something of a mess. :-(

Anyways, I decided my URI implementation still had a weak spot that needed to be taken care of.

I figure I’ll fill out the missing chart on Sam’s list with the results of Ruby’s two URI implementations:

testuri.1.rb produces:

http://example.com/          http://example.com           true
HTTP://example.com/          http://example.com/          false
http://example.com/          http://example.com:/         true
http://example.com/          http://example.com:80/       true
http://example.com/          http://Example.com/          true
http://example.com/~smith/   http://example.com/%7Esmith/ false
http://example.com/~smith/   http://example.com/%7esmith/ false
http://example.com/%7Esmith/ http://example.com/%7esmith/ false
http://example.com/%C3%87    http://example.com/C%CC%A7   false

testuri.2.rb produces:

http://example.com/          http://example.com           true
HTTP://example.com/          http://example.com/          true
http://example.com/          http://example.com:/         true
http://example.com/          http://example.com:80/       true
http://example.com/          http://Example.com/          true
http://example.com/~smith/   http://example.com/%7Esmith/ true
http://example.com/~smith/   http://example.com/%7esmith/ true
http://example.com/%7Esmith/ http://example.com/%7esmith/ true
http://example.com/%C3%87    http://example.com/C%CC%A7   true

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 3

The extract method is basically done. I’m sure it could be improved a bit more, but it seems to be fairly effective. I added a few extra features beyond the original URI class’s capabilities, such as supplying a base uri to resolve relative uris against. You can also have it return the parsed URIs instead of the strings. At no extra processing cost since it has to parse each URI internally anyways. Tried it out on Sam Ruby’s feed (as you may have noticed, currently my favorite chunk of text to try just about everything out on) and it seems to have gone ok:

>> GentleCMS::URI.extract(text,
  :base => "http://www.intertwingly.net/blog/index.atom")
=> ["http://www.w3.org/2005/Atom",
 "http://purl.org/syndication/thread/1.0",
 "http://www.intertwingly.net/blog/index.atom",
 "http://www.intertwingly.net/blog/index.atom",
 "tag:intertwingly.net,2004:2340",
 "http://www.w3.org/1999/xhtml",
 "http://www.tbray.org/ongoing/When/200x/2006/07/07/With-Bloglines-to-Atom",
 "http://www.w3.org/1999/xhtml",
 "http://www.tbray.org/ongoing/When/200x/2006/07/07/With-Bloglines-to-Atom",
 "http://www.w3.org/2000/svg",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.link",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#rfc.section.3.1.1",
 "http://www.w3.org/TR/2001/REC-xmlbase-20010627/",
 "http://www.bloglines.com/preview?siteid=235142",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.author",
 "http://www.bloglines.com/preview?siteid=235142",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.source",
 "http://www.bloglines.com/preview?siteid=5319444",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.updated",
 "http://www.bloglines.com/preview?siteid=2375595",
 "http://www.bloglines.com/preview?siteid=50",
 "http://www.bloglines.com/preview?siteid=2438392",
 "http://weblog.philringnalda.com/2005/12/18/who-knows-a-title-from-a-hole-in-the-ground",
 "http://www.niallkennedy.com/blog/archives/2006/07/google-sitemaps-2.html",
 "http://www.stephenduncanjr.com/2006/06/atom-10-and-blogger.shtml",
 "tag:intertwingly.net,2004:2339",
 "http://www.w3.org/1999/xhtml",
 "http://www.1060.org/blogxter/entry?publicid=8A0DC194929914711F1C0470FFDB7B73",
 "http://www.intertwingly.net/slides/2005/xmlconf/",
 "http://www.intertwingly.net/slides/2005/etcon/",
 "tag:intertwingly.net,2004:2338",
 "http://www.w3.org/1999/xhtml",
 "http://www.w3.org/2000/svg",
 "http://en.wikipedia.org/wiki/Penrose_tiling",
 "http://intertwingly.net/stories/2006/07/06/penroseTiling.svg",
 "tag:intertwingly.net,2004:2337",
 "http://www.w3.org/1999/xhtml",
 "http://www.w3.org/2000/svg",
 "http://www.unto.net/unto/work/on-rss-and-atom/",
 "http://www.unto.net/unto/opensearch/more-on-rss-and-atom/",
 "tag:intertwingly.net,2004:2336",
 "http://www.w3.org/1999/xhtml",
 "http://intertwingly.net/stories/2006/07/04/clean_utf8_for_xml.c",
 "http://www.intertwingly.net/blog/",
 "http://www.intertwingly.net/blog/2006/07/08/Bloglines-Edge-Cases",
 "http://www.intertwingly.net/blog/2340.atom",
 "http://www.intertwingly.net/blog/2006/07/06/Blame-Somebody",
 "http://www.intertwingly.net/blog/2339.atom",
 "http://www.intertwingly.net/blog/2006/07/06/Penrose-Tiling",
 "http://www.intertwingly.net/blog/2338.atom",
 "http://www.intertwingly.net/blog/2006/07/04/Just-a-Technical-Detail",
 "http://www.intertwingly.net/blog/2337.atom",
 "http://www.intertwingly.net/blog/2006/07/04/Clean-utf-8-for-XML",
 "http://www.intertwingly.net/blog/2336.atom"]

The original’s output:

URI.extract(text)
=> ["http://www.w3.org/2005/Atom",
 "xmlns:thr=",
 "http://purl.org/syndication/thread/1.0",
 "http://www.intertwingly.net/blog/index.atom",
 "http://www.intertwingly.net/blog/index.atom",
 "T20:30:05-04:00",
 "tag:intertwingly.net,2004:2340",
 "thr:count=",
 "thr:when=",
 "T20:30:01-04:00",
 "http://www.w3.org/1999/xhtml",
 "http://www.tbray.org/ongoing/When/200x/2006/07/07/With-Bloglines-to-Atom",
 "http://www.w3.org/1999/xhtml",
 "http://www.tbray.org/ongoing/When/200x/2006/07/07/With-Bloglines-to-Atom",
 "http://www.w3.org/2000/svg",
 "float:right",
 "out:",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.link",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#rfc.section.3.1.1",
 "http://www.w3.org/TR/2001/REC-xmlbase-20010627/",
 "xml:base",
 "http://www.bloglines.com/preview?siteid=235142",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.author",
 "http://www.bloglines.com/preview?siteid=235142",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.source",
 "http://www.bloglines.com/preview?siteid=5319444",
 "http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.updated",
 "http://www.bloglines.com/preview?siteid=2375595",
 "http://www.bloglines.com/preview?siteid=50",
 "http://www.bloglines.com/preview?siteid=2438392",
 "http://weblog.philringnalda.com/2005/12/18/who-knows-a-title-from-a-hole-in-the-ground",
 "http://www.niallkennedy.com/blog/archives/2006/07/google-sitemaps-2.html",
 "http://www.stephenduncanjr.com/2006/06/atom-10-and-blogger.shtml",
 "T18:06:55-04:00",
 "tag:intertwingly.net,2004:2339",
 "thr:count=",
 "thr:when=",
 "T12:45:00-04:00",
 "http://www.w3.org/1999/xhtml",
 "http://www.1060.org/blogxter/entry?publicid=8A0DC194929914711F1C0470FFDB7B73",
 "http://www.intertwingly.net/slides/2005/xmlconf/",
 "http://www.intertwingly.net/slides/2005/etcon/",
 "T21:07:59-04:00",
 "tag:intertwingly.net,2004:2338",
 "thr:count=",
 "thr:when=",
 "T19:56:01-04:00",
 "http://www.w3.org/1999/xhtml",
 "http://www.w3.org/2000/svg'",
 "float:right",
 "http://en.wikipedia.org/wiki/Penrose_tiling",
 "http://intertwingly.net/stories/2006/07/06/penroseTiling.svg",
 "T17:55:35-04:00",
 "tag:intertwingly.net,2004:2337",
 "thr:count=",
 "thr:when=",
 "T08:45:19-04:00",
 "http://www.w3.org/1999/xhtml",
 "http://www.w3.org/2000/svg",
 "float:right",
 "http://www.unto.net/unto/work/on-rss-and-atom/",
 "http://www.unto.net/unto/opensearch/more-on-rss-and-atom/",
 "T12:15:13-04:00",
 "T21:19:04-04:00",
 "tag:intertwingly.net,2004:2336",
 "thr:count=",
 "thr:when=",
 "T22:27:59-04:00",
 "http://www.w3.org/1999/xhtml",
 "http://intertwingly.net/stories/2006/07/04/clean_utf8_for_xml.c",
 "T08:59:42-04:00"]

Here’s the diffs:

(uri_result - gentle_uri_result)
=> ["xmlns:thr=",
 "T20:30:05-04:00",
 "thr:count=",
 "thr:when=",
 "T20:30:01-04:00",
 "float:right",
 "out:",
 "xml:base",
 "T18:06:55-04:00",
 "thr:count=",
 "thr:when=",
 "T12:45:00-04:00",
 "T21:07:59-04:00",
 "thr:count=",
 "thr:when=",
 "T19:56:01-04:00",
 "http://www.w3.org/2000/svg'",
 "float:right",
 "T17:55:35-04:00",
 "thr:count=",
 "thr:when=",
 "T08:45:19-04:00",
 "float:right",
 "T12:15:13-04:00",
 "T21:19:04-04:00",
 "thr:count=",
 "thr:when=",
 "T22:27:59-04:00",
 "T08:59:42-04:00"]
(gentle_uri_result - uri_result)
=> [".",
 "2006/07/08/Bloglines-Edge-Cases",
 "2340.atom",
 "2006/07/06/Blame-Somebody",
 "2339.atom",
 "2006/07/06/Penrose-Tiling",
 "2338.atom",
 "2006/07/04/Just-a-Technical-Detail",
 "2337.atom",
 "2006/07/04/Clean-utf-8-for-XML",
 "2336.atom"]

The extract code was designed to work especially well with SGMLish text and Textile-formatted text. The regular expressions should work perfectly with BBCode and Markdown as well, though I haven’t tried it.

I do admit that I totally cheated and threw out basically all of those false-positives specifically for this example, but i’ll probably also be expanding the rejection list as time goes on, since it’s a fairly lightweight check. Good enough for my purposes anyhow.

FeedTools: Sporkmonger Blog

Monkey Patching Goodness

FeedTools 0.2.25: Now with 625 lines of monkey patching, and all the same terrible performance you’ve come to expect!

I decided to extract all of my REXML monkey patches out into a single file instead of leaving them all in feed_tools.rb for this release. Tests should all pass on Ruby 1.8.4 now. And Sam Ruby’s feed should be handled correctly again. His use of ”.” as his link uri caused one of the parser’s heuristics to throw a hissy fit and misreport the feed’s uri as nil and the value of the feed’s link as the feed’s uri. Weird stuff. Anyways, that works again. (NetNewsWire was breaking on Sam’s feed last I checked.) HTTP redirection handling has been changed in that FeedTools won’t barf if a relative Location: header is supplied. And the parser should generally work a little bit better with FeedUpdater.

I’ll probably make another release when I get around to integrating my new URI code. After that, that will likely be the last release for quite some time. Virtually all of my free coding time will be being spent on GentleCMS instead. Just a heads-up.

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 2

I have an admission to make.

I hate Ruby’s URI class. There, I’ve said it. It’s out in the open, I can’t take it back. I just really don’t like it. It constantly tells me that my URIs are invalid and that I’m a bad person and it basically just hates me. So yeah. I’m just not gonna take it anymore. URI class, I’m quitting you. I’ve found someone else who treats me nicely, someone who knows how to normalize without making me do lots of extra work, someone who actually understands IRIs, and who makes allowance for the fact that sometimes people are ignorant morons who don’t read the specifications.

I’d like to introduce GentleCMS’s replacement for Ruby’s standard library URI implementation: GentleCMS::URI. It’s based on RFC 3986 instead of RFC 2396 and RFC 2732, so it’s a little bit more modern, it’s fairly close in functionality to Ruby’s URI class, and for the most part, it shares all of the same methods. The opaque method is gone because it’s no longer used in RFC 3986, although if necessary, an alias could be created I suppose, since path now takes opaque’s place.

The class will automatically use Ruby’s bindings for libidn if they’re present:

require 'system/uri'
uri = GentleCMS::URI.parse(
  "http://www.詹姆斯.com/atomtests/iri/詹.html")
uri.normalize.to_s
# => "http://www.xn--8ws00zhy3a.com/atomtests/iri/%E8%A9%B9.html"

I wanted to write a pure-Ruby IDNA implementation, but in the interests of not yak-shaving for the rest of my life, I decided not to bother.

The code, thus far, is checked into the repository if you want to play with it. Complete with an executable specification and 100.0% C0 code coverage according to RCov.

It is my intention to polish it up a bit longer, improve the documentation, run with it for awhile in the real world, and then see what it would take to have it replace the implementation in Ruby’s standard library. I’ll probably start having FeedTools use the code as well.

651 lines of implementation, 739 lines of specification.

FeedTools: Sporkmonger Blog

GentleCMS Development Log: Part 1

I got a lot of work on GentleCMS done while I was at RailsConf. I’ll probably start putting regular entries up on my progress so that it doesn’t get too quiet around here.

I checked in support for search/indexing with Ferret, which may eventually either be replaced with Hyper Estraier or I might work on making the indexing code modular enough to support either implementation. Not sure. Seth Fitzsimmons gave a presentation at RailsConf that sort of threw a wet blanket on my enthusiasm for Ferret—If Ferret isn’t sufficiently stable, I’ll have to work out an alternative, but we’ll see. It hasn’t crashed on me yet, but I haven’t really done anything interesting with it either.

The start of templating is in place, with the dependancy tracking handled by the indexing code. The basic templating stuff works, but right now, setting file dependancies is still a very manual task, and that needs to be fixed. I’ll have to finish work on the generated properties code to get that working. Representation caching hasn’t been written either, and that’s a major prerequisite for the templating system.

Filtering, however, is working perfectly. The filters I’ve currently finished writing are:

  • textile
  • markdown
  • tidy-page
  • tidy-fragment

Chaining filters is trivial, so you can apply tidy-fragment to a textile page if you like, or really, any combination of filters. I’d like to add in support for sanitized (X)HTML and BBCode by default. I can probably lift the sanitization code from FeedTools fairly easily, though I really ought to clean it up first. It’s honestly kinda messy. Does anyone know if there’s a BBCode implementation for Ruby? I’m sure it wouldn’t be hard to write, but no sense reinventing the wheel if it’s already out there. I’ll probably lift Typo’s syntax-highlighting code as well if Scott et alii don’t mind.

The theming engine has a long way to go, and I’m really not sure exactly how I should go about implementing it. I started writing it during Rails Day, but I think it might need to be scrapped and restarted.

Authentication/authorization is still non-existant, I’ll get around to it when I have more to work with. I’m planning on having an authentication proxy of sorts so that you can script the attachment of, say, an OpenID identity to an account or set of privileges within GentleCMS. This should allow you to let people sign into their OpenID account, and have the CMS treat that user as if they’d signed into an anonymous GentleCMS account with a specific set of privileges. Or at least, it should go something like that. I’m not sure exactly. OpenID will probably be the only “Identity 2.0” protocol that will be natively supported, but I intend to make it reasonably simple to add in support for other authentication systems as plugins.

Last but not least, I decided to use RSpec for the testing framework for GentleCMS. I know, I know, we’re not supposed to call it a “testing framework”, but I’m going to anyways, since I’m guessing that I’ll get a few raised eyebrows from the people who haven’t used it yet if I called it a “specification framework”. I’ve decided to practice BDD in the development of GentleCMS (at least, from here forward anyways). I’m still finishing implementing the specifications for the code I have so far (yes, I know, naughty me for not doing it first from the beginning, but I promise that I’m repenting). From what I’ve gotten done so far, I have to say, I’m really impressed. It’s clear that a lot of refinement has gone into BDD and actually practicing it is definitely resulting in much more thorough testing than I normally would be doing. Beyond that, RSpec feels like a much more natural way of expressing testing code… er, specification code.

I had a few problems with RSpec getting it going though. I initially was trying to use RSpec with the RSpec on Rails plugin, but it didn’t seem to supply rake tasks for some unknown reason. I wrote my own rake task for it instead, but there were bunches of problems and eventually I gave up on it. But… then I noticed that there was actually a Rails generator for RSpec on the RSpec homepage. I installed it and it worked… fairly well. Now, this isn’t really a criticism of RSpec or the RSpec generator, because actually, Test::Unit under Rails has the exact same problem (the RSpec generator actually borrowed this code). If you don’t have a valid connection to the database through database.yml, neither RSpec nor Test::Unit will work very well. There’s some problems with fixtures because the testing frameworks assume you are using ActiveRecord. Which I’m not. So I did a little bit of hacking on both Rails’ testing code and the RSpec generator. Then I discovered that RSpec’s rake task isn’t keen on testing a Rails app that has no model specifications. So I had to hack that too since I have no ActiveRecord models. My general feeling here is that RSpec shouldn’t be throwing a generic exception in the case where no specifications are found. If no specifications are found, that may be because there’s nothing to specify, as opposed to that the programmer was too lazy to specify. This case should probably be handled more elegantly. IIRC, Rails’ Test::Unit support also has this problem (though I haven’t checked recently). Plus a huge chunk of my code resides in my Rails app’s lib directory, so I added a new rake task for executing the specifications for the lib directory.

Finally, RCov is awesome. And RSpec’s tight integration with RCov is really, really nice. There is a tiny, though easily fixable, problem with RSpec’s RCov support within the Rails environment. By default, RSpec tells RCov to exclude the lib/spec/ directory from the code coverage reporting, but in the rails environment, the specifications are actually in the spec/ directory, so they aren’t excluded. I’d like to see the ability to pass in a list of directories to have RCov exclude from within the Rake tasks. Should be trivial to do. I’ll have a patch written for it shortly. Actually, I’ll probably write up patches for all of the issues I’ve run into so far. Largely because, right now, I’m the only person who can actually run my specifications since I’ve hacked my local copies of ActiveRecord, Rails, and RSpec to get things working correctly. I already submitted a patch/ticket regarding the Rails issue.

Of this I’m sure: RSpec is now my preferred choice for ensuring that my code works as intended.

FeedTools: Sporkmonger Blog

GentleCMS and Rails Day

I posted the code I wrote yesterday for Rails Day on GentleCMS to the RubyForge svn repository. I didn’t come even close to finishing it in the 24 hours, but that’s hardly a surprise for such an ambitious project. One does not write a viable competitor to Documentum or Typo3 in 24 hours. Anyways, if you’re interested in the code thus far, feel free to take a look. I’ll have something more presentable in a week or two, probably.

Oh, and Sam? I know you’re going to ask the question. Yes, I have every intention of supporting ETags. In fact, it would be silly for me not to, since all data stored by GentleCMS is stored as just regular files.

FeedTools: Sporkmonger Blog

Username:
Password:
(or Cancel)