Decoupling Content Management

cover image for Decoupling Content Management

Traditional content management systems are monolithic beasts. Just to make your website editable you need to accept the web framework imposed by the system, the templating engine used by the system, and the editing tools used by the system. Want to have a better user interface? Be prepared to rewrite your whole website, and to the pain of having to migrate content between different storage systems.

But none of this should be necessary. When web editing tools were more immature, it made sense for the same people to build the whole stack from database content models to web page generation and editing tools. But that was ten years ago, now we could do better.

Here is how a traditional CMS looks like:

cms-monolithic-approach.png

As you can see, the whole system is a monolithic block. The CMS provides content storage, routing, templating, editing tools, the kitchen sink. Probably you're even tied to a particular relational database for content storage. Want to use a cool new editor like Aloha, or a different templating engine, or maybe a trendy NoSQL storage back-end? You'll have to convince the whole CMS project or vendor to switch over.

A much better picture would be something like the following:

cms-decoupled-approach.png

In this scenario, the concept of Content Management is decoupled. There is a content repository that manages content models and how to store them. This could be something like JCR, PHPCR, CouchDB or Midgard2. Then there is a web framework, responsible of matching URL requests to particular content and generating corresponding web pages. This could be Drupal, Flow3, Django, CodeIgniter, Midgard MVC, or something similar. And finally there is the web editing tool. The web editing tool provides an interface for managing contents of the web pages. This includes functionalities like rich text editing, workflows and image handling.

The web editing tools have traditionally been part of the web framework, the framework serving forms and toolbars to the user as part of the generated web pages. But with modern browsers you could throw forms out of the window and just make pages editable as they are.

Common representation of content on HTML level

How would the communication between the web editing tool and the backend work, then?

cms-decoupled-communications.png

First of all, the web editing tool has to understand the contents of the page. It has to understand what parts of the page should be editable, and how they connect together. If there is a list of news for instance, the tool needs to understand it enough to enable users to add new news items. The easy way of accomplishing this is to add some semantic annotations to the HTML pages. These annotations could be handled via Microformats, HTML5 microdata, but the most power lies with RDFa.

RDFa is a way to describe the meaning of particular HTML elements using simple attributes. For example:

<div typeof="http://rdfs.org/sioc/ns#Post" about="http://example.net/blog/news_item">
    <h1 property="dcterms:title">News item title</h1>
    <div property="sioc:content">News item contents</div>
</div>

Here we get all the necessary information for making a blog entry editable:

  • typeof tells us the type of the editable object. On typical CMSs this would map to a content model or a database table
  • about gives us the identifier of a particular object. On typical CMSs this would be the object identifier or database row primary key
  • property ties a particular HTML element to a property of the content object. On a CMS this could be a database column

As a side effect, we also manage to make our page more understandable to search engines and other semantic tools. So the annotations are not just needed for UI, but also for SEO.

Common representation of content on JavaScript level

Having contents of a page described via RDFa makes it very easy to extract the content model into JavaScript. We can have a common utility library for doing this, but we also should have a common way of keeping track of these content objects. Enter Backbone.js:

Backbone supplies structure to JavaScript-heavy applications by providing models with key-value binding and custom events, collections with a rich API of enumerable functions, views with declarative event handling, and connects it all to your existing application over a RESTful JSON interface.

With Backbone, the content extracted from the RDFa-annotated HTML page is easily manageable via JavaScript. Consider for example:

objectInstance.set({title: 'Hello, world'});
objectInstance.save(null, {
success: function(savedModel, response) {
alert("Your article '" + savedModel.get('title') + "' was saved to server");
}
});

This JS would work across all the different CMS implementations. Backbone.js provides a quite nice RESTful implementation of communicating with the server with JSON, but it can be easily overridden with CMS-specific implementation by just implementing a new Backbone.Sync method. Look for example at the localStorage Backbone.js Sync implementation.

New possibilities for collaboration

Once the different Content Management Systems describe their content with RDFa, and provide an unified JavaScript API to it, lots of things become possible. While most systems probably want to have their own look-and-feel, still many functionalities can be shared. Consider for example:

  • Using browser's localStorage for storing drafts of content edited by user. Never lose content!
  • Collaborative editing via XMPP or WebSockets
  • Versioning and undo
  • Semantic enrichment of content using tools like Apache Stanbol

All of these would be quite hard to implement by an individual CMS project. But if we have a common JS layer available, the effort can be shared by all CMS projects implementing these ideas.

There have been prior efforts at doing something similar. In the early 2000s, OSCOM made the Twingle tool that was able to edit and save content with multiple CMSs. Then there was the Atom Publishing Protocol and the Neutrol Protocol efforts, and also CMIS. But all of these mandated that the systems would have to implement some particular server-side protocol. The advantage of the approach promoted here is that the only server-side change needed is adding RDFa annotations to HTML templates, and then the rest happens on JavaScript level.

The new CMS interface we've built for Midgard2 already uses these concepts. Now here in the Aloha Editor Developer conference we're talking with Drupal and TYPO3 developers about rolling out the same ideas in their systems. Other systems and projects are also more than welcome to participate.

Update: The work is underway to generalize the RDFa-Backbone.js bridge I originally wrote for Midgard Create. You can find it on GitHub. We're currently experimenting with it on both Midgard2 and TYPO3.


Read more Midgard posts.