Endorsing MidCOM's Migration to File System

08 September 2004

5 minute read

Current situation

To those unfamiliar with Midgard development, Midgard Framework itself has been written as a C system library and a PHP extension. The Framework provides services like authentication, templating and content storage. The actual Midgard applications are written on that foundation in PHP and stored in the Midgard database as pages and "snippet" objects.

MidCOM functions as a component system written on the application layer. With MidCOM, the actual point applications, or components are written following a strict set of coding standards and stored in the Midgard snippet tree. The components must provide a standardized set of interfaces, and all input and output with them is controlled by the MidCOM framework.

The fact that all application code is stored in the database causes several problems, the topmost being lack of good development tools and difficulty of version control. Torben explains these points well in the "Current problems" section of the mRFC.

Additional issue is performance. A component framework is a big thing, and its services and interfaces require many code libraries to be loaded. It has been calculated that an average MidCOM page request causes 2000 database queries. With careful optimization we could probably get the number down to about 1300, but that is still a lot. If we move the libraries to file system, only database I/O will be actual content and template queries. Also, if the code is stored in the file system, it can be compiled into faster byte code using tools like Zend Optimizer.

Reasoning for the old way

With all these issues, why did we do the illogical thing of putting our code into the database in the first place? The code snippet system was added to Midgard in the 1.4.0 cycle, and there were several reasons:

At that time most Midgard applications were much smaller
Storage in Midgard database provides metadata support. Licensing information, documentation and icons can be stored into the code objects themselves
Code can be replicated together with the site data
Editing code does not require filesystem access
Permissions come from Midgard's group system

An important consideration is also that the lack of file system access makes Midgard much more secure, as noted by Alexander Bokovoy in his Midgard 5th anniversary presentation. When the whole system accesses only the database, and even that only through the Midgard libraries, this cuts down the risk of security holes through bad coding. In its 5 years of history, there hasn't been a since publicized security hole in Midgard. And this even though many high-profile hacking targets like large organizations, military, financial institutions and governments run Midgard.

Changing deployment models

Back with Midgard 1.4.0 almost all Midgard applications were built in-house and were not distributed. This way version control of the code was more closely connected with the general staging/live routines of the developers. For this the fact that code could be replicated with content was advantageous.

Now most applications like MidCOM and its components are being developed in centralized fashion by the Midgard Community. In this setup code is developed and tested on separate systems, and then release versions are deployed widely across production servers. This way version control of the code between a content staging and live is not so important, as both will share the same component versions, and will differ only in versions of templates, configuration and local code.

Separated concepts

Moving MidCOM to file system not only makes development easier, but will also make the separation of centralized "system" code and local scripts clearer. MidCOM and its components are a part of the Midgard Framework, and are installed by the system administrator and not modified by a site developer. The things that a site developer will modify will be local and specific to a website, organization or a server environment, and so are stored into the Midgard database. This includes local configurations, output templates and local scripts.

The new way

If mRFC 0006 gets implemented, all MidCOM core code and components will be developed just like any PHP application, using the file system and include() statements. This means that version control and development will become substantially simpler.

However, what we lose there is the ability to install a whole MidCOM component or the framework from a single repligard .xml.gz file. I have proposed using the PEAR installer for this, and Torben has promised to include it to the mRFC.

The PEAR installer uses an XML configuration file for defining the files shipping with a package, and enables server administrators to install new software with a single, Debian APT-like command regardless of the operating system. The PEAR installer is available by default on PHP 4.3.0 and newer releases.

Using the common PHP installation methods of the PEAR community would make The Midgard Project far better connected with the PHP community in general and promote code sharing.

To promote this even further, the Midgard Community should fully embrace the PEAR Coding Standards, and use the same code documentation tools.

Theoretically MidCOM could even move to the PEAR repository, but it would probably be seen as too Midgard-specific by the PEAR community. It provides a quite generic component architecture, but all components and its methods like the URL parser depend heavily on the midgard-php extension.

mRFC status

Torben has still left the proposal to Draft state, and some things will have to be polished before it can be voted on.

If the mRFC passes the actual transition should be relatively simple. We can just use the Aegir FileSync AddOn to copy all code snippets of the MidCOM core and component directories to the file system, and then programmatically change all mgd_include_snippet() calls in the system to use MidCOM's internal $midcom->include() method.

The PEAR package description files should be easy to generate automatically. In addition, we should create a "MidCOM uninstaller" Repligard file that would include delete calls for all objects in the MidCOM 1.4.0 release.

The actual MidCOM sites will be easy to migrate, as most of them contain only a single mgd_include_snippet("/midcom/midcom") call. This could be either changed to a filesystem require() call, or the MidCOM system could ship a stub snippet for this.

Updated 13:20: There is some discussion about this on the midgard-dev mailing list.

Updated 2004-10-28: The vote on mRFC 0006 passed today, so we can finally start implementation.