Delivering database updates

While there are several existing products for designing database models (like power amc, generally commercial) and for developing on databases (SQL clients like sql developer, mysql query brower, ..., generally open source), the aspect of delivering updates to a database seriously lacks of productive tools.

In this article we will try to expose the specific problematic of database deliveries to explain how our IDE can help you.

Understanding database deliveries

Databases are a persistent storage system, it can therefore not be dropped and replaced each time you want to alter the data model as this would imply to loose all your precious data. Instead, you need to update the database structure & data by submitting incremental instructions which will tell the database server what to add, what to remove and what to update in the existing structure.

The following diagram represents the evolution of a production database over time :

Known states

The formula that can guarantee quality database upgrades is to always have your target database in ''known states''. The process of upgrading a database is then to shift the database from a known state to a next known state.

Indeed, with databases you never know at development time what you will have at runtime. With the application tier for example you build your runtime from your code base which implies that you know exactly what is being built and what will therefore be deployed. In the database world, you have no ''runtime'' since the runtime is the database and is hosted and administrated by your client (or production server). The result of all your developments is a set of ''upgrade'' scripts. And those scripts will expect a very specific initial configuration and will be able to alter it to another very specific target configuration of the database.

Knowing those states and having the ability to validate (or invalidate) them is the key aspect of database deliveries. As soon as you know that your ''expected initial state'' is reliable, your upgrade script has 90% of chances to succeed. The other 10% might come from the data, but we'll see this aspect in another article.

Validating states

Since our ability to rely on the expected states is critical for our database deployment, validating the states '''before''' and '''after''' applying a change is a key.

Initial state validation enforces the following :

Nobody has altered the database structure manually
The assertions from the script about to be deployed are guaranteed

Target state validation enforces the following :

Whatever the errors I had or I did not have during the execution of the upgrade, the database has been upgraded to the expected target state
The database schema expected by the dependent application tier has been properly installed and will work as expected
The target state of an upgrade will be the initial state of the next upgrade so this needs to be OK

Taking control from development

Now that we had an overview of the deployment part, we can try to tackle this problem from a development point of view. Because everything comes from the development.

We see many attempts to solve this database deployment "''problem''" by deployment-oriented tools. For example, many will tell you to make your developments on a database and then when you are ready to deploy to use some kind of ''database comparison'' tool which will be able to compare this database with another one in the previous version and which will generate the SQL scripts resulting from the comparison. Some software vendors have even developped very advanced features of ssh tunneling so that you can access your production database to generate the diff.

To our eyes, this kind of approach is bad and is only a workaround which does not completely answer to the problem (or it answers to a different question). Here is why this is bad :

We are talking about '''known''' states being the key aspect of good deployment : how could you trust an ''open bar'' development database to be your reference for building your deliveries ?
These solutions assumes that the reference database is reliable, why should it be ?

Our key concept here is a "''Trust no one''" rule.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delivering database updates

Understanding database deliveries

Known states

Validating states

Taking control from development

Clone this wiki locally