Manage Database Changes with Liquibase

June 28, 2010

I love discovering a new tool, especially when it completely changes the way you approach developing software.  Liquibase is one of those tools.  Once I came across their site and started reading more about it and how simple it was, I started to reflect back on all the failed production migrations I had in the past, and I shed a tear for all the weekend hours I lost.  No longer will that be the case!

Liquibase

What is Liquibase?

Liquibase is an open source database change management tool built on Java.  Rather than writing SQL directly against the database to create, update or drop database objects, developers define their desired database changes in XML files.  The XML file, called a changelog, contains a list of changesets that define a desired database change in an database agnostic abstraction.  The changelog is intended to contain an evolving list of database changes the team would like to apply to a target database.  This list is additive over time.  Here is a simple example:

<databaseChangeLog>
   <changelog id="FOO-196-01" author="Mike McGarr" >
      <createTable tableName="users">
         <column name="id" type="int">
            <constraints primaryKey="true" nullable="false"/>
         </column>
         <column name="name" type="varchar(100)">
            <constraints nullable="false"/>
         </column>
      </createTable>
   </changelog>
</databaseChangeLog>

Liquibase can be executed through either the command line or as part of a build using Ant, Maven or the like (I would recommend build integration).  Liquibase will apply the changesets directly to the database and can handle rollbacks and tagging of database state.

How does it work?

When Liquibase is executed, you must specify the database against which to apply your changesets.  Liquibase uses two tables to manage changes to the database: databasechangelog and databasechangelock.  If the tables don't exist on the target database, they are created.  An entry is added to the databasechangelock table which ensures only one instance of Liquibase is running at a time.  The databasechangelog table contains a listing of every changeset that has been applied to this database.  Liquibase conducts a diff of the table contents with the XML files and determines which changes still need to be applied.  Once this is determined, Liquibase will then apply the changes to the database.  If you are new to Liquibase, then I recommend checking out their documentation on how to setup and get started.

Using Liquibase on a Team

When working on a team, I found that there are two key lessons to learn.  The first is to make sure that you organize your changelog files in version control.  The current Liquibase Best Practice page recommends organizing your changelog files in a single directory, with each file named by a release number.  A previous version of the Liquibase Best Practice page recommended using a folder per release and allowed for multiple changelog files per release.  This also allows developers to create a changelog file for a set of logical changes as part of a release.  Our team chose the later approach and have been successful with a small team.  I can see the advantages of using the "single file per release" approach as well, but the key point is you should choose one approach for organizing your files.

The second key lesson we learned was to utilize a separate database schema for each developer.  You do not want developers to be testing out their database changes against the same database schema prior to check-in.  For the same reason developers test their code changes locally on their own machine before checking code in, you want them to be able to test their database changes.

Guidelines for Using Liquibase

In addition to the team recommendations I outlined above, I have a few other guidelines for how to best utilize Liquibase in your environment.

  • Follow their Best Practices - When I first started using Liquibase, their Best Practices page had a number of recommendations on it that I found very helpful.  They have since trimmed this page down and I am not sure why.  Since their main site is a Wiki, I recommend reading the previous versions of this page.
  • Minimize Rollbacks - While in the midst of development, a developer should minimize the amount of rollbacks they conduct.  You should reserve rollbacks for failed migrations or backing code out from a test environment.  If a developer adds a table, and decides they no longer need this table, then simply add another entry in the changelog to drop the table.
  • Only Use Liquibase - If your team is starting to use Liquibase, it is a good idea to make it a rule that all database changes should be made through Liquibase.  If you only use Liquibase, it makes it easier to keep track of all the changes you made.  But if you only sometimes use Liquibase, there is good chance you will miss something when it comes time to migrate.
  • Consider Continuous Integration - I find it hard to envision using Liquibase without automating its use through a Continuous Integration Server.  By using a Continuous Integration Server, you can have your development database updated on a nightly basis, integrating all changes checked in that day.  This of course should coincide with a deployment of the application that matches the version of the database, which should also be automated through your CI Server.
  • Make Sure You Test! - It is important to add some integration tests to your automated build.  This helps ensure that your code has been updated to take into account the changes to the database schema.
  • Define a Volatile Schema - In our automated build, we defined a database schema that our nightly build points to that is only used for nightly builds.  Our Liquibase process runs before the integration tests that depend on these changes.  If our build fails for whatever reason, we don't want the database changes to have been migrated to our development database.  If the build is successful, then we re-run Liquibase, this time pointing it at the development database.  This takes longer but ensures that changes migrated to the development database will work.
  • Be Consistent about Naming! - Liquibase doesn't force you to name every constraint you define, which allows you to leave it up to the database to autogenerate a name.  This can get you into trouble, especially if some of your constraint names are explicitly defined and use the same naming convention as the database.  I recommend explicit naming to avoid this issue. (Remember, names should be portable between environments.)
  • Use a Context - Liquibase provides a context feature that allows you define when this changelog gets applied.  It is useful for defining changes that should only be applied to say a test or demo database.  Just add the context to the changeset and then when you configure your various databases, make sure each one defines the contexts it uses.  Don't get carried away though!
  • Reset your Changelogs Periodically - If your project uses Liquibase for long enough, you will find that you accumulate a lot of changelogs.  It is a good idea to use Liquibase to build a completely new changelog after a major release.  This will cut down on the time it takes to build.

A Caution about Schemas

What I found very early on is that you have to determine whether or not you are going to define schemas in your changelogs.  This decision should be based on the environment you are in.  In our environment, we decided to have the schema implied based on the default schema of the connection, which is usually the username.  This was due to the fact that each developer had their own schema on the same machine for development.  If we had hardcoded a schema for our application, this would mean each developer would need their own database server with that schema name.  We found this approach to be easier and it worked, so long as you have the correct database schema defined for the connection.

Share this:


comments powered by Disqus