Categories
josephscott

Source Control for Databases

Josh Berkus asks a great question, what options are there for source control management (SCM) of databases? He focuses on three specific issues:

  • Database Diffs are DDL
  • Some Data Gets Copied, Some Doesn’t
  • A Database SCM Tool Should Control Source and Update the Database in One Action

I’d love to be able to get a SQL based diff for database schemas for different versions of an application. A tool like that would be a huge step by itself. So what does it take to tokenize SQL syntax? Perhaps CPAN‘s SQL::Statement would good enough. It seems to understand a fair bit of SQL syntax. Additional work would have to be done get a set of ALTER TABLE commands to update the schema.

Such a tool will also fall prey to the issue support multiple databases. The two obvious targets to start with would be PostgreSQL and MySQL.