Josh Berkus asks a great question, what options are there for source control management (SCM) of databases? He focuses on three specific issues:
- Database Diffs are DDL
- Some Data Gets Copied, Some Doesn’t
- A Database SCM Tool Should Control Source and Update the Database in One Action
I’d love to be able to get a SQL based diff for database schemas for different versions of an application. A tool like that would be a huge step by itself. So what does it take to tokenize SQL syntax? Perhaps CPAN‘s SQL::Statement would good enough. It seems to understand a fair bit of SQL syntax. Additional work would have to be done get a set of ALTER TABLE commands to update the schema.
Such a tool will also fall prey to the issue support multiple databases. The two obvious targets to start with would be PostgreSQL and MySQL.