Intro to Source-Control for SQL Server Development

Mar 9 2007

[Minor updates 2009-07-16 during re-posting.]

It seems developers coming from a “compiled code” background into database management struggle at first with how to version-control the database artifacts. Turns out, it isn’t really that much different. The first step is to reframe the question: “How do I version control the scripts that create my database?”

Since my good friend Chris recently asked me to remind him how we did it at a company where we worked together, I thought I’d write up a poor-man’s starter way of doing source control for database development. This works no matter what development tool or source-control system you use.

First, organize your source files:

  • One file for bootstrapping. This includes "create database" and any other necessary artifacts like filegroups, custom roles, etc.
  • One file per “object” like procedures, views, triggers, functions. Anything you’d want to have an independent log of changes on. For me this is usually everything except "schema" changes, by which I mean Tables with their associated constraints and indexes.
  • One file per "schema version update." I tend to roll together any table/index updates with whatever DML is needed to transform and move the data into the new schema. I keep a "Version" table in the database itself (created during bootstrap) that can track these updates.
  • Then, once per version update, create a little *.bat or *.vbs file that combines the necessary files in the right order for deployment.

This general strategy has worked well regardless of the source-control system I'm using.

(Aside: I recently started a little .NET utility to make this pattern easier to apply; it relies an a single XML manifest file listing the scripts in dependency order each with an attribute of which 'version' they were last modified in. The DB has a 'version' table, and provides a 2-proc interface for "GetCurrentVersion" and "UpdateVersion", then the *.exe can determine which scripts to run in which order. Updating objects in source is relatively painless because they can be checked in (n) times during iterations and the manifest is only updated when I'm confident enough in it to include in the official build/release/rollout.)

If you use the full Team Suite editions of Visual Studio, I'd also recommend Visual Studio for Database Professionals, which automates shredding the DB structure into a script&local-only environment for editing and builds deployment scripts for you. My only gripes with it are (a) it obviously doesn't handle any custom data transforms you might need during a schema update; (b) it shreds the DB a bit too granularly for my tastes -- each column, each index, each constraint, etc.

Some general tips for painless source-control:

  • Make scripts "rerunnable". For DDL, I always use some variation of "if not exists... then /* do something */". For objects like procedures and triggers I use the pattern "if not exists, create <some-stub-object>; ... alter <real-object-code>". This emulates Oracle's "create or replace package" syntax which I always liked. The script works the very first time and is re-runnable.
  • Include copious "print" statements for easily seeing what happened or logging your rollout to a file.
  • At the top of files, include a statement that raises a severe error if you're connected to Master or something. Like this:

if (db_name() in ('master','model','msdb','tempdb','Northwind','pubs','AdventureWorks'))
    raiserror('Must not target a built-in or sample database.', 20, 1) with log

  • Create ad hoc batch files to apply handfuls of updates as needed during development. This especially helps coordinate with other developers during tight iterations where you don't want the "real build" to see the changes yet.
  • Use CodeSmith or some other tool to code-generate most of the cute little change-script stubs. I plug in TableName, ColumnName and Index Options, and get a nice little rollout script.

Some DB source-control anti-patterns:

  • Don’t deploy new databases by (a) restoring a backup or (b) attaching a whole DB set of files. Ick. You lose the ability to track edits over time and you bloat your source control immensely!
  • Don’t combine all schema and object changes into one huge monolithic script to update the version. You might have checkin history on that one file, but every change ever is in one file (needle in haystack problem—good luck finding the version when you messed up the tax-calculation logic!) and multiple developers working together is messy or impossible.
  • Don’t develop against a 'live' database then using some—hopefully robust—third party tool to magically figure out your changes for you. I've never seen this lazy approach work robustly. Usually more time is spent validating the changes than it would have taken to just write the change scripts by hand.


I hope to post more in the future on this, including some sample files. I'm interested to hear your thoughts on what works well for you.

Database, MS SQL Server ,

Comments

Add comment




biuquote
  • Comment
  • Preview
Loading