Easy database migrations with C# and FluentMigrator

Database migrations are an increasingly common pattern for managing and automating the creation of your projects database schemas. Typically each migration has 3 elements:

  • A Unique Id – each new migration is given a numeric identifier higher than the previous one.
  • An UP component – describing the table / row / column / key – you want to create
  • A DOWN component – which exactly reverses the change you’re are making in the UP component.

Thus by running each of the migrations in order you can go from an empty database, to the schema required by your project, or if an error occurs you can simply roll back any number of migrations to get to a known stable state.

But why?

The advantages of this approach may not be immediately obvious, but I’d say the main advantages are:

  • No seperate SQL scripts, db exports etc.  All database migrations are contained within the project and can be reviewed and managed by the team – No DBA Required.
  • Easy deployment onto as many servers as required – most medium to large projects will have a number of environments, minimally dev, test and live.  By creating DB migrations it’s simple to keep each of these environments in synch.
  • As the project grows, database migrations can be created as required, meaning you can easily rollout new changes to live projects.
  • Easy rollback – if you rollout a patch containing a migration, you can instantly roll it back without discovering you don’t have rollback scripts.
  • Roll outs to the live db are usually possible with no downtime

Introducing FluentMigrator

FluentMigrator is a package that allows you to create “fluent” migrations within Visual Studio.  The easiest thing to do is just to demo a migration:


using FluentMigrator;
namespace Nant.Builder.Sample.Migrations
{
[Migration(3)]
public class Mig003_CreateUserRoleTable : Migration
{
private const string TableName = "UserRole";
public override void Up()
{
Create.Table(TableName)
.WithColumn("UserId").AsInt32().NotNullable()
.WithColumn("RoleId").AsInt16().NotNullable();
var compKey = new[] { "UserId", "RoleId" };
Create.PrimaryKey("PK_UserRole").OnTable("UserRole").Columns(compKey);
Create.ForeignKey("FK_UserRole_User").FromTable("UserRole").ForeignColumn("UserId").ToTable("User").PrimaryColumn("UserId");
Create.ForeignKey("FK_UserRole_Role").FromTable("UserRole").ForeignColumn("RoleId").ToTable("Role").PrimaryColumn("RoleId");
}
public override void Down()
{
Delete.ForeignKey("FK_UserRole_User").OnTable(TableName);
Delete.ForeignKey("FK_UserRole_Role").OnTable(TableName);
Delete.Table(TableName);
}
}
}

You can see that the migration is decorated with a number in this case 3.  You can see the UP migration creates the UserRoles table and a number of Foreign Keys.  You should also see that the DOWN migration reverses this change, deleting the keys then deleting the table itself.

Thus if a problem occurs during the creation of this table, FluentMigrator can rollback the migration, as described in your DOWN migration.

Running the migrations

Migrations on their own are of little value if you can’t run them against a target database.  FluentMigrator integrates with both MSBuild and Nant, meaning you can run migrations as part of your build process. What’s nice about this is that you can only run in DB changes if all your unit tests etc pass, and if you have a more sophisticated build script you can control which db migrations are rolled out onto different environments.

Readers of my blog will be familiar with my love of Nant and my Nuget package Nant.Builder.  So it won’t come as a great surprise to find out I’ve added a FluentMigrator target into my Nant.Builder scripts.

You can set up the appropriate values in Nant.xml, you can see a working sample in the Nant.Builder.Sample this uses sqlite (note it will only run on 64bit windows).


<!– Database Migration Settings – expects FluentMigrator–>
<property name="run.db.migrations" value="true" />
<!–The name of the project containing your FluentMigrator tests–>
<property name="migrations.project.name" value="Nant.Builder.Sample.Migrations" />
<!–Database type, eg sqlserver2008, sqlite –>
<property name="database.type" value="sqlite" />
<!–The connection string you for the db you want to migrate–>
<property name="connection.string" value="Data Source=C:\dev\tools\sqlite\nant-builder-sample.sqlite;Version=3;" />
<!– Set flag to true if you wish to rollback a db migration, you need to specify a migration number also–>
<property name="rollback.db.migrations" value="false" />
<!– The migration number you wish to rollback to – WARNING be careful you don't delete data by setting this value incorrectly–>
<property name="rollback.to.version" value="0" />

Best Practice

As far as best practice goes, as I demonstrate in the sample project I’d advise:

  • Keep all the migration classes in a separate project
  • Number the migration classes in line with their migration number ie Mig012_CreateTableX – makes it easier to manage them once you have a few migrations.

Here’s a snap of our migrations on one of our large projects, as you can see after iteration 18 it occurs to us to start creating folders per iteration, allowing to keep things a bit more ordered:

Conclusions

Fluentmigrator takes a lot of the pain out of managing your database schema over multiple environments.  If there’s a negative it’s that the project is a little bit flakey on environments other than Sqlserver, I’ve tried it on SqlserverCe and Sqlite and it’s not worked or only worked after poking at the code for a while.

On the positive side the project is being actively maintained and updated, I had a minor update accepted and committed within a few days of posting it.  So download it and get involved.

Advertisement

Testing with Selenium Webdriver, Visual Studio and NUnit

Update 20/06/12 – Updated NareshScaler to work with the new IEDriverServer that ships with the latest version of Selenium Webdriver

We have a multi-tenant solution at work.  As we added tenants we were happy to discover that our app scaled nicely, but we were sad to discover that Naresh, our lone QA superstar, did not.  We discovered (unsurprisingly) it wasn’t possible to stick to a fixed release schedule and do regression/integration testing manually.  So some additional automation was required.

Integration Testing Automation Requirements

Our requirements were

  1. Naresh could create integration tests with minimum oversight from the team
  2. Tests needed to be created quickly without a lot of additional coding
  3. Ideally tools would run on Visual Studio

Ruling out Specflow

I’d been keen to investigate Specflow and the BDD style of integration testing.  However, it quickly became apparent that this would require a significant effort in time to successfully wire up the tests, and would require a reasonable amount of dev to ensure each test passed.  Thus failing requirements 1 and 2.

Ruling in Selenium Webdriver

We already had a bit of experience with Selenium and we found this excellent blog post from Stephen Walther.  Reading this we realised that  Selenium Webdriver met our requirements perfectly.  We could install Selenium Webdriver into Firefox and export the scripts as C# Webdriver classes.  We could then add the classes into a simple test-runner and we’d be able to scale Naresh 🙂

Some Selenium Pitfalls

However, it’s not all good news.  One thing to point out is that there can be a certain amount of flakiness when the various selenium drivers are trying to locate elements on your page.  Ie a test will fail once, then pass again later.  This is obviously far from ideal, but overall I think the benefits out-weigh the drawbacks.   Test that fail consistently can be investigated.

One way to minimise these failures would be to run the tests in only one browser, the Firefox driver seems the most reliable.  If you’re not doing loads of Javascript this is probably safe enough.

Introducing NareshScaler

I wrote NareshScaler to allow Naresh to quickly add each Selenium macro into the test-runner.  I’d also been wanting to try out creating a nuget package for a while, so this seemed like a perfect chance to give it a go.

You can install NareshScaler into your Integration Test project using the nuget package manager.  Once successfully installed, you should add your Selenium Webdriver class file(s).

Then simply mark each class as inheriting from NareshScalerTest.

You will now have to override the RunSeleniumTests method.  You simply need to wire up the driver and list any test methods in your selenium Webdriver class, ie:

[TestFixture]
public class NugetOrgTest : NareshScalerTest
{
private IWebDriver driver;
private string baseURL;

[SetUp]
public void SetupTest()
{
baseURL = "http://nuget.org/";
}

public void Test_That_NareshScaler_Exists_On_NugetOrg()
{
driver.Navigate().GoToUrl(baseURL + "/");
driver.FindElement(By.Id("searchBoxInput")).Clear();
driver.FindElement(By.Id("searchBoxInput")).SendKeys("NareshScaler");
driver.FindElement(By.Id("searchBoxSubmit")).Click();
driver.FindElement(By.LinkText("Naresh Scaler")).Click();
}

public override void RunSeleniumTests(IWebDriver webDriver)
{
driver = webDriver;
Test_That_NareshScaler_Exists_On_NugetOrg();
}
}

We use NUnit here, so once everything is wired up, you should be able to point NUnit at your Integration Tests dll and see all your tests running in IE, Firefox and Chrome  Which I have to admit is pretty cool when you see everything running automatically.

Additionally NareshScaler includes an Nant build file, to allow you to wire up your integration tests into CruiseControl etc.  I’ve added a sample, so you can hopefully see how it works.  Hope you find it useful.

Windows Azure Diagnostics with SDK 1.6 for WebRoles

There appears to be a lot of conflicting and confused advice about configuring Diagnostics on Windows Azure.  The situation is not at all helped by Microsoft’s own site which, to paraphrase Morecambe and Wise, has all the right pieces of information, just not necessarily in the right order.

It doesn’t help that what used to work with earlier versions of the Azure SDK, no longer works with later versions.  So here I outline:

  • The steps to get Diagnostics outputting correctly to Windows Azure Storage with SDK 1.6 for WebRoles (although I’d imagine it’s largely the same for WorkerRoles)
  • Azure 1.5 ate my diagnostics – Another post where I update my Auto Packaging post to be compatible with SDK 1.6

Setting up Windows Azure Diagnostics for your WebRole with SDK 1.6

1. Configure Web.Config – required if you are using Trace statements

I use Log4Net for my general logging/tracing needs so don’t use Trace statements, thus the example shown in step 3, below, does not require you to complete this step.

However, if you are using Trace statements,  ie:

System.Diagnostics.Trace.TraceError("Error has occurred");

You’ll need to configure Web.config as described here

<system.diagnostics>
    <trace>
        <listeners>
            <add type="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener,
                Microsoft.WindowsAzure.Diagnostics,
                Version=1.0.0.0,
                Culture=neutral,
                PublicKeyToken=31bf3856ad364e35"
                name="AzureDiagnostics">
                <filter type="" />
            </add>
        </listeners>
    </trace>
</system.diagnostics>

2. Initialise Diagnostics

As outlined here, you’ll need to ensure you add the Import element for the Diagnostics module in your ServiceDefinition.csdef file.  Here’s what mine looks like:

<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="HuzuSocial.Azure" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
    <WebRole name="HuzuSocial.App" vmsize="Small" >
        <Sites>
            <Site name="Web">
                <Bindings>
                    <Binding name="Endpoint1" endpointName="Endpoint1" />
                </Bindings>
            </Site>
        </Sites>
        <Endpoints>
            <InputEndpoint name="Endpoint1" protocol="http" port="80" />
        </Endpoints>
        <Imports>
            <Import moduleName="Diagnostics" />
        </Imports>
    </WebRole>
</ServiceDefinition>

Secondly you’ll need to add your Azure Storage Account details into your ServiceConfiguration.cscfg, mine looks like this (obviously replace with your account name and key):

<?xml version="1.0" encoding="utf-8"?>
<ServiceConfiguration serviceName="HuzuSocial.Azure" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">
        <Role name="HuzuSocial.App">
        <Instances count="2" />
        <ConfigurationSettings>
            <Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString" value="DefaultEndpointsProtocol=https;AccountName=[youracountnamehere];AccountKey=[youraccountkeyhere]/>
        </ConfigurationSettings>
        <Certificates>
        </Certificates>
    </Role>
</ServiceConfiguration>

3. Override the OnStart method in WebRole.cs

In the root of your web project you should have a WebRole class.  You’ll need to override the OnStart method to correctly initialise the Diagnostics.  There is loads of different sample code out there, some of it highly dubious.  This is my configuration, and works well for me (I lifted this from a post out there somewhere, unfortunately I forgot to bookmark it and can no longer find it, so thankyou whoever you are)

public override bool OnStart()
{
    DiagnosticMonitorConfiguration diagConfig = DiagnosticMonitor.GetDefaultInitialConfiguration();

    var perfCounters = new List<string>
    {
        @"\Processor(_Total)\% Processor Time",
        @"\Memory\Available Mbytes",
        @"\TCPv4\Connections Established",
        @"\ASP.NET Applications(__Total__)\Requests/Sec",
        @"\Network Interface(*)\Bytes Received/sec",
        @"\Network Interface(*)\Bytes Sent/sec"
    };

    // Add perf counters to configuration
    foreach (var counter in perfCounters)
    {
        var counterConfig = new PerformanceCounterConfiguration
                            {
                                CounterSpecifier = counter,
                                SampleRate = TimeSpan.FromSeconds(5)
                            };

        diagConfig.PerformanceCounters.DataSources.Add(counterConfig);
    }

    diagConfig.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);

    //Windows Event Logs
    diagConfig.WindowsEventLog.DataSources.Add("System!*");
    diagConfig.WindowsEventLog.DataSources.Add("Application!*");
    diagConfig.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);
    diagConfig.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning;

    //Azure Trace Logs
    diagConfig.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);
    diagConfig.Logs.ScheduledTransferLogLevelFilter = LogLevel.Warning;

    //Crash Dumps
    CrashDumps.EnableCollection(true);

    //IIS Logs
    diagConfig.Directories.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);

    DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", diagConfig);

    return base.OnStart();
}

4. That’s it

When deployed to Azure your diagnostics should be successfully transferred to Azure Storage.  To analyse them in any meaningful way, I’d recommend Cerebrate Diagnostics manager, which gives you a nice dashboard.  See below

CRUD – Destroyer of Worlds

As web developers a large part of what we do is CRUD.  The average Service class in my project is at least 50% CRUD methods, and often a lot more.

If you’re anything like me, you’ve probably not given much thought to CRUD methods, you just write them as required.  However, I recently had a small epiphany, this approach to CRUD was making the code base sprawl, approaches to CRUD from the team were non-standard.  Which was leading to new CRUD methods being created, when not required, because of the difficulty of understanding what was there.

I wouldn’t start from here…

So how did it come to this?  I use the Repository pattern for my DB layer and for whatever reason a lot of the documentation around this pattern have method signatures like this:

Customer GetCustomerById(int customerId)
Customer GetCustomerByAddress(Address address)

Thinking that this was the accepted pattern I just blindly started adding this style of methods into my code base, not only in the Repository but in the Service layer too.

Here’s an interface from out codebase for working with images, were you can see CRUD methods jumbled in with other methods to get PagedLists etc.  All in all not very obvious.

public interface IMediaItemsService
{
    PagedList<MediaItem> GetForUserId(int userId, MediaTypeEnum mediaType, int page, int pageSize);
    List<MediaItem> GetForUsername(string username, MediaTypeEnum? mediaType, int? numRecords);
    void Save(MediaItem mediaItem);
    void Delete(MediaItem mediaItem);
    void Delete(int mediaItemId);
    void Delete(int mediaItemId, int currentUserId);
    MediaItem GetByMediaItemId(int mediaItemId);
    MediaItem GetByUniqueId(Guid uniqueId);
    IPagedList<MediaItem> GetAll(int page, int pageSize);
    IPagedList<MediaItem> Search(MediaSearchParams mediaSearchParams, int page, int pageSize);
    MediaItem Update(MediaForm mediaModel);
    IEnumerable<MediaItem> GetAllMediaItemsForRequest(Guid uploadRequestId);
}

The Anti-Pattern

What I’ve come to realise is the Byxxxx is an anti-pattern and makes your interfaces far harder to work with.

Not only that but we had some non-standard approaches to Creates, sometimes called Save (does this do updates), Delete maybe called Remove etc.  The upshot being that when approaching adding some functionality you had to study the Service class to separate the CRUD methods from other methods.  It was at this point I realised there was a much better approach to dealing with CRUD (credit must go to @leegunn for planting the germ of the idea).

Convention over Configuration, Overloading and Areas

In my opinion, the most important message from Convention over Configuration (taken from the Wikipedia article) is:

“… a software design paradigm which seeks to decrease the number of decisions that developers need to make…”

So I came up with the following CRUD conventions for our project

  1. CRUD Methods will be added at the top of each method in a Region called CRUD
  2. All CRUD methods should be overloaded where-ever possible
  3. Overloads should ideally call down to one method containing all the business logic – (hopefully I’ll manage to expand on this in an additional post).
  4. CRUD methods should be entered in the order Create, Read, Update, Delete.
  5. Create Methods will be named Create, and should always return the object that was created
  6. Read Methods will be named Retrieve where they return a single object
  7. Read Methods will be named  RetrieveAll  where they return an IEnumerable<T> collection of objects
  8. Update Methods will be called Update and should always return the object that was updated
  9. Delete methods will be called Delete and return void, an exception may be thrown if an error occurs during the transaction.

The Results

By applying these rules to the interface above we get:

public interface IMediaItemsService
{
#region CRUD

MediaItem Create(MediaItem mediaItem);
MediaItem Retrieve(int mediaItemId);
MediaItem Retrieve(Guid uniqueId);
IEnumerable<MediaItem> RetrieveAll(Guid uploadRequestId);
IEnumerable<MediaItem> RetrieveAll (string username, int? numRecords);
MediaItem Update(MediaForm mediaModel);
MediaItem Update(MediaItem mediaItem);
void Delete(MediaItem mediaItem, int currentUserId);
void Delete(int mediaItemId, int currentUserId);
void Delete(MediaItem mediaItem, int currentUserId, bool isAdmin);
void Delete(int mediaItemId, int currentUserId, bool isAdmin);

#endregion

IPagedList<MediaItem> GetAll(int page, int pageSize);
PagedList<MediaItem> GetForUserId(int userId, MediaTypeEnum mediaType, int page, int pageSize);
IPagedList<MediaItem> Search(MediaSearchParams mediaSearchParams, int page, int pageSize);
}

By leveraging overloading where-ever possible we’ve got rid of Byxxx, which makes working with Reads a lot easier.  I can just do MediaItemsService.Retrieve and study the available overloads.  Same for Updates, Deletes etc.

Additionally by separating your interfaces into CRUD and Non-CRUD methods, it makes it easier to spot when a new class, factory etc might be appropriate.  In our case we can see that the non CRUD methods could do with being refactored, possibly rename to RetrieveAllPaged etc.  The point being the refactoring is now easy to spot.

So by conforming to CoC principles for CRUD methods, we’ve definitely decreased the number of decisions our devs needs to make.  Score! 🙂

Postscript

While writing this post I came across an intriguing idea mentioned at the bottom of this Stackoverflow question about using lambda expressions within your CRUD interface, to allow you to effectively pass a query through to the DB.  This would drastically cut down on overloads not to mention repository methods.

So interface would be

T Single<T>(Expression<Func<T, bool>> expression) where T : class, new();

Usage would be

var user = _dbService.Single<User>(user => user.ID == 12);

There are a few things that could make this tricky to work with, especially if you needed to enforce business rules etc.  However, as a starting place for working with CRUD it’s a really smart idea, and warrants more investigation.

Update 07/12/11

After comments from @alanjmburns I decided he was right and the Delete methods are better to return void rather than a status, and throw an exception if necessary.

Resharper – Going too var?

Since installing Resharper 5.1, about a year ago, I’ve found it to be an invaluable tool.  One of its more controversial tenets is that all variables should be implicitly typed, not strongly typed, ie:

string rockStatus = “Iain rocks the casbah!”;

Should be

var rockStatus = “Iain implicitly rocks the casbah!”;

The rationale for this is held to be two-fold:

  1. It makes your code more readable, ie:
    1. Dictionary<int, object> example = new Dictionary<int, object>(); is a pain to write and read, much better to just use var
  2. It encourages developers to give their variables a more meaningful name, since the type is not present to impart additional meaning.
Aside – Personally I think another reason is it makes C# a bit more like the dynamic languages that all the cool kids are running (hello Ruby).  “Dude, thinking about what types my variables will hold, gets in the way of discovering the conceptual model of what I’m building.  There is no spoon, maaaaan”

Having resisted the notion for a while, I came round to this way of thinking (re points 1 and 2, not about the spoon ;-)), and generally I think Resharper is correct.

However, refactoring some code today, I realised there are a couple of potential pitfalls, one that could be very common in Web programming with MVC, and a less common scenario.

1. Using an implicitly typed variable for variables that are set from an external method, and then returned to a view.

ie:

public ActionResult Index()
{
    var model = _activityStreamService.GetActivityStream("iain");
    return View(model);
}

Now if you, as I did, refactored the external method to return a different type.  The compiler doesn’t help you as your var is implicitly typed to the new return type, and the first you know about it is when the view blows up.  On a complex system this may go unnoticed for a while.

2. Using an implicitly typed variable that are set from an external method, and downcasted when being processed by another method.

ie (imagine Helpers is a different class, I just put them together for quickness),

public string ToJson(object serializeMe)
{
    var jSerializer = new JavaScriptSerializer();
    return jSerializer.Serialize(serializeMe);
}

public ActionResult Index()
{
    var activityStream = _activityStreamService.GetActivityStream("iain");
    var json = Helpers.ToJson(activityStream);

    return View(json);
}

In this example if we refactor the return type from activityStreamService you may be in even more serious trouble, you might completely miss that there’s a problem until your customers complain.

The counter arguments to this are that UnitTests would hopefully catch this scenario (although interestingly if you’re using implicit variables in your unit tests, they might not, plus who honestly has 100% coverage).  Also return types are rarely refactored, you might change the body of the method, but if it started off returning a string it probably always will.

But I think the more important point is that there are cases where using implicit typing is a bad choice, and may store up trouble for you, especially in large projects.

SEO Friendly Strings, with a simple ToSeo() extension

In web development, you’ll spend a lot of time converting blog titles, image titles, page titles into SEO friendly format, ie:

iains-blogs-rock-my-world

In the past I would have written a static helper utility method to do this, eg

string convertMe = "Iain’s Blogs Rock My World";
string seoString = Utils.ConvertToSeo(convertMe);

However C# 3 gave us the awesome power of extension methods, which in the formal words of MSDN are:

Extension methods enable you to “add” methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type.

Or to put it another way, you can add methods to classes you didn’t write, be they classes in the framework or classes written by charlatans like my good self.

So we want to add a ToSeo() method onto the string object. To perform this black magic you must ensure the following is true:

  1. Your extension method must be static
  2. The first parameter specifies which type the method operates on, and the parameter must be preceded the this keyword
  3. The extension method must live in a public static class

So in our case we have

namespace HuzuSocial.Web.Helpers
{
    public static class Extensions
    {
        public static string ToSeo(this string str)
        {
            if (str == null)
                return null;
            else
            {
                // Remove any punctuation
                var sb = new StringBuilder();
                foreach (char c in str)
                {
                    if (!char.IsPunctuation(c))
                        sb.Append(c);
                }

                // Replace spaces with dashs and return
                return sb.ToString().ToLower().Replace(" ", "-");
            }
        }
    }
}

And now our new ToSeo() extension is available on all strings, as long as we have a reference to our extensions class

I think this is a definite improvement over the old utils class approach, and gives us a more flexible and natural way of adding functionality.