Month: October 2012

Windows Azure Cloud Storage using the Repository pattern

Repository pattern instead of an ORM but with added Unit of Work and Specification patterns

When querying Azure Tables you will usually use the .NET client to the RESTful interface. The .NET client provides a familiar ADO.NET syntax that is easy to use and works wonderfully with LINQ. To prevent the access code becoming scattered through your code you should be collecting it into some kind of DAL. You should also be thinking about testability of your code and the simplist way to provide this is to have interfaces to your data access code. Okay, so there’s nothing earth-shattering here but getting the patterns together and learning to use Azure Tables to their best is probably new to you or your project.

IRepository

What do you want to provide to every object that needs a backing store? I’d suggest searching and saving so here are the two methods every repository is going to need.

public interface IRepository<TEntity> where TEntity : TableServiceEntity
{
  IEnumerable<TEntity> Find(params Specification<TEntity>[] specifications);

  void Save(TEntity item);
}

IEntityRepository

What about getting back a particular entity, making changes and saving that back? The first thing to note is that in Azure Tables an entity is stored in the properties of a Table row *but* other entities may also be stored in the same Table. So think entity and not table, which is different to how you would normally think of a repository.

Let’s say for this example I want to be able to get a single entity, a range of entities, to be able to delete a given entity and even to page through a range of entities.

To keep the code cleaner I’m going to pass in the parameters as already formed predicates for my where clause. There’s little advantage to using the Specification pattern here other than I think it makes the code a little more explicit.

public interface IEntityRepository : IRepository<Entity>
{
    void Delete(Entity item);

    Entity GetEntity(params Specification<Entity>[] specifications);

    IEnumerable<Entity> GetEntities(
        params Specification<Entity>[] specifications);

    IEnumerable<Entity> GetEntitiesPaged(
        string key, int pageIndex, int pageSize);
}

EntityRepository

public class EntityRepository : RepositoryBase, IEntityRepository
{
    public EntityRepository(IUnitOfWork context) 
        : base(context, "table")
    {
    }

    public void Save(Entity entity)
    {
        // Insert or Merge Entity aka Upsert (>= v.1.4).
        // In case we are already tracking the entity we must 
        // first detach for the Upsert to work.
        this.Context.Detach(entity);
        this.Context.AttachTo(this.Table, entity);
        this.Context.UpdateObject(entity);
    }

    public void Delete(Entity entity)
    {
        this.Context.DeleteObject(entity);
    }

    public Entity GetEntity(
        params Specification<Entity>[] specifications)
    {
        return this.Find(specifications).FirstOrDefault();
    }

    public IEnumerable<Entity> GetEntities(
        params Specification<Entity>[] specifications)
    {
        // new ByKeySpecification("partitionKey")
        return this.Find(specifications);
    }

    public IEnumerable<Entity> GetEntitiesPaged(
        string partitionKey, int pageIndex, int pageSize)
    {
        var results = this.Find(
            new ByPartitionKeySpecification("partitionKey"));

        return results.Skip(pageIndex * pageSize).Take(pageSize);
    }

    public IEnumerable<Entity> Find(
        params Specification<Entity>[] specifications)
    {
        IQueryable<Entity> query = 
            this.Context
            .CreateQuery<Entity>(this.Table)
            .AsTableServiceQuery();

        query = specifications.Aggregate(
            query, (current, spec) => 
            current.Where(spec.Predicate));

        return query.ToArray();
    }
}

It’s easy enough to pass in a context for your repository following the Unit of Work pattern. You can create this quite simply (see TableStorageContext following). You have to define which Table your entity is stored in and you want that and your context as properties of your class. I find it cleaner to manage (and easier for the next developer to implement) if that work is done in a base class, RepositoryBase.

public class RepositoryBase
{
    public RepositoryBase(IUnitOfWork context, string table)
    {
        if (context == null)
        {
            throw new ArgumentNullException("context");
        }

        if (string.IsNullOrEmpty(table))
        {
            throw new ArgumentNullException(
                "table", "Expected a table name.");
        }

        this.Context = context as TableServiceContext;
        this.Table = table;

        // belt-and-braces code - 
        // ensure the table is there for the repository.
        if (this.Context != null)
        {
            var cloudTableClient = 
                new CloudTableClient(
                    this.Context.BaseUri, 
                    this.Context.StorageCredentials);
            cloudTableClient.CreateTableIfNotExist(this.Table);
        }
    }

    protected TableServiceContext Context { get; private set; }

    protected string Table { get; private set; }
}

So now we actually get to the meat of the matter and implement our TableServiceContext methods for the CRUD functionality we need. In this example I’ve a single Save method that uses the ‘Upsert’ (InsertOrMerge) functionality available in Azure since v.1.4 (2011-08). The Find method is there for convience – if it doesn’t suit your query then simply don’t use it.

TableStorageContext

public class TableStorageContext : TableServiceContext, IUnitOfWork
{
    // Constructor allows for setting up a specific 
    // connection string (for testing).
    public TableStorageContext(string connectionString = null)
        : base(
            BaseAddress(connectionString),
            CloudCredentials(connectionString))
    {
        this.SetupContext();
    }

    // NOTE: the implementation of Commit may vary depending on 
    // your desired table behaviour.
    public void Commit()
    {
        try
        {
            // Insert or Merge Entity aka Upsert (>=v.1.4) uses 
            // SaveChangesOptions.None to generate a merge request.
            this.SaveChanges(SaveChangesOptions.None);
        }
        catch (DataServiceRequestException exception)
        {
            var dataServiceClientException =       
                exception.InnerException as 
                DataServiceClientException;
            if (dataServiceClientException != null)
            {
                if (
                    dataServiceClientException.StatusCode == 
                    (int)HttpStatusCode.Conflict)
                {
                    // a conflict may arise on a retry where it
                    // succeeded so this is ignored.
                    return;
                }
            }

            throw;
        }
    }

    public void Rollback()
    {
        // TODO: clean up context.
    }

    private static string BaseAddress(string connectionString)
    {
        return CloudStorageAccount(connectionString)
            .TableEndpoint.ToString();
    }

    private static StorageCredentials CloudCredentials(
        string connectionString)
    {
        return CloudStorageAccount(connectionString).Credentials;
    }

    private static CloudStorageAccount CloudStorageAccount(
        string connectionString)
    {
        var cloudConnectionString = 
            connectionString ?? 
                CloudConfigurationManager
                .GetSetting("CloudConnectionString");
        var cloudStorageAccount =     
            Microsoft.WindowsAzure.CloudStorageAccount.Parse(
                cloudConnectionString);
        return cloudStorageAccount;
    }

    private void SetupContext()
    {
        /*
            * this retry policy will introduce a greater delay if 
            * there are retries than the original setting of 3 retries 
            * in 3 seconds but it will then show up a problem with 
            * the system without the system failing completely.
            */
        this.RetryPolicy = 
            RetryPolicies.RetryExponential(
                RetryPolicies.DefaultClientRetryCount, 
                RetryPolicies.DefaultClientBackoff);

        // don't throw a DataServiceRequestException when 
        // a row doesn't exist.
        this.IgnoreResourceNotFoundException = true;
    }
}

In my ServiceDefinition config I have a CloudConnectionString. This has to be parsed to get the endpoint and account details before I can create the TableServiceContext. A couple of static methods do the job. This object also implements the Commit and Rollback methods for the Unit of Work. My Commit is implementing ‘Upsert’ so you may want it to be different or you may want to have different implementations of TableStorageContext that you can pass in to your Repository class depending on how it needs to talk to storage.

Further Architectural Options

I favour Uncle Bob’s Clean Architecture and as such I wouldn’t expose my Repository classes to other modules. I would wrap them in a further service layer that would receive and pass back Model objects. Cloud Table Storage is much more flexible than relational database storage but you have to think about it quite differently and the structure of your code will be very different to what you may be used to.

I’ve placed the Repository project on github: WindowsAzureRepository.

Advertisements