Vadim's Dev Blog: 2011

Wednesday, December 14, 2011

Stubbing out NHibernate's ISession

When I first started using NHibernate I didn't know much about design patterns. My early attempts were messy, ugly, and unstable. I finally learned about the repository pattern and started using it. Things were great, I was able to write cleaner code and was able to abstract away my use of NHiberante. The thing is, NHibernate's ISession is already an implementation of this pattern, in my opinion. So abstracting an abstraction just adds an extra layer of complexity. That's not to say that you shouldn't use the repository pattern to persist objects, but for basic reads, using the ISession directly, has a lot of benefits. (For a more thorough analysis see Ayende's blog here and here.)

There is, however, one advantage to abstracting away the ISession. It allows you to unit test your code in a much easier fashion. Since you can easily mock out your repository object, you can write unit tests for modules that use it. This is much easier than if you used the ISession directly. Whether you use the Criteria, QueryOver, or the Linq Query apis, it's almost impossible to stub out this interaction. If you do happen to stub it out using a mocking framework, your unit test Arrange step would become so messy, that your unit tests would become completely unreadable.

So what's the solution? In a recent project, we decided not to abstract away, NHibernate when doing queries. Of course, we ran up against the problem of unit testing the code. The solution was remarkable simple. Since we were using the QueryOver api, all we did was write a QueryOverStub implementation of the IQueryOver<TRoot, TSub>interface. Here's a simple implementation with unused methods not being implemented.

    public class QueryOverStub<TRoot, TSub> : IQueryOver<TRoot, TSub>
    {
        private readonly TRoot _singleOrDefault;
        private readonly IList<TRoot> _list;
        private readonly ICriteria _root = MockRepository.GenerateStub<ICriteria>();

        public QueryOverStub(IList<TRoot> list)
        {
            _list = list;
        }

        public QueryOverStub(TRoot singleOrDefault)
        {
            _singleOrDefault = singleOrDefault;
        }

        public ICriteria UnderlyingCriteria
        {
            get { return _root; }
        }

        public ICriteria RootCriteria
        {
            get { return _root; }
        }

        public IList<TRoot> List()
        {
            return _list;
        }

        public IList<U> List<U>()
        {
            throw new NotImplementedException();
        }

        public IQueryOver<TRoot, TRoot> ToRowCountQuery()
        {
            throw new NotImplementedException();
        }

        public IQueryOver<TRoot, TRoot> ToRowCountInt64Query()
        {
            throw new NotImplementedException();
        }

        public int RowCount()
        {
            return _list.Count;
        }

        public long RowCountInt64()
        {
            throw new NotImplementedException();
        }

        public TRoot SingleOrDefault()
        {
            return _singleOrDefault;
        }

        public U SingleOrDefault<U>()
        {
            throw new NotImplementedException();
        }

        public IEnumerable<TRoot> Future()
        {
            return _list;
        }

        public IEnumerable<U> Future<U>()
        {
            throw new NotImplementedException();
        }

        public IFutureValue<TRoot> FutureValue()
        {
            throw new NotImplementedException();
        }

        public IFutureValue<U> FutureValue<U>()
        {
            throw new NotImplementedException();
        }

        public IQueryOver<TRoot, TRoot> Clone()
        {
            throw new NotImplementedException();
        }

        public IQueryOver<TRoot> ClearOrders()
        {
            return this;
        }

        public IQueryOver<TRoot> Skip(int firstResult)
        {
            return this;
        }

        public IQueryOver<TRoot> Take(int maxResults)
        {
            return this;
        }

        public IQueryOver<TRoot> Cacheable()
        {
            return this;
        }

        public IQueryOver<TRoot> CacheMode(CacheMode cacheMode)
        {
            return this;
        }

        public IQueryOver<TRoot> CacheRegion(string cacheRegion)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> And(Expression<Func<TSub, bool>> expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> And(Expression<Func<bool>> expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> And(ICriterion expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> AndNot(Expression<Func<TSub, bool>> expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> AndNot(Expression<Func<bool>> expression)
        {
            return this;
        }

        public IQueryOverRestrictionBuilder<TRoot, TSub> AndRestrictionOn(Expression<Func<TSub, object>> expression)
        {
            throw new NotImplementedException();
        }

        public IQueryOverRestrictionBuilder<TRoot, TSub> AndRestrictionOn(Expression<Func<object>> expression)
        {
            throw new NotImplementedException();
        }

        public IQueryOver<TRoot, TSub> Where(Expression<Func<TSub, bool>> expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> Where(Expression<Func<bool>> expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> Where(ICriterion expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> WhereNot(Expression<Func<TSub, bool>> expression)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> WhereNot(Expression<Func<bool>> expression)
        {
            return this;
        }

        public IQueryOverRestrictionBuilder<TRoot, TSub> WhereRestrictionOn(Expression<Func<TSub, object>> expression)
        {
            return new IQueryOverRestrictionBuilder<TRoot, TSub>(this, "prop");
        }

        public IQueryOverRestrictionBuilder<TRoot, TSub> WhereRestrictionOn(Expression<Func<object>> expression)
        {
            return new IQueryOverRestrictionBuilder<TRoot, TSub>(this, "prop");
        }

        public IQueryOver<TRoot, TSub> Select(params Expression<Func<TRoot, object>>[] projections)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> Select(params IProjection[] projections)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> SelectList(Func<QueryOverProjectionBuilder<TRoot>, QueryOverProjectionBuilder<TRoot>> list)
        {
            return this;
        }

        public IQueryOverOrderBuilder<TRoot, TSub> OrderBy(Expression<Func<TSub, object>> path)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, path);
        }

        public IQueryOverOrderBuilder<TRoot, TSub> OrderBy(Expression<Func<object>> path)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, path, false);
        }

        public IQueryOverOrderBuilder<TRoot, TSub> OrderBy(IProjection projection)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, projection);
        }

        public IQueryOverOrderBuilder<TRoot, TSub> OrderByAlias(Expression<Func<object>> path)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, path, true);
        }

        public IQueryOverOrderBuilder<TRoot, TSub> ThenBy(Expression<Func<TSub, object>> path)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, path);
        }

        public IQueryOverOrderBuilder<TRoot, TSub> ThenBy(Expression<Func<object>> path)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, path, false);
        }

        public IQueryOverOrderBuilder<TRoot, TSub> ThenBy(IProjection projection)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, projection);
        }

        public IQueryOverOrderBuilder<TRoot, TSub> ThenByAlias(Expression<Func<object>> path)
        {
            return new IQueryOverOrderBuilder<TRoot, TSub>(this, path, true);
        }

        public IQueryOver<TRoot, TSub> TransformUsing(IResultTransformer resultTransformer)
        {
            return this;
        }

        public IQueryOverFetchBuilder<TRoot, TSub> Fetch(Expression<Func<TRoot, object>> path)
        {
            return new IQueryOverFetchBuilder<TRoot, TSub>(this, path);
        }

        public IQueryOverLockBuilder<TRoot, TSub> Lock()
        {
            throw new NotImplementedException();
        }

        public IQueryOverLockBuilder<TRoot, TSub> Lock(Expression<Func<object>> alias)
        {
            throw new NotImplementedException();
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, U>> path)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<U>> path)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, U>> path, Expression<Func<U>> alias)
        {
            return new QueryOverStub<TRoot, U>(_list);
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<U>> path, Expression<Func<U>> alias)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, U>> path, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<U>> path, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, U>> path, Expression<Func<U>> alias, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<U>> path, Expression<Func<U>> alias, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, IEnumerable<U>>> path)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<IEnumerable<U>>> path)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, IEnumerable<U>>> path, Expression<Func<U>> alias)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<IEnumerable<U>>> path, Expression<Func<U>> alias)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, IEnumerable<U>>> path, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<IEnumerable<U>>> path, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<TSub, IEnumerable<U>>> path, Expression<Func<U>> alias, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, U> JoinQueryOver<U>(Expression<Func<IEnumerable<U>>> path, Expression<Func<U>> alias, JoinType joinType)
        {
            return new QueryOverStub<TRoot, U>(new List<TRoot>());
        }

        public IQueryOver<TRoot, TSub> JoinAlias(Expression<Func<TSub, object>> path, Expression<Func<object>> alias)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> JoinAlias(Expression<Func<object>> path, Expression<Func<object>> alias)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> JoinAlias(Expression<Func<TSub, object>> path, Expression<Func<object>> alias, JoinType joinType)
        {
            return this;
        }

        public IQueryOver<TRoot, TSub> JoinAlias(Expression<Func<object>> path, Expression<Func<object>> alias, JoinType joinType)
        {
            return this;
        }

        public IQueryOverSubqueryBuilder<TRoot, TSub> WithSubquery
        {
            get { return new IQueryOverSubqueryBuilder<TRoot, TSub>(this); }
        }

        public IQueryOverJoinBuilder<TRoot, TSub> Inner
        {
            get { return new IQueryOverJoinBuilder<TRoot, TSub>(this, JoinType.InnerJoin); }
        }

        public IQueryOverJoinBuilder<TRoot, TSub> Left
        {
            get { return new IQueryOverJoinBuilder<TRoot, TSub>(this, JoinType.LeftOuterJoin); }
        }

        public IQueryOverJoinBuilder<TRoot, TSub> Right
        {
            get { return new IQueryOverJoinBuilder<TRoot, TSub>(this, JoinType.RightOuterJoin); }
        }

        public IQueryOverJoinBuilder<TRoot, TSub> Full
        {
            get { return new IQueryOverJoinBuilder<TRoot, TSub>(this, JoinType.FullJoin); }
        }
    }

So you can easily specify the item or list of items you want the query to return in your test.

new QueryOverStub<Entity, Entity>(new List<Entity>())

The only thing missing is: if you're using projections or getting something like the count. In order to make this implementation work in those situations, we would just need to write implementations for the List<U> and SingleOrDefault<U> methods. In an upcoming blog post I'll talk about how to use an in memory Sqlite database as an alternative to using this stub.

Monday, November 28, 2011

Writing testable ETL processes with Rhino-ETL

On a recent project we had to integrate several external data sources into the application database. Some of these sources were csv files, while another one was an Oracle database. Our application, meanwhile, used a SQL Server database. Furthermore some of the data had to be loaded automatically, while other data had to be loaded by a business user. We tossed around several ideas for how to accomplish this: SSIS, Informatica, something custom built. We finally arrived on Ayende Rahien's (aka Oren Eini) ETL framework Rhino-ETL. It looked like it would fit the bill as it could be integrated into any .NET application and it allowed us to write unit and integration tests around it.

Unfortunately there is a dearth of information around how to use this framework. The only good piece of documentation is this great video tutorial. While a good starting point, I thought I'd write a blog to show how to get started with the framework.

Rhino ETL is based on a pipeline concept. Each process consists of a bunch of operations strung together. Each operation's output is fed into the next operation as input. The operations interact by the following method:

public interface IOperation : IDisposable
{
    ...
    IEnumerable<Row> Execute(IEnumerable<Row> rows);
    ...
}

So each operation takes in a collection of Rows and outputs a collection of Rows. A Row is basically an instance of Dictionary<object, object> with an added twist: no exceptions are thrown if a key does not exist. A Row will simply return null when a key does not exist in the dictionary. Each operation can also leverage the oft unused C# keyword of yield return. A typical operation might looks something like this:

public class MyOperation : AbstractOperation
{
    public override IEnumerable<Row> Execute(IEnumerable<Row> rows)
    {
        foreach (var row in rows)
        {
            //process the row
            yield return row;
        }
    }
}

The yield return keyword instructs the compiler to generate an iterator that, for each row, would execute whatever code you had in the foreach statement. This allows us to defer execution of each row to the point in time when it passes through the operation pipeline.

Since your operations expose the Execute method, you can easily write unit tests around each operation, with any kind of data as input. Voila, by using this framework and developing a good suite of unit tests, your ETL process has become far more robust than if you were using a tool like SSIS.

The main thing to note about Rhino-ETL is that you will have to code all of your operations. Typically your operations will inherit from the AbstractOperation class. Some other useful operations:

SqlBulkInsertOperation: Set up the schema and do a bulk insert into a table. Useful for inserting large data sets quickly and efficiently
SqlBatchOperation: Batch your sql operations to reduce server roundtrips
BranchingOperation: Send rows to multiple operation pipelines
JoinOperation: Join your rows to a result set from another operation
PartialProcessOperation: Typically used with BranchingOperations to create an operation pipeline within a branch

A process is typically created by inheriting from the EtlProcess class.

public class MyProcess : EtlProcess
{
    protected override Initialize()
    {
        Register(new MyOperation());
        //register other operations
    }
}

A final note, in order to get any output from your etl process you will need to set log4net up. A simple console output can be created by the following entry in your App.config:

<log4net debug="false">
    <appender name="console" type="log4net.Appender.ConsoleAppender, log4net">
      <threshold value="WARN" />
      <layout type="log4net.Layout.PatternLayout,log4net">
        <param name="ConversionPattern" value="%d [%t] %-5p %c - %m%n"/>
      </layout>
    </appender>
    <root>
      <level value="INFO" />
      <appender-ref ref="console"/>
    </root>
  </log4net>

and by adding the following line of code to your application startup:

log4net.Config.XmlConfigurator.Configure();

To conclude, while you do give up fancy designers and useful operations like SSIS's Fuzzy Lookup, the benefit gained from being able to write unit tests around your ETL process can be invaluable. Due to its simplicity and the fact that it can be easily integrated into other .NET applications, Rhino ETL is a very useful tool to have in your arsenal.

Wednesday, July 13, 2011

Why you should use a DVCS

Over the last several years DVCSs (Distributed Version Control Systems) have gained enormous acceptance in the development community. There are a lots of good reasons to ditch your old Subversion (or god forbid CVS) repository and switch to one of the new kids on the block like Git or Mercurial. Let's take a look at some of these reasons.

Experiment to your heart's content

I'm sure this has happened to everybody. You start developing a feature. You change a few minor things here and there (e.g. add a new property or method to an entity). Then you make a few more changes; and a few more; and a few more. All of a sudden you realize that your approach just isn't going to work. Hold on though, some of the changes you made earlier are still valid and you want to keep them. So you start trying to revert just the later changes, only to give up and revert everything back to the latest copy that was in source control. Begrudgingly you reproduce your earlier work as you curse your source control system.

If you're using a DVCS, all of that can be avoided. Since you have a source control repository right on your own machine, tracking your code. You can easily reset your code to one of the commit points along your journey to the dead end you've found yourself in. You can even save your progress in a tag or a branch, in case you want to refer to some of the code in the future.

Branch and merge with ease

A common scenario in software development is to do a product release, create a release branch, and continue development on the main line trunk. If a bug is reported in the release, you would check out the release branch, fix the bug and check in your changes. After a few of these changes you can do another release and merge your branch fixes into your main line trunk. At least that's how it's supposed to work. What happens more often is on the last step, when you try to do a merge, you find so many conflicts that it then takes the entire team a day or more to resolve them.

Since a DVCS is built around branching and merging, these operations are not only easy to do, but also painless. Most of the time you won't even need to manually merge anything as the systems are very good at figuring out what was intended. So go ahead, make as many branches as you want, merge them whenever you want, and stop trying to mangle your process with the limitations of your VCS.

Safety and redundancy

One of the great things about having multiple copies of the repository floating around among the development team and the central location is redundancy. In a traditional central version control system environment, if your your repository became corrupted, or the server hard drive failed, you would have to go find a backup and restore it, possibly losing work and holding up development.

In a distributed environment, everyone has a copy of the repository. So if the same scenario were to happen, the fix can be as simple as one of the developers pushing in the latest copy to a new shared location. Alternatively one developer's repositories can be chosen to be a central repository until the server can be rebuilt. That is not to say you shouldn't have backups in a distributed environment, but it is far less catastrophic not to have them.

The bottom line is that constantly working with the benefits of source control far outweigh the initial pain of converting to a DVCS. If you're developing in Windows, I highly recommend Mercurial, if Linux or Mac is your preferred environment, then Git is definitely the way to go. So go on and give it a try. I promise you won't go back to your old VCS.

Monday, May 30, 2011

Git with ssh on Windows

While Git is a great source control system that can bend to almost any source control workflow you might have, support on windows varies from awesome (Git Extensions, msysgit) to downright awful (using git, ssh, or http protocol). I've struggled enough with setting up an ssh server on windows to host a central repository to warrant documenting it.

First you will need to install Cygwin with the openssh and git modules enabled.

After the installation we'll need to let cygwin know about the domain accounts we want to use. For this we'll use the mkpasswd command. For domain accounts use the '-d' option.

$ mkpasswd -d -u username >> /etc/passwd

This will append the entry to the /etc/passwd file. Do this for all of the accounts you want to add. You may need to subsequently edit the /etc/passwd file and set the appropriate home directory to /home/username especially if your company likes to have a network home drive for all of their users.

Now it's time to create a new bare repository.

$ git init --bare myrepo.git
$ git config core.sharedRepository group

The second command lets git know that it should use group write permissions when creating files and directories. We'll also want to set the group permissions to special so that newly created files and folders have the same group permissions.

$ chmod -R g+ws .

Now that your repository is set up. Let's setup the ssh server.

$ ssh-host-config

The script will begin by generating host key files and config files. It will then prompt you to use privilege separation. Answer yes and again when it prompts you to create an unprivileged user.

Next the script will ask if you want to install sshd as a service. Answer yes. The next question will tell you that you need a privileged account to run the service and if you want to use the name 'cyg_server'. Either say no, or if you already have an appropriate privileged account answer yes and enter the name you want to use. If the account you entered does not have the appropriate permissions the script will warn you. I recommend letting the script create a 'cyg_server' account. When prompted enter an appropriate password that conforms to your password policies.

Finally the script will ask you to enter the values for the CYGWIN environment variable. Enter 'tty acl'. This will let you use windows security to access the file system and insert appropriate characters.

After all of that, ssh will be configured so we just need to start it.

$ cygrunsrv --start sshd

Alternatively you can go into the services list, find CYGWIN sshd and start it there.
If you start seeing errors at this point there is likely something wrong with the password you set up for the 'cyg_server' account.

As a final step you will likely want to create a symlink to your repository in the root folder. This will allow the ssh URI to be more concise and friendly.

$ ln -s /path/to/your/repo /repo

Once you have the ssh server running, it's time to connect to it. Let's push our local repository to the remote.

$ git push ssh://server/repo

If you did not set up a symlink your address will need to contain the appropriate cygwin path to your repo starting from /. The first time you connect it will prompt you to add the host key to your known_hosts file. Finally you will be prompted you to enter your password. At this point you have everything in place to access the repository remotely.

As a final step you can set up public/private keys in order to identify yourself to the ssh server. This will allow you to forgo having to enter your password. First, set up your ssh keys

$ ssh-keygen -t rsa

This will create 2 files in your ~/.ssh drive, one is your public key (id_rsa.pub) and the other is your private key (id_rsa). The process will involve you copying the contents of your public key to a /home/username/.ssh/authorized_keys file on the server. One way of doing this is to create a share to your home drive on the server that only your account can access, then you can create an authorized_keys file and copy the contents in notepad.

Git can still be a pain to work with on Windows, but hopefully this will make your life a little easier.

Update:
If you would rather set up git via http protocol. Have a look at this post if you want to go via Apache. Or for an IIS solution check this out.

Wednesday, May 4, 2011

Custom properties for IPrincipal

Sometimes you need to provide application specific values that are associated with a particular user. Let's say your application must handle role based security by using AD groups. Furthermore let's say that you wanted those group mappings to be configurable. A way to solve this would be to create your own implementation of IPrincipal. It might look something like this:

public class ApplicationPrincipal : IPrincipal
{
    private readonly IIdentity _identity;
    private readonly IPrincipal _principal;
    private readonly ISecurityConfiguration _config;

    public class ApplicationPrincipal(IIdentity identity, IPrincipal principal, ISecurityConfiguration config)
    {
        _identity = identity;
        _principal = principal;
        _config = config;
    }

    public IIdentity Identity { get { return _identity; } }

    public bool IsInRole(string role)
    {
        var windowsGroup = MapInternalRoleToWindowsGroup(role);
        return _principla.IsInRole(windowsGroup);
    }

    public string MapInternalRoleToWindowsGroup(string role)
    {
        switch (role)
        {
            case "Admin": return _config.AdminGroup;
            case "User": return _config.UserGroup;
            default: throw new SecurityException("Invalid group name: " + role);
        }
    }
}

Now that we have our own principal implementation we need a way to let the application access it. A good place to do that would be in an HttpModule that handles the PostAuthenticateRequest event.

public class ApplicationPrincipalModule : IHttpModule
{
    public void Init(HttpApplication context)
    {
        context.PostAuthenticateRequest += PostAuthenticateRequest;
    }

    public void Dispose()
    {
        context.PostAuthenticateRequest -= PostAuthenticateRequest;
    }

    private void PostAuthenticateRequest(object sender, EventArgs e)
    {
        var httpContext = HttpContext.Current;
        var principal = new ApplicationPrincipal(
                             httpContext.User.Identity, httpContext.User, 
                             new SecurityConfiguration());
        httpContext.User = principal;
    }
}

Once a user has been authenticated, by whichever method your application is set up to authenticate, this module will set the HttpContext.Current.User to your implementation of IPrincipal. This can then be referenced where needed in your application.

Monday, April 4, 2011

Getting started with BDD using NBehave

What is BDD? BDD stands for Behavior Driven Development and at its core is about implementing an application by describing its behavior from the perspective of its stakeholders. It is an agile software development methodology that started as an offshoot of TDD. However, in my opinion it is better suited to software delivery than TDD is. The reason is that TDD is great as a code design methodology, but there is a disconnect between code design and stakeholder value. When practicing TDD, developers typically produce cleaner and more maintainable code. However, that elegant code may not deliver what the stakeholders wanted.

BDD also allows non technical users, like BAs and business users to specify system behavior in a non technical, natural language. Tools like NBehave, allow the developers to verify and demonstrate that the system delivers what the stakeholders expect.

How do we write a scenario? Scenarios typically follow the Given/When/Then structure. So for example:

Scenario: User logs in
Given that I am not logged in
And I have an account
When I enter my username and password
Then I should see “Welcome Vadim”

How would we use NBehave to verify that the system does what the scenario says? First thing's first download the latest NBehave build and extract the contents of the zip file.
There are 2 versions of the DLLs, one for .Net 3.5 and one for .Net 4.0.

Create a project for your tests and add a reference to NBehave.Narrator.Framework.dll. Also add a reference to your favorite testing framework (like NUnit or xUnit) and the appropriate NBehave.Spec.<framework>.dll. Currently NBehave comes with extension methods for NUnit, xUnit, MbUnit, and MSTest. These methods are simply wrappers around Assert statements, that allow a more natural language feel when writing your code. So if you're using something like MSpec, which already has Should<DoSomething> type assertions, then you don't need to worry about this last step.

Next, create a features directory inside your project. This is where you'll store your feature and scenario files. Let's pretend we're writing a calculator application. Add a file to the features folder called calculator_turns_on.feature. Time to write our scenario.

Scenario: I turn on the calculator
Given the calculator is off
When I press the on button
Then the calculator should display 0

Let's add a folder called steps. This is where we're going to put our tests. Also create a code file and call it CalculatorTurnsOn.cs. Let's start with a skeleton test:

[ActionSteps]
public class CalculatorTurnsOn
{
    
}

Compile the solution then run the NBehave-Console like so

NBehave-Console.exe bin\Debug\NBehaveIntro.dll /sf=features\calculator_turns_on.feature

You should see all results yellow with a PENDING tag next to them. These aren't failures, but we know that we have to implement this scenario. Let's create a Calculator class.

public class Calculator
{
    public float Display { get; set; }
}

Next we'll flesh out our test.

[ActionSteps]
public class CalculatorTurnsOn
{
    private Calculator calculator;

    [Given("the calculator is off")]
    public void CalculatorOff()
    {
        calculator = null;
    }

    [When("I press the on button")]
    public void TurnCalculatorOn()
    {
        calculator = new Calculator();
    }

    [Then("the calculator should display 0")]
    public void CalculatorDisplayIs0()
    {
        calculator.Display.ShouldEqual(0);
    }
}

Compile and run the command again. Now the tests all pass.

Note how the attributes are decorated with the same sentences that were written in our scenario. That's how NBehave knows how the scenarios are mapped. You can decorate a method with multiple Given, When, and Then attributes and you can have any number of methods in a given ActionSteps class. Furthermore, you can have regular expressions or string tokens in the description as well. For example if we had the following scenario:

Scenario: I can add two numbers
Given numbers [one] and [two]
When I add them
Then the sum should be [sum]

Examples:
|one|two|sum|
|1  |2  |3  |
|2  |2  |4  |
|10 |35 |45 |

Our attribute decorations might look like:

[Given("numbers $one and $two")]
public void TwoNumbers(float one, float two)
...
[Then("the sum should be $sum")]
public void VerifySum(float sum)
...

One last note, the scenario above also contained an example table. This table allows us to write one scenario with multiple inputs and outputs to test. This saves us from having to write repetitive scenarios when warranted.

For further info see the following links:
http://nbehave.org/
http://nbehave.codeplex.com/wikipage?title=0.5.0&referringTitle=Documentation

Monday, March 21, 2011

NHibernate unit of work in MVC using Ninject (part 2)

In part 1 I talked about a problematic NHibernate Unit of Work implementation that we had in our MVC application. In part 2 I will talk about why this approach turned out to be problematic and the ultimate solution that we came up with.

If you recall, the original solution was to use an HttpModule to subscribe to the BeginRequest and EndRequest events. When we upgraded to Ninject 2.1, none of our entities would get saved. Reads all worked fine, but no CUD actions worked. The problem turned out to be due to a new component of Ninject called the OnePerRequestModule. This module subscribes to the EndRequest event and disposes any objects that were bound in request scope. The problem was that this module would dispose of our session, before we had a chance to commit the transaction and end the session ourselves. It was also registering first (which meant it was first to handle EndRequest) and there appeared to be no hook to turn it off.

The solution that we came up with was to use a filter attribute in order to encapsulate our unit of work. The first part of the refactor was to create an IUnitOfWork interface:

public interface IUnitOfWork
{
 void Begin();
 void End();
}

That's all a unit of work really should look like. Second we moved our BeginSession and EndSession code into an implementation of IUnitOfWork:

public class NHibernateUnitOfWork : IUnitOfWork, IDisposable
{
 private readonly ISession _session;

 public NHibernateUnitOfWork(ISession session)
 {
  _session = session;
 }

 public void Begin()
 {
  _session.BeginTransaction();
 }

 public void End()
 {
  if (_session.IsOpen)
  {
   CommitTransaction();
   _session.Close();
  }
  _session.Dispose();
 }

 private void CommitTransaction()
 {
  if (_session.Transaction != null && _session.Transaction.IsActive)
  {
   try
   {
    _session.Transaction.Commit();
   }
   catch (Exception)
   {
    _session.Transaction.Rollback();
    throw;
   }
  }
 }

 public void Dispose()
 {
  End();
 }
}

This separated repository logic from the unit of work implementation. The binding for this implementation was bound in request scope. Finally we created our filter attribute:

public class UnitOfWorkAttribute : FilterAttribute, IActionFilter
{
 [Inject]
 public IUnitOfWork UnitOfWork { get; set; }

 public UnitOfWorkAttribute()
 {
  Order = 0;
 }

 public void OnActionExecuting(ActionExecutingContext filterContext)
 {
  UnitOfWork.Begin();
 }

 public void OnActionExecuted(ActionExecutedContext filterContext)
 {
  UnitOfWork.End();
 }
}

The order was set to 0 to ensure that this filter was always run first. We also did not have to explicitly call End on the unit of work since it would be disposed at the end of the request and would call End itself. However, it's good practice to be explicit with these things. Finally you can apply this filter at the controller level or the action level. So you can decorate only those actions that actually perform data access.

Tuesday, March 8, 2011

NHibernate unit of work in MVC using Ninject (part 1)

We just released our third and (likely) final release of the application we've been working on for the last 11 months. During that time I've gone through two implementations of the NHibernate Unit of Work pattern. I will talk about it in two parts. The first part will talk about our first implementation, that turned out to be a little bit flawed and very incompatible with Ninject 2.1. The second one will talk about a better approach that we shipped with this final release.

I've previous blogged about NHibernate and the unit of work pattern here. The approach, I first decided to take was to create a generic Repository class. This class was not only responsible for CRUD operations and transaction management, but it also had 2 special methods on it called BeginRequest and EndRequest. Here's a simple implementation:

public class Repository : IRepository
{
    private ISession _session;
    public Repository(ISession session) { _session = session; }

    public Save<T>(T entity) where T : Entity
    {
        _session.SaveOrUpdate(entity);
    }
    public Get<T>(int id) where T : Entity
    {
        return _session.Get<T>(id);
    }
    public Delete<T>(T entity) where T : Entity
    {
        _session.Delete(entity);
    }

    public BeginTransaction() { _session.BeginTransaction(); }
    public CommitTransaction() { _session.Transaction.Commit(); }
    public RollbackTransaction() { _session.Transaction.Rollback(); }

    public BeginRequest() { BeginTransaction(); }
    public EndRequest() { CommitTransaction(); }
}

Obviously the real class was more robust and had much more error checking around transaction management, but those details aren't important for this discussion.

In addition to this Repository we also had an HttpModule that called Repository.BeginRequest() on BeginRequest and Repository.EndRequest() on EndRequest. It looked something like this:

public class DataModule : IHttpModule
{
 public void Init(HttpApplication context)
 {
  context.BeginRequest += BeginRequest;
  context.EndRequest += EndRequest;
 }

 public void Dispose()
 {
 }

 public void BeginRequest(object sender, EventArgs e)
 {
  var app = (WebApplication)sender;
  var repository = app.Kernel.Get<IRepository>().BeginRequest();
 }

 public void EndRequest(object sender, EventArgs e)
 {
  var app = (WebApplication)sender;
  var repository = app.Kernel.Get<IRepository>().EndRequest();
 }
}

Our Ninject bindings for these classes were as follows:

Bind<ISession>()
       .ToMethod(context => NHibernateSessionFactory.Instance.OpenSession())
       .InRequestScope();
Bind<IRepository>().To<Repository>().InTransientScope();

An important note is that the NHibernate session was bound in request scope, while the repository was bound in transient scope. So all instances of Repository in request scope, shared the same session. (NHibernateSessionFactory is our singleton in charge of configuring the SessionFactory and creating new sessions.)

This approach worked fine with Ninject 2.0, but suffered from several problems. First of all, we would be creating a new session and starting/commiting a new transaction on every http request. This means any request, even for content files like javascript or css, would trigger the DataModule. The second problem was, more architectural/stylistic. We had code that dealt with unit of work implementation inside a repository. We could have moved the Begin/End Request logic off to the DataModule and that would have alleviated this problem some what. However, we still would have been stuck with the first problem.

The kick in the pants for the rewrite came, when we upgraded to Ninject 2.1 and Ninject.Web.Mvc 2.1. Ninject 2.1 introduced a very eager way to ensure objects bound in Request scope didn't hang around after the EndRequest event. I'll talk about this and our implementation rewrite in part 2.

Monday, February 7, 2011

JavaScript/jQuery code optimization (or famous IE bottlenecks)

After we pulled out the Infragistics grid in favor of Telerik MVC Extensions, We've had to code several features, that were available in Infragistics out of the box, ourselves. This was partly due to our data structure not being fully supported by the Telerik grid. That's a topic for another blog post though. One of the features was the ability to select several cells and do spreadsheet like functionality, such as copy/paste and fill. We were mostly testing this out in Chrome/Firefox and with small tables. However, as typically happens, after releasing into our demo environment, our BA told us that the performance was brutal.

Surprise, surprise, it turned out that our grid page ran very slowly in IE(7) with very large tables (50 rows by 35 columns). I wound up being able to trace the bottle neck down to this chunk of code.

$('tbody td', $(this.element)).each(function() {
 if (this.parentElement.rowIndex >= selection.startRowIndex && 
 this.parentElement.rowIndex <= selection.endRowIndex &&
 this.cellIndex >= selection.startColIndex && 
 this.cellIndex <= selection.endColIndex) {
  if (columns[this.cellIndex].readonly != true) {
   $(this).addClass('selected');
  }
 }
});

This is the code for when a user has a cell selected and is either dragging the mouse or holding shift and clicking to select a block of cells. What's happening here is that we're going through all cells in the table body and seeing if it is between the first selected cell and the cell located at the mouse position. I.E. whether or not the cell is in a 'box'. The problem is that we're going through every cell in the table body, even if only one has been selected.

The first optimization was to only go through the rows/columns that were in the selected box, so to speak. This was done using the jQuery slice method. The code is below

$('tbody tr', this.element)
 .slice(selection.startRowIndex, selection.endRowIndex + 1)
 .each(function() {
  $(this).children('td')
   .slice(selection.startColIndex, selection.endColIndex + 1)
   .each(function() {
    if (!columns[this.cellIndex].readonly) {
     $(this).addClass('selected');
    }
   });
 });

The performance improved a lot when the selection box was small, but if the user selected anything bigger than a 4 row by 10 column box the script crawled again. Each row would take approximately 190 ms to run in IE7. So if the user selected 10 rows it would take almost 2 seconds to run. The problem seemed to be the readonly check. If it was taken out, the timing dropped to a fraction of a millisecond.

The problem is that Internet Explorer is very slow processing conditional statements. Doing some unscientific tests on this site I got around the following numbers:

Browser	Average Time
Chrome 8	0.02 ms
Firefox 4 b10	0.66 ms
Internet Explorer 8	52 ms
Internet Explorer 7	94 ms

So we had to find a way to optimize that conditional statement. I finally found a jQuery call that seemed to do the trick.

$('tbody tr', this.element)
 .slice(selection.startRowIndex, selection.endRowIndex + 1)
 .each(function() {
  $(this).children('td')
   .slice(selection.startColIndex, selection.endColIndex + 1)
   .not(function() {
    return columns[this.cellIndex].readonly;
   })
   .addClass('selected');
   });

After going through the jQuery code, I'm still not 100% sure why the performance is better with the last statement. A couple of things I've noticed is that using each to do a check on every item seems to be slightly more expensive than not (or for that matter filter). Since each also does a check to make sure the callback hasn't returned false. Also addClass has a lot of overhead so it's quite inefficient to run it on elements one by one. Not sure if that's necessarily the problem either though because the 2nd code snippet ran very quickly when the readonly check was taken out.

If anyone has any ideas on why they think the 3d code snippet runs much faster than the 2nd one I would love to hear it. Until then this just another example of why it's important to optimize your jQuery code.

Monday, January 31, 2011

Using CVS and Git in tandem

VCS systems are indispensable for software developers. They provide many benefits: team development, code recovery, history tracking, and many more. Everyone has their favorite, but it's not always possible in a corporate environment to use the one you like. On my current project, the corporate standard is CVS. It's been a while since I've used it. Not to bad mouth CVS, but the shine has really worn off this VCS in the last decade.

I decided to set up a local Git repository to use for development and use the CVS repository for large scale commits. This was working well, until a second developer came on board. We had to set up a system in which merges were not a giant pain in the butt. The system we eventually came to use works like this. We both have our local Git repositories in which we do our feature development. We also set up a bare repository on the build server that we push to once we've completed our features. This central repository is also used for our continuous build integration.

The last part of the setup is a separate folder that contains the CVS checkout. I've named it CVS. This folder also contains another Git repository. The process of checking into CVS goes as follows. We first do a CVS update to bring our checkout in line with any CVS checkins the other developer may have done. We do this because Git is far more forgiving with brand new files that the VCS is trying to bring in that may already exist on the file system. After we have synchronized our checkout with the latest from CVS, we then do a code pull from our central bare repository. At this point we have to go through and make sure we manually do a `cvs remove` on any files that may have been deleted during our feature development. We also have to do a `cvs add` on any new files. Once all that has been done we are ready to do a check in.

So that's it. The process saves us a lot of headaches while we are doing our feature development and we are able to use our pet VCS without too much trouble.

Thursday, January 27, 2011

Debugging JavaScript

The tools available to web developers have grown a lot in the last 5 years. Time was, the only way to debug JavaScript was to pepper your code with alert statements. Now you can get a variety of plugins and tools that allow you to examine HTML, CSS, JavaScript, and other elements of a web page. Here are some of the better tools out there:

Firebug: One of the best debugging tools out there. Allows you to inspect and modify html elements and style. Comes with a great JavaScript debugger and allows you to examine network usage. Available as an add on for Firefox. (Note if you're using Firefox 4.0 beta make sure you are using version 1.7 of firebug and Firefox 4.0 beta 10 as previous versions are broken)
Internet Explorer Developer Toolbar: Let's face it, if you've developed any kind of web applications, you've grown to despise IE. With its quirks and non standard implementation of all things html, css, and JavaScript , IE is one of the biggest pain points for any web developer. The internet explorer developer toolbar tries to ease some of that pain by providing the ability to examine html and make modifications to style
Visual Studio: A decent tool for debugging JavaScript in IE. Can be a bit of a pain, but contains everything a debugger should: call stack, breakpoints, watch variables, and quick watch.
Chrome Developer Tools: One of the easiest browsers to develop for, also comes with a great set of developer tools. View and change html elements, trace styles, debug JavaScript like a pro, examine loaded resources including XHR content. It also provides optimization and performance analysis

All of these tools are great, but what prompted me to write this blog entry is another great tool out there, JSFiddle. If you've ever tried debugging a small JS problem you know it's a real pain to constantly update your code, reload the page and step through a multitude of lines that your page might contain. JSFiddle provides a great way to isolate problem code and debug it. JSFiddle allows you to enter your HTML, CSS, and JavaScript then run it all in one page. It also allows you to easily load different versions of most of the common OSS JS frameworks that exist. Not only that, but it also provides versioning of your "fiddles". If you haven't signed up for an account I highly recommend it. You don't need an account to use the site, but if you want to save and share your fiddles you will need to have an account.

Saturday, January 22, 2011

Dynamic stored procedures without using IFs

Have you ever written a stored procedure like this?

create procedure GetProducts
   @id int
AS BEGIN
   if @id is not null
      SELECT *
      FROM Products
      WHERE id = @id
   ELSE
      SELECT *
      FROM Products
END

Here's a tip that DBAs and sql developers have known about for a while. You can shorten your query to this

create procedure MyProc
   @id int
AS
   SELECT *
   FROM Products
   WHERE (@id is null OR id = @id)

You should take care to examine the execution plan and make sure that it's not doing a table scan. Otherwise performance will suffer.

Monday, January 17, 2011

The NHibernate Anti Pattern

On Stack Overflow A lot of NHibernate questions come with a code snippet similar to the following

publlc IList<MyObject> GetObjects()
{
   using (ISession session = sessionFactory.OpenSession())
   {
      return session.CreateCriteria<MyObject>().List<MyObject>();
   }
}

What's wrong with this code? Well at first glance not much. We're opening a session doing our query and then cleaning up any open resources.

Well if we were dealing with ADO.NET, opening a connection, doing our work, then closing the connection is a typical process. However, NHibernate sessions are a lot more than database connections. One of the biggest benefits is the first level cache. The session caches objects that have been retrieved by their id as well as saved objects. This means that if you've retrieved an object once, as long as you haven't closed the session, any time you retrieve the object again, NHibernate will get you the cached version. This saves a round trip to the database.

Another benefit is batching database writes. Since updates are stored in the cache, NHibernate doesn't need to write them to the database until it needs to. If you let NHibernate flush automaticially it can save its updates until it needs to and then run them all in one round trip to the database.

What about lazy loading? If our MyObject class looks like this:

public class MyObject
{
   public int Id { get; set; }
   public AnotherObject Reference { get; set; }
}

and we wanted to access the Reference field we would need to eagerly load each associated AnotherObject. This is because NHibernate uses proxy objects. When your references are lazy loaded and you access the Reference field, NHibernate will use the same session the original object was loaded with to fetch the associated reference. If the session has been closed you will receive an Exception telling you that the associated session has been closed.

So if the pattern in the first code snippet is a bad one, which pattern should we be following? The folks over at NHibernate recommend the Unit of Work pattern. The idea is that you create a session and keep it alive for the duration of a so called "conversation". An example of a conversation could be the HttpRequest in a web application. So a session would be opened at the beginning of the request and at the end of the request would be closed and disposed. You can even wrap the session in a transaction for extra reliability.

On a windows app it's a little more tricky. One way to define a unit of work is in the context of a thread. Another might be to handle a user action, For example if a user action triggers a button to update some data, a new session would be created. Then closed after all work has been completed.

Finally you can use your favorite IoC container to define how and when a new session is created and then your data access layer can be injected with the reusable session.

Saturday, January 8, 2011

Rendering Asp.Net WebForm via an MVC action

As part of our rewrite of WebForms to MVC we have some reports that are generated via Infragistics. I know I said in the previous post that the reason we're doing the rewrite is because we hate Infragistics. However, one thing that Infragistics does well is to give a lot of power to the developer for developing reports.

There is no GUI for report development and the documentation is virtually non existent, but if you need dynamic reports that are generated at run time it's a great library to use.

I won't bore you with implementation details for our reports, but the fact of the matter is that we need to render a WebForm. How do you do that inside a Controller action? Well fortunately the folks at Microsoft have thought of this scenario. There is an IView implementation called WebFormView. So if you need to return a classic Asp.Net Web Form as part of your action, you can do it like so.

public ActionResult Index()
{
   return View(new WebFormView("~/path/to/your/webform.aspx"));
}

The only catch is that your WebForm, must inherit from System.Web.Mvc.ViewPage, rather than System.Web.UI.Page. Behind the scenes it will call the BuildManager to build your web form and call the ProcessRequest method with HttpContext.Current. This means that the page is taken through the Asp.Net page life cycle.

You can even serve up WebControls like this. Again they have to inherit from System.Web.Mvc.ViewUserControl. There is one more limitation. That is that the user control can either only have html elements (albeit they can have the runat=server attribute). Or if you really need server controls, then the user control must contain a form with the runat=server attribute.

One final note. If for whatever reason you cannot change your web forms to inherit from ViewPage or you need to access the compiled instance of the Page, you can still use them to render content inside a controller action like so

public ActionResult Index()
{
   var page = BuildManager
.CreateInstanceFromVirtualPath("~/path/to/your/webform.aspx", typeof(Page));
   page.ProcessRequest(HttpContext.Current);
   return new EmptyResult();
}

Vadim's Dev Blog