Posts in  design

11.1.2013

Using invariants to avoid null check insanity

Null checks aren't fun, but even worse are the ever ambiguous run-time NullReferenceExceptions we might otherwise receive.

Variant

Take the following code:

public class Family
{
    public List<string> Names;
}

// consumer creating a family
var family = new Family();
family.Names = new[] {"John", "Jane"}.ToList();

// consumer adding a name
family.Names = family.Names ?? new List<string>();
family.Names.Add("Baby");

// consumer searching names
var searchName = "Bob";
var hasSomeoneNamed = family.Names != null && family.Names.Contains(searchName);


  • Note: each // consumer comment delineates a separate example of using the Family type
  • There's a whole extra line of code just to add a family member!
  • Searching isn't a simple query, it's also a preventative null check and clever usage of the short circuiting && operator.
  • This extra noise is especially confusing to novice programmers.

Invariant

Invariant - "never changing"

A few simple changes can make things much simpler for consumers. If we require a list of names upon creation of a family, we can do the following:

public class Family
{
    public readonly List<string> Names;

    public Family(IEnumerable<string> names)
    {
        Names = names.ToList();
    }
}

  • As a consumer I can see that Names is readonly, which means it can't be changed after creation.
  • Seeing that the constructor also requires a list of names, even if I didn't know the implementation details it would be safe to assume that the names list is never going to be null, unless someone is out to troll me!
  • Even without requiring a list of names, what would be the purpose of a Names list that was always null?
  • We could go further by including this invariant in the class description or with CodeContracts and bring in static analysis support to help avoid null checks, but readonly alone is a great start.

Look at the impact on consumers:

// consumer creating a family
var names = new[] { "John", "Jane" };
var family = new Family(names);

// consumer adding a name
family.Names.Add("Baby");

// consumer searching names
var searchName = "Bob";
var hasSomeoneNamed = family.Names.Contains(searchName);

I'd much rather maintain this code!

One step further with Encapsulation

We could also encapsulate the Names list:

public class Family
{
    protected readonly List<string> Names;

    public Family(IEnumerable<string> names)
    {
        Names = names.ToList();
    }

    public void AddName(string name)
    {
        Names.Add(name);
    }

    public bool HasSomeoneNamed(string searchName)
    {
        return Names.Contains(searchName);
    }
}

Now our consumers don't even have to be aware of the fact that Names exists let alone that it might be null:

// consumer creating a family
var names = new[] { "John", "Jane" };
var family = new Family(names);

// consumer adding a name
family.AddName("Baby");

// consumer searching names
var searchName = "Bob";
var hasSomeoneNamed = family.HasSomeoneNamed(searchName);


  • This is suggested by the principles behind the Law of Demeter.
  • All null checking would be confined to the Family type, thus making it simple to see the lack of necessity.

However, I usually don't go this far:

  • This can lead to a lot of boiler plate code, which itself has readability and maintainability concerns, especially when redefining all the list operations on the Family type (AddName/RemoveName/EnumerateNames etc).
  • I prefer to wait to create methods on types until I see benefits of re-use among consumers and can compare that to the purpose of my type in the first place.

I prefer the invariant only approach, giving consumers the guarantee that Names won't be null and letting them take it from there.

Serialization concerns

  • If you are serializing objects to a database or other medium, be aware of how these impact your invariants.
  • readonly can cause a lot of friction in serialization, if it does, try an auto property with a public getter and protected/private setter, but be aware that deserializers may leave this null.
    • protected/private setters guarantee that at least code consumers aren't modifying the field.
  • I strongly recommend a good understanding of your serializer and possibly even some unit/integration tests to verify this invariant.

Conclusion

Null check insanity is often a sign of design smell, invariants are a great first step in the direction of creating a solid contract between producers and consumers of a type.

It may seem like work to enforce invariants, but the dividends in maintainability and readability are worth it. Think how often you stumble upon null checks, or the lack thereof. Understanding these patterns will make them second nature.

1.23.2012

Creating abstractions instead of using generic collections

9.24.2009

The slippery slope of "var"

The var keyword has been a rather controversial addition to the C# language. Many developers initially fear it, getting lost in demos that use it. Eventually, they come to understand it as something "lazy" programmers would use and often dismiss it. I've gone through these stages. I even remember my blood boiling when I saw resharper, out of the box, suggest to use it! I never really understood it as anything more than implicit typing.

Recently, I decided that I should learn new concepts with new languages instead of just trying to learn and do examples in the same old. Why not kill two birds with one stone! I've been exploring a lot of interpreted languages (functional and imperative), focusing on Scheme (LISP) and python. I've had a great joy reading and conceptualizing new things in these language.

Studying new concepts and applying them with dynamic languages has made me notice, more than ever, all the boiler plate type code to get anything done in C#. After a weekend of hacking in python, I find myself skipping type declarations in C# only to get compiler errors :(. The simplicity of

names = ('Bob', 'John')
in python is doubled in C# to
List<string> names = new List<string>{ "Bob", "John" };
without any added value!

I have been struggling to find ways to bring this simplicity to C#, short of switching to python altogether :). My favorite way is to cut some of the crap and go with
var names = new List<string>{ "Bob", "John" };
. No loss of information and some added clarity! However, the use of var is often frowned upon as if it were unprofessional, my peers reading my latest code only to comment on my "abuse" of it!

What I was missing is the next step in the progression of understanding var. I was starting to realize that it adds clarity through readability! No longer do I have to scan through a bunch of type verbiage in a variable declaration to find the name, let alone the content. Readability alone wasn't convincing my critics so I pondered on the topic some more in regards to another set of concepts I have been working to grasp, DDD.

In studying DDD (Domain Driven Design) I detected a sense of deja vu. The core concepts, of creating a ubiquitous language and model that permeates code, was resonating all the way down to the level of variables. If a variable represents a list of employee names, it should be labeled employeeNames. If we cram that much intent and meaning into our variable names, why do we even care what primitive or compound type is used?

employeeNames might be implemented as a List<string> in C# or a simple list in python. However, when all is said and done, employeeNames is neither a List<string> in C# nor a list in python. Thinking about it as such adds no value, just translation overhead. employeeNames is simply a variable to store employee names, it's type is employeeNames! The name implies this directly. It describes a kind of name, employee, and it's plural, clearly a collection or set of names. The same would apply for a variable to represent age, even if implemented as an integer, it's not an integer, it is an age!

I think this is the answer to help convince people that using var is actually a good thing. Especially when writing unit tests where readability is a primary concern. I would even go so far as to suggest using it anywhere when declaring a variable. The only argument I have heard against this, thus far, is that this will lead to ambiguity. But wait a minute, I think the opposite is true! Static typing leaves room for intention to be left in the type itself. Because of this, developers get sloppy with variable names. When developers don't have a type to prefix a variable, they typically put more intention into the name, for their own sanity!

I would be more inclined to believe I would run across
List<string> employees = new List<string>();
in C# than
employees = ('Bob', 'John')
in python. This example requires knowledge that employees is implemented as a list of strings for someone to extrapolate that employees holds names and not ages or something else. However, with a list of strings, this may just be a guess! It could be a list of their emails or maybe home addresses! I know I've seen this in the past and I know I've done it myself. This added translation only decreases the maintainability of the code.

So the next evolution, in the slope of understanding and using var will be understanding it as a tool of readability and to help avoid leaving intention in types. This adds a layer of linguistic abstraction that hides the implementation of our variable's type and makes it more flexible.

I think the last evolution with var, will be to help developers get more familiar with the ideas behind dynamic typing. Implicit typing is like a step in the direction of dynamic typing. The last thing to do, seems to be to drop the use of var at all. It's simply linguistic boilerplate to developers (even if it serves a purpose to the C# evaluator).

So get your var on, don't be ashamed!

7.21.2009

Null Object Pattern

Code reuse is very important for developers, most of the patterns and refactorings exist soley to reduce the smell of duplicated code. Null checks are one of the biggest culprits of duplicated code. It is also, to a degree, a concern that often gets scattered through out an application and isn’t properly separated. Without testing and/or good documentation, it’s often hard to determine the expectations of a method that returns a reference type or a nullable value type. Often, unnecessary, paranoid, null checks are incorporated to deal with the problem.

The null object pattern almost solely exists as a pattern to reduce code, so if it cannot reduce the amount of code you are writing, don’t use it!

The idea is that instead of setting a refernece type to null initially, you set it to an implementation of the type that executes “null behavior” if methods are called on it’s contract that would normally throw a null reference exception.

Pros

  • Redcue code
  • Separated concerns when handling null logic
Cons
  • Not familiar to newer developers
  • Can complicate code, make sure it is actually reducing the amount of code before implementing! This is not a pattern to start with in a standard toolset!

Say we have a contract for a type:

  public interface IThing
  {
    void Do();
    bool Execute();
  }

and your system made a large number of calls to this Do method but had to check if the Thing was null before calling Do/Execute, you could replace the initial value of a reference to it with a Null implementation instead, then if someone calls a method on a null instance it won’t have any side effects (including null reference exceptions) if it wasn’t initialized already.

  public class NullThing : IThing
  {
    public void Do()
    {

    }

    public bool Execute()
    {
      return true;
    }
  }

Then, anywhere that used this type could initilize to the NullThing:

  public class ThingHaver
  {
    private IThing _Thing = new NullThing();
    public IThing Thing
    {
      get
      {
        return _Thing;

      }
      set
      {
        _Thing = value;
      }
    }
  }

What would be really cool is if we could somehow overload the default operator in c# and set what it returns. Then, if we used an abstract base class of IThing (ThingBase), we could implement this to return our NullThing(), assuming that behind the scenes the compiler relied on wiring up calls to default whenever it ran across:

private ThingBase _Thing;
Then we wouldn’t even have to set our variable to NullThing! Though maybe I’m getting too far off on a tangent here :) My one concern with this pattern is that it is very easy to produce a disparity in a team if they all aren’t aware of the pattern or it’s implementation in a particular scenario.

-Wes

6.20.2009

Program to an interface, not to an implementation

Program to an interface, not to an implementation
Not sure if this has a source, but it’s a great concept.