Posts in  refactoring

1.4.2014

Refactoring: Pushing Behavior Into The Domain

"Reduce duplication, bugs and shotgun surgery, all while increasing testability by pushing behavior into the domain!"

12.29.2013

Refactoring: Renaming Ambiguous Variables When Reading Code

"How many times a day do you find yourself trying to figure out what variables represent because they are named ambiguously?"

3.18.2013

Nebraska Code Camp 3 Recap

Thanks again to everyone that came to my workshop and presentation.

You can find the content of the presentation here:

https://github.com/g0t4/Presentations

I've added a tag for nebraskacodecamp3 in the event I reuse and continue to modify these presentations in the future.

Async Programming in Javascript with Await and Defer, look in JavascriptAwaitAndDefer. I exported my google doc analysis spreadsheet to an html file and checked it in too.

HALF DAY WORKSHOP: Introduction to Refactoring with Resharper, look in RefactoringWithResharper.

3.12.2013

Teaser: HALF DAY WORKSHOP: Introduction to Refactoring with Resharper #ncc3

12.18.2012

Why requirejs?

Why are we using requirejs? I've even asked it of myself. For me, organizing large javascript projects and managing dependencies led to it. There are lots of great posts about how to use requirejs but I wanted to focus on WHY.

Explicit Dependency Specifications

Open up any significant sized javascript file with 100+ lines of code, even 50, and tell me, within 5 seconds, what dependencies it has. Can't do it? You can with modules and requirejs.

...
function View() {
    this.name = ko.observable('bob');
    this.age = ko.observable(20);
    ...
}
var view = new View();
ko.applyBindings(view, $('viewPlaceHolder')[0]);    
...

versus

require(['jquery', 'knockout'], function($, ko) {
    // who knows whats in here, but I do know what it needs
});


How often do you use this information as a developer? Think about the value in not spending 5 minutes everytime you need to know, or more likely, the value in not creating a bug because you didn't take the time to find them all or you missed one.

Explicit Export Specifications

Along with explicit dependencies, modules have an explicit mechanism to define what they export. We can quickly read a module and know what it provides by looking at the return statement, instead of wondering what globals to access. define makes explicit that it is providing an interface to consumers.

define('math', ['dependencyA', 'dependencyB'], function(dependencyA, dependencyB) {
    // horrible things to local scope
    return {
            min: min,
            max: max,
            average: average
    };
});

Avoid Script Tag Soup

<script src="scripts/utils.js">
<script src="scripts/numbers.js">
<script src="scripts/common.js">
<script src="scripts/formatting.js">
<script src="scripts/thing.js">
<script src="scripts/other.js">
<script src="scripts/site.js">

Did I type the path correctly? Are they ordered properly? Did someone change the order for a reason? Did they leave a comment, is it still applicable? What if I need to refactor one of them and change its position in the list?

With requirejs we can simply work on the dependency lists alone:

define('utils', [], function(){
    ...
});

define('numbers', [], function(){
    ...
});

define('common', ['utils', 'numbers'], function(){
    ...
});

define('formatting', [], function(){
    ...
});

define('thing', ['formatting'], function(){
    ...
});

...

This makes the dependencies explicit as we've shown above and doesn't hide meaning in an ordered list of script tags that are likely copy/pasted every time someone needs to load a set of dependencies, because who is going to manage this list and limit it to only the pieces they need, in every view of every application?

Confidence in Refactoring

Unlike in static typed languages, I can't just build to see if I messed up a dependency.

If dependency management is automatic, and we have an explicit, consistent specification of what interface we export from a module, and we use a function scope to organize our locals, it becomes very easy to make changes within a module, to split modules in parts etc. Our feeble human minds can focus on one module, but have a very difficult time taking into consideration the entire scope of the application. It's not that you can't practice this with function scopes alone, but it becomes a requirement with requirejs.

Shines a Light on Circular Dependencies and Intent

It's very easy to create circular dependencies in javascript, refactoring non module code to module code lets those issues bubble up to the dependency list and makes them apparent. It also helps other developers see how modules were intended to be used together, so they don't make assumptions and violate internal dependencies. You simply cannot see this stuff without a modular approach.

Simplified Testing

This falls back to script tag soup. When we setup tests we have to worry about loading scripts, even if it's just the module we are testing. We have three choices, take your pick:

  • Copy/paste a shared list between test fixtures, suffering testing speed and potentially breaking tests when updating this list.
  • Manually create this list for each test fixture, by manually scanning for dependencies, see above.
  • Use requirejs and forget about it

Simplifies Sharing Code

We have lots of handy JS we create, but sometimes we don't share it because it's too difficult to know what it needs. With requirejs, it's easy to see what needs to go with a piece of code, to use it elsewhere.

Works Client Side

Mobile and other solutions don't always end up using a server side technology and even if they do, it varies, so a module framework on the client is a huge benefit. Especially for organizing complex mobile applications.

Optimized Build

If desired, you can combine and minify your modules into one script for an optimized experience. The great part is you can use requirejs in dev without this and only add it as a deploy step, thus alleviating the need to build every time you make a change in dev.

Other benefits

  • Only load what you need, on demand and asynchronously!
  • Lazy loaded scripts
  • Although something is coming in ES harmony

    unless someone can get Microsoft to adopt rapid IE updates, even for old IEs, AMD will be around for a long time - James Burke

  • Consistent approach to mocking modules if testing via stubbing.

Articles against modules

Articles supporting modules

4.11.2012

Programming doesn't have to be Magic

Computer Locke

In the show LOST, the Swan Station had a button that “had to be pushed” every 100 minutes to avoid disaster. Several characters in the show took it upon themselves to have faith and religiously push the button, resetting the clock and averting the unknown “disaster”. There are striking similarities in this story to the code we write every day. Here are some common ones that I encounter:

  • “I don’t know what it does but the application doesn’t work without it”
  • “I added that code because I saw it in other similar places, I didn’t understand it, but thought it was necessary.” (for consistency, or to make things “work”)
  • “An error message recommended it”
  • “I copied that code” (and didn’t look at what it was doing)
  • “It was suggested in a forum online and it fixed my problem so I left it”

In all of these cases we haven’t done our due diligence to understand what the code we are writing is actually doing. In the rush to get things done it seems like we’re willing to push any button (add any line of code) just to get our desired result and move on. All of the above explanations are common things we encounter, and are valid ways to work through a problem we have, but when we find a solution to a task we are working on (whether a bug or a feature), we should take a moment to reflect on what we don’t understand. Remove what isn’t necessary, comprehend and simplify what is.

Why is it detrimental to commit code we don’t understand?

  • Perpetuates unnecessary code
    • If you copy code that isn’t necessary, someone else is more likely to do so, especially peers
  • Perpetuates tech debt
    • Adding unnecessary code leads to extra code that must be understood, maintained and eventually cleaned up in longer lived projects
    • Tech debt begets tech debt as other developers copy or use this code as guidelines in similar situations
  • Increases maintenance
    • How do we know the code is simplified if we don’t understand it?
  • Perpetuates a lack of ownership
    • Makes it seem ok to commit anything so long as it “gets the job done”
  • Perpetuates the notion that programming is magic
    • If we don’t take the time to understand every line of code we add, then we are contributing to the notion that it is simply enough to make the code work, regardless of how.

TLDR

Don’t commit code that you don’t understand, take the time to understand it, simplify it and then commit it!

4.5.2011

Refactoring Part II - Tight rope walking / what can break and KISS

Like it or not, we humans make mistakes. If we embrace the fact that we are going to make mistakes, we can direct our efforts to reduce mistakes in areas that are critical in exchange for potentially making more mistakes in areas that aren’t. Gasp! We need to get over the silly notion that our work can ever be 100% perfect and try to maximize in the areas that matter.

Does it really matter?

These are the things I’ve found that typically don’t matter as much in the grand scheme of development. Start training yourself to identify areas that matter!

  • Infrequently used features
    • Especially if there’s an easy workaround during a failure
  • Administrative crud pages
    • Especially in smaller apps, the developer is usually the admin and can just hack at the DB in a failure
  • MVC Controllers
  • Logging
    • This just means debugging will be a bit harder and I’m sure we’ll fix it quick enough.
  • User management / authentication
    • Development typically involves logging in daily, so it’s likely we’ll catch the mistakes.
    • Please just use OpenId already or another common alternative to rolling your own.
    • If no one can login, deploying a fix won’t interrupt anyone!
  • Easily fixed/deployed features
    • Any feature that isn’t critical, that can easily be fixed and deployed.
  • CSS and images
    • How many of the things we do with CSS and images are just for aesthetic purposes and really don’t matter if the application still functions?
    • Do I really care if my bank sends me a statement and their logo shows up a red X?
  • Reports versus entry
    • If we allow a change (like a balance transfer) to occur that is invalid, it’s probably a bigger problem than if we accidently show an invalid balance total on a statement report. This is highly subjective, but I’m more worried about data going into my systems than data coming out, except where data flowing out flows into other systems.
  • Features that are no longer used / should be removed
  • Areas where testing seems to slow down development
    • IMHO, testing typically slows down development in areas that don’t matter (stubbing controllers/services, duplicated/overlapping test cases, KISS code) In areas that are important, we typically find complexity, and testing often helps avoid bugs in complexity faster than F5 debugging.

KISS

In the areas that don’t matter, we should strive for simplicity. Readable and simplified code is less likely to contain mistakes. Controllers can be thin instead of fat. Reports can isolate complexity in a tested model or functional components of reuse (calculations).

What does this mean?

So we know what doesn’t matter as much, what does that mean? For me it means less testing:

  • Not writing automated tests (or very few) … integration or unit
  • Not double/triple checking my work
  • Sometimes compiling is good enough
  • Sometimes a quick run of a common scenario on a related page, but not all scenarios
  • Rarely, the occasional bat shit crazy refactoring with little regard to these areas.

Conclusions

Some of this may sound radical, if so, don’t adopt it. I can refactor a lot faster if I know where I can run versus where I should crawl. Always crawling (extra effort is expended upfront) is as reckless as always running (extra effort is expended after the fact), an optimum is usually found in balancing what can and what shouldn’t break.

Sadly, I’ve seen a lot of hugely beneficial refactoring passed up simply because it would be invasive to areas that ironically aren’t that important in the grand scheme of things.

Happy Coding!

-Wes

1.29.2011

Refactoring Part 1 : Intuitive Investments

Fear, it’s what turns maintaining applications into a nightmare. Technology moves on, teams move on, someone is left to operate the application, what was green is now perceived brown. Eventually the business will evolve and changes will need to be made. The approach to those changes often dictates the long term viability of the application. Fear of change, lack of passion and a lack of interest in understanding the domain often leads to a paranoia to do anything that doesn’t involve duct tape and bailing twine. Don’t get me wrong, those have a place in the short term viability of a project but they don’t have a place in the long term. Add to it “us versus them” in regards to the original team and those that maintain it, internal politics and other factors and you have a recipe for disaster. This results in code that quickly becomes unmanageable. Even the most clever of designs will eventually become sub optimal and debt will amount that exponentially makes changes difficult.

This is where refactoring comes in, and it’s something I’m very passionate about. Refactoring is about improving the process whereby we make change, it’s an exponential investment in the process of change. Without it we will incur exponential complexity that halts productivity. Investments, especially in the long term, require intuition and reflection.

How can we tackle new development effectively via evolving the original design and paying off debt that has been incurred?

The longer we wait to ask and answer this question, the more it will cost us. Small requests don’t warrant big changes, but realizing when changes now will pay off in the long term, and especially in the short term, is valuable.

I have done my fair share of maintaining applications and continuously refactoring as needed, but recently I’ve begun work on a project that hasn’t had much debt, if any, paid down in years. This is the first in a series of blog posts to try to capture the process which is largely driven by intuition of smaller refactorings from other projects.

Signs that refactoring could help:

Testability

  • How can decreasing test time not pay dividends?
  • One of the first things I found was that a very important piece often takes 30+ minutes to test. I can only imagine how much time this has cost historically, but more importantly the time it might cost in the coming weeks: I estimate at least 10-20 hours per person! This is simply unacceptable for almost any situation. As it turns out, about 6 hours of working with this part of the application and I was able to cut the time down to under 30 seconds! In less than the lost time of one week, I was able to fix the problem for all future weeks!
  • If we can’t test fast then we can’t change fast, nor with confidence.
  • Code is used by end users and it’s also used by developers, consider your own needs in terms of the code base. Adding logic to enable/disable features during testing can help decouple parts of an application and lead to massive improvements. What exactly is so wrong about test code in real code? Often, these become features for operators and sometimes end users.
  • If you cannot run an integration test within a test runner in your IDE, it’s time to refactor.

Readability

  • Are variables named meaningfully via a ubiquitous language?
  • Is the code segmented functionally or behaviorally so as to minimize the complexity of any one area?
  • Are aspects properly segmented to avoid confusion (security, logging, transactions, translations, dependency management etc)
  • Is the code declarative (what) or imperative (how)? What matters, not how. LINQ is a great abstraction of the what, not how, of collection manipulation. The Reactive framework is a great example of the what, not how, of managing streams of data.
  • Are constants abstracted and named, or are they just inline?
  • Do people constantly bitch about the code/design?
  • If the code is hard to understand, it will be hard to change with confidence. It’s a large undertaking if the original designers didn’t pay much attention to readability and as such will never be done to “completion.” Make sure not to go over board, instead use this as you change an application, not in lieu of changes (like with testability).

Complexity

  • Simplicity will never be achieved, it’s highly subjective. That said, a lot of code can be significantly simplified, tidy it up as you go.
  • Refactoring will often converge upon a simplification step after enough time, keep an eye out for this.

Understandability

  • In the process of changing code, one often gains a better understanding of it. Refactoring code is a good way to learn how it works. However, it’s usually best in combination with other reasons, in effect killing two birds with one stone. Often this is done when readability is poor, in which case understandability is usually poor as well. In the large undertaking we are making with this legacy application, we will be replacing it. Therefore, understanding all of its features is important and this refactoring technique will come in very handy.

Unused code

  • How can deleting things not help?
  • This is a freebie in refactoring, it’s very easy to detect with modern tools, especially in statically typed languages. We have VCS for a reason, if in doubt, delete it out (ok that was cheesy)!
  • If you don’t know where to start when refactoring, this is an excellent starting point!

Duplication

  • Do not pray and sacrifice to the anti-duplication gods, there are excellent examples where consolidated code is a horrible idea, usually with divergent domains. That said, mediocre developers live by copy/paste. Other times features converge and aren’t combined. Tools for finding similar code are great in the example of copy/paste problems. Knowledge of the domain helps identify convergent concepts that often lead to convergent solutions and will give intuition for where to look for conceptual repetition.

80/20 and the Boy Scouts

  • It’s often said that 80% of the time 20% of the application is used most. These tend to be the parts that are changed. There are also parts of the code where 80% of the time is spent changing 20% (probably for all the refactoring smells above). I focus on these areas any time I make a change and follow the philosophy of the Boy Scout in cleaning up more than I messed up. If I spend 2 hours changing an application, in the 20%, I’ll always spend at least 15 minutes cleaning it or nearby areas.
  • This gives a huge productivity edge on developers that don’t.
  • Ironically after a short period of time the 20% shrinks enough that we don’t have to spend 80% of our time there and can move on to other areas.

Refactoring is highly subjective, never attempt to refactor to completion! Learn to be comfortable with leaving one part of the application in a better state than others. It’s an evolution, not a revolution. These are some simple areas to look into when making changes and can help get one started in the process. I’ve often found that refactoring is a convergent process towards simplicity that sometimes spans a few hours but often can lead to massive simplifications over the timespan of weeks and months of regular development.

11.25.2009

Remove<T>(this IList<T> source, Func<T,bool> selector) why not?

Maybe I am just crazy, but it seems like removing or deleting items from a collection is always an after thought. Take IList for example, a list of items, with the ability to add and remove from it. We have a flurry of extension methods that are inherited from IEnumerable to add items but it seems like no one thought maybe it would be nice to beef up the remove method with some extension methods. Maybe I missed an extension namespace, maybe I am just crazy. How many times do we have to write the following bloated code just to remove something from a collection?

var ingredient = Ingredients.FirstOrDefault(i => i.FeedId == feed.Id);
Ingredients.Remove(ingredient);

Even worse, is when we want to remove multiple items from a list based on some criteria. I have to add a foreach loop of some sort to remove each item one at a time. Oh and, deal with the always wonderful: (System.InvalidOperationException: Collection was modified; enumeration operation may not execute). So, I have to remember to create a new list to avoid this dreadful mistake:

var ingredients = Ingredients.Where(i => i.FeedId == feed.Id).ToList();
foreach (var ingredient in Ingredients)
{
  Ingredients.Remove(ingredient);
}

Of course, now that I have created a new list, ToList(), I can simplify this to:

var ingredients = Ingredients.Where(i => i.FeedId == feed.Id).ToList();
ingredients.ForEach(i=> Ingredients.Remove(i));

But that is still icky, all that code just to remove an item or a set of items? So enough complaining on my part, it's time to put up or shut up. This is what I want to do in code:

Ingredients.Remove(i => i.FeedId == feed.Id);

Simple, right? And to do this here is the simple extension method:

public static void Remove<T>(this IList<T> source, Func<T, bool> selector)
{
  if (source == null || selector == null)
  {
    return;
  }

var itemsToRemove = source.Where(selector).ToList(); itemsToRemove.ForEach(i => source.Remove(i)); }

Now, I no longer have to worry about all these things when I want to remove items based on searching my list:

  • Doing a separate search for the items I want to remove
  • Creating a new list, avoiding the pitfall of enumerating when removing and getting "Collection was modified" exceptions.
  • Iterating over those items and calling remove on each

Some purists out there would be mad that I just quietly allow the source/selector to be null and not throw an exception. That's how I chose to implement this, if you want different behavior that's great. Just make sure you add this extension method to your arsenal! Maybe another RemoveSingle that throws an exception if it finds 0 or more than 1 item would be appropriate to add as well?

Happy coding!

10.22.2009

Apply DeMorgan's law refactoring

After reading up on some refactorings after a few hours of refactoring today, I thought it might be nice to share this refactoring with everyone.

I ran into some code today, a simple equality comparison method, except it was comparing about 7 properties, so even worse than this:

public bool Compare(item x, item y)
{
  return !(x.a != y.a || x.b != y.b || x.c != y.c);
}
Even though it doesn't take long to figure out what it is doing, it takes more time to figure out than if we applied DeMorgan's law in what is known as avoiding double negatives refactoring:

public bool Compare(item x, item y)
{
  return x.a == y.a && x.b == y.b && x.c == y.c;
}
Any time you run across code where you negate a set of conditions, do everyone else a favor and invert the conditions. Apply DeMorgan's law to make the code more readable. To freshen up on the rules, visit http://en.wikipedia.org/wiki/De_Morgan's_laws. Saving even 30 seconds of translation every time someone looks at that line of code, becomes a huge savings.

6.17.2009

Refactoring - Compose Method

This refactoring, again from Refactoring to Patterns, reduces ambiguity in methods that have too much detail or conditional logic to quickly comprehend. By breaking a method down into smaller methods, the ambiguity can be removed.

The mechanics of this refactoring are rather subjective, but obviously lead to the use of Extract Method. Study the code and make sure you understand it, the longer this takes, the more you need to apply this refactoring! Clean up the code, look for unused or duplicated pieces. Then, extract detail to smaller methods. Reveal intention through naming of new variables and methods. Once you are done, make sure all the code in the Composed Method is of the same level of detail.

Pros

  1. Readability
  2. Uniform level of detail.
  3. Joshua lists difficulty in debugging as a drawback, however, I think a stack trace that points to the smaller, intentionally named method will increase the likelihood that I can figure out what is wrong before I even crack open the code!
  4. Another benefit to add to Joshua’s list: code re-use of the extracted methods to replace other duplicated code.
Cons
  1. Excessive quantity of small methods, though Joshua points out this could lead to applying Extract Class.

So I rummaged around in some code and stumbled upon a text box validation method that was a great candidate for this refactoring:

public virtual void Validate()
{
  IsValid = true;
  if (Visible && Enabled)
  {
    if (IsRequired)
    {
      if (String.IsNullOrEmpty(Text) || Text.Trim().Length == 0)
      {
        ErrorMessage = String.Format(Messages.TextBoxCannotBeEmpty, InvalidDisplayName);
        IsValid = false;
      }
    }

if (MaxLength != 0 &amp;&amp; !String.IsNullOrEmpty(Text) &amp;&amp; Text.Trim().Length &gt; MaxLength)
{
  ErrorMessage = String.Format(Messages.TextBoxExceedsMaximumLength, InvalidDisplayName, MaxLength);
  IsValid = false;
}

if (MinLength &gt; 0 &amp;&amp; (String.IsNullOrEmpty(Text) || Text.Trim().Length &lt; MinLength))
{
  ErrorMessage = String.Format(Messages.TextBoxUnderMinimumLength, InvalidDisplayName, MinLength);
  IsValid = false;
}

} }

The level of nested conditionals makes it rather hard to follow. First, I want to remove some of the nesting by returning immediately if the validation isn’t applicable:

public virtual void Validate()
{
  IsValid = true;
  if (!Visible || !Enabled)
  {
    return;
  }
  if (IsRequired)
  {
    if (String.IsNullOrEmpty(Text) || Text.Trim().Length == 0)
    {
      ErrorMessage = String.Format(Messages.TextBoxCannotBeEmpty, InvalidDisplayName);
      IsValid = false;
    }
  }

if (MaxLength != 0 && !String.IsNullOrEmpty(Text) && Text.Trim().Length > MaxLength) { ErrorMessage = String.Format(Messages.TextBoxExceedsMaximumLength, InvalidDisplayName, MaxLength); IsValid = false; }

if (MinLength > 0 && (String.IsNullOrEmpty(Text) || Text.Trim().Length < MinLength)) { ErrorMessage = String.Format(Messages.TextBoxUnderMinimumLength, InvalidDisplayName, MinLength); IsValid = false; } }

Next, I notice 3 different “sub validations” occurring. One validates if the field is required, the next validates the maximum length and the last validates the minimum length. Each of these can become their own methods:

public virtual void Validate()
{
  IsValid = true;
  if (!Visible || !Enabled)
  {
    return;
  }

ValidateRequired(); ValidateMaximumLength(); ValidateMinimumLength(); } private void ValidateRequired() { if (IsRequired) { if (String.IsNullOrEmpty(Text) || Text.Trim().Length == 0) { ErrorMessage = String.Format(Messages.TextBoxCannotBeEmpty, InvalidDisplayName); IsValid = false; } } } private void ValidateMaximumLength() { if (MaxLength != 0 && !String.IsNullOrEmpty(Text) && Text.Trim().Length > MaxLength) { ErrorMessage = String.Format(Messages.TextBoxExceedsMaximumLength, InvalidDisplayName, MaxLength); IsValid = false; } } private void ValidateMinimumLength() { if (MinLength > 0 && (String.IsNullOrEmpty(Text) || Text.Trim().Length < MinLength)) { ErrorMessage = String.Format(Messages.TextBoxUnderMinimumLength, InvalidDisplayName, MinLength); IsValid = false; } }

Finally, I realize the level of detail isn’t uniform, I should extract a new property to check if the control is in use. Also, even more troublesome, the intentions of the first line to set the IsValid to true are lost to future developers. This is needed to reset the validation process, otherwise if a control is validated and fails, subsequent calls may still return invalid. My new Composed Method finally becomes:

public virtual void Validate()
{
  ResetValidation();
  if (!InUse)
  {
    return;
  }

ValidateRequired(); ValidateMaximumLength(); ValidateMinimumLength(); } private void ResetValidation() { IsValid = true; } private bool InUse { get { return Visible && Enabled; } } ...

The new Composed Method is much easier to comprehend. Ironically enough, looking at other validation methods, the newly extract methods can be re-used either through inheritance or possibly by applying Extract Class. The new InUse property may eventually be exposed for public consumption.

6.17.2009

Chain Constructors

Chain Constructors is another refactoring from the book Refactoring to Patterns, it helps remove duplication in constructor overloads. This should be a very familiar refactoring for most developers. Catch all constructors are produced, and hopefully minimized, that other constructors rely on either directly or indirectly.

Mechanics

  1. Study the constructors in the class and find the two with the most shared code. Say I have two constructors to create lines as follows:
    public class Line
    {
      ...
      private Line()
      {
        Style = LineStyle.Solid;
        Color = LineColor.Black;
        Thickness = 1;
      }
    
      private Line(LineStyle style, LineColor color, int thickness)
      {
        Style = style;
        Color = color;
        Thickness = thickness;
      }
      ...
    }
    
    These can be chained together, to reduce redundant code and leave a common spot for future property initializations.
    public class Line
    {
      ...
      private Line()
        : this(LineStyle.Solid, LineColor.Black, 1)
      {
      }
    
      private Line(LineStyle style, LineColor color, int thickness)
      {
        Style = style;
        Color = color;
        Thickness = thickness;
      }
      ...
    }
    
  2. Repeat step one, even with constructors already chained, to maximize chaining and code re-use.
  3. Re-scope any constructors that no longer need to be public.

Pros

  1. Code reuse
  2. Avoid future mistakes when new code in one constructor isn’t copied to others.
Cons

  1. Highly nested constructors might become confusing, try to keep it simple, if you find many levels of chaining, you may want to consider applying refactorings to or twoards Creation Method and/or Factory patterns to help reduce ambiguity

6.16.2009

Refactoring, Creation Methods

Lately, I’ve been engulfed in several design books, one of which is Refactoring to Patterns by Joshua Kerievsky. The book is an extension of Martin Fowler’s book, Refactoring. It emphasizes learning to use patterns by working with existing code, identifying code that “smells” and then finding a pattern to refactor the code “to, towards or away from.” Smelly code typically involves duplication, complexity and/or ambiguity.

Joshua believes, rather than heavy pattern use up front, to let code evolve into or away from patterns. I thought it might be fun to share some of these refactorings and examples from my code past (gasp).

One of the first refactorings, Replace Constructors with Creation Methods. Simply put, when we have lots of overloads or parameters for our constructor(s), ambiguity arises as to the intended use. Often we add conditional plumbing to construct different instances of the same class with different run time behavior. The problem is that the constructors are called the same way and we have no way of providing intention through naming conventions like we can with methods and other members.

I went back to some old code and found an example of where this refactoring could help. The code was doing various types of charting by abstracting the creation of lines to separate instances before rendering the chart. One of the classes, scrubbed, had constructors as follows:

public class Line
{
  ...
  public Line(int year, int month, int productId)
  {
    Initialize(year, month, productId);
    LoadData();
  }

  public Line(int month, int productId)
  {
    Initialize(2000, month, productId);
  }
  ...
}

The first overload has an extra parameter to provide the year, that is all a user of this class would see when they come across instantiations of this class. With out digging deeper they wouldn’t realize that one overload queries and loads actual data points for the specified product while the other was intended to provide a way to manipulate and map existing lines to new lines.

The refactoring seeks to use methods to create instances of the class and subsequently, if appropriate, hiding the original constructors so users are forced to use the new, hopefully less ambiguous, Creation Methods.

Every refactoring has a set of steps, which should be performed one at at time until the desired changes are complete. Joshua emphasizes that not all steps must be completed. Sometimes only part of a refactoring is sufficient to clean up smelly code.

The “Mechanics,” as Joshua refers to in his book, are as follows:

  1. Find a spot where the class is instantiated and apply the refactoring Extract Method.

    Line line = new Line(2009, 2, productId);
    

    Refactors to (note the extracted method should be static):

    Line line = CreateLine(2009, 2, productId);
    ...
    public static Line CreateLine(int month, int year, int productId)
    {
      return new Line(month, year, productId);
    }
    
  2. Next, apply Move Method and move the new method to the class it instantiates.

    Line line = Line.CreateLine(2009, 2, productId);
    ...
    public class Line
    {
      ...
      public static Line CreateLine(int month, int year, int productId)
      {
        return new Line(month, year, productId);
      }
      ...
    }
    
  3. Find all code that uses this specific constructor, for this specific intended use, and refactor it to use the new Creation Method. Be careful to check for constructors that are used for several different intended purposes, these should be broken out into their own separate Creation Methods. Red flags for this include conditional logic in the constructor and/or allowing null/default values for a parameter that drive different class behaviors.
  4. Joshua mentions to repeat steps 1 to 3 for all other constructor overloads. I’d rather switch his last two steps. So, in this step, make any constructors that are no longer publicly used, private or internal to avoid users trying to call the constructors. Remeber, each step is optional, and in some cases you may not want to re-scope the constructor.

    public class Line
    {
      ...
      private Line(int year, int month, int productId)
      {
        Initialize(year, month, productId);
        LoadData();
      }
      ...
    }
    
  5. Repeat for all applicable constructors. To save time, I will show the final form of my Line class:

    public class Line
    {
      ...
      private Line(int year, int month, int productId)
      {
        Initialize(year, month, productId);
        LoadData();
      }

    private Line(int month, int productId) { Initialize(2000, month, productId); }

    public static Line CreateLine(int month, int year, int productId) { return new Line(month, year, productId); }

    public static Line CreateMappingLine(int year, int productId) { return new Line(year, productId); } ... }


It is easy to see how the intentions of the different constructors are present in the names of their Creation Methods. The second overload’s Creation Method now clearly creates a mapping line whereas the first creates an actual line. This helps the developer avoid assumptions when reading the client code!

Joshua lists pros and cons for each refactoring. For this refactoring, the positives are that we can express intentions and increase the readability/maintainability of our code. Also, in cases where a single constructor creates two instances that behave differently, we can get around this by using the exact same parameter set but showing the intended differences in the Creation Method names. Whereas with constructors we can only have one constructor per parameter set. Finally, a word of caution, using Creation Methods is rather non standard. I would personally, and Joshua hints at it too, use some type of Factory/pattern with this instead of just a Creation Method. I feel that developers are likely to be familiar with the Factory/pattern(s) and know to look for that when constructors are missing, at least more so than Creation Methods.
Happy Coding!