Programming In The Small

PITS

PITS – Details matter. [svg]

“Uh oh. I made a clean spot here. Now I’ve done it. Guess I’ll have to do the whole thing.” –E. Ripley

The risk with getting attached to the idea of Programming In The Small—abbreviated PITS from now on—is that you can not stop once you start using it on a piece of code.

That is, in my humble opinion, an acceptable risk.

The idea – back to basics

PITS is the idea that you do not need to overhaul the current programming language in favor of the hottest trend, nor switch to a “better” technology/API, not even waste another minute complaining about how badly written the legacy system you work on is.

All you need is going back to basics; the nuts and bolts of good programming style.

Things like good names for variables and methods. Reducing local duplication in methods and classes. Throwing away unnecessary comments explaining the code, and writing methods instead. In short: increasing readability in the smallest of scales. Forget about the big picture – focus on the small details instead.

The simple, bread-and-butter of coding—not the fancy design patterns, superior technologies or new paradigms. The latter could be viewed as Castles in the Sky in comparison to the mundane craft of PITS.

Another good aspect of PITS

Compared to bigger picture refactorings such as applying more or less complicated design patterns, or swapping one technology for the other, it is also less intensive on the team. What I mean by that is the pros and cons of applying different patterns, or the merits of one tool, language or API over the other can keep developers talking for hours and hours, often without reaching any consensus and—in the worst case—ending in unresolved micro conflicts. The small scale refactorings proposed by PITS is not really up for discussion, and very few people will be upset by anyone improving the readability of the code.

So are you opposed to learning new languages, technologies and design patterns?

Not at all. I think it is great, and necessary, to learn new stuff. I love it!

However, if I was in a position to put together a software developer team, and had to choose between working with a programmer that knew a lot of patterns and other more advanced programming techniques, but did not understand nor appreciate PITS, and a programmer that was not that into the latest trends/technologies/patterns, but with an eye for PITS, I would choose the latter.

An example of PITS

To illustrate PITS I have found a snippet of C# code to work with. It is a property getter which is not that hard to read to start with. Still – it does require some staring at to convince yourself it is correct.

// Version 1 - Original code snippet (found in Entity Framework)
public static string[] Args
{
    get
    {
        // GetCommandLineArgs puts the program in element 0
        // The args passed to Main do not do this so remove it
        var commandLineArgs = CommandEnvironment.GetCommandLineArgs();
        args = new string[commandLineArgs.Length - 1];
        for (var i = 0; i < commandLineArgs.Length - 1; i++)
        {
            args[i] = commandLineArgs[i + 1];
        }

        return args;
    }
    internal set { args = value; }
}

This is the original source code snippet. I found it in Entity Framework, which is open source and hosted at codeplex.com. The first thing I noticed was a big comment in the middle. An explaining comment is a smell for me; if code needs explanation it has good potential to be written clearer*. However, this comment does provide some value to us–it explains why the algorithm is needed at all.

First thing I do is introduce a variable ‘length’ to get rid of the two uses of ‘commandLineArgs.Length’.

// Version 2 - added local variable length, reducing duplication
public static string[] Args
{
    get
    {
        var commandLineArgs = CommandEnvironment.GetCommandLineArgs();
        var length = commandLineArgs.Length;
        args = new string[length - 1];
        for (var i = 0; i < length - 1; i++)
        {
            args[i] = commandLineArgs[i + 1];
        }

        return args;
    }
    internal set { args = value; }
}

Already the code is easier to read in my opinion. Also, doing that refactoring made me spot another duplication: ‘length – 1’ is performed twice. Let us extract another variable and call it ‘resultLength’, since the result array will be of that length.

// Version 3 - added local variable resultLength, reducing duplication
public static string[] Args
{
    get
    {
        var commandLineArgs = CommandEnvironment.GetCommandLineArgs();
        var length = commandLineArgs.Length;
        var resultLength = length - 1;
        args = new string[resultLength];
        for (var i = 0; i < resultLength; i++)
        {
            args[i] = commandLineArgs[i + 1];
        }

        return args;
    }
    internal set { args = value; }
}

The initial refactoring, adding local variable ‘length’, now seems to add clutter instead of clarity. I therefore inline ‘length’ into the assignment of ‘resultLength’.

// Version 4 - removed local variable length, reducing clutter
public static string[] Args
{
    get
    {
        var commandLineArgs = CommandEnvironment.GetCommandLineArgs();
        var resultLength = commandLineArgs.Length - 1;
        args = new string[resultLength];
        for (var i = 0; i < resultLength; i++)
        {
            args[i] = commandLineArgs[i + 1];
        }

        return args;
    }
    internal set { args = value; }
}

To me the code is starting to look a lot more readable. Still, it could be even cleaner. We could separate the actual algorithm into a subroutine and call it something appropriate to add readability to the code. Remember, method names instead of explaining comments.

Since the algorithm creates a new string array containing all elements but the first, I will call it ‘AllButFirst’.

// Version 5 - extracted method AllButFirst to improve readability
public static string[] Args
{
    get
    {
        var commandLineArgs = CommandEnvironment.GetCommandLineArgs();
        args = AllButFirst(commandLineArgs);
        return args;
    }
    internal set { args = value; }
}

private static string[] AllButFirst(string[] array)
{
    var resultLength = array.Length - 1;
    result = new string[resultLength];
    for (var i = 0; i < resultLength; i++)
    {
        result[i] = array[i + 1];
    }
    return result;
}

In the extracted method I have renamed the parameter to ‘array’ and all mentions of ‘args’ or ‘command’ has been removed to.

We could go even further and convince ourselves to get rid of the loop altogether by using a temporary List<string>.

// Version 6 - rewrote method AllButFirst
public static string[] Args
{
    get
    {
        var commandLineArgs = CommandEnvironment.GetCommandLineArgs();
        args = AllButFirst(commandLineArgs);
        return args;
    }
    internal set { args = value; }
}

private static string[] AllButFirst(string[] array)
{
    var tempList = new List(array);
    tempList.RemoveAt(0);
    return tempList.ToArray();
}

I will stop now, but as you see it can be quite fun and the only** risk you run into is not being able to stop. Try to stick to the Scouts’ Rule – leave the place a little cleaner – but do not build a house beside the camp fire!

I hope you agree with me the end result is quite a lot easier to understand (and thus maintain) than the original version.

Apart from the code quality gains PITS gives us, it is also a lot of fun. It is a challenge to make code more readable, you can always ask yourself “Can I make this code even more readable?”.

* Some explaining comments I do fancy. That is those that explain code that is completely odd; situations like having to send in a NULL value to a built-in API to get around a bug, or similiar “hacks”. Comments in general will be the topic for a separate upcoming post.
** OK, not the only risk. You need to keep your code behaving the same. In a statically typed language like C# you can do most of the above refactorings safely (rename and extract method refactorings are built-in Visual Studio) making unit tests a little less necessary than in a dynamically typed language where most refactorings are manual. But unit tests is my preferred method of convincing myself I do not break code anyway, since it gives me more freedom restructuring code (not only autorefactorings available). In fact, the necessity of writing unit tests even when performing simple refactorings could be seen as a good property of dynamic languages.

Background

I found out about the PITS idea during a Software Craftmanship seminar in London 2009. You can read the notes from that seminar by clicking this link.

In My Humble Opinion…is a book being written by me, Olof Bjarnason. It is about my views on the business of developing software. This blog post will probably end up in that book.

 

[ About | Book posts ]

Advertisements

6 thoughts on “Programming In The Small

  1. Pingback: Scouts Rule | Voice Steam

  2. In this specific case i would do:

    return CommandEnvironment.GetCommandLineArgs().Skip(1).ToArray();

    In the general case, I think overly refactoring small non duplicated code to separate functions is a bad habit since it often makes the code harder to read, as it is more work to actually check what the function actually does than to handle 3-4 lines of code inline. There is simply some kind of limit to where it is counter productive to introduce an extra identifier to handle a small operation like that.

    This is why I simply love LINQ, as it introduces powerful new “known operations” whose meaning is already clear in the mind, and therefore doesn’t clutter the thinking with new unknown functions, but still lets us write terse code that introduces much less amibigouity of what it actually does.

    Compare the
    var resultLength = array.Length – 1;
    result = new string[resultLength];
    for (var i = 0; i < resultLength;
    {
    result[i] = array[i + 1];
    }
    return result;

    with:
    return array.Skip(1).ToArray();

    Yes, performance wise, it is probably not as good, but from the point of understandability, it is unbeatable.

  3. Yes you have a good point in LINQs succinctness and how too granular identifiers/methods can obscure instead of clarify. However, the example is not meant to be read too blindly–it is just an illustration for the sake of explanation. It does take up a lot of “real estate” in the text; maybe because of that I should remove some steps and use Skip() instead?

  4. …or rather (since args should be assigned):
    args = CommandEnvironment.GetCommandLineArgs().Skip(1).ToArray();
    return args;

    or even
    return args = CommandEnvironment.GetCommandLineArgs().Skip(1).ToArray();

  5. In general, I think that code should require you to keep as small context as possible in your head to understand what is going on, and I think you are right in that it often helps to make simplifications locally.

    But I think it is hard to say that it always makes things simpler to for example remove all duplication, or that the opposite always is right. It depends a bit on how you do it,
    For example, if you replace a simple, obvious, expression like .Length – 1 with a variable it would, for me, depend on how you name the variable… Named “resultLength” it would often introduce something I’ll might have to look up how it was calculated, but named “ArrLengthMinus1”, it would be helpful…

    Regarding comments, I prefer the kind that explains why the code is there, i.e. summary comments on metods and properties, rather than the one that explains things I just as well can see in the code. In your example, I think the comment you removed actually was helpful, but not as an inline comment in the getter, but rather as a summary comment on the property as a whole.

  6. Pingback: Don’t Repeat Yourself – 1 | Voice Steam

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s