Steve Lillis' Code Blog

Tuesday, 30 December 2014

The When and Why: Properties and Fields

Fields and Properties are two important concepts that come up very early on when learning to code. Due to their surface level similarities it can be a confusing process understanding when to use each of them and why.

When?

Only ever use fields with the private access modifier. They can be used with protected or public but it's generally poor practice and I'll explain why later.

Use properties for all non-private scenarios, even if the properties are auto-implemented. You can also use them in a private context when implementing get and set logic.

public class ExampleClass
{
    // This particular field is used as a 'backing' field
    // for PropertyWithLogic
    private int _myField;

    // protected, so a property, even though its an auto-property
    protected string AutoProperty { get; set; } 

    // private but has logic, so a property
    private int PropertyWithLogic
    {
        get { return _myField; }
        set { _myField = value >= 0 ? value : 0; } 
    }
}

Why?

Good Object Oriented Programming is all about encapsulation, separating the interface of a class from its implementation. Interfaces are the blueprints for what a class should do, Classes are the how it does it.

The signature and implementation of a field are inseparable. That is to say, there is no separation of the what of a field (getting and setting of a value) from the how of a field (storing the value in memory). For this reason, fields are an implementation detail and can't be part of an interface.

Properties, on the other hand, separate signature from implementation. In the below example, the interface specifies that implementations must provide a property that you can get and it's the classes that define how that property is implemented.

public interface IExample
{
    string Description { get; }
}

public class Example1 : IExample
{
    // Returns a hard-coded string.
    public string Description
    {
        get { return "Example1"; }
    }
}

public class Example2 : IExample
{
    private int _id;
    private DateTime _createdDate;

    // Performs some logic to return a string
    // that includes some member field values.
    public string Description
    {
        get 
        { 
            return string.Format(
                        "Example2, id: {0}, created: {1}",
                        _id,
                        _createdDate); 
        }
    }
}

public class Example3 : IExample
{
    // Not implemented. Throws an exception when accessed.
    public string Description
    {
        get 
        { 
            throw new NotImplementedException("Todo!");
        }
    }
}

public class Example4 : IExample
{
    // Auto-property implementation, will return whatever
    // we set the value to.  Value is stored in an
    // automatically generated backing field.
    public string Description { get; set; }
}

Keeping interfaces separate from implementation in this way is a critical step in keeping the code you are writing from collapsing under its own weight over time.

An additional reason to use public automatic properties rather than public fields is that should you later need to add logic to the getter or setter, swapping from a field to a property is a breaking change for serialization as well as being a breaking change for backwards compatibility between dependent assemblies. Taking the time to do it right now will save you unnecessary difficulty when the application has grown or live data is involved.

Wednesday, 10 December 2014

The When and Why: Dependency Injection

Dependency Injection is one form of Inversion of Control - a collection of programming patterns focused on minimising your classes' direct reliance on each other, known as their coupling.

In its simplest form Dependency Injection just means to provide an object its dependencies instead of having the object create them for itself.

When?

In any class where you use the new keyword to instantiate another class, you should think about using Dependency Injection instead. It may feel like overkill at first but the benefits are big and get bigger as the application grows in size.

Here's a real world before and after example and I'll explain the why of it straight after.

BEFORE

public class CustomerService
{
    private const string SqlConnectionString = 
         @"server=localhost;username=admin;password=password;";

    private const string LoggingFilePath = 
         @"C:\logs\MyApplication.txt";

    private readonly ILogger _logger;
    private readonly IDataContext _context;

    public CustomerService()
    {
        // In this example all of the services create
        // their own logger and data context instance when
        // they are created.
        // The classes FileLogger and SqlDataContext are this
        // class's 'dependencies' because they are needed in 
        // order for this class to compile.
        _logger = new FileLogger(LoggingFilePath);
        _context = new SqlDataContext(SqlConnectionString);
    }

    public IEnumerable<Customer> GetCustomers(int accountID)
    {
        // TODO: Use logger to log
        // TODO: Use context to get customers for account ID
    }
}

public static class Application()
{
    public static void Main()
    {
        var service = new CustomerService();
        foreach (var customer in service.GetCustomers(55))
        {
            Console.WriteLine(customer.Name);
        }
    }
}

AFTER

public class CustomerService
{
    private readonly ILogger _logger;
    private readonly IDataContext _context;

    public CustomerService(IDataContext context, ILogger logger)
    {
        // In this example instead of instantiating the 
        // dependencies themselves, the service classes like 
        // CustomerService expect them to be passed in.
        // Taking them via the constructor is known as 
        // 'Constructor Injection' and is the preferred method
        // of Dependency Injection because it forces the
        // developer to provide the right dependencies before
        // the class can be used.
        _logger = logger;
        _context = context;
    }

    public IEnumerable<Customer> GetCustomers(int accountID)
    {
        // TODO: Use logger to log
        // TODO: Use context to get customers for account ID
    }
}

public static class Application()
{
    private const string SqlConnectionString = 
         @"server=localhost;username=admin;password=password;";

    private const string LoggingFilePath = 
         @"C:\logs\MyApplication.txt";

    public static void Main()
    {
        // This is messier than it was but the mess is now in
        // a single location which can be easily tightened up 
        // using a DI framework such as Ninject or Unity.
        var logger = new FileLogger(LoggingFilePath);
        var context = new SqlDataContext(SqlConnectionString);
        var service = new CustomerService();
        foreach (var customer in service.GetCustomers(55))
        {
            Console.WriteLine(customer.Name);
        }
    }
}

Why?

There are many convincing reasons to follow the Dependency Injection pattern, I'll cover a few of them here.

Implementation Agnosticism

The before example enforces the use of SqlDataContext and FileLogger. It doesn't need those specific implementations to work properly, it actually just needs something that implements IDataContext and something that implements ILogger in order for it to perform its responsibility of getting the customers, but the implementation in before has specified a concrete implementation anyway.

If we wanted one Customer Service in our application to log to a file and another to log to a database, the pattern in before would force us to either copy paste the whole class to make this small change or start specifying different enums or booleans to pick a context Type on the constructor and, as a result, would seriously harm the scalability of the application by coupling this class with every implementation it could possibly be used with.

The Customer Service in the after example is much more flexible because it does not arbitrarily restrict us to working with certain implementations of the interface. The class definition is also much clearer about exactly what the Customer Service needs in order to function correctly and also what it's purpose is: Getting customers from any given IDataContext and logging about it.

Avoiding unnecessary usage restrictions and minimising Class Coupling like this is the key to developing maintainable systems that don't become exponentially more complicated to work with as the codebase grows.

Single Responsibility Principle

Having a single responsibility makes the class more likely to be reusable, more accessible and more easily tested. Beyond making the programmer's life easier there's a structural benefit to the Single Responsibility Principle too.

CustomerService Responsibilities Before	CustomerService Responsibilities After
Create Log File Connection	Getting the Customers
Create Database Context
Getting the Customers

We can't open the same file twice, so to share a FileLogger between classes using the approach in before we'd have to make the logger available as a public static. If we ever need to change the SQL connection string for the application, we'd have to make sure we update it in all locations or make the connection string a static variable, too.

You might be tempted to resolve these sorts of issues by making everything static and publicly available but making everything public and static is an anti-pattern and doesn't resolve the issue anyway, it just hides the issue long enough for you to write a few thousand more lines of code before reaching a scenario it can't handle.

An example of such a scenario would be needing the logger and data context to live longer than a customer service but not forever and not globally across the application - for the duration of an individual web request, for instance.

Managing the lifespan of class instances as a separate responsibility and injecting them into their dependents keeps code from being bogged down under ever-changing business requirements.

Unit Testing

Unit Testing is the act of testing individual units of functionality in your code to prove that specific expectations are met. For example, a useful Unit Test for CustomerService would be GetCustomers Gets Only Customers For Provided Account, where we test to ensure that the implementation only retrieves customers with a matching account ID.

Because we don't control which ILogger and IDataContext implementations the Customer Service gets in the before example, if we were to test GetCustomers then we'd be including the workings of SqlDataContext and FileLogger within that test too. If the SQL database has the wrong data or FileLogger had a bug, then the GetCustomers Gets Only Customers For Provided Account test would fail, even if the core logic of GetCustomers is correct.

A good Unit Test tests a small, specific unit of code. In the after example, we can pass in whatever implementations of ILogger and IDataContext best suit our needs. In the real code, we provide a SqlDataContext and FileLogger. In the Unit Test, however, we can provide an implementation of ILogger that does nothing and an IDataContext implementation that provides a specific set of records for testing whether GetCustomers does its job properly given that data.

Unit Testing is very much a topic in its own right. It's well worth getting an understanding.

A Challenge

See if you can write an entire application where the new keyword is used in only one class and all other classes have their dependencies injected via their constructor!

Tuesday, 2 December 2014

Useful LINQ Extensions You Might Not Know

If you've used LINQ for any period of time then you've no doubt come to appreciate the power, abstraction of responsibilities and expressiveness of it. You'll be used to extension methods such as Select, Where, Group and so on, but there are a few gems in the System.LINQ library that you just don't know until you know.

SelectMany

No doubt you're familiar with the Select extension method. SelectMany is similar, but produces a flat result set instead of a relational one. So when you would ordinarily work with Select like so:

public class Parent
{
    public string Name { get; set; }
    public List<string> Children { get; set; }
}

// The below gives us a list of lists...
var children = parents.Select(s => s.Children);

// ...which means going through them like this:
foreach (var childGroup in children)
{
    foreach (var child in childGroup)
    {
        // Do work here
    }
}

You can, instead, use SelectMany to flatten the results!

var children = parents.SelectMany(s => s.Children);

foreach (var child in children)
{
    // Do work here
}

If you need access to the parent as well as the child for every row like a SQL join, you can add a result selector:

var relationships = 
    parents.SelectMany(
               s => s.Children, 
               (parent, child) => new { parent, child });

foreach (var relationship in relationships)
{
    // Do work here
    // i.e. relationship.parent.Name or relationship.child
}

Cast

If you've ever needed to cast the elements of an enumerable to a different Type, you've probably written it like this:

var specificEnumerable = originalEnumerable
                            .Select(s => (SpecificType)s);

You can write this more expressively using the Cast method:

var specificEnumerable = originalEnumerable
                            .Cast<SpecificType>();

OfType

Cast is all well and good so long as the items in the enumerable are all of the same Type, but there are times when you have a mixture of Types and only want the items of a particular Type. You may be tempted to write the following:

var onlySpecificEnumerable = originalEnumerable
                                .Where(w => w is SpecificType)
                                .Select(s => (SpecificType)s);
// or, perhaps:
var onlySpecificEnumerable = originalEnumerable
                                .Select(s => s as SpecificType)
                                .Where(w => w != null);

The OfType extension performs the same filter plus cast action as the above, but with cleaner syntax and less room for mistakes!

var onlySpecificEnumerable = originalEnumerable
                                .OfType<SpecificType>();

Wednesday, 26 November 2014

The When and Why: Static

The static modifier, much like the access modifiers, is one of the first keywords that new developers learn and, much like the public access modifier, static is often misused by new developers because it provides a seemingly easy shortcut around learning how to write proper Object Oriented code.

Excessive use of the static modifier can lead to messy, overly complex code. On the other hand, careful use of the static modifier can lead to functionality that is elegant, scalable and performs well. The aim of this article is to help new developers strike the right balance. First I'll describe when to use static, then I'll explain why.

When?

Ideally, static should only be used for private or protected members when you need to share a single value between all instances of the class to which those members belong. You should avoid using public static wherever possible.

There are a couple of occasions where a public static is required, however.

Providing Common Values for structs

If you've ever used string.Empty or int.MinValue then you've invoked a public static property. Here's an example of using the same technique in a Point struct.

public struct Point
{
    public int X;
    public int Y;

    // Static property that returns an instance of point
    public static Point Zero 
    {
        return new Point { X = 0, Y = 0 };
    }
}

// This...
var point = Point.Zero; 

// ...is cleaner and more descriptive than writing this...
var point = new Point { X = 0, Y = 0 };

// ...every time we want a zero value point

Extension Methods

Extension methods are a way to add new functionality to classes that you don't have access to the code for. The design pattern for extension methods requires that you use the static modifier on a public method.

public static class StringExtensions
{
    public static string Left(this string str, int count)
    {
        return str.Length <= count 
               ? str 
               : str.SubString(0, count);
    }
}

// We use the extension method just like any normal method
var example = "Hello world".Left(5);

Why?

Over-use of the static modifier is an anti-pattern because it increases and actually encourages class coupling. Code that features classes that reference each other excessively is often referred to as spaghetti code because of how difficult it is to untangle and understand.

When all of your classes know about each other in this way they become less reusable and they also become harder to maintain. After all, it's much easier for you as a programmer to understand and reuse a class that only knows about itself than it is to understand one that accesses eight or nine other classes, each of which accesses another eight, and so on.

Tuesday, 25 November 2014

When Worlds Collide: IronPython

In the post A Brief Encounter with Python I mentioned that I quite like Python's syntax and script-like feel. I also make no secret of my passion for the .NET framework... So imagine my delight when I discovered an open-source fusion of the two!

With the syntax of Python and the vast wealth of utility found in the .NET framework, IronPython is an exciting prospect. C# still has the edge when it comes to cool language features (async/await, anyone?) but I can't deny that IronPython interests me from the perspective of combining something new with rapid application development.

Since you can try IronPython without having to download it there's nothing to lose, so why not have a go with it?

Sunday, 23 November 2014

The Problem With a Language That Grows

The Problem

There is no denying that C# is a very powerful language. A large contributor toward that power is that the C# language definition is regularly expanded and built upon to grow alongside the ever-changing landscape of information technology. Each iteration of the language has opened up new, powerful capabilities that either did not exist within the language prior to that release or that were only possible through the writing of many, many lines of brittle code.

The inherent problem with this kind of growth is that for programmers who didn't join the C# party early on, but rather are just arriving now, there are almost too many tools available for problem solving. Worse still, some of those tools are incredibly powerful and can be easily and accidentally abused when placed in the hands of a novice. This is an unfortunate side effect because those same tools in the hands of an experienced programmer make for compelling and powerful solutions.

An Example

One such modern language feature to which I am referring is the Type dynamic. As a C# advocate one of the features that I laud loudest is type safety. In C# if the Type you're referencing doesn't have a DoSomething() method then you can't write myObj.DoSomething(). Well you can... but it won't compile!

The Type dynamic removes that powerful precaution. It says "whatever you write, I'll trust you, but if I later try to run code that I can't then I'll crash". That's quite the deal with the Devil and should only be used by those who really know what they're signing up for and only when there is truly no other option.

In my entire career I've used the Type dynamic exactly once.

In Outsorcery when your serialized object of Type IWorkItem<T> gets to the server, unable to infer the Type T and not needing to know it because the resulting object of that Type will just be serialized and sent back to the client immediately, I use the Type dynamic to allow me to call the method DoWorkAsync() that I know exists but can't prove to the compiler without reams and reams of infrastructure code. It is an incredibly rare scenario and one that I have locked down to only a couple of lines of code, wrapped in a try/catch, in the lowest levels of my library.

More and more I've started to see novice developers diving straight onto using the Type dynamic in scenarios where other language features would give them both the type safety and the flexibility that are available within C#. I've seen people use dynamic when they should be using Interfaces, when they should be using Generics, when they should be using Type Casting and even in cases when they should just be using the Type object.

A Solution?

So the question inevitably becomes a take on the same question we see time and time again in sport, in online gaming and even in general life.

How to protect new C# developers from being overwhelmed,
without holding experienced C# developers back?

The solution that I propose is one that aims to not be overly invasive, but instead informative. Drawing on the experience of the world of online gaming, I suggest that when you write code that uses a heavy-handed feature for the first time, Visual Studio should present you with a tutorial window. Something like this:

dynamic myObj = otherObj.Operation();

Of course, veteran programmers will be able to disable this feature via the pop-up either on a per tutorial basis or as a global setting.

Having these tutorials built into the IDE shouldn't get in anybody's way, it just means that for the programmer who's just starting out, the relevant information is right next to the code they're writing. In effect, it'll be like pair programming between the novice and the collective knowledge of veteran C# developers.

What do you think?

Friday, 21 November 2014

The When and Why: For and Foreach

I've often seen it asked which type of loop out of for and foreach is more appropriate under a given set of circumstances. I'll explain when to use each type of loop, then I'll explain the why, much in the same way that I did for access modifiers.

Definitions

The keywords for and foreach have such broad applications that it can be a bit befuddling trying to understand what all the related terms mean. Below is a quick summary of the terms I'll be using.

Enumerable: An object that can be enumerated (it implements IEnumerable). Don't worry if that's too technical, just know that you can use foreach with any object that is enumerable (i.e. all arrays and collections).
Enumerate: To go through each item in an Enumerable in turn.
Index: An item's zero-indexed position in the collection/array. e.g. items[3] is the 4th item in the array named 'items'.

Note: Not all enumerables have an index, but anything with an index is enumerable.

When

Always start with foreach, you can swap to for later if the need arises. You can't add or remove items in a collection while enumerating it, but you can call methods and get/set properties on the item, which is the most common reason for enumerating.

foreach (var item in items)
{
    item.DoWork();
    item.Property = "Some other value";
}

Use for only when you need access to the current index and are working with an array or array-based collection (e.g. a list). One scenario that you might need this is when you need to remove items from a collection while going through it (something you can't easily achieve with foreach). Here's an example.

// go through the list backward so that removing 
// items doesn't cause us to skip the item that follows.
for (int i = myList.Count - 1; i >= 0; --i) 
{
    if (myList[i].ShouldBeRemovedFromList)
    {
        myList.RemoveAt(i);
    }
}

Why

There are many technical reasons for favouring foreach over for, but my main reason isn't technical at all. foreach is fewer lines of code, uses cleaner syntax and also expresses the intent of your code. That is, 'go through this collection without changing which items are in the collection'.

As luck would have it, foreach isn't just better looking, it often performs better too!

As mentioned in the definitions section, not all enumerables have an index, but all objects with an index can be enumerated. Similarly, arrays have length and some collections have Count, but some implementations of enumerable have neither. You'd be tempted to cheat and use the extension method Count(), but that'd be a mistake. Why? Let's take a look at a comparison.

// foreach on an enumerable
foreach (var item in items)
{
    if (item.GiveUp)
        break;
}

// for loop on enumerable
// Count() extension method is not the same as List<T>'s Count
for (int i = 0; i < items.Count(); ++i) 
{
    if (items[i].GiveUp)
        break;
}

You'd be forgiven for thinking that the two examples do the same thing and, from the perspective that they achieve the same result, they do. However, the second example is actually a performance issue waiting to happen.

Let's say that the enumerable items has 10,000 items in it and that it is the 5th item of those 10,000 that gives up.

First example: We enumerate the first five items, reach the item that indicates we should give up and exit the loop (by using the break keyword).
Second example: We do the same thing, except that the call to the extension method Count() actually goes through the whole enumerable and counts all the items before we start.

So while the foreach example makes 5 visits, the for example makes 10,005... That's a pretty massive difference!

This is just one example and there are many more where foreach is the smarter choice, but this article is quite long already, so if you have any specific scenarios you're not sure about then let me know in the comments and I'll add them to the article.

Tuesday, 30 December 2014

When?

Why?

Further Reading

Wednesday, 10 December 2014

When?

BEFORE

AFTER

Why?

Implementation Agnosticism

Single Responsibility Principle

Unit Testing

Further Reading

A Challenge

Tuesday, 2 December 2014

SelectMany

Cast

OfType

Wednesday, 26 November 2014

When?

Providing Common Values for structs

Extension Methods

Why?

Further Reading

Tuesday, 25 November 2014

Sunday, 23 November 2014

The Problem

An Example

A Solution?

Friday, 21 November 2014

Definitions

When

Why

Further Reading