Friday, 21 November 2014

The When and Why: For and Foreach

I've often seen it asked which type of loop out of for and foreach is more appropriate under a given set of circumstances. I'll explain when to use each type of loop, then I'll explain the why, much in the same way that I did for access modifiers.

Definitions

The keywords for and foreach have such broad applications that it can be a bit befuddling trying to understand what all the related terms mean. Below is a quick summary of the terms I'll be using.

  • Enumerable: An object that can be enumerated (it implements IEnumerable). Don't worry if that's too technical, just know that you can use foreach with any object that is enumerable (i.e. all arrays and collections).
  • Enumerate: To go through each item in an Enumerable in turn.
  • Index: An item's zero-indexed position in the collection/array. e.g. items[3] is the 4th item in the array named 'items'.

Note: Not all enumerables have an index, but anything with an index is enumerable.

When

Always start with foreach, you can swap to for later if the need arises. You can't add or remove items in a collection while enumerating it, but you can call methods and get/set properties on the item, which is the most common reason for enumerating.

foreach (var item in items)
{
    item.DoWork();
    item.Property = "Some other value";
}

Use for only when you need access to the current index and are working with an array or array-based collection (e.g. a list). One scenario that you might need this is when you need to remove items from a collection while going through it (something you can't easily achieve with foreach). Here's an example.

// go through the list backward so that removing 
// items doesn't cause us to skip the item that follows.
for (int i = myList.Count - 1; i >= 0; --i) 
{
    if (myList[i].ShouldBeRemovedFromList)
    {
        myList.RemoveAt(i);
    }
}

Why

There are many technical reasons for favouring foreach over for, but my main reason isn't technical at all. foreach is fewer lines of code, uses cleaner syntax and also expresses the intent of your code. That is, 'go through this collection without changing which items are in the collection'.

As luck would have it, foreach isn't just better looking, it often performs better too!

As mentioned in the definitions section, not all enumerables have an index, but all objects with an index can be enumerated. Similarly, arrays have length and some collections have Count, but some implementations of enumerable have neither. You'd be tempted to cheat and use the extension method Count(), but that'd be a mistake. Why? Let's take a look at a comparison.

// foreach on an enumerable
foreach (var item in items)
{
    if (item.GiveUp)
        break;
}

// for loop on enumerable
// Count() extension method is not the same as List<T>'s Count
for (int i = 0; i < items.Count(); ++i) 
{
    if (items[i].GiveUp)
        break;
}

You'd be forgiven for thinking that the two examples do the same thing and, from the perspective that they achieve the same result, they do. However, the second example is actually a performance issue waiting to happen.

Let's say that the enumerable items has 10,000 items in it and that it is the 5th item of those 10,000 that gives up.

  • First example: We enumerate the first five items, reach the item that indicates we should give up and exit the loop (by using the break keyword).
  • Second example: We do the same thing, except that the call to the extension method Count() actually goes through the whole enumerable and counts all the items before we start.

So while the foreach example makes 5 visits, the for example makes 10,005... That's a pretty massive difference!

This is just one example and there are many more where foreach is the smarter choice, but this article is quite long already, so if you have any specific scenarios you're not sure about then let me know in the comments and I'll add them to the article.

Further Reading

There are occasions when you'll want to run a loop until a particular condition is met, rather than running a loop for every item in an array or collection. When such an occasion arises, you'll be glad to know about the while statement.

If you're interested in the fine-grained difference between an array and a collection then check out this MSDN article on Arrays and Collections.

foreach and the yield keyword go hand in hand. It's the yield keyword that lets you enumerate a collection or array once, instead of once per method that does work on it.

I've saved the best for last, as once you have a handle on the basics of foreach and yield it's well worth getting to grips with using LINQ. It can be quite a big topic with lots of confusing and varied applications, but hopefully these two hints will help make things clearer.

  1. LINQ does different things depending on what you use it on. For example, using LINQ on an in memory list will run code against the list, using it on a LINQ to SQL table object will generate and execute SQL against the database!
  2. LINQ makes use of a concept known as Lazy Evaluation, which means that the aforementioned generation and code running doesn't happen until you actually enumerate the collection.
If you'd like to see a beginner's guide to LINQ, let me know in the comments, as I'd love to write one!

No comments:

Post a Comment