Steve Lillis' Code Blog: Optimising DotLiquid: Part 1

Introduction

A colleague recently introduced me to a very interesting open source library for content templating called DotLiquid. It's actually a .NET port of a system originally produced for Ruby, and that system originated from Shopify! This kind of legacy is good, because it (usually) means that an API has stood the test of time for being fast and easy to work with.

Intrigued, I downloaded the source for DotLiquid and was pleased to find that not only is the provided API easy to understand and work with without knowing the underlying code, but the source itself has a very well structured Object Oriented architecture, too.

Digging a bit deeper I found a few places that can potentially be improved for better performance. How big the performance gain will be depends on the size and complexity of the content being passed in to be rendered.

How important that performance gain will be really depends on how time critical your rendering of content is. If you're using the templating system for rarely updated web page content, the time saved by these changes will likely be negligible, but if you're rendering a thousand different content pieces in quick succession using the same template then every millisecond saved per render is a whole second saved per batch.

Templates in DotLiquid can be parsed and cached in advance but rendering needs to occur for each piece of content using that template, so in this series I'll be working on improving the performance of DotLiquid's render method specifically.

Testing Performance: Never Guess

The most important thing to do when making efforts to optimise code is to build a reliable way to take before and after performance measurements. By taking measurements in this way, it can be said for certain whether a change to the source had a positive, negative or negligible impact on how it performs.

As the ambition of this series is to improve the speed at which DotLiquid renders content, I have written a simple method for taking timings as my performance metric.

static void RunTest(int iterations)
{
    var template = Template.Parse(TemplateCode);

    var stopwatch = new Stopwatch();
    var timings = new List<double>();
            
    for (var i = 0; i < iterations; ++i)
    {
        var hash = Hash.FromAnonymousObject(GetFreshTestObject());

        stopwatch.Reset();
        stopwatch.Start();

        template.Render(hash);

        stopwatch.Stop();

        timings.Add(stopwatch.Elapsed.TotalMilliseconds);
    }

    Console.WriteLine(@"Iterations: {0}", iterations);

    Console.WriteLine(@"   Average: {0:0.00000}ms", 
                      timings.Average());

    Console.WriteLine(@"   Std Dev: {0:0.00000}ms", 
                      CalculateStdDev(timings));

    Console.WriteLine(@"   Minimum: {0:0.00000}ms", 
                      timings.Min());

    Console.WriteLine(@"   Maximum: {0:0.00000}ms", 
                      timings.Max());

    Console.WriteLine();
}

static void Main()
{
    // Warm up
    Console.WriteLine("Warm up");
    RunTest(100);

    // Real tests
    Console.WriteLine("Real tests");
    RunTest(10000);
    RunTest(1000);
            
    Console.ReadKey();
}

You may notice that I run a warm up test first. C#'s JIT compiler has to compile the code on first execution and I don't want the extra time taken to compile to impact the timings. The warm up run gets JIT compilation out of the way.

To ensure that performance testing results are as reliable as possible, I'll be running the Release build because it is properly optimised and I'll be running it outside of the IDE to remove the IDE's performance impact.

The Test Template

I'll be using the same DotLiquid template throughout the series to keep the parameters of the test consistent. I've used a variety of tags so that optimisations that only affect one tag type will still be included in the timings.

<div>
<p><b>
{% if user.name == 'Steve Lillis' -%}
  Welcome back
{% else -%}
  I don't know you!
{% endif -%}
</b></p>
{% unless user.name == 'Steve Thompson' -%}
  <i>Unless example</i>
{% endunless -%}
{% comment %}A comment for comments sake{% endcomment %}
<ul>
<li>This entry and something about baked goods</li>
<li>
{% assign handle = 'cake' -%}
{% case handle -%}
  {% when 'cake' -%}
     This is a cake
  {% when 'cookie' -%}
     This is a cookie
  {% else -%}
     This is not a cake nor a cookie
{% endcase -%}
</li>
</ul>
</div>
<p>{{ user.name | upcase }} has the following items:</p>
<table>
{% for item in user.items -%}
  <tr>
     <td>
        {% cycle 'one', 'two', 'three' %}
     </td>
     <td>
        {{ item.description }} 
        {% assign handle = 'cake' -%}
        {% case handle -%}
          {% when 'cake' -%}
             This is a cake
          {% when 'cookie' -%}
             This is a cookie
          {% else -%}
             This is not a cake nor a cookie
        {% endcase -%}
     </td>
     <td>
        {{ item.cost }}
     </td>
  </tr>
{% endfor -%}
{% for item in user.items reversed -%}
  <tr>
     <td>
        {% cycle 'one', 'two', 'three' %}
     </td>
     <td>
        {% if item.description == 'First Item' -%}
            {{ item.description | upcase }}
        {% else %}
            {{ item.description }}
        {% endif %}
     </td>
     <td>
        {{ item.cost }}
     </td>
  </tr>
{% endfor -%}
</table>

Initial Timings

Iterations:	1000	10000
Minimum (ms)	3.44020	3.42810
Maximum (ms)	8.25000	7.02620
Range (ms)	4.80980	3.59810
Average (ms)	3.95568	3.76747
Standard Deviation (ms)	0.41129	0.23009

Early Analysis

I only have the initial timings, having made no changes to the code yet, but there's already some points of note.

The standard deviation for 1k and 10k iterations only differs by a fairly negligible ~0.2ms, but the maximum time any one iteration took differs by ~1.2ms. This discrepancy combined with a preliminary examination of DotLiquid's code suggests that excessive object allocation on the heap could be triggering garbage collection, which I'd want to avoid if I can.

The average in both cases is a lot closer to the minimum than the maximum, even when taking the standard deviation into account. This could imply that the first time a DotLiquid template is rendered it caches values to save time on subsequent renders. It might be possible to improve initial render performance by moving the caching out of rendering and into parsing or even compile time. A very brief look over the codebase reveals a few Regexes that aren't pre-compiled, there may be similar savings elsewhere too.

The difference in average between 1k and 10k is just ~0.2ms, hardly a difference at all. This tells me that the render method has been built with consideration to being called multiple times on a single template object, so I likely won't have to make many improvements to how the render method cleans up.

Going Forward

That's it for now. In the next post in this series I'll be investigating the DotLiquid codebase, making changes and hopefully presenting some improved timings. Wish me luck!

Steve Lillis' Code Blog

Friday, 9 January 2015

Optimising DotLiquid: Part 1