What to Improve First?
Choosing where to focus optimisation attention is all about identifying performance hotspots. Hotspots are code paths with a single, long execution time or paths that have short execution times but are visited frequently.
Using a simple divide and conquer approach, I broadly identified the areas that made up the majority of the ~4ms of rendering in the initial timings.
The biggest costs came from the main infrastructure class Context which manages the template's memory stack while it is rendered and the class Condition which evaluates the if and unless statements that are used frequently in the test template.
First Pass of Optimisations
In this pass I focused on making small, simple changes that eliminate unnecessary rework occurring within some of the most frequently called methods. I re-ran the tests after each change was applied to ensure that no individual change was raising the execution time back up. In the cases that this did occur, I undid the change.
I've provided the changes and relevant example code below and you can see the full changeset on GitHub.
The Condition Class
I amended the class to keep a reference to the operation delegate whenever the operation string is set instead of every time the condition is evaluated, reducing the number of times the lookup occurs from thousands to just a handful.
// BEFORE // ================================== private static bool InterpretCondition( string left, string right, string op, Context context) { if (string.IsNullOrEmpty(op)) { object result = context[left]; return (result != null && (!(result is bool) || (bool) result)); } var leftObject = context[left]; var rightObject = context[right]; ConditionOperatorDelegate opDelegate; if (!Operators.TryGetValue( op, out opDelegate)) throw new Exceptions.ArgumentException(...); return opDelegate(leftObject, rightObject); } public virtual bool Evaluate(Context context) { context = context ?? new Context(); bool result = InterpretCondition( Left, Right, Operator, context); ... // AFTER // ================================== private string _operatorString; private ConditionOperatorDelegate _operatorDelegate; private string Operator { get { return _operatorString; } set { _operatorString = value; if (string.IsNullOrEmpty(value)) _operatorDelegate = (l, r) => NoOperator(l); else if (!Operators.TryGetValue( value, out _operatorDelegate)) throw new Exceptions.ArgumentException(...); } } public virtual bool Evaluate(Context context) { // The whole InterpretCondition method is avoided var result = _operatorDelegate( context[_left], context[_right]); ...
I replaced comparisons against arbitrary strings "and" and "or" with byte comparisons against constants fields named And and Or.
// BEFORE // ================================== switch (_childRelation) { case "and": return result && _childCondition.Evaluate(context); case "or": ... // AFTER // ================================== if (_childRelation == And) // And is a const byte value return result && _childCondition.Evaluate(context); if (_childRelation == Or) // Or is a const byte value ...
I replaced occurrences of double casting with a single as operator and null check.
// BEFORE // ================================== if (left is Symbol) return ((Symbol)left).EvaluationFunction(right); ... // AFTER // ================================== var symbolLeft = left as Symbol; if (symbolLeft != null) return symbolLeft.EvaluationFunction(right); ...
The Context Class
I stored a reference to the first and last scopes in the context's list of scopes and replaced all lookups with the new references.
// BEFORE // ================================== context.Scopes.Last()[_to] = _from.Render(context); ... var orphanedBlocks = ((List<Block>)context.Scopes[0]["extends"]) ?? new List<Block>(); ... // AFTER // ================================== context.GlobalScope[_to] = _from.Render(context); ... var orphanedBlocks = ((<Block>)context.LocalScope["extends"]) ?? new List<Block>(); ...
Timings
The timings seen below were all taken during the same time period on the same machine. They are based on 10,000 iterations per test.
Before Optimisations | After Optimisations | |
---|---|---|
Minimum (ms) | 3.45680 | 3.36080 |
Maximum (ms) | 4.95130 | 4.97570 |
Range (ms) | 1.49450 | 1.61490 |
Average (ms) | 3.64347 | 3.54507 |
Standard Deviation (ms) | 0.16981 | 0.18134 |
Analysis
A reduction in average execution time of ~0.1ms may not seem like a lot, but it represents a 3% reduction in the total time taken to render, which is a sizeable chunk.
It's worth noting that micro-optimisation is not often required, but in the case of trying to bring down a ~4ms execution time, micro-optimisations are exactly what I needed!
Summary
The careful adjustments made during this pass had a small but notable impact and were a great way to get used to the DotLiquid codebase.
I expect to make a huge performance boost in the next pass of optimisations by replacing all the uses of static method Regex.Match with pre-compiled Regex instance equivalents.
Exciting stuff!
No comments:
Post a Comment