Creating a Template Engine Part 1: Groundwork

Although I strongly recommend using a popular, well tested engine, I believe that creating your own is a great learning experience. Since I’ve recently done this and couldn’t find any blog posts explaining the process, I thought I’d take the time to explain how one works, hopefully teaching a few budding JavaScripters looking to expand their knowledge.

By the end of this, you should be able to see a template like this:

<ul>
    ${#if responses.length}
        ${#each responses as response}
        <li>${response.name}</li>
        ${#end each}
    ${#end if}
    ${#if !responses.length}
    <li><em>No responses</em></li>
    ${#end if}
</ul>

… combine it with data like this:

var data = {
    responses: [
        { name: "Alpha" },
        { name: "Bravo" },
        { name: "Charlie" }
    ]
};

… and be able to turn it into rendered markup like this:

<ul>
    <li>Alpha</li>
    <li>Bravo</li>
    <li>Charlie</li>
</ul>

… with some minor allowances for whitespace.

This isn’t hugely difficult but there are a few steps to go through. I’m going to spend this blog post explaining the groundwork that you will need to understand before the more interesting parts in the next post will make sense. If you’re curious enough to want to know how to build you own JavaScript template engine then there’s a good chance that you already know this stuff, but I’ll go over it anyway just to make sure we’re all on the same page.

Where to Start

The best way to start something like this is to start small. The smallest part of a template engine is string interpolation. There are native ways of doing that these days but this post isn’t about what’s available – it’s about what can be done!

String Interpolation

There are a few ideas out there about string interpolation (one of my favourites used to be James Padolsey’s Straight-up Interpolation) but I’ve always found that Douglas Crockford’s Remedial JavaScript is the easiest to understand (even if it is a little dated these days). Here’s Crockford’s original function:

if (!String.prototype.supplant) {
    String.prototype.supplant = function (o) {
        return this.replace(
            /\{([^{}]*)\}/g,
            function (a, b) {
                var r = o[b];
                return typeof r === 'string' || typeof r === 'number' ? r : a;
            }
        );
    };
}

// Usage:
"hello {thing}".supplant({thing: "world"}); // -> "hello world"

We can tidy up the function a little to bring it into line with JSLint‘s current recommendations.

function supplant(string, replacements, pattern) {

    return string.replace(/(\$\{(.*?)\})/g, function (ignore, whole, key) {

        var value = replacements[key];

        return (
                typeof value === "string"
                || typeof value === "number"
            )
            ? value
            : whole;

    });

}

Still simple, but (in my honest and humble opinion) easier to read. I’ve also swapped the syntax from {key} to ${key} to more closely match the native template strings. Despite the tidy-up, this function still has a issue: we can’t escape the match.

supplant("hello ${thing}", {thing: "world"}); // -> "hello world" :-)
supplant("hello \\${thing}", {thing: "world"}); // -> "hello world" :-(

This is the sort of thing that makes people who use template engines cry. PrototypeJS has a fairly simple template engine that does allow escaping. My first thought was to just swipe their regular expression and call it a day, but there’s an issue with doing that as well:

function supplant(string, replacements, pattern) {

    return string.replace(
        /(^|.|\r|\n)(\$\{(.*?)\})/g,
        function (ignore, prefix, whole, key) {

            var value = replacements[key];
            var replacement = (
                    typeof value === "string"
                    || typeof value === "number"
                )
                ? value
                : whole;

            return prefix === "\\"
                ? whole
                : prefix + replacement;

        }
    );

}

That function works perfectly …

supplant("hello ${thing}", {thing: "world"}); // -> "hello world" :-)
supplant("hello \\${thing}", {thing: "world"}); // -> "hello ${thing}" :-D

… until you put two placeholders next to each other.

supplant("hello ${thing}${point}", {thing: "world", point: "!"});
// -> "hello world${point}" :-(

It turns out that Prototype’s trick was a little deeper than remedial JavaScript: they break apart a string based on the regular expression given and process each part, relying on a little-known property that the results of String.prototype.match contains: match.index is the index where the match appears in the string. This means that string.substr(0, match.index) is everything before the match and string.substr(match.index + match[0].length) is everything afterwards; if we set string to that second substr then we can repeat the process to get the next match. If there is no match (or no more matches) then the entire string must be after our final match and a zero-length match would cause an infinite loop so in either case, we stop the process and return the results.

function tokenise(string, pattern, handler, context) {

    var parts = [];
    var match;

    if (!(pattern instanceof RegExp)) {
        pattern = /(?:)/; // matches nothing
    }

    if (typeof handler !== "function") {

        // returns first entry
        handler = function (matches) {
            return matches[0];
        };

    }

    while (string.length) {

        match = string.match(pattern);

        if (!match || match[0].length === 0) {

            parts.push(string);
            string = "";

        } else {

            parts.push(
                string.substr(0, match.index),
                handler.call(context, match)
            );
            string = string.substr(match.index + match[0].length);

        }

    }

    return parts;

}

This gives us a function that can break a string into matches and optionally process any match it finds.

tokenise("a<b>c<d>e", /<(\w+)>/);
// -> ["a", "<b>", "c", "<d>", "e"]

tokenise("a<b>c<d>e", /<(\w+)>/, function (matches) {
    return "(" + matches[1].toUpperCase() + ")";
});
// -> ["a", "(B)", "c", "(D)", "e"]

We shouldn’t be in the habit of trusting functions that another developer can enter – we returned a string this time, but returning something else might give us strange results. I find a simple function like this can fix many problems like that and we can even use it in supplant:

function interpretString(string) {

    return typeof string === "string"
        ? string
        : (string === null || string === undefined)
            ? ""
            : String(string);

}

No matter what we pass to interpretString, we get back a string …

interpretString("hello");   // -> "hello"
interpretString(12345);     // -> "12345"
interpretString([1, 2, 3]); // -> "1,2,3"
interpretString(true);      // -> "true"
interpretString();          // -> ""

… meaning that we can add it to tokenise to avoid any future problems:

function tokenise(string, pattern, handler, context) {

    var str = interpretString(string);

    // ...

            parts.push(
                str.substr(0, match.index),
                interpretString(handler.call(context, match))
            );

    // ...

}

We can do other tricks such as ensuring that pattern doesn’t have a global flag, but I’ll avoid those so this post doesn’t get too distracted.

If you want to do that yourself, regexp.flags.split("") will give an array of flags, [].indexOf and [].splice can remove entries and regexp.source can be passed to new RegExp() – that’s how I’d approach it.

Back on point, our supplant function just needs to be altered to use tokenise rather than "".replace. We can also take advantage of interpretString:

function supplant(string, replacements) {

    string = interpretString(string);
    replacements = replacements || {};

    return tokenise(string, /(^|.|\r|\n)(\$\{(.*?)\})/, function (matches) {

        var prefix = matches[1];
        var whole = matches[2];
        var value = replacements[matches[3]];
        var replacement = (
                typeof value === "string"
                || typeof value === "number"
            )
            ? value
            : whole;

        return prefix === "\\"
            ? whole
            : prefix + replacement;

    }).join("");

}

Quick test:

supplant("hello ${thing}", {thing: "world"});
// -> "hello world" :-)
supplant("hello \\${thing}", {thing: "world"});
// -> "hello ${thing}" :-D
supplant("hello ${thing}${point}", {thing: "world", point: "!"});
// -> "hello world!" X-D

… and we have a fully-functional string interpolation function that can handle escaping and consecutive placeholders! All sorted, right?

Not quite. If you check the original template that I teased you with, you’ll notice that we don’t actually reference a property directly, we get a nested property (response.name as it happens). In order to get that functionality working, we’re going to have to … erm … adapt some code from another library. Luckily, Lodash is open-source and I’ve got my eye on their _.get() method. Because Lodash was built with performance and modularity in mind instead of readability, going through their GitHub repo is not without its challenges but here’s the essence of accessing nested property from an object.

It starts with some impressive regex-fu to break the string into an array of each path request. I’m not going to try and explain it because (just between you and me) I don’t fully understand it and I’m just glad it works.

function stringToPath(paths) {

    var result = [];
    var str = interpretString(paths);

    if ((/^\./).test(str)) {
        result.push('');
    }

    str.replace(
        /[^.[\]]+|\[(?:(-?\d+(?:\.\d+)?)|(["'])((?:(?!\2)[^\\]|\\.)*?)\2)\]|(?=(?:\.|\[\])(?:\.|\[\]|$))/g,
        function(match, number, quote, string) {

            result.push(
                quote
                    ? string.replace(/\\(\\)?/g, '$1')
                    : (number || match)
            );

        }
    );

    return result;

}

Now that we’ve got a list of paths, we just need to run through those paths, stopping and returning undefined if we can’t find one.

If you haven’t see the technique before, [].every will stop processing as soon as the function returns false – you can think of it as [].doWhile.

function access(object, path) {

    stringToPath(path).every(function (property) {

        object = (
                object !== null
                && object !== undefined
                && Object.prototype.hasOwnProperty.call(object, property)
            )
            ? object[property]
            : undefined;

        return object !== undefined;


    });

    return object;

}

We can now modify supplant to take advantage of access:

function supplant(string, replacements) {

    // ...

    return tokenise(string, /(^|.|\r|\n)(\$\{(.*?)\})/, function (matches) {

        var value = access(replacements, matches[3]);

        // ...

    }).join("");

}

I’m a big fan of keeping these functions tidy, so here’s some minor house-keeping (bonus points if you recognise this variable style).

var util = {

    Object: {
        access: access
    },

    String: {
        interpret: interpretString,
        toPath: stringToPath,
        supplant: supplant,
        tokenise: tokenise
    }

};

I’ll be using util in the next post as well.

This now gives you enough to render 4 lines of the template:

<ul>
    <!-- ${#if responses.length} -->
        <!-- ${#each responses as response} -->
        <li>${response.name}</li>
        <!-- ${#end each} -->
    <!-- ${#end if} -->
    <!-- ${#if !responses.length} -->
    <li><em>No responses</em></li>
    <!-- ${#end if} -->
</ul>

Specifically how to handle the process blocks (those ${#if, ${#each and ${#end lines) will be explained next time – before that I need to explain how to handle the nesting since that trips up a lot of budding template engineers and it can be tricky to google if you don’t know the correct search term. That term is:

The Composite Pattern

Pro JavaScript Design Patterns describes the composite pattern like this:

The composite pattern … is tailor-made for creating dynamic user interfaces … you can initiate complex or recursive behaviours on many objects with a single command. This allows your glue code to be simpler and easier to maintain, while delegating the complex behaviours to the objects.

The book goes on to give an example of the structure, showing the difference between a “Composite” (an object that will have child objects) and a “Leaf” (an object that will have no children but will likely handle the functionality).

Composite─┬─Leaf
          ├─Composite─┬─Leaf
          │           ├─Leaf
          │           └─Leaf
          └─Composite─┬─Leaf
                      └─Leaf

A “Composite” (or “Branch”) can have any child, “Leaf” or “Composite” and the nesting can be infinite. A “Composite” will have a method that will execute the same method on its children, that method on a “Leaf” will execute something or return something.

To explain the pattern, let’s build a menu that will display items and their prices. We can separate the menu with headings. We’ll start by defining the food items:

function makeItem(name, price) {

    return {

        draw: function () {
            return "<dt>" + name + "</dt><dd>" + price + "</dd>";
        }

    };

}

Nothing extravagant here, we’re just building a definition list. The logical next step is to create something that will manage lists. Since the sections will technically be lists as well, we need to make the list generic enough that both situations will work. For the sake of simplicity, I’ll just allow the list to define the immediate code before and after the list itself.

function makeList(prefix, postfix) {

    var children = [];

    return {

        addChild: function (child) {
            children.push(child);
        },

        draw: function () {

            var pre = prefix || "";
            var post = postfix || "";

            return pre + children.map(function (child) {
                return child.draw();
            }).join("") + post;

        }

    };

}

The magic of the “list” object is the draw method – see how it calls the draw method of all of its children? It doesn’t matter how many children there are – there can even be none. Keeping the method names the same makes the composite pattern dynamic as it doesn’t matter whether the child is an “item” object or another “list”.

Moving on, we’ll create a function that will create the sections. They’re based on the “list” since they also need to have children and draw them,

function makeSection(name, level, prefix, postfix) {

    var list = makeList(prefix, postfix);
    var listDraw = list.draw;

    list.draw = function () {

        var tag = "h" + level;

        return "<" + tag + ">" + name + "</" + tag + ">" + listDraw();

    };

    return list;

}

Now we just need a menu to tie all these parts together. Since this is just a demo, I’ll hard-code the branches (in next week’s code, the branches will be dynamically built up). This will show you how sections can have “item” objects (a “Leaf”) or sub sections (a “Composite”).

function makeMenu() {

    // Create the branches.

    var menu = makeSection("Menu", 1);
    var pizza = makeSection("Pizza", 2);
    var basic = makeSection("Basic", 3, "<dl>", "</dl>");
    var advanced = makeSection("Advanced", 3, "<dl>", "</dl>");
    var sides = makeSection("Sides", 2, "<dl>", "</dl>");
    var drinks = makeSection("Drinks", 2);
    var bottles = makeSection("Bottles", 3, "<dl>", "</dl>");
    var cans = makeSection("Cans", 3, "<dl>", "</dl>");

    // Add the leaves.

    menu.addChild(pizza);
    menu.addChild(sides);
    menu.addChild(drinks);

    pizza.addChild(basic);
    pizza.addChild(advanced);

    basic.addChild(makeItem("St Tropez", "£9.99"));
    basic.addChild(makeItem("Ploughman's", "£9.99"));
    basic.addChild(makeItem("Hawaiian", "£9.99"));

    advanced.addChild(makeItem("Rio", "£12.99"));
    advanced.addChild(makeItem("Texan", "£12.99"));

    sides.addChild(makeItem("Chips", "£0.99"));
    sides.addChild(makeItem("Coleslaw", "£0.99"));

    drinks.addChild(bottles);
    drinks.addChild(cans);

    bottles.addChild(makeItem("Coke", "£1.99"));
    bottles.addChild(makeItem("Diet Coke", "£1.99"));

    cans.addChild(makeItem("Coke", "£0.99"));
    cans.addChild(makeItem("Diet Coke", "£0.99"));

    return {
        draw: menu.draw
    };

}

… and even the “menu” object has a draw method. That means that all we need to do is call a single method to render the entire menu, no matter how many branches were actually added. If you’d like to see what the code does, I’ve put together a JSFiddle example:

Final Words

This is the basic knowledge that you’ll need to understand before you can put together a template engine of your own. In part 2 I’ll show you the code necessary to actually build the engine and put this theory into practice. I’ve got the code for the interpolation up on GitHub if you want to see it all together.

I hope you’ve enjoyed reading this post and that you learned something. Part 2 will be up next week.

Leave a Reply

Your email address will not be published. Required fields are marked *