Sam's Blog v1.00 Released

Date: Monday, 10 May 2010, 16:22.

Categories: perl, ironman, template-benchmark, template-sandbox, benchmarking.

Well, after 13(!) point releases, I'm happy enough with Template::Benchmark to make the first stable release.

v1.00 should be hitting a CPAN mirror near you sometime soon.

Now that it's officially released, I thought I'd go into a little more detail about what Template::Benchmark does, how it can be useful to you, what motivated me to write it in the first place, and some thoughts on where to take the project next.

If you're not interested in why I wrote Template::Benchmark, I suggest you skip the next section and cut to the chase.

How It Came To Be

If there's one thing that Perl doesn't have a shortage of, it's templating modules. It's probably the classic example of redundant development for the language.

It's what happens when "There's More Than One Way To Do It" meets "Not Really What I Was After".

Taken together there's a proliferation of modules aimed to solve slightly different subsets of problems with slightly different implementations, and everyone knows that if they just wrote their own, it'd fit their needs perfectly.

Across sixteen years of perl development, I've honestly lost count of how many times I've written a template engine from scratch for a company. Always to fit the particular needs of that company, because the modules on CPAN either didn't exist at the time, or had some short-coming that either was or seemed critical at that time and place. Thinking back, the only times I haven't were when someone else had beaten me to the punch.

I've even released Template::Sandbox to CPAN, because it has the particular mix of features that I feel are important. In particular (shameless plug) it's: decently sand-boxed from your application data and flow; uses third-party caching modules rather than rolling its own; has simple syntax; and "just enough" complexity to help separate application logic from UI design; and lastly, but for this article most-importantly, it's "high performance".

I use quotes on that last statement advisedly.

Every template engine at some point in its documentation, abstract, website, or elsewhere in its sales pitch, makes the claim that it's "high performance" or "fastest template engine evva!!!!".

In some ways that's no surprise, people seldom publish their code for public consumption under a banner of "Slow As Molasses But Use It Anyway".

Now, at one point I'd spent a fair number of releases, and probably way too much development-time, profiling and optimizing Template::Sandbox, with the help of the invaluable Devel::NYTProf, and I was feeling nicely proud of the improvements I'd made.

But it's one thing to know you've improved your code against previous versions, you're still working in a little bubble-universe of your own. You could be up a dead-end alley, happily making 5% gains here, 10% gains there, and meanwhile everyone else is roaring off down the motorway using a methodology that's inherently ten times faster than where you started from.

It pays to look at what other people have written, to see what they're doing that works better than what you've done, so you can nick their ideas.

Er, I mean take inspiration from and give credit where credit's due.

In this context, that "works better than" phrase is an awful lot like the "high performance" phrase earlier... it's more than a bit fuzzy and you really need to measure it before you can be sure it's really true.

Now, being a veteran of writing and using template engines, and also being a veteran of benchmarking code over the years, I already knew that there's some intrinsic issues here: to get a meaningful benchmark you need to compare like-with-like, and well... I've already pointed out that the reason there's so many template engines is because they're written to solve different needs and so no two template engines are alike.


Knocking around in the back of my head for the past five or six years had been the idea that each template engine tends to each try to solve a specific subset of a common superset of problems. So, in theory, it would be possible to find the common subset for any combination of template engines and benchmark across just that subset.

After all, if the end-user is happy that for their needs two template engines are interchangeable, they're surely only going to be interested in the performance of the interchangeable bits.

You'd need to have some idea of how each template engine implemented each feature, and how to combine different features, and... well it looked like a lot of work and I kept finding more interesting things to do instead.

Well fast-forward to late 2009 and I was poking around other template engine distributions to see what interesting approaches they'd used.

I was rifling through the code of Template::Alloy, which had an interesting Parrot-like idea of compiling to a generic abstract syntax tree, so that with different parsers and execution engines it could behave like any template engine it wanted.

Because of this emulation of other template engines the author, Paul Seamons, had written a benchmark script comparing a number of other template engines, each with different syntaxes, and it actually ran a template that was more complicated than the usual "Hello World" benchmarks. It also tested in a wide range of different caching options.

I snagged a copy and added Template::Sandbox to the list of engines and sat back to bask in the glory of my outstanding performance.

Hmm, yes, well.

Actually, it wasn't a complete disaster from my ego's point of view, Template::Sandbox performed pretty robustly in some circumstances (in particular my decision to use external caching modules was a huge win), and in most places it under-performed because it was against either an XS module/C-library binding, or against an engine with a much smaller feature-set.

Of course, for me this highlighted the fact that I wasn't benchmarking that wider set of features, or the fancy optimizations I'd made for them.

So I added them to the Template::Sandbox section and for the other engines that supported similar features, then added some code that skipped the template engines that didn't support those extra bits when they were asked for.

From there it was a short step to split up the templates that Paul had originally written into their constituent parts, so I could selectively choose those too.

By this point the original script had been mangled into a fairly unrecognisable state and, in all honestly, an unmaintainable one.

So, there I was, converting hard-coded template snippets and benchmark closures in randomly-named variables into some sort of hash-lookup when the penny belatedly dropped: I'd turned it into that generic "build a benchmark for just the features I want" application that I'd always dismissed as too much work.

It didn't look to be much work (ha!) to turn the script into some sort of generic plugin architecture with different template engine features and caching types and a bajillion options to let you just produce the benchmark you want.

"So I wrote it... (in Perl)", as Damian Conway would say.

What It Does

At its core, there's 24 template features ranging from scalar_variable through variable_function, and 6 caching types such as uncached_string or shared_memory_cache. Template::Benchmark then has a plugin for every template engine it knows about, 21 of them currently, and each plugin provides hooks to say how to produce a snippet of template syntax for each given template feature, and a closure to run a the template in a given caching configuration.

All you need to do is tell Template::Benchmark that you're interested in a particular group of template features and caching types, and it goes off and determines what plugins support those choices, produces appropriate templates and code snippets and then runs benchmarks, before serving you the results.

There's a command-line script to make using the module easy, and you get a few extra benefits from the information the Template::Benchmark needs to know: you can get some pretty matrices of what engines support what features, or what caching types, or even whether they're pure-perl and whether their syntax is embedded-perl or a mini-language.

You can use --featuresfrom to quickly select the set of features supported by a specific engine, useful if you're looking for something with equivalent functionality.

Using --onlyplugin or --skipplugin you can filter out different template engines.

You can change the length of the generated template by controlling how many repeats of each template snippet there are with the --repeats option, and you can control the CPU time spent on each benchmark with the --duration option.

--json gives you the ability to dump the output in a machine-readable format for storage or subsequent analysis, a topic I've covered before in "Regression Benchmarks with Template::Benchmark".

There's plenty more to Template::Benchmark, but rather than replicate it all here, I'll just point you at the documentation if you're interested in learning more.

Where Next?

Template::Benchmark has reached its first stable release, and should provide a decent level of functionality, but it remains only a v1.00 release, there are many more things that could and should be done.

Some loose plans, in no particular order:

  • Bundle::Template::Benchmark.

    A bundle distro for CPAN that installs all the supported template engines, so you don't have to manually install each of them.

  • Improved documentation of the results data-structure.

    Frankly, this currently sucks. Partly because I keep wondering if it should return some manner of result object instead of just a bare hash-ref. But then I decide that feels like overkill. I should JFDI.

  • Improved test-suite.

    The current test-suite is rather anaemic because it doesn't know what (if any!) template engines the target machine has installed, so it can't make any predictions about behaviour that relies on a plugin existing. There needs to be a mock plugin object used as part of the test-suite.

  • A test-suite for plugins.

    Having a plugin system is fine and dandy, but there needs to be a decent test-suite to help other people write plugins and test that they've done it right.

  • Pretty HTML report generation.

    Pointless fluff to some people, a useful presentation tool to others. Something like Devel::NYTProf's nytprofhtml makes for a happy compromise.

  • "Meta-features" or "feature families".

    Some template features are quite similar in behaviour: there's several types of loops, of conditional statements and expression handling. There's good reasons for this, but it can be a pain to turn them all on or all off, it'd be nice if you could do --noloops or something similar and turn them all off with a single option.

  • Per-feature repeat values.

    Currently the --repeats option applies globally to all the feature snippets that make up the template being benchmarked. A finer-grained control over what gets repeated, and how many times, could be useful to let people simulate their intended environment more accurately.

If there's an obvious improvement that you think ought to be on that list, now's a good time to suggest it.

In Conclusion

I feel that Template::Benchmark should make people's lives easier, either for people trying to choose a template engine that meets their needs (a one-stop feature-comparison list), or to choose the one that has the performance characteristics that meets their feature and caching requirements, or for template module authors who want to see what strengths and weaknesses their implementation has against similar engines.

Let me know if you agree.

Browse Sam's Blog Subscribe to Sam's Blog

By day of May: 08, 10, 20, 27.

By month of 2010: March, April, May, June, July, August, September, November.

By year: 2010, 2011, 2012, 2013.

Or by: category or series.


blog comments powered by Disqus
© 2009-2013 Sam Graham, unless otherwise noted. All rights reserved.