Sam's Blog entries for category 'regexp' Subscribe to feed for category 'regexp'

Readable Regexps: Why you should use /x

Date: Wednesday, 2 June 2010, 09:53.

Categories: perl, ironman, regexp, craft, basic, tutorial.

Regexps are one of Perl's strongest features, but they're also one of the causes of Perl's greatest criticism: that it looks like line noise.

If you've ever had to examine someone else's regexp, or worse debug one, you'll probably agree that there's some merit in that criticism.

It doesn't have to be that way however, there's some simple steps you can take to make your regexps more readable and more maintainable, and this week we look at one of them: the /x modifier.

Did you mean +, not *, in that regexp?

Date: Wednesday, 28 April 2010, 13:36.

Categories: perl, ironman, regexp, craft, basic, tutorial.

Continuing from my previous article "Anchoring Regexps", another common regexp mistake I see is use of * where the author really meant +.

So today I cover + and *: what's the difference and why does it matter?

Anchoring Regexps

Date: Thursday, 22 April 2010, 15:14.

Categories: perl, ironman, regexp, craft, optimization, basic, tutorial.

A common mistake I find whenever I look at someone else's regexps, is a failure to anchor the regexp.

This is often, in my experience, the single biggest thing you can do to improve the performance of a regexp: it's one of those things you should learn to do in every regexp where applicable, which should be almost every regexp unless you're specifically looking for "something somewhere in the middle but I don't know where".

So, what is anchoring, and why does it have such a big impact?

In my previous blog entry, "Advanced Benchmark Analysis I: Yet more white-space trimming", I left you with the thought that our benchmarks changed with changing input.

This article shows you how to analyze those changes and how to draw conclusions from them.

Seems my previous blog, "Some simple "white-space trim" benchmarks" caught people's attention, and I've received some interesting suggestions and observations worthy of a followup article, this also gives me the chance to delve into explaining more advanced benchmark analysis.

So, deep breath, here goes.

Some simple "white-space trim" benchmarks

Date: Wednesday, 3 March 2010, 15:38.

Categories: perl, ironman, benchmarking, trim, regexp, optimization, basic, tutorial.

Laufeyjarson asked on Monday, about stripping whitespace from both ends of a string. The comments contains lots of suggestions, but no hard figures, so I thought I'd reproduce them here along with the code used to generate the benchmarks - it provides a simple example of how to write a quick and reliable benchmark.

Browse Sam's Blog Subscribe to Sam's Blog

By year: 2010, 2011, 2012, 2013.

Or by: category or series.

© 2009-2013 Sam Graham, unless otherwise noted. All rights reserved.