[oclug] A Friday Regex Head Scratcher
yanick at babyl.dyndns.org
Mon Apr 20 19:14:35 EDT 2009
Jon Earle wrote:
> Dave O'Neill wrote:
>> Using $& is generally a bad idea, because using it once imposes a
>> performance penalty on all other regular expression matches.
> I was just reading up on the $& stuff, and according to Programming Perl, 3rd
> Ed, pg 146, last para:
> "Perl uses a similar mechanism to produce $1, $2 and so on, so you also pay a
> price for each pattern that contains capturing parentheses."
> In my case, the matches are to be done to generate a web page and the results
> are plenty fast enough (fast enough that I can't detect a page loading
> It would be interesting to see if there are in fact, performance differences
> between $& and $1 usage though, but I don't have the time to run those tests
> at the moment.
For a single regex, the performance will be the same (basically, in
both cases you ask Perl to remember what was matched, which takes a
little more work than just match'n'forget). It's only if there are many
regexes that you'll notice a difference. Whereas the use of $1, $2,
etc, will only impart a performance penalty to the regexes in which they
are used[*], using $& will trigger that penalty for *all* regexes in the
program (simply because there's no way to know a priori which regex we
want to capture the content of).
Now, this being said, that delta of performance is typically fairly
small. For little scripts, you'll have to squint real hard to see it.
[*] By the way, to group patterns without storing the matched value in
$1 and its ilk, there's the (?:) construct. E.g.:
"Such a fool" =~ /(fo+)/; # $1 is now 'foo'
"Such a fool" =~ /(?:fo+)/; # $1 is undef
> I ended up studying the heck out of Yanick's example and learned quite a bit
> about lookbehinds, the map function and advanced regex usage.
Excellent. I was half afraid you'd take a look at the regex, turn
green, and immediately proceed to format your hard-drive. :-)
> Thanks Dave and Yanick for your advice and suggestions!
Pleasure was all mine. :-)
More information about the OCLUG