[oclug] A Friday Regex Head Scratcher
yanick at babyl.dyndns.org
yanick at babyl.dyndns.org
Fri Apr 17 17:40:37 EDT 2009
On Fri, Apr 17, 2009 at 02:38:04PM -0400, Jon Earle wrote:
> Probably simple, but the answer is eluding me at the moment. I have a block
> of text that contains:
>
> some text PR 1234 more text
>
> and I want to convert the number to a link to the issue in the bug reporting
> database. This is no problem, there is a regex to do just that:
>
> (\b(PR|SCR)[:s#]?\s?) # PR or SCR followed by :|s|#|whitespace
> (\s[a-z0-9-]+\/)? # a category name & a slash (optional)
> ([0-9]+) # the PR number
>
> and that works perfectly. Now, suppose that the block of text contains:
>
> some text PR 1234, 5678 ,2345 more text
>
> How would I need to adjust the regex to account for the unknown number of
> additional comma-number sequences so that they can each be htmlified?
A way of doing it in Perl would be:
=begin code
my $string = "some text PR 1234, 5678 ,2345 more text";
$string =~ s[ (?<=PR\s) # preceeded by 'PR '
\s*\d+\s*(,\s*\d+\s*)*
][
join( ', ',
map {
s/^\s+|\s+$//g; # trim whitespaces
"<a href='http://$_'>$_</a>"; # url-ify
}
split ',' => $&
)
. ' ' # add a whitespace after the list
]xeg;
print $string;
=end code
Basically, I captured the '1234, 5678, 2345' part of the string,
which I splitted on the commas, url-ified independently and then
glued back together.
Joy,
`/anick
More information about the OCLUG
mailing list