<?xml version="1.0" encoding="windows-1252"?>
<node id="291543" title="Perl Idioms Explained - @ary = $str =~ m/(stuff)/g" created="2003-09-15 08:53:53" updated="2005-08-15 11:49:25">
<type id="120">
perlmeditation</type>
<author id="80749">
tachyon</author>
<data>
<field name="doctext">
&lt;p&gt;Consider that you want a regex to find lots of things in a string and store them into an array. Perl has a very convenient idiom for this:
&lt;code&gt;
@ary = $str =~ m/(stuff)/g;
&lt;/code&gt;
&lt;p&gt;If you are after a single match you use a scalar in list context as the L-VALUE ie
&lt;code&gt;
($scalar) = $str =~ m/(this)/;
&lt;/code&gt;
&lt;p&gt;Note if you forget the ( ) around $scalar and you get a match $scalar will contain the integer value 1 so don't forget the ( ). The ( ) gets you list context which you need.
&lt;p&gt;Anyway, although this might not make a lot of sense at first glance it is really very simple. If we had just this:
&lt;code&gt;
$str =~ m/(stuff)/;
print $1
&lt;/code&gt;
&lt;p&gt;then we would expect our code to print 'stuff' if the string contained the literal 'stuff' as this gets captured into $1. The addition of /g means the $1 will sequentailly contain 'stuff' EVERY time &lt;code&gt;$str =~m/(stuff)g&lt;/code&gt; is true ie 0..n times. Now if we know that a regex is a valid R value in an expression, we know we can write &lt;code&gt;L-VALUE = R-VALUE&lt;/code&gt; so we can understand that:
&lt;code&gt;
@all_the_matches   = $str =~ m/(stuff)/g;
&lt;/code&gt;
&lt;p&gt;In array context we get all the matches into our array. So for example you can do:
&lt;code&gt;
@links = $html =~ m/&lt;a[^&gt;]+href\s*=\s*["']?([^"'&gt; ]+)/ig;
&lt;/code&gt;
&lt;p&gt;This is a reasonably reliable and quick way to extract all the &amp;lt;a...href=...&amp;gt; links from HTML. Although you can certainly use [cpan://HTML::LinkExtor] or any of the other [cpan://HTML::Parser] based widgets there are times when you want to say extract all the links that look like:
&lt;code&gt;
&lt;A CLASS="blah" HREF="foo.com"&gt;
&lt;/code&gt;
&lt;p&gt;A carefully chosen regex can extract exactly what you want, without any excess, as you can make it match a specific link subset with ease. Using this idiom you can grok the matches into an array in one elegant line of Perl.....
&lt;p&gt;The uses are of course only limited by your imagination. Using ^ and /m you can do things like extract a specific field from a space separated data set:
&lt;code&gt;
$data = '
f1 f2 f3
f4 f5 f6
f7 f8 f9
';

@second = $data =~ m/^\S+\s+(\S+)/mg;
print "@second";
&lt;/code&gt;
&lt;p&gt;As always YMMV and you should pick the best hammer to drive the nail at hand.
&lt;h3&gt;Update&lt;/h3&gt;
&lt;p&gt;Technical inaccurary removed. For the details on what happens if you put a scalar on the LHS of this idiom see [id://292089|this] where [bart] gets to poke fun at me for making an untested assumption and my pitiful excuses [id://292107|here] and a round about hack....</field>
</data>
</node>
