Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re: A regex question

by roboticus (Chancellor)
on Oct 28, 2011 at 20:22 UTC ( #934502=note: print w/replies, xml ) Need Help??

in reply to A regex question


Here's a quick bit of code to get you started:

use strict; use warnings; $/=undef; while (my $line = <DATA>) { for ($line =~ m/<a[^>]*>(.*?)<\/a>/gs) { print "Name '$_'\n"; } } __DATA__ <a href="foo">Jon.Martinez</a><li>gabba, gabba, hey!</li><a href=bar>Mary Jones</a><p>Gazebo!</p><a href="baz">Rob Oticus</a><a>Joe Blow</a>

Note that we slurp all the file in at once ($/=undef) otherwise we can't find names spread over two lines (like Mary Jones). We also need to use the 's' switch on the regular expression to let '.' match newlines (again to pick up Mary Jones!.

Running it gives you:

$ perl 1 Name 'Jon.Martinez' Name 'Mary Jones' Name 'Rob Oticus' Name 'Joe Blow'

Now, having said all that: Remember to review perlre and perlop. Also, you may want to use a real HTML parser instead of hacking away with regular expressions. Otherwise you can find some difficulties with unexpected formatting.


When your only tool is a hammer, all problems look like your thumb.

Update: changed 'e' to 's' (thanks for catching that, hbm!)

Replies are listed 'Best First'.
Re^2: A regex question
by emelianenko (Initiate) on Oct 28, 2011 at 20:58 UTC
    Thank you whole heartedly. I am going to study this that you wrote. Definitively I want to incorporate Perl into my bagage but I am finishing C now. Right after that I will because I am fanatic about managing information. thank you again best regards

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://934502]
and God said, "Let Newton be!"...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2017-09-22 05:47 GMT
Find Nodes?
    Voting Booth?
    During the recent solar eclipse, I:

    Results (257 votes). Check out past polls.