Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: parsing hmtl file with regex

by Marshall (Canon)
on Oct 01, 2011 at 07:10 UTC ( [id://929000]=note: print w/replies, xml ) Need Help??


in reply to parsing hmtl file with regex

I try to avoid the use of $1, $2 etc.

I find it better and easier as far as the coding, to put the left hand side of the regex match into a list context and for example assign $catalog_num directly instead of fiddling with $1!

#!/usr/bin/perl -w use strict; my @lines = ( 'Catalog Number: PAL1001', ' Catalog Number:PAL1001', 'Catalog Number: Catalog Number: PAL1001', 'Catalog Number: PAL1001Catalog Number: PAL1001', 'Cat Number: PAL1001Catalog', 'Catalog Number: 123PAL100'); foreach my $line (@lines) { my ($catalog_num) = $line =~ /^\s*Catalog Number:\s*([A-Za-z]+\d+)/ +; if ($catalog_num) { print "$catalog_num\n" } else { print "Bad Line!...$line\n"; } } __END__ PAL1001 PAL1001 Bad Line!...Catalog Number: Catalog Number: PAL1001 PAL1001 Bad Line!...Cat Number: PAL1001Catalog Bad Line!...Catalog Number: 123PAL100
Update:
When writing if statements like above, you have to consider "truth-ness". $catalog_num is false if that value is undefined or it is numeric zero. And that's a "Bad Line!"

In the above, a valid catalog_num cannot be a numeric zero and so I can just test "if ($catalog_num)" instead of "if (defined $catalog_num)".

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://929000]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2024-04-19 20:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found