Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Help with Regex

by Mercio (Scribe)
on Jul 06, 2004 at 22:22 UTC ( #372222=note: print w/ replies, xml ) Need Help??


in reply to Help with Regex

Ok, I've read a few of them and decided to try and take all the html tag names out of a file and print them, however I am running into a few problems. This is what i have.

$content = "<head><body blah></body><foo></foo></head>"; while ($content =~ /<([^(?:\s|>)]+).*>.*<\/\1>/ig) { print $1."\n"; }
This works fine as long as the html tags do not encompass other html tags. In this case they do and it will only find html. Is there something I'm doing wrong? I've tried everything.


Comment on Re: Help with Regex
Download Code
Re^2: Help with Regex
by ercparker (Hermit) on Jul 07, 2004 at 02:02 UTC
    if you're trying to match that entire string you could try this
    this matches from the first tag to the last tag
    I hope I understood what you we're trying to do
    $content = "<head><body blah></body><foo></foo></head>"; $content =~ m[^(<(.+?)>.*?</\2>)$]; print $1."\n";
    if you just wanted to match and print out the individual tags you could do this
    $content = "<head><body blah></body><foo></foo></head>"; while ($content =~ m[(<.+?>)]g) { print $1."\n"; }
    a great tutorial on perlmonks covering how a regex will match
    hope this helps
Re^2: Help with Regex
by TomDLux (Vicar) on Jul 07, 2004 at 03:33 UTC
    Anything but the simplest HTML processing and you should be using HTML::Parser, not regex.

    --
    TTTATCGGTCGTTATATAGATGTTTGCA

      I agree. Even if it's your own html and you know what to expect. It is never worth it and it will bite back eventually.
      Since I started using HTML::TokeParser I've never looked back. I use it even on "the simplest HTML". Why go to all that effort when others (who know what they're doing) already have?
      The best advice I've seen in regex tutorials is "don't roll your own html parser".
      wfsp

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://372222]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (9)
As of 2014-12-21 19:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (107 votes), past polls