Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: mutiple-line regexes?

by jens (Pilgrim)
on Aug 19, 2002 at 06:55 UTC ( [id://191095]=note: print w/replies, xml ) Need Help??


in reply to mutiple-line regexes?

Here's some simplified code to give you an idea of what I wanted to accomplish:

Example HTML code (produced by saving as HTML from OpenOffice Calc) simplified for sake of argument:

<tag1> <tag2> <tag3> <tag4><MYTAG>is there stuff here?</MYTAG></tag4> </tag3> </tag2></tag1>

If the contents between MYTAG were *blank*, then I wanted to delete the entire six lines.



--jens

Replies are listed 'Best First'.
Re: Re: mutiple-line regexes?
by Chady (Priest) on Aug 19, 2002 at 08:38 UTC

    does your data always look like pseudo-HTML? tagged? maybe you can have better results with HTML::TokeParser

    here's some untested code:

    my $p = HTML::TokeParser->new($html) || die "Can't tokenize: $!"; # get each <tag1> alone. while (my $token = $p->get_tag('tag1')) { # store the original text $origtext = $token->[3]; # get data between <MYTAG></MYTAG> my $myTag = $p->get_tag('MYTAG'); my $text = $p->get_text('/MYTAG'); if ($text ne '') { # tag is not empty.. so $origtext retains # the data we want.. # ... do whatever with $origtext and move on } }

    He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

    Chady | http://chady.net/
Re: Re: mutiple-line regexes?
by Arien (Pilgrim) on Aug 19, 2002 at 07:33 UTC

    Instead of using a multine regex, you could consider looping over the lines in turn and removing every six lines where there's a match on the fourth.

    I think the code is pretty self-explanatory (@lines contains your file):

    my $i = 3; # from line 4 ... while ($i < @lines - 2) { # ... until last but 3 if ($lines[$i] =~ m!<MYTAG></MYTAG>!) { splice(@lines, $i - 3, 6); # kill 6 lines from $i - 3 } else { $i++; } }

    — Arien

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://191095]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2024-04-23 12:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found