Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

delete and substitute

by Anonymous Monk
on Dec 09, 2009 at 04:31 UTC ( [id://811858]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl while($_ = <DATA>){ chomp; #print $_; s/<.+><\/.+>//g; s/<!\[CDATA\[<p><\/p><p>/<![CDATA[<p>/g; print "$_\n"; } __DATA__ <Text><p> <Text></Text> <Memo></Memo> <Text></p> <![CDATA[<p></p><p>
How to remove the string if it has <Text></Text>,<Memo></Memo> but not </p><p> Which is empty and how to replace <![CDATA[<p></p><p> to <![CDATA[<p>

Replies are listed 'Best First'.
Re: delete and substitute
by GrandFather (Saint) on Dec 09, 2009 at 06:43 UTC

    Take a serious look at HTML::TreeBuilder or some of the other HTML munging modules in CPAN. You really don't want to be doing that sort of work with hand rolled code!


    True laziness is hard work
Re: delete and substitute
by Anonymous Monk on Dec 09, 2009 at 06:42 UTC
    Why aren't you using a module for parsing XML?
Re: delete and substitute
by doug (Pilgrim) on Dec 09, 2009 at 16:41 UTC

    Are you the same person who yesterday wanted to edit XML without parsing it? Hopeless task, that. Parse it, look for the stuff you want to remove, and then write it back. Sleep well knowing that your program isn't a house of cards

    - doug

Re: delete and substitute
by vitoco (Hermit) on Dec 09, 2009 at 12:50 UTC

    Your sample data doesn't seem to be valid XML, so XML modules will fail.

    Try the following line as a replace for both of yours:

    s/<(\w+)>\s*<\/\1>//g;

    Beware that this won't remove two "adjacent" tags from different lines, because you are processing one line at a time.

      This person keeps switching from HTML to XML, all trying to edit with regex, every time question changes by little bit

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://811858]
Approved by keszler
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (6)
As of 2024-04-19 11:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found