Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: how to strip XML into Plain Text file

by sleepingsquirrel (Hermit)
on Jan 26, 2005 at 00:21 UTC ( #425086=note: print w/ replies, xml ) Need Help??


in reply to how to strip XML into Plain Text file

perl -p -e 's/<[^>]*>//g' <foo.xml


-- All code is 100% tested and functional unless otherwise noted.


Comment on Re: how to strip XML into Plain Text file
Download Code
Re^2: how to strip XML into Plain Text file
by Fletch (Chancellor) on Jan 26, 2005 at 01:00 UTC

    ... <img alt="Next >>" src="../next_button.jpg" />*Boom*

    And this is why you use a real parser, not just a regex . . .

    Update: Just to clarify the above is a pathological case and if you're reasonably sure that it probably won't occur then go ahead and use the simple s///; but be aware that it's not bulletproof and know where to find the right tool when the sledgehammer doesn't cut it any more.

      Since we're being pedantic about it, is '>' actually allowed inside attribute values in XML?

        xmllint doesn't gripe about it:

        freebie:~ 677> cat foo.xml + 9:34:27 <?xml version="1.0" encoding="utf8" ?> <testing> <img alt="Next >>" src="../next_button.jpg" /> </testing> freebie:~ 678> xmllint --noout foo.xml + 9:34:29 freebie:~ 679> + 9:34:35

        Yes. Only < is not.

        Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://425086]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2014-09-03 05:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (35 votes), past polls