Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

extract xml data and insert new xml tag into same file

by lakssreedhar (Acolyte)
on Aug 11, 2013 at 15:21 UTC ( #1049009=perlquestion: print w/ replies, xml ) Need Help??
lakssreedhar has asked for the wisdom of the Perl Monks concerning the following question:

i have an xml file like this <collection><document><title><text> Hello how are you</text></title></document><document><abstract><text>I am fine</text></abstract></document></collection>. There may be many documents within a collection.I need to extract the data between every text tag and then print some passage after each text in between <passage> tags.

Comment on extract xml data and insert new xml tag into same file
Re: extract xml data and insert new xml tag into same file
by afoken (Parson) on Aug 11, 2013 at 15:59 UTC

    So - what exactly is your problem? Show your code! If you have no code, are you aware that perlmonks is not a code writing service?

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: extract xml data and insert new xml tag into same file
by NetWallah (Abbot) on Aug 11, 2013 at 19:47 UTC
    It's a little intimidating to get started with an XML parser, so Here is a working solution for the text you presented:

    Please use it as a model to learn how to use the XML parser for your real life problem.

    $ perl -MXML::LibXML -e ' my $dom = XML::LibXML->load_xml(string => shift); print $_->to_literal(),"\n" for $dom->findnodes("/collection/document/*/text"),; ' "<collection><document><title><text> Hello how are you</text></titl +e></document><document><abstract><text>I am fine</text></abstract></d +ocument></collection>"
    Output
    Hello how are you I am fine

                 My goal ... to kill off the slow brain cells that are holding me back from synergizing my knowledge of vertically integrated mobile platforms in local cloud-based content management system datafication.

Re: extract xml data and insert new xml tag into same file
by sundialsvc4 (Abbot) on Aug 13, 2013 at 12:05 UTC

    You could come pretty darned close to this ... no, you could get this ... by using “XSLT stylesheets” and an open-source tool such as Saxon.   In this case, you would write no program at all ... not in Perl, and not in anything else.   (Although Perl also knows how to do XSLT transformatiions.)

    Now, this use of the word “stylesheet” is another one of the abuses of human language that are so common in the data processing world:   they have absolutely nothing to do with, say, CSS.

    The transform would read one (version of the) file as input, and generate a new (version of the) file to replace it.   Thus, the process is non-destructive and therefore repeatable.   You would, presumably, archive the old version in some useful way and then replace it with the latest one.

    A critical idea behind XSLT is XPath expressions, which are vaguely like “queries for XML documents.”   So, basically, you will write an expression that says what you want to get (not how to get it ... no code-writing here), and then, what you want to produce from each subtree that XSLT finds.   In your original post, you already say that in human terms, and the XSLT will simply say the same thing in computer terms.

    Web browsers, for example, already know how to do XSL, and you can readily find very impressive examples of what can be done.   An interactive Periodic Table of the Elements, for example, written entirely with XSLT and a smidgen of JavaScript.   Many word-processors and statistical analysis programs also know how to use both XSLT and XPath.   The DocBook electronic publishing format (the source of all those O’Reilly books) is also built entirely with XSLT.

      In this case, you would write no program at all ... not in Perl, and not in anything else.

      XSLT is programming, deal with it

        Anonymity does not become you.   Nevertheless, there is a huge difference between writing a custom Perl program to extract and to re-form data from an XML data source, and leveraging an existing technology to do the same thing.   There is an entire world of endeavor ... particularly in the world of electronic publishing and education ... which relies completely on XSLT, and a single standardized tool-chain, with nary a line of “source code” to be found.   Perl is sometimes used to drive that tool chain ... at Apple, for example, until quite recently ... but the extraction of data and the subsequent compiling of all manner of deliverables occurs entirely without custom code.   The OP’s requirement appears to me to be a classic use-case.   And, for the curious, I myself have spent years working in exactly this space.

      XSLT, also known as XML in XML that XMLs XML to XML XML in XML for XML xml XML XML? No thanks!

      By your definition writing SQL is not programming either. You write an expression that says what you want to get, not how to get it. Heck, Prolog is no programming either then, you write what do you want to hold true for the result, not how to compute the result.

      Writing XSLT IS programming. The syntax is insane, the readability for all nontrivial "stylesheets" incredibly low, but it is programming.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1049009]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (8)
As of 2014-12-23 01:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (133 votes), past polls