Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

 The other day I had filled out about 10 forms using Infopath. The problem is all Infopath documents are saved in XML and one cannot extract the contents of each XML Tag into simple text.

 I will explain how to extract information from XML tag and save it in a DOC format. You can then apply the same concept for the Infopath form to DOC conversion.

 So here goes...

 XPath is one of the many XML technologies you could use to traverse the XML Tree. If you access a file in your explorer, the path to your file may be in the format "C:\folder1\file1.txt". XPath uses similar concept to walkthrough your XML file which can be thought as a Parent Tree containing many child nodes.

Lets take a very simple example. Suppose your XML file is in the following manner:


<Books> <Book> <title>Perl Magic</title> <author>Karthik</author> <publisher>ORielly</publisher> <price currency="Rupees" value="330"/> </Book> <Book> <title>Perl for Dummies</title> <author>Mark</author> <publisher>ORielly</publisher> <price currency="Rupees" value="420"/> </Book> </Books>

Now you want to extract the necessary Book Information, i.e, the "Title" and "Author" of the Books.
From the XML file the tags <title> and <author> should be extracted.

Now in XPath you get the "title" tag's content by using the Path :


Similarly for the "author" tag's content:


Now that you know what and how extract the information, its time to use Perl.

The beauty of Perl is that you can materialize the idea in your mind into reality so easily. You got
Perl modules to make your Life easy.

I am going to use the XPath Perl module which is part of the XML module.You use the module by coding like this

use XML::XPath

Now you need to get the BOOKS.XML file into a variable and create a new XPath object.

$file="books.xml"; my $xp = XML::XPath->new(filename => $file);

Open a WORD DOC file of the same name for conversion

open(INFO3, ">$file.doc");

Print necessary information in the DOC file

print INFO3 "Perl Xpath\n\n"; print INFO3 "BOOK INFORMATION:\n\n";

 Use the find method using the XPath object and give it the path.


 This will populate a Answer Node List which is used further to extract the tags <title> and <author>
and print to the DOC file.


The resulting Perl File(BOOKEXTRACT.PL) is given below.


use XML::XPath; $file="books.xml"; my $xp = XML::XPath->new(filename => $file); open(INFO3, "+>$file.doc"); print INFO3 "Perl Xpath\n\n"; print INFO3 "BOOK INFORMATION:\n\n"; foreach my $book ($xp->find('//Books/Book')->get_nodelist){ print INFO3 "TITLE:"; print INFO3 $book->find('title')->string_value."\n"; print INFO3 "AUTHOR:"; print INFO3 $book->find('author')->string_value."\n"; print INFO3 "\n\n"; } print "Converted XML file into WORD file\n\n"; print $file." WORD document generated"; close(INFO3);

After you run the PERL Script you will be presented with a DOC file of the same name as the XML file
with the extracted information.

Now that you know how to extract Tags and content into WORD DOC, you can apply the same method in the
conversion of INFOPATH XML Files into WORD Documents.

Hope this helps.

Happy Coding.

20050113 Janitored by Corion: Fixed formatting

20050114 Unconsidered by Corion: was considered as move to Meditations (edit:14 keep:7 del:0)

In reply to Using Perl XPath for converting Infopath XML files to Word Documents by karthik4perl

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others studying the Monastery: (4)
    As of 2017-09-23 16:36 GMT
    Find Nodes?
      Voting Booth?
      During the recent solar eclipse, I:

      Results (272 votes). Check out past polls.