Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Monks,

I have been at this for much longer than should be necessary. I am using XML::Simple, but I have also unsuccessfully tried XML::Twig. The file (with an example below) I have a file that looks like what is below. Each tag is only used once. I know it is possible to use a few regexen, but the way I need to use the data later, having the data in a hash similar to one returned by XML::Simple.

<CVS> $Id: File_Find.pl,v 1.1 2006-12-17 19:25:03 eric Exp $ This That <this@that.com> Desc: Test file </CVS> <DATE>2006-12-10</DATE> <INTRODUCTION>Blah, <b>blah</b>, blah</INTRODUCTION> <TITLE>Foo</TITLE> <AUTHOR>Bar</AUTHOR> ... <ARTICLE> <p>foo, test</p> <p>bar</p> <p>baz</p> </ARTICLE>

When I have the above text, all I get is what's inside of <CVS>. When I add <XML> tags surrounding the whole file, all I get is the following using Data::Dumper:

$VAR1 = \{ 'CVS' => '<The entire file and this is the only tag in the + hash with no other tags even making it in here>'};

What is the best method to extract the data out of this file (either using an XML module or not) and pulling out what I need into a hash? Thanks.

Update: I forgot to include the code I actually tried. I know the $file is correct and @articles is populated.

foreach my $file (@articles) { my $article = XMLin($file, NoAttr => 1); use Data::Dumper; print "<pre>", Dumper(\$article), "</pre>"; }

Update 2: I think I need to provide a better example of the data. So look above at the data example and that gives a much better perception of what I am dealing with.


In reply to Parsing XML with XML::Simple by madbombX

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others perusing the Monastery: (7)
    As of 2015-07-06 09:56 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (71 votes), past polls