Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

XML Transaction - random access.

by tomazos (Deacon)
on Jun 29, 2001 at 22:28 UTC ( [id://92749]=perlquestion: print w/replies, xml ) Need Help??

tomazos has asked for the wisdom of the Perl Monks concerning the following question:

My company is receiving an XML transaction by email from our payment processing service everytime a user changes their email address. A transaction looks like this:
<?xml version='1.0' encoding='ISO-8859-1'?> <transactionUpdate xmlns='http://www.mypayproc.com/transaction'> <!-- This email is to let you know blah blah blah. --> <transaction> <theTID>GH95QFEFAC80</theTID> <registerTo>John Citizen</registerTo> <customerEmails> oldjohn@citizen.com </customerEmails> </transaction> <updatedInformation dateOfChange='2001-6-25'> <customerEmails> newjohn@citizen.com </customerEmails> <customerEmail>newjohn@citizen.com</customerEmail> </updatedInformation> </transactionUpdate> <!-- END XML FILE -->
Rather than write my own parser, I want to resuse XML::Parser or whatever is appropriate. I want it to be able to parse the above into an object such that I can randomly access elements in it.

I have seen code that parses it and just prints it out again, but not one that functions as a real object factory. I am pretty sure XML::Parser can do this - but never seen a code example of it actually being done.

Does anyone know how to make XML::Parser act as an object factory? Example code appreciated.

Replies are listed 'Best First'.
Re: XML Transaction - random access.
by andreychek (Parson) on Jun 29, 2001 at 22:41 UTC
    If thats as big as it's going to get, you could always consider using XML::Simple to transform the XML data into a hash. The biggest problem with that would be in a case where your XML grew to be large -- you'd end up using a lot of RAM.

    To use XML::Simple, you can just use code like this:
    use XML::Simple; # Gets XML data from a file $hashref = XMLin('/path/to/somefile.xml'); or even: # Assumes you already have the XML data within a string $hashref = XMLin($XMLinaString);
    And that would provide you with a nested hash containing all your data, properly structured (as opposed to flattened). From that point it wouldn't be too difficult to write code to randomly access the keys/data within.

    And by all means, no one could ever go wrong using XML::Twig for the task if you were concerned about the size of you XML document :-)
    -Eric

    Update: Added code example.
      Two notes on the use of XML::Simple:

      1. Since malformed XML is fatal to the parser you may want to wrap its evocation in an eval block.
      2. Use forcearray to avoid problems with references when the returned structure has empty elements.

      $hashref = eval{ XMLin($XMLinaString, forcearray => 1) }; xml_parse_failure($@) if $@; # an error trapping sub

      --
      Check out my Perlmonks Related Scripts like framechat, reputer, and xNN.

Re: XML Transaction - random access.
by mirod (Canon) on Jun 30, 2001 at 12:40 UTC

    I think what you are looking for is the Objects style of XML::Parser. I have not used it as it generates a structure I (and many others!) find quite ugly.

    Try this to figure out what yuo can get from it:

    #!/bin/perl -w use strict; use XML::Parser; use Data::Denter; use Text::Iconv; my $p= new XML::Parser( Style=> 'Objects'); my $doc= $p->parse( \*STDIN); print Denter( $doc);

    I agree with andreychek and I would advise you to use either XML::Simple, which takes the structure generated by the Tree style of XML::Parser and simplifies it a _lot_ (and as epoptai mentions, don't forget to use the force(array option) Luke!

    One important drawback of XML::Simple is that it does not deal with mixed content (<p>this is <b>mixed</b> content</p>, text and elements are mixed directly within the p element). If your data includes mixed content (or if it might include it one day) you can have a look at XML::Path (nice, solid module, well supported), XML::DOM (ugly, dangerous and not too well-supported but the DOM is a W3C standard) or of course XML::Twig (which I wrote, so I like it!). All of those modules will generate a tree structure from the XML and allow you to navigate and update it.

    There is quite a lot of information about those on this site, just do a SuperSearch on XML and you'll get more information that you might ever want ;--)

    If you are looking for pure speed you can also use XML::Parser directly and build your own structure, look at the XML::Parser Tutorial for help.

    There is also a tutorial on Perl and XML on my web site at www.xmltwig.com (the DNS info for it might be corrupted at the moment so you can try 193.251.86.24 instead).

    Another important note is that since your input is in ISO-8859-1 you might want to get the ouptut in the same character set. XML::Parser (and all of the modules based on it) will convert it to UTF-8 (which is an encoding for the Unicode character set...). Your best bet here is to use Text::Iconv if the iconv library is available on your system (it should be, I think you can get it even on windows):

    #!/bin/perl -w use strict; use Text::Iconv; use XML::Simple; my $converter= Text::Iconv->new("utf8", "ISO-8859-1") or die "cannot generate the converter"; # note that if you want the text with XML::Parser # Objects style you will need to get # $doc->[0]->{Kids}->[0]->{Text} my $doc= XMLin( \*DATA); my $latin1_text= $converter->convert( $doc); print "text: $latin1_text (was $doc)\n"; __DATA__ <?xml version="1.0" encoding="ISO-8859-1"?> <doc>español</doc>
Re: XML Transaction - random access.
by toma (Vicar) on Jun 30, 2001 at 22:25 UTC
    Here is example code using XPath that might be what you are looking for. I put the XML from your question in a file called 'xmlexample.xml'. XPath doesn't know that the whitespace in your XML isn't part of the email address, so this example strips the whitespace from the returned result.
    use strict; use XML::XPath; my $xp = XML::XPath->new(filename => "xmlexample.xml"); my $oldemail= $xp->findvalue('/transactionUpdate/transaction/customerE +mails'); $oldemail =~ s/\s//g; # Get rid of whitespace print $oldemail,"\n";

    Thanks to mirod for providing XML::XPath lessons in the ChatterBox!

    It should work perfectly the first time! - toma

Re: XML Transaction - random access.
by Anonymous Monk on Jul 01, 2001 at 06:32 UTC
    This is such a tiny file. You should try either XML::Simple or XML::EasyOBJ, which will return hash. It doesn’t get any simpler.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://92749]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (4)
As of 2024-04-25 13:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found