Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^2: Retrieving XML From a File Based On Child Node Attribute

by the_r (Initiate)
on Feb 14, 2017 at 19:07 UTC ( [id://1181985]=note: print w/replies, xml ) Need Help??


in reply to Re: Retrieving XML From a File Based On Child Node Attribute
in thread Retrieving XML From a File Based On Child Node Attribute

Thanks for the reply. The reason I used regular expressions was that the actual XML is contained within a log file that contains other information besides xml. Will the XML:LibXML handle any type of file or does it strictly need a xml file?

I tried the following using this script and am getting a parser error Start tag expected, '<' not found. Below is the code:

#!/usr/bin/perl use XML::LibXML; my $requestId = $ARGV[0]; my $fileName = "sample.xml"; print "$requestId\n"; print "$fileName\n"; my $doc = XML::LibXML->load_xml(string=>$fileName); my @nodes = $doc->findnodes("/*/EventInfo[\@RequestId='$requestId']"); for my $node (@nodes) { print "### ", $node->getParentNode->toString, " ###\n\n"; }

The sample xml file is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <DeliveryTimeChanged CurrentStatus="OnHold" xmlns:ns2="http://com/post/orderupdatesasync/jaxbxml"> <EventInfo EventId="666313444" CreationDatetime="2017/02/09 07:59:17 369 GMT" RequestId="321150454"> <TopicCounts TopicName="DELIVERY.TIME.CHANGED" TopicCount="1"/> </EventInfo> <DeliveryChangeOperationType OperationTypeCode="DELAY" OperationSubtypeCode="HOLD" DeliveryChangeReason="Weather" DeliveryDate="20170210"/> </DeliveryTimeChanged>

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <DeliveryRouteChanged CurrentStatus="OnHold" xmlns:ns2="http://com/post/orderupdatesasync/jaxbxml"> <EventInfo EventId="666313445" CreationDatetime="2017/02/09 07:59:23 639 GMT" RequestId="321150454"> <TopicCounts TopicName="DELIVERY.ROUTE.CHANGED" TopicCount="1"/> </EventInfo> <DeliveryRouteType OperationTypeCode="AIR" OperationSubtypeCode="HOLD" DeliveryDate="20170210"/> </DeliveryRouteChanged>

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <DeliveryCanceled CurrentStatus="Canceled" xmlns:ns2="http://com/post/orderupdatesasync/jaxbxml"> <EventInfo EventId="666313446" CreationDatetime="2017/02/09 07:59:44 963 GMT" RequestId="421150444"> <TopicCounts TopicName="DELIVERY.STATUS.CANCELED" TopicCount="1"/> </EventInfo> <DeliveryStatusType DeliveryStatusCode="CX" OperationSubtypeCode="CANCELED" DeliveryDate="20170210"/> </DeliveryCanceled>

Replies are listed 'Best First'.
Re^3: Retrieving XML From a File Based On Child Node Attribute
by haukex (Archbishop) on Feb 14, 2017 at 22:19 UTC

    Hi the_r,

    First, have a look at the documentation, and note that XML::LibXML->load_xml(string=>$fileName); is trying to parse the string contained in $fileName. What you want is XML::LibXML->load_xml(location=>$fileName); instead.

    Will the XML:LibXML handle any type of file or does it strictly need a xml file?

    It will need an XML file conforming to the specifications. I am having trouble understanding the sample data you posted, please use <code> tags. Is this all one file, or three separate files? If the latter, then the above change should be all you need.

    If however the input you pasted here is from one single file (as you seem to be saying with the "log file"), then this is not a standard XML file, as the <?xml...?> declaration may only appear once, at the top of the file. First, I would recommend you check the source of the data, whether you can retrieve the pieces of XML as individual files. If not, I might complain to whomever is generating this data that it does not conform to XML specifications :-)

    If that doesn't work, you may be left with parsing the file and breaking it into individual chunks that a normal XML parser can handle, in that case, you'll have to show a sample input that is representative of the data you're getting, in <code> tags. But try and see if you can get data conforming to the standards first.

    Hope this helps,
    -- Hauke D

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1181985]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (3)
As of 2024-04-24 06:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found