Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Greetings Monks. I'm trying to parse a chunk of XML code and the problem has become significantly more complex than I'm used to. I am requesting an xml page that describes one or more ads. There can be any number of ads returned, and any number of ads of a specific type. I am OK when there is only one ad of a given type, but multiple ads of a given type is problematic. Here is an example of the xml returned from a request for one Preroll, 3 Midroll, and one Postroll:
- <AdXML> - <Preroll> <Creative>Preroll_30sec</Creative> <CompanionId>N/A</CompanionId> <Impression>TBD</Impression> <Completion>http://192.168.0.1:80/foo/bar</Completion> <TrackingId>null:414</TrackingId> <Length>4</Length> </Preroll> - <Postroll> <Creative>Postroll_60sec</Creative> <CompanionId>N/A</CompanionId> <Impression>TBD</Impression> <Completion>http://192.168.0.1:80/foo/bar</Completion> <TrackingId>null:418</TrackingId> <Length>6</Length> </Postroll> - <Midroll> <Creative>Midroll_45sec_3</Creative> <CompanionId>N/A</CompanionId> <Impression>TBD</Impression> <Completion>http://192.168.0.1:80/foo/bar</Completion> <TrackingId>null:417</TrackingId> <Length>5</Length> </Midroll> - <Midroll> <Creative>Midroll_45sec_1</Creative> <CompanionId>N/A</CompanionId> <Impression>TBD</Impression> <Completion>http://192.168.0.1:80/foo/bar</Completion> <TrackingId>null:415</TrackingId> <Length>5</Length> </Midroll> - <Midroll> <Creative>Midroll_45sec_2</Creative> <CompanionId>N/A</CompanionId> <Impression>TBD</Impression> <Completion>http://192.168.0.1:80/foo/bar</Completion> <TrackingId>null:416</TrackingId> <Length>5</Length> </Midroll> </AdXML>
Using XML::Simple, I can put this all into an object. Run through Data::Dumper, I get this:
Response Dump: $VAR1 = { 'Preroll' => { 'Length' => '4', 'TrackingId' => 'null:414', 'CompanionId' => 'N/A', 'Creative' => 'Preroll_30sec', 'Impression' => 'TBD', 'Completion' => 'http://192.168.0.1:80/foo/bar' }, 'Midroll' => [ { 'Length' => '5', 'TrackingId' => 'null:415', 'CompanionId' => 'N/A', 'Creative' => 'Midroll_45sec_1', 'Impression' => 'TBD', 'Completion' => 'http://192.168.0.1:80/foo/ba +r' }, { 'Length' => '5', 'TrackingId' => 'null:417', 'CompanionId' => 'N/A', 'Creative' => 'Midroll_45sec_3', 'Impression' => 'TBD', 'Completion' => 'http://192.168.0.1:80/foo/ba +r' }, { 'Length' => '5', 'TrackingId' => 'null:416', 'CompanionId' => 'N/A', 'Creative' => 'Midroll_45sec_2', 'Impression' => 'TBD', 'Completion' => 'http://192.168.0.1:80/foo/ba +r' } ], 'Postroll' => { 'Length' => '6', 'TrackingId' => 'null:418', 'CompanionId' => 'N/A', 'Creative' => 'Postroll_60sec', 'Impression' => 'TBD', 'Completion' => 'http://192.168.0.1:80/foo/bar +' } };
What I need to do is walk through the object so I can compare the values for the end parameters (Length, Creative, etc) for each ad with the expected values. The problems are:
  1. I won't know in advance what order the xml elements will be in. It may be Preroll, Midroll, Postroll, or it may be Midroll, Postroll, Preroll. There is no way of knowing in advance.
  2. If there is only one ad returned for a specific ad type, '*roll' will be a hash reference. If there are multiple ads returned, '*roll' will be a reference to an anonymous array of hashes. It is possible to know in advance which ad type will have multiple ads returned and how many there should be.
What I need is an algorithm that will walk the master hash reference and be smart enough to recognize whether it's encountered a simple hash or an array of hashes and act accordingly. I've tried this:
my ($self, $response) = @_; foreach my $asset_type ( keys %{$response} ) { $logger->debug("Starting asset_type $asset_type"); foreach my $asset_param ( keys %{$response->{$asset_type}} ) { $logger->debug("Top of middle FOR loop asset_param = $asset_param: + $response->{$asset_type}->{$asset_param}"); if ( exists ($response->{$asset_type}->{$asset_param}) ) { $logger->debug("\t$asset_type asset_param $asset_param exists: ( +$asset_param) = $response->{$asset_type}->{$asset_param}"); ## LINE 7 +79 } else { $logger->debug("\t$asset_type asset_param $asset_param is an arr +ay reference"); my $i = 0; while ($response->{$asset_type}[$i]) { foreach my $subkey ( keys %{$response->{$asset_type}[$i]}) { $logger->debug("\t\tTesting $asset_type asset number $i (sub +key $subkey) = ($response->{$asset_type}[$i]->{$subkey})"); } $i++; } $logger->debug("Broke innermost WHILE loop asset_param = $asset_ +param"); } $logger->debug("Bottom of middle FOR loop asset_param = $asset_par +am"); } $logger->debug("Broke middle FOR asset param loop"); }
The output is:
- Starting asset_type Preroll - Top of middle FOR loop asset_param = Length: 4 - Preroll asset_param Length exists: (Length) = 4 - Bottom of middle FOR loop asset_param = Length - Top of middle FOR loop asset_param = TrackingId: null:414 - Preroll asset_param TrackingId exists: (TrackingId) = null:414 - Bottom of middle FOR loop asset_param = TrackingId - Top of middle FOR loop asset_param = CompanionId: N/A - Preroll asset_param CompanionId exists: (CompanionId) = N/A - Bottom of middle FOR loop asset_param = CompanionId - Top of middle FOR loop asset_param = Creative: KohlFauPreroll_30sec - Preroll asset_param Creative exists: (Creative) = KohlFauPreroll_3 +0sec - Bottom of middle FOR loop asset_param = Creative - Top of middle FOR loop asset_param = Impression: TBD - Preroll asset_param Impression exists: (Impression) = TBD - Bottom of middle FOR loop asset_param = Impression - Top of middle FOR loop asset_param = Completion: http://172.24.16.84 +:8380/baapi/hics - Preroll asset_param Completion exists: (Completion) = http://172.2 +4.16.84:8380/baapi/hics - Bottom of middle FOR loop asset_param = Completion - Broke middle FOR asset param loop - Starting asset_type Midroll - Top of middle FOR loop asset_param = Length: - Midroll asset_param Length is an array reference - Testing Midroll asset number 0 (subkey Length) = (5) - Testing Midroll asset number 0 (subkey TrackingId) = (null:41 +7) - Testing Midroll asset number 0 (subkey CompanionId) = (N/A) - Testing Midroll asset number 0 (subkey Creative) = (KohlFauMi +droll_45sec_3) - Testing Midroll asset number 0 (subkey Impression) = (TBD) - Testing Midroll asset number 0 (subkey Completion) = (http:// +172.24.16.84:8380/baapi/hics) - Testing Midroll asset number 1 (subkey Length) = (5) - Testing Midroll asset number 1 (subkey TrackingId) = (null:41 +5) - Testing Midroll asset number 1 (subkey CompanionId) = (N/A) - Testing Midroll asset number 1 (subkey Creative) = (KohlFauMi +droll_45sec_1) - Testing Midroll asset number 1 (subkey Impression) = (TBD) - Testing Midroll asset number 1 (subkey Completion) = (http:// +172.24.16.84:8380/baapi/hics) - Testing Midroll asset number 2 (subkey Length) = (5) - Testing Midroll asset number 2 (subkey TrackingId) = (null:41 +6) - Testing Midroll asset number 2 (subkey CompanionId) = (N/A) - Testing Midroll asset number 2 (subkey Creative) = (KohlFauMi +droll_45sec_2) - Testing Midroll asset number 2 (subkey Impression) = (TBD) - Testing Midroll asset number 2 (subkey Completion) = (http:// +172.24.16.84:8380/baapi/hics) - Broke innermost WHILE loop asset_param = Length - Bottom of middle FOR loop asset_param = Length
At that point the program dies with the error "Bad index while coercing array into hash at OO_HttpInterfaceTest.pm line 779." Line 779 in this case is: $logger->debug("\t$asset_type asset_param $asset_param exists: ($asset_param) = $response->{$asset_type}->{$asset_param}");. If I remove the logging statement, the code chokes on line 780 with the same error, which leads me to suspect that the actual error is with the statement "$response->{$asset_type}->{$asset_param}"

The break is happening after the Midroll section is evaluated. The value for 'Completion' is displayed at which point the loop should exit. What seems to be happening is the code tests if $response->{$asset_type}->{$asset_param} exists, finds that it doesn't, and exits rather than going to the else condition. I have no idea why it only does this when transitioning from walking the anonymous array back to a normal hash.

I've been on this for most of the day. Please help! And if there's some vastly easier/less complex way to do this, I'm all ears.

Thanks,

-Logan
"What do I want? I'm an American. I want more."


In reply to XML::Simple Meets Complex Hash Structure by logan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-25 07:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found