Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^3: Passing complex html-tag input over command line to HTML TreeBuilder method look_down() properly

by Anonymous Monk
on Apr 19, 2019 at 07:00 UTC ( #1232788=note: print w/replies, xml ) Need Help??


in reply to Re^2: Passing complex html-tag input over command line to HTML TreeBuilder method look_down() properly
in thread Passing complex html-tag input over command line to HTML TreeBuilder method look_down() properly

#!/usr/bin/perl -- use strict; use warnings; use HTML::TreeBuilder::XPath; use HTML::Selector::XPath 'selector_to_xpath'; Main( @ARGV ); sub Main { my $tree = HTML::TreeBuilder::XPath->new; # $tree->parse_file('foo.html'); $tree->parse_content( DemoHtml() ); for my $node ( $tree->findnodes( selector_to_xpath( 'div.comic__container' ) ) ) { MeImagins( $node ); } } sub MeImagins { my( $node ) = @_; for my $img( $node->findnodes('//img') ){ print "\n###", "\n", $img->address(), "\n", $img->attr( 'src' ), "\n", $img->attr( 'alt' ), "\n", ; } } sub DemoHtml { return <<'__HTML__'; <div class="comic__container"> <div class="comic__image js-comic-swipe-target"> <div class="swipe-preview swipe-preview__previous js-preview-p +revious"> <div class="swipe-preview__group"> <h5 class="card-subtitle"> <date>April 16, 2019</date> </h5> <div class="swipe-preview__ubadge"> <div class="gc-avatar gc-avatar--creator sm"><img srcset="https: +//assets.gocomics.com/assets/transparent-3eb10792d1f0c7e07e7248273540 +f1952d9a5a2996f4b5df70ab026cd9f05517.png" data-srcset="https://avatar +.amuniversal.com/feature_avatars/ubadge_images/features/cw/small_u-20 +1701251613.png, 72w" class="lazyload" alt="9 Chickweed Lane" src="htt +ps://avatar.amuniversal.com/feature_avatars/ubadge_images/features/cw +/small_u-201701251613.png"></div> </div> </div> </div> <a itemprop="image" class="js-item-comic-link" href="/9chickwe +edlane/2019/04/17" title="9 Chickweed Lane"> <picture class="item-comic-image"><img class="lazyload img-fluid" sr +cset="https://assets.gocomics.com/assets/transparent-3eb10792d1f0c7e0 +7e7248273540f1952d9a5a2996f4b5df70ab026cd9f05517.png" data-srcset="ht +tps://assets.amuniversal.com/93d41d70391d01379025005056a9545d 900w" s +izes=" (min-width: 992px) 900px, (min-width: 768px) 600px, (min-width: 576px) 300px, 900px" alt="9 Chickweed Lane Comic Strip for Ap +ril 17, 2019 " src="https://assets.amuniversal.com/93d41d70391d013790 +25005056a9545d" width="100%"></picture> </a> <meta itemprop="isFamilyFriendly" content="true"> <div class="swipe-preview swipe-preview__next js-preview-next" +> <div class="swipe-preview__group"> <h5 class="card-subtitle"> <date>April 18, 2019</date> </h5> <div class="swipe-preview__ubadge"> <div class="gc-avatar gc-avatar--creator sm"><img srcset="https: +//assets.gocomics.com/assets/transparent-3eb10792d1f0c7e07e7248273540 +f1952d9a5a2996f4b5df70ab026cd9f05517.png" data-srcset="https://avatar +.amuniversal.com/feature_avatars/ubadge_images/features/cw/small_u-20 +1701251613.png, 72w" class="lazyload" alt="9 Chickweed Lane" src="htt +ps://avatar.amuniversal.com/feature_avatars/ubadge_images/features/cw +/small_u-201701251613.png"></div> </div> </div> </div> </div> <nav class="gc-calendar-nav" role="group" aria-label="Date Nav +igation Controls"> <div class="gc-calendar-nav__previous"> <a role="button" href="/9chickweedlane/1993/07/12" class="fa btn + btn-outline-secondary btn-circle fa fa-backward sm " title=""></a> <a role="button" href="/9chickweedlane/2019/04/16" class="fa btn + btn-outline-secondary btn-circle fa-caret-left sm js-previous-comic +" title=""></a> </div> <div class="gc-calendar-nav__select"> <div class="btn btn-outline-secondary gc-calendar-nav__datepicke +r js-calendar-wrapper" data-date="2019/04/17" data-name="/9chickweedl +ane/" data-year="2019" data-month="04" data-day="17" data-feature="9c +hickweedlane" data-ct="" data-start="1993/07/12" data-end="2019/04/19 +" data-open="2019-04-17"> <i class="fa fa-calendar xs"></i> <input name="startDate" placeholder="April 17, 2019" readonl +y="readonly" class="cal off calendar-input date js-calendar-input dat +epicker js-calendar-input-link" type="text"> </div> <a class="btn btn-outline-secondary" alt="Click to View a Random + 9 Chickweed Lane Comic Strip!" href="/random/9chickweedlane">Random< +/a> </div> <div class="gc-calendar-nav__next"> <a role="button" href="/9chickweedlane/2019/04/18" class="fa btn + btn-outline-secondary btn-circle fa-caret-right sm " title=""></a> <a role="button" href="/9chickweedlane/2019/04/19" class="fa btn + btn-outline-secondary btn-circle fa-forward sm " title=""></a> </div> </nav> </div> __HTML__ }
  • Comment on Re^3: Passing complex html-tag input over command line to HTML TreeBuilder method look_down() properly
  • Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1232788]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2019-12-14 17:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?