Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Help with web crawling

by tobyink (Abbot)
on Dec 09, 2012 at 11:01 UTC ( #1007969=note: print w/replies, xml ) Need Help??


in reply to Help with web crawling

use HTML::HTML5::Parser; my $uri = 'http://www.sec.gov/Archives/edgar/data/935226/00011442041 +1058092/0001144204-11-058092-index.htm'; my $xpath = '//*[@class="formGrouping" and ./*[@class="infoHead" and c +ontains(./text(), "Items")]]/*[@class="info"]'; my $item = HTML::HTML5::Parser -> load_html(location => $uri) -> findvalue($xpath); print $item, "\n";
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1007969]
help
Chatterbox?
choroba still has one more.
[Corion]: I have Thursday and Friday off, so then I might get to converting the rough outlines to more articles ;)
[Corion]: But currently, most of the modules are web-related and I don't like to publish two web articles in a row
[Corion]: Maybe I should do the Filter::Simple release on the next weeked - this would give me one more article to milk from this theme

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (8)
As of 2017-04-24 08:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I'm a fool:











    Results (437 votes). Check out past polls.