Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Script for a URL that constantly changes

by anneli (Pilgrim)
on Oct 19, 2011 at 23:03 UTC ( #932522=note: print w/ replies, xml ) Need Help??


in reply to Script for a URL that constantly changes

Uh, so, what would you like a hand with? What have you got so far?

How do I post a question effectively?


Comment on Re: Script for a URL that constantly changes
Re^2: Script for a URL that constantly changes
by semrich (Initiate) on Oct 20, 2011 at 00:00 UTC

    This is what I have so far:

    use strict; $|++; use Test::WWW::Mechanize; use WWW::Mechanize::Sleepy; use File::Basename; use WWW::Mechanize; use WWW::Mechanize::Image; use Storable; use HTTP::Cookies; use HTML::SimpleParse; my @sku = ('NUMBER'); for my $sku (@sku) { # sleep between 5 and 20 seconds between requests my $mech = WWW::Mechanize::Sleepy->new( sleep => '1..3' ); my $URL ="https://www.vwrsp.com/psearch/ControllerServlet.do?D=$sku&sp +age=header&CurSel=Ntt&Nty=1&Ntx=mode%2bmatchpartialmax&cntry=us&Ntk=A +ll&N=0&Ntt=$sku"; $mech->get( $URL ); $mech->success or die $mech->response->status_line +; $mech->success or die "post failed: "; my $url1= $mech->uri(); print "$url1\n"; my @links = $mech->find_all_links( url_regex => qr/\/catalog\/product/ +i); for my $links (@links) { $mech->get( $links->url() ); $mech->success or die $mech->response->status_line; $mech->success or die "post failed: "; my $pike = "|"; open (price_file, "FILE NAME") || die "can't open price.txt: $!\n"; my $some_html = $mech-> content(); my $p = new HTML::SimpleParse ($mech); print $p;

    What happens on the site is, you go to the URL listed above to get a list that matches the SKU fed to it, then from there each listing has it's own page where it holds all the information you see on the initial page (where the list is). When you go to that second page, in the URL there is a catalog number, that changes depending on which listing you clicked on (which is why I am not using that URL to parse), and I need to somehow capture that catalog number each time the SKU changes so I can get the correct information from the "second" page. I am not sure if that made any more sense. It is sort of complicated to explain.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://932522]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2014-12-28 00:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (177 votes), past polls