Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re^2: Script for a URL that constantly changes

by semrich (Initiate)
on Oct 20, 2011 at 00:00 UTC ( #932529=note: print w/replies, xml ) Need Help??

in reply to Re: Script for a URL that constantly changes
in thread Script for a URL that constantly changes

This is what I have so far:

use strict; $|++; use Test::WWW::Mechanize; use WWW::Mechanize::Sleepy; use File::Basename; use WWW::Mechanize; use WWW::Mechanize::Image; use Storable; use HTTP::Cookies; use HTML::SimpleParse; my @sku = ('NUMBER'); for my $sku (@sku) { # sleep between 5 and 20 seconds between requests my $mech = WWW::Mechanize::Sleepy->new( sleep => '1..3' ); my $URL ="$sku&sp +age=header&CurSel=Ntt&Nty=1&Ntx=mode%2bmatchpartialmax&cntry=us&Ntk=A +ll&N=0&Ntt=$sku"; $mech->get( $URL ); $mech->success or die $mech->response->status_line +; $mech->success or die "post failed: "; my $url1= $mech->uri(); print "$url1\n"; my @links = $mech->find_all_links( url_regex => qr/\/catalog\/product/ +i); for my $links (@links) { $mech->get( $links->url() ); $mech->success or die $mech->response->status_line; $mech->success or die "post failed: "; my $pike = "|"; open (price_file, "FILE NAME") || die "can't open price.txt: $!\n"; my $some_html = $mech-> content(); my $p = new HTML::SimpleParse ($mech); print $p;

What happens on the site is, you go to the URL listed above to get a list that matches the SKU fed to it, then from there each listing has it's own page where it holds all the information you see on the initial page (where the list is). When you go to that second page, in the URL there is a catalog number, that changes depending on which listing you clicked on (which is why I am not using that URL to parse), and I need to somehow capture that catalog number each time the SKU changes so I can get the correct information from the "second" page. I am not sure if that made any more sense. It is sort of complicated to explain.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://932529]
and the fog begins to lift...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2018-06-23 20:52 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (125 votes). Check out past polls.