Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Script for a URL that constantly changes

by anneli (Pilgrim)
on Oct 19, 2011 at 23:03 UTC ( #932522=note: print w/ replies, xml ) Need Help??

in reply to Script for a URL that constantly changes

Uh, so, what would you like a hand with? What have you got so far?

How do I post a question effectively?

Comment on Re: Script for a URL that constantly changes
Replies are listed 'Best First'.
Re^2: Script for a URL that constantly changes
by semrich (Initiate) on Oct 20, 2011 at 00:00 UTC

    This is what I have so far:

    use strict; $|++; use Test::WWW::Mechanize; use WWW::Mechanize::Sleepy; use File::Basename; use WWW::Mechanize; use WWW::Mechanize::Image; use Storable; use HTTP::Cookies; use HTML::SimpleParse; my @sku = ('NUMBER'); for my $sku (@sku) { # sleep between 5 and 20 seconds between requests my $mech = WWW::Mechanize::Sleepy->new( sleep => '1..3' ); my $URL ="$sku&sp +age=header&CurSel=Ntt&Nty=1&Ntx=mode%2bmatchpartialmax&cntry=us&Ntk=A +ll&N=0&Ntt=$sku"; $mech->get( $URL ); $mech->success or die $mech->response->status_line +; $mech->success or die "post failed: "; my $url1= $mech->uri(); print "$url1\n"; my @links = $mech->find_all_links( url_regex => qr/\/catalog\/product/ +i); for my $links (@links) { $mech->get( $links->url() ); $mech->success or die $mech->response->status_line; $mech->success or die "post failed: "; my $pike = "|"; open (price_file, "FILE NAME") || die "can't open price.txt: $!\n"; my $some_html = $mech-> content(); my $p = new HTML::SimpleParse ($mech); print $p;

    What happens on the site is, you go to the URL listed above to get a list that matches the SKU fed to it, then from there each listing has it's own page where it holds all the information you see on the initial page (where the list is). When you go to that second page, in the URL there is a catalog number, that changes depending on which listing you clicked on (which is why I am not using that URL to parse), and I need to somehow capture that catalog number each time the SKU changes so I can get the correct information from the "second" page. I am not sure if that made any more sense. It is sort of complicated to explain.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://932522]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2015-11-28 23:02 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (746 votes), past polls