in reply to A regex on the same content fails and works, with conditions
I managed to solve this using some HTML::Element fu:
(my $author) = map $_->as_text, $t->look_down(_tag => 'a', href => +qr{^http://news\.example\.com/\?author=});
But while testing this, it appears the upstream site has some blocking/throttling mechanisms, so now I can't test it because they're throwing back pages indicating I'm "reading articles faster than a human can read" (my code had a 10-second delay in it).
Now I'm adding randomization across an array of anonymous proxies to try to alleviate that blocking, but the list of proxies is not reliable.
Too many yaks to shave in one day.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: A regex on the same content fails and works, with conditions (rude)
by tye (Sage) on Oct 22, 2007 at 17:06 UTC |
In Section
Seekers of Perl Wisdom