Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Negate link search in Mechanize

by Anonymous Monk
on Nov 02, 2011 at 04:50 UTC ( #935289=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to scrape some links using Mechanize and cannot find a way to negate the search argument. I need to find links that do not contain (for example) 'foo'. Tried super search, perlre & Mechanize docs with no help, but feel like I'm overlooking something obvious. Can anyone help?

# this finds the links I do not want: my @links = $mech->find_all_links(url_regex => qr/foo/); # these (as well as some really bizarre attempts) don't work: my @links = $mech->find_all_links(url_regex => !qr/foo/); my @links = $mech->find_all_links(url_regex !~ qr/foo/);

Comment on Negate link search in Mechanize
Download Code
Re: Negate link search in Mechanize
by mwp (Hermit) on Nov 02, 2011 at 04:57 UTC
Re: Negate link search in Mechanize
by Plankton (Priest) on Nov 02, 2011 at 05:04 UTC
    Why not just get all the links like so ...
    ... use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->get( $url ); my @links = $mech->links();
    ... then iterate over @links. Something like ...
    for my $l (@links) { next if $l->url() =~ /foo/; push ( @non_foo_links, $l ); ...
      Why not just get all the links like so ...

      Because I'm lazy (a Perl virtue, you know). But thanks, I was too focused on letting Mechanize do the work and did overlook the obvious. Maybe time to get some sleep.

Re: Negate link search in Mechanize
by Anonymous Monk on Nov 02, 2011 at 13:34 UTC
    my @links = $mech->find_all_links(url_regex => qr/^(?!.*foo)/);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://935289]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (19)
As of 2014-09-02 14:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (25 votes), past polls