Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Negate link search in Mechanize

by Anonymous Monk
on Nov 02, 2011 at 04:50 UTC ( #935289=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to scrape some links using Mechanize and cannot find a way to negate the search argument. I need to find links that do not contain (for example) 'foo'. Tried super search, perlre & Mechanize docs with no help, but feel like I'm overlooking something obvious. Can anyone help?

# this finds the links I do not want: my @links = $mech->find_all_links(url_regex => qr/foo/); # these (as well as some really bizarre attempts) don't work: my @links = $mech->find_all_links(url_regex => !qr/foo/); my @links = $mech->find_all_links(url_regex !~ qr/foo/);

Comment on Negate link search in Mechanize
Download Code
Re: Negate link search in Mechanize
by mwp (Hermit) on Nov 02, 2011 at 04:57 UTC
Re: Negate link search in Mechanize
by Plankton (Priest) on Nov 02, 2011 at 05:04 UTC
    Why not just get all the links like so ...
    ... use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->get( $url ); my @links = $mech->links();
    ... then iterate over @links. Something like ...
    for my $l (@links) { next if $l->url() =~ /foo/; push ( @non_foo_links, $l ); ...
      Why not just get all the links like so ...

      Because I'm lazy (a Perl virtue, you know). But thanks, I was too focused on letting Mechanize do the work and did overlook the obvious. Maybe time to get some sleep.

Re: Negate link search in Mechanize
by Anonymous Monk on Nov 02, 2011 at 13:34 UTC
    my @links = $mech->find_all_links(url_regex => qr/^(?!.*foo)/);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://935289]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2014-09-21 08:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (168 votes), past polls