Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^3: Create a ipset for blocking networks based on internet sources

by jwkrahn (Abbot)
on Apr 25, 2012 at 01:49 UTC ( [id://966971]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Create a ipset for blocking networks based on internet sources
in thread Create a ipset for blocking networks based on internet sources

You're welcome.

I made a little mistake in one sugestion:

qr/(^([0-9]{1,3}\.){3}[0-9]{1,3})/, ... qr/^(\d.*\d)/, ... qr/(.*)/, ... foreach ( $response->content =~ /$regex/g ) {

And you "corrected" thusly:

qr/\n(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/, # 46.137.194.0 + 46.137.194.255 24 2650

The problem with that is that that pattern will match every line except the first line.    The proper solution is to use the /m option so that the pattern will match at the beginning of every line:

qr/(^([0-9]{1,3}\.){3}[0-9]{1,3})/m, ... qr/^(\d.*\d)/m, ... qr/(.*)/, ... foreach ( $response->content =~ /$regex/g ) {


my @sys = (qw(ipset create), "temp_$set_name", split / /,$set_ +type); ... @sys = (qw(ipset create -exist), $set_name, split / /,$set_typ +e);

The use of / / with split may not do what you want, and it certainly is not what the shell would do.    You should use ' ' instead:

my @sys = (qw(ipset create), "temp_$set_name", split ' ',$set_ +type); ... @sys = (qw(ipset create -exist), $set_name, split ' ',$set_typ +e);


$fh->open("> $f_dates_last") || die "Unable to save timestamp urls in +$f_dates_last: $?";

The $? variable will have no useful information if open fails.    You should use $! or $^E instead.

Replies are listed 'Best First'.
Re^4: Create a ipset for blocking networks based on internet sources
by mimosinnet (Beadle) on Apr 25, 2012 at 18:40 UTC

    Thanks jwkrahn, I really appreciate the corrections! I have included them in the script and read the section on error variables in perlvar.

    Also, my first version used Moose (influenced by the book "Modern Perl"). After looking at the script, and comparing it with the bash scripts, it took more time and resources. I guess that Moose is for larger and more complex projects, so I have rewritten in what I assume is "procedural programming". It has been trivial to have Moose out, transforming the package into a subroutine.

    One thing is puzzeling me is the regex. In a file in this format:

    Start End Netblock Attacks Name Country email 116.45.99.0 116.45.99.255 24 1799 46.21.150.0 46.21.150.255 24 1708 121.243.146.0 121.243.146.255 24 1446

    Applying this regex:

     qr/^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/m

    Gives me:

    116.45.99.0 46.21.150.0 121.243.146.0

    And this regex:

    qr/^((\d{1,3}\.){3}\d{1,3})/m

    Gimes me:

    116.45.99.0 99. 46.21.150.0 150. 121.243.146.0 146.

    I would expect the same result.

      qr/^((\d{1,3}\.){3}\d{1,3})/m

      The problem there is that you have two sets of capturing parentheses.    You can change the inner capturing parentheses to non-capturing parentheses:

      qr/^((?:\d{1,3}\.){3}\d{1,3})/m

      And you could change the foreach loop:

      foreach ( $response->content =~ /$regex/g ) { @sys = (qw(ipset add), "temp_$set_name", $_); system(@sys) == 0 or die "Unable to add $_ to temp_$set_na +me because: $?"; }

      To a while loop that only accesses $1:

      while ( $response->content =~ /$regex/g ) { @sys = (qw(ipset add), "temp_$set_name", $1); system(@sys) == 0 or die "Unable to add $_ to temp_$set_na +me because: $?"; }

        Thanks for the tip on the use of the extended pattern (?: ) to not make backreferences!

        It has been also very enlightening to see the relationship between the foreach loop and the $_ variable, and what would be the equivalent of the while loop and the $1 bakcreference.

        Your second suggestion has made me explore more how the while loop works, realizing that while ( $response->content =~ /$regex/g ) gets the content each time that the loop is called. I have changed it by:

        my $resp = $response->content; while ( $resp =~ /$regex/g ) { @sys = (qw(ipset add), "temp_$set_name", $1); }

        Your comments have been very useful! Although all these issues are on the documentation, it is sometimes difficult at the beginning to find the right section at the right time. It has been great to apply it to a specific example. Very greatful!

        Cheers!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://966971]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-16 17:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found