Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Generic Parsing class with callback

by QuillMeantTen (Friar)
on Aug 25, 2015 at 08:03 UTC ( [id://1139795]=CUFP: print w/replies, xml ) Need Help??

Greetings,
I'm not sure if this is a real cool use, but with this class you can specify any regex for a standard parser tool as well as any callback in string form (pretty useful if you want to do all of that in a configuration file under version management)

I have some serious doubts about the way I used eval in this code so I would welcome feedback on how to do it better. I looked over the forum for ways to handle the storing of substitution regex and back references but what I found I felt I did not understand enough to re use properly so if you feel like engraving it into my forehead so I can get it feel free to do so (I'm refering to this node and that one

now here is the code :

package Security::Monitoring::Logs::Normalization::Parser; use 5.006; use strict; use warnings; use Carp; use Security::Monitoring::Utils; =head1 NAME Normalization::Parser =head1 VERSION Version 0.01 =cut our $VERSION = '0.01'; =head1 SYNOPSIS my %params = { regex=>'single quoted string regex to be evalued', name=>"my_parser_name", tag=>"log_type_or_anything", callback_ref=>'string containing a sub that will be called whe +never matching does or does not happen,like writing to a file handle +' } my $instance = $class->init(\%params); open my $fh, '<','to_be_parsed.log'; $instance->parse($fh); the callback ref should use one input and one output, see tests fo +r examples =head1 DESCRIPTION this module provides a class for the parser instance that will be +in charge of normalizing each logs and store the metadata =head1 SUBROUTINES/METHODS =head2 new instance creator =cut sub new { my $class = shift; my $params = shift; if (!defined($params)){ croak("params are not defined!\n"); } my $self = {}; bless $self,$class; $self->_init($params); return $self; } =head2 _init instance initialisation subroutine =cut sub _init{ my $self = shift; my $params = shift; #makes code from the callback string my $callback = eval $params->{callback_ref}; $params->{callback_ref} = $callback; my @keys = keys %{$params}; foreach my $key (@keys){ $self->{$key} = $params->{$key}; } } =head2 parse starts parsing from a file handle ref =cut sub parse{ my ($self,$input,$output) = @_; if (!defined($self->{regex})|| !defined($input) || !defined($outpu +t)|| !defined($self->{callback_ref})){ croak "sorry, my caller must have a defined regex and my input + and output have to be defined, the caller callback has to be too"; } else{ my $reghash_ref = $self->{regex}; while(<$input>){ my $line = $_; my $result = undef; eval '$result = $line =~ '.$self->{regex}; $self->{callback_ref}($result,$line,$output); } } } 1; # End of Security::Monitoring::Logs::Normalization::Parser

And, as usual, the tests, iirc its almost 100% cover

#!perl -T use 5.006; use strict; use warnings; use Test::More; use Test::Exception; use Security::Monitoring::Logs::Normalization::Parser; use diagnostics; BEGIN { plan tests => 9; use_ok( 'Security::Monitoring::Logs::Normalization::Parser' ) || p +rint "Bail out!\n"; } my $class = 'Security::Monitoring::Logs::Normalization::Parser'; diag( "Testing Parser module $Security::Monitoring::Logs::Normalizatio +n::Parser::VERSION, Perl $], $^X" ); my $callback = 'sub { my($result,$input,$output) = @_; if ($result){ print $output $input; } }'; my $params ={name=>"dummy",tags=>"silly",regex=>'m/tata/',callback_ref +=>$callback}; our $parser = $class->new($params); dies_ok(sub{my $new = $class->new(undef)},"new dies when params undef" +); ok(defined($parser),'my dummy parser correctly created'); my $input = "tata\ntoto\ntutu"; my $output; open my $fh_input, '<',\$input; open my $fh_output,'>',\$output; $parser->parse($fh_input,$fh_output); is($output,"tata\n",'match parsing works'); $parser->{regex} = 's/(tata)/$1tutu/'; open $fh_output, '>',\$output; seek $fh_input,0,0; $parser->parse($fh_input,$fh_output); is($output,"tatatutu\n","substitution parsing works"); dies_ok(sub{ $parser->parse(undef,$fh_output)},"parse dies with undefi +ned input"); dies_ok(sub{$parser->parse($fh_input,undef)}, "parse dies with undefin +ed output"); dies_ok(sub {$parser->{regex} = undef;$parser->parse($fh_input,$fh_output)},"parse dies with undefined regex"); $parser->{regex} = 'm/tata/'; dies_ok(sub {$parser->{callback_ref} = undef; $parser->parse($fh_input,$fh_output); },"parser dies with undefined callback"); close $fh_output;
Thank you for your feedback :)

Replies are listed 'Best First'.
Re: Generic Parsing class with callback
by stevieb (Canon) on Aug 31, 2015 at 19:55 UTC

    I would force the caller to send in a callback instead of doing eval on a string...

    use warnings; use strict; sub work { my ($x, $y, $callback) = @_; if (ref $callback ne 'CODE'){ die "callback param needs to be a code reference"; } $callback->($x, $y); } # with a ref to an existing sub sub this { my ($x, $y) = @_; print "$x $y\n"; } work(3, 4, \&this); # with an explicit code ref my $cref = sub { print "$_\n" for @_; }; work(2, 2, $cref); # or inline work(2, 3, sub { print "$_[0], $_[1]\n"; })

    ...and in your documentation, clearly state what parameters (the number and type) the callback should expect to receive.

    It doesn't look like you need an eval later on either. eval is pretty much a try/catch to ensure no problems happened, and puts any errors it encounters in $@ so you can handle them gracefully. I think your else could be re-written as such (untested), given that it looks like you're only trying to see if a match was made, not the matches themselves:

    my $regex = qr/$self->{regex}/; while(<$input>){ my $result = /$regex/; $self->{callback_ref}($result,$line,$output); }

    -stevieb

      I tried storing the regex with a simple qr but it broke whenever I tried to do a substitution regex, that's why I chose this quite unelegant way to handle it.
      If you have a better idea though I'd gladly change the way its written :)

      everything else was very good advice, it looks better but most importantly it feels cleaner.
      If you'd be so kind, have a look at the improved code!

      Update : the use of the eval might not seem obvious, this submodule has been written to help normalize Logs before putting them into a database hence the provisioning for substitutions, I want to extract metadata to fill dublin core style tables
      Since I plan to use the same objects to scrap the database for suspicious behaviour pattern matching also was of interest

      and the updated test file

        I'll take a closer look later when I have more time, but I didn't realize you were doing substitution regexes. For that, you either have to use eval or here are two other methods that I know how to get around the eval call.

        Force the caller to pass in a code reference with the entire details of the substitution...

        my $str = "This is one thing. This."; sub cref_regex { my ($str, $re_cref) = @_; print $re_cref->($str); } my $cref = sub { $_[0] =~ s/This/That/ig; }; cref_regex($str, $cref);

        ...or, force the caller to send in both a $search and if required, an optional $replace variable...

        sub separate_regex { my ($str, $search, $replace) = @_; if (! $replace){ print $str =~ /$search/; } else { print $str =~ s/$search/$replace/g; } } separate_regex($str, 'This', 'That');

        Both of those are more work for the caller, so I can't tell which is better in the long run. I've just always heard "don't use string eval", so I try to avoid it where possible.

        In a recent module of mine I updated, I opted for the $search, $replace option.

        -stevieb

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://1139795]
Approved by herveus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2024-04-24 22:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found