Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Can't seem to match from one varibale to next one.

by tobyink (Canon)
on Sep 16, 2012 at 18:45 UTC ( [id://993936]=note: print w/replies, xml ) Need Help??


in reply to Can't seem to match from one varibale to next one.

This bit:

while ( $line = <> )

... is only reading a single line at a time. Adjusting your regular expression won't make a difference because the variable that you're matching it against is only ever one line!

You need to loop through lines, not doing anything except accumulating them into a variable; and only when you hit the start of a new record, processing that accumulated variable (then resetting it).

Here's a somewhat simplified example:

use Data::Dumper; sub process_record { my $record = join q[], @{+shift}; warn "Malformed record: $record" unless $record =~ /^ \[ (.+?) \] \s+ \[ (.+?) \] \s+ (.+) $/xs +; local $Data::Dumper::Terse = 1; print "Got record ", Dumper +{ datetime => $1, status => $2, info => $3, }; } my $current_record; while (<DATA>) { # we have hit a new record if (/^ \[ \d{4}-\d{2}-\d{2} /x) { process_record($current_record) if $current_record; $current_record = []; # start a new record } push @$current_record, $_; } # don't forget to process the final record process_record($current_record) if $current_record; __DATA__ [2012-09-14 16:55:22,498] [ACTIVE] INFO - this is a single line [2012-09-14 16:55:22,498] [ACTIVE] INFO - this is a multi line record [2012-09-14 16:55:22,500] [ACTIVE] INFO - this is another single line [2012-09-14 16:55:22,500] [ACTIVE] INFO - this is yet single line [2012-09-14 16:55:22,500] [ACTIVE] INFO - this one is on two lines [2012-09-14 16:55:22,500] [ACTIVE] INFO - one last record
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Replies are listed 'Best First'.
Re^2: Can't seem to match from one varibale to next one.
by Ben328 (Novice) on Sep 16, 2012 at 19:24 UTC
    Thank you for your answer. Since I am newbie to perl, I am trying to also understand what are you actually doing in the script. I looked up online but the description of following were pretty vague. Here are my questions:

    What is "join q[], @{+shift}" doing? Normally join is like join("expression", "list")

    And what is this line doing? local $Data::Dumper::Terse = 1;

    Hope you can help me understand this script better. Thanks, Ben

      G'day Ben328,

      Welcome to Perl and the monastery.

      I don't know where you're looking online for Perl documentation. Except in very rare cases, the following two sites have provided all the Perl documentation I've needed for many years:

      • perldoc.perl.org - Perl Programming Documentation. You'll find Perl syntax, functions, built-in modules, etc. here.
      • search.cpan.org - CPAN (Comprehensive Perl Archive Network). You'll find user contributed modules here.

      All documentation links I provide below are to pages on the first of those sites.

      What is "join q[], @{+shift}" doing? Normally join is like join("expression", "list")

      If online documentation has said that join("expression", "list") is normal syntax, it is wrong and I wouldn't use that site again. Did it say that or have you paraphrased what it said or, perhaps, taken it out of context? That piece of code evaluates to just "list" making join("expression", and the closing parenthesis completely pointless:

      $ perl -Mstrict -Mwarnings -E ' my $x = join("expression", "list"); say $x; my $y = "list"; say $y; ' list list

      Take a look at join which shows the syntax as: join EXPR, LIST. (It only has one example which shows join(EXPR, LIST) - read on for a further explanation).

      When you write a subroutine, e.g. sub some_function { ... }, you'd call it like this: some_function(arg1, ..., argN). Perl's built-in functions can, but don't need to, use the parentheses. It's normally perfectly fine to omit the parentheses; here's an example where you might include them:

      print 'Tabbed items: ', join("\t", @items), "\n";

      [Advanced exception: There is a way to make your functions act like Perl functions. It's generally a bad idea to do this. I strongly recommend that you do not do this - certainly not until you are way past the "newbie" stage. As you may see it in other's code, here's what I'm recommending you don't use: perlsub - Prototypes.]

      It is all too easy to read '' as ". Can you see the difference in your browser? Perl has a number of Quote-Like Operators which you can use to avoid this potential confusion. That documentation shows q/.../, q!...! and q(...); tobyink has used q[...]; my personal preference is for q{...}; you can pick some other delimiter if you want. q[] is a zero-length string: it is unambiguous and doesn't require you to decide if you're looking at one double-quote or two single-quotes.

      The start of the process_record function could have been written more verbosely as:

      sub process_record { my $array_ref = shift; my $record = join q[], @{$array_ref}; ...

      See shift if you're not sure how that works. The code @{shift} is ambiguous and could be interpreted as @shift or @{shift(@_)}. Adding the + tells Perl you mean the second interpretation. See perlop - Symbolic Unary Operators for a discussion of this.

      And what is this line doing? local $Data::Dumper::Terse = 1;

      I'm don't know what part of that you're having trouble with. Have a read of local and Data::Dumper ($Data::Dumper::Terse is mentioned in a few places). If you're still in the dark with one or more parts of that line of code, please specify where you're having problems.

      -- Ken

        This is Great. I will check the doc that you have mentioned and go on from there. I am just starting to learn perl and was confused before. Thanks a bunch. You guys are great. Ben

      my $record = join q[], @{+shift}; does a whole lot of stuff in one statement, but basically it takes a reference to an array of strings and joins them all into a single string.

      It could alternatively be written as:

      my $ref_to_list_of_strings = shift @_; my @list_of_strings = @{ $ref_to_list_of_strings }; my $empty_string = q[]; my $record = join($empty_string, @list_of_strings);

      ... but that sort of coding has been known to cause repetitive strain injuries. Whatsmore, the short way also avoids creating a bunch of temporary variables, so probably runs (ever so slightly) faster.

      perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://993936]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-04-25 14:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found