Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Extract a small part of a long sentence using regular expressions

by QM (Parson)
on Dec 02, 2014 at 13:34 UTC ( [id://1108960]=note: print w/replies, xml ) Need Help??


in reply to Extract a small part of a long sentence using regular expressions

As long as the target is a parenthesized list of integers, this will grab the list:

Update: Fixed the regex to capture correctly by putting parens around the list inside the literal parens, and ignoring captures on the internal group.

# First, just grab the list if (my ($list) = $line =~ /\((\d+(?:,\d+)*)\)/) { # split the list by commas, assuming no whitespace my @list = split ',', $list; # initialise the magic alpha incrementer key my $key = 'a'; my %hash; for my $value (@list) { next unless $value; $hash{$key} = $value; # increment magically ++$key; } do_something_with(%hash); }

Then the question is whether you need to do something with %hash for each line, or accumulate these across the whole file. If it's file level, move the my %hash; to before the if, and the do_something_with(%hash) after the if block.

Also, do_something_with(%hash) might be better as a hash reference:

do_something_with(\%hash);

-QM
--
Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re^2: Extract a small part of a long sentence using regular expressions
by AnomalousMonk (Archbishop) on Dec 02, 2014 at 17:40 UTC
    if (my $list = $line =~ /\(\d+(,\d+)*\)/) { ... }

    The problem with this is it only captures the match success status in the  $list scalar:

    c:\@Work\Perl>perl -wMstrict -le "my $line = '[AHB_REPORTER][INFO]: action(62,1,0,0,0,0,5,53,9,0,190)D:/XYZ/reg/ +Tests/Mcu/A_test.cCALL: (null)'; if (my $list = $line =~ /\(\d+(,\d+)*\)/) { print qq{'$list'}; } " '1'
    Because of the way something like  (,\d+)* works, changing  $list to an array  @list isn't much better:
    c:\@Work\Perl>perl -wMstrict -le "my $line = '[AHB_REPORTER][INFO]: action(62,1,0,0,0,0,5,53,9,0,190)D:/XYZ/reg/ +Tests/Mcu/A_test.cCALL: (null)'; if (my @list = $line =~ /\(\d+(,\d+)*\)/) { print qq{(@list)}; } " (,190)
    (This works the same with or without a  /g modifier on the  m// match.)

    To extract all digit groups, you could do something like:

    c:\@Work\Perl>perl -wMstrict -MData::Dump; -le "my $line = '[AHB_REPORTER][INFO]: action(62,1,0,0,0,0,5,53,9,0,190)D:/XYZ/reg/ +Tests/Mcu/A_test.cCALL: (null)'; if (my @list = $line =~ m{ (?: \G , | action\( ) (\d+) }xmsg) { printf qq{'$_' } for @list; print ''; my %hash = do { my $k = 'a'; map { $_ ? ($k++ => $_) : () } @list +}; dd \%hash; } " '62' '1' '0' '0' '0' '0' '5' '53' '9' '0' '190' { a => 62, b => 1, c => 5, d => 53, e => 9, f => 190 }
    (Add  \s* whitespace flavoring to taste.) (Update: The  \G , pattern assumes that a  , (comma) never occurs at the beginning of  $line.)

    Update: If you want to get a bit fancy, do it all in one swell foop and then just test if the hash has anything in it:

    c:\@Work\Perl>perl -wMstrict -MData::Dump -le "my $line = '[AHB_REPORTER][INFO]: action(62,1,0,0,0,0,5,53,9,0,190)D:/XYZ/reg/ +Tests/Mcu/A_test.cCALL: (null)'; ;; my %hash = do { my $k = 'a'; map { $_ ? ($k++ => $_) : () } $line =~ m{ (?: \G , | action [(] ) \K \d+ }xmsg; }; ;; if (%hash) { dd \%hash; } else { print 'no got'; } " { a => 62, b => 1, c => 5, d => 53, e => 9, f => 190 }
    (The  \K regex operator comes with Perl versions 5.10+. If your version pre-dates 5.10, let me know and I'll supply a simple fix.)

      This worked like a charm! Thank you. I have now learnt what a 'named backreference' is and what it can do and also how magical the incrementer  my $k = 'a'; can be!
Re^2: Extract a small part of a long sentence using regular expressions
by Anonymous Monk on Dec 02, 2014 at 16:00 UTC
    Thank you! The fact that one small thing in Perl can be figured out in so many different ways give me the creeps! This is a very intersting approach and i must admit i had not thought of this...

    Just one question though, this magic alpha incrementer, i don't get it. Is it liek a normal counter where we say  $count = 1; and then increment it or is this something different??

      It is like a normal incrementer but it works on string variables, which is the 'magical' part. The variable has to only have been used in string context since it was set and match the pattern:

      /^[a-zA-Z]*[0-9]*$/

      and not be the null string. It's pretty much designed for cases like this. If you have more than 26 keys, it will go from 'z' to 'aa' and so on. The autodecrement operator (--) ISN'T magical, and I don't think the incrementer works on Unicode, but it's still pretty cool.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1108960]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-19 23:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found