Beefy Boxes and Bandwidth Generously Provided by pair Networks Frank
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Array Filter and Lookahead

by daugh016 (Initiate)
on Nov 20, 2012 at 17:10 UTC ( #1004756=perlquestion: print w/ replies, xml ) Need Help??
daugh016 has asked for the wisdom of the Perl Monks concerning the following question:

Monks, I am getting very confused with a concept. I am trying to adapt some code that I wrote from matching strings to matching elements of an array. My original code would take a string and return the last number in the string using a lookahead:

my $test_str = "1 text text text 2 text 3 "; my $test_out = q{}; my $sec_num_comb_pat = $empty_str; $sec_num_comb_pat .= '^(([0-9]+)(?![^0-9]*[0-9]))'; my $sec_num_rx = qr{$sec_num_comb_pat}xms; if ( $test_str =~ m{($sec_num_rx)}xms ) { $test_out = $1; } print "\$test_out\n" . $test_out . "\n";

The output would be 3

Now, I have an array @test that looks like the following:

#test[ 0]: "text text text 1 text"
#test[ 1]: "1 text text text 2 text 3 "
#test[ 2]: "text 1 text text 2 text"
#test[ 3]: "text text text 1 text 2 3.0"
#test[ 4]: "1 2 3 4 5 6 7 8 9 10 11"
#test[ 5]: "text text text text"

I want to get the last number in every element and put it into a new array @test_filtered. The output would look like the following:

#test_filtered[ 0]: "1"
#test_filtered[ 1]: "3"
#test_filtered[ 2]: "2"
#test_filtered[ 3]: "3"
#test_filtered[ 4]: "11"
#test_filtered[ 5]: ""

I have tried countless different things. Here is one attempt I have tried:

foreach (@test) { @test_filtered=grep{/^(([0-9]+)(?![^0-9]*[0-9]))/xsm} @test; }

Please help me! Thanks ahead of time!




UPDATE - 11.20.12 at 3:02 PM CST

Thank you everyone for the feedback! I have had some stuff come up tonight and will look at all the replies tomorrow.

Also, the reason for the truncating of the decimal is that the numbers are actually section numbers so sometimes they are written "section 1" and sometimes it could be "section 1.", "section 1)", "section 1.)", etc. I am just actually wanting the last section number in the string.

Thank you again Monks for all the help!

Comment on Array Filter and Lookahead
Select or Download Code
Re: Array Filter and Lookahead
by thundergnat (Deacon) on Nov 20, 2012 at 17:25 UTC

    Update: added intermediate variables.

    my @test = ( "text text text 1 text", "1 text text text 2 text 3 ", "text 1 text text 2 text", "text text text 1 text 2 3.0", "1 2 3 4 5 6 7 8 9 10 11", "text text text text" ); my @filtered = map {/(\d+)(?:\.0*)?\D*$/; $1} @test; print join "\n", @filtered;
    yields:
    1
    3
    2
    3
    11
    
    

    Or possibly, depending on your needs, something like:

    my @filtered = map {/(\d+(?:\.\d*)?)\D*$/; 0 + $1 || ''} @test; print join "\n", @filtered;

      The solutions you propose have various problems:

      • handling a number like 3.0, of which daugh016 seems to want to capture only the integer portion;
      • handling 0;
      • and the problem of having capture variables undefined if a match fails (or else having the value of these variables persist from a previous match).

      Here is a possible solution avoiding those problems:

      >perl -wMstrict -le "my @test = ( 'text text text 1 text', '1 text text text 2 text 3 ', 'text 1 text text 2 text', 'text text text 1 text 2 3.0', '1 2 3 4 5 6 7 8 9 10 11', 'text 0', 'text 3.1', 'text text text text', ); ;; 'Zonk' =~ m{ (\w+) }xms; ;; my @filtered = map {/(\d+)(?:\.0*)?\D*$/; $1} @test; printf qq{'$_' } for @filtered; print qq{\n}; ;; @filtered = map {/(\d+(?:\.\d*)?)\D*$/; 0 + $1 || ''} @test; printf qq{'$_' } for @filtered; print qq{\n}; ;; @filtered = map { m{ (\d+) (?: [.] \d*)? \D* \z }xms ? $1 : '' } @test ; printf qq{'$_' } for @filtered; print qq{\n}; " '1' '3' '2' '3' '11' '0' '1' 'Zonk' Argument "Zonk" isn't numeric in addition (+) at -e line 1. '1' '3' '2' '3' '11' '' '3.1' '' '1' '3' '2' '3' '11' '0' '3' ''
Re: Array Filter and Lookahead
by moritz (Cardinal) on Nov 20, 2012 at 19:06 UTC

    If you really want the last integer in a string, you can find that as /.*(?<!\d)(\d+)/, and get the numbers as

    my @filtered = map { /.*(?<!\d)(\d+)/; $1 // '' } @lines;

    However you seem to want to turn 3.0 into 3 (and not 0), and I don't know the rules behind that. Do you want to permit decimals, and cut off zero-decimals behind the dot? Or something else?

      >perl -wMstrict -le "my @lines = ('xx', '0 xx', '1 xx 3.2'); ;; 'Zonk' =~ m{ (\w+) }xms; ;; my @filtered = map { /.*(?<!\d)(\d+)/; $1 // '' } @lines; printf qq{'$_' } for @filtered; " 'Zonk' '0' '2'
      Thanks for the feedback! The numbers are actually section numbers so sometimes they are written "section 1" and sometimes it could be "section 1.", "section 1)", "section 1.)", etc.

        So what's the extraction rule? One can make a guess based on the examples just given, but an explicit definition is always nice to have.

Re: Array Filter and Lookahead
by Kenosis (Priest) on Nov 20, 2012 at 19:13 UTC

    Here's another option:

    my @test_filtered = map { (/(\d+)[\d.]*/g)[-1] // '' } @test;

    Update: changed to // to handle 0s. Thanks again, AnomalousMonk.

      >perl -wMstrict -le "my @test = ('0 xx', 'xx 0', '0'); ;; my @test_filtered = map { (/(\d+)[\d.]*/g)[-1] || '' } @test; printf qq{'$_' } for @test_filtered; " '' '' ''

      Update:  // vice  || does the trick.

        Excellent catch, thank you! Changed disjunct positions...

Re: Array Filter and Lookahead
by CountZero (Chancellor) on Nov 20, 2012 at 20:13 UTC
    There is more than one way to do it:
    use Modern::Perl; my @test = ( "text text text 1 text", "1 text text text 2 text 3", "text 1 text text 2 text", "text text text 1 text 2 3.0", "1 2 3 4 5 6 7 8 9 10 11", "text text 4.5 text text" ); for (@test) { say "$_: $1" if /(\d+)\.?\d*?\D*?$/; }
    Output:
    text text text 1 text: 1 1 text text text 2 text 3: 3 text 1 text text 2 text: 2 text text text 1 text 2 3.0: 3 1 2 3 4 5 6 7 8 9 10 11: 11 text text 4.5 text text: 4

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Array Filter and Lookahead
by ww (Bishop) on Nov 20, 2012 at 20:34 UTC

    Yet another approach (founded, in part, on moritz' point that the rule for dealing with decimals is UNspecified).

    Given the OP, if the the spec is that trailing decimal digits should be truncated to integers (let us hope there's no hidden requirement for rounding :) ) then a two-part approach is also viable.

    So, less elegantly but, IMO, very obvious in its tactics and using regular expressions similar to those proposed above, a TIMTOWTDI might look like this:

    #!/usr/bin/perl use 5.014; #1004756 my @init_array = ("text text text 1 text", "1 text text text 2 text 3 ", "text 1 text text 2 text", "text text text 1 text 2 3.0", "1 2 3 4 5 6 7 8 9 10 11", "text text text text", ); my ($sub_select, $select, @selected); for $_(@init_array) { $_ =~ s/(.+)\.0\z/$1/; # Truncate trailing decimal numbers say "\t \$_: $_"; # debug ($select) = $_ =~ / .+? (?:(\d+)) (?:[\D]*) $ # EOL /x; push @selected, $select; } my $i = 0; for my $captured(@selected) { say "\t\t" . '#test_filtered' . "[$i]: " . '"' . $captured . '"'; $i++; } =head EXECUTION: C:\>perl 1004756.pl $_: text text text 1 text $_: 1 text text text 2 text 3 $_: text 1 text text 2 text $_: text text text 1 text 2 3 $_: 1 2 3 4 5 6 7 8 9 10 11 $_: text text text text #test_filtered[0]: "1" #test_filtered[1]: "3" #test_filtered[2]: "2" #test_filtered[3]: "3" #test_filtered[4]: "11" #test_filtered[5]: "" =cut

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1004756]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (10)
As of 2014-04-18 11:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (466 votes), past polls