http://www.perlmonks.org?node_id=1072624

ccelt09 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perl Monks!

I have a program to count 0's in a long string of characters but only in the substrings specified within the foreach loop. Right now I am just counting the 0's in each substring discretely but would like to print the sum of all the 0's from all the substrings. Is there a elegant, quick way to do this without a looping variable?" Many Thanks!

my $input_dir = "/Users/logancurtis-whitchurch/Desktop/IB_Senior_Thesi +s/consensus_files/mask_files/"; #directory with mask files my $input_file = "$input_dir"."mask."."$population".".txt"; open(CGS, "<$input_file") or die "can't open input_file\n"; my $cgs = <CGS>; my $interval = "/Users/logancurtis-whitchurch/Desktop/chrX_divisions/" +."$region"."_$filter".".txt"; #specifiecs intervals by region and fil +ter version open (INTERVAL, "<$interval") or die "can't open interval file\n"; foreach ( <INTERVAL> ) { my (undef, $start, $end) = split '\s+', $_; my $subs_length = $end-$start; my $included_length = substr( $cgs, $start, $subs_length ); print length $included_length; printf "found %d zeros\n", $included_length =~ tr[0][]; }

Replies are listed 'Best First'.
Re: Summing Variables in foreach loop
by davido (Cardinal) on Jan 30, 2014 at 05:46 UTC

    Are you asking how to get a total of all the results printed by printf "found %d zeros\n", $included_length =~ tr[0][];?

    Update: I'll go with that assumption, and that you're looking for an elegant solution, by some subjective definition of elegant. A word of warning; elegance may be in the eye of the beholder. Anyway, here's my stab at it:

    use strict; use warnings; no warnings 'once'; use feature 'say'; use List::Util 'reduce'; my $cgs = '10001011101010010101101000010101101011101111010100101010101 +0'; sub substr_at { substr $_[0], $_[1], $_[2]-$_[1]; } sub count_zeros { ( my $string, local $_ ) = @_; substr_at( $string, ( split )[1,2,1] ) =~ tr[0][]; } say reduce { $a += count_zeros($cgs,$b) } 0, <DATA>; __DATA__ junk 5 15 junk 23 59 junk 18 34 junk 10 20 junk 9 19 junk 40 49

    Here's how it works:

    • The reduce statement is self-documenting if you're familiar with List::Util::reduce; Iterating over the lines in the DATA filehandle, $b is passed each time to a subroutine that will count the zeros in a string. That value is then added to the $a(ccumulator). At the end, the final value of the accumlator, $a is returned as our sum.
    • substr_at should also be self-documenting by its name; we specify the string, and then the "at" values (start at, end at). I'll grant that it would be clearer if I unpacked @_, but it's pretty simple to look at.
    • count_zeros takes a target string, and a criteria string, and binds the substr_at of the string to the tr/// operator to return a count.

    And the reason I think it's elegant is that it is uncluttered, and seems to flow nicely. At any level, from the low end (substr_at) to the high end (reduce), you can look at what's happening and understand without spending too much time trying to grok the code.

    Keep in mind though, any time you're summing anything, there's a loop involved, whether you see it or not. In the code I provided, "reduce" loops over the contents of <DATA>. Internally there are other loops as well, such as the one that the transliteration operator must use to count matches in a string. But even though we know the loops are there, it's nice to work at a level of abstraction where we're not boringly typing out "foreach..."


    Dave

      Derived from davido — dunno if it's more elegant or not:

      use strict; use warnings; use List::Util qw(sum); my $cgs = '10001011101010010101101000010101101011101111010100101010101 +0'; print # prints 45 as does davido's solution sum map substr($cgs, $_->[1], $_->[2]-$_->[1]) =~ tr/0//, map [ split ], <DATA> ; __DATA__ junk 5 15 junk 23 59 junk 18 34 junk 10 20 junk 9 19 junk 40 49

      Update: Changed
          map scalar(substr($cgs, $_->[1], $_->[2]-$_->[1]) =~ tr/0//),
      to
          map substr($cgs, $_->[1], $_->[2]-$_->[1]) =~ tr/0//,
      (same result).

      A more normal way to use reduce would be to use the + operator in there, not +=. Using += could give the impression that the variable $a can be usefully modified within the block.

      say reduce { $a + count_zeros($cgs,$b) } 0, <DATA>;
      use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name
        A even more normal way would be avoid reduce entirely :)
        say sum 0, map count_zeros($cgs,$_), <DATA>;
      yes that is a better way to say it

        I had thought to use something like this

        my $total_coverage = 0; #looping variable my $interval = "/Users/logancurtis-whitchurch/Desktop/chrX_divisions/" +."$region"."_$filter".".txt"; #specifiecs intervals by region and fil +ter version open (INTERVAL, "<$interval") or die "can't open interval file\n"; foreach ( <INTERVAL> ) { my (undef, $start, $end) = split '\s+', $_; my $subs_length = $end-$start; my $included_length = substr( $cgs, $start, $subs_length ); my $coverage = $included_length =~ tr[0][]; $total_coverage = $total_coverage + $coverage; } print "$total_coverage\n";

        Your approach uses an understanding of the language that I hope to attain some day, quite cool. Thank you!

Re: Summing Variables in foreach loop
by AnomalousMonk (Archbishop) on Jan 30, 2014 at 09:40 UTC

    The approachs shown in the OP and in the solutions of davido here and myself here treat the ranges given in the data as exclusive of the terminating offset. E.g., in the range  'junk 23 59' the final  '0' in the 60-character string  '100010111010100101011010000101011010111011110101001010101010' (i.e., the character at offset 59) is not counted. Is this correct? Put another way, in the range  'junk 5 5' is there one character or zero, one  '0' or none (assuming the same  $cgs string)? If the range is inclusive, the fix is very simple.