Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

List manipulation

by Anonymous Monk
on May 31, 2012 at 16:02 UTC ( #973551=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear fellow Monks,

I have a list with three columns. The first one contains a list of IDs,second different values and third a value or blank (its A\t1_x\tblank). The ID is being repeated many times with different or same values. Example

A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z

What I would like to get is a list of IDs with corresponding number of 1_x and any non blank value in third column. So from the above example A 3,2 B 1,1 C 1,2. The list is preaty long so I though maybe I should go line by line append to hash and get the number of 1_x and any non blank string in third column? I started with appending the values in three arrays replacing blanks with NA (thought it might help). How should I tackle this to only go through the list once and not for each ID? Thanks for help

Comment on List manipulation
Download Code
Re: List manipulation
by choroba (Abbot) on May 31, 2012 at 16:25 UTC
    See the comments for explanation:
    #!/usr/bin/perl use strict; use warnings; use feature 'say'; my $last = ''; # Remember the id. my $count_1_x; my $count_nonblank; while (<DATA>) { my ($id, $col2, $col3) = split; if ($id ne $last) { # A new record begin +s. say "$last $count_1_x $count_nonblank" if $last; # Do not print at th +e very beginning. $last = $id; $count_1_x = $count_nonblank = 0; } $count_1_x++ if '1_x' eq $col2; $count_nonblank++ if defined $col3 and length $col3; } say "$last $count_1_x $count_nonblank" if $last; # Print for the last + record. __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z D v_x s_x E 1_x F
    Update: Updated to comfort warnings on emtpy column 3.

      This was more an experiment in whether it would be possible, than a piece of sound advice...

      use 5.010; use List::MoreUtils qw< part >; say for map { my $x = $_; sprintf( '%s %d,%d', $x->[0][0], scalar(grep { $_->[1] eq '1_x' } @$x), scalar(grep { defined $_->[2] } @$x), ) } part { # me thinks List::MoreUtils needs a way to make this simpler state $part = 0; state $last = undef; $part++ if defined $last && $last ne $_->[0]; $last = $_->[0]; $part } map { chomp; [split /\s+/] } sort <DATA>; __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z
      perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: List manipulation
by Not_a_Number (Parson) on May 31, 2012 at 18:47 UTC
    use 5.010; my %parsed; while ( <DATA> ){ my @tmp = split; $parsed{$tmp[0]}{x1}++ if $tmp[1] eq '1_x'; $parsed{$tmp[0]}{c3}++ if @tmp > 2; } for my $k ( sort keys %parsed ) { say join ' ', $k, $parsed{$k}->{x1}, $parsed{$k}->{c3}; } __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z

    Update: The output part of the above code breaks for a given ID if (a) '1_x' never appears in column 2, or (b) '1_x' appears in column 2 but there is never a third column for this ID. To understand what I mean, add the following lines to __DATA__:

    D v_x s_x E 1_x

    Solution: change the line in the for loop to:

      say join ' ', $k, $parsed{$k}->{x1} || 0, $parsed{$k}->{c3} || 0;
Re: List manipulation
by kcott (Abbot) on May 31, 2012 at 21:26 UTC

    I saw Not_a_Number's solution as I was about to post mine. I've added his extra test data plus an additional one of my own which has nothing in either column 2 or 3.

    #!/usr/bin/env perl use 5.010; use strict; use warnings; my %result; while (<DATA>) { map { $result{$_->[0]}[0] += defined $_->[1] && $_->[1] eq q{1_x} ? +1 : 0; $result{$_->[0]}[1] += defined $_->[2] ? 1 : 0; } ([split]); } say qq{$_ }, join q{,} => @{$result{$_}} for sort keys %result; __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z D v_x s_x E 1_x F

    This outputs:

    $ pm_extract_1_x.pl A 3,2 B 1,1 C 1,2 D 0,1 E 1,0 F 0,0

    -- Ken

      OP here. That looks brilliant thanks very much for that.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://973551]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2014-11-28 07:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (193 votes), past polls