Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
The stupid question is the question not asked
 
PerlMonks  

List manipulation

by Anonymous Monk
on May 31, 2012 at 16:02 UTC ( #973551=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear fellow Monks,

I have a list with three columns. The first one contains a list of IDs,second different values and third a value or blank (its A\t1_x\tblank). The ID is being repeated many times with different or same values. Example

A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z

What I would like to get is a list of IDs with corresponding number of 1_x and any non blank value in third column. So from the above example A 3,2 B 1,1 C 1,2. The list is preaty long so I though maybe I should go line by line append to hash and get the number of 1_x and any non blank string in third column? I started with appending the values in three arrays replacing blanks with NA (thought it might help). How should I tackle this to only go through the list once and not for each ID? Thanks for help

Comment on List manipulation
Download Code
Re: List manipulation
by choroba (Abbot) on May 31, 2012 at 16:25 UTC
    See the comments for explanation:
    #!/usr/bin/perl use strict; use warnings; use feature 'say'; my $last = ''; # Remember the id. my $count_1_x; my $count_nonblank; while (<DATA>) { my ($id, $col2, $col3) = split; if ($id ne $last) { # A new record begin +s. say "$last $count_1_x $count_nonblank" if $last; # Do not print at th +e very beginning. $last = $id; $count_1_x = $count_nonblank = 0; } $count_1_x++ if '1_x' eq $col2; $count_nonblank++ if defined $col3 and length $col3; } say "$last $count_1_x $count_nonblank" if $last; # Print for the last + record. __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z D v_x s_x E 1_x F
    Update: Updated to comfort warnings on emtpy column 3.

      This was more an experiment in whether it would be possible, than a piece of sound advice...

      use 5.010; use List::MoreUtils qw< part >; say for map { my $x = $_; sprintf( '%s %d,%d', $x->[0][0], scalar(grep { $_->[1] eq '1_x' } @$x), scalar(grep { defined $_->[2] } @$x), ) } part { # me thinks List::MoreUtils needs a way to make this simpler state $part = 0; state $last = undef; $part++ if defined $last && $last ne $_->[0]; $last = $_->[0]; $part } map { chomp; [split /\s+/] } sort <DATA>; __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z
      perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: List manipulation
by Not_a_Number (Parson) on May 31, 2012 at 18:47 UTC
    use 5.010; my %parsed; while ( <DATA> ){ my @tmp = split; $parsed{$tmp[0]}{x1}++ if $tmp[1] eq '1_x'; $parsed{$tmp[0]}{c3}++ if @tmp > 2; } for my $k ( sort keys %parsed ) { say join ' ', $k, $parsed{$k}->{x1}, $parsed{$k}->{c3}; } __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z

    Update: The output part of the above code breaks for a given ID if (a) '1_x' never appears in column 2, or (b) '1_x' appears in column 2 but there is never a third column for this ID. To understand what I mean, add the following lines to __DATA__:

    D v_x s_x E 1_x

    Solution: change the line in the for loop to:

      say join ' ', $k, $parsed{$k}->{x1} || 0, $parsed{$k}->{c3} || 0;
Re: List manipulation
by kcott (Abbot) on May 31, 2012 at 21:26 UTC

    I saw Not_a_Number's solution as I was about to post mine. I've added his extra test data plus an additional one of my own which has nothing in either column 2 or 3.

    #!/usr/bin/env perl use 5.010; use strict; use warnings; my %result; while (<DATA>) { map { $result{$_->[0]}[0] += defined $_->[1] && $_->[1] eq q{1_x} ? +1 : 0; $result{$_->[0]}[1] += defined $_->[2] ? 1 : 0; } ([split]); } say qq{$_ }, join q{,} => @{$result{$_}} for sort keys %result; __DATA__ A 1_x 9_z A 1_x A 1_x g_z B 2_c B 1_x 1_z C 1_x 1_z C v_x 8_z D v_x s_x E 1_x F

    This outputs:

    $ pm_extract_1_x.pl A 3,2 B 1,1 C 1,2 D 0,1 E 1,0 F 0,0

    -- Ken

      OP here. That looks brilliant thanks very much for that.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://973551]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2014-04-18 23:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (473 votes), past polls