Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

printing out a matrix for a data list

by Angharad (Pilgrim)
on Aug 05, 2006 at 13:21 UTC ( #565794=perlquestion: print w/replies, xml ) Need Help??

Angharad has asked for the wisdom of the Perl Monks concerning the following question:

Hi there
I have been trying to do the following
Taking 'ob1' as the 'object of interest' I have a text file which looks like this
ob1, ob2, 34 ob1, ob3, 56 ob1, ob4, 12 ob1, ob5, 78 ob1, ob6, 23 ob3, ob1, 56 ob7, ob1, 23 ob8, ob1, 12 ob9, ob1, 90 etc ...
and what I need is to create a matrix like this
ob1 ob2 ob3 ob4 ob5 ob6 ob7 ob8 ob9 ob1 0 34 56 12 78 23 23 12 90 ob2 34 ob3 56 ob4 12 ob5 78 ob6 23 ob7 23 ob8 12 ob9 90
And for any 'duplicate results' like these
ob1, ob3, 56 ob3, ob1, 56
Only take the first instance (as the result is the same, regardless of the direction).
Ob2 then becomes the 'object of interest' and column two and row two is populated in the same way (but with different values potentially) as the first column and first row was populated with ob1 was the 'object if interest'. And the ob3 becomes the 'object of interest' until the matrix is completely filled. In some cases, there may be some missing values (for example there may not be a 'ob4 ob5' value and I need to take that into account - perhaps by printing 'NULL' or something).
I asked for help from Perl Monks yesterday and this is the code I have so far
#!/usr/local/bin/perl use strict; my $data = $ARGV[0]; # open file here open(DATA, "$data") || die "cant open file for reading\n"; my %table; my %rows; my %cols; for(<DATA>) { my($row,$col,$val) = split ','; $table{$row}{$col} = $val; $rows{$row}++; $cols{$col}++; } for my $col (sort keys %cols) { print "\t$col"; } print "\n"; for my $row (sort keys %rows) { print "$row\t"; for my $col (sort keys %cols) { print $table{$row}{$col} if defined $table{$row}{$col}; print "\t"; } print "\n"; }
The results I get using the code above is:
ob1 ob2 ob3 ob4 ob5 ob6 ob1 34 56 12 78 23 ob3 56 ob7 23 ob8 12 ob9 90
Which isn't quite what I need, but I don't know how to fix it (due to my blind spot regarding hashes I suspect).
Any suggestions much appreciated.

Replies are listed 'Best First'.
Re: printing out a matrix for a data list
by rodion (Chaplain) on Aug 05, 2006 at 14:59 UTC
    Try the code below. The changes include
    • adding chomp to get rid of the extra newlines that are left on the $val
    • removing the spaces, along with the commas, on the split, so they don't end up as part of the hash indices
    • using a canonical form for the table index, so that "ob1,bo2" and "ob2,ob1" go in the same bin.
    • just one hash for which rows and columns are present
    • print "-" if there's no entry

    You were well on your way, but the first two on this list can confuse things enough that you can't see the rest. That's the way it happens, it's the things you're not looking at that get you.

    use strict; my %table; my %rows_cols; my %cols; for(<DATA>) { chomp; my($row,$col,$val) = split ', *'; if ($row>$col) { ($row,$col) = ($col,$row); } $table{$row}{$col} = $val; $rows_cols{$row}++; $rows_cols{$col}++; } for my $col (sort keys %rows_cols) { print "\t$col"; } print "\n"; my $val; for my $row (sort keys %rows_cols) { print "$row\t"; for my $col (sort keys %rows_cols) { if (defined $table{$row}{$col}) { print $table{$row}{$col}; } elsif (defined $table{$col}{$row}) { print $table{$col}{$row}; } else { print "-"; } print "\t"; } print "\n"; } __END__ ob1, ob2, 34 ob1, ob3, 56 ob1, ob4, 12 ob1, ob5, 78 ob1, ob6, 23 ob3, ob1, 56 ob7, ob1, 23 ob8, ob1, 12 ob9, ob1, 90 ob3, ob2, 87
    prints
Re: printing out a matrix for a data list
by lima1 (Curate) on Aug 05, 2006 at 13:50 UTC

    Fix the output? Or the matrix filling code?

    One problem I see in the output code is that you don't treat missing values. Probably you should iterate over all objects.

    my %all = ( %rows, %cols ) my @objs = keys %all;
    UPDATE: Another (or the?) problem I see is that you don't remove the newline in your data. chomp should fix this.
Re: printing out a matrix for a data list
by Tanktalus (Canon) on Aug 05, 2006 at 14:09 UTC

    Have you investigated CPAN? Specifically, Text::Table? The trick becomes massaging your input data into the format that Text::Table wants it, but then it becomes much more trivial to get it printing right.

Re: printing out a matrix for a data list
by gellyfish (Monsignor) on Aug 05, 2006 at 22:46 UTC
      Yes, actually. It was a slightly different question and I was upfront from the beginning as to it being an update. Any particular reason why you had to be rude?
Re: printing out a matrix for a data list
by Angharad (Pilgrim) on Aug 05, 2006 at 15:06 UTC
    thanks for your comments/suggestions so far. much appreciated
Re: printing out a matrix for a data list
by TedPride (Priest) on Aug 06, 2006 at 22:20 UTC
    Very basic hack:
    use strict; use warnings; my ($delim, $row, $col, $val, %matrix, %cols, @rows, @cols) = ' '; while (<DATA>) { chomp; ($row, $col, $val) = split /, /, $_; $matrix{$row}{$col} = $val; $cols{$col} = (); } @cols = sort keys %cols; @rows = sort keys %matrix; print join $delim, ' ', @cols; for $row (@rows) { print "\n", $row; for $col (@cols) { print $delim, $matrix{$row}{$col} ? sprintf('%3d', $matrix{$ro +w}{$col}) : ' -'; } } __DATA__ ob1, ob2, 34 ob1, ob3, 56 ob1, ob4, 12 ob1, ob5, 78 ob1, ob6, 23 ob3, ob1, 56 ob7, ob1, 23 ob8, ob1, 12 ob9, ob1, 90
      Thank you. Your help is much appreciated :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://565794]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2021-05-16 19:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Perl 7 will be out ...





    Results (152 votes). Check out past polls.

    Notices?