Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Sorting an array or hashes

by hok_si_la (Curate)
on Jan 23, 2012 at 08:27 UTC ( #949334=perlquestion: print w/ replies, xml ) Need Help??
hok_si_la has asked for the wisdom of the Perl Monks concerning the following question:

Good localtime monks,

I asked a question early last week concerning information extracting from a specific file format (Extracting information from file to Hash) and BrowserUK was good enough to point me in the right direction, however I am having an issue sorting my AoH. The error I am getting when running a command line trace is, "Can't use string ("1") as a HASH ref while "strict refs" in use at getCollections.pl line 164, <ARCFILE> line 10." I can reference and print the unsorted keys just fine, however the sorted AoH is empty. For instance $collectionData[$i]{'Missing'} contains a value however $sortedCollectionData[$i]{'Missing'} is undef.

Here is my file format:
CollectionId=>26154 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:12:09 CollectionId=>26155 Framecount=>6 Status=>I Missing=>4 Modified=>01/22 +/2012 22:12:20 CollectionId=>25000 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:13:07 CollectionId=>25002 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:14 CollectionId=>25009 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:19 CollectionId=>25309 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:25 CollectionId=>25349 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:31 CollectionId=>25318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:37 CollectionId=>21318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:43 CollectionId=>21342 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:56
Here is my sub:
sub printCollectionData { my $arcFile = shift; my $orderBy = shift; my (@collectionData, @sortedCollectionData); my $semaphore = $arcFile . '.lock'; my ($longStatus, $rowStyle, $rowColor); open(LOCKFILE, ">>$semaphore") or die "$semaphore: $!"; flock(LOCKFILE, LOCK_EX) or die "flock() failed for $semaphore: $!"; open (ARCFILE, "<$arcFile") or die "Failed to open $arcFile: $!"; # Retrieve file information from arcFile as an array of hashes while( <ARCFILE> ) { my( $col, $cnt, $stat, $miss, $mod) = m[ ^ CollectionId \s* => \s* (\d+)? \s* Framecount \s* => \s* (\d+)? \s* Status \s* => \s* (\w+)? \s* Missing \s* => \s* ([\d,]+)? \s* Modified \s* => \s* ([\d/]+\s[\d:]+)? \s* $ ]x or warn "Bad format at line $.\n" and next; my( $modday, $modmon, $modyear, $modhrs, $modmin, $modsec ) = $mod =~ m[(\d+)/(\d+)/(\d+) (\d+):(\d+):(\d+)] or warn "Bad date format in line $." and next; push @collectionData, { CollectionId => $col, Framecount => $cnt, Status => $stat, Missing => $miss, Modified => sprintf( "%4d/%02d/%02d %02d:%02d:%02d", $modyear, $modmon, $modday, $modhrs, $modmin, $modsec ), }; } # Sort the collection according to orderBy param if($orderBy eq "collection") { @sortedCollectionData = sort { $collectionData[ $b ]{CollectionId} <=> $collectionData[ $a ]{Coll +ectionId} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }elsif($orderBy eq "framecount") { @sortedCollectionData = sort { $collectionData[ $b ]{Framecount} <=> $collectionData[ $a ]{Framec +ount} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }elsif($orderBy eq "status") { @sortedCollectionData = sort { $collectionData[ $a ]{Status} cmp $collectionData[ $b ]{Status} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }elsif($orderBy eq "missing") { @sortedCollectionData = sort { $collectionData[ $b ]{Missing} <=> $collectionData[ $a ]{Missing} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }else { @sortedCollectionData = sort { $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; } my $transactions = scalar(@collectionData); for (my $i=0; $i < $transactions; $i++) { if($sortedCollectionData[$i]{'Status'} eq "I") { $longStatus = "Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "SI") { $longStatus = "Submitted Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "C") { $longStatus = "Submitted"; }else{ $longStatus = "Submitted Complete"; } if (($i%2) == 0){ $rowStyle = "oddrow"; $rowColor = "#e5e5e5"; } else { $rowStyle = "evenrow"; $rowColor = "#ffffff"; } print qq{ <tr class=$rowStyle style=Cursor:hand onclick= \"location.href=\'get +Collection.pl?id=$collectionData[$i]{'CollectionId'}\';\" onMouseOver +=\"style.backgroundColor='#c5c5c5'\" onMouseOut=\"style.backgroundCol +or='$rowColor'\"> <th class=graycenter>$sortedCollectionData[$i]{'CollectionId'}</th +> <th class=graycenter>$sortedCollectionData[$i]{'Framecount'}</th> <th class=graycenter>$longStatus</th> <th class=graycenter>$sortedCollectionData[$i]{'Missing'}</th> <th class=graycenter>$sortedCollectionData[$i]{'Modified'}</th> </tr> </div> }; } }

Comment on Sorting an array or hashes
Select or Download Code
Replies are listed 'Best First'.
Re: Sorting an array or hashes
by Corion (Pope) on Jan 23, 2012 at 08:57 UTC

    The problem is that $sortedCollectionData only contains the indices into @collectionData, but you try to access it as if it contains the elements themselves when you try to print the data:

    ... <th class=graycenter>$sortedCollectionData[$i]{'CollectionId'}</th> ...

    should be

    <th class=graycenter>$collectionData[$sortedCollectionData[$i]]{'Colle +ctionId'}</th>

    The deeper problem is, that your subroutine does too many things at once, which makes debugging such stuff much harder than it needs to be. I split up the subroutine into three steps, readCollectionData, sortCollectionData (where I thought the problem was) and printCollectionData (where I found the problem). That made it much easier to separate the things and see what each step returns as results.

    my $orderby = 'collection'; my @data = readCollectionData('dummyFilename.txt'); my @indices = sortCollectionData($orderby, @data); print Dumper \@indices; printCollectionData(\@data, @indices);

    As an aside, you had relatively large if ... elsif ... else ... blocks that decide on what to sort and the same kind again to translate the short status code into a long status message. I replaced them by a hash lookup:

    ... if($sortedCollectionData[$i]{'Status'} eq "I") { $longStatus = "Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "SI") { $longStatus = "Submitted Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "C") { $longStatus = "Submitted"; }else{ $longStatus = "Submitted Complete"; } ...

    becomes

    my %translateLongStatus = ( 'I' => 'Incomplete', 'SI' => 'Submitted Incomplete', 'C' => 'Submitted', ); ... my ($longStatus, $rowStyle, $rowColor); $longStatus = $translateLongStatus{ $collectionData[$i]{Status} } +||'Submitted Complete'; ... }

    In the end, my program looks like this (with much of the HTML printing removed):

    #!perl -w use strict; use Data::Dumper; sub readCollectionData { my $arcFile = shift; my (@collectionData); my $semaphore = $arcFile . '.lock'; my ($longStatus, $rowStyle, $rowColor); #open(LOCKFILE, ">>$semaphore") or die "$semaphore: $!"; #flock(LOCKFILE, LOCK_EX) or die "flock() failed for $semaphore: $!" +; #open (ARCFILE, "<$arcFile") or die "Failed to open $arcFile: $!"; local *ARCFILE = *DATA; # Retrieve file information from arcFile as an array of hashes while( <ARCFILE> ) { my( $col, $cnt, $stat, $miss, $mod) = m[ ^ CollectionId \s* => \s* (\d+)? \s* Framecount \s* => \s* (\d+)? \s* Status \s* => \s* (\w+)? \s* Missing \s* => \s* ([\d,]+)? \s* Modified \s* => \s* ([\d/]+\s[\d:]+)? \s* $ ]x or warn "Bad format at line $.\n" and next; my( $modday, $modmon, $modyear, $modhrs, $modmin, $modsec ) = $mod =~ m[(\d+)/(\d+)/(\d+) (\d+):(\d+):(\d+)] or warn "Bad date format in line $." and next; push @collectionData, { CollectionId => $col, Framecount => $cnt, Status => $stat, Missing => $miss, Modified => sprintf( "%4d/%02d/%02d %02d:%02d:%02d", $modyear, $modmon, $modday, $modhrs, $modmin, $modsec ), }; } return @collectionData }; # Map the program orderby names to the internal names in the hash my %sort_columns = ( collection => 'CollectionId', framecount => 'Framecount', status => 'Status', missing => 'Missing', ); sub sortCollectionData { my ($orderby, @collectionData) = @_; # Sort the collection according to orderBy param my $sort_col = $sort_columns{ $orderby } || 'Modified'; my @sortedCollectionData = sort { $collectionData[ $b ]{$sort_col} <=> $collectionData[ $a ]{$sort_c +ol} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; return @sortedCollectionData } my %translateLongStatus = ( 'I' => 'Incomplete', 'SI' => 'Submitted Incomplete', 'C' => 'Submitted', ); sub printCollectionData { my ($collectionData, @sortedCollectionData) = @_; my @collectionData = @$collectionData; my $transactions = scalar(@collectionData); for (my $i=0; $i < $transactions; $i++) { my ($longStatus, $rowStyle, $rowColor); $longStatus = $translateLongStatus{ $collectionData[$i]{Status} } +||'Submitted Complete'; if (($i%2) == 0){ $rowStyle = "oddrow"; $rowColor = "#e5e5e5"; } else { $rowStyle = "evenrow"; $rowColor = "#ffffff"; } print $i, $longStatus, $collectionData[ $i ]->{CollectionId}, "\n" +; } } my $orderby = 'collection'; my @data = readCollectionData('dummyFilename.txt'); my @indices = sortCollectionData($orderby, @data); print Dumper \@indices; printCollectionData(\@data, @indices); __DATA__ CollectionId=>26154 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:12:09 CollectionId=>26155 Framecount=>6 Status=>I Missing=>4 Modified=>01/22 +/2012 22:12:20 CollectionId=>25000 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:13:07 CollectionId=>25002 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:14 CollectionId=>25009 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:19 CollectionId=>25309 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:25 CollectionId=>25349 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:31 CollectionId=>25318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:37 CollectionId=>21318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:43 CollectionId=>21342 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:56
      Thanks for the help Corion. I cleaned up my code a bit and used the following to print sorted elements of my AoH (@collectionData):
      foreach $j (@sortedCollectionData) { my ($longStatus, $rowStyle, $rowColor); $longStatus = $translateLongStatus{ $collectionData[$j]->{Status} +} ||'Submitted Complete'; if (($i%2) == 0){ $rowStyle = "oddrow"; $rowColor = "#e5e5e5"; } else { $rowStyle = "evenrow"; $rowColor = "#ffffff"; } print qq{ <tr class=$rowStyle style=Cursor:hand onclick= \"location.href=\'get +Collection.pl?id=$collectionData[$j]->{CollectionId}\';\" onMouseOver +=\"style.backgroundColor='#c5c5c5'\" onMouseOut=\"style.backgroundCol +or='$rowColor'\"> <th class=graycenter>$collectionData[$j]->{'CollectionId'}</th> <th class=graycenter>$collectionData[$j]->{'Framecount'}</th> <th class=graycenter>$collectionData[$j]->{'Missing'}</th> <th class=graycenter>$longStatus</th> <th class=graycenter>$collectionData[$j]->{'Modified'}</th> </tr> };
Re: Sorting an array or hashes
by moritz (Cardinal) on Jan 23, 2012 at 08:51 UTC
    @sortedCollectionData = sort { $collectionData[ $b ]{CollectionId} <=> $collectionData[ $a ]{Coll +ectionId} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData;

    What you are storing here are numbers (specifically from 0 to $#collectionData), so @sortedCollectionData now contains these numbers in some order or another. And then you write $sortedCollectionData[$i]{'Status'}, and try to access one of these numbers as if it was a hash reference.

    You might want to sort your hash refs directly instead:

    @sortedCollectionData = sort { $b->{CollectionId} <=> $a->{CollectionId} || $b->{Modified} cmp $a->{Modified} } @collectionData;
Re: Sorting an array or hashes
by salva (Abbot) on Jan 23, 2012 at 10:07 UTC
    Those so similar and ugly sorting blocks can be generated from metadata, specially if you use some sorting module from CPAN as Sort::Key or Sort::Maker:
    # untested! use Sort::Key; my %key_type = (CollectionId => 'int', Framecount => 'int', Status => 'str', Missing => 'int', Modified => 'str'); my %order = (collection => [qw(-CollectionId -Modified)], # the minus framecount => [qw(-Framecount -Modified)], # sign means status => [qw(Status -Modified)], # descending + order missing => [qw(-Missing -Modified)], modified => [qw(-Modified)]); my %sorter; for my $order (keys %order) { my @types; my @keys; for (@{$order{$order}}) { /^(-?)(\w+)$/ or die; push @types, "$1$key_type{$2}"; push @keys, $2; } $sorter{$order} = Sort::Key::multikeysorter { @{$_}{@keys} } @types; } sub printCollectionData { ... # Sort the collection according to orderBy param @sortedCollectionData = $sorter{$orderBy}->(@collectionData); ... }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://949334]
Approved by Corion
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2015-07-08 02:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls