Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Merging 2 Hashes finding result sorted by one of the fields

by kris1511 (Acolyte)
on Aug 31, 2017 at 05:35 UTC ( [id://1198386]=perlquestion: print w/replies, xml ) Need Help??

kris1511 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I am new to perl so trying out number of problems from all over. Here is problem where I can seek some wisdom I have 2 files with following data
FILE 1: NAME,STRIDE_LENGTH,STANCE Euoplocephalus,1.87,quadrupedal Stegosaurus,1.90,quadrupedal Tyrannosaurus Rex,5.76,bipedal Hadrosaurus,1.4,bipedal Deinonychus,1.21,bipedal Struthiomimus,1.34,bipedal Velociraptor,2.72,bipedal FILE 2: NAME,LEG_LENGTH,DIET Hadrosaurus,1.2,herbivore Struthiomimus,0.92,omnivore Velociraptor,1.0,carnivore Triceratops,0.87,herbivore Euoplocephalus,1.6,herbivore Stegosaurus,1.40,herbivore Tyrannosaurus Rex,2.5,carnivore

Question # List all herbivores dinos # List their name and leg_length sorted

I am able to find the dino's that are herbivores but I am unsure how to sort the result based on the length of leg. Is is code right way to solve this problem ?
use Data::Dumper; open my $fh1, '<', '1.csv' or die $!; open my $fh2, '<', '2.csv' or die $!; my $header1 = <$fh1>; my $header2 = <$fh2>; my %app_map; while(my $row = <$fh1>){ my ($name, $leg_len, $diet) = split /\,/, $row; if($leg_len ne ''){ $app_map{$name}{leg_len}=$leg_len; }else{ $app_map{$name}{leg_len}="NA"; } if ($diet ne '') { $app_map{$name}{diet}=$diet; } else {$app_map{$name}{diet}="NA"; } } close $fh1 or die $!; while(my $row = <$fh2>){ my ($name, $str_len, $stance) = split /\,/, $row; if($str_len ne ''){ $app_map{$name}{str_len}=$str_len; }else {$app_map{$name}{str_len}="NA";} if($stance ne ''){ $app_map{$name}{stance}=$stance; }else {$app_map{$name}{stance}="NA";} } close $fh2 or die $!; while ( my ($k, $v) = each %app_map ) { if(defined($app_map{$k}{diet}) && $app_map{$k}{diet} =~/herbivo +re/){ ####SORT BY STR_LEN##################### #print $k." - ".$app_map{$k}{str_len},"\n"; } } #print Dumper \%app_map;

Replies are listed 'Best First'.
Re: Merging 2 Hashes finding result sorted by one of the fields
by kcott (Archbishop) on Aug 31, 2017 at 07:35 UTC

    G'day kris1511,

    Your text has "leg_length sorted", but your code has "#SORT BY STR_LEN", so it's unclear exactly which you want. I've used stride length for sorting, but output both lengths, in the example code below.

    See the sort function for an explanation, and lots of examples, of how to sort in Perl.

    Note that you sort numbers using '<=>' and strings using 'cmp'. I mention this specifically because you're defaulting numeric values to the string "NA"; this will likely cause problems when sorting.

    See my earlier post to you, "Re: Multiple File handling and merging records from 2 files", where I described Text::CSV and other elements of my example code below.

    The basic sort code you'll want will look something like this:

    ... sort { $data{$a}{stride_length} <=> $data{$b}{stride_length} } ...

    Here's that in an example script.

    #!/usr/bin/env perl -l use strict; use warnings; use Text::CSV; use Inline::Files; my %data; my $csv = Text::CSV::->new; while (my $row = $csv->getline(\*FILE1)) { $data{$row->[0]}{stride_length} = $row->[1]; } while (my $row = $csv->getline(\*FILE2)) { @{$data{$row->[0]}}{qw{leg_length diet}} = @$row[1,2]; } print "$_ @{$data{$_}}{qw{stride_length leg_length}}" for sort { $data{$a}{stride_length} <=> $data{$b}{stride_length} } grep { $data{$_}{diet} eq 'herbivore' } map { $data{$_}{diet} ||= 'NA'; $data{$_}{stride_length} ||= 0; $data{$_}{leg_length} ||= 0; $_; } keys %data; __FILE1__ Euoplocephalus,1.87,quadrupedal Stegosaurus,1.90,quadrupedal Tyrannosaurus Rex,5.76,bipedal Hadrosaurus,1.4,bipedal Deinonychus,1.21,bipedal Struthiomimus,1.34,bipedal Velociraptor,2.72,bipedal __FILE2__ Hadrosaurus,1.2,herbivore Struthiomimus,0.92,omnivore Velociraptor,1.0,carnivore Triceratops,0.87,herbivore Euoplocephalus,1.6,herbivore Stegosaurus,1.40,herbivore Tyrannosaurus Rex,2.5,carnivore

    Output:

    Triceratops 0 0.87 Hadrosaurus 1.4 1.2 Euoplocephalus 1.87 1.6 Stegosaurus 1.90 1.40

    — Ken

Re: Merging 2 Hashes finding result sorted by one of the fields
by huck (Prior) on Aug 31, 2017 at 07:30 UTC

    First off, your files are backwards, there are spaces in the data segments, and it always makes sense to keep the key as part of the data. That said

    use strict; use warnings; use Data::Dumper; my %app_map; readit (\%app_map,'2.csv','name',qw /name leg_len diet/); readit (\%app_map,'1.csv','name',qw /name str_len stance/); my @result; while ( my ($k, $v) = each %app_map ) { if(defined($v->{diet}) && $v->{diet} eq 'herbivore' && defined($v->{str_len}) ){ push @result,$v;} } my @sorted=sort {$a->{str_len} <=> $b->{leg_len}} @result; for my $row (@sorted) { print $row->{name}.' '.$row->{leg_len}."\n"; } #print Dumper \%app_map; exit; sub readit { my $db=shift; my $fn=shift; my $key=shift; my @vars=@_; open my $fh, '<', $fn or die $!; my $header1 = <$fh>; my $vn=$#vars; while(my $row = <$fh>){ chomp $row; my $h0={}; my @parts= split /\,/, $row; for my $ii (0..$vn){ unless (defined ($parts[$ii])) {$parts[$ii]='';} $parts[$ii]=~s/^\s+//; $parts[$ii]=~s/\s+$//; unless ($parts[$ii] eq '') { $h0->{$vars[$ii]}=$parts[$ii]; + } } my $keyval=$h0->{$key}; my $h1=$db->{$keyval}; unless ($h1) { $db->{$keyval}=$h0; } else { while ( my ($k, $v) = each %$h0 ) {$h1->{$k}=$v;} } # exists } # row close $fh or die $!; } # readit
    Notice use of a subroutine to read both files, passing a pointer to %app_map, listing the key by name, and listing the names of the cols. Also notice cleaning the blanks from the beginning and the end of the values, only storing the data if it exists ('NA" is a "defined" value), and building a new subhash or repopulating an existing subhash. Also notice the use of $v as the subhash in the testing phase.
    Hadrosaurus 1.4 Euoplocephalus 1.87 Stegosaurus 1.90
    Edit: opps, as kcott pointed out at Re: Merging 2 Hashes finding result sorted by one of the fields i too started by sorting/print str_len cuz thats what the code had, so i fixed it above and here are the new results
    Hadrosaurus 1.2 Stegosaurus 1.40 Euoplocephalus 1.6
    If you turn this in as your homework assignment the teacher will know you got someone to do it for you so you better be able to explain EVERTHING. Adding comments to explain will go a long way to not being failed. That task turns this assignment into a learning exercise now.

Re: Merging 2 Hashes finding result sorted by one of the fields
by BillKSmith (Monsignor) on Aug 31, 2017 at 17:01 UTC
    An alternate approach lets command_line switches do most of the work.
    C:\Users\Bill\forums\monks>type merg.pl #!perl -F/,/an use strict; use warnings; # our @animals; next if $F[0] eq 'NAME'; push @animals, [@F[0,1]] if $F[2] eq "herbivore\n"; END{ our @animals; printf "\n%-15s %10s\n\n", 'NAME', 'LEG_LENGTH'; printf "%-15s %4.2f\n" x scalar (@animals), map {@$_} sort {$$a[1] > $$b[1]} @animals; } C:\Users\Bill\forums\monks>perl merg.pl file1.txt file2.txt NAME LEG_LENGTH Triceratops 0.87 Hadrosaurus 1.20 Stegosaurus 1.40 Euoplocephalus 1.60
    Bill

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1198386]
Approved by hdb
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-03-29 04:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found