Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Entering the land of Perl

by manbroski (Initiate)
on Apr 04, 2013 at 21:27 UTC ( #1027035=perlquestion: print w/ replies, xml ) Need Help??
manbroski has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, Monks.

In my weary travels, I have found a serene beauty in your language. It is sometimes harsh, but it is also beautiful. I come from the land of Python, were folks carry their language as a hammer, always clear and to the point. However, at times I wish that I can wield a rapier, swift and effective. Which is why I come to the Monastery.

Recently, I have started looking at problems on Rosalind (rosalind.org) and decided to solve all the tasks in Perl. Question number 1 is particularly interesting. It is simple, yet writing clean code for it may be a challenge. The task is simple - count letters in a string and print counts in alphabetical order.

Here is what I came up with

#!/usr/bin/env perl #Question 1: Counting DNA nucleotides use v5.16; sub sort_and_print_hash_keys (\%) { my %hash = %{shift()}; foreach (sort keys %hash) { print "$hash{$_} "; } print "\n"; } sub counting { my %letters = (); chomp(my $seq = readline); foreach my $base (split //, $seq) { $letters{$base}++; } sort_and_print_hash_keys(%letters); } counting();

As you might see, I like clean and meaningful code. One thought per line.

My question is twofold. First, is this idiomatic Perl? Would you have done something differently? Please, critique this.

Second, I am wondering if the magic behind hash ref passing has any side effects. Here, I use a subroutine prototype declaration to ensure argument type safety and dereference immediately with %{shift()}. Are there any undesired consequences of this? Does any other code flatten out data structures or makes unnecessary copies? I am aware that 'sort keys' would create a separate array of keys.

Thank you, monks. I eagerly await your wisdom

Comment on Entering the land of Perl
Download Code
Re: Entering the land of Perl
by choroba (Abbot) on Apr 04, 2013 at 21:38 UTC
    Hello manbroski, welcome to the Monastery!
    Prototypes are seldom used in Perl. It is usually enough to pass a reference as the argument directly:
    sort_and_print_hash_keys(\%letters);
    Moreover, counting characters in a string is idiomatically done via the tr operator. We know in advance that the possible characters are A, C, T and G only, so you can write:
    #!/usr/bin/perl use strict; use warnings; use feature qw(say); my $s = shift; say join " ", map eval "$s =~ tr/$_//", qw/A C G T/;
    Which was exactly my solution to the DNA problem on Rosalind :-)
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      That, sir, is a beauty.

      Anything functional (in the map-reduce sense) makes me quite happy. You have my thanks.

      EDIT: what version is this? 5.16 'use strict' returns all zeros for each count. Does that provide the function return code?

        The shift requires the string as an argument of the script. If you want to read it from a file, you have to use
        $s = <>;
        instead and provide the input file as the argument.
        لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Entering the land of Perl
by jwkrahn (Monsignor) on Apr 04, 2013 at 23:30 UTC
    Please, critique this.

    You say you want to "count letters in a string" but you are counting all characters.

    $letters{$1}++ while $seq =~ /([[:alpha:]])/g;


    You shouldn't use prototypes.    And besides you are just copying the contents of the hash anyway which you could do more simply as:

    sub sort_and_print_hash_keys { my %hash = @_;

      Good call on the letters versus characters. But as far as the prototype passing, I was certain that it enforces a pass by reference. Notice that I have to dereference it after the passage. So the hope is that there is no copying.

        Hi manbroski,

        "..Notice that I have to dereference it after the passage.." Why is that?

        Did you also notice your subroutine sort_and_print_hash_keys definition

        sub sort_and_print_hash_keys (\%) {..
        and how you eventually used the it like so:
        sort_and_print_hash_keys(%letters); # you passed a hash variable not a + hash ref.
        ".. But as far as the prototype passing, I was certain that it enforces a pass by reference..."

        If I may suggest, you will do well yielding the wisdom of jwkrahn, as regard the usage of prototype for this reason:

        When you use a reference prototype, like "\$", "\@", "\%" "...those symbols don't actually say that you must pass in a scalar reference, an array reference, and a hash reference. Rather, they say you must pass in a scalar variable, an array variable, and a hash variable. That means that the compiler insists upon seeing a properly notated variable of the given type, complete with "$", "@", or "%" in that slot. You must not use a backslash. The compiler silently supplies the backslash for you... "
        from Far More than Everything You've Ever Wanted to Know about Prototypes in Perl -- by Tom Christiansen
        -- by liverpole, under subheading Problems with Reference Prototypes

        Hope this helps.

        If you tell me, I'll forget.
        If you show me, I'll remember.
        if you involve me, I'll understand.
        --- Author unknown to me
        Notice that I have to dereference it after the passage. So the hope is that there is no copying.

        In your code you have:

        sub sort_and_print_hash_keys (\%) { my %hash = %{shift()}; foreach (sort keys %hash) { print "$hash{$_} "; } print "\n"; }

        Which is copying the entire hash.    If you didn't want to copy the hash you could do it like this:

        sub sort_and_print_hash_keys (\%) { my $hash = shift; foreach (sort keys %$hash) { print "$hash->{$_} "; } print "\n"; }
Re: Entering the land of Perl
by Discipulus (Deacon) on Apr 05, 2013 at 07:42 UTC
    Welcome to Perl! this is a wonderful land..

    TIMTOWTDT (i'm not a master, take with doubt my code..)
    perl -e ' print scalar split (//,$ARGV[0]),"\t", (sort split (//,$ +ARGV[0])),"\n"' TGAC

    L*
    there are no rules, there are no thumbs..
Re: Entering the land of Perl
by hdb (Parson) on Apr 05, 2013 at 10:18 UTC

    I cannot resist to provide my take on it as well...

    use strict; use warnings; my $_ = "AAAGCCCCTTTAAACCCCxxxxxx"; my %h; $h{$_}++ for /./g; print "ACGT only: ", join " ", @h{qw(A C G T)}, "\n"; print "All chars: ", join " ", @h{sort keys%h}, "\n";
Re: Entering the land of Perl
by Cristoforo (Deacon) on Apr 24, 2013 at 22:18 UTC
    Or, using the Bio::SeqIO, Bio::Tools::SeqStats modules from the Bio::Seq distribution, you can come up with this, (though more useful for less trivial tasks as the one here).
    #!/usr/bin/perl use strict; use warnings; use 5.014; use Bio::SeqIO; use Bio::Tools::SeqStats; my $in = Bio::SeqIO->new (-fh => \*DATA, -format=>'fasta'); while(my $seq = $in->next_seq() ) { my $seq_stats = Bio::Tools::SeqStats->new(-seq => $seq); my $count = $seq_stats->count_monomers(); print "Count: A $count->{A} T $count->{T} G $count->{G} C $count-> +{C}\n"; } __DATA__ >NR_037701 1 aggagctatgaatattaatgaaagtggtcctgatgcatgcatattaaaca tgcatcttacatatgacacatgttcaccttggggtggagacttaatattt aaatattgcaatcaggccctatacatcaaaaggtctattcaggacatgaa ggcactcaagtatgcaatctctgtaaacccgctagaaccagtcatggtcg gtgggctccttaccaggagaaaattaccgaaatcactcttgtccaatcaa agctgtagttatggctggtggagttcagttagtcagcatctggtggagct gcaagtgttttagtattgtttatttagaggccagtgcttatttagctgct agagaaaaggaaaacttgtggcagttagaacatagtttattcttttaagt gtagggctgcatgacttaacccttgtttggcatggccttaggtcctgttt gtaatttggtatcttgttgccacaaagagtgtgtttggtcagtcttatga cctctattttgacattaatgctggttggttgtgtctaaaccataaaaggg aggggagtataatgaggtgtgtctgacctcttgtcctgtcatggctggga actcagtttctaaggtttttctggggtcctctttgccaagagcgtttcta ttcagttggtggaggggacttaggattttatttttagtttgcagccaggg tcagtacatttcagtcacccccgcccagccctcctgatcctcctgtcatt cctcacatcctgtcattgtcagagattttacagatatagagctgaatcat ttcctgccatctcttttaacacacaggcctcccagatctttctaacccag gacctacttggaaaggcatgctgggtctcttccacagactttaagctctc cctacaccagaatttaggtgagtgctttgaggacatgaagctattcctcc caccaccagtagccttgggctggcccacgccaactgtggagctggagcgg
    Output:
    C:\Old_Data\perlp>perl t7.pl Count: A 244 T 311 G 232 C 213 C:\Old_Data\perlp>

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1027035]
Approved by state-o-dis-array
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2014-07-12 21:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (241 votes), past polls