Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Creating a column of frequency for the unique entries of another column

by Cristoforo (Deacon)
on Oct 29, 2011 at 20:28 UTC ( #934647=note: print w/ replies, xml ) Need Help??


in reply to Creating a column of frequency for the unique entries of another column

Tags Frequency EFFFFDDDDDR 3 FFFFEFFEEDD 3 EFFDFEDEDDR 2 FFFFEFFEEDD 2 ............
I got different results from your dataset using the code below. I am assuming you want a new output file for every *.seq input file. You would just need to uncomment the 4 commented statements and change the foreach my $input_file ('o66.txt') line to foreach my $input_file (@input_files).
#!usr/bin/perl use strict; use warnings; my $sequence='ABCD'; my @headings= qw/ Tags Frequency /; my @input_files=<*.seq>; foreach my $input_file ('o66.txt') { open INPUT, "<", $input_file or die "Cannot open file \"$input_fil +e\". $!"; (my $outfile = $input_file) =~ s/.seq/.tag.txt/i; my %freq; while (my $line=<INPUT>) { if ($line=~m/$sequence(.{11})(.{11})$sequence/i){ $freq{$_}++ for $1, $2; } } close INPUT or die "Cannot close file \"$input_file\". $!"; #open OUTPUT, ">", $outfile or die "Cannot open file \"$outfile\". + $!"; #printf OUTPUT "%-12s%s\n", @headings; printf "%-12s%s\n", @headings; for my $tag (sort {$freq{$b} <=> $freq{$a}} keys %freq) { #printf OUTPUT "%-12s%5s\n", $tag, $freq{ $tag }; printf "%-12s%5s\n", $tag, $freq{ $tag}; } #close OUTPUT or die "Unable to close \"$outfile\". $!"; } __END__ o66.txt is below: @HWDFFFDDABCDEFFFFDDDDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFDFEDEDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFFFDDDDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFFFDDDDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFDFEDEDDRFFFFEFFEEDDABCDEDDDDDD output is: C:\Old_Data\perlp>perl t.pl Tags Frequency FFFFEFFEEDD 5 EFFFFDDDDDR 3 EFFDFEDEDDR 2


Comment on Re: Creating a column of frequency for the unique entries of another column
Select or Download Code
Re^2: Creating a column of frequency for the unique entries of another column
by bluray (Sexton) on Oct 29, 2011 at 21:10 UTC
    Thanks Cristoforo, I used the .seq because I have several input files. Anyway, if you delete the output file from the same directory, you would get the same result when you run the script again.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://934647]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (10)
As of 2015-07-07 11:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (88 votes), past polls