Beefy Boxes and Bandwidth Generously Provided by pair Networks Joe
We don't bite newbies here... much
 
PerlMonks  

Re: Creating a column of frequency for the unique entries of another column

by Cristoforo (Chaplain)
on Oct 29, 2011 at 20:28 UTC ( #934647=note: print w/ replies, xml ) Need Help??


in reply to Creating a column of frequency for the unique entries of another column

Tags Frequency EFFFFDDDDDR 3 FFFFEFFEEDD 3 EFFDFEDEDDR 2 FFFFEFFEEDD 2 ............
I got different results from your dataset using the code below. I am assuming you want a new output file for every *.seq input file. You would just need to uncomment the 4 commented statements and change the foreach my $input_file ('o66.txt') line to foreach my $input_file (@input_files).
#!usr/bin/perl use strict; use warnings; my $sequence='ABCD'; my @headings= qw/ Tags Frequency /; my @input_files=<*.seq>; foreach my $input_file ('o66.txt') { open INPUT, "<", $input_file or die "Cannot open file \"$input_fil +e\". $!"; (my $outfile = $input_file) =~ s/.seq/.tag.txt/i; my %freq; while (my $line=<INPUT>) { if ($line=~m/$sequence(.{11})(.{11})$sequence/i){ $freq{$_}++ for $1, $2; } } close INPUT or die "Cannot close file \"$input_file\". $!"; #open OUTPUT, ">", $outfile or die "Cannot open file \"$outfile\". + $!"; #printf OUTPUT "%-12s%s\n", @headings; printf "%-12s%s\n", @headings; for my $tag (sort {$freq{$b} <=> $freq{$a}} keys %freq) { #printf OUTPUT "%-12s%5s\n", $tag, $freq{ $tag }; printf "%-12s%5s\n", $tag, $freq{ $tag}; } #close OUTPUT or die "Unable to close \"$outfile\". $!"; } __END__ o66.txt is below: @HWDFFFDDABCDEFFFFDDDDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFDFEDEDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFFFDDDDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFFFDDDDDRFFFFEFFEEDDABCDEDDDDDD @HWDFFFDDABCDEFFDFEDEDDRFFFFEFFEEDDABCDEDDDDDD output is: C:\Old_Data\perlp>perl t.pl Tags Frequency FFFFEFFEEDD 5 EFFFFDDDDDR 3 EFFDFEDEDDR 2


Comment on Re: Creating a column of frequency for the unique entries of another column
Select or Download Code
Re^2: Creating a column of frequency for the unique entries of another column
by bluray (Sexton) on Oct 29, 2011 at 21:10 UTC
    Thanks Cristoforo, I used the .seq because I have several input files. Anyway, if you delete the output file from the same directory, you would get the same result when you run the script again.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://934647]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2014-04-19 10:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (480 votes), past polls