Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re^2: Re-organising entries

by $new_guy (Acolyte)
on Feb 14, 2011 at 15:12 UTC ( #888006=note: print w/replies, xml ) Need Help??

in reply to Re: Re-organising entries
in thread Re-organising entries

Hi Moritz and Limbic_Region,

Am afraid your scripts don't work very well. They only generate 2 coulumns and the contents/entries in the columns don't have the same prefix. If you look at my scratchpad I have put the first seven rows (of which there are about 8000 rows in my .txt file). Note that each row starts with Cluster(\d+) i.e. the word "Cluster" followed by a number eg 1,2,3 etc.

The code I have so far come up with is:

#!usr/bin/perl -w use warnings; use strict; use List::Util 'max'; # Read in the file my $FILENAME3 = "clusters3.txt"; open(DATA, $FILENAME3); #create arrays and hashes to store stuff my (%data, %all, @keys); while (<DATA>) { # avoid \n on last field chomp; #split the data into chunks my @chunks = split; #create keys for the chunks my $key = shift @chunks; #store the keys in an array unless they already exist push @keys, $key unless exists $data{$key}; foreach my $chunk (@chunks) { #return references using hashes $data{$key}{$chunk}++; #add all chunks to the hash '%all' $all{$chunk} = 1; } #now make a file for the ouput my $outputfile = "new_cluster.txt"; if (! open(POS, ">>$outputfile") ) { print "Cannot open file \"$outputfile\" to write to!!\n\n" +; exit; } #sort the fields/columns keys and save them as an array #my @fields = sort {$a <=> $b} keys %all; my @fields = sort {lc($a) cmp lc($b)} keys %all; ##<--this sorting did +n't work #find the longest entry in an array #my $longest = max map {length} @fields; my $longest = max map {scalar grep $_=~ /\(\d+\)\_\(\d+\)\_\(\d+\)\_/, + @fields} @fields; #the line I think has a problem! #organise the data foreach my $key (@keys) { while (keys %{$data{$key}}) { print POS $key, " "; foreach my $field (@fields) { if ($data{$key}{$field}){ printf POS "%${longest}s ", $field; delete $data{$key}{$field} unless --$data{$key}{$field +}; } else { printf POS "%${longest}s ", "-"; } } print POS"\n"; }}}

In the code cluster3.txt is my .txt file But it spits out rubbish

Is it possible to have for each entry in each row arranged tidyly in columns

Generally the prefixes are separated by an underscore for this beginning with letter, except 'spr, HMPREF, and pseudoSPN23F(which is also exactly similar or should be in the same column as SPN23F)'

For this beginning with digits/numbers. The prefix is from the beginning to the last underscore e.g. 3850_1_7_ and 3850_1_8_ .



Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://888006]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2016-10-23 21:34 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (302 votes). Check out past polls.