Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: concatenating identical sequences

by $new_guy (Acolyte)
on Oct 05, 2011 at 05:49 UTC ( [id://929708]=note: print w/replies, xml ) Need Help??


in reply to Re: concatenating identical sequences
in thread concatenating identical sequences

Thanks Limbic~Region and BrowserUk for your good advice. I have had a quick go with BrowserUk's script and it seems to work OK. However, the merge is not so clean and results after the merge seem not to be in equal chunks (I have posted this on to my scratchpad). I am speculating that this will continue as > continue to add more block (i.e thousands). I have adapted your script as follows:

#! perl -slw use strict; if (scalar(@ARGV) != 1) { print "\n"; print "Usage: script.pl <alignment file>"; print "\n"; exit(); } my ($FILENAME) = @ARGV; #read in file open(DATA, $FILENAME); my( $id, %accu, @order ); ## remove existing files my $remove = "new_alignment_".$FILENAME; #remove any existing results + file if (unlink($remove) == 1) { print "Existing \"$remove\" file was removed\n +"; } ## generate a temporary storage file my $outputfile = "new_alignment_".$FILENAME; #make a big file in which + the final results will be printed unless ( open(POS, ">>$outputfile") ) { print "Cannot open file \"$outputfile\" to write to!!\n\n"; exit; } while( <DATA> ) { chomp; if( m[^(\S+_\S+_\S+)\s+(.+)\s*$] ) { $id = $1; unless( exists $accu{ $id } ) { push @order, $id; $accu{ $id } = $2; } else { $accu{ $id } .= ' ' . $2; } } else { $accu{ $id } .= ' ' . $_; } } for my $key ( @order ) { printf POS "%-10s %s\n", $key, substr( $accu{ $key }, 0, 55, '' ); print POS substr( $accu{ $key }, 0, 66, '' ) while length $accu{ $ +key }; print POS ''; }
I must also confess I do not understand the last bit of code:
for my $key ( @order ) { printf POS "%-10s %s\n", $key, substr( $accu{ $key }, 0, 55, '' ); print POS substr( $accu{ $key }, 0, 66, '' ) while length $accu{ $ +key }; print POS ''; }
Could you kindly please explain it to me.

PS: Still thinking of how to implement what Limbic~Region has pointed out.

$new_guy

Replies are listed 'Best First'.
Re^3: concatenating identical sequences
by BrowserUk (Patriarch) on Oct 05, 2011 at 10:56 UTC

    Or you could try this version first:

    #! perl -slw use strict; my( $id, %accu, @order ); while( <DATA> ) { chomp; if( m[^(\S+_\S+_\S+)\s+(.+)\s*$] ) { $id = $1; unless( exists $accu{ $id } ) { push @order, $id; $accu{ $id } = $2; } else { $accu{ $id } .= ' ' . $2; } } else { my $pad = 10 - length() % 11; $accu{ $id } .= ' ' . $_ . ' ' x $pad; } } for my $key ( @order ) { printf "%-10s %s\n", $key, substr( $accu{ $key }, 0, 55, '' ); print substr( $accu{ $key }, 0, 66, '' ) while length $accu{ $key +}; print ''; }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      It works like a charm, thank you very much BrowserUk

Re^3: concatenating identical sequences
by BrowserUk (Patriarch) on Oct 05, 2011 at 10:35 UTC
    However, the merge is not so clean and results after the merge seem not to be in equal chunks (I have posted this on to my scratchpad).

    I missed this bit of your spec: "(except for the very last chunk at the end of each block which might be shorter than 10 letters)".

    Looks like you'll have to use Limbic~Region's solution of splitting the lines and storing the data as arrays.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://929708]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2024-04-23 08:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found