Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Parsing newick trees

by BrowserUk (Patriarch)
on Oct 17, 2008 at 18:17 UTC ( [id://717834]=note: print w/replies, xml ) Need Help??


in reply to Parsing newick trees

A non-eval solution. This constructs a AoA. I'd define constants: use constant { LEFT => 0, RIGHT => 1 }; rather than use a HoH for this, but converting the code to produce a HoH is trivial. As coded, it does require an absence of spaces, so strip them first if they might be present:

Updated with a slightly less verbose version

#! perl -slw use strict; use Data::Dumper; sub newickFromString { my $in = shift; return $in unless $in =~ tr[(),][(),]; my( $depth, $p ) = 0; for ( 0 .. length( $in ) -1 ) { ## find split point my $c = substr $in, $_, 1; ++$depth if $c eq '('; --$depth if $c eq ')'; $p = $_ and last if $c eq ',' and $depth == 1; } return [ newickFromString( substr $in, 1, $p-1 ), newickFromString( substr $in, $p+1, -1 ) ]; } my $input = "((Alpha,(Bravo,((Charlie,(Delta,Echo)),(Foxtrot,Golf)))), +Hotel)"; my $newick = newickFromString( $input ); print Dumper $newick; __END__ c:\test>junk8 $VAR1 = [ [ 'Alpha', [ 'Bravo', [ [ 'Charlie', [ 'Delta', 'Echo' ] ], [ 'Foxtrot', 'Golf' ] ] ] ], 'Hotel' ];

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Parsing newick trees
by JadeNB (Chaplain) on Oct 17, 2008 at 21:32 UTC
    Rather than doing all the substr's, couldn't you just do
    $in = s/^\((.*)\)$/$1/; my @splat = split(/,/, $in); my ( $depth, $left ); while ( my $c = shift @splat ) { $left .= ( $left ? ',' : '' ) . $c; ++$depth if index($c, '(') >= $[; --$depth if index($c, ')') >= $[; last if $depth == 0; } my $right = join(',', @splat);
    and then recurse on $left and $right? (Maybe searching each time for the parentheses is as hard as substr'ing, though.)

    UPDATE: Per BrowserUK's response, the answer seems to be “Yes, but why?” I always thought that substr's were very expensive, but apparently not. Hurrah for premature optimisation!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://717834]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-26 09:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found