Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Parser object to return different objects

by rvosa (Curate)
on Sep 01, 2005 at 22:27 UTC ( #488509=perlquestion: print w/replies, xml ) Need Help??

rvosa has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I need to parse text files of several formats. Based on the contents of the text files, one or more trees, matrices, and/or taxon objects are created. I wonder if you can advice me, from a user perspective, what would be a convenient interface. Right now, what you'd do is:
use Bio::Phylo::Parsers; my $parser = new Bio::Phylo::Parsers; # the newick format contains one or more trees. my $trees = $parser->parse( -format => 'newick', -file => $newickfile +); # the 'taxlist' format is simply a list of names, from # which a taxa object is created my $taxa = $parser->parse( -format => 'taxlist', -file => $taxonfile ) +; # the nexus format is a mixed format, that can contain # trees, taxa, matrices, etc. my $arrayref = $parser->parse( -format => 'nexus', -file => $nexusfile + );
The Bio::Phylo::Parsers package functions as a facade, that require's the appropriate parser submodule based on the format switch.

The problem lies primarily in the nexus format, which can contain a bunch of different things. Right now, without recourse to the nexus text file, it is impossible to say what is returned by the parser (it simply returns an array ref with all objects it parsed from the file). This seems too ad hoc. All other parsers return Bio::Phylo::* objects.

An option I've been considering is to make it such that not all text in the file is parsed, but only those things (if present) that the user wants:
my $trees = $parser->parse(-format => 'nexus', -I_want => 'trees', -fi +le => $nexusfile);
So, based on the "-I_want => " switch, the parser only ever returns what the user wants (if present).

Would that be a convenient way to go about things? What would you do?


Replies are listed 'Best First'.
Re: Parser object to return different objects
by Roger (Parson) on Sep 02, 2005 at 00:48 UTC
    it is impossible to say what is returned by the parser

    That's not true though, you can tell what type of the object it is by inspecting the ref of the object returned.

    I think your suggestion is workable. Perhaps you want to consider making -I_want switch optional. A more convenient interface would be a standardised approach for every file types, something like...
    use Bio::Phylo::Parsers; my $parser = new Bio::Phylo::Parsers; my $collection = $parser->parse( -format => $format, -file => $filenam +e , -I_want => qw/ trees / ); # I_want is optional for my $obj (@$collection) { .... }

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://488509]
Approved by Enlil
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2022-11-30 21:47 GMT
Find Nodes?
    Voting Booth?