Flame has asked for the
wisdom of the Perl Monks concerning the following question:
Well, bored out of my mind earlier today, I decided that I wanted to reach into the realm of math for a while. Thus was born the concept behind Math::ParseEquasion (Tempname until I can come up with something better)
Now that I have the idea though, I'm stuck with a few serious problems. Most significant is that I have no clue how to write a parser. So to start with, can anyone suggest any particularly good parsing modules I should examine?
Now, basically this is the way I'm aiming to have it work:
my $equasion = new Math::Parser('tan(x^4)+2sin(x)');
That little bit should have parsed it into a tree resembling this:
'tan'
'x'
'^'
'4'
'+'
'2'
'*'
'sin'
'x'
Hopefully that explained my goals well enough.
To wrap up, I ask that if anyone has any advice or suggestions, please post them. Although I did a search for something similar to this, and didn't find it, that doesn't mean it's not there, so if someone has already written something to do this, please say so.
Thanks.
"Weird things happen, get used to it."
Flame ~ Lead Programmer: GMS
Re: Seeking Advice: Writing a parser by Flame (Deacon) on Feb 08, 2002 at 04:29 UTC 
Note: This module is being written to lead into another module I'm considering, Math::Solver.
"Weird things happen, get used to it."
Flame ~ Lead Programmer: GMS
 [reply] 
Re: Seeking Advice: Writing a parser by jryan (Vicar) on Feb 08, 2002 at 05:07 UTC 
Something like this needs a recursive parser. Aside from suggesting Parse::RecDescent or the example in Japhy's book, I'll give a few suggestions on how I would approach this.
First off, you are correct in storing it a treetype data structure. However, don't use a normal hash; I recommend Tie::IxHash to keep an ordered hash since you will need to reorder later.
Next, you will need to figure out a way of determining whether a term is top level, and separate those terms from each other. Simply splitting on /[+]/ wont be enough; you'll need to inchworm through in order and determine context. Next, you'll have to separate those terms which are /[*\/]/, and mark those as separate parts of one term. Finally, you'll need to deal with leftovers, such as exponentiation, function calls, etc.
Once you have the top level terms in the tree, you'll have descend into each node and repeat the process (here is where the recursion comes in.) You'll need to do it again, and again, and again, until everything is parsed to its lowest level. Finally, you'll have to work backwards from the bottom to perform the calculations.
Not as easy as you thought, was it? That's why there is CPAN :)
 [reply] [d/l] [select] 
Re: Seeking Advice: Writing a parser by jackdied (Monk) on Feb 08, 2002 at 05:09 UTC 
Update : jryan hit submit two seconds before I did, and with a more verbose post. read that
Parse::RecDescent is an easy to use little parser. It includes examples of describing small grammars. If that isn't enough, do a google search on 'BNF Grammars' and check out the university comp sci pages.  [reply] 
Re: Seeking Advice: Writing a parser by Zaxo (Archbishop) on Feb 08, 2002 at 05:14 UTC 
Math::Expr does exactly what you want. Release date is 20011001, so it's fairly up to date.
NAME
Math::Expr  Parses mathematical expressions
SYNOPSIS
use Math::Expr;
SetOppDB(new Math::Expr::OpperationDB('<DBFileName>'));
$e=Parse("a+4*bd/log(s)+f(d,e)");
DESCRIPTION
Parses mathematical expressions into a tree structure. The expressions
may contain integers, real numbers, alphanumeric variable names,
alphanumeric function names and most other characters might be used
as operators...
From the pod. That's no reason not to write your own as an exercise :)
After Compline, Zaxo
 [reply] 
Re: Seeking Advice: Writing a parser by BrentDax (Hermit) on Feb 08, 2002 at 09:18 UTC 
I highly recommend you use Parse::RecDescent. It has an <autotree> directive which, although it doesn't work exactly as you describe, builds a tree that's probably better for your purposes:
+
/ \
/ \
/ \
tan *
 / \
^ 2 sin
/ \ 
x 4 x
The advantage of this tree is that you can just do a postorder traversal with a number stack to run it. Of course, if you really like the tree design you describe above, you can also just provide the treebuilding actions yourself instead of using <autotree>.
=cut
Brent Dax
There is no sig.  [reply] [d/l] 
Re: Seeking Advice: Writing a parser by mattr (Curate) on Feb 08, 2002 at 15:30 UTC 
Well I can't access Math::Expr through search.cpan.org so full throttle ahead! But please call it Math::ParseEquation
instead.
Also I think I might have a different preference for the
way you order things in your tree. That said, perhaps the problem could be redefined by designing a converter from the above infix notation to ReversePolish (postfix) Notation? Some linx:
<a href=http://www.onthenet.com.au/~grahamis/int2008/week02/lect02.html#stackapp">1
2
3
Move SIG!  [reply] 

Math::ParseEquation was the name I originally planned to use, since no one has suggested I change it, I guess I'm sticking with it.
"Weird things happen, get used to it."
Flame ~ Lead Programmer: GMS
 [reply] 

Math::Expr at search.cpan.org. Maybe the site was down, or you typed it wrong.
Also I did
perl MCPAN e 'install Math::Expr'
and it found it and installed it...
 Snazzy tagline here
 [reply] [d/l] 

Forbidden
You don't have permission to access /search on this server.

+
Apache/1.3.22 Server at search.cpan.org Port 80
 [reply] [d/l] 
Re: Seeking Advice: Writing a parser by Dominus (Parson) on Feb 08, 2002 at 22:52 UTC 
One problem you will need to think about that I think nobody
else has pointed out yet is that mathematics, as conventionally
written, is ambiguous. For example, if you get an
expression like b(x+1), it may not be clear whether
the intent is to multiply the quantity b by x+1 or
to give x+1 as an argument to the function b.
Mathematicians use two things to determine which case holds:
One is information from the surrounding mathematical context;
if b has been previously described as being a function,
then you know the second interpretation is correct. Making use
of contextual details is tricky in a parser. The other thing
the mathematicians look at is subtle typographical
distinctions, which aren't available
to you here.
Computer languages get around this by requiring explicit
multiplication symbols: The expression
above becomes b*(x+1) if multiplication is
intended. Your example above indicates that you want implicit
multiplication, and that can be very tricky. If you see
something like y = mx, how will you know how to parse it?
Is mx one variable, or is it the product of m
and x? These are the sorts of side problems
that you'll have to solve to build a parser that works the way you want
it to.
You might want to consider first writing a parser
for a simpler and less ambiguous languagesay,
arithmetic expressions involving only numerals, with
explicit multiplication. Once you have some experience solving the simple
problem, you can go back and embellish it to handle more complex
expressions.

Mark Dominus
Perl Paraphernalia
 [reply] 

Just to clarify your first point;
for the benefit of those w/o
an education from the American public schools ;)
it's PEMDAS!
 Parantheses (I would imagine trig functions go here)
 Exponents (I would imagine logarythms go here)
 Multiplication, Division
 Addition, Subtraction
For the second part you can be smart.
Declare your parser as not lazy, then
if mx has not yet been seen but m and x
have there is implicit multiplication.
On the gripping hand say m, x, and mx
are defined. I'd opt for mx and force
explicit multiplcation for m and x.

perl pe "s/\b;([st])/'\1/mg"
 [reply] 

