good chemistry is complicated,
and a little bit messy -LW
Prevent "Out of Memory" errorby tford (Beadle)
|on Nov 23, 2009 at 02:20 UTC||Need Help??|
tford has asked for the
wisdom of the Perl Monks concerning the following question:
Hello all, I have a written a "chart parser" for Math notation, and I have written it entirely in Perl.
It works by building tree structures corresponding to every valid substring of the input, and storing them in memory. Then later on, it may combine some of the subtrees to make larger trees, which also need to be stored.
The problem is that a malicious user (or just a very ignorant one) can enter inputs that are so ambiguous that they will cause a great many trees to be built during the course of the parse.
It is possible to run the parser out of memory this way, and indeed it has already happened.
Another thing that makes this a difficult problem is that the "Out of Memory" error can occur at different places in the code. Sometimes when it tries to construct a new "Node" object. Sometimes, if it's already running low on memory, the error will happen when it's just adding an element to a pre-existing hash.
My question is, "Is there a way to monitor the free memory available from within the program?"
I've looked a little bit at Devel::Peek, and it seems promising, but I figured I'd check with you guys first. Surely someone else has run across this same issue!
Barring that, I'm thinking I may have to use the C programming language to rewrite my own versions of each basic data structure (array, hash, node), as XSUBs. Then they each could have a "safety valve" which would use malloc to check and see if there's actually enough memory to create or extend the given data. If not, the function could return an error value which would make the parsing stop, and send a message to the user.
One thing I've already tried is simply putting a limit on the amount of "Node" objects that can be created during parsing, but this is not a real solution. Apparently the amount of free memory for the process can vary depending on various factors.
Any help will be greatly appreciated!