Re^4: Performance, Abstraction and HOP

Wouldn't it better if it had something in it that you hadn't already thought of?

No, I figure you are smarter than I am, and I know you are a better writer, so I figured if you covered the subject of dynamically generated code it would be well worth reading. :-)

But it seems to me that you cannot reasonably hold the opinion that Perl is worth using for nontrivial programs and also that Perl's function calls are too slow to use.

Of course that would be an unreasonable position. And Its not the position I hold. The position that I hold is that there are some task where Perls function call overhead starts to become unreasonable, and thus for those tasks I do what I can to avoid function calls. Eliminating them entirely is usually not feasable, but with a bit of cacheing (not ala Memoize.pm which is too slow -- yes ive benchmarked it) you can often minimize their impact.

I can only imagine that what you are doing does not really make sense, or that it is such a weird special situation that it has little applicability to general discussions like this one.

Well I'm talking about processing tens to hundreds of millions of records per program run. For instance every day a program I have has to apply a set of rules stipluated in an ini file to all of the records that we have recieved from the previous day. Exactly which rules apply to a record are determined based on criteria like the the type of file that is being processed.

A naive implementation of code like this would have a single main loop and then a lot of conditional logic, checking to see which rules apply. In addition it would probably have the rule handling logic factored out into method calls and subroutines. The naive implementation will take a big speed hit just for the method calls.

When I realized how large the speed hit was from the method calls i started looking into how I could unroll them, and dynamically generate the loop so that once it started all lookups were statically resolved. Thus each method knows how to represent itself as a snippet. Each rule object knows how to represent itself as an if statement, etc. The result is a considerably faster record processing engine. IOW, instead of writing a generic record processing tool, I wrote an engine that would produce a tailored record processing tool that would then do the processing.

Ive used this technique quite succesfully a number of times, and normally I find that it provides a perfect compromise. In most parts of the code function all overhead is negligable. DB calls, file operations accross the network, etc are the bottlenecks, not the function call overhead. But inside of a tight loop that has to process millions of records per day, doing what I can to avoid unnecessary method and function calls has proved to be a powerful optimization technique.

The other, as you said, is to be a problem-solving engineer, to recognize that you are in a hopeless situation, and to obtain a tool that is not hopelessly defective. I have recently been informed that Java's function calls are faster than Perl's; perhaps that would be a better choice.

well, personally I never considered changing language, just squeezing more performance out of the one I have. If it seemed to me that I could not reach acceptable performance levels with Perl regardless of the techniques I used then I probably would have done so. But the truth is that Perl is pretty fast at a alot of things, and is flexible too. I usually am able to fairly easily avoid method and function calls when I need to, so I dont worry about it too much.

And actually its amusing that you say to use Java, as thats one of the few languages that I actually havent seen used for any serious systems in my field. Apparently its just too slow. :-)

---
$world=~s/war/peace/g

Comment on Re^4: Performance, Abstraction and HOP


The stupid question is the question not asked
	PerlMonks