mrbbking has asked for the wisdom of the Perl Monks concerning the following question:
I'm looking for real-life uses of Perl's study function.
The discussion of study here at the Monastery is largely the same as what's in the Camel Book, and the description leaves me wondering when and why one might use it.
Before I get too long-winded, do you use study? What for, and why?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Why study a SCALAR?
by clintp (Curate) on Dec 26, 2001 at 08:59 UTC | |
Following the guidelines in the Camel, online docs, and what I could grok of the source code, I kept coming up with seriously contrived uses. Okay, I did find a few uses that seemed to apply and didn't look so bad. So like any "optimization" scheme, I set up some benchmarks so that I could recommend it or at least further specify *exactly* where study benefits. By the time that I had a broad list of examples where studywas beneficial, and a list where it didn't help (or even hurt) performance it would have taken several pages to explain *why* it works this way -- and along the way dancing around implementation details of the language that I really didn't care to explain. Very, very small changes in the input data would cause large swings in the benchmark timings. I didn't want a huge checklist of cases and exceptions with disclaimers making the whole thing moot anyway. So I documented what I found (which is essentially what the Camel 3ed and the docs say) but with even broader warnings and more vigorous handwaving. | [reply] [d/l] [select] |
by melora (Scribe) on Nov 14, 2003 at 02:55 UTC | |
| [reply] |
Re: Why study SCALAR?
by LunaticLeo (Scribe) on Dec 26, 2001 at 22:20 UTC | |
I have done benchmarks repeating the same regex on the scalar, and multiple regex's on the same scalar. I have never found a speedup. BTW, my benchmarks were like: Basically, study() is an anachronism. Feel free to ignore it, everybody else does. | [reply] [d/l] |
Re: Why study SCALAR?
by atcroft (Abbot) on Dec 27, 2001 at 01:14 UTC | |
I considered the use of study() in a project at work, but was unable to find a sufficient increase in efficiency to use it (as the application I considered it for was a CGI searching for a limited amount of information). I did, however, test the use of study() again out of curiousity after reading the replies by clintp and LunaticLeo . My testing consisted of performing a search for the word "lease" in a large file (a sample taken from a DHCP server's leases file, consisting of 787'811 lines / 22'741'219 characters, the word occurring 69'474 times) using the code below. I wrote the results from the program to STDERR (to be able to filter them later), and tested 3 possibilities:
My results, however, might differ from that of others, had I had a search string with some characters more rare than others, and am still learning to Benchmark effectively. The moral to this (I believe) is that if you think it might prove helpful, Benchmark it and see, and remember, as always, YMMV. Update: I stand corrected by the experience and knowledge of chipmunk . Thank you chipmunk , for the correction to my understanding (or lack thereof). Update: After considering chipmunk's correction, I have edited and retested code to try to determine the effect of the study() statement. The new code is below, but I have left the code above as text for those who may learn from the correction, as I have. I utilized the same datafile as before. The new tests were:
Question: what effect could the caching in the Benchmark.pm module have on this code/results? | [reply] [d/l] [select] |
by chipmunk (Parson) on Dec 27, 2001 at 02:52 UTC | |
Of course, study won't be a win if you're only going to perform a single match on the target string. And it turns out that it probably won't be a win even if you do a bunch of matches on the target string. The regular expression engine has had lots of optimizations added to it over time, making it pretty fast with or without the use of study. | [reply] [d/l] |