Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^5: Given When Syntax

by LanX (Archbishop)
on Mar 17, 2014 at 15:22 UTC ( #1078616=note: print w/replies, xml ) Need Help??


in reply to Re^4: Given When Syntax
in thread Given When Syntax

I'll post benchmarks after GPW.

Sorry but ATM you are mangling different things and effectively comparing apples with plastic food.

BTW as it seems you are also "proving" that gotos to labels with code execution are faster then simple hash look ups. (which would support using goto)

update

I'm not sure but I somehow remember benchmarks indicating that goto was not very efficiently implemented.

By having a linear complexity, it could be in small cases faster than the constant overhead of hash lookups.

Cheers Rolf

( addicted to the Perl Programming Language)

Replies are listed 'Best First'.
Re^6: Given When Syntax
by Laurent_R (Canon) on Mar 17, 2014 at 18:07 UTC

    BTW as it seems you are also "proving" that gotos to labels with code execution are faster then simple hash look ups.

    The difference might come from something else, such as the use of a regex on the one hand, and of substr on the other hand for extracting the first digit. I did not change them because I wanted to use the various solutions as posted, or as close as possible to the way they were posted. After all, the way to get the first digit is also part of the solution. But, of course, if one wants to compare only the process for finding the values associated with that digit, then the easiest is to remove the digit extraction part by passing a single integer to the function. That was not my goal when posting the benchmark above, but I can certainly do that later when I get home, but not right now.

      There is more to do, switches on consecutive numbers are very unlikely, so array solutions don't prove anything, even if the example is so simplistic.

      Arbitrary keys in dispatch tables need hashes.

      Furthermore all my solutions are able to define a "default" case, like given/when does.

      Then goto's allow jumping to multiple lables leading to the same code, for this you need multiple entries in a dispatch hash.

      And it should be clear (but mostly ignored/assumed in this thread) that given/when's linear testing of conditions is more flexible then switching to code by literal keys . (C-style I suppose)

      The latter is an important but limited sub-case, which is ideally optimized under the hood, but measuring both side by side is like comparing apples and oranges.

      > I did not change them because I wanted to use the various solutions as posted,

      yes but you made (rather bold) statements about the execution time of "sub-calls" by benchmarking these arbitrary snippets.

      edit

      Better you concentrate on this point and try variations on the number of cases.

      I hope its evident that "clean" benchmarks are not trival.

      Cheers Rolf

      ( addicted to the Perl Programming Language)

        There is more to do, switches on consecutive numbers are very unlikely, so array solutions don't prove anything, even if the example is so simplistic.

        No, switches on consecutive numbers are certainly less common, but not unlikely. The OP describes a situation where the switch is done on the first digit of a string always starting with a digit. The values can only be 0 to 9. I have seen many similar cases where decision have to be taken on the basis of two (or three) specific digits of a telephone number or an IMSI number, or the first byte of an IP address, or the length of words in a phrase, some sequentially ordered items, etc. This is far from uncommon, and there is no reason not to use an array when that occurs.

        Arbitrary keys in dispatch tables need hashes.

        Yes, and I said on an earlier post in this thread that dispatch tables are usually hashes whose values are coderefs. But if they are not arbitrary values but consecutive numbers, as in the case in point, why not think out of the box and use an array if that is likely to be more efficient (although the difference is only marginal.

        Furthermore all my solutions are able to define a "default" case, like given/when does.

        Mines return undef, a perfectly admissible default case.

        Then goto's allow jumping to multiple lables leading to the same code, for this you need multiple entries in a dispatch hash.

        First, I am only very mildly interested about discussing about goto solutions. Second, this has nothing to do with the problem outlined in the OP. Third, the goto solution is not very efficient. Yet, granted, there may be cases where it might simplify the code.

        And it should be clear (but mostly ignored/assumed in this thread) that given/when's linear testing of conditions is more flexible then switching to code by literal keys . (C-style I suppose)

        Yes, that is absolutely true, but off-topic. The given/when solution is clearly deprecated (well, "highly experimental" is the official expression) at this point.

        The latter is an important but limited sub-case, which is ideally optimized under the hood, but measuring both side by side is like comparing apples and oranges.

        Not sure to understand what you mean, but, again, I only compared solutions as they were offered by their original posters, nothing more.

        You made (rather bold) statements about the execution time of "sub-calls" by benchmarking these arbitrary snippets.

        Methinks that I clearly stated that there was a penalty in using dispatch tables. I only said that the penalty was not as large as most people say, especially when the subs are really doing some work. The dispatch table option I used in my benchmark were able to process more than 1.5 million requests per second on my poor laptop.

        I made some timings a few months ago on a program that had to split input data between about 8 or 9 output files, based on somewhat complex conditions on the input data. The input data was about 30 million lines and the execution time about 13 minutes. I tried two different approaches: a traditional C-style approach, and a heavily HOP oriented approach (with dispatch tables, function factories, etc.). I did not benchmark them properly, because it would have taken ages, but I timed them several times. The largest execution time difference that I found was, if I remember correctly, 16 seconds, and the average difference about 10 sec. Does that matter? I don't think so when we speak about more than 10 min execution time. One program was 300 lines, the other just 70. Guess which ones?

        I hope its evident that "clean" benchmarks are not trival.

        Thank you for the lesson, it looks like I did not know, sorry. As I said, I used various people solutions as they were written, including yours. Your goto solution was faster than your hash solution, probably because one used the substr function and the other a regex. These are YOUR solutions as you wrote them, please don't blame me on that.

        Having said all that, I'll post soon a new benchmark in which I am trying to make things more comparable. It will shed a different light of this.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1078616]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2019-10-19 14:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?