Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Meditations

( #480=superdoc: print w/ replies, xml ) Need Help??

If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
[RFC: Cool Uses for Perl] A weighted randomized word generator for conlangers
1 direct reply — Read more / Contribute
by flowdy
on Jun 20, 2014 at 18:32
    Hi,

    A decade or so ago, I was a conlanger. I liked to construct a language spoken by a virtual people I didn't however, in contrary to J.R.R. Tolkien, shape regarding their culture. Obviously I had a plenty of mind capacity for such things, and they called me quixotic. No more. Now I prefer programing and hobbies and a real life, too. I remember that the grammar part was in favour compared to developing a comprehensive lexicon so the language is talkable. Inflection and word order and features I liked in other languages however were far more interesting. The best part about it I found were its 35 cases, actually two case systems interacting like cogwheels.

    I just programmed a word generator as a kind of tribute to that young nerd's passion. It respects how words can be shaped in a language and how not. In English for instance there is no word like quirge, but it still seems more English than, er, dampfschiff which is by the way, believe it or not, actually a word in a spoken language. These unwritten principles are characteristic for every language. Despite of their being unwritten, you can approximate these principles by probabilistic shares and the rand() of perl.

    Even if you are not interested in conlanging, you might want to study this example how curried closures can lead to efficient code, i.e. code of which I am quite sure it's efficient, until someone will prove me wrong. For those who do not know yet: Closures are anonymous subroutines that cling, for their own life-time, to any used lexical variables which would have been garbage-collected otherwise on going out of their respective scopes. Higher Order is when you pass these subroutine references to other functions which call them back when appropriate. Learned that once from the High Order Perl book by Mark J. Dominus (available online but if you can effort it, consider buying it).

    Update: Corrected name of the Higher Order Perl author. Further confused Currying and Higher Order. Will have to re-read myself the very book I cited.

    -- flowdy

Perl Job Marketability Question - very important for me!
6 direct replies — Read more / Contribute
by o2bwise
on Jun 13, 2014 at 17:55

    Hi,

    I got laid off a couple of months ago and am in the midst of the whole job search process. With that in mind, one thing I am wanting to have a good idea of is my marketability.

    I have been programming in Perl for close to 15 years now, but I think my use of the language has been pretty limited in scope.

    Now, my last job included an application called CA eHealth. I did a ton of Perl programming involving this application. (In fact, I largely lost my job due to my employer’s decision to replace this application.)

    There is one particular kind of scripting that I just love, but I am frankly unsure if it is sufficiently marketable. What I love doing is having Perl read data from a number of sources (databases, flat files) and produce desired data. I just LOVE that kind of coding. I love the whole portion in between those two points. Data aggregation, coming up with one or more nested hash data structures, looping through the hashes and writing out data as needed. Sometimes needing to do math and/or stats as part of the data aggregation. Mining through lines of text and db queries, and so on.

    My question is, is just the above thing I love to do marketable? I honestly do not know, but I sure hope so.


    Tony

Bring back the smartmatch operator (but with sane semantics this time)!
6 direct replies — Read more / Contribute
by smls
on Jun 10, 2014 at 12:52

    No I'm not kidding, please hear me out... :)

    History of smartmatch in Perl

    Smartmatching was invented for Perl 6 where it turned out to be a very useful and well-loved1 feature, but the attempt to backport it to Perl 5.10 in 2009 did not turn out so great (and it was consequently deprecated again in Perl 5.18). Among the Perl 6 community the commonly accepted explanation for that is, from what I heard:

    • Perl lacks a strong & fine-grained type system (which in Perl 6, where it exists, adds much sanity to the concept of dynamic dispatch)
    • Perl lacks composable related features like Junctions (which allow the Perl 6 smartmatching rules to be simpler and less arbitrary than the rules that would be needed to facilitate the same set of use-cases in Perl)

    These limitations are hard to circumvent, but I don't think that means Perl should have no smartmatching at all, it just means it should have less ambitious / more focused smartmatching.

    I wasn't around at the time, but it looks to me as if the Perl 5.10+ smartmatching was designed with these goals:

    1. Support all use-cases that the Perl 6 smartmatch operator supports
    2. Use it as an opportunity to sneak in useful new comparison/searching operations into the core, without having to invent separate operator names for them, and without having to justify them individually

    ...and was thus doomed to failure.

    Some later proposals for re-designing the smartmatch operator (like this 2011 post by brian d foy) tend to avoid mistake no. 1, but still fall into the second trap.

    If there are comparison/searching operations that are deemed worthy of being added to Perl (say, "deep comparison" of two arrays, or checking whether an array contains a given scalar), then each of them should get its own operator. That's the normal Perl way: One operator per type of operation (that's why we have both == and eq for example).

    How smartmatch should be designed

    Smartmatching explicitly breaks with the conventional "one meaning per operator" rule by dynamically deciding what operation to perform based on its arguments. This means it should be carefully designed around use-cases where you actually need to dynamically decide what operation to perform. Operations that you would likely never want to mix-and-match, have no business being part of the smartmatch operator, even if they would be useful to have in core by themselves.

    So, what are those use-cases where you actually need dynamic smartmatching? I can think of two major ones:

    1. When you want to avoid writing out  ($_ <operator> ...)  in a given/when construct, for the purpose of brevity/elegance:

      use v6; given $username { when 'root' { dostuff } when /^guest\d*$/ { die "You're not allowed to do stuff." } when any(<http apache>) { authenticate :web; dostuff } default { authenticate :local; dostuff } }

      Of course it is only elegant when the meaning is self-evident without consulting a manual, so this use-case only makes sense for commonly used & unambiguous comparison operations.

    2. When you want your code to test things against a "filter/pattern/rule" that is passed in from the outside, and you don't want to restrict it to just one way of filtering (e.g. only by string comparison, or only by regex, or only by callback etc.)

      For example, consider the Perl 6 built-in function dir, which lists the contents of a directory in the filesystem. It takes an optional 'test' argument, against which it promises to smartmatch each filename and only return the matching ones. Since smartmatch is built into the language, Perl 6 programmers need no further documentation to understand that parameter; they know they can use anything that would be valid as the right-hand-side argument of ~~ as the test, for example:

      use v6; dir '/some/directory', test => /\.txt$/; # a regex dir '/some/directory', test => none('.', '..'); # a junction² dir '/some/directory', test => &validate_filename; # a coderef

      The result is a very flexible but still elegant and predictable API that is easy to imitate in your own functions/modules that want to allow their users to "match" or filter stuff: Just use smartmatch as your filter implementation!

    We can make new Perl 5 smartmatching rules useful for those use-cases, while still keeping them sane and predictable, by adhering to the these two principles:

    1. Decide what operation to perform, based on the type of the right-hand-side argument (and nothing else!)
      (Put another way, this means that  LHS ~~ RHS  can always be expressed in words as the question "Does LHS fit the constraint/template defined by RHS?")

    2. Blindly coerce the left-hand-side argument to the type that the chosen operation requires, just as normal Perl operators like eq also coerce their arguments.
      (So, for example, @foo ~~ /foo/ would be the same as @foo =~ /foo/, even though that may not be useful, rather than doing anything special just because it's an array!)

    Sensible smartmatch rules

    With that in mind, we can start to think about the kind of right-hand-side "things" that it should be possible to smartmatch against.

    The following are no-brainers imo:

    if RHS is an... (example) then  LHS ~~ RHS  should do...
    undefined scalar $x ~~ undef !defined(LHS)
    simple scalar $x ~~ 'foo' LHS eq RHS
    regex (literal or reference) $x ~~ /foo/ LHS =~ RHS
    code reference $x ~~ sub { ... } RHS->(LHS)
    an object that overloads ~~ $x ~~ $object call the overload method, with LHS as argument

    The 'simple scalar' case is not as elegant as one might wish it to be; Ideally it would be able to dynamically decide between string or numeric comparison like it does in Perl 6, but I don't think that is possible to do safely in Perl (its type system being what it is), so we need to take what we can get.

    The following two rules also tend to be pretty useful in Perl 6, and it might make sense to add them to our hypothetical new Perl smartmatch, but I'm unsure about them because range literals and typename barewords are not usually treated as first-class "things" in Perl, so it might feel strange:

    if RHS is a... (example) then  LHS ~~ RHS  should do...
    bareword $node ~~ XML::LibXML::Node ref(LHS) eq "RHS"
    range literal $age ~~ 0..17 interpret LHS as a number, and check if is within the range

    Lastly, the lack of junctions in Perl could be partially remedied by interpreting an array/list on the right-hand-side like an any() junction:

    if RHS is an... (example) then  LHS ~~ RHS  should do...
    array or list $switch ~~ qw(yes true on 1) (grep { LHS ~~ $_ } RHS) >= 1

    Of course, a better solution would be to add junctions to Perl together with re-adding smartmatch... :)
    (Perl6::Junctions already exists on CPAN, but it relies on at least one awful hack due to the fact that it is non-core).

    Anyway, the above rules would be more or less a subset of both Perl 6 smartmatching and the deprecated Perl 5.10+ smartmatching, but without the craziness of the latter.

    And that's it; All cases not handled by these rules should generate a runtime error.
    I don't think any other special cases need to be added - in particular, all the arbitrary behaviors that Perl 5.10+ smartmatching added for when one or both arguments were arrays/hashes, only served to confuse people and made the operator "not safe to use" in practice. Let's not repeat that mistake.

    PS: In case you want to get a "feel" for what this kind of smart-matching is like in practice, check out Toby Inkster's match::simple module which implements very similar rules to what is discussed here (but suffers from some unavoidable limitations due to the fact that it is not in core).

    ---

    1) Among the small but passionate fan base of Perl 6 :)
    2) This particular junction is in fact used as the default when the 'test' argument is omitted.
RFC - early draft of Menu::Simple
2 direct replies — Read more / Contribute
by RonW
on Jun 06, 2014 at 13:13

    A while back, to further simplify making menus for occasional Tk driven Perl tool, I started Menu::Simple as an alternative to the -menuitems option in Tk::Menu::Items. It has significantly helped others to add their own Tk GUIs to Perl tools. (None of us here do much GUI coding, so anything that makes it easier helps us.)

    Recently, while adding a Tk GUI to yet another tool I created using Perl, it occurred to me that Menu::Simple might be worth sharing.

    It still needs more documentation, but I do have a simple example that shows the basic usage. (It also needs some clean up. Over time, I have changed and enhanced it. Just haven't thought about sharing outside my team at work.)

    So, here is the module and usage example. (Warning: It currently uses the 'current_sub' feature, so at least Perl 5.16 is needed.)

    Update: Determined that Perl 5.16 is currently needed. Will try to make it work with earlier versions.

    Update 2: After clean-ups, of course, would this be worth submitting to CPAN? Maybe change the name to Tk::Menu::Simple?

    Update 3: Removed requirement for Perl 5.16, but not sure how old a Perl it will work with.

    Update 4: Added a draft of documentation. Also, changed name to Menu::Builder. (I am planning to contact the maintainer of the Tk namespace on CPAN about having this included as Tk::Menu::Builder.)

at continue, last
7 direct replies — Read more / Contribute
by djerius
on Jun 06, 2014 at 00:53
    Update The imprecise nature of the language in this post has muddied the waters. The point of the post was to enumerate some ways of ensuring a single path for cleanup from a conditional clause, not all of them practical. My thanks to the respondents who persevered in spite of the muddiness of the water.

    On to the original post...

    I need to ensure that a variable is decremented regardless of how an if clause is exited, which might be in the middle of the block. I can code it this way:

    if ( $condition1 ) {{ ... --$loop, last if $condition2; #bail! ... --$loop; }}
    The repeated decrements are not DRY. This one is better:
    if ( $condition1 ) { { ... last if $condition2; ... } --$loop; }
    Along the way I stumbled upon these monstrosities:
    for ( ; $condition1 ; $loop--, last ) { ... next if $condition2; ... }
    Even worse:
    while ( $condition1 ) { ... next if $condition2; ... } continue { --$loop; last; }
    A last in a continue? Who knew?
Tiny Perl puzzle
4 direct replies — Read more / Contribute
by Dominus
on Jun 05, 2014 at 17:16
    Without testing first, guess what this program will print:
    print (two + two == five ? "true" : "false")
    Then figure out why you were wrong. (Please mark spoilers accordingly.)
Perl::Minimal -- the good, bad, and the ugly...
7 direct replies — Read more / Contribute
by taint
on May 29, 2014 at 21:18
    Hello, Monks.

    I've seen quite a few questions in SOPW related to creating, or working with Perl installs of Reduced size. The thought occurred to me, that maybe there was some potential to something in the Minimal::Perl | Perl::Minimal namespace on the CPAN.

    But maybe not. I can see where there is great diversity needed to satisfy the needs of every-wants-a-smaller-version-of-perl project.

    Maybe so. But maybe there is some "common ground". Some RE of "features" that would only require some small subset of Modules to complete the scheme for any given project.
    In other words; some small subset of Modules only needed to be added to the Minimal-Perl.

    Anyway. I thought this topic had merit, and thought it'd be wonderful to get some feedback on the concept, from everyone.

    Thanks for taking the time.

    --Chris

    UPDATE: added some clarity.

    ¡λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH

Thought(s) about SOP questions...
6 direct replies — Read more / Contribute
by wjw
on May 23, 2014 at 13:34

    SOP on PM: a basic template for successful questions


    Purpose:

    To provide a 'thought' template (or form/checklist) that `concisely' guides a SOPW participant through preparing to post a question on said SOPW.

    How do I post a question effectively? is one of the places to really dig in to writing a good post. This hopefully serves the purpose of getting you to think about all that good advice.


    What I am working on(brief overview)
    • Platform
    • OS
    • Integrating with...(excel, libreoffice...etc...)
    • Perl Version
    • Using (module::name::?) if appropriate
    • What I want for output
    • What I am using for input
    • Combine this with 'What the problem is' below and you will have a good start to answering How do I compose an effective node title?
    What the problem is
    • First, ensure this is not XY Problem
    • Post relevant code (use code tags, 'code' or 'code')
    • Keep to relevant portions.
    • Make sure it compiles (or would if complete, copy exactly what you have(not data though))
    What you have tried
    • If you provide some of the above, it will prove you did something and are not looking for free coding service.
    Errors you're are getting ... if you already tried the following: (helpfully suggested by LanX)
    • use warnings;
    • use strict;
    • use diagnostics;
    • Then:
    • Copy/paste the errors in code tags with the rest of your post.
    My input
    • Include __DATA__ block or at least a few lines of the input file(also in code tags, 'code' or 'c')
    • If it is a known format like JSON or TNNAME.ORA, mention that.
    • Clean it for security purposes, but don't screw it up! Original size, character counts etc... are important!
    What it is you don't understand about solving the problem
    • I don't know how to format...
    • I don't understand the data structure...
    • I don't understand this code at line...
    • ...

    Further excellent suggestions regarding the subject matter from other PM's:


    If you manage to get at least one clear, concise statement about what you are trying to solve from each those first three main categories, you are likely to get pretty good, pretty quick help. If you can't do that, go back and ask yourself those questions in the first three categories, get them clear in your head, then write them. The clarity of your question might lead you to your own answer, and if not, it will surely help those attempting to assist you...

    No one is going to beat you up for not knowing how to state every question above. Hell, my posts have sadly not even followed this format! They will from now on! But if you give it half a shot your results are going to be a lot more satisfying, and the questions you get about your question for clarification will help you better define your question, and therefore your solution. Really, this is all covered in [id://174051>, but here is some of it again...

    Hope that is helpful...


    A solution is simply a well stated problem...otherwise the problem is not a problem, it is a fact...

      Updates:(newest first)
    • Added a few select links
    • Clean up, re-order in more logical step by step sequence base on other successful SOP postings(based on my opinion
    • Added link to suggestion, Added 'Purpose Statement'
    • Clean up, clarify Title
    • Added reference to XY problem
    • Added this update list
    • Added suggestions from subsequent posters
      ToDo
    • Provide example of Excellent Question
    • Provide example of Marginal Question
    • Provide example of Poor Question

    ...the majority is always wrong, and always the last to know about it...

    Insanity: Doing the same thing over and over again and expecting different results...

RFC: config::simple vs config::ini
3 direct replies — Read more / Contribute
by thanos1983
on May 20, 2014 at 19:23

    Dear all,

    Recently I discovered, thanks to people of this forum the use of Benchmark. A great powerful tool comparing speeds, processes and times.

    Well since I love playing around with details I could not resist to make this make not useful test and compare the Config::IniFiles Vs Config::Simple.I decided to run an experiment and observe the output.

    Update:

    Thanks to Anonymous Monk and davido for their contributions and suggestions, I have modified my experiment and came up with new data.

    The experimental code:

    #!/usr/bin/perl use strict; use warnings; use List::Compare; use Config::Simple; use Config::IniFiles; use Fcntl qw(:flock); use Benchmark qw(:all) ; use Data::Dumper qw(Dumper); $|=1; #flush every time the program =flock sub LOCK_SH { 1 } ## shared lock sub LOCK_EX { 2 } ## exclusive lock sub LOCK_NB { 4 } ## non-blocking sub LOCK_UN { 8 } ## unlock =cut my %information; my $path = 'conf.ini'; my $count = -5 || die "Need a count!\n"; sub complex { open my $fh , '<' , "".$path."" or die "Could not open file: ".$path." - $!\n"; flock($fh, LOCK_SH) or die "Could not lock '".$fh."' - $!\n"; tie my %ini, 'Config::IniFiles', ( -file => "".$path."" ) or die "Error: IniFiles->new: @Config::IniFiles::errors"; close ($fh) or die "Could not close '".$path."' - $!\n"; print Dumper(\%ini); return %ini; } # end sub complex sub val { open my $fr , '<' , "".$path."" or die "Could not open file: ".$path." - $!\n"; flock($fr, LOCK_SH) or die "Could not lock '".$fr."' - $!\n"; my $cfg = Config::IniFiles->new( -file => "".$path."" ) or die "error: IniFiles->new: @Config::IniFiles::errors"; close ($fr) or die "Could not close '".$path."' - $!\n"; my $values = $cfg->val( 'Perl', 'test' ); my @array = split(',', $values); print Dumper(\@array); return @array; } # End of sub val sub simple { open my $fr , '<' , "".$path."" or die "Could not open file: ".$path." - $!\n"; flock($fr, LOCK_SH) or die "Could not lock '".$fr."' - $!\n"; Config::Simple->import_from( "".$path."", \%information) or die Config::Simple->error(); close ($fr) or die "Could not close '".$path."' - $!\n"; print Dumper(\%information); return %information; } # End of sub simple my $r = timethese ( $count , { 'complex' => '&complex', 'val' => '&val', 'simple' => '&simple' } ); cmpthese $r;

    I am using a common conf.ini folder that both scripts are reading from with flock process applied since I am planning to use in combination with other scripts.

    Sample of the conf.ini folder:

    [Perl] test= bar,baz,foo,quux

    The results after the test are the following:

    val: 6 wallclock secs ( 5.69 usr + 0.30 sys = 5.99 CPU) @ 989.98/s +(n=5930) Rate complex val simple complex 700/s -- -29% -46% val 990/s 41% -- -24% simple 1297/s 85% 31% --

    Well to be honest I was expecting the complex version of Config::IniFiles to be faster in comparison to Config::Simple due to simplicity of the code. But the results have proved my assumption wrong. Thanks to Anonymous Monk that he elaborate regarding the tie and OOP process I decided to add also the val process to make it more fair. Also davido point out that the unlock process is a trap and possibly the data can still remain within the process, so it is better to close the file in order to flush the output.

    Again thank you all for your contribution, to beginners like my self this is a big boost on the learning curve.

    Well I do not know if this comparison makes any sense to anyone. But since I am beginner and all of this stuff make a huge impression, I felt it would be nice to mention this. Just in case that someone needs to use one of these two solutions to get also an idea about speed.

For magnificent glory, a TT Replacement.
6 direct replies — Read more / Contribute
by EvanCarroll
on May 20, 2014 at 14:23
    TT is pretty old and dated, so I've endeavored to implement Jade in Perl. Jade is a modern markup that is quickly becoming somewhat of a standard for Node.js. Here is what Jade looks like, (and all of this currently works in my port).
    doctype html
    html(lang="en")
      head
        title= pageTitle
        script(type='text/javascript').
          if (foo) {
             bar(1 + 5)
          }
      body
        h1 Jade - node template engine
        #container.col
          if youAreUsingJade
            p You are amazing
          else
            p Get on it!
          p.
            Jade is a terse and simple
            templating language with a
            strong focus on performance
            and powerful features.
    
    You can read more about it on jade-lang.com/ There are a lot of key features to it. The big thing is, of course, not having to manage closing tags with whitespace sensitivity. This may be very un-Perlesque. But all good Perl users should join the rest of the rest of world in passionately hating sgml/xml. I've got very few tests on it currently, and most importantly it does not set out to recreate a DOM. It's a basic Jade recursive decent parser that supports interpolated (Blocks) and compiled (includes). The only things not yet implemented are in the POD. I plan on getting to them in the future.


    Evan Carroll
    The most respected person in the whole perl community.
    www.evancarroll.com
The 10**21 Problem (Part 4)
6 direct replies — Read more / Contribute
by eyepopslikeamosquito
on May 19, 2014 at 07:48

    In part three of this seemingly never-ending series on high performance computing we continued what may become the longest whittle in Perl Monks history by further reducing the running time of our magic formula search program, from 728 years down to just 286.

    The latest round of improvements were achieved not by a major breakthrough, rather by applying a succession of micro-optimizations. To achieve the required massive speed boost, it was becoming increasingly clear that we needed to find a new breakthrough idea.

    Constraint Satisfaction Problems

    Constraint satisfaction problems (CSPs) are mathematical problems defined as a set of objects whose state must satisfy a number of constraints or limitations. CSPs are the subject of intense research in both artificial intelligence and operations research, since the regularity in their formulation provides a common basis to analyze and solve problems of many unrelated families. CSPs often exhibit high complexity, requiring a combination of heuristics and combinatorial search methods to be solved in a reasonable time.

    -- Constraint satisfaction problem (wikipedia)

    So, while the talk of this being a “10**21 problem” is an interesting one, and the latest installment (#3) about whacking the TLB of the processor is even more so, it seems like an incredible amount of work to unleash on a problem that has a trivial algorithmic solution ... This is on the one hand very interesting and computer-sciency, as I have already said, but the choice of example-problem puzzles me.

    -- sundialsvc4

    Roman numerals were the challenge at hand. The method of the example is common to constraint searches beyond the example. Constraint searches exist in drug research and many other fields ... Pick any field which has a large search space for just the right combination of properties in an as yet undiscovered item. Write a program which tries and makes a preliminary fitness determination for each possibility. Have that program spit out a short list of candidates for further investment of testing and development ... In this specific case, the fitness is a maximum length, a handful of inputs, and a handful of outputs that map correctly to those inputs ... The point of an example is that it is a concrete thing that is completed and shown rather than an abstract idea.

    -- mr_mischief beautifully explains the point to sundialsvc4

    mr_mischief was right on the money, sundialsvc4 on the wrong track.

    This was indeed a classic constraint satisfaction problem ... which I felt woefully ill-equipped to solve. Lacking the mathematical ability of an ambrus, the only strategy I could think of was to desperately search for a hack, any hack, that would allow me to abandon a potential solution as early as possible, as soon I was certain it could not be successfully completed.

    In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, or select a lower-level procedure or heuristic (partial search algorithm) that may provide a sufficiently good solution to an optimization problem, especially with incomplete or imperfect information or limited computation capacity.

    -- Metaheuristic (wikipedia)

    Ahem. Instead of "hack" above, please feel free to substitute "metaheuristic" -- a new word I learnt while researching this series.

    An "early abandonment hack", erm metaheuristic, would clearly be much faster than a brute force complete enumeration of all candidate solutions. How to find such a hack?

    Know Your Data

    To get a feel for what the data in the 4GB lookup tables actually looked like, I wrote some small Perl programs similar to the one below, which displays the data in 128-byte blocks:

    use strict; use warnings; use List::MoreUtils qw(natatime); # See [id://861938] sub chunk_array { my ( $n, @vals ) = @_; my $str; my $iter = natatime( $n, @vals ); while ( my @line = $iter->() ) { $str .= join( " ", @line ) . "\n"; } return $str; } sub dump_blocks { my $file = shift; # bytemap file to read my $nblock = shift; # number of blocks to dump local $/ = \128; # Make <> operator read 128 bytes open( my $fh, '<', $file ) or die "error: open '$file': $!"; binmode($fh); for my $n ( 1 .. $nblock ) { my $block = <$fh>; length($block) == 128 or die "oops"; print "block $n:\n"; print chunk_array( 16, map { sprintf "%3d", ord } split //, $b +lock ); } close($fh); } # A 4GB bytemap consists of four 1GB physical files (suffix: "-0" .. " +-3"). # (See [id://1086481] for a program that generates a bytemap). my $bytemap_root = "lookzpM.byte"; my $fname = $bytemap_root . "-0"; dump_blocks( $fname, 16 );
    which produced:
    block 1: 0 118 0 126 0 126 0 102 0 102 0 94 0 94 0 86 0 86 0 94 0 94 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 75 55 51 55 59 63 35 47 43 39 35 39 27 31 19 15 11 23 19 23 27 31 block 3: 0 111 0 103 0 103 0 127 0 79 0 119 0 119 0 127 0 111 0 103 0 103 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 4: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 114 0 122 0 114 0 106 0 98 0 90 0 66 0 74 0 82 0 90 0 82 44 42 40 34 36 58 56 2 12 0 8 50 52 58 56 50 44 42 40 34 36 block 5: 0 88 0 76 0 72 0 84 0 88 0 108 0 104 0 100 0 120 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 6: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 109 0 97 0 101 0 65 0 77 0 113 0 117 0 113 0 109 0 97 0 101 5 1 1 0 0 17 17 21 21 17 17 45 45 33 33 37 0 65 0 77 0 block 7: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 8: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 68 0 72 0 76 0 88 0 84 0 104 0 108 0 24 26 4 2 8 0 12 0 120 0 116 0 104 0 108 0 120 0 68 0 72 0 76 0 88 0 84 0 block 9: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 10: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 83 0 91 0 67 0 75 0 67 63 59 55 51 55 43 47 51 63 59 7 3 7 11 15 3 31 27 23 19 23 0 111 0 95 0 71 0 71 0 79 0 127 0 119 0 119 0 111 0 127 0 block 11: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 12: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 124 0 112 0 116 0 112 0 124 0 64 0 68 0 64 0 92 0 80 0 84 22 16 30 28 30 0 6 4 6 0 62 60 62 48 54 52 54 48 62 60 62 0 70 0 70 0 94 0 94 0 86 0 0 0 0 0 0 0 0 0 0 0 block 13: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 14: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 85 0 89 0 93 0 73 0 69 0 121 0 125 0 105 0 117 0 121 0 125 1 9 5 5 1 25 29 29 17 105 0 85 0 89 0 93 0 73 0 69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 15: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 block 16: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 78 0 70 0 70 0 94 44 46 40 22 20 22 24 30 12 14 8 6 4 6 0 126 0 110 0 118 0 118 0 126 0 78 0 70 0 70 0 94 0 110 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    Hmmm, every one of these blocks looks like it is one of just three distinct types:
    • All zeros
    • All non-zero numbers even
    • All non-zero numbers odd
    Every one of those blocks is one of just three distinct types! Every one of those blocks is one of just three distinct types?!! Whoa. But does that hold true everywhere? To find out, I nervously ran a brute force search over all seven 4GB lookup tables and was elated to learn that this "theorem" does indeed hold true for every single block of every single lookup table!

    Only odd modulos (i.e. 1001, 1003, 1005, ...) can produce a valid solution

    -- ambrus' third theorem

    For modulo 1001, each and every 128-byte block in all seven 4GB lookup tables contains either all zeros, all non-zero numbers even, or all non-zero numbers odd

    -- ambrus' fourth theorem

    Once again I could not restrain myself from adding to our list of the theorems of ambrus. :-) I expect ambrus' third theorem is easy to prove, his fourth much harder.

    Curiously, out of all odd modulos in the range 1001..1221, the fourth theorem of ambrus applies to modulos 1001 and 1221 only. Why only those two? Who decides? Weird. Maybe ambrus can show mathematically why it must be so. I "proved" it only via brute force enumeration of hundreds of 4GB lookup tables.

    For modulo 1003..1219 you would need to find a different heuristic to trim the search space (if one exists). That is the main reason why I used modulo 1001 only.

    As you might have guessed, noticing this oddity in the lookup table data was a key breakthrough. Why?

    • All zero blocks can be skipped immediately. 125 searches eliminated.
    • To get a hit, all blocks must be even or all blocks must be odd (a value from an odd block can never match one from an even block).
    • We can encode each 128 byte block in just two bits: one to indicate zero or non-zero, the other to indicate odd or even. This reduces the lookup table size from 4GB down to just 8MB! This is crucial in reducing CPU cache misses.

    Wait, there's more. Each of MDCLXVI can produce a valid solution only if all their lookup table blocks are of the same (non-zero) type. Assuming an equal number of blocks of each type, that occurs with a probability of (2/3) (first block non-zero) times (1/3)**6 (next six blocks match first non-zero block). After a preliminary check of all seven bitmaps therefore, only one in 1093.5 candidate solutions needs further (more expensive) checking for an exact match -- via a 4GB table lookup plus calculation, as detailed in earlier episodes of this series.

    Bitmaps

    Here is the code (64-bit compile, 32-bit int, 64-bit size_t) to create a 8MB bitmap from a 4GB lookup table:

    #define LAST_CHUNK_IDX_4MB 33554431 #define LAST_CHUNK_IDX_8MB (33554432*2 - 1) #define BIT_MAP_SIZE_BYTES ((LAST_CHUNK_IDX_8MB+1) >> 3) typedef unsigned char bytev_t; void make_bitmapA( const char* bitfname, // in: bit map file name const bytev_t* bytevec, // in: byte vector unsigned char* bitvec // out: bitvec put here (assumed already z +eroed) ) { unsigned int uu; unsigned int uu2; unsigned int byte; unsigned int bit; const size_t* psz; const bytev_t* psy; int j; unsigned int neven; unsigned int nodd; unsigned int totzero = 0; unsigned int toteven = 0; unsigned int totodd = 0; for (uu = 0; uu <= LAST_CHUNK_IDX_4MB; ++uu) { psy = &bytevec[uu << 7]; psz = (size_t*)psy; // This is the zero case. if (psz[0] == 0 && psz[1] == 0 && psz[2] == 0 && psz[3] == 0 && psz[4] == 0 && psz[5] == 0 && psz[6] == 0 && psz[7] == 0 && psz[8] == 0 && psz[9] == 0 && psz[10] == 0 && psz[11] == 0 && psz[12] == 0 && psz[13] == 0 && psz[14] == 0 && psz[15] == 0) { // zero: both bits are zero, nothing to do ++totzero; continue; } // All non-zero elements in block will be all even or all odd! neven = 0; nodd = 0; for (j = 0; j < 128; ++j) { if (psy[j] != 0) { if (psy[j] % 2 == 0) { ++neven; } else { ++nodd; } } } if (neven > 0 && nodd > 0) { fprintf(stderr, "oops 1\n"); exit +(1); } if (neven == 0 && nodd == 0) { fprintf(stderr, "oops 2\n"); exit +(1); } uu2 = uu * 2; // nonzero: set first bit byte = uu2 >> 3; bit = uu2 & 7; bitvec[byte] |= (1 << (bit+1)); if (neven > 0) { // even: second bit is zero, nothing to do ++toteven; continue; } // odd: set second bit ++totodd; bitvec[byte] |= (1 << bit); } printf("zero=%u even=%u odd=%u from %u (%u)\n", totzero, toteven, t +otodd, LAST_CHUNK_IDX_4MB+1, uu); FILE* fh; printf("write %s\n", bitfname); fh = fopen(bitfname, "wb"); if (fh == NULL) { fprintf(stderr, "oops open '%s' failed: errno=%d\n", bitfname, e +rrno); exit(1); } if (fwrite(bitvec, BIT_MAP_SIZE_BYTES, 1, fh) != 1) { fprintf(stderr, "oops write '%s' failed: errno=%d\n", bitfname, +errno); exit(1); } fclose(fh); }

Object Oriented Perl - very basic guide
9 direct replies — Read more / Contribute
by Preceptor
on May 15, 2014 at 10:10

    I've been doing Perl for a while now, and one of the things I've meant to get to grips with, is 'object oriented' Perl. I think I'm getting there now, so I'm writing up what I've got so far in the hopes that it'll be a helpful reference.

    First, the basics - object oriented languages encapsulate code into 'objects'. The object has an internal state, and it's own 'code' for handling that state.

    The reason that's useful, is because if I write a module for you to use - then all you need to know is how to run the internal code, and don't have to worry about what goes on inside.

    That's why it's widely used in modules on CPAN, and chances are pretty good that you've already used it if you've used any CPAN modules.

    The way perl 'does' objects, is that it treats each object as a hash. The 'attributes' or 'internal state' are hash keys.

    As a start point (save as 'MyObject.pm')

    #!/usr/bin/perl use strict; use warnings; package MyObject; sub new { print "method 'new' called with parameters: ", join ("\n", @_ ), "\n +"; my ( $class, @args ) = @_; my $self = {}; $self -> {state} = "newly created"; bless $self, $class; return $self; } sub set_state { my ( $self, $new_state ) = @_; $self -> {state} = $new_state; } sub get_state { my ( $self ) = @_; return $self -> {state}; }

    You can then 'drive' this module with:

    # use_oo.pl #!/usr/bin/perl use strict; use warnings; use MyObject my $object_instance = MyObject -> new(); print "Object is: ", $object_instance,"\n"; $object_instance -> set_state ( "here is a new state" ); print "Object state: ", $object_instance -> get_state(),"\n";

    It's fairly simple, but what it means is that I can 'build in' all sorts of code and state variables into 'MyObject' - for example, validating when you 'set state' to ensure that it's not forcing something odd to happen. But this can happen transparently, and your code snippet doesn't actually need to worry about it.

    What you may not have seen before is the 'bless' keyword - when you create '$self' then you create a hash reference - which you could (and essentially, do) treat it as any other hash reference. What 'bless' does is 'mark' the reference as a specific type of object. So you'll see when you print '$object_instance' that it looks something like:

    MyObject=HASH(x012345)

    That's key, because it lets perl identify where it should look when you use a method - e.g. 'get_state' - it knows to look inside the 'MyObject' module explicitly.

    $object_instance -> set_state ( "here is a new state" );

    is translated by perl (because it knows it's a 'MyObject' because of that 'bless') into:

    MyObject::set_state ( $object_instance, "here is a new state" );

    (Passing $object_instance is necessary, because that has it's own internal state, and we need to know which internal state to modify).

    I wouldn't say it's something I'll always use as a technique, but it's a useful extra tool - if I've got a set of scripts that all do similar and related things, then there may well be value in creating a module for them to all use... and if you do, it's certainly worth considering using an object oriented module instead.

At the risk of saying something stupid-but-obvious about Roman Numerals
4 direct replies — Read more / Contribute
by sundialsvc4
on May 12, 2014 at 08:34

    Like everyone else, I have been reading the “10**21” threads with some academic interest – but also, I confess, some confusion when talk is made of “4 gigabyte lookup-tables.”   If the poor Romans actually had to do that, they’d never have been able to count anything.   An algorithm for converting Roman numerals to decimal is trivially easy:   just reverse the string (say ...) so that you intepret it from right-to-left instead of left-to-right.

    use strict; use warnings; my %letters = ('I' => 1, 'V' => 5, 'X' => 10, 'L' => 50, 'C' => 100, 'D' => 500, 'M' => 1000); my $roman = "MCCMLXXIV"; #1874 my $result = 0; my $high_score = 0; foreach my $letter ( split(//, reverse($roman)) ) { my $score = $letters{$letter}; if ($score >= $high_score) { $result += $score; } else { $result -= $score; } $high_score = $score if $score > $high_score; } print "$result\n";

    When the string is viewed from left-to-right, parsing seems difficult because in a well-formed Roman string like MCCMLXXIV you don’t know what C means until you encounter M.   However, that’s not the way the Romans did it.   They probably also read the string from right-to-left, as was and is the standard for many human languages of the region.   Now, the rule is one that any human could do in his head:   “if the digit is as big or bigger than the biggest one you have seen so-far, reading from right to left, add it to the total; otherwise, subtract.”   They did not use arbitrary combinations of letters, although the algorithm shown above would accept them.   (As an example, I doubt that they actually used IM to represent 999.)   In fact, the rules that they did use make the process similar to “making change” given various denominations of currency, and this can’t be accidental.

    So, while the talk of this being a “10**21 problem” is an interesting one, and the latest installment (#3) about whacking the TLB of the processor is even more so, it seems like an incredible amount of work to unleash on a problem that has a trivial algorithmic solution.   I am certainly not “dissing” that thread for its informative discussion of how to deal with problems that would require massive search, but Roman Numerals does not seem to be a good example to use here, unless I am entirely missing the point.   (Which possibility I do not deny.)   The proper solution for Roman Numerals would be to “find a better algorithm,” then optimize-the-hell out of that algorithm.   I cannot imagine that any solution would be faster, especially when the time required to populate those gigabytes of data is included, but I say this only because a nearly cost-free algorithm to solve this problem readily exists, as just shown.   For problems that lack such alternatives, the issues discussed in that thread are of course hugely relevant, and the solutions, very informative.

    - - - - -

    For the “true die-hard golfers” among us, the same algorithm can be expressed, even without any conditional branches at all other than for the scan-loop, as a finite-state machine (FSM) with twenty-six-element tables containing numbers that are either positive or negative depending on the table.   The machine’s state would, of course, represent the largest digit, if any, yet seen when scanning from right to left.   For instance, in the table for state LARGEST_SEEN_IS_V (i.e. all tables except LARGEST_SEEN_IS_I and INITIAL_STATE), the entry for I would now be (-1).   A “real” Perl implementation would use a C-language subroutine.   If an ERROR_STATE is provided, such an algorithm could also detect and reject “syntax error” inputs (as the algorithm shown above cannot do).   Which is exactly why we build language-parsers and lexical-scanners this way . . .

    - - - - -

    Clearly, the intent of the main thread must be to present efficient solutions to problems that actually do consist of arbitrary-string inputs, for which all possible strings are equally probable, under conditions where arbitrarily-large amounts of physical RAM are presumed to be available.   Problems that, unlike Roman Numerals, do not present a trivial algorithmic solution.   Problems for which an efficient solution in the Perl language (or at least, the Perl environment) is required.

    - - - - -

    (Postscript:   I recently helped with a geocaching-like team building challenge for highly-paid “alpha-male” executives, in which you had to use a GPS to find nearby things in an unfamiliar town and obtain clues.   One of the things we took them to was a statue which had a date written in Roman Numerals.   They just needed to get that date.   No big deal, any third-grader could do that, right?   Uh uh.   I was speechless to watch several(!) of them busily searching-for and downloading a “Roman Numerals app” to let them decipher the clue ... which they had utterly no idea how to do on their own.   They could not read Roman Numerals!   So they sat there, locating and downloading – in a couple of cases, buying – an app for their cell-phones.   “Oy vey!”)

The 10**21 Problem (Part 3)
2 direct replies — Read more / Contribute
by eyepopslikeamosquito
on May 11, 2014 at 09:04

    To recap, in parts one and two we reduced the estimated running time of a magic formula search program from 52,887,477 years down to just 728 years. Though that was gratifying, I really didn't want to wait that long. I was happy to wait a year or two, but no more. So I had to find a way to somehow make the search program run a hundred times faster.

    That's not a minor disadvantage, we're talking about things being 50 to 100 times slower with a linked list. Compactness matters. Vectors are more compact than lists. And predictable usage patterns matter enormously. With a vector, you have to shove a lot of elements over, but caches are really really good at that.

    When you traverse lists you keep doing random access ... so you are actually random accessing your memory and you are maximizing your cache misses, which is exactly the opposite of what you want.

    -- Bjarne Stroustrup: Why you should avoid Linked Lists (3:30-5:00) (from GoingNative 2012 Keynote)

    At the end of Part 2, you may recall that we had become stalled by a nasty "cache miss" performance problem, similar to the one Stroustrup describes above. By following Stroustrup's advice to "stay compact, stay predictable" -- and without changing the essential algorithm -- I was able to eventually achieve the massive speed up required.

    Sometimes you think a code change "should" improve performance yet it does not. Each specific code "performance improvement" therefore has to be benchmarked separately, accepted if it improves performance, backed out if not. Painstaking work. One step at a time ... one insight leads to another ... just like playing code golf. :-)

    Lookup Table vs Calculation

    In Part 1 we noticed, when playing golf, there is often a battle between a lookup table and a magic formula. Curiously, when searching for a code golf magic formula, there is often a similar battle, this time between a lookup table and a calculation: for example, to eliminate the inner loop, ambrus used a calculation; I used a lookup table.

    When searching for our magic formula, lookup table wins for the first Roman Numeral value, it's a tie for the second value, and calculation wins for all subsequent values. Why?

    Well, from Part 2, the code to calculate the first Roman Numeral value (D):

    d8 = (d7 ^ q8) * H_PRIME ^ HASH_LEN; d9 = d8 & ~127; d9 %= 1001; if (d9 < 0) d9 += 1001; if (d9 < 500 - 127) continue; if (d9 > 500 + 127) continue; dist = d9 < 500 ? 500 - d9 : d9 - 500; q9 = (d8 & 127) ^ dist; if (q9 == 0 || q9 == 10 || q9 == 13) continue;
    is significantly more work than the code to calculate the second and subsequent values namely:
    c9 = (c7 ^ q8) * H_PRIME ^ HASH_LEN ^ q9; c9 %= 1001; if (c9 < 0) c9 += 1001; if (c9 != 100) continue;
    Apart from being more work, the first value always needs to be calculated, while the second value only need be calculated about 255 times per 1001 (i.e. when d9 above is in the range 373..627). Since the second value matches only once per 1001 (i.e. when c9 above is precisely 100), the third value only need be calculated with a probability of (255/1001) * (1/1001), the fourth value (255/1001) * (1/1001) * (1/1001), ... and so on. Testing showed that hitting the third and subsequent values (i.e. L, X, M) is a sufficiently rare event that, with current commodity hardware, it is not worth cluttering all the memory caches with their 4GB lookup tables.

    Of course, whether lookup table is faster than calculation depends on CPU versus memory speed. And main memory speed can be improved by spending a lot of money. At one stage, I was considering boosting memory performance by buying expensive high speed memory and overclocking my PC.

    By the way, a nice bonus from reducing five 4GB lookup tables down to just one is that the program's total memory requirement is now much lower, allowing us to run it in certain massively parallel environments, such as Xeon Phi for example (which currently has a 6GB limit per core).

    Here is the inner loop code revised to use just one 4GB lookup table:

    #define H_PRIME 1000003 #define HASH_LEN 11 // my_m128_t mimics Intel __m128i type typedef struct my_m128_t my_m128_t; struct my_m128_t { short m128i_i16[8]; }; int inner( const unsigned char* bytevecM, int startval, int endval, int m2, int d2, int c2, int l2, int x2, my_m128_t* ps) { int m3, m4, m5, m6, m7; int d3, d4, d5, d6, d7, d9; int c3, c4, c5, c6, c7, c9; int l3, l4, l5, l6, l7, l9; int x3, x4, x5, x6, x7, x9; int q3, q4, q5, q6, q7, q8, q9; int iret = 0; for (q3 = startval; q3 < endval; ++q3) { if (q3 == 10 || q3 == 13) continue; m3 = (m2 ^ q3) * H_PRIME; d3 = (d2 ^ q3) * H_PRIME; c3 = (c2 ^ q3) * H_PRIME; l3 = (l2 ^ q3) * H_PRIME; x3 = (x2 ^ q3) * H_PRIME; for (q4 = 1; q4 < 128; ++q4) { if (q4 == 10 || q4 == 13) continue; m4 = (m3 ^ q4) * H_PRIME; d4 = (d3 ^ q4) * H_PRIME; c4 = (c3 ^ q4) * H_PRIME; l4 = (l3 ^ q4) * H_PRIME; x4 = (x3 ^ q4) * H_PRIME; for (q5 = 1; q5 < 128; ++q5) { if (q5 == 10 || q5 == 13) continue; m5 = (m4 ^ q5) * H_PRIME; d5 = (d4 ^ q5) * H_PRIME; c5 = (c4 ^ q5) * H_PRIME; l5 = (l4 ^ q5) * H_PRIME; x5 = (x4 ^ q5) * H_PRIME; for (q6 = 1; q6 < 128; ++q6) { if (q6 == 10 || q6 == 13) continue; m6 = (m5 ^ q6) * H_PRIME; d6 = (d5 ^ q6) * H_PRIME; c6 = (c5 ^ q6) * H_PRIME; l6 = (l5 ^ q6) * H_PRIME; x6 = (x5 ^ q6) * H_PRIME; for (q7 = 1; q7 < 128; ++q7) { if (q7 == 10 || q7 == 13) continue; m7 = (m6 ^ q7) * H_PRIME; d7 = (d6 ^ q7) * H_PRIME; c7 = (c6 ^ q7) * H_PRIME; l7 = (l6 ^ q7) * H_PRIME; x7 = (x6 ^ q7) * H_PRIME; for (q8 = 1; q8 < 128; ++q8) { if (q8 == 10 || q8 == 13) continue; q9 = bytevecM[(unsigned int)(m7 ^ q8)]; if (q9 == 0) continue; d9 = (d7 ^ q8) * H_PRIME ^ HASH_LEN ^ q9; d9 %= 1001; if (d9 < 0) d9 += 1001; if (d9 != 500) continue; c9 = (c7 ^ q8) * H_PRIME ^ HASH_LEN ^ q9; c9 %= 1001; if (c9 < 0) c9 += 1001; if (c9 != 100) continue; l9 = (l7 ^ q8) * H_PRIME ^ HASH_LEN ^ q9; l9 %= 1001; if (l9 < 0) l9 += 1001; if (l9 != 50) continue; x9 = (x7 ^ q8) * H_PRIME ^ HASH_LEN ^ q9; x9 %= 1001; if (x9 < 0) x9 += 1001; if (x9 != 10) continue; ps[iret].m128i_i16[0] = q3; ps[iret].m128i_i16[1] = q4; ps[iret].m128i_i16[2] = q5; ps[iret].m128i_i16[3] = q6; ps[iret].m128i_i16[4] = q7; ps[iret].m128i_i16[5] = q8; ps[iret].m128i_i16[6] = q9; ++iret; } } } } } } return iret; }
    As you can see, we simply combined the "first theorem of ambrus" code and the "first hack of eyepopslikeamosquito" code from Part 2.

    Doing this reduced the running time from 95 to 74 seconds = 74 * 125**4 / 60 / 60 / 24 / 365 = 572 years.

    Prefetching

    In emerging and future high-end processor systems, tolerating increasing cache miss latency and properly managing memory bandwidth will be critical to achieving high performance. Prefetching, in both hardware and software, is among our most important available techniques for doing so.

    -- When Prefetching Works, When it Doesn't, and Why (pdf)

    What more can be done to speed up our (newly single) 4GB lookup table? Well, before we enter the inner loop, we already know the limited range of values required from the bytevecM lookup table. In fact, just as in ambrus's first theorem, there are only 128 values, and they are all right next to each other in memory. To reduce memory latency, can we somehow tell the computer to start the tortuous load sequence of those 128 bytes from main memory before we enter the inner loop? Yes! The MSVC _mm_prefetch compiler instrinsic, which under the covers generates the x86 SSE SIMD prefetcht0 instruction, does just that. That is, adding the following code immmediately before the q8 loop above:

    _mm_prefetch(&bytevecM[(unsigned int)m7 & 0xffffff80], _MM_HINT_T0); _mm_prefetch(&bytevecM[64+((unsigned int)m7 & 0xffffff80)], _MM_HINT_T +0);
    begins bringing into all CPU caches the two (64-byte) cache lines required by the q8 loop for the bytevecM lookup table before the loop starts. Note that 0xffffff80 above is just ~127 in disguise.

    Adding this prefetch hack reduced the running time by 6 seconds to 68 seconds = 68 * 125**4 / 60 / 60 / 24 / 365 = 526 years. Update: this was later improved from 6 to 17 seconds.

Nothingness - The far end of language development
2 direct replies — Read more / Contribute
by wjw
on May 03, 2014 at 04:56
    After having read an impressive meditation by eyepopslikeamosquito here The 10**21 Problem (Part 2) I went on a bit of a walk-about through CPAN; Not looking for anything in particular, just looking.

    Along the way I ran into a Moose, which was a bit of a surprise. I had heard of this creature in the world of Perl, had seen references to it, read a bit about it. But my understanding was that is was one of those 'Object' creatures which I have, for a long time, 'Object'ed to.

    Not for any good reason I guess. Mostly because God and everyone had been talking about object oriented programming as if the worlds existence prior to its appearance was somehow miraculous. "How did people get anything done?". Being the obstinate type, I plodded along shaking my head and writing my procedural code, getting the job done without accessors, roles, classes, instances, methods or any of the other goofy constructs that generated a whole new way of talking about code. I liked(and still do like) my simple variables, and sub-routines.

    But that illusion is finally to be set aside. My obstinate Luddite attitude, I have to admit finally, is being undermined by my use of all those handy CPAN modules that make it possible for me to talk to databases, coerce odd date representations into something I consider readable, deliver web pages and data for them, etc... .

    Yes, I am finally, after reading perlootut and perlobj, ready to admit that the goddamn OO stuff is useful, and I probably should have been learning it instead of ignoring it.

    All this from a wander through CPAN... And then I find it... The ultimate module! The one which inspires me to realize that I might as well go a head and learn something new after all this time. Someday, I might write a module with this level of elegance...something so fundamental that it leaves observers with nothing but the unanswerable question 'Why?'. It will not generate debate like our current Voting Booth subject. Nor will it genearate heady (worth-while, informative) study like the one previously mentioned that started tonights personal Chautauqua.

    A module which is, and does and contributes like this one. Some day... . Someday, I will achieve M!

    ...the majority is always wrong, and always the last to know about it...
    Insanity: Doing the same thing over and over again and expecting different results...

Add your Meditation
Title:
Meditation:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":


  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chilling in the Monastery: (5)
    As of 2014-08-01 02:51 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (256 votes), past polls