Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

The Monastery Gates

( #131=superdoc: print w/replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

poll ideas quest 2020
Starts at: Jan 01, 2020 at 00:00
Ends at: Dec 31, 2020 at 23:59
Current Status: Active
5 replies by pollsters
    First, read How do I create a Poll?. Then suggest your poll here. Complete ideas are more likely to be used.

    Note that links may be used in choices but not in the title.

Break out from download
1 direct reply — Read more / Contribute
by IB2017
on Apr 09, 2020 at 14:33

    Hello, I am creating an updater for my Tk program on Windows. It simply needs to download an installer and run it. The download part is done by the following subrutine which also updates a progress bar. Is there an easy way to break out of the download (possibily without to rewrite the whole program using threads etc.).

    sub doUpdate{ print "doUpdate\n"; my $url = ''; my $file = $Path . '/updater.exe'; print "Saving updater in $file\n"; my $ttlDown = 0; my $ua = LWP::UserAgent->new; open my $out, '>:raw', $file or die "$file: $!"; my $resp = $ua->get( $url, ':content_cb' => sub { my ($data, $response) = @_; my $size = $response->content_length; $ttlDown += length $data; $percent_done= $ttlDown * 100.0 / $size; print {$out} $data; $mw->update(); }, ); close $out; print "Out\n"; } depreciated?
3 direct replies — Read more / Contribute
by guthrie
on Apr 08, 2020 at 23:20
    ON Debian linux, I am using XML::Xpath & XML::XPath::Parser for parsing some web files, and started getting this error: Cannot open file 'this service is depreciated as of 1/15/2020' at /usr/share/perl5/xml/
Installing Net::SCP::Expect - it can't find/install Expect
3 direct replies — Read more / Contribute
by mrkrinkle
on Apr 08, 2020 at 19:14

    I'm running Ubuntu 18.04.2

    This is perl 5, version 26, subversion 1 (v5.26.1) built for x86_64-linux-gnu-thread-multi (with 67 registered patches, see perl -V for more detail)

    I have make installed (all I did was sudo apt-get update; sudo apt-get install make).

    I installed CPAN according to this:

    The first line

    perl -MCPAN -e install Net::SCP::Expect

    seems to have only install cpan (well MCPAN is what it's called now)?

    So then I did:

    sudo perl -MCPAN -e shell

    And "install Net::SCP::Expect"

    It ended with this:

    Test Summary Report ------------------- t/01-test.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 14 tests but ran 0. t/02-bc.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: No plan found in TAP output t/03-log.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 11 tests but ran 0. t/04-multiline.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 33 tests but ran 0. t/10-internal.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 17 tests but ran 0. t/11-calc.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: No plan found in TAP output Files=6, Tests=0, 1 wallclock secs ( 0.03 usr 0.00 sys + 0.27 cusr + 0.06 csys = 0.36 CPU) Result: FAIL Failed 6/6 test programs. 0/0 subtests failed. Makefile:883: recipe for target 'test_dynamic' failed make: *** [test_dynamic] Error 2 JACOBY/Expect-1.35.tar.gz make test -- NOT OK //hint// to see the cpan-testers results for installing this module, t +ry: reports JACOBY/Expect-1.35.tar.gz RYBSKEJ/Net-SCP-Expect-0.16.tar.gz Has already been unwrapped into directory /home/bog/.cpan/build/Net- +SCP-Expect-0.16-2 RYBSKEJ/Net-SCP-Expect-0.16.tar.gz Has already been prepared Running make for R/RY/RYBSKEJ/Net-SCP-Expect-0.16.tar.gz Warning: Prerequisite 'Expect => 1.14' for 'RYBSKEJ/Net-SCP-Expect-0.1 +6.tar.gz' failed when processing 'JACOBY/Expect-1.35.tar.gz' with 'ma +ke_test => NO'. Continuing, but chances to succeed are limited. cp blib/lib/Net/SCP/ Manifying 1 pod document RYBSKEJ/Net-SCP-Expect-0.16.tar.gz make -- OK Running make test PERL_DL_NONLAZY=1 "/usr/bin/perl" "-MExtUtils::Command::MM" "-MTest::H +arness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/l +ib', 'blib/arch')" t/*.t t/attributes.t .... ok t/glob.t .......... ok t/signature.t ..... skipped: Set the environment variable TEST_SIGNATU +RE to enable this test. t/string_parse.t .. ok All tests successful. Files=4, Tests=54, 0 wallclock secs ( 0.03 usr 0.00 sys + 0.26 cusr + 0.03 csys = 0.32 CPU) Result: PASS RYBSKEJ/Net-SCP-Expect-0.16.tar.gz Tests succeeded but one dependency not OK (Expect) RYBSKEJ/Net-SCP-Expect-0.16.tar.gz [dependencies] -- NA Failed during this command: TODDR/IO-Tty-1.14.tar.gz : writemakefile NO '/usr +/bin/perl Makefile.PL INSTALLDIRS=site' returned status 6400 JACOBY/Expect-1.35.tar.gz : make_test NO RYBSKEJ/Net-SCP-Expect-0.16.tar.gz : make_test NO one depen +dency not OK (Expect)
    I could give you the whole output if you want. What am I doing wrong? I'm not doing anything else with perl right now. (I mean Ubuntu probably is somewhere but that's all standard stuff.)
Convert JSON to Perl and back with unicode
5 direct replies — Read more / Contribute
by bliako
on Apr 08, 2020 at 18:16

    Hello Monks,

    I am struggling to make either of Data::Dump or Data::Dumper to print rendered(?) unicode characters rather than those ugly escapes but I can't seem to succeed. Perl prints them nicely but dump and dumper escape.

    use utf8; binmode STDOUT, ':encoding(UTF-8)'; use Data::Dumper; use Data::Dump qw/pp/; my $pv = {'&#945;&#946;&#947;' => '&#967;&#968;&#950;'}; #<<<proper gr +eek key and value print pp($pv)."\n"; print Dumper($pv); print "XX:'".$pv->{'&#945;&#946;&#947;'}."'\n"; # proper greek nicely +printed # madness: { "\x{3B1}\x{3B2}\x{3B3}" => "\x{3C7}\x{3C8}\x{3B6}" } $VAR1 = { "\x{3b1}\x{3b2}\x{3b3}" => "\x{3c7}\x{3c8}\x{3b6}" }; # nicely printed XX:'&#967;&#968;&#950;' #<<<< that's proper greek

    thanks, bliako

    Edit: sorry, I did not mention JSON (thanks haukex for reminding me). What I am trying to do is to visualise a long JSON by converting it to a Perl var and then possibly edit the perl var, and finally save back to JSON (with the changes). So, yes, actually I am serialising and de-serialising but I can't seem to find an ascii-text-based, unicode-friendly serialiser other than Dump and Dumper. And for me, YAML is too tiring with all that spaces. Or I am just used to nested Perl data.

    So, the input and output are JSON. Long JSON with unicode. I want to edit that JSON too. But it's too cumbersome in a text-based editor. And so I prefer to covert JSON to Perl, edit the Perl and then convert back to JSON. My procedure/tool was working until some unicode broke it.

Where is documenation for defining servers for use by Rex?
3 direct replies — Read more / Contribute
by nysus
on Apr 07, 2020 at 17:24

    I'm having trouble locating documentation that explains how to set up servers for use in Rex scripts. Can someone please point me in the right direction? Thanks.

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
    $nysus = $PM . ' ' . $MCF;
    Click here if you love Perl Monks

Override new in Moose for flyweight objects
2 direct replies — Read more / Contribute
by Aaronrp
on Apr 07, 2020 at 14:15

    I've found the module MooseX::Role::Flyweight a useful one for keeping a cache of objects, but I've never been happy with the way it requires you to use Class->instance(y => 'x') instead of Class->new(y => 'x') to get the new object. What if I change my mind about caching, or want to debug the code without the cache? Then I have to find all the calls in all my code and change them. What if I miss one, and call new from outside the class? So much for the cache.

    I came up with this alternative approach, which seems to work. It overrides Moose's new method but still inlines the constructor.

    package TestNew 0.001; use Moose; has thingy => ( required => 1, is => 'ro', ); my %obj_cache; override 'new' => sub { my ( $class, @args ) = @_; my $params = $class->BUILDARGS(@args); # from Moose::Object my $thingy = $params->{thingy}; my $cachekey = $thingy; if ( exists $obj_cache{$cachekey} ) { return $obj_cache{$cachekey}; } return $obj_cache{$thingy} = $class->_new( thingy => $thingy ); }; *Moose::Object::_new = \&Moose::Object::new; __PACKAGE__->meta->make_immutable( constructor_name => '_new' ); 1;

    Other than the unfortunate need to mess around inside the Moose::Object namespace, I don't see any huge downsides, but I must admit I don't really know what I'm looking at when I poke around inside Class::MOP::Class and was wondering if others had thoughts.


Parsing Emacs Lisp sexpr?
3 direct replies — Read more / Contribute
by perlancar
on Apr 07, 2020 at 13:41
    Wondering if there's something on CPAN or elsewhere which can parse the contents of into a reasonable Perl representation? I think Data::SExpression chokes on bracket character and that module seems to be pretty much what CPAN has to offer for something relating to parsing S-expression. I guess whipping up a new parser is not hard...
Connecting to remote server with Rex using .pem file
1 direct reply — Read more / Contribute
by nysus
on Apr 07, 2020 at 10:25

    I'm stumped trying to figure out how to use Rex to log into a remote server. The server provides me with a .pem file. I have successfully used this file to log in manually using ssh -i PEM_FILE_NAME blah@

    I tried converting it to public key with ssh-keygen -y -f PEM_FILE_NAME > public.key and have the following in my RexFile:

    Rex::connect( server => '', user => 'my_user_name', public_key => '/home/my_user_name/public.key', );

    But this isn't working. Not sure what else to try. Thanks.

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
    $nysus = $PM . ' ' . $MCF;
    Click here if you love Perl Monks

Lost in compressed encodings
4 direct replies — Read more / Contribute
by Skeeve
on Apr 06, 2020 at 04:22

    This is a follow up of my Lost in encodings question.

    Thanks to the answers given there I was able to solve that issue, but now I'm clueless again as I now need to read in compressed (gzip) UTF-8 files and I do not know how to convince perl to read them as UTF-8.

    My code for opening the files is:

    open my $in, '<:utf8', $filename or die "Can't read $filename: $!\ +n"; if ($filename=~/\.gz$/) { $in= new IO::Uncompress::Gunzip $in, { AutoClose => 1 }; }

    When reading uncompressed data, it works fine as I could verify with the help given in my previous thread. I did so by setting the debugger to UTF-8. When reading uncompressed data the Umlaut "" is correctly displayed as "". When reading the same data from a compressed file the "" is displayed as "ü".

    I have no idea how to make perl consider the compressed data as UTF-8?

Regex to Array lookup question
4 direct replies — Read more / Contribute
by johnfl68
on Apr 05, 2020 at 17:23


    Looking for a suggestion of a better way to do this, instead of doing about 30 regex's in a row.

    I have NWS API data for icons that references their long list of a wide array of icons with extra data that I do not need.,20/tsra_sct,40?size=me +dium,30/tsra_hi,30?size +=medium,30/rain_showers? +size=medium",

    All I really need is the modifier (tsra, rain, sleet, bkn, skc, few, ovc, etc). I don't really need anything else. Because the format somewhat changes with each response, it's a bit hard to regex down to just the modifier, as some times there are 2, and no established list of all the possibilities. At this point I figure possibly use just the first modifier listed. I am going to try and regex it down to just the modifier and see how that works, but I am afraid they will through a wrench in the works at some point that will trip up that regex.

    Instead of doing a separate regex for each modifier, is there a way to use an array with a single regex to do a look up table to get the new icon reference? Or another way to see if any modifier is anywhere in the string, then return referenced new icon name?

    { 'skc' => "clear-day", 'few' => "partly-cloudy-day", 'sct' => "partly-cloudy-day", 'bkn' => "cloudy", 'wind' => "wind" }

    Any suggestions would be appreciated. Once pointed in the right direction, usually I can figure the rest out. I just can't think of a better way to do this. Too many things on my mind as well, like many others right now. Thanks you!

Niche programming
5 direct replies — Read more / Contribute
by maurocavendish
on Apr 05, 2020 at 11:31

    Hi, fellow monks!

    I'm a returning Perl programmer, stuck in a vortex of bad management and poor quality code at my regular job in modern IT sweatshops in Italy.

    At the start of my career, I worked few years doing software/hardware monitoring in Perl, and I loved it. I am a self-taught 42-years-old programmer, with no CS/maths education. I'd like to take some time to explore my possibilities in my spare time, for fun and growth. A lot of the sexier technologies are out of my reach, because I cannot afford years of mentally and financially taxing college level studies. Also, I have a very practical mindset, more keen on simple, concise solutions to well-defined problems. I'm thinking on the lines of things like Domain Specific Languages, APIs, focused libraries.

    Could you provide some insights into niche but deep domains I can tackle with Perl, without having to resort to learn advanced college stuff or crazy architectural patterns? I'm aware I can do couple Web projects, but it would feel like being at work again, and I've also become weary of Web Development/CRUD apps/Scrum gimmicks/you name it. I'd love to find concrete domains that I can tackle with a somewhat scripting mindset for my leisure and enrichment in the next couple years. I hope I make sense, and I wait to be enlightened by your comments!
COVID-19 data analytics, cryptography, and some things you should know
5 direct replies — Read more / Contribute
by tachyon-II
on Apr 04, 2020 at 21:15

    Hi All,

    It's been a decade or so but my love of Perl continues.

    The background of the Why? for this question can be found at which I recommend you read, particularly if you take any medication. It might just save your life.

    The data collection tool to which it applies can be found at We are gathering this data because the US, Chinese and Canadian CDC's are not and when the Italians partially gathered this medication data it showed 73% of all COVID-19 deaths occurred in the ~3% of people taking two specific classes of medication. The Turkish data release a few days later found similarly - 68.8% of deaths occurred in this really small group.

    The problem to be solved is the safe, deidentified release of IP addresses and Browser strings to fulfil the requirements of HIPAA, GDPR and CCPA. We simply do not gather any other PII (personally identifiable information) so it is impossible for this to leak. Age range, sex, disease severity and outcome and medications are the other data points.

    These 2 identifiers will assist researchers in assessing if the crowdsourced data we are gathering is "gamed" or "believable". We have taken steps to make automated submission difficult, but as we hackers know, virtually nothing is impossible if you really put your mind to it...

    So the task to hand is to convert an IP and Browser String into a cryptographically secure hash that can not be reversed or revealed with a rainbow table. IP addresses and Browser strings both exist in a small finite search space.

    Given this data will be released publicly and is timestamped it is trivial for an attacker to correlate a known IP address and Browser string to these hashes. Given the secret packing data I don't believe this would allow any quantity of computational power to elucidate the packing data and thus create a lookup table, but if that is incorrect I would be good to know now and fix it before our impending first data release.

    While I am expert in the field of medicine my knowledge of crypto and how you attack it is less. SHA3 was chosen for its resistance to length extension attacks and the packing data size to give enough random data for the resultant hash to spread evenly across that space. Maybe that's good enough, maybe it can/should/must be done better.

    Here is a draft version of those hashing functions. Expert commentary appreciated, particularly from cryptographers.

    #!/usr/bin/env perl package SHA3; use strict; use warnings; use Digest::SHA3 qw(sha3_256_hex); use Digest::MD5 qw(md5_hex); use Socket qw( inet_pton AF_INET AF_INET6 ); my @packing = qw( fa13a941b76466850c2558d9ae5d969f e71ab0d8bb54c75b37ad23a449050121 6736564ec6bc9bbc8ba42df565317443 c3e088a5cf247ec0df971c5cb9ee6eec 6cf20d548878cdd82b8f207192f58c80 660a311b8d75d5fb28c73f7e2ec5d25e 377f92899b81ad7c5e1d08b81ccc8904 8e1f27dee8ae3374ae5c462adf37bba5 ccd558ff6b9de48ca22023ead2dbd7a2 ff228ef28ae8544155323180ba070d1b ); print SHA3::sha3_ip(''), "\n"; print SHA3::sha3_ip(''), "\n"; print SHA3::sha3_ip('2001:0db8:0000:0000:0000:8a2e:0370:7334'), "\n"; print SHA3::sha3_ip('2001:0db8:0000:0000:0000:8a2e:0370:7335'), "\n"; print SHA3::sha3_bs('Mozilla'), "\n"; print SHA3::sha3_bs('Win32'), "\n"; =head 2 sha3_ip { Expects a dot quad or an IPv6 address and returns a SHA3_256_hex string or null string for invalid input =cut sub sha3_ip { my $ip = shift; my $pack_format; if ( $ip =~ m/^\d+\.\d+\.\d+\.\d+$/ ) { my $bytes = pack("H32 a4 H32", $packing[0], inet_pton( AF_INET +, $ip ), $packing[7]); my $hash = sha3_256_hex($bytes); return $hash; } elsif ( $ip =~ /^([0-9a-f]{0,4}:){0,7}([0-9a-f]{0,4})$/i ) { my $bytes = pack("H32 a16 H32", $packing[0], inet_pton( AF_INE +T6, $ip ), $packing[7]); my $hash = sha3_256_hex($bytes); return $hash; } warn "Invalid IP:$ip\n"; return ''; } =head 2 sha3_bs { Expects a browser string and returns a SHA3_256_hex string or a null string for invalid input =cut sub sha3_bs { my $bs = shift; unless (length $bs > 4 ) { warn "Insufficient data in browser string $bs"; return ''; } my $bytes = pack("H32 H32 H32", $packing[1], md5_hex($bs), $packin +g[8]); my $hash = sha3_256_hex($bytes); return $hash; }
RFC: Perl<->JSON<->YAML<->Dumper : roundtripping and possibly with unicode
2 direct replies — Read more / Contribute
by bliako
on Apr 09, 2020 at 10:55

    Here is a collection of subroutines for converting between:

    • Perl variables (nested data structures),
    • JSON strings,
    • YAML strings,
    • Data::Dumper output strings

    I really needed one just recently after my older implementation broke because of unicode content and Data::Dumper's obsession with escaping unicode. And here is what I have whipped up for my own use and anyone else's after posting Convert JSON to Perl and back with unicode and getting pointers from haukex, kcott, an anonymous monk and Corion who solved (hopefully for eternity) how to make Data::Dumper unicode escaping optional,(see Corion's answer Re: Convert JSON to Perl and back with unicode).

    Because I don't leave a challenge unchallenged here is the code, in the hope to be released as a module with your comments and suggestions.

    And here is a test script which demonstrates usage and tests unicoded content:

    #!perl -T
    use 5.006;
    use strict;
    use warnings;
    use utf8;
    binmode STDERR, ':encoding(UTF-8)';
    binmode STDOUT, ':encoding(UTF-8)';
    binmode STDIN,  ':encoding(UTF-8)';
    # to avoid wide character in TAP output
    # do this before loading Test* modules
    use open ':std', ':encoding(utf8)';
    use Test::More;
    #use Test::Deep;
    my $num_tests = 0;
    use Data::Roundtrip;
    use Data::Dumper qw/Dumper/;
    my $abc = "abc-αβγ";
    my $xyz = "χψζ-xyz";
    my $json_string = <<EOS;
    $json_string =~ s/\s*$//;
    my $yaml_string = <<EOS;
    $abc: $xyz
    #$yaml_string =~ s/\s*$//;
    my $perl_var = {$abc => $xyz};
    # perl2json
    my $result = Data::Roundtrip::perl2json($perl_var);
    ok(defined $result, "perl2json() called."); $num_tests++;
    ok($result eq $json_string, "perl2json() checked (got: '$result', expected: '$json_string')."); $num_tests++;
    # json2perl
    $result = Data::Roundtrip::json2perl($json_string);
    ok(defined $result, "json2perl() called."); $num_tests++;
    for (keys %$result){
    	ok(exists $perl_var->{$_}, "json2perl() key exists."); $num_tests++;
    	ok($perl_var->{$_} eq $result->{$_}, "json2perl() values are the same."); $num_tests++;
    for (keys %$perl_var){
    	ok(exists $result->{$_}, "json2perl() key exists (other way round)."); $num_tests++;
    # this fails:
    #cmp_deeply($perl_var, $result, "json2perl() checked (got: '".Dumper($result)."', expected: ".Dumper($perl_var).")."); $num_tests++;
    # perl2yaml
    $result = Data::Roundtrip::perl2yaml($perl_var);
    ok(defined $result, "perl2yaml() called."); $num_tests++;
    ok($result eq $yaml_string, "perl2yaml() checked (got: '$result', expected: '$yaml_string')."); $num_tests++;
    # yaml2perl
    $result = Data::Roundtrip::yaml2perl($yaml_string);
    ok(defined $result, "yaml2perl() called."); $num_tests++;
    for (keys %$result){
    	ok(exists $perl_var->{$_}, "yaml2perl() key exists."); $num_tests++;
    	ok($perl_var->{$_} eq $result->{$_}, "yaml2perl() values are the same."); $num_tests++;
    for (keys %$perl_var){
    	ok(exists $result->{$_}, "yaml2perl() key exists (other way round)."); $num_tests++;
    # yaml2json
    $result = Data::Roundtrip::yaml2json($yaml_string);
    ok(defined $result, "yaml2json() called."); $num_tests++;
    ok($result eq $json_string, "perl2yaml() checked (got: '$result', expected: '$json_string')."); $num_tests++;
    # json2yaml
    $result = Data::Roundtrip::json2yaml($json_string);
    ok(defined $result, "json2yaml() called."); $num_tests++;
    ok($result eq $yaml_string, "perl2yaml() checked (got: '$result', expected: '$yaml_string')."); $num_tests++;
    # perl2dump and dump2perl with unicode quoting (default Data::Dumper behaviour)
    my $adump = Data::Roundtrip::perl2dump($perl_var, {'terse'=>1});
    ok(defined $adump, "perl2dump() called."); $num_tests++;
    ok($adump=~/\\x\{3b1\}/, "perl2dump() unicode quoted."); $num_tests++;
    # dump2perl
    $result = Data::Roundtrip::dump2perl($adump);
    ok(defined $result, "dump2perl() called."); $num_tests++;
    for (keys %$result){
    	ok(exists $perl_var->{$_}, "perl2dump() and dump2perl() key exists."); $num_tests++;
    	ok($perl_var->{$_} eq $result->{$_}, "perl2dump() and dump2perl() values are the same."); $num_tests++;
    for (keys %$perl_var){
    	ok(exists $result->{$_}, "perl2dump() and dump2perl() key exists (other way round)."); $num_tests++;
    # perl2dump and dump2perl WITHOUT unicode quoting
    $adump = Data::Roundtrip::perl2dump($perl_var, {'terse'=>1, 'dont-bloody-escape-unicode'=>1});
    ok(defined $adump, "perl2dump() called."); $num_tests++;
    ok($adump!~/\\x\{3b1\}/, "perl2dump() unicode not quoted."); $num_tests++;
    # dump2perl
    $result = Data::Roundtrip::dump2perl($adump);
    ok(defined $result, "dump2perl() called."); $num_tests++;
    for (keys %$result){
    	ok(exists $perl_var->{$_}, "perl2dump() and dump2perl() key exists."); $num_tests++;
    	ok($perl_var->{$_} eq $result->{$_}, "perl2dump() and dump2perl() values are the same."); $num_tests++;
    for (keys %$perl_var){
    	ok(exists $result->{$_}, "perl2dump() and dump2perl() key exists (other way round)."); $num_tests++;

    bw, bliako

Optimizing with Caching vs. Parallelizing (MCE::Map)
5 direct replies — Read more / Contribute
by 1nickt
on Apr 05, 2020 at 11:17

    Mon cher ami Laurent_R recently blogged about his solution to the "extra credit" problem in the Perl Weekly Challenge #54. He showed a solution using memoizing, or caching, to reduce the number of repeated calculations made by a program.

    I wondered about the strategy. Obviously calculating the sequences for numbers up to 1,000,000 without some optimization would be painfully or maybe unworkably slow. But the task looks computation-intensive, so I wanted to see whether more cycles would be more beneficial than caching.

    Here is the solution presented by Laurent:

    This runs on my system pretty quickly:

    real 0m22.596s user 0m21.530s sys 0m1.045s

    Next I ran the following version using mce_map_s from MCE::Map. mce_map_s is an implementation of the parallelized map functionality provided by MCE::Map, optimized for sequences. Each worker is handed only the beginning and end of the chunk of the sequence it will process, and workers communicate amongst themselves to keep track of the overall task. When using mce_map_s, pass only the beginning and end of the sequence to process (also, optionally, the step interval and format).

    use strict; use warnings; use feature 'say'; use Data::Dumper; use MCE::Map; my @output = mce_map_s { my $input = $_; my $n = $input; my @result = $input; while ( $n != 1 ) { $n = $n % 2 ? 3 * $n + 1 : $n / 2; push @result, $n; } return [ $input, scalar @result ]; } 1, 1000000; MCE::Map->finish; @output = sort { $b->[1] <=> $a->[1] } @output; say sprintf('%s : length %s', $_->[0], $_->[1]) for @output[0..19];

    This program, with no caching, runs on my system about five times faster (I have a total of 12 cores):

    real 0m4.322s user 0m27.992s sys 0m0.170s

    Notably, reducing the number of workers to just two still ran the program in less than half the time than Laurent's single-process memoized version. Even running with one process, with no cache, was faster. This is no doubt due to the fact MCE uses chunking by default. Even with one worker the list of one million numbers was split by MCE into chunks of 8,000.

    Next, I implemented Laurent's cache strategy, but using MCE::Shared::Hash. I wasn't really surprised that the program then ran much slower than either previous version. The reason, of course, is that this task pretty much only makes use of the CPU, so while throwing more cycles at it it a huge boost, sharing data among the workers - precisely because the task is almost 100% CPU-bound - only slows them down. Modern CPUs are very fast at crunching numbers.

    I was curious about how busy the cache was in this case, so I wrapped the calls to assign to and read from the hash in Laurent's program in a sub so I could count them. The wrappers look like:

    my %cache; my $sets = my $gets = 0; sub cache_has { $gets++; exists $cache{$_[0]} } sub cache_set { $sets++; $cache{$_[0]} = $_[1] } sub cache_get { $gets++; $cache{$_[0]} }

    The result:

    Sets: 659,948 Gets: 16,261,635
    That's a lot of back and forth.

    So the moral of the story is that while caching is often useful when you are going to make the same calculations over and over, sometimes the cost of the caching exceeds the cost of just making the calculations repeatedly.

    Hope this is of interest!

Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2020-04-10 10:27 GMT
Find Nodes?
    Voting Booth?
    The most amusing oxymoron is:

    Results (49 votes). Check out past polls.