Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Meditations

( #480=superdoc: print w/replies, xml ) Need Help??

If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
Breathing life into the (Emacs) cperl-mode
2 direct replies — Read more / Contribute
by haj
on Jul 02, 2020 at 11:29
    This also appeared on blogs.perl.org

    If you are an Emacs user, you might know or even use cperl-mode. I am using it, more or less since my first days with Perl. Back then, newsgroups were a thing, and Ilya Zakharevich recommended it occasionally. In older times cperl-mode was shipped with Perl, today it is part of Emacs.

    If you use cperl-mode, you might also have had a look at the code and noticed that it hasn't seen much love in the last decade or so. Perl, on the other hand, evolves. Version 5.32, for example, brings a new infix operator, and some future version might bring Cor. Wouldn't it be nice if cperl-mode understood these new keywords?

    I'm on my way to get familiar with emacs-lisp, ERT, and other stuff to see what I can do. Ideas, contributions, comments, bug requests and criticism is welcome - There's a GitHub repository to get started.

RFC: Destructuring Assignment (aka Unpacking aka Type Patterns) in pure Perl
3 direct replies — Read more / Contribute
by LanX
on Jul 02, 2020 at 09:41

    At yesterday online meeting of Germany PM we had a discussion about Destructuring Assignment aka unpacking aka Type Patterns which is available in languages like JS (ES6) and Haskell.

    I claimed it shouldn't be to difficult to do this in pure Perl, this is what I got so far.

    NB: This is a proof of concept !

    Neither syntactic sugar nor performance are at it's limits.

    (you know "release often ... yadda yadda" ;-)

    Grabbing of (almost greedy) lists @arr and %hsh not implemented yet.

    Comments welcome.

    use strict; use warnings; use Data::Dump qw/pp dd/; use Scalar::Util qw/reftype/; use autovivification; use Test::More; use constant DBG => 0; my @rules; sub parse_arr (\@$) { my ( $arr, $path ) = @_; for ( my $i=0; $i <= @$arr; $i++ ) { my $item = $arr->[$i]; parse( $item, "${path}[$i]"); } } sub parse_hsh (\%$) { my ( $hsh, $path ) = @_; while ( my ($key,$val) = each %$hsh ) { parse( $val, "${path}{$key}"); } } sub parse { my ( $item, $path ) = @_; $path //= ''; my $type = ref $item; return unless $type; if ( $type eq "GRAB") { push @rules, [ $item->{ref} , $path ]; } elsif ( $type eq "ARRAY" ) { parse_arr( @$item, $path ) } elsif ( $type eq "HASH" ) { parse_hsh( %$item, $path ) } else { warn "Unknown $type"; } } sub assign (++){ my ($pattern,$in) =@_; @rules = (); parse($pattern); warn pp @rules if DBG; no autovivification; for my $rule (@rules) { my ($ref,$path) = @$rule; my $code = "\$in->$path"; $$ref = eval($code); } } sub set (\[@$%]) { bless {ref => $_[0]}, "GRAB"; } my @pattern = ( 1,2,3, set my $y , [ 5,6, set my $z ] ); my $src = [ 1,2,3, 42 , [5,6, 666] ]; assign @pattern => $src; is_deeply ( [$z,$y], [666, 42], "AoA grabbing different levels" ); assign { u => [undef, set( my $u )], v => set my $v } => { u => [1,123] , v => "VVV"} ; is_deeply ( [$u,$v], [123,"VVV"], "HoA grabbing different levels" ); done_testing;

    C:/Perl_524/bin\perl.exe d:/exp/grab_assign.pl ok 1 - AoA grabbing different levels ok 2 - HoA grabbing different levels 1..2

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Modernizing the Postmodern Language?
3 direct replies — Read more / Contribute
by WaywardCode
on Jun 29, 2020 at 12:44

    Fellow Monks,

    Today I re-read the 1999 talk: "Perl, the first postmodern computer language." I couldn't help but laugh to see that--as of this writing--the #1 advertised post on the right side-panel is the Perl 7 announcement, with the tag-line Perl 5 with modern defaults.

    Do we want our postmodern language modernized? It may not be the sound of one hand clapping, but still worthy of some meditation, I think. I recently noted that Perl 5.30 will run almost all of the Perl 1 test suite unmodified (the rare failures are on subroutines called with do whatever();). If I understand the Perl 7 plan correctly, Perl 7 will run approximately zero percent of the Perl 1 test suite due to mandating strict and dropping bareword filehandles. Does that matter? I guess not, but it feels like a loss to me.

    The virtues extolled in the talk were (1) incorporating what rules (2) leaving out what sucks and (3) letting the duct work show. To uphold those virtues, it seems to me that newer Perls should look around and incorporate things that both (1) rule and (2) aren't already in Perl. Perl should primarily accrete, and occasionally mutate. So, Monks, should it concern me that one of the two motivations given for Perl 7 is the ability to remove syntax? Should it concern me that SawyerX's Guac project (youtube) calls the regularized, easily-parsable subset of Perl "Standard Perl"? Despite the don't-worry-disclaimers, if the pumpking can call a language without autoquoting or heredocs Standard, that speaks volumes to me about where his head is at. Perhaps I am overreacting.

Java Recommendations for a Perl developer
6 direct replies — Read more / Contribute
by talexb
on Jun 16, 2020 at 22:12

    I'm currently doing Perl development, but there's a fair bit of Java code at work .. so I'm wondering if any monks can recommend a book, website or approach they found useful when getting up to speed.

    My background in Perl started in about '98, after about 15 years of developing C (with a few years of Pascal thrown in for good measure). Before that, I did development in a variety of assemblers (x86, 68000, 6809). I never did any C++, so any OO nomenclature I've picked up has been via Perl's take on OO.

    What I've seen of Java is that it's very hierarchical, and is also exception based -- Try::Tiny seems to neatly encapsulate the try / catch of Java.

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

n-dim array generator
1 direct reply — Read more / Contribute
by jo37
on Jun 16, 2020 at 15:54

    Every now and then I need to create a n-dimensional array from some given formula, where n almost always is 1 or 2.
    I've been searching for a module providing such functionality without success. It would be a surprise if there really wasn't such a thing.

    But so I rolled my own and came to this:

    # arg_1 to arg_n specify size in dimension k, # last arg is a reference to a sub # - expecting n arguments x1 ... xn # - returning the desired value at $na->[x1]...[xn] sub narray { my $size = shift; my $val = pop; my $na; for my $i (0 .. $size - 1) { $na->[$i] = @_ ? narray(@_, sub {&$val($i, @_)}) : &$val($i); } $na; }
    which can be used in various ways and is not limited in dimension:
    $ar = narray(3, sub {$_[0]}); # [0, 1, 2] $mr = narray(3, 3, sub {$_[0] == $_[1] || 0}); # [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
    just to get an idea.

    Does anybody know of a module providing something similar?
    Otherwise: Is this of use for someone else or would it even fit into something existing?

    UPDATE 1: This would be more perlish if the sub ref argument came first and the sub itself was prototyped:

    sub narray (&@); $ar = narray {$_[0]} 3;

    UPDATE 2: Incorporating LanX's suggestion of using map into UPDATE 1 results in:

    sub narray (&@); sub narray (&@) { my $val = shift; my $size = shift; [map {my $i = $_; @_ ? narray {&$val($i, @_)} @_ : &$val($i)} (0 . +. $size - 1)]; } my $na = narray {"<@_>"} 3, 2 # [['<0 0>', '<0 1>'], ['<1 0>', '<1 1>'], ['<2 0>', '<2 1>']]

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
foreach $1
1 direct reply — Read more / Contribute
by choroba
on Jun 05, 2020 at 17:49
    When answering a question on StackOverflow, I noticed an interesting anti-idiom (antidiom?) I haven't seen before (and haven't thought of trying):
    foreach $1 (@array)

    Interestingly, you can use it under strict and warnings and Perl doesn't protest. But $1 is aliased to the elements of the array and loses its magic behaviour:

    #!/usr/bin/perl use warnings; use strict; for $1 ('a' .. 'h') { print $1 if "A" =~ /(.)/; }

    Can you guess what the output is?

    So, if you ever get the idea of using special variables for something else than what they were designed to, stop.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
(Ab)using the Regex Engine
2 direct replies — Read more / Contribute
by jo37
on May 25, 2020 at 05:19

    The TASK #2 of perl-weekly-challenge-061 was to split a given string into certain subparts. There were two solutions that (ab)use Perl's regular expression engine to get all matches for the leading part of a regular expression. Though being one of the authors, I'm not so sure about this approach. How smart is the engine allowed to be? Is there a way to guarantee that it actually tries all possibilities?

    The section Embedded Code Execution Frequency in perlre says:

    How non-accepting pathways and match failures affect the number of times a pattern is executed is specifically unspecified and may vary depending on what optimizations can be applied to the pattern and is likely to change from version to version.
    This is a rather clear statement, that the proposed solutions may fail in future versions of Perl. But does this hold in any case? See examples in this program:

    #!/usr/bin/perl use strict; use warnings; my $match = qr[([ab]+)([ab]+)]; my $str = 'aba'; $str =~ /^ $match $ (?{ print "1: $1-$2\n" }) [c] /x; $str =~ /^ $match $ (?{ print "2: $1-$2\n" }) (?!) /x; $str =~ /^ $match $ (??{ print "3: $1-$2\n"; qr[(?!)] }) /x; __DATA__ 2: ab-a 2: a-ba 3: ab-a 3: a-ba

    Explanations to the numbered samples:

    1. There is a non-matched character class [c] at the end of the pattern. In my copy of the "Camel Book" (3rd Edition, 2000) it is stated that the engine is smart enough to optimize away the match attempt if there is a single character, but not if it is inside a character class. The engine has become smarter since then: the (?{CODE}) block is not executed.
    2. Currently, using a negative look-ahead assertion as a non-matcher outsmarts the engine into trying to match the string. I reckon that the matching attempt might be optimized away in future versions.
    3. With a small change, the resulting pattern remains the same but isn't known to the regex engine from the beginning, as the final part now is the returned value from a (??{CODE}) block. To decide if there is a match, the CODE has to be executed and thus cannot be optimized away. Would sniffing at the CODE and detecting that it always returns something non-matchable be "legal"? I feel kind of safe with this but I may be wrong.
    Would you agree with this statement, that seems to be in contrast to the quotation above?
    A (??{CODE}) block is guaranteed to be executed, if the failing or success of a pathway containing this block solely depends on the returned subexpression.
    Could we even have a zero-width assertion like (?!?{CODE}) that always fails but must not be optimized away in the sense of the previous proposition?

    I'd be glad to see your opinions.

    BTW: What matches and what is matched? Is a regex matching a string or is a string matching a regex? I don't know.

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
Converting Hashes to Objects
9 direct replies — Read more / Contribute
by haukex
on May 17, 2020 at 09:13

    Inspiration whacked me in the head yesterday, and I realized that it was pretty easy to take an arbitrary hash reference and turn it into an object, so that instead of writing $hash->{key} I can just write $hash->key, saving two characters and giving me protection from typos. Enter Util::H2O:

    use Util::H2O; my $hash = h2o { foo => "bar", x => "y" }, qw/ more keys /; print $hash->foo, "\n"; # prints "bar" $hash->x("z"); # setter

    Deeply nested structures can be converted too...

    my $struct = { hello => { perl => { world => "yay" } } }; h2o -recurse, $struct; print $struct->hello->perl->world, "\n"; # prints "yay"

    The function blesses its hashref argument into a new package, and returns that hashref as well. The hashrefs otherwise remain the same as before, so you can still use $hash->{key} and so on if you like. The newly created package is cleaned up by default when the object is destroyed (configurable). These objects are of course a bit less performant than a regular hashref, but this wasn't intended for high-performance applications.

    Update: As of the freshly-released v0.06, the h2o function also locks the hash's keyset by default, to prevent typos when using the hash like a regular hashref. This can be disabled with the -lock=>0 option. /Update

    I quickly realized I could do more with this, like add methods to these objects:

    my $obj = h2o -meth, { one => 123, two => 456, sum => sub { my $self = shift; return $self->one + $self->two; } }; print $obj->sum, "\n"; # prints "579"

    And I could even make it easy to create classes; the -new option generates a constructor:

    h2o -class=>'Point', -new, -meth, { angle => sub { my $self = shift; atan2($self->y, $self->x) }, }, qw/ x y /; my $one = Point->new(x=>1, y=>2); my $two = Point->new(x=>3, y=>4); $two->y(5); printf "%.4f\n", $two->angle; # prints 1.0304

    Yes, I realize that might be taking it a little bit too far, but it was easy to implement - plus, I figured this could possibly be useful for whipping up mock objects in tests. And yes, I'm aware similar tools exist ;-)

Ways of quoting
4 direct replies — Read more / Contribute
by Aaronrp
on May 14, 2020 at 19:16

    I knew that you could use :: to make sure that a package name wasn't used as a subroutine, but I didn't really understand what it did.

    $ perl -MO=Deparse -e '$x = anything_could_go_here::' $x = 'anything_could_go_here'; -e syntax OK

    Just like single quotes?

    TIMTOWTDI I guess...

    So we have

    q{} '' => :: qw{} - (as in -bareword)

    And of course

    no strict; $x = bareword;

    not to mention

    "" and qq{} `` and qx{}

    Somebody will tell me that I missed some, I'm sure.

[OT] Am I just a bad programmer?
11 direct replies — Read more / Contribute
by maurocavendish
on May 08, 2020 at 07:33
    Hi, fellow monks!

    I've come to a point in my programming career when I seriously doubt that I have it in myself to go on. I worked years as a Perl programmer/system integrator. Few years ago, my company pulled me out of a project, and then it began. I was constantly moved on new projects, with new technologies stack, ranging from PHP, Python, AngularJS, Struts, Spring Boot, Vue, SQL Server, Node, React. You name it. As I understand, I am now under the umbrella of the Java Competence Center, so they expect me to deliver on that.

    I never received any guidance, training, or support. Every project since then I was, sometimes literally, alone. Of course, my boss wants me to deliver fast and clean. At the moment, I am working on a huge Java project, again on my own. I'm in charge of analysis, development, sometimes testing. Also, no one gave me a proper introduction to the app we are developing, that is heavy on financial business logic. All the while, despite having worked in the field for more than 10 years, my current job title is Junior Software Developer, with no hope of advancing given the current situation.

    I feel like I've reached a wall. I cannot deliver. I have constant migraines. All in all, I fear I'll lose my job. Is this just a sign that I'm not cut out for the job, or that I'm perhaps too old to keep up (I'm 42)? I still study in my spare time what fascinates me, but I'm beginning to hate programming, and I'm not putting in the same amount of work into my education as I did when I started. I'd need some real side projects under my belt to brush up my resume, but I do not have the spare mental capacity to push even outside the workplace.

    Unfortunately, it seems that here in Italy all the coding jobs are like so, therefore I'm questioning my worth in the market as a programmer. Soon I think I'll try to switch company, but I'm not convinced anything would really change. Is a valuable professional programmer required to just go along with everything the company throws at him, or is it more sensible to craft a specific role in the spare time and go for it elsewhere?

    I understand this post is highly OT, but I could seriously use some help for maturing a better understanding.

    Regards.

    Training is everything
Why a regex *really* isn't good enough for HTML and XML, even for "simple" tasks
8 direct replies — Read more / Contribute
by haukex
on May 05, 2020 at 07:41

    TL;DR: Working code below!

    Say you "just" want to extract some links. Are you sure the HTML's formatting will never change (whitespace, order of attributes, its structure, and so on)? Well, here's some perfectly valid HTML - good luck!

    Solutions that work on all of the above:

lexical vs. local file handles
3 direct replies — Read more / Contribute
by jo37
on Apr 18, 2020 at 14:26

    EDIT:
    I use used bareword filehandles. In the course of this discussion I was convinced not to do so.

    I've seen many Perl programmers advocating the usage of lexical scalars instead. Though there are good reasons for this, it has a drawback. Filehandles in Perl are special on a syntactical level. The compiler is capable of catching errors like this:

    open FH, '<', 'myfile'; print FH, "something\n";

    by emitting an error:

    No comma allowed after filehandle at ...

    Whereas strange things happen at runtime if you use:

    open my $fh, '<', 'myfile'; print $fh, "something\n";
    NB: Someone else pointed to this subtle difference but sadly I don't remember who and where it was. She/he should be credited here.

    So I stay with barewords and try to avoid the problems caused by the usage of global variables using this idiom:

    open local *FH, '<', 'myfile'; while (<FH>) { # do something } close FH;

    Some features:

    • usages of equally named file handles do not affect each other in different scopes
    • the usage in a sub does no harm if the sub is in the same package and it uses the same idiom
    • the usage in a sub of a different package does not harm.
    So, as long as I follow this pattern im my own package, I feel kind of safe. Side effects of localizing a glob are easily circumvented.

    The question remains: Do I miss something here? Do you see any pitfalls using this approach?

    Here is an example demonstrating the issues I'm aware of.

    #!/usr/bin/perl use Test2::V0; package Foobar; our $foobar = "foobar\n"; our @foobar; sub use_fh { # unlocalized use of FH in separate package open FH, '<', \$foobar; @foobar = <FH>; close FH; } package main; my $foo = "foo\n"; my $bar = "bar\n"; my $baz = "baz\n"; my @baz; sub use_localized_fh { # protect caller's FH open local *FH, '<', \$baz; @baz = <FH>; close FH; } sub close_fh { # unlocalized use of FH close FH; } # open now, use later open FH, '<', \$bar; my @foo; # create new scope { # use localized FH, protecting handle opened on \$bar open local *FH, '<', \$foo; # call sub that uses localized FH use_localized_fh; # call sub in other package that uses FH Foobar::use_fh; # FH still intact @foo = <FH>; close FH; } is \@baz, [$baz], 'got $baz in sub'; is \@Foobar::foobar, [$Foobar::foobar], 'got $Foobar::foobar in foreig +n sub'; is \@foo, [$foo], 'good: got $foo'; { open local *FH, '<', \$foo; # call sub that closes FH close_fh; @foo = <FH>; close FH; } is \@foo, [], 'bad: FH was closed in sub'; # FH at this scope is still untouched my @bar = <FH>; close FH; is \@bar, [$bar], 'good: got $bar'; done_testing;

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
MCE::Flow + Caching via File::Map
1 direct reply — Read more / Contribute
by marioroy
on Apr 14, 2020 at 21:00

    Dearest Monks of the Monastery,

    Recently, I tried computing the longest Collatz progression here. I was pleasantly surprised by File::Map's performance. Our fellow monk Laurent_R posted an update to his original code for computing the Collatz sequences. And what a speedup it is.

    Here, I want to try Laurent's code and run parallel. Yes, with caching. The first thing I do typically is apply optimizations to the serial implementation. Because you know, just think of any domino impact running parallel might have. See my update to Laurent's code. That went well and so will take collatz3_e.pl there and use File::Map here. This is exciting for me because this is a great use case for File::Map for running the algorithm in parallel. But with all things, a serial version using File::Map is needed for comparison.

    Update 1: Map using map_anonymous, previously map_file.
    Update 2: Use 16-bit signed integer for pack/unpack.

    Note: The OS must have ~ 3.8 GiB of available memory to compute size 1e9.

    collatz3_filemap:

    #!/usr/bin/env perl use strict; use warnings; use File::Map qw/map_anonymous unmap/; my $size = shift || 1e6; $size = 1e6 if $size < 1e6; # minimum $size = 1e9 if $size > 1e9; # maximum map_anonymous my $cache, $size * 2, 'shared'; # init cache with -1's, then set 0, 1, 2 substr($cache, 0, $size * 2, ( my $neg1 = pack('s', -1) ) x $size); substr($cache, $_ * 2, 2, pack('s', $_)) for 0..2; my @seqs; sub collatz_seq { my ( $seq_beg, $seq_end ) = @_; my ( $n, $steps, $tmp ); for my $input ( $seq_beg..$seq_end ) { $n = $input, $steps = 0; while ( $n != 1 ) { $steps += unpack('s', $tmp), last if ($n < $size && ($tmp = substr($cache, $n * 2, 2)) n +e $neg1); $n % 2 ? ( $steps += 2, $n = (3 * $n + 1) >> 1 ) : ( $steps += 1, $n = $n >> 1 ); } substr($cache, $input * 2, 2, pack('s', $steps)) if $input < $ +size; push @seqs, [ $input, $steps ] if $steps > 400; } } collatz_seq(2, $size); unmap $cache; @seqs = ( sort { $b->[1] <=> $a->[1]} @seqs )[ 0..19 ]; printf "Collatz(%5d) has sequence length of %3d steps\n", @$_ for @seqs;

    collatz3_parallel:

    This is the serial implementation converted to run parallel. The collatz_seq function is identical, no changes there.

    #!/usr/bin/env perl use strict; use warnings; use File::Map qw/map_anonymous unmap/; use MCE::Flow; use MCE::Candy; my $size = shift || 1e6; $size = 1e6 if $size < 1e6; # minimum $size = 1e9 if $size > 1e9; # maximum map_anonymous my $cache, $size * 2, 'shared'; # init cache with -1's, then set 0, 1, 2 substr($cache, 0, $size * 2, ( my $neg1 = pack('s', -1) ) x $size); substr($cache, $_ * 2, 2, pack('s', $_)) for 0..2; # local to workers and the manager process my @seqs; sub collatz_seq { my ( $seq_beg, $seq_end ) = @_; my ( $n, $steps, $tmp ); for my $input ( $seq_beg..$seq_end ) { $n = $input, $steps = 0; while ( $n != 1 ) { $steps += unpack('s', $tmp), last if ($n < $size && ($tmp = substr($cache, $n * 2, 2)) n +e $neg1); $n % 2 ? ( $steps += 2, $n = (3 * $n + 1) >> 1 ) : ( $steps += 1, $n = $n >> 1 ); } substr($cache, $input * 2, 2, pack('s', $steps)) if $input < $ +size; push @seqs, [ $input, $steps ] if $steps > 400; } } my $chunk_size; $chunk_size = int( $size / MCE::Util::get_ncpu() / 80 + 1 ); $chunk_size += 1 if $chunk_size % 2; mce_flow_s { max_workers => MCE::Util::get_ncpu(), chunk_size => $chunk_size, bounds_only => 1, gather => MCE::Candy::out_iter_array(\@seqs), }, sub { my ($mce, $chunk_ref, $chunk_id) = @_; collatz_seq(@{ $chunk_ref }); @seqs > 20 ? MCE->gather($chunk_id, ( sort { $b->[1] <=> $a->[1] } @seqs +)[ 0..19 ]) : MCE->gather($chunk_id, @seqs); @seqs = (); }, 2, $size; MCE::Flow->finish; unmap $cache; @seqs = ( sort { $b->[1] <=> $a->[1]} @seqs )[ 0..19 ]; printf "Collatz(%5d) has sequence length of %3d steps\n", @$_ for @seqs;

    Results:

    Caching using File::Map obviously will have overhead plus having to serialize/unserialize using pack/unpack.

    $ time perl collatz3_a.pl 1e7 AMD 3970x, Docker Container, Ubuntu + Perlbrew Perl 5.30.1 collatz3_a.pl 1e7 13.130s (a) original, accepts argument collatz3_b.pl 1e7 12.394s (b) a + replaced division with >> 1 collatz3_c.pl 1e7 12.261s (c) b + removed 1 level of branching collatz3_d.pl 1e7 9.170s (d) c + reduced loop iterations collatz3_e.pl 1e7 7.681s (e) d + less caching collatz3_filemap 8.889s 1 core collatz3_parallel 8.982s 1 core collatz3_parallel 4.548s 2 cores collatz3_parallel 2.378s 4 cores collatz3_parallel 1.233s 8 cores collatz3_parallel 0.661s 16 cores collatz3_parallel 0.408s 32 cores Collatz(8400511) has sequence length of 686 steps Collatz(8865705) has sequence length of 668 steps Collatz(6649279) has sequence length of 665 steps Collatz(9973919) has sequence length of 663 steps Collatz(6674175) has sequence length of 621 steps Collatz(7332399) has sequence length of 616 steps Collatz(7532665) has sequence length of 616 steps Collatz(5649499) has sequence length of 613 steps Collatz(8474249) has sequence length of 611 steps Collatz(6355687) has sequence length of 608 steps Collatz(8847225) has sequence length of 606 steps Collatz(9533531) has sequence length of 606 steps Collatz(6635419) has sequence length of 603 steps Collatz(9953129) has sequence length of 601 steps Collatz(7464846) has sequence length of 598 steps Collatz(7464847) has sequence length of 598 steps Collatz(3732423) has sequence length of 597 steps Collatz(5598635) has sequence length of 595 steps Collatz(8397953) has sequence length of 593 steps Collatz(6298465) has sequence length of 590 steps

    Some will say, let's add cores. Some will say, let's improve the algorithm. Few might say, let's try both. It turns out that caching and parallel are possible. It's unbelievable, TBH. Processors are equipped with many CPU cores. I made the time to try and retry. Mainly, for future Monks to the Monastery, way after I'm gone. Years ago the saying was, "IO and Parallel" isn't possible. Input IO in MCE is sequential, not random.

    What I have witnessed tonight is that Meta::Cpan is a treasure box. In other words, a big gigantic box of Legos. I opened the box and picked out File::Map, MCE::Flow and then went over to this wonderful Monastery. There I looked for Laurent_R's code.

    I tried this not knowing what to expect. This is my first time using File::Map with many workers.

    Regards, Mario

Shared DBI handle supporting threads and processes
5 direct replies — Read more / Contribute
by marioroy
on Apr 12, 2020 at 21:58

    Dearest Monks,

    Have you ever wanted to share a DBI handle?

    Let's imagine an environment that consists of 200 compute blades. Each blade has 10 CPU cores, 20 logical cores total with hyperthreading/SMT enabled. That might be a lot of DB connections 4,000 (200 x 20) and IMHO not graceful at all. Better yet, imagine an environment with 400 compute blades. These are the new generations having 32 cores (64 logical cores with SMT enabled). Certainly ( 400 x 64 = 25,600 cores ) may be too much for the DB to handle.

    Fortunately, there is a way. One DB connection per blade, no matter the number of CPU cores, is possible with Perl and MCE::Shared. The code that follows is based on my reply to 1nickt's elegant MCE demonstration.

    First attempt

    Creating a shared DBI handle is not a problem. Unfortunately, it does not work with STMT objects failing due to the STMT object looking and saying, wait a minute, this is not a DBI object. Ah...

    my $dbh = MCE::Shared->share({ module => 'DBI', new => 'connect' }, $d +sn, $user, $password, $params );

    Second attempt

    Another way is writing a wrapper class with the things you need. Think of MCE::Shared as a proxy server. It does nothing more than passing the method name you want to call and arguments over to the shared-manager process. Likewise, returning data on the way back.

    Update: Added missing 'do' method to the shared class. Also, updating a record.

    Schema:

    create table mytable( field1 integer, field2 varchar(24), field3 varchar(24), field4 varchar(24), field5 varchar(24) );

    Perl:

    use strict; use warnings; use Data::GUID; use DBD::Pg; use SQL::Abstract; use Tie::Cycle; use MCE::Loop max_workers => 4; use MCE::Shared; my $sqla = SQL::Abstract->new; my @cols = map {"field$_"} 1..5; # https://www.cattlegrid.info/2006/06/13/write-no-more-sql-abstract-it +.html my $ins_sql = $sqla->insert('mytable', { map { $_ => '' } @cols }); my $sel_sql = $sqla->select('mytable', 'count(*)', { field2 => '' }); my $upd_sql = $sqla->update('mytable', { field2 => '' }, { field2 => ' +' }); #--------------------------------------------------------------------# package My::DBI { use DBI; sub new { my ( $class, $dsn, $user, $password, $params ) = @_; my $self = {}; # MCE::Shared will emit the error and exit if fail to connect $self->{DBH} = DBI->connect($dsn, $user, $password, $params); $self->{STMT} = {}; bless $self, $class; } sub prepare_cached { my ( $self, $key, $sql ) = @_; $self->{STMT}{$key} = $self->{DBH}->prepare_cached($sql); 1; } sub do { my $self = shift; $self->{DBH}->do(@_); } sub execute { my ( $self, $key ) = ( shift, shift ); if ( my $stmt = $self->{STMT}{$key} ) { $stmt->execute(@_); } } sub fetchrow_array { my ( $self, $key ) = ( shift, shift ); if ( my $stmt = $self->{STMT}{$key} ) { $stmt->execute(@_); $stmt->fetchrow_array; } } sub finish { my ( $self, $key ) = @_; if ( $key ) { $self->{STMT}{$key}->finish if $self->{STMT}{$key}; } else { $self->{STMT}{$_}->finish for keys %{ $self->{STMT} }; } return 1; } sub disconnect { my ( $self ) = @_; $self->finish; $self->{DBH}->disconnect; 1; } }; #--------------------------------------------------------------------# my $dsn = 'DBI:Pg:dbname=test_db;host=localhost;port=5432'; my $sdb = MCE::Shared->share( { module => 'My::DBI' }, $dsn, $ENV{USER}, undef, { AutoCommit => 1, RaiseError => 1, PrintError => 1 }, ); $sdb->prepare_cached('ins_sth', $ins_sql); $sdb->prepare_cached('sel_sth', $sel_sql); $sdb->prepare_cached('upd_sth', $upd_sql); mce_loop { my ($mce, $chunk, $chunk_id) = @_; for my $record( @{$chunk} ) { $sdb->execute('ins_sth', @{$record}); my $field2_old = $record->[1]; my $field2_new1 = Data::GUID->new->as_base64; my $field2_new2 = Data::GUID->new->as_base64; # update using a prepared statement $sdb->execute('upd_sth', $field2_new1, $field2_old); # update using the dbh handle inside the shared class my ($query, @bind) = $sqla->update( 'mytable', { field2 => $field2_new2 }, { field2 => $field2_new1 }, ); $sdb->do($query, undef, @bind); # pass any arguments for execute inside the shared class my ($count) = $sdb->fetchrow_array('sel_sth', $field2_new2); # count is 1 due to selecting field2 = $field2_new2 my $msg = sprintf 'wid %s; chnk %s; ins %s; cnt %s', $mce->wid, $chunk_id, $record->[0], $count; MCE->say($msg); } } get_sample_data(); # ^^ do not pass @{ get_sample_data() } to mce_loop # it will not work if @{ [ has 1 element ] } # pass the array ref instead, MCE accepts it MCE::Loop->finish; $sdb->disconnect; #--------------------------------------------------------------------# sub get_sample_data { tie my $value1, 'Tie::Cycle', [ 40 .. 49 ]; return [ map { [ $value1, map { Data::GUID->new->as_base64 } 0..3] } 1..1000 ]; }

    Add to the My::DBI class any DBI/STMT methods that your application uses. The code is straight forward I hope. The fetchrow_array is typically preceded with an execute. So the method in the shared class handles both execute and fetchrow_array. This is important. Likewise, be sure to pass the execute arguments when calling fetchrow_array in the application.

    Well, the wrapper class works very well. The number of CPU cores keeps increasing every couple of years. Meaning that new problems emerge and so do possibilities.

    Kind regards, Mario

Longest Collatz using MCE::Flow, Inline::C, and GCC compiler intrinsics
1 direct reply — Read more / Contribute
by marioroy
on Apr 11, 2020 at 22:52

    Dearest Monks,

    My mind has been amused with Collatz conjecture. See 1nickt's post about obtaining the top 20 sequences. Below is code for obtaining the longest progression. Here I try a GCC compiler intrinsic to further increase performance. That went well and so updated my prior post adding collatz_count_c2 there.

    use strict; use warnings; use feature 'say'; use MCE::Flow; use Inline C => Config => CCFLAGSEX => '-O2 -fomit-frame-pointer', clean_after_build => 0; use Inline C => <<'END_OF_C_CODE'; #include <stdlib.h> #include <stdint.h> #if defined(_WIN32) #define strtoull _strtoui64 #endif void collatz_longest_c1( SV* _beg_n, SV* _end_n ) { uint64_t beg_n, end_n, i, n, steps; uint64_t number = 0, highest = 0; Inline_Stack_Vars; #ifdef __LP64__ beg_n = SvUV( _beg_n ); end_n = SvUV( _end_n ); #else beg_n = strtoull( SvPV_nolen( _beg_n ), NULL, 10 ); end_n = strtoull( SvPV_nolen( _end_n ), NULL, 10 ); #endif for ( i = end_n; i >= beg_n; i-- ) { n = i, steps = 0; /* count using the T(x) notation */ do { n % 2 ? ( steps += 2, n = (3 * n + 1) >> 1 ) : ( steps += 1, n = n >> 1 ); } while ( n != 1 ); if ( steps >= highest ) { number = i, highest = steps; } } Inline_Stack_Reset; Inline_Stack_Push( sv_2mortal( newSVuv(number ) ) ); Inline_Stack_Push( sv_2mortal( newSVuv(highest) ) ); Inline_Stack_Done; } void collatz_longest_c2( SV* _beg_n, SV* _end_n ) { uint64_t beg_n, end_n, i, n, steps; uint64_t number = 0, highest = 0; Inline_Stack_Vars; #ifdef __LP64__ beg_n = SvUV( _beg_n ); end_n = SvUV( _end_n ); #else beg_n = strtoull( SvPV_nolen( _beg_n ), NULL, 10 ); end_n = strtoull( SvPV_nolen( _end_n ), NULL, 10 ); #endif /* based on GCC compiler intrinsics demonstration by Alex Shirley +*/ /* https://stackoverflow.com/questions/38114205/c-collatz-conjectu +re-optimization */ /* https://www.go4expert.com/articles/builtin-gcc-functions-builti +nclz-t29238 */ for ( i = beg_n; i <= end_n; i++ ) { n = i, steps = 0; if ( n % 2 == 0 ) { steps += __builtin_ctz(n); /* account for all evens */ n >>= __builtin_ctz(n); /* always returns an odd */ } /* when we enter we're always working on an odd number */ do { n = 3 * n + 1; steps += __builtin_ctz(n) + 1; /* account for odd and even + */ n >>= __builtin_ctz(n); /* always returns an odd */ } while ( n != 1 ); if ( steps > highest ) { number = i, highest = steps; } } Inline_Stack_Reset; Inline_Stack_Push( sv_2mortal( newSVuv(number ) ) ); Inline_Stack_Push( sv_2mortal( newSVuv(highest) ) ); Inline_Stack_Done; } END_OF_C_CODE sub collatz_longest { my ( $beg_n, $end_n ) = @_; my ( $number, $highest ) = ( 0, 0 ); my ( $i, $n, $steps ); for ( $i = $end_n; $i >= $beg_n; $i-- ) { $n = $i, $steps = 0; # count using the T(x) notation $n % 2 ? ( $steps += 2, $n = (3 * $n + 1) >> 1 ) : ( $steps += 1, $n = $n >> 1 ) while $n != 1; $number = $i, $highest = $steps if ( $steps >= $highest ); } return ( $number, $highest ); } no warnings 'once'; #*collatz = \&collatz_longest; # choose collatz here #*collatz = \&collatz_longest_c1; # using T(x) notation *collatz = \&collatz_longest_c2; # using compiler intrinsics my $m = shift || '1e7'; my ( @sizes, $chunk_size ); $chunk_size = int( $m / MCE::Util::get_ncpu() / 80 + 1 ); $chunk_size += 1 if $chunk_size % 2; mce_flow_s { max_workers => MCE::Util::get_ncpu(), chunk_size => $chunk_size, gather => \@sizes, bounds_only => 1, }, sub { MCE->gather([ collatz( @{ $_[1] } ) ]); }, 1, $m; MCE::Flow->finish; # Output the longest progression for the initial starting number. # https://en.wikipedia.org/wiki/Collatz_conjecture my $highest = ( sort { $b->[1] <=> $a->[1] } @sizes )[ 0 ]->[ 1 ]; say "Longest Collatz (index and value)"; say "@$_" for ( sort { $a->[0] <=> $b->[0] } grep { $_->[1] == $highest } @sizes )[ 0..0 ];

    Output

    The times include launching Perl, loading modules, spawning workers, reaping workers, and output (~ 0.100 seconds).

    This outputs the longest progression for the number of steps to reach 1.
    These numbers are the lowest ones with the indicated step count.

    1e7 : 8400511 685 1 core collatz_longest 1m16.034s collatz_longest_c1 0m01.868s collatz_longest_c2 0m00.778s 2 cores collatz_longest 0m37.912s collatz_longest_c1 0m00.965s collatz_longest_c2 0m00.422s 4 cores collatz_longest 0m19.799s collatz_longest_c1 0m00.516s collatz_longest_c2 0m00.239s 8 cores collatz_longest 0m10.042s collatz_longest_c1 0m00.285s collatz_longest_c2 0m00.147s 16 cores collatz_longest 0m05.196s collatz_longest_c1 0m00.178s collatz_longest_c2 0m00.109s 32 cores collatz_longest 0m02.717s collatz_longest_c1 0m00.137s collatz_longest_c2 0m00.105s collatz_longest_c1 (Inline C), collatz_longest (Perl) 32 cores 1e8 : 63728127 949 Inline C 0m00.738s Perl 0m30.554s 1e9 : 670617279 986 Inline C 0m07.198s Perl 5m51.938s 1e10 : 9780657630 1132 Inline C 1m17.059s 1e11 : 75128138247 1228 Inline C 13m51.122s collatz_longest_c2 (Inline C) 32 cores 1e8 : 63728127 949 Inline C 0m00.340s 1e9 : 670617279 986 Inline C 0m03.023s 1e10 : 9780657630 1132 Inline C 0m33.152s 1e11 : 75128138247 1228 Inline C 6m10.355s

    I can now sleep knowing that MCE can handle this. Just be sure to use sequence generation in MCE (i.e. mce_flow_s) with the bounds_only option.

    Regards, Mario


Add your Meditation
Title:
Meditation:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":


  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (2)
    As of 2020-07-05 04:14 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found

      Notices?