Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation


( #480=superdoc: print w/replies, xml ) Need Help??

If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
Does Go steal from Perl? :-)
3 direct replies — Read more / Contribute
by reisinge
on Aug 03, 2018 at 04:28

    I've started to read The Go Programming Language. I have come across this example code:

    // Dup1 prints the text of each line that appears more than // once in the standard input, preceded by its count. package main import ( "bufio" "fmt" "os" ) func main() { counts := make(map[string]int) input := bufio.NewScanner(os.Stdin) for input.Scan() { counts[input.Text()]++ } // NOTE: ignoring potential errors from input.Err() for line, n := range counts { if n > 1 { fmt.Printf("%d\t%s\n", n, line) } } }

    The counts[input.Text()]++ construct looks pretty familiar, like a hash autovivification in Perl. Was this idea taken from Perl and put into Go?

    Leave no stone unturned. -- Euripides
RIP Win 10
2 direct replies — Read more / Contribute
by SimonClinch
on Jul 30, 2018 at 11:01
    17 years ago I erased Windows from my home environment owing to the relative richness of multimedia applications available from planet CCRMA (for Fedora linux). Indeed, at that time, the best open source DVD ripper was written in Perl and intended mainly for linux.

    Then, seven years later, certain annoying office-related circumstances drove me back to using Windows XP at home. When that got unsupported I skipped the detestable Windows 7 and upgraded to 8. I lived with that thru 10 until recently.

    What happened next was that something went wrong in the April 2018 upgrade which didn't even hit my Win 10 until July -- whataver had been happening to delay windows upgrade is not related to the next bug I suffered: it wouldn't run windows updates because of old SIDs in the registry that did not match any users. But when these were removed (I had to carefully check the valid SIDs and edit the registry), it refused to log in because some typically nasty MS programming was linking these old SIDs to the PIN and password details instead of the new SIDs created by the upgrade. Too late to create a backup admin logon and my project (carefully backed up elsewhere) was ready to migrate to linux (as previously planned), so it was enough reason, relative to faint counter-argument, once again to banish windows from my home computers.

    You only live twice, Mister Gates...

    One world, one people

RFC: Inline::Blocks or inline as a keyword?
4 direct replies — Read more / Contribute
by shmem
on Jul 30, 2018 at 02:26

    Some days ago, Ovid on the perl5porters mailing list asked under the subject "inline keyword?"

    I know people have tried to inline Perl subs before but with little success. Instead of going down that road, have we ever considered an inline keyword where the developer says what can be inlined?

    There have been only a few answers, most off the point (mine included in the latter).

    I haven't considered inlining of subroutines as do blocks at all up to now, and whipped up some benchmark code to prove Ovids point.

    That's pretty impresssive. But the inlined subroutine consists of just one integer division.

    Experimenting further, I found that adding seven reciprocals as 1 / $_; to the plain sub set it in par with the sub subroutine. This means that the overhead of calling a subroutine against plain inlined code is that of seven integer divisions. With heavily used small subs, inlining gives the most benefit in performance; the percentage drops as those subs get more complex, but inlining can be a significant performance boost.

    What do you think? Should we have an inline keyword in Perl?

    Or should we delegate that to a module?

    Making inline into a Perl keyword proper would mean giving it an opcode, cloning of optrees and injecting them at the places where an inlined subroutine is called. Leaving it to a module would mean source filtering.

    Thinking about it, inlining subroutines isn't anything special to the language. It is just duplicating code - for some reason - all over the place, something you don't want to do statically for the sake of DRY (Don't Repeate Yourself) and avoiding a maintenance nightmare. So you want to leave that to some mechanism at compile time, let the computer do it, and don't have the results in the source code.

    Since implementing inline as a keyword proper is much more difficult than whipping up a source filter, I decided to do the latter, and for being lazy.

    Before you go O Noes, another source filter module o_O - that's brittle, evil eval etc think about that: the whole perl source code relies on a source code filter named C pre-processor. All perl C source code is shoven through the preprocessor prior to compilation. If preprocessing fails, there's no compilation; cpp doesn't prove whether it produces valid C code, but fails within its own rules. Validating the procuded code is a task of the compiler.

    The same applies for Perl source filters. The fact that the source filter is invoked in the compile phase doesn't change that, that is just how perl works - it switches between parsing, compiling and runtime in the compile phase (think BEGIN blocks and use), so there's nothing special about it.

    Source filters have their merits. For instance, IO::All is a wonderful tool in my box, and I use it where appropriate.

    After that long preamble providing the rationale and defense for the perpetration, here's the module.

    update: edited according to tobyinks remarks below. The match variables are no longer package globals, and overridable via import parameters.

    Using this module, the following

    use Inline::Blocks; inline sub capitalize_next; print uppercaseIncrementAsString('a'..'f'), "\n"; sub uppercaseIncrementAsString { my @l = @_; my $ret; $ret .= capitalize_next($_) for @l; $ret; } sub capitalize_next { my ($thing) = @_; uc inline increase($thing); } sub increase { my ($foo) = @_; ++$foo; }

    results (via B::Deparse) in

    use Inline::Blocks; print uppercaseIncrementAsString(('a', 'b', 'c', 'd', 'e', 'f')), "\n" +; sub uppercaseIncrementAsString { my(@l) = @_; my $ret; $ret .= do { my($thing) = $_; uc do { my($foo) = $thing; ++$foo } } foreach (@l); $ret; } sub capitalize_next { my($thing) = @_; uc do { my($foo) = $thing; ++$foo }; } sub increase { my($foo) = @_; ++$foo; }

    What do you think? does that suffice or should we have an inline keyword? Apart of answers to that question, critics are welcome, improvements also, e.g. for the regexps in the regexp variables, their names etc.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
use Memoize;
2 direct replies — Read more / Contribute
by Anonymous Monk
on Jul 16, 2018 at 15:18
    I was porting a script to a module and noticed it kept getting slower. The script could initialize its expensive data structure once at the top and be done with it, but in order to encapsulate, the module was calling the function several times. I remembered the core module Memoize and added one line to the top of the program and now it runs fast again, 4x faster than without Memoize!
    use Memoize; memoize('some_sub');
    Only 1.5 seconds to start a program that was taking 6 seconds!
Find the shortest word in the English Language with: a b c d e f
9 direct replies — Read more / Contribute
by usemodperl
on Jul 03, 2018 at 20:41
    Edit: Not going to edit this node because of replies but Eily noticed there is an extra single quote at the end of my port :-/

    This recent post in r/programming poses an interesting question for a bit of golf: "What is the shortest word in the English Language which contains: a b c d e f?" Some Junk™ code was posted to figure this out and my Perl version is a bit shorter. I know you perl better than me, if you do, and can see how else to do this:

    The Junk Code:
    sorted([w.strip() for w in open('/usr/share/dict/words', 'r').readline +s() if set(list('abcdef')).issubset(set(list(w.strip())))], key=lambda x: +len(x))
    My Perl Port:
    open$:,"</usr/share/dict/words";while(<$:>){next unless/(?=.*a)(?=.*b) +(?=.*c)(?=.*d)(?=.*e)(?=.*f)/i;push@_,$_}@_=sort{length$a<=>length$b} +@_'

    For your convenience:

    Print the list:
    open$:,"</usr/share/dict/words";while(<$:>){next unless/(?=.*a)(?=.*b) +(?=.*c)(?=.*d)(?=.*e)(?=.*f)/i;push@_,$_}print for sort{length$a<=>le +ngth$b}@_
    Print the word:
    open$:,"</usr/share/dict/words";while(<$:>){next unless/(?=.*a)(?=.*b) +(?=.*c)(?=.*d)(?=.*e)(?=.*f)/i;push@_,$_}for(sort{length$a<=>length$b +}@_){print;last}
    Compare length of Perl to Junk:
    #!/usr/bin/perl -lw use strict; my $PERL = <<'PERL'; open$:,"</usr/share/dict/words";while(<$:>){next unless/(?=.*a)(?=.*b) +(?=.*c)(?= .*d)(?=.*e)(?=.*f)/i;push@_,$_}@_=sort{length$a<=>length$b}@_' PERL { no strict; $JUNK = <<JUNK; sorted([w.strip() for w in open('/usr/share/dict/words', 'r').readline +s() if set(list('abcdef')).issubset(set(list(w.strip())))], key=lambda x: +len(x)) JUNK print length $PERL, ' PERL: ', $PERL; print length $JUNK, ' JUNK: ', $JUNK; } # I think junk could lose 4 spaces making it 148. __END__ PERL: 144 JUNK: 152
    Of course different versions of dict give different results...

Avoiding perl's Atof when assigning floating point values
5 direct replies — Read more / Contribute
by syphilis
on Jun 28, 2018 at 21:35

    Most perls assign floating point values using perl's internal Atof function - and that includes perls that define "Perl_strtod".
    But Perl's Atof function is notoriously incorrect, and a far better alternative IMO is to have floats assigned using Perl_strtod, which is just a wrapper around C's strtod() or strtold() or strtoflt128() - whichever is appropriate for the particular perl's nvtype.

    First up, I should point out that -Dusequadmath builds (ie builds for which $Config{nvtype} reports "__float128") already use Perl_strtod(), with the result that the __float128 values are assigned correctly, in my experience on Ubuntu-16.04. (By "correctly", I mean rounded to nearest, ties to even.)

    But when perl's nvtype is "double" or "long double", then values are being assigned using perl's Atof function and there's a fair chance that values are being assigned incorrectly.
    The magnitude of Atof's inaccuracies is not particularly large - mostly it's only 1 unit of least precision (ULP). But it can be as large as 7 ULP when nvtype is "double" and and as large as 54 ULP when nvtype is the extended precision "long double".
    (The figures of "7" and "54" are the largest I've found, having tested millions of random values - and those 2 numbers turn up often enough.)
    The actual likelihood of striking inaccuracies with Atof depends upon the exponent range that you're working in. If the exponent is in the range (say) -10 to 10 the likelihood of an incorrect assignment is about 10%.
    But when I randomly select values across the full exponent range, I'm finding that the chances of an incorrect assignment rise to around 97% for "doubles" and 82% for "long doubles".
    When I hack the perl source to use Perl_strtod, the chances of an incorrect assignment become 0. (Ok ... I haven't checked every value ... but I've not yet found a value that has been incorrectly assigned by Perl_strtod on Ubuntu.)

    It turns out that using Perl_strtod instead of perl's Atof is very easy to implement. We just need to open up numeric.c in the top level perl source folder, replace (the one occurrence of) "strtoflt128" with "Perl_strtod", replace every occurrence of "USE_QUADMATH" with "Perl_strtod", and rebuild perl.
    The actual patch (for perl-5.28.0 source) can be downloaded from my scratchpad.

    UPDATE: Better to grab this patch because:
    a) it's a portable patch for both mingw-w64 built Windows perl && Linux perl;
    b) at some time I'll probably clear my scratchpad.

    That's about it. If your perl's nvtype is "__float128" or your build of perl doesn't define "Perl_strtod", then applying the patch will not change anything.
    Otherwise, however, if you build perl using the patched numeric.c then perl will assign floating point values using Perl_strtod instead of perl's Atof.

    It's very much the same story on MS Windows wrt to mingw-w64 builds of perl whose nvtype is "double", where exactly the same patch makes equally dramatic improvements to the assigning of floating point values.
    Sadly, however, for "long doubles" on Windows, there's and that complicate matters.
    And there's also an issue wrt to strtold's assigning of some subnormal long double values - for which I've yet to submit a bug report.
    (More about Windows at a later date.)

    Here's the script I use to check $ARGV[1] randomly selected values within a specified exponent range (-$ARGV[0] to +$ARGV[0]).
    # # Test a range of values for # correctness of assignment use strict; use warnings; use Math::MPFR qw(:mpfr); die "Upgrade to Math-MPFR-4.03" unless $Math::MPFR::VERSION >= 4.03; die "Usage: perl maximum_exponent how_many_values" unless @ARGV == 2; $|++; my $display = 0; while($display !~ /^y/i && $display !~ /^n/i) { print("Do you want mismatched values to be displayed ? [y|n]: \n"); $display = <STDIN>; } $display = 0 if $display =~ /n/i; my($mant, $exp, $perl_unpacked, $mpfr_unpacked, $str_value); my($count, $diff, $max_diff, $min_diff) = (0, 0, 0, 0); my $max_exp = $ARGV[0]; $max_exp++; # $workspace is the Math::MPFR object to which # the value being tested is assigned. # Here we set the precision of $workspace to the # same number of bits as perl's NV. my $workspace = Rmpfr_init2($Math::MPFR::BITS); my $failed = 0; my($perl_nv, $mpfr_nv); for(;;) { $count++; $mant = int(rand(10)) . '.' . int(rand(10)) . int(rand(10)) . int(rand(10)) . int(rand(10)) . int(rand(10)) . int(rand(10)) . int(rand(10)) . int(rand(10)) ; $exp = int(rand($max_exp)); $exp = "-$exp" if $count % 2; $str_value = $mant . "e$exp"; # Assign $str_value to $mpfr_nv using mpfr $mpfr_nv = atonv($workspace, $str_value); # Assign $str_value to $perl_nv using perl $perl_nv = "$str_value" + 0; # $mpfr_nv and $perl_nv should be exactly equivalent. # Else atleast one of mpfr and perl has assigned incorrectly. # IME, mpfr does not assign incorrectly. unless($perl_nv == $mpfr_nv) { $failed++; $perl_unpacked = scalar reverse unpack "h*", pack "F<", $perl_nv; $mpfr_unpacked = scalar reverse unpack "h*", pack "F<", $mpfr_nv; print "$str_value: $mpfr_nv:\n $perl_unpacked vs $mpfr_unpacked\n\ +n" if $display; $diff = hex(substr($perl_unpacked, -8, 8)) - hex(substr($mpfr_unpa +cked, -8, 8)); if($diff > $max_diff) { $max_diff = $diff; } elsif($diff < $min_diff) { $min_diff = $diff; } } last if $count == $ARGV[1]; } print "Count: $count\n"; print "Failed: $failed\n"; print "Largest differences were $max_diff ULPs and $min_diff ULPs\n"; print "Failed: $failed\n"; print "Largest differences were $max_diff ULPs and $min_diff ULPs\n";
    It requires Math-MPFR-4.03. If you want to test values in the subnormal range, you should build Math::MPFR against mpfr-4.0.x as earlier versions of mpfr were buggy in their calculation of subnormals.
    As a starter, run perl 300 100, opting to display mismatches, and see how that fares.
    Whenever I run that command against a patched perl-5.28.0, 0 mismatches are detected, irrespective of perl's nvtype.
    Whenever I run that command against an unpatched perl-5.28.0, about 80 failures are detected unless, of course, nvtype is "__float128" - in which case no failures still occur.

    There's probably not many who would bother, but I certainly intend to continue building perl with this hack in place.

    UPDATE: For the record, gcc version on my Ubuntu box is 5.4.0, and libc version is 2.23

Why did you become a Perl expert (or programmer)?
12 direct replies — Read more / Contribute
by QM
on Jun 25, 2018 at 05:23
    Prompted by this comic at Commit Strip.

    Me? (Not that I'm an expert.) Because Perl was handy, extremely useful, and didn't require a separate compile phase. Because I could solve other people's problems with it. Because it was general purpose, and not specifically geared for stream picking / editing. Because it was free. Because it was more fun than any other language I knew at the time.

    Quantum Mechanics: The dreams stuff is made of

The CPAN Apocalypse: June 25, 2018
4 direct replies — Read more / Contribute
by usemodperl
on Jun 19, 2018 at 16:06
    I wonder what will happen the day goes away, and the day after. One more week and we will know! Does Perlmonks get flooded with identical questions? Does FUD gleefully announcing the death of Perl and CPAN make the rounds?? Do those redirects work???

Review of CGI::Alternatives
11 direct replies — Read more / Contribute
by Anonymous Monk
on Jun 09, 2018 at 11:44
    CGI::Alternatives is a module that "doesn't do anything"1 except vehemently deny and propogandize against the utility of one of the most useful forms of programming EVER conceived: CGI2 programming.

    CGI::Alternatives perpetuates all the common fallicies against CGI that if heeded only disempower independent developers. This includes advocating the replacement of all of Perl's wonderful and extremely simple, stable, mature, powerful CGI modules with vastly more byzantine "frameworks"1 which rigidly enforce all sorts of corporate nonsense like "full separation of concerns"1, total object-oriented lack of any possible function-ality, and the absurd complication of allowing oneself to be used by something as annoyingly totalitarian as templates1 EVEN when they're not appropriate! All of these techniques have their place of course, mostly in big projects, with lots of tiny modules (to confuse management, ensure job stability, in competitive workplaces, stretching hours into months, for the children), but not usually in code written by us individuals for fun, prototyping, and extreme levels of pure: results.3

    With all due respect to the author's efforts to change how we Perl into something his bosses find acceptable, the author of CGI::Alternatives is actually in charge of! How is this even possible? I realize the author is a talented programmer who has contributed significantly to CPAN, but this quote directly reflects his inappropriate state of mind towards the CGI paradigm (while is derided with that weird novelty-obsessed bigotry for being "old", as he removes perfectly sensible functions, only to prove his pointless point):

    "You can't just hand a template to the web-designers and allow them to work their magic. Don't mix the business logic and the presentation layer. Just don't."

    This guy doesn't even know what a CGI programmer does yet he dictates to us? This is CRAZY! We ARE the web-designers, OF the business logic, AND the presentation layer--ALL mixed together--like a SWISS ARMY CHAINSAW: this is our TECHINQUE! THIS IS Perl! Something YOU (Lee) obviously don't understand. Mixing it all up is exactly how some other language(s) seized the web from Perl (along with plenty of well-funded corporate FUD). Even though we still do it far better.

    We have been here from the beginning and we remain no matter how many of our tools you try to disable or how much FUD you spread about our primordially awesome technique of producing ONE UNIFIED FILE, USING CORE MODULES, QUITE OFTEN VASTLY SUPERIOR TO FRAGMENTED TEMPLATES, WRITTEN BY ONE PERL GENIUS, RATHER THAN A TEAM OF HOPELESSLY ABSTRACTED CORPORATE DRONES: Because Larry wrote and maintains Perl that way; Blessed be.

    How dare you tell us to stop doing what we love and what Perl empowers us to do? How dare you remove the HTML generation functions from Who do you think you are anyway? People who come to Perl and say things have got to change don't appreciate Perl and should be led as far away from Perl as possible (Python), not in charge of (formerly, unfortunately) core modules!

    Can someone who cares please take away from Lee Johnson (LEEJO)? I would feel far more comfortable with someone we can trust, like ikegami or Merlyn3, in charge of maintaining At least we know they would give us what we want and need, and more, rather than inflicting torture by removing legacy functionality FOR EMOTIONAL REASONS thereby violating operational stability.

    News for you Lee: What worked 20 years ago still works today: UNIX, POSIX, BASH, PERL, ME, AND MAYBE EVEN YOU. Mature technology never stops working! I appreciate innovation so don't necessarily stop trying to reinvent the wheel, but please do stop trying to shove your shiny new wheels in sheep's clothing down our throats because PERL ALREADY WON.4

    If we stopped wasting time and spirit listening to ideologically driven flame warrior infiltrators who keep trying to change Perl we would already have a perfectly backward compatible and "fixed" (even though it has never failed me to this very day thank you Lincoln Stein5) on the corelist joined by other bits we desperately need and use EVERY SINGLE DAY like CGI::Carp, Data::Dumper and File::Slurp.

    Some examples:

    This extremely useful one-line CGI dubugger is now broken thanks to LEEJO (thanks!):

    print header('text/plain'), Dumper $data; exit;

    This is never ever going away:

    start_html now reduces efficiency by 100%!:

    '</body></html>' end_html # removed from, for ideological anti-reasons

    If you agree PLEASE respond! If you do NOT agree please DO NOT hijack the thread because you guys already kinda won and I hope this thread can be for CGI programmers to chime in and support this seemingly lost cause which is really not even close to lost in the real non-ideological world of actual programmers who GET STUFF DONE.


    1. CGI::Alternatives
How will Artificial Intelligence change the way we code?
5 direct replies — Read more / Contribute
by LanX
on Jun 09, 2018 at 07:55
To <=80 char code line length or not
11 direct replies — Read more / Contribute
by stevieb
on Jun 07, 2018 at 18:15

    In all of my current 44 CPAN distributions, along with my near 80 Github Open Source repositories (Perl, C, C++, C#, Python etc), I (with few exceptions) enforce an 80 char limit on the length of the lines of code.

    I also do this even in my POD, Changes and test files (again, with some exceptions).

    I know that this practice is based on legacy console line-length reasons, but I still like to stick with it, as it keeps things very consistent, as well as allows my IDE to display the project layout, two open files side-by-side, and the overview (structure) of the file I'm currently working on to be viewed clearly and easily.

    Even when I'm using just vi/vim on the CLI outside of my IDE of choice, I can count on my code being consistently wide in all aspects.

    What are your thoughts here? Many coders I speak to go as far as 120 chars and they say that is helpful, and at $work (Python and C++), there's a 79 char limit and many hate it. Seems as though newer generations prefer longer line lengths, but here I am curious as to what the Perl community feels.

What is a Bool?
No replies — Read more | Post response
by tobyink
on Jun 07, 2018 at 05:01

    Already posted on

    Perl allows pretty much any value to be evaluated in a boolean context:

    if ($something) { ... }

    No matter what $something is, it will safely evaluate to either true or false. (With the exceptions of a few edge cases like blessed objects which are overloaded to throw an error when evaluated as booleans.)

    So when a Moose class does something like this, what does it mean?

    has something => ( is => 'ro', isa => 'Bool', );

    If absolutely any value could work when $self->something was accessed in boolean context, then what need is there to check what value is passed to the constructor? Should Bool basically be the same as Any, just spelled differently for documentation purposes?

    So what does Moose do? The documentation says:

    Bool accepts 1 for true, and undef, 0, or the empty string as false.

    However, that's not the full story. Blessed objects which overload stringification are accepted, but only if the stringification returns the strings "0", "1", or the empty string at the time the type constraint is checked. If the object stringifies to something else, but also overloads boolification sensibly, then too bad. Of course when you write if ($self->something) it's the boolification overloading which matters, but Moose only checks the stringification overloading.

    Moose's support for objects that overload stringification as booleans is not explicitly documented, nor is it covered at all by the Moose test suite.

    What does Mouse do? Well, that's even weirder. It mostly follows Moose's documented behaviour. It accepts "1" for true, and "0", undef, and the empty string for false. But also, it accepts objects overloading boolification for false. Yes, that's right — if you overload boolification to return true, it will fail the type check. Overload it to return false, and you're golden!

    So where does this leave my module Types::Standard? Well, the pure Perl implementation follows what Moose does, and the (optional) XS implementation is forked from Mouse.

    For the latest release of the XS version, I've dropped support for objects which overload boolification to return false, bringing it in line with Moose's documented behaviour. I plan for the pure Perl implementation to also follow suit, dropping support for objects which overload stringification to return a boolean value.

    If you need support for objects overloading boolification, a quick workaround is this:

    has something => ( is => 'ro', isa => 'Any', # Bool );

    Or use coercions (example uses Types::Standard):

    has something => ( is => 'ro', isa => Bool->plus_coercions(Any, q{ !!$_ }), coerce => 1, );

    In the case of read-only attributes, I happen to believe accepting a blessed object as a boolean value could be harmful. The contents of the object could later change, changing the value from true to false, or vice versa, despite its read-onlyness.

RFC: LWP::UserAgent hit counter
2 direct replies — Read more / Contribute
by bliako
on Jun 03, 2018 at 10:30

    As I get sucked deeper and deeper into web scrapers -- the Cosmo Cramers of our era -- and constantly doing so with my faithful companion, the LWP::UserAgent, the need arose, primarily out of curtesy to the hosts, for counting the number of requests (hits) I made over a certain time interval and holding the scraper back by sleep()ing some time.

    Eventually, I decided I wanted to be able to know the ratio of active hitting sessions over sleep times and also control and tweak the hit rate and the subsequent burden on the host, for particular traffic situations: late night or noons, with just a few parameters, mainly the sleep() durations between the various phases of scraping and form filling. The latters, one could imagine being like a complex state machine which can lead you to deterministic -- most of the time -- but highly complex paths.

    And so I have devised two methods/tools to assist me in my endeavours, one is a hit counter for LWP::UserAgent and the other is a counter of sleep() seconds which works across all sleep() calls even in far and foreign modules.

    I will proceed now to lay out a module-based implementation of so-called UserAgent-with-Stats, including a test script.

    The basic idea is to subclass LWP::UserAgent in order to add a handler (via set_handler), when requested by the user, to the "request_send" phase of LWP's request(). The purpose of this handler is to increment our internal hit counter every time a request is sent by LWP (GET/POST/etc.).

    Additionally, there are two time counters to assist us in calculating the time-interval between when counter was turned on and either last-hit or when it was turned off. The aim is to be able to know the number of hits that occured within a time interval. Thinking about it maybe it makes more sense a time-first-hit to time-last-hit interval.

    Now, one may ask why there is a need to subclass and not create a new class which takes a LWP::UserAgent object in adds handler to it and keeps the counters. Indeed, that is another possibility.

    In any event, that's the basic idea. I would like to ask for your comments, corrections and recommendations. I will do the same for the sleep-count module in my next post.

    And here is a test script:

    I will detail the sleep-count in my next post.

    Thanks, bliako

Mocking LDAP in your tests
1 direct reply — Read more / Contribute
by Ea
on May 31, 2018 at 10:48
    Your mother was X.500 and your father smells of RFCs! Now go away or I shall mock you a second time!
      - from the original draft of Monty Python and the Holy Grail.

    I finally buckled down and starting to mock the LDAP server in my tests rather than trying to connect to a live server. Other than the documentation, there's not a lot of examples out there for Test::Net::LDAP::Mock or Test::Net::LDAP::Util, so here's the results from banging away at it for an afternoon along with what I think is going on. Please feel free to point out what I've done wrong. I've stopped where it started working for me.

    I have a Mojolcious app that authenticates against LDAP, but the tests would fail when using dummy accounts or when I wasn't connected. Here's the test I wrote

    Steps to mocking

    Setup your test environment as usual and use Test::Net::LDAP::Util qw/ldap_mockify/; The ldap_mockify method intercepts all calls to Net::LDAP->new() and redirects them to your mocked LDAP directory.
    1. Create a new Net::LDAP object
    2. Use the object to populate your mocked server with data using the add method
    3. If you want to mock the authentication process, use the mock_bind method with a call back that returns LDAP_SUCCESS or LDAP_INVALID_CREDENTIALS
    4. Now that your LDAP server is all mocked up, run your tests
    5. Don't forget the }; at the end of the method. It's a funny error message when you forget the semicolon at the end.


    • the $basedn that you mock has to be the same as the base DN that you search in your application. this is easier if you keep the values in a config file and read the same file in your test (not shown here for brevity)
    • testing authentication, you don't set the password for an entry with mock_password(), but instead supply mock_bind() with a callback
    • if you haven't imported Net::LDAP::Constant, you'll need to use the fully qualified name to report success Net::LDAP::Constant::LDAP_SUCCESS or failure Net::LDAP::Constant::LDAP_INVALID_CREDENTIALS
    • most of the methods in Test::Net::LDAP::Util seem to want to return success, regardless of the underlying data, which can be frustrating until you work that out and code accordingly.

    Well, what do you think? Does it get the job done?

    Edit - while cleaning up tabs used for putting this post together, I found a relevant question on StackOverflow from 5 years ago, but it hasn't been answered so far.


    Sometimes I can think of 6 impossible LDAP attributes before breakfast.

    YAPC::Europe::2018 — Hmmm, need to talk to work about sending me there or to Mojoconf.

GDPR ( Global Data Protection Rights )
6 direct replies — Read more / Contribute
by trippledubs
on May 17, 2018 at 01:33


    What do you think of General_Data_Protection_Regulation? I'm interested to know if your companies are behind it or minimally complying, more interested to know if you think individuals ought to have the rights expressed in that law and if there is really a moral obligation on site owners to comply. Or, if it should be scrapped or changed.

    The right of erasure specifically contradicts PM policy which is defended with the same argument that Wikipedia uses, the "Memory hole" argument. If one user decides to revoke the site owners permission to use their nodes, that creates a hole in the link of the chain, and every user is negatively affected. That is a pretty utilitarian view point. It smells slightly self serving to me to hear that argument from sites whose success directly rides on user generated content.

    It really only benefits future users, because if you were there, you don't need a tattoo of the conversation to remember it later. I don't see that a site owner, especially if it's not the hoster ie back in time machines, gets a perpetual license after you leave. Recipe sites -- let's say you participate for years honing the craft and eventually decide to write a cookbook, you don't ever have the right to revoke your recipes down off the boards and make the world pay for your stuff? But your dishes have probably benefited from all that recipe sharing, so it seems you would owe something too.

    I can't help but think of the social contract put forth in Crito. You have a good idea of what you are getting into when you participate online, seems reasonable that the site architects who built your playground would be able to dictate the terms, but I don't see how they have the right to continue to do so once you leave.

    I googled: Social contract, copyright law, landlord tenant, looked up about 10 web sites that were closing down or blocking EU Customer, but I can't make up my mind. There seems to be a lot of data players operating in the shadows without consent that should be addressed, but I can't see how it affects my life at all. I see an ad about something I almost bought on Amazon, big deal.

    Well surely we do not live in a perfect world, but does the GDPR move the decimal point either direction? Or just adding more compliance factories to the world? And who are the people who wrote the bill that made me get all this TOS spam. I tried to find the authors' names and I could not. Maybe this is a stepping stone to better "digital rights"?

Add your Meditation
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others about the Monastery: (6)
    As of 2018-08-15 13:00 GMT
    Find Nodes?
      Voting Booth?
      Asked to put a square peg in a round hole, I would:

      Results (160 votes). Check out past polls.