If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
Perl monks vs other sites
3 direct replies — Read more / Contribute
by f77coder
on May 23, 2015 at 23:09
    Hello All,

    I wasn't sure where to post this, so apologies if this is not the place.

    I wanted to say how great Perl Monks is at helping out noobs compared with knuckle dragging neanderthals at place like stack overflow. People here are generally orders of magnitude nicer.

    Cudos to the site.

RFC: Swagger-codegen for Perl
2 direct replies — Read more / Contribute
by wing328
on May 15, 2015 at 02:17
    Hi all,

    https://github.com/swagger-api/swagger-codegen contains a template-driven engine to generate client code in different languages by parsing your Swagger Resource Declaration. Recently I've added the Perl template. To test the code generation, please perform the following (assuming you've the dependencies installed):

    git clone https://github.com/swagger-api/swagger-codegen.git git checkout develop_2.0 git checkout mvn clean && ./bin/perl-petstore.sh

    If you do not want to install the dependencies and just want to get a peek at the auto-generated Perl SDK, please go to the directory samples/client/petstore/perl to have a look at the Perl SDK for Petstore http://petstore.swagger.io (please make sure you're in the develop_2.0 branch)

    The Perl SDK is not perfect and I would appreciate your time to test and review, and share with me your feedback.

    (ideally I would like to post this at "Meditations" but I couldn't find a way to post there)

OOP: How to (not) lose Encapsulation
3 direct replies — Read more / Contribute
by Arunbear
on May 12, 2015 at 08:25


    Moose and its smaller cousin Moo have become popular ways of creating Object Oriented code, but the existing tutorials do not have much to say about Information hiding (also known as Encapsulation), so this is an attempt to fill that gap.

    Alice and the Set

    Alice wants to use a Set (a collection of unique items) to keep track of data her program has already seen. She creates a Set class which initially looks like this (assume that she either hasn't learnt about hashes yet, or doesn't like the "fiddliness" of simulating a Set via a Hash):
    package Set; use Moo; has 'items' => (is => 'ro', default => sub { [ ] }); no Moo; sub has { my ($self, $e) = @_; scalar grep { $_ == $e } @{ $self->items }; } sub add { my ($self, $e) = @_; if ( ! $self->has($e) ) { push @{ $self->items }, $e; } } 1;
    It can be used like this:
    % reply 0> use Set 1> my $s = Set->new $res[0] = bless( { 'items' => [] }, 'Set' ) 2> $s->has(42) $res[1] = 0 3> $s->add(42) $res[2] = 1 4> $s->has(42) $res[3] = 1

    Bob, min and max

    Another programmer Bob, starts using this class. Bob finds it useful but misses the ability to find the min/max element of a set, so he creates some utilities:
    package BobUtil; use strict; use List::Util qw(min max); sub set_min { my ($set) = @_; min @{ $set->items }; } sub set_max { my ($set) = @_; max @{ $set->items }; } 1;
    which he can use like so:
    % reply 0> use Set 1> use BobUtil 2> my $s = Set->new $res[0] = bless( { 'items' => [] }, 'Set' ) 3> $s->add($_) for 1 .. 5 $res[1] = '' 4> BobUtil::set_min($s) $res[2] = 1 5> BobUtil::set_max($s) $res[3] = 5
    Bob eventually finds this usage too cumbersome and decides to make it simpler by using Inheritance to create his own set:
    package BobSet; use Moo; use List::Util qw(min max); extends 'Set'; no Moo; sub numeric_min { my ($self) = @_; min @{ $self->items }; } sub numeric_max { my ($self) = @_; max @{ $self->items }; } 1;
    And now he can do this:
    % reply 0> use BobSet 1> my $s = BobSet->new $res[0] = bless( { 'items' => [] }, 'BobSet' ) 2> $s->add($_) for 1 .. 5 $res[1] = '' 3> $s->numeric_min $res[2] = 1 4> $s->numeric_max $res[3] = 5

    Alice updates the Set

    Now realising that linear scans don't scale up as well as hash lookups, Alice decides to update her Set class to use a hash rather than an array:
    package Set; use Moo; has 'items' => (is => 'ro', default => sub { { } }); no Moo; sub has { my ($self, $e) = @_; exists $self->items->{ $e }; } sub add { my ($self, $e) = @_; if ( ! $self->has($e) ) { $self->items->{ $e } = 1; } } 1;
    News of the new improved Set reaches Bob, and he installs the new version, but alas:
    % reply 0> use BobSet 1> my $s = BobSet->new $res[0] = bless( { 'items' => {} }, 'BobSet' ) 2> $s->add($_) for 1 .. 5 $res[1] = '' 3> $s->numeric_min Not an ARRAY reference at BobSet.pm line 11.
    And BobUtil is just as broken:
    % reply 0> use Set 1> use BobUtil 2> my $s = Set->new $res[0] = bless( { 'items' => {} }, 'Set' ) 3> $s->add($_) for 1 .. 5 $res[1] = '' 4> BobUtil::set_min($s) Not an ARRAY reference at BobUtil.pm line 8.

    Encapsulation lost via accessor

    By making the internal representation of the Set public, anyone depending on that representation will be in trouble if the representation changes. Alice updates the Set again to correct this design error:
    package Set; use Moo; has '_items' => (is => 'ro', default => sub { { } }); no Moo; sub items { my ($self) = @_; [ keys %{ $self->_items } ]; } sub has { my ($self, $e) = @_; exists $self->_items->{ $e }; } sub add { my ($self, $e) = @_; if ( ! $self->has($e) ) { $self->_items->{ $e } = 1; } } 1;
    Here the internal representation of the Set is made "private" via the "leading underscore" convention, and a public method is provided to access (a copy of) the set items (it would be better to not have an accessor for the set items, as there would be no need to rely on a convention for privacy, but this is beyond the power of Moo). And now BobSet works once again:
    % reply 0> use BobSet 1> my $s = BobSet->new $res[0] = bless( { '_items' => {} }, 'BobSet' ) 2> $s->add($_) for 1 .. 5 $res[1] = '' 3> $s->numeric_min $res[2] = '1' 4> $s->numeric_max $res[3] = '5'

    Encapsulation lost via constructor

    There's still another way in which Encapsulation is lost, consider:
    % reply 0> use Set 1> my $s = Set->new(_items => [1 .. 4]) $res[0] = bless( { '_items' => [ 1, 2, 3, 4 ] }, 'Set' ) 2> $s->add(5) Not a HASH reference at Set.pm line 15.
    The constructor generated by Moo will by default allow any attribute to be set via an "init_arg", and clearly in this case it is not desirable. There are are few ways to fix this, such as constraining the value using an "isa" directive or by an "init_arg" => undef directive.

    Yet another way is to use BUILDARGS e.g.

    package Set; use Moo; has '_items' => (is => 'ro', default => sub { { } }); no Moo; sub BUILDARGS { shift; return { _items => { map { $_ => 1 } @_ } }; } sub items { my ($self) = @_; [ keys %{ $self->_items } ]; } sub has { my ($self, $e) = @_; exists $self->_items->{ $e }; } sub add { my ($self, $e) = @_; if ( ! $self->has($e) ) { $self->_items->{ $e } = 1; } } 1;
    Now any (explicit) constructor arguments will become elements of the Set:
    % reply 0> use Set 1> my $s = Set->new(1 .. 4) $res[0] = bless( { '_items' => { '1' => 1, '2' => 1, '3' => 1, '4' => 1 } }, 'Set' ) 2> $s->has(5) $res[1] = '' 3> $s->has(3) $res[2] = 1

    Encapsulation lost via Inheritance

    Yet another programmer Chuck, also starts using Alice's Set class but he finds that he wants the Set to be able to remember how many times it has encountered a given value e.g. Chuck wants to create a new type of set with this behaviour:
    % reply 0> use RememberingSet 1> $s = RememberingSet->new 3> $s->has(1) $res[1] = '' 4> $s->add(1) $res[2] = 1 5> $s->seen(1) $res[3] = 2 6> $s->seen(2) $res[4] = 0
    Here the set remembers that it has seen the value 1 twice, once via has() and once via add(), whereas it hasn't seen the value 2 via add() or has(). Chuck uses Inheritance to create this type of set
    package RememberingSet; use Moo; has '_count' => (is => 'ro', default => sub { { } }); extends 'Set'; no Moo; sub has { my ($self, $e) = @_; $self->_count->{ $e }++; $self->SUPER::has($e); } sub add { my ($self, $e) = @_; $self->_count->{ $e }++; $self->SUPER::add($e); } sub seen { my ($self, $e) = @_; exists $self->_count->{ $e } ? $self->_count->{ $e } : 0; } 1;
    The RememberingSet overrides the has() and add() methods in both cases updating a counter before calling the corresponding version in Alice's Set. But Chuck finds that this new set doesn't work as expected
    % reply 0> use RememberingSet 1> my $s = RememberingSet->new $res[0] = bless( { '_count' => {}, '_items' => {} }, 'RememberingSet' ) 2> $s->has(1) $res[1] = '' 3> $s->add(1) $res[2] = 1 4> $s->seen(1) $res[3] = 3
    This has happened because in the Set class, the add() method calls the has() method. Chuck could fix this by not updating the count in his add() method, but this is a fragile solution as seen() would yield the wrong answer if Alice decided to update add() so that it didn't call has().


    The problem with Inheritance is that it requires Chuck to know the internal workings of Alice's set class to use it correctly (thus the loss of Encapsulation). A safer form of reuse is Composition which looks like
    package RememberingSet; use Moo; use Set; has '_count' => (is => 'ro', default => sub { { } }); has '_set' => (is => 'ro', default => sub { Set->new }); no Moo; sub has { my ($self, $e) = @_; $self->_count->{ $e }++; $self->_set->has($e); } sub add { my ($self, $e) = @_; $self->_count->{ $e }++; $self->_set->add($e); } sub seen { my ($self, $e) = @_; exists $self->_count->{ $e } ? $self->_count->{ $e } : 0; } 1;
    This solution provides the expected behaviour. Composition works by wrapping the "derived" class around the original one and forwarding (or delegating) the appropriate methods to it. There are even Moosisms like "handles" and "around" that could be be used to simplify this solution e.g.
    package RememberingSet; use Moo; use Set; my $Delegated = [qw/add has/]; has '_count' => (is => 'ro', default => sub { { } }); has '_set' => (is => 'ro', default => sub { Set->new }, handles => $ +Delegated); around $Delegated => sub { my ($orig, $self, $e) = @_; $self->_count->{ $e }++; $self->$orig($e); }; no Moo; sub seen { my ($self, $e) = @_; exists $self->_count->{ $e } ? $self->_count->{ $e } : 0; } 1;
    The REPL used in the above examples is reply


    • Consider making all attributes private (if only via convention)
    • Consider turning off init_args
    • Consider using Composition instead of Inheritance
Never say never
2 direct replies — Read more / Contribute
by Lady_Aleena
on May 05, 2015 at 23:37

    A thud can be heard throughout the monastery as Lady Aleena's head hits her desk in the scriptorium. Shortly thereafter the sound of breaking glass filters through the monastery's halls as her preconceptions are shattered.

    Never say you will never do a thing because, one day, you will do the thing then feel foolish.

    Now here is some history and what happened tonight. I said I would never use pipes in my data fields. Tonight I did. Now I am feeling more than foolish. (And I've made more work for myself while trying to make less work for myself.) I just learned my lesson.

    If you feel you are about to say "I will never...", stop, think very hard, count to ten, whatever. Just do not say it! One day you will be writing a piece of code, humming a little tune stuck in your head, then you will crash into a wall because you did not take into consideration the exception you just wrote.

    Have a nice day!

    I started this thought in the CB and continued it here.

    No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
    Lady Aleena
perlnews item worth reading!
No replies — Read more | Post response
by ww
on May 05, 2015 at 16:12
10,000 days of Perl
2 direct replies — Read more / Contribute
by johnbio
on May 05, 2015 at 07:43
    Just a brief meditation on this landmark day for Perl. If the release of Perl 1.0 according to perlhist is taken as the date of birth, today Perl is 10,000 days old. As someone who has used Perl an awful lot for ... more than 7,000 days, and continues to do so, I would like to post a big Thank You and a 5-digit landmark congratulation to Perl, Larry Wall and the Community :-) I may post again in 2042.
Refactoring Perl5 with XS++
5 direct replies — Read more / Contribute
by rje
on Apr 25, 2015 at 01:06

    Last time I mused aloud about "refactoring" Perl, I referenced Chromatic's statement/challenge:

    "If I were to implement a language now, I'd write a very minimal core suitable for bootstrapping. ... Think of a handful of ops. Think very low level. (Think something a little higher than the universal Turing machine and the lambda calculus and maybe a little bit more VMmy than a good Forth implementation, and you have it.) If you've come up with something that can replace XS, stop. You're there. Do not continue. That's what you need." (Chromatic, January 2013)

    I know next to nothing about XS, so I started reading perldoc.

    I'm thinking about the problem, so if there is a question, it would be "what NOW?"

    Should I bother with thinking about bytecodes? In what sense could it be a replacement for XS? What does "replace XS" even MEAN? (i.e. perhaps it just means "remove the need to use perl's guts to write extensions, and improve the API").

    Most importantly, am I wasting people's time by asking?

    I'm trying to come up with my own answers, and learn by trying. But wisdom is in knowing that some of you guys have already thought through this. If you can help bring me up to speed, I'd appreciate it.

    UPDATE: I see even within the Lorito page, it was understood that the discussion was to some degree about Perl's API: "This is an issue of API design. If we understand the non-essential capabilities we want to support (e.g. optimization passes, etc), we can design the API so that such capabilities can be exploited but not required. - cotto "

Want to make an Everything site of your own?
1 direct reply — Read more / Contribute
by thomas895
on Apr 20, 2015 at 03:48

    Ever wanted to experiment with the engine PerlMonks is built on? I did, but it's rather difficult to install, so I thought I'd write this for anyone who wanted to give it a go themselves.

    My Perl is v5.16.2. YMMV for others.


    • A MySQL/MariaDB server

    • GNU patch

    • C compiler

    • Some knowledge of how Apache works

    Estimated time: a quiet evening or so

    1. Download Apache 1.3.9 and mod_perl 1.14 from your nearest mirror, then unpack them. You may use other versions, but this guide won't apply to them.

    2. I wanted to install it all to my home directory. I ran mod_perl's Makefile.PL like so:

      perl Makefile.PL APACHE_SRC=../apache_1.39/src APACHE_PREFIX=$HOME/opt/apache1.3.9 DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 PREFIX=/home/thomas/perl5
      Adjust as needed.
    3. If you have a relatively new version of gcc and a Perl v5.14 or newer, you will need to make some changes to the source. Apply this patch file to the mod_perl directory, and this one to the apache directory. It does the following (you can skip these details if you want):

      • In v5.16, $<; and friends are no longer cached. I just tried removing the problematic section that used these variables, and that seemed to work. You might not be able to run your server as root (which requires being able to change its own permissions), but I haven't checked.

      • For some reason, it was a trend to supply your own version of getline (maybe the libc one was broken, haven't looked it up) in those days. In any case, gcc complains about it, so I updated all of the code to use the Apache one. (it only affects the password utility, which is not really needed in our case, but it does cause make to fail)

      • In v5.14, you can't use the result of GvCV and friends as lvalues anymore, so I replaced the places where something was assigned to the result of that function with the appropriate _set macro, as the delta advises.

    4. Run make and make install, and go make some coffee. You can make test, too, but then also grab a sandwich.

    5. Try to start Apache as make install's instructions suggest, to make sure it works. You may need to choose a different port number, do so with the Listen and Port options in httpd.conf

      • If you installed Apache locally, you will need to modify apachectl and/or your shell startup script: make sure that the PERL5LIB environment variable is set to where mod_perl's Perl libraries are installed.

      Now for Everything else...

    6. Download this, unpack it, and follow QUICKINSTALL up to (but not yet including) the install_esite

      • When running Makefile.PL, if you want to install locally, don't forget to set PREFIX accordingly.

      • It is not necessary to let it append things to your httpd.conf, in a later step I'll show you why and what to do instead.

    7. If you have a modern mysql/mariadb, some of the SQL scripts won't work. Here is another patch to fix them.

      • It mostly has to do with the default values of the integer columns: by getting rid of the default value of a quoted zero, mysql accepts it.

      • There is also a timestamp column that has a size in the script, but mysql doesn't like that, so by getting rid of it, it works again.

    8. Now run install_esite, as QUICKINSTALL says.

    9. For some reason, index.pl only showed up as text, perhaps due to the other mod_perl settings I'd been playing with, or perhaps it was something else. I added this to httpd.conf, and then it worked:

      PerlModule Apache::Registry PerlModule Apache::DBI PerlModule CGI <Files *.pl> SetHandler perl-script PerlHandler Apache::Registry Options +ExecCGI PerlSendHeader On PerlSetupEnv On </Files>
    10. (Re)start Apache, visit /index.pl, and have lots of fun!

    If something doesn't work for you, post it below.

    Happy hacking!

    Edit: forgot the third patch

    "Excuse me for butting in, but I'm interrupt-driven..."
    Did you know this software was released when I was only 3 years old? Still works, too -- I find that amazing.
Data-driven Programming: fun with Perl, JSON, YAML, XML...
6 direct replies — Read more / Contribute
by eyepopslikeamosquito
on Apr 19, 2015 at 04:41

    The programmer at wit's end for lack of space can often do best by disentangling himself from his code, rearing back, and contemplating his data. Representation is the essence of programming.

    -- from The Mythical Man Month by Fred Brooks

    Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

    -- Rob Pike

    As part of our build and test automation, I recently wrote a short Perl script for our team to automatically build and test specified projects before checkin.

    Lamentably, another team had already written a truly horrible Windows .BAT script to do just this. Since I find it intolerable to maintain code in a language lacking subroutines, local variables, and data structures, I naturally started by re-writing it in Perl.

    Focusing on data rather than code, it seemed natural to start by defining a table of properties describing what I wanted the script to do. Here is a cut-down version of the data structure I came up with:

    # Action functions (return zero on success). sub find_in_file { my $fname = shift; my $str = shift; my $nfound = 0; open( my $fh, '<', $fname ) or die "error: open '$fname': $!"; while ( my $line = <$fh> ) { if ( $line =~ /$str/ ) { print $line; ++$nfound; } } close $fh; return $nfound; } # ... # -------------------------------------------------------------------- +---- # Globals (mostly set by command line arguments) my $bldtype = 'rel'; # -------------------------------------------------------------------- +---- # The action table @action_tab below defines the commands/functions # to be run by this program and the order of running them. my @action_tab = ( { id => 'svninfo', desc => 'svn working copy information', cmdline => 'svn info', workdir => '', logfile => 'minbld_svninfo.log', tee => 1, prompt => 0, run => 1, }, { id => 'svnup', desc => 'Run full svn update', cmdline => 'svn update', workdir => '', logfile => 'minbld_svnupdate.log', tee => 1, prompt => 0, run => 1, }, # ... { id => "bld", desc => "Build unit tests ${bldtype}", cmdline => qq{bldnt ${bldtype}dll UnitTests.sln}, workdir => '', logfile => "minbld_${bldtype}bldunit.log", tee => 0, prompt => 0, run => 1, }, { id => "findbld", desc => 'Call find_strs_in_file', fn => \&find_in_file, fnargs => [ "minbld_${bldtype}bldunit.log", '[1-9][0-9]* errors +' ], workdir => '', logfile => '', tee => 1, prompt => 0, run => 1, } # ... );

    Generally, I enjoy using property tables like this in Perl. I find them easy to understand, maintain and extend. Plus, a la Pike above, focusing on the data first usually makes the coding a snap.

    Basically, the program runs a specified series of "actions" (either commands or functions) in the order specified by the action table. In the normal case, all actions in the table are run. Command line arguments can further be added to specify which parts of the table you want to run. For convenience, I added a -D (dry run) option to simply print the action table, with indexes listed, and a -i option to allow a specific range of action table indices to be run. A number of further command line options were added over time as we needed them.

    Initially, I started with just commands (returning zero on success, non-zero on failure). Later "action functions" were added (again returning zero on success and non-zero on failure).

    As the table grew over time, it became tedious and error-prone to copy and paste table entries. For example, if there are four different directories to be built, rather than having four entries in the action table that are identical except for the directory name, I wrote a function that took a list of directories and returned an action table. None of this was planned, the script just evolved naturally over time.

    Now is time to take stock, hence this meditation.

    Coincidentally, around the same time as I wrote my little script, we inherited an elaborate testing framework that specified tests via XML files. To give you a feel for these, here is a short excerpt:

    <Test> <Node>Muss</Node> <Query>Execute some-command</Query> <Valid>True</Valid> <MinimumRows>1</MinimumRows> <TestColumn> <ColumnName>CommandResponse</ColumnName> <MatchesRegex row="0">THRESHOLD STARTED.*Taffy</MatchesRegex> </TestColumn> <TestColumn> <ColumnName>CommandExitCode</ColumnName> <Compare function="Equal" row="0">0</Compare> </TestColumn> </Test>

    Now, while I personally detest using XML for these sorts of files, I felt the intent was good, namely to clearly separate the code from the data, thus allowing non-programmers to add new tests.

    Seeing all that XML at first made me feel disgusted ... then uneasy because my action table was embedded in the script rather than more cleanly represented as data in a separate file.

    To allow my script to be used by other teams, and by non-programmers, I need to make it easier to specify different action tables without touching the code. So I seek your advice on how to proceed:

    • Encode the action table as an XML file.
    • Encode the action table as a YAML file.
    • Encode the action table as a JSON (JavaScript Object Notation) file.
    • Encode the action table as a "Perl Object Notation" file (and read/parse via string eval).
    • Turn the script and action table/s into Perl module/s.

    Another concern is that when you have thousands of actions, or thousands of tests, a lot of repetition creeps into the data files. Now dealing with repetition (DRY) in a programming language is trivial -- just use a function or a variable, say -- but what is the best way of dealing with unwanted repetition in XML, JSON and YAML data files? Suggestions welcome.


Effective Automated Testing
3 direct replies — Read more / Contribute
by eyepopslikeamosquito
on Apr 18, 2015 at 04:50

    I'll be giving a talk at work about improving our test automation. Initial ideas are listed below. Feedback on talk content and general approach are welcome along with any automated testing anecdotes you'd like to share. Possible talk sections are listed below.

    Automation Benefits

    • Reduce cost.
    • Improve testing accuracy/efficiency.
    • Regression tests ensure new features don't break old ones. Essential for continuous delivery.
    • Automation is essential for tests that cannot be done manually: performance, reliability, stress/load testing, for example.
    • Psychological. More challenging/rewarding. Less tedious. Robots never get tired or bored.

    Automation Drawbacks

    • Opportunity cost of not finding bugs had you done more manual testing.
    • Automated test suite needs ongoing maintenance. So test code should be well-designed and maintainable; that is, you should avoid the common pitfall of "oh, it's only test code, so I'll just quickly cut n paste this code".
    • Cost of investigating spurious failures. It is wasteful to spend hours investigating a test failure only to find out the code is fine, the tests are fine, it's just that someone kicked out a cable. This has been a chronic nuisance for us, so ideas are especially welcome on techniques that reduce the cost of investigating test failures.
    • May give a false sense of security.
    • Still need manual testing. Humans notice flickering screens and a white form on a white background.

    When and Where Should You Automate?

    • Testing is essentially an economic activity. There are an infinite number of tests you could write. You test until you cannot afford to test any more. Look for value for money in your automated tests.
    • Tests have a finite lifetime. The longer the lifetime, the better the value.
    • The more bugs a test finds, the better the value.
    • Stable interfaces provide better value because it is cheaper to maintain the tests. Testing a stable API is cheaper than testing an unstable user interface, for instance.
    • Automated tests give great value when porting to new platforms.
    • Writing a test for customer bugs is good because it helps focus your testing effort around things that cost you real money and may further reduce future support call costs.

    Adding New Tests

    • Add new tests whenever you find a bug.
    • Around code hot spots and areas known to be complex, fragile or risky.
    • Where you fear a bug. A test that never finds a bug is poor value.
    • Customer focus. Add new tests based on what is important to the customer. For example, if your new release is correct but requires the customer to upgrade the hardware of 1000 nodes, they will not be happy.
    • Documentation-driven tests. Go through the user manual and write a test for each example given there.
    • Add tests (and refactor code if appropriate) whenever you add a new feature.
    • Boundary conditions.
    • Stress tests.
    • Big ones, but not too big. A test that takes too long to run is a barrier to running it often.
    • Tools. Code coverage tools tell you which sections of the code have not been tested. Other tools, such as static (e.g. lint) and dynamic (e.g. valgrind) code analyzers, are also useful.

    Test Infrastructure and Tools

    • Single step, automated build and test. Aim for continuous integration.
    • Clear and timely build/test reporting is essential.
    • Quarantine flaky failing tests quickly; run separately until solid, then return to main build. No broken windows.
    • Make it easy to find and categorize tests. Use test metadata.
    • Integrate automated tests with revision control, bug tracking, and other systems, as required.
    • Divide test suite into components that can be run separately and in parallel. Quick test turnaround time is crucial.

    Design for Testability

    • It is much easier/cheaper to write automated tests for systems that were designed with testability in mind in the first place.
    • Interfaces Matter. Make them: consistent, easy to use correctly, hard to use incorrectly, easy to read/maintain/extend, clearly documented, appropriate to audience, testable in isolation.
    • Dependency Injection is perhaps the most important design pattern in making code easier to test.
    • Mock Objects are also frequently useful and are broader than just code. For example, I've written a number of mock servers in Perl (e.g. a mock SMTP server) so as to easily simulate errors, delays, and so on.
    • Consider ease of support and diagnosing test failures during design.

    Test Driven Development (TDD)

    • Improved interfaces and design. Especially beneficial when writing new code. Writing a test first forces you to focus on interface. Hard to test code is often hard to use. Simpler interfaces are easier to test. Functions that are encapsulated and easy to test are easy to reuse. Components that are easy to mock are usually more flexible/extensible. Testing components in isolation ensures they can be understood in isolation and promotes low coupling/high cohesion.
    • Easier Maintenance. Regression tests are a safety net when making bug fixes. No tested component can break accidentally. No fixed bugs can recur. Essential when refactoring.
    • Improved Technical Documentation. Well-written tests are a precise, up-to-date form of technical documentation.
    • Debugging. Spend less time in crack-pipe debugging sessions.
    • Automation. Easy to test code is easy to script.
    • Improved Reliability and Security. How does the code handle bad input?
    • Easier to verify the component with memory checking and other tools (e.g. valgrind).
    • Improved Estimation. You've finished when all your tests pass. Your true rate of progress is more visible to others.
    • Improved Bug Reports. When a bug comes in, write a new test for it and refer to the test from the bug report.
    • Reduce time spent in System Testing.
    • Improved test coverage. If tests aren't written early, they tend never to get written. Without the discipline of TDD, developers tend to move on to the next task before completing the tests for the current one.
    • Psychological. Instant and positive feedback; especially important during long development projects.


the sorry state of Perl unit testing framework
6 direct replies — Read more / Contribute
by bulk88
on Apr 07, 2015 at 02:51
    Updated: Test::Tiny was benchmarked and analyzed, Test::More alpha release 1.301001_101 with new backend tried

    Soon I will be going to QAH 2015 in Berlin. My number one goal is to get parallel testing for TAP::Harness working on Win32 Perl but this post isn't about parallel testing. TAP::Harness stymied me many times trying to touch the codebase. It is militantly OOP, and obfuscated with its own implementation of method dispatch, and is written in a "declarative language" similar to a makefile. It has layers of faux-abstraction, that superficially purports to support pluggability, yet doesn't allow anything but the current implementation to fit the abstraction layers. I belive it is a pedagogical exercise in a ivory tower, which in plain terms, it is a homework assignment for appealing to a professor's ego that shows off all the skills itemized on your syllabus.

    Researching the bloated design of TAP::Harness also brings me to investigate the other side of the TAP connection, Test::Simple/Test::More. It is equally inefficient I discovered.

    All these tests were done a 2.6 ghz 2 core machine, running 32 bit WinXP. The primary tool I use for benchmarking in this post is called timeit.exe. It is a simples times() like benchmark tool that asks the NT kernel for its counters after the process exits. The resolution of these counters is 15.625 ms but the workloads I use to benchmark take seconds or minutes to complete, so 15.625 ms resolution isn't an issue.

    The workload I use is always 1 million tests, the tests are run by GenTAP.pm, which is from http://github.com/bulk88/Win32-APipe. GenTAP.pm and fastprint.t, fasttinyok.t and fastok.t should be portable and run on any Perl platform if you want to try reproducing these benchmarks yourself.

    I refrained from using nytprof in this write up since nytprof has overhead, and questioning individual subs, and lines of code, is pointless, if slowness is a conscious systemic design rule, not a couple bad drive-by patches over the years. The output is redirected to a file, since this way there is no overhead of the Win32 console in writing to STDOUT.


    fastok.t calls Test::More's ok() from version 1.001014, 1 million times in a loop with each test always passing, and a randomly randomized test name.
    timeit perl t\t\fastok.t > t.txt Version Number: Windows NT 5.1 (Build 2600) Exit Time: 2:11 am, Saturday, April 4 2015 Elapsed Time: 0:02:00.671 Process Time: 0:01:59.234 System Calls: 3664127 Context Switches: 320686 Page Faults: 948528 Bytes Read: 3339101868 Bytes Written: 100048020 Bytes Other: 73765993

    which leads to 0.000119 seconds to do 1 ok() call, or 0.1 milliseconds. This is signifigant, for each 10000 tests, 1 second of overhead. It also means, you can't run more than 10000 tests per second on per core no matter what you do. How often do you run "make test" and wait for it to finish and it feels like filling a gas tank, how slow is travis or whatever CI solution you use?

    If your unit testing consists of code generated permutations of parameters to your module, 10,000 tests or 100,000 is very easily reachable on 1 software project/module. Some very popular CPAN modules do this style of code generated permutations of tests.

    Now what could be the fastest possible TAP generation? "type file_of_tap.txt" or "cat file_of_tap.txt" is cheating. Under the same conditions, which is the best case scenario for the overhead of test to compare Test::More against, is a simple "print "ok ".(++$counter)." - $testname\n";" in a loop instead of "ok(1, $testname)", which is what fastprint.t does.
    timeit perl t\t\fastprint.t > t.txt Version Number: Windows NT 5.1 (Build 2600) Exit Time: 2:19 am, Saturday, April 4 2015 Elapsed Time: 0:00:02.156 Process Time: 0:00:02.109 System Calls: 21413 Context Switches: 6818 Page Faults: 14663 Bytes Read: 270297 Bytes Written: 60606608 Bytes Other: 6971963

    0.0000021 seconds per the DIY ok().

    0.000119/0.0000021 57x more time. FIFTY SEVEN times more CPU. To summarize, if you use Test::More, you might as well imagine there is an gig ethernet cable and a UDP socket between your TAP emitter and TAP consumer.

    Now about memory use of Test::More. I modified fastok.t as such
    unshift(@INC, '.'); require 't/t/GenTAP.pm'; require Test::More; #load but dont call anything in Test::More #we want runtime mem overlead, not loadtime, otherwise the require wil +l happen #inside GenTAP if it isn't done here system('pause');#sample memory GenTAP(0, 0, 'ok', 1000000); system('pause');#sample memory

    before 3,828KB, after 397,828 KB, peak 397,840 KB.

    (397840-3828)/1000000=0.394012 KB per test emitted. 400 bytes per test. What on earth is in those 400 bytes? My test name passed to T::M::ok() is always 42 bytes long. Lets round that to the next 16 byte win32 malloc boundary, 48+12(perl's win32 ithread malloc wrapper in vmem.h)+16(sv head)+8(svpv body)+4(SV * somewhere else)=88 bytes for storing the test name. Where did the other 300 bytes go? Why is Test::More saving the names of passing tests? showing off your skills in LOC per hour for your CV? writing job-for-life unmaintable code? The TAP parser is responsible for maintain TAP state, not the TAP emitter. A TAP emitter should have no memory increase between sucessive calls to ok().

    Test::More has no competitors on CPAN except for Test::Tiny which makes no attempt at API compatibility but has a similar ok() sub. So using fasttinyok.t, which calls Test::Tiny's ok() sub 1 million times, I get
    timeit perl t\t\fasttinyok.t > t.txt Version Number: Windows NT 5.1 (Build 2600) Exit Time: 7:42 pm, Tuesday, April 7 2015 Elapsed Time: 0:00:05.218 Process Time: 0:00:05.140 System Calls: 49612 Context Switches: 24005 Page Faults: 17396 Bytes Read: 146216 Bytes Written: 57859498 Bytes Other: 17639334
    Test::Tiny's ok() is (5.140/2.109=2.437) 2.4x slower than my ideal DIY ok() implementation. which compared to Test::More's 57x slower, 2.4x is a rounding error. Test::Tiny is a real working CPAN module remember.

    About Test::Tiny's memory usage, using the same breakpoint positions, before 3,008KB, after 3,028KB, 3028-3008=20 KB. 20000/1000000=0.02 bytes per test, which means unmeasurable small. Basically no increase in memory usage, unlike the 100s of MBs seen with Test::More.

    Even a drop in replacement for Test::More that is 10x slower than the ideal DIY ok() implementation above, or in other words 4x slower than Test::Tiny, is still 5x times faster than Test::More. Just about anything is faster than a sloth pulling a wagon. Something needs to be done about Test::More, the entire perl community relies on it, and it is unworkably slow. Either a drop in replacement, or replacing all of the internals of Test::More with a simplied architecture.

    I was told to try a alpha release (1.301001101) of Test::More which included a new backend which hoped to improve its performance. I will therefore benchmark it.
    timeit perl t\t\fastok.t > t.txt Version Number: Windows NT 5.1 (Build 2600) Exit Time: 11:37 pm, Tuesday, April 14 2015 Elapsed Time: 0:02:10.859 Process Time: 0:02:09.375 System Calls: 2399091 Context Switches: 238722 Page Faults: 543275 Bytes Read: 3031284 Bytes Written: 103643867 Bytes Other: 87114348
    The results are bad. 10 seconds more old stable 1.001014 Test::More, or 9% slower.


    Onto the TAP consumer, TAP::Harness. For the next example, remember fastprint.t takes 2 seconds of CPU to print its 1 million tests. I dont think fastprint.t's process time is included by timeit.exe tool but with the numbers shown, 2 seconds is a rounding error if it is included.

    C:\sources\Win32-APipe>timeit C:\perl521\bin\perl.exe "-MExtUtils::Com +mand::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0 +, 'blib\li b', 'blib\arch')" t\t\fastprint.t t\t\fastprint.t .. ok All tests successful. Files=1, Tests=1000000, 66 wallclock secs (65.86 usr + 0.13 sys = 65. +98 CPU) Result: PASS Version Number: Windows NT 5.1 (Build 2600) Exit Time: 3:00 am, Saturday, April 4 2015 Elapsed Time: 0:01:06.406 Process Time: 0:01:06.203 System Calls: 483839 Context Switches: 182749 Page Faults: 78950 Bytes Read: 62566241 Bytes Written: 53961404 Bytes Other: 11595189 C:\sources\Win32-APipe>

    (60+6)/1000000 = 0.000066 seconds for TAP::Harness to process 1 TAP test. It is better than Test::More, with TAP::Harness taking, for parsing 1 test, 55% of the time it takes, for Test::More to emit 1 test.

    Now about the memory usage of TAP::Harness. Checking the process memory with Task Manager with a breakpoint right before "test_harness" sub is called, shows 5,868 KB, Windows OS shows the process peaked at 106,368 KB, and a breakpoint right after test_harness sub, shows 96,636 KB. There are 2 memory problems here that need to be broken down.

    Problem 1, TAP::Harness uses about 100 bytes of memory for each *passing* test ((106368000-5868000)/1000000). The internal state of tests results isn't a sparse array or linked list of failed tests, or a vec() or C bitfield, heck, it isn't even a @array with undef/uninit slices for successful tests, which would be 4 bytes per test on a 32bit OS. It is 100 bytes per passing test. What is 100 bytes? 100/4 is 25 pointers/slice members. For reference, each scalar you create, on 32 bit OSes, is 4 slice members. And I will guess it uses a blessed hash with 1 or 2 hash keys for each test without even looking at TAP::Harness's implementation.

    Problem 2, in the breakpoint, after test_harness() was executed, memory dropped from 106,368 KB to 96,636 KB. Only about 10MB. What is inside the 96,636-5,868=90,768KB?

    Here is the console log with the breakpoints (the "Press any key to continue . . ." lines) to show where memory was sampled.

    C:\sources\Win32-APipe>C:\perl521\bin\perl.exe "-MExtUtils::Command::M +M" "-MTest ::Harness" "-e" "undef *Test::Harness::Switches; system 'pause'; test_ +harness(0, 'blib\lib', 'blib\arch'); system 'pause';" t\t\fastprint.t Press any key to continue . . . t\t\fastprint.t .. ok All tests successful. Files=1, Tests=1000000, 65 wallclock secs (65.05 usr + 0.11 sys = 65. +16 CPU) Result: PASS Press any key to continue . . . C:\sources\Win32-APipe>
    All tests successful. Files=1, Tests=1000000, 65 wallclock secs (65.05 usr + 0.11 sys = 65. +16 CPU) Result: PASS

    was already printed, so what do those 90 MB contain? Why is TAP::Harness holding onto state after printing the final line? surely this can't all be malloc fragmentation preventing a release of memory? Or was TAP::Harness written to leak memory since "the process will exit soon, dont waste CPU freeing anything", perhaps a legitimate reason? I doubt that was the intention of the person who designed TAP::Harness's OOP API.

    In combination, TAP::Harness+Test::More take 0.185 ms per ok() call, of overhead. is() which is more common than ok(), will probably take even more than 0.185ms, so 0.185 ms per test, is the current best case scenario using the existing unit testing framework.



    Rewrite Test::Simple and Test::More back into standalone modules like they were 20 years ago, and remove their usage of Test::Builder would be best, but that requires Test::More's authors and maintainers to agree that the code is deeply flawed and agree to replacing it.

    Summarizing some code in Test::Builder, do we really need to ever implement this, anywhere?
    sub in_todo { my($todo, $caller); $todo = defined((caller++$caller)[0])?(caller$caller)[0]->can('is_ +todo')?(caller$caller)[0]->is_todo?1:undef:undef:0 until defined $tod +o; $todo; }

    Since I expect signifigant protests from the peoples whoses CVs depend on protecting their precious snowflakes (see this incredulous post https://github.com/Test-More/test-more/issues/252 from 2012, leaking memory is by design and will never change by the then author), a drop in replacement for Test::More under a different namespace and patching dists away from T::M/T::B, and removing T::M/T::B from Perl 5 core is probably the easiest way forward.


    Rewriting TAP::Harness from scratch is probably the only solution, since a couple 3rd party modules are crazy enough to integrate themselves, like TAP::Harness::Archive. The typical "make test" has no use for TAP::Harness OOP bloat, with the only 2 options being TEST_VERBOSE on or off, and parallel or not.

    I have done nytprof-ing of TAP::Harness, but nothing is fixable there without admitting all the design rules are a list of what not to do.

    A simple design for a new harness would be, a TAP source class (usually a process class) that returns a stream class. The stream class returns a string name() (filename of .t, or a http:// URL of TAP or a disk path of TAP), and returns multi-KB blocks and eventually undef as EOS indicator, just 2 methods. For passing tests store nothing (undef), or store a "sparse" range of passing tests unblessed objects, store failed tests and unknown lines and diag in a linked list for dumping/summing at the end. Rewrite the parser in XS to quickly search for newline and "not ok" and "ok" token, maybe use BM algo. Even for a PP version, index and substr for a 1 pass parser through the block. For passed tests, store nothing. If a TAP stream has all passing tests, and reaches the end of stream, all the passing tests are represented by 1 hash with 2 keys (start range, end range). This is a long shot, but ideally, pipes shouldn't even be used between a TAP consumer and TAP emmitter. Future Test::More can communicate through shared memory through XS to future TAP::Harness, with the IPC buffer maybe looking like a stream, or the TAP "parsing" (no TAP is generated) is "done" (there is no TAP, just an array of test records) on the client side in Test::More, which also gives the benefit of "out of sequence" TAP being impossible due to perl thread safety in Test::More::XS.

    Why is TAP::Harness's design flawed? I saw all of this being done with stepping and nytprof inside TAP::Harness.

    Things not do include, no method accessors, callbacks, method dispatchers in PP, declarative syntax, no pluggable tokenizers, no roll your own RTTI, not RTTI at all, just 3 classes, and absolutly no "Harness::Data::Integer" class since Perl isnt Javascript, and will not JIT your Integer class into a scalar, always use hash keys, do not use classes and constructors if you can use an integer bitfields or integer bools/!!0, this is Perl, not C++, not Java. Do not build bool methods that should be bitfields, that are aggregations of other bool methods, since you wind up exponential method calls, and there is no caching since caching is evil since it cant be plugged later on according the OOP dogma, so the result is to parse a TAP line over and over. Do not use "class factory" classes. There is no reason to have closured, dynamically generated, anonymous classes. If you need 2 classes, both named "Record" since you are too incompetent to prefix "Customer::" or "Inventory::" to the word "Record", you should name classes after your pets, I hope your house doesn't have 2 Rustys. Perl has "packages", do not invent our own. Do not nest hashes inside hashes. Perl hashes aren't C structs where all the "."s in "pointer->a.b.c.d" optimized away. Do not write classes, where the ctor, does nothing except blesses an empty hash, and then each method of the class checks if the object was "really ctored" and conditionally calls the real ctor, in some crazy attempt to optimize for bad callers that unnecessarily create objects, never call a single method on them, then dtor them. Do not ask an object if it can() do anything, that means your objects are lying as to their abstract base class. If you bought a car on ebay, and the seller mails you 4 tires with a $1K shipping charge, do you timidly do nothing and buy another car online? Do not implement has_pending() with return !!scalar(shift->pending()); where pending() is return map{ref($_)->new($_)} @{shift->{queue}}. Do not implement meaningless sort calls, such as in $new->{$_} = $self->{$_} foreach sort @clone;. Do not collect error/fail state diagnostics, build error message strings, when there is no error, and you will never use that error message or state snapshot again.

Is pushing strict and warnings still relevant?
7 direct replies — Read more / Contribute
by stevieb
on Apr 06, 2015 at 17:47

    I've been out of using Perl for some time now. After I decided to leave the Network Engineering field, many things changed.

    Before I left, I wrote some tutorials et-al (v5.10-ish) and did a preliminary examination report on Perl6, but since, I've found a job where I've been pushed into Python.

    My question is, as I dabble here on Perlmonks and some of my older code, I wonder if it is still important to remind people to use the warnings and strictures, or am I getting old?


Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)
7 direct replies — Read more / Contribute
by BrowserUk
on Apr 05, 2015 at 07:16

    The basic premise of (the titled) fast string matching algorithms, is that you preprocess the needle and construct a table of shifts or skips that allow you to skip over chunks of the haystack and thus speed things up.

    The following, I hope clear, explanation (they are few and far between) describes a variant called Quick Search.

    You start with a haystack and needle:

    0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 000010100101010000011010

    You inspect the needle and build the skip table:

    00001010 01010100 00011010 00001010 shift 24 01010100 shift 16 00011010 shift 8 xxxxxxxx shift 32 (all other patterns not found in the needle allow ma +ximum shift)

    And apply it. Try the pattern at the start of the haystack; doesn't match, so look up the next 8 bits of the haystack in the table, and find the shift = 32:

    000011101001110010111011 01110001 111010111100111110111111100000001000 +100100001010010101000001101000110010011000101101010110011011000000 000010100101010000011010 ???????? shift = 32

    So, apply the shift and try the needle at the new position. No match, lookup the next 8 bits of the haystack to find a shift of 32:

    0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 000010100101010000011010???????? shift + = 32

    Apply the shift, try the needle at the new position. No match, look up the next 8 bits, get the shift of 8:

    0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 000010 +100101010000011010???????? shift = 8

    Apply the shift, try the match, And success. Needle found.

    0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 + 000010100101010000011010 Found at 72

    Now lets try that with the same haystack but another needle:

    10000101 00101010 00001101 10000101 shift 24 00101010 shift 16 00001101 shift 8 xxxxxxxx shift 32 0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 100001010010101000001101???????? shift 32 100001010010101000001101???????? shift + 32 100001 +010010101000001101???????? shift 32 + 100001010010101000001101???????? shift 32 + 10000101001 +0101000001101 >>> Not found.

    Great! Four compares & four skips to discover the needle isn't in the haystack.

    Except that it is!

    0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 + 100001010010101000001101

    And that's why Boyer-Moore, Horspool, Alpha-Skip, Quick Search et. al. don't work for bit-strings.

    It doesn't matter what size you make the chunks of bits -- above I used 8 -- the skip table will only ever consider aligned groups of bits, but bit-strings inherently consist entirely of unaligned bits!

    (Don't believe me; just try it for yourself.)

    And the point of this meditation?

    To prove my instincts were right?

    Maybe, but mostly because I wanted to be wrong. I wanted there to be a better than brute force mechanism. And I'm still hoping that someone will point me at one.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
A Big "Thank You" To Strawberry Perl Folks.
No replies — Read more | Post response
by Anonymous Monk
on Mar 31, 2015 at 03:14

    Big Thank You to all the Strawberry Perl creators/Maintainers/Developers. You have created an awesome distribution. What is even more awesome is Strawberry Portable Perl. It has made my life simpler. No Admin Rights needed to install it. I have it running on our production servers. There are some applications which are using Perl, and I treat that as "System Perl". So no fiddling there. I first downloaded the portable perl to my workstation, installed a few modules, and simply copied the folder to the production server, where I had some scripts running. Worked like a charm. Beautiful.

    It also ended up installing gmake/dmake etc which were extremely useful. I use gVim on windows and was recently playing around with some plugins which required vimproc. Compiling it was easy peasy. All thanks to the extra goodies you folks have provided.

    Thank you all once again.

MJDs Contract Warnings - courtesy of Perlweekly
4 direct replies — Read more / Contribute
by ww
on Mar 30, 2015 at 08:10

Add your Meditation
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.