http://www.perlmonks.org?node_id=11133348

I often have to update Perl and the CPAN dists i use for my projects on many machines at once. Over time i have to come more and more to realize how much of a drag badly written test suits are. In this post, i'll try to formulate this into a somewhat readable list of things that can be improved.

Make your tests as short and fast as possible

Some dists really seems to drag their feet when it comes to running tests. Do you really need to run a gazillion slow tests on every single computer your dist is installed? This wastes the users time.

For example, do you really need to test if 1+1=2? You can assume that Perl did some extensive tests on the basics during its installation, so you don't have to.

Don't run extensive "timeout" tests unless you absolutely have to.

Make more of your tests Author-only tests

Decide which tests only makes sense for you, the author. Prime examples would be tests that check the documentation or running Perl::Critic. Others might be check if your network protocol handler conforms to the specification. Unless part of your code is specific to the system architecture, these tests only need to be run whenever you change the code.

Don't make assumptions about the users network architecture.

Quite a few dists in CPAN make assumption on how the network setup of a users should behave. Don't assume that the user is even allowed or able to access the internet. For the same reason, don't assume DNS resolution is going to get you the expected results, these days many users and companies run filters on Nameservers.

Don't try to access servers you don't personally own or have explicit permission to use

I've seen tests that run against other peoples (or companies) servers. This is generally frowned up on, you are wasting resources of the server owner and quite possibly their money.

Also, your tests might suddenly start to fail. You don't have control over those servers. If their configuration suddenly changes or they go offline, your users might have problems installing your dist.

Accessing certain sites may also be against company policy or even against the law in the country the user resides. Think "news sites" for example. Accessing news sites during work hours might get the user into trouble. Trying to access these sites might even get the user into legal trouble because they are "banned" in the users country.

Don't ask stupid questions while installing

Quite a few dists hold up the installation process to ask stupid questions. This is very annoying, especially if you run the installation on multiple hosts and once and you have to constantly switch through all tabs of your terminal. Just to check if some installation script has put its lazy feet on the table and won't do anything until you press enter.

Like "should i run the network tests over the internet". Answer: "No, you shouldn't" (see above). Provide some Author-tests instead that the user can run if they run into trouble - and document this in the dists "Troubleshooting" section.

Another good example is Template Toolkit. You get the questions "Do you want to build the XS Stash module?" and "Do you want to use the XS Stash by default?". How the f should i know? You are the author and you should know the answer to that better than me. If the author answer is "i don't know", then try to build the module while catching errors as non-fatal. If the module build, run its tests, if those tests work, use it as default.

Don't keep outdated tests and workarounds for external modules

If your dist uses external modules, and you encounter bugs, you will probably check for those bugs and write a workaround. Say the author of that module has since fixed the bug. Instead of keeping slow workarounds and extensive tests for that outdated module around, just require the current version of that external module as your minimum version.

Don't install if your requirements are not fulfilled.

If your dist needs either ACME::Foo or ACME::Foo::PurePerl installed to function properly, don't just print a message and continue. Either fail the installation (not good) or pick one at development time. In this case, picking the PurePerl version might be more reliable. Just printing a message during installation is pretty useless, especially if your dist is getting installed at the same time as a number of other dists. Your message might be on screen for only a fraction of a second.

Don't make the computer unusable during building and testing (unless you have to).

A good example where this is unavoidable is install Tk. It has to pop up many different windows to test this GUI library, which makes working on the computer while the installation is running impossible due to focus stealing. But if at all possible, avoid this.

Be careful when testing floating point numbers

Different systems might handle floating point numbers in slightly different way. Just because 1.0 + 0.05 = 1.05, this might not be true on the users system. It might only be "1.04999999". That is how floating point numbers work on on computers, because they use binary representations of varying length (precision) to work on those internally.

perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

Replies are listed 'Best First'.
Re: Let's try for a better CPAN experience
by Tux (Canon) on Jun 01, 2021 at 11:40 UTC

    I agree on many items, but not on all.

    Removing tests for workarounds that got fixed /might/ be a speedup, but it for sure has value to those that want to make your module work on older versions of dependencies that are officially not supported.

    The author releases a best-effort of their work and tested that against a minimum version of resources. That however is no hard limit. Someone might take the source and make it work on minimumversion - 3 and or with a deprecated version of a module that once was supported but due to lack of maintainers now isn't anymore. Those tests help a lot in getting the new code working in the old environment(s).

    For some modules/distributions requiring a "recent" version of perl or other prereq isn't a real problem, but some authors try really hard to make their work function as expected on a range of perl releases and configurations that many of the end users do not really care about, but this will for sure make the code more reliable and probably easier to port to new architectures or configured environments. Having those tests is or might be a slowdown in 95% of the cases but it makes development a lot easier.

    I 100% agree with the Tk tests making the desktop useless, certainly if you do 10 in parallal on different systems, but I do not have a sane workaround to that :(

    On the floating point numbers, I'd like to add that next to different archtitectures, there are also different configurations. 32bit, 64bit, longdouble and quadmath are a few that have huge impact on test results. I know it is close to impossible for the majority of CPAN authors to verify that all of that works throughout the test suite, but some modules really start off wrong in their expectations. Additional problems will occur in having the test suite communicates with servers with a different architecture (e.g. NFS, databases, SOAP, ...) that make different rounding and truncating change the returned values.

    Last but not least, when rolling large sets of installations and or updates, please make your own life easier and start using distroprefs. the CPAN client supports a way to answer all those nasty questions for you with what *you* think are the only appropriate answers. I have mine available on github, but Andreas has an even more extensive set of examples in the distribution. DO NOT BLINDLY COPY! Your preferences might not match!


    Enjoy, Have FUN! H.Merijn

      I agree on many items, but not on all.

      That is the result i was hoping for. Only if people disagree can both sides learn something new :-)

      For example, i completely did not realize that Distroprefs was a thing... Thanks!

      perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'
      On win32 at least Tk can do all its popping up in background .... If the cmd.exe doesn't bave focus when testing starts (some kind of heuristics)
Re: Let's try for a better CPAN experience
by eyepopslikeamosquito (Bishop) on Jun 01, 2021 at 10:23 UTC

    Do you really need to run a gazillion slow tests on every single computer your dist is installed? ... Don't run extensive "timeout" tests unless you absolutely have to ... Make your tests as short and fast as possible ... Make more of your tests Author-only tests ...
    Thanks for taking the time to report your real world experiences in this area. Very much appreciated.

    Though agreeing with Your Mother's sentiment (namely I refuse to criticize devs willing to do things I am not) I felt very sad about the reception my Perl CPAN test metadata proposal received back in 2010. Actually, I still feel my suggested declarative trumps imperative approach to CPAN test metadata is the best general approach to this tricky problem ... though it appears to have little support from the people actually doing the work.

      In case you're interested, though unable to sell my test metadata ideas to the perl-qa folks, I was allowed to implement a simple test metadata scheme at work, mostly for C++, but also Perl and other languages. We used identical test metadata names across all languages and all types of tests (not just unit tests) ... and integrated with our build and release tools.

      In practice, the most popular and useful metadata was Smoke, with a value of Smoke=1 indicating a Smoke Test. Smoke tests need to be robust and fast because if they fail, the change is automatically rejected by our build tools.

      We also learnt that it's vital to quarantine intermittently failing tests quickly and to fix them quickly ... only returning them to the main build when reliable. If you don't do that, people start ignoring test failures! You need a mindset of zero tolerance for test failures, aka No Broken Windows.

      An interesting metadata extension is to keep metrics on the test suite itself. Is a test providing "value"? How often does it fail validly? How often does it fail spuriously? How long does it take to run? Who writes the "flakiest" tests? ;-)

      See also: Effective Automated Testing

Re: Let's try for a better CPAN experience
by hippo (Bishop) on Jun 01, 2021 at 10:08 UTC

    If you are setting the testing environment variables to your requirements then many of these problems should be eliminated. If a test suite doesn't honour the variables then that's worth raising as an issue against the dist in question.

    Don't run extensive "timeout" tests unless you absolutely have to.

    If you are referring to idle timeouts then installing/updating your modules in parallel can mitigate that. However, I agree with you and long timeouts should not be part of a standard test suite at all.


    🦛

      Yes, you can work around some of that stuff by environment variables. My thinking was more along the lines of a standard setup.

      Also, it would be great to have an option in the cpan shell config to just set those whenever calling the shell for installing a dist. At least, i haven't seen an option like this.

      perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'
Re: Let's try for a better CPAN experience
by syphilis (Bishop) on Jun 02, 2021 at 07:42 UTC
    Be careful when testing floating point numbers

    Indeed, that's good advice.
    However,if your dealing only with nvtypes of 'double' or '__float128', there's pretty much only ever one correct result - so if you find a discrepancy across different systems with either of those 2 nvtypes, then you've almost certainly exposed a bug.
    It gets a lot murkier with the 'long double' nvtype as there's a number of different kinds of 'long double' - and this is further complicated by the fact that some of those implementations are rather questionable (if not outright buggy) to begin with.
    For example, I've encountered long double builds of perl where sqrt(2) != 2 ** 0.5

    Cheers,
    Rob
Re: Let's try for a better CPAN experience
by EvanCarroll (Chaplain) on Jun 01, 2021 at 16:42 UTC
    Got a big one for for 2021. Don't make the assumption that your code is built with root or privileges. https://github.com/rurban/Net-Ping/issues/27


    Evan Carroll
    The most respected person in the whole perl community.
    www.evancarroll.com

      As a general principle that's probably a good guideline, but in this specific case . . . eeeh. Writing to a raw IP socket has historically depended on being done as root. If you're writing something which in the normal *NIX sense requires elevated privileges ((say) managing or changing effective UID, or writing to raw network devices) then unless you're running as root the code is not going to work.

      Perhaps it hints at needing a more fine-grained permissions / capabilities framework that could automagically enable / disable tests where the current context is lacking (or provide gated access via something like sudo). Or maybe MOAR ENVIRONMENT VARIABLES to dictate the context.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

Re: Let's try for a better CPAN experience
by Anonymous Monk on Jun 01, 2021 at 20:41 UTC
    Base your tests exactly on what you wrote. Your tests should match what your module expects to be able to do, and how it expects to do it, and should exercise every one of those things. It should test for every library that it depends on, and exercise (only, but every one of) the calls that it expects to make against those libraries. It doesn't need to test the Perl interpreter itself. It doesn't need to test that math works.

      The reason why it is potentially a good idea to test that math works is human beings and computers donít conceptualize math the same way and the line between known solid ground and buggy assumptions can be wide.

      Meaning: Math != what your code is doing. There is an example of anti-intuitive testing and why a simplistic test that passes 99.9% of the time can lead to an exercise in hair pulling; Re: why Test::More?

      And with regards to the original thesis: 20 extra seconds, nay, minutes!, of a test run is a *gift* if it prevents a single shipped bug. I have never once spent less than that, probably not less than an hour, diagnosing and fixing a bug in someone elseís distribution. Final note: adding tests itself is a form of thinking about the code concretely and not as a conception. That alone is frequently extremely helpful and extra tests can always be removed or factored into bigger tests as subtests.