Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Advice wanted for debugging CPAN Testers failures

by pryrt (Abbot)
on Aug 23, 2016 at 15:59 UTC ( [id://1170232]=perlquestion: print w/replies, xml ) Need Help??

pryrt has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

How do you go about debugging failures from CPAN Testers when your own configurations are not failing? I'd like advice, both in general, and anything you see in my specific examples below.

For example, this test matrix has a bunch of failures -- but when I test on my machines, I cannot replicate the errors they are getting.

Before releasing, I tested on a couple of different versions I have access to (strawberry perl 5.24.0_64 on Win7 and an ancient CentOS 4.6 linux 2.6.9-55 with perl 5.8.5), and neither failed my test suite. And since I've seen the CPAN Testers failures, I've started increasing my berrybrew installations to improve version coverage -- but so far, they've all passed, even when they've been on Perl versions that failed in the linux column.

After I've exhausted available Strawberry installations, I will probably grab one of my linux virtual machines and start increasing perlbrew installations, and run through as many as I can there (I cannot install perlbrew or other local perls on the CentOS machine I noted, due to disk restrictions). But even with trying a new slew of versions, I cannot guarantee that I'll see the same failures that CPAN Testers is showing me.

I know where I'll be looking for the specific errors: my expected values are wrong; the expected values were being generated by functions I thought were fully tested earlier in my test suite, so I'll have to look into that some more, and also see if maybe I should independently generate the expected values.

But if I cannot replicate the exact failures from CPAN Testers, it's going to be harder to know I've solved the problem. When doing my last release to add features, I ended up submitting beta versions to CPAN, with extra debug printing, and waiting overnight while the CPAN Testers ran, then basing my fixes on changes in those results. But that's a rather slow debug process... and I noticed that every submission, I was getting fewer results from TESTERS: I think some of those auto-testers have some sort of submission limits, or otherwise remember that a particular module fails and stops testing new versions.

Any advice, generic or specific, would be welcome.

Replies are listed 'Best First'.
Re: Advice wanted for debugging CPAN Testers failures
by stevieb (Canon) on Aug 23, 2016 at 19:05 UTC

    Got a git repo that we can point to for our own testing? I wrote Test::BrewBuild for testing these kinds of things (note that in win2k12+, the bbtester program can NOT be run in the background, one must use --fg mode. Background mode works fine on win2k8), and I'll gladly throw the tests against as many servers and perl configurations I can :)

    You can test against all Perl/Berrybrew instances installed locally like this (inside of your distribution's root directory):

    # test against all installed perls, without doing anything else brewbuild # -r removes all installed Perls, less the one you're 'switch'ed to # -n installs N number of perls, randomly, and tests against them all brewuild -r -n 4 # same thing, but after removing all Perls, install # specific versions to test against brewbuild -r -i 5.22.1 -i 5.24.0 -i 5.10.1

    Essentially, the ideal test is to wipe all of your perl/berrybrew instances away on each test run, so you are literally starting them from scratch.

    When all else fails, email/contact the owner of the tester. This is listed in the Tester column under each OS/Perl ver test. For example, see this aggregate test report. Sometimes their email is listed in the end test results, other times, you can just drop their name into an Author search on MetaCPAN. If you can't find contact info on their Meta page, open up one of their distributions, and an email address can likely be found down around the Copyright section of the documentation.

    People who operate testers for us are very friendly, and are willing to help diagnose problems you can't... such as running custom builds on specific configurations outside of normal CPAN Testers operations, providing feedback on their environment, or even looking at the code to see if they can spot issues you may have missed. (In fact, some of them are proactive; they'll email you about a test problem if they indeed know what the problem is before you've read your failure email the next morning).

        I can repro some of the failures on Linux Mint with Perlbrew and my brewbuild application.

        I just did a whole bunch of transplanting so I'm done for the day (time to go read; I bought the Harry Potter collection and I'm on the second one ;), but I will have a good look tomorrow to see if I can figure out why it's breaking. If I don't get the time to thoroughly look at it tomorrow and you still don't have any *nix VMs configured, /msg me your email address and I'll give you access to one of my AWS servers that you can troubleshoot with when I get home from work tomorrow.

        Here's the test build output:

        spek@sainai:~/repos/Data-IEEE754-Tools$ brewbuild -r -i 5.10.1 -i 5.20 +.3 - removing previous installs... - installing perl-5.10.1... - installing perl-5.20.3... 5.24.0 :: PASS 5.20.3 :: FAIL 5.10.1 :: FAIL

        Here's the full brewbuild FAIL log for 5.20.3.

Re: Advice wanted for debugging CPAN Testers failures
by syphilis (Archbishop) on Aug 24, 2016 at 00:46 UTC
    Any advice, generic or specific, would be welcome.

    Have you looked at the specific failing reports ?

    For example, this one (MS Windows, perl-5.16.0) shows that all of the failures are related to the way that NaN is being displayed.
    MS Windows has a myriad of ways of displaying NaN.

    Interestingly, the module passes all tests on 5.16.0 for me (on Windows) but I'm not using the precise Strawberry build that produced that FAIL report.
    You could grab the perl that produced that FAIL report and give it a spin yourself.
    (My perl-5.16.0 was built with a different version of gcc from a different vendor.)

    Similarly, the other FAIL reports I looked at arose because an expected 'nan' turned out to be '-nan' (or vice-versa).

    It seems to me that there's a diversity in the way that the sign of the NaN is determined.
    If IEEE-754 decrees that such diversity should not exist then you're obviously up against a number of perl builds that are not compliant with IEEE-754 in that regard.
    And it's quite likely not perl's fault - rather the fault of the underlying system/compiler/libc.

    If you really want to test the sign of NaN I would suggest that, at the 'perl Makefile.PL' step, you run a C program that determines whether NaN's signedness is being treated as you expect.
    Then proceed to act on that determination as you wish.
    (That is, if there's a problem with the setting of the sign, do you want to provide a workaround ? ... or do you want to skip the tests regarding the sign ? ... or do you want to do something else ?)

    As a ghastly example of what I'm alluding to, you could look at what I do in Math::LongDouble to determine such things as whether nan**0 is evaluated correctly.

    Cheers,
    Rob
      Thanks. I had looked at a couple of the reports, and had seen that it was a sign issue on the NaN: specifically, my expected value is the wrong sign (at least on the instances I dug into), which means it's somewhere in the way I produce my expected value (I probably should've just hardcoded the expected, once I knew what my test matrix looked like).

      The last I'd looked, the Windows 5.16.0 failure hadn't shown up in the matrix. That should make it easier to replicate, since it is just a strawberry that I should be able to download tomorrow.

      Again, thanks.

      This morning, I've been able to replicate the error on Strawberry Perl 5.16.0 32bit. Which means I can now focus on debugging (when I have time). It's so much easier to debug something you can reproduce locally. :-)

      ++syphilis ++stevieb for all the advice, help, and extra miles.

Re: Advice wanted for debugging CPAN Testers failures
by pryrt (Abbot) on Aug 28, 2016 at 22:32 UTC
    Final update:

    I've made my virtual machine with a slew of linux perls in perlbrew, and updating my berrybrew list to include more, I was able to do a complete debug of the problem.

    First issue: I was using the default stringification of the floating-point numbers for the comparison in the failing tests... which was what was causing my linux 5.8.5, and the >= 5.22, to pass. Really, they would have been failing if I'd been using my stringification rather than Perl's, and I would have found the real problem before doing my beta release to CPAN to get the CPAN Testers results.

    Second issue: Perl has apparently decided to not follow the IEEE Std 754™-2008 rules, which specify that for abs() (and negate() and copySign()), NaNs are to be treated identically to other numeric values, and have the sign bit cleared (or negated, or copied). Since abs never claims to be be following the standard, that's their perogative. I just got rid of all instances of CORE::abs() in my test suite, and have specifically recommended using the module's abs() function when standard-compliant handling of NaN signs is desired.

    Now on to the one last feature I'm adding in this group, and then a potential v0.014 live release to CPAN in the near future.

    Thanks to ++stevieb and ++syphilis for the advice, debugging, and testing.

      Nice work! Glad you were able to get it sorted.

Re: Advice wanted for debugging CPAN Testers failures
by Anonymous Monk on Aug 24, 2016 at 01:28 UTC
    the version of the c-runtime would have more to do with NaN than perl version
      the version of the c-runtime would have more to do with NaN than perl version

      which is why it surprised me that in the test matrix, same user on apparently the same system, with two different perl versions but same gcc version and same libc version, got different results.

        apparently the same system, with two different perl versions but same gcc version and same libc version, got different results

        Are you sure ?
        It's no big deal but I'd be interested to see 2 reports indicative of that.

        Cheers,
        Rob

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1170232]
Front-paged by stevieb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (7)
As of 2024-04-23 17:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found