Trojan Perl Distributions

There has been a furore of discussion recently on PerlMonks and on the Modules list, because one author exposed an exploit in the CPAN install process. This exploit is nothing new and has been there from the beginning, and it is only through the good nature of the Perl authors that it hasn't been exploited before. Which to my mind says a lot about how most of us care about CPAN, rather than the fact that authors didn't know that the exploit exists.

However, this has been one area which has been bothering me for a while. Since more or less I started being a cpan-tester. Aside from the recent exploit that has been highlighted, there is another, just as serious, exploit, which many authors liberally (ab)use.

I have usually referred to this other exploit in the context of laboured testing. It exists when authors attempt to test other modules to ensure their modules work, where the other modules interface to third party software. The most obvious and most commonly used is the surrogate DBD testing. This is where the author of a module tests the connection to a database, through DBI and a DBD driver, to ensure that they can process queries. I have often asked why?

In a couple of instances authors have connected to my MySQL database, created a test table, dumped data into it, then reported a successful test. They then forget they've dumped a whole load gunk into a users database, with no thought to cleaning up after themselves. Now what happens if that test table already exists (some test tables are not called 'test') and the module author dumps gunk in there? Or perhaps they are trying to be conscientious and delete the table for you? What happens to the application that requires that table? This has previously happened to me and I took issue with the module author, and to his credit he did create several tests to ensure that he wasn't clobbering any existing tables. Thankfully I backup my database regularly, and it was only test data, so no real harm was done. But that wouldn't be the case when installing on a live server. Another scenario is when a module does check for the existence of a named table and fails because it exists. It may even exist because the previous tests of the same module may have crashed and left it behind. Is it really a failure?

Testing the database is a job for DBI and the DBD drivers, which they all do very well. So why test it all over again, at the risk of doing damage to your reputation as a good and conscientious CPAN author? There are some distributions that even test several drivers! Shouldn't you be testing what your modules do with the results rather than whether the driver works? Some distributions do prompt the user or expect environment variables, so at least you know what they are trying to do. However, there are some that don't catch bad or non-existent settings and thus an automated FAIL report is sent. Which then prompts the author to respond to the cpan-tester asking them not to do that! Why? Why should a distribution fail just because it currently doesn't have access to an active database?

Thankfully there is another way ... Test::MockObject. For sometime I have been trying to steer people towards this little module, who would dearly benefit from its features. Recently I have been trying to find the time to write the database tests, using Test::MockObject, for a recent CPAN module for one author. I started well, but realised there was a better way to code all these tests using a combination of Storable and Test::MockObject. Alas I have been too busy to finish it all :(

I have used Test::MockObject quite successfully to test applications that can use several different databases, without ever accessing a database. I don't need to, I can rely on the fact that DBI and the DBD drivers will do the right thing, and just test that the result set of known requests are going to be used correctly.

By the way, just in case you were wondering, I didn't write Test::MockObject, chromatic did. Its him you should thank for thinking up a cool module :)

The only argument I can see for not using Test::MockObject, is that it requires yet another dependency that authors may not want. However, I personally would rather the dependency. Then again maybe I'm missing a more obvious reason ... authors don't know that Test::MockObject exists and what it can do. However, there have been a couple of discussions recently on PerlMonks that involve mocking, so perhaps we could start to see a few more authors adopting this approach.

With the recent discussions in mind, I sincerely hope that some misguided soul doesn't upload a module that includes a drop database query. And moreover someone doesn't fall foul of it.

--
Barbie | Birmingham Perl Mongers | http://birmingham.pm.org/

Back to Meditations