http://www.perlmonks.org?node_id=627174

The problem: automating tested of a data-driven application requires creating test data to run the test, and removing the test data again when the test is finished.

The goal: Make the data clean up process automatic. The data should be cleaned up without an explicit call to a clean up function, and it should work even if the test script dies half-way through.

The solution: As part of creating the test data, keep track of the IDs of the data created in a package-scoped array. Use an END {} block to call the cleanup function, using the the package-scoped array to know what to clean up. The END block gets called when the script is exiting, even if it is "die"'ing.

Some example pseudo-code:

our @IDS_TO_DELETE; sub create_test_widget { my $widget = Widget->new; push @IDS_TO_DELETE, $widget->id; return $widget; } sub delete_test_widgets { for my $id (@IDS_TO_DELETE) { Widget->delete($id); } } END { delete_test_widgets(); }

There you have it. Now just by creating some test data, it will automatically be removed when test is finished. I suggest putting a pair of functions like this in a private testing module and then importing the test data creation function.

Replies are listed 'Best First'.
Re: Test Technique: Self-removing test data
by Mutant (Priest) on Jul 18, 2007 at 08:44 UTC

    If you use Test::Class (which I highly recommend!), you can create 'setup' and 'teardown' methods (which run before and after each test), and 'startup' and 'shutdown' methods (which run before each test class).

    You can even push these methods into a base class, and have related test classes inherit from this.

      Test::Class looked interesting to me, too, and I tried in 2005, but I got a bizarre error and gave up.

      That issue has sense been resolved, and I'll consider giving it another shot. However, from looking at the source of Test::Class, it doesn't use the END {} block technique, even though it has a teardown() method, it is going to fail to clean up when the test script dies in the middle, which seems to be quite possible during development.

        Test::Class catches exceptions that happen in any of your test methods (including startup, shutdown, setup, and teardown methods) and registers the tests as failed.

        It's conceivable that you could die from outside of your test methods and that would prevent your cleanup from happening, but it would also probably mean that your setup didn't happen either.

        Test::Class is by no means perfect, but it deserves a little more credit than you give it.

Re: Test Technique: Self-removing test data
by perrin (Chancellor) on Jul 18, 2007 at 03:51 UTC

    We did something like this on my last project, and got pretty far with it. Eventually though, I started to wonder why we were bothering. For one thing, it's still not really safe to run it on a production database. What if a test dies? Mess.

    In the end, there were enough special exceptions and additional things to clear that it probably would have worked better to just wipe the whole database before each test script.

      That's the other major model to use: a dedicated test database. I recall that's what Ruby on Rails sets up default. I have to confess to not trying that approach in earnest because of a perception that it would be too difficult to get started with and maintain as the "fixture" data and data model change over time.

      I agree that it would certainly have the benefit of allowing you to (more) safely exercise the production code-line.

      Are they any specific tools you are using to make this approach easier?

        I haven't tried tearing down the database between scripts. I use the method you described. However, I think the only fixture data you would need is whatever you already have to create a new database (in my case, a script that drops and recreates the tables, and fills in lookup values). The test-specific data is added by the test in either approach.

        Even with an END block, it's possible for your test to die in a way where it won't be able to effectively remove the data you added. That's what makes this approach unsafe for use on a production database.

        It's still faster than dropping the database and recreating it, but as things got more complex, I spent a lot of time troubleshooting problems with deleting the test data, and I think it would have been wiser ultimately to trade a little test speed for the saved debugging time.

        Another wrinkle is web testing with Mechanize on code that creates data. If your web tests cause data to be added to the database, it won't be in your stack to delete, so you end up with manual deletes, and END blocks that complain and crash if the script dies before the data was added. It gets messy.

        We work on a test database, on which we apply the same upgrade script as the one being offerred to clients. So maintaining the database is part of the tests too. Apart from that, I think the two methods are more complementary than supplemantary in nature, i.e. even on a test database, it is a good idea to distinguish between data which is part of the environment and data introduced by the tests and to clean up the latter, for which the method under discussion is a good idea.
        Tabari
Re: Test Technique: Self-removing test data
by CountZero (Bishop) on Jul 18, 2007 at 21:05 UTC
    Couldn't you clean-up the database through the use of transactions? Just by failing to commit the changes made to the database (e.g. when the test dies) or expressly rolling-back the changes? Of course not all databases support transactions and if you wish to test something which already uses transactions, you are out of luck.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      No. The test script often drives code which is tested as a black box, so we don't have access to it's internal database connection, and it already uses its internal transactions, too.