I frequently suggest Thread::Queue for pooled thread applications, but in addition to some non-queue-like behavioural cruft, that module has no way of auto-limiting the size of the queue. That means it is all too easy to populate the queue at a rate far in excess of the pool's abilities to process those entries. And that can lead to excessive memory consumption.
Here is an implementation of a shared queue to address that deficiency:
#! perl -slw use strict; package Q; use threads; use threads::shared; use constant { NEXT_WRITE => -2, N => -1, }; sub new { # warn "new: @_\n"; my( $class, $Qsize ) = @_; $Qsize //= 3; my @Q :shared; $#Q = $Qsize; @Q[ NEXT_WRITE, N ] = ( 0, 0 ); ## nextWrite, N # warn sprintf "new: size %d\n\n", scalar @Q; return bless \@Q, $class; } sub nq { # warn "nq: @_\n"; my $self = shift; lock @$self; for( @_ ) { cond_wait @$self until $self->[ N ] < ( @$self-2 ); $self->[ $self->[ NEXT_WRITE ]++ ] = $_; ++$self->[ N ]; $self->[ NEXT_WRITE ] %= ( @$self - 2 ); cond_signal @$self; } } sub dq { # warn "dq: @_\n"; my $self = shift; lock @$self; cond_wait @$self until $self->[ N ] > 0; my $p = $self->[ NEXT_WRITE ] - $self->[ N ]--; $p += @$self -2 if $p < 0; my $out = $self->[ $p ]; cond_signal @$self; return $out; } sub n { my $self = shift; # lock @$self; return $self->[ N ]; } sub _state { no warnings; my $self = shift; lock @$self; return join '|', @{ $self }; } return 1 if caller;
Criticisms and comments on the implementation are welcome, but what I'd really like is for people to post what tests they would implement for this module, and how they would implement them.
It's a big ask I know, and there is a not-so-hidden agenda. Anyone prepared to step up?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Testing methodology
by tobyink (Canon) on Mar 04, 2012 at 17:34 UTC | |
You don't provide any documentation for the package, but a rigorous test suite would ideally start by testing that the API for the package does actually exist. i.e. the constructor constructs an object, blessed into the right package; and each public method is callable. Then you test the basic functionality of the package. In this case, it's a queue, so FIFO. So you'd put some stuff into it, and test that it comes out in the correct order. A feature of the package is that it has a limited size, so you'd want to test that this limit is enforced - i.e. try enqueuing more items than the size limit, and checking that it blocks. I haven't done any threaded programming in Perl for years, but in my limited experience I'm not sure it's possible to do that test reliably as it may introduce race conditions. Keeping the items enqueued simple (e.g. integers), and sleeping for a second before testing that the queue is blocked seems to be sufficient protection against race conditions. Here's a test script using no non-core modules (apart from Q.pm of course!)
Update: comments on the implementation are welcome... It would be handy to have a few additional methods:
Many of the above are trivial to implement by inspecting @$q, however implementing them outside the package itself breaks encapsulation. If people using your module start relying on the internal details of how Q.pm works (that it uses an arrayref, and keeps its stats in the last two array elements, etc), this leaves you less freedom to refactor Q.pm in the future if you discover a more efficient way of doing it. Not only would the above make the module more testable, they'd also make it more useful. As I said, I don't know an awful lot about Perl threading, but I know a bit about parsing, which also tends to operate on a FIFO queue. Parsers for pretty much any non-trivial language have a peek_token method or two hidden away somewhere, for using tokens further up the stream to disambiguate the current token. | [reply] [Watch: Dir/Any] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 05, 2012 at 02:23 UTC | |
Replying to the update only at this time: It would be handy to have ... Not only would the above make the module more testable, they'd also make it more useful. Sorry, but I disagree completely with both halves of that statement. A queue has one purpose in life. Take things in at one end and let them out at the other as efficiently as possible. Compromising the function to make testing easier is not going to happen. Adding could-dos without use-cases, for their own sake, is not going to happen. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] [d/l] [select] |
by tobyink (Canon) on Mar 05, 2012 at 08:17 UTC | |
Re max_length. Let's say I'm writing a module Amusement::Park which has a list of Amusement::Rides. Each Amusement::Ride has an associated Q. (This is a popular amusement park.) I'm writing the Amusement::Park but have no control over the Amusement::Rides, so don't initialise the Q objects. Good amusement park management says I need to point customers at rides with the least busy queues. To do this I need to calculate length ÷ max_length. A use case for is_full is that say I generate events of multiple levels of importance: say errors, warnings and debugging messages. If the queue is full, I might choose to silently drop debugging messages rather than being blocked. Or there might be several queues I could potentially add an item to, and I wish to choose one that is not full.. peek/peek_all - there may be several producers but only one consumer, so no danger of peeking a value but somebody else dequeueing it. | [reply] [Watch: Dir/Any] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 05, 2012 at 11:17 UTC | |
by BrowserUk (Patriarch) on Mar 06, 2012 at 03:23 UTC | |
Firstly, as I already said. Thank you for stepping up. I note that the 'big guns' have ducked & covered, presumably keeping their power dry. I've spent the best part of yesterday responding to your test suite section by section, and I just discarded most of it because it would be seen as picking on you, rather than targeting the tools I am so critical off. Again, thank you for being a willing subject. Now's your chance for revenge :) Take it! Here is my module complete with its test suite:
You'll notice that in addition to performing a default test, it can be configured through command line parameters to vary the key parameters of the test. The actual test consists of setting up 3 queues. One thread feeding data via the first queue to a pool of threads (1 to many). That pool dequeues the input and passes on to a second pool of threads via the second queue (many to many). And finally those threads pass the data back to the main thread via the third queue (many to 1). The data for a run consists of a simple list of integers. Once they make it back to the main thread, they are checked off against a bitmap tally to ensure that nothing is dequeued twice, nor omitted. All in one file; no extraneous modules; no extraneous output; completely compatible with any other test tools available, because it is nothing more than a simple perl script. Feel free to rip it to shreds. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] [d/l] [select] |
by tobyink (Canon) on Mar 06, 2012 at 10:17 UTC | |
Depends why loading fails. If Q.pm parses OK and executes OK but returns false, then the use_ok test will fail but the other twenty tests will still pass. None of your tests cover the situation where Q.pm returns false because you never attempt to "use Q" or "require Q". Nothing compels you to put a BEGIN { ... } block around it, but as a matter of style (in both test suites and regular code) I tend to make sure all modules load at compile time unless I'm purposefully deferring the load for a specific reason.
No it doesn't. It checks that the returned object is of the same class as the name supplied, or a descendent class in an inheritance heirarchy. This still allows you to return Q::Win32 or Q::Nix objects depending on the current OS, provided that they both inherit from Q. To have a class method called "new" in Q, which returns something other than an object that "isa" Q would be bizarre and likely to confuse users. Bizarreness is worth testing against. Tests don't just have to catch bugs - they can catch bad ideas. But it can catch bug anyway. Imagine Q/Win32.pm contains: my @ISA = ('Q');Ooops! That should be our @ISA. This test catches that bug.
Notice none of my further tests touch the "n" method? Well, at least its existence is tested for. If for some reason during a refactor it got renamed, this test would fail, and remind me to update the documentation. I don't think any of your tests check the "n" method either. If you accidentally removed it during a refactor, end users might get a nasty surprise. A can_ok test is essentially a pledge that you're not going to remove a method, or not without some deliberate decision process. Use of a formalised testing framework can act as a contract - not necessarily in the legal sense - between the developer and the end users. It's a statement of intention: this is how my software is expected to work; if you're relying on behaviour that's not tested here, then you're on dangerous ground; if I deviate from this behaviour in future versions, it will only have been after careful consideration, and hopefully documentation of the change. ☆ ☆ ☆ Overall most of your complaints around Test::More seem to revolve around three core concerns: Verbosity of output has never been as issue for me. The "prove" command (bundled with Perl since 5.8.x) gives you control over the granularity of result reporting: one line per test, one line per file, or just a summary for the whole test suite. Yes, you get more lines when a test fails, but as a general rule most of your tests should not be failing, and when they do, you typically want to be made aware of it as loudly as possible. The fact that test running continues after a failure I regard as a useful feature. Some test files are computationally expensive to run. If lots of calculations occur, then a minor test of limited importance fails, I still want to be able to see the results of the tests following it, so if there are any more failures I can fix them all before re-running the expensive test file. If a particular test is so vital that you think the test file should bailout when it fails, it's not especially difficult to add or BAIL_OUT($reason) to the end of the test.
Test::Most offers the facility to make all tests bail out on failure, but I've never really used Test::Most. One man's "forced to jump through hoops" is another man's "saved from writing repetitive code". new_ok saves me from writing:
Ultimately if I did ever feel like a particular set of tests wasn't a natural fit for Test::More, there would be nothing to stop me sticking a few non-TAP scripts into my distro's "t" directory, provided I didn't name them with a ".t" at the end. They can live in the same directory structure as my other tests; they just won't get run by "prove" or "make test", and won't be reported on by CPAN testers. It doesn't have be an either/or situation. | [reply] [Watch: Dir/Any] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 06, 2012 at 12:03 UTC | |
by tye (Sage) on Mar 06, 2012 at 05:26 UTC | |
Yeah, I eventually realized that there was almost nothing in Test::More that I was actually using or even found to be a wise thing to use. isa_ok()? I just can't imagine that finding a real mistake. I can certainly see it complaining about an implementation detail that I might change. I also don't want to use is_deeply(). I think a test suite should complain about what it cares about. If there is extra stuff that it doesn't care about, then it shouldn't complain. And I find Test::Simple to just be a stupid idea. But I do use and value Test (though I use my own wrapper around it for a few minor reasons and to provide decent Lives() and Dies() implementations -- better than the several modules that purport to provide such that I've seen). I certainly make frequent use of skip()ing. The fact that Test doesn't use Test::Builder is a minor side benefit that becomes a major benefit every so often when I feel the need to look at the source code. Test::More's skip() is so bizarrely defined that I can't actually use it correctly without reading the code that implements it and the act of trying to find said code is so terribly aggravating since Test::Builder is involved, so I'm happy to have realized that I never have to go through that again. There are tons of tools built on top of TAP (and other testing schemes such as used by some of our Ruby-based tests). It is actually useful in the larger context for each individual test to get numbered so we can often correlate different failure scenarios and to make concise reports easy. And we have more than one test file per code file in many cases. This is especially useful when there are interesting set-up steps required for some tests. Testing leaf modules is the easiest case and usually doesn't really stress one's testing chops. Many of my test files abstract a few patterns of test and then run lots of simple tests that are specified with a small amount of data. So, for example, I might have a few dozen lines where each line specifies an expected return value, a method name, and an argument list (and maybe a test description). Also, having the test code in the same file as the code being tested would complicate coverage measurement, easily distinguishing commits that are fixing code from commits that are fixing tests, searching for real uses of a specific feature while ignoring tests that make use of it, ... But, no, I'm not interested in "stepping up" to your challenge. Many of my reasons would just come across as a personal attack so I'll not go into them. But most of what I'm talking about I can't demonstrate well by pasting a bit of code. I have no interest in trying such a feat. - tye | [reply] [Watch: Dir/Any] |
by BrowserUk (Patriarch) on Mar 06, 2012 at 13:24 UTC | |
by tye (Sage) on Mar 06, 2012 at 15:32 UTC | |
| |
Re: Testing methodology
by Anonymous Monk on Mar 04, 2012 at 18:27 UTC | |
| [reply] [Watch: Dir/Any] |
by BrowserUk (Patriarch) on Mar 06, 2012 at 04:06 UTC | |
... set up more-than-two consumers and more-than-two producers with random rates of speed from cycle to cycle but some known to be faster/slower than the others. Thanks for the response anonymonk. With regard to trying to orchestrate indeterminacy. I've tried in the past and it is a sucker's game. The one sure-fire thing you learn about concurrency when you've done enough of it, is that you do not have to orchestrate deadlocks, live-locks, race conditions, or any of the other nasties. Run a bad system for a while and make sure plenty of other different things are happening in the same system, and the nasties will make themselves known. Hence, my surety against these anomalies is to run my test suite (posted elsewhere) with big numbers and then play music, watch the Iplayer, and defrag my hard disks concurrently. It is a fair bet that a more diverse set of inter-thread timings occurred during that than I could ever hope to orchestrate deliberately. If the test suite completes correctly with all that going on, it is probably bomb proof. A typical test run consists of this:
That's 1 million items fed from 1 thread via a queue to a pool of 200 threads; those threads feed it via a second queue to another pool of 200 threads; which in turn feed it via third queue back to the main thread. At the same time I'm listening to Division Bell, whilst "Racing for Time" (movie) plays away to itself (with the volume off) in a tab in my browser. All of which simply means that my 4-cores are averaging over 90% usage each and I don't need any heat in the room despite being close to zero outside because the cpu fan is running close to flat out. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Testing methodology
by duelafn (Parson) on Mar 05, 2012 at 14:30 UTC | |
Why would you not use Thread::Conveyor? As for testing your module, adapting Conveyor.t would at least be a place to start. Good Day, | [reply] [Watch: Dir/Any] |
by BrowserUk (Patriarch) on Mar 05, 2012 at 14:36 UTC | |
Why would you not use Thread::Conveyor? Because I know better. Why are you suggesting it when you obviously do not? I was asked to update this to explain my reasoning, so here it is: Thread::Conveyor doesn't work. It isn't thread-safe. Don't believe me, try it for yourself! I'll even provide the test script for you:
Nine times out of ten this will segfault with;
On the 10th occasion it will emit the following before hanging:
And if you trace the run you get: <Reveal this spoiler or all in this thread>
| [reply] [Watch: Dir/Any] [d/l] [select] |
by duelafn (Parson) on Mar 05, 2012 at 17:33 UTC | |
Thank you very much for updating and giving your reason. I do in fact use Thread::Conveyor and had never run in to the segfault issue. Upon investigation, I only get the segfault when I set the stack size globally (when I load the threads module or via threads->set_stack_size(). I have no problems if I set the stack size at thread creation time (below). This explains why I have never run into trouble. Clearly, there is some problem with Thread::Conveyor and you have convinced me that there might be some reason to avoid it. Determining whether the problem is fixable is probably beyond me.
Good Day, | [reply] [Watch: Dir/Any] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 06, 2012 at 02:59 UTC |