http://www.perlmonks.org?node_id=865829

Originally written in response to:

..before I begin to post gory details which might turn out as being sideshows..

Though the nice thing about sideshows is that you don't have to look. You can carry on to the main event as if they did not exist. But, for some, the sideshows can be more interesting than the main event. If there was no interest, they wouldn't exist.

For the OPs of most SoPWs, the main event is getting their problem solved and their applications written.

But for responders, beyond the pure altruism of helping out a fellow perler, there are often other reasons for their interest in the OPs problem.

For me, helping you do that--if I can/have--is the side event. The main event is finding the best ways to solve particular problems. For some problems, fork (where available) is the right solution; for some it is threading; for some a select-type mechanism is appropriate. But the problem is that these are too often seen as all-or-nothing alternatives to each other.

And that's where the 'frameworks'--and my distaste for them--come in. Things like POE, AnyEvent, Coro, Parallel::ForkManager; Thread::Pool et al. all try to "simplify" the use of what are, at their hearts, very simple programming constructs. fork, select async etc. But in those attempts to simplify, they fail dismally.

By wrapping over the basic constructs in these huge, unwieldy APIs, they not only trade the small-but-required learning curves of those basic constructs for the far larger learning curves of the own APIs; they also dictate the architecture of the entire applications and application suites in the process. Thus forcing the abandonment of the other basic constructs in the process. And so you end up with one-size (or rather, one-hammer) force-fits-all solutions. They have to be all fork; all select or all threads, because that is what the frameworks expect and dictate. And that's a nonsense.

There is no good reason not to use select within a fork or a thread. Or threads with a fork. Or fork or thread within a select loop. And if you avoid the overarching frameworks built around these basic constructs, and simply learn to use the basic constructs themselves, then choosing the right tool for each particular job is easy.

That's why I've resolutely avoided publishing any, of the many, "threading framework" modules I've written. Because once you move beyond the "poster child" application, into trying to define an API to cover a wide range of typically messy, general purpose problems, the APIs become far more complicated and unwieldy than the basic:

async { while( my $workItem = $Q->dequeue ) { ... } };

they are intended to "simplify" through encapsulation. And along the way, usually preclude the flexibility of using fork or select where that is appropriate.

It's not that those other techniques can't be used within a threading framework. It's just that using them then requires detailed knowledge of the internals of the framework to make them play well together. So then you get into the game of writing wrappers around popular modules like IO::Socket, LWP, etc. The whole thing snowballs and you end up with a huge, fragile and interdependent morass of modules that need intimate knowledge of the framework internals to maintain.

Ie. Exactly what you see when you search for POE::* or AnyEvent::* or Coro::* or Thread::*.

Even if I am capable of writing & maintaining such a suite of complex interdependent modules, I've no desire to have others dependant upon my ability and interest in maintaining them in the long term. But even more important to me, is that I am philosophically opposed to creating such monocultures.

The (my) bottom line is that it would be far better for people to learn to use the basic constructs of fork & exec, select & sysread, async & enqueue()/dequeue() and apply them as required; than to try and skip the (relatively shallow) initial learning curves of those basic constructs by using "a framework", and then finding themselves locked in to monoculture and having to force-fit every aspect of their applications to its limitations and dogmas.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re: Your main event may be another's side-show.
by sundialsvc4 (Abbot) on Oct 17, 2010 at 20:32 UTC

    It is interesting to me how our points-of-view differ ... so, in the spirit of the old TV show, “Point : Counterpoint”, let me offer my counter-points.   My alternate points of view.   (Not “my dissent.”)

    First of all, my “reason for responding,” to any post and every post that I respond to, is simply to help that person if I can.   I do this because I know that I can come to the same place, “seek,” and find.   There is no other reason for me.   I take.   I give.

    Second, I find packages like the ones that you mention to be helpful, either as reference or as a source of an actual solution.   While I do not always use these materials exactly as they are written, they do, nevertheless, represent “a complete thought.”   And, yes, I do find myself using these modules quite a bit because I simply do not have the time to do original coding on my own.   I can usually find in the source-code the evidence of obstacles that have been smashed-into and learned from.   Also, I look at some modules and wonder, “why was the module designed this way?”   You know that you are peeking over the shoulders of a complete work that was built by a professional programmer colleague, and refined to the point to where said programmer was willing to publish it as a reusable resource.   Such code is full of surprises, most of them useful.

    Still, it really does pay to “use the Source, Luke!”   CPAN modules are ... well ... what they are.   You can treat them as “black boxes” to a limited extent.   But sometimes you have to do the exact opposite:   peel all the covers off, and study the thing.

    A very interesting and thought-provoking thread. . .   I look forward to the other comments.

      Second, I find packages like the ones that you mention to be helpful, either as reference or as a source of an actual solution. While I do not always use these materials exactly as they are written, they do, nevertheless, represent “a complete thought.” And, yes, I do find myself using these modules quite a bit because I simply do not have the time to do original coding on my own.

      I'm far from adverse to using CPAN modules. Much less looking inside them to see what makes them tick.

      One or more of them, Data::Dump, List::Util, Time::HiRes *, are used in almost every script write. And a bunch more LWP, IO::Socket, IO::Select, GD, etc. I use as if they were an integral part of Perl. I use them so frequently that I never have to look up their basic functionality; nor often even their more involved details.

      And I've never even considered trying to re-create their functionality using the basic Perl's built-ins (like socket,accept,bind & listen) they use under the covers. They're, (their apis), just so well thought through and simple to use, that messing around with the low-level stuff they encapsulate rarely enters my head. And when it does, it is only to understand how to use them better, or on very rare occasion, to patch my way past a limitation.

      But the thing to note is that none of those modules is a framework. Their designs are such, and their apis so well thought through and developed, that they can be used in conjunction with threads or fork or select. They assist the application writer by encapsulating the messy low-level details in such a way as to allow him to be ignorant of them. But they do it without imposing a particular architecture; or requiring that they be masters of all they survey.

      The distinction is as between: doing a self-build where you hire an architect to encapsulate your ideas, and individual contractors to enact them; and buying a house from a developer, who offers you choices of, the number of bedrooms, and the color of the bathroom suites & the tiles on the kitchen floor.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I now see that I mis-understood the thrust of your original post.   Thank you for the clarification.

Re: Your main event may be another's side-show.
by zentara (Archbishop) on Oct 18, 2010 at 12:21 UTC
    Ie. Exactly what you see when you search for POE::* or AnyEvent::* or Coro::* or Thread::*.

    I'm relieved you left out Glib :-)

    It seems to me that everyone picks a methodolgy for solving problems that fits in with their mental view of the world. Some like the Object Oriented approach and some like the monolithic script. Some like event-loop systems, others not. Each has it own merits. That is why perlmonks is such a great place..... no matter what your main show is, you MAY provide just the right side show that someone else is seeking, but just for that particular problem. Its like casting for a hollywood movie.... your method may be just the perfect bit part for a particular movie, but not for the next movie, nor all movies forever.

    It seems that people want to find general solutions for all problems of a certain type, then set that problem aside as being solved, freeing their minds to attack another problem. Of course that can lead to mental laziness, because you do not approach each new problem with a fresh approach.


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
      I'm relieved you left out Glib :-)

      Drat! Knew I was forgetting something :)

      It seems that people want to find general solutions for all problems of a certain type, then set that problem aside as being solved,

      I understand the motivation, but often I see "them" creating themselves far more work than they save, by doing so.

      Some problems lend themselves to a particular approach naturally. For example, GUIs require an event loop--nothing else will do. The problem arises when the fact that a particular application needs a GUI is used to determine that everything else that application does, has to be force-fitted into an event-driven architecture. It simply doesn't need to be that way.

      You know yourself--as you've done it here many times--that it is perfectly good and proper to set the GUI event loop running in a thread--main or otherwise--and push other, naturally serial proceses--like read-process-write loops, off into background threads. Sure, there are issues that most of the available GUI libraries were never written to work across threads, but as you know, there are simple solutions to those problems.

      There are other things that naturally lend themselves to event-driven architectures. Eg. Starting a whole thread to ping each of a long list of machines doesn't make sense. Sure, you can avoid that by using a thread pools and a queue, but it still probably isn't the best mechanism. Which is why when that question comes up, I suggest the asynchronous capabilities of Net::Ping.

      The really nice thing about that module, is that it happily runs in a thread. So, if the application wants to display the running status of the network in a gui, it is very easy to separate and test those two aspects of the application. You stick the gui in one thread, the network discovery in another and communicate the status between them using a queue.

      In this way, both parts of the application can be developed and tested separately, and you avoid having to try and interleave their processing in a way that means you are always having to trade the requirements of one, for those of the other.

      Similarly, in a spidering application, LWP::Parallel can be used to great advantage in conjunction with threads. If you try to use just threads, you need to start many threads per core to ensure that each core always has something to do. If you try to use just an event loop architecture, you don't scale across cores and so waste 3/4; 15/16th; 255/256ths etc. of your available processing power. Start one or two threads per core and run an event loop (LWP::Parallel) instance in each, and you get the best of both worlds.

      And that's my 'problem' with 'frameworks'(*). By inverting the flow of control, they dictate the architecture of every application that uses them. They force the conflation of what should be separate concerns. By precluding other constructs and methodologies, they exclude the 'natural' solutions to simple problems and you end up with people trying to use complicated semaphoring mechanisms to re-create simple, linear-flow, serial processing. Or polling every 1/10 second to check on the progress of stuff that may takes hours--and that they don't really care how long it takes, they just need to do something when it is done.

      I love chips. But the idea of fried spagetti or fried strawberries fills me with horror. I like mashed potato; but boiled fish or steak?

      (*)Not all frameworks are bad. From my limited understanding of CGI work, you pretty much need a framework to do anything remotely complex with the stateless nature of http. You either use an existing framework, or are destined to create one (usually badly), in order to bring some level of statefullness to your application.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Your main event may be another's side-show.
by TomDLux (Vicar) on Oct 18, 2010 at 15:09 UTC

    Mark-Jason Dominus' rejection of Design patterns ( he was essentially quoting someone else ) is they do not represent an advance, but rather a manual implementation of something missing from the language. The earliest machine language had the Subroutine design pattern, implemented by saving certain values to certain places, and jumping to certain machine addresses; every language since simply has functions and/or subroutines. Java has an Interator design pattern; Perl has for() acting on a list.

    In the same way, threads, forks, selects deal with implementation. People are searching for an abstraction which simplifies away the complications and allows a higher level of comprehension.

    So far, they've had limited success. Maybe the problem needs several abstractions, for different situations. But you can't fault them for trying.

    As Occam said: Entia non sunt multiplicanda praeter necessitatem.

      My problem with Design Patterns, (the book rather than the underlying concept) is that it attempts to do for programming what 'painting-by-numbers' does for art. And from my experience, the former 'succeeds' to almost exactly the same extent as does the latter.

      That's not to say that individuals cannot read the book and draw very useful lessons from it. They can and do. But it's my contention that those same individuals would have quickly started to recognise recurrent themes in the code they write, read and maintain. Without the silly names, or falling into the trap of fill-in-the-blanks development practices.

      The problem is that far too many CS teachers rely solely on teaching "the patterns"; and so we have a whole generation of programmers that have never been taught the processes and skills of analysis. And so you end up with a high proportion of those that learnt their programming in this way, that never actually look for the patterns.

      They simply write every program in terms of the "top eight patterns"--Abstract Factory; Adapter; Composite; Decorator; Factory Method; Observer; Strategy; Template Method et al. Because, after all, the book said:

      It's hard to find an object-oriented system that doesn't use at least a couple of these patterns, and large systems use nearly all of them.This subset will help you understand design patterns in particular and good object-oriented design in general

      And so, it became a self-fulfilling prophecy. If you doubt this, just wander over to CPAN and search for all the modules and suites that contain a "*::Factory::*" element. (Or three). And then take a look inside some of them and see how many of them will never be called to instantiate more than once in any given program. You don't need a factory to produce a single, bespoke item.

      That might be seen as 'unfair', as it is very easy to confuse the purpose of a 'Factory'--which is meant to instantiate different (sub)classes--with that of a 'class' who's job is to instantiate instances of that class. But you'll see many of the other 'top eight' misused in similar ways. Because the programmers are taught the 'rules' without being taught when to apply them. The patterns, without the analysis skills to differentiate between a pattern and a meta-pattern.

      Hence, you'll also see the Decorator used to add a single attribute to a class, when a simple subclass would be appropriate. Or two copies of the Observer pattern used to implement bi-directional communications.

      Recognising patterns is good. Implementing patterns because that's what you've been told you should be doing, is bad.

      In the same way, threads, forks, selects deal with implementation. People are searching for an abstraction which simplifies away the complications and allows a higher level of comprehension.

      I agree. That is what people are looking for. The problem is, you almost always do them a disservice by supplying it.

      Every one of Perl's looping constructs can be implemented using redo (and one or two variables). But if every one dropped for/foreach/while/until/map/grep et al. in favour if redo, their code would be a nightmare to read and maintain.

      Programming is a complex process. Perhaps as, if not more, complex than any other single occupation. There is a point below which it is not productive or useful to simplify or encapsulate further. It becomes a self-defeating exercise to try and wrap over every basic mechanism, or combination of basic mechanisms, with a higher-level abstraction.

      At each higher level of abstraction you either loose the possibility of control over the finer details; or you end up with an api that attempts to encapsulate all the exponential combinations of parameters to all the low-level constructs being encapsulated. And you either get monster APis; or monster suites of interdependent wrap-over modules; Or both.

      By inverting the flow of control, and generalising the interface for the 'poster child' typical case, you end up creating a far steeper, and broader and deeper, learning curve tree than that, that you set out to reduce. And along the way, you discard the flexibility to use simple, appropriate solutions as the need arises.

      Far better methinks, to say that there are some, somewhat difficult, basic concepts that you need to just knuckle down and learn. Because once they've clicked, you'll rapidly assimilate them in a way that means you'll have a full toolkit at your disposal when it comes to solving real-world coding problems. And your efforts will be rewarded over and over.

      I think good design shows. There's only one DBI; one LWP; one IO::Socket, one List::Util. And one threads. But half a dozen dumpers; half a dozen Find::*; half a dozen *event* thingies.

      And I think the quality of a module's design shows in the number times people can offer, of-the-cuff, working code solutions to OPs problems, written in terms of those modules.

      If you want my measure of CPAN module's "Kwality" [sic], there it is in a nutshell. When I see advocacy through demonstration rather than referral; I see something worthy of investigation.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      ... they do not represent an advance, but rather a manual implementation of something missing from the language.

      Be careful that you do not reject the whole field based upon a book promoted to widely and misinterpreted even further. By the original intent of design patterns one might classify "a distribution bundled for the CPAN" as a design pattern, and rightly so.

        The fact that “design patterns” were “promoted too widely and misinterpreted even more so,” certainly blunted its effectiveness as a potential teaching tool.   But, no one can deny, “silver bullets sell much better than the regular ones.”

        The observation is a sound one:   many of the world’s successful computer programs have a certain observable structure, and said structure can be usefully studied in a purely-abstract way.   But students are rewarded for knowing the “right” answer, and for constructing their little programs in just the “right” way.   And, for picking up a hammer and seeing nothing but nails everywhere.   Because they were rewarded for seeing patterns everywhere, and for selecting the “right” pattern and for producing their stuff to “look just like that,” the good-idea backfired.   I find students who have entered the working-world who do not really know how to work with more-“indefinite,” production, code bases.   They are still looking for “the right answer.”

Re: Your main event may be another's side-show.
by aquarium (Curate) on Oct 20, 2010 at 00:36 UTC
    surely there are forking frameworks that are much smaller and easier to use than their big brothers. the reason why the bigger and more complex frameworks exist is because they do a whole lot more than a simple fork, typically allowing all sorts of scheduling and queueing. forking is their primary mechanism, but not their central goal.
    the hardest line to type correctly is: stty erase ^H
      surely there are forking frameworks that are much smaller and easier to use than their big brothers. the reason why the bigger and more complex frameworks exist is because they do a whole lot more than a simple fork, typically allowing all sorts of scheduling and queueing. forking is their primary mechanism, but not their central goal.

      That fine if you want everything they do; and if they do everything you want.

      But if the former is not true, you have to carry its costs. And if the latter is not true, you are into the game of either waiting for them to provide it, or trying to understand enough of their internals to allow you to wrap over the modules--CPAN or your own--in such a way to make them compatible with the framework.

      It all comes down to: Do I call them if and when I need them?

      Or: Do I have to try and arrange for them to call me sufficiently frequent that if there is a possibility of there being something to do, I can poll around and see what, if anything, that might be?

      (Oh. And then remember to store enough information about what I was currently doing before they interrupted me that I can pick it up when I get back to doing it.)

      (Oh. And oh. And not forgetting that if I do find something to do, I'll need to retrieve whatever information I stored last time I was doing it. And then remember to store the modified state once I finished doing what ever it is that I found to do.)

      (Always assuming that I succeed in finishing it and don't get interrupted again before then.)

      (That's assuming that there was actually something else to do when I got interrupted.)

      In a nutshell. Frameworks suck because they force me to work their way for everything. Even when most of the things I need to do aren't a natural fit to that way of working.

      On the other hand, the nitty-gritty of forking and IPC can be encapsulated in a way that allows it to be used intuitively and without the need to invert the flow of control, or requiring bending the entire application to the will of the encapsulating module.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        In a nutshell. Frameworks suck because they force me to work their way for everything. Even when most of the things I need to do aren't a natural fit to that way of working.

        Your arguments are too abstract. The framework introduced by certain library is intended to save the time of developing your own framework. So in real life you have the choice of doing everything yourself, or adjusting your work to the ready framework. According to your statement, you just dislike forcing yourself into the ways proposed by others. This is fine, because tastes differ. But in the real life the frameworks created by others can be very helpful. So it is incorrect to simply say "they shouldn't be because I don't like them". There's a saying (at least in Russian) don't spit in the well, one day you may drink water from it :)

        don't get me wrong. i agree that some (maybe most) have a lot of bloat etc. in the end it's up to you to decide if/when/how to use a fork framework. i guess a lot of typical problems involve pre-forking daemon process(es) with a central queue mechanism and such. and hence these more complicated frameworks proliferate. it certainly would be nice to have a very minimal forking wrapper, with a nice fork function that takes care of the tiny details (handling process initiation and termination, and which signals to use etc), and nothing else.
        the hardest line to type correctly is: stty erase ^H