Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Syntactic Confectionery Delight
 
PerlMonks  

Litmus test: It's ok to roll your own if...

by davido (Archbishop)
on Sep 18, 2003 at 23:28 UTC ( #292539=perlmeditation: print w/ replies, xml ) Need Help??

This is not intended to be a list of why or why not to reinvent ones own wheel. Rather, it's a subset of talking points, a preliminary checklist to consider before one dives into the trouble of hand-coding something that's already been done. Consider these points that I weigh as I'm tinkering with code. These are talking points, not cut and dried criteria. As you conscienciously develop code, undoubtedly you will have your own barometers. But I think that these are valid and interesting points for consideration. Now for a little preaching to the choir:

It's ok to roll your own (in place of using a well-written module) when:

  • Your dumb boss requires it. 'Dumb' is accurate if his requirement to do so lacks understanding of the following list items. But if the ramifications are understood, it might not be so bad.

  • You already looked at the module's source and know how it falls short of your needs.

  • You already looked at the module's source, and know how to do it better.

  • You have control over the data source or user base such that your solution won't break even if it isn't necessarily as robust as the comparable module.

  • You're making a single-use or one-user program, that really has no need to stand up to anything beyond that single use or user, and your hand-rolled version is acceptible as a quick-n-dirty solution.

  • You are advanced enough to understand all of the caviets and ramifications that the existing modules robustly withstand, and know that despite not being subjected to the firey rigors of use by thousands of people, your version will definately withstand 100% of the rigors to which you or anyone following you will put it, even if future use and application cannot be predicted.

  • You have time to accomplish all of the above. Remember that time, being finite, is a scarce resource, which means that the laws of economics and utility apply. That is to say, the benefits of rolling your own must outweigh opportunity cost of the time to it takes to roll your own, which consumes that finite resource, and prevents it (time) from being spent elsewhere, such as boating or going home from the office at the end of the day.

  • You have a desire to maintain your own home-rolled code rather than simply installing updated revisions of a module as "Better & More Robust Ways To Do It" are required and developed.

While many people are capable of rolling their own, and may even frequently do so when they have a specific need, those situations that require such treatment, I assert, are the exception, and not the norm. Of course for the purposes of this discussion, I refer to module with the meaning of "well-written, publically available (via CPAN or core Perl distribution) module".

What also seems to be the norm among many a beginner is the script-kiddy attitude that, "I know what I need, and I'll just plink out a few lines of code to do it; the module is overkill." What is really going on in this kiddy's mind is probably that he/she doesn't really care to wade through 5 minutes worth of POD to understand how the module works, and that he/she knows darn well that these six lines of code will fit his need. What is implicitly ignored by that person is all of the above list items.

Think about what an extensible tool the original formmail.pl would have been, for example, had it originally been written with 'use CGI'. As Lincoln released updates to CGI.pm, inspired by the feedback of many thousands of uses, a simple upgrade-installation of that module would import much of its improved security and robustness into the original kiddie's script, without even changing a thing in the original script. (Note: Lincoln has nothing to do with the original, poorly written formmail.pl, but has a lot to do with the well written CGI.pm). And those things that might be required to be changed (due to changes in the user-interface of CGI.pm) would force the author of formmail.pl to keep up with the state-of-the-art in CGI security.

I don't mean to pick on formmail in specific. It just happens to be a well-known example. There are plenty of lines of script-kiddie code out there that suffers from similar short-sightedness.

I'm certanly no expert on either rolling my own or using modules. I can say that when I use a module for something, I tend to get a better solution than when I don't bother with figuring out how to apply the module in question to my script.

There recently was a discussion on Usenet about "why should I use a module when I already know how to parse HTML". About two posts into the thread it became readily apparent that "already know how" was an overstated assertion. Modules are more than often developed by people who have a strong understanding of the subject matter. Modules are rigorously tested. Modules are used by a vast array of individuals, in a vast array of applications, and put to tests that no single author could ever forsee. And feedback rolls in, "This needs to be tweeked. That might be a security risk. This other thing might be more efficient coded as follows..." And those maintaining the module gain the benefit of feedback from many many end users, rather than only being able to rely on their own experience in refining the module.

With all this in mind, it seems to be a significant exercise in "misguided lazyness" (as opposed to the healthy sort of laziness that Mr. Wall promotes) to roll ones own rather than take 5 minutes away from the keyboard to learn how to use the time and real-world tested module.

I'm, of course, interested in hearing any comments to this discussion. It's possible that I've missed some points, made some incorrect assertions, or simply misunderstood some aspect of the subject. I'd like to hear what others think.

Dave

"If I had my life to do over again, I'd be a plumber." -- Albert Einstein

Comment on Litmus test: It's ok to roll your own if...
Re: Litmus test: It's ok to roll your own if...
by BrowserUk (Pope) on Sep 19, 2003 at 01:29 UTC

    I don't understand the basic premise of this post.

    Why the hell should anyone else give one flying fuig for whether I choose to re-invent a wheel, or legs, or mucus-lubricated, muscular wave locomotion?

    Many apparently do, but why they do I have not the vaguest concept.

    If my boss says do this, and I say I can grab a module from CPAN and having it running in 10 minutes, and he says--No. It would take me 3 months to go through all the IT, Purchasing, SysAdmin and Security departments red tape and associated BS, do it "in-house"--then I might grab that module, pull it apart, re-write it in my own or the house style making sure that I understood exactly how it worked, add a note at the top pointing to the original upon which it was based.

    Or if I didn't like the method of implementation, then I might re-implement it.

    As I currently don't have a boss, I tend to look at whats available on CPAN, read the sources, and then sit down and write my own. With some modules, I get so far into the project and decide I can't improve on what I saw on CPAN and revert to that. Tie::File is one such module, IMO it is pretty close to perfect, I couldn't improve on it, leastwise, not until I get my asychronous IO version going anyway.

    Many other modules I have written my own version and I prefer them. Note: That doesn't mean mine are better and should replace the CPAN versions, it just means I prefer them. As an example, there are many cases of modules on cpan that are entirely procedural, but the authors have wrapped them up in a pseudo-OO interface. This invariably bugs the hellck out of me. The are others that I dislike for a variety of reasons. Some I've written my own versions of bits of them. Many of these are "one-liners" that get stuck into My::Utils.pm.

    The most frequent problems I have with cpan (and core) modules, are not their implementation but the interfaces. This is kind of illustrated by your comment

    ...when I don't bother with figuring out how to apply the module in question to my script....

    The absence of a cohesive design overview means that if my application could be written by importing say 3 different CPAN modules (suites), and gluing them together, the resulting "glue code" spends much of it time coersing the output/storage formats of one module into the input/storage formats of the next.

    If the script is a simple step1/module1, step2/module2, step3,module3 progression, then you can sometimes get away with this, but if the script loops around a lot, the mapping back and forth between formats into and out of each of the different modules, is a pain and ultimately undoes much of the realiablilty and testing that you proffer as reasons for doing it that way.

    Anyone with an eye to the history of software development will know that using tried and tested components does not make for less bugs unless the components have been previously integration tested together. In large systems, more bugs result from the (unintended) interaction between subsystems, than from within the individual subsystems. This is a well-known and document fact of life of systems integration. There are entire companies that make a living from simply integrating other peoples tried and tested components.

    We have all either experienced the situation, or heard of those that have, where the OS vendor blames the RDMS vendor, who blames the Transport Layer vendor who blames the Messaging API vendor who blames the Network vendor who blames the OS vendor....

    The bottom line is, it's my code and I'll code it my way. I don't know why anyone else would give a flying fuig about the way I choose to go about it, but I can tell you that I don't give a flying fuig for what they think.

    If there is one thing about this place, which in every other way I love, it is the pervading attitude that there is "one true way" and the propensity for people to answer OP's questions with

    "Stupid man, "How do you do X!", why do ask such a stupid question when it perfectly obvious that you shouldn't be trying to do "X" in the first place. You should be installing Cygwin, learning a completely different set of OS commands so that you can download the latest set of patches and rebuild sources so that you can use the latest, greatest installer that would allow you to install this bundle of CPAN modules. When you've done that, one of the dependancies that will automatically installed, will allow to you use
    `yayacc --thisenormouslongcommandlineparameterdoesntserveanypurposeoth +erthantoinformthecommandthatthenextvalueshouldbeusedliterallyandnotau +tomaticallyinterpolatedglobbedandreversepolishized +2`

    which means you wouldn't need to do X in the first place!.


    Update: Er....{grin}!


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

      Why the hell should anyone else give one flying fuig for whether I choose to re-invent a wheel, or legs, or mucus-lubricated, muscular wave locomotion?

      Many apparently do, but why they do I have not the vaguest concept.

      I can give a couple of simple reasons why I care enough about this issue to bother at least pointing people towards a publicly available wheel and encouraging them to have a go at it.

      First, there is self-interest. If people are reinventing wheels, they are usually not contributing useful patches and ideas to CPAN. A strong CPAN makes my life easier. Also, it has been my experience that despite what you say about integration problems, reinventing wheels usually leads to wasted time, less functionality, and more bugs, all of which makes Perl look bad. I want Perl to look great, so there will be lots of fun jobs for me.

      In a slightly wider sense, but still out of self-interest, I want it to be accepted that using open source components is a reasonable way to write applications because it's the way I like to write them. If everyone just rolls their own all the time and that becomes the norm, managers will think it's strange when I tell them I don't want to write the ten thousandth template engine from scratch.

      Obviously you and every other Perl developer can do whatever they want to, but there are some simple and sensible reasons why I (and others) will continue to encourage people to use CPAN and not reinvent wheels.

        ...at least pointing people towards a publicly available wheel and encouraging them to have a go at it.

        I've no problem with people "pointing to and encouraging" the use of CPAN--I've even been known to do this myself surprisingly. I do take umbrage at being told that the "solution" to my problem is in doing something completely different to what i am trying to do (and about which I hadn't asked for advice) just so that I can utilise a CPAN module I have no interest in.

        My re-invented solutions have caused noone any "wasted time, less functionality, and more bugs", nor made "Perl look bad", because I've not made any of them available on CPAN nor anywhere else.

        The only module I've so far uploaded to CPAN, was withdrawn within 1 day. I uploaded it because 3 people seperately encouraged me to do so. I withdrew it because one person, who views I respect, suggested that it "polluted the namespace" because I had named it as a pragma, rather than deeply nesting it in some obscure namespace where it would never been seen. Reserving namespaces is well and good, but if they are never allowed to be used, then they become wasted space. C'est la vie!

        As for the rest of my modules. I re-invent for my own learning purposes. As and when I have something that I think is of sufficient quality to warrent inclusion, I will do so.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
        If I understand your problem, I can solve it! Of course, the same can be said for you.

        If people are reinventing wheels, they are usually not contributing useful patches and ideas to CPAN.
        Really? If someone reinvents a wheel by creating a module that does X, while there is already a module doing X, he basically made a patch the size of a module. Reinventing wheels is usually done because someone didn't like the original wheel. It's often more efficient to reinvent a round wheel, than to submit patches for a square wheel (if only the creator of the square wheel likes squares, or because round wheels aren't compatible with the vehicles equipped with square wheels).

        A strong CPAN makes my life easier. Also, it has been my experience that despite what you say about integration problems, reinventing wheels usually leads to wasted time, less functionality, and more bugs, all of which makes Perl look bad.

        There are no quality requirements for CPAN. Anyone can already load any crap on CPAN, and many already do. Non-overlapping functionality doesn't make CPAN stronger, it probably makes it weaker. Choice is good (remember one of the slogans of Perl: there's more than one way of doing it?) Your reasoning would mean that the first module doing Y that was uploaded to CPAN is necessary the best one. Good heavens. Had Matt Wright made his formmail.pl available on CPAN, you would have preferred noone else would have created a form to email utility, but they all had used Matt's script.

        Abigail

      I don't understand the basic premise of this post.

      Perhaps not. The basic premise is that if you are confident that you gain more utility (in the economics sense of the word) by doing it by hand, and are confident that you can do so in a way that safely meets your needs, you have passed the litmus test and should do it however you please, by all means. Particularly if your confidence is properly placed. The premise can be extended to assert that the test is yours to define. I suggested some talking points, but never claimed to know all of the criteria that you use in evaluating the benefits of reinvention versus reuse. I did assert that if you know enough about the problem and solution to do it better yourself, have at it.

      The assertion I made is that many will fail to gain economic utility by doing by hand what has already been done in a well-written module. They will fail because their approach is not as robust, or because it took longer to implement and debug, or because it wasn't supported as future uses develop, or because it didn't benefit from the sharpening stone of mass-use, or because it is not extensible, or because of a number of other reasons.

      But the person writing the code (and perhaps his superiors, customers, or shareholders) must be the final judge of whether it is a maximization of utility to invent his own wheel or try to make an existing wheel spin on his/her axle. I never intended to put myself in the position of saying yes or no to you coding however you please.

      I pointed out that frequently, the wrong choice is made, favoring reinvention out of misguided lazyness. ...too lazy to figure out how a superior wheel works, not understanding that extra up front effort might lead to increased utility (free time, or better product) in the future. Certanly it's not always the case that code reuse is the best way to go. It is definately true that there are many instances where a new wheel works better. But just as certanly, there are way too many times that the wrong decision is made. This is particularly an issue for beginners, who may not yet have learned enough about the issues surrounding the chore they've set out to accomplish to write their own safe, robust, and effective solution. This is certanly the case with most of those who post questions about how to parse CGI or XML with regexps. They know enough to know that regexps are powerful tools in text manipulation. But not enough to know the shortcomings and downright dangerous pitfalls.

      Your specific cases may well pass the litmus test, and you may even add or subtract items of your own criteria to that test. But every time you, yourself, use or don't use a module, you are in your own way weighing the alternatives. I only tried to bring to the forefront a few of the items that should be on ones mind as he weighs the pros and cons of using a particular module versus cooking up something altogether different.

      Dave

      "If I had my life to do over again, I'd be a plumber." -- Albert Einstein

Re: Litmus test: It's ok to roll your own if...
by nimdokk (Vicar) on Sep 19, 2003 at 12:34 UTC
    Why I wrote my own module. It had nothing to do (really) with management decisions about purchasing, etc. It was simply easier to customize what we needed instead of going out and searching for different modules to do what I needed. I could customize the functionality to our needs and environment(s). Also, some of what we needed was simply not available in modules because essentially the routines that I have written are wrappers for external programs. I also wrapping regular modules into a routine so that I and my co-workers can quickly plug that routine into our boiler-plate code and run it without having to dig into what a module needs because my two co-workers are not as proficient with Perl as I am (although the training in that area is preceeding nicely). Those routines also cover several work-arounds that I have needed to put into place to make our job easier since it is not all that easy for me to install modules that I do need (although I have convinced one the Unix admin team leader that we need to be on the latest version of Perl when we upgrade the system we are using).

    Just my 2 bits. :-)


    "Ex libris un peut de tout"
Re: Litmus test: It's ok to roll your own if...
by jonadab (Parson) on Sep 19, 2003 at 15:29 UTC

    Lots of good reasons, some of which you point out, and some of which you neglect; here are a couple of the ones you neglect...

    • Learning. Sure, Perl is a language for getting your job done, but there's no reason it can't also be a language for learning. Sometimes your entire purpose in writing something is to learn, and sometimes you don't just want to learn how to make it work; sometimes you want to understand the internal details of how it works. Sure, you could study an existing module, but sometimes writing the code yourself gives a clearer understanding. This is the same kind of reasoning that leads people to read RFC documents instead of (or in addition to) tutorials.
    • Sometimes the existing module, despite technically doing the same thing you want to do, isn't right for your needs. Maybe its interface is needlessly complex for a simple task; maybe the module on CPAN uses XS, and you want a solution you can use easily on different systems without compiling anything. Maybe you need to be able to distribute it as part of something and don't feel like hiring lawyers to figure out whether the license of the module on CPAN allows this. (I'm a programmer, Jim, not a lawyer.) Maybe a lot of things.

    In short, my criterion for rolling my own is this: I'm aware that I'm re-inventing a wheel, and I have a good reason for doing it. Do I see the value in avoiding reimplentation by using existing solutions? Yes, of course; I can't count the number of CPAN modules I've used in the last year. But sometimes, just sometimes, there's a reason for doing otherwise.


    $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
Re: Litmus test: It's ok to roll your own if...
by krisahoch (Deacon) on Sep 19, 2003 at 15:47 UTC

    A small addition to extend your points

    • You already looked at the module's source and know how it falls short of your needs AND the module cannot be extended/overridden without modifying the module's source
    • You already looked at the module's source, and know how to do it better AND the module cannot be extended/overridden without modifying the module's source

    Kristofer

Re: Litmus test: It's ok to roll your own if...
by blssu (Pilgrim) on Sep 23, 2003 at 15:56 UTC

    It's always ok to write your own code. Duh!

    But... that doesn't mean your code will be cleaner, smaller, faster or more reliable. In fact, you're probably going to repeat mistakes and/or re-invent features that at first glance you thought were stupid.

    I'm constantly thinking my previous self was a moron for doing something obviously wrong. Then when I get back into the problem I learn that my previous self has made a design trade-off to solve an unobvious problem. What I really need to remember is that my *future* self is a moron and write better documentation...

    On the macro scale: The free software world works on the basis of massive parallelism. Creating duplicate projects improves long-term productivity by allowing different projects to learn about different things. This avoids problems that overly connected fields (like modern science) have due to peer pressure causing everyone to move in the same direction.

    Take an example close to our hearts. Perl, Python and Ruby all have very similar goals. Perl 6 tries to learn from them. If all those developers worked only on Perl 5, would Perl 6 be as rich as it is? Would there be as many minds working on the problem?

    How about Xt, Tk, Qt and Gtk?

    How about Linux and the BSDs?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://292539]
Approved by phydeauxarff
Front-paged by bart
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2014-04-20 01:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls