http://www.perlmonks.org?node_id=242357

Monks,

A couple of things have come up today at work that've gotten me thinking about the pros (and cons) of using other people's code (in this case, modules) in a production environment.

I work for a Very Large Organisation, and, as such, we have some fairly strict (and I presume fairly standard) rules about what can, and can't, enter our production environment. If something hasn't undergone an extensive testing and parallel run, it can't be implemented. This is, in my mind, sensible - if things break, people get upset, and we all get fired unless it can be quickly fixed.

Now, we're in the early stages of implementing a number of new systems that output their data in XML. I'm looking into the options we have of reading this data, and I'd very much like to use a couple of CPAN modules rather than having to roll my own code to handle the XML. This is (for us) a contravertial subject - we have almost a "no external modules" policy here, and are basically restricted to the "core" Perl modules and DBI.

I can't help but think it's a little bizarre, not to say financially probably a bad idea, to have programmers sit and essentially reinvent the wheel, where implementation of a module and extensive testing could, and probably would, be more efficient use of time.

I guess I have two questions that I'm interested to hear thoughts about ... firstly, how standard is this sort of rule across the industry? I'm in my first (real) coding position, and I guess I'm interested to find out how other shops work with respect to this sort of thing.

Secondly, have people had problems using (mature) CPAN modules to carry out essential tasks within production code? The scripts for the area I work in cannot afford to "lose"/throw out data, and while extensive testing can help to prevent this, it's still very difficult to pick up one missing row in a couple of hundred thousand.

I think I can see the argument from both sides - it's a great safety net (for both the company and colleagues) to have someone local to talk to when things go wrong - but I can also see (from my own PoV) that it's very frustrating to have a pre-written solution, or partial solution, available, but being unable to make use of it.

Any thoughts or comments are welcome.

-- Foxcub
A friend is someone who can see straight through you, yet still enjoy the view. (Anon)

  • Comment on Production Environments and "Foreign" Code

Replies are listed 'Best First'.
Re: Production Environments and "Foreign" Code
by adrianh (Chancellor) on Mar 12, 2003 at 14:50 UTC

    I've used a lot of CPAN modules in production code. CPAN is probably the primary reason I use perl for most of my work since it enables me to build quality apps quickly. Not using CPAN is... foolish :-)

    The important thing is to have some sort of process. My general rules when using other peoples modules are:

    • Look for a test suite. If there isn't one be wary. Strongly consider writing one before you start using it :-)
    • Review the code.
    • Consider wrapping all the external code in a proxy so you can swap it out easily if necessary.
    • Make sure your integration tests exercise all the modules, not just the ones you wrote.
    • Keep an eye on CPAN for updates. Other people fix bugs (isn't it great :-)
    • Do not blindly update to the latest version of a module when it hits CPAN. Read the changelog. Run a diff against the one used on the production machine. Run regression tests.

    Even if you do all of the above it will still take less time than writing something like DBI or Template from scratch.

    I've only very occasionally had problems. Some of the non-upward compatable changes made to Class::DBI being the only ones that ever caused serious hassles - and that was because somebody else ignored the last point above :-/

    Hope this helps.

      ++adrianh.

      I absolutely refuse to install any module without a test suite (and I hate changelogs which lie -- diff is key).

      Another tip is to not rule out modules with artificial requirements.

      A module may say use Foo 5.5, but that may be only because that's the version they had installed (test anyway).

      Also, you can find lots of useful modules which could work with your version of perl, if they used the vars pragma instead of our $VERSION = 3.1;.


      MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
      I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
      ** The Third rule of perl club is a statement of fact: pod is sexy.

Re: Production Environments and "Foreign" Code
by IlyaM (Parson) on Mar 12, 2003 at 14:42 UTC
    What stops your company from adopting CPAN modules as its "own"? I mean it is open source after all and not some proprietary black box libraries. Get modules you need into your CVS, do their code review, if you want write tests with 100% coverage if modules don't have them. It still will be more time efficient then doing everything from scratch.

    --
    Ilya Martynov, ilya@iponweb.net
    CTO IPonWEB (UK) Ltd
    Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net
    Personal website - http://martynov.org

      Then, once you have those tests, upload them back to CPAN!

      If your management balks, do the following:

      1. Ask them if they like the rapid and safe development of their system.
      2. If they say "No", then that's an answer.
      3. If they say "Yes", then tell them what the cost would have been to hand-roll all the underlying modules. Estimate about 2-3 man-years (at least!) for DBI, CGI, XML::Parser, and others.
      4. Ask them for your back-pay.
      5. When they balk at that, tell them that this is the "cost" of using open-source software.
      6. Let them compare for themselves the cost of closed-source vs. open-source.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      Have you actually read the licenses?

      Depending on the license, mingling open source code with proprietary is generally not wise. Unless your developers are very clear about drawing a line in the sand between proprietary and non-proprietary (eg no you can't cut and paste from here to there), you can get into trouble.

      Sure the open source community is nice about it. Much nicer than the average corporation who you might have cut a deal with for code access. The FSF likes to get you to open source some affected code and then set up a voluntary compliance program rather than a lawsuit. But it is a real cost, and there is the fear that in 5-10 years someone who thinks that the GPL needs to be tested in court will be a real jerk about it. And depending on what your company does, you really might not want to unexpectedly have to GPL your code.

      This doesn't mean that your approach is wrong. But the thought of developers who think this stuff is all free (and don't appreciate the legal risks) is what keeps corporate lawyers awake.

        You have a point here. Before using open souce software you should read its license and decide if it is appropriate for you. On the other hand majority of CPAN modules (but not all!) are dual licensed under GPL/Artistic. And Artistic is very flexible license with nearly no restrictions. I doubt using Artistic licensed modules can get you in any trouble.

        --
        Ilya Martynov, ilya@iponweb.net
        CTO IPonWEB (UK) Ltd
        Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net
        Personal website - http://martynov.org

Re: Production Environments and "Foreign" Code
by Biker (Priest) on Mar 12, 2003 at 14:39 UTC

    I share your double point of view. Using external code is sensitive but may be very rewarding.

    Now, the other side of the coin is that Perl modules (as all Perl code) is distributed as source code. You can inspect all code to verify that it suits your requirements and make necessary changes.

    In theory, you could accept the project of rewriting all the XML related modules you need in your shop, copy them from CPAN and tell the folks you wrote it yourself. (That wouldn't be a very nice thing to do, but as a theoretical example it works.)

    Then imagine that you re-write the whole lot. Besides from taking a lot of time, your code will most likely not be as well tested as the code you find in the most popular CPAN modules. I would guess that you would introduce more bugs by re-writing the stuff than by relying on already tested code. (With all due respect. ;-)

    In our shop we do extensive testing and we are happy with that.

    My (and our) opinion is that the art is not to reinvent the wheel, but to chose which existing wheel you should get to your vehichle. I.e. chosing the right, mature and extensivly tested CPAN modules to use is where you should put your efforts.


    Everything went worng, just as foreseen.

Re: Production Environments and "Foreign" Code
by BrowserUk (Patriarch) on Mar 12, 2003 at 16:32 UTC

    Work with their system, and use it to your (mutual) advantage.

    If you were going to implement your own, in-house version of say, XML::Parser, the first thing you need is a specification document: use the XML::Parser documentation as your spec. Pretty much the next thing you should be designing is your test suit, test data and acceptance criteria.

    Offer this as your test procedure.

    • Set up a test drone--network connected but firewalled for inbound-only traffic if they are paranoid.
    • It runs a script using XML::Parser, that monitors an inbound directory.
    • Have the owners/admins of the dept. producing or receiving XML data to do daily ftp dumps of stuff they produce/receive to the test drone.
    • Whenever an XML file appears there, have the monitoring script parse it, and then simply re-write the XML to a new file in another directory.
    • Then, preferably using a seperate non-perl process, use diff or something similar to compare the results and raise an alert if differences are found.
    • Have a human procedure that formally investigates and explains the differences.

    Management willing, that shouldn't take more than a few days to set up at the start of the project. You then sit down to write the rest of the application--the part that uses XML::Parser-like clone you are going to write--using the XML::Parser interface, and using XML::Parser as a substitute for your own clone for development purposes.

    If by the time the rest of the application is written, the test drone process has highlighted serious or repetative errors in XML::Parser, then you can set about writing (or refactoring) Your::XML::Parser, with the additional knowledge gained of the originals weakspots and deficiances. If however, it proves to be reliable, then you have saved the company money, and hopefully gained a little respect/qudos for yourself and CPAN at the same time.

    Write everything up as a proposal up front, nothing hidden. Your bullet points for presenting the proposal and the management summary are:

    1. You reduce risk by starting out with a tried and tested specification for the XML processing.
    2. You allow the a fast start for the team developing the business processing part of the application, by decoupling their development from the XML parsing development through a mature interface design.
    3. You get a head start on discovering anomolies and problems routed in the inbound XML and decouple those problems from the rest of the development work by isolating them before they get there.
    4. You raise the potential of saving the company money and reducing time-to-live.
    5. You establish a possible approvals mechanism (and precident) for adoption of further CPAN modules.
    6. If XML::Parser (or other chosen module) proves not up to the job, and you have to write your own, many of the above benefits will still have been acheived, and the few days of effort required to set up the test-drone are more than offset by those retained benefits.

    If you think that it would be viewed in a sympathetic light, you could add a not-too-obscure footnote somewhere to the effect that if the process pans out and the company does adopt the module, therebye saving X man-hours/$1000's in development costs, that a contribution by the company to the Perl Foundation of some percentage of that savings would be beneficial and gratefully received.


    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: Production Environments and "Foreign" Code
by dragonchild (Archbishop) on Mar 12, 2003 at 14:28 UTC
    Most of my employers have had the attitude "Do whatever you like, but it's your $#%@ if it breaks." I personally have never run into a similar policy.

    As for modules ... my feeling is that the "core" modules are a subset of the real "core" modules. I have never had a problem d/l'ing almost any CPAN module and having it work according to docs. (This doesn't mean that I didn't have to extensively augment/modify the module(s), but that's another story.) No data loss, no corruption, no nothing. YMMV.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      My advice to you, and the poster of the root node is to "protect your $#@%@". (to the root node author, that's if you do convince your employer to allow you to use CPAN ;-) ) This means a couple things that people have already said as well as a couple more unique sugestions:
      • Check http://testers.cpan.org, but don't take it to be the perfect. Take notice of whether the failures apply to you (meaning check whether the failures are platform specific, dependency problems, etc.)
      • Check http://rt.cpan.org. This is a great.
      • Since it's open source check the code on your own.
      • Registered modules seem to be more reliable/better tested than unregistered modules.
      • They also provide a DLSIP definition that can be very helpful in assessing the viability of the module. This is key to me; the most important things about it are the support level, and developmental stage.
      • Check around with fellow perl coders. See if any of them have ever had trouble with the code.
      • Check the version of the code (duh). 0.0x generally means something is beta or incomplete, treat it as such.
      • Install the module on your/your companies test box, and test it. Don't only use their test software, do further regression testing, test the code in the environment you'll use it in (what will you use it for? )
      • If you still can't discern the credibility, email the author, and ask them if they have to 'fix' anything for the next version of the software.
      • Check the author's credibility. Damian, Lincoln, etc. provide an instant credibility because their name is on their code. They can't publish crap because then their name value goes down. On the other hand, if you're using a module by a fool like me, you might want to think twice about putting it an environment where your job is on the line. { grin }

      Gyan Kapur
      gkapur@myrealbox.com
Re: Production Environments and "Foreign" Code
by adrianh (Chancellor) on Mar 12, 2003 at 15:29 UTC

    On a slightly sneakier front I have found something like this works quite well when you have to deal with Not Invented Here syndrome:

    "I agree completely. You're correct and we need to write the modules ourselves. However, in the interest of producing a prototype quickly why don't we use these modules for the moment. They have a nice API which we can keep, and we can rewrite the internals later once we have things up and running."

    "Later" never arrives. Especially when the expense of rewriting something that works becomes obvious.

Re: Production Environments and "Foreign" Code
by djantzen (Priest) on Mar 12, 2003 at 14:47 UTC

    I cannot speak to your first question, however I can say that my experience with CPAN modules has been consistently positive and satisfying. Moreover, in cases where I've encountered problems or shortcomings, I've found the support from the Perl and OSS communities to be superior to anything offered by proprietary software companies. And of course the availability of the source means that modification/extension of the downloaded software is a feasible task.

    Your employers need to be aware that utilizing a CPAN module in no way means that they won't have good support, or even local support. Almost certainly you'd be better off grokking and being the local goto guy for the local installation of a prewritten e.g., XML module than writing one from scratch and handling all of the design, development, and debugging yourself.


    "The dead do not recognize context" -- Kai, Lexx
Re: Production Environments and "Foreign" Code
by PodMaster (Abbot) on Mar 12, 2003 at 14:58 UTC
    IMHO, such policies result out of bad-experience and ignorance. Did you hand-roll your CGI parsing code?

    I very much doubt you'll experience data loss, and if you do, you should have your backups (duh).

    Every xml parsing modules has about 0% chance of losing data on its own (it's only reading after all, and that can't be damaging :D).

    Now, look at the reports of cpantesters, and check your target platform(s) for success/failure (lack of test reports don't mean much, inspect reports closely as not all cpan-testers know what they're doing -- true even more today due to CPANPLUS)
    http://testers.cpan.org/search?request=dist&dist=XML-Parser
    http://testers.cpan.org/search?request=dist&dist=XML-LibXML
    http://testers.cpan.org/search?request=dist&dist=XML-Twig

    Do a codereview like monks have already suggested, but also be sure to review the tests (and the more tests there are, the better).

    Look for bug reports (http://rt.cpan.org).

    Now, also look at who wrote the module. I frankly wouldn't do too much of a codereview on a module like Text::xSV which is written by tilly. Some of SCHWERNs stuff I might inspect a little closer though ;D


    MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
    I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
    ** The Third rule of perl club is a statement of fact: pod is sexy.

      We don't parse CGI .. *grin* .. at the minute it's all CSV and TDT in flat files, and yes, the modules to read them are all hand-rolled (long before my time). As I said in the original post, we have core Perl modules and DBI installed, as well as the in-house modules people have written over the years, and nothing further than that.

      By losing data I meant badly-formatted or wrongly tagged lines being silently kicked out, not the module itself failing to read or "damage" data. Error reporting and handling is, I believe, one of the reasons management here decided to move away from external code - we're *very* liable if something isn't reported on correctly - and forcing people to write their own code to complete tasks makes you at least stop and think about how the code will cope if the data isn't the *exact* format it should be (spaces in tags, blank lines in the middle of XML, things like that).

      Similarly, the scripts can't fall over if they encounter data they don't know what to do with - errors should be reported and the reports run with the data that *does* exist - we can always re-run that section of the batch run if needs be the following day.

      This is a policy that's existed since long before I got here, and while I'm arguing against it, I can see why it exists. Saying it's all down to ignorance is all well and good, and I agree, it doesn't make a lot of sense, but when you're fighting against years of "this is just the way we do it here", I don't know if progress can ever easily be made. People can, and do, get very set in their ways - even minor changes to policy can come across as a very big thing.

      -- Foxcub
      A friend is someone who can see straight through you, yet still enjoy the view. (Anon)

        Well arm yourself with knowledge (the perlmonks can help, especially if you choose XML::Twig and ask mirod ;D).

        It's easy to fight the PTHBs(pointy haired bosses) if you got the right ammo.

        It is your duty as the developer/programmer, to get that ammo and shoot it up their wazoo, until they give in (see article referenced in Article on how to be a programmer, same sentiment).


        MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
        I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
        ** The Third rule of perl club is a statement of fact: pod is sexy.

Re: Production Environments and "Foreign" Code
by demerphq (Chancellor) on Mar 12, 2003 at 18:27 UTC

    we have almost a "no external modules" policy here, and are basically restricted to the "core" Perl modules and DBI.

    Theres not much to add to this thread but a minor observation. The "core" modules have been evolving for quite a while. Not only that but there are several collections of "core" modules. ActiveState fwict bundles a _lot_ more into their releases than are in the "standard". Each new version of perl has a slightly different set of modules acompanying the code and not only that, they have different versions. (Does your company let you upgrade modules in the "core" from CPAN? Will it let you install a module that has been included in a later perl release but not in the one you use?) A few modules are IMO still in the core becuase they have been in the core for so long nobody wants to remove them, _but_ they have been generally replaced in new code by other more powerful modules anyway. File::Basename vs File::Spec comes to mind as example. Does your company have a policy about which of the two should be used?

    Afaik the list of modules that is included is constrained more by space (how many users really are going to use Parse::RecDescent for instance?) and a modules utility to actually buidling perl itself than by the modules worthyness. Not that unworthy modules would ever make into a standard release, but that there are tons of worthy modules that will never get included because their user base wouldnt be large enough to burden every install with. (Space is already being discussed in some circles as being excessive.)


    ---
    demerphq


      Does your company let you upgrade modules in the "core" from CPAN?

      No.

      Will it let you install a module that has been included in a later perl release but not in the one you use?

      No.

      We're using Perl 5.004.04 (which I know is full of bugs and some quite serious security holes anyway). If it didn't come with that, we don't have it, and we can't get it, at least, that's the way it works at the minute.

      I agree, it's a crazy system, I personally think that it causes more problems than it solves. I'm all for CPAN - I use a lot of the modules from there in my own personal code, and I see no reason to use CPAN from a corporate point of view.

      I do think the current core module set is pretty well balanced, though: it allows a lot of development and useful reuse of code without bloating the distribution too much. Size seems to be very rarely an issue nowadays anyway, with disk space no longer costing what it did a few years back.

      -- Foxcub
      A friend is someone who can see straight through you, yet still enjoy the view. (Anon)

        We're using Perl 5.004.04 (which I know is full of bugs and some quite serious security holes anyway).

        You do what? No, seriously. I think there's something wrong with that attitude.
        I'm not arguing the usefulness of evaluation of software to be used in production - by no means.

        But this seems reckless to me. On the other hand, very probably your codebase actually relies on those bugs in external software (I've seen that oh so often ... :-( ), and if it does upgrading would be disastrous, indeed. But consider: How many people actually know the trouble spots in 5.004.04 and know how to avoid pitfalls. Is the primary reason to distrust outside code a missing test suite which ensures the robustness of your system?

        Is your staff actually a good enough team to tackle XML parsing on its own without taking years to do it right. Without hiding disastrous bugs in that code? And if they are good enough, wouldn't they be good enough to do proper code review on existing modules and feeding those bug fixes back to the community, serving others as you perfect your system?

        I know from personal experience that we employed external code (CPAN modules in this case) which contained bugs. We found them in our testing environment, fixed them and sent patches to the author.
        In other cases we looked into modules which supposedly did what we needed, but simply weren't usable. So we wrote ourselves.

        Ergo: By using external code, you don't have to use all external code there is. Which means: You can still write your own if you want/need to. Careful selection of modules is of course necessary as some others have noted above, too.

        janx

Re: Production Environments and "Foreign" Code
by hardburn (Abbot) on Mar 12, 2003 at 15:05 UTC

    Any CPAN module worth installing is (at the least) going to run a few basic tests before it is installed. You can add your own tests if you want (usually just a matter of adding a script in the t/ subdirectory of the module). You can audit the code if you want--a tedious task, but a far better use of expensive programmer time than rewriting it. Something as popular as an XML parser is going to be widely used and tested, so you may not even bother.

    If your employeer still wants it rewritten, I'd just give up and go along with it. It's their money to waste. I suggest documenting all your suggestions before hand, so that when (almost certainly not 'if') the whole project turns into a stinking mess and management is looking for someone to blame, you have something to point at.

    ----
    Reinvent a rounder wheel.

    Note: All code is untested, unless otherwise stated

Re: Production Environments and "Foreign" Code
by perrin (Chancellor) on Mar 12, 2003 at 17:37 UTC
    What are they afraid of? Bugs? Your OS has bugs. So does your database. Paying lots of money, or writing it yourself does not preclude bugs. However, with CPAN modules you have the source, so you can do as much investigation as you need to before putting that code into production. Add in extra logging, throw torture tests at it, etc. This is the beauty of having the source.
Re: Production Environments and "Foreign" Code
by Abigail-II (Bishop) on Mar 12, 2003 at 21:42 UTC
    Well, if the rule is "no foreign code", then that rule is silly. Or did they write all their OSses, editors, compilers, MTAs, database, webservers, themselves?

    However I can understand a policy of saying "this and this and this and this we take from the outside, and anything else needs to be evaluated first". Anyone can upload as much crap on CPAN as they wish. The standard distribution has been tested by a lot of people - but you don't know that from some random module on CPAN. I don't share the often voiced opinion "if it's on CPAN, it has to be good", although it's usually not said with those words. There are a lot of crap coders out there. There are a lot of crap Perl coders out there as well. All that's required to upload something to CPAN is pushing a few buttons on a webpage. Pause is not a 'bad code' firewall.

    I've used CPAN modules in production code. But only well used modules, which have gotten good reviews.

    Abigail

Re: Production Environments and "Foreign" Code
by michaeld (Monk) on Mar 13, 2003 at 10:01 UTC

    I work for an international VLO also. My experience is quite similar to yours: as long as the software has got a nice label on it and costs LOADS of money, it's OK.
    Perferably, the package should come with a costly upgrade program and a - yet again - very costly support program

    So when people like me come up with Open Source Software, or any other free and readily available stuff, managers tend to get frightened and turn away....

    I think it's a very common phenomenon which IMO is caused by two things:

    Ignorance
    Plain ignorance about what Open Source really is. To most managers, it only represents a risk (and I must admit that they're right upto some extend). What they lack to see it that it also represents possibilities ( to tailor the software to your own needs, etc...)
    Ergo: there's still a lot of work in advocating open source software... get to it!!!

    Fright
    Nobody ever got fired for choosing IBM (or any other major player in IT...)
    "What will I look like if Ze Big Boss finds out I've used Open Source in my project???"
    I don't think this needs much explanation, does it?

    As for myself, I use Perl to "fill up some of the gaps" in the project I'm working on: checking data integrity,etc...
    I don't think management is fully aware of this yet, but don't think I'll get much head wind when they do find out (as it does the job..)
    I'll keep my fingers crossed.

    Cheers
    Michael.

Re: Production Environments and "Foreign" Code
by helgi (Hermit) on Mar 13, 2003 at 14:21 UTC
    I find this a totally bizarre requirement/policy.

    Is it only Perl programmers who labour under these ridiculous restrictions or everybody?

    How on earth would Java or C++ or Delphi programmers function without "foreign" code?

    The whole idea is ludicrous.

    --
    Regards,
    Helgi Briem
    helgi AT decode DOT is

      I think we all bump into these kind of restrictions some time.

      But... there's no need to despair just yet!...I seem to remember that not long ago, Java was looked upon as "weird-stuff-never-to-be-touched-by-a-decent-IT-person". (In our company anyway...)

      And look where Java ended up today!

      It's just a matter of time and patience...

      Cheers,
      Michael.

      The snide answer is that only Perl programmers are productive enough to be able to rewrite the foreign code and still get their work done.

      But the truth is that there are many who have helped major projects crash and burn by re-inventing the wheel. Check out Crash And Burn Java for instance, among its tips is re-invent your infrastructure so that you can use up some programming talent and then have something else to sell afterwards. (That rant is, I understand, based on an actual project.)