Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Perl-Sensitive Sunglasses

Best way to fix a broken but functional program?

by idnopheq (Chaplain)
on Aug 22, 2001 at 22:03 UTC ( #107060=perlquestion: print w/ replies, xml ) Need Help??
idnopheq has asked for the wisdom of the Perl Monks concerning the following question:

Hi, All!

I am plagued, PLAGUED I said, by a bit of perl exceeding 4K lines, where 'use strict;' and '-w' were actively avoided. Not only that, the program is replete w/ symbolic references, vars never accessed, subs called in void contexts, the Holy texts never consulted, no external modules used, and - as a result - the wheel is reinvented poorly more times than the Yankees have World Series titles. Enabling warnings, strict, *or* diagnostics overflow my scrollback buffer.

I want to laugh at it, but my incescent sobbing gets in the way. On top of it all, my girlfriend won't let me bitch about it anymore ( I'm amazed at how much bitching can be fit into a 36 hour period! ).

My query is 'How to I begin knocking this damn thing into shape?'. Of course, some suit made this huge CRAWLING HORROR, nay WOMBAT, mission critical, and the unstable mind that brought this monstrosity into the world is no longer with us ( in more ways than one, RIP! HHOS ). The folks running either end are third generation and don't know how this works, either. Thus, I inherrited this damnable thing and it's the glue somehow.

The thing stinks slightly of PASCAL, too.

Via vgrep, I cannot certainly discern what the script does, but we've proven we need it to function in our e-commerce app. Best I can tell it querries data from a VAXen or two based off of some kind of mangled pre-SQL query from an HTML text box and kicks a job off to some knock-off Win32 app. The inputs and outputs are unclear at best, though analysis of the tcpdump output may reveal something.

Were it elegant but obfuscated, that would be one thing. But it looks so nasty that I think it was writen in M$ Notepad ( except for the length ). I cannot post it, as it may or may not divulge secrets into life, the universe, and everything. So, I am looking for methodology.

So far, I've devised:

  1. enable warnings, and eliminate each.
  2. enable strict, and eliminate each.
  3. follow Camel et al. and follow sane coding practice.
  4. test, fix, test, etc. ( tho I actually have no means to test outside of production at this moment )
  5. reach a critical mass, when there is nothing left to fix and/or remove.
  6. document, document, document, especially as there's none today.

I'm guessing at least a month to fully nurse this thing to what I consider health and continue with my normal duties, but is there an easier way to do it? I don't normally look for a magic spell to fix a problem, but I dunno what else to do. Granted, I have EMACS and the ever so lovely cperl-mode to help me through formatting, highlighting, and the imenu, but I've not tackled something so monsterous and ugly before. I fear fixing it to the point it no longer works.

Advice from the more proficient or those who have tackled something like this is most appreicated and solicited. BTW, I did do a Super Search and some Googling, but nothing truly satisfactory was returned ... did I miss something?

Apply yourself to new problems without preparation, develop confidence in your ability to to meet situations as they arrise.

Comment on Best way to fix a broken but functional program?
(jeffa) Re: Best way to fix a broken but functional program?
by jeffa (Chancellor) on Aug 22, 2001 at 22:15 UTC
    You can't figure out what the program is supposed to do, eh? My first recommendation is to understand what the problem is and re-write the solution. If you guess it will take you a month to 'fully nurse that thing' then i guess it will take about a week to re-write it. Maybe you could take advantage of CPAN modules even.


Re: Best way to fix a broken but functional program?
by mr.dunstan (Monk) on Aug 22, 2001 at 22:16 UTC
    Scary situation, not fun, but I think determining a strategy is your best bet. I don't know if this has ever been done, but I wonder if you could write a utility that would figure out all the scalars, arrays and hashes declared, and have this utility grep thru the code to figure out where 'my' hasn't been used.

    It might be too gnarly a path to even go down though ...

    I wonder if you could break the code into smaller chunks and lib-ify it, so you can fix and do unit testing on individual subs?

    If the code is bad enough I usually just plan on transplanting code into a new file and fixing as I rewrite each sub, that way there's a method to the madness, and I can incrementally fix the code without 'breaking' the original.

Re: Best way to fix a broken but functional program?
by traveler (Parson) on Aug 22, 2001 at 22:20 UTC
    Well, The Elements of Programming Style by Kernighan and Plaugher says, "Don't patch bad code -- rewrite it". Pretty good advice. A month of patching vs rewriting may or may not be a simple choice, but it may be the one you have to make.

    If you decide to attack it as it is, perhaps the best thing to do is to first break it into "chunks" whether these are functions, loops, blocks of declarations or whatver, so you can begin to tackle the program. Without seeing this disaster it's hard to discern the best method of chunking.

    HTH, --traveler

Bite off a little at a time...
by dragonchild (Archbishop) on Aug 22, 2001 at 22:28 UTC
    The very first thing to do is to run it and get a set of known output. You don't want to be messing with something you don't understand until you have a baseline.

    The second thing is to make a copy somewhere, not in your working directory. (Oh, you do have a directory structure just for this effort, right?) Then, make another backup copy in your working directory. Then, you have your working copy. If this thing is critical in your e-commerce system, then you have to maintain a copy that works, even if it's unmaintainable.

    The third thing is to print out a copy and take it home. Go to a diner and drink coffee, smoke a pack of cigarettes, and read the code. Mark up your printout.

    The fouth thing is to reformat it. DO NOT MAKE A SINGLE CODE CHANGE! Simply get the code to "look normal". Get the squigglies to line up. Get the capitalization of if-then out of there.

    Up until this point, you haven't made a single functional change. You've also read the code at least 2-3 times all the way through, in the process of reformatting. Now, before enabling strict, start going through and making it strict. You haven't enabled strict, yet.

    Every iteration of this, you have to re-run it and make sure it still works. You will have to find a way of doing this. THIS IS THE MOST CRITICAL PART. Give it the exact same input and run against the known output. diff the files and make sure their identical. If they're not, back up and find out what you accidentally changed.

    Once you think you got all the my things in place, enable strict vars. Do NOT enable strict 'refs' yet. You'll go nuts. Don't be afraid to 'use vars qw(....)' a lot. Clean up the ones you missed.

    After that, start moving the code around. Make functions. Start uncoupling the data. Start localizing the data. Start perl-izing the code. Use hash-slices. Use unpack instead of substr. Use the backwards-if. Use common-sense.

    Once you have that done, you should have gotten most of the symbolic references cleaned up. Enable strict 'refs'. Clean up what you missed.

    Now, go to full strict. Clean up what you missed.

    Now, add warnings. Clean up what you missed.

    During this entire process, document what you can.

    Don't be surprised if you discover that, in the process of cleaning this up, you have to clean up other stuff. Maybe there were globals that were cross-file. Maybe there were dependencies you didn't know about. Leave those dependencies in there for the other files!! Don't attempt to touch anything but the file you're working on. You'll go nuts. Mark them to be cleaned up later.

    Then, when you're done, you'll probably be down to under 1000 lines and have a very easy script to work with.

    I'd estimate a good 2-3 weeks completely on this.

    /me wants to be the brightest bulb in the chandelier!

    Vote paco for President!

      Well, all, I guess the core problem is that I cannot decypher it enough to support it, which I have been tasked with. By some miracle, the process has run for over a year w/o restarting, so there is magic within it, no doubt. But what is that story from the Jargon file, about Mel, a real programmer?

      I see areas where things like Net::Telnet maybe employed versus opening raw sockets, where Tie::Hash would make life easier, even where removing seemingly obvious redundencies and lengthy if-elsif statements, would improve things. But, being so sensitive to day-to-day operatios, is there a delicate way to proceed?

      Like I said, cperl-mode will help me with lining things up, and hopefully tcpdump will tell me what is passing between the various hosts. I would most definitely rewrite the thing, but w/o a model office to play in, what can I do other than tweek and test?

      Remember I'm working IT for the auto industry, in a group of four which used to be twenty.

      Apply yourself to new problems without preparation, develop confidence in your ability to to meet situations as they arrise.

        A little at a time, my friend. You have to go slowly.

        The very first thing is that you should ALWAYS have some manner of development box which (relatively) closely mirrors the production environment. What're you going to do, develop in production?!? Of course not!

        Get it lined up. That will help you understand what's going on. Comment it extensively. Then, start breaking out functions. Rename variables. Take it very slowly.

        You really can't do very much, unfortunately, without that development area. Demand that, or say that what the manager requested cannot be safely done.

        /me wants to be the brightest bulb in the chandelier!

        Vote paco for President!

Re: Best way to fix a broken but functional program?
by chromatic (Archbishop) on Aug 22, 2001 at 22:30 UTC
    I'd run it through perltidy (on SourceForge, I believe) first, so the structure is workable. You might also use B::Deparse, to get it it something resembling actual code. Then read it. All of it. Print it out and mark it up, if necessary. 4000 lines translates to ~60 pages, at 8 point font. That's not too awful.

    Having read it, you might have noticed duplications or near duplications. There's your chance to factor out common code into subroutines. This also allows you better encapsulation -- you can use lexicals instead of globals, passing in and receiving data. The program will get shorter and shorter, as you do this.

    Based on my experiences reworking File::Find (soon, soon), you may be able to cut out 25% or more of the code.

    (Extra free advice: wouldn't it be nice if this old code had unit tests, so you could change things and verify it still performed as expected? Someday, your code will be maintained. Make life easier for someone else.)

      THX, chromatic, perltidy looks tasty. :p

      Just from reformatting via cperl-mode, the script came down about 200 lines. But, as you say, the globals are sticky and I began replaicing pages of if-elsif code into simple switch/case statements, removing many lines of useless code.

      So far, tho, the tcpdump of the TCP/IP traffic is proving unhelpful as the Sun box and the VAXen share data amongst various processes. The Sun 2 W2K is more helpful. I can at least see what the two sides are talking about ...

      Looks like I will be able to totally redo at least half of the existing program.

      Apply yourself to new problems without preparation, develop confidence in your ability to to meet situations as they arrise.

Re: Best way to fix a broken but functional program?
by runrig (Abbot) on Aug 22, 2001 at 22:30 UTC
    I would vote for rewriting the thing from scratch. You'll probably spend roughly an equal amount of time whether you rewrite or fix, and if you rewrite you will (almost certainly) end up with better code. Of course, alot of this time will be spent in determining what the program does, and 'breaking it into' smaller pieces as others have suggested, but just think, if its that badly written, I bet you can turn that 4K program into 1K (ok, maybe 2) :)
Re: Best way to fix a broken but functional program?
by Corion (Pope) on Aug 22, 2001 at 22:40 UTC

    I'd follow the suggestions above and do a rewrite instead of patching up the broken program. But even if I were to patch it up, here's my approach (free, and thus, worth its price) :

    1. First, make a backup of all files that seem remotely interesting, and store that backup on a CD, on a directory of the same machine, and possibly in your personal workspace somewhere.
    2. Set up your modified version, be it the patch or the rewrite, in a different directory than the original file. Use symlinks for data files if necessary, or better, use copies.
    3. Test your program in parallel to the original - check that the output of your new program is identical to the output of the original. You will face interesting challenges if the original modifies the input data - then you might have to isolate/rewrite the input and output routines to create a copy of the input and output, praying that the original program still works (You did make the backups, right ?).
    4. In each development step, compare your output against the original output (Perl can do this for you). Thus, you will end up with a program that is not as broken as the original, but works the same.
    5. At some time, you will decide that the original produces wrong data. Ask another, non-technical, person who should know what they're doing, if your data seems more correct than the original data. If the modification is OK, adapt your test result routine to accomodate for the difference in output, and go on.

    This strategy will keep you close to the original, whatever way you chose to implement the functionality of the original. But by forcing your input/output to be the same, you are also bounded by those design decisions and miss the chance to refactor the whole process. But as the whole process seems to be lost in the winds of time, first trying a local fix seems much more manageable.

Re: Best way to fix a broken but functional program?
by ralphie (Friar) on Aug 22, 2001 at 23:15 UTC
    why not install ptkdb, run it as the debugger, and step through the thing a few times watching what it does? if you're really not sure what goes in and what goes out you might just want to let it cycle through its structure a few times.
Re (tilly) 1: Best way to fix a broken but functional program?
by tilly (Archbishop) on Aug 22, 2001 at 23:37 UTC
    I am going to suggest a completely different way of doing this.

    You know what the script is used for, don't you? Well define your task into 2 interacting jobs:

    1. To write a library or libraries that make it easy to write acripts that do its task or tasks.
    2. To simplify this script by having it written in terms of that library or libraries.
    If you were to take these two jobs to completion, you should have a clean library and a simple script. At present you are nowhere near this goal and have no idea how you are going to get there.


    What you want to do is look at the script and identify something simple which might belong in a more general library that you can easily drop into the script. Found it? Good. Now start a simple module. You can use the following template:

    package My::Module; # Magical export invocation. use Exporter; @ISA = 'Exporter'; @EXPORT_OK = qw(); # Will fill in # Rest of header section. use strict; # Body here. # Will fill in 1; __END__ =head1 NAME My::Module - Will fill in =head1 SYNOPSIS use My::Module; # Will fill in =head1 DESCRIPTION Will fill in =head1 BUGS AND LIMITATIONS None known.
    Note that the name of the module should match the package name, in this case My/, and where it goes depends on your current development setup.

    Now go to the hideous, ugly script and have it use your module. Next take that simple function or bit of functionality which you identified, and put it in your module. Put the name of it in @EXPORT_OK, describe it and then put it in the import list in your script. In other words put a bit in all the things which are, "Will fill in" above. Then replace a bit of the script with your brand new function.

    Wash, rinse, and repeat, developing tests, refactoring your library, etc as necessary. At no step are you writing the script from scratch. But you are eating away at it, and eventually there will be little enough left that you can do the rewrite easily. Plus when you get done you have this convenient library.

    Note that this approach is not doable with all kinds of scripts. But when it is appropriate, it can be a good path to know about from ugly legacy code to a cleaner design.

Re: Best way to fix a broken but functional program?
by cLive ;-) (Parson) on Aug 22, 2001 at 23:47 UTC
    what fun!

    Here are my suggestions...

    Add a log function:

    sub log { open (LOG,">>log") || die $!; print LOG "$_[0]\n"; close LOG; }
    Then, at the beginning and end of each sub-routine, add:
    sub sub-name { log("Entering sub-name"); ... log("Leaving sub-name"); }
    Then add various log calls when things happen that you're unsure of, tacking vars used. eg:
    log("querying VAX: $vaxen_query"); execute $vaxen_query; log("SQL query: $mangled_sql_query"); execute $mangled_sql_query;
    Adding some vars fom caller() may also give you some added insight.

    Once you have some idea what gets sent where and when, you can replan what the scipt does and rewite cleanly.


    cLive ;-)

      If you want to write such a logging function, there is no need to add it manually everywhere. Try just dropping the following at the start of the script:
      foreach my $used (keys %main::) { no strict; if (defined &$used) { my $old_sub = \&$used; *$used = sub { print "Calling $used\n"; &$old_sub; }; } }
      This one only abuses subroutines in your main package. Alter as desired. :-)
Re: Best way to fix a broken but functional program?
by Sifmole (Chaplain) on Aug 23, 2001 at 00:07 UTC
    I would argue against blanket rewriting. A complete rewrite also means a complete QA/Defect resolution cycle. What you are basically looking at is a piece of code that needs a serious refactoring.

    The suggestions regarding Unit Testing and using them after each little change is an extremely good one. You will save yourself significant trouble and time by making sure you have several examples of the various outputs based on various inputs; This is true whether you rewrite or not, especially if other processes use the codes output as input.

    tilly suggestion regarding breaking code functionality into seperate libraries is very logical. Doing this as part of the refactoring is natural and will increase the potential for the result code to be much easier to maintain.

    Martin Fowler has a book devoted to refactoring, Refactoring: Improving the Design of Existing Code. It is a good resource for recipes to look for during refactoring.

    Good luck, and a project like this can actually be fun if you approach it with the right perspective.

    Edited: footpad, 22 Aug 01, ~21:05 PDT

Re: Best way to fix a broken but functional program?
by clemburg (Curate) on Aug 23, 2001 at 12:42 UTC

    The first thing you need is a test environment. You *really need to build a test environment* to pull this thing off, since:

    • Testing in production is no option
    • The risk to introduce bugs while fixing is very high
    • You need to test to understand the program
    • Building a test environment will make you understand your inputs and outputs
    • You will be the one to blame after you touched this thing, so you better be able to explain its functionality exactly, and to demonstrate it can do what you advertise
    • Building a test environment will make you understand your limits in fixing this thing, so you can advertise them to your superiors, or even pull out of the operation

    After you have the test environment, you can start to refactor using one of the strategies outlined above. But I honestly warn you to touch this thing or assume any responsibility for it unless you are allowed the time to construct a test environment. If they won't allow you to build this, they are not looking for a solution, but for a scapegoat. Decide if you want to be this scapegoat. If people actively resist sensible approaches to problem solution, the best strategy is to get out of their reach and to document your reasons for doing so.

    Christian Lemburg
    Brainbench MVP for Perl

Re: Best way to fix a broken but functional program?
by idnopheq (Chaplain) on Aug 23, 2001 at 19:33 UTC
    UDATE! Grand news! My boss and I pushed back, and the VAXen and Windows admins *have* to set up a test environment before I start my work.

    I thank all of you for your advice, and will apply lots 'o it as I go. I'll let you know how it proceeds.

    Apply yourself to new problems without preparation, develop confidence in your ability to to meet situations as they arrise.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://107060]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (9)
As of 2014-04-19 03:09 GMT
Find Nodes?
    Voting Booth?

    April first is:

    Results (475 votes), past polls