http://www.perlmonks.org?node_id=1203327

likbez has asked for the wisdom of the Perl Monks concerning the following question:

This is kind of topic that previously was reserved to Cobol and PL/1 forums ;-) but now Perl is almost 30 years old and it looks like the space for Perl archeology is gradually opening ;-).

I got a dozen of fairly large scripts (several thousand lines each) written in a (very) early version of Perl 5 (below Perl 5.6), I now need:

1. Convert them to use strict pragma. The problem is that all of them share (some heavily, some not) information from main program to subroutines (and sometimes among subroutines too) via global variables in addition to (or sometimes instead of) parameters. Those scripts mostly do not use my declarations either.

So I need to map variables into local and global namespaces for each subroutine (around 40 per script; each pretty small -- less then hundred lines) to declare them properly.

As initial step I just plan use global variable with namespace qualification or our lists for each subroutine. Currently I plan to postprocess output of perl -MO=Xref old_perl_script.pl

and generate such statement. Is there a better way ?

2. If possible I want to split the main namespace into at least two chunks putting all subroutines into another namespace, or module. I actually do not know how to export subroutines names into other namespace (for example main::) when just package statements is used in Perl as in example below. Modules do some magic via exporter that I just use but do not fully understand. For example if we have

#main_script ... ... ... x:a(1,2,3); ... ... ... package x; sub a {...) sub b {...} sub c {...} package y; ... ... ...
How can I access subs a,b,c without qualifying them with namespace x from the main:: namespace?

3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

Any advice will be greatly appreciated.

Replies are listed 'Best First'.
Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict
by Corion (Patriarch) on Nov 14, 2017 at 08:45 UTC

    In addition to AnomalousMonks advice of a test suite, I would suggest at the very least to invest the time up front to run automatic regression tests between whatever development version of the program you have and the current "good" (but ugly) version. That way you can easily verify whether your change affected the output and operation of the program. Ideally, the output of your new program and the old program should remain identical while you are cleaning things up.

    Note that you can enable strict locally in blocks, so you don't need to make the main program compliant but can start out with subroutines or files and slowly convert them.

    For your second question, have a look at Exporter. Basically it allows you to im/export subroutine names between packages:

    package x; use Exporter 'import'; our @EXPORT_OK = ('a', 'b', 'c');
    #main_script use x 'a', 'b'; # makes a() and b() available in the main namespace

    To find and collect the global variables, maybe it helps you to dump the global namespace before and after your program has run. All these names are good candidates for being at least declared via our to make them visible, and then ideally removed to pass the parameters explicitly instead of implicitly:

    #!perl -w use strict; our $already_fixed = 1; # this won't show up # Put this right before the "uncleaned" part of the script starts my %initial_variables; BEGIN { %initial_variables = %main::; # make a copy at the start of the pr +ogram } END { #use Data::Dumper; #warn Dumper \%initial_variables; #warn Dumper \%main::; # At the end, look what names came newly into being, and tell us a +bout them: for my $key (sort keys %main::) { if( ! exists $initial_variables{ $key } ) { print "Undeclared global variable '$key' found\n"; my $glob = $main::{ $key }; if( defined *{ $glob }{GLOB}) { print "used as filehandle *'$key', replace by a lexica +l filehandle\n"; }; if( defined *{ $glob }{CODE}) { print "used as subroutine '$key'\n"; # so maybe a fals +e alarm unless you dynamically load code?! }; if( defined *{ $glob }{SCALAR}) { print "used as scalar \$'$key', declare as 'our'\n"; }; if( defined *{ $glob }{ARRAY}) { print "used as array \@'$key', declare as 'our'\n"; }; if( defined *{ $glob }{HASH}) { print "used as hash \%'$key', declare as 'our'\n"; }; }; }; } no strict; $foo = 1; @bar = (qw(baz bat man)); open LOG, '<', *STDIN; sub foo_2 {} use strict;

    The above code is a rough cut and for some reason it claims all global names as scalars in addition to their real use, but it should give you a start at generating a list of undeclared names.

    Also see Of Symbol Tables and Globs.

      Thank you. that's a very good advice. Both the idea to use "strict" pragma selectively and to use Data::Dumper; are simply great !!!
Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict
by AnomalousMonk (Archbishop) on Nov 14, 2017 at 07:20 UTC

    I'd like to suggest that you also need a

    Step 0: Write a test suite that the current code passes for all normal modes of operation and for all failure modes.
    With this test suite, you can be reasonably certain that refactored code isn't just going to be spreading the devastation.

    Given that you seem to be describing a spaghetti-coded application with communication from function to function via all kinds of secret tunnels and spooky-action-at-a-distance global variables, I'd say you have a job on your hands just with Step 0. But you've already taken a test suite into consideration... Right?


    Give a man a fish:  <%-{-{-{-<

      This is what I would do after 'Step 0':

      • identify a function using a global variable.
      • verify the global variable does not change during execution of this function, e.g. some other function called by this function modifies it. (insert some code to do this for you)
      • convert global variable into an argument and update all callers.

      If the variable does change during the run then pick a different function first. When you got the global state disentangled a bit it's a lot easier to reason about what this code is doing. Everything that's still using a global needs to be treated with very careful attention.

      Thank you. In this case we already have a set of test cases that can be compared and supposedly they cover all or or most of the branches.

      I am reading "Perl Medic: Transforming Legacy Code" hoping to understand this problem better, and it does contain this recommendation.

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict
by 1nickt (Canon) on Nov 14, 2017 at 19:53 UTC

    Hi,

    Since no one has addressed this part...

    How can I access subs a,b,c without qualifying them with namespace x from the main:: namespace?

    See Exporter or Exporter::Tiny. Also see perlmod.

    "main" script:

    use strict; use warnings; use Foo qw/ bar /; print bar('baz'); __END__

    file 'Foo.pm':
    package Foo; use strict; use warnings; use parent qw/ Exporter /; our @EXPORT_OK = qw/ bar /; sub bar { return uc shift } 1;

    Hope this helps!


    The way forward always starts with a minimal test.
      Thank you very much. Never heard about existence of Exporter::Tiny. That might help me.
Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict
by AnomalousMonk (Archbishop) on Nov 14, 2017 at 18:11 UTC

    Further to Corion's wise general advice, I would add to

    ... find and collect the global variables ... and then [remove them and] pass the parameters explicitly instead of implicitly ...
    that if you can also identify a global variable as, in fact, not varying during program execution, you can convert this variable into a true constant and thus remove a potential headache: a global constant is, in general, a Good Thing (or at least no Bad Thing).


    Give a man a fish:  <%-{-{-{-<

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict
by karlgoethebier (Abbot) on Nov 14, 2017 at 19:15 UTC
    "...any Perl IDE has some of required capabilities..."

    Even listing and visualizing vars and so on might be a big relief i guess.

    You might take a look at EPIC. But no warranty. I have used Eclipse for refactoring jobs with PHP, ActionScript and Java with varying results: From pretty cool to totally inferior. Unfortunately i don't have any serious experience with the mentioned plugin. You have been warned. Good luck.

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict (hurry up and wait)
by Anonymous Monk on Nov 14, 2017 at 08:26 UTC

    1) ... strict pragma ...My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

    Hurry up and leave it alone :)

     use strict; itself confers no benefits; The benefits come from avoidance of the bad practices forbidden by  strict :)

    That pretty much means convert one at a time by hand after you have learned the understanding of importance of knowing :) Speed kills

    2. If possible ... I do not understand ...

    That is a hint you shouldn't be refactoring anything programmatically.

    There are a million nodes on perlmonks, and a readers digest version might be Modern Perl a loose description of how experienced and effective Perl 5 programmers work....You can learn this too.

    Hurry up and bone up

    3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

    I hope you have foot insurance :) happy hunting :) perlcritic, PPI/PPIx::XPath , PPIx::EditorTools,
    App::EditorTools - Command line tool for Perl code refactoring
    Code::CutNPaste - Find Duplicate Perl Code

     

    So enjoy, test first, step0++

      use strict; itself confers no benefits; The benefits come from avoidance of the bad practices forbidden by strict :)
      That's very true. But if we are talking about the modernization of legacy code this advice sounds like "it is better to be rich and healthy, than poor and sick" ;-)

      The code is valuable and will probably live another 20 years and so leaving it alone is not an optimal solution. And modernization always has resource constrains so it is important not to "overachieve". I chose a very modest goal -- implementing "strict" pragma because "use strict" and "use warnings" are two pragmas which do improve maintainability of Perl scripts. Other new staff mostly don't.

      Not to open religious wars, but as for your recommendation to read "Modern Perl" I respectfully reject it because I suspect that chromatic is a "complexity junkie" in heart :-).

      So this is an implicit attempt to push me into "overachiever mode". By "overachiever mode" I means conversion of the code using all those fancy idioms available in Perl 5.22 and above and advocated by chromatic, especially unhealthy fascination with OO (inspired by the desire to complete with Python) which I consider counterproductive. When I see bless statement in simple scripts I suspect fraud :-). Also during modernization of legacy code it is important to respect the original author way of thinking and coding.

      BTW when they introduced escaping opening curvy brackets in regex in 5.22 (which was a blunder) I thought that now all bets are off and I am staying with teen versions of Perl forever ;-). Later I changed my mind and use 5.26 is some cases, but the problem remains: inability to reduce complexity of the language, only add to it, sometimes screwing previously healthy parts of the language in the process.

        So this is an implicit attempt to push me into "overachiever mode". By "overachiever mode" I means conversion of the code using all those fancy idioms available in Perl 5.22 and above and advocated by chromatic, especially unhealthy fascination with OO (inspired by the desire to complete with Python) which I consider counterproductive. When I see bless statement in simple scripts I suspect fraud.

        I don't get anything out of it if you read it or don't, but it's a shame that you might give other people the impression that the book tries to do something it was never intended to do. For example, you won't see anything in the book about using:

        • Smartmatch (except "don't use this")
        • Postfix-dereferencing (because it wasn't explicitly marked as stable for the version supported in 4e)
        • Subroutine signatures (again stability)

        The book has always been freely available online, in all of its versions. I'm disappointed that you'd write this without having at least skimmed the book for yourself to see if it's true. (It's not.)

        unhealthy fascination with OO (inspired by the desire to complete with Python) which I consider counterproductive
        Python? Really? Do you mean Ruby? Quoting Matz from An Interview with the Creator of Ruby: "Then I came across Python. It was an interpretive, object-oriented language. But I didn't feel like it was a "scripting" language. In addition, it was a hybrid language of procedural programming and object-oriented programming. I wanted a scripting language that was more powerful than Perl, and more object-oriented than Python."

        When I see bless statement in simple scripts I suspect fraud
        That's an extreme position to take. Though an obsession with OO is unhealthy, your apparent anti-OO obsession is just as unhealthy IMHO. Don't want to use Moose? Fine. But don't blindly reject sound principles of design - which include using OO when appropriate. As for whether and when to use OO, there is no substitute for judgement and taste. A simple rule of thumb is to ask "do I need more than one?": if the answer is yes, an object is indicated; if the answer is no, then a module.

        High-level Design Checklist (derived from On Coding Standards and Code Reviews)

        • Coupling and Cohesion. Systems should be designed as a set of cohesive modules as loosely coupled as is reasonably feasible.
        • Testability. Systems should be designed so that components can be easily tested in isolation.
        • Data hiding. Minimize the exposure of implementation details. Minimize global data.
        • Interfaces matter. Once an interface becomes widely used, changing it becomes practically impossible (just about anything else can be fixed in a later release).
        • Design the module's interface first.
        • Design interfaces that are: consistent; easy to use correctly; hard to use incorrectly; easy to read, maintain and extend; clearly documented; appropriate to your audience. Be sufficient, not complete; it is easier to add a new feature than to remove a mis-feature.
        • Use descriptive, explanatory, consistent and regular names.
        • Correctness, simplicity and clarity come first. Avoid unnecessary cleverness. If you must rely on cleverness, encapsulate and comment it.
        • DRY (Don't repeat yourself).
        • Establish a rational error handling policy and follow it strictly.

        Some Related Perl Monks Nodes

        Hi

        I chose a very modest goal -- implementing "strict" pragma because "use strict" and "use warnings" are two pragmas which do improve maintainability of Perl scripts. Other new staff mostly don't.

        You can be strict/warnings compliant and not benefit

        Being in a hurry to automate strict/warnings compliance hints that you just might be missing the point of strict/warnings

        The slideshow I linked is very good, its called Program Repair Shop , its about refactoring/strict/warnings

        So this is an implicit attempt to push me into "overachiever mode".

        No, Its an explicit invitation to answer question #2 yourself,

        Chapter 9. Managing Real Programs, Modules, Organizing Code With Modules

        I'm boning up right now

        Its hard being an archeologist