Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Translating Legacy Code to Perl

by Anonymous Monk
on Aug 07, 2004 at 15:09 UTC ( #380918=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I'm about to begin a large project of porting half a million lines of old 4D business logic to perl (4D's language is similar to PASCAL, but with more database features). I've written a code converter using regular expressions that does a pretty decent job, but I'm having trouble making it much better.

It seems that the best way to do such a project would be to parse the legacy code into a tree-structure and then build the perl code up from that tree. Has this been your experience? And more importantly, which tools (CPAN or otherwise) would you recommend to make this process easier. I've seen Parse::RecDescent; is this the best tool for this project?

My goal isn't to generate perfect code, only code that requires the least debugging. Also, if you have any general suggestions about writing a code translator, let me know that too.

Thanks so much,


Replies are listed 'Best First'.
Re: Translating Legacy Code to Perl
by dfaure (Chaplain) on Aug 07, 2004 at 15:30 UTC

    If I had such a task to do, I would use Parse::RecDescent or Parse::Yapp and a little grammar (Pascal is known to have a small and a regular grammatic structure -- you may even look for FreePascal, which may provide you one quasi-suitable), in order to analyze and translate on the fly the source code.

    You may refer to this code example where I use Parse::RD to build SQL queries from LDAP filters.

    HTH, Dominique
    My two favorites:
    If the only tool you have is a hammer, you will see every problem as a nail. --Abraham Maslow
    Bien faire, et le faire savoir...

Re: Translating Legacy Code to Perl
by kvale (Monsignor) on Aug 07, 2004 at 17:24 UTC
    You are essentially building a compiler from 4D to perl. In addition to the modules mentioned above, it is also possible to build your own top-down recursive descent parser. Each sub would represent a nonterminal and terminals generally correspond to regexes. Translation can either be done as you parse or by walking the parse tree afterwards.

    I have hand-built parsers a number of times with good success. It allows one to really understand the nature of the language and customize as needed. Hand-built parsers can also be faster than P::RD, but this may not be a big consideration except in development of the compiler itself.

    Whether you use the hand-built or module approach, I would recommend reading up on compilers to avoid pitfalls (such as ambiguous grammars) that would show up in either case. The best book I know of is "Compilers: Principles, Tools and Techniques' by Aho Sethi and Ullman.


Re: Translating Legacy Code to Perl
by dragonchild (Archbishop) on Aug 08, 2004 at 02:44 UTC
    This topic comes up every 3-6 months. It's a different language each time, but I will give the same answer.

    You will almost never like the results of doing a direct translation from language X to language Y, especially if they're as different from each other as a Pascal-type language vs. a scripting language. A small and incomplete list of the reasons why would include:

    • LOCs between languages can vary up to 50x. My personal estimate for Pascal -> Perl would be about 20x - as in Perl is 20x more concise than the equivalent Pascal code.
    • Many languages do not have the same concepts. (Q.v. Beating the Averages) This means that you will be ending up with Perl code that is highly inefficient. If you're converting because you convinced your boss there would be a performance benefit, s/he will be sorely disappointed.
    • Verification of the translation is going to be a bitch. Especially on 500K lines of Pascal-type code.
    • Database access in Perl is generally done through DBI. I'm betting 4D doesn't have a similar feature.

    Basically, I'm saying that you will want to rewrite the application from the ground up, with all the risks that entails. It will be a smaller risk and a greater reward than translating 500K lines of Pascal-type code to Perl. Plus, this can be seen as a good opportunity to update the business rules, as needed. It will probably take between 2-5 man-years. Good luck! (You'll need it ...)

    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

Re: Translating Legacy Code to Perl
by tilly (Archbishop) on Aug 08, 2004 at 04:11 UTC
    Why are you doing this port?

    If it is because you are having trouble maintaining the old code, then I'll wish you luck. If it is because the old code is forcing you into some platform dependency that you don't like, then it may be easier to build a new interpreter for the language than to try to port existing code. This might even be an appropriate use for Parrot.

Re: Translating Legacy Code to Perl
by sleepingsquirrel (Hermit) on Aug 08, 2004 at 04:50 UTC
    Here are some additional resources about available virtual machine architectures. Parrot, LLVM, Mono/CLR, JVM.

    -- All code is 100% tested and functional unless otherwise noted.
Re: Translating Legacy Code to Perl
by CountZero (Bishop) on Aug 08, 2004 at 18:30 UTC
    I assume that after careful consideration you have chosen Perl to be a good substitution for 4D, otherwise of course there is no reason why you should switch to Perl.

    The second thing you should think of, is whether you already have an exhaustive test suite for your existing 4D-application?

    Indeed, even if it is possible to do a machine translation (and other monks have already pointed out that it is by no means certain/efficient/desirable), you will still need to verify that indeed the translated application runs correct.

    Without a good test suite there is no way that you can even be remotely certain that the Perl programs faithfully emulate your 4D code.

    A good book to read is Perl Medic: it is all about transforming legacy code.

    Update: Of course it is about Perl legacy code but the concepts touched upon in the book are valid outside the Perl realm as well.


    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://380918]
Approved by Zaxo
Front-paged by dfaure
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (7)
As of 2018-06-19 20:10 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (114 votes). Check out past polls.