http://www.perlmonks.org?node_id=257211

I've been building a perl 'application' for several months now, which has grown to require about 15 separate modules (and may require several more before I'm done). I mainly develop on Linux/BSD systems, not Windows. A good majority of the users of this application will likely be Windows users, based on our download statistics from the main project website that this application will become a part of. So far, so good... except I have a few questions about the 'linting' and 'delivery' approach:
  1. What considerations should I take to make sure that code that runs clean (with strict, warnings, diagnostics, -T) on POSIX systems will continue to run clean on Windows systems?
  2. Is there any change to @INC or the shebang between these systems I should worry about?
  3. When delivering the application itself, should I also bundle the required modules that it uses? Or should I add a snippet to the README that describes how to use PAR or PPM/CPAN to fetch and install these modules? I don't want to get bogged down in having to teach everyone how to use PPM or CPAN because they got some error or another with this, so I'd prefer it to be as clean as possible. This will also affect the POSIX (mostly Linux) users when installing these modules, which have quite a few requirements of their own, like XML::LibXML and others.
  4. What about using the modules as "local packages" under the application directory itself? (this code is not a module in itself, yet..)
  5. What rules change as far as error trapping/reporting, that can be abstracted into something portable between systems? i.e. how would cmd.exe (NT/XP) vs. command.com (98/ME) vs. /bin/sh differ in their interpolation in this regard?
  6. What about operations like open(), sysread(), and system()? Do they need specific "precautions" as well when run on Windows?

One of the thoughts I had was to write a quick script (batch/cmd file on Windows, shell script on POSIX, or maybe just a pure perl script itelf, which can detect the OS in place, and perform accordingly) which then can unpack the modules delivered with my application, push them into the right place, and so on, but that has its gotchas also.

I'm used to writing portable code across POSIX systems (Unix, BSD, Linux, Solaris, etc.) in C, but not with perl, and I know there are many more things to look out for.

Any insight from those who have done this? I'm specifically looking for options, alternatives, gotchas, and other means of accomplishing the task. Most of the modules are low-level XML, HTML, IO modules wrapped around some code that is doing a lot of text processing.

Replies are listed 'Best First'.
Re: Delivering "portable" code between POSIX and Windows
by Aristotle (Chancellor) on May 11, 2003 at 12:11 UTC
    1. That's less a matter of being digestible by Perl; all Perl constructs except a number of functions geared towards Unix specifics work everywhere. See perlport.
    2. Windows doesn't use the shebang line at all, although perl examines it after launch. You may run into with issues with "too late for -T".
    3. I would suggest you deliver two versions, at least for Windows users: an unbundled one and one built with PAR. The download page should suggest them to get the PAR'd one unless they know why they'd want the other. I don't see a need to write module fetching tutorials for the unbundled package then. Consider wrapping the Windows PAR version in an InstallShield-ish installer - I quite liked the NullSoft installer creator, myself. Due to XS modules, a PAR'd version for people on *nixoid systems would be difficult to offer; you'll need to account for a lot of systems.
    4. I'm not sure what that question means. The quoting mechanisms are very different, at any rate. I'm not sure the MS shells even offer sufficiently powerful escaping mechanisms to deal with arbitrary data. sh type shells do - clumsy ones, but they don't impose restrictions.
    5. I don't think there's anything specific about those. You will have to look into binmode though. (This is a no-op on *nixoid systems, so you don't need to maintain two different versions.) In other notes, remember that MS systems don't pay attention to capitalization in filenames.

    Makeshifts last the longest.

      2. This can be resolved via creating a second file assocation binding and putting the -T there. Thus *.plt are Taint sensitive perl scripts and .pl are not. (Or whatever.)

      5. binmode isnt a no-op in the *nix world anymore afaiui. Its necessary to open multibyte character files in byte mode.


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
        The latter was new to me. But then, it's something that was added for the ongoing Unicode support effort which is (historically speaking) relatively new. Thanks for the pointer.

        Makeshifts last the longest.

•Re: Delivering "portable" code between POSIX and Windows
by merlyn (Sage) on May 11, 2003 at 11:57 UTC
Re: Delivering "portable" code between POSIX and Windows
by dws (Chancellor) on May 11, 2003 at 17:10 UTC
    1. What considerations should I take to make sure that code that runs clean (with strict, warnings, diagnostics, -T) on POSIX systems will continue to run clean on Windows systems?

    Avoid features that aren't portable to Windows, and test on both platforms. I've never noticed an appreciable difference with -T / strict / warnings.

    2. Is there any change to @INC or the shebang between these systems I should worry about?

    Unless you're using Apache on Win32, the path part of the shebang line is ignored. The paths in @INC are going to be different, but what you probably care about is whether the packages your application expects are there are not. Test.

    3. When delivering the application itself, should I also bundle the required modules that it uses?

    That depends on your user base. If doing a CPAN install or running PPM is going to be an obstacle for them, then bundle what you need and use lib.

    4. What about using the modules as "local packages" under the application directory itself?

    See 3.

    5. how would cmd.exe (NT/XP) vs. command.com (98/ME) vs. /bin/sh differ in their interpolation in this regard?

    Shell meta-characters aren't completely portable across platforms. Beyond that, test.

    6. What about operations like open(), sysread(), and system()? Do they need specific "precautions" as well when run on Windows?

    If you want to write binary data portably, use binmode() to prevent newlines from getting translated, and use the "network portable" pack() formats. Use of system() depends largely on what you're trying to invoke, and whether it exists on the target platform.

    Any insight from those who have done this?

    If you run into messy platform differences is some area, consider writing the equivalent of a Java Interface class (or an abstract base class) that hides the differences behind the interface, then write platform-specific subclasses. Write factory methods that cough up the appropriate platform-specific subclass.

    This may sound like generic advice, but it worked well on an 80KLOC Perl middleware server with a web front end and a database back end.

      This may sound like generic advice, but it worked well on an 80KLOC Perl middleware server with a web front-end and a database back-end.

      And is employed by File::Spec as well as few other places in the standard(ish) modules.


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: Delivering "portable" code between POSIX and Windows
by demerphq (Chancellor) on May 11, 2003 at 20:10 UTC

    Since a list seems to be the style....

    1. NT is a POSIX compliant OS. If you are worried about -T then you will need to create a new file assocation on top of the standard associations if you want people to avoid having to do "perl -T script.pl". One thing that seems to be a convention in the *nix world is scripts with no extension. Win32 doesn't like files without extensions. It doesnt know what to do with them, so make sure _all_ you scripts have extensions.
    2. Shebang has limited utility on Win32. I don't bother with it with anthing but -l and -i, and even then rarely. Others have dealt with this point as well as it can be.
    3. A key issue here is going to be how much binary code you have, either directly or indirectly. The vast majority of Win32 perl users dont have C compilers and arent comfortable with using cygwin/gcc (although this is a viable, but slower way to do things.) This means their only source of binaries will typically be you and AS. So if you are using stuff that AS doesn't keep online on PPM you will have problems. BTW, this usually means that they dont even have make or nmake or dmake. I dont know about PAR, but if I were you I would be making sure that all the modules I depend on are part of AS library, and work with the versions they provide (they aren't usually as up to date as the CPAN ones.)
    4. Local modules are IMO a pain. Just set them up to install as normal. This isn't usually a problem on Win32 boxes. There wont be an admin saying "you can't install here". Otoh, this does assume that they have some way of installing. Hell you could just copy the files into the correct location.
    5. This question is a little bit hard to understand. Shell level interpolation on the three shelss you mentioned are afaik, completely different. But afaik, people like Barrie Slaymaker and the CPANPLUS boys are all over these issues. Why not review their work and take it from there? I think IPC::Run should do the trick nicely. Incidentally interpolation rules on cmd.exe are very wierd. Dont bother trying to guess them without a lot of time. They are barely documented by anyone, and even the perl source doesnt always do the right thing. (Although I say that from the position of deliberately seeking the error out.)
    6. Afaik there arent many traps in this regard. Obviously the issues I mentioned above about interpolation may come into play, but generally I think that if it works in unix it will work fine in Win32.

      I suppose one potential trap is code that has been written without properly using things like File::Spec. If you doing regex based file/path games you might get bitten. Even with File::Spec getting sloppy because part of the output is always "" on UNIX may bite you on Win32. Stick with using File::Spec carefully and you should be ok. This will also make Mac and other exotic OS owners like you.

      Ah! I just thought of one common unix trick that is a trap on Win32: you can't delete a file that is open. Also locking semantics are handled differently.

    One of the thoughts I had was to write a quick script (batch/cmd file on Windows, shell script on POSIX, or maybe just a pure perl script itelf, which can detect the OS in place, and perform accordingly) which then can unpack the modules delivered with my application, push them into the right place, and so on, but that has its gotchas also.

    Dont do this. There are other people who have already cracked this nut better than you will. If PPM's arent good enough and CPAN isnt a viable solution then use PAR or whatever.

    Most of the modules are low-level XML, HTML, IO modules wrapped around some code that is doing a lot of text processing.

    I would really make sure that they are available/buildable on win32. A lot of XML stuff that I have tried to install has failed. If AS provides it via PPM then its safe, otherwise id be hesitant....

    Good luck.


    ---
    demerphq

    <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...

      Overall I think your post is very good. It answers most of the OP's questions, and doles out good advice. I do have a nit to pick with your opening sentence, however. I have seen this claim made a number of times through the ages, and it always struck me as odd. It just didn't feel right. So I decided to do some research....

      Based on what I could find, the native POSIX subsystem in Windows NT/2000 is limited at best. It seems to be incomplete, and even bordering on non-functional. According to a paper on Microsoft's own MSDN, "the Windows NT/2000 POSIX subsystem ... only supports POSIX 1003.1," and "The 1003.1 system is of limited interest for fully featured applications, because it does not include many capabilities (such as those in 1003.2, network support, and so on)." You can read the entire article to get full context.

      I guess the claim that Windows NT/2000 has a POSIX layer is true, but it seems rather disingenuous to say "Windows NT is a POSIX compliant OS." How this relates to porting of Perl apps to Windows, I'm not sure. It just seems like something that should be pointed out for future reference.

        Ok fair comment. I have to admit that I considered adding a comment along the lines of "maybe we have a different concept of what POSIX implies" but I couldn't get the wording right. And it would seem that on a closer reading I am misunderstnading things. I'll strike that comment right after I post. I would actually say however that you have probably found the ideal resource for hacker's concerns about porting his unix stuff over. That link seems to detail the primary avenues for porting C code to win32, and a brief survey of the issues involved. Nice one. ++


        ---
        demerphq

        <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: Delivering "portable" code between POSIX and Windows
by mandog (Curate) on May 11, 2003 at 18:14 UTC

    This is a tiny thing but, if you expect to move code back and forth between UNIX and Windows append a space at the end of the shbang line

    "/usr/bin/perl -wT\r\n"

    ...is not valid Perl on UNIX

    "/usr/bin/perl -wT \r\n"

    ...works just fine on both platforms.

    Alternatively, you can explain to people how to turn on ASCII mode in their ftp client



    email: mandog