Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Pure Perl Modules, XS Modules, what's the current trends?

by skazat (Hermit)
on Dec 20, 2007 at 19:33 UTC ( #658221=perlquestion: print w/ replies, xml ) Need Help??
skazat has asked for the wisdom of the Perl Monks concerning the following question:

If you look at my writeup history, you'll see that many of the questions I pose are about getting complex web apps working, without the help of CPAN. I love CPAN myself, but a lot of regular users don't know how to use it, nor a CLI, etc. My webapps started with manual configuration (think webapps written in php) in mind and I'm slowly working up to having something that can be distributed on CPAN, without alienating my users who are used to the way it's been for yarns of years. I am looking forward very very extremely much to using something like Module::Install to get the dependencies I need for the webapp, but I still need a fallback plan for other users (right now, it's probably the majority of the users)

One of my main issues is trying to package up the needed CPAN modules my webapps require into a perllib that can just be included with the webapp.

The other issues I have is figuring out what to do with a CPAN Module that has a dependency chain that includes an XS module. This is a type of module that I can't easily include, since it has to be compiled, etc.

One of the things I'm noticing is that some newer XS perl modules are shipped without a Pure Perl Implementation to fallback on, but there's another Pure Perl Module implementation by someone else, that attempts to closely mimic the behavior of the first one.

For example, an older module that has a fallback behavior would be something like MIME::Base64 (or, geez, used to)

This new trend, I'm noticing in things like Text::CSV_XS and Text::CSV_PP, as well as Crypt::Rijndael and Crypt::Rijndael_PP. Now I see there's also a MIME::Base64::Perl

At the moment, I basically keep good track of what CPAN modules I use and keep them in a app-specific perllib and also say in the dev notes that this is so, and it'd be a good idea to install these modules yourself (a Bundle is used, at the moment)

I say things like this:

WebAppx uses CSV files, which, by default is handled by Text::CSV_PP - but you'd be better off using Text::CSV_XS, but! Please install this yourself and WebAppX will know what to do. Hazzah! I'm wondering what the best way to have this in code. Right now, I do something like this:

my $csv; eval {require Text::CSV_XS}; if(!$@){ $csv = Text::CSV_XS->new; }else{ require Text::CSV_PP; $csv = Text::CSV_pp->new; }

It seems somewhat scrappy and wasteful.

I know of the Best module, but the low version number and long-time-since-hacked-upon date makes me wonder how good it'd work as well.

I'm also starting to work with CPAN authors in helping to get their modules who have dependency chains that include XS modules to have a Pure Perl implementation (hey, if it's already there, why not?)

Is there a better method that you guys use? I hope people can understand the importance of having a fall back, pure perl implementation of some of these modules. I do understand that in many instances, these modules are much much slower, but it is better than nothin'.

-justin simoni
skazat me

Comment on Pure Perl Modules, XS Modules, what's the current trends?
Download Code
Re: Pure Perl Modules, XS Modules, what's the current trends?
by webfiend (Vicar) on Dec 20, 2007 at 20:32 UTC

    Put Text::CSV_PP and Text::CSV in your app include dir for this specific example and use Text::CSV - since that module already tries to do the best thing:

    1. Try to import Text::CSV_XS
    2. Try to import Text::CSV_PP if the first step failed.

    You may want to take a look at modules like Text::CSV to see their approach to conditionally including modules.

    Update: Added the steps I was thinking, since it's unlikely that you can read my mind from there.

      Wow!

      Thanks for showing me that - the last time I did my research on Text::CSV, it was before this recent 1.00 release, and the version available was very, very old and had fragmented alternatives (Basically, Text::CSV_XS and Text::CSV_PP) - it's nice that the anarchy that is CPAN and it's users figured it all out :)

      Well, that solves at least one problem on my checklist and gives me something to go on for the rest)

      ++

       

      -justin simoni
      skazat me

Re: Pure Perl Modules, XS Modules, what's the current trends? (allowing easy choices)
by tye (Cardinal) on Dec 20, 2007 at 20:36 UTC

    The best route is to have a pure-Perl module that provides the interface and have an XS module that requires the pure-Perl interface and is used via the pure-Perl interface (yes, this can be done efficiently and easily). That way, if you want/need the XS implementation, then you just install that one XS module (and your install tool likely installs the interface module automatically because of the declared dependency). If somebody else has difficulty installing an XS module, then they just install the pure-Perl one and they're set as well. You don't have to try to write code to get all of the varied install tools to try to ask the user whether they want the XS piece or if they want to continue installing even though the XS part didn't build or other such complex magic.

    I've seen several modules go that route though I don't recall the names of any of them at the moment.

    If the XS module is wrapping some external libraries, then this whole question is probably moot. Otherwise, there are usually only a few (if any) features that can't be provided without XS (most of the time, the vast majority of the features can be done in pure-Perl and the XS is just there to increase performance and the number of bugs) or just haven't been coded yet in Perl, in which case the pure-Perl replacement likely just dies/croaks (at least for now).

    But you were more asking about what you should do when things didn't go this "best" route. I'd write a wrapper module that transparently uses either the XS module or the pure-Perl module and make it require the pure-Perl module. And I'd encourage the author of the XS module to be involved and to encourage users to switch to using this interface abstraction module (which, again, usually isn't terribly difficult to write while still being very efficient, though exactly how to write it will depend on the iterface being implemented) to the point of getting the XS module to require this abstraction module.

    If you can also get the author of the pure-Perl implementation involved such that the interface abstraction module is actually just made a part of the pure-Perl implementation module, then you've got the original best route that I described (but the only real advantage of this last step if reducing the number of "moving parts").

    Note that it is also nice to have a hook whereby the user of the abstraction module can request that the pure-Perl implementation be used even if the XS implementation is installed. This is polite but can also be important as almost always the XS implementation will have more bugs or just not work on some less-vanilla data, etc. and will certainly be harder to tweak to work around bugs, awkward design choices, etc.

    Update: Here is a block diagram of the proposal for a quick grasp of the concept without trudging through the long description:

    - tye        

      I'd write a wrapper module that transparently uses either the XS module or the pure-Perl module and make it require the pure-Perl module.

      Upvoted and seconded. Unification behind an interface that can support either version solves a lot of problems.

        I just realized that that's what they did with Text::CSV a little less than a month ago - a little bit after I did my research on Text::CSV. The world it's moving faster than I am.

         

        -justin simoni
        skazat me

      I agree with this approach, but what I'm seeing (albeit I haven't looked at every single module that mixes in this way), that modules /used/ to be written like this, but aren't anymore, opting to go for simply an XS route and then someone comes along and writes a Pure Perl version. Which is causing me all kinds of headaches.

      For example, in MIME::Base64 2.23 (just as an example), you'll see basically what your diagram is sayin':

      eval { bootstrap MIME::Base64 $VERSION; }; if ($@) { # can't bootstrap XS implementation, use perl implementation *encode_base64 = \&old_encode_base64; *decode_base64 = \&old_decode_base64; $OLD_CODE = $@; #warn $@ if $^W; } # Historically this module has been implemented as pure perl code. # The XS implementation runs about 20 times faster, but the Perl # code might be more portable, so it is still here.

      (followed by the Pure Perl version)

      In the newest version (as of my writing: 3.07) that's gone, but out has sprouted MIME::Base64::Perl - which is probably the same implementation that *was* in Mime::Base64 to start off with.

      Seems like a step backwards.

       

      -justin simoni
      skazat me

        No, that first block doesn't look like what I described. That first block looks like the XS version is bundled with the interface module and they try to detect if the XS part of the one module just isn't installed. Trying to install a module distribution where you can't build the XS part is ugly. So it is nice of them to try to detect that situation but it doesn't look to me like they've really done anything to make life easy for people who have a need for that situation.

        So, splitting out the pure-Perl version into a separate package doesn't bother me one way or the other in the abstract. Splitting it out into a separately distributed module might be a step backward, I agree, though perhaps only a tiny step. I prefer to have the interface distribution be the "visible" one with the pure-Perl implementation less visible but required by (or included in) the interface distribution, since the alternate situation (people having to specifically request that the pure-Perl implementation be installed) tends to lead to less transparent fall-back.

        If the XS code were similarly split out and the XS code distribution declared that it depends on the base interface distribution, then that would be a big step forward.

        In this particular case, I suspect that part of the motivation for these changes is that this particular module looks like it was added to "core" (based on comments in the "Changes" file) and so the XS component should be built when Perl is built and so most people shouldn't have to worry about trying to build the XS part. Of course, this idea that "/it/ being in core means that everybody who gets Perl also gets /it/" is not absolute.

        For example, my primary Linux box has a Perl that is missing many core modules and that box doesn't have a C compiler installed (there at least used to be a C compiler available but it isn't easy to find nor install and I'd probably have to purchase extra hardware to do that anyway). So I don't buy the "who cares if they can't build an XS module, nearly everybody can if they just take the time" argument. Time is valuable so not having to take the time is a valuable option to provide to people. I've seen plenty of cases where time isn't the only obstacle (in my case, hardware restrictions, in many cases company policy restrictions or factors of the scale involved due to a large number of target platforms, etc).

        So it'd be nice to help these modules be better citizens toward those unfortunate enough to not have the ideal Perl build environment to work with. But my experience is that even when accomodating such cases is not complex nor difficult, many will still object to having to even contemplate the possibility at all, considering it a waste of their valuable time. So I wish you luck.

        But, yes, for this particular case, the creation of MIME::Base64::Perl looks like a backward step to me as well.

        It is sad that somebody expects one to do s/MIME::Base64/MIME::Base64::Perl/g to their code base (including to modules one didn't write) in order to allow it to work in a unfortunate environment, especially when this previously wasn't required.

        - tye        

Re: Pure Perl Modules, XS Modules, what's the current trends?
by stvn (Monsignor) on Dec 20, 2007 at 20:39 UTC
    At the moment, I basically keep good track of what CPAN modules I use and keep them in a app-specific perllib and also say in the dev notes that this is so, and it'd be a good idea to install these modules yourself (a Bundle is used, at the moment)

    Actually, Bundles are (and have always been) kind of a dirty hack. The newer trend is toward Task modules, which themselves are implemented using Module::Install. See the docs in Task for a more detailed explaination.

    I know of the Best module, but the low version number and long-time-since-hacked-upon date makes me wonder how good it'd work as well.

    I have used Best and it is totally stable and production worthy (IMO of course).

    Low version numbers don't mean a thing, nor does it matter how long ago something has been hacked on (unless you are talking like +5-6 years, in which case the module may not be in keeping with current community best practices (no, not the Damian's book, the communities best practices, there is a subtle difference). In the case of Best the code is so simple that I would be more worried if there more releases. If you look over the Changes file you will see that most of the releases have been for doc fixes and a few minor bugs which tells me that other people are using this module and that the author cares about clarity of his documentation, both very good signs.

    I hope people can understand the importance of having a fall back, pure perl implementation of some of these modules. I do understand that in many instances, these modules are much much slower, but it is better than nothin'.

    I disagree with the importance of Pure Perl fallbacks, they are nice to have in some cases, but really, just about every *nix system comes with a C compiler and with the existence of things like Strawberry Perl the Windows platform is becoming less of an issue. IMO, effort is better spent making it easier to compile and install C based extensions then it is spent re-writing these extension in a slower Pure Perl version.

    Is there a better method that you guys use?

    If you are looking to distribute your code easily without your users needing to install 1/2 of CPAN, then you might want to look into PAR, which is about as close to PHP-ease-of-installation as Perl gets these days.

    -stvn
      Low version numbers don't mean a thing, nor does it matter how long ago something has been hacked on (unless you are talking like +5-6 years,

      It is hard to distinguish between Production Ready and not, without signifiers, like version numbers that are above, "1", or something like that. There's always the "Just, because it's on CPAN, doesn't mean it's made of Gold" , idea - but what is? and what isn't? Sometimes it's hard to distinguish.

      I disagree with the importance of Pure Perl fallbacks, they are nice to have in some cases, but really, just about every *nix system comes with a C compiler and with the existence of things like Strawberry Perl the Windows platform is becoming less of an issue. IMO, effort is better spent making it easier to compile and install C based extensions then it is spent re-writing these extension in a slower Pure Perl version.

      Still, the ecological nitch is that the users don't know how to use CPAN, even if they have it available. I tear out my own hair sometimes when CPAN gets unyieldy. That's not something my users are even going to attempt.

      If you are looking to distribute your code easily without your users needing to install 1/2 of CPAN, then you might want to look into PAR, which is about as close to PHP-ease-of-installation as Perl gets these days.

      If I could find an easy enough to use tutorial on how to do exactly that, I would - but the docs on CPAN still read as if I know what the heck I'm already doing. Perl is only user-friendly to Perl hackers, it seems. It's why people *use* php.

      And a app packaged in PAR needs PAR to work! It's a chicken 'n egg thing again.

       

      -justin simoni
      skazat me

        It is hard to distinguish between Production Ready and not, without signifiers, like version numbers that are above, "1", or something like that. There's always the "Just, because it's on CPAN, doesn't mean it's made of Gold" , idea - but what is? and what isn't? Sometimes it's hard to distinguish.

        But a version 1.0 or above is no more reliable an indicator, I have seen lots of lumps of crap wrapped up in a bow with a big 1.0 on it and uploaded to CPAN. In the end you have to do things like; 1) ask the community 2) look for indicators like the ones I mentioned in my post (resolved bugs, nicely updated docs, etc) and finally 3) read the code and make your own judgement. In the end, it is open source, so you can always patch it and send the fix to the author, or just fork the whole module and maintain your own version.

        Still, the ecological nitch is that the users don't know how to use CPAN, even if they have it available. I tear out my own hair sometimes when CPAN gets unyieldy. That's not something my users are even going to attempt.

        But how is a Pure Perl version of Module::X going to make that any easier? I have had hard times installing Pure Perl modules too, it is not just C/XS based modules that cause issues.

        If I could find an easy enough to use tutorial on how to do exactly that, I would - but the docs on CPAN still read as if I know what the heck I'm already doing. Perl is only user-friendly to Perl hackers, it seems. It's why people *use* php.

        The PAR Tutorial is pretty good actually, and anything you don't understand you can just ask here or on IRC and I am sure you can get people to help you.

        And a app packaged in PAR needs PAR to work! It's a chicken 'n egg thing again.

        This is not actually true, you can tell PAR to package it all into a self contained file which only depends on itself. Look in the PAR Tutorial it specifically says "Requires only core Perl to run on the target machine".

        -stvn
Re: Pure Perl Modules, XS Modules, what's the current trends?
by almut (Canon) on Dec 20, 2007 at 20:48 UTC
    I hope people can understand the importance of having a fall back, pure perl implementation of some of these modules. I do understand that in many instances, these modules are much much slower, but it is better than nothin'.

    The problem is you can't easily have a pure Perl implementation in every case. In particular, a module which implements a binding to some C library, would have to re-implement the entire functionality of that library in Perl... (think of Net::SSH::Perl as a (more or less failed) attempt along these lines).  Similarly, if you need to make system calls into the OS.

      That point's well understood, and I don't want the universe to bend to my willing, but the 3 examples I state all have XS and Pure Perl versions and they're all packaged in different modules.

      That's the reality.

      I'm just trying to see what may be the best way to say, "Use this if ya got it (the XS Module) and if not, Use this way (The Pure Perl module) - 'cause they both do the same thing: one's probably faster, one's a little easier to package in a home-brewed system.

      I may just try the Best module, since I can package that myself and forgo the eval() statements.

       

      -justin simoni
      skazat me

      In particular, a module which implements a binding to some C library, would have to re-implement the entire functionality of that library in Perl...

      ... or if I could get P5NCI stable and completer, it could just use that instead of XS.

        ... but would you want to call it "pure Perl" in that case? I mean, from an installation point of view, you'd still have to make sure the respective prerequisite shared object file(s) will be installed on the target system...

        But irrespective of that, a stable P5NCI would be a very nice thing to have :)

Re: Pure Perl Modules, XS Modules, what's the current trends?
by HeatSeekerCannibal (Beadle) on Dec 21, 2007 at 17:26 UTC
    So glad to hear someone who can clearly articulate the questions I've had for so long, but havent been able to express coherently.

    I've no answer to give, other than "I feel your pain".

    Best regards,

    Heatseeker Cannibal

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://658221]
Approved by almut
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2014-07-23 01:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (131 votes), past polls