Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Maintaining Local CPAN Patches

by Ovid (Cardinal)
on Feb 27, 2008 at 10:49 UTC ( #670583=perlmeditation: print w/ replies, xml ) Need Help??

Sometimes a CPAN module is broken and you have to handle it. As a general rule, I've found that these fall into three categories.

  1. You have a policy (formal or otherwise) never to load said module.
  2. You need it and will live with local patches.
  3. You want it, but will wait until its updated.

For rule #1, we never load the UNIVERSAL:: modules. That's a shame because I really, really love these modules. Well, I love the idea of these modules. There are two problems, though. Not only are they a slight performance hit (rewriting in C would alleviate that), but they've introduced strange bugs and warnings. For example, you can't load UNIVERSAL::can and Template::Timer together. chromatic knows about this. Andy Lester knows about this. Neither author appears to be budging on this matter and we're stuck in the middle. So what do we do? We've disallowed UNIVERSAL::can and patched Template::Timer.

But what do you do with the patch? We have a deps/ directory which has all of our CPAN dependencies. Yesterday, we introduced our deps_patched/ directory for local patches. Our UNIVERSAL::can looks like this:

package UNIVERSAL::can; # Small performance hit and breaks otherwise working code. # http://rt.cpan.org/Ticket/Display.html?id=31709 use Config; my $max_int = 2 ** ( $Config{intsize} * 8 ); our $VERSION = $max_int; 1;

Note that we embed the RT number in the package. Later programmers must have an idea of what has been patched and why. We also set the version number high enough that it's unlikely to be upgraded by automated tools (is there a better way to handle this?)

Our Template::Timer code is similar, but we leave the current version number, have the RT ticket in the code, and have a local patch.

In our script directory, we have check_cpan_upgrades.pl and it looks like this (courtesy of Matt Trout):

use CPAN; { no warnings 'redefine'; sub CPAN::Module::inst_file { shift->_file_in_path(['deps_patched/lib']); } } CPAN::Shell->r;

Now, if any of our patched modules we might upgrade have a new version released, we can see which ones and evaluate them. This code needs to be cleaned up to integrate better with automated tools, but this is a good start.

Finally, we have our tests in aggtests/unit/local_patches.t. The tests look sort of like this (redacted):

use Test::Most 'no_plan'; use Config use UNIVERSAL::can; use Sub::Information; # for inspect() cmp_ok $UNIVERSAL::can::VERSION, '==', ( 2 ** ($Config{intsize} * 8 ), 'We should be loading our local version of modules'; # XXX this looks rather obscure, but what's going on is that the origi +nal # &UNIVERSAL::can's package is UNIVERSAL, but if UNIVERSAL::can is rea +lly # loaded, then the package is reported as UNIVERSAL::can. is inspect(\&UNIVERSAL::can)->package, 'UNIVERSAL', '... and not be accidentally reloading'; # more tests require Exporter::NoWork; foreach ('import') { # force an alias to a read-only constant eval { Exporter::NoWork::import( __PACKAGE__, $_ ) }; ok my $error = $@, 'Trying to import an unexportable tag with Exporter::NoWork sh +ould fail' unlike $error, qr/Modification of a read-only value/, '... but not with a "modification of readonly value" error'; my $package = <<' END_PACKAGE'; package Foo; use Exporter::NoWork; no warnings 'redefine'; sub unusual_function_name { return 'here I am' } END_PACKAGE eval $package; eval $package; is +(scalar grep { /Exporter::NoWork/ } @Foo::ISA), 1, '... and multiple uses of Exporter::NoWork should only add to +@ISA once' }

As you can see, we have two local patches applied to Exporter::NoWork and we look forward to this module being upgraded. We ensure that UNIVERSAL::can is our local version and the CPAN version is not accidentally loaded.

Suggestions for improvements to this process are welcome.

Cheers,
Ovid

New address of my CGI Course.

Comment on Maintaining Local CPAN Patches
Select or Download Code
Re: Maintaining Local CPAN Patches
by Corion (Pope) on Feb 27, 2008 at 12:06 UTC

    Depending on whether you want to do monkeypatching or real source patching (both of which can be confusing depending on whether you change/increase $VERSION or not), you might find the idea of the Distroprefs in CPAN interesting. The distroprefs allow local setting of preferences and (more to the point) automatic application of local patches, which hopefully apply cleanly across more than one version.

Re: Maintaining Local CPAN Patches
by Fletch (Chancellor) on Feb 27, 2008 at 18:30 UTC

    Not a pure Perl mechanism, but your local packaging mechanism might support automatically applying patches on build to a stock source tree (specifically I'm thinking BSD-y ports, but I'm pretty sure RPM provides something similar and would be surprised if apt/dpkg didn't as well). You could make local packages which pull the official source and then apply your patch in the build/install. That'd also give you the benefit of having the modules under package system control.

    (Granted I'm iffy on doing this with macports myself versus installing straight from CPAN, but thought I'd throw the idea out.)

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Maintaining Local CPAN Patches
by chromatic (Archbishop) on Feb 27, 2008 at 19:11 UTC
    Suggestions for improvements to this process are welcome.

    Patching away the real bug in Template::Timer is a good start, but I'd be happy to release a non-development version of UNIVERSAL::can and UNIVERSAL::isa if I could get a success or failure report on the problem for the development versions of either module... released three months ago.

    My preference is to work with upstream to get these bugs fixed and then use local patches as a last resort.

    Update: After thinking about this some more, I think this bug is a different one than the one I was thinking of when I wrote this reply.

      Regrettably, it's unlikely that even newer versions of these modules will be used:

      • It's a small performance hit (UNIVERSAL::isa alone cost us 20 seconds in our test suite)
      • We're finding so many bugs (even segfaults) with code which has a global effect that we try to eliminate all of it as soon as we see it.

      It's nothing specific against your modules, but their small impact coupled with the small impact of several other "global scope" modules has created a large impact in terms of performance, bugs, and time spent resolving these issues.

      Also, 20 seconds seems like a lot, but our test suite was taking almost 11 minutes to run at the time we removed it and thus 20 seconds was only about 3% of the entire runtime. However, finding just a few modules to remove which gain us 20 seconds each and we can quickly and easy gain a 10% improvement in our test suite (just yesterday I shaved 50 seconds be reimplementing part of Test::Strict internally, but we're up to over 16 minutes again). We have to be aggressive about test performance because it's a huge drain on resources.

      Cheers,
      Ovid

      New address of my CGI Course.

        I feel your pain (just look at Parrot's test suite, and then compare it to the Pugs test suite...), but everything I've seen you write publicly about speeding up your test suite has been microoptimizations that gain you five seconds here and ten seconds there.

        Now I'm obviously not privy to the details, and you may have 500,000 assertions that run in 11 minutes (which is a pretty good clip), but my instincts tell me to look for the equivalent of algorithmic improvements, not two and three percent here and there.

        I don't remember the name of the health insurance billing project you saw me work on, but I did a brief statistical analysis of their test database, dropped 90% of the records, and cut the running time of certain tests by an order of magnitude. I hope you've looked at those types of optimizations and are just doing cleanup now.

Re: Maintaining Local CPAN Patches
by tilly (Archbishop) on Feb 27, 2008 at 23:57 UTC
    Suggested improvement. Use CPAN::Mini to create a local copy of CPAN, and then inject your versions of patched modules into it. And then configure all of your machines to get copies of modules they need off of that version of CPAN. And now all automated tools will automatically go against your CPAN server and will not accidentally install the wrong version of the module.

    Bonus, when you go to install new machines, you'll know that they are getting versions of CPAN modules that have actually been tested in your development environment. (Just be sure to actually upgrade regularly to avoid the pain of a "big bang" upgrade down the road.)

    I forget the exact details of setting this up, but brian_d_foy gave an excellent talk on this late last year to the Los Angeles perlmongers. You should be able to get them from him (possibly for the cost of ordering a back issue of The Perl Journal).

      This is an excellent suggestion, worth more than just an upvote.

Re: Maintaining Local CPAN Patches
by sundialsvc4 (Monsignor) on Feb 28, 2008 at 19:34 UTC

    Well, what I would do is to introduce a new directory in your production @LIB path, to contain your patched modules. This directory must of course occur early in the list so that it will always be chosen first. In other words, even as CPAN is used to update the main CPAN directories on your system, these packages will never actually be selected by Perl from those directories. Instead, your patched versions will be discovered and used first.

    Now, you need a reliable system for maintaining those patches so that you can maintain your changes and incorporate the regular CPAN-derived changes, all at the same time. For that, you need to set up and use a version-control system. subversion (svn) might be an apropos choice, since that's generally what CPAN uses.

    See: http://en.wikipedia.org/wiki/Subversion_(software)#Branching_and_tagging.

    You would start by checking-in the original code from CPAN, which you used to create your original patch. That's your “base vendor-branch.” Tag this as your original starting-point. Now, branch off from that and check-in your modified code to create the first “production branch.” svn will automatically determine the source-code differences between the two.

    Now, from time to time, as CPAN updates its modules, you can check those in on the vendor-branch (the total set of changes made in CPAN will be captured), tagging them as you go. Next, merge those back into your production-branch, which contains your revisions, and tag that also as-you-go.

    If there are any conflicts between the two, you'll automatically be notified and can resolve them on a case-by-case basis. (The by-product of that resolve-step is, of course, another explicit revision, that gets committed into the system, so you know exactly what was done.)

    In general, this will allow you to reliably, and for the most part, automagically, maintain your site-specific patches and apply them to the CPAN-derived code ... thus maintaining both the ability to make local changes and the ability to employ CPAN's latest-and-greatest. Furthermore, you can at any time revert back to any previous change-point, e.g. to any one of the “tags” that you set along the way on any branch. You can compare any one to any other with absolute accuracy.

    You're always storing your merged revisions back to the local directory that you've created for that purpose; the one that's listed early in @LIB. Although CPAN is updating the main directory, you're solely maintaining the version of these modules that you employ in production ... through the version-control system.

Re: Maintaining Local CPAN Patches
by demerphq (Chancellor) on Mar 01, 2008 at 11:08 UTC

    Im not sure this is exactly what you want, but newer version control tools like mercurial and git have features specifically for handling cases like this. In the case of mercurial their patch queue features are very useful.

    For all of you who are stuck on centralized version control systems take a moment to learn either mercurial or git, you wont look back and youll be a much happier person. There has been a paradigm shift in version control going on almost silently in our industry and its something you should know about.

    ---
    $world=~s/war/peace/g

Re: Maintaining Local CPAN Patches
by eserte (Deacon) on Mar 01, 2008 at 21:03 UTC
    "Local" is not good - I think patches should always go to CPAN (my CPAN patch directory is $CPAN/authors/id/S/SR/SREZIC/patches) and this patch should be references in the corresponding RT ticket. And then use a distropref to apply the patch when building the module locally. E.g.
    --- comment: "bleadperl 31194 broke the test suite, also 5.00505 problems" match: distribution: "^MSERGEANT/XML-Parser-2.34.tar.gz" patches: - "SREZIC/patches/XML-Parser-2.34-SREZIC-01.patch"

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://670583]
Approved by moritz
Front-paged by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (7)
As of 2014-04-17 07:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (440 votes), past polls