Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Breaking Tie::Hash into three modules

by afoken (Abbot)
on Sep 08, 2018 at 18:26 UTC ( #1221954=perlmeditation: print w/replies, xml ) Need Help??

Trigger

This meditation was triggered by Re^3: How to tie a hash to a class.

The Problem

Why can we do this ...

package My::Hash; use strict; use warnings; use parent 'Tie::Hash'; 1;

... but not this?

package My::Hash; use strict; use warnings; use parent 'Tie::StdHash'; 1;

Why do we have to write this instead?

package My::Hash; use strict; use warnings; use Tie::Hash (); our @ISA='Tie::StdHash'; 1;

Or, even worse, this, following the documentation found in Tie::Hash?

package My::Hash; # NO use strict; or things will break! use warnings; require Tie::Hash; @ISA='Tie::StdHash'; 1;

A little bit of history

Well, at some point in time, someone must have thought it was clever to put all three classes Tie::Hash, Tie::StdHash, and Tie::ExtraHash into the same file, Tie/Hash.pm. Looking at https://perl5.git.perl.org/perl.git/history/HEAD:/lib/Tie/Hash.pm, that must have happened before 1996-02-03. So this oddity is embedded deep in Perl's history.

At that time, Tie::Hash was still named TieHash, and Tie::StdHash was still named TieHash::Std. Tie::ExtraHash did not exist. And for reasons that are not clear to me, TieHash::Std inherited from TieHash. TieHash::Std reimplemented each and every method, and so only new() was really inherited from TieHash. That is the most useless method in the entire file, because tie never calls it. The TieHash::TIEHASH() constructor adds some extra work just to check for a new() implementation, and that's still there.

Three days after the first version, the classes were renamed to their current names.

Six years later, Tie::ExtraHash appeared in the same file, and it did NOT inherit from Tie::Hash. At the same time, Tie::StdHash stopped inheriting the nonsense new() constructor from Tie::Hash:

# @ISA = qw(Tie::Hash); # would inherit new() only

In all of the following versions up to the latest from 2013, Tie::ExtraHash never started inherited from Tie::Hash, and the @ISA line in Tie::StdHash was never commented back in.

Cleaning up the mess

First, Tie::ExtraHash

As shown, Tie::ExtraHash never inherited from Tie::Hash, so it is completely safe to move it into a separate file. Just cut the last 13 lines from Tie/Hash.pm and paste them into a new file Tie/ExtraHash.pm:

package Tie::ExtraHash; sub TIEHASH { my $p = shift; bless [{}, @_], $p } sub STORE { $_[0][0]{$_[1]} = $_[2] } sub FETCH { $_[0][0]{$_[1]} } sub FIRSTKEY { my $a = scalar keys %{$_[0][0]}; each %{$_[0][0]} } sub NEXTKEY { each %{$_[0][0]} } sub EXISTS { exists $_[0][0]->{$_[1]} } sub DELETE { delete $_[0][0]->{$_[1]} } sub CLEAR { %{$_[0][0]} = () } sub SCALAR { scalar %{$_[0][0]} } 1;

Then, Tie::StdHash

For the last 16 years, it seems that nobody complained about Tie::StdHash no longer inheriting the useless new() from Tie::Hash. So I am very sure that we can also move that class safely out of Tie/Hash.pm. From the edited Tie/Hash.pm, cut the last 17 lines and paste them into a new file Tie/StdHash.pm. Add a trailing 1; and remove the @ISA line completely:

# The Tie::StdHash package implements standard perl hash behaviour. # It exists to act as a base class for classes which only wish to # alter some parts of their behaviour. package Tie::StdHash; sub TIEHASH { bless {}, $_[0] } sub STORE { $_[0]->{$_[1]} = $_[2] } sub FETCH { $_[0]->{$_[1]} } sub FIRSTKEY { my $a = scalar keys %{$_[0]}; each %{$_[0]} } sub NEXTKEY { each %{$_[0]} } sub EXISTS { exists $_[0]->{$_[1]} } sub DELETE { delete $_[0]->{$_[1]} } sub CLEAR { %{$_[0]} = () } sub SCALAR { scalar %{$_[0]} } 1;

Third, Tie::Hash

The most obvious problem first: Append a trailing 1; or else loading Tie::Hash will fail.

Now that we have cleaned up everything, legacy knocks at the door, complaining about modules failing to inherit from Tie::StdHash after loading only Tie::Hash. So, we have to load the two modules, just to avoid legacy trouble. Adding two use lines is easy:

use Tie::StdHash; use Tie::ExtraHash;

A little bit of cleaning up

Both Tie::StdHash and Tie::ExtraHash lack a version number, so add our $VERSION = '1.06' to both files, and bump $VERSION in Tie::Hash to 1.06, too.

All three classes should use strict; use warnings;.

The POD

Tie::StdHash and Tie::ExtraHash have no POD at all. We could work around with something like this:

=head1 NAME Tie::StdHash =head1 SYNOPSIS See L<Tie::Hash> =head1 DESCRIPTION See L<Tie::Hash> =cut

A Patch

Instead of adding workarounds, we could clean up the POD in Tie::Hash, move the parts documenting Tie::StdHash and Tie::ExtraHash into the respective files, and clean up the examples to 21st century Perl.

Here is a diff, based on the current HEAD:

Should anyone get paranoid about legal stuff: This diff is free software, use it under the same license as Perl (perlartistic).

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re: Breaking Tie::Hash into three modules
by Corion (Pope) on Sep 09, 2018 at 07:05 UTC

    Not to detract from your excellent point, parent allows for (and documents) this case:

    package MyHash; use Tie::Hash; use parent -norequire, 'Tie::StdHash';

    But the real fix is to split the modules up into separate files.

      Not to detract from your excellent point, parent allows for (and documents) this case:

      package MyHash; use Tie::Hash; use parent -norequire, 'Tie::StdHash';

      I understand that use parent -norequire, 'Some::Class'; can be helpful, but it requires even more typing than our @ISA=('Some::Class');. I've considered for a few seconds if parent could fix what Tie::Hash broke, without forcing me to write more code.

      Adding a few special cases for core modules like Tie::StdHash that exist in files where they should not be. A simple hash mapping broken classes to their containing files ({ 'Tie::StdHash' => 'Tie::Hash', 'Tie::ExtraHash' => 'Tie::Hash', ... }), plus a few lines of code to use that hash. Perhaps only if require failed to load the base class.

      It would have solved the problem that use parent 'Tie::StdHash'; does not work out of the box. But the same would have to be added to base, and to every other module that manipulates @ISA for other modules. And it would require updating the workaround hash from time to time, whenever someone blindly copies the Tie::Hash anti-pattern.

      Another way could have been adding a way to load a class and inherit from a different class to parent, eleminating the need to manually load the file containing the base class. Something like use parent 'Tie::StdHash', -from => 'Tie::Hash'; or use parent { 'Tie::StdHash' => 'Tie::Hash' };.

      But again, why should parent fix what is wrong in other modules?

      After all, parent's history is quite clear:

      This module was forked from base to remove the cruft that had accumulated in it.

      So, after about three seconds, I discarded that idea and did not write about it in the meditation.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Breaking Tie::Hash into three modules
by afoken (Abbot) on Sep 09, 2018 at 10:16 UTC

    Tie::StdHash

    For the last 16 years, it seems that nobody complained about Tie::StdHash no longer inheriting the useless new() from Tie::Hash. So I am very sure that we can also move that class safely out of Tie/Hash.pm.

    Having a fresh look at the code of both Tie::StdHash and Tie::Hash explains why nobody complained:

    • Tie::Hash::TIEHASH() is designed to call new() in an inheriting class, or to be reimplemented by the inheriting class. If the inheriting class implements neither new() nor TIEHASH(), Tie::Hash::TIEHASH() croak()s.
    • (If the inheriting class implements new(), but not TIEHASH(), Tie::Hash::TIEHASH() warns that it calls new() because TIEHASH() is missing. So why even bother to implement new()?)
    • Tie::Hash::new(), which is never called by tie, just calls TIEHASH() of the inheriting class. Implementing new() in an inheriting class does not make any sense at all, because you need a class name, not an object, for tie.
    • Tie::StdHash implements its own TIEHASH(), as expected by Tie::Hash and tie. This prevents new() from being called indirectly from tie().

    So, no class inheriting from Tie::StdHash needs a new() implementation.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Breaking Tie::Hash into three modules
by afoken (Abbot) on Sep 09, 2018 at 11:46 UTC

    More mess

    Tie::Array and Tie::Scalar also place Tie::StdArray and Tie::StdScalar in the wrong files. Tie::Scalar also copies the useless new() method from Tie::Hash, including the warnings generated by the real constructor (TIESCALAR() / TIEHASH()).

    Tie::Array does not. It does not even implement a TIEARRAY() constructor. I think this is the cleanest solution. If inheriting classes do not implement TIEARRAY(), perl will automatically complain, no code required at all.

    Tie::Array implements a dummy DESTROY() method. I don't think it is strictly needed, but it should not hurt.

    Tie::Handle does not embed Tie::StdHandle. It was moved out into Tie/StdHandle.pm in 2007 and replaced by a backwards-compatible use Tie::StdHandle;. But it also copies the useless new() method.

    More patches

    I've written two more patches, that split out the Tie::Std* classes, add strict and warnings, and use parent instead of require and assigning to @ISA.

    Tie::Array

    Note: Tie::StdArray does not inherit from Tie::Array, because it would inherit only the empty methods EXTEND() and DESTROY(). Instead, it implements its own empty EXTEND() and DESTROY() methods.

    Based on https://perl5.git.perl.org/perl.git/blob/83aebc99b27428f0566bab5ded4d1df2167a9d4a:/lib/Tie/Array.pm:

    Tie::Scalar

    Note: The new Tie::StdScalar does not have a new() method at all. It is not needed, for the same reasons as explained in Re: Breaking Tie::Hash into three modules for Tie::StdHash. It also does not inherit from Tie::Scalar, because it would reimplement all methods (except for new()).

    Based on https://perl5.git.perl.org/perl.git/blob/2515a12cf493baaa211da7524a276dd30695ca29:/lib/Tie/Scalar.pm:

    Tie::Handle

    No patch. I can use parent 'Tie::StdHandle';, so I don't really need to patch here. The synopsis in Tie::StdHandle is wrong (copied from Tie::Handle), but it is quite obvious that only the class name has to be changed to Tie::StdHandle.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Breaking Tie::Hash into three modules
by shmem (Chancellor) on Sep 09, 2018 at 19:34 UTC

    Perhaps the way to go would be making the three modules into just one: Tie::Hash. Same with Tie::Array and Tie::Scalar.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
      Perhaps the way to go would be making the three modules into just one: Tie::Hash. Same with Tie::Array and Tie::Scalar.

      And for completeness, also merge Tie::StdHandle into Tie::Handle.

      I really like that idea. I shortly thought about it, but I did not have the courage to drastically change the behaviour of Tie::Hash, Tie::Array, Tie::Scalar, and Tie::Handle.

      But:

      Looking again at the source, there is a pattern in Tie::Array, Tie::Hash, and Tie::Handle, hidden between the nonsense code: All three classes implement methods for inheriting classes that do NOT depend on the actual implementation of the class, but use existing primitive methods. And I think that for this reason, we should keep the class names and functionality as they are now. The generic class implements generic methods that allow omitting some methods in inheriting classes, even with completely different inner structures, and the Std* class implements standard behaviour based on a trivial blessed reference of the respective type.

      Tie::Hash

      In case of Tie::Hash, there is only CLEAR() that slowly clears the tied hash by iterating over all keys and removing them one after the other. Consequentially, both Tie::StdHash and Tie::ExtraHash implement faster variants that simply assign an empty list to the internal hash.

      All of the other methods in Tie::Hash are just code that complains where perl would also complain without any extra code. use Carp, use warnings::register, and the methods new(), TIEHASH(), and EXISTS() can simply be removed from Tie::Hash.

      Tie::Array

      The methods UNSHIFT, SHIFT, CLEAR, PUSH, POP, and SPLICE, and EXTEND in Tie::Array are again generic methods that do not depend on the implementation.

      The remaining methods EXISTS and DELETE in Tie::Array just complain, and DESTROY is redundant. They all should be removed.

      Tie::StdArray implements standard array behaviour in short and efficient code, reimplementing most of the methods in Tie::Array with faster code.

      Tie::Scalar

      A scalar is so simple that it needs only three methods. All of them are primitive and can not be replaced by calls to other methods. Consequentially, Tie::Scalar contains only code to complain, all of it should be removed: All methods, use Carp, and use warnings::register. What remains is POD, the package Tie::Scalar, and our $VERSION. Moving Tie::StdScalar to Tie/StdScalar.pm breaks backwards compatibility, so use Tie::StdScalar should be added to Tie::Scalar. Also, Tie::StdScalar should keep inheriting from the empty Tie::Scalar.

      Yes, it looks like nonsense. But it allows to add more methods to a tied scalar class in future perl versions, and having a sane default for the extra methods in either Tie::Scalar or Tie::StdScalar.

      Tie::Handle

      Tie::Handle once again contains some generic methods (PRINT and PRINTF use WRITE, GETC uses READ), and several complaining methods (TIEHANDLE, READLINE, READ, WRITE, CLOSE), as well as the useless new(). new() and the complaining ones have to go. Tie::StdHandle implements all standard methods, and a simpler version of GETC. No changes needed here, except for the wrong class name in the SYNOPSIS section of the POD.

      Backwards compatibility needed?

      I wanted to know which CPAN modules use one of the Tie::* classes, and which of those modules actually use the new() nonsense. https://grep.metacpan.org/search?size=20&q=Tie%3A%3A&qd=*&qft= runs into an HTTP 500 error. Is there a better way than downloading and unpacking the entire CPAN?

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        There is a life outside Perl, that keeps me dry, warm and well-fed. But sometimes, it needs time I wanted to use for perl. So, after an unplanned delay, here are ...

        Even more patches

        Preparing

        I've downloaded perl 5.28.0 from cpan to a spare machine, unpacked and compiled it, and run its test. No errors, no warnings, no problems. So my testing environment (Slackware 14.2) is ok. I did not expect anything else.

        Then, I did the same after changing Tie/Scalar.pm, Tie/Array.pm, Tie/Hash.pm and moving the Tie::Scalar, Tie::StdArray, Tie::StdHash, and Tie::ExtraHash classes to new files. I've completely removed Carp, warnings::register and new(). The latter is nonsense, as explained in Re: Breaking Tie::Hash into three modules and in Note 1 below. Removing new() made all of the checks and warnings in the dummy TIEARRAY() / TIESCALAR() / TIEHASH() constructors obsolete, and because they contained no useful code beyond that, I just removed them. I also removed dummy methods that just called called croak() together with Carp used only there. If the constructors or required methods are missing, tie will complain as loudly as the code in the dummy constructors and methods did before. No Perl code needed for that.

        Perl compiled fine, but unrelated tests failed. Typos, missing edits, you name it. The new files have to be added to MANIFEST. D'oh! Could have thought of that. Maintainers.pl complains about the new files. Patch that, wash, rinse, repeat.

        Strange Tests

        With clean code, some tests still failed. Not because the cleanup did not work. It worked fine, not a single test complained about the methods I've moved around or removed. They failed because abusing the classes did not cause the same errors as before.

        They failed because the nonsense new() constructor no longer existed that tie never ever calls. lib/Tie/scalar.t explicitly calls Tie::StdScalar->new(). No sane code would do that!

        Patching Tests

        I've considered a long time if I should treat the tests like a requirement document that must not be changed. And yes, for me, they are a kind of requirement document, but with bad wording. Following the wording would have meant to re-introduce all of that nonsense code around new() just to pass the unmodified tests.

        I choose to follow the spirit, but not the words of that implied requirement document. Wrong or missing constructors shall fail, and I won't change that requirement. But I will change the test that checks for the exact error message. And because I've removed the nonsense new(), the tests rotating around that will have to go.

        A huge patch

        34 kBytes, 1128 lines. Much of the POD for the new files wss copied, and moving stuff from one file to another really inflates diff output. If there was a notation for "move this chunk of text to that file", the output would be much shorter.


        Notes

        (1) Yes, we could make life a little bit easier for people coming from C++, by giving them an additional new constructor faking the C++ new operator. But: tie does not call that constructor, and you don't have a new() constructor in C++ classes. The constructor of a C++ class has the same name as the class. So new() is complete and utter nonsense, even if we want to make Perl look like C++.

        (2) As before, the patch is under the same license as Perl itself. Feel free to test it.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        cpan.grep.me is usually better than grep.metacpan.org, although both tend to have issues with downtime.

        There are plenty of modules inheriting from these classes, although I haven't looked at how they are using them.

      I agree with this thinking. It's both logical and practical. Once the core work is done, seems to me it would be easier to maintain (and write tests against).

      How would that work? The three packages have different behavior that can't be combined.
        How would that work? The three packages have different behavior that can't be combined.

        Read Re: Breaking Tie::Hash into three modules again.

        shmem++ proposes to merge Tie::StdHash into Tie::Hash, Tie::StdArray into Tie::Array, and Tie::StdScalar into Tie::Scalar.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Breaking Tie::Hash into three modules
by Anonymous Monk on Sep 10, 2018 at 15:41 UTC
    Have you submitted this to perlbug@perl.org so that it can be discussed?

      Not yet

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        If you do "report" this to perlbug or the P5P, note me in as an advocate, even if I don't understand all of the details (I don't think I need to really).

        I think and feel that it's a sane move, even if it requires higher-level discussions for 'approval'. I have a few cycles, and some hardware to play on if necessary.

        Some tips for successfully submitting issues for Perl in case you haven't before: 1. Send an initial email without attachments to perlbug@perl.org describing the bug and including the word Perl so the spamfilter doesn't silently drop it. Send it from the perlbug program if you have email capability on your Perl development machine (I don't). Otherwise include perl -V output if it is relevant. 2. Log into rt.perl.org and hope that your account isn't randomly blocked from accessing all bug reports. 3. Go to your bug report and reply with attachments in the web interface.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://1221954]
Approved by marto
Front-paged by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2018-09-23 15:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Eventually, "covfefe" will come to mean:













    Results (191 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!