Honourable Monks,
I'm working on a program that reads in XML files, and while at it, I need to save XML elements and their values in a Perl hash. The elements in the hash are not simply DOM trees, but attempt to be a bit more usable.
More often than not, XML documents use CamelCase or lowerCamelCase, while I would much prefer lower_case_with_underscore. (I'm not even sure if it is possible to deviate from that; what does the standard say?) From a certain point of view, it makes sense to store the keys exactly as they appear in the original document. This makes it easier to find what you want from the hash, simply by reading the schema, if available; no guesswork needed. However, the clash in naming conventions is enough to make me gouge my eyes out.
Enter Hash::CamelCase. This is a (trivial?) little module for tied hashes that simply converts all CamelCase and lowerCamelCase keys to lower_case_with_underscore, which is the internal representation. Problem solved -- it is now possible to simply store the keys as they appear in the XML document, and later access them using either CamelCase or lower_case keys.
Hash::CamelCase inherits from Tie::ExtraHash. Similar modules with respect to the case of keys exist (such as Tie::CPHash and the Hash::Case framework), but none provide exactly what I need.
package Hash::CamelCase; =head1 NAME Hash::CamelCase - A hash whose keys are CamelCase-insensitive. =head1 SYNOPSIS use Hash::CamelCase; my %hash; tie %hash, 'Hash::CamelCase'; $hash{ThisIsAKey} = 1; $hash{this_is_a_key} = 0; # $hash{ThisIsAKey} is now 0. $hash{thisIsAKey} = 5; # $hash{ThisIsAKey} is now 5. # However, these are different keys from the above three: print "Not defined\n" if (not defined $hash{THIS_IS_A_KEY}); print "Not defined\n" if (not defined $hash{This_Is_A_Key}); =head1 REQUIRES Perl 5 =head1 INHERITS FROM Tie::ExtraHash (which, in turn, inherits from L<Tie::Hash>) =cut use 5.000; use strict; use warnings; use Tie::Hash; use vars qw($VERSION @ISA); @ISA = qw(Tie::ExtraHash); use subs qw(_internalize); =head1 EXPORTS Nothing. =head1 DESCRIPTION Hash::CamelCase is a simple subclass of Tie::ExtraHash. It provides "CamelCase insensitive" keys: key names in CamelCase, lowerCamelCase, and lower_case_with_underscore are all equivalent. In other words, keys in any of those three forms will be converted to a common, internal representation. This module was originally created in the TIMTOWTDI spirit, intented to be used to store XML elements and their values in a Perl hash. Quite often, XML documents use CamelCase or lowerCamelCase, while the author adheres to the lowercase underscore naming convention in Perl code. Hash::CamelCase allows one to use both conventions with the same hash. The following three naming conventions are considered equivalent: =over =item 1 MyLongVariableName =item 2 myLongVariableName =item 3 my_long_variable_name =back However, the following are different from all three above: My_Long_Variable_Name, my_LongVariableName, MYLONGVARIABLENAME, MY_LONG_VARIABLE_NAME, My_long_variable_name, _MyLongVariableName, MYLongVariableName, etc. The module does not prevent the user from storing keys in that are not CamelCase, lowerCamelCase, or lower_case_with_underscore. Use for other purposes at your own risk. To use Hash::CamelCase, simply tie your hash with it: my %hash; tie %hash, 'Hash::CamelCase'; You can now access the same keys in any of the three naming convention +s: $hash{variable_name} = 1; print "I'm a camel!\n" if ($hash{VariableName}); print "I'm a small camel!\n" if ($hash{variableName}); print "I'm a confused beast.\n" if ($hash{Variable_Name}); # This will print both "I'm a camel!" and "I'm a small camel!", # but not "I'm a confused beast." Integer sequences are counted as words. In other words, C<VariableName1> and C<variable_name_1> are the same key, as are C<Variable1Name> and C<variable_1_name>. However, C<Var1ableName> is not CamelCase and will not be equivalent to C<var_1able_name>! Use C<Var1AbleName>. =cut # INTERNAL METHODS # Overriden methods from Tie::ExtraHash # For all overriden methods, simply convert the key to the internal # representation first, if needed, then act normal (and call the metho +ds of # the superclass). sub FETCH { $_[0][0]->{_internalize $_[1]}; } sub STORE { $_[0][0]->{_internalize $_[1]} = $_[2]; } sub EXISTS { exists $_[0][0]->{_internalize $_[1]}; } sub DELETE { delete $_[0][0]->{_internalize $_[1]}; } # Utility functions # $result = _internalize($string) # # _internalize will convert CamelCase and camelCase to # lower_case_with_underscore, and leave the rest as is. $string may # contain any characters at all that are legal in Perl hashes. # # _internalize is package global and functional. sub _internalize { my $word = shift; for ($word) { m{^[[:upper:][:lower:]0-9]+$} and not m{[[:upper:]]{3,}} and not m{[0-9][[:lower:]]} and do { s{([0-9]+)}{_$1}g; s{([[:upper:]])}{_$1}g; # If $word begins with an uppercase letter or number, # then the above will prefix it with an underscore. # Remove the underscore. s{^_}{}; $_ = lc; } } return $word; } =head1 VERSION 1.0 =cut $VERSION = '1.0'; =head1 SEE ALSO L<Tie::Hash>, L<perltie>. For a semi-official definition of CamelCase and mixedCase, see L<http://en.wikipedia.org/wiki/CamelCase> and L<http://www.python.org/dev/peps/pep-0008/>. A related module in spirit is L<String::CamelCase> by YAMASHINA Hio, which converts between CamelCase and lower_case_with_underscore. Other modules related to hashes and key cases: =over =item By Mark Overmeer: L<Hash::Case>, L<Hash::Case::Lower>, L<Hash::Case::Upper>, L<Hash::Case::Preserve>. =item By Christopher J. Madsen: L<Tie::CPHash>. =back =head1 AUTHOR Ville R. Koskinen E<lt>w-ber@iki.fiE<gt> =head1 COPYRIGHT Copyright (C) 2007 by Ville R. Koskinen. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available. =cut 1;
The only deficit I can think of currently is that the documentation is much longer than the actual code, which is, well, almost trivial. Does this module have any right to exist? Would someone else than myself find uses for it? Should this be shoehorned to the Hash::Case framework?
Download the module as a CPAN-esque package.
Thank you for your patience.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: RFC: Hash::CamelCase
by diotalevi (Canon) on Mar 01, 2007 at 15:32 UTC | |
Re: RFC: Hash::CamelCase
by GrandFather (Saint) on Mar 01, 2007 at 20:14 UTC | |
by vrk (Chaplain) on Mar 02, 2007 at 07:42 UTC | |
Re: RFC: Hash::CamelCase
by Jenda (Abbot) on Mar 02, 2007 at 14:37 UTC | |
by vrk (Chaplain) on Mar 02, 2007 at 18:17 UTC |