Neat code! (Though a few named constants would make it a little more readable:)
I had thought that when processing fixed-length records, rather than having to unpack every record into seperate scalars, operate upon them and then repack them, I could
- Set up a buffer
- Grab some Lvalue refs into the sections of the buffer
- Read or lvalue assign the record directly into that buffer.
- Operate upon the named (lvalue) sections.
- Write the record out.
- Repeat from 3 till done.
Something like this: (only shows one field, but that's the perl limitation I encountered).
#! perl -sw
use strict;
my $buf = ' 'x70;
my $genus = \substr($buf, 40, 10);
while( <DATA>) {
substr($buf,0) = $_;
$$genus = uc($$genus);
print $buf;
}
#23456789012345678901234567890123456789012345678901234567890123456789
__DATA__
00001 fox brown indian Vulpes Bengalensis
00002 fox blanford quick Vulpes Cana
00003 fox brown cape Vulpes Chama
00004 fox grey tree Vulpes Cinereoargenteus
00005 fox brown quick Vulpes Corsac
00006 fox brown tibetian Vulpes Ferrilata
00007 fox grey quick Vulpes Littoralis
00008 fox pale quick Vulpes Pallida
00009 fox brown quick Vulpes Ruppelli
00010 fox brown swift Vulpes Velox
00011 fox red quick Vulpes Vulpes
00012 fox brown quick Vulpes Zerda
00013 fox white arctic Alopex Lagopus
00014 fox Culpeo Dusicyon Culpaeus
00015 fox Grey Argentine Dusicyon Griseus
00016 fox Azara Dusicyon Gymnocercus
00017 fox small eared Dusicyon Microtis
00018 fox Sechuran Dusicyon Sechurae
00019 fox crab eating Dusicyon Thous
00020 fox Hoary Dusicyon Vetulus
00021 fox bat eared Octocyon Megalotis
Output 00001 fox brown indian VULPES Bengalensis
00002 fox blanford quick VULPES Cana
00003 fox brown cape VULPES Chama
00004 fox grey tree VULPES Cinereoargenteus
00005 fox brown quick VULPES Corsac
00006 fox brown tibetian VULPES Ferrilata
00007 fox grey quick VULPES Littoralis
00008 fox pale quick VULPES Pallida
00009 fox brown quick VULPES Ruppelli
00010 fox brown swift VULPES Velox
00011 fox red quick VULPES Vulpes
00012 fox brown quick VULPES Zerda
00013 fox white arctic ALOPEX Lagopus
00014 fox Culpeo DUSICYON Culpaeus
00015 fox Grey Argentine DUSICYON Griseus
00016 fox Azara DUSICYON Gymnocercus
00017 fox small eared DUSICYON Microtis
00018 fox Sechuran DUSICYON Sechurae
00019 fox crab eating DUSICYON Thous
00020 fox Hoary DUSICYON Vetulus
00021 fox bat eared OCTOCYON Megalotis
Somewhat COBOLish, but possibly useful for efficiency, especially if long records are involved.
Whilst it can be achieved using tie'd scalar as you've shown, as Elian pointed out above, the costs of tie pretty much negate the benefits of the idea.
I think it's probably more efficient to set up HoA's something like
my %fields = (
record_num =>[0,10],
characteristic_1 =>[10,10],
characteristic_2 =>[20,10],
genus =>[30,10],
subspecies =>[40,20],
);
substr($buf, @{$fields{genus}}) = uc(substr($buf, @{$fields{genus}});
but that's nowhere near as nice as the equivalent line above.
unpacking to named scalars and then repacking them is much cleaner.
I also hoped that creating an array of lvalue refs to a string instead of using split to create an array of 1 char scalars from that string, would allow me to treat a string as an array of char.
my @str = map{ \substr($str, $_, 1) } 0..length($s)-1;
$str[3] = 'b' if $str[7] eq 'N';
This is a feature that I think perl deserves and really hope that it will be available in Perl6. It irks me everytime that I use
my @c = split'',$str;
$c[3] = 'b' if $c[7] eq 'N';
$str=join'',@c;
I can do this with substr but using an array of lvalues would allow me the syntax I desire. It would still need to be wrapped into a tie'd scalar with the penalties that involves.
With perl's current use of sigils, it isn't possible to add the syntactic sugar that would allow $str[n] to mean the nth char of $str as there is no way to distinguish this from accessing the nth element of the array @str. And there is no clean way to provide that syntax using overload or a tied scalar as far as I can tell.
I'm hoping that the powers that be will allow this syntax once references to @str will always start with @ in perl 6. Then $str[3] and @str[3] will be clearly distinct things and then $str[3..7] could be the equivalent of substr($str, 3, 5) which I think would be much more usable as I often know the start and end positions and have to perform (unintuative) math to translate this into a start/length pair. I quite like the idea of $str[3,5,6,7] = $str[5,3,7,6]; as a similar concept to an array slice operation on a string, but I can see problems with it.
The nice thing about not being Mr. Wall et al, is that you can have such random, speculative dreams without having to consider all the details:).
Examine what is said, not who speaks.
The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead. |