Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^3: lhs substr(): refs vs. scalars

by BrowserUk (Patriarch)
on Oct 08, 2005 at 18:57 UTC ( [id://498449]=note: print w/replies, xml ) Need Help??


in reply to Re^2: lhs substr(): refs vs. scalars
in thread lhs substr(): refs vs. scalars

I'm the wrong person to ask, but I would assume so, as they (substr refs) have steadily been corrected and improved over the last few versions.

They have been available since before my time (5.6.1), but through a bug in the implementation, there was originally only 1 lvalue ref available at a time for each given string in te program. This was fixed in 5.8.5.

The most useful use of them is processing fixed length record files where you allocate the input buffer and create an array of lvalue refs to the fields. You can now read or sysread subsequent records directly into the buffer overlaying the previous record, and the fields array now refers to the fields of the new record.

It saves re-divvying the buffer over and over for each record, which can save a good deal of memory (re)allocation when processing large files. Add a few seeks and you have an efficient and fairly cache freindly way of doing in-place editing on huge, fixed record length files.

Not they're much in vogue these days, but they do have their uses :). I played with manipulating huge tiff images like this one (Warning!!! 11,477 x 7,965 x 24 image 204MB) directly on disk.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

Replies are listed 'Best First'.
Re^4: lhs substr(): refs vs. scalars
by renodino (Curate) on Oct 08, 2005 at 19:41 UTC
    Actually, the segment ref's may be valuable for me. I'm using a pool of fixed buffers that get populated with binary msgs whose headers have a fixed format. So if I grab ref's to the header fields of interest when the buffers are created, my logic may be a bit simpler, and hopefully faster (tho I still need to pack/unpack(), so it may not be worth the effort).

    But still very good to know!

Re^4: lhs substr(): refs vs. scalars
by ysth (Canon) on Oct 09, 2005 at 03:45 UTC
    Can you give a reference to the pre-5.8.5 bug? I recall there being a bug fixed where 3-arg substr misbehaved if the same call was used in both lvalue and non-lvalue mode, but that sounds like a different case than you describe.

      The only one substr ref per string bug I was referring to is demonstrated here using 5.6.1:

      P:\test>c:\perl561\bin\perl5.6.1.exe \bin\p1.pl perl> $bigScalar = 'the quick brown fox jumps over the laxy dog';; perl> @lvrefs = map{ \substr $bigScalar, $_->[0], $_->[1] } [0,3], [4,5], [10,5], [16,3], [20,5], [26,4], [31,3], [35,4], [40,3] +;; perl> print $$_ for @lvrefs;; dog dog dog dog dog dog dog dog dog

      With the work around being to use string eval to bypass the restriction:

      perl> $bigScalar = 'the quick brown fox jumps over the laxy dog';; perl> @lvrefs = map{ eval '\ substr $bigScalar, $_->[0], $_->[1]' } [0,3], [4,5], [10,5], [16,3], [20,5], [26,4], [31,3], [35,4], [40,3] +;; perl> print $$_ for @lvrefs;; the quick brown fox jumps over the laxy dog

      If I remember correctly, the original fix for this went in circa 5.8.3?, but there was another problem also, which I forget the details of, but I will try to remember/retrace my path.

      There is (IMO) still an existing problem with lvalue refs in 5.8.5, which I thought I raised a perlbug for, but I have lost track of the outcome. The problem is this:

      P:\test>c:\perl5.8.5\bin\perl5.8.5.exe \bin\p1.pl perl> $bigScalar = 'the quick brown fox jumps over the laxy dog';; perl> @lvrefs = map{ \substr $bigScalar, $_->[0], $_->[1] } [0,3], [4,5], [10,5], [16,3], [20,5], [26,4], [31,3], [35,4], [40,3] +;; perl> print $$_ for @lvrefs;; the quick brown fox jumps over the laxy dog perl> ${ $lvrefs[ 4 ] } = 'x';; perl> print $bigScalar;; the quick brown fox x over the laxy dog perl> ${ $lvrefs[ 4 ] } = 'xxxxxx';; perl> print $bigScalar;; the quick brown fox xxxxxxr the laxy dog perl> print $$_ for @lvrefs;; the quick brown fox xxxxx r th la y do

      Once a lvalue ref has been used in a way that shrinks or grows the target string, it appears to loose track of what part of the string it should now be referring to.

      Ie. When the lvref that points to substring 20,5, is assigned to by a single char, it continues to refer to a 5 char long substr start at pos 20. Hence, when a second (longer) assignment is done via that same lvref, instead of expanding the string (as it would the first time), it overlays a part of the string that it shouldn't.

      I believe it would be both possible, and more correct, for the lvrefs that are assigned to in such a way that they shrink or expand the original string to have their length value adjusted accordingly so that second and subsequent assignments would only overlay that part of the original string that remains after the first assignment.

      perl> $s = 'abcde';; perl> $lv = \ substr $s, 1, 3;; perl> print $$lv;; bcd perl> $$lv = '1234';; perl> print $$lv;; 123 ## The lvref stil refers to a 3 char substring. ## this could be adjusted to reflect the new len +gth.

      Ideally, any other lvrefs refering to that part of the string or later parts of the string that would also be affected by the alteration of the strings length would also be adjusted. But I can see that would require keeping track of all lvrefs that point to a given string and inspecting them all each time which would be complex and costly.

      But I think adjusting the one assign through would be easy and useful.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
        But I think adjusting the one assign through would be easy and useful.
        This is changed in bleadperl and will be in 5.10. In 5.8.x at some point a grammatically torturous paragraph was added to the end of the substr doc warning about some of these issues.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://498449]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2024-04-19 15:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found