Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Indirect Object Syntax Tomfoolery

by larryk (Friar)
on Feb 13, 2002 at 12:55 UTC ( #145152=perlquestion: print w/replies, xml ) Need Help??
larryk has asked for the wisdom of the Perl Monks concerning the following question:

There is a module that lets me access a scalar variable as if it were a file: Tie::Handle::Scalar but I can't seem to find one to do the opposite - access a file as if it were a scalar variable.

The reason I want to do this is that I was looking at YAPE::HTML and it wanted a string containing HTML to parse and I had an HTML file. If I could get YAPE::HTML to think it had a string but was really accessing my file then I wouldn't have to slurp it in (not such a big problem for relatively small files but Out of memory! errors lurk around every corner).

After a while in the CB I couldn't find a suitable wheel. I looked at perldoc perltie but it won't let me override the core string functions. I decided the only way was to create an object and somehow force any string ops to use the object's methods - fine if I am writing the script as I can say $pseudo_string->substr(0,10) but not so fine if I am trying to fool A::N::Other package into thinking it has got a real scalar variable that it can do substr($pseudo_string,0,10). I read the WARNING section in perldoc perlobj about indirect object syntax and while my proposed solution does not suffer from the first problem:

    The first problem is that an indirect object is limited to a name, a
    scalar variable, or a block, because it would have to do too much
    lookahead otherwise...
, the second most certainly applies:
    As if that weren't bad enough, think about this: Perl must guess *at
    compile time* whether "name" and "move" above are functions or methods.
    Usually Perl gets it right, but when it doesn't it, you get a function
    call compiled as a method, or vice versa.

I made a start (see code below) but the only way I can substr my $pseudo_string using indirect syntax is by fully qualifying the method name: Pseudo::Tie::Scalar::Handle::substr $pseudo_string, 10, 20. Which defeats the purpose if I have to then modify A::N::Other package.

Am I barking up the wrong tree? Or just barking? How can I force the indirect syntax of A::N::Other package to call the $pseudo_string's methods instead of core functions?

package Pseudo::Tie::Scalar::Handle; use strict; use warnings; use Fcntl; use IO::File; sub new { my($class,$file) = @_; my $fh = IO::File->new($file,O_RDWR|O_CREAT); return unless defined $fh; bless { file => $file, handle => $fh, }, $class; } sub AUTOLOAD { no strict 'vars'; warn "$AUTOLOAD not implemented for a Pseudo::Tie::Scalar::Handle +object\n"; } sub DESTROY {} sub substr ($$;$$) { my $self = shift; my $fh = $self->{handle}; seek $fh, shift, 0; my $size = shift; my $chunk; if (defined $size) { warn "substr replacement not implemented\n" if @_; read $fh, $chunk, $size; } else { $chunk = do { local $/; <$fh> } } return $chunk; } 1; package main; use strict; use warnings; my $pseudo_string = Pseudo::Tie::Scalar::Handle->new('/test.txt'); print "\nnot-working: ",substr $pseudo_string, 0, 43;
perl -le "s,,reverse killer,e,y,rifle,lycra,,print"

Replies are listed 'Best First'.
Re (tilly) 1: Indirect Object Syntax Tomfoolery
by tilly (Archbishop) on Feb 13, 2002 at 13:57 UTC
    First of all your approach won't work since if Perl already knows about a function, it won't think that is meant to be a method call. (Perl already knows about its internal functions.)

    An approach with more chance of success is to try to override the core function as described in perlsub. But on experiment that doesn't seem to work.

    A major problem with your approach is that even if you can get substr to work, the RE engine assumes that it knows all about the low-level string representation, and uses that knowledge directly. I don't think you will easily be able to get native REs to transparently cross line boundaries. And this problem is not simple, Perl has at least one bug related to that. The following code will work in an eval, but not in a file, for exactly this reason:

    use strict; my %demo = (foo => "bar", );
    However all is not (yet) lost. Open source encourages code reuse. Usually you reuse wisely. Sometimes that means that you might just take the module, copy it, and then edit it to do what you want. Should you be able to find a way to do this configurably, then submit the patch back. If you don't, then you at least can get your job done. (The last time I used this approach was to create a version of that showed me the communication between LWP and the webserver...)

    If you can't figure out how to do that edit, well you may be out of luck this time, but you probably just learned something useful about how overriding does(n't) work, and likely also learned from the source-code. Knowledge isn't always the reward you want, but getting it isn't a bad consolation prize...

Re: Indirect Object Syntax Tomfoolery
by japhy (Canon) on Feb 13, 2002 at 15:04 UTC
    If I ever get back to work on another version of YAPE::HTML, it'll have stream-capacity, so you can send it a filehandle and it'll do its duty the right way.

    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who could use a job
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

(Zaxo) Mmap Re: Indirect Object Syntax Tomfoolery
by Zaxo (Archbishop) on Feb 14, 2002 at 04:45 UTC

    Mmap allows you to treat a file as a scalar variable. There are caveats, from pod:

           Mmap - uses mmap to map in a file as a perl variable
               use Mmap;
               mmap($foo, 0, PROT_READ, MAP_SHARED, FILEHANDLE) or die "mmap: $!";
               @tags = $foo =~ /<(.*?)>/g;
               munmap($foo) or die "munmap: $!";
               mmap($bar, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, FILEHANDLE);
               substr($bar, 1024, 11) = "Hello world";
           The Mmap module lets you use mmap to map in a file as a perl variable
           rather than reading the file into dynamically allocated memory. It
           depends on your operating system supporting UNIX or POSIX.1b mmap, of
           course. You need to be careful how you use such a variable...

    After Compline,

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://145152]
Approved by root
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2017-07-22 01:09 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (336 votes). Check out past polls.