Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Filehandle Filter

by btrott (Parson)
on Aug 11, 2000 at 02:56 UTC ( [id://27415]=CUFP: print w/replies, xml ) Need Help??

There was (brief) discussion yesterday about how to use tied filehandles. And I put up an example of using a tied filehandle to filter STDOUT, but I didn't really like it because, well, it ended up printing to STDERR. :)

So I came up with this, which is nicer, because at least it prints to the filehandle you've specified. But to do that I have to untie the filehandle before I print to it; otherwise I'll end up writing to my tied filehandle, which will write to my tied filehandle, and so on. Deep recursion. :)

So I've implemented this, which first untie's the filehandle, then tie's it back up again afterwards. Which is kind of ugly, particularly since I've had to turn off warnings to get rid of the "untie attempted while 1 inner references still exist" error. Which perhaps means that I'm leaking memory.

Any ideas on cleaning that bit up would be appreciated.

Anyway, usage:

use Filter::Handle; my $f = Filter::Handle->new(\*STDOUT); print "Foo"; print "Bar";

You don't have to use STDOUT, of course; that's the whole idea. Any filehandle will do.

package Filter::Handle; use strict; sub new { my $class = shift; my $fh = shift; tie *{$fh}, 'Filter::Handle::Tie', $fh; bless { fh => $fh }, $class; } sub DESTROY { my $self = shift; my $fh = $self->{fh}; undef $self; { local $^W = 0; untie *{$fh} } } package Filter::Handle::Tie; use vars qw/@ISA/; use Tie::Handle; @ISA = qw/Tie::Handle/; sub TIEHANDLE { my $class = shift; my $fh = shift; bless { fh => $fh }, $class; } sub PRINT { my $self = shift; my $fh = $self->{fh}; my($file, $line) = (caller)[1,2]; { local $^W = 0; untie *{$fh} } print $fh sprintf "%s:%d - %s\n", $file, $line, join ' ', @_; tie *{$fh}, __PACKAGE__, $fh; } 1;

Replies are listed 'Best First'.
RE: Filehandle Filter
by chromatic (Archbishop) on Aug 11, 2000 at 04:10 UTC
    I like this, a lot. It could be a CPAN module if someone (push me and I'll do it) provided a mechanism where you could specify the output. Here's how I got around the tie-untie reference problem:
    package Filter::Handle::Tie; use vars qw/@ISA/; use Tie::Handle; @ISA = qw/Tie::Handle/; sub TIEHANDLE { my $class = shift; my $fh = shift; bless { fh => $fh }, $class; } sub PRINT { my $self = shift; my $fh = *{ $self->{fh} }; my($file, $line) = (caller)[1,2]; print $fh sprintf "%s:%d - %s\n", $file, $line, join ' ', @_; }
    This avoids the deep recursion by not printing to a tied filehandle -- it's copied by dereference before printing to it.

    Maybe users could pass in a subref with stuff they want to print instead. Interesting idea.

      Well since you beat me to the original, here is an implementation that by default does what btrott's original display did, but can do anything you want. For giggles and grins I have implemented new and PRINTF using two different idioms:
      package Filter::Handle; use strict; use Carp; sub PRINT { my $self = shift; my $fh = $self->{fh}; print $fh $self->{disp}->(@_); } sub PRINTF { my $self = shift; @_ = ($self, sprintf(shift, @_)); goto &PRINT; } sub TIEHANDLE { my $class = shift; my $fh = shift or croak("Need a filehandle to tie to."); my $disp = shift || sub { my ($file, $line) = (caller(1))[1,2]; sprintf("%s:%d - %s\n", $file, $line, "@_"); }; return bless ({fh => $fh, disp => $disp}, $class); } # An OO interface for free! :-) *new = *TIEHANDLE; *print = *PRINT; *printf = *PRINTF; 1;
      (And yes, chip this is for you because of RE: RE: Shot myself in the foot with a pos. Who said that goto was useless? :-)

      EDIT
      Chip made 2 very good style suggestions on the anon sub, and I am glad to have made both of them.

      EDIT 2
      In addition to the suggestions chip made, I had removed my other return. At the time I wondered if I should, and tye's misgivings confirmed my misgivings. It belongs, it really does.

      Note that this could be more compact still. For instance the anon sub could be made even shorter by removing the temporary variable. But that would not optimize my ability to understand my code. :-)

      EDIT 3
      OK, I got tired of the temporary variable in PRINTF that was clearly not needed. I have not (yet) convinced myself that it would be a good thing to drop the two in the anon sub. I also added a print and printf functions as syntactic sugar to make the OO interface a bit nicer. Oh, and I added a comment. :-)

      For those who do not understand tie, this package has two distinct interfaces. The OO one works like this:

      my $out = new Filter::Handle(\*STDOUT); $out->print("Hello world\n"); $out->printf("%s %s\n", "Hello", "world");
      The tied interface like this:
      tie (*OUT, 'Filter::Handle', \*STDOUT); select(OUT); print "Hello world\n"; printf ("%s %s\n", "Hello", "world");
      The apparently shocking similarity in the implementations drives home the fact that tie is nothing more than syntactic sugar to allow you to think about an object which happens to provide the right methods as a native Perl datatype. :-)

      BTW my absolutely favorite part of this code is the DESTROY method. Stuff like that is what makes Perl great! :-)

        I love it! :-)

        My favorite bit is *new = *TIEHANDLE. That's just inspired.

        I have to deduct a style point for unnecessary use of the return operator, and the explicit call of join when "@_" would be both shorter and clearer (IMO).

        But to repeat: This is a neat hack. Thanks.

            -- Chip Salzenberg, Free-Floating Agent of Chaos

      Ooh. Thanks so much for that trick. Very nice.

      How's this for the output functionality?

      package Filter::Handle; use strict; use vars qw/@ISA/; use Tie::Handle; @ISA = qw/Tie::Handle/; sub TIEHANDLE { my $class = shift; bless { @_ }, $class; } sub new { my $class = shift; my $fh = shift; tie *{$fh}, __PACKAGE__, fh => $fh, @_; bless { fh => $fh }, $class; } sub DESTROY { my $self = shift; my $fh = $self->{fh}; { local $^W = 0; untie *{$fh} } } sub PRINT { my $self = shift; my $fh = *{ $self->{fh} }; die "No output handler installed" unless defined $self->{output}; print $fh $self->{output}->(@_); } sub CLOSE { }
      Now to be used like this:
      my $f = Filter::Handle->new(\*STDOUT, output => sub { my($file, $line) = (caller(1))[1,2]; return sprintf "%s:%d - %s\n", $file, $line, join ' ', @_; } ); print "Foo";
      I think this might be nice to put on CPAN. I've just sent in my registration for a user ID. So... thanks very much for that idea!
        Okay, I got rid of the hash and used blessed typeglobs, just because I'm that kinda guy and I want the approval of heroes like chip and merlyn and Damian.
        package Filter::Handle; use strict; use vars qw/@ISA/; use Tie::Handle; @ISA = qw/Tie::Handle/; sub TIEHANDLE { my $class = shift; my $fh = shift; bless $fh, $class; } sub new { my $class = shift; my $fh = shift; my $sub = shift; *fh = $sub; tie *{$fh}, __PACKAGE__, $fh; bless $fh, $class; } sub DESTROY { my $self = shift; my $fh = *{ $self }; { local $^W = 0; untie *{$fh} } } sub PRINT { my $self = shift; my $fh = *$self; my $code = *fh{CODE}; die "No output handler installed" unless defined $code; print $fh $code->(@_); } sub CLOSE { }
        You can pass an anonymous sub or a sub ref to the constructor:
        #!/usr/bin/perl -w use strict; use Filter::Handle; local *OUTPUT; open(OUTPUT, ">handle.txt") || die "Can't open: $!"; my $fh = \*OUTPUT; my $f = Filter::Handle->new($fh, sub { my($file, $line) = (caller(1))[1,2]; return sprintf "%s:%d - %s\n", $file, $line, @_; }); print OUTPUT "Foo"; print OUTPUT "Bar";
        Yeah, that's pretty evil. But hey, why not?
RE: Filehandle Filter
by btrott (Parson) on Aug 11, 2000 at 08:51 UTC
    All right, here's a new version. This one has support for filtering and unfiltering filehandles by calling functions on them, as opposed to the object going out of scope method.

    Usage below.

    package Filter::Handle; use strict; sub import { my $class = shift; return if !@_; my $caller = caller; if ($_[0] eq "subs") { no strict 'refs'; for my $sub (qw/Filter UnFilter/) { *{"${caller}::$sub"} = \&{"${class}::$sub"}; } } } sub Filter { my $fh = $_[0]; tie *{ $fh }, __PACKAGE__, @_; } sub UnFilter { my $fh = shift; { local $^W = 0; untie *{ $fh } } } sub TIEHANDLE { my $class = shift; my $fh = shift or die "Need a filehandle."; my $output = shift || sub { my($file, $line) = (caller(1))[1,2]; sprintf "%s:%d - %s\n", $file, $line, "@_" }; bless { fh => $fh, output => $output }, $class; } sub new { Filter(@_[1..$#_]); bless { fh => $_[1] }, $_[0] } sub DESTROY { my $self = shift; UnFilter($self->{fh}); } sub PRINT { my $self = shift; my $fh = *{ $self->{fh} }; ## thanks, chromatic :) print $fh $self->{output}->(@_); } sub PRINTF { my $self = shift; my $fmt = shift; @_ = ($self, sprintf $fmt, @_); goto &PRINT; ## thanks, tilly :) } 1;
    Usage is either what we had before:
    my $f = Filter::Handle->new(\*STDOUT, sub { "Foo: @_\n" }); print "Bar";
    Or the new
    use Filter::Handle qw/subs/; Filter \*STDOUT, sub { "Foo: @_\n" }; print "Bar"; UnFilter \*STDOUT;
      Here's my POD documentation for this module (which is basically what the latest version looks like):
      =head1 NAME Filter::Handle - Apply filters to output filehandles =head1 SYNOPSIS use Filter::Handle; my $f = Filter::Handle->new(\*STDOUT); use Filter::Handle qw/subs/; Filter \*STDOUT; ... UnFilter \*STDOUT; =head1 DESCRIPTION I<Filter::Handle> allows you to apply arbitrary filters to output filehandles. You can perform any sorts of transformations on the outgoing text: you can prepend it with some data, you can replace all instances of one word with another, etc. You can even filter all of your output to one filehandle and send it to another; for example, you can filter everything written to STDOUT and write it instead to another filehandle. To do this, you need to explicitly use the I<tie> interface (see below). =head2 Calling Interfaces There are three interfaces to filtering a handle: =over 4 =item * Functional use Filter::Handle qw/subs/; Filter \*STDOUT; print "I am filtered text"; UnFilter \*STDOUT; print "I am normal text"; The functional interface works by exporting two functions into the caller's namespace: I<Filter> and I<UnFilter>. To start filtering a filehandle, call the I<Filter> function; to stop, call I<UnFilter> on that same filehandle. Any writes between the time you start and stop filtering will be filtered. =item * Object-Oriented use Filter::Handle; { my $f = Filter::Handle->new(\*STDOUT); print "I am filtered text"; } print "I am normal text"; The object-oriented interface works by filtering the filehandle while your object is in scope. Once all references to that object have gone out of scope--typically, this is after your one reference has gone away--the filehandle will no longer be filtered. =item * Tie Interface use Filter::Handle; local *HANDLE; tie *STDOUT, 'Filter::Handle', \*HANDLE; print "I am filtered text written to HANDLE"; untie *STDOUT; The I<tie> interface will filter your filehandle until you explicitly I<untie> it. This is the only interface that allows you to filter one filehandle through another. The above example will filter all writes to STDOUT through the output filter, then write it out on HANDLE. Note that this is different behavior than that of the first two interfaces; if you want your output written to the same handle that you're filtering, you could use: tie *STDOUT, 'Filter::Handle', \*STDOUT; Which is exactly what the first two interfaces do. =back =head2 Customized Filters The default filter is relatively boring: it simply prepends any text you print with the filename and line of the invoking caller. You'll probably want to do something more interesting. To do so, pass an anonymous subroutine as a second argument to either the I<new> method, if you're using the OO interface, or to the I<Filter> function, if you're using the functional interface. Your subroutine will be passed the list originally passed to print, and it should return another list, suitable for passing to your (unfiltered) output filehandle. For example, say that we want to replace all instances of "blue" with "red". We could say: use Filter::Handle qw/subs/; Filter \*STDOUT, sub { local $_ = "@_"; s/blue/red/g; $_ }; print "My house is blue.\n"; print "So is my cat, whose nose is blue.\n"; UnFilter \*STDOUT; print "And the plane is also blue.\n"; This prints: My house is red. So is my cat, whose nose is red. And the plane is also blue. As expected. =head1 CAVEATS Note that this won't work correctly with output from XSUBs or system calls. This is due to a limitation of Perl's I<tie> mechanism when tying filehandles. =head1 AUTHOR Benjamin Trott, ben@rhumba.pair.com =head1 CREDITS Thanks to tilly, chromatic, Adam, and merlyn at PerlMonks.org for suggestions, critiques, and code samples. =cut
        Excellent! And for those who look at the code and are totally lost, here is a useful hint on how tie works. All that tie does is allow an object in a class that defines the right methods to look like a native Perl datatype. The only thing you have to do is make sure that you are providing the OO interface that Perl is looking for.

        To find out what methods are part of the interface that Perl knows to look for type "perldoc -f tie". Note that the documentation here of tie is somewhat misleading since it documents the limitations of tie that were in Perl 5.003. For instance you probably can create a full tied interface to an array.

        It feels strange the first few times you create an implementation of a tied class, but it really is not very hard and it is an excellent example of how encapsulation can lead to good things later. :-)

      Hey now, 21 years after this inspired module was added, I just wanted to note a small bug in the Filter subroutine (in package Filter::Handle). It should be:
      sub Filter { my $fh = shift; tie *{ $fh }, __PACKAGE__, @_; }
      Note how the subroutine UnFilter uses the shift correctly. Also, it's a shame that this is no longer on CPAN. But this chunk of module is enough to put into one's "private" library for use. Using a CODE reference, you can duplicate all your STDOUT and STDERR (including any that comes out of perl warnings), like the following:
      select STDERR; $|=1; select STDOUT; $|=1; use FileHandle; my $LOG_FH = new FileHandle($logfile, "w"); $LOG_FH->autoflush; open(DUPOUT, ">&STDOUT") or die "Couldn't dup STDOUT: $!\n"; open(DUPERR, ">&STDERR") or die "Couldn't dup STDERR: $!\n"; use Filter::Handle qw/subs/; our $FILTER_STDOUT = sub { local $_ = "@_"; print DUPOUT $_; sprintf "[STDOUT]: %s", "@_" if (defined $LOG_FH && $LOG_FH->opened); }; our $FILTER_STDERR = sub { local $_ = "@_"; print DUPERR $_; sprintf "[STDERR]: %s", "@_" if (defined $LOG_FH && $LOG_FH->opened); }; ## Call Filter to tie the filehandles ## Call UnFilter to untie the filehandles (don't care) Filter \*STDOUT, $LOG_FH, $FILTER_STDOUT; Filter \*STDERR, $LOG_FH, $FILTER_STDERR;
      I like this much better than IO::Tee, because I don't need a custom filehandle to print to to get a logfile of all output. I'm continuing to play around with this and may update this thread more later.
        What version of Perl are you using? The original usage example
        use Filter::Handle qw/subs/; Filter \*STDOUT, sub { "Foo: @_\n" }; print "Bar"; UnFilter \*STDOUT;
        works for me correctly in 5.6.2, but fails in 5.10.1 with
        Deep recursion on subroutine "Filter::Handle::PRINT" at .../lib/Filter +/Handle.pm line 51. Segmentation fault (core dumped)

        When I replace $_[0] with shift, it fails in all the versions from 5.6.2 to blead:

        Not a GLOB reference at .../lib/Filter/Handle.pm line 50. (in cleanup) Not a GLOB reference at .../lib/Filter/Handle.pm line + 25 during global destruction.

        > it's a shame that this is no longer on CPAN

        See Text::OutputFilter.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      That module is really cool. I wrote the following filter to use with it to generate line numbers:
      use Filter::Handle qw/subs/; use strict; { my( $i, $n ) = (0,1); # Scoped, only the filter sees them Filter \*STDOUT, sub { @_ = @_; # Laziness my $string = $n ? "Line ".++$i.": " : ""; for(@_){ s/\n(.)/"\nLine ".++$i.": $1"/egm; $string .= $_ } $n = $_[$#_] =~ /\n$/; return $string; } } # A quick test: print "line 1\n"; print 'line 2', ' line 2 cont.'; print ' more stuff for line 2', "\n"; print "this is line 3\nand this is line 4"; print "\nthis is line 5"; __END__ # And yes, this prints: Line 1: line 1 Line 2: line 2 line 2 cont. more stuff for line 2 Line 3: this is line 3 Line 4: and this is line 4 Line 5: this is line 5
RE: Filehandle Filter
by btrott (Parson) on Aug 22, 2000 at 19:55 UTC
    For all interested: this module is now on CPAN as the module Filter::Handle. It's technically now at version 0.03, but that's just because I did every little thing possible to move up the version number. :) The OO interface has changed, slightly, as well. But anyway... enjoy, if you want to.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://27415]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-03-19 11:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found