http://www.perlmonks.org?node_id=586864


in reply to Top 10 reasons to start using ack

I wrote diotalevi's grep. It's one distinguishing feature is automatically looking inside tarballs and zip files. I find that's really common for me. It does a dirt simple look at the file extension and opens .tgz/.zip/.whatever using the streaming abilities of archive extraction programs. The code isn't sexy and it isn't even good but it does do a pretty good enough job. It'd be possible to do a much better job given a few minutes and the interest. I'd use ack immediately if it got this feature.

sub open_file_harder { my ($filename) = @_; return if not defined $filename; if ( my ($extension) = $filename =~ /(\.[^.]+)\z/mx ) { my @readers = ( [ qr/\.t(?:ar\.)?gz\z/ => qw( gzcat ), $filename ], [ qr/\.zip\z/, => qw( unzip -p ), $filename ], [ qr/\.Z\z/ => qw( zcat ), $filename ], [ qr/\.gz\z/ => qw( gzcat ), $filename ], [ qr/\.bz2\z/ => qw( bzcat ), $filename ], ); for my $reader (@readers) { my ( $pattern, @command ) = @{$reader}; if ( $extension =~ $pattern ) { open3( undef, my $fh, undef, @command ); return $fh; } } } open my $fh, '<', $filename or die "Couldn't open $filename: $!"; return $fh; }

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Replies are listed 'Best First'.
Re^2: Top 10 reasons to start using ack
by petdance (Parson) on Nov 30, 2006 at 05:28 UTC
    A couple of thoughts:

    * I wouldn't have it on by default. It would only be via the -A,--archive switch, for example.

    * How would I show the resultant filename? If the file is in a tarball, what do I show for the filename?

    xoxo,
    Andy

      I'd be sane to treat these containers like directories. Directories are just another kind of container. Given a tarball baz.tgz in /foo/bar containing the files a and b/c you'd tell the user about matches in the paths /foo/bar/baz.tgz/a and /foo/bar/baz.tgz/b/c. I think Windows "Compressed Folders" work like this.

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

        I could see using "foo.gz" as the filename for a standalone file, using your streaming method.

        For tarballs, though, I would have to extract to a temporary directory and delve through that, and that means write privileges that I don't want to assume.

        xoxo,
        Andy