in reply to Possible to have regexes act on file directly (not in memory)

I haven't tried this myself, but the first thought in my head is whether there's any way to get perl to mmap a file. Check CPAN for modules that allow you to mmap a file to memory - you will still read the file into memory in chunks, but it could be much more transparent.

Of course, there's the other side of the coin: you need better structured data :)

  • Comment on Re: Possible to have regexes act on file directly (not in memory)

Replies are listed 'Best First'.
Re^2: Possible to have regexes act on file directly (not in memory)
by karlgoethebier (Abbot) on May 02, 2014 at 18:41 UTC

    This is :mmap described in the Custom Layers section of PerlIO?

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

      From a cursory glance at the documentation, no. That's still treating that memory map as a filehandle. You need to treat it as a string or scalar. A quick check on cpan gives Sys::Mmap as a reasonable candidate based on its documentation.

        The winner seems to be File::Map. It is less complicated to use, and at least performs in correct accordance to its documentation (unlike Sys::Mmap). ++ to leont once again.

        use strict; use warnings; use File::Map 'map_file'; map_file my $map, '/usr/share/dict/words', '<'; while( $map =~ m/^(wal.*)$/mg ) { print "$1\n"; }

        Though its user interface is fairly simple, it is a module where reading 100% of the documentation is probably a very good idea.


        In principle this seems to work. Here's an example:

        use strict; use warnings; use Sys::Mmap; my $string; open my $fh, '<', '/usr/share/dict/words' or die $!; Sys::Mmap::mmap( $string, 0, MAP_SHARED, PROT_READ, $fh ); while( $string =~ m/^(wal.*)$/mg ) { print $1, "\n"; } Sys::Mmap::munmap( $string ) or die "munmap: $!"; close $fh;

        On my system I get a dump of everything in the dict/words file starting with 'wal'. However, I end up getting the following error message when munmap is called:

        variable is not a string at ./ line 17.

        The XS code testing whether the variable being unmapped is a string is this:

        if(SvTYPE(var) != SVt_PV) { croak("variable is not a string"); return; }

        ...which doesn't strike me as wrong. Must be something in the XS for 'mmap' that is not happening properly.


      I have always been intimidated by perl's IO layer concept. The documentation is sparse at best. Perhaps it is time for another look.