Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Loading a part of the file to array using Tie::File

by ikegami (Patriarch)
on Nov 22, 2017 at 17:20 UTC ( [id://1204044]=note: print w/replies, xml ) Need Help??


in reply to Loading a part of the file to array using Tie::File

First of all, you don't want Tie::File. You never want Tie::File. It can easily end up using more memory than just loading the entire file into memory (despite the goal of limiting memory usage), and it's orders of magnitude slower than the alternatives.

The following is a solution using tail as you suggested:

open(my $fh, '|-', "tail", "-n", "+$InStartLineNumber", $InLogFIlePath +) or die("Can't tail \"$InLogFIlePath\": $!\n"); while (<$fh>) { ... } close($fh); if ( $? == -1 ) { die "Can't tail \"$InLogFIlePath\": $!\n"); } if ( $? & 0x7F ) { die "Can't tail \"$InLogFIlePath\": Killed by signa +l ".( $? & 0x7F )."$!\n"); } if ( $? >> 8 ) { die "Can't tail \"$InLogFIlePath\": Exited with err +or ".( $? >> 8 )."$!\n"); }

However, there's no reason to involve tail when you can easily do the same thing must more cleanly in Perl.

open(my $fh, '<', $InLogFIlePath) or die("Can't open \"$InLogFIlePath\": $!\n"); while (<$fh>) { next if $. < $InStartLineNumber; ... }

Replies are listed 'Best First'.
Re^2: Loading a part of the file to array using Tie::File
by karlgoethebier (Abbot) on Nov 23, 2017 at 10:49 UTC
    "..You never want Tie::File..."

    Wait:

    From the friendly manual:

    "...default memory limit is 2Mib ... about 310 bytes per cached record ... overhead..."

    Sure, a lot of overhead.

    I'm not so sure (or don't know) what bad things could happen.

    But i'm also sure that the author as well as the maintainer are no idiots.

    And i have heard that there are files out in the wild > $my_ram. Let's say 20 Gib or so ;-)

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

      It's not the buffer/cache (which has a configurable size) that's the problem; it's the index. Its size is proportional to highest line index encountered, and it can't be limited. For files with a small average line length (e.g. source code), the index uses more memory than the actual file. For example, if you read through a 20 GiB file using Tie::File, the index can end up using 20 GiB of memory (on top of the 2 MiB).

        Thanks ikegami.

        But this is a rigorous verdict which marks Tie::File as unusable and not recommendable, right?

        Or do you see any serious use cases for it?

        Best regards, Karl

        «The Crux of the Biscuit is the Apostrophe»

        perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

        But as far as i remember the interpretation of the result of Benchmark is problematic. And it doesn't tell us anything about memory usage. What the basic theme was IMHO. Best regards, Karl

        P.S.: Yes, i know Devel::NYTProf

        «The Crux of the Biscuit is the Apostrophe»

        perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1204044]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-03-29 05:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found