Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^5: Loading a part of the file to array using Tie::File

by haukex (Bishop)
on Nov 24, 2017 at 19:10 UTC ( #1204204=note: print w/replies, xml ) Need Help??


in reply to Re^4: Loading a part of the file to array using Tie::File
in thread Loading a part of the file to array using Tie::File

Or do you see any serious use cases for it?

Dunno, I think that for random access of small files (say, maybe, under a megabyte) in situations where performance is not critical, the ease of implementation can still outweigh the cost. On the other hand, at least in my experience such files are rare. For example, when inserting lines somewhere, one usually has to scan the file to locate the insertion point anyway, so in such cases a while(<>) loop would still feel more natural to me than a linear search in an array. It's maybe a nice module to show off some of the power (Update: as in expressiveness / TIMTOWTDI, not speed) of Perl to newcomers, although then one might cause the problem of "if all you have is a hammer, everything looks like a nail".

Of course ikegami has a good point. There is a huge difference between reading even a ~1MB file into in array, vs. reading it with Tie::File:

$ cp -L /usr/share/dict/words /tmp/test.txt $ wc -l /tmp/test.txt 99132 /tmp/test.txt $ du -sh /tmp/test.txt 920K /tmp/test.txt $ time perl -MTie::File -e 'open F, "/tmp/test.txt" or die; print `ps -orss $$`; my @x = <F>; print `ps -orss $$`' RSS 7408 RSS 23604 real 0m0.042s user 0m0.024s sys 0m0.012s $ time perl -MTie::File -e 'tie my @a, "Tie::File", "/tmp/test.txt"; print `ps -orss $$`; $a=$_ for @a; print `ps -orss $$`' RSS 7612 RSS 55024 real 0m1.001s user 0m0.908s sys 0m0.088s

Replies are listed 'Best First'.
Re^6: Loading a part of the file to array using Tie::File
by ikegami (Pope) on Nov 24, 2017 at 21:52 UTC

    Dunno, I think that for random access of small files (say, maybe, under a megabyte) in situations where performance is not critical, the ease of implementation can still outweigh the cost

    Except it's just as easy to load the file into memory and write it back out when it's that small.

      Except it's just as easy to load the file into memory and write it back out when it's that small.

      By "ease of implementation" I meant:

      open my $ifh, '<', $file or die $!; chomp( my @a = <$ifh> ); close $ifh; # vs. use Tie::File; tie my @a, 'Tie::File', $file; # and open my $ofh, '>', $file or die $!; print $ofh $_,"\n" for @a; close $ofh; # vs. untie @a;

      I think several arguments, some subjective, could be made in either direction which one "easier". Whether it's worth the cost (speed and memory penalty) depends on the application, and of course the user must be aware of the penalty in the first place, so it's definitely good to caution. I just personally wouldn't reject the module outright.

      Granted, if it's about reducing the length of the code, it's also possible to do something like this, although it still has a speed penalty, just not quite as bad as Tie::File:

      use Path::Class qw/file/; my @a = file($file)->slurp(chomp=>1); ... file($file)->spew_lines(\@a);

      A while(<>) should still be faster than any of the above. (I ran some quick back-of-the-envelope benchmarks with /usr/share/dict/words.)

      Karl asked about "serious" use cases. Under the conditions I named, I'd say it's just a matter of TIMTOWTDI, but if you were to say that's not a "serious" use case, then you wouldn't be wrong. I think I'll be cautioning against Tie::File more in the future.

        The fact that you can't even use Tie::File correctly says everything: The non-Tie implementation is clearly the easier of those two implementations.

        Aside the fact that your Tie::File code only saved you one line (after getting rid of those closes) and that replaced well-known operations with obscure ones, you forgot to throw an error for non-existing files in your Tie::File solution, and you hid the fact that you have to use a bunch of encodes and decodes with Tie::File while the alternative automatically uses the one encoding you set using use open.

Re^6: Loading a part of the file to array using Tie::File
by karlgoethebier (Abbot) on Nov 25, 2017 at 19:42 UTC

    Thank you very much for advice haukex. But to be honest: I struggle a bit about what you benchmarked. I guess i need to take a closer look at your results ;-) Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Re^6: Loading a part of the file to array using Tie::File
by Anonymous Monk on Nov 25, 2017 at 05:49 UTC
    Didnt you look at the benchmark?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1204204]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2020-12-02 13:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you use taint mode?





    Results (41 votes). Check out past polls.

    Notices?