http://www.perlmonks.org?node_id=439978

amt has asked for the wisdom of the Perl Monks concerning the following question:

Gentlemen,

I am trying to read in an XLS file into a scalar variable. I am curious if slurping the file into scalar with this method:
while(<XLSFILE>){$xlsdata = $_;}would work, because after reading the file into the scalar, I need to insert it into a LONGBLOB field in a database.

Many thanks in advance.
amt.

perlcheat

Replies are listed 'Best First'.
Re: Binary files
by dragonchild (Archbishop) on Mar 16, 2005 at 15:23 UTC
    .= would work better so you're not overriding what you just read in.

    However, you really want something like:

    my $xlsfile = do { local $\; <XLSFILE>; };

    Or, just disregard everything and go with File::Slurp. :-)

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Re: Binary files
by jmcnamara (Monsignor) on Mar 16, 2005 at 16:10 UTC

    Just in case you, or someone else down the line, run this on Windows you should binmode the filehandle:
    ... binmode XLSFILE; ...

    --
    John.

Re: Binary files
by BrowserUk (Patriarch) on Mar 16, 2005 at 15:40 UTC

    my $xlsdata = do{ local $/ = \-s( XLSFILE ); <XLSFILE> };

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco.
    Rule 1 has a caveat! -- Who broke the cabal?

      I don't get it. Why would one prefer

      my $xlsdata = do{ local $/ = \-s( XLSFILE ); <XLSFILE> };
      over plain ol'
      my $xlsdata = do { local $/; <XLSFILE> };
      ??

      the lowliest monk

        Leeeeeeeeeeeeeeeeeeeeets BENCHMARK!

        use Benchmark qw(:all); + timethese( 100, { 'By File Size' => sub { open XLSFILE, "x"; my $xlsdata = do { local $/ = \-s (XLSFILE); <XLSFILE> }; }, 'Plain Slurp' => sub { open XLSFILE, "x"; my $xlsdata = do { local $/; <XLSFILE> }; + }, } );
        Where "x" is 22Mb:
        By File Size: 11 wallclock secs ( 7.54 usr + 3.11 sys = 10.65 CPU) @ + 9.39/s (n=100) Plain Slurp: 11 wallclock secs ( 8.50 usr + 2.95 sys = 11.45 CPU) @ +8.73/s (n=100)
        It appears that using the file size is a nats cock faster, which is as I would expect as the traditional slurp will have to go looking for the end of the file.

        /J\

        I originally got it into my head that setting the size for slurping was a good thing after reading Slurp-Eazy. I seem to recall that it seemed quicker on my old portable under 5.6.1.

        However, looking again now under 5.8.4, the picture is both mixed and confusing:

        (ordered both ways to eliminate the possibilty of "first run bias")

        #! perl -slw use strict; use Benchmark qw[ cmpthese ]; open our $raw, '<:raw', $ARGV[ 0 ] or die $!; open our $txt, '< ', $ARGV[ 0 ] or die $!; our $s = -s $raw; cmpthese -3, { d_raw_trad => q[ my $data = do{ local $/; <$raw> }; ], c_raw_size => q[ my $data = do{ local $/ = \$s; <$raw> }; ], b_txt_trad => q[ my $data = do{ local $/; <$txt> }; ], a_txt_size => q[ my $data = do{ local $/ = \$s; <$txt> }; ], }; __END__ [15:46:09.53] P:\test>440395 100.dat Rate raw_size raw_trad txt_trad txt_size a_raw_size 7969/s -- -87% -88% -96% b_raw_trad 63136/s 692% -- -2% -69% c_txt_trad 64282/s 707% 2% -- -69% d_txt_size 206315/s 2489% 227% 221% -- [15:53:19.11] P:\test>440395 100.dat Rate c_raw_size d_raw_trad b_txt_trad a_txt_size c_raw_size 7855/s -- -87% -88% -96% d_raw_trad 62625/s 697% -- -4% -70% b_txt_trad 65166/s 730% 4% -- -68% a_txt_size 206449/s 2528% 230% 217% --

        Now quite why slurping a file in text-mode would be faster (and soo much faster!) than doing so in raw mode leaves me at a total loss for an explaination. Seems that I should reassess what I thought I once knew about Perl IO-layers and performance.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco.
        Rule 1 has a caveat! -- Who broke the cabal?