http://www.perlmonks.org?node_id=522499

xmerlin has asked for the wisdom of the Perl Monks concerning the following question:

The problem: The code below fails with out of memory if the file is too large How to solve this? code snippet:
if (-e $logfile) { my $fh=new IO::File($logfile,'r'); @lines=$fh->getlines(); $fh->close; } my $log=[]; foreach my $l (@lines) { if ($l=~m/error/i) { push @$log,{logLine=>$l,error=>1}; } else { push @$log,{logLine=>$l}; } } $tmpl->param(log1=>$log); template snippet: <tmpl_loop name=log1> <tmpl_if name=error> <font color=red> </tmpl_if> <tmpl_var escape=html name=logLine><br> <tmpl_if name=error> </font> </tmpl_if> </tmpl_loop>

Replies are listed 'Best First'.
Re: html::template and large files
by gryphon (Abbot) on Jan 11, 2006 at 17:47 UTC

    Greetings xmerlin,

    In your example, you have the entire log file getting loaded into @lines all at once. If your log file is especially large, you'll run out of memory. What you probably want instead is to only look at a single line in the file at a time.

    my @log; if (-e $logfile) { open(LOG, '<', $logfile) or die $!; while (<LOG>) { push @log, { logLine => $_, error => (/error/i) ? 1 : 0 }; last if (@log > $some_large_number ); } close(LOG); } $tmpl->param( log1 => \@log );

    The other problem, though, is that it looks like you want to dump the entire log into a web page. With an especially large log file, that will be a problem for the client's memory and take a long time for the data transfer. You'll want to add an "if the log is bigger than n lines, stop" line inside the while.

    gryphon
    Whitepages.com Development Manager (DSMS)
    code('Perl') || die;

      Thanks for your input. I was already thinking of segmenting the large file into workable chunks. So using @$log1 for the first 20000 lines, @$log2 for the next batch and so on and finally if it amounts above 100000 lines saying it cannot be rendered on the webpage so user has to look at the file on the server locally eventually ... I am not an html::template expert so my next question arises how can i write the following more efficient ?
      <tmpl_loop name=log1> <tmpl_if name=error> <font color=red> </tmpl_if> <tmpl_var escape=html name=logLine><br> <tmpl_if name=error> </font> </tmpl_if> </tmpl_loop> <tmpl_loop name=log2> <tmpl_if name=error> <font color=red> </tmpl_if> <tmpl_var escape=html name=logLine><br> <tmpl_if name=error> </font> </tmpl_if> </tmpl_loop> etc ....

        Greetings xmerlin,

        I would avoid using @$log1, 2, 3, etc. Whether you have the data stored in one array or fifteen array references, it's still in memory. You aren't going to gain anything by splitting the data into multiple array references (other than ending up with more complex code).

        Just put all the data in @log, then give HTML::Template a reference to the data. (ex. \@log) Also, you may want to avoid using the font tag in your HTML.

        <tmpl_loop name="log"> <tmpl_if name="error"><span style="color: red"></tmpl_if> <tmpl_var escape="html" name="logLine"><br/> <tmpl_if name="error"></span></tmpl_if> </tmpl_loop>

        From the perspective of usability, it's hard for me to see much value in presenting 100,000 lines of log in a single browser window. Why not present 1,000 lines in a page? Then you can paginate the rest and give the user forward and backward buttons. Maybe adding a search feature would be helpful. Anyway, I don't know your problem domain, so maybe you have a special case, but I'd have a hard time using a web page that just dumped a huge log into the browser. I'd encourage you to look into other ways to present this data.

        gryphon
        Whitepages.com Development Manager (DSMS)
        code('Perl') || die;