Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: Optimising processing for large data files.

by tachyon (Chancellor)
on Apr 10, 2004 at 13:10 UTC ( #344120=note: print w/replies, xml ) Need Help??

in reply to Optimising processing for large data files.

Good optimisation. In C you can of course avoid the copy and move overhead of the substr buffer you use and just flip pointers between a pair of buffers to get the sliding window.

Runtime on a 1GHz laptop was 10 minutes on a 3GB test file. So the benefits of doing it in C are real but perhaps hardly worth the effort unless saving 20 minutes runtime for adding X minutes coding time makes sense.

$ cat file.c #include <stdio.h> #define FILENAME "c:\\test.txt" #define CHUNK 500 int main() { FILE *f; char buf1[CHUNK],buf2[CHUNK],pair[3],*fbuf,*bbuf,*swap; int r, i; f=fopen(FILENAME,"r"); if (!f) return 1; fbuf=buf1; bbuf=buf2; r=(int)fread( fbuf, sizeof(char), CHUNK, f ); if ( !r || r<CHUNK ) return 1; pair[2]=0; while ( (r=(int)fread( bbuf, sizeof(char), CHUNK, f )) ) { for( i=0;i<r;i++ ) { pair[0]=fbuf[i]; pair[1]=bbuf[i]; /* printf("%s\n",pair);*/ } /* Move old back buffer pointer to front buffer ptr * And vice versa. Net effect is to slide buffer->R * As we will refil the back buffer with fresh data. * Thus we simply pour data from disk to memory with * no wasted copying effort. */ swap=fbuf; fbuf=bbuf; bbuf=swap; } fclose(f); return 0; }



Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://344120]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2021-05-08 05:35 GMT
Find Nodes?
    Voting Booth?
    Perl 7 will be out ...

    Results (96 votes). Check out past polls.