Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^2: Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!

by NeilF (Sexton)
on Jan 01, 2006 at 19:55 UTC ( #520282=note: print w/replies, xml ) Need Help??


in reply to Re: Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!
in thread Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!

BrowserUk, thanks... Two questions/comments regarding your post.

Wouldn't your code mean the lines are stripped of the line feeds they originally had? ie: When you came to write the array out it would not longer have the line feeds and you'd have to add them into every line?

The area I'm looking at is where I'm posting a new message in a forum, which reads the forum in, manipulated the lines and then writes it back out. So this code is not used for general browsing, just when updating.


I'll have a play with your example and see what the outcome is... You recon it reads it in one(ish) hit and not in horrible 4K blocks?
  • Comment on Re^2: Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!

Replies are listed 'Best First'.
Re^3: Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!
by BrowserUk (Patriarch) on Jan 01, 2006 at 23:35 UTC

    The problem is, it is quite likely that your ISP is measuring your IO in terms of bytes read and written rather than the number of reads and writes, so reducing the latter is unlikely to satisfy them.

    Also, when you have read the entire file, there is no need to re-write the entire thing in order to add a new line. If you open the file for reading and writing, when you have read it, the file pointer will be perfectly placed to append any new line to the end. That will reduce your writes to 1 per new addition. If there is no new addition, they user is just refreshing, then you'll have no writes.

    Also, you presumably do not redisplay the entire forum each time, but rather only the last 20 or so lines?

    If this is so, then you should not bother to re-read the entire file each time, but rather use File::ReadBackwards to get just those lines you intend to display. If you do this, then you can use seekFH, 0, 2 to reposition the pointer to the eof and then append new lines without having to re-write the entire file each time.

    Using this method, you can fix the total overhead per invocation to (say) 20 reads and 0 or 1 writes. You'll need to deploy locking, but from your code above you seem to be already familiar with that.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thanks... Interesting stuff! They do seem to talk specifically about IO processes! Is File::ReadBackwards a standard library/module?? ie: Will it exist on sites?
        They do seem to talk specifically about IO processes!

        Hmm. As a phrase "IO processes" doesn't make a lot of sense.

        The lines you see, and the numbers in the first column under the heading "#" in Filemon, are IO events, not processes. The process IDs are appended to the filenames in the third column under the heading "Process". The best I can suggest to you is that you ask them by what measure they are deciding that you are using to much resource.

        File::ReadBackwards isn't a standard module, but it is pure perl, so it is easy to install it in the same place as your scripts live, and use use lib './lib'; to allow it to be found. Assuming your scripts live in a subdirectory /cgi-bin, the you would create a directory structure and copy Backwards.pm from CPAN into it as follows:

        /cgi-bin/lib/File/Backwards.pm

        Then in your script you would have

        use lib './lib'; use File::Backwards;

        I seem to remember someone posting a more thorough explanation of this somewhere, but I could not find it via supersearch. Maybe someone else remembers it and will post a link.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!
by BrowserUk (Patriarch) on Jan 02, 2006 at 01:51 UTC

    I just realised I completely ignored one of your questions.

    Wouldn't your code mean the lines are stripped of the line feeds they originally had?

    Yes, as I coded it the newlines would be removed. This would effectively do a free chomp @test;. I don't see this as a problem as it would cost very little to replace them when writing the lines out again.

    However, if you want them left in place, then you could use the following split instead.

    #! perl -slw use strict; my $file = 'test.txt'; open DF, '<:raw', $file or die "$file : $!"; my @test = split /(?<=\n)/, do{ local $/ = \ -s( $file ); <DF> }; close DF;

    All that said, if you are only appending to the end of the file, why read the file at all? Have you heard of opening a file for append?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://520282]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2022-05-18 04:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (68 votes). Check out past polls.

    Notices?