To read Char-by-Char from a file

by Anonymous Monk
Hi Monks How do we read a file character by character without loading it into the memeory. Regards, Sid

    As long as each character has one byte:

    $/ = \1;

    Assigning $/ to a reference to an integer makes it separate input on each $x bytes ($x here being the integer).

    Note: Bytes, not characters.

    Try it out:

    perl -pe 'BEGIN{$/=\1}s/.*/<$&>/' file

    my $read; while ($read = read FILE, $char, 1) { print "got: $char\n"; } die "read error: $!" if not defined $read;

    Note that this reads a character at a time, not a byte; that's not the same thing unless you're using a 7/8-bit encoding. See read for more details.

    Update: in terms of memory, this actually does read more than one byte at a time; but this is normally what you want. Perl performs buffering for you so you don't make many needless system calls. The buffer doesn't endanger your memory. See also sysread for the low-level call.

    If you have an open file handle, see perlfunc:getc, otherwise cog's suggestion is quite elegant (assuming your character-set is not "wide", requiring more than one byte per character).


      Absolutely. This has the added benefit to bypass Perl's buffered IO, which is what one really wants if (s)he needs to read a precise amount of data from a "source".

      In addition, if you want to read byte-by-byte instead of char-by-char (which is different due to Unicode support), it's safer to use binmode:

      binmode FILE; sysread(FILE,$buffer,1);

    i wonder what you need that approach for? your question sounds like being a) a fake, or b) senseless.

      or, c) the pursuit of knowledge, which is never bad.
      who knows, there could easily be a good reason, too. maybe this is the only way to do something he's trying to do. or maybe it's not. if it's a good reason that could be done a different, better way, then maybe that's a mistake the op will have to make by doing it the wrong way first... also not always a bad thing.
      A fake question? Isn't that logically inconsistant? Something is either a question, or not, surely?

      my name's not Keith, and I'm not reasonable.
        Something is either a question, or not, surely?

        This is a fake answer.

        Perhaps what he meant by 'fake' was 'misrepresentative' as in the case of homework?

      Or c) correct, because (s)he really wants to read a file (descriptor) char-by-char or byte-by-byte.

      Think about a modem supervising application that waits for some defined input on the line and then hands over to another application chosen upon the seen input so far. Maybe if you read line-by-line you risk to eat too much input.

