http://www.perlmonks.org?node_id=238998


in reply to Read a line with max length ?

There is a detail in "perldoc perlvar" about assigning an integer value to $/, which led me to discover the following, which I think is just what you want:
my $line = ""; my $maxlen = 10; $/ = \1; while (<>) { $line .= $_; last if ( /[\r\n]/ or length( $line ) == $maxlen ); } print $line;
I haven't tested this thoroughly in terms of what happens with underlying input buffers, but in terms of the behavior of variables and values within the perl script, it seems to do exactly what you'd like.

Setting $/ to \1 means the input record size is one byte; the while loop will append one character byte at a time to $line, and will terminate either when you read $maxlen bytes or when you get any sort of line termination. (This will work sensibly for all character encodings I've heard of.)

No doubt this will raise some hackles because it seems like a really non-optimal amount of overhead for reading input; maybe you can set $/ to $maxlen, but then if you're really expecting to do line-oriented input, and you're going back for additional reads during a given connection, you have to worry about making sure that any residue that follows a line termination is carried over to the next time that you clear $line to start filling it again. One way or another, you pay extra for being really careful (so just believe that it ends up less expensive than being left open to hackers).

UPDATE: Having thought about this a bit more, I think that any approach that tries to read more than one byte at a time will get into a lot of trouble, if your intention is really to do line-oriented input safely.

The point is that, as soon as you leave behind the default value of $/ and expect some minimum number of bytes greater than one on each read, you run the risk that (a closing portion of) a line will be left stranded in the input buffer until either: (a) more stuff is written by the remote host to fill the buffer, or (b) you close the connection. This would hose your process, putting it into an indefinite wait. I bow to Elian's more informed experience on this issue -- but also second Zaxo's point about making sure to watch for multiple lines in one read. Thanks, folks!

Replies are listed 'Best First'.
Re: Re: Read a line with max length ?
by isotope (Deacon) on Feb 27, 2003 at 05:27 UTC
Re: Re: Read a line with max length ?
by Elian (Parson) on Feb 27, 2003 at 23:15 UTC
    You should read more than one byte at a time. It's not particularly dangerous, and works just fine. A buffer size of 1500 is good, since it tends to match the maximum TCP/IP frame size. Sockets have timeouts, so worst case your program will pause waiting for the remote end, but that'll happen for a one-byte read just as often, so it's not a problem that you're avoiding.

    Note that a read on a socket for more data than is available won't stall. If you issue a read for 1500 bytes but there's only 100 available, you'll get 100 back, barring really bizarre OS bugsfeatures.

Re: Re: Read a line with max length ?
by fauxpas (Initiate) on Feb 27, 2003 at 04:15 UTC
    This seems to do the trick, thanks. I'll be careful now and figure out how to be optimal later. ;)