This is a very frequently asked question. It appears in perlfaq5, along with related questions "How do I delete a line from a file?" and "How do I change one line in a file?" It sounds like it should be easy, but it isn't.

The problem is that although we think of files as made of lines, the operating system usually thinks of them as made of bytes. You can overwrite a byte, but not a line. If you want to replace a line, you either have to overwrite every byte exactly, or you have to move the following part of the file up or down. There isn't even an easy way to find a line in a file; you have to read through the file counting newline characters until you get to the place you want.

The FAQ starts with a rather snotty remark about how "Perl is not a text editor." It follows with a 500-word article sketching several more-or-less difficult ways to do this. Most of them involve throwing away the original file and replacing it with a modified copy.

At last, there is a better way.

The new Tie::File module makes a file look like a Perl array. Each array element is one line of the file. If you read the array, you get a line from the file. If you modify the array, the file is modified as you requested.

It's safe. It's reliable. It's efficient.

Best of all, it's easy.

Let's take an example. Supose you want to go through a file and replace PERL with perl everywhere. One easy way is to use Perl's -i option:

perl -i.bak -lpe 's/PERL/perl/g' file
This is convenient, but it has the drawback that it rewrites the entire file. If you want to do this as part of a larger program, it's rather less convenient, and a lot more bizarre. The FAQ suggests:
{ local ($^I, @ARGV) = ('.bak', 'file'); while (<>) { s/PERL/perl/g; print; } }
You get poor error checking if you do this---the open is implicit, so there's no way to catch the error if it fails.

Here's the Tie::File version:

tie @lines, 'Tie::File', 'file' or die ...; for (@lines) { s/PERL/perl/g; } untie @lines;
Not only is this simpler (what the heck is local($^I), anyway?) but it's a lot more efficient. Unlike perl -i, which promises to modify the file "in place", and then actually creates a totally new file from scratch, Tie::File really does modify the file in place. If the file is ten megabytes long and contains PERL ten times, the -i solution writes ten megabytes; Tie::File writes just the ten records that changed.

Here's another common task; people ask about this in comp.lang.perl.misc every week: I have some text, in $text, and I want to insert it into an HTML file just after the line that says <!-- insert here -->. Again, I could use -i, which rewrites the whole file. Or I can use Tie::File:

for (@lines) { if (/<!-- insert here -->/) { $_ .= $text; last; } }
Instead of rewriting the entire file, this only rewrites what is necessary, the part of the file after the comment. If $text happens to be empty, it rewrites only the one line. And the code is really simple and obvious.

Here's another common problem which is trivially solved by Tie::File. How do I add a new record at the beginning of a file instead of at the end?

unshift @lines, $new; # Or add more than one record
This does rewrite the entire file, but there's no getting around that. All you can do is make it easy to write the code, and now it is easy to write the code.

Now let's suppose you have a datatbase with several columns, and the first column is the key. For concreteness, let's say it's the Unix password file, and the key is the username. (Or maybe it's your web server's password file, which has the same format.) Suppose you have a program that needs to look up data in this database.

One good way to do this is to read the database into a hash, and use the usernames as the hash keys, like this:

open DB, "< $database" or die ...; while (<DB>) { chomp; my ($username) = split /:/; $db{$username} = $_; } sub lookup { my $user = shift; return $db{$username}; }
The major drawback of this approach is that if the database is big, you will run out of memory for the hash. (That is probably not a consideration with the password file, but many other databases are bigger.) But you can use Tie::File here to get an easy and efficient solution:

tie @DB, 'Tie::File', $database or die ...; for (@DB) { my ($username) = split /:/, $_; $recno{$username} = $lineno++; } sub lookup { my $username = shift; return $DB[$recno{$username}]; }
We're still using a hash, and the usernames are still the keys. But instead of associating the data with the usernames (which would take a lot of space) we only associate a record number with each username. If we look up $recno{'merlyn'}, we don't get the information for merlyn directly. Instead, we get a number like 1123, which tells us that merlyn's data is on line 1123 of the data file. Then we look at $DB[1123] and Tie::File immediately recovers the data for us---it remembered where record 1123 was from the last time it saw it go by, and goes directly to the right place in the file to find it. We get fast access to every record without storing the entire database in memory.

Even if the database is small, you might still want to use Tie::File if you need to change the data. With Tie::File, you're not limited to only reading the database; you can modify it also:

sub replace_data { my ($username, $new_data) = @_; my $recno = $recno{$username}; if (defined $recno) { # Update existing user $DB[$recno] = $new_data; } else { push @DB, $new_data; # Add new user at the end } } sub update_password { my ($username, $new_password) = @_; my $crypted_password = crypt($new_password, random_salt()); my @data = split /:/, lookup($username); $data[1] = $crypted_password; replace_data($username, join(':', @data)); }
When we call replace_data, the data in the file is overwritten in place with the new data.

Tie::File arrays support all the Perl array operations, including push, pop, shift, unshift, splice, and $#a = $N. There are some other fancy features that you probably won't ever need, but if you do, they are in the manual.

Tie::File is available on CPAN and also from my website. It will be included with Perl 5.8, which will be released in April. It is distributed under the same terms as Perl.

You will like it.

Mark Dominus
Perl Paraphernalia

Replies are listed 'Best First'.
Re: How do I insert a line into a file?
by Juerd (Abbot) on Mar 31, 2002 at 14:44 UTC

    Dominus, you have again created a GREAT module, and I will surely put it to use very often.

    A string cannot be used to store data structures, unless serialized. It would however be great to use this efficient Tie::File together with array or hash references. Because putting the serialization in Tie::File would ruin all non-serializing operation, I thought it would be nice to tie an array and have automatic serialization. I have searched CPAN, but couldn't find a module that does what I want, so I created this quick hack:

    package Tie::FreezeThaw; # NOTE: # This is a quick hack and has NOT been tested thoroughly! # NOTE: # You can't use the elements directly as references! # (if @xyzzy is tied, $xyzzy[1][2] won't work. # Use $foo = $xyzzy[1]; $foo->[2] instead.) use FreezeThaw qw(freeze thaw); use base 'Tie::Array'; use strict; sub TIEARRAY { bless $_[1], $_[0] } sub FETCHSIZE { scalar @{ $_[0] } } sub STORESIZE { @{ $_[0] } = $_[1] } sub EXISTS { exists $_[0]->[$_[1]] } sub DELETE { delete $_[0]->[$_[1]] } sub STORE { ($_[2] = freeze $_[2]) =~ s/([^\x20-\x7E])/sprintf "\xFF%02x", $1/ +ge; $_[0]->[$_[1]] = $_[2]; } sub FETCH { (my $foo = $_[0]->[$_[1]]) =~ s/\xFF(..)/chr hex $1/ge; return (thaw $foo)[0]; } 1;
    Which allows me to use Tie::File with complexer data structures (of course, entire records will be overwritten, but that's still more efficient than re-writing the entire file) without having to think about the serialization.

    use Tie::File; use Tie::FreezeThaw; use strict; tie my @foo, 'Tie::File', 'testfile' or die $!; tie my @bar, 'Tie::FreezeThaw', \@foo; push @bar, [ qw/1..10/ ];
    (If there's already a module like my quick Tie::FreezeThaw hack, please let me know)