<?xml version="1.0" encoding="windows-1252"?>
<node id="843710" title="Re: Reorganizing file contents" created="2010-06-08 13:27:41" updated="2010-06-08 13:27:41">
<type id="11">
note</type>
<author id="840762">
rjt</author>
<data>
<field name="doctext">
&lt;p&gt;The following should be reasonably efficient. It does keep files open, one per unique DESCR element. Depending on how many DESCR elements you have in your real data, you may need to rethink this, possibly with a least-recently-used scheme.&lt;/p&gt;

&lt;code&gt;
use warnings;
use strict;

my %fh_of; # Hash of filehandles

foreach my $file (&lt;R*-*.txt&gt;) {
    open INPUT, "&lt;$file" or die "Couldn't open $file: $!";
    while (&lt;INPUT&gt;) {
        my $fh;
        if (/^DESCRP\s+(.+?)$/) {
            my $des = $1;
            unless (exists $fh_of{$des}) {
                 open $fh_of{$des}, "&gt;&gt;$des.txt"
                    or die "Couldn't open $des.txt: $!";
            }
            $fh = $fh_of{$des};
            $file =~ /^(R.+?)-/; # Glob guarantees match
            print $fh "$1$des\n";
            next;
        }
        print $fh $_ if $fh;
    }
    close INPUT;
}
close $_ for (values %fh_of);
&lt;/code&gt;

&lt;p&gt;With the input files as you've given them, I get the expected output. All errors are fatal; you might want to handle them more gracefully depending on your application. If an input file does not start with a DESCRP line, $fh will not be defined, so I just throw away records until I see a DESCRP line. Again, you may want to handle this differently.&lt;/p&gt;</field>
<field name="root_node">
843691</field>
<field name="parent_node">
843691</field>
</data>
</node>
