<?xml version="1.0" encoding="windows-1252"?>
<node id="973586" title="Re: List manipulation" created="2012-05-31 14:47:28" updated="2012-05-31 14:47:28">
<type id="11">
note</type>
<author id="258724">
Not_a_Number</author>
<data>
<field name="doctext">
&lt;c&gt;use 5.010;

my %parsed;
while ( &lt;DATA&gt; ){
   my @tmp = split;
   $parsed{$tmp[0]}{x1}++ if $tmp[1] eq '1_x';
   $parsed{$tmp[0]}{c3}++ if @tmp &gt; 2;
}

for my $k ( sort keys %parsed ) {
  say join ' ', $k, $parsed{$k}-&gt;{x1}, $parsed{$k}-&gt;{c3};
}

__DATA__
A 1_x 9_z
A 1_x 
A 1_x g_z
B 2_c 
B 1_x 1_z
C 1_x 1_z
C v_x 8_z&lt;/c&gt;
&lt;br&gt;
&lt;p&gt;&lt;b&gt;Update:&lt;/b&gt; The output part of the above code breaks for a given ID if (a) '1_x' never appears in column 2, or (b) '1_x' appears in column 2 but there is never a third column for this ID. To understand what I mean, add the following lines to &lt;c&gt;__DATA__&lt;/c&gt;:&lt;/p&gt;
&lt;c&gt;D v_x s_x
E 1_x&lt;/c&gt;
&lt;p&gt;Solution: change the line in the &lt;c&gt;for&lt;/c&gt; loop to:&lt;/p&gt;
&lt;c&gt;  say join ' ', $k, $parsed{$k}-&gt;{x1} || 0, $parsed{$k}-&gt;{c3} || 0;
&lt;/c&gt;</field>
<field name="root_node">
973551</field>
<field name="parent_node">
973551</field>
</data>
</node>
