http://www.perlmonks.org?node_id=857332


in reply to Splitting file into separate files based on record lengths and identifiers

Hope this will help you. Uses Hash.

#!/usr/bin/perl -w # use warnings; use strict; my %hash = (); while (<DATA>) { chomp($_); my $n = length($_); my $i = 0; while ($i<$n) { my $long = substr($_,$i,4); $i += 4; my $delim = substr($_,$i,1); $i += 1; my $val = substr($_,$i,$long); $i += $long; # print $long,' ',$delim, ' ', $val,"\n"; $hash{ $delim } .= $val.','; } } s/,\z// for values %hash; while ( my ($key, $value) = each(%hash) ) { print "$key => $value\n"; } print "size of hash: " . keys( %hash ) . ".\n"; print '-' x (60),"\n"; __DATA__ 0004$ADAM0002*330004%19770004$BOB 0002*430004%1967 0003$XDA0002*440004%22220003$XOB0002*990004%3333

Results.

Process started >>> $ => ADAM,BOB ,XDA,XOB % => 1977,1967,2222,3333 * => 33,43,44,99 size of hash: 3. ------------------------------------------------------------ <<< Process finished.

Replies are listed 'Best First'.
Re^2: Splitting file into separate files based on record lengths and identifiers
by monty77 (Initiate) on Aug 26, 2010 at 21:53 UTC

    The hash solution intrigued me, but it seems to break when I make the strings longer, can someone enlighten me as to why? Sample data:

    0100$THIS IS A 100 CHAR FIELD £$%£%$£                                                               0030*THIS IS A 30 CHAR ONE

    Thanks!

      It is working fin for me. The £ sign comes out funny because I an in Latinamerica and using the standard character set.

      DATA

      0004$ADAM0002*330004%19770004$BOB 0002*430004%1967 0003$XDA0002*440004%22220003$XOB0002*990004%3333 0100$THIS IS A 100 CHAR FIELD £$%£%$£ + 12345 0030*THIS IS A 30 CHAR ONE

      Result

      Process started >>> $ => ADAM,BOB ,XDA,XOB,THIS IS A 100 CHAR FIELD ú$%ú%$ú + 12345 % => 1977,1967,2222,3333 * => 33,43,44,99,THIS IS A 30 CHAR ONE size of hash: 3. ------------------------------------------------------------ <<< Process finished.