http://www.perlmonks.org?node_id=696464


in reply to EBCDIC and COBOL records

I use the following module to parse COBOL copy structures and to create the appropriate decoders:

package PIC; use strict; use Encode qw(decode encode); use POSIX qw(ceil); use Carp qw(croak); use vars qw(%parse_PIC); =head2 C<decode_COMP3> Decodes a BCD number with trailing sign. =cut sub decode_COMP3 { # 0C -> + # 0D -> - # 0F -> (unsigned) my $digits = unpack "H*", $_[0]; my $sign = chop $digits; #print "$digits\n"; if ($sign eq 'D' or $sign eq 'd') { $sign = '-' } elsif ($sign eq 'C' or $sign eq 'c') { $sign = '+' } elsif ($sign eq 'F' or $sign eq 'f') { $sign = ' ' } else { $digits .= $sign; $sign = '?' }; "$sign$digits" }; =head2 C<make_COMP3_decoder PRE, POST, SEP> Returns a subroutine that parses the digits into PRE places before and POST places after the decimal delimiter and returns a string with SEP as the decimal delimiter =cut sub make_COMP3_decoder { my ($pre,$post,$sep) = @_; $sep ||= ","; return sub { my $res = decode_COMP3($_[0]); substr($res,$pre+2,0) = $sep; $res }; }; =head2 C<decode_EBCDIC> Decodes an EBCDIC string to a Perl internal string Shorthand for decode( 'cp1047', $_[0] ) =cut sub decode_EBCDIC { decode( 'cp1047', $_[0] )}; # The patterns for things I recognize in the COBOL copy structures I e +ncounter %parse_PIC = ( qr/^PIC\s+S?9\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC +}, qr/^PIC\s+X\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC }, qr/^PIC\s+S?9\s*\((\d+)\)\s*COMP-3$/ => sub { return ceil(($1+1) / +2), \&decode_COMP3 }, qr/^PIC\s+S?9\s*\((\d+)\)V9\((\d+)\)\s*(?:USAGE\s+)?COMP-3$/ => su +b { return ceil(($1+$2+1)/2), make_COMP3_decoder($1,$2,$_[0]) }, qr/^PIC\s+S?9\s*\((\d+)\)V\s*COMP-3$/ => sub { return ceil(($1+1) +/2), \&decode_COMP3 }, qr/^PIC\s+S?9\s*\((\d+)\)\s+OCCURS\s+(\d+)$/ => sub { return $1*$2 +, \&decode_EBCDIC }, ); =head2 C<decode_PIC> Returns two values: =over 4 =item * Length in bytes of the PIC expression =item * Code reference that decodes the passed value according to the +copy structure =back =cut sub decode_PIC { my ($pic,$sep) = @_; for my $re (keys %parse_PIC) { if ($pic =~ /$re/) { return ($parse_PIC{$re}->($sep)); }; }; croak "Couldn't decode '$pic'"; }; =head2 C<PIC_length> Returns the length of the expression in bytes =cut sub PIC_length { my ($pic) = @_; my ($len,$decoder) = decode_PIC($pic); $len }; 1;