I suppose I should have explained that a little better: It's setting $/ to \108 because it wants to read
at least the full ISA segment (105 bytes, itself, not including the segment terminator), plus the segment terminator, plus (and this is the odd thing) a possible additional newline character (or newline characters, in some cases). It doesn't matter if it reads into the GS, because it's just doing some look-ahead to determine the separator characters. After it has done this peeking ahead into the file, it seeks back to the beginning of the file (that's the
seek ARGV, 0, 0).
Anyway, to go through that code bit by bit, it's like this:
$/=\108;
($/) = (<> =~ /^.{105}(.?\r?\n?)/);
seek ARGV, 0, 0;
or to be more understandable, but less terse:
# Look ahead into the file for the ISA segment and the few characters
+which
# follow it, so that we can determine the segment terminator charac
+ter(s).
$/ = \108;
my $ISA_and_trailing_chars = <>;
# examine the text after the segment for the segment termination chara
+cter(s)
$ISA_and_trailing_chars =~ /
^.{105} # The actual ISA segment, itself
(
.? # The proper segment terminator character
\r? # If the segment terminator is followed by a newlin
+e and/or
\n? # carriage return, include that as part of the s
+egment
# terminator. this isn't really legal to do, but s
+ome
# folks do it, regardless, and we want to be able t
+o handle
# that sort of garbage as though it weren't garbage
+.
)
/x;
my $segment_terminator = $1;
# set the input record separator to the segment termination character(
+s), so
# that subsequent reads from the file will read one segment at a ti
+me.
$/ = $segment_terminator;
# since we peeked ahead into the file, to determine *how* to read it,
+now
# seek back to the beginning of it, so that we can read it (all) co
+rrectly
seek ARGV, 0, 0;
Hope that explains it better. For what it's worth, I actually updated the little script I use for this to also pull the field-separator and component-separator, too. I made a little shell-script wrapper around this, so that I can easily write simple little perl command-line scripts with the very basics of X12 syntax handling builtin. Here it is;
#!/bin/sh
# since "perl -p ... -n ..." will be treated by perl as just
# "perl -p ... ..." (not allowing the -n to override the -p)
# we have to scan through any switches on the command line,
# so that commands like "x12cat -ne 'print if /^PLB/' ..."
# will work as expected... but still allow an implicit "-p"
# if no "-n" is specified. Granted this isn't perfect...
# But this isn't so bad, because an explicit "-p" can override
# this.
looptype=""
for i do
case "$i" in
-m*) ;; # -m and -M start command-line "use" direct
+ives,
-M*) ;; # "n"s in them should be ignored
-*n*) looptype=n;;
-h) echo 'x12cat: "magic" x12 spooler wrap
+per around command-line perl.
-splits files by segment terminator
-auto sets $f and $c to (regex-escaped) field and component separato
+rs
-auto sets the autosplit to the field separator (for the -a autospli
+t)
-allows -pe or -ne operations on file (or none at all to just cat th
+e file)'; exit;;
-*) ;;
*) break;;
esac
done
if [ "x$looptype" = "x" ]; then
looptype=p
fi
perl -l -$looptype -e 'BEGIN{$SIG{PIPE}="exit"; $|=1; $/=\108; ($f,$c,
+$/)=(<> =~ /^ISA(.).{100}(.)(.?\r?\n?)/); ($f,$c)=("\Q$f","\Q$c"); se
+ek ARGV,0,0; $.=0}' -F'/$f/' "$@"
The idea of that script (I call it x12cat) is that I can do fairly simple things like:
$ x12cat -ne 'print if /^GS/' some_file.x12
to grep a file for GS segments. Or to do something more interesting (showing off how it ties into perls -a, or "autosplit", comand-line parameter):
$ x12cat -ane '$paid += $F[4] if $F[0] eq "CLP"; END{print $paid}' som
+e_835_file.x12
which would print the sum of all CLP04 fields (total amount paid in an 835 electronic remittance advice file).
------------
:Wq
Not an editor command: Wq