note
etcshadow
I suppose I should have explained that a little better: It's setting $/ to \108 because it wants to read <i>at least</i> the full ISA segment (105 bytes, itself, not including the segment terminator), plus the segment terminator, plus (and this is the odd thing) a possible additional newline character (or newline characters, in some cases). It doesn't matter if it reads into the GS, because it's just doing some look-ahead to determine the separator characters. After it has done this peeking ahead into the file, it seeks back to the beginning of the file (that's the <code>seek ARGV, 0, 0</code>).
<p>
Anyway, to go through that code bit by bit, it's like this:
<p>
<code>
$/=\108;
($/) = (<> =~ /^.{105}(.?\r?\n?)/);
seek ARGV, 0, 0;
</code>
<p>
or to be more understandable, but less terse:
<p>
<code>
# Look ahead into the file for the ISA segment and the few characters which
# follow it, so that we can determine the segment terminator character(s).
$/ = \108;
my $ISA_and_trailing_chars = <>;
# examine the text after the segment for the segment termination character(s)
$ISA_and_trailing_chars =~ /
^.{105} # The actual ISA segment, itself
(
.? # The proper segment terminator character
\r? # If the segment terminator is followed by a newline and/or
\n? # carriage return, include that as part of the segment
# terminator. this isn't really legal to do, but some
# folks do it, regardless, and we want to be able to handle
# that sort of garbage as though it weren't garbage.
)
/x;
my $segment_terminator = $1;
# set the input record separator to the segment termination character(s), so
# that subsequent reads from the file will read one segment at a time.
$/ = $segment_terminator;
# since we peeked ahead into the file, to determine *how* to read it, now
# seek back to the beginning of it, so that we can read it (all) correctly
seek ARGV, 0, 0;
</code>
<p>
Hope that explains it better. For what it's worth, I actually updated the little script I use for this to also pull the field-separator and component-separator, too. I made a little shell-script wrapper around this, so that I can easily write simple little perl command-line scripts with the very basics of X12 syntax handling builtin. Here it is;
<p>
<code>
#!/bin/sh
# since "perl -p ... -n ..." will be treated by perl as just
# "perl -p ... ..." (not allowing the -n to override the -p)
# we have to scan through any switches on the command line,
# so that commands like "x12cat -ne 'print if /^PLB/' ..."
# will work as expected... but still allow an implicit "-p"
# if no "-n" is specified. Granted this isn't perfect...
# But this isn't so bad, because an explicit "-p" can override
# this.
looptype=""
for i do
case "$i" in
-m*) ;; # -m and -M start command-line "use" directives,
-M*) ;; # "n"s in them should be ignored
-*n*) looptype=n;;
-h) echo 'x12cat: "magic" x12 spooler wrapper around command-line perl.
-splits files by segment terminator
-auto sets $f and $c to (regex-escaped) field and component separators
-auto sets the autosplit to the field separator (for the -a autosplit)
-allows -pe or -ne operations on file (or none at all to just cat the file)'; exit;;
-*) ;;
*) break;;
esac
done
if [ "x$looptype" = "x" ]; then
looptype=p
fi
perl -l -$looptype -e 'BEGIN{$SIG{PIPE}="exit"; $|=1; $/=\108; ($f,$c,$/)=(<> =~ /^ISA(.).{100}(.)(.?\r?\n?)/); ($f,$c)=("\Q$f","\Q$c"); seek ARGV,0,0; $.=0}' -F'/$f/' "$@"
</code>
<p>
The idea of that script (I call it x12cat) is that I can do fairly simple things like:
<p>
<code>
$ x12cat -ne 'print if /^GS/' some_file.x12
</code>
<p>
to grep a file for GS segments. Or to do something more interesting (showing off how it ties into perls -a, or "autosplit", comand-line parameter):
<p>
<code>
$ x12cat -ane '$paid += $F[4] if $F[0] eq "CLP"; END{print $paid}' some_835_file.x12
</code>
<p>
which would print the sum of all CLP04 fields (total amount paid in an 835 electronic remittance advice file).
<!-- Node text goes above. Div tags should contain sig only -->
<div class="pmsig"><div class="pmsig-296575">
<code>
------------
:Wq
Not an editor command: Wq
</code>
</div></div>
314986
575439