Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

EDI Store and Forward System

by diskcrash (Hermit)
on Oct 02, 2002 at 03:37 UTC ( #202181=sourcecode: print w/replies, xml ) Need Help??
Category: Miscellaneous
Author/Contact Info diskcrash
Description: This code is the main part of an X12 EDI (Electronic Data Interchange) store and forward system. It parses aggregate files of EDI transactions for ISA header information and forwards the EDI transaction on to an output directory. From there you can move the data to an EDI forwarder by FTP or other tools. It won't win ANY golf contests but was written to be maintainable.
#This perl script is used to parse ISA headers for EDI Transactions
#The transactions can then be forwarded to appropriate receivers by FT
#or other means. In effect it is an EDI store and forward system.
#diskcrash Jan 24, 2002 
#Change History
#date      #Description                        #who did it     
#01/28/02  V1.0 Creation                      diskcrash
#02/06/02  V1.1 Added choice of sender or receiver for output     disk
#02/06/02  V1.2 Removed additional end of transaction LF chars    disk

#****  Now set which directories are created, for senders vs. receiver
#****  Generally the directories created are receivers of wxyz data

my($nameflag)="r";             #file set for receiver output

#***  now set the default device for the file system

#*** now establish the default directories

$excpdir=$defdev."/wxyzprod/wxyzexcp/";        #dir to put exception f
+iles in
$logfile=$defdev."/wxyzprod/wxyzlog.txt";    #log file
$indir=$defdev."/wxyzprod/wxyzin/";        #input directory
$outdir=$defdev."/wxyzprod/suppliers/";        #top of suppliers dir t
$archive=$defdev."/wxyzprod/wxyzarch/";   #original files go here afte
+r processing

undef $/;        #this removes the default \n input line separator 
            #so that files shall be read
            #in as one giant string, regardless of embedded
            #line controls.
            #Note that this radically impacts all file reads
            #vs. normal "read by record" behavior.

#Start this perl script on system boot
#It first writes a timestamp to the wxyzlog.txt file in /wxyzprod
#In this startup section we'll check to make sure the script can read 
#the /wxyzprod/wxyzin directory, and stop if it can't
#The script also tests to see if the logfile is larger than 100 megaby
#if so it needs looking at or deleting. The logfile will be recreated 

#******write start time  to logfile

open( LOGGER, ">>$logfile");


print LOGGER $hour , $min ,$sec , $mday , $mon , $year," wxyzprod star

close (LOGGER);

#*******check for input directory and exception
if  (!opendir(PRODDIR,"$indir"))

    $timestamp= $hour . $min .$sec . $mday . $mon . $year;
    open( LOGGER, ">>$logfile");
     print LOGGER $timestamp . "\wxyzprod\wxyzin 
    directory not found-stopped!\n";
     die("No wxyzprod/wxyz input directory found");   

#******check for logfile size and e-mail
#******if the log file got really huge, it should be at least inspecte

$blksize,$blocks) = stat(LOGGER);

if ($size > 100000000)  # if log file size over 100 meg stop!

    $timestamp= $hour . $min .$sec . $mday . $mon . $year;
    open( LOGGER, ">>$logfile");
    print LOGGER $timestamp . "log file greater than 100Meg - WARNING!

#The intro stuff is done, the main loop follows. It sleeps for 20 seco
#then checks for new files in the /wxyzprod/wxyzin directory.
#If it finds one or more, it gets the file list and then checks the fo

#     Is the file at least 10 seconds old since last modified, if not 
#    it throws it back to get processed 20 seconds later

#    Does the file have a string ISA in it? If not it copies it to the
#    wxyzprod/wxyzexcp directory, might have been a CAD file or someth
#    It wasn't an EDI file, and there is no ISA header to decode it.

#    Does the file have a string IEA in it? (Same story, see above)

# If the file meets the above criteria, then its scanned for embedded 
#Feeds - as in ASCII "0A". These are replaced by nulls. There is a sin
#trailing LF.

#It is assumed there could be an unlimited number of ISA/IEA header 
#pairs. The file is scanned for each ISA and - Lordy Yes- fixed byte 
#count offsets are read in to reach the sender, receiver, date and tim
#This practice of fixed offsets could, of course, be blown out of the 
#with a single change. Fortunately the ISA header structure
#has been relatively fixed and stable over the years. 
#Just be aware of this potential issue.  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<

#***** main loop - the other end is at the bottom of the script

#*****sleep 20 seconds, then check for files again


#set deadman variable to 1
#stubbed out

#determine if any files exist in wxyzprod\wxyzin
#open the directory and get file list, then read files one by one 
#from the file list

opendir (FDIR,"$indir");

# Now the first two "files" will be .  and ..  which we will skip

#So.. if the file list is null, the following loop is bypassed
# This next loop reads each filename from @filelist

foreach $infile (@filelist)  #*** This is the start of the file proces
                             #loop, which ends after the last file is 

if ($infile eq "." || $infile eq "..")   #vestigial returned files-ign
goto escape1;

#we want to make sure the file is finished being written, so..
#determine if $mtime > 10 seconds and skip if not
#If its too young we'll get back to it next time


$blksize,$blocks) = stat($getinfile);


if ($delta < 10)     #***file too young, wait a few seconds
next;              #***jump to the end of the loop

#***ok, its older than ten seconds
#***so read the whole file into an array at one time

open(INF,"<$getinfile");  #***open each file in turn

$line=<INF>;      #This reads the whole input file at one time into th
+e $line. 
        #it will usually have only one entry (a single record)
        #, but if it has more and it is an EDI file, it will will need
        #all of the embedded LFs replaced by nulls.
        #This process has been tested for file sizes up to 50 Mbytes
        #on a 128 Mbyte Win2K system. I wouldn't really push this 
        #for files above 5 Mbytes, but then most EDI files are under
        # 10K bytes

close(INF);        #***close the input file

#*****determine if ISA header exists and if not post a log event and t
#the file to the /wxyzproc/wxyzexcp exception directory

if (index($line,"ISA") == -1)
{                    #if we are here,there is no ISA header
                    # so its not an EDI file
                    #write to the exception area and 
                    #delete from wxyzin dir


    $timestamp= $hour . $min .$sec . $mday . $mon . $year;
    print LOGGER "$timestamp , $infile , no ISA header found \n";

    $outfile=$excpdir . $infile;    
    print IEXCP "$line";

#****now delete the original file from the input directory

    goto escape1;   #exit from the file reading loop and look for next
+ file


#********determine if IEA trailer exists and exception
if (index($line,"IEA") == -1)
{                    #if you are here,it has no IEA trailer
                    # not an EDI file
                    #write to the exception area and 
                    #delete from wxyzin dir


    $timestamp= $hour . $min .$sec . $mday . $mon . $year;
    print LOGGER "$timestamp . $infile . -no IEA trailer found\n";
    $outfile=$excpdir . $infile;    
    print IEXCP "$line";

#****now delete the original file from the input directory

    goto escape1;   #exit from the file reading loop and look for next
+ file

$origline=$line ;  #***keep a copy of the original line, before LF rem
           #This string will be written to the wxyz Arcive, as is
#strip embedded LFs

$line=~ s/\x0a//;      #this replaces line feeds with nulls, in the wh

#*** (STUBBED OUT FOR NOW now put the trailing LF back in
# the file is now built to  spec  - January,2002

#In this next loop we will cycle through the file looking for the star
+t of
#ISA headers. Each header represents a transaction, possibly from a un
#sender. Each one is parsed out and the transaction is written with a 
#file name to the /wxyzprod/supplier/"unique" directory.

$ipos=0;            #set initial scanning point in $line, in the file

#***find next ISA - This is the start of the transaction loop, within 
+a file

while (index($line,"ISA",$ipos) > -1)        #if false we have process
+ed all transactions
                #in the file, so skip forward to next file
$ipos=index($line,"ISA",$ipos);  #Use this position to get next variab

#***sender is 35 after ISA

#***get sender


#***get receiver

$receiver=substr($line,$ipos+54, 5);

#*** based on the $nameflag being r or s, use the receiver or sender a
+s the key
#***name to used to make target directories

if ($nameflag eq "r")
elsif ($nameflag eq "s")

#***get date

$edidate=substr($line,$ipos+70, 6);

#***get time

$editime=substr($line,$ipos+77, 4);

#***now check for the IEA position, near the end of the transaction
$ieapos=index($line,"IEA",$ipos+80);      #always at least 80 chars aw
$transend=$ieapos+14;            #14 chars after the IEA it ends
$tlength=$transend-$ipos+1;        #get length of the transaction

#**** Now, get just this transaction as a substring from the $line str

$ipos=$transend;              #set up for next ISA scan
                    #as the start point for the next
                    #transaction (if there is one)

#***the following file counter is used if there are  ambiguous file na


#***check for sender or receiver directory as selected by $nameflag

if (!opendir(DIR,$outdir.$fname)) 
    $timestamp= $hour . $min .$sec . $mday . $mon . $year;
    print LOGGER "$timestamp . $infile . new dir made for $fname\n";

#***Check for unique file name and keep going until you get one

while (-e $outdir.$fname.$slash.$outfile)

#***ok, there is now a unique file name
#***write new file to dir and filename, close it

$supname=$outdir . $fname . $slash . $outfile;

open (FOUT, ">$supname");
print FOUT "$transaction";

}    #******this is the end of the transaction loop, within a file

#so now we write the original data to the archive and make a note
#in the log file

#**write create date/time, process date/time, file size and trans coun
+t to log
    $timestamp= $hour . $min .$sec . $mday . $mon . $year;
$outbuf=$timestamp ." " . $infile ." sender ".$sender." Receiver ".$re
    ." length ". $flen;
    print LOGGER "$outbuf\n";
#***write the original file into the archives, with original file name

$archname=$archive . $infile;
    open(ARCH, ">$archname");
    print ARCH "$origline\n";

#****now delete the original file from the /wxyzprod/wxyzin directory

escape1:    #if file name is . or .. we got here and bypassed processs

    #***** This is the end of the file processing loop, all files
    #in the wxyzprod/wxyzin directory have been read and processed at 
    #point - so go back and sleep for a few seconds, then look for fil

#*****back to the top
goto A;

#This is the end of the wxyzprod perl script

#The Deadman process - stubbed out for now

#Every 20 seconds the wxyzprod sets a variable to 1
#Deadman runs every two minutes and sets it to zero
#If deadman detects two zeros in a row, send e-mail and log file it

Replies are listed 'Best First'.
Re: EDI Store and Forward System (what's with the gotos?)
by grinder (Bishop) on Oct 02, 2002 at 09:33 UTC

    Scanning the code quickly, I see you use goto escape1, and the label escape1 appears at the end of block. This means you can replace the statement by a last or next instead. This is much more maintainable (which you state as a goal). People will have to have to ask themselves whether to add code before or after the label, and wonder what the difference is.

    At the end of the script, there is another unconditional goto to somewhere above. This would be much better written as a while(1) { ... } loop or even { ...; redo}.

    The comments are useless. You can't do anything with them, apart from reading them when looking at the code. If you want to comment that much, you should be doing it in POD. That way you can extract the documentation and do something with it with external tools

    You are not checking to see whether your open calls are succeeding or not. Things could be broken and you wouldn't know about it.

    I would write:

    $excpdir=$defdev."/wxyzprod/wxyzexcp/"; $logfile=$defdev."/wxyzprod/wxyzlog.txt"; $indir=$defdev."/wxyzprod/wxyzin/"; $outdir=$defdev."/wxyzprod/suppliers/"; $archive=$defdev."/wxyzprod/wxyzarch/";
    my $dir = '/wxyzprod'; $excpdir="$defdev$dir/wxyzexcp/"; $logfile="$defdev$dir/wxyzlog.txt"; $indir="$defdev$dir/wxyzin/"; $outdir="$defdev$dir/suppliers/"; $archive="$defdev$dir/wxyzarch/";

    Factor out the common substrings. Use interpolation. It's easier to maintain (one change) and easier on the eyes (less keystrokes). I'd also use whitespace air the code visually.

    If your statting a file to get its mtime, just say so.

    ($device,$inode,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctim +e, $blksize,$blocks) = stat($getinfile); my($timenow)=time; $delta=$timenow-$mtime; # versus $delta = time - (stat($getinfile))[9];

    That way you don't have all those unused variables lying around.

    And look for a way to factor out the blocks beginning with if (index($line,"ISA") == -1) into a subroutine. Any time you have large slabs of lines doing the same thing, it will be far more maintainable to have only one copy, and pass the information that changes via parameters.

    Hope this helps.

    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'
Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://202181]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (4)
As of 2017-03-25 02:49 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (310 votes). Check out past polls.