Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Cannot work on second file after reading first file.

by 1straw (Novice)
on Feb 28, 2014 at 15:40 UTC ( #1076547=perlquestion: print w/ replies, xml ) Need Help??
1straw has asked for the wisdom of the Perl Monks concerning the following question:


Hi Monks:

I'm starting to learn Perl and trying to use it in my first real life project. This is my first post on perlmonks.org.

The purpose of the program is to read 2 files and write a CSV batch file to print price labels with glabels (the program is unfinished).

The files to read are:

1) An EDI purchase order. This is a file with fields separated by * and end of lines separated by ~

2) A CSV file with three columns: Store number, type of store (there are 2 types that require different graphics on the label), and store location.

So far I've managed to extract the information I need from the EDI file (probably in a very long-winded way). I wrote a subroutine to read the CSV file (last sub in the code: &separate_stores) and it worked fine when I ran it by itself.

PROBLEM:

Seeing that &separate_stores worked as I wanted, I included it in my main program. When I run it with the other subroutines, &separate_stores seems to never go into the while loop and the two store number arrays return empty. I thought this might be due to the fact that I've read the other file (EDI) earlier in the program and it is somehow interfering with the second read, but I'm not sure and cannot find a way to fix this. Any help is much appreciated. Thank you!

EDI file structure:

The EDI file is a standard EDI purchase order. The program deletes all information before the first line starting with PO1. It then reads the product code from each PO1 line (i.e. *THSB01*). The next line is unecessary. The following line gives store numbers and quantities to be supplied for that product (i.e. *00001*12*00002*10*etc*), then another PO1 line, etc.
ISA*00* *00* *ZZ*EXAMPLE *ZZ*EDISUPPLIER + *130315*0332*U*00304*000001345*0*P*:~GS*PO*EXAMPLE*EDISUPPLIER*13031 +5*0332*993*X*003040~ST*850*000003696~BEG*00*SA*20202020**130315~REF*D +P*0155~DTM*011*130312~DTM*001*130318~N1*SU**92*123456~N1*ST**92*00090 +~PO1**17*EA*42*CA*EN*1234567890123*SK*1234567890*IZ*v*CL*v*ST*THSB10* +BC*LIMPIEZA*BL*005~CTP**RTL*11~SDQ*UN*92*00021*15*00072*2~PO1**6*EA*4 +2*CA*EN*1234567890123*SK*1234567890*IZ*v*CL*v*ST*THSB11*BC*LIMPIEZA*B +L*005~CTP**RTL*11~SDQ*UN*92*00012*2*00134*4~PO1**4*EA*46*CA*EN*123456 +7890123*SK*1234567890*IZ*v*CL*v*ST*THSB12*BC*LIMPIEZA*BL*005~CTP**RTL +*11~SDQ*UN*92*00016*2*00113*1*00165*1~PO1**2*EA*103*CA*EN*12345678901 +23*SK*1234567890*IZ*v*CL*v*ST*THSB01*BC*LIMPIEZA*BL*005~CTP**RTL*20~S +DQ*UN*92*00012*2~PO1**1*EA*52*CA*EN*1234567890123*SK*1234567890*IZ*v* +CL*v*ST*THSB03*BC*LIMPIEZA*BL*005~CTP**RTL*11~SDQ*UN*92*00065*1~PO1** +5*EA*27*CA*EN*1234567890123*SK*1234567890*IZ*v*CL*v*ST*THSB09*BC*LIMP +IEZA*BL*005~CTP**RTL*6~SDQ*UN*92*00031*1*00080*3*00165*1~CTT*6*35~SE* +27*000003696~GE*1*993~IEA*1*000001345~

CSV example file:
# de tienda, tipo, ubicacion 00001,li,location1 00002,li,location2 00003,li,location3 00004,li,location4 00005,li,location5 00006,li,location6 00007,li,location7 00010,ff,location8 00011,ff,location9 00012,ff,location10 00014,li,location11 00015,ff,location12 00016,ff,location13 00017,ff,location14 00019,ff,location15 00021,li,location16 00022,li,location17 00025,ff,location18 00026,ff,location19 00027,ff,location20 00028,ff,location21 00029,li,location22 00031,li,location23 00034,ff,location24 00035,ff,location25 00036,ff,location26 00037,ff,location27

(the file consists of close to 100 locations. The preceeding is a smaller version)

CODE:
#!/usr/bin/perl use strict; use warnings; use File::HomeDir; use List::Util 'first'; use List::MoreUtils 'first_index'; use Text::CSV; my @po = &open_po; #my @po_no_headings = &erase_headings(@po); #my @po_separated = &make_product_store_quantity(@po_no_headings); #print "@po_separated\n"; my ($li_stores_ref, $ff_stores_ref) = &separate_stores; my @liv_stores = @$li_stores_ref; my @ffr_stores = @$ff_stores_ref; print "Li stores: @liv_stores\n"; print "FF Stores: @ffr_stores\n"; # **************************************** # 1) open .EDI file # 2) split into lines through "~" line separator # 3) Place .EDI file contents into @po_array sub open_po { my (edi_file, $po_data, @po_array); edi_file = File::HomeDir->my_home . "/example/po//example.EDI" or di +e ".EDI file not found\n"; open ($po_data, '<', edi_file) or die "Could not open the file 'edi_ +file' $!\n"; undef $/; @po_array = split (/\~/, <$po_data>); close $po_data; return @po_array; } # **************************************** # Erase all information from purchase order before first PO1 line sub erase_headings { my ($delete_point); $delete_point = (first_index {/PO1/} @_) - 1; # Find first PO1 line # Shift array up to first PO1 line for (0..$delete_point) { shift @_; } @_; } # **************************************** # Return array with product, store, quantity # (i.e. THSB01 00021 3 00043 1 THSB02...etc.) sub make_product_store_quantity { my (@line, @product_stores, $product, $product_index); while(scalar (@_) !=0) { @line = split (/\*/, $_[0]); # Split line into fields # Find product code (i.e. THSB01) and shift array twice if ($line[0] eq "PO1") { $product_index = (first_index {/ST/} @line) + 1; $product = $line[$product_index]; shift; shift; } # Find stores & quantity for each store. # Write product code followed by stores & quantity to array @product_s +tores if ($line[0] eq "SDQ") { shift (@line); shift (@line); shift (@line); unshift (@line, $product); push (@product_stores, @line); shift; # Exit when PO line starts with CTT, meaning end of PO1 details } elsif ($line[0] eq "CTT") { undef (@_); } } @product_stores; } # **************************************** # 1) Open and read stores.csv # 2) Place all "li" type store numbers in @li_store array # 3) Place all "ff" type store numbers in @ff_store array # 4) Return array references sub separate_stores { my (@li_stores, @ff_stores); my $stores_file = File::HomeDir->my_home . "/example/data/example.cs +v"; open my $fh, "<", $stores_file or die "$stores_file: $!"; my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1, }); my $count_li = 0; my $count_ff = 0; $csv ->getline ($fh); while (my $row = $csv->getline ($fh)) { if ($row->[1] eq "li") { $li_stores[$count_li] = $row->[0]; $count_li ++; } else { $ff_stores[$count_ff] = $row->[0]; $count_ff ++; } } close $fh; return (\@li_stores, \@ff_stores); }

########################

***PROBLEM SOLVED***

SUMMARY:

1) A subroutine to open and read a CSV file worked fine in isolation but stopped working when integrated into larger script.

2) The problem stemmed from an earlier subroutine (open_po) which globally modified the input record separator ($/) through: undef $/;

3) The problem code was modified to change the state locally, according to sn1987a's post to: local ($/);

4) This modification solved the problem.

Thank you all for your help!

Comment on Cannot work on second file after reading first file.
Select or Download Code
Re: Cannot work on second file after reading first file.
by davido (Archbishop) on Feb 28, 2014 at 16:08 UTC

    You're intentionally skipping the first line of your CSV, which is probably because it contains field names, and isn't useful to you. But there's nothing to test your assertion that the first line of the file will be field names, or that it even exists. You might change it to:

    my $head = $csv->getline($fh); die "Empty CSV." unless defined $head; warn "Field names detected: (@{$head})\n" if $ENV{MYSCRIPT_DEBUG}; die "No more CSV rows to process after header.\n" if $csv->eof;

    Set an environment variable named "MYSCRIPT_DEBUG" to a true value before running if you want debugging info (the warn statement). This will at least test the assertion that your CSV file is not empty, that it starts out with field names, and that there are additional rows available to process after the header row.


    Dave


      Yes, the first line of the CSV contains field names, sorry for not mentioning that.

      Thank you for the recommendations. I had not tested for the existence of the file or the assertion that the first line is field names because this is a CSV that I made myself to use with this program. It will only be modified when the customer opens new shops. I will follow your recommendations.

      What I'm trying to get to with the program is on one hand to have an array with each product, store & quantity, taken from the EDI file, and on the other to have one array listing store numbers of each type.

      I then plan to go through the "product,store,quantity" array and make a sum of totals per product for each store type (i.e. THSB01 for li type stores = 30, THSB01 for ff type stores = 15, etc.). This is because I need a different price label format for each store type.

      Would it help if I posted dummy store.csv and po.edi files?

      If I put the code for the CSV file by itself and run it, it does what I want it to do (i.e. list of store numbers in two arrays). The problem only happens when I try to integrate that subroutine into the rest of the program.

      Thanks!

        Did you try running it with the additions I proposed? They are tests that attempt to get to the bottom of why your while() loop seems to never be entered. That's relevant information.


        Dave

Re: Cannot work on second file after reading first file.
by McA (Priest) on Feb 28, 2014 at 16:09 UTC

    Hi,

    just a guess. I've to cite the man page for Text::CSV: "...getline...It reads a row from the IO object $io using $io->getline () and parses this row into an array ref. This array ref is returned by the function or undef for failure.".

    IMHO you're not checking the case of a failure, so it seems possible that the while loop is exited because of an error.

    UPDATE: There is another problem:

    my (edi_file, $po_data, @po_array);

    I get a compiling error thrown.

    Regards
    McA

      Presumably Text::CSV's auto_diag attribute works correctly, so testing for error isn't so much the issue as testing for additional lines to get.


      Dave

        You're right. I saw it afterwards that the auto_diag flag is set. So, I'm happy I wrote "guess" ;-)

        Regards
        McA


      Thank you McA. Sorry for the error. I was writing the program with variables in Spanish and decided to change to English before posting here. I ran a search & replace on that variable and replaced without $. All edi_file should be $edi_file.

        I've added some comments to your separate_stores function to try to better explain the potential issues I see.

        sub separate_stores { my (@li_stores, @ff_stores); my $stores_file = File::HomeDir->my_home . "/example/data/example.cs +v"; open my $fh, "<", $stores_file or die "$stores_file: $!"; # So at this point we know that you were able to successfully # open example.csv, because you tested your open. my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1, }); # Also know that if your parsing of CSV generates an error, # that error will be spat to the screen for us. Good. my $count_li = 0; my $count_ff = 0; $csv ->getline ($fh); # We don't know if the preceding line returned 'undef', # indicating an empty CSV file. We also don't know if # it really stripped away a header row, or something useful # instead. We also don't know if there is any more CSV remaining # after processing that first line. ################# # Change the preceding line "$csv ->getline ($fh);" to the following, # for better diagnostics: # my $head = $csv->getline($fh) or die "Empty CSV file $!\n"; warn "CSV header line contained (@{$head})\n" if $ENV{MYTEST_DEBUG}; die "Nothing found after CSV header line.\n" if $csv->eof; # # Now set an environment variable "MYTEST_DEBUG" true, and re-run your + test. ################# while (my $row = $csv->getline ($fh)) { # Your 'while' loop will never be entered if you're already at # the end of file. We don't know if that's an issue, because you # didn't explicitly test what happened when you stripped the first # CSV line. if ($row->[1] eq "li") { $li_stores[$count_li] = $row->[0]; $count_li ++; } else { $ff_stores[$count_ff] = $row->[0]; $count_ff ++; } } close $fh; return (\@li_stores, \@ff_stores); }

        It's almost certain that if this subroutine is failing in the big script, one of those assertions will also fail in the big script. And that will give you better information on why your subroutine is failing to do what you want.


        Dave

Re: Cannot work on second file after reading first file.
by sundialsvc4 (Abbot) on Feb 28, 2014 at 18:13 UTC

    Hello, and welcome to the Monastery!

    Since this has turned into quite a detailed-and-deep (as well as very useful) thread, could you please post a summary of how things turned out, at or near the end of the thread.   What the issues turned out to be, and how they were dealt with.   This will make it that much easier for the next Monk to quickly grok the issue by (starting with ...) reading one post.   It’s a nice way, I think, to package a thread for posterity.


      Thank you sundialsvc4!

      I have not yet been able to solve this issue. Small script works well when isolated, stops working when integrated into larger script. I still have no idea what's causing this

      I will certainly write this up once I figure out what's causing the problem and how to fix it!
Re: Cannot work on second file after reading first file.
by GotToBTru (Curate) on Feb 28, 2014 at 18:38 UTC

    As a beginner with perl, it would do you good to learn about the debugger. It would allow you to execute each line of your program one at a time, and inspect the values of variables to confirm that things are happening as you expect. That can be very useful when the problem is one like this, where you suspect the outer program is somehow changing something.


      Thank you for the advice! I will spend some time learning to use the debugger as well as Data::Dumper.
Re: Cannot work on second file after reading first file.
by sn1987a (Monk) on Feb 28, 2014 at 18:45 UTC

    The immediate problem is the undef $/; in open_po. Replace it with local ($/);

    A couple notes on my process for debugging this:

    First, I used Data::Dumper to dump the result of call $csv->getline_all to see the entire parse being returned. This verified the only thing being returned was the header row.

    Then I commented out the call to open_po and saw that the file was not being parsed correctly. This meant the problem was some state being globally modifed in open_po.

    Study of that sub lead to the problem


      Thank you sn1987a! This solved the problem.

      I will learn to use Data::Dumper & read the debugger tutorial suggested as well.

      SUMMARY:

      1) A subroutine to open and read a CSV file worked fine in isolation but stopped working when integrated into larger script.

      2) The problem stemmed from an earlier subroutine (open_po) which globally modified the input record separator ($/) through: undef $/;

      3) The problem code was modified to change the state locally, according to sn1987a's post to: local ($/);

      4) This modification solved the problem.

      Thank you all for your help!
Re: Cannot work on second file after reading first file.
by McA (Priest) on Feb 28, 2014 at 19:34 UTC

    Hi all,

    coming home and seeing that this thread is growing I really get curious about what the problem is. 1straw seems to be a nice guy. So, I took the corrected program, created the directories addressed and put the data to the files.

    After the first run I see the same weird result as 1straw.

    So I follow the advice of davido who was giving the hint that it would be intersting to see what is slurped by the first getline in the relevant function. Inserted a use Data::Dumper; to the friends at the beginning of the script and changed the line

    $csv ->getline ($fh);

    into

    my $header = $csv ->getline ($fh); print Dumper(\$header), "\n";

    Another run showed the following output:

    $VAR1 = \[ '# de tienda', ' tipo', ' ubicacion', 'li', 'location1', 'li', 'location2', 'li', 'location3', 'li', 'location4', 'li', 'location5', 'li', 'location6', 'li', 'location7', 'ff', 'location8', 'ff', 'location9', 'ff', 'location10', 'li', 'location11', 'ff', 'location12', 'ff', 'location13', 'ff', 'location14', 'ff', 'location15', 'li', 'location16', 'li', 'location17', 'ff', 'location18', 'ff', 'location19', 'ff', 'location20', 'ff', 'location21', 'li', 'location22', 'li', 'location23', 'ff', 'location24', 'ff', 'location25', 'ff', 'location26', 'ff', 'location27' ];

    Uuuuuppsss, what is that? The whole file is slurped as one line. How can this be? What is the line seperator? Oh, there is no explicit definition! So, what is taken? I looked at the docs: Ahhh, the default seems to be the special variable $/. Oh, I've seen that variable in the script. Looking around I found this code in open_po:

    open ($po_data, '<', $edi_file) or die "Could not open the file '$ed +i_file' $!\n"; undef $/; @po_array = split (/\~/, <$po_data>);

    Oh, oh, you've changed a GLOBAL variable which changes the semantics of Text::CSV. (Search for "Global variables are bad" in Perlmonks ;-) )

    So, a simple proove if this is the culprit. Put a local in front of it:

    local $/ = undef;

    and the next run shows that it's found.

    Conclusion: davido gave the right hint but you seem not to put a print statement to the header gathering line which gave definitely the right hint. Changing global variables without localisation is risky. Don't localize, use CPAN modules doing what you like, e.g. File::Slurp. Use Data::Dumper as often as possible when you want to print debug lines. You're freed from thinking about what is stored to a variable.

    Hope, we all could help you.

    UPDATE: Argggghhh, this is the problem here: When you write a long answer, it can happen that someone is answering meanwhile. After looking at the results of my post I've seen that the right hints were given meanwhile. So, a ++ to those who were faster. :-)

    Best regards
    McA

      That's exactly why I wanted to print "(@{$head})". It would tell me if $head contained stuff that shouldn't be there. The OP told me the following, "When I ran the script it printed: Field names detected: (# de tienda  tipo  ubicacion". I should have known -- because he didn't show me the closing paren ')' -- that his report to us was incomplete, and should have suspected that it lacked exactly the piece of information we were looking for.

      That doesn't excuse my not having seen "undef $/" earlier in the script, but had he reported the full output, or had I noticed that his report was incomplete, it would have led us to look for the modification of $/ sooner even though it was at first not noticed.


      Dave

        After seeing the output of the first line it was so obvious in combination that we've seen $/. But without that I would have read over this piece of code many times without seeing the problem.

        But the next time my presented solution will be a one-liner and not a whole story to see myself being late in the thread... ;-)

        Best regards
        McA


        Hi Dave:

        I still don't understand what went wrong when I followed your instructions. I did mention the odd open parenthesis "(" without a closing parenthesis, but that's definitely all that was outputed. I just tried again w/ undef $/; and I can confirm (Li Stores: and FF Stores was the original output before implementing your code, it was still there afterward, but I didn't think it was important to mention):
        x@y:~/example$ perl example Field names detected: (# de tienda tipo ubicacion Li stores: FF Stores: x@y:~/example$

        I post this out of curiosity & to avoid making the same mistake in the future

        Thank you for your help, I really appreciate the time you took to help me figure this out!!

        1straw

      Thank you for the long answer McA! I will look into File::Slurp as well. I hope to continue learning and avoid these silly errors. I was aware that global variables were a bad idea, I just didn't realize I was changing $/ globally. I have very little programming experience in general, and about 1 week experience with Perl :P

      Thanks again!

        Hi 1straw,

        you're welcome!

        IMHO it's outstanding how you come back to your thread and react on the comments and advices given. That happens not too often. And I'm pretty sure this is the reason why you got help from all directions.

        I wish you much fun with programming in Perl.

        Best regards
        McA

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1076547]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (8)
As of 2014-12-26 03:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (164 votes), past polls