AssFace has asked for the wisdom of the Perl Monks concerning the following question:

I am seeing an odd behavior here, which is making me assume at this point that something I thought I was doing properly is actually incorrect.

I have a directory A, in that directory are thousands of files that are in the name style of all caps letters. AAA, ABC, TTH, RTYU, etc (stock tickers).

I have a directory within that A directory that has files that are named exactly like the files in the directory above. But there is different information in there.
That data looks like:
H: 1.000000 B: 2,0,-1.57
My code is starting off and getting all of the filenames out of the subdir and populating an array. I have checked that visually and each spot is correct and there are no blank spots in there.

I iterate over that array and I open the file that matches that in the A directory. I put that into an array, and I then splice off parts of it, and then reverse that array.
All of that works perfectly over and over and over again for thousands of iterations.

But the instant I uncomment this code here that follows that code, it does something I've never seen before.
$strASub = ''; open(ASUBFILE,"/A/sub/$strTName") or die "can't open the asub file: $s +trTName : $!\n"; while(<ASUBFILE>){ $strASub .= $_; } close(ASUBFILE) or die "can't close the asub file: $strTName :$!\n";
Once that code is uncommented, then it fails. It will dump out something that clears my ssh screen and then a bunch of space, and then some other tickernames will show up separated by odd characters like superscript 1s and other things like that.

If I comment the code, then everything works out well - uncommented, crash. I can't trace down what file it doing it because it dies and clears the screen. I tried debugging it locally on my laptop, but I have some issues with this laptop where it will overheat and die easily.

Is that open call incorrect? Is there an easy way I could check over 1000 files to see if one has odd data in it that is causing this to break?
It looks like a similar issue to a buffer overflow.


-------------------------------------------------------------------
There are some odd things afoot now, in the Villa Straylight.

Replies are listed 'Best First'.
Re: Strange crash - any ideas?
by jonadab (Parson) on Oct 20, 2003 at 02:23 UTC

    First thing I'd do is stick a test in to check for any unexpected data. Write a regex that should match the data you _do_ expect, and log anything that doesn't match, together with the name of the file it comes from...

    while(<ASUBFILE>){ unless (/^[A-Za-z0-9 other expected chars]*$/) { print LOG "Encountered unexpected data for $strTName: $_\n"; } $strASub .= $_; }

    $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
Re: Strange crash - any ideas?
by Roger (Parson) on Oct 20, 2003 at 02:20 UTC
    Question 1 - is "/A/sub/$strTName" a file? If so, you probably need to open it with open ASUBFILE, "</A/sub/$strTName".

    Question 2 - is $strTName defined? Try to insert a print statement and print out the $strTName variable.

    Question 3 - Why is there a reference to $strTickerName, not $strTName?

    The behaviour you have described happens most likely when you read past the end of the file, which happened to me a few times in the past.

    If you are running the script under Unix environment, you could try out the execellent 'ddd' debugger. Which does a good job at debugging perl scripts.

      Q1) Yes, it is a file. For just plain reading in of a file, I have always just left it as is and it has worked. I made the change that you pointed out and it still fails.

      Q2) I posted up another node on here that shows a chunk of the code that is having the issue. The first time through the loop, the $strTName is defined and outputs, the second time through the loop, it doesn't appear to be defined. Where is it getting killed?

      Q3) The $strTickerName is what it is called in my code - I was trying to shorted the names that I had posted so that it wouldn't line wrap as much and be easier to read - I just missed the one there in my haste to try to post it - I went back and changed that.


      -------------------------------------------------------------------
      There are some odd things afoot now, in the Villa Straylight.
Re: Strange crash - any ideas?
by AssFace (Pilgrim) on Oct 20, 2003 at 02:25 UTC
    Here is a larger code example - the array @testDateRows has the series of dates that I will be testing - in this case close to a thousand days.
    The array @algTickers has all of the names of the data files that will be looked at - several thousand are in there.

    When I set it to close after it runs once (tmpCounter > 0), then it works great that one time. It will output all the tickers quickly and it doesn't complain.
    When I set the tmpCounter to 1, as it is below, meaning that it runs through the outer loop twice, then I see issues in there for every single spot.
    I get the error "Use of uninitialized value in concatenation (.) or string at" some line number in the thing. (it refers to the line number where the open is trying to take place)

    This would make me think that since the names are derived from pulling out the filenames of a directory, then the "." and ".." snuck into the array. But I have accounted for their removal in the code that populates that array, and I have output that array to a file to look at the data and it is all correct.

    What is wrong with the code below that would allow it to work perfectly one time through the loop, and then the next time through the loop (and every time after that) fail?
    I also haven't figured out why at some point it will also clear the screen out - at this level of iteration it doesn't do it, but it will if I get rid of the if statement that will exit.
    my $tmpCounter = 0; for(@testDateRows){ if($tmpCounter > 1){ exit; } @tmpArr = split(',',$_); $formattedName = $tmpArr[0]; #iterate through all the tickers for(@algTickers){ $strTickerName = $_; #there is an open call in here that works, I commented it out to see i +f I could still reproduce the problem # open... print "$strTName:$formattedName\n"; #now we need to load the a file $strAlg = ''; open(ALGFILE,"/A/sub/$strTName") or die "can't open th +e asub file: $strTName : $!\n"; while(<ALGFILE>){ $strAlg .= $_; } close(ALGFILE) or die "can't close the asub file: $str +TName :$!\n"; } $tmpCounter++; }


    -------------------------------------------------------------------
    There are some odd things afoot now, in the Villa Straylight.
      This seems to have resolved the problem. Changing the for loop so that it had:
      for $ticker (@algTickers){ #use $ticker for anything for this spot in the array }
      instead of
      for(@algTickers){ #use $_ into a var and then use that var for this spot in the array }
      This must have been something I missed as I was learning about "$_" out of a for loop - I must have only used it in straight loops and not nested loops up until now. Checked the camel book and in the foreach section it does say that each $_ is a reference, which I knew, but I don't see any code there that is clearing out that spot in the array... not sure why it was doing it, but this seems to have resolved the issue and then the ensuing side effects.

      Now I need to uncomment a lot of lines now to confirm that this fixes the issue.


      -------------------------------------------------------------------
      There are some odd things afoot now, in the Villa Straylight.
        Add the following to the top of your perl script:
        use strict; use Data::Dumper;
        And then add print Dumper(\@algTickers); before and after your for loop, as this will inspect your array and see what has been changed.