Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: Multithreading leading to Out of Memory error

by joemaniaci (Sexton)
on Jun 14, 2013 at 20:28 UTC ( [id://1039032]=note: print w/replies, xml ) Need Help??


in reply to Re: Multithreading leading to Out of Memory error
in thread Multithreading leading to Out of Memory error

So after a few days of intermittent network connectivity(data is on a networked drive) and testing I figured it out. I think.

push(@array, $data);

I went down the path of using only 1 thread for processing. I went down the path of parsing only a single file type. Once I started the behavior went away, and of course it wasn't until the final file type that the behavior came back. I looked through my code to see what it had that no other file type had and it was...

 @array = sort { $a <=> $b } @array;

So I took out that code and tested again, the problem was still present. So I kind of commented things out piece-meal until I narrowed it down to the single line of code above. With that single line of code 3 additional MB of memory is used up as the thread leaves the parsing method for that particular file type.

So here is the basic rundown of this file.

sub parsefiletypeX { my $filename = shift; #get the directory from the filename(w/ directory) open(IN, $filename) or die... open(OUT, $outfile) or die... my $lineCount = 1; my $nextline = <IN>; my $headerlines; my $samplesize = 1; my @array; $nextline = trim_whitespace($nextline);#my subroutine ++$LineCount; #Did this before I learned about $. #read the header for the file(five lines) #Do a bunch of regex checks on the header lines #Read in the first record, which contains its own line of sub he +ader data as well as the two lines of actual data in pascal float for +mat I believe. Or fortran actually. #Regex on the first line and push a certain piece of data. This +line is NOT the bad line push( @array, @fi[4]);#1st line has 5 values #Regex checks on the next two lines #Then read in the rest of the 3-lined records until(eof(IN)) { #get the same three lines and do the regex checks #now the faulty call is made push( @array, @fi[4]); } //Do stuff to the @array, like sorting and determining certain c +hecks. close IN; close OUT; #call function that uploads a record to the database.

,

So after a few days of intermittent network connectivity(data is on a networked drive) and testing I figured it out. I think.

push(@array, $data);

I went down the path of using only 1 thread for processing. I went down the path of parsing only a single file type. Once I started the behavior went away, and of course it wasn't until the final file type that the behavior came back. I looked through my code to see what it had that no other file type had and it was...

 @array = sort { $a <=> $b } @array;

So I took out that code and tested again, the problem was still present. So I kind of commented things out piece-meal until I narrowed it down to the single line of code above. With that single line of code 3 additional MB of memory is used up as the thread leaves the parsing method for that particular file type.

So here is the basic rundown of this file.

sub parsefiletypeX { my $filename = shift; #get the directory from the filename(w/ directory) open(IN, $filename) or die... open(OUT, $outfile) or die... my $lineCount = 1; my $nextline = <IN>; my $headerlines; my $samplesize = 1; my @array; $nextline = trim_whitespace($nextline);#my subroutine ++$LineCount; #Did this before I learned about $. #read the header for the file(five lines) #Do a bunch of regex checks on the header lines #Read in the first record, which contains its own line of sub he +ader data as well as the two lines of actual data in pascal float for +mat I believe. Or fortran actually. #Regex on the first line and push a certain piece of data. This +line is NOT the bad line push( @array, @fi[4]);#1st line has 5 values #Regex checks on the next two lines #Then read in the rest of the 3-lined records until(eof(IN)) { #get the same three lines and do the regex checks #now the faulty call is made push( @array, @fi[4]); } //Do stuff to the @array, like sorting and determining certain c +hecks. close IN; close OUT; #call function that uploads a record to the database.

I even tried clearing out that array after I was done using it, such as...

@array = (); undef @array;

But this has no effect!?!? So what is going on?

Replies are listed 'Best First'.
Re^3: Multithreading leading to Out of Memory error
by BrowserUk (Patriarch) on Jun 15, 2013 at 08:29 UTC
    So what is going on?

    There is simply not enough information here to even begin to guess.

    If you want this debugged, you are going to have to find some way around your 'can't post the real code' problem and supply -- publicly or privately -- real, runnable source code + sample data that demonstrates the problem. If not, you're on your own I'm afraid.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      To get the same behavior I would have to mock the structure of the file with false data, but even the structure of the file is classified. So if I made an attempt to approximate the structure/data of the file, that would be fine, but knowing my luck, the problem would go away. I'll just keep whacking at it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1039032]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2024-04-23 16:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found