Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Consumes memory then crashs

by allhellno (Novice)
on Mar 24, 2012 at 12:15 UTC ( #961394=perlquestion: print w/ replies, xml ) Need Help??
allhellno has asked for the wisdom of the Perl Monks concerning the following question:

Howdy monks! So I have been trying to code a script to fetch info for game, and I found with out threads this is simply to slow to be effective. However, In an attempt to make things faster I added threading, but now when I run the script it quickly shoots up to 4gb memory then crashes. Now I am completely new to this so some one school me on what I'm doing wrong so this never happens again. note, the var $names is 16k lines long

use threads; use warnings; use LWP::Simple; $names=''; my $re69='((?:[a-z0-9_]+))'; open (lookup, '>>rstlookup.txt'); my @a = (); my @b = (); print "Starting main program\n"; my $nb_process = 16832; my $nb_compute = 20; my $i=0; my @running = (); my @Threads; while (scalar @Threads < $nb_compute) { @running = threads->list(threads::running); while ($names =~ m/$re69/isg) { if (scalar @running < $nb_process) { $name = $1; my $thread = threads->new(\&lookup); push (@Threads, $thread); my $tid = $thread->tid; } @running = threads->list(threads::running); foreach my $thr (@Threads) { if ($thr->is_running()) { my $tid = $thr->tid; } elsif ($thr->is_joinable()) { my $tid = $thr->tid; $thr->join; } } } @running = threads->list(threads::running); $i++; } while (scalar @running != 0) { foreach my $thr (@Threads) { $thr->join if ($thr->is_joinable()); } @running = threads->list(threads::running); } sub lookup { my $name2 = $name; my $lookup = get("http://rscript.org/lookup.php?type=track&time=62 +899200&user=".$name2."&skill=all"); print "Looking up $name2...\n"; $reg1='(ERROR)'; if ($lookup) { my $re1='(gain)'; my $re2='(:)'; my $re3='(Overall)'; my $re4='(:)'; my $re5='(\\d+)'; my $re6='(:)'; my $re7='(\\d+)'; my $re=$re1.$re2.$re3.$re4.$re5.$re6.$re7; if ($lookup =~ m/$re/isg) { print lookup "$name2 $7\n"; } } if ($lookup =~ m/$reg1/isg) { print lookup "$name2 doesn't exist \ +n" } else{ print lookup "$name2 0\n"; } } close (lookup);

Comment on Consumes memory then crashs
Download Code
Re: Consumes memory then crashs
by BrowserUk (Pope) on Mar 24, 2012 at 12:51 UTC
    In an attempt to make things faster I added threading, but now when I run the script it quickly shoots up to 4gb memory then crashes.

    You are attempting to run 16382 threads concurrently. 4GB / 16382 = 256k. Perl's threads (and threads in most languages) require more than 256k each. Ergo, what you are trying to do won't work.

    Now look at it another way. Your stated goal is "an attempt to make things faster".

    Does your machine have 16,000 cores?

    If not, then using 16382 threads is not going to speed things up.

    Rather than each thread starting one lookup(), and running it until completes and then starting the next one; each thread is doing a bit of one, then switching to another and doing a bit; then switching to another and doing a bit, ...

    All that switching costs time and cpu. Time and cpu that can no longer be used for solving the original problem. Ergo, it takes longer!

    Judicious use of threads can speed up some cpu-intensive tasks; but throwing 1000s of threads at a problem is never going to help unless you happen to have around $10,000,000 with which to purchase a machine that has thousands of cores.

    If you want help in improving the performance of your code, show us the unthreaded version(*) and tell us how long it takes and how much faster you would like it to be. Then, if once we've checked that your task cannot be sped up by using a better algorithm, we might suggest a threading solution.

    (*)But do ensure that your code is readable and compiles clean with use strict and warnings. Your current code is barely intelligible and has obviously had strict and my slapped into it to try placate this place. It doesn't work and it doesn't help -- you or us.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      I was trying to get it to only allow 20 workings threads at one time, but here is the original
      use LWP::Simple; $names = 'zezima'; $re69='((?:[a-z0-9_]+))'; open (lookup, '>>rstlookup.txt'); while ($names =~ m/$re69/isg) { $name = $1; $lookup = get("http://rscript.org/lookup.php?type=track&time=62899 +200&user=".$name."&skill=all"); print "Looking up $name...\n"; $reg1='(ERROR)'; $re1='(gain)'; $re2='(:)'; $re3='(Overall)'; $re4='(:)'; $re5='(\\d+)'; $re6='(:)'; $re7='(\\d+)'; $re=$re1.$re2.$re3.$re4.$re5.$re6.$re7; if ($lookup =~ m/$re/isg) { print lookup "$name $7\n"; } elsif ($lookup =~ m/$reg1/isg) { print lookup "$name doesn't exist + \n" } else{ print lookup "$name 0\n"; } } close (lookup);

        There doesn't seem any good reason to build up your regex a bit at a time in separate variables; then concatenate those bits into another variable; and then interpolate that into a regex:

        $re1='(gain)'; $re2='(:)'; $re3='(Overall)'; $re4='(:)'; $re5='(\\d+)'; $re6='(:)'; $re7='(\\d+)'; $re=$re1.$re2.$re3.$re4.$re5.$re6.$re7; if ($lookup =~ m/$re/isg) {

        And even less reason to repeat the same 3-stage exercise over and over again each time around a loop. Especially as it means that the regex engine needs to recompile the combined regex every time around that loop even though nothing changes.

        And why capture 7 different bits of the string you are matching against:

        $re = '(gain)(:)(Overall)(:)(\\d+)(:)(\\d+)'; ... "$name $7\n"

        when 5 of them are constants; and you are only using one of them?

        Equally, the is nothing to be gained from assigning a constant string to a variable and then interpolating it into a regex:

        $reg1='(ERROR)'; ... elsif ($lookup =~ m/$reg1/isg)

        Cleaning up those; and a few other things up; adding strict and -w; and moving the regexing into a subroutine (to make later threading easier), I get:

        #! perl -slw use strict; use LWP::Simple; sub lookup { my( $hf, $name ) = @_; my $lookup = get( "http://rscript.org/lookup.php?type=track&time=62899200&user=$ +name&skill=all" ); print "Looking up $name...\n"; if( $lookup =~ m/gain:Overall:\d+:(\d+)/isg ) { print { $fh } "$name $7\n"; } elsif( $lookup =~ m/(ERROR)/isg ) { print { $fh } "$name doesn't exist \n" } else{ print { $fh } "$name 0\n"; } } my $names = 'zezima'; open( LOOKUP, '>>rstlookup.txt' ) or die $!; while( my( $name ) = $names =~ m/([a-z0-9_]+)/isg ) { lookup( \*LOOKUP, $name ); } close( LOOKUP );

        Which will probably not run much more quickly as you are IO-bound, but (I hope you'll agree) is easier to read and will at least consume less cpu.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://961394]
Approved by moritz
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2014-08-21 07:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (128 votes), past polls