Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Venerable Monks,

I come to the monastery once again, this time seeking advice about the CPU Use of a multi-threaded script I recently wrote.

First let me explain the script. Its purpose is to walk filesystems, gather information about files it encounters, write it down into temporary files, and import those files into a mysql database table. Essentially creating a giant 'dir' listing. Not very elegant, I know - but it suits our purposes.

Anyway. The actual scanning is done by Worker threads using a very nice freeware utility from the makers of TreeSize, called FileList. The threads are pooled using the Thread::Pool::Simple module. I'll post the code at the bottom of this message.

The script works, but the problem is that it is that the actual Perl process is using way more resources than (I think) it should. Here's what the typical Task Manager pane ends up looking like:

Windows Task Manager
Image NameUser NameCPUMem Usage
Perl.exeZenshai75125,588K
FileList.exeZenshai06688K
FileList.exeZenshai043,952K
FileList.exeZenshai02488K
FileList.exeZenshai02688K
cmd.exeZenshai0072K
cmd.exeZenshai0072K
cmd.exeZenshai0072K
cmd.exeZenshai0072K
etc...Zenshai00<2000K


So as you can see, it seems that Perl is hogging most of the resources when all it should really be doing is handing out jobs from a queue. I think this is having an impact on the performance of the FileList.exe processes, normally they use about 08-12 CPU units each.

I'm still very new to Perl, so I don't even know where to begin debugging a problem like this. Looking for any help you may be able to provide.

Here is the code creating the pool and adding the jobs:
my $pool = Thread::Pool::Simple->new( min => 2, # at least 2 workers max => $total_threads, # at most x workers load => 10,# add worker if each worker has 10 jobs queued do => [ \&do_handler ], # job handler for each worker passid => 1, # pass the job id->1st arg to &do_handler lifespan => 10000, # total jobs handled by each worker ); open( FLP, $deeper_file_path ); foreach my $file_path (<FLP>) { $pool->add( $file_path, $filelistEXE_path, $scan_table_name); } $pool->join();


Update: I am running: Perl v5.8.8 built for MSWin32-x86-multi-thread
threads version 1.63

Let me know if there are any other snippets I can post that would help figure this out.

Thanks for reading, and sorry for the long post.

In reply to Multithreaded Script CPU Usage by Zenshai

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others exploiting the Monastery: (2)
    As of 2014-09-17 00:49 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      How do you remember the number of days in each month?











      Results (55 votes), past polls