Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Well I am a bit puzzled about the subroutine calls eating the processor.

You are recursively traversing a subtree, opening all the files and generating MD5 checksums. This will consume a lot of processor time as the math involved in calculating MD5s is cpu intensive. The cost of a subroutine call is miniscule by comparison and is a complete red-herring.

You say it is taking 10-15 minutes as if that is too long. How many files, and how big are they? It doesn't sound unreasonable to me.

Other than that, it is not clear to me exactly what problem you are asking for help with. I have your code, but I obviously cannot run it without creating a subdirectory tree that contains files with the names of those you are looking for, and I could not verify your timing without having the same number and sizes of files as you have.

The biggest problem I see with your code is that you are reading all the directory entries into an array at each level of recursion. And recursing whenever you encounter a nested directory. That means that if your directories have lots of files and/or the directory structure is very deep, you are consuming large amounts of memory as you descend the tree.

I think that perhaps your process is consuming so much memory that it is pushing your machine into swapping?

If you are determined to continue to use your own directory traversal routine, then you should avoid "slurping" the whole directory into an array. Instead, call readdir in a while loop and process one entry at a time. This will require that you avoid using a BAREWORD directory handle (like DIRECTORY) and use a lexical instead. Otherwise you will run into conflicts during recursion.

If none of that previous paragraph makes sense to you, then you should probably consider using File::Find or similar instead.

BTW. You should have use strict; (not use Strict;).


Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.

In reply to Re: Subroutine speed by BrowserUk
in thread Subroutine speed by prad_intel

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-04-19 13:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found