Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Hi Monks-

I've been working on a project for my degree, and it's about time to be done with the thing. This script is just one part it. (See my bio for more info.) Anyway, I'm feel like I've never had my code looked over by someone who knows what they're doing, so I posted the db-loading program on my scratchpad, along with a sample of the gene data input. If you feel like you have some spare time, I'd appreciate some constructive criticism.

(I didn't use a generic module because I knew that wouldn't be in the spirit of the project as my advisor saw it. If you'd like to advocate your favorite module, I'll try to study it before the next time I have to do something like this. :) )

My main concern is that the program takes too long to run. (14- 17 hours for 26,000+ records.) There is indexing on several of the fields. I've heard that if I didn't index while inserting data, everything might go faster. I've also heard that the indexing makes the multiple db searches needed go faster during the program, so any gain from indexing at the end is lost from the extra search time within the script. erg. As for the script, I did pass array references to the subroutines instead of copying arrays, but what are some other practical ways that you would optimize it for speed? It blazes on the first thousand records or so, then gets slower as it goes on. I expect that to some degree, since it has to search ever-increasing tables as it progresses, but is 14-17 hours realistic? I am seeing some stuff on optimizing the many regex in Programming Perl (pp 594-9), but I'm not sure the best way to apply it. I have read that tr// is faster than s//, which I could use in a couple places.

Thanks for any comments you can make. If you have questions or need more information, I'll oblige.

-Jenn


In reply to Up for Critique by biograd

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others scrutinizing the Monastery: (11)
    As of 2014-07-10 15:32 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      When choosing user names for websites, I prefer to use:








      Results (213 votes), past polls