Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Right now I'm revising a script that has to do the following:
  • iterate over a list of ~600 files
  • uncompress each of the files using gzip
  • reformat each uncompressed file using a proprietary program over which I have no control
  • recompress each file so that our disk doesn't run out of space
Right now, I'm doing this all sequentially, one step after another and man is it SLOW. There are lots of ways to optimize this, the easiest and most obvious of which is to 'use Thread'. However, my Perl wasn't compiled with Thread support, and I'm reluctant to replace it on my system when so much of what we do depends on Perl. :(

A few questions that I have:
  • Should I even try to optimize this without using Thread? Any ideas on how to do this?
  • Is recompiling Perl with threading support that big of a deal?
  • How usable is Perl threading for a task such as this? It's not enabled in a default compile of Perl, is it a stable feature or a semi-usable hack?
Being lazy I've left my script it's current state until I've had time to step back and really look at it. I've thought about using some system("gunzip $myfile &") style calls, but I can't figure out how to make synchronize the way I want without some unnecessary assumptions about file size and how it relates to compression/uncompression speed.

Also, I'm a long time reader of PM, and first time poster, I just made a small donation to the cause. Much thanks to the Perl Monks community for making this such a valuable resource. :)

In reply to Best Practices for Uncompressing/Recompressing Files? by biosysadmin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-20 00:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found