Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
When I first read your question, I thought that your intent to "create space" meant that once a huge .zip file had been created, it would be archived somewhere else and a whole mess of files all over the place would be deleted. Upon second reading, it appears that you just intend to do a one-to-one replacement of each original file with zipped version of the original file.

7-zip is a good choice. From my experience on Windows, 7-zip is faster and generates a smaller file (compresses better) than the built-in zip functionality while still being 100% compatible with the standard unzipper.

I would prioritize data reliability over any speed considerations. I would suggest using File::Find or one of its cousins to create an array with the full path names of all of the work to be done. Of course mileage does vary, but in general, I strive to have the "wanted() routine" do as little actual work as possible. The find() routine will be changing directories as it goes about its work and if some calculation causes an error, then you could be left with the current working directory at some random point. Making a "to do list" doesn't add much computational effort at all. This has the added bonus of being able to calculate progress in terms of "% completed" as you go along. Sometimes long executing time programs get killed by a user because they think that it is "hung" when in fact it is not.

I would not do this unless there is some compelling reason to do so, but the "to do list" could be a work queue that is parceled out to N threads running in parallel.

Do not delete the original file until you are certain that the zip file has been properly created. This rules out any kind of "move" to zip file functionality.

You should run some tests to estimate how much space you will save by doing this. The compression ratio is highly data dependent. I suspect that more space could be saving by deleting or archiving completely obsolete files? Disks are relatively cheap.

Have you considered what would happen if one if these files happened to be in use by another program at the time? There are many "yeah, buts" and "what if's" down that rabbit hole. I am just mentioning it as "something to consider".

In reply to Re: Automatic zip and delete by Marshall
in thread Automatic zip and delete by justin423

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2022-05-16 19:46 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (63 votes). Check out past polls.