http://www.perlmonks.org?node_id=136532


in reply to Serving tarball contents as part of your webspace (impractical but fun)

One of your main problems is that the gzip format means that you have to decompress the whole tar file in order to get any particular one out of it.

I would expect far better performance if you gzipped the individual files and then tarred the bundle rather than doing it the other way. This isn't usually done because it results in less overall compression. But it means that any particular file can be extracted relatively easily.

Also note that you could improve performance even more by just losing the idea of having a tar. But if you do that then you need to be very, very careful about security else someone will be able to access any gzipped file on your webserver.

  • Comment on Re (tilly) 1: Serving tarball contents as part of your webspace (impractical but fun)

Replies are listed 'Best First'.
Re^2: Serving tarball contents as part of your webspace (impractical but fun)
by Aristotle (Chancellor) on Jan 05, 2002 at 22:29 UTC

    That’s why I tried Archive::Zip - zipfiles contain individually compressed files. However, as I said, performance improved only marginally at best. The problem that you get a multitude of simultaneously running scripts all doing the same, very CPU intensive thing remains, after all. Losing the tarball is not helpful since the main idea was to keep a document tree that consists of oodles of tiny snippet files from eating an ungodly amount of inodes.

    You’re giving me an idea though – I’ll check to see how it performs with an uncompressed zipfile. I know uncompressed tarballs won’t make a huge difference since Archive::Tar always slurps the whole tarball into memory no matter what. However, zipfiles are indexed, and maybe Archive::Zip is smart enough to exploit that in which case this thing may actually be useful.

    I’ll update as soon as I’ve found the time to run a quick check.