$ ls -l T*.pdf
-rw-rw-r-- 1 zaxo zaxo 362870 Sep 3 2002 The_Perl_Review_0_
+5.pdf
-rw-rw-r-- 1 zaxo zaxo 352282 Jan 6 2003 The_Perl_Review_0_
+6.pdf
-rw-rw-r-- 1 zaxo zaxo 263105 Jan 6 2003 The_Perl_Review_0_
+7.pdf
$ perl -MPerlIO::gzip -e'for (@ARGV) {local $/ = \4096; open my $ih, "
+<:raw", $_ or warn $! and next; open my $oh, ">:gzip", $_.".gz" or wa
+rn $! and next; while (<$ih>) { print $oh $_ }}' T*.pdf
$ ls -l T*.pdf.gz
-rw-rw-r-- 1 zaxo zaxo 331242 Jun 25 14:57 The_Perl_Review_0_
+5.pdf.gz
-rw-rw-r-- 1 zaxo zaxo 327834 Jun 25 14:57 The_Perl_Review_0_
+6.pdf.gz
-rw-rw-r-- 1 zaxo zaxo 235150 Jun 25 14:57 The_Perl_Review_0_
+7.pdf.gz
$ file T*.pdf.gz
The_Perl_Review_0_5.pdf.gz: gzip compressed data, deflated, last modif
+ied: Fri Jun 25 14:57:55 2004, os: Unix
The_Perl_Review_0_6.pdf.gz: gzip compressed data, deflated, last modif
+ied: Fri Jun 25 14:57:56 2004, os: Unix
The_Perl_Review_0_7.pdf.gz: gzip compressed data, deflated, last modif
+ied: Fri Jun 25 14:57:56 2004, os: Unix
$
The local $/ = \4096; statement makes perl read the input files in chunks of that size. That speeds things up quite a bit.
Evidently PDF files are not too redundant to begin with. They don't gzip all that much tighter.
|