Re^4: PDF::OCR2 results not what I was hoping for

by nysus (Parson)
on Feb 08, 2016 at 18:35 UTC

in reply to Re^3: PDF::OCR2 results not what I was hoping for
in thread PDF::OCR2 results not what I was hoping for

Bam! Got it. I set the "density" setting to "300x300" when reading the image in, by default it is set to 72 dpi.

PDF::OCR2 is now reading the text in the cropped rectangle flawlessly.

Thanks for pointing me in the right direction.

Here is the sample code:

use Image::Magick; use PDF::OCR2; my $image = Image::Magick->new; $image->Set(density=>'300x300'); $image->Read('agendas/2016-02-02 Natural Resources.pdf', compression=> +'None'); $image->Crop(geometry=>'1248x520+936+520'); $image->Write(filename=>'crop.pdf', compression=>'None'); my $p = PDF::OCR2->new('crop.pdf'); my $text_all = $p->text; print $text_all;

