Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^4: PDF::OCR2 results not what I was hoping for

by nysus (Parson)
on Feb 08, 2016 at 18:35 UTC ( [id://1154660]=note: print w/replies, xml ) Need Help??


in reply to Re^3: PDF::OCR2 results not what I was hoping for
in thread PDF::OCR2 results not what I was hoping for

Bam! Got it. I set the "density" setting to "300x300" when reading the image in, by default it is set to 72 dpi.

PDF::OCR2 is now reading the text in the cropped rectangle flawlessly.

Thanks for pointing me in the right direction.

Here is the sample code:

use Image::Magick; use PDF::OCR2; my $image = Image::Magick->new; $image->Set(density=>'300x300'); $image->Read('agendas/2016-02-02 Natural Resources.pdf', compression=> +'None'); $image->Crop(geometry=>'1248x520+936+520'); $image->Write(filename=>'crop.pdf', compression=>'None'); my $p = PDF::OCR2->new('crop.pdf'); my $text_all = $p->text; print $text_all;

$PM = "Perl Monk's";
$MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon";
$nysus = $PM . $MCF;
Click here if you love Perl Monks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1154660]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2025-07-12 07:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.