Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^4: PDF::OCR2 results not what I was hoping for

by nysus (Parson)
on Feb 08, 2016 at 18:35 UTC ( [id://1154660]=note: print w/replies, xml ) Need Help??

in reply to Re^3: PDF::OCR2 results not what I was hoping for
in thread PDF::OCR2 results not what I was hoping for

Bam! Got it. I set the "density" setting to "300x300" when reading the image in, by default it is set to 72 dpi.

PDF::OCR2 is now reading the text in the cropped rectangle flawlessly.

Thanks for pointing me in the right direction.

Here is the sample code:

use Image::Magick; use PDF::OCR2; my $image = Image::Magick->new; $image->Set(density=>'300x300'); $image->Read('agendas/2016-02-02 Natural Resources.pdf', compression=> +'None'); $image->Crop(geometry=>'1248x520+936+520'); $image->Write(filename=>'crop.pdf', compression=>'None'); my $p = PDF::OCR2->new('crop.pdf'); my $text_all = $p->text; print $text_all;

$PM = "Perl Monk's";
$MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon";
$nysus = $PM . $MCF;
Click here if you love Perl Monks

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1154660]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-07-24 19:09 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.