in reply to How to extract image captions from a PDF file using perl
PDF modules on CPAN would probably be a good start. CAM::PDF, iirc, can do that (well, the image part - the caption is iffy). Also see HTML::HTMLDoc. (what was I yammering here?)
Electric eels were invented at the same time as electricity
Before electricity was invented, electric eels had to stun with gas
Results (354 votes). Check out past polls.