in reply to
How to extract image captions from a PDF file using perl
PDF modules on CPAN would probably be a good start. CAM::PDF, iirc, can do that (well, the image part - the caption is iffy). Also see HTML::HTMLDoc. (what was I yammering here?)
By rote learning.
Via Genetic memory.
It's provided by my firmware.
I just remember them.
Thirty days hath September
My computer gets it right, usually.
I just ask someone else.
Someone punches me on the first of the month.
Results (115 votes),