Try xpdf: http://www.foolabs.com/xpdf/ (GPL) Converts pdf to i.e. text/html/xml and seems to handle font subsets well.

In reply to Re^5: CAM::PDF did't extract all pdf's content by Anonymous Monk
in thread CAM::PDF did't extract all pdf's content by Gangabass

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":