Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Need Help for Convert PDF to HTML

by steve (Deacon)
on Feb 11, 2011 at 16:11 UTC ( #887640=note: print w/ replies, xml ) Need Help??


in reply to Need Help for Convert PDF to HTML

Another difficulty I do not see listed among the replies here is the issue of embedded fonts. PDF documents allow for embedding of fonts, and HTML does not. If usage of non-standard (non-web) fonts is embedded in the source PDF, then extraction of the font becomes a significant challenge. Some tools are available to do just that. CAM::PDF can Extract Font Info from PDF, but when brian_d_foy asked about extracting the fonts themselves Chris Dolan intends to never add that feature.

If you happen to have the font, that may be easier. It really depends on your source PDF document.

CSS can be used to specify such fonts (see FontSpring "Bulletproof" Method, Smiley Variation among many).

There are also licensing issues in play for many fonts. Depending on your circumstances (and perhaps the font requirements) this may be of concern/interest to you.


Comment on Re: Need Help for Convert PDF to HTML
Re^2: Need Help for Convert PDF to HTML
by inman2787 (Initiate) on Mar 26, 2011 at 04:29 UTC
    1. Convert PDF file to text file using Acrobat Reader or any program similiar. Just save it as a text file, no need for pro or extended versions of reader.
    2. Open TextEdit.app, open up the text file you've created, copy/paste whole thing to a new document window.
    - Open Preferences in TextEdit
    - Go to the "Open/Save" Tab
    - Change Document Type to HTML Strict or XHTML strict depending on your needs. In Styling, select No CSS.
    - Go back and save the new document now as a html file.
    There is a step by step instruction on how to convert PDF to HTML.
    Hope that helps !
Re^2: Need Help for Convert PDF to HTML
by Anonymous Monk on Dec 31, 2011 at 15:25 UTC
    HTML 5 has embedded fonts via JavaScript

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://887640]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2014-12-26 20:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls