|Syntactic Confectionery Delight|
Re^2: Need Help for Convert PDF to HTMLby LanX (Chancellor)
|on Feb 12, 2011 at 00:40 UTC||Need Help??|
> Building the HTML page (and probably some CSS as well) to mimick the PDF-layout. This will be more difficult than one thinks as the HTML document format actually is very bad in placing "things" at exactly the spot you want. The whole idea of HTML (and CSS) is that the laoyout is "flowing" and will adapt itself (more or less) graciously to the output method of the client viewing it.
Actually most of this is solvable since CSS positioning was introduced (maybe 10 years ago?), the real problem is that arbitrary fonts are (in practice) not embeddable in HTML, and reconstructing words, lines and paragraphs with even slightly different font metrics looks awkward.
For example some may remember how Google used to produce HTML-previews of PDFs, with those random gaps in the text lines.
As I already said, it highly depends on the use case. (and on differing definitions of what HTML is)